All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 00/11] Devlink health reporting and recovery system
@ 2019-01-17 21:59 Eran Ben Elisha
  2019-01-17 21:59 ` [PATCH net-next v2 01/11] devlink: Add health buffer support Eran Ben Elisha
                   ` (12 more replies)
  0 siblings, 13 replies; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

The health mechanism is targeted for Real Time Alerting, in order to know when
something bad had happened to a PCI device
- Provide alert debug information
- Self healing
- If problem needs vendor support, provide a way to gather all needed debugging
  information.

The main idea is to unify and centralize driver health reports in the
generic devlink instance and allow the user to set different
attributes of the health reporting and recovery procedures.

The devlink health reporter:
Device driver creates a "health reporter" per each error/health type.
Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
or unknown (driver specific).
For each registered health reporter a driver can issue error/health reports
asynchronously. All health reports handling is done by devlink.
Device driver can provide specific callbacks for each "health reporter", e.g.
 - Recovery procedures
 - Diagnostics and object dump procedures
 - OOB initial attributes
Different parts of the driver can register different types of health reporters
with different handlers.

Once an error is reported, devlink health will do the following actions:
  * A log is being send to the kernel trace events buffer
  * Health status and statistics are being updated for the reporter instance
  * Object dump is being taken and saved at the reporter instance (as long as
    there is no other dump which is already stored)
  * Auto recovery attempt is being done. Depends on:
    - Auto-recovery configuration
    - Grace period vs. time passed since last recover

The user interface:
User can access/change each reporter attributes and driver specific callbacks
via devlink, e.g per error type (per health reporter)
 - Configure reporter's generic attributes (like: Disable/enable auto recovery)
 - Invoke recovery procedure
 - Run diagnostics
 - Object dump

The devlink health interface (via netlink):
DEVLINK_CMD_HEALTH_REPORTER_GET
  Retrieves status and configuration info per DEV and reporter.
DEVLINK_CMD_HEALTH_REPORTER_SET
  Allows reporter-related configuration setting.
DEVLINK_CMD_HEALTH_REPORTER_RECOVER
  Triggers a reporter's recovery procedure.
DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE
  Retrieves diagnostics data from a reporter on a device.
DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET
  Retrieves the last stored dump. Devlink health
  saves a single dump. If an dump is not already stored by the devlink
  for this reporter, devlink generates a new dump.
  dump output is defined by the reporter.
DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR
  Clears the last saved dump file for the specified reporter.


                                               netlink
                                      +--------------------------+
                                      |                          |
                                      |            +             |
                                      |            |             |
                                      +--------------------------+
                                                   |request for ops
                                                   |(diagnose,
 mlx5_core                             devlink     |recover,
                                                   |dump)
+--------+                            +--------------------------+
|        |                            |    reporter|             |
|        |                            |  +---------v----------+  |
|        |   ops execution            |  |                    |  |
|     <----------------------------------+                    |  |
|        |                            |  |                    |  |
|        |                            |  + ^------------------+  |
|        |                            |    | request for ops     |
|        |                            |    | (recover, dump)     |
|        |                            |    |                     |
|        |                            |  +-+------------------+  |
|        |     health report          |  | health handler     |  |
|        +------------------------------->                    |  |
|        |                            |  +--------------------+  |
|        |     health reporter create |                          |
|        +---------------------------->                          |
+--------+                            +--------------------------+

In this patchset, mlx5e TX reporter is implemented.

v2:
- Remove FW* reporters to decrease the amount of patches in the patchset

Aya Levin (1):
  devlink: Add Documentation/networking/devlink-health.txt

Eran Ben Elisha (10):
  devlink: Add health buffer support
  devlink: Add health reporter create/destroy functionality
  devlink: Add health report functionality
  devlink: Add health get command
  devlink: Add health set command
  devlink: Add health recover command
  devlink: Add health diagnose command
  devlink: Add health dump {get,clear} commands
  net/mlx5e: Add TX reporter support
  net/mlx5e: Add TX timeout support for mlx5e TX reporter

 Documentation/networking/devlink-health.txt   |   86 ++
 .../net/ethernet/mellanox/mlx5/core/Makefile  |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |   18 +-
 .../ethernet/mellanox/mlx5/core/en/reporter.h |   15 +
 .../mellanox/mlx5/core/en/reporter_tx.c       |  356 ++++++
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  186 +--
 .../net/ethernet/mellanox/mlx5/core/en_tx.c   |    2 +-
 include/net/devlink.h                         |  144 +++
 include/trace/events/devlink.h                |   62 +
 include/uapi/linux/devlink.h                  |   25 +
 net/core/devlink.c                            | 1055 +++++++++++++++++
 11 files changed, 1782 insertions(+), 169 deletions(-)
 create mode 100644 Documentation/networking/devlink-health.txt
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 10:03   ` Jiri Pirko
  2019-01-20 11:20   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 02/11] devlink: Add health reporter create/destroy functionality Eran Ben Elisha
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Devlink health buffer is a mechanism to pass descriptors between drivers
and devlink. The API allows the driver to add objects, object pair,
value array (nested attributes), value and name.

Driver can use this API to fill the buffers in a format which can be
translated by the devlink to the netlink message.

In order to fulfill it, an internal buffer descriptor is defined. This
will hold the data and metadata per each attribute and by used to pass
actual commands to the netlink.

This mechanism will be later used in devlink health for dump and diagnose
data store by the drivers.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/net/devlink.h        |  76 ++++++
 include/uapi/linux/devlink.h |   8 +
 net/core/devlink.c           | 501 +++++++++++++++++++++++++++++++++++
 3 files changed, 585 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 67f4293bc970..77c77319290a 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -423,6 +423,8 @@ struct devlink_region;
 
 typedef void devlink_snapshot_data_dest_t(const void *data);
 
+struct devlink_health_buffer;
+
 struct devlink_ops {
 	int (*reload)(struct devlink *devlink, struct netlink_ext_ack *extack);
 	int (*port_type_set)(struct devlink_port *devlink_port,
@@ -584,6 +586,22 @@ int devlink_region_snapshot_create(struct devlink_region *region, u64 data_len,
 				   u8 *data, u32 snapshot_id,
 				   devlink_snapshot_data_dest_t *data_destructor);
 
+int devlink_health_buffer_nest_start(struct devlink_health_buffer *buffer,
+				     int attrtype);
+void devlink_health_buffer_nest_end(struct devlink_health_buffer *buffer);
+void devlink_health_buffer_nest_cancel(struct devlink_health_buffer *buffer);
+int devlink_health_buffer_put_object_name(struct devlink_health_buffer *buffer,
+					  char *name);
+int devlink_health_buffer_put_value_u8(struct devlink_health_buffer *buffer,
+				       u8 value);
+int devlink_health_buffer_put_value_u32(struct devlink_health_buffer *buffer,
+					u32 value);
+int devlink_health_buffer_put_value_u64(struct devlink_health_buffer *buffer,
+					u64 value);
+int devlink_health_buffer_put_value_string(struct devlink_health_buffer *buffer,
+					   char *name);
+int devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
+					 void *data, int len);
 #else
 
 static inline struct devlink *devlink_alloc(const struct devlink_ops *ops,
@@ -844,6 +862,64 @@ devlink_region_snapshot_create(struct devlink_region *region, u64 data_len,
 	return 0;
 }
 
+static inline int
+devlink_health_buffer_nest_start(struct devlink_health_buffer *buffer,
+				 int attrtype)
+{
+	return 0;
+}
+
+static inline void
+devlink_health_buffer_nest_end(struct devlink_health_buffer *buffer)
+{
+}
+
+static inline void
+devlink_health_buffer_nest_cancel(struct devlink_health_buffer *buffer)
+{
+}
+
+static inline int
+devlink_health_buffer_put_object_name(struct devlink_health_buffer *buffer,
+				      char *name)
+{
+	return 0;
+}
+
+static inline int
+devlink_health_buffer_put_value_u8(struct devlink_health_buffer *buffer,
+				   u8 value)
+{
+	return 0;
+}
+
+static inline int
+devlink_health_buffer_put_value_u32(struct devlink_health_buffer *buffer,
+				    u32 value)
+{
+	return 0;
+}
+
+static inline int
+devlink_health_buffer_put_value_u64(struct devlink_health_buffer *buffer,
+				    u64 value)
+{
+	return 0;
+}
+
+static inline int
+devlink_health_buffer_put_value_string(struct devlink_health_buffer *buffer,
+				       char *name)
+{
+	return 0;
+}
+
+static inline int
+devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
+				     void *data, int len)
+{
+	return 0;
+}
 #endif
 
 #endif /* _NET_DEVLINK_H_ */
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 6e52d3660654..cff0e0cb5ac2 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -285,6 +285,14 @@ enum devlink_attr {
 	DEVLINK_ATTR_REGION_CHUNK_ADDR,         /* u64 */
 	DEVLINK_ATTR_REGION_CHUNK_LEN,          /* u64 */
 
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT,		/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR,		/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_NAME,		/* string */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE,	/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY,	/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,	/* u8 */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,	/* dynamic */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index abb0da9d7b4b..8984501edade 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3597,6 +3597,507 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 	return 0;
 }
 
+#define DEVLINK_HEALTH_BUFFER_SIZE (4096 - GENL_HDRLEN)
+#define DEVLINK_HEALTH_BUFFER_DATA_SIZE (DEVLINK_HEALTH_BUFFER_SIZE / 2)
+#define DEVLINK_HEALTH_SIZE_TO_BUFFERS(size) DIV_ROUND_UP(size, DEVLINK_HEALTH_BUFFER_DATA_SIZE)
+#define DEVLINK_HEALTH_BUFFER_MAX_CHUNK 1024
+
+struct devlink_health_buffer {
+	void *data;
+	u64 offset;
+	u64 bytes_left;
+	u64 bytes_left_metadata;
+	u64 max_nested_depth;
+	u64 curr_nest;
+};
+
+struct devlink_health_buffer_desc {
+	int attrtype;
+	u16 len;
+	u8 nla_type;
+	u8 nest_end;
+	int value[0];
+};
+
+static void
+devlink_health_buffers_reset(struct devlink_health_buffer **buffers_list,
+			     u64 num_of_buffers)
+{
+	u64 i;
+
+	for (i = 0; i < num_of_buffers; i++) {
+		memset(buffers_list[i]->data, 0, DEVLINK_HEALTH_BUFFER_SIZE);
+		buffers_list[i]->offset = 0;
+		buffers_list[i]->bytes_left = DEVLINK_HEALTH_BUFFER_DATA_SIZE;
+		buffers_list[i]->bytes_left_metadata =
+			DEVLINK_HEALTH_BUFFER_DATA_SIZE;
+		buffers_list[i]->max_nested_depth = 0;
+		buffers_list[i]->curr_nest = 0;
+	}
+}
+
+static void
+devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
+			       u64 size);
+
+static struct devlink_health_buffer **
+devlink_health_buffers_create(u64 size)
+{
+	struct devlink_health_buffer **buffers_list;
+	u64 num_of_buffers = DEVLINK_HEALTH_SIZE_TO_BUFFERS(size);
+	u64 i;
+
+	buffers_list = kcalloc(num_of_buffers,
+			       sizeof(struct devlink_health_buffer *),
+			       GFP_KERNEL);
+	if (!buffers_list)
+		return NULL;
+
+	for (i = 0; i < num_of_buffers; i++) {
+		struct devlink_health_buffer *buffer;
+		void *data;
+
+		buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
+		data = kzalloc(DEVLINK_HEALTH_BUFFER_SIZE, GFP_KERNEL);
+		if (!buffer || !data) {
+			kfree(buffer);
+			kfree(data);
+			goto buffers_cleanup;
+		}
+		buffers_list[i] = buffer;
+		buffer->data = data;
+	}
+	devlink_health_buffers_reset(buffers_list, num_of_buffers);
+
+	return buffers_list;
+
+buffers_cleanup:
+	devlink_health_buffers_destroy(buffers_list, --i);
+	kfree(buffers_list);
+	return NULL;
+}
+
+static void
+devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
+			       u64 num_of_buffers)
+{
+	u64 i;
+
+	for (i = 0; i < num_of_buffers; i++) {
+		kfree(buffers_list[i]->data);
+		kfree(buffers_list[i]);
+	}
+}
+
+void
+devlink_health_buffer_offset_inc(struct devlink_health_buffer *buffer,
+				 int len)
+{
+	buffer->offset += len;
+}
+
+/* In order to store a nest, need two descriptors, for start and end */
+#define DEVLINK_HEALTH_BUFFER_NEST_SIZE (sizeof(struct devlink_health_buffer_desc) * 2)
+
+int devlink_health_buffer_verify_len(struct devlink_health_buffer *buffer,
+				     int len, int metadata_len)
+{
+	if (len > DEVLINK_HEALTH_BUFFER_DATA_SIZE)
+		return -EINVAL;
+
+	if (buffer->bytes_left < len ||
+	    buffer->bytes_left_metadata < metadata_len)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static struct devlink_health_buffer_desc *
+devlink_health_buffer_get_desc_from_offset(struct devlink_health_buffer *buffer)
+{
+	return buffer->data + buffer->offset;
+}
+
+int
+devlink_health_buffer_nest_start(struct devlink_health_buffer *buffer,
+				 int attrtype)
+{
+	struct devlink_health_buffer_desc *desc;
+	int err;
+
+	err = devlink_health_buffer_verify_len(buffer, 0,
+					       DEVLINK_HEALTH_BUFFER_NEST_SIZE);
+	if (err)
+		return err;
+
+	if (attrtype != DEVLINK_ATTR_HEALTH_BUFFER_OBJECT &&
+	    attrtype != DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR &&
+	    attrtype != DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE &&
+	    attrtype != DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY)
+		return -EINVAL;
+
+	desc = devlink_health_buffer_get_desc_from_offset(buffer);
+
+	desc->attrtype = attrtype;
+	buffer->bytes_left_metadata -= DEVLINK_HEALTH_BUFFER_NEST_SIZE;
+	devlink_health_buffer_offset_inc(buffer, sizeof(*desc));
+
+	buffer->curr_nest++;
+	buffer->max_nested_depth = max(buffer->max_nested_depth,
+				       buffer->curr_nest);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_nest_start);
+
+enum devlink_health_buffer_nest_end_cancel {
+	DEVLINK_HEALTH_BUFFER_NEST_END = 1,
+	DEVLINK_HEALTH_BUFFER_NEST_CANCEL,
+};
+
+static void
+devlink_health_buffer_nest_end_cancel(struct devlink_health_buffer *buffer,
+				      enum devlink_health_buffer_nest_end_cancel nest)
+{
+	struct devlink_health_buffer_desc *desc;
+
+	WARN_ON(!buffer->curr_nest);
+	buffer->curr_nest--;
+
+	desc = devlink_health_buffer_get_desc_from_offset(buffer);
+	desc->nest_end = nest;
+	devlink_health_buffer_offset_inc(buffer, sizeof(*desc));
+}
+
+void devlink_health_buffer_nest_end(struct devlink_health_buffer *buffer)
+{
+	devlink_health_buffer_nest_end_cancel(buffer,
+					      DEVLINK_HEALTH_BUFFER_NEST_END);
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_nest_end);
+
+void devlink_health_buffer_nest_cancel(struct devlink_health_buffer *buffer)
+{
+	devlink_health_buffer_nest_end_cancel(buffer,
+					      DEVLINK_HEALTH_BUFFER_NEST_CANCEL);
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_nest_cancel);
+
+int
+devlink_health_buffer_put_object_name(struct devlink_health_buffer *buffer,
+				      char *name)
+{
+	struct devlink_health_buffer_desc *desc;
+	int err;
+
+	err = devlink_health_buffer_verify_len(buffer, strlen(name) + 1,
+					       sizeof(*desc));
+	if (err)
+		return err;
+
+	desc = devlink_health_buffer_get_desc_from_offset(buffer);
+	desc->attrtype = DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_NAME;
+	desc->nla_type = NLA_NUL_STRING;
+	desc->len = strlen(name) + 1;
+	memcpy(&desc->value, name, desc->len);
+	devlink_health_buffer_offset_inc(buffer, sizeof(*desc) + desc->len);
+
+	buffer->bytes_left_metadata -= sizeof(*desc);
+	buffer->bytes_left -= (strlen(name) + 1);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_put_object_name);
+
+static int
+devlink_health_buffer_put_value(struct devlink_health_buffer *buffer,
+				u8 nla_type, void *value, int len)
+{
+	struct devlink_health_buffer_desc *desc;
+	int err;
+
+	err = devlink_health_buffer_verify_len(buffer, len, sizeof(*desc));
+	if (err)
+		return err;
+
+	desc = devlink_health_buffer_get_desc_from_offset(buffer);
+	desc->attrtype = DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA;
+	desc->nla_type = nla_type;
+	desc->len = len;
+	memcpy(&desc->value, value, len);
+	devlink_health_buffer_offset_inc(buffer, sizeof(*desc) + desc->len);
+
+	buffer->bytes_left_metadata -= sizeof(*desc);
+	buffer->bytes_left -= len;
+
+	return 0;
+}
+
+int
+devlink_health_buffer_put_value_u8(struct devlink_health_buffer *buffer,
+				   u8 value)
+{
+	int err;
+
+	err = devlink_health_buffer_put_value(buffer, NLA_U8, &value,
+					      sizeof(value));
+	if (err)
+		return err;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_put_value_u8);
+
+int
+devlink_health_buffer_put_value_u32(struct devlink_health_buffer *buffer,
+				    u32 value)
+{
+	int err;
+
+	err = devlink_health_buffer_put_value(buffer, NLA_U32, &value,
+					      sizeof(value));
+	if (err)
+		return err;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_put_value_u32);
+
+int
+devlink_health_buffer_put_value_u64(struct devlink_health_buffer *buffer,
+				    u64 value)
+{
+	int err;
+
+	err = devlink_health_buffer_put_value(buffer, NLA_U64, &value,
+					      sizeof(value));
+	if (err)
+		return err;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_put_value_u64);
+
+int
+devlink_health_buffer_put_value_string(struct devlink_health_buffer *buffer,
+				       char *name)
+{
+	int err;
+
+	if (strlen(name) + 1 > DEVLINK_HEALTH_BUFFER_MAX_CHUNK)
+		return -EINVAL;
+
+	err = devlink_health_buffer_put_value(buffer, NLA_NUL_STRING, name,
+					      strlen(name) + 1);
+	if (err)
+		return err;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_put_value_string);
+
+int
+devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
+				     void *data, int len)
+{
+	int err;
+
+	if (len > DEVLINK_HEALTH_BUFFER_MAX_CHUNK)
+		return -EINVAL;
+
+	err = devlink_health_buffer_put_value(buffer, NLA_BINARY, data, len);
+	if (err)
+		return err;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_health_buffer_put_value_data);
+
+static int
+devlink_health_buffer_fill_data(struct sk_buff *skb,
+				struct devlink_health_buffer_desc *desc)
+{
+	int err = -EINVAL;
+
+	switch (desc->nla_type) {
+	case NLA_U8:
+		err = nla_put_u8(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,
+				 *(u8 *)desc->value);
+		break;
+	case NLA_U32:
+		err = nla_put_u32(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,
+				  *(u32 *)desc->value);
+		break;
+	case NLA_U64:
+		err = nla_put_u64_64bit(skb,
+					DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,
+					*(u64 *)desc->value, DEVLINK_ATTR_PAD);
+		break;
+	case NLA_NUL_STRING:
+		err = nla_put_string(skb,
+				     DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,
+				     (char *)&desc->value);
+		break;
+	case NLA_BINARY:
+		err = nla_put(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,
+			      desc->len, (void *)&desc->value);
+		break;
+	}
+
+	return err;
+}
+
+static int
+devlink_health_buffer_fill_type(struct sk_buff *skb,
+				struct devlink_health_buffer_desc *desc)
+{
+	int err = -EINVAL;
+
+	switch (desc->nla_type) {
+	case NLA_U8:
+		err = nla_put_u8(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,
+				 NLA_U8);
+		break;
+	case NLA_U32:
+		err = nla_put_u8(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,
+				 NLA_U32);
+		break;
+	case NLA_U64:
+		err = nla_put_u8(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,
+				 NLA_U64);
+		break;
+	case NLA_NUL_STRING:
+		err = nla_put_u8(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,
+				 NLA_NUL_STRING);
+		break;
+	case NLA_BINARY:
+		err = nla_put_u8(skb, DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,
+				 NLA_BINARY);
+		break;
+	}
+
+	return err;
+}
+
+static inline struct devlink_health_buffer_desc *
+devlink_health_buffer_get_next_desc(struct devlink_health_buffer_desc *desc)
+{
+	return (void *)&desc->value + desc->len;
+}
+
+static int
+devlink_health_buffer_prepare_skb(struct sk_buff *skb,
+				  struct devlink_health_buffer *buffer)
+{
+	struct devlink_health_buffer_desc *last_desc, *desc;
+	struct nlattr **buffer_nlattr;
+	int err;
+	int i = 0;
+
+	buffer_nlattr = kcalloc(buffer->max_nested_depth,
+				sizeof(*buffer_nlattr), GFP_KERNEL);
+	if (!buffer_nlattr)
+		return -EINVAL;
+
+	last_desc = devlink_health_buffer_get_desc_from_offset(buffer);
+	desc = buffer->data;
+	while (desc != last_desc) {
+		switch (desc->attrtype) {
+		case DEVLINK_ATTR_HEALTH_BUFFER_OBJECT:
+		case DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR:
+		case DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE:
+		case DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY:
+			buffer_nlattr[i] = nla_nest_start(skb, desc->attrtype);
+			if (!buffer_nlattr[i])
+				goto nla_put_failure;
+			i++;
+			break;
+		case DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA:
+			err = devlink_health_buffer_fill_data(skb, desc);
+			if (err)
+				goto nla_put_failure;
+			err = devlink_health_buffer_fill_type(skb, desc);
+			if (err)
+				goto nla_put_failure;
+			break;
+		case DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_NAME:
+			err = nla_put_string(skb, desc->attrtype,
+					     (char *)&desc->value);
+			if (err)
+				goto nla_put_failure;
+			break;
+		default:
+			WARN_ON(!desc->nest_end);
+			WARN_ON(i <= 0);
+			if (desc->nest_end == DEVLINK_HEALTH_BUFFER_NEST_END)
+				nla_nest_end(skb, buffer_nlattr[--i]);
+			else
+				nla_nest_cancel(skb, buffer_nlattr[--i]);
+			break;
+		}
+		desc = devlink_health_buffer_get_next_desc(desc);
+	}
+
+	return 0;
+
+nla_put_failure:
+	kfree(buffer_nlattr);
+	return err;
+}
+
+static int
+devlink_health_buffer_snd(struct genl_info *info,
+			  enum devlink_command cmd, int flags,
+			  struct devlink_health_buffer **buffers_array,
+			  u64 num_of_buffers)
+{
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh;
+	void *hdr;
+	int err;
+	u64 i;
+
+	for (i = 0; i < num_of_buffers; i++) {
+		/* Skip buffer if driver did not fill it up with any data */
+		if (!buffers_array[i]->offset)
+			continue;
+
+		skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+		if (!skb)
+			return -ENOMEM;
+
+		hdr = genlmsg_put(skb, info->snd_portid, info->snd_seq,
+				  &devlink_nl_family, NLM_F_MULTI, cmd);
+		if (!hdr)
+			goto nla_put_failure;
+
+		err = devlink_health_buffer_prepare_skb(skb, buffers_array[i]);
+		if (err)
+			goto nla_put_failure;
+
+		genlmsg_end(skb, hdr);
+		err = genlmsg_reply(skb, info);
+		if (err)
+			return err;
+	}
+
+	skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+	nlh = nlmsg_put(skb, info->snd_portid, info->snd_seq,
+			NLMSG_DONE, 0, flags | NLM_F_MULTI);
+	err = genlmsg_reply(skb, info);
+	if (err)
+		return err;
+
+	return 0;
+
+nla_put_failure:
+	err = -EIO;
+	nlmsg_free(skb);
+	return err;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 02/11] devlink: Add health reporter create/destroy functionality
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
  2019-01-17 21:59 ` [PATCH net-next v2 01/11] devlink: Add health buffer support Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:49   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 03/11] devlink: Add health report functionality Eran Ben Elisha
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Devlink health reporter is an instance for reporting, diagnosing and
recovering from run time errors discovered by the reporters.
Define it's data structure and supported operations.
In addition, expose devlink API to create and destroy a reporter.
Each devlink instance will hold it's own reporters list.

As part of the allocation, driver shall provide a set of callbacks which
will be used the devlink in order to handle health reports and user
commands related to this reporter. In addition, driver is entitled to
provide some priv pointer, which can be fetched from the reporter by
devlink_health_reporter_priv function.

For each reporter, devlink will hold a metadata of statistics,
buffers and status.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/net/devlink.h |  59 ++++++++++++++++++++
 net/core/devlink.c    | 127 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 186 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 77c77319290a..7fe30d67678a 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -30,6 +30,7 @@ struct devlink {
 	struct list_head param_list;
 	struct list_head region_list;
 	u32 snapshot_id;
+	struct list_head reporter_list;
 	struct devlink_dpipe_headers *dpipe_headers;
 	const struct devlink_ops *ops;
 	struct device *dev;
@@ -424,6 +425,34 @@ struct devlink_region;
 typedef void devlink_snapshot_data_dest_t(const void *data);
 
 struct devlink_health_buffer;
+struct devlink_health_reporter;
+
+/**
+ * struct devlink_health_reporter_ops - Reporter operations
+ * @name: reporter name
+ * dump_size: dump buffer size allocated by the devlink
+ * diagnose_size: diagnose buffer size allocated by the devlink
+ * recover: callback to recover from reported error
+ *          if priv_ctx is NULL, run a full recover
+ * dump: callback to dump an object
+ *       if priv_ctx is NULL, run a full dump
+ * diagnose: callback to diagnose the current status
+ */
+
+struct devlink_health_reporter_ops {
+	char *name;
+	unsigned int dump_size;
+	unsigned int diagnose_size;
+	int (*recover)(struct devlink_health_reporter *reporter,
+		       void *priv_ctx);
+	int (*dump)(struct devlink_health_reporter *reporter,
+		    struct devlink_health_buffer **buffers_array,
+		    unsigned int buffer_size, unsigned int num_buffers,
+		    void *priv_ctx);
+	int (*diagnose)(struct devlink_health_reporter *reporter,
+			struct devlink_health_buffer **buffers_array,
+			unsigned int buffer_size, unsigned int num_buffers);
+};
 
 struct devlink_ops {
 	int (*reload)(struct devlink *devlink, struct netlink_ext_ack *extack);
@@ -602,6 +631,16 @@ int devlink_health_buffer_put_value_string(struct devlink_health_buffer *buffer,
 					   char *name);
 int devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
 					 void *data, int len);
+struct devlink_health_reporter *
+devlink_health_reporter_create(struct devlink *devlink,
+			       const struct devlink_health_reporter_ops *ops,
+			       u64 graceful_period, bool auto_recover,
+			       void *priv);
+void
+devlink_health_reporter_destroy(struct devlink_health_reporter *reporter);
+
+void *
+devlink_health_reporter_priv(struct devlink_health_reporter *reporter);
 #else
 
 static inline struct devlink *devlink_alloc(const struct devlink_ops *ops,
@@ -920,6 +959,26 @@ devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
 {
 	return 0;
 }
+
+static inline struct devlink_health_reporter *
+devlink_health_reporter_create(struct devlink *devlink,
+			       const struct devlink_health_reporter_ops *ops,
+			       u64 graceful_period, bool auto_recover,
+			       void *priv)
+{
+	return NULL;
+}
+
+static inline void
+devlink_health_reporter_destroy(struct devlink_health_reporter *reporter)
+{
+}
+
+static inline void *
+devlink_health_reporter_priv(struct devlink_health_reporter *reporter)
+{
+	return NULL;
+}
 #endif
 
 #endif /* _NET_DEVLINK_H_ */
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 8984501edade..fec169a28dba 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4098,6 +4098,132 @@ devlink_health_buffer_snd(struct genl_info *info,
 	return err;
 }
 
+struct devlink_health_reporter {
+	struct list_head list;
+	struct devlink_health_buffer **dump_buffers_array;
+	struct mutex dump_lock; /* lock parallel read/write from dump buffers */
+	struct devlink_health_buffer **diagnose_buffers_array;
+	struct mutex diagnose_lock; /* lock parallel read/write from diagnose buffers */
+	void *priv;
+	const struct devlink_health_reporter_ops *ops;
+	struct devlink *devlink;
+	u64 graceful_period;
+	bool auto_recover;
+	u8 health_state;
+};
+
+void *
+devlink_health_reporter_priv(struct devlink_health_reporter *reporter)
+{
+	return reporter->priv;
+}
+EXPORT_SYMBOL_GPL(devlink_health_reporter_priv);
+
+static struct devlink_health_reporter *
+devlink_health_reporter_find_by_name(struct devlink *devlink,
+				     const char *reporter_name)
+{
+	struct devlink_health_reporter *reporter;
+
+	list_for_each_entry(reporter, &devlink->reporter_list, list)
+		if (!strcmp(reporter->ops->name, reporter_name))
+			return reporter;
+	return NULL;
+}
+
+/**
+ *	devlink_health_reporter_create - create devlink health reporter
+ *
+ *	@devlink: devlink
+ *	@ops: ops
+ *	@graceful_period: to avoid recovery loops, in msecs
+ *	@auto_recover: auto recover when error occurs
+ *	@priv: priv
+ */
+struct devlink_health_reporter *
+devlink_health_reporter_create(struct devlink *devlink,
+			       const struct devlink_health_reporter_ops *ops,
+			       u64 graceful_period, bool auto_recover,
+			       void *priv)
+{
+	struct devlink_health_reporter *reporter;
+
+	mutex_lock(&devlink->lock);
+	if (devlink_health_reporter_find_by_name(devlink, ops->name)) {
+		reporter = ERR_PTR(-EEXIST);
+		goto unlock;
+	}
+
+	if (WARN_ON(ops->dump && !ops->dump_size) ||
+	    WARN_ON(ops->diagnose && !ops->diagnose_size) ||
+	    WARN_ON(auto_recover && !ops->recover) ||
+	    WARN_ON(graceful_period && !ops->recover)) {
+		reporter = ERR_PTR(-EINVAL);
+		goto unlock;
+	}
+
+	reporter = kzalloc(sizeof(*reporter), GFP_KERNEL);
+	if (!reporter) {
+		reporter = ERR_PTR(-ENOMEM);
+		goto unlock;
+	}
+
+	if (ops->dump) {
+		reporter->dump_buffers_array =
+			devlink_health_buffers_create(ops->dump_size);
+		if (!reporter->dump_buffers_array) {
+			kfree(reporter);
+			reporter = ERR_PTR(-ENOMEM);
+			goto unlock;
+		}
+	}
+
+	if (ops->diagnose) {
+		reporter->diagnose_buffers_array =
+			devlink_health_buffers_create(ops->diagnose_size);
+		if (!reporter->diagnose_buffers_array) {
+			devlink_health_buffers_destroy(reporter->dump_buffers_array,
+						       DEVLINK_HEALTH_SIZE_TO_BUFFERS(ops->dump_size));
+			kfree(reporter);
+			reporter = ERR_PTR(-ENOMEM);
+			goto unlock;
+		}
+	}
+
+	list_add_tail(&reporter->list, &devlink->reporter_list);
+	mutex_init(&reporter->dump_lock);
+	mutex_init(&reporter->diagnose_lock);
+
+	reporter->priv = priv;
+	reporter->ops = ops;
+	reporter->devlink = devlink;
+	reporter->graceful_period = graceful_period;
+	reporter->auto_recover = auto_recover;
+unlock:
+	mutex_unlock(&devlink->lock);
+	return reporter;
+}
+EXPORT_SYMBOL_GPL(devlink_health_reporter_create);
+
+/**
+ *	devlink_health_reporter_destroy - destroy devlink health reporter
+ *
+ *	@reporter: devlink health reporter to destroy
+ */
+void
+devlink_health_reporter_destroy(struct devlink_health_reporter *reporter)
+{
+	mutex_lock(&reporter->devlink->lock);
+	list_del(&reporter->list);
+	devlink_health_buffers_destroy(reporter->dump_buffers_array,
+				       DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->dump_size));
+	devlink_health_buffers_destroy(reporter->diagnose_buffers_array,
+				       DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->diagnose_size));
+	kfree(reporter);
+	mutex_unlock(&reporter->devlink->lock);
+}
+EXPORT_SYMBOL_GPL(devlink_health_reporter_destroy);
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -4383,6 +4509,7 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
 	INIT_LIST_HEAD(&devlink->resource_list);
 	INIT_LIST_HEAD(&devlink->param_list);
 	INIT_LIST_HEAD(&devlink->region_list);
+	INIT_LIST_HEAD(&devlink->reporter_list);
 	mutex_init(&devlink->lock);
 	return devlink;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 03/11] devlink: Add health report functionality
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
  2019-01-17 21:59 ` [PATCH net-next v2 01/11] devlink: Add health buffer support Eran Ben Elisha
  2019-01-17 21:59 ` [PATCH net-next v2 02/11] devlink: Add health reporter create/destroy functionality Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:27   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 04/11] devlink: Add health get command Eran Ben Elisha
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Upon error discover, every driver can report it to the devlink health
mechanism via devlink_health_report function, using the appropriate
reporter registered to it. Driver can pass error specific context which
will be delivered to it as part of the dump / recovery callbacks.

Once an error is reported, devlink health will do the following actions:
* A log is being send to the kernel trace events buffer
* Health status and statistics are being updated for the reporter instance
* Object dump is being taken and stored at the reporter instance (as long
  as there is no other dump which is already stored)
* Auto recovery attempt is being done. depends on:
  - Auto Recovery configuration
  - Grace period vs. time since last recover

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/net/devlink.h          |  9 ++++
 include/trace/events/devlink.h | 62 +++++++++++++++++++++++
 net/core/devlink.c             | 93 ++++++++++++++++++++++++++++++++++
 3 files changed, 164 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 7fe30d67678a..a81a1b7a67d7 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -641,6 +641,8 @@ devlink_health_reporter_destroy(struct devlink_health_reporter *reporter);
 
 void *
 devlink_health_reporter_priv(struct devlink_health_reporter *reporter);
+int devlink_health_report(struct devlink_health_reporter *reporter,
+			  const char *msg, void *priv_ctx);
 #else
 
 static inline struct devlink *devlink_alloc(const struct devlink_ops *ops,
@@ -979,6 +981,13 @@ devlink_health_reporter_priv(struct devlink_health_reporter *reporter)
 {
 	return NULL;
 }
+
+static inline int
+devlink_health_report(struct devlink_health_reporter *reporter,
+		      const char *msg, void *priv_ctx)
+{
+	return 0;
+}
 #endif
 
 #endif /* _NET_DEVLINK_H_ */
diff --git a/include/trace/events/devlink.h b/include/trace/events/devlink.h
index 44acfbca1266..7e39d2fc7c75 100644
--- a/include/trace/events/devlink.h
+++ b/include/trace/events/devlink.h
@@ -46,6 +46,65 @@ TRACE_EVENT(devlink_hwmsg,
 		  (int) __entry->len, __get_dynamic_array(buf), __entry->len)
 );
 
+TRACE_EVENT(devlink_health_report,
+	TP_PROTO(const struct devlink *devlink, const char *reporter_name,
+		 const char *msg),
+
+	TP_ARGS(devlink, reporter_name, msg),
+
+	TP_STRUCT__entry(
+		__string(bus_name, devlink->dev->bus->name)
+		__string(dev_name, dev_name(devlink->dev))
+		__string(driver_name, devlink->dev->driver->name)
+		__string(reporter_name, msg)
+		__string(msg, msg)
+	),
+
+	TP_fast_assign(
+		__assign_str(bus_name, devlink->dev->bus->name);
+		__assign_str(dev_name, dev_name(devlink->dev));
+		__assign_str(driver_name, devlink->dev->driver->name);
+		__assign_str(reporter_name, reporter_name);
+		__assign_str(msg, msg);
+	),
+
+	TP_printk("bus_name=%s dev_name=%s driver_name=%s reporter_name=%s: %s",
+		  __get_str(bus_name), __get_str(dev_name),
+		  __get_str(driver_name), __get_str(reporter_name),
+		  __get_str(msg))
+);
+
+TRACE_EVENT(devlink_health_recover_aborted,
+	TP_PROTO(const struct devlink *devlink, const char *reporter_name,
+		 bool health_state, u64 time_since_last_recover),
+
+	TP_ARGS(devlink, reporter_name, health_state, time_since_last_recover),
+
+	TP_STRUCT__entry(
+		__string(bus_name, devlink->dev->bus->name)
+		__string(dev_name, dev_name(devlink->dev))
+		__string(driver_name, devlink->dev->driver->name)
+		__string(reporter_name, reporter_name)
+		__field(bool, health_state)
+		__field(u64, time_since_last_recover)
+	),
+
+	TP_fast_assign(
+		__assign_str(bus_name, devlink->dev->bus->name);
+		__assign_str(dev_name, dev_name(devlink->dev));
+		__assign_str(driver_name, devlink->dev->driver->name);
+		__assign_str(reporter_name, reporter_name);
+		__entry->health_state = health_state;
+		__entry->time_since_last_recover = time_since_last_recover;
+	),
+
+	TP_printk("bus_name=%s dev_name=%s driver_name=%s reporter_name=%s: health_state=%d time_since_last_recover = %llu recover aborted",
+		  __get_str(bus_name), __get_str(dev_name),
+		  __get_str(driver_name), __get_str(reporter_name),
+		  __entry->health_state,
+		  __entry->time_since_last_recover)
+);
+
 #endif /* _TRACE_DEVLINK_H */
 
 /* This part must be outside protection */
@@ -64,6 +123,9 @@ static inline void trace_devlink_hwmsg(const struct devlink *devlink,
 {
 }
 
+static inline void trace_devlink_health(const char *msg)
+{
+}
 #endif /* _TRACE_DEVLINK_H */
 
 #endif
diff --git a/net/core/devlink.c b/net/core/devlink.c
index fec169a28dba..943d3e7dea6a 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4110,6 +4110,16 @@ struct devlink_health_reporter {
 	u64 graceful_period;
 	bool auto_recover;
 	u8 health_state;
+	u8 dump_avail;
+	u64 dump_ts;
+	u64 error_count;
+	u64 recovery_count;
+	u64 last_recovery_ts;
+};
+
+enum devlink_health_reporter_state {
+	DEVLINK_HEALTH_REPORTER_STATE_HEALTHY,
+	DEVLINK_HEALTH_REPORTER_STATE_ERROR,
 };
 
 void *
@@ -4224,6 +4234,89 @@ devlink_health_reporter_destroy(struct devlink_health_reporter *reporter)
 }
 EXPORT_SYMBOL_GPL(devlink_health_reporter_destroy);
 
+static int
+devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
+				void *priv_ctx)
+{
+	int err;
+
+	if (!reporter->ops->recover)
+		return -EOPNOTSUPP;
+
+	err = reporter->ops->recover(reporter, priv_ctx);
+	if (err)
+		return err;
+
+	reporter->recovery_count++;
+	reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_HEALTHY;
+	reporter->last_recovery_ts = jiffies;
+
+	return 0;
+}
+
+static int devlink_health_do_dump(struct devlink_health_reporter *reporter,
+				  void *priv_ctx)
+{
+	int err;
+
+	if (!reporter->ops->dump)
+		return 0;
+
+	if (reporter->dump_avail)
+		return 0;
+
+	devlink_health_buffers_reset(reporter->dump_buffers_array,
+				     DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->dump_size));
+	err = reporter->ops->dump(reporter, reporter->dump_buffers_array,
+				     DEVLINK_HEALTH_BUFFER_SIZE,
+				     DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->dump_size),
+				     priv_ctx);
+	if (!err) {
+		reporter->dump_avail = true;
+		reporter->dump_ts = jiffies;
+	}
+
+	return err;
+}
+
+int devlink_health_report(struct devlink_health_reporter *reporter,
+			  const char *msg, void *priv_ctx)
+{
+	struct devlink *devlink = reporter->devlink;
+	int err = 0;
+
+	/* write a log message of the current error */
+	WARN_ON(!msg);
+	trace_devlink_health_report(devlink, reporter->ops->name, msg);
+	reporter->error_count++;
+
+	/* abort if the previous error wasn't recovered */
+	if (reporter->auto_recover &&
+	    (reporter->health_state != DEVLINK_HEALTH_REPORTER_STATE_HEALTHY ||
+	     jiffies - reporter->last_recovery_ts <
+	     msecs_to_jiffies(reporter->graceful_period))) {
+		trace_devlink_health_recover_aborted(devlink,
+						     reporter->ops->name,
+						     reporter->health_state,
+						     jiffies -
+						     reporter->last_recovery_ts);
+		return -ECANCELED;
+	}
+
+	reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_ERROR;
+
+	mutex_lock(&reporter->dump_lock);
+	/* store current dump of current error, for later analysis */
+	devlink_health_do_dump(reporter, priv_ctx);
+	mutex_unlock(&reporter->dump_lock);
+
+	if (reporter->auto_recover)
+		err = devlink_health_reporter_recover(reporter, priv_ctx);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(devlink_health_report);
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 04/11] devlink: Add health get command
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (2 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 03/11] devlink: Add health report functionality Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:31   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 05/11] devlink: Add health set command Eran Ben Elisha
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Add devlink health get command to provide reporter/s data for user space.
Add the ability to get data per reporter or dump data from all available
reporters.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/uapi/linux/devlink.h |  12 +++
 net/core/devlink.c           | 152 +++++++++++++++++++++++++++++++++++
 2 files changed, 164 insertions(+)

diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index cff0e0cb5ac2..c05470578b99 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -89,6 +89,8 @@ enum devlink_command {
 	DEVLINK_CMD_REGION_DEL,
 	DEVLINK_CMD_REGION_READ,
 
+	DEVLINK_CMD_HEALTH_REPORTER_GET,
+
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
 	DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1
@@ -293,6 +295,16 @@ enum devlink_attr {
 	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,	/* u8 */
 	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,	/* dynamic */
 
+	DEVLINK_ATTR_HEALTH_REPORTER,			/* nested */
+	DEVLINK_ATTR_HEALTH_REPORTER_NAME,		/* string */
+	DEVLINK_ATTR_HEALTH_REPORTER_STATE,		/* u8 */
+	DEVLINK_ATTR_HEALTH_REPORTER_ERR,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_RECOVER,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_DUMP_AVAIL,	/* u8 */
+	DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD,	/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER,	/* u8 */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 943d3e7dea6a..2ba9275449c2 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4317,6 +4317,149 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
 }
 EXPORT_SYMBOL_GPL(devlink_health_report);
 
+static struct devlink_health_reporter *
+devlink_health_reporter_get_from_info(struct devlink *devlink,
+				      struct genl_info *info)
+{
+	char *reporter_name;
+
+	if (!info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_NAME])
+		return NULL;
+
+	reporter_name =
+		nla_data(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_NAME]);
+	return devlink_health_reporter_find_by_name(devlink, reporter_name);
+}
+
+static int
+devlink_nl_health_reporter_fill(struct sk_buff *msg,
+				struct devlink *devlink,
+				struct devlink_health_reporter *reporter,
+				enum devlink_command cmd, u32 portid,
+				u32 seq, int flags)
+{
+	struct nlattr *reporter_attr;
+	void *hdr;
+
+	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (devlink_nl_put_handle(msg, devlink))
+		goto genlmsg_cancel;
+
+	reporter_attr = nla_nest_start(msg, DEVLINK_ATTR_HEALTH_REPORTER);
+	if (!reporter_attr)
+		goto genlmsg_cancel;
+	if (nla_put_string(msg, DEVLINK_ATTR_HEALTH_REPORTER_NAME,
+			   reporter->ops->name))
+		goto reporter_nest_cancel;
+	if (nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_STATE,
+		       reporter->health_state))
+		goto reporter_nest_cancel;
+	if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_ERR,
+			      reporter->error_count, DEVLINK_ATTR_PAD))
+		goto reporter_nest_cancel;
+	if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_RECOVER,
+			      reporter->recovery_count, DEVLINK_ATTR_PAD))
+		goto reporter_nest_cancel;
+	if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD,
+			      reporter->graceful_period,
+			      DEVLINK_ATTR_PAD))
+		goto reporter_nest_cancel;
+	if (nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER,
+		       reporter->auto_recover))
+		goto reporter_nest_cancel;
+	if (nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_AVAIL,
+		       reporter->dump_avail))
+		goto reporter_nest_cancel;
+	if (reporter->dump_avail &&
+	    nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS,
+			      jiffies_to_msecs(reporter->dump_ts),
+			      DEVLINK_ATTR_PAD))
+		goto reporter_nest_cancel;
+
+	nla_nest_end(msg, reporter_attr);
+	genlmsg_end(msg, hdr);
+	return 0;
+
+reporter_nest_cancel:
+	nla_nest_end(msg, reporter_attr);
+genlmsg_cancel:
+	genlmsg_cancel(msg, hdr);
+	return -EMSGSIZE;
+}
+
+static int devlink_nl_cmd_health_reporter_get_doit(struct sk_buff *skb,
+						   struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_health_reporter *reporter;
+	struct sk_buff *msg;
+	int err;
+
+	reporter = devlink_health_reporter_get_from_info(devlink, info);
+	if (!reporter)
+		return -EINVAL;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	err = devlink_nl_health_reporter_fill(msg, devlink, reporter,
+					      DEVLINK_CMD_HEALTH_REPORTER_GET,
+					      info->snd_portid, info->snd_seq,
+					      0);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int
+devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
+					  struct netlink_callback *cb)
+{
+	struct devlink_health_reporter *reporter;
+	struct devlink *devlink;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&devlink_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+			continue;
+		mutex_lock(&devlink->lock);
+		list_for_each_entry(reporter, &devlink->reporter_list,
+				    list) {
+			if (idx < start) {
+				idx++;
+				continue;
+			}
+			err = devlink_nl_health_reporter_fill(msg, devlink,
+							      reporter,
+							      DEVLINK_CMD_HEALTH_REPORTER_GET,
+							      NETLINK_CB(cb->skb).portid,
+							      cb->nlh->nlmsg_seq,
+							      NLM_F_MULTI);
+			if (err) {
+				mutex_unlock(&devlink->lock);
+				goto out;
+			}
+			idx++;
+		}
+		mutex_unlock(&devlink->lock);
+	}
+out:
+	mutex_unlock(&devlink_mutex);
+
+	cb->args[0] = idx;
+	return msg->len;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -4342,6 +4485,7 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_PARAM_VALUE_CMODE] = { .type = NLA_U8 },
 	[DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_REGION_SNAPSHOT_ID] = { .type = NLA_U32 },
+	[DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING },
 };
 
 static const struct genl_ops devlink_nl_ops[] = {
@@ -4562,6 +4706,14 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.flags = GENL_ADMIN_PERM,
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 	},
+	{
+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_GET,
+		.doit = devlink_nl_cmd_health_reporter_get_doit,
+		.dumpit = devlink_nl_cmd_health_reporter_get_dumpit,
+		.policy = devlink_nl_policy,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
+		/* can be retrieved by unprivileged users */
+	},
 };
 
 static struct genl_family devlink_nl_family __ro_after_init = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 05/11] devlink: Add health set command
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (3 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 04/11] devlink: Add health get command Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:32   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 06/11] devlink: Add health recover command Eran Ben Elisha
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Add devlink health set command, in order to set configuration parameters
for a specific reporter.
Supported parameters are:
- graceful_period: Time interval between auto recoveries (in msec)
- auto_recover: Determines if the devlink shall execute recover upon
		receiving error for the reporter

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/uapi/linux/devlink.h |  1 +
 net/core/devlink.c           | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index c05470578b99..49ad5a76b121 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -90,6 +90,7 @@ enum devlink_command {
 	DEVLINK_CMD_REGION_READ,
 
 	DEVLINK_CMD_HEALTH_REPORTER_GET,
+	DEVLINK_CMD_HEALTH_REPORTER_SET,
 
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 2ba9275449c2..a34414bf1c27 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4460,6 +4460,33 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
 	return msg->len;
 }
 
+static int
+devlink_nl_cmd_health_reporter_set_doit(struct sk_buff *skb,
+					struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_health_reporter *reporter;
+
+	reporter = devlink_health_reporter_get_from_info(devlink, info);
+	if (!reporter)
+		return -EINVAL;
+
+	if (!reporter->ops->recover &&
+	    (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] ||
+	     info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER]))
+		return -EINVAL;
+
+	if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD])
+		reporter->graceful_period =
+			nla_get_u64(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD]);
+
+	if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER])
+		reporter->auto_recover =
+			nla_get_u8(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER]);
+
+	return 0;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -4486,6 +4513,8 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_REGION_SNAPSHOT_ID] = { .type = NLA_U32 },
 	[DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING },
+	[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] = { .type = NLA_U64 },
+	[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER] = { .type = NLA_U8 },
 };
 
 static const struct genl_ops devlink_nl_ops[] = {
@@ -4714,6 +4743,13 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 		/* can be retrieved by unprivileged users */
 	},
+	{
+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_SET,
+		.doit = devlink_nl_cmd_health_reporter_set_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
+	},
 };
 
 static struct genl_family devlink_nl_family __ro_after_init = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 06/11] devlink: Add health recover command
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (4 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 05/11] devlink: Add health set command Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:33   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 07/11] devlink: Add health diagnose command Eran Ben Elisha
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Add devlink health recover command to the uapi, in order to allow the user
to execute a recover operation over a specific reporter.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/uapi/linux/devlink.h |  1 +
 net/core/devlink.c           | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 49ad5a76b121..1c186fd77343 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -91,6 +91,7 @@ enum devlink_command {
 
 	DEVLINK_CMD_HEALTH_REPORTER_GET,
 	DEVLINK_CMD_HEALTH_REPORTER_SET,
+	DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
 
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index a34414bf1c27..b224d0d31c0c 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4487,6 +4487,19 @@ devlink_nl_cmd_health_reporter_set_doit(struct sk_buff *skb,
 	return 0;
 }
 
+static int devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb,
+						       struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_health_reporter *reporter;
+
+	reporter = devlink_health_reporter_get_from_info(devlink, info);
+	if (!reporter)
+		return -EINVAL;
+
+	return devlink_health_reporter_recover(reporter, NULL);
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -4750,6 +4763,13 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.flags = GENL_ADMIN_PERM,
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 	},
+	{
+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
+		.doit = devlink_nl_cmd_health_reporter_recover_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
+	},
 };
 
 static struct genl_family devlink_nl_family __ro_after_init = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 07/11] devlink: Add health diagnose command
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (5 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 06/11] devlink: Add health recover command Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:38   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 08/11] devlink: Add health dump {get,clear} commands Eran Ben Elisha
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Add devlink health diagnose command, in order to run a diagnose
operation over a specific reporter.

It is expected from driver's callback for diagnose command to fill it
via the buffer descriptors API. Devlink will parse it and convert it to
netlink nla API in order to pass it to the user.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/uapi/linux/devlink.h |  1 +
 net/core/devlink.c           | 51 ++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 1c186fd77343..51b4d7612cf8 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -92,6 +92,7 @@ enum devlink_command {
 	DEVLINK_CMD_HEALTH_REPORTER_GET,
 	DEVLINK_CMD_HEALTH_REPORTER_SET,
 	DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
+	DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
 
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index b224d0d31c0c..57252ca31e1e 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4500,6 +4500,50 @@ static int devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb,
 	return devlink_health_reporter_recover(reporter, NULL);
 }
 
+static int devlink_nl_cmd_health_reporter_diagnose_doit(struct sk_buff *skb,
+							struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_health_reporter *reporter;
+	u64 num_of_buffers;
+	int err;
+
+	reporter = devlink_health_reporter_get_from_info(devlink, info);
+	if (!reporter)
+		return -EINVAL;
+
+	if (!reporter->ops->diagnose)
+		return -EOPNOTSUPP;
+
+	num_of_buffers =
+		DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->diagnose_size);
+
+	mutex_lock(&reporter->diagnose_lock);
+	devlink_health_buffers_reset(reporter->diagnose_buffers_array,
+				     num_of_buffers);
+
+	err = reporter->ops->diagnose(reporter,
+				      reporter->diagnose_buffers_array,
+				      DEVLINK_HEALTH_BUFFER_SIZE,
+				      num_of_buffers);
+	if (err)
+		goto out;
+
+	err = devlink_health_buffer_snd(info,
+					DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
+					0, reporter->diagnose_buffers_array,
+					num_of_buffers);
+	if (err)
+		goto out;
+
+	mutex_unlock(&reporter->diagnose_lock);
+	return 0;
+
+out:
+	mutex_unlock(&reporter->diagnose_lock);
+	return err;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -4770,6 +4814,13 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.flags = GENL_ADMIN_PERM,
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 	},
+	{
+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
+		.doit = devlink_nl_cmd_health_reporter_diagnose_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
+	},
 };
 
 static struct genl_family devlink_nl_family __ro_after_init = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 08/11] devlink: Add health dump {get,clear} commands
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (6 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 07/11] devlink: Add health diagnose command Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-17 21:59 ` [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support Eran Ben Elisha
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Add devlink health dump commands, in order to run an dump operation
over a specific reporter.

The supported operations are dump_get in order to get last saved
dump (if not exist, dump now) and dump_clear to clear last saved
dump.

It is expected from driver's callback for diagnose command to fill it
via the buffer descriptors API. Devlink will parse it and convert it to
netlink nla API in order to pass it to the user.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/uapi/linux/devlink.h |  2 +
 net/core/devlink.c           | 75 ++++++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+)

diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 51b4d7612cf8..6b26bb2ce4dc 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -93,6 +93,8 @@ enum devlink_command {
 	DEVLINK_CMD_HEALTH_REPORTER_SET,
 	DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
 	DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
+	DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET,
+	DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR,
 
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 57252ca31e1e..60248a53c0ad 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4544,6 +4544,65 @@ static int devlink_nl_cmd_health_reporter_diagnose_doit(struct sk_buff *skb,
 	return err;
 }
 
+static void
+devlink_health_dump_clear(struct devlink_health_reporter *reporter)
+{
+	reporter->dump_avail = false;
+	reporter->dump_ts = 0;
+	devlink_health_buffers_reset(reporter->dump_buffers_array,
+				     DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->dump_size));
+}
+
+static int devlink_nl_cmd_health_reporter_dump_get_doit(struct sk_buff *skb,
+							struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_health_reporter *reporter;
+	u64 num_of_buffers;
+	int err;
+
+	reporter = devlink_health_reporter_get_from_info(devlink, info);
+	if (!reporter)
+		return -EINVAL;
+
+	if (!reporter->ops->dump)
+		return -EOPNOTSUPP;
+
+	num_of_buffers =
+		DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->dump_size);
+
+	mutex_lock(&reporter->dump_lock);
+	err = devlink_health_do_dump(reporter, NULL);
+	if (err)
+		goto out;
+
+	err = devlink_health_buffer_snd(info,
+					DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET,
+					0, reporter->dump_buffers_array,
+					num_of_buffers);
+
+out:
+	mutex_unlock(&reporter->dump_lock);
+	return err;
+}
+
+static int
+devlink_nl_cmd_health_reporter_dump_clear_doit(struct sk_buff *skb,
+					       struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_health_reporter *reporter;
+
+	reporter = devlink_health_reporter_get_from_info(devlink, info);
+	if (!reporter)
+		return -EINVAL;
+
+	mutex_lock(&reporter->dump_lock);
+	devlink_health_dump_clear(reporter);
+	mutex_unlock(&reporter->dump_lock);
+	return 0;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -4821,6 +4880,22 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.flags = GENL_ADMIN_PERM,
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 	},
+	{
+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET,
+		.doit = devlink_nl_cmd_health_reporter_dump_get_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NO_LOCK,
+	},
+	{
+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR,
+		.doit = devlink_nl_cmd_health_reporter_dump_clear_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NO_LOCK,
+	},
 };
 
 static struct genl_family devlink_nl_family __ro_after_init = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (7 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 08/11] devlink: Add health dump {get,clear} commands Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-20 11:06   ` Jiri Pirko
  2019-01-17 21:59 ` [PATCH net-next v2 10/11] net/mlx5e: Add TX timeout support for mlx5e TX reporter Eran Ben Elisha
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

Add mlx5e tx reporter to devlink health reporters. This reporter will be
responsible for diagnosing, reporting and recovering of TX errors.
This patch declares the TX reporter operations and allocate it using the
devlink health API. Currently, this reporter supports reporting and
recovering from send error CQE only. In addition, it adds diagnose
information for the open SQs.

For a local SQ recover (due to driver error report), in case of SQ recover
failure, the recover operation will be considered as a failure.
For a full TX recover, an attempt to close and open the channels will be
done. If this one passed successfully, it will be considered as a
successful recover.

The SQ recover from error CQE flow is not a new feature in the driver,
this patch re-organize the functions and adapt them for the devlink
health API. For this purpose, move code from en_main.c to a new file
named reporter_tx.c.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  18 +-
 .../ethernet/mellanox/mlx5/core/en/reporter.h |  14 +
 .../mellanox/mlx5/core/en/reporter_tx.c       | 321 ++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 135 +-------
 .../net/ethernet/mellanox/mlx5/core/en_tx.c   |   2 +-
 6 files changed, 367 insertions(+), 125 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 9de9abacf7f6..6bb2a860b15b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -22,7 +22,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 #
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
 		en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
-		en_selftest.o en/port.o en/monitor_stats.o
+		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o
 
 #
 # Netdev extra
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 8fa8fdd30b85..27e276c9bf84 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -388,10 +388,7 @@ struct mlx5e_txqsq {
 	struct mlx5e_channel      *channel;
 	int                        txq_ix;
 	u32                        rate_limit;
-	struct mlx5e_txqsq_recover {
-		struct work_struct         recover_work;
-		u64                        last_recover;
-	} recover;
+	struct work_struct         recover_work;
 } ____cacheline_aligned_in_smp;
 
 struct mlx5e_dma_info {
@@ -682,6 +679,13 @@ struct mlx5e_rss_params {
 	u8	hfunc;
 };
 
+struct mlx5e_modify_sq_param {
+	int curr_state;
+	int next_state;
+	int rl_update;
+	int rl_index;
+};
+
 struct mlx5e_priv {
 	/* priv data path fields - start */
 	struct mlx5e_txqsq *txq2sq[MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC];
@@ -737,6 +741,7 @@ struct mlx5e_priv {
 #ifdef CONFIG_MLX5_EN_TLS
 	struct mlx5e_tls          *tls;
 #endif
+	struct devlink_health_reporter *tx_reporter;
 };
 
 struct mlx5e_profile {
@@ -866,6 +871,11 @@ void mlx5e_set_rq_type(struct mlx5_core_dev *mdev, struct mlx5e_params *params);
 void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
 			       struct mlx5e_params *params);
 
+int mlx5e_modify_sq(struct mlx5_core_dev *mdev, u32 sqn,
+		    struct mlx5e_modify_sq_param *p);
+void mlx5e_activate_txqsq(struct mlx5e_txqsq *sq);
+void mlx5e_tx_disable_queue(struct netdev_queue *txq);
+
 static inline bool mlx5e_tunnel_inner_ft_supported(struct mlx5_core_dev *mdev)
 {
 	return (MLX5_CAP_ETH(mdev, tunnel_stateless_gre) &&
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
new file mode 100644
index 000000000000..74083e120ab3
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2018 Mellanox Technologies. */
+
+#ifndef __MLX5E_EN_REPORTER_H
+#define __MLX5E_EN_REPORTER_H
+
+#include <linux/mlx5/driver.h>
+#include "en.h"
+
+int mlx5e_tx_reporter_create(struct mlx5e_priv *priv);
+void mlx5e_tx_reporter_destroy(struct mlx5e_priv *priv);
+void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq);
+
+#endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
new file mode 100644
index 000000000000..9800df4909c2
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -0,0 +1,321 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2018 Mellanox Technologies. */
+
+#include <net/devlink.h>
+#include "reporter.h"
+#include "lib/eq.h"
+
+#define MLX5E_TX_REPORTER_PER_SQ_MAX_LEN 256
+
+struct mlx5e_tx_err_ctx {
+	int (*recover)(struct mlx5e_txqsq *sq);
+	struct mlx5e_txqsq *sq;
+};
+
+static int mlx5e_wait_for_sq_flush(struct mlx5e_txqsq *sq)
+{
+	unsigned long exp_time = jiffies + msecs_to_jiffies(2000);
+
+	while (time_before(jiffies, exp_time)) {
+		if (sq->cc == sq->pc)
+			return 0;
+
+		msleep(20);
+	}
+
+	netdev_err(sq->channel->netdev,
+		   "Wait for SQ 0x%x flush timeout (sq cc = 0x%x, sq pc = 0x%x)\n",
+		   sq->sqn, sq->cc, sq->pc);
+
+	return -ETIMEDOUT;
+}
+
+static void mlx5e_reset_txqsq_cc_pc(struct mlx5e_txqsq *sq)
+{
+	WARN_ONCE(sq->cc != sq->pc,
+		  "SQ 0x%x: cc (0x%x) != pc (0x%x)\n",
+		  sq->sqn, sq->cc, sq->pc);
+	sq->cc = 0;
+	sq->dma_fifo_cc = 0;
+	sq->pc = 0;
+}
+
+static int mlx5e_sq_to_ready(struct mlx5e_txqsq *sq, int curr_state)
+{
+	struct mlx5_core_dev *mdev = sq->channel->mdev;
+	struct net_device *dev = sq->channel->netdev;
+	struct mlx5e_modify_sq_param msp = {0};
+	int err;
+
+	msp.curr_state = curr_state;
+	msp.next_state = MLX5_SQC_STATE_RST;
+
+	err = mlx5e_modify_sq(mdev, sq->sqn, &msp);
+	if (err) {
+		netdev_err(dev, "Failed to move sq 0x%x to reset\n", sq->sqn);
+		return err;
+	}
+
+	memset(&msp, 0, sizeof(msp));
+	msp.curr_state = MLX5_SQC_STATE_RST;
+	msp.next_state = MLX5_SQC_STATE_RDY;
+
+	err = mlx5e_modify_sq(mdev, sq->sqn, &msp);
+	if (err) {
+		netdev_err(dev, "Failed to move sq 0x%x to ready\n", sq->sqn);
+		return err;
+	}
+
+	return 0;
+}
+
+static int mlx5e_tx_reporter_err_cqe_recover(struct mlx5e_txqsq *sq)
+{
+	struct mlx5_core_dev *mdev = sq->channel->mdev;
+	struct net_device *dev = sq->channel->netdev;
+	u8 state;
+	int err;
+
+	if (!test_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state))
+		return 0;
+
+	err = mlx5_core_query_sq_state(mdev, sq->sqn, &state);
+	if (err) {
+		netdev_err(dev, "Failed to query SQ 0x%x state. err = %d\n",
+			   sq->sqn, err);
+		return err;
+	}
+
+	if (state != MLX5_RQC_STATE_ERR) {
+		netdev_err(dev, "SQ 0x%x not in ERROR state\n", sq->sqn);
+		return -EINVAL;
+	}
+
+	mlx5e_tx_disable_queue(sq->txq);
+
+	err = mlx5e_wait_for_sq_flush(sq);
+	if (err)
+		return err;
+
+	/* At this point, no new packets will arrive from the stack as TXQ is
+	 * marked with QUEUE_STATE_DRV_XOFF. In addition, NAPI cleared all
+	 * pending WQEs.  SQ can safely reset the SQ.
+	 */
+
+	err = mlx5e_sq_to_ready(sq, state);
+	if (err)
+		return err;
+
+	mlx5e_reset_txqsq_cc_pc(sq);
+	sq->stats->recover++;
+	mlx5e_activate_txqsq(sq);
+
+	return 0;
+}
+
+void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq)
+{
+	char err_str[MLX5E_TX_REPORTER_PER_SQ_MAX_LEN];
+	struct mlx5e_tx_err_ctx err_ctx = {0};
+
+	err_ctx.sq       = sq;
+	err_ctx.recover  = mlx5e_tx_reporter_err_cqe_recover;
+	sprintf(err_str, "ERR CQE on SQ: 0x%x", sq->sqn);
+
+	devlink_health_report(sq->channel->priv->tx_reporter, err_str,
+			      &err_ctx);
+}
+
+/* state lock cannot be grabbed within this function.
+ * It can cause a dead lock or a read-after-free.
+ */
+int mlx5e_tx_reporter_recover_from_ctx(struct mlx5e_tx_err_ctx *err_ctx)
+{
+	return err_ctx->recover(err_ctx->sq);
+}
+
+static int mlx5e_tx_reporter_recover_all(struct mlx5e_priv *priv)
+{
+	int err;
+
+	mutex_lock(&priv->state_lock);
+	mlx5e_close_locked(priv->netdev);
+	err = mlx5e_open_locked(priv->netdev);
+	mutex_unlock(&priv->state_lock);
+
+	return err;
+}
+
+static int mlx5e_tx_reporter_recover(struct devlink_health_reporter *reporter,
+				     void *context)
+{
+	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
+	struct mlx5e_tx_err_ctx *err_ctx = context;
+
+	return err_ctx ? mlx5e_tx_reporter_recover_from_ctx(err_ctx) :
+			 mlx5e_tx_reporter_recover_all(priv);
+}
+
+static int
+mlx5e_tx_reporter_build_diagnose_output(struct devlink_health_buffer *buffer,
+					u32 sqn, u8 state, u8 stopped)
+{
+	int err, i;
+	int nest = 0;
+	char name[20];
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	sprintf(name, "SQ 0x%x", sqn);
+	err = devlink_health_buffer_put_object_name(buffer, name);
+	if (err)
+		goto buffer_error;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_put_object_name(buffer, "HW state");
+	if (err)
+		goto buffer_error;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_put_value_u8(buffer, state);
+	if (err)
+		goto buffer_error;
+
+	devlink_health_buffer_nest_end(buffer); /* DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE */
+	nest--;
+
+	devlink_health_buffer_nest_end(buffer); /* DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR */
+	nest--;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_put_object_name(buffer, "stopped");
+	if (err)
+		goto buffer_error;
+
+	err = devlink_health_buffer_nest_start(buffer,
+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
+	if (err)
+		goto buffer_error;
+	nest++;
+
+	err = devlink_health_buffer_put_value_u8(buffer, stopped);
+	if (err)
+		goto buffer_error;
+
+	for (i = 0; i < nest; i++)
+		devlink_health_buffer_nest_end(buffer);
+
+	return 0;
+
+buffer_error:
+	for (i = 0; i < nest; i++)
+		devlink_health_buffer_nest_cancel(buffer);
+	return err;
+}
+
+static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
+				      struct devlink_health_buffer **buffers_array,
+				      unsigned int buffer_size,
+				      unsigned int num_buffers)
+{
+	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
+	unsigned int buff = 0;
+	int i = 0, err = 0;
+
+	if (buffer_size < MLX5E_TX_REPORTER_PER_SQ_MAX_LEN)
+		return -ENOMEM;
+
+	mutex_lock(&priv->state_lock);
+
+	if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
+		mutex_unlock(&priv->state_lock);
+		return 0;
+	}
+
+	while (i < priv->channels.num * priv->channels.params.num_tc) {
+		struct mlx5e_txqsq *sq = priv->txq2sq[i];
+		u8 state;
+
+		err = mlx5_core_query_sq_state(priv->mdev, sq->sqn, &state);
+		if (err)
+			break;
+
+		err = mlx5e_tx_reporter_build_diagnose_output(buffers_array[buff],
+							      sq->sqn, state,
+							      netif_xmit_stopped(sq->txq));
+		if (err) {
+			if (++buff == num_buffers)
+				break;
+		} else {
+			i++;
+		}
+	}
+
+	mutex_unlock(&priv->state_lock);
+	return err;
+}
+
+static const struct devlink_health_reporter_ops mlx5_tx_reporter_ops = {
+		.name = "TX",
+		.recover = mlx5e_tx_reporter_recover,
+		.diagnose_size = MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC *
+				 MLX5E_TX_REPORTER_PER_SQ_MAX_LEN,
+		.diagnose = mlx5e_tx_reporter_diagnose,
+		.dump_size = 0,
+		.dump = NULL,
+};
+
+#define MLX5_REPORTER_TX_GRACEFUL_PERIOD 500
+int mlx5e_tx_reporter_create(struct mlx5e_priv *priv)
+{
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct devlink *devlink = priv_to_devlink(mdev);
+
+	priv->tx_reporter =
+		devlink_health_reporter_create(devlink, &mlx5_tx_reporter_ops,
+					       MLX5_REPORTER_TX_GRACEFUL_PERIOD,
+					       true, priv);
+	return PTR_ERR_OR_ZERO(priv->tx_reporter);
+}
+
+void mlx5e_tx_reporter_destroy(struct mlx5e_priv *priv)
+{
+	devlink_health_reporter_destroy(priv->tx_reporter);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8cfd2ec7c0a2..4c4507384fed 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -51,6 +51,7 @@
 #include "en/xdp.h"
 #include "lib/eq.h"
 #include "en/monitor_stats.h"
+#include "en/reporter.h"
 
 struct mlx5e_rq_param {
 	u32			rqc[MLX5_ST_SZ_DW(rqc)];
@@ -1160,7 +1161,7 @@ static int mlx5e_alloc_txqsq_db(struct mlx5e_txqsq *sq, int numa)
 	return 0;
 }
 
-static void mlx5e_sq_recover(struct work_struct *work);
+static void mlx5e_tx_err_cqe_work(struct work_struct *recover_work);
 static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
 			     int txq_ix,
 			     struct mlx5e_params *params,
@@ -1182,7 +1183,7 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
 	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
 	sq->min_inline_mode = params->tx_min_inline_mode;
 	sq->stats     = &c->priv->channel_stats[c->ix].sq[tc];
-	INIT_WORK(&sq->recover.recover_work, mlx5e_sq_recover);
+	INIT_WORK(&sq->recover_work, mlx5e_tx_err_cqe_work);
 	if (MLX5_IPSEC_DEV(c->priv->mdev))
 		set_bit(MLX5E_SQ_STATE_IPSEC, &sq->state);
 	if (mlx5_accel_is_tls_device(c->priv->mdev))
@@ -1270,15 +1271,8 @@ static int mlx5e_create_sq(struct mlx5_core_dev *mdev,
 	return err;
 }
 
-struct mlx5e_modify_sq_param {
-	int curr_state;
-	int next_state;
-	bool rl_update;
-	int rl_index;
-};
-
-static int mlx5e_modify_sq(struct mlx5_core_dev *mdev, u32 sqn,
-			   struct mlx5e_modify_sq_param *p)
+int mlx5e_modify_sq(struct mlx5_core_dev *mdev, u32 sqn,
+		    struct mlx5e_modify_sq_param *p)
 {
 	void *in;
 	void *sqc;
@@ -1376,17 +1370,7 @@ static int mlx5e_open_txqsq(struct mlx5e_channel *c,
 	return err;
 }
 
-static void mlx5e_reset_txqsq_cc_pc(struct mlx5e_txqsq *sq)
-{
-	WARN_ONCE(sq->cc != sq->pc,
-		  "SQ 0x%x: cc (0x%x) != pc (0x%x)\n",
-		  sq->sqn, sq->cc, sq->pc);
-	sq->cc = 0;
-	sq->dma_fifo_cc = 0;
-	sq->pc = 0;
-}
-
-static void mlx5e_activate_txqsq(struct mlx5e_txqsq *sq)
+void mlx5e_activate_txqsq(struct mlx5e_txqsq *sq)
 {
 	sq->txq = netdev_get_tx_queue(sq->channel->netdev, sq->txq_ix);
 	clear_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state);
@@ -1395,7 +1379,7 @@ static void mlx5e_activate_txqsq(struct mlx5e_txqsq *sq)
 	netif_tx_start_queue(sq->txq);
 }
 
-static inline void netif_tx_disable_queue(struct netdev_queue *txq)
+void mlx5e_tx_disable_queue(struct netdev_queue *txq)
 {
 	__netif_tx_lock_bh(txq);
 	netif_tx_stop_queue(txq);
@@ -1411,7 +1395,7 @@ static void mlx5e_deactivate_txqsq(struct mlx5e_txqsq *sq)
 	/* prevent netif_tx_wake_queue */
 	napi_synchronize(&c->napi);
 
-	netif_tx_disable_queue(sq->txq);
+	mlx5e_tx_disable_queue(sq->txq);
 
 	/* last doorbell out, godspeed .. */
 	if (mlx5e_wqc_has_room_for(wq, sq->cc, sq->pc, 1)) {
@@ -1431,6 +1415,7 @@ static void mlx5e_close_txqsq(struct mlx5e_txqsq *sq)
 	struct mlx5_rate_limit rl = {0};
 
 	cancel_work_sync(&sq->dim.work);
+	cancel_work_sync(&sq->recover_work);
 	mlx5e_destroy_sq(mdev, sq->sqn);
 	if (sq->rate_limit) {
 		rl.rate = sq->rate_limit;
@@ -1440,105 +1425,15 @@ static void mlx5e_close_txqsq(struct mlx5e_txqsq *sq)
 	mlx5e_free_txqsq(sq);
 }
 
-static int mlx5e_wait_for_sq_flush(struct mlx5e_txqsq *sq)
+static void mlx5e_tx_err_cqe_work(struct work_struct *recover_work)
 {
-	unsigned long exp_time = jiffies + msecs_to_jiffies(2000);
-
-	while (time_before(jiffies, exp_time)) {
-		if (sq->cc == sq->pc)
-			return 0;
-
-		msleep(20);
-	}
-
-	netdev_err(sq->channel->netdev,
-		   "Wait for SQ 0x%x flush timeout (sq cc = 0x%x, sq pc = 0x%x)\n",
-		   sq->sqn, sq->cc, sq->pc);
-
-	return -ETIMEDOUT;
-}
-
-static int mlx5e_sq_to_ready(struct mlx5e_txqsq *sq, int curr_state)
-{
-	struct mlx5_core_dev *mdev = sq->channel->mdev;
-	struct net_device *dev = sq->channel->netdev;
-	struct mlx5e_modify_sq_param msp = {0};
-	int err;
-
-	msp.curr_state = curr_state;
-	msp.next_state = MLX5_SQC_STATE_RST;
-
-	err = mlx5e_modify_sq(mdev, sq->sqn, &msp);
-	if (err) {
-		netdev_err(dev, "Failed to move sq 0x%x to reset\n", sq->sqn);
-		return err;
-	}
-
-	memset(&msp, 0, sizeof(msp));
-	msp.curr_state = MLX5_SQC_STATE_RST;
-	msp.next_state = MLX5_SQC_STATE_RDY;
+	struct mlx5e_txqsq *sq = container_of(recover_work, struct mlx5e_txqsq,
+					      recover_work);
 
-	err = mlx5e_modify_sq(mdev, sq->sqn, &msp);
-	if (err) {
-		netdev_err(dev, "Failed to move sq 0x%x to ready\n", sq->sqn);
-		return err;
-	}
-
-	return 0;
-}
-
-static void mlx5e_sq_recover(struct work_struct *work)
-{
-	struct mlx5e_txqsq_recover *recover =
-		container_of(work, struct mlx5e_txqsq_recover,
-			     recover_work);
-	struct mlx5e_txqsq *sq = container_of(recover, struct mlx5e_txqsq,
-					      recover);
-	struct mlx5_core_dev *mdev = sq->channel->mdev;
-	struct net_device *dev = sq->channel->netdev;
-	u8 state;
-	int err;
-
-	err = mlx5_core_query_sq_state(mdev, sq->sqn, &state);
-	if (err) {
-		netdev_err(dev, "Failed to query SQ 0x%x state. err = %d\n",
-			   sq->sqn, err);
-		return;
-	}
-
-	if (state != MLX5_RQC_STATE_ERR) {
-		netdev_err(dev, "SQ 0x%x not in ERROR state\n", sq->sqn);
-		return;
-	}
-
-	netif_tx_disable_queue(sq->txq);
-
-	if (mlx5e_wait_for_sq_flush(sq))
-		return;
-
-	/* If the interval between two consecutive recovers per SQ is too
-	 * short, don't recover to avoid infinite loop of ERR_CQE -> recover.
-	 * If we reached this state, there is probably a bug that needs to be
-	 * fixed. let's keep the queue close and let tx timeout cleanup.
-	 */
-	if (jiffies_to_msecs(jiffies - recover->last_recover) <
-	    MLX5E_SQ_RECOVER_MIN_INTERVAL) {
-		netdev_err(dev, "Recover SQ 0x%x canceled, too many error CQEs\n",
-			   sq->sqn);
-		return;
-	}
-
-	/* At this point, no new packets will arrive from the stack as TXQ is
-	 * marked with QUEUE_STATE_DRV_XOFF. In addition, NAPI cleared all
-	 * pending WQEs.  SQ can safely reset the SQ.
-	 */
-	if (mlx5e_sq_to_ready(sq, state))
+	if (!sq->channel->priv->tx_reporter)
 		return;
 
-	mlx5e_reset_txqsq_cc_pc(sq);
-	sq->stats->recover++;
-	recover->last_recover = jiffies;
-	mlx5e_activate_txqsq(sq);
+	mlx5e_tx_reporter_err_cqe(sq);
 }
 
 static int mlx5e_open_icosq(struct mlx5e_channel *c,
@@ -3207,6 +3102,7 @@ static void mlx5e_cleanup_nic_tx(struct mlx5e_priv *priv)
 {
 	int tc;
 
+	mlx5e_tx_reporter_destroy(priv);
 	for (tc = 0; tc < priv->profile->max_tc; tc++)
 		mlx5e_destroy_tis(priv->mdev, priv->tisn[tc]);
 }
@@ -4908,6 +4804,7 @@ static int mlx5e_init_nic_tx(struct mlx5e_priv *priv)
 #ifdef CONFIG_MLX5_CORE_EN_DCB
 	mlx5e_dcbnl_initialize(priv);
 #endif
+	mlx5e_tx_reporter_create(priv);
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 598ad7e4d5c9..a8e052a5ce36 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -514,7 +514,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 				mlx5e_dump_error_cqe(sq,
 						     (struct mlx5_err_cqe *)cqe);
 				queue_work(cq->channel->priv->wq,
-					   &sq->recover.recover_work);
+					   &sq->recover_work);
 			}
 			stats->cqe_err++;
 		}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 10/11] net/mlx5e: Add TX timeout support for mlx5e TX reporter
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (8 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-17 21:59 ` [PATCH net-next v2 11/11] devlink: Add Documentation/networking/devlink-health.txt Eran Ben Elisha
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

With this patch, ndo_tx_timeout callback will be redirected to the TX
reporter in order to detect a TX timeout error and report it to the
devlink health. (The watchdog detects TX timeouts, but the driver verify
the issue still exists before launching any recover method).

In addition, recover from TX timeout in case of lost interrupt was added
to the TX reporter recover method. The TX timeout recover from lost
interrupt is not a new feature in the driver, this patch re-organize the
functionality and move it to the TX reporter recovery flow.

TX timeout example:
(with auto_recover set to false, if set to true, the manual recover and
diagnose sections are irrelevant)

$cat /sys/kernel/debug/tracing/trace
...
devlink_health_report: bus_name=pci dev_name=0000:00:09.0
driver_name=mlx5_core reporter_name=TX: TX timeout on queue: 0, SQ: 0xd8a, CQ:
0x406, SQ Cons: 0x2 SQ Prod: 0x2, usecs since last trans: 13972000

$devlink health diagnose pci/0000:00:09 reporter TX
SQ 0xd8a: HW state: 1, stopped: 1
SQ 0xe44: HW state: 1, stopped: 0
SQ 0xeb4: HW state: 1, stopped: 0
SQ 0xf1f: HW state: 1, stopped: 0
SQ 0xf80: HW state: 1, stopped: 0
SQ 0xfe5: HW state: 1, stopped: 0

$devlink health recover pci/0000:00:09 reporter TX
$devlink health show
pci/0000:00:09.0:
  name TX state healthy #err 1 #recover 1 last_dump_ts N/A dump_available false
    attributes:
        grace_period 500 auto_recover false

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/en/reporter.h |  1 +
 .../mellanox/mlx5/core/en/reporter_tx.c       | 35 +++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 51 +++----------------
 3 files changed, 43 insertions(+), 44 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
index 74083e120ab3..2335c5b48820 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
@@ -10,5 +10,6 @@
 int mlx5e_tx_reporter_create(struct mlx5e_priv *priv);
 void mlx5e_tx_reporter_destroy(struct mlx5e_priv *priv);
 void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq);
+void mlx5e_tx_reporter_timeout(struct mlx5e_txqsq *sq);
 
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 9800df4909c2..d9675afbb924 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -126,6 +126,41 @@ void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq)
 			      &err_ctx);
 }
 
+static int mlx5e_tx_reporter_timeout_recover(struct mlx5e_txqsq *sq)
+{
+	struct mlx5_eq_comp *eq = sq->cq.mcq.eq;
+	u32 eqe_count;
+
+	netdev_err(sq->channel->netdev, "EQ 0x%x: Cons = 0x%x, irqn = 0x%x\n",
+		   eq->core.eqn, eq->core.cons_index, eq->core.irqn);
+
+	eqe_count = mlx5_eq_poll_irq_disabled(eq);
+	if (!eqe_count) {
+		clear_bit(MLX5E_SQ_STATE_ENABLED, &sq->state);
+		return 1;
+	}
+
+	netdev_err(sq->channel->netdev, "Recover %d eqes on EQ 0x%x\n",
+		   eqe_count, eq->core.eqn);
+	sq->channel->stats->eq_rearm++;
+	return 0;
+}
+
+void mlx5e_tx_reporter_timeout(struct mlx5e_txqsq *sq)
+{
+	struct mlx5e_tx_err_ctx err_ctx;
+	char err_str[MLX5E_TX_REPORTER_PER_SQ_MAX_LEN];
+
+	err_ctx.sq       = sq;
+	err_ctx.recover  = mlx5e_tx_reporter_timeout_recover;
+	sprintf(err_str,
+		"TX timeout on queue: %d, SQ: 0x%x, CQ: 0x%x, SQ Cons: 0x%x SQ Prod: 0x%x, usecs since last trans: %u\n",
+		sq->channel->ix, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,
+		jiffies_to_usecs(jiffies - sq->txq->trans_start));
+	devlink_health_report(sq->channel->priv->tx_reporter, err_str,
+			      &err_ctx);
+}
+
 /* state lock cannot be grabbed within this function.
  * It can cause a dead lock or a read-after-free.
  */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 4c4507384fed..dee0c8f3d4e9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4074,31 +4074,14 @@ netdev_features_t mlx5e_features_check(struct sk_buff *skb,
 	return features;
 }
 
-static bool mlx5e_tx_timeout_eq_recover(struct net_device *dev,
-					struct mlx5e_txqsq *sq)
-{
-	struct mlx5_eq_comp *eq = sq->cq.mcq.eq;
-	u32 eqe_count;
-
-	netdev_err(dev, "EQ 0x%x: Cons = 0x%x, irqn = 0x%x\n",
-		   eq->core.eqn, eq->core.cons_index, eq->core.irqn);
-
-	eqe_count = mlx5_eq_poll_irq_disabled(eq);
-	if (!eqe_count)
-		return false;
-
-	netdev_err(dev, "Recover %d eqes on EQ 0x%x\n", eqe_count, eq->core.eqn);
-	sq->channel->stats->eq_rearm++;
-	return true;
-}
-
 static void mlx5e_tx_timeout_work(struct work_struct *work)
 {
 	struct mlx5e_priv *priv = container_of(work, struct mlx5e_priv,
 					       tx_timeout_work);
-	struct net_device *dev = priv->netdev;
-	bool reopen_channels = false;
-	int i, err;
+	int i;
+
+	if (!priv->tx_reporter)
+		return;
 
 	rtnl_lock();
 	mutex_lock(&priv->state_lock);
@@ -4107,36 +4090,16 @@ static void mlx5e_tx_timeout_work(struct work_struct *work)
 		goto unlock;
 
 	for (i = 0; i < priv->channels.num * priv->channels.params.num_tc; i++) {
-		struct netdev_queue *dev_queue = netdev_get_tx_queue(dev, i);
+		struct netdev_queue *dev_queue =
+			netdev_get_tx_queue(priv->netdev, i);
 		struct mlx5e_txqsq *sq = priv->txq2sq[i];
 
 		if (!netif_xmit_stopped(dev_queue))
 			continue;
 
-		netdev_err(dev,
-			   "TX timeout on queue: %d, SQ: 0x%x, CQ: 0x%x, SQ Cons: 0x%x SQ Prod: 0x%x, usecs since last trans: %u\n",
-			   i, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,
-			   jiffies_to_usecs(jiffies - dev_queue->trans_start));
-
-		/* If we recover a lost interrupt, most likely TX timeout will
-		 * be resolved, skip reopening channels
-		 */
-		if (!mlx5e_tx_timeout_eq_recover(dev, sq)) {
-			clear_bit(MLX5E_SQ_STATE_ENABLED, &sq->state);
-			reopen_channels = true;
-		}
+		mlx5e_tx_reporter_timeout(sq);
 	}
 
-	if (!reopen_channels)
-		goto unlock;
-
-	mlx5e_close_locked(dev);
-	err = mlx5e_open_locked(dev);
-	if (err)
-		netdev_err(priv->netdev,
-			   "mlx5e_open_locked failed recovering from a tx_timeout, err(%d).\n",
-			   err);
-
 unlock:
 	mutex_unlock(&priv->state_lock);
 	rtnl_unlock();
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v2 11/11] devlink: Add Documentation/networking/devlink-health.txt
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (9 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 10/11] net/mlx5e: Add TX timeout support for mlx5e TX reporter Eran Ben Elisha
@ 2019-01-17 21:59 ` Eran Ben Elisha
  2019-01-18 22:59 ` [PATCH net-next v2 00/11] Devlink health reporting and recovery system David Miller
  2019-01-20  9:27 ` [PATCH iproute2-next v2] devlink: Add health command support Aya Levin
  12 siblings, 0 replies; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-17 21:59 UTC (permalink / raw)
  To: netdev, Jiri Pirko, David S. Miller
  Cc: Ariel Almog, Aya Levin, Eran Ben Elisha, Moshe Shemesh

From: Aya Levin <ayal@mellanox.com>

This patch adds a new file to add information about devlink health
mechanism.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
---
 Documentation/networking/devlink-health.txt | 86 +++++++++++++++++++++
 1 file changed, 86 insertions(+)
 create mode 100644 Documentation/networking/devlink-health.txt

diff --git a/Documentation/networking/devlink-health.txt b/Documentation/networking/devlink-health.txt
new file mode 100644
index 000000000000..1db3fbea0831
--- /dev/null
+++ b/Documentation/networking/devlink-health.txt
@@ -0,0 +1,86 @@
+The health mechanism is targeted for Real Time Alerting, in order to know when
+something bad had happened to a PCI device
+- Provide alert debug information
+- Self healing
+- If problem needs vendor support, provide a way to gather all needed debugging
+  information.
+
+The main idea is to unify and centralize driver health reports in the
+generic devlink instance and allow the user to set different
+attributes of the health reporting and recovery procedures.
+
+The devlink health reporter:
+Device driver creates a "health reporter" per each error/health type.
+Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
+or unknown (driver specific).
+For each registered health reporter a driver can issue error/health reports
+asynchronously. All health reports handling is done by devlink.
+Device driver can provide specific callbacks for each "health reporter", e.g.
+ - Recovery procedures
+ - Diagnostics and object dump procedures
+ - OOB initial parameters
+Different parts of the driver can register different types of health reporters
+with different handlers.
+
+Once an error is reported, devlink health will do the following actions:
+  * A log is being send to the kernel trace events buffer
+  * Health status and statistics are being updated for the reporter instance
+  * Object dump is being taken and saved at the reporter instance (as long as
+    there is no other dump which is already stored)
+  * Auto recovery attempt is being done. Depends on:
+    - Auto-recovery configuration
+    - Grace period vs. time passed since last recover
+
+The user interface:
+User can access/change each reporter's parameters and driver specific callbacks
+via devlink, e.g per error type (per health reporter)
+ - Configure reporter's generic parameters (like: disable/enable auto recovery)
+ - Invoke recovery procedure
+ - Run diagnostics
+ - Object dump
+
+The devlink health interface (via netlink):
+DEVLINK_CMD_HEALTH_REPORTER_GET
+  Retrieves status and configuration info per DEV and reporter.
+DEVLINK_CMD_HEALTH_REPORTER_SET
+  Allows reporter-related configuration setting.
+DEVLINK_CMD_HEALTH_REPORTER_RECOVER
+  Triggers a reporter's recovery procedure.
+DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE
+  Retrieves diagnostics data from a reporter on a device.
+DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET
+  Retrieves the last stored dump. Devlink health
+  saves a single dump. If an dump is not already stored by the devlink
+  for this reporter, devlink generates a new dump.
+  dump output is defined by the reporter.
+DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR
+  Clears the last saved dump file for the specified reporter.
+
+
+                                               netlink
+                                      +--------------------------+
+                                      |                          |
+                                      |            +             |
+                                      |            |             |
+                                      +--------------------------+
+                                                   |request for ops
+                                                   |(diagnose,
+ mlx5_core                             devlink     |recover,
+                                                   |dump)
++--------+                            +--------------------------+
+|        |                            |    reporter|             |
+|        |                            |  +---------v----------+  |
+|        |   ops execution            |  |                    |  |
+|     <----------------------------------+                    |  |
+|        |                            |  |                    |  |
+|        |                            |  + ^------------------+  |
+|        |                            |    | request for ops     |
+|        |                            |    | (recover, dump)     |
+|        |                            |    |                     |
+|        |                            |  +-+------------------+  |
+|        |     health report          |  | health handler     |  |
+|        +------------------------------->                    |  |
+|        |                            |  +--------------------+  |
+|        |     health reporter create |                          |
+|        +---------------------------->                          |
++--------+                            +--------------------------+
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 00/11] Devlink health reporting and recovery system
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (10 preceding siblings ...)
  2019-01-17 21:59 ` [PATCH net-next v2 11/11] devlink: Add Documentation/networking/devlink-health.txt Eran Ben Elisha
@ 2019-01-18 22:59 ` David Miller
  2019-01-20  9:27 ` [PATCH iproute2-next v2] devlink: Add health command support Aya Levin
  12 siblings, 0 replies; 36+ messages in thread
From: David Miller @ 2019-01-18 22:59 UTC (permalink / raw)
  To: eranbe; +Cc: netdev, jiri, ariela, ayal, moshe

From: Eran Ben Elisha <eranbe@mellanox.com>
Date: Thu, 17 Jan 2019 23:59:09 +0200

> The health mechanism is targeted for Real Time Alerting, in order to know when
> something bad had happened to a PCI device
> - Provide alert debug information
> - Self healing
> - If problem needs vendor support, provide a way to gather all needed debugging
>   information.
> 
> The main idea is to unify and centralize driver health reports in the
> generic devlink instance and allow the user to set different
> attributes of the health reporting and recovery procedures.
 ...
> In this patchset, mlx5e TX reporter is implemented.
> 
> v2:
> - Remove FW* reporters to decrease the amount of patches in the patchset

Series applied.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH iproute2-next v2] devlink: Add health command support
  2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
                   ` (11 preceding siblings ...)
  2019-01-18 22:59 ` [PATCH net-next v2 00/11] Devlink health reporting and recovery system David Miller
@ 2019-01-20  9:27 ` Aya Levin
  2019-01-23  3:37   ` David Ahern
  12 siblings, 1 reply; 36+ messages in thread
From: Aya Levin @ 2019-01-20  9:27 UTC (permalink / raw)
  To: David Ahern, netdev, David S. Miller, Jiri Pirko
  Cc: Moshe Shemesh, Aya Levin, Eran Ben Elisha, Tal Alon, Ariel Almog

This patch add support for the following commands:
Devlink health show [DEV reporter REPORTE_NAME],
Devlink health recover DEV reporter REPORTER_NAME,
Devlink health diagnose DEV reporter REPORTER_NAME,
Devlink health dump show DEV reporter REPORTER_NAME,
Devlink health dump clear DEV reporter REPORTER_NAME,
Devlink health set DEV reporter REPORTER_NAME NAME VALUE

 * Devlink health show command displays status and configuration info on
   specific reporter on a device or dump the info on all reporters on all
   devices.
 * Devlink health recover enables the user to initiate a
   recovery on a reporter. This operation will increment the recoveries
   counter displayed in the show command.
 * Devlink health diagnose enables the user to retrieve diagnostics data
   on a reporter on a device. The command's output is a free text defined
   by the reporter.
 * Devlink health dump show displays the last saved dump. Devlink
   health saves a single dump. If an dump is not already stored by
   the Devlink for this reporter, Devlink generates a new dump. The
   dump can be generated automatically when a reporter reports on an
   error or by the user's request.
   dump output is defined by the reporter.
 * Devlink health dump clear, deletes the last saved dump file.
 * Devlink health set, enables the user to configure:
	1) grace_period [msec] time interval between auto recoveries.
	2) auto_recover [true/false] whether the devlink should execute
	automatic recover on error.

Examples:
$devlink health show pci/0000:00:09.0 reporter TX
pci/0000:00:09.0:
  name TX 
  state healthy #err 0 #recover 1 last_dump_ts N/A dump_available false
  attributes:
    grace period 600 auto_recover true
$devlink health diagnose pci/0000:00:09.0 reporter TX
SQ 0x9b: HW state: 1 stopped: 0
SQ 0x9f: HW state: 1 stopped: 0
SQ 0xa3: HW state: 1 stopped: 0
SQ 0xa7: HW state: 1 stopped: 0
SQ 0xab: HW state: 1 stopped: 0
$devlink health dump show pci/0000:00:09.0 reporter TX
TX dump data
$devlink health dump clear pci/0000:00:09.0 reporter TX
$devlink health set pci/0000:00:09.0 reporter TX grace_period 3500
$devlink health set pci/0000:00:09.0 reporter TX auto_recover false

Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
---
 devlink/devlink.c            | 536 ++++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/devlink.h |  29 +++
 man/man8/devlink-health.8    | 175 ++++++++++++++
 man/man8/devlink.8           |   7 +-
 4 files changed, 743 insertions(+), 4 deletions(-)
 create mode 100644 man/man8/devlink-health.8

diff --git a/devlink/devlink.c b/devlink/devlink.c
index 3651e90c1159..9fc19668ccd0 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -1,4 +1,5 @@
 /*
+ *
  * devlink.c	Devlink tool
  *
  *              This program is free software; you can redistribute it and/or
@@ -22,7 +23,7 @@
 #include <linux/devlink.h>
 #include <libmnl/libmnl.h>
 #include <netinet/ether.h>
-
+#include <sys/sysinfo.h>
 #include "SNAPSHOT.h"
 #include "list.h"
 #include "mnlg.h"
@@ -40,6 +41,12 @@
 #define PARAM_CMODE_DRIVERINIT_STR "driverinit"
 #define PARAM_CMODE_PERMANENT_STR "permanent"
 
+#define HEALTH_REPORTER_STATE_HEALTHY_STR "healthy"
+#define HEALTH_REPORTER_STATE_ERROR_STR "error"
+#define HEALTH_REPORTER_GRACE_PERIOD_STR "grace_period"
+#define HEALTH_REPORTER_AUTO_RECOVER_STR "auto_recover"
+#define HEALTH_REPORTER_TS_LEN 80
+
 static int g_new_line_count;
 
 #define pr_err(args...) fprintf(stderr, ##args)
@@ -199,6 +206,10 @@ static void ifname_map_free(struct ifname_map *ifname_map)
 #define DL_OPT_REGION_SNAPSHOT_ID	BIT(22)
 #define DL_OPT_REGION_ADDRESS		BIT(23)
 #define DL_OPT_REGION_LENGTH		BIT(24)
+#define DL_OPT_HANDLE_HEALTH		BIT(25)
+#define DL_OPT_HEALTH_REPORTER_NAME	BIT(26)
+#define DL_OPT_HEALTH_REPORTER_DEV	BIT(27)
+
 
 struct dl_opts {
 	uint32_t present; /* flags of present items */
@@ -230,6 +241,10 @@ struct dl_opts {
 	uint32_t region_snapshot_id;
 	uint64_t region_address;
 	uint64_t region_length;
+	const char *reporter_name;
+	const char *reporter_param_name;
+	const char *reporter_param_value;
+
 };
 
 struct dl {
@@ -959,7 +974,7 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
 		if (err)
 			return err;
 		o_found |= handle_bit;
-	} else if (o_required & DL_OPT_HANDLE) {
+	} else if (DL_OPT_HANDLE) {
 		err = dl_argv_handle(dl, &opts->bus_name, &opts->dev_name);
 		if (err)
 			return err;
@@ -1178,6 +1193,15 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
 			if (err)
 				return err;
 			o_found |= DL_OPT_REGION_LENGTH;
+		} else if (dl_argv_match(dl, "reporter") &&
+			   (o_all & DL_OPT_HEALTH_REPORTER_NAME)) {
+			dl_arg_inc(dl);
+			err = dl_argv_str(dl, &opts->reporter_name);
+			if (err)
+				return err;
+			o_found |= DL_OPT_HEALTH_REPORTER_NAME;
+			o_found |= DL_OPT_HANDLE;
+			break;
 		} else {
 			pr_err("Unknown option \"%s\"\n", dl_argv(dl));
 			return -EINVAL;
@@ -1298,6 +1322,12 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
 		return -EINVAL;
 	}
 
+	if ((o_required & DL_OPT_HEALTH_REPORTER_NAME) &&
+	    !(o_found & DL_OPT_HEALTH_REPORTER_NAME)) {
+		pr_err("Reporter name expected.\n");
+		return -EINVAL;
+	}
+
 	return 0;
 }
 
@@ -1382,6 +1412,9 @@ static void dl_opts_put(struct nlmsghdr *nlh, struct dl *dl)
 	if (opts->present & DL_OPT_REGION_LENGTH)
 		mnl_attr_put_u64(nlh, DEVLINK_ATTR_REGION_CHUNK_LEN,
 				 opts->region_length);
+	if (opts->present & DL_OPT_HEALTH_REPORTER_NAME)
+		mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME,
+				  opts->reporter_name);
 }
 
 static int dl_argv_parse_put(struct nlmsghdr *nlh, struct dl *dl,
@@ -1513,6 +1546,8 @@ static void __pr_out_handle_start(struct dl *dl, struct nlattr **tb,
 				__pr_out_newline();
 				__pr_out_indent_inc();
 				arr_last_handle_set(dl, bus_name, dev_name);
+			} else {
+				__pr_out_indent_inc();
 			}
 		} else {
 			pr_out("%s%s", buf, content ? ":" : "");
@@ -5557,11 +5592,502 @@ static int cmd_region(struct dl *dl)
 	return -ENOENT;
 }
 
+static int cmd_health_set_params(struct dl *dl)
+{
+	struct nlmsghdr *nlh;
+	uint64_t period;
+	bool auto_recover;
+	int err;
+
+	nlh = mnlg_msg_prepare(dl->nlg, DEVLINK_CMD_HEALTH_REPORTER_SET,
+			       NLM_F_REQUEST | NLM_F_ACK);
+	err = dl_argv_parse(dl, DL_OPT_HANDLE |
+			    DL_OPT_HEALTH_REPORTER_NAME, 0);
+	if (err)
+		return err;
+
+	err = dl_argv_str(dl, &dl->opts.reporter_param_name);
+	if (err)
+		return err;
+	err = dl_argv_str(dl, &dl->opts.reporter_param_value);
+	if (err)
+		return err;
+	dl_opts_put(nlh, dl);
+
+	if (!strncmp(dl->opts.reporter_param_name,
+		     HEALTH_REPORTER_GRACE_PERIOD_STR, strlen("garce"))) {
+		err = strtouint64_t(dl->opts.reporter_param_value, &period);
+		if (err)
+			goto err_param_value_parse;
+		mnl_attr_put_u64(nlh, DEVLINK_ATTR_HEALTH_REPORTER_PERIOD,
+				 period);
+	} else if (!strncmp(dl->opts.reporter_param_name,
+			    HEALTH_REPORTER_AUTO_RECOVER_STR,
+			    strlen("auto"))) {
+		err = strtobool(dl->opts.reporter_param_value, &auto_recover);
+		if (err)
+			goto err_param_value_parse;
+		mnl_attr_put_u8(nlh, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_REC,
+				(uint8_t)auto_recover);
+	} else {
+		printf("Parameter name: %s  is not supported\n",
+		       dl->opts.reporter_param_name);
+		return -ENOTSUP;
+	}
+
+	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
+
+err_param_value_parse:
+	pr_err("Value \"%s\" is not a number or not within range\n",
+	       dl->opts.param_value);
+	return err;
+}
+
+static int cmd_health_dump_clear(struct dl *dl)
+{
+	struct nlmsghdr *nlh;
+	int err;
+
+	nlh = mnlg_msg_prepare(dl->nlg,
+			       DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR,
+			       NLM_F_REQUEST | NLM_F_ACK);
+
+	err = dl_argv_parse_put(nlh, dl, DL_OPT_HANDLE |
+				DL_OPT_HEALTH_REPORTER_NAME, 0);
+	if (err)
+		return err;
+
+	dl_opts_put(nlh, dl);
+	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
+}
+
+static int health_value_show(struct dl *dl, int type, struct nlattr *nl_data)
+{
+	const char *str;
+	uint8_t *data;
+	uint32_t len;
+	uint64_t u64;
+	uint32_t u32;
+	uint16_t u16;
+	uint8_t u8;
+	int i;
+
+	switch (type) {
+	case MNL_TYPE_FLAG:
+		if (dl->json_output)
+			jsonw_string(dl->jw, nl_data ? "true" : "false");
+		else
+			pr_out("%s ", nl_data ? "true" : "false");
+		break;
+	case MNL_TYPE_U8:
+		u8 = mnl_attr_get_u8(nl_data);
+		if (dl->json_output)
+			jsonw_uint(dl->jw, u8);
+		else
+			pr_out("%u ", u8);
+		break;
+	case MNL_TYPE_U16:
+		u16 = mnl_attr_get_u16(nl_data);
+		if (dl->json_output)
+			jsonw_uint(dl->jw, u16);
+		else
+			pr_out("%u ", u16);
+		break;
+	case MNL_TYPE_U32:
+		u32 = mnl_attr_get_u32(nl_data);
+		if (dl->json_output)
+			jsonw_uint(dl->jw, u32);
+		else
+			pr_out("%u ", u32);
+		break;
+	case MNL_TYPE_U64:
+		u64 = mnl_attr_get_u64(nl_data);
+		if (dl->json_output)
+			jsonw_u64(dl->jw, u64);
+		else
+			pr_out("%lu ", u64);
+		break;
+	case MNL_TYPE_STRING:
+	case MNL_TYPE_NUL_STRING:
+		str = mnl_attr_get_str(nl_data);
+		if (dl->json_output)
+			jsonw_string(dl->jw, str);
+		else
+			pr_out("%s ", str);
+		break;
+	case MNL_TYPE_BINARY:
+		len = mnl_attr_get_payload_len(nl_data);
+		data = mnl_attr_get_payload(nl_data);
+		i = 0;
+		while (i < len) {
+			if (dl->json_output)
+				jsonw_printf(dl->jw, "%d", data[i]);
+			else
+				pr_out("%02x ", data[i]);
+			i++;
+		}
+		break;
+	default:
+		return -EINVAL;
+	}
+	return MNL_CB_OK;
+}
+
+static int health_object_pair_show(struct dl *dl, struct nlattr *nl)
+{
+	struct nlattr *nla_pair[DEVLINK_ATTR_MAX + 1] = {};
+	struct nlattr *nla_value[DEVLINK_ATTR_MAX + 1] = {};
+	struct nlattr *nla_value_data;
+	struct nlattr *nla_obj_pair;
+	struct nlattr *nla_object;
+	struct nlattr *nla_array;
+	struct nlattr *nla_val;
+	int err, type;
+	const char *name;
+
+	err = mnl_attr_parse_nested(nl, attr_cb, nla_pair);
+	if (err != MNL_CB_OK)
+		return -EINVAL;
+
+	if (!nla_pair[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_NAME] ||
+	    !nla_pair[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE])
+		return -EINVAL;
+
+	name = mnl_attr_get_str(nla_pair[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_NAME]);
+	if (dl->json_output)
+		jsonw_name(dl->jw, name);
+	else
+		pr_out("%s: ", name);
+
+	nla_val = nla_pair[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE];
+	err = mnl_attr_parse_nested(nla_val, attr_cb, nla_value);
+	if (err != MNL_CB_OK)
+		return -EINVAL;
+
+	if (!nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT] &&
+	    !nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY] &&
+	    !nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE])
+		return -EINVAL;
+
+	if (nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT]) {
+		nla_object = nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT];
+		/*inside must be an object pair*/
+		if (dl->json_output)
+			jsonw_start_object(dl->jw);
+
+		mnl_attr_for_each_nested(nla_obj_pair, nla_object) {
+			err = health_object_pair_show(dl, nla_obj_pair);
+			if (err != MNL_CB_OK)
+				break;
+		}
+		if (dl->json_output)
+			jsonw_end_object(dl->jw);
+	}
+	if (nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE]) {
+		type = mnl_attr_get_u8(nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE]);
+		if (!nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA])
+			return -EINVAL;
+		nla_value_data = nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA];
+		health_value_show(dl, type, nla_value_data);
+	}
+	if (nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY]) {
+		if (dl->json_output)
+			jsonw_start_array(dl->jw);
+		nla_array = nla_value[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY];
+		mnl_attr_for_each_nested(nla_object, nla_array) {
+			mnl_attr_for_each_nested(nla_obj_pair, nla_object) {
+				err = health_object_pair_show(dl, nla_obj_pair);
+				if (err != MNL_CB_OK)
+					break;
+			}
+		}
+		if (dl->json_output)
+			jsonw_end_array(dl->jw);
+	}
+	return err;
+}
+
+static int cmd_health_object_cb(const struct nlmsghdr *nlh, void *data)
+{
+	struct genlmsghdr *genl = mnl_nlmsg_get_payload(nlh);
+	struct nlattr *tb[DEVLINK_ATTR_MAX + 1] = {};
+	struct nlattr *nla_object;
+	struct nlattr *nla_object_pair;
+	struct dl *dl = data;
+	int err = MNL_CB_OK;
+
+	mnl_attr_parse(nlh, sizeof(*genl), attr_cb, tb);
+	if (!tb[DEVLINK_ATTR_HEALTH_BUFFER_OBJECT])
+		return MNL_CB_ERROR;
+
+	mnl_attr_for_each(nla_object, nlh, sizeof(*genl)) {
+		mnl_attr_for_each_nested(nla_object_pair, nla_object) {
+			err = health_object_pair_show(dl, nla_object_pair);
+			if (err != MNL_CB_OK)
+				break;
+		}
+		if (!dl->json_output)
+			pr_out("\n");
+	}
+
+	return err;
+}
+
+static int cmd_health_object_common(struct dl *dl, uint8_t cmd)
+{
+	struct nlmsghdr *nlh;
+	int err;
+
+	nlh = mnlg_msg_prepare(dl->nlg, cmd,
+			       NLM_F_REQUEST | NLM_F_ACK);
+
+	err = dl_argv_parse_put(nlh, dl, DL_OPT_HANDLE |
+				DL_OPT_HEALTH_REPORTER_NAME, 0);
+	if (err)
+		return err;
+
+	if (dl->json_output)
+		jsonw_start_object(dl->jw);
+
+	err = _mnlg_socket_sndrcv(dl->nlg, nlh, cmd_health_object_cb, dl);
+
+	if (dl->json_output)
+		jsonw_end_object(dl->jw);
+
+	return err;
+}
+
+static int cmd_health_diagnose(struct dl *dl)
+{
+	return cmd_health_object_common(dl,
+					DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE);
+}
+
+static int cmd_health_dump_show(struct dl *dl)
+{
+	return cmd_health_object_common(dl,
+					DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET);
+}
+
+static int cmd_health_recover(struct dl *dl)
+{
+	struct nlmsghdr *nlh;
+	int err;
+
+	nlh = mnlg_msg_prepare(dl->nlg, DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
+			       NLM_F_REQUEST | NLM_F_ACK);
+
+	err = dl_argv_parse_put(nlh, dl, DL_OPT_HANDLE |
+				DL_OPT_HEALTH_REPORTER_NAME, 0);
+	if (err)
+		return err;
+
+	dl_opts_put(nlh, dl);
+	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
+}
+
+static void pr_out_array_parameters_start(struct dl *dl, const char *name)
+{
+	if (dl->json_output) {
+		jsonw_name(dl->jw, name);
+		jsonw_start_array(dl->jw);
+	} else {
+		__pr_out_newline();
+		pr_out("%s:", name);
+		 __pr_out_indent_inc();
+		__pr_out_newline();
+	}
+}
+
+static const char *health_state_name(uint8_t state)
+{
+	switch (state) {
+	case DEVLINK_HEALTH_REPORTER_STATE_HEALTHY:
+		return HEALTH_REPORTER_STATE_HEALTHY_STR;
+	case DEVLINK_HEALTH_REPORTER_STATE_ERROR:
+		return HEALTH_REPORTER_STATE_ERROR_STR;
+	default: return "<unknown state>";
+	}
+}
+
+static int jiffies_to_logtime(uint64_t jiffies, char *output)
+{
+	struct sysinfo s_info;
+	struct tm *info;
+	time_t now, sec;
+	int err;
+
+	time(&now);
+	info = localtime(&now);
+	strftime(output, HEALTH_REPORTER_TS_LEN, "%Y-%b-%d %H:%M:%S", info);
+	err = sysinfo(&s_info);
+	if (err)
+		return err;
+	/*substruct uptime in sec from now
+	 * add jiffies and 5 minutes factor*/
+	sec = now - (s_info.uptime - jiffies/1000) + 300;
+	info = localtime(&sec);
+	strftime(output, HEALTH_REPORTER_TS_LEN, "%Y-%b-%d %H:%M:%S", info);
+
+	return 0;
+}
+
+static void pr_out_health(struct dl *dl, struct nlattr **tb)
+{
+	struct nlattr *hlt[DEVLINK_ATTR_MAX + 1] = {};
+	enum devlink_health_reporter_state state;
+	char dump_time_date[HEALTH_REPORTER_TS_LEN] = "N/A";
+	bool auto_recover = false;
+	const struct nlattr *attr;
+	bool dmp = false;
+	uint64_t jiffies;
+	int err;
+
+	state = DEVLINK_HEALTH_REPORTER_STATE_HEALTHY;
+
+	err = mnl_attr_parse_nested(tb[DEVLINK_ATTR_HEALTH_REPORTER], attr_cb,
+				    hlt);
+	if (err != MNL_CB_OK)
+		return;
+
+	if (!hlt[DEVLINK_ATTR_HEALTH_REPORTER_NAME]		||
+	    !hlt[DEVLINK_ATTR_HEALTH_REPORTER_ERR]		||
+	    !hlt[DEVLINK_ATTR_HEALTH_REPORTER_RECOVER]		||
+	    !hlt[DEVLINK_ATTR_HEALTH_REPORTER_DUMP_AVAIL]	||
+	    !hlt[DEVLINK_ATTR_HEALTH_REPORTER_STATE]		||
+	    !hlt[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_REC]		||
+	    !hlt[DEVLINK_ATTR_HEALTH_REPORTER_PERIOD])
+		return;
+
+	dmp = mnl_attr_get_u8(hlt[DEVLINK_ATTR_HEALTH_REPORTER_DUMP_AVAIL]);
+	if (dmp) {
+		if (hlt[DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS]) {
+			attr = hlt[DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS];
+			jiffies = mnl_attr_get_u64(attr);
+			err = jiffies_to_logtime(jiffies, dump_time_date);
+			if (err)
+				sprintf(dump_time_date,
+					"Failed to retrieve dump's timestamp");
+		}
+	}
+
+	pr_out_handle_start_arr(dl, tb);
+
+	pr_out_str(dl, "name",
+		   mnl_attr_get_str(hlt[DEVLINK_ATTR_HEALTH_REPORTER_NAME]));
+	__pr_out_newline();
+	__pr_out_indent_inc();
+	state = mnl_attr_get_u8(hlt[DEVLINK_ATTR_HEALTH_REPORTER_STATE]);
+	pr_out_str(dl, "state", health_state_name(state));
+	pr_out_u64(dl, "#err",
+		   mnl_attr_get_u64(hlt[DEVLINK_ATTR_HEALTH_REPORTER_ERR]));
+	pr_out_u64(dl, "#recover",
+		   mnl_attr_get_u64(hlt[DEVLINK_ATTR_HEALTH_REPORTER_RECOVER]));
+	pr_out_str(dl, "last_dump_ts", dump_time_date);
+	pr_out_bool(dl, "dump_available", dmp);
+	pr_out_array_parameters_start(dl, "parameters");
+	pr_out_entry_start(dl);
+	pr_out_u64(dl, "grace_period",
+		   mnl_attr_get_u64(hlt[DEVLINK_ATTR_HEALTH_REPORTER_PERIOD]));
+	auto_recover = mnl_attr_get_u8(hlt[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_REC]);
+	pr_out_bool(dl, "auto_recover", auto_recover);
+	pr_out_entry_end(dl);
+	pr_out_array_end(dl);
+	__pr_out_indent_dec();
+	pr_out_handle_end(dl);
+}
+
+static int cmd_health_show_cb(const struct nlmsghdr *nlh, void *data)
+{
+	struct genlmsghdr *genl = mnl_nlmsg_get_payload(nlh);
+	struct nlattr *tb[DEVLINK_ATTR_MAX + 1] = {};
+	struct dl *dl = data;
+
+	mnl_attr_parse(nlh, sizeof(*genl), attr_cb, tb);
+	if (!tb[DEVLINK_ATTR_BUS_NAME] || !tb[DEVLINK_ATTR_DEV_NAME] ||
+	    !tb[DEVLINK_ATTR_HEALTH_REPORTER])
+		return MNL_CB_ERROR;
+
+	pr_out_health(dl, tb);
+
+	return MNL_CB_OK;
+}
+
+static int cmd_health_show(struct dl *dl)
+{
+	struct nlmsghdr *nlh;
+	uint16_t flags = NLM_F_REQUEST | NLM_F_ACK;
+	int err;
+
+	if (dl_argc(dl) == 0)
+		flags |= NLM_F_DUMP;
+	nlh = mnlg_msg_prepare(dl->nlg, DEVLINK_CMD_HEALTH_REPORTER_GET,
+			       flags);
+
+	if (dl_argc(dl) > 0) {
+		err = dl_argv_parse_put(nlh, dl, 0, DL_OPT_HANDLE |
+					DL_OPT_HEALTH_REPORTER_NAME);
+		if (err)
+			return err;
+	}
+	pr_out_section_start(dl, "health");
+
+	err = _mnlg_socket_sndrcv(dl->nlg, nlh, cmd_health_show_cb, dl);
+	if (err)
+		return err;
+	pr_out_section_end(dl);
+	return err;
+}
+
+static void cmd_health_help(void)
+{
+	pr_err("Usage: devlink health show [ dev DEV reporter REPORTER_NAME ]\n");
+	pr_err("Usage: devlink health recover DEV reporter REPORTER_NAME\n");
+	pr_err("Usage: devlink health diagnose DEV reporter REPORTER_NAME\n");
+	pr_err("Usage: devlink health dump show DEV reporter REPORTER_NAME\n");
+	pr_err("Usage: devlink health dump clear DEV reporter REPORTER_NAME\n");
+	pr_err("Usage: devlink health set DEV reporter REPORTER_NAME NAME VALUE\n");
+}
+
+static int cmd_health(struct dl *dl)
+{
+	if (dl_no_arg(dl)) {
+		return cmd_health_show(dl);
+	} else if (dl_argv_match(dl, "help")) {
+		cmd_health_help();
+		return 0;
+	} else if (dl_argv_match(dl, "show")) {
+		dl_arg_inc(dl);
+		return cmd_health_show(dl);
+	} else if (dl_argv_match(dl, "recover")) {
+		dl_arg_inc(dl);
+		return cmd_health_recover(dl);
+	} else if (dl_argv_match(dl, "diagnose")) {
+		dl_arg_inc(dl);
+		return cmd_health_diagnose(dl);
+	} else if (dl_argv_match(dl, "dump")) {
+		dl_arg_inc(dl);
+		if (dl_argv_match(dl, "show")) {
+			dl_arg_inc(dl);
+			return cmd_health_dump_show(dl);
+		} else if (dl_argv_match(dl, "clear")) {
+			dl_arg_inc(dl);
+			return cmd_health_dump_clear(dl);
+		}
+	} else if (dl_argv_match(dl, "set")) {
+		dl_arg_inc(dl);
+		return cmd_health_set_params(dl);
+	}
+
+	pr_err("Command \"%s\" not found\n", dl_argv(dl));
+	return -ENOENT;
+}
+
 static void help(void)
 {
 	pr_err("Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }\n"
 	       "       devlink [ -f[orce] ] -b[atch] filename\n"
-	       "where  OBJECT := { dev | port | sb | monitor | dpipe | resource | region }\n"
+	       "where  OBJECT := { dev | port | sb | monitor | dpipe | resource | region | health }\n"
 	       "       OPTIONS := { -V[ersion] | -n[o-nice-names] | -j[son] | -p[retty] | -v[erbose] }\n");
 }
 
@@ -5594,7 +6120,11 @@ static int dl_cmd(struct dl *dl, int argc, char **argv)
 	} else if (dl_argv_match(dl, "region")) {
 		dl_arg_inc(dl);
 		return cmd_region(dl);
+	} else if (dl_argv_match(dl, "health")) {
+		dl_arg_inc(dl);
+		return cmd_health(dl);
 	}
+
 	pr_err("Object \"%s\" not found\n", dl_argv(dl));
 	return -ENOENT;
 }
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index d0a33d79dc22..90c6ebb4829d 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -89,6 +89,12 @@ enum devlink_command {
 	DEVLINK_CMD_REGION_DEL,
 	DEVLINK_CMD_REGION_READ,
 
+	DEVLINK_CMD_HEALTH_REPORTER_GET,
+	DEVLINK_CMD_HEALTH_REPORTER_SET,
+	DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
+	DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
+	DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET,
+	DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR,
 	/* add new commands above here */
 	__DEVLINK_CMD_MAX,
 	DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1
@@ -168,6 +174,11 @@ enum devlink_param_fw_load_policy_value {
 	DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH,
 };
 
+enum devlink_health_reporter_state {
+	DEVLINK_HEALTH_REPORTER_STATE_HEALTHY,
+	DEVLINK_HEALTH_REPORTER_STATE_ERROR,
+};
+
 enum devlink_attr {
 	/* don't change the order or add anything between, this is ABI! */
 	DEVLINK_ATTR_UNSPEC,
@@ -285,6 +296,24 @@ enum devlink_attr {
 	DEVLINK_ATTR_REGION_CHUNK_ADDR,         /* u64 */
 	DEVLINK_ATTR_REGION_CHUNK_LEN,          /* u64 */
 
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT,		/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR,		/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_NAME,		/* string */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE,	/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_ARRAY,	/* nested */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_TYPE,	/* u8 */
+	DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE_DATA,	/* dynamic */
+
+	DEVLINK_ATTR_HEALTH_REPORTER,			/* nested */
+	DEVLINK_ATTR_HEALTH_REPORTER_NAME,		/* string */
+	DEVLINK_ATTR_HEALTH_REPORTER_STATE,		/* u8 */
+	DEVLINK_ATTR_HEALTH_REPORTER_ERR,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_RECOVER,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_DUMP_AVAIL,	/* u8 */
+	DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_PERIOD,		/* u64 */
+	DEVLINK_ATTR_HEALTH_REPORTER_AUTO_REC,		/* u8 */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/man/man8/devlink-health.8 b/man/man8/devlink-health.8
new file mode 100644
index 000000000000..880672290d36
--- /dev/null
+++ b/man/man8/devlink-health.8
@@ -0,0 +1,175 @@
+.TH DEVLINK\-HEALTH 8 "27 Dec 2018" "iproute2" "Linux"
+.SH NAME
+devlink-health \- devlink health reporting and recovery
+.SH SYNOPSIS
+.sp
+.ad l
+.in +8
+.ti -8
+.B devlink
+.RI "[ " OPTIONS " ]"
+.B health
+.RI  " { " COMMAND " | "
+.BR help " }"
+.sp
+.ti -8
+.IR OPTIONS " := { "
+\fB\-V\fR[\fIersion\fR] }
+.ti -8
+.BR "devlink health show"
+.RI "[ "
+.RI "" DEV ""
+.BR " reporter "
+.RI ""REPORTER " ] "
+.ti -8
+.BR "devlink health recover"
+.RI "" DEV ""
+.BR "reporter"
+.RI "" REPORTER ""
+.ti -8
+.BR "devlink health diagnose"
+.RI "" DEV ""
+.BR "reporter"
+.RI "" REPORTER ""
+.ti -8
+.BR "devlink health dump show"
+.RI "" DEV ""
+.BR "reporter"
+.RI "" REPORTER ""
+.ti -8
+.BR "devlink health dump clear"
+.RI "" DEV ""
+.BR "reporter"
+.RI "" REPORTER ""
+.ti -8
+.BR "devlink health set"
+.RI "" DEV ""
+.BR "reporter"
+.RI "" REPORTER ""
+.RI "" NAME ""
+.RI "" VALUE ""
+.ti -8
+.B devlink health help
+.SH "DESCRIPTION"
+.SS devlink health show - Show status and configuration on all supported reporters on all devlink devices.
+.PP
+.I "DEV"
+- specifies the devlink device.
+.PP
+.I "REPORTER"
+- specifies the reporter's name registered on the devlink device.
+.SS devlink health recover - Initiate a recovery operation on a reporter.
+This action performs a recovery and increases the recoveries counter on success.
+.PP
+.I "DEV"
+- specifies the devlink device.
+.PP
+.I "REPORTER"
+- specifies the reporter's name registered on the devlink device.
+.SS devlink health diagnose - Retrieve diagnostics data on a reporter.
+.PP
+.I "DEV"
+- specifies the devlink device.
+.PP
+.I "REPORTER"
+- specifies the reporter's name registered on the devlink device.
+.SS devlink health dump show - Display the last saved dump.
+.PD 0
+.P
+Devlink health saves a single dump per reporter. If an dump is
+.P
+not already stored by the Devlink, this command will generate a new
+.P
+dump. The dump can be generated either automatically when a
+.P
+reporter reports on an error or manually at the user's request.
+.PD
+.PP
+.I "DEV"
+- specifies the devlink device.
+.PP
+.I "REPORTER"
+- specifies the reporter's name registered on the devlink device.
+.SS devlink health dump clear - Delete the saved dump.
+Deleting the save dump enables a generation of a new dump on
+.PD 0
+.P
+the next "devlink health dump show" command.
+.PD
+.PP
+.I "DEV"
+- specifies the devlink device.
+.PP
+.I "REPORTER"
+- specifies the reporter's name registered on the devlink device.
+.SS devlink health set - Enable the user to configure:
+.PD 0
+1) grace_period [msec] - Time interval between auto recoveries.
+.P
+2) auto_recover [true/false] - Indicates whether the devlink should execute automatic recover on error.
+.PD
+.PP
+.I "DEV"
+- specifies the devlink device.
+.PP
+.I "REPORTER"
+- specifies the reporter's name registered on the devlink device.
+.SH "EXAMPLES"
+.PP
+devlink health show
+.RS 4
+pci/0000:00:09.0:
+  name TX state healthy #err 1 #recover 1 last_dump_ts N/A dump_available false
+  attributes:
+    grace period 600 auto_recover true
+.RE
+.PP
+devlink health recover pci/0000:00:09.0 reporter TX
+.RS 4
+Initiate recovery on TX reporter registered on pci/0000:00:09.0.
+.RE
+.PP
+devlink health diagnose pci/0000:00:09.0 reporter TX
+.RS 4
+.PD 0
+SQ 0x9b: HW state: 1 stopped: 0
+.P
+SQ 0x9f: HW state: 1 stopped: 0
+.P
+SQ 0xa3: HW state: 1 stopped: 0
+.P
+SQ 0xa7: HW state: 1 stopped: 0
+.P
+SQ 0xab: HW state: 1 stopped: 0
+.PD
+.RE
+.PP
+devlink health dump show pci/0000:00:09.0 reporter TX
+.RS 4
+Display the last saved dump on TX reporter registered on pci/0000:00:09.0.
+.RE
+.PP
+devlink health dump clear pci/0000:00:09.0 reporter TX
+.RS 4
+Delete saved dump on TX reporter registered on pci/0000:00:09.0.
+.RE
+.PP
+devlink health set pci/0000:00:09.0 reporter TX grace_period 3500
+.RS 4
+Set time interval between auto recoveries to minimum of 3500 mSec on
+TX reporter registered on pci/0000:00:09.0.
+.RE
+.PP
+devlink health set pci/0000:00:09.0 reporter TX auto_recover false
+.RS 4
+Turn off auto recovery on TX reporter registered on pci/0000:00:09.0.
+.RE
+.SH SEE ALSO
+.BR devlink (8),
+.BR devlink-dev (8),
+.BR devlink-port (8),
+.BR devlink-region (8),
+.br
+
+.SH AUTHOR
+Aya Levin <ayal@mellanox.com>
diff --git a/man/man8/devlink.8 b/man/man8/devlink.8
index 8d527e7e1d60..13d4dcd908b3 100644
--- a/man/man8/devlink.8
+++ b/man/man8/devlink.8
@@ -7,7 +7,7 @@ devlink \- Devlink tool
 .in +8
 .ti -8
 .B devlink
-.RI "[ " OPTIONS " ] { " dev | port | monitor | sb | resource | region " } { " COMMAND " | "
+.RI "[ " OPTIONS " ] { " dev | port | monitor | sb | resource | region | health " } { " COMMAND " | "
 .BR help " }"
 .sp
 
@@ -78,6 +78,10 @@ Turn on verbose output.
 .B region
 - devlink address region access
 
+.TP
+.B health
+- devlink reporting and recovery
+
 .SS
 .I COMMAND
 
@@ -109,6 +113,7 @@ Exit status is 0 if command was successful or a positive integer upon failure.
 .BR devlink-sb (8),
 .BR devlink-resource (8),
 .BR devlink-region (8),
+.BR devlink-health (8),
 .br
 
 .SH REPORTING BUGS
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-17 21:59 ` [PATCH net-next v2 01/11] devlink: Add health buffer support Eran Ben Elisha
@ 2019-01-20 10:03   ` Jiri Pirko
  2019-01-20 11:06     ` Eran Ben Elisha
  2019-01-20 11:20   ` Jiri Pirko
  1 sibling, 1 reply; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 10:03 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:10PM CET, eranbe@mellanox.com wrote:

[...]

>+static void
>+devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
>+			       u64 size);

Avoid fwd declarations.


>+
>+static struct devlink_health_buffer **
>+devlink_health_buffers_create(u64 size)
>+{
>+	struct devlink_health_buffer **buffers_list;
>+	u64 num_of_buffers = DEVLINK_HEALTH_SIZE_TO_BUFFERS(size);
>+	u64 i;
>+
>+	buffers_list = kcalloc(num_of_buffers,
>+			       sizeof(struct devlink_health_buffer *),
>+			       GFP_KERNEL);
>+	if (!buffers_list)
>+		return NULL;
>+
>+	for (i = 0; i < num_of_buffers; i++) {
>+		struct devlink_health_buffer *buffer;
>+		void *data;
>+
>+		buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
>+		data = kzalloc(DEVLINK_HEALTH_BUFFER_SIZE, GFP_KERNEL);
>+		if (!buffer || !data) {
>+			kfree(buffer);
>+			kfree(data);
>+			goto buffers_cleanup;
>+		}
>+		buffers_list[i] = buffer;
>+		buffer->data = data;
>+	}
>+	devlink_health_buffers_reset(buffers_list, num_of_buffers);
>+
>+	return buffers_list;
>+
>+buffers_cleanup:
>+	devlink_health_buffers_destroy(buffers_list, --i);

Just do for-kfree here.


>+	kfree(buffers_list);
>+	return NULL;
>+}
>+
>+static void
>+devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
>+			       u64 num_of_buffers)
>+{
>+	u64 i;
>+
>+	for (i = 0; i < num_of_buffers; i++) {
>+		kfree(buffers_list[i]->data);
>+		kfree(buffers_list[i]);
>+	}
>+}
>+

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-20 10:03   ` Jiri Pirko
@ 2019-01-20 11:06     ` Eran Ben Elisha
  2019-01-20 11:08       ` Jiri Pirko
  0 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-20 11:06 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh



On 1/20/2019 12:03 PM, Jiri Pirko wrote:
> Thu, Jan 17, 2019 at 10:59:10PM CET, eranbe@mellanox.com wrote:
> 
> [...]
> 
>> +static void
>> +devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
>> +			       u64 size);
> 
> Avoid fwd declarations.
> 
> 
>> +
>> +static struct devlink_health_buffer **
>> +devlink_health_buffers_create(u64 size)
>> +{
>> +	struct devlink_health_buffer **buffers_list;
>> +	u64 num_of_buffers = DEVLINK_HEALTH_SIZE_TO_BUFFERS(size);
>> +	u64 i;
>> +
>> +	buffers_list = kcalloc(num_of_buffers,
>> +			       sizeof(struct devlink_health_buffer *),
>> +			       GFP_KERNEL);
>> +	if (!buffers_list)
>> +		return NULL;
>> +
>> +	for (i = 0; i < num_of_buffers; i++) {
>> +		struct devlink_health_buffer *buffer;
>> +		void *data;
>> +
>> +		buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
>> +		data = kzalloc(DEVLINK_HEALTH_BUFFER_SIZE, GFP_KERNEL);
>> +		if (!buffer || !data) {
>> +			kfree(buffer);
>> +			kfree(data);
>> +			goto buffers_cleanup;
>> +		}
>> +		buffers_list[i] = buffer;
>> +		buffer->data = data;
>> +	}
>> +	devlink_health_buffers_reset(buffers_list, num_of_buffers);
>> +
>> +	return buffers_list;
>> +
>> +buffers_cleanup:
>> +	devlink_health_buffers_destroy(buffers_list, --i);
> 
> Just do for-kfree here.
> 
> 
>> +	kfree(buffers_list);
>> +	return NULL;
>> +}
>> +
>> +static void
>> +devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
>> +			       u64 num_of_buffers)
>> +{
>> +	u64 i;
>> +
>> +	for (i = 0; i < num_of_buffers; i++) {
>> +		kfree(buffers_list[i]->data);
>> +		kfree(buffers_list[i]);
>> +	}
>> +}
>> +
> 
> [...]
> 

Hi Jiri,
The series is merged. I can take the relevant comments as send as fix 
with the rest of the series if you wish to.

Eran

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support
  2019-01-17 21:59 ` [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support Eran Ben Elisha
@ 2019-01-20 11:06   ` Jiri Pirko
  2019-01-21 11:32     ` Eran Ben Elisha
  0 siblings, 1 reply; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:06 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:18PM CET, eranbe@mellanox.com wrote:
>Add mlx5e tx reporter to devlink health reporters. This reporter will be
>responsible for diagnosing, reporting and recovering of TX errors.
>This patch declares the TX reporter operations and allocate it using the
>devlink health API. Currently, this reporter supports reporting and
>recovering from send error CQE only. In addition, it adds diagnose
>information for the open SQs.
>
>For a local SQ recover (due to driver error report), in case of SQ recover
>failure, the recover operation will be considered as a failure.
>For a full TX recover, an attempt to close and open the channels will be
>done. If this one passed successfully, it will be considered as a
>successful recover.
>
>The SQ recover from error CQE flow is not a new feature in the driver,
>this patch re-organize the functions and adapt them for the devlink
>health API. For this purpose, move code from en_main.c to a new file
>named reporter_tx.c.
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
>---
> .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
> drivers/net/ethernet/mellanox/mlx5/core/en.h  |  18 +-
> .../ethernet/mellanox/mlx5/core/en/reporter.h |  14 +
> .../mellanox/mlx5/core/en/reporter_tx.c       | 321 ++++++++++++++++++
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 135 +-------
> .../net/ethernet/mellanox/mlx5/core/en_tx.c   |   2 +-
> 6 files changed, 367 insertions(+), 125 deletions(-)
> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>index 9de9abacf7f6..6bb2a860b15b 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>@@ -22,7 +22,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
> #
> mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
> 		en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
>-		en_selftest.o en/port.o en/monitor_stats.o
>+		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o
> 
> #
> # Netdev extra
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>index 8fa8fdd30b85..27e276c9bf84 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>@@ -388,10 +388,7 @@ struct mlx5e_txqsq {
> 	struct mlx5e_channel      *channel;
> 	int                        txq_ix;
> 	u32                        rate_limit;
>-	struct mlx5e_txqsq_recover {
>-		struct work_struct         recover_work;
>-		u64                        last_recover;
>-	} recover;
>+	struct work_struct         recover_work;
> } ____cacheline_aligned_in_smp;
> 
> struct mlx5e_dma_info {
>@@ -682,6 +679,13 @@ struct mlx5e_rss_params {
> 	u8	hfunc;
> };
> 
>+struct mlx5e_modify_sq_param {
>+	int curr_state;
>+	int next_state;
>+	int rl_update;
>+	int rl_index;
>+};
>+
> struct mlx5e_priv {
> 	/* priv data path fields - start */
> 	struct mlx5e_txqsq *txq2sq[MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC];
>@@ -737,6 +741,7 @@ struct mlx5e_priv {
> #ifdef CONFIG_MLX5_EN_TLS
> 	struct mlx5e_tls          *tls;
> #endif
>+	struct devlink_health_reporter *tx_reporter;
> };
> 
> struct mlx5e_profile {
>@@ -866,6 +871,11 @@ void mlx5e_set_rq_type(struct mlx5_core_dev *mdev, struct mlx5e_params *params);
> void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
> 			       struct mlx5e_params *params);
> 
>+int mlx5e_modify_sq(struct mlx5_core_dev *mdev, u32 sqn,
>+		    struct mlx5e_modify_sq_param *p);
>+void mlx5e_activate_txqsq(struct mlx5e_txqsq *sq);
>+void mlx5e_tx_disable_queue(struct netdev_queue *txq);
>+
> static inline bool mlx5e_tunnel_inner_ft_supported(struct mlx5_core_dev *mdev)
> {
> 	return (MLX5_CAP_ETH(mdev, tunnel_stateless_gre) &&
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
>new file mode 100644
>index 000000000000..74083e120ab3
>--- /dev/null
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
>@@ -0,0 +1,14 @@
>+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
>+/* Copyright (c) 2018 Mellanox Technologies. */
>+
>+#ifndef __MLX5E_EN_REPORTER_H
>+#define __MLX5E_EN_REPORTER_H
>+
>+#include <linux/mlx5/driver.h>
>+#include "en.h"
>+
>+int mlx5e_tx_reporter_create(struct mlx5e_priv *priv);
>+void mlx5e_tx_reporter_destroy(struct mlx5e_priv *priv);
>+void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq);
>+
>+#endif
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
>new file mode 100644
>index 000000000000..9800df4909c2
>--- /dev/null
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
>@@ -0,0 +1,321 @@
>+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
>+/* Copyright (c) 2018 Mellanox Technologies. */
>+
>+#include <net/devlink.h>
>+#include "reporter.h"
>+#include "lib/eq.h"
>+
>+#define MLX5E_TX_REPORTER_PER_SQ_MAX_LEN 256

Some made-up number. I don't like how this whole thing is do. I will
comment on it in the devlink part.


[...]


>+static int
>+mlx5e_tx_reporter_build_diagnose_output(struct devlink_health_buffer *buffer,
>+					u32 sqn, u8 state, u8 stopped)
>+{
>+	int err, i;
>+	int nest = 0;
>+	char name[20];
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	sprintf(name, "SQ 0x%x", sqn);

No. The whole idea of having the message build up with nested attributes
(json-like) was to avoid things like this. No sprintfs please. If you
want to do sprintf, most of the time you are doing something wrong.


>+	err = devlink_health_buffer_put_object_name(buffer, name);
>+	if (err)
>+		goto buffer_error;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	err = devlink_health_buffer_put_object_name(buffer, "HW state");
>+	if (err)
>+		goto buffer_error;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);

How could you put an object name and don't start nesting? It looks
implicit to me.



>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	err = devlink_health_buffer_put_value_u8(buffer, state);
>+	if (err)
>+		goto buffer_error;
>+
>+	devlink_health_buffer_nest_end(buffer); /* DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE */
>+	nest--;
>+
>+	devlink_health_buffer_nest_end(buffer); /* DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR */
>+	nest--;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>+	if (err)
>+		goto buffer_error;
>+	nest++;
>+
>+	err = devlink_health_buffer_put_object_name(buffer, "stopped");
>+	if (err)
>+		goto buffer_error;
>+
>+	err = devlink_health_buffer_nest_start(buffer,
>+					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>+	if (err)
>+		goto buffer_error;
>+	nest++;

What's up with this nest++ nest--?
When you open a nest, you should have an error path inten to close the
nest and jump there in case of an error. This is hard to read and
understand.
	


>+
>+	err = devlink_health_buffer_put_value_u8(buffer, stopped);
>+	if (err)
>+		goto buffer_error;
>+
>+	for (i = 0; i < nest; i++)
>+		devlink_health_buffer_nest_end(buffer);
>+
>+	return 0;
>+
>+buffer_error:
>+	for (i = 0; i < nest; i++)
>+		devlink_health_buffer_nest_cancel(buffer);
>+	return err;
>+}
>+
>+static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
>+				      struct devlink_health_buffer **buffers_array,
>+				      unsigned int buffer_size,
>+				      unsigned int num_buffers)
>+{
>+	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
>+	unsigned int buff = 0;
>+	int i = 0, err = 0;
>+
>+	if (buffer_size < MLX5E_TX_REPORTER_PER_SQ_MAX_LEN)
>+		return -ENOMEM;
>+
>+	mutex_lock(&priv->state_lock);
>+
>+	if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
>+		mutex_unlock(&priv->state_lock);
>+		return 0;
>+	}
>+
>+	while (i < priv->channels.num * priv->channels.params.num_tc) {
>+		struct mlx5e_txqsq *sq = priv->txq2sq[i];
>+		u8 state;
>+
>+		err = mlx5_core_query_sq_state(priv->mdev, sq->sqn, &state);
>+		if (err)
>+			break;
>+
>+		err = mlx5e_tx_reporter_build_diagnose_output(buffers_array[buff],
>+							      sq->sqn, state,
>+							      netif_xmit_stopped(sq->txq));
>+		if (err) {
>+			if (++buff == num_buffers)
>+				break;
>+		} else {
>+			i++;
>+		}
>+	}
>+
>+	mutex_unlock(&priv->state_lock);
>+	return err;
>+}
>+
>+static const struct devlink_health_reporter_ops mlx5_tx_reporter_ops = {
>+		.name = "TX",
>+		.recover = mlx5e_tx_reporter_recover,
>+		.diagnose_size = MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC *
>+				 MLX5E_TX_REPORTER_PER_SQ_MAX_LEN,
>+		.diagnose = mlx5e_tx_reporter_diagnose,
>+		.dump_size = 0,
>+		.dump = NULL,

Don't initialize 0 and NULL in static const.

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-20 11:06     ` Eran Ben Elisha
@ 2019-01-20 11:08       ` Jiri Pirko
  2019-01-20 18:45         ` David Miller
  0 siblings, 1 reply; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:08 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Sun, Jan 20, 2019 at 12:06:18PM CET, eranbe@mellanox.com wrote:
>
>
>On 1/20/2019 12:03 PM, Jiri Pirko wrote:
>> Thu, Jan 17, 2019 at 10:59:10PM CET, eranbe@mellanox.com wrote:
>> 
>> [...]
>> 
>>> +static void
>>> +devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
>>> +			       u64 size);
>> 
>> Avoid fwd declarations.
>> 
>> 
>>> +
>>> +static struct devlink_health_buffer **
>>> +devlink_health_buffers_create(u64 size)
>>> +{
>>> +	struct devlink_health_buffer **buffers_list;
>>> +	u64 num_of_buffers = DEVLINK_HEALTH_SIZE_TO_BUFFERS(size);
>>> +	u64 i;
>>> +
>>> +	buffers_list = kcalloc(num_of_buffers,
>>> +			       sizeof(struct devlink_health_buffer *),
>>> +			       GFP_KERNEL);
>>> +	if (!buffers_list)
>>> +		return NULL;
>>> +
>>> +	for (i = 0; i < num_of_buffers; i++) {
>>> +		struct devlink_health_buffer *buffer;
>>> +		void *data;
>>> +
>>> +		buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
>>> +		data = kzalloc(DEVLINK_HEALTH_BUFFER_SIZE, GFP_KERNEL);
>>> +		if (!buffer || !data) {
>>> +			kfree(buffer);
>>> +			kfree(data);
>>> +			goto buffers_cleanup;
>>> +		}
>>> +		buffers_list[i] = buffer;
>>> +		buffer->data = data;
>>> +	}
>>> +	devlink_health_buffers_reset(buffers_list, num_of_buffers);
>>> +
>>> +	return buffers_list;
>>> +
>>> +buffers_cleanup:
>>> +	devlink_health_buffers_destroy(buffers_list, --i);
>> 
>> Just do for-kfree here.
>> 
>> 
>>> +	kfree(buffers_list);
>>> +	return NULL;
>>> +}
>>> +
>>> +static void
>>> +devlink_health_buffers_destroy(struct devlink_health_buffer **buffers_list,
>>> +			       u64 num_of_buffers)
>>> +{
>>> +	u64 i;
>>> +
>>> +	for (i = 0; i < num_of_buffers; i++) {
>>> +		kfree(buffers_list[i]->data);
>>> +		kfree(buffers_list[i]);
>>> +	}
>>> +}
>>> +
>> 
>> [...]
>> 
>
>Hi Jiri,
>The series is merged. I can take the relevant comments as send as fix 
>with the rest of the series if you wish to.

I haven't have time to review this due to travel. I think it was mistake
to merge this as the buffer api is wrong in my opinion. I would vote for
revert if possible.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-17 21:59 ` [PATCH net-next v2 01/11] devlink: Add health buffer support Eran Ben Elisha
  2019-01-20 10:03   ` Jiri Pirko
@ 2019-01-20 11:20   ` Jiri Pirko
  1 sibling, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:20 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:10PM CET, eranbe@mellanox.com wrote:
>Devlink health buffer is a mechanism to pass descriptors between drivers
>and devlink. The API allows the driver to add objects, object pair,
>value array (nested attributes), value and name.
>
>Driver can use this API to fill the buffers in a format which can be
>translated by the devlink to the netlink message.
>
>In order to fulfill it, an internal buffer descriptor is defined. This
>will hold the data and metadata per each attribute and by used to pass
>actual commands to the netlink.
>
>This mechanism will be later used in devlink health for dump and diagnose
>data store by the drivers.
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
>---
> include/net/devlink.h        |  76 ++++++
> include/uapi/linux/devlink.h |   8 +
> net/core/devlink.c           | 501 +++++++++++++++++++++++++++++++++++
> 3 files changed, 585 insertions(+)
>

[...]

	
>+static int
>+devlink_health_buffer_snd(struct genl_info *info,
>+			  enum devlink_command cmd, int flags,
>+			  struct devlink_health_buffer **buffers_array,
>+			  u64 num_of_buffers)
>+{
>+	struct sk_buff *skb;
>+	struct nlmsghdr *nlh;
>+	void *hdr;
>+	int err;
>+	u64 i;
>+
>+	for (i = 0; i < num_of_buffers; i++) {
>+		/* Skip buffer if driver did not fill it up with any data */
>+		if (!buffers_array[i]->offset)
>+			continue;
>+
>+		skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
>+		if (!skb)
>+			return -ENOMEM;
>+
>+		hdr = genlmsg_put(skb, info->snd_portid, info->snd_seq,
>+				  &devlink_nl_family, NLM_F_MULTI, cmd);
>+		if (!hdr)
>+			goto nla_put_failure;
>+
>+		err = devlink_health_buffer_prepare_skb(skb, buffers_array[i]);
>+		if (err)
>+			goto nla_put_failure;
>+
>+		genlmsg_end(skb, hdr);
>+		err = genlmsg_reply(skb, info);
>+		if (err)
>+			return err;
>+	}

So you have an array of "buffers". I don't see a need for it. Mapping
this "buffer" 1:1 to netlink skb leads to lots of skbs for info
that could be send in one or a few skbs.

The API to the driver should be different. The driver should not care
about any "buffer" or size of it (in bytes) or how many of them should
be. The driver should just construct a "message". The helpers should be
similar to what you have, but the arg should be say "struct devlink_msg".
It is not really anything special to "health". It is basically json-like
formatted message.

So the driver should use the API to open and close objects, to fill
values etc. Devlink should take care of allocation of needed
buffer/buffers/structs/objects on fly. It could be one linked list of
object for all it matters. No byte buffer needed.

Then, when devlink needs to send netlink skb, it should take this
"struct devlink msg" and translate it to one or many skbs.

Basically the whole API to the driver is wrong, I think it would be
much easier to revert, redo and reapply.



>+
>+	skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
>+	if (!skb)
>+		return -ENOMEM;
>+	nlh = nlmsg_put(skb, info->snd_portid, info->snd_seq,
>+			NLMSG_DONE, 0, flags | NLM_F_MULTI);
>+	err = genlmsg_reply(skb, info);
>+	if (err)
>+		return err;
>+
>+	return 0;
>+
>+nla_put_failure:
>+	err = -EIO;
>+	nlmsg_free(skb);
>+	return err;
>+}
>+
> static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
> 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
> 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 03/11] devlink: Add health report functionality
  2019-01-17 21:59 ` [PATCH net-next v2 03/11] devlink: Add health report functionality Eran Ben Elisha
@ 2019-01-20 11:27   ` Jiri Pirko
  2019-01-21 11:12     ` Eran Ben Elisha
  0 siblings, 1 reply; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:27 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:12PM CET, eranbe@mellanox.com wrote:

[...]

>+
>+TRACE_EVENT(devlink_health_recover_aborted,
>+	TP_PROTO(const struct devlink *devlink, const char *reporter_name,
>+		 bool health_state, u64 time_since_last_recover),
>+
>+	TP_ARGS(devlink, reporter_name, health_state, time_since_last_recover),
>+
>+	TP_STRUCT__entry(
>+		__string(bus_name, devlink->dev->bus->name)
>+		__string(dev_name, dev_name(devlink->dev))
>+		__string(driver_name, devlink->dev->driver->name)
>+		__string(reporter_name, reporter_name)
>+		__field(bool, health_state)
>+		__field(u64, time_since_last_recover)
>+	),
>+
>+	TP_fast_assign(
>+		__assign_str(bus_name, devlink->dev->bus->name);
>+		__assign_str(dev_name, dev_name(devlink->dev));
>+		__assign_str(driver_name, devlink->dev->driver->name);
>+		__assign_str(reporter_name, reporter_name);
>+		__entry->health_state = health_state;
>+		__entry->time_since_last_recover = time_since_last_recover;
>+	),
>+
>+	TP_printk("bus_name=%s dev_name=%s driver_name=%s reporter_name=%s: health_state=%d time_since_last_recover = %llu recover aborted",

Drop spaces around "=".

Otherwise, this looks fine.
Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 04/11] devlink: Add health get command
  2019-01-17 21:59 ` [PATCH net-next v2 04/11] devlink: Add health get command Eran Ben Elisha
@ 2019-01-20 11:31   ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:31 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:13PM CET, eranbe@mellanox.com wrote:
>Add devlink health get command to provide reporter/s data for user space.
>Add the ability to get data per reporter or dump data from all available
>reporters.
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Moshe Shemesh <moshe@mellanox.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 05/11] devlink: Add health set command
  2019-01-17 21:59 ` [PATCH net-next v2 05/11] devlink: Add health set command Eran Ben Elisha
@ 2019-01-20 11:32   ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:32 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:14PM CET, eranbe@mellanox.com wrote:
>Add devlink health set command, in order to set configuration parameters
>for a specific reporter.
>Supported parameters are:
>- graceful_period: Time interval between auto recoveries (in msec)
>- auto_recover: Determines if the devlink shall execute recover upon
>		receiving error for the reporter
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Moshe Shemesh <moshe@mellanox.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 06/11] devlink: Add health recover command
  2019-01-17 21:59 ` [PATCH net-next v2 06/11] devlink: Add health recover command Eran Ben Elisha
@ 2019-01-20 11:33   ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:33 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:15PM CET, eranbe@mellanox.com wrote:
>Add devlink health recover command to the uapi, in order to allow the user
>to execute a recover operation over a specific reporter.
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Moshe Shemesh <moshe@mellanox.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 07/11] devlink: Add health diagnose command
  2019-01-17 21:59 ` [PATCH net-next v2 07/11] devlink: Add health diagnose command Eran Ben Elisha
@ 2019-01-20 11:38   ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:38 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:16PM CET, eranbe@mellanox.com wrote:
>Add devlink health diagnose command, in order to run a diagnose
>operation over a specific reporter.
>
>It is expected from driver's callback for diagnose command to fill it
>via the buffer descriptors API. Devlink will parse it and convert it to
>netlink nla API in order to pass it to the user.
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
>---
> include/uapi/linux/devlink.h |  1 +
> net/core/devlink.c           | 51 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 52 insertions(+)
>
>diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
>index 1c186fd77343..51b4d7612cf8 100644
>--- a/include/uapi/linux/devlink.h
>+++ b/include/uapi/linux/devlink.h
>@@ -92,6 +92,7 @@ enum devlink_command {
> 	DEVLINK_CMD_HEALTH_REPORTER_GET,
> 	DEVLINK_CMD_HEALTH_REPORTER_SET,
> 	DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
>+	DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
> 
> 	/* add new commands above here */
> 	__DEVLINK_CMD_MAX,
>diff --git a/net/core/devlink.c b/net/core/devlink.c
>index b224d0d31c0c..57252ca31e1e 100644
>--- a/net/core/devlink.c
>+++ b/net/core/devlink.c
>@@ -4500,6 +4500,50 @@ static int devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb,
> 	return devlink_health_reporter_recover(reporter, NULL);
> }
> 
>+static int devlink_nl_cmd_health_reporter_diagnose_doit(struct sk_buff *skb,
>+							struct genl_info *info)
>+{
>+	struct devlink *devlink = info->user_ptr[0];
>+	struct devlink_health_reporter *reporter;
>+	u64 num_of_buffers;
>+	int err;
>+
>+	reporter = devlink_health_reporter_get_from_info(devlink, info);
>+	if (!reporter)
>+		return -EINVAL;
>+
>+	if (!reporter->ops->diagnose)
>+		return -EOPNOTSUPP;
>+
>+	num_of_buffers =
>+		DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->diagnose_size);

This is odd. You need to convert to number of "buffers" the magic value
the driver provided. And then couple lines below....


>+
>+	mutex_lock(&reporter->diagnose_lock);
>+	devlink_health_buffers_reset(reporter->diagnose_buffers_array,
>+				     num_of_buffers);
>+
>+	err = reporter->ops->diagnose(reporter,
>+				      reporter->diagnose_buffers_array,
>+				      DEVLINK_HEALTH_BUFFER_SIZE,
>+				      num_of_buffers);

....you call the driver with "buffers array", buffer size and number of
buffers. It should be just:

	struct devlink_msg_ctx msg_ctx;

	err = reporter->ops->diagnose(reporter, &msg_ctx);

and then;

	err = devlink_health_buffer_snd(info,
					devlink_cmd_health_reporter_diagnose,
					0, &msg_ctx);


>+	if (err)
>+		goto out;
>+
>+	err = devlink_health_buffer_snd(info,
>+					devlink_cmd_health_reporter_diagnose,
>+					0, reporter->diagnose_buffers_array,
>+					num_of_buffers);
>+	if (err)
>+		goto out;
>+
>+	mutex_unlock(&reporter->diagnose_lock);
>+	return 0;
>+
>+out:
>+	mutex_unlock(&reporter->diagnose_lock);
>+	return err;
>+}
>+
> static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
> 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
> 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
>@@ -4770,6 +4814,13 @@ static const struct genl_ops devlink_nl_ops[] = {
> 		.flags = GENL_ADMIN_PERM,
> 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
> 	},
>+	{
>+		.cmd = DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
>+		.doit = devlink_nl_cmd_health_reporter_diagnose_doit,
>+		.policy = devlink_nl_policy,
>+		.flags = GENL_ADMIN_PERM,
>+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
>+	},
> };
> 
> static struct genl_family devlink_nl_family __ro_after_init = {
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 02/11] devlink: Add health reporter create/destroy functionality
  2019-01-17 21:59 ` [PATCH net-next v2 02/11] devlink: Add health reporter create/destroy functionality Eran Ben Elisha
@ 2019-01-20 11:49   ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-20 11:49 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Thu, Jan 17, 2019 at 10:59:11PM CET, eranbe@mellanox.com wrote:
>Devlink health reporter is an instance for reporting, diagnosing and
>recovering from run time errors discovered by the reporters.
>Define it's data structure and supported operations.
>In addition, expose devlink API to create and destroy a reporter.
>Each devlink instance will hold it's own reporters list.
>
>As part of the allocation, driver shall provide a set of callbacks which
>will be used the devlink in order to handle health reports and user
>commands related to this reporter. In addition, driver is entitled to
>provide some priv pointer, which can be fetched from the reporter by
>devlink_health_reporter_priv function.
>
>For each reporter, devlink will hold a metadata of statistics,
>buffers and status.
>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
>---
> include/net/devlink.h |  59 ++++++++++++++++++++
> net/core/devlink.c    | 127 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 186 insertions(+)
>
>diff --git a/include/net/devlink.h b/include/net/devlink.h
>index 77c77319290a..7fe30d67678a 100644
>--- a/include/net/devlink.h
>+++ b/include/net/devlink.h
>@@ -30,6 +30,7 @@ struct devlink {
> 	struct list_head param_list;
> 	struct list_head region_list;
> 	u32 snapshot_id;
>+	struct list_head reporter_list;
> 	struct devlink_dpipe_headers *dpipe_headers;
> 	const struct devlink_ops *ops;
> 	struct device *dev;
>@@ -424,6 +425,34 @@ struct devlink_region;
> typedef void devlink_snapshot_data_dest_t(const void *data);
> 
> struct devlink_health_buffer;
>+struct devlink_health_reporter;
>+
>+/**
>+ * struct devlink_health_reporter_ops - Reporter operations
>+ * @name: reporter name
>+ * dump_size: dump buffer size allocated by the devlink
>+ * diagnose_size: diagnose buffer size allocated by the devlink
>+ * recover: callback to recover from reported error
>+ *          if priv_ctx is NULL, run a full recover
>+ * dump: callback to dump an object
>+ *       if priv_ctx is NULL, run a full dump
>+ * diagnose: callback to diagnose the current status
>+ */
>+
>+struct devlink_health_reporter_ops {
>+	char *name;
>+	unsigned int dump_size;
>+	unsigned int diagnose_size;
>+	int (*recover)(struct devlink_health_reporter *reporter,
>+		       void *priv_ctx);
>+	int (*dump)(struct devlink_health_reporter *reporter,
>+		    struct devlink_health_buffer **buffers_array,
>+		    unsigned int buffer_size, unsigned int num_buffers,
>+		    void *priv_ctx);
>+	int (*diagnose)(struct devlink_health_reporter *reporter,
>+			struct devlink_health_buffer **buffers_array,
>+			unsigned int buffer_size, unsigned int num_buffers);
>+};
> 
> struct devlink_ops {
> 	int (*reload)(struct devlink *devlink, struct netlink_ext_ack *extack);
>@@ -602,6 +631,16 @@ int devlink_health_buffer_put_value_string(struct devlink_health_buffer *buffer,
> 					   char *name);
> int devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
> 					 void *data, int len);
>+struct devlink_health_reporter *
>+devlink_health_reporter_create(struct devlink *devlink,
>+			       const struct devlink_health_reporter_ops *ops,
>+			       u64 graceful_period, bool auto_recover,
>+			       void *priv);
>+void
>+devlink_health_reporter_destroy(struct devlink_health_reporter *reporter);
>+
>+void *
>+devlink_health_reporter_priv(struct devlink_health_reporter *reporter);
> #else
> 
> static inline struct devlink *devlink_alloc(const struct devlink_ops *ops,
>@@ -920,6 +959,26 @@ devlink_health_buffer_put_value_data(struct devlink_health_buffer *buffer,
> {
> 	return 0;
> }
>+
>+static inline struct devlink_health_reporter *
>+devlink_health_reporter_create(struct devlink *devlink,
>+			       const struct devlink_health_reporter_ops *ops,
>+			       u64 graceful_period, bool auto_recover,
>+			       void *priv)
>+{
>+	return NULL;
>+}
>+
>+static inline void
>+devlink_health_reporter_destroy(struct devlink_health_reporter *reporter)
>+{
>+}
>+
>+static inline void *
>+devlink_health_reporter_priv(struct devlink_health_reporter *reporter)
>+{
>+	return NULL;
>+}
> #endif
> 
> #endif /* _NET_DEVLINK_H_ */
>diff --git a/net/core/devlink.c b/net/core/devlink.c
>index 8984501edade..fec169a28dba 100644
>--- a/net/core/devlink.c
>+++ b/net/core/devlink.c
>@@ -4098,6 +4098,132 @@ devlink_health_buffer_snd(struct genl_info *info,
> 	return err;
> }
> 
>+struct devlink_health_reporter {
>+	struct list_head list;
>+	struct devlink_health_buffer **dump_buffers_array;
>+	struct mutex dump_lock; /* lock parallel read/write from dump buffers */
>+	struct devlink_health_buffer **diagnose_buffers_array;
>+	struct mutex diagnose_lock; /* lock parallel read/write from diagnose buffers */
>+	void *priv;
>+	const struct devlink_health_reporter_ops *ops;
>+	struct devlink *devlink;
>+	u64 graceful_period;
>+	bool auto_recover;
>+	u8 health_state;
>+};
>+
>+void *
>+devlink_health_reporter_priv(struct devlink_health_reporter *reporter)
>+{
>+	return reporter->priv;
>+}
>+EXPORT_SYMBOL_GPL(devlink_health_reporter_priv);
>+
>+static struct devlink_health_reporter *
>+devlink_health_reporter_find_by_name(struct devlink *devlink,
>+				     const char *reporter_name)
>+{
>+	struct devlink_health_reporter *reporter;
>+
>+	list_for_each_entry(reporter, &devlink->reporter_list, list)
>+		if (!strcmp(reporter->ops->name, reporter_name))
>+			return reporter;
>+	return NULL;
>+}
>+
>+/**
>+ *	devlink_health_reporter_create - create devlink health reporter
>+ *
>+ *	@devlink: devlink
>+ *	@ops: ops
>+ *	@graceful_period: to avoid recovery loops, in msecs
>+ *	@auto_recover: auto recover when error occurs
>+ *	@priv: priv
>+ */
>+struct devlink_health_reporter *
>+devlink_health_reporter_create(struct devlink *devlink,
>+			       const struct devlink_health_reporter_ops *ops,
>+			       u64 graceful_period, bool auto_recover,
>+			       void *priv)
>+{
>+	struct devlink_health_reporter *reporter;
>+
>+	mutex_lock(&devlink->lock);
>+	if (devlink_health_reporter_find_by_name(devlink, ops->name)) {
>+		reporter = ERR_PTR(-EEXIST);
>+		goto unlock;
>+	}
>+
>+	if (WARN_ON(ops->dump && !ops->dump_size) ||
>+	    WARN_ON(ops->diagnose && !ops->diagnose_size) ||
>+	    WARN_ON(auto_recover && !ops->recover) ||
>+	    WARN_ON(graceful_period && !ops->recover)) {
>+		reporter = ERR_PTR(-EINVAL);
>+		goto unlock;
>+	}
>+
>+	reporter = kzalloc(sizeof(*reporter), GFP_KERNEL);
>+	if (!reporter) {
>+		reporter = ERR_PTR(-ENOMEM);
>+		goto unlock;
>+	}
>+
>+	if (ops->dump) {
>+		reporter->dump_buffers_array =
>+			devlink_health_buffers_create(ops->dump_size);
>+		if (!reporter->dump_buffers_array) {
>+			kfree(reporter);
>+			reporter = ERR_PTR(-ENOMEM);
>+			goto unlock;
>+		}
>+	}
>+
>+	if (ops->diagnose) {
>+		reporter->diagnose_buffers_array =
>+			devlink_health_buffers_create(ops->diagnose_size);
>+		if (!reporter->diagnose_buffers_array) {
>+			devlink_health_buffers_destroy(reporter->dump_buffers_array,
>+						       DEVLINK_HEALTH_SIZE_TO_BUFFERS(ops->dump_size));

This is just ugly. :/

As I wrote in the other email, should be converted to simple
"msg_ctx = msg_ctx_create(), msg_ctx_destroy(msg_ctx)", no sizes, no
buffers visible to the driver.


>+			kfree(reporter);
>+			reporter = ERR_PTR(-ENOMEM);
>+			goto unlock;
>+		}
>+	}
>+
>+	list_add_tail(&reporter->list, &devlink->reporter_list);
>+	mutex_init(&reporter->dump_lock);
>+	mutex_init(&reporter->diagnose_lock);
>+
>+	reporter->priv = priv;
>+	reporter->ops = ops;
>+	reporter->devlink = devlink;
>+	reporter->graceful_period = graceful_period;
>+	reporter->auto_recover = auto_recover;
>+unlock:
>+	mutex_unlock(&devlink->lock);
>+	return reporter;
>+}
>+EXPORT_SYMBOL_GPL(devlink_health_reporter_create);
>+
>+/**
>+ *	devlink_health_reporter_destroy - destroy devlink health reporter
>+ *
>+ *	@reporter: devlink health reporter to destroy
>+ */
>+void
>+devlink_health_reporter_destroy(struct devlink_health_reporter *reporter)
>+{
>+	mutex_lock(&reporter->devlink->lock);
>+	list_del(&reporter->list);
>+	devlink_health_buffers_destroy(reporter->dump_buffers_array,
>+				       DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->dump_size));
>+	devlink_health_buffers_destroy(reporter->diagnose_buffers_array,
>+				       DEVLINK_HEALTH_SIZE_TO_BUFFERS(reporter->ops->diagnose_size));
>+	kfree(reporter);
>+	mutex_unlock(&reporter->devlink->lock);
>+}
>+EXPORT_SYMBOL_GPL(devlink_health_reporter_destroy);
>+
> static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
> 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
> 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
>@@ -4383,6 +4509,7 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
> 	INIT_LIST_HEAD(&devlink->resource_list);
> 	INIT_LIST_HEAD(&devlink->param_list);
> 	INIT_LIST_HEAD(&devlink->region_list);
>+	INIT_LIST_HEAD(&devlink->reporter_list);
> 	mutex_init(&devlink->lock);
> 	return devlink;
> }
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-20 11:08       ` Jiri Pirko
@ 2019-01-20 18:45         ` David Miller
  2019-01-21 11:07           ` Eran Ben Elisha
  0 siblings, 1 reply; 36+ messages in thread
From: David Miller @ 2019-01-20 18:45 UTC (permalink / raw)
  To: jiri; +Cc: eranbe, netdev, jiri, ariela, ayal, moshe

From: Jiri Pirko <jiri@resnulli.us>
Date: Sun, 20 Jan 2019 12:08:50 +0100

> I haven't have time to review this due to travel. I think it was mistake
> to merge this as the buffer api is wrong in my opinion. I would vote for
> revert if possible.

Let's spend a couple days trying to fix things up first.

I cannot make contributors hostage to other people's travel
schedule, no way.

If it seems hopeless by Wednesday, I will revert.

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-20 18:45         ` David Miller
@ 2019-01-21 11:07           ` Eran Ben Elisha
  2019-01-21 12:08             ` Jiri Pirko
  0 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-21 11:07 UTC (permalink / raw)
  To: David Miller, jiri
  Cc: netdev, Jiri Pirko, Ariel Almog, Aya Levin, Moshe Shemesh



On 1/20/2019 8:45 PM, David Miller wrote:
> From: Jiri Pirko <jiri@resnulli.us>
> Date: Sun, 20 Jan 2019 12:08:50 +0100
> 
>> I haven't have time to review this due to travel. I think it was mistake
>> to merge this as the buffer api is wrong in my opinion. I would vote for
>> revert if possible.
> 
> Let's spend a couple days trying to fix things up first.

will do.
I reviewed the comments and saw that the major comment is around the 
"Devlink buffers", which Jiri asked to replace with "Devlink msg".

The issue is that there is no easy way to modify one into the other 
without breaking things, so I am planning to provide a series on top 
which does:
1. Add devlink msg API
2. Move devlink reporter to use new API
3. Move existing reporter to use the new API
4. Get rid of the old API

thanks,
Eran

> 
> I cannot make contributors hostage to other people's travel
> schedule, no way.
> 
> If it seems hopeless by Wednesday, I will revert.

Should be published by then, hopefully sooner.

> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 03/11] devlink: Add health report functionality
  2019-01-20 11:27   ` Jiri Pirko
@ 2019-01-21 11:12     ` Eran Ben Elisha
  0 siblings, 0 replies; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-21 11:12 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh



On 1/20/2019 1:27 PM, Jiri Pirko wrote:
> Thu, Jan 17, 2019 at 10:59:12PM CET, eranbe@mellanox.com wrote:
> 
> [...]
> 
>> +
>> +TRACE_EVENT(devlink_health_recover_aborted,
>> +	TP_PROTO(const struct devlink *devlink, const char *reporter_name,
>> +		 bool health_state, u64 time_since_last_recover),
>> +
>> +	TP_ARGS(devlink, reporter_name, health_state, time_since_last_recover),
>> +
>> +	TP_STRUCT__entry(
>> +		__string(bus_name, devlink->dev->bus->name)
>> +		__string(dev_name, dev_name(devlink->dev))
>> +		__string(driver_name, devlink->dev->driver->name)
>> +		__string(reporter_name, reporter_name)
>> +		__field(bool, health_state)
>> +		__field(u64, time_since_last_recover)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__assign_str(bus_name, devlink->dev->bus->name);
>> +		__assign_str(dev_name, dev_name(devlink->dev));
>> +		__assign_str(driver_name, devlink->dev->driver->name);
>> +		__assign_str(reporter_name, reporter_name);
>> +		__entry->health_state = health_state;
>> +		__entry->time_since_last_recover = time_since_last_recover;
>> +	),
>> +
>> +	TP_printk("bus_name=%s dev_name=%s driver_name=%s reporter_name=%s: health_state=%d time_since_last_recover = %llu recover aborted",
> 
> Drop spaces around "=".

ack.

> 
> Otherwise, this looks fine.
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support
  2019-01-20 11:06   ` Jiri Pirko
@ 2019-01-21 11:32     ` Eran Ben Elisha
  2019-01-21 12:11       ` Jiri Pirko
  0 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-21 11:32 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh



On 1/20/2019 1:06 PM, Jiri Pirko wrote:
> Thu, Jan 17, 2019 at 10:59:18PM CET, eranbe@mellanox.com wrote:
>> Add mlx5e tx reporter to devlink health reporters. This reporter will be
>> responsible for diagnosing, reporting and recovering of TX errors.
>> This patch declares the TX reporter operations and allocate it using the
>> devlink health API. Currently, this reporter supports reporting and
>> recovering from send error CQE only. In addition, it adds diagnose
>> information for the open SQs.
>>
>> For a local SQ recover (due to driver error report), in case of SQ recover
>> failure, the recover operation will be considered as a failure.
>> For a full TX recover, an attempt to close and open the channels will be
>> done. If this one passed successfully, it will be considered as a
>> successful recover.
>>
>> The SQ recover from error CQE flow is not a new feature in the driver,
>> this patch re-organize the functions and adapt them for the devlink
>> health API. For this purpose, move code from en_main.c to a new file
>> named reporter_tx.c.
>>
>> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
>> ---
>> .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
>> drivers/net/ethernet/mellanox/mlx5/core/en.h  |  18 +-
>> .../ethernet/mellanox/mlx5/core/en/reporter.h |  14 +
>> .../mellanox/mlx5/core/en/reporter_tx.c       | 321 ++++++++++++++++++
>> .../net/ethernet/mellanox/mlx5/core/en_main.c | 135 +-------
>> .../net/ethernet/mellanox/mlx5/core/en_tx.c   |   2 +-
>> 6 files changed, 367 insertions(+), 125 deletions(-)
>> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
>> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>> index 9de9abacf7f6..6bb2a860b15b 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>> @@ -22,7 +22,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
>> #
>> mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
>> 		en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
>> -		en_selftest.o en/port.o en/monitor_stats.o
>> +		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o
>>
>> #
>> # Netdev extra
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>> index 8fa8fdd30b85..27e276c9bf84 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>> @@ -388,10 +388,7 @@ struct mlx5e_txqsq {
>> 	struct mlx5e_channel      *channel;
>> 	int                        txq_ix;
>> 	u32                        rate_limit;
>> -	struct mlx5e_txqsq_recover {
>> -		struct work_struct         recover_work;
>> -		u64                        last_recover;
>> -	} recover;
>> +	struct work_struct         recover_work;
>> } ____cacheline_aligned_in_smp;
>>
>> struct mlx5e_dma_info {
>> @@ -682,6 +679,13 @@ struct mlx5e_rss_params {
>> 	u8	hfunc;
>> };
>>
>> +struct mlx5e_modify_sq_param {
>> +	int curr_state;
>> +	int next_state;
>> +	int rl_update;
>> +	int rl_index;
>> +};
>> +
>> struct mlx5e_priv {
>> 	/* priv data path fields - start */
>> 	struct mlx5e_txqsq *txq2sq[MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC];
>> @@ -737,6 +741,7 @@ struct mlx5e_priv {
>> #ifdef CONFIG_MLX5_EN_TLS
>> 	struct mlx5e_tls          *tls;
>> #endif
>> +	struct devlink_health_reporter *tx_reporter;
>> };
>>
>> struct mlx5e_profile {
>> @@ -866,6 +871,11 @@ void mlx5e_set_rq_type(struct mlx5_core_dev *mdev, struct mlx5e_params *params);
>> void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
>> 			       struct mlx5e_params *params);
>>
>> +int mlx5e_modify_sq(struct mlx5_core_dev *mdev, u32 sqn,
>> +		    struct mlx5e_modify_sq_param *p);
>> +void mlx5e_activate_txqsq(struct mlx5e_txqsq *sq);
>> +void mlx5e_tx_disable_queue(struct netdev_queue *txq);
>> +
>> static inline bool mlx5e_tunnel_inner_ft_supported(struct mlx5_core_dev *mdev)
>> {
>> 	return (MLX5_CAP_ETH(mdev, tunnel_stateless_gre) &&
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
>> new file mode 100644
>> index 000000000000..74083e120ab3
>> --- /dev/null
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter.h
>> @@ -0,0 +1,14 @@
>> +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
>> +/* Copyright (c) 2018 Mellanox Technologies. */
>> +
>> +#ifndef __MLX5E_EN_REPORTER_H
>> +#define __MLX5E_EN_REPORTER_H
>> +
>> +#include <linux/mlx5/driver.h>
>> +#include "en.h"
>> +
>> +int mlx5e_tx_reporter_create(struct mlx5e_priv *priv);
>> +void mlx5e_tx_reporter_destroy(struct mlx5e_priv *priv);
>> +void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq);
>> +
>> +#endif
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
>> new file mode 100644
>> index 000000000000..9800df4909c2
>> --- /dev/null
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
>> @@ -0,0 +1,321 @@
>> +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
>> +/* Copyright (c) 2018 Mellanox Technologies. */
>> +
>> +#include <net/devlink.h>
>> +#include "reporter.h"
>> +#include "lib/eq.h"
>> +
>> +#define MLX5E_TX_REPORTER_PER_SQ_MAX_LEN 256
> 
> Some made-up number. I don't like how this whole thing is do. I will
> comment on it in the devlink part.
> 
> 
> [...]
> 
> 
>> +static int
>> +mlx5e_tx_reporter_build_diagnose_output(struct devlink_health_buffer *buffer,
>> +					u32 sqn, u8 state, u8 stopped)
>> +{
>> +	int err, i;
>> +	int nest = 0;
>> +	char name[20];
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	sprintf(name, "SQ 0x%x", sqn);
> 
> No. The whole idea of having the message build up with nested attributes
> (json-like) was to avoid things like this. No sprintfs please. If you
> want to do sprintf, most of the time you are doing something wrong.

I wanted that each SQ object will have a unique name. But I can merge 
the sqn into its attributes instead.

> 
> 
>> +	err = devlink_health_buffer_put_object_name(buffer, name);
>> +	if (err)
>> +		goto buffer_error;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	err = devlink_health_buffer_put_object_name(buffer, "HW state");
>> +	if (err)
>> +		goto buffer_error;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
> 
> How could you put an object name and don't start nesting? It looks
> implicit to me.

I will add some helper functions that you could review. Just keep in 
mind the implicit nest start must come with implicit nest end, so it 
won't fit into every use...
> 
> 
> 
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	err = devlink_health_buffer_put_value_u8(buffer, state);
>> +	if (err)
>> +		goto buffer_error;
>> +
>> +	devlink_health_buffer_nest_end(buffer); /* DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE */
>> +	nest--;
>> +
>> +	devlink_health_buffer_nest_end(buffer); /* DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR */
>> +	nest--;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
>> +
>> +	err = devlink_health_buffer_put_object_name(buffer, "stopped");
>> +	if (err)
>> +		goto buffer_error;
>> +
>> +	err = devlink_health_buffer_nest_start(buffer,
>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>> +	if (err)
>> +		goto buffer_error;
>> +	nest++;
> 
> What's up with this nest++ nest--?
> When you open a nest, you should have an error path inten to close the
> nest and jump there in case of an error. This is hard to read and
> understand.
it was in order to have the loop for nest_end. I removed it.
> 	
> 
> 
>> +
>> +	err = devlink_health_buffer_put_value_u8(buffer, stopped);
>> +	if (err)
>> +		goto buffer_error;
>> +
>> +	for (i = 0; i < nest; i++)
>> +		devlink_health_buffer_nest_end(buffer);
>> +
>> +	return 0;
>> +
>> +buffer_error:
>> +	for (i = 0; i < nest; i++)
>> +		devlink_health_buffer_nest_cancel(buffer);
>> +	return err;
>> +}
>> +
>> +static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
>> +				      struct devlink_health_buffer **buffers_array,
>> +				      unsigned int buffer_size,
>> +				      unsigned int num_buffers)
>> +{
>> +	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
>> +	unsigned int buff = 0;
>> +	int i = 0, err = 0;
>> +
>> +	if (buffer_size < MLX5E_TX_REPORTER_PER_SQ_MAX_LEN)
>> +		return -ENOMEM;
>> +
>> +	mutex_lock(&priv->state_lock);
>> +
>> +	if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
>> +		mutex_unlock(&priv->state_lock);
>> +		return 0;
>> +	}
>> +
>> +	while (i < priv->channels.num * priv->channels.params.num_tc) {
>> +		struct mlx5e_txqsq *sq = priv->txq2sq[i];
>> +		u8 state;
>> +
>> +		err = mlx5_core_query_sq_state(priv->mdev, sq->sqn, &state);
>> +		if (err)
>> +			break;
>> +
>> +		err = mlx5e_tx_reporter_build_diagnose_output(buffers_array[buff],
>> +							      sq->sqn, state,
>> +							      netif_xmit_stopped(sq->txq));
>> +		if (err) {
>> +			if (++buff == num_buffers)
>> +				break;
>> +		} else {
>> +			i++;
>> +		}
>> +	}
>> +
>> +	mutex_unlock(&priv->state_lock);
>> +	return err;
>> +}
>> +
>> +static const struct devlink_health_reporter_ops mlx5_tx_reporter_ops = {
>> +		.name = "TX",
>> +		.recover = mlx5e_tx_reporter_recover,
>> +		.diagnose_size = MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC *
>> +				 MLX5E_TX_REPORTER_PER_SQ_MAX_LEN,
>> +		.diagnose = mlx5e_tx_reporter_diagnose,
>> +		.dump_size = 0,
>> +		.dump = NULL,
> 
> Don't initialize 0 and NULL in static const.
Ack.
> 
> [...]
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 01/11] devlink: Add health buffer support
  2019-01-21 11:07           ` Eran Ben Elisha
@ 2019-01-21 12:08             ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-21 12:08 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: David Miller, netdev, Jiri Pirko, Ariel Almog, Aya Levin, Moshe Shemesh

Mon, Jan 21, 2019 at 12:07:14PM CET, eranbe@mellanox.com wrote:
>
>
>On 1/20/2019 8:45 PM, David Miller wrote:
>> From: Jiri Pirko <jiri@resnulli.us>
>> Date: Sun, 20 Jan 2019 12:08:50 +0100
>> 
>>> I haven't have time to review this due to travel. I think it was mistake
>>> to merge this as the buffer api is wrong in my opinion. I would vote for
>>> revert if possible.
>> 
>> Let's spend a couple days trying to fix things up first.
>
>will do.
>I reviewed the comments and saw that the major comment is around the 
>"Devlink buffers", which Jiri asked to replace with "Devlink msg".
>
>The issue is that there is no easy way to modify one into the other 
>without breaking things, so I am planning to provide a series on top 
>which does:
>1. Add devlink msg API
>2. Move devlink reporter to use new API
>3. Move existing reporter to use the new API
>4. Get rid of the old API

Will be harder to review then revert and redo. Also harder for you to do
the patches. Small nightmare :/

>
>thanks,
>Eran
>
>> 
>> I cannot make contributors hostage to other people's travel
>> schedule, no way.
>> 
>> If it seems hopeless by Wednesday, I will revert.
>
>Should be published by then, hopefully sooner.
>
>> 
>> Thanks.
>> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support
  2019-01-21 11:32     ` Eran Ben Elisha
@ 2019-01-21 12:11       ` Jiri Pirko
  2019-01-21 13:06         ` Eran Ben Elisha
  0 siblings, 1 reply; 36+ messages in thread
From: Jiri Pirko @ 2019-01-21 12:11 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Mon, Jan 21, 2019 at 12:32:07PM CET, eranbe@mellanox.com wrote:
>
>
>On 1/20/2019 1:06 PM, Jiri Pirko wrote:
>> Thu, Jan 17, 2019 at 10:59:18PM CET, eranbe@mellanox.com wrote:

[...]

	
>>> +static int
>>> +mlx5e_tx_reporter_build_diagnose_output(struct devlink_health_buffer *buffer,
>>> +					u32 sqn, u8 state, u8 stopped)
>>> +{
>>> +	int err, i;
>>> +	int nest = 0;
>>> +	char name[20];
>>> +
>>> +	err = devlink_health_buffer_nest_start(buffer,
>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>>> +	if (err)
>>> +		goto buffer_error;
>>> +	nest++;
>>> +
>>> +	err = devlink_health_buffer_nest_start(buffer,
>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>>> +	if (err)
>>> +		goto buffer_error;
>>> +	nest++;
>>> +
>>> +	sprintf(name, "SQ 0x%x", sqn);
>> 
>> No. The whole idea of having the message build up with nested attributes
>> (json-like) was to avoid things like this. No sprintfs please. If you
>> want to do sprintf, most of the time you are doing something wrong.
>
>I wanted that each SQ object will have a unique name. But I can merge 
>the sqn into its attributes instead.

Should be an array.


>
>> 
>> 
>>> +	err = devlink_health_buffer_put_object_name(buffer, name);
>>> +	if (err)
>>> +		goto buffer_error;
>>> +
>>> +	err = devlink_health_buffer_nest_start(buffer,
>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>>> +	if (err)
>>> +		goto buffer_error;
>>> +	nest++;
>>> +
>>> +	err = devlink_health_buffer_nest_start(buffer,
>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>>> +	if (err)
>>> +		goto buffer_error;
>>> +	nest++;
>>> +
>>> +	err = devlink_health_buffer_nest_start(buffer,
>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>>> +	if (err)
>>> +		goto buffer_error;
>>> +	nest++;
>>> +
>>> +	err = devlink_health_buffer_put_object_name(buffer, "HW state");
>>> +	if (err)
>>> +		goto buffer_error;
>>> +
>>> +	err = devlink_health_buffer_nest_start(buffer,
>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>> 
>> How could you put an object name and don't start nesting? It looks
>> implicit to me.
>
>I will add some helper functions that you could review. Just keep in 
>mind the implicit nest start must come with implicit nest end, so it 
>won't fit into every use...

You can do just object_start(), object_finish() or something like that.

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support
  2019-01-21 12:11       ` Jiri Pirko
@ 2019-01-21 13:06         ` Eran Ben Elisha
  2019-01-21 13:45           ` Jiri Pirko
  0 siblings, 1 reply; 36+ messages in thread
From: Eran Ben Elisha @ 2019-01-21 13:06 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh



On 1/21/2019 2:11 PM, Jiri Pirko wrote:
> Mon, Jan 21, 2019 at 12:32:07PM CET, eranbe@mellanox.com wrote:
>>
>>
>> On 1/20/2019 1:06 PM, Jiri Pirko wrote:
>>> Thu, Jan 17, 2019 at 10:59:18PM CET, eranbe@mellanox.com wrote:
> 
> [...]
> 
> 	
>>>> +static int
>>>> +mlx5e_tx_reporter_build_diagnose_output(struct devlink_health_buffer *buffer,
>>>> +					u32 sqn, u8 state, u8 stopped)
>>>> +{
>>>> +	int err, i;
>>>> +	int nest = 0;
>>>> +	char name[20];
>>>> +
>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +	nest++;
>>>> +
>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +	nest++;
>>>> +
>>>> +	sprintf(name, "SQ 0x%x", sqn);
>>>
>>> No. The whole idea of having the message build up with nested attributes
>>> (json-like) was to avoid things like this. No sprintfs please. If you
>>> want to do sprintf, most of the time you are doing something wrong.
>>
>> I wanted that each SQ object will have a unique name. But I can merge
>> the sqn into its attributes instead.
> 
> Should be an array.

Every SQ shall hold it's own properties. I don't see why all SQs shall 
be held under one array, this array do not provide any additional 
info/order.

In addition, it is better to keep main preliminary objects up to the 
size of one netlink SKB, otherwise it will be impossible for the devlink 
to prepare this one big object as skb fragments, and also to re-assmble 
them in user space as driver intended them to be.

If all SWs will be under one big array, under one big object, we will 
hit this corner case.

> 
> 
>>
>>>
>>>
>>>> +	err = devlink_health_buffer_put_object_name(buffer, name);
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +
>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +	nest++;
>>>> +
>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +	nest++;
>>>> +
>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +	nest++;
>>>> +
>>>> +	err = devlink_health_buffer_put_object_name(buffer, "HW state");
>>>> +	if (err)
>>>> +		goto buffer_error;
>>>> +
>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>>>
>>> How could you put an object name and don't start nesting? It looks
>>> implicit to me.
>>
>> I will add some helper functions that you could review. Just keep in
>> mind the implicit nest start must come with implicit nest end, so it
>> won't fit into every use...
> 
> You can do just object_start(), object_finish() or something like that.

ack.

Better to have also helpers that start and close their nests on one 
shot. like pair_name_value or pair_name_value_array.

Also, The json is too powerful to close it inside few wrappers, we must 
give the basic blocks as well for nontraditional structures.


> 
> [...]
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support
  2019-01-21 13:06         ` Eran Ben Elisha
@ 2019-01-21 13:45           ` Jiri Pirko
  0 siblings, 0 replies; 36+ messages in thread
From: Jiri Pirko @ 2019-01-21 13:45 UTC (permalink / raw)
  To: Eran Ben Elisha
  Cc: netdev, Jiri Pirko, David S. Miller, Ariel Almog, Aya Levin,
	Moshe Shemesh

Mon, Jan 21, 2019 at 02:06:44PM CET, eranbe@mellanox.com wrote:
>
>
>On 1/21/2019 2:11 PM, Jiri Pirko wrote:
>> Mon, Jan 21, 2019 at 12:32:07PM CET, eranbe@mellanox.com wrote:
>>>
>>>
>>> On 1/20/2019 1:06 PM, Jiri Pirko wrote:
>>>> Thu, Jan 17, 2019 at 10:59:18PM CET, eranbe@mellanox.com wrote:
>> 
>> [...]
>> 
>> 	
>>>>> +static int
>>>>> +mlx5e_tx_reporter_build_diagnose_output(struct devlink_health_buffer *buffer,
>>>>> +					u32 sqn, u8 state, u8 stopped)
>>>>> +{
>>>>> +	int err, i;
>>>>> +	int nest = 0;
>>>>> +	char name[20];
>>>>> +
>>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +	nest++;
>>>>> +
>>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +	nest++;
>>>>> +
>>>>> +	sprintf(name, "SQ 0x%x", sqn);
>>>>
>>>> No. The whole idea of having the message build up with nested attributes
>>>> (json-like) was to avoid things like this. No sprintfs please. If you
>>>> want to do sprintf, most of the time you are doing something wrong.
>>>
>>> I wanted that each SQ object will have a unique name. But I can merge
>>> the sqn into its attributes instead.
>> 
>> Should be an array.
>
>Every SQ shall hold it's own properties. I don't see why all SQs shall 
>be held under one array, this array do not provide any additional 
>info/order.

If you have 8 SQs, you should have 8 objects (with subobjects) in an
array.


>
>In addition, it is better to keep main preliminary objects up to the 
>size of one netlink SKB, otherwise it will be impossible for the devlink 
>to prepare this one big object as skb fragments, and also to re-assmble 
>them in user space as driver intended them to be.
>
>If all SWs will be under one big array, under one big object, we will 
>hit this corner case.
>
>> 
>> 
>>>
>>>>
>>>>
>>>>> +	err = devlink_health_buffer_put_object_name(buffer, name);
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +
>>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +	nest++;
>>>>> +
>>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT);
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +	nest++;
>>>>> +
>>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_PAIR);
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +	nest++;
>>>>> +
>>>>> +	err = devlink_health_buffer_put_object_name(buffer, "HW state");
>>>>> +	if (err)
>>>>> +		goto buffer_error;
>>>>> +
>>>>> +	err = devlink_health_buffer_nest_start(buffer,
>>>>> +					       DEVLINK_ATTR_HEALTH_BUFFER_OBJECT_VALUE);
>>>>
>>>> How could you put an object name and don't start nesting? It looks
>>>> implicit to me.
>>>
>>> I will add some helper functions that you could review. Just keep in
>>> mind the implicit nest start must come with implicit nest end, so it
>>> won't fit into every use...
>> 
>> You can do just object_start(), object_finish() or something like that.
>
>ack.
>
>Better to have also helpers that start and close their nests on one 
>shot. like pair_name_value or pair_name_value_array.
>
>Also, The json is too powerful to close it inside few wrappers, we must 
>give the basic blocks as well for nontraditional structures.
>
>
>> 
>> [...]
>> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH iproute2-next v2] devlink: Add health command support
  2019-01-20  9:27 ` [PATCH iproute2-next v2] devlink: Add health command support Aya Levin
@ 2019-01-23  3:37   ` David Ahern
  2019-01-23 11:27     ` Aya Levin
  0 siblings, 1 reply; 36+ messages in thread
From: David Ahern @ 2019-01-23  3:37 UTC (permalink / raw)
  To: Aya Levin, netdev, David S. Miller, Jiri Pirko
  Cc: Moshe Shemesh, Eran Ben Elisha, Tal Alon, Ariel Almog

On 1/20/19 2:27 AM, Aya Levin wrote:
> diff --git a/devlink/devlink.c b/devlink/devlink.c
> index 3651e90c1159..9fc19668ccd0 100644
> --- a/devlink/devlink.c
> +++ b/devlink/devlink.c
> @@ -1,4 +1,5 @@
>  /*
> + *

extra newline

>   * devlink.c	Devlink tool
>   *
>   *              This program is free software; you can redistribute it and/or
> @@ -22,7 +23,7 @@
>  #include <linux/devlink.h>
>  #include <libmnl/libmnl.h>
>  #include <netinet/ether.h>
> -
> +#include <sys/sysinfo.h>
>  #include "SNAPSHOT.h"
>  #include "list.h"
>  #include "mnlg.h"
> @@ -40,6 +41,12 @@
>  #define PARAM_CMODE_DRIVERINIT_STR "driverinit"
>  #define PARAM_CMODE_PERMANENT_STR "permanent"
>  
> +#define HEALTH_REPORTER_STATE_HEALTHY_STR "healthy"
> +#define HEALTH_REPORTER_STATE_ERROR_STR "error"
> +#define HEALTH_REPORTER_GRACE_PERIOD_STR "grace_period"
> +#define HEALTH_REPORTER_AUTO_RECOVER_STR "auto_recover"
> +#define HEALTH_REPORTER_TS_LEN 80
> +
>  static int g_new_line_count;
>  
>  #define pr_err(args...) fprintf(stderr, ##args)
> @@ -199,6 +206,10 @@ static void ifname_map_free(struct ifname_map *ifname_map)
>  #define DL_OPT_REGION_SNAPSHOT_ID	BIT(22)
>  #define DL_OPT_REGION_ADDRESS		BIT(23)
>  #define DL_OPT_REGION_LENGTH		BIT(24)
> +#define DL_OPT_HANDLE_HEALTH		BIT(25)
> +#define DL_OPT_HEALTH_REPORTER_NAME	BIT(26)
> +#define DL_OPT_HEALTH_REPORTER_DEV	BIT(27)
> +
>  

extra newline


>  struct dl_opts {
>  	uint32_t present; /* flags of present items */
> @@ -230,6 +241,10 @@ struct dl_opts {
>  	uint32_t region_snapshot_id;
>  	uint64_t region_address;
>  	uint64_t region_length;
> +	const char *reporter_name;
> +	const char *reporter_param_name;
> +	const char *reporter_param_value;
> +
>  };
>  
>  struct dl {
> @@ -959,7 +974,7 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>  		if (err)
>  			return err;
>  		o_found |= handle_bit;
> -	} else if (o_required & DL_OPT_HANDLE) {
> +	} else if (DL_OPT_HANDLE) {

typo? Seems like o_required is required.


>  		err = dl_argv_handle(dl, &opts->bus_name, &opts->dev_name);
>  		if (err)
>  			return err;
> @@ -1178,6 +1193,15 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>  			if (err)
>  				return err;
>  			o_found |= DL_OPT_REGION_LENGTH;
> +		} else if (dl_argv_match(dl, "reporter") &&
> +			   (o_all & DL_OPT_HEALTH_REPORTER_NAME)) {
> +			dl_arg_inc(dl);
> +			err = dl_argv_str(dl, &opts->reporter_name);
> +			if (err)
> +				return err;
> +			o_found |= DL_OPT_HEALTH_REPORTER_NAME;
> +			o_found |= DL_OPT_HANDLE;
> +			break;

why the break? It is the only option that breaks after a match. If it is
required, please add a comment why for future readers.


>  		} else {
>  			pr_err("Unknown option \"%s\"\n", dl_argv(dl));
>  			return -EINVAL;
> @@ -1298,6 +1322,12 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>  		return -EINVAL;
>  	}
>  
> +	if ((o_required & DL_OPT_HEALTH_REPORTER_NAME) &&
> +	    !(o_found & DL_OPT_HEALTH_REPORTER_NAME)) {
> +		pr_err("Reporter name expected.\n");
> +		return -EINVAL;
> +	}

I realize your are following suit with this change, but these checks for
'required and not found' are getting really long. There is a better way
to do this.


> +
>  	return 0;
>  }
>  
> @@ -1382,6 +1412,9 @@ static void dl_opts_put(struct nlmsghdr *nlh, struct dl *dl)
>  	if (opts->present & DL_OPT_REGION_LENGTH)
>  		mnl_attr_put_u64(nlh, DEVLINK_ATTR_REGION_CHUNK_LEN,
>  				 opts->region_length);
> +	if (opts->present & DL_OPT_HEALTH_REPORTER_NAME)
> +		mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME,
> +				  opts->reporter_name);
>  }
>  
>  static int dl_argv_parse_put(struct nlmsghdr *nlh, struct dl *dl,
> @@ -1513,6 +1546,8 @@ static void __pr_out_handle_start(struct dl *dl, struct nlattr **tb,
>  				__pr_out_newline();
>  				__pr_out_indent_inc();
>  				arr_last_handle_set(dl, bus_name, dev_name);
> +			} else {
> +				__pr_out_indent_inc();
>  			}
>  		} else {
>  			pr_out("%s%s", buf, content ? ":" : "");
> @@ -5557,11 +5592,502 @@ static int cmd_region(struct dl *dl)
>  	return -ENOENT;
>  }
>  
> +static int cmd_health_set_params(struct dl *dl)
> +{
> +	struct nlmsghdr *nlh;
> +	uint64_t period;
> +	bool auto_recover;
> +	int err;
> +
> +	nlh = mnlg_msg_prepare(dl->nlg, DEVLINK_CMD_HEALTH_REPORTER_SET,
> +			       NLM_F_REQUEST | NLM_F_ACK);
> +	err = dl_argv_parse(dl, DL_OPT_HANDLE |
> +			    DL_OPT_HEALTH_REPORTER_NAME, 0);
> +	if (err)
> +		return err;
> +
> +	err = dl_argv_str(dl, &dl->opts.reporter_param_name);
> +	if (err)
> +		return err;
> +	err = dl_argv_str(dl, &dl->opts.reporter_param_value);
> +	if (err)
> +		return err;
> +	dl_opts_put(nlh, dl);
> +
> +	if (!strncmp(dl->opts.reporter_param_name,
> +		     HEALTH_REPORTER_GRACE_PERIOD_STR, strlen("garce"))) {
> +		err = strtouint64_t(dl->opts.reporter_param_value, &period);
> +		if (err)
> +			goto err_param_value_parse;
> +		mnl_attr_put_u64(nlh, DEVLINK_ATTR_HEALTH_REPORTER_PERIOD,
> +				 period);
> +	} else if (!strncmp(dl->opts.reporter_param_name,
> +			    HEALTH_REPORTER_AUTO_RECOVER_STR,
> +			    strlen("auto"))) {
> +		err = strtobool(dl->opts.reporter_param_value, &auto_recover);
> +		if (err)
> +			goto err_param_value_parse;
> +		mnl_attr_put_u8(nlh, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_REC,
> +				(uint8_t)auto_recover);
> +	} else {
> +		printf("Parameter name: %s  is not supported\n",
> +		       dl->opts.reporter_param_name);
> +		return -ENOTSUP;
> +	}
> +
> +	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
> +
> +err_param_value_parse:
> +	pr_err("Value \"%s\" is not a number or not within range\n",
> +	       dl->opts.param_value);
> +	return err;
> +}
> +
> +static int cmd_health_dump_clear(struct dl *dl)
> +{
> +	struct nlmsghdr *nlh;
> +	int err;
> +
> +	nlh = mnlg_msg_prepare(dl->nlg,
> +			       DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR,
> +			       NLM_F_REQUEST | NLM_F_ACK);
> +
> +	err = dl_argv_parse_put(nlh, dl, DL_OPT_HANDLE |
> +				DL_OPT_HEALTH_REPORTER_NAME, 0);
> +	if (err)
> +		return err;
> +
> +	dl_opts_put(nlh, dl);
> +	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
> +}
> +
> +static int health_value_show(struct dl *dl, int type, struct nlattr *nl_data)
> +{
> +	const char *str;
> +	uint8_t *data;
> +	uint32_t len;
> +	uint64_t u64;
> +	uint32_t u32;
> +	uint16_t u16;
> +	uint8_t u8;
> +	int i;
> +
> +	switch (type) {
> +	case MNL_TYPE_FLAG:
> +		if (dl->json_output)
> +			jsonw_string(dl->jw, nl_data ? "true" : "false");
> +		else
> +			pr_out("%s ", nl_data ? "true" : "false");
> +		break;
> +	case MNL_TYPE_U8:
> +		u8 = mnl_attr_get_u8(nl_data);
> +		if (dl->json_output)
> +			jsonw_uint(dl->jw, u8);
> +		else
> +			pr_out("%u ", u8);
> +		break;
> +	case MNL_TYPE_U16:
> +		u16 = mnl_attr_get_u16(nl_data);
> +		if (dl->json_output)
> +			jsonw_uint(dl->jw, u16);
> +		else
> +			pr_out("%u ", u16);
> +		break;
> +	case MNL_TYPE_U32:
> +		u32 = mnl_attr_get_u32(nl_data);
> +		if (dl->json_output)
> +			jsonw_uint(dl->jw, u32);
> +		else
> +			pr_out("%u ", u32);
> +		break;
> +	case MNL_TYPE_U64:
> +		u64 = mnl_attr_get_u64(nl_data);
> +		if (dl->json_output)
> +			jsonw_u64(dl->jw, u64);
> +		else
> +			pr_out("%lu ", u64);
> +		break;
> +	case MNL_TYPE_STRING:
> +	case MNL_TYPE_NUL_STRING:
> +		str = mnl_attr_get_str(nl_data);
> +		if (dl->json_output)
> +			jsonw_string(dl->jw, str);
> +		else
> +			pr_out("%s ", str);
> +		break;
> +	case MNL_TYPE_BINARY:
> +		len = mnl_attr_get_payload_len(nl_data);
> +		data = mnl_attr_get_payload(nl_data);
> +		i = 0;
> +		while (i < len) {
> +			if (dl->json_output)
> +				jsonw_printf(dl->jw, "%d", data[i]);
> +			else
> +				pr_out("%02x ", data[i]);
> +			i++;
> +		}

If 'len' is an arbitrary size then line lengths can be ridiculous here.


> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +	return MNL_CB_OK;
> +}
> +

...

> +static int jiffies_to_logtime(uint64_t jiffies, char *output)
> +{
> +	struct sysinfo s_info;
> +	struct tm *info;
> +	time_t now, sec;
> +	int err;
> +
> +	time(&now);
> +	info = localtime(&now);
> +	strftime(output, HEALTH_REPORTER_TS_LEN, "%Y-%b-%d %H:%M:%S", info);

This strftime should really be done in the error path of sysinfo as
opposed to every call to this function calling strftime twice.

> +	err = sysinfo(&s_info);
> +	if (err)
> +		return err;
> +	/*substruct uptime in sec from now
> +	 * add jiffies and 5 minutes factor*/

fix the comment style.


> +	sec = now - (s_info.uptime - jiffies/1000) + 300;
> +	info = localtime(&sec);
> +	strftime(output, HEALTH_REPORTER_TS_LEN, "%Y-%b-%d %H:%M:%S", info);
> +
> +	return 0;
> +}
> +


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH iproute2-next v2] devlink: Add health command support
  2019-01-23  3:37   ` David Ahern
@ 2019-01-23 11:27     ` Aya Levin
  2019-01-24  0:27       ` David Ahern
  0 siblings, 1 reply; 36+ messages in thread
From: Aya Levin @ 2019-01-23 11:27 UTC (permalink / raw)
  To: David Ahern, netdev, David S. Miller, Jiri Pirko
  Cc: Moshe Shemesh, Eran Ben Elisha, Tal Alon, Ariel Almog


נכתב על ידי David Ahern, ב־1/23/2019 בשעה 5:37 AM:
> On 1/20/19 2:27 AM, Aya Levin wrote:
>> diff --git a/devlink/devlink.c b/devlink/devlink.c
>> index 3651e90c1159..9fc19668ccd0 100644
>> --- a/devlink/devlink.c
>> +++ b/devlink/devlink.c
>> @@ -1,4 +1,5 @@
>>   /*
>> + *
> 
> extra newline
> 
>>    * devlink.c	Devlink tool
>>    *
>>    *              This program is free software; you can redistribute it and/or
>> @@ -22,7 +23,7 @@
>>   #include <linux/devlink.h>
>>   #include <libmnl/libmnl.h>
>>   #include <netinet/ether.h>
>> -
>> +#include <sys/sysinfo.h>
>>   #include "SNAPSHOT.h"
>>   #include "list.h"
>>   #include "mnlg.h"
>> @@ -40,6 +41,12 @@
>>   #define PARAM_CMODE_DRIVERINIT_STR "driverinit"
>>   #define PARAM_CMODE_PERMANENT_STR "permanent"
>>   
>> +#define HEALTH_REPORTER_STATE_HEALTHY_STR "healthy"
>> +#define HEALTH_REPORTER_STATE_ERROR_STR "error"
>> +#define HEALTH_REPORTER_GRACE_PERIOD_STR "grace_period"
>> +#define HEALTH_REPORTER_AUTO_RECOVER_STR "auto_recover"
>> +#define HEALTH_REPORTER_TS_LEN 80
>> +
>>   static int g_new_line_count;
>>   
>>   #define pr_err(args...) fprintf(stderr, ##args)
>> @@ -199,6 +206,10 @@ static void ifname_map_free(struct ifname_map *ifname_map)
>>   #define DL_OPT_REGION_SNAPSHOT_ID	BIT(22)
>>   #define DL_OPT_REGION_ADDRESS		BIT(23)
>>   #define DL_OPT_REGION_LENGTH		BIT(24)
>> +#define DL_OPT_HANDLE_HEALTH		BIT(25)
>> +#define DL_OPT_HEALTH_REPORTER_NAME	BIT(26)
>> +#define DL_OPT_HEALTH_REPORTER_DEV	BIT(27)
>> +
>>   
> 
> extra newline
> 
> 
>>   struct dl_opts {
>>   	uint32_t present; /* flags of present items */
>> @@ -230,6 +241,10 @@ struct dl_opts {
>>   	uint32_t region_snapshot_id;
>>   	uint64_t region_address;
>>   	uint64_t region_length;
>> +	const char *reporter_name;
>> +	const char *reporter_param_name;
>> +	const char *reporter_param_value;
>> +
>>   };
>>   
>>   struct dl {
>> @@ -959,7 +974,7 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>>   		if (err)
>>   			return err;
>>   		o_found |= handle_bit;
>> -	} else if (o_required & DL_OPT_HANDLE) {
>> +	} else if (DL_OPT_HANDLE) {
> 
> typo? Seems like o_required is required.
typo should have been o_all
> 
> 
>>   		err = dl_argv_handle(dl, &opts->bus_name, &opts->dev_name);
>>   		if (err)
>>   			return err;
>> @@ -1178,6 +1193,15 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>>   			if (err)
>>   				return err;
>>   			o_found |= DL_OPT_REGION_LENGTH;
>> +		} else if (dl_argv_match(dl, "reporter") &&
>> +			   (o_all & DL_OPT_HEALTH_REPORTER_NAME)) {
>> +			dl_arg_inc(dl);
>> +			err = dl_argv_str(dl, &opts->reporter_name);
>> +			if (err)
>> +				return err;
>> +			o_found |= DL_OPT_HEALTH_REPORTER_NAME;
>> +			o_found |= DL_OPT_HANDLE;
>> +			break;
> 
> why the break? It is the only option that breaks after a match. If it is
> required, please add a comment why for future readers.
> 
> 
>>   		} else {
>>   			pr_err("Unknown option \"%s\"\n", dl_argv(dl));
>>   			return -EINVAL;
>> @@ -1298,6 +1322,12 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>>   		return -EINVAL;
>>   	}
>>   
>> +	if ((o_required & DL_OPT_HEALTH_REPORTER_NAME) &&
>> +	    !(o_found & DL_OPT_HEALTH_REPORTER_NAME)) {
>> +		pr_err("Reporter name expected.\n");
>> +		return -EINVAL;
>> +	}
> 
> I realize your are following suit with this change, but these checks for
> 'required and not found' are getting really long. There is a better way
> to do this.
Do you mean somthing like:
#define requiered_not_found(o_found, o_required, flag, msg)            \
           ({                                                           \
                   if ((o_required & flag) && !(o_found & flag)) {      \
                           pr_err("%s \n", msg);                        \
                           return -EINVAL;                              \
                   }                                                    \
           })

> 
> 
>> +
>>   	return 0;
>>   }
>>   
>> @@ -1382,6 +1412,9 @@ static void dl_opts_put(struct nlmsghdr *nlh, struct dl *dl)
>>   	if (opts->present & DL_OPT_REGION_LENGTH)
>>   		mnl_attr_put_u64(nlh, DEVLINK_ATTR_REGION_CHUNK_LEN,
>>   				 opts->region_length);
>> +	if (opts->present & DL_OPT_HEALTH_REPORTER_NAME)
>> +		mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME,
>> +				  opts->reporter_name);
>>   }
>>   
>>   static int dl_argv_parse_put(struct nlmsghdr *nlh, struct dl *dl,
>> @@ -1513,6 +1546,8 @@ static void __pr_out_handle_start(struct dl *dl, struct nlattr **tb,
>>   				__pr_out_newline();
>>   				__pr_out_indent_inc();
>>   				arr_last_handle_set(dl, bus_name, dev_name);
>> +			} else {
>> +				__pr_out_indent_inc();
>>   			}
>>   		} else {
>>   			pr_out("%s%s", buf, content ? ":" : "");
>> @@ -5557,11 +5592,502 @@ static int cmd_region(struct dl *dl)
>>   	return -ENOENT;
>>   }
>>   
>> +static int cmd_health_set_params(struct dl *dl)
>> +{
>> +	struct nlmsghdr *nlh;
>> +	uint64_t period;
>> +	bool auto_recover;
>> +	int err;
>> +
>> +	nlh = mnlg_msg_prepare(dl->nlg, DEVLINK_CMD_HEALTH_REPORTER_SET,
>> +			       NLM_F_REQUEST | NLM_F_ACK);
>> +	err = dl_argv_parse(dl, DL_OPT_HANDLE |
>> +			    DL_OPT_HEALTH_REPORTER_NAME, 0);
>> +	if (err)
>> +		return err;
>> +
>> +	err = dl_argv_str(dl, &dl->opts.reporter_param_name);
>> +	if (err)
>> +		return err;
>> +	err = dl_argv_str(dl, &dl->opts.reporter_param_value);
>> +	if (err)
>> +		return err;
>> +	dl_opts_put(nlh, dl);
>> +
>> +	if (!strncmp(dl->opts.reporter_param_name,
>> +		     HEALTH_REPORTER_GRACE_PERIOD_STR, strlen("garce"))) {
>> +		err = strtouint64_t(dl->opts.reporter_param_value, &period);
>> +		if (err)
>> +			goto err_param_value_parse;
>> +		mnl_attr_put_u64(nlh, DEVLINK_ATTR_HEALTH_REPORTER_PERIOD,
>> +				 period);
>> +	} else if (!strncmp(dl->opts.reporter_param_name,
>> +			    HEALTH_REPORTER_AUTO_RECOVER_STR,
>> +			    strlen("auto"))) {
>> +		err = strtobool(dl->opts.reporter_param_value, &auto_recover);
>> +		if (err)
>> +			goto err_param_value_parse;
>> +		mnl_attr_put_u8(nlh, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_REC,
>> +				(uint8_t)auto_recover);
>> +	} else {
>> +		printf("Parameter name: %s  is not supported\n",
>> +		       dl->opts.reporter_param_name);
>> +		return -ENOTSUP;
>> +	}
>> +
>> +	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
>> +
>> +err_param_value_parse:
>> +	pr_err("Value \"%s\" is not a number or not within range\n",
>> +	       dl->opts.param_value);
>> +	return err;
>> +}
>> +
>> +static int cmd_health_dump_clear(struct dl *dl)
>> +{
>> +	struct nlmsghdr *nlh;
>> +	int err;
>> +
>> +	nlh = mnlg_msg_prepare(dl->nlg,
>> +			       DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR,
>> +			       NLM_F_REQUEST | NLM_F_ACK);
>> +
>> +	err = dl_argv_parse_put(nlh, dl, DL_OPT_HANDLE |
>> +				DL_OPT_HEALTH_REPORTER_NAME, 0);
>> +	if (err)
>> +		return err;
>> +
>> +	dl_opts_put(nlh, dl);
>> +	return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
>> +}
>> +
>> +static int health_value_show(struct dl *dl, int type, struct nlattr *nl_data)
>> +{
>> +	const char *str;
>> +	uint8_t *data;
>> +	uint32_t len;
>> +	uint64_t u64;
>> +	uint32_t u32;
>> +	uint16_t u16;
>> +	uint8_t u8;
>> +	int i;
>> +
>> +	switch (type) {
>> +	case MNL_TYPE_FLAG:
>> +		if (dl->json_output)
>> +			jsonw_string(dl->jw, nl_data ? "true" : "false");
>> +		else
>> +			pr_out("%s ", nl_data ? "true" : "false");
>> +		break;
>> +	case MNL_TYPE_U8:
>> +		u8 = mnl_attr_get_u8(nl_data);
>> +		if (dl->json_output)
>> +			jsonw_uint(dl->jw, u8);
>> +		else
>> +			pr_out("%u ", u8);
>> +		break;
>> +	case MNL_TYPE_U16:
>> +		u16 = mnl_attr_get_u16(nl_data);
>> +		if (dl->json_output)
>> +			jsonw_uint(dl->jw, u16);
>> +		else
>> +			pr_out("%u ", u16);
>> +		break;
>> +	case MNL_TYPE_U32:
>> +		u32 = mnl_attr_get_u32(nl_data);
>> +		if (dl->json_output)
>> +			jsonw_uint(dl->jw, u32);
>> +		else
>> +			pr_out("%u ", u32);
>> +		break;
>> +	case MNL_TYPE_U64:
>> +		u64 = mnl_attr_get_u64(nl_data);
>> +		if (dl->json_output)
>> +			jsonw_u64(dl->jw, u64);
>> +		else
>> +			pr_out("%lu ", u64);
>> +		break;
>> +	case MNL_TYPE_STRING:
>> +	case MNL_TYPE_NUL_STRING:
>> +		str = mnl_attr_get_str(nl_data);
>> +		if (dl->json_output)
>> +			jsonw_string(dl->jw, str);
>> +		else
>> +			pr_out("%s ", str);
>> +		break;
>> +	case MNL_TYPE_BINARY:
>> +		len = mnl_attr_get_payload_len(nl_data);
>> +		data = mnl_attr_get_payload(nl_data);
>> +		i = 0;
>> +		while (i < len) {
>> +			if (dl->json_output)
>> +				jsonw_printf(dl->jw, "%d", data[i]);
>> +			else
>> +				pr_out("%02x ", data[i]);
>> +			i++;
>> +		}
> 
> If 'len' is an arbitrary size then line lengths can be ridiculous here.
> 
> 
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +	return MNL_CB_OK;
>> +}
>> +
> 
> ...
> 
>> +static int jiffies_to_logtime(uint64_t jiffies, char *output)
>> +{
>> +	struct sysinfo s_info;
>> +	struct tm *info;
>> +	time_t now, sec;
>> +	int err;
>> +
>> +	time(&now);
>> +	info = localtime(&now);
>> +	strftime(output, HEALTH_REPORTER_TS_LEN, "%Y-%b-%d %H:%M:%S", info);
> 
> This strftime should really be done in the error path of sysinfo as
> opposed to every call to this function calling strftime twice.
> 
>> +	err = sysinfo(&s_info);
>> +	if (err)
>> +		return err;
>> +	/*substruct uptime in sec from now
>> +	 * add jiffies and 5 minutes factor*/
> 
> fix the comment style.
> 
> 
>> +	sec = now - (s_info.uptime - jiffies/1000) + 300;
>> +	info = localtime(&sec);
>> +	strftime(output, HEALTH_REPORTER_TS_LEN, "%Y-%b-%d %H:%M:%S", info);
>> +
>> +	return 0;
>> +}
>> +
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH iproute2-next v2] devlink: Add health command support
  2019-01-23 11:27     ` Aya Levin
@ 2019-01-24  0:27       ` David Ahern
  0 siblings, 0 replies; 36+ messages in thread
From: David Ahern @ 2019-01-24  0:27 UTC (permalink / raw)
  To: Aya Levin, netdev, David S. Miller, Jiri Pirko
  Cc: Moshe Shemesh, Eran Ben Elisha, Tal Alon, Ariel Almog

On 1/23/19 4:27 AM, Aya Levin wrote:
>>> @@ -1298,6 +1322,12 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
>>>   		return -EINVAL;
>>>   	}
>>>   
>>> +	if ((o_required & DL_OPT_HEALTH_REPORTER_NAME) &&
>>> +	    !(o_found & DL_OPT_HEALTH_REPORTER_NAME)) {
>>> +		pr_err("Reporter name expected.\n");
>>> +		return -EINVAL;
>>> +	}
>>
>> I realize your are following suit with this change, but these checks for
>> 'required and not found' are getting really long. There is a better way
>> to do this.
> Do you mean somthing like:
> #define requiered_not_found(o_found, o_required, flag, msg)            \
>            ({                                                           \
>                    if ((o_required & flag) && !(o_found & flag)) {      \
>                            pr_err("%s \n", msg);                        \
>                            return -EINVAL;                              \
>                    }                                                    \
>            })
> 

That's one way.

I was also thinking a helper that does a single loop with an array of
bits and messages to print if required and not found.

That parse function is currently 355 lines long with a lot of repetition.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-01-24  0:27 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-17 21:59 [PATCH net-next v2 00/11] Devlink health reporting and recovery system Eran Ben Elisha
2019-01-17 21:59 ` [PATCH net-next v2 01/11] devlink: Add health buffer support Eran Ben Elisha
2019-01-20 10:03   ` Jiri Pirko
2019-01-20 11:06     ` Eran Ben Elisha
2019-01-20 11:08       ` Jiri Pirko
2019-01-20 18:45         ` David Miller
2019-01-21 11:07           ` Eran Ben Elisha
2019-01-21 12:08             ` Jiri Pirko
2019-01-20 11:20   ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 02/11] devlink: Add health reporter create/destroy functionality Eran Ben Elisha
2019-01-20 11:49   ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 03/11] devlink: Add health report functionality Eran Ben Elisha
2019-01-20 11:27   ` Jiri Pirko
2019-01-21 11:12     ` Eran Ben Elisha
2019-01-17 21:59 ` [PATCH net-next v2 04/11] devlink: Add health get command Eran Ben Elisha
2019-01-20 11:31   ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 05/11] devlink: Add health set command Eran Ben Elisha
2019-01-20 11:32   ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 06/11] devlink: Add health recover command Eran Ben Elisha
2019-01-20 11:33   ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 07/11] devlink: Add health diagnose command Eran Ben Elisha
2019-01-20 11:38   ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 08/11] devlink: Add health dump {get,clear} commands Eran Ben Elisha
2019-01-17 21:59 ` [PATCH net-next v2 09/11] net/mlx5e: Add TX reporter support Eran Ben Elisha
2019-01-20 11:06   ` Jiri Pirko
2019-01-21 11:32     ` Eran Ben Elisha
2019-01-21 12:11       ` Jiri Pirko
2019-01-21 13:06         ` Eran Ben Elisha
2019-01-21 13:45           ` Jiri Pirko
2019-01-17 21:59 ` [PATCH net-next v2 10/11] net/mlx5e: Add TX timeout support for mlx5e TX reporter Eran Ben Elisha
2019-01-17 21:59 ` [PATCH net-next v2 11/11] devlink: Add Documentation/networking/devlink-health.txt Eran Ben Elisha
2019-01-18 22:59 ` [PATCH net-next v2 00/11] Devlink health reporting and recovery system David Miller
2019-01-20  9:27 ` [PATCH iproute2-next v2] devlink: Add health command support Aya Levin
2019-01-23  3:37   ` David Ahern
2019-01-23 11:27     ` Aya Levin
2019-01-24  0:27       ` David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.