netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms
@ 2020-09-03 13:41 Ido Schimmel
  2020-09-03 13:41 ` [PATCH net-next 1/3] mlxsw: core_hwmon: Split temperature querying from show functions Ido Schimmel
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Ido Schimmel @ 2020-09-03 13:41 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, jiri, amcohen, petrm, vadimp, andrew, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Amit says:

Extend hwmon interface with critical and emergency module alarms.

In case that current module temperature is higher than emergency
threshold, EMERGENCY alarm will be reported in sensors utility:

$ sensors
...
front panel 025:  +55.0°C  (crit = +35.0°C, emerg = +40.0°C) ALARM(EMERGENCY)

In case that current module temperature is higher than critical
threshold, CRIT alarm will be reported in sensors utility:

$ sensors
...
front panel 025:  +54.0°C  (crit = +35.0°C, emerg = +80.0°C) ALARM(CRIT)

Patch set overview:

Patches #1-#2 make several changes to make the code easier to change.

Patch #3 extends the hwmon interface with the new module alarms.

Amit Cohen (3):
  mlxsw: core_hwmon: Split temperature querying from show functions
  mlxsw: core_hwmon: Calculate MLXSW_HWMON_ATTR_COUNT more accurately
  mlxsw: core_hwmon: Extend hwmon interface with critical and emergency
    alarms

 .../net/ethernet/mellanox/mlxsw/core_hwmon.c  | 152 +++++++++++++++---
 1 file changed, 134 insertions(+), 18 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next 1/3] mlxsw: core_hwmon: Split temperature querying from show functions
  2020-09-03 13:41 [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms Ido Schimmel
@ 2020-09-03 13:41 ` Ido Schimmel
  2020-09-03 13:41 ` [PATCH net-next 2/3] mlxsw: core_hwmon: Calculate MLXSW_HWMON_ATTR_COUNT more accurately Ido Schimmel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2020-09-03 13:41 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, jiri, amcohen, petrm, vadimp, andrew, mlxsw,
	Amit Cohen, Ido Schimmel

From: Amit Cohen <amitc@mellanox.com>

mlxsw_hwmon_module_temp_show(), mlxsw_hwmon_module_temp_critical_show()
and mlxsw_hwmon_module_temp_emergency_show() query the relevant
temperature from firmware and fill the value in provided buffers.

Split the temperature querying functionality to individual get()
functions and call them from the show() functions.

The get() functions will be used by subsequent patches in the set.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/core_hwmon.c  | 70 ++++++++++++++-----
 1 file changed, 54 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
index 3fe878d7c94c..3f1822535bc6 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
@@ -205,25 +205,39 @@ static ssize_t mlxsw_hwmon_pwm_store(struct device *dev,
 	return len;
 }
 
-static ssize_t mlxsw_hwmon_module_temp_show(struct device *dev,
-					    struct device_attribute *attr,
-					    char *buf)
+static int mlxsw_hwmon_module_temp_get(struct device *dev,
+				       struct device_attribute *attr,
+				       int *p_temp)
 {
 	struct mlxsw_hwmon_attr *mlwsw_hwmon_attr =
 			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
 	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
 	u8 module;
-	int temp;
 	int err;
 
 	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon->sensor_count;
 	mlxsw_reg_mtmp_pack(mtmp_pl, MLXSW_REG_MTMP_MODULE_INDEX_MIN + module,
 			    false, false);
 	err = mlxsw_reg_query(mlxsw_hwmon->core, MLXSW_REG(mtmp), mtmp_pl);
+	if (err) {
+		dev_err(dev, "Failed to query module temperature\n");
+		return err;
+	}
+	mlxsw_reg_mtmp_unpack(mtmp_pl, p_temp, NULL, NULL);
+
+	return 0;
+}
+
+static ssize_t mlxsw_hwmon_module_temp_show(struct device *dev,
+					    struct device_attribute *attr,
+					    char *buf)
+{
+	int err, temp;
+
+	err = mlxsw_hwmon_module_temp_get(dev, attr, &temp);
 	if (err)
 		return err;
-	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
 
 	return sprintf(buf, "%d\n", temp);
 }
@@ -270,48 +284,72 @@ static ssize_t mlxsw_hwmon_module_temp_fault_show(struct device *dev,
 	return sprintf(buf, "%u\n", fault);
 }
 
-static ssize_t
-mlxsw_hwmon_module_temp_critical_show(struct device *dev,
-				      struct device_attribute *attr, char *buf)
+static int mlxsw_hwmon_module_temp_critical_get(struct device *dev,
+						struct device_attribute *attr,
+						int *p_temp)
 {
 	struct mlxsw_hwmon_attr *mlwsw_hwmon_attr =
 			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
 	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
-	int temp;
 	u8 module;
 	int err;
 
 	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon->sensor_count;
 	err = mlxsw_env_module_temp_thresholds_get(mlxsw_hwmon->core, module,
-						   SFP_TEMP_HIGH_WARN, &temp);
+						   SFP_TEMP_HIGH_WARN, p_temp);
 	if (err) {
 		dev_err(dev, "Failed to query module temperature thresholds\n");
 		return err;
 	}
 
-	return sprintf(buf, "%u\n", temp);
+	return 0;
 }
 
 static ssize_t
-mlxsw_hwmon_module_temp_emergency_show(struct device *dev,
-				       struct device_attribute *attr,
-				       char *buf)
+mlxsw_hwmon_module_temp_critical_show(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	int err, temp;
+
+	err = mlxsw_hwmon_module_temp_critical_get(dev, attr, &temp);
+	if (err)
+		return err;
+
+	return sprintf(buf, "%u\n", temp);
+}
+
+static int mlxsw_hwmon_module_temp_emergency_get(struct device *dev,
+						 struct device_attribute *attr,
+						 int *p_temp)
 {
 	struct mlxsw_hwmon_attr *mlwsw_hwmon_attr =
 			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
 	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
 	u8 module;
-	int temp;
 	int err;
 
 	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon->sensor_count;
 	err = mlxsw_env_module_temp_thresholds_get(mlxsw_hwmon->core, module,
-						   SFP_TEMP_HIGH_ALARM, &temp);
+						   SFP_TEMP_HIGH_ALARM, p_temp);
 	if (err) {
 		dev_err(dev, "Failed to query module temperature thresholds\n");
 		return err;
 	}
 
+	return 0;
+}
+
+static ssize_t
+mlxsw_hwmon_module_temp_emergency_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	int err, temp;
+
+	err = mlxsw_hwmon_module_temp_emergency_get(dev, attr, &temp);
+	if (err)
+		return err;
+
 	return sprintf(buf, "%u\n", temp);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 2/3] mlxsw: core_hwmon: Calculate MLXSW_HWMON_ATTR_COUNT more accurately
  2020-09-03 13:41 [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms Ido Schimmel
  2020-09-03 13:41 ` [PATCH net-next 1/3] mlxsw: core_hwmon: Split temperature querying from show functions Ido Schimmel
@ 2020-09-03 13:41 ` Ido Schimmel
  2020-09-03 13:41 ` [PATCH net-next 3/3] mlxsw: core_hwmon: Extend hwmon interface with critical and emergency alarms Ido Schimmel
  2020-09-03 19:12 ` [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2020-09-03 13:41 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, jiri, amcohen, petrm, vadimp, andrew, mlxsw,
	Amit Cohen, Ido Schimmel

From: Amit Cohen <amitc@mellanox.com>

Currently the value of MLXSW_HWMON_ATTR_COUNT is calculated not really
accurate.

Add several defines to make the calculation clearer and easier to
change.

Calculate the precise high bound of number of attributes that may be
needed.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
index 3f1822535bc6..f1b0c176eaeb 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
@@ -12,8 +12,17 @@
 #include "core.h"
 #include "core_env.h"
 
-#define MLXSW_HWMON_TEMP_SENSOR_MAX_COUNT 127
-#define MLXSW_HWMON_ATTR_COUNT (MLXSW_HWMON_TEMP_SENSOR_MAX_COUNT * 4 + \
+#define MLXSW_HWMON_SENSORS_MAX_COUNT 64
+#define MLXSW_HWMON_MODULES_MAX_COUNT 64
+#define MLXSW_HWMON_GEARBOXES_MAX_COUNT 32
+
+#define MLXSW_HWMON_ATTR_PER_SENSOR 3
+#define MLXSW_HWMON_ATTR_PER_MODULE 5
+#define MLXSW_HWMON_ATTR_PER_GEARBOX 4
+
+#define MLXSW_HWMON_ATTR_COUNT (MLXSW_HWMON_SENSORS_MAX_COUNT * MLXSW_HWMON_ATTR_PER_SENSOR + \
+				MLXSW_HWMON_MODULES_MAX_COUNT * MLXSW_HWMON_ATTR_PER_MODULE + \
+				MLXSW_HWMON_GEARBOXES_MAX_COUNT * MLXSW_HWMON_ATTR_PER_GEARBOX + \
 				MLXSW_MFCR_TACHOS_MAX + MLXSW_MFCR_PWMS_MAX)
 
 struct mlxsw_hwmon_attr {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 3/3] mlxsw: core_hwmon: Extend hwmon interface with critical and emergency alarms
  2020-09-03 13:41 [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms Ido Schimmel
  2020-09-03 13:41 ` [PATCH net-next 1/3] mlxsw: core_hwmon: Split temperature querying from show functions Ido Schimmel
  2020-09-03 13:41 ` [PATCH net-next 2/3] mlxsw: core_hwmon: Calculate MLXSW_HWMON_ATTR_COUNT more accurately Ido Schimmel
@ 2020-09-03 13:41 ` Ido Schimmel
  2020-09-03 19:12 ` [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2020-09-03 13:41 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, jiri, amcohen, petrm, vadimp, andrew, mlxsw,
	Amit Cohen, Ido Schimmel

From: Amit Cohen <amitc@mellanox.com>

Add new attributes to hwmon object for exposing critical and emergency
alarms.

In case that current temperature is higher than emergency threshold,
EMERGENCY alarm will be reported in sensors utility:

$ sensors
...
front panel 025:  +55.0°C  (crit = +35.0°C, emerg = +40.0°C) ALARM(EMERGENCY)

In case that current temperature is higher than critical threshold,
CRIT alarm will be reported in sensors utility:

$ sensors
...
front panel 025:  +54.0°C  (crit = +35.0°C, emerg = +80.0°C) ALARM(CRIT)

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../net/ethernet/mellanox/mlxsw/core_hwmon.c  | 71 ++++++++++++++++++-
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
index f1b0c176eaeb..8232bc0f5c03 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
@@ -17,7 +17,7 @@
 #define MLXSW_HWMON_GEARBOXES_MAX_COUNT 32
 
 #define MLXSW_HWMON_ATTR_PER_SENSOR 3
-#define MLXSW_HWMON_ATTR_PER_MODULE 5
+#define MLXSW_HWMON_ATTR_PER_MODULE 7
 #define MLXSW_HWMON_ATTR_PER_GEARBOX 4
 
 #define MLXSW_HWMON_ATTR_COUNT (MLXSW_HWMON_SENSORS_MAX_COUNT * MLXSW_HWMON_ATTR_PER_SENSOR + \
@@ -388,6 +388,53 @@ mlxsw_hwmon_gbox_temp_label_show(struct device *dev,
 	return sprintf(buf, "gearbox %03u\n", index);
 }
 
+static ssize_t mlxsw_hwmon_temp_critical_alarm_show(struct device *dev,
+						    struct device_attribute *attr,
+						    char *buf)
+{
+	int err, temp, emergency_temp, critic_temp;
+
+	err = mlxsw_hwmon_module_temp_get(dev, attr, &temp);
+	if (err)
+		return err;
+
+	if (temp <= 0)
+		return sprintf(buf, "%d\n", false);
+
+	err = mlxsw_hwmon_module_temp_emergency_get(dev, attr, &emergency_temp);
+	if (err)
+		return err;
+
+	if (temp >= emergency_temp)
+		return sprintf(buf, "%d\n", false);
+
+	err = mlxsw_hwmon_module_temp_critical_get(dev, attr, &critic_temp);
+	if (err)
+		return err;
+
+	return sprintf(buf, "%d\n", temp >= critic_temp);
+}
+
+static ssize_t mlxsw_hwmon_temp_emergency_alarm_show(struct device *dev,
+						     struct device_attribute *attr,
+						     char *buf)
+{
+	int err, temp, emergency_temp;
+
+	err = mlxsw_hwmon_module_temp_get(dev, attr, &temp);
+	if (err)
+		return err;
+
+	if (temp <= 0)
+		return sprintf(buf, "%d\n", false);
+
+	err = mlxsw_hwmon_module_temp_emergency_get(dev, attr, &emergency_temp);
+	if (err)
+		return err;
+
+	return sprintf(buf, "%d\n", temp >= emergency_temp);
+}
+
 enum mlxsw_hwmon_attr_type {
 	MLXSW_HWMON_ATTR_TYPE_TEMP,
 	MLXSW_HWMON_ATTR_TYPE_TEMP_MAX,
@@ -401,6 +448,8 @@ enum mlxsw_hwmon_attr_type {
 	MLXSW_HWMON_ATTR_TYPE_TEMP_MODULE_EMERG,
 	MLXSW_HWMON_ATTR_TYPE_TEMP_MODULE_LABEL,
 	MLXSW_HWMON_ATTR_TYPE_TEMP_GBOX_LABEL,
+	MLXSW_HWMON_ATTR_TYPE_TEMP_CRIT_ALARM,
+	MLXSW_HWMON_ATTR_TYPE_TEMP_EMERGENCY_ALARM,
 };
 
 static void mlxsw_hwmon_attr_add(struct mlxsw_hwmon *mlxsw_hwmon,
@@ -491,6 +540,20 @@ static void mlxsw_hwmon_attr_add(struct mlxsw_hwmon *mlxsw_hwmon,
 		snprintf(mlxsw_hwmon_attr->name, sizeof(mlxsw_hwmon_attr->name),
 			 "temp%u_label", num + 1);
 		break;
+	case MLXSW_HWMON_ATTR_TYPE_TEMP_CRIT_ALARM:
+		mlxsw_hwmon_attr->dev_attr.show =
+			mlxsw_hwmon_temp_critical_alarm_show;
+		mlxsw_hwmon_attr->dev_attr.attr.mode = 0444;
+		snprintf(mlxsw_hwmon_attr->name, sizeof(mlxsw_hwmon_attr->name),
+			 "temp%u_crit_alarm", num + 1);
+		break;
+	case MLXSW_HWMON_ATTR_TYPE_TEMP_EMERGENCY_ALARM:
+		mlxsw_hwmon_attr->dev_attr.show =
+			mlxsw_hwmon_temp_emergency_alarm_show;
+		mlxsw_hwmon_attr->dev_attr.attr.mode = 0444;
+		snprintf(mlxsw_hwmon_attr->name, sizeof(mlxsw_hwmon_attr->name),
+			 "temp%u_emergency_alarm", num + 1);
+		break;
 	default:
 		WARN_ON(1);
 	}
@@ -613,6 +676,12 @@ static int mlxsw_hwmon_module_init(struct mlxsw_hwmon *mlxsw_hwmon)
 		mlxsw_hwmon_attr_add(mlxsw_hwmon,
 				     MLXSW_HWMON_ATTR_TYPE_TEMP_MODULE_LABEL,
 				     i, i);
+		mlxsw_hwmon_attr_add(mlxsw_hwmon,
+				     MLXSW_HWMON_ATTR_TYPE_TEMP_CRIT_ALARM,
+				     i, i);
+		mlxsw_hwmon_attr_add(mlxsw_hwmon,
+				     MLXSW_HWMON_ATTR_TYPE_TEMP_EMERGENCY_ALARM,
+				     i, i);
 	}
 
 	return 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms
  2020-09-03 13:41 [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms Ido Schimmel
                   ` (2 preceding siblings ...)
  2020-09-03 13:41 ` [PATCH net-next 3/3] mlxsw: core_hwmon: Extend hwmon interface with critical and emergency alarms Ido Schimmel
@ 2020-09-03 19:12 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2020-09-03 19:12 UTC (permalink / raw)
  To: idosch; +Cc: netdev, kuba, jiri, amcohen, petrm, vadimp, andrew, mlxsw, idosch

From: Ido Schimmel <idosch@idosch.org>
Date: Thu,  3 Sep 2020 16:41:43 +0300

> Amit says:
> 
> Extend hwmon interface with critical and emergency module alarms.
 ...

Looks good, series applied, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-03 19:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 13:41 [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms Ido Schimmel
2020-09-03 13:41 ` [PATCH net-next 1/3] mlxsw: core_hwmon: Split temperature querying from show functions Ido Schimmel
2020-09-03 13:41 ` [PATCH net-next 2/3] mlxsw: core_hwmon: Calculate MLXSW_HWMON_ATTR_COUNT more accurately Ido Schimmel
2020-09-03 13:41 ` [PATCH net-next 3/3] mlxsw: core_hwmon: Extend hwmon interface with critical and emergency alarms Ido Schimmel
2020-09-03 19:12 ` [PATCH net-next 0/3] mlxsw: Expose critical and emergency module alarms David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).