All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] mlxsw: Thermal and hwmon extensions
@ 2019-06-23 12:56 Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 1/3] mlxsw: core: Extend thermal core with per inter-connect device thermal zones Ido Schimmel
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ido Schimmel @ 2019-06-23 12:56 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@mellanox.com>

This patchset from Vadim includes various enhancements to thermal and
hwmon code in mlxsw.

Patch #1 adds a thermal zone for each inter-connect device (gearbox).
These devices are present in SN3800 systems and code to expose their
temperature via hwmon was added in commit 2e265a8b6c09 ("mlxsw: core:
Extend hwmon interface with inter-connect temperature attributes").

Currently, there are multiple thermal zones in mlxsw and only a few
cooling devices. Patch #2 detects the hottest thermal zone and the
cooling devices are switched to follow its trends. RFC was sent last
month [1].

Patch #3 allows to read and report negative temperature of the sensors
mlxsw exposes via hwmon and thermal subsystems.

[1] https://patchwork.ozlabs.org/patch/1107161/

Vadim Pasternak (3):
  mlxsw: core: Extend thermal core with per inter-connect device thermal
    zones
  mlxsw: core: Add the hottest thermal zone detection
  mlxsw: core: Add support for negative temperature readout

 .../net/ethernet/mellanox/mlxsw/core_hwmon.c  |  12 +-
 .../ethernet/mellanox/mlxsw/core_thermal.c    | 208 +++++++++++++++++-
 drivers/net/ethernet/mellanox/mlxsw/reg.h     |  12 +-
 3 files changed, 208 insertions(+), 24 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/3] mlxsw: core: Extend thermal core with per inter-connect device thermal zones
  2019-06-23 12:56 [PATCH net-next 0/3] mlxsw: Thermal and hwmon extensions Ido Schimmel
@ 2019-06-23 12:56 ` Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 2/3] mlxsw: core: Add the hottest thermal zone detection Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout Ido Schimmel
  2 siblings, 0 replies; 9+ messages in thread
From: Ido Schimmel @ 2019-06-23 12:56 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, mlxsw, Vadim Pasternak, Ido Schimmel

From: Vadim Pasternak <vadimp@mellanox.com>

Add a dedicated thermal zone for each inter-connect device. The
current temperature is obtained from inter-connect temperature sensor
and the default trip points are set to the same values as default ASIC
trip points. These settings could be changed from the user space.
A cooling device (fan) is bound to all inter-connect devices.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 .../ethernet/mellanox/mlxsw/core_thermal.c    | 137 +++++++++++++++++-
 1 file changed, 136 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
index cfab0e330a47..88f43ad2cc4f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
@@ -98,7 +98,7 @@ struct mlxsw_thermal_module {
 	struct thermal_zone_device *tzdev;
 	struct mlxsw_thermal_trip trips[MLXSW_THERMAL_NUM_TRIPS];
 	enum thermal_device_mode mode;
-	int module;
+	int module; /* Module or gearbox number */
 };
 
 struct mlxsw_thermal {
@@ -111,6 +111,8 @@ struct mlxsw_thermal {
 	struct mlxsw_thermal_trip trips[MLXSW_THERMAL_NUM_TRIPS];
 	enum thermal_device_mode mode;
 	struct mlxsw_thermal_module *tz_module_arr;
+	struct mlxsw_thermal_module *tz_gearbox_arr;
+	u8 tz_gearbox_num;
 };
 
 static inline u8 mlxsw_state_to_duty(int state)
@@ -554,6 +556,46 @@ static struct thermal_zone_device_ops mlxsw_thermal_module_ops = {
 	.set_trip_hyst	= mlxsw_thermal_module_trip_hyst_set,
 };
 
+static int mlxsw_thermal_gearbox_temp_get(struct thermal_zone_device *tzdev,
+					  int *p_temp)
+{
+	struct mlxsw_thermal_module *tz = tzdev->devdata;
+	struct mlxsw_thermal *thermal = tz->parent;
+	char mtmp_pl[MLXSW_REG_MTMP_LEN];
+	unsigned int temp;
+	u16 index;
+	int err;
+
+	index = MLXSW_REG_MTMP_GBOX_INDEX_MIN + tz->module;
+	mlxsw_reg_mtmp_pack(mtmp_pl, index, false, false);
+
+	err = mlxsw_reg_query(thermal->core, MLXSW_REG(mtmp), mtmp_pl);
+	if (err)
+		return err;
+
+	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
+
+	*p_temp = (int) temp;
+	return 0;
+}
+
+static struct thermal_zone_device_ops mlxsw_thermal_gearbox_ops = {
+	.bind		= mlxsw_thermal_module_bind,
+	.unbind		= mlxsw_thermal_module_unbind,
+	.get_mode	= mlxsw_thermal_module_mode_get,
+	.set_mode	= mlxsw_thermal_module_mode_set,
+	.get_temp	= mlxsw_thermal_gearbox_temp_get,
+	.get_trip_type	= mlxsw_thermal_module_trip_type_get,
+	.get_trip_temp	= mlxsw_thermal_module_trip_temp_get,
+	.set_trip_temp	= mlxsw_thermal_module_trip_temp_set,
+	.get_trip_hyst	= mlxsw_thermal_module_trip_hyst_get,
+	.set_trip_hyst	= mlxsw_thermal_module_trip_hyst_set,
+};
+
+static struct thermal_zone_params mlxsw_thermal_gearbox_params = {
+	.governor_name = "user_space",
+};
+
 static int mlxsw_thermal_get_max_state(struct thermal_cooling_device *cdev,
 				       unsigned long *p_state)
 {
@@ -779,6 +821,92 @@ mlxsw_thermal_modules_fini(struct mlxsw_thermal *thermal)
 	kfree(thermal->tz_module_arr);
 }
 
+static int
+mlxsw_thermal_gearbox_tz_init(struct mlxsw_thermal_module *gearbox_tz)
+{
+	char tz_name[MLXSW_THERMAL_ZONE_MAX_NAME];
+
+	snprintf(tz_name, sizeof(tz_name), "mlxsw-gearbox%d",
+		 gearbox_tz->module + 1);
+	gearbox_tz->tzdev = thermal_zone_device_register(tz_name,
+						MLXSW_THERMAL_NUM_TRIPS,
+						MLXSW_THERMAL_TRIP_MASK,
+						gearbox_tz,
+						&mlxsw_thermal_gearbox_ops,
+						&mlxsw_thermal_gearbox_params,
+						0, 0);
+	if (IS_ERR(gearbox_tz->tzdev))
+		return PTR_ERR(gearbox_tz->tzdev);
+
+	return 0;
+}
+
+static void
+mlxsw_thermal_gearbox_tz_fini(struct mlxsw_thermal_module *gearbox_tz)
+{
+	thermal_zone_device_unregister(gearbox_tz->tzdev);
+}
+
+static int
+mlxsw_thermal_gearboxes_init(struct device *dev, struct mlxsw_core *core,
+			     struct mlxsw_thermal *thermal)
+{
+	struct mlxsw_thermal_module *gearbox_tz;
+	char mgpir_pl[MLXSW_REG_MGPIR_LEN];
+	int i;
+	int err;
+
+	if (!mlxsw_core_res_query_enabled(core))
+		return 0;
+
+	mlxsw_reg_mgpir_pack(mgpir_pl);
+	err = mlxsw_reg_query(core, MLXSW_REG(mgpir), mgpir_pl);
+	if (err)
+		return err;
+
+	mlxsw_reg_mgpir_unpack(mgpir_pl, &thermal->tz_gearbox_num, NULL, NULL);
+	if (!thermal->tz_gearbox_num)
+		return 0;
+
+	thermal->tz_gearbox_arr = kcalloc(thermal->tz_gearbox_num,
+					  sizeof(*thermal->tz_gearbox_arr),
+					  GFP_KERNEL);
+	if (!thermal->tz_gearbox_arr)
+		return -ENOMEM;
+
+	for (i = 0; i < thermal->tz_gearbox_num; i++) {
+		gearbox_tz = &thermal->tz_gearbox_arr[i];
+		memcpy(gearbox_tz->trips, default_thermal_trips,
+		       sizeof(thermal->trips));
+		gearbox_tz->module = i;
+		gearbox_tz->parent = thermal;
+		err = mlxsw_thermal_gearbox_tz_init(gearbox_tz);
+		if (err)
+			goto err_unreg_tz_gearbox;
+	}
+
+	return 0;
+
+err_unreg_tz_gearbox:
+	for (i--; i >= 0; i--)
+		mlxsw_thermal_gearbox_tz_fini(&thermal->tz_gearbox_arr[i]);
+	kfree(thermal->tz_gearbox_arr);
+	return err;
+}
+
+static void
+mlxsw_thermal_gearboxes_fini(struct mlxsw_thermal *thermal)
+{
+	int i;
+
+	if (!mlxsw_core_res_query_enabled(thermal->core))
+		return;
+
+	for (i = thermal->tz_gearbox_num - 1; i >= 0; i--)
+		mlxsw_thermal_gearbox_tz_fini(&thermal->tz_gearbox_arr[i]);
+	kfree(thermal->tz_gearbox_arr);
+}
+
 int mlxsw_thermal_init(struct mlxsw_core *core,
 		       const struct mlxsw_bus_info *bus_info,
 		       struct mlxsw_thermal **p_thermal)
@@ -869,10 +997,16 @@ int mlxsw_thermal_init(struct mlxsw_core *core,
 	if (err)
 		goto err_unreg_tzdev;
 
+	err = mlxsw_thermal_gearboxes_init(dev, core, thermal);
+	if (err)
+		goto err_unreg_modules_tzdev;
+
 	thermal->mode = THERMAL_DEVICE_ENABLED;
 	*p_thermal = thermal;
 	return 0;
 
+err_unreg_modules_tzdev:
+	mlxsw_thermal_modules_fini(thermal);
 err_unreg_tzdev:
 	if (thermal->tzdev) {
 		thermal_zone_device_unregister(thermal->tzdev);
@@ -891,6 +1025,7 @@ void mlxsw_thermal_fini(struct mlxsw_thermal *thermal)
 {
 	int i;
 
+	mlxsw_thermal_gearboxes_fini(thermal);
 	mlxsw_thermal_modules_fini(thermal);
 	if (thermal->tzdev) {
 		thermal_zone_device_unregister(thermal->tzdev);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/3] mlxsw: core: Add the hottest thermal zone detection
  2019-06-23 12:56 [PATCH net-next 0/3] mlxsw: Thermal and hwmon extensions Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 1/3] mlxsw: core: Extend thermal core with per inter-connect device thermal zones Ido Schimmel
@ 2019-06-23 12:56 ` Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout Ido Schimmel
  2 siblings, 0 replies; 9+ messages in thread
From: Ido Schimmel @ 2019-06-23 12:56 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, mlxsw, Vadim Pasternak, Ido Schimmel

From: Vadim Pasternak <vadimp@mellanox.com>

When multiple sensors are mapped to the same cooling device, the
cooling device should be set according the worst sensor from the
sensors associated with this cooling device.

Provide the hottest thermal zone detection and enforce cooling device
to follow the temperature trends of the hottest zone only.
Prevent competition for the cooling device control from others zones,
by "stable trend" indication. A cooling device will not perform any
actions associated with a zone with a "stable trend".

When other thermal zone is detected as a hottest, a cooling device is
to be switched to following temperature trends of new hottest zone.

Thermal zone score is represented by 32 bits unsigned integer and
calculated according to the next formula:
For T < TZ<t><i>, where t from {normal trip = 0, high trip = 1, hot
trip = 2, critical = 3}:
TZ<i> score = (T + (TZ<t><i> - T) / 2) / (TZ<t><i> - T) * 256 ** j;
Highest thermal zone score s is set as MAX(TZ<i>score);
Following this formula, if TZ<i> is in trip point higher than TZ<k>,
the higher score is to be always assigned to TZ<i>.

For two thermal zones located at the same kind of trip point, the higher
score will be assigned to the zone which is closer to the next trip
point. Thus, the highest score will always be assigned objectively to
the hottest thermal zone.

All the thermal zones initially are to be configured with mode
"enabled" with the "step_wise" governor.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 .../ethernet/mellanox/mlxsw/core_thermal.c    | 75 +++++++++++++++----
 1 file changed, 62 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
index 88f43ad2cc4f..504a34d240f7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
@@ -23,6 +23,7 @@
 #define MLXSW_THERMAL_HYSTERESIS_TEMP	5000	/* 5C */
 #define MLXSW_THERMAL_MODULE_TEMP_SHIFT	(MLXSW_THERMAL_HYSTERESIS_TEMP * 2)
 #define MLXSW_THERMAL_ZONE_MAX_NAME	16
+#define MLXSW_THERMAL_TEMP_SCORE_MAX	GENMASK(31, 0)
 #define MLXSW_THERMAL_MAX_STATE	10
 #define MLXSW_THERMAL_MAX_DUTY	255
 /* Minimum and maximum fan allowed speed in percent: from 20% to 100%. Values
@@ -113,6 +114,8 @@ struct mlxsw_thermal {
 	struct mlxsw_thermal_module *tz_module_arr;
 	struct mlxsw_thermal_module *tz_gearbox_arr;
 	u8 tz_gearbox_num;
+	unsigned int tz_highest_score;
+	struct thermal_zone_device *tz_highest_dev;
 };
 
 static inline u8 mlxsw_state_to_duty(int state)
@@ -197,6 +200,34 @@ mlxsw_thermal_module_trips_update(struct device *dev, struct mlxsw_core *core,
 	return 0;
 }
 
+static void mlxsw_thermal_tz_score_update(struct mlxsw_thermal *thermal,
+					  struct thermal_zone_device *tzdev,
+					  struct mlxsw_thermal_trip *trips,
+					  int temp)
+{
+	struct mlxsw_thermal_trip *trip = trips;
+	unsigned int score, delta, i, shift = 1;
+
+	/* Calculate thermal zone score, if temperature is above the critical
+	 * threshold score is set to MLXSW_THERMAL_TEMP_SCORE_MAX.
+	 */
+	score = MLXSW_THERMAL_TEMP_SCORE_MAX;
+	for (i = MLXSW_THERMAL_TEMP_TRIP_NORM; i < MLXSW_THERMAL_NUM_TRIPS;
+	     i++, trip++) {
+		if (temp < trip->temp) {
+			delta = DIV_ROUND_CLOSEST(temp, trip->temp - temp);
+			score = delta * shift;
+			break;
+		}
+		shift *= 256;
+	}
+
+	if (score > thermal->tz_highest_score) {
+		thermal->tz_highest_score = score;
+		thermal->tz_highest_dev = tzdev;
+	}
+}
+
 static int mlxsw_thermal_bind(struct thermal_zone_device *tzdev,
 			      struct thermal_cooling_device *cdev)
 {
@@ -292,6 +323,9 @@ static int mlxsw_thermal_get_temp(struct thermal_zone_device *tzdev,
 		return err;
 	}
 	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
+	if (temp > 0)
+		mlxsw_thermal_tz_score_update(thermal, tzdev, thermal->trips,
+					      temp);
 
 	*p_temp = (int) temp;
 	return 0;
@@ -353,6 +387,22 @@ static int mlxsw_thermal_set_trip_hyst(struct thermal_zone_device *tzdev,
 	return 0;
 }
 
+static int mlxsw_thermal_trend_get(struct thermal_zone_device *tzdev,
+				   int trip, enum thermal_trend *trend)
+{
+	struct mlxsw_thermal_module *tz = tzdev->devdata;
+	struct mlxsw_thermal *thermal = tz->parent;
+
+	if (trip < 0 || trip >= MLXSW_THERMAL_NUM_TRIPS)
+		return -EINVAL;
+
+	if (tzdev == thermal->tz_highest_dev)
+		return 1;
+
+	*trend = THERMAL_TREND_STABLE;
+	return 0;
+}
+
 static struct thermal_zone_device_ops mlxsw_thermal_ops = {
 	.bind = mlxsw_thermal_bind,
 	.unbind = mlxsw_thermal_unbind,
@@ -364,6 +414,7 @@ static struct thermal_zone_device_ops mlxsw_thermal_ops = {
 	.set_trip_temp	= mlxsw_thermal_set_trip_temp,
 	.get_trip_hyst	= mlxsw_thermal_get_trip_hyst,
 	.set_trip_hyst	= mlxsw_thermal_set_trip_hyst,
+	.get_trend	= mlxsw_thermal_trend_get,
 };
 
 static int mlxsw_thermal_module_bind(struct thermal_zone_device *tzdev,
@@ -474,7 +525,9 @@ static int mlxsw_thermal_module_temp_get(struct thermal_zone_device *tzdev,
 		return 0;
 
 	/* Update trip points. */
-	mlxsw_thermal_module_trips_update(dev, thermal->core, tz);
+	err = mlxsw_thermal_module_trips_update(dev, thermal->core, tz);
+	if (!err)
+		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips, temp);
 
 	return 0;
 }
@@ -539,10 +592,6 @@ mlxsw_thermal_module_trip_hyst_set(struct thermal_zone_device *tzdev, int trip,
 	return 0;
 }
 
-static struct thermal_zone_params mlxsw_thermal_module_params = {
-	.governor_name = "user_space",
-};
-
 static struct thermal_zone_device_ops mlxsw_thermal_module_ops = {
 	.bind		= mlxsw_thermal_module_bind,
 	.unbind		= mlxsw_thermal_module_unbind,
@@ -554,6 +603,7 @@ static struct thermal_zone_device_ops mlxsw_thermal_module_ops = {
 	.set_trip_temp	= mlxsw_thermal_module_trip_temp_set,
 	.get_trip_hyst	= mlxsw_thermal_module_trip_hyst_get,
 	.set_trip_hyst	= mlxsw_thermal_module_trip_hyst_set,
+	.get_trend	= mlxsw_thermal_trend_get,
 };
 
 static int mlxsw_thermal_gearbox_temp_get(struct thermal_zone_device *tzdev,
@@ -574,6 +624,8 @@ static int mlxsw_thermal_gearbox_temp_get(struct thermal_zone_device *tzdev,
 		return err;
 
 	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
+	if (temp > 0)
+		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips, temp);
 
 	*p_temp = (int) temp;
 	return 0;
@@ -590,10 +642,7 @@ static struct thermal_zone_device_ops mlxsw_thermal_gearbox_ops = {
 	.set_trip_temp	= mlxsw_thermal_module_trip_temp_set,
 	.get_trip_hyst	= mlxsw_thermal_module_trip_hyst_get,
 	.set_trip_hyst	= mlxsw_thermal_module_trip_hyst_set,
-};
-
-static struct thermal_zone_params mlxsw_thermal_gearbox_params = {
-	.governor_name = "user_space",
+	.get_trend	= mlxsw_thermal_trend_get,
 };
 
 static int mlxsw_thermal_get_max_state(struct thermal_cooling_device *cdev,
@@ -709,13 +758,13 @@ mlxsw_thermal_module_tz_init(struct mlxsw_thermal_module *module_tz)
 							MLXSW_THERMAL_TRIP_MASK,
 							module_tz,
 							&mlxsw_thermal_module_ops,
-							&mlxsw_thermal_module_params,
-							0, 0);
+							NULL, 0, 0);
 	if (IS_ERR(module_tz->tzdev)) {
 		err = PTR_ERR(module_tz->tzdev);
 		return err;
 	}
 
+	module_tz->mode = THERMAL_DEVICE_ENABLED;
 	return 0;
 }
 
@@ -833,11 +882,11 @@ mlxsw_thermal_gearbox_tz_init(struct mlxsw_thermal_module *gearbox_tz)
 						MLXSW_THERMAL_TRIP_MASK,
 						gearbox_tz,
 						&mlxsw_thermal_gearbox_ops,
-						&mlxsw_thermal_gearbox_params,
-						0, 0);
+						NULL, 0, 0);
 	if (IS_ERR(gearbox_tz->tzdev))
 		return PTR_ERR(gearbox_tz->tzdev);
 
+	gearbox_tz->mode = THERMAL_DEVICE_ENABLED;
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout
  2019-06-23 12:56 [PATCH net-next 0/3] mlxsw: Thermal and hwmon extensions Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 1/3] mlxsw: core: Extend thermal core with per inter-connect device thermal zones Ido Schimmel
  2019-06-23 12:56 ` [PATCH net-next 2/3] mlxsw: core: Add the hottest thermal zone detection Ido Schimmel
@ 2019-06-23 12:56 ` Ido Schimmel
  2019-06-23 15:44   ` Andrew Lunn
  2 siblings, 1 reply; 9+ messages in thread
From: Ido Schimmel @ 2019-06-23 12:56 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, mlxsw, Vadim Pasternak, Ido Schimmel

From: Vadim Pasternak <vadimp@mellanox.com>

Extend macros MLXSW_REG_MTMP_TEMP_TO_MC() to allow support of negative
temperature readout, since chip and others thermal components are
capable of operating within the negative temperature.
With no such support negative temperature will be consider as very high
temperature and it will cause wrong readout and thermal shutdown.
For negative values 2`s complement is used.
Tested in chamber.
Example of chip ambient temperature readout with chamber temperature:
-10 Celsius:
temp1:             -6.0C  (highest =  -5.0C)
-5 Celsius:
temp1:             -1.0C  (highest =  -1.0C)

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c   | 12 +++++-------
 drivers/net/ethernet/mellanox/mlxsw/core_thermal.c | 14 +++++++-------
 drivers/net/ethernet/mellanox/mlxsw/reg.h          | 12 +++++++-----
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
index 056e3f55ae6c..fb1fd334a8d4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
@@ -52,8 +52,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device *dev,
 			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
 	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
-	unsigned int temp;
-	int index;
+	int temp, index;
 	int err;
 
 	index = mlxsw_hwmon_get_attr_index(mlwsw_hwmon_attr->type_index,
@@ -65,7 +64,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device *dev,
 		return err;
 	}
 	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
-	return sprintf(buf, "%u\n", temp);
+	return sprintf(buf, "%d\n", temp);
 }
 
 static ssize_t mlxsw_hwmon_temp_max_show(struct device *dev,
@@ -76,8 +75,7 @@ static ssize_t mlxsw_hwmon_temp_max_show(struct device *dev,
 			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
 	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
-	unsigned int temp_max;
-	int index;
+	int temp_max, index;
 	int err;
 
 	index = mlxsw_hwmon_get_attr_index(mlwsw_hwmon_attr->type_index,
@@ -89,7 +87,7 @@ static ssize_t mlxsw_hwmon_temp_max_show(struct device *dev,
 		return err;
 	}
 	mlxsw_reg_mtmp_unpack(mtmp_pl, NULL, &temp_max, NULL);
-	return sprintf(buf, "%u\n", temp_max);
+	return sprintf(buf, "%d\n", temp_max);
 }
 
 static ssize_t mlxsw_hwmon_temp_rst_store(struct device *dev,
@@ -215,8 +213,8 @@ static ssize_t mlxsw_hwmon_module_temp_show(struct device *dev,
 			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
 	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
-	unsigned int temp;
 	u8 module;
+	int temp;
 	int err;
 
 	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon->sensor_count;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
index 504a34d240f7..35a1dc89c28a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
@@ -312,7 +312,7 @@ static int mlxsw_thermal_get_temp(struct thermal_zone_device *tzdev,
 	struct mlxsw_thermal *thermal = tzdev->devdata;
 	struct device *dev = thermal->bus_info->dev;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
-	unsigned int temp;
+	int temp;
 	int err;
 
 	mlxsw_reg_mtmp_pack(mtmp_pl, 0, false, false);
@@ -327,7 +327,7 @@ static int mlxsw_thermal_get_temp(struct thermal_zone_device *tzdev,
 		mlxsw_thermal_tz_score_update(thermal, tzdev, thermal->trips,
 					      temp);
 
-	*p_temp = (int) temp;
+	*p_temp = temp;
 	return 0;
 }
 
@@ -503,7 +503,7 @@ static int mlxsw_thermal_module_temp_get(struct thermal_zone_device *tzdev,
 	struct mlxsw_thermal *thermal = tz->parent;
 	struct device *dev = thermal->bus_info->dev;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
-	unsigned int temp;
+	int temp;
 	int err;
 
 	/* Read module temperature. */
@@ -519,14 +519,14 @@ static int mlxsw_thermal_module_temp_get(struct thermal_zone_device *tzdev,
 		return 0;
 	}
 	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
-	*p_temp = (int) temp;
+	*p_temp = temp;
 
 	if (!temp)
 		return 0;
 
 	/* Update trip points. */
 	err = mlxsw_thermal_module_trips_update(dev, thermal->core, tz);
-	if (!err)
+	if (!err && temp > 0)
 		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips, temp);
 
 	return 0;
@@ -612,8 +612,8 @@ static int mlxsw_thermal_gearbox_temp_get(struct thermal_zone_device *tzdev,
 	struct mlxsw_thermal_module *tz = tzdev->devdata;
 	struct mlxsw_thermal *thermal = tz->parent;
 	char mtmp_pl[MLXSW_REG_MTMP_LEN];
-	unsigned int temp;
 	u16 index;
+	int temp;
 	int err;
 
 	index = MLXSW_REG_MTMP_GBOX_INDEX_MIN + tz->module;
@@ -627,7 +627,7 @@ static int mlxsw_thermal_gearbox_temp_get(struct thermal_zone_device *tzdev,
 	if (temp > 0)
 		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips, temp);
 
-	*p_temp = (int) temp;
+	*p_temp = temp;
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 452f645fa040..e5f6bfd8a35a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -8050,7 +8050,10 @@ MLXSW_REG_DEFINE(mtmp, MLXSW_REG_MTMP_ID, MLXSW_REG_MTMP_LEN);
 MLXSW_ITEM32(reg, mtmp, sensor_index, 0x00, 0, 12);
 
 /* Convert to milli degrees Celsius */
-#define MLXSW_REG_MTMP_TEMP_TO_MC(val) (val * 125)
+#define MLXSW_REG_MTMP_TEMP_TO_MC(val) ({ typeof(val) v_ = (val); \
+					  ((v_) >= 0) ? ((v_) * 125) : \
+					  ((s16)((GENMASK(15, 0) + (v_) + 1) \
+					   * 125)); })
 
 /* reg_mtmp_temperature
  * Temperature reading from the sensor. Reading is in 0.125 Celsius
@@ -8121,11 +8124,10 @@ static inline void mlxsw_reg_mtmp_pack(char *payload, u16 sensor_index,
 						    MLXSW_REG_MTMP_THRESH_HI);
 }
 
-static inline void mlxsw_reg_mtmp_unpack(char *payload, unsigned int *p_temp,
-					 unsigned int *p_max_temp,
-					 char *sensor_name)
+static inline void mlxsw_reg_mtmp_unpack(char *payload, int *p_temp,
+					 int *p_max_temp, char *sensor_name)
 {
-	u16 temp;
+	s16 temp;
 
 	if (p_temp) {
 		temp = mlxsw_reg_mtmp_temperature_get(payload);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout
  2019-06-23 12:56 ` [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout Ido Schimmel
@ 2019-06-23 15:44   ` Andrew Lunn
  2019-06-23 16:00     ` Vadim Pasternak
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Lunn @ 2019-06-23 15:44 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, jiri, mlxsw, Vadim Pasternak, Ido Schimmel

> --- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
> @@ -52,8 +52,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device *dev,
>  			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
>  	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
>  	char mtmp_pl[MLXSW_REG_MTMP_LEN];
> -	unsigned int temp;
> -	int index;
> +	int temp, index;
>  	int err;
>  
>  	index = mlxsw_hwmon_get_attr_index(mlwsw_hwmon_attr->type_index,
> @@ -65,7 +64,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device *dev,
>  		return err;
>  	}
>  	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
> -	return sprintf(buf, "%u\n", temp);
> +	return sprintf(buf, "%d\n", temp);
>  }

If you had used the hwmon core, rather than implementing it yourself,
you could of avoided this part of the bug.

>  static ssize_t mlxsw_hwmon_temp_rst_store(struct device *dev,
> @@ -215,8 +213,8 @@ static ssize_t mlxsw_hwmon_module_temp_show(struct device *dev,
>  			container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
>  	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
>  	char mtmp_pl[MLXSW_REG_MTMP_LEN];
> -	unsigned int temp;
>  	u8 module;
> +	int temp;
>  	int err;
>  
>  	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon->sensor_count;

I think you missed changing the %u to %d in this function.

> @@ -519,14 +519,14 @@ static int mlxsw_thermal_module_temp_get(struct thermal_zone_device *tzdev,
>  		return 0;
>  	}
>  	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
> -	*p_temp = (int) temp;
> +	*p_temp = temp;
>  
>  	if (!temp)
>  		return 0;
>  
>  	/* Update trip points. */
>  	err = mlxsw_thermal_module_trips_update(dev, thermal->core, tz);
> -	if (!err)
> +	if (!err && temp > 0)
>  		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips, temp);

Why the > 0?

    Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout
  2019-06-23 15:44   ` Andrew Lunn
@ 2019-06-23 16:00     ` Vadim Pasternak
  2019-06-23 16:11       ` Vadim Pasternak
  2019-06-23 16:25       ` Andrew Lunn
  0 siblings, 2 replies; 9+ messages in thread
From: Vadim Pasternak @ 2019-06-23 16:00 UTC (permalink / raw)
  To: Andrew Lunn, Ido Schimmel; +Cc: netdev, davem, Jiri Pirko, mlxsw, Ido Schimmel



> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Sunday, June 23, 2019 6:44 PM
> To: Ido Schimmel <idosch@idosch.org>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
> <jiri@mellanox.com>; mlxsw <mlxsw@mellanox.com>; Vadim Pasternak
> <vadimp@mellanox.com>; Ido Schimmel <idosch@mellanox.com>
> Subject: Re: [PATCH net-next 3/3] mlxsw: core: Add support for negative
> temperature readout
> 
> > --- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
> > +++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
> > @@ -52,8 +52,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device
> *dev,
> >  			container_of(attr, struct mlxsw_hwmon_attr,
> dev_attr);
> >  	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
> >  	char mtmp_pl[MLXSW_REG_MTMP_LEN];
> > -	unsigned int temp;
> > -	int index;
> > +	int temp, index;
> >  	int err;
> >
> >  	index = mlxsw_hwmon_get_attr_index(mlwsw_hwmon_attr-
> >type_index,
> > @@ -65,7 +64,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device
> *dev,
> >  		return err;
> >  	}
> >  	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
> > -	return sprintf(buf, "%u\n", temp);
> > +	return sprintf(buf, "%d\n", temp);
> >  }
> 
> If you had used the hwmon core, rather than implementing it yourself, you could
> of avoided this part of the bug.
> 

Hi Andrew.

Yes.
But before we handle only positive temperature.
And currently support for the negative readouts has been added.

> >  static ssize_t mlxsw_hwmon_temp_rst_store(struct device *dev, @@
> > -215,8 +213,8 @@ static ssize_t mlxsw_hwmon_module_temp_show(struct
> device *dev,
> >  			container_of(attr, struct mlxsw_hwmon_attr,
> dev_attr);
> >  	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
> >  	char mtmp_pl[MLXSW_REG_MTMP_LEN];
> > -	unsigned int temp;
> >  	u8 module;
> > +	int temp;
> >  	int err;
> >
> >  	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon-
> >sensor_count;
> 
> I think you missed changing the %u to %d in this function.

If I am not wrong, I think you refer to mlxsw_hwmon_fan_rpm_show(),
where it should be %u.

> 
> > @@ -519,14 +519,14 @@ static int mlxsw_thermal_module_temp_get(struct
> thermal_zone_device *tzdev,
> >  		return 0;
> >  	}
> >  	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
> > -	*p_temp = (int) temp;
> > +	*p_temp = temp;
> >
> >  	if (!temp)
> >  		return 0;
> >
> >  	/* Update trip points. */
> >  	err = mlxsw_thermal_module_trips_update(dev, thermal->core, tz);
> > -	if (!err)
> > +	if (!err && temp > 0)
> >  		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips,
> temp);
> 
> Why the > 0?

We don't consider negative temperature for thermal control.

> 
>     Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout
  2019-06-23 16:00     ` Vadim Pasternak
@ 2019-06-23 16:11       ` Vadim Pasternak
  2019-06-23 16:25       ` Andrew Lunn
  1 sibling, 0 replies; 9+ messages in thread
From: Vadim Pasternak @ 2019-06-23 16:11 UTC (permalink / raw)
  To: Andrew Lunn, Ido Schimmel; +Cc: netdev, davem, Jiri Pirko, mlxsw, Ido Schimmel



> -----Original Message-----
> From: Vadim Pasternak
> Sent: Sunday, June 23, 2019 7:01 PM
> To: Andrew Lunn <andrew@lunn.ch>; Ido Schimmel <idosch@idosch.org>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
> <jiri@mellanox.com>; mlxsw <mlxsw@mellanox.com>; Ido Schimmel
> <idosch@mellanox.com>
> Subject: RE: [PATCH net-next 3/3] mlxsw: core: Add support for negative
> temperature readout
> 
> 
> 
> > -----Original Message-----
> > From: Andrew Lunn <andrew@lunn.ch>
> > Sent: Sunday, June 23, 2019 6:44 PM
> > To: Ido Schimmel <idosch@idosch.org>
> > Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
> > <jiri@mellanox.com>; mlxsw <mlxsw@mellanox.com>; Vadim Pasternak
> > <vadimp@mellanox.com>; Ido Schimmel <idosch@mellanox.com>
> > Subject: Re: [PATCH net-next 3/3] mlxsw: core: Add support for
> > negative temperature readout
> >
> > > --- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
> > > +++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
> > > @@ -52,8 +52,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device
> > *dev,
> > >  			container_of(attr, struct mlxsw_hwmon_attr,
> > dev_attr);
> > >  	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
> > >  	char mtmp_pl[MLXSW_REG_MTMP_LEN];
> > > -	unsigned int temp;
> > > -	int index;
> > > +	int temp, index;
> > >  	int err;
> > >
> > >  	index = mlxsw_hwmon_get_attr_index(mlwsw_hwmon_attr-
> > >type_index,
> > > @@ -65,7 +64,7 @@ static ssize_t mlxsw_hwmon_temp_show(struct device
> > *dev,
> > >  		return err;
> > >  	}
> > >  	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
> > > -	return sprintf(buf, "%u\n", temp);
> > > +	return sprintf(buf, "%d\n", temp);
> > >  }
> >
> > If you had used the hwmon core, rather than implementing it yourself,
> > you could of avoided this part of the bug.
> >
> 
> Hi Andrew.
> 
> Yes.
> But before we handle only positive temperature.
> And currently support for the negative readouts has been added.
> 
> > >  static ssize_t mlxsw_hwmon_temp_rst_store(struct device *dev, @@
> > > -215,8 +213,8 @@ static ssize_t mlxsw_hwmon_module_temp_show(struct
> > device *dev,
> > >  			container_of(attr, struct mlxsw_hwmon_attr,
> > dev_attr);
> > >  	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
> > >  	char mtmp_pl[MLXSW_REG_MTMP_LEN];
> > > -	unsigned int temp;
> > >  	u8 module;
> > > +	int temp;
> > >  	int err;
> > >
> > >  	module = mlwsw_hwmon_attr->type_index - mlxsw_hwmon-
> sensor_count;
> >
> > I think you missed changing the %u to %d in this function.
> 
> If I am not wrong, I think you refer to mlxsw_hwmon_fan_rpm_show(), where it
> should be %u.
> 
 O, I see what you mentioned.
This is mlxsw_hwmon_module_temp_show().
Yes, right it should be %d.
Thank you.

> >
> > > @@ -519,14 +519,14 @@ static int
> > > mlxsw_thermal_module_temp_get(struct
> > thermal_zone_device *tzdev,
> > >  		return 0;
> > >  	}
> > >  	mlxsw_reg_mtmp_unpack(mtmp_pl, &temp, NULL, NULL);
> > > -	*p_temp = (int) temp;
> > > +	*p_temp = temp;
> > >
> > >  	if (!temp)
> > >  		return 0;
> > >
> > >  	/* Update trip points. */
> > >  	err = mlxsw_thermal_module_trips_update(dev, thermal->core, tz);
> > > -	if (!err)
> > > +	if (!err && temp > 0)
> > >  		mlxsw_thermal_tz_score_update(thermal, tzdev, tz->trips,
> > temp);
> >
> > Why the > 0?
> 
> We don't consider negative temperature for thermal control.
> 
> >
> >     Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout
  2019-06-23 16:00     ` Vadim Pasternak
  2019-06-23 16:11       ` Vadim Pasternak
@ 2019-06-23 16:25       ` Andrew Lunn
  2019-06-23 16:34         ` Vadim Pasternak
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Lunn @ 2019-06-23 16:25 UTC (permalink / raw)
  To: Vadim Pasternak
  Cc: Ido Schimmel, netdev, davem, Jiri Pirko, mlxsw, Ido Schimmel

> > Why the > 0?
> 
> We don't consider negative temperature for thermal control.

Is this because the thermal control is also broken and does not
support negative values? This is just a workaround papering over the
cracks?

I've worked on some systems where the thermal subsystem has controller
a heater. Mostly industrial systems, extended temperature range, and
you have to make sure the hardware is kept above -25C, otherwise the
DRAM timing goes to pot and the system crashed and froze.

	Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout
  2019-06-23 16:25       ` Andrew Lunn
@ 2019-06-23 16:34         ` Vadim Pasternak
  0 siblings, 0 replies; 9+ messages in thread
From: Vadim Pasternak @ 2019-06-23 16:34 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Ido Schimmel, netdev, davem, Jiri Pirko, mlxsw, Ido Schimmel



> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Sunday, June 23, 2019 7:26 PM
> To: Vadim Pasternak <vadimp@mellanox.com>
> Cc: Ido Schimmel <idosch@idosch.org>; netdev@vger.kernel.org;
> davem@davemloft.net; Jiri Pirko <jiri@mellanox.com>; mlxsw
> <mlxsw@mellanox.com>; Ido Schimmel <idosch@mellanox.com>
> Subject: Re: [PATCH net-next 3/3] mlxsw: core: Add support for negative
> temperature readout
> 
> > > Why the > 0?
> >
> > We don't consider negative temperature for thermal control.
> 
> Is this because the thermal control is also broken and does not support negative
> values? This is just a workaround papering over the cracks?

We just have system hardware requirements for minimal speed for system
PWM. It could not be less than 20%.
So for temperature ~40C or below it PWM will set to this speed.

> 
> I've worked on some systems where the thermal subsystem has controller a
> heater. Mostly industrial systems, extended temperature range, and you have to
> make sure the hardware is kept above -25C, otherwise the DRAM timing goes to
> pot and the system crashed and froze.

Interesting input. I didn't know about such feature.
We don't have heaters within our systems.
Maybe we should think about it for the next generatin systems.

> 
> 	Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-23 16:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-23 12:56 [PATCH net-next 0/3] mlxsw: Thermal and hwmon extensions Ido Schimmel
2019-06-23 12:56 ` [PATCH net-next 1/3] mlxsw: core: Extend thermal core with per inter-connect device thermal zones Ido Schimmel
2019-06-23 12:56 ` [PATCH net-next 2/3] mlxsw: core: Add the hottest thermal zone detection Ido Schimmel
2019-06-23 12:56 ` [PATCH net-next 3/3] mlxsw: core: Add support for negative temperature readout Ido Schimmel
2019-06-23 15:44   ` Andrew Lunn
2019-06-23 16:00     ` Vadim Pasternak
2019-06-23 16:11       ` Vadim Pasternak
2019-06-23 16:25       ` Andrew Lunn
2019-06-23 16:34         ` Vadim Pasternak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.