All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Add hw overheat IRQ support to Marvell thermal driver
@ 2018-11-23 16:17 ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

Hello,

This is the last batch of patches about the thermal driver that was
suspended, waiting for the ICU/SEI series to be merged. Now that
everything is ready in mainline, let's add hardware overheat interrupt
support to this driver.

Bindings and DT are updated accordingly. The interrupt will only be
triggered if the platform goes above 102°C (threshold set to 100°C,
hysteresis to > 2°C). The interrupt property is of course not
mandatory.

In the mean time, I add myself to the MAINTAINERS file to receive and
review possible fixes/new features.

Comments are welcome.

Thanks,
Miquèl


Changes since v1:
=================
* Use a threaded IRQ handler to avoid a potential lock depency when
  notifying the core of an overheat situation.


Miquel Raynal (6):
  thermal: armada: add overheat interrupt support
  MAINTAINERS: thermal: add entry for Marvell MVEBU thermal driver
  dt-bindings: ap806: document the thermal interrupt capabilities
  dt-bindings: cp110: document the thermal interrupt capabilities
  arm64: dts: marvell: add interrupt support to ap806 thermal node
  arm64: dts: marvell: add interrupt support to cp110 thermal node

 .../arm/marvell/ap806-system-controller.txt   |   8 +
 .../arm/marvell/cp110-system-controller.txt   |   9 +
 MAINTAINERS                                   |   5 +
 arch/arm64/boot/dts/marvell/armada-ap806.dtsi |  18 +-
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi |  15 +-
 drivers/thermal/armada_thermal.c              | 270 +++++++++++++++++-
 6 files changed, 318 insertions(+), 7 deletions(-)

-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 0/6] Add hw overheat IRQ support to Marvell thermal driver
@ 2018-11-23 16:17 ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This is the last batch of patches about the thermal driver that was
suspended, waiting for the ICU/SEI series to be merged. Now that
everything is ready in mainline, let's add hardware overheat interrupt
support to this driver.

Bindings and DT are updated accordingly. The interrupt will only be
triggered if the platform goes above 102?C (threshold set to 100?C,
hysteresis to > 2?C). The interrupt property is of course not
mandatory.

In the mean time, I add myself to the MAINTAINERS file to receive and
review possible fixes/new features.

Comments are welcome.

Thanks,
Miqu?l


Changes since v1:
=================
* Use a threaded IRQ handler to avoid a potential lock depency when
  notifying the core of an overheat situation.


Miquel Raynal (6):
  thermal: armada: add overheat interrupt support
  MAINTAINERS: thermal: add entry for Marvell MVEBU thermal driver
  dt-bindings: ap806: document the thermal interrupt capabilities
  dt-bindings: cp110: document the thermal interrupt capabilities
  arm64: dts: marvell: add interrupt support to ap806 thermal node
  arm64: dts: marvell: add interrupt support to cp110 thermal node

 .../arm/marvell/ap806-system-controller.txt   |   8 +
 .../arm/marvell/cp110-system-controller.txt   |   9 +
 MAINTAINERS                                   |   5 +
 arch/arm64/boot/dts/marvell/armada-ap806.dtsi |  18 +-
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi |  15 +-
 drivers/thermal/armada_thermal.c              | 270 +++++++++++++++++-
 6 files changed, 318 insertions(+), 7 deletions(-)

-- 
2.19.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/6] thermal: armada: add overheat interrupt support
  2018-11-23 16:17 ` Miquel Raynal
@ 2018-11-23 16:17   ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

The IP can manage to trigger interrupts on overheat situation from all
the sensors.

However, the interrupt source changes along with the last selected
source (ie. the last read sensor), which is an inconsistent behavior.
Avoid possible glitches by always selecting back only one channel which
will then be referenced as the "overheat_sensor" (arbitrarily: the first
in the DT which has a critical trip point filled in).

It is possible that the scan of all thermal zone nodes did not bring a
critical trip point from which the overheat interrupt could be
configured. In this case just complain but do not fail the probe.

Also disable sensor switch during overheat situations because changing
the channel while the system is too hot could clear the overheat state
by changing the source while the temperature is still very high.

Even if the overheat state is not declared, overheat interrupt must be
cleared by reading the DFX interrupt cause _after_ the temperature has
fallen down to the low threshold, otherwise future possible interrupts
would not be served. A work polls the corresponding register until the
overheat flag gets cleared in this case.

Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
 1 file changed, 269 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
index 92f67d40f2e9..55cfaee0da77 100644
--- a/drivers/thermal/armada_thermal.c
+++ b/drivers/thermal/armada_thermal.c
@@ -26,6 +26,11 @@
 #include <linux/iopoll.h>
 #include <linux/mfd/syscon.h>
 #include <linux/regmap.h>
+#include <linux/interrupt.h>
+
+#include "thermal_core.h"
+
+#define TO_MCELSIUS(c)			((c) * 1000)
 
 /* Thermal Manager Control and Status Register */
 #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
@@ -61,9 +66,13 @@
 #define CONTROL1_TSEN_AVG_MASK		0x7
 #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
 #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
+#define CONTROL1_TSEN_INT_EN		BIT(25)
+#define CONTROL1_TSEN_SELECT_OFF	21
+#define CONTROL1_TSEN_SELECT_MASK	0x3
 
 #define STATUS_POLL_PERIOD_US		1000
 #define STATUS_POLL_TIMEOUT_US		100000
+#define OVERHEAT_INT_POLL_DELAY_MS	1000
 
 struct armada_thermal_data;
 
@@ -75,7 +84,11 @@ struct armada_thermal_priv {
 	/* serialize temperature reads/updates */
 	struct mutex update_lock;
 	struct armada_thermal_data *data;
+	struct thermal_zone_device *overheat_sensor;
+	int interrupt_source;
 	int current_channel;
+	long current_threshold;
+	long current_hysteresis;
 };
 
 struct armada_thermal_data {
@@ -93,12 +106,20 @@ struct armada_thermal_data {
 	/* Register shift and mask to access the sensor temperature */
 	unsigned int temp_shift;
 	unsigned int temp_mask;
+	unsigned int thresh_shift;
+	unsigned int hyst_shift;
+	unsigned int hyst_mask;
 	u32 is_valid_bit;
 
 	/* Syscon access */
 	unsigned int syscon_control0_off;
 	unsigned int syscon_control1_off;
 	unsigned int syscon_status_off;
+	unsigned int dfx_irq_cause_off;
+	unsigned int dfx_irq_mask_off;
+	unsigned int dfx_overheat_irq;
+	unsigned int dfx_server_irq_mask_off;
+	unsigned int dfx_server_irq_en;
 
 	/* One sensor is in the thermal IC, the others are in the CPUs if any */
 	unsigned int cpu_nr;
@@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
 	return reg & priv->data->is_valid_bit;
 }
 
+static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
+{
+	struct armada_thermal_data *data = priv->data;
+	u32 reg;
+
+	/* Clear DFX temperature IRQ cause */
+	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
+
+	/* Enable DFX Temperature IRQ */
+	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
+	reg |= data->dfx_overheat_irq;
+	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
+
+	/* Enable DFX server IRQ */
+	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
+	reg |= data->dfx_server_irq_en;
+	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
+
+	/* Enable overheat interrupt */
+	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
+	reg |= CONTROL1_TSEN_INT_EN;
+	regmap_write(priv->syscon, data->syscon_control1_off, reg);
+}
+
+static void __maybe_unused
+armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
+{
+	struct armada_thermal_data *data = priv->data;
+	u32 reg;
+
+	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
+	reg &= ~CONTROL1_TSEN_INT_EN;
+	regmap_write(priv->syscon, data->syscon_control1_off, reg);
+}
+
 /* There is currently no board with more than one sensor per channel */
 static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
 {
@@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
 
 	/* Do the actual reading */
 	ret = armada_read_sensor(priv, temp);
+	if (ret)
+		goto unlock_mutex;
+
+	/*
+	 * Select back the interrupt source channel from which a potential
+	 * critical trip point has been set.
+	 */
+	ret = armada_select_channel(priv, priv->interrupt_source);
+	if (ret)
+		goto unlock_mutex;
 
 unlock_mutex:
 	mutex_unlock(&priv->update_lock);
@@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
 	.get_temp = armada_get_temp,
 };
 
+static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
+					  unsigned int temp_mc)
+{
+	s64 b = data->coef_b;
+	s64 m = data->coef_m;
+	s64 div = data->coef_div;
+	unsigned int sample;
+
+	if (data->inverted)
+		sample = div_s64(((temp_mc * div) + b), m);
+	else
+		sample = div_s64((b - (temp_mc * div)), m);
+
+	return sample & data->temp_mask;
+}
+
+static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
+					  unsigned int hyst_mc)
+{
+	/*
+	 * The documentation states:
+	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
+	 * which is the mathematical derivation for:
+	 * 0x0 <=> 1.9°C, 0x1 <=> 3.8°C, 0x2 <=> 7.6°C, 0x3 <=> 15.2
+	 */
+	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};
+	int i;
+
+	/*
+	 * We will always take the smallest possible hysteresis to avoid risking
+	 * the hardware integrity by enlarging the threshold by +8°C in the
+	 * worst case.
+	 */
+	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
+		if (hyst_mc >= hyst_levels_mc[i])
+			break;
+
+	return i & data->hyst_mask;
+}
+
+static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
+					   int thresh_mc, int hyst_mc)
+{
+	struct armada_thermal_data *data = priv->data;
+	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
+	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
+	u32 ctrl1;
+
+	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
+
+	/* Set Threshold */
+	if (thresh_mc >= 0) {
+		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
+		ctrl1 |= threshold << data->thresh_shift;
+		priv->current_threshold = thresh_mc;
+	}
+
+	/* Set Hysteresis */
+	if (hyst_mc >= 0) {
+		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
+		ctrl1 |= hysteresis << data->hyst_shift;
+		priv->current_hysteresis = hyst_mc;
+	}
+
+	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
+}
+
+static irqreturn_t armada_overheat_isr(int irq, void *blob)
+{
+	/*
+	 * Disable the IRQ and continue in thread context (thermal core
+	 * notification and temperature monitoring).
+	 */
+	disable_irq_nosync(irq);
+
+	return IRQ_WAKE_THREAD;
+}
+
+static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
+{
+	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;
+	int low_threshold = priv->current_threshold - priv->current_hysteresis;
+	int temperature;
+	u32 dummy;
+	int ret;
+
+	/* Notify the core in thread context */
+	thermal_zone_device_update(priv->overheat_sensor,
+				   THERMAL_EVENT_UNSPECIFIED);
+
+	/*
+	 * The overheat interrupt must be cleared by reading the DFX interrupt
+	 * cause _after_ the temperature has fallen down to the low threshold.
+	 * Otherwise future interrupts might not be served.
+	 */
+	do {
+		msleep(OVERHEAT_INT_POLL_DELAY_MS);
+		mutex_lock(&priv->update_lock);
+		ret = armada_read_sensor(priv, &temperature);
+		mutex_unlock(&priv->update_lock);
+		if (ret)
+			return ret;
+	} while (temperature >= low_threshold);
+
+	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
+
+	/* Notify the thermal core that the temperature is acceptable again */
+	thermal_zone_device_update(priv->overheat_sensor,
+				   THERMAL_EVENT_UNSPECIFIED);
+
+	enable_irq(irq);
+
+	return IRQ_HANDLED;
+}
+
 static const struct armada_thermal_data armadaxp_data = {
 	.init = armadaxp_init,
 	.temp_shift = 10,
@@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
 	.is_valid_bit = BIT(16),
 	.temp_shift = 0,
 	.temp_mask = 0x3ff,
+	.thresh_shift = 3,
+	.hyst_shift = 19,
+	.hyst_mask = 0x3,
 	.coef_b = -150000LL,
 	.coef_m = 423ULL,
 	.coef_div = 1,
@@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
 	.syscon_control0_off = 0x84,
 	.syscon_control1_off = 0x88,
 	.syscon_status_off = 0x8C,
+	.dfx_irq_cause_off = 0x108,
+	.dfx_irq_mask_off = 0x10C,
+	.dfx_overheat_irq = BIT(22),
+	.dfx_server_irq_mask_off = 0x104,
+	.dfx_server_irq_en = BIT(1),
 	.cpu_nr = 4,
 };
 
@@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
 	.is_valid_bit = BIT(10),
 	.temp_shift = 0,
 	.temp_mask = 0x3ff,
+	.thresh_shift = 16,
+	.hyst_shift = 26,
+	.hyst_mask = 0x3,
 	.coef_b = 1172499100ULL,
 	.coef_m = 2000096ULL,
 	.coef_div = 4201,
@@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
 	.syscon_control0_off = 0x70,
 	.syscon_control1_off = 0x74,
 	.syscon_status_off = 0x78,
+	.dfx_irq_cause_off = 0x108,
+	.dfx_irq_mask_off = 0x10C,
+	.dfx_overheat_irq = BIT(20),
+	.dfx_server_irq_mask_off = 0x104,
+	.dfx_server_irq_en = BIT(1),
 };
 
 static const struct of_device_id armada_thermal_id_table[] = {
@@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
 	} while (insane_char);
 }
 
+/*
+ * The IP can manage to trigger interrupts on overheat situation from all the
+ * sensors. However, the interrupt source changes along with the last selected
+ * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
+ * possible glitches by always selecting back only one channel (arbitrarily: the
+ * first in the DT which has a critical trip point). We also disable sensor
+ * switch during overheat situations.
+ */
+static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
+					 struct thermal_zone_device *tz,
+					 int sensor_id)
+{
+	/* Retrieve the critical trip point to enable the overheat interrupt */
+	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
+	int ret;
+	int i;
+
+	if (!trips)
+		return -EINVAL;
+
+	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
+		if (trips[i].type == THERMAL_TRIP_CRITICAL)
+			break;
+
+	if (i == of_thermal_get_ntrips(tz))
+		return -EINVAL;
+
+	ret = armada_select_channel(priv, sensor_id);
+	if (ret)
+		return ret;
+
+	armada_set_overheat_thresholds(priv,
+				       trips[i].temperature,
+				       trips[i].hysteresis);
+	priv->overheat_sensor = tz;
+	priv->interrupt_source = sensor_id;
+
+	armada_enable_overheat_interrupt(priv);
+
+	return 0;
+}
+
 static int armada_thermal_probe(struct platform_device *pdev)
 {
 	struct thermal_zone_device *tz;
@@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
 	struct armada_drvdata *drvdata;
 	const struct of_device_id *match;
 	struct armada_thermal_priv *priv;
-	int sensor_id;
+	int sensor_id, irq;
 	int ret;
 
 	match = of_match_device(armada_thermal_id_table, &pdev->dev);
@@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
 	drvdata->data.priv = priv;
 	platform_set_drvdata(pdev, drvdata);
 
+	irq = platform_get_irq(pdev, 0);
+	if (irq == -EPROBE_DEFER)
+		return irq;
+
+	/* The overheat interrupt feature is not mandatory */
+	if (irq >= 0) {
+		ret = devm_request_threaded_irq(&pdev->dev, irq,
+						armada_overheat_isr,
+						armada_overheat_isr_thread,
+						0, NULL, priv);
+		if (ret) {
+			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
+				irq);
+			return ret;
+		}
+	}
+
 	/*
 	 * There is one channel for the IC and one per CPU (if any), each
 	 * channel has one sensor.
@@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
 			devm_kfree(&pdev->dev, sensor);
 			continue;
 		}
+
+		/*
+		 * The first channel that has a critical trip point registered
+		 * in the DT will serve as interrupt source. Others possible
+		 * critical trip points will simply be ignored by the driver.
+		 */
+		if (irq >= 0 && !priv->overheat_sensor)
+			armada_configure_overheat_int(priv, tz, sensor->id);
 	}
 
+	/* Just complain if no overheat interrupt was set up */
+	if (!priv->overheat_sensor)
+		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
+
 	return 0;
 }
 
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 1/6] thermal: armada: add overheat interrupt support
@ 2018-11-23 16:17   ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

The IP can manage to trigger interrupts on overheat situation from all
the sensors.

However, the interrupt source changes along with the last selected
source (ie. the last read sensor), which is an inconsistent behavior.
Avoid possible glitches by always selecting back only one channel which
will then be referenced as the "overheat_sensor" (arbitrarily: the first
in the DT which has a critical trip point filled in).

It is possible that the scan of all thermal zone nodes did not bring a
critical trip point from which the overheat interrupt could be
configured. In this case just complain but do not fail the probe.

Also disable sensor switch during overheat situations because changing
the channel while the system is too hot could clear the overheat state
by changing the source while the temperature is still very high.

Even if the overheat state is not declared, overheat interrupt must be
cleared by reading the DFX interrupt cause _after_ the temperature has
fallen down to the low threshold, otherwise future possible interrupts
would not be served. A work polls the corresponding register until the
overheat flag gets cleared in this case.

Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
 1 file changed, 269 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
index 92f67d40f2e9..55cfaee0da77 100644
--- a/drivers/thermal/armada_thermal.c
+++ b/drivers/thermal/armada_thermal.c
@@ -26,6 +26,11 @@
 #include <linux/iopoll.h>
 #include <linux/mfd/syscon.h>
 #include <linux/regmap.h>
+#include <linux/interrupt.h>
+
+#include "thermal_core.h"
+
+#define TO_MCELSIUS(c)			((c) * 1000)
 
 /* Thermal Manager Control and Status Register */
 #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
@@ -61,9 +66,13 @@
 #define CONTROL1_TSEN_AVG_MASK		0x7
 #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
 #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
+#define CONTROL1_TSEN_INT_EN		BIT(25)
+#define CONTROL1_TSEN_SELECT_OFF	21
+#define CONTROL1_TSEN_SELECT_MASK	0x3
 
 #define STATUS_POLL_PERIOD_US		1000
 #define STATUS_POLL_TIMEOUT_US		100000
+#define OVERHEAT_INT_POLL_DELAY_MS	1000
 
 struct armada_thermal_data;
 
@@ -75,7 +84,11 @@ struct armada_thermal_priv {
 	/* serialize temperature reads/updates */
 	struct mutex update_lock;
 	struct armada_thermal_data *data;
+	struct thermal_zone_device *overheat_sensor;
+	int interrupt_source;
 	int current_channel;
+	long current_threshold;
+	long current_hysteresis;
 };
 
 struct armada_thermal_data {
@@ -93,12 +106,20 @@ struct armada_thermal_data {
 	/* Register shift and mask to access the sensor temperature */
 	unsigned int temp_shift;
 	unsigned int temp_mask;
+	unsigned int thresh_shift;
+	unsigned int hyst_shift;
+	unsigned int hyst_mask;
 	u32 is_valid_bit;
 
 	/* Syscon access */
 	unsigned int syscon_control0_off;
 	unsigned int syscon_control1_off;
 	unsigned int syscon_status_off;
+	unsigned int dfx_irq_cause_off;
+	unsigned int dfx_irq_mask_off;
+	unsigned int dfx_overheat_irq;
+	unsigned int dfx_server_irq_mask_off;
+	unsigned int dfx_server_irq_en;
 
 	/* One sensor is in the thermal IC, the others are in the CPUs if any */
 	unsigned int cpu_nr;
@@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
 	return reg & priv->data->is_valid_bit;
 }
 
+static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
+{
+	struct armada_thermal_data *data = priv->data;
+	u32 reg;
+
+	/* Clear DFX temperature IRQ cause */
+	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
+
+	/* Enable DFX Temperature IRQ */
+	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
+	reg |= data->dfx_overheat_irq;
+	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
+
+	/* Enable DFX server IRQ */
+	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
+	reg |= data->dfx_server_irq_en;
+	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
+
+	/* Enable overheat interrupt */
+	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
+	reg |= CONTROL1_TSEN_INT_EN;
+	regmap_write(priv->syscon, data->syscon_control1_off, reg);
+}
+
+static void __maybe_unused
+armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
+{
+	struct armada_thermal_data *data = priv->data;
+	u32 reg;
+
+	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
+	reg &= ~CONTROL1_TSEN_INT_EN;
+	regmap_write(priv->syscon, data->syscon_control1_off, reg);
+}
+
 /* There is currently no board with more than one sensor per channel */
 static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
 {
@@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
 
 	/* Do the actual reading */
 	ret = armada_read_sensor(priv, temp);
+	if (ret)
+		goto unlock_mutex;
+
+	/*
+	 * Select back the interrupt source channel from which a potential
+	 * critical trip point has been set.
+	 */
+	ret = armada_select_channel(priv, priv->interrupt_source);
+	if (ret)
+		goto unlock_mutex;
 
 unlock_mutex:
 	mutex_unlock(&priv->update_lock);
@@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
 	.get_temp = armada_get_temp,
 };
 
+static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
+					  unsigned int temp_mc)
+{
+	s64 b = data->coef_b;
+	s64 m = data->coef_m;
+	s64 div = data->coef_div;
+	unsigned int sample;
+
+	if (data->inverted)
+		sample = div_s64(((temp_mc * div) + b), m);
+	else
+		sample = div_s64((b - (temp_mc * div)), m);
+
+	return sample & data->temp_mask;
+}
+
+static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
+					  unsigned int hyst_mc)
+{
+	/*
+	 * The documentation states:
+	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
+	 * which is the mathematical derivation for:
+	 * 0x0 <=> 1.9?C, 0x1 <=> 3.8?C, 0x2 <=> 7.6?C, 0x3 <=> 15.2
+	 */
+	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};
+	int i;
+
+	/*
+	 * We will always take the smallest possible hysteresis to avoid risking
+	 * the hardware integrity by enlarging the threshold by +8?C in the
+	 * worst case.
+	 */
+	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
+		if (hyst_mc >= hyst_levels_mc[i])
+			break;
+
+	return i & data->hyst_mask;
+}
+
+static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
+					   int thresh_mc, int hyst_mc)
+{
+	struct armada_thermal_data *data = priv->data;
+	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
+	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
+	u32 ctrl1;
+
+	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
+
+	/* Set Threshold */
+	if (thresh_mc >= 0) {
+		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
+		ctrl1 |= threshold << data->thresh_shift;
+		priv->current_threshold = thresh_mc;
+	}
+
+	/* Set Hysteresis */
+	if (hyst_mc >= 0) {
+		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
+		ctrl1 |= hysteresis << data->hyst_shift;
+		priv->current_hysteresis = hyst_mc;
+	}
+
+	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
+}
+
+static irqreturn_t armada_overheat_isr(int irq, void *blob)
+{
+	/*
+	 * Disable the IRQ and continue in thread context (thermal core
+	 * notification and temperature monitoring).
+	 */
+	disable_irq_nosync(irq);
+
+	return IRQ_WAKE_THREAD;
+}
+
+static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
+{
+	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;
+	int low_threshold = priv->current_threshold - priv->current_hysteresis;
+	int temperature;
+	u32 dummy;
+	int ret;
+
+	/* Notify the core in thread context */
+	thermal_zone_device_update(priv->overheat_sensor,
+				   THERMAL_EVENT_UNSPECIFIED);
+
+	/*
+	 * The overheat interrupt must be cleared by reading the DFX interrupt
+	 * cause _after_ the temperature has fallen down to the low threshold.
+	 * Otherwise future interrupts might not be served.
+	 */
+	do {
+		msleep(OVERHEAT_INT_POLL_DELAY_MS);
+		mutex_lock(&priv->update_lock);
+		ret = armada_read_sensor(priv, &temperature);
+		mutex_unlock(&priv->update_lock);
+		if (ret)
+			return ret;
+	} while (temperature >= low_threshold);
+
+	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
+
+	/* Notify the thermal core that the temperature is acceptable again */
+	thermal_zone_device_update(priv->overheat_sensor,
+				   THERMAL_EVENT_UNSPECIFIED);
+
+	enable_irq(irq);
+
+	return IRQ_HANDLED;
+}
+
 static const struct armada_thermal_data armadaxp_data = {
 	.init = armadaxp_init,
 	.temp_shift = 10,
@@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
 	.is_valid_bit = BIT(16),
 	.temp_shift = 0,
 	.temp_mask = 0x3ff,
+	.thresh_shift = 3,
+	.hyst_shift = 19,
+	.hyst_mask = 0x3,
 	.coef_b = -150000LL,
 	.coef_m = 423ULL,
 	.coef_div = 1,
@@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
 	.syscon_control0_off = 0x84,
 	.syscon_control1_off = 0x88,
 	.syscon_status_off = 0x8C,
+	.dfx_irq_cause_off = 0x108,
+	.dfx_irq_mask_off = 0x10C,
+	.dfx_overheat_irq = BIT(22),
+	.dfx_server_irq_mask_off = 0x104,
+	.dfx_server_irq_en = BIT(1),
 	.cpu_nr = 4,
 };
 
@@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
 	.is_valid_bit = BIT(10),
 	.temp_shift = 0,
 	.temp_mask = 0x3ff,
+	.thresh_shift = 16,
+	.hyst_shift = 26,
+	.hyst_mask = 0x3,
 	.coef_b = 1172499100ULL,
 	.coef_m = 2000096ULL,
 	.coef_div = 4201,
@@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
 	.syscon_control0_off = 0x70,
 	.syscon_control1_off = 0x74,
 	.syscon_status_off = 0x78,
+	.dfx_irq_cause_off = 0x108,
+	.dfx_irq_mask_off = 0x10C,
+	.dfx_overheat_irq = BIT(20),
+	.dfx_server_irq_mask_off = 0x104,
+	.dfx_server_irq_en = BIT(1),
 };
 
 static const struct of_device_id armada_thermal_id_table[] = {
@@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
 	} while (insane_char);
 }
 
+/*
+ * The IP can manage to trigger interrupts on overheat situation from all the
+ * sensors. However, the interrupt source changes along with the last selected
+ * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
+ * possible glitches by always selecting back only one channel (arbitrarily: the
+ * first in the DT which has a critical trip point). We also disable sensor
+ * switch during overheat situations.
+ */
+static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
+					 struct thermal_zone_device *tz,
+					 int sensor_id)
+{
+	/* Retrieve the critical trip point to enable the overheat interrupt */
+	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
+	int ret;
+	int i;
+
+	if (!trips)
+		return -EINVAL;
+
+	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
+		if (trips[i].type == THERMAL_TRIP_CRITICAL)
+			break;
+
+	if (i == of_thermal_get_ntrips(tz))
+		return -EINVAL;
+
+	ret = armada_select_channel(priv, sensor_id);
+	if (ret)
+		return ret;
+
+	armada_set_overheat_thresholds(priv,
+				       trips[i].temperature,
+				       trips[i].hysteresis);
+	priv->overheat_sensor = tz;
+	priv->interrupt_source = sensor_id;
+
+	armada_enable_overheat_interrupt(priv);
+
+	return 0;
+}
+
 static int armada_thermal_probe(struct platform_device *pdev)
 {
 	struct thermal_zone_device *tz;
@@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
 	struct armada_drvdata *drvdata;
 	const struct of_device_id *match;
 	struct armada_thermal_priv *priv;
-	int sensor_id;
+	int sensor_id, irq;
 	int ret;
 
 	match = of_match_device(armada_thermal_id_table, &pdev->dev);
@@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
 	drvdata->data.priv = priv;
 	platform_set_drvdata(pdev, drvdata);
 
+	irq = platform_get_irq(pdev, 0);
+	if (irq == -EPROBE_DEFER)
+		return irq;
+
+	/* The overheat interrupt feature is not mandatory */
+	if (irq >= 0) {
+		ret = devm_request_threaded_irq(&pdev->dev, irq,
+						armada_overheat_isr,
+						armada_overheat_isr_thread,
+						0, NULL, priv);
+		if (ret) {
+			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
+				irq);
+			return ret;
+		}
+	}
+
 	/*
 	 * There is one channel for the IC and one per CPU (if any), each
 	 * channel has one sensor.
@@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
 			devm_kfree(&pdev->dev, sensor);
 			continue;
 		}
+
+		/*
+		 * The first channel that has a critical trip point registered
+		 * in the DT will serve as interrupt source. Others possible
+		 * critical trip points will simply be ignored by the driver.
+		 */
+		if (irq >= 0 && !priv->overheat_sensor)
+			armada_configure_overheat_int(priv, tz, sensor->id);
 	}
 
+	/* Just complain if no overheat interrupt was set up */
+	if (!priv->overheat_sensor)
+		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
+
 	return 0;
 }
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/6] MAINTAINERS: thermal: add entry for Marvell MVEBU thermal driver
  2018-11-23 16:17 ` Miquel Raynal
@ 2018-11-23 16:17   ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

Add myself as Marvell MVEBU thermal driver maintainer.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 MAINTAINERS | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f4855974f325..42028c06e4e5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8902,6 +8902,11 @@ L:	netdev@vger.kernel.org
 S:	Maintained
 F:	drivers/net/phy/marvell10g.c
 
+MARVELL MVEBU THERMAL DRIVER
+M:	Miquel Raynal <miquel.raynal@bootlin.com>
+S:	Maintained
+F:	drivers/thermal/armada_thermal.c
+
 MARVELL MVNETA ETHERNET DRIVER
 M:	Thomas Petazzoni <thomas.petazzoni@bootlin.com>
 L:	netdev@vger.kernel.org
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/6] MAINTAINERS: thermal: add entry for Marvell MVEBU thermal driver
@ 2018-11-23 16:17   ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

Add myself as Marvell MVEBU thermal driver maintainer.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 MAINTAINERS | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f4855974f325..42028c06e4e5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8902,6 +8902,11 @@ L:	netdev at vger.kernel.org
 S:	Maintained
 F:	drivers/net/phy/marvell10g.c
 
+MARVELL MVEBU THERMAL DRIVER
+M:	Miquel Raynal <miquel.raynal@bootlin.com>
+S:	Maintained
+F:	drivers/thermal/armada_thermal.c
+
 MARVELL MVNETA ETHERNET DRIVER
 M:	Thomas Petazzoni <thomas.petazzoni@bootlin.com>
 L:	netdev at vger.kernel.org
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/6] dt-bindings: ap806: document the thermal interrupt capabilities
  2018-11-23 16:17 ` Miquel Raynal
@ 2018-11-23 16:17   ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

The thermal IP can produce interrupts on overheat situation.
Describe them.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 .../bindings/arm/marvell/ap806-system-controller.txt      | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt b/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt
index 3fd21bb7cb37..35e8dd2edfd2 100644
--- a/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt
+++ b/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt
@@ -114,12 +114,18 @@ Documentation/devicetree/bindings/thermal/thermal.txt
 The thermal IP can probe the temperature all around the processor. It
 may feature several channels, each of them wired to one sensor.
 
+It is possible to setup an overheat interrupt by giving at least one
+critical point to any subnode of the thermal-zone node.
+
 Required properties:
 - compatible: must be one of:
   * marvell,armada-ap806-thermal
 - reg: register range associated with the thermal functions.
 
 Optional properties:
+- interrupt-parent/interrupts: overheat interrupt handle. Should point to
+  line 18 of the SEI irqchip.
+  See interrupt-controller/interrupts.txt
 - #thermal-sensor-cells: shall be <1> when thermal-zones subnodes refer
   to this IP and represents the channel ID. There is one sensor per
   channel. O refers to the thermal IP internal channel, while positive
@@ -133,6 +139,8 @@ ap_syscon1: system-controller@6f8000 {
 	ap_thermal: thermal-sensor@80 {
 		compatible = "marvell,armada-ap806-thermal";
 		reg = <0x80 0x10>;
+		interrupt-parent = <&sei>;
+		interrupts = <18>;
 		#thermal-sensor-cells = <1>;
 	};
 };
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/6] dt-bindings: ap806: document the thermal interrupt capabilities
@ 2018-11-23 16:17   ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

The thermal IP can produce interrupts on overheat situation.
Describe them.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 .../bindings/arm/marvell/ap806-system-controller.txt      | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt b/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt
index 3fd21bb7cb37..35e8dd2edfd2 100644
--- a/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt
+++ b/Documentation/devicetree/bindings/arm/marvell/ap806-system-controller.txt
@@ -114,12 +114,18 @@ Documentation/devicetree/bindings/thermal/thermal.txt
 The thermal IP can probe the temperature all around the processor. It
 may feature several channels, each of them wired to one sensor.
 
+It is possible to setup an overheat interrupt by giving at least one
+critical point to any subnode of the thermal-zone node.
+
 Required properties:
 - compatible: must be one of:
   * marvell,armada-ap806-thermal
 - reg: register range associated with the thermal functions.
 
 Optional properties:
+- interrupt-parent/interrupts: overheat interrupt handle. Should point to
+  line 18 of the SEI irqchip.
+  See interrupt-controller/interrupts.txt
 - #thermal-sensor-cells: shall be <1> when thermal-zones subnodes refer
   to this IP and represents the channel ID. There is one sensor per
   channel. O refers to the thermal IP internal channel, while positive
@@ -133,6 +139,8 @@ ap_syscon1: system-controller at 6f8000 {
 	ap_thermal: thermal-sensor at 80 {
 		compatible = "marvell,armada-ap806-thermal";
 		reg = <0x80 0x10>;
+		interrupt-parent = <&sei>;
+		interrupts = <18>;
 		#thermal-sensor-cells = <1>;
 	};
 };
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/6] dt-bindings: cp110: document the thermal interrupt capabilities
  2018-11-23 16:17 ` Miquel Raynal
@ 2018-11-23 16:17   ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

The thermal IP can produce interrupts on overheat situation.
Describe them.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 .../bindings/arm/marvell/cp110-system-controller.txt     | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt b/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt
index 81ce742d2760..4db4119a6d19 100644
--- a/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt
+++ b/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt
@@ -199,6 +199,9 @@ Thermal:
 The thermal IP can probe the temperature all around the processor. It
 may feature several channels, each of them wired to one sensor.
 
+It is possible to setup an overheat interrupt by giving at least one
+critical point to any subnode of the thermal-zone node.
+
 For common binding part and usage, refer to
 Documentation/devicetree/bindings/thermal/thermal.txt
 
@@ -208,6 +211,11 @@ Required properties:
 - reg: register range associated with the thermal functions.
 
 Optional properties:
+- interrupts-extended: overheat interrupt handle. Should point to
+  a line of the ICU-SEI irqchip (116 is what is usually used by the
+  firmware). The ICU-SEI will redirect towards interrupt line #37 of the
+  AP SEI which is shared across all CPs.
+  See interrupt-controller/interrupts.txt
 - #thermal-sensor-cells: shall be <1> when thermal-zones subnodes refer
   to this IP and represents the channel ID. There is one sensor per
   channel. O refers to the thermal IP internal channel.
@@ -220,6 +228,7 @@ CP110_LABEL(syscon1): system-controller@6f8000 {
 	CP110_LABEL(thermal): thermal-sensor@70 {
 		compatible = "marvell,armada-cp110-thermal";
 		reg = <0x70 0x10>;
+		interrupts-extended = <&CP110_LABEL(icu_sei) 116 IRQ_TYPE_LEVEL_HIGH>;
 		#thermal-sensor-cells = <1>;
 	};
 };
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/6] dt-bindings: cp110: document the thermal interrupt capabilities
@ 2018-11-23 16:17   ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

The thermal IP can produce interrupts on overheat situation.
Describe them.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 .../bindings/arm/marvell/cp110-system-controller.txt     | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt b/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt
index 81ce742d2760..4db4119a6d19 100644
--- a/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt
+++ b/Documentation/devicetree/bindings/arm/marvell/cp110-system-controller.txt
@@ -199,6 +199,9 @@ Thermal:
 The thermal IP can probe the temperature all around the processor. It
 may feature several channels, each of them wired to one sensor.
 
+It is possible to setup an overheat interrupt by giving at least one
+critical point to any subnode of the thermal-zone node.
+
 For common binding part and usage, refer to
 Documentation/devicetree/bindings/thermal/thermal.txt
 
@@ -208,6 +211,11 @@ Required properties:
 - reg: register range associated with the thermal functions.
 
 Optional properties:
+- interrupts-extended: overheat interrupt handle. Should point to
+  a line of the ICU-SEI irqchip (116 is what is usually used by the
+  firmware). The ICU-SEI will redirect towards interrupt line #37 of the
+  AP SEI which is shared across all CPs.
+  See interrupt-controller/interrupts.txt
 - #thermal-sensor-cells: shall be <1> when thermal-zones subnodes refer
   to this IP and represents the channel ID. There is one sensor per
   channel. O refers to the thermal IP internal channel.
@@ -220,6 +228,7 @@ CP110_LABEL(syscon1): system-controller at 6f8000 {
 	CP110_LABEL(thermal): thermal-sensor at 70 {
 		compatible = "marvell,armada-cp110-thermal";
 		reg = <0x70 0x10>;
+		interrupts-extended = <&CP110_LABEL(icu_sei) 116 IRQ_TYPE_LEVEL_HIGH>;
 		#thermal-sensor-cells = <1>;
 	};
 };
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/6] arm64: dts: marvell: add interrupt support to ap806 thermal node
  2018-11-23 16:17 ` Miquel Raynal
@ 2018-11-23 16:17   ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

Add interrupt properties in the thermal node as well as a critical trip
point in the thermal-zone.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 arch/arm64/boot/dts/marvell/armada-ap806.dtsi | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
index 073610ac0a53..70b244a4c4c2 100644
--- a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
@@ -293,6 +293,8 @@
 				ap_thermal: thermal-sensor@80 {
 					compatible = "marvell,armada-ap806-thermal";
 					reg = <0x80 0x10>;
+					interrupt-parent = <&sei>;
+					interrupts = <18>;
 					#thermal-sensor-cells = <1>;
 				};
 			};
@@ -303,16 +305,26 @@
 	 * The thermal IP features one internal sensor plus, if applicable, one
 	 * remote channel wired to one sensor per CPU.
 	 *
+	 * Only one thermal zone per AP/CP may trigger interrupts at a time, the
+	 * first one that will have a critical trip point will be chosen.
+	 *
 	 * The cooling maps are always empty as there are no cooling devices.
 	 */
 	thermal-zones {
 		ap_thermal_ic: ap-thermal-ic {
-			polling-delay-passive = <1000>;
-			polling-delay = <1000>;
+			polling-delay-passive = <0>; /* Interrupt driven */
+			polling-delay = <0>; /* Interrupt driven */
 
 			thermal-sensors = <&ap_thermal 0>;
 
-			trips {	};
+			trips {
+				ap_crit: ap-crit {
+					temperature = <100000>; /* mC degrees */
+					hysteresis = <2000>; /* mC degrees */
+					type = "critical";
+				};
+			};
+
 			cooling-maps { };
 		};
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/6] arm64: dts: marvell: add interrupt support to ap806 thermal node
@ 2018-11-23 16:17   ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

Add interrupt properties in the thermal node as well as a critical trip
point in the thermal-zone.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 arch/arm64/boot/dts/marvell/armada-ap806.dtsi | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
index 073610ac0a53..70b244a4c4c2 100644
--- a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
@@ -293,6 +293,8 @@
 				ap_thermal: thermal-sensor at 80 {
 					compatible = "marvell,armada-ap806-thermal";
 					reg = <0x80 0x10>;
+					interrupt-parent = <&sei>;
+					interrupts = <18>;
 					#thermal-sensor-cells = <1>;
 				};
 			};
@@ -303,16 +305,26 @@
 	 * The thermal IP features one internal sensor plus, if applicable, one
 	 * remote channel wired to one sensor per CPU.
 	 *
+	 * Only one thermal zone per AP/CP may trigger interrupts at a time, the
+	 * first one that will have a critical trip point will be chosen.
+	 *
 	 * The cooling maps are always empty as there are no cooling devices.
 	 */
 	thermal-zones {
 		ap_thermal_ic: ap-thermal-ic {
-			polling-delay-passive = <1000>;
-			polling-delay = <1000>;
+			polling-delay-passive = <0>; /* Interrupt driven */
+			polling-delay = <0>; /* Interrupt driven */
 
 			thermal-sensors = <&ap_thermal 0>;
 
-			trips {	};
+			trips {
+				ap_crit: ap-crit {
+					temperature = <100000>; /* mC degrees */
+					hysteresis = <2000>; /* mC degrees */
+					type = "critical";
+				};
+			};
+
 			cooling-maps { };
 		};
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/6] arm64: dts: marvell: add interrupt support to cp110 thermal node
  2018-11-23 16:17 ` Miquel Raynal
@ 2018-11-23 16:17   ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, Marc Zyngier, David Sniatkiwicz, Rob Herring,
	Thomas Petazzoni, Miquel Raynal, linux-arm-kernel

Add interrupt properties in the thermal node as well as a critical trip
point in the thermal-zone.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
index b9d9f31e3ba1..4d6e4a097f72 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
@@ -28,12 +28,19 @@
 	 */
 	thermal-zones {
 		CP110_LABEL(thermal_ic): CP110_NODE_NAME(thermal-ic) {
-			polling-delay-passive = <1000>;
-			polling-delay = <1000>;
+			polling-delay-passive = <0>; /* Interrupt driven */
+			polling-delay = <0>; /* Interrupt driven */
 
 			thermal-sensors = <&CP110_LABEL(thermal) 0>;
 
-			trips {	};
+			trips {
+				CP110_LABEL(crit): crit {
+					temperature = <100000>; /* mC degrees */
+					hysteresis = <2000>; /* mC degrees */
+					type = "critical";
+				};
+			};
+
 			cooling-maps { };
 		};
 	};
@@ -259,6 +266,8 @@
 			CP110_LABEL(thermal): thermal-sensor@70 {
 				compatible = "marvell,armada-cp110-thermal";
 				reg = <0x70 0x10>;
+				interrupts-extended =
+					<&CP110_LABEL(icu_sei) 116 IRQ_TYPE_LEVEL_HIGH>;
 				#thermal-sensor-cells = <1>;
 			};
 		};
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/6] arm64: dts: marvell: add interrupt support to cp110 thermal node
@ 2018-11-23 16:17   ` Miquel Raynal
  0 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-11-23 16:17 UTC (permalink / raw)
  To: linux-arm-kernel

Add interrupt properties in the thermal node as well as a critical trip
point in the thermal-zone.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
index b9d9f31e3ba1..4d6e4a097f72 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
@@ -28,12 +28,19 @@
 	 */
 	thermal-zones {
 		CP110_LABEL(thermal_ic): CP110_NODE_NAME(thermal-ic) {
-			polling-delay-passive = <1000>;
-			polling-delay = <1000>;
+			polling-delay-passive = <0>; /* Interrupt driven */
+			polling-delay = <0>; /* Interrupt driven */
 
 			thermal-sensors = <&CP110_LABEL(thermal) 0>;
 
-			trips {	};
+			trips {
+				CP110_LABEL(crit): crit {
+					temperature = <100000>; /* mC degrees */
+					hysteresis = <2000>; /* mC degrees */
+					type = "critical";
+				};
+			};
+
 			cooling-maps { };
 		};
 	};
@@ -259,6 +266,8 @@
 			CP110_LABEL(thermal): thermal-sensor at 70 {
 				compatible = "marvell,armada-cp110-thermal";
 				reg = <0x70 0x10>;
+				interrupts-extended =
+					<&CP110_LABEL(icu_sei) 116 IRQ_TYPE_LEVEL_HIGH>;
 				#thermal-sensor-cells = <1>;
 			};
 		};
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] thermal: armada: add overheat interrupt support
  2018-11-23 16:17   ` Miquel Raynal
@ 2018-11-23 16:55     ` Marc Zyngier
  -1 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2018-11-23 16:55 UTC (permalink / raw)
  To: Miquel Raynal, Gregory Clement, Jason Cooper, Andrew Lunn,
	Sebastian Hesselbarth, Zhang Rui, Eduardo Valentin,
	Daniel Lezcano
  Cc: Mark Rutland, devicetree, linux-pm, Antoine Tenart,
	Catalin Marinas, Will Deacon, Russell King, Maxime Chevallier,
	Nadav Haklai, David Sniatkiwicz, Rob Herring, Thomas Petazzoni,
	linux-arm-kernel

On 23/11/2018 16:17, Miquel Raynal wrote:
> The IP can manage to trigger interrupts on overheat situation from all
> the sensors.
> 
> However, the interrupt source changes along with the last selected
> source (ie. the last read sensor), which is an inconsistent behavior.
> Avoid possible glitches by always selecting back only one channel which
> will then be referenced as the "overheat_sensor" (arbitrarily: the first
> in the DT which has a critical trip point filled in).
> 
> It is possible that the scan of all thermal zone nodes did not bring a
> critical trip point from which the overheat interrupt could be
> configured. In this case just complain but do not fail the probe.
> 
> Also disable sensor switch during overheat situations because changing
> the channel while the system is too hot could clear the overheat state
> by changing the source while the temperature is still very high.
> 
> Even if the overheat state is not declared, overheat interrupt must be
> cleared by reading the DFX interrupt cause _after_ the temperature has
> fallen down to the low threshold, otherwise future possible interrupts
> would not be served. A work polls the corresponding register until the
> overheat flag gets cleared in this case.
> 
> Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
>  1 file changed, 269 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
> index 92f67d40f2e9..55cfaee0da77 100644
> --- a/drivers/thermal/armada_thermal.c
> +++ b/drivers/thermal/armada_thermal.c
> @@ -26,6 +26,11 @@
>  #include <linux/iopoll.h>
>  #include <linux/mfd/syscon.h>
>  #include <linux/regmap.h>
> +#include <linux/interrupt.h>
> +
> +#include "thermal_core.h"
> +
> +#define TO_MCELSIUS(c)			((c) * 1000)
>  
>  /* Thermal Manager Control and Status Register */
>  #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
> @@ -61,9 +66,13 @@
>  #define CONTROL1_TSEN_AVG_MASK		0x7
>  #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
>  #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
> +#define CONTROL1_TSEN_INT_EN		BIT(25)
> +#define CONTROL1_TSEN_SELECT_OFF	21
> +#define CONTROL1_TSEN_SELECT_MASK	0x3
>  
>  #define STATUS_POLL_PERIOD_US		1000
>  #define STATUS_POLL_TIMEOUT_US		100000
> +#define OVERHEAT_INT_POLL_DELAY_MS	1000
>  
>  struct armada_thermal_data;
>  
> @@ -75,7 +84,11 @@ struct armada_thermal_priv {
>  	/* serialize temperature reads/updates */
>  	struct mutex update_lock;
>  	struct armada_thermal_data *data;
> +	struct thermal_zone_device *overheat_sensor;
> +	int interrupt_source;
>  	int current_channel;
> +	long current_threshold;
> +	long current_hysteresis;
>  };
>  
>  struct armada_thermal_data {
> @@ -93,12 +106,20 @@ struct armada_thermal_data {
>  	/* Register shift and mask to access the sensor temperature */
>  	unsigned int temp_shift;
>  	unsigned int temp_mask;
> +	unsigned int thresh_shift;
> +	unsigned int hyst_shift;
> +	unsigned int hyst_mask;
>  	u32 is_valid_bit;
>  
>  	/* Syscon access */
>  	unsigned int syscon_control0_off;
>  	unsigned int syscon_control1_off;
>  	unsigned int syscon_status_off;
> +	unsigned int dfx_irq_cause_off;
> +	unsigned int dfx_irq_mask_off;
> +	unsigned int dfx_overheat_irq;
> +	unsigned int dfx_server_irq_mask_off;
> +	unsigned int dfx_server_irq_en;
>  
>  	/* One sensor is in the thermal IC, the others are in the CPUs if any */
>  	unsigned int cpu_nr;
> @@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
>  	return reg & priv->data->is_valid_bit;
>  }
>  
> +static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
> +{
> +	struct armada_thermal_data *data = priv->data;
> +	u32 reg;
> +
> +	/* Clear DFX temperature IRQ cause */
> +	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
> +
> +	/* Enable DFX Temperature IRQ */
> +	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
> +	reg |= data->dfx_overheat_irq;
> +	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
> +
> +	/* Enable DFX server IRQ */
> +	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
> +	reg |= data->dfx_server_irq_en;
> +	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
> +
> +	/* Enable overheat interrupt */
> +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> +	reg |= CONTROL1_TSEN_INT_EN;
> +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> +}
> +
> +static void __maybe_unused
> +armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
> +{
> +	struct armada_thermal_data *data = priv->data;
> +	u32 reg;
> +
> +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> +	reg &= ~CONTROL1_TSEN_INT_EN;
> +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> +}
> +
>  /* There is currently no board with more than one sensor per channel */
>  static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
>  {
> @@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
>  
>  	/* Do the actual reading */
>  	ret = armada_read_sensor(priv, temp);
> +	if (ret)
> +		goto unlock_mutex;
> +
> +	/*
> +	 * Select back the interrupt source channel from which a potential
> +	 * critical trip point has been set.
> +	 */
> +	ret = armada_select_channel(priv, priv->interrupt_source);
> +	if (ret)
> +		goto unlock_mutex;

Do you really need a "goto" to the immediately following line?

>  
>  unlock_mutex:
>  	mutex_unlock(&priv->update_lock);
> @@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
>  	.get_temp = armada_get_temp,
>  };
>  
> +static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
> +					  unsigned int temp_mc)
> +{
> +	s64 b = data->coef_b;
> +	s64 m = data->coef_m;
> +	s64 div = data->coef_div;
> +	unsigned int sample;
> +
> +	if (data->inverted)
> +		sample = div_s64(((temp_mc * div) + b), m);
> +	else
> +		sample = div_s64((b - (temp_mc * div)), m);
> +
> +	return sample & data->temp_mask;
> +}
> +
> +static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
> +					  unsigned int hyst_mc)
> +{
> +	/*
> +	 * The documentation states:
> +	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
> +	 * which is the mathematical derivation for:
> +	 * 0x0 <=> 1.9°C, 0x1 <=> 3.8°C, 0x2 <=> 7.6°C, 0x3 <=> 15.2
> +	 */
> +	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};

Why isn't this a static array rather than something you have to push on
the stack each time you execute this function?

> +	int i;
> +
> +	/*
> +	 * We will always take the smallest possible hysteresis to avoid risking
> +	 * the hardware integrity by enlarging the threshold by +8°C in the
> +	 * worst case.
> +	 */
> +	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
> +		if (hyst_mc >= hyst_levels_mc[i])
> +			break;
> +
> +	return i & data->hyst_mask;
> +}
> +
> +static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
> +					   int thresh_mc, int hyst_mc)
> +{
> +	struct armada_thermal_data *data = priv->data;
> +	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
> +	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
> +	u32 ctrl1;
> +
> +	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
> +
> +	/* Set Threshold */
> +	if (thresh_mc >= 0) {
> +		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
> +		ctrl1 |= threshold << data->thresh_shift;
> +		priv->current_threshold = thresh_mc;
> +	}
> +
> +	/* Set Hysteresis */
> +	if (hyst_mc >= 0) {
> +		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
> +		ctrl1 |= hysteresis << data->hyst_shift;
> +		priv->current_hysteresis = hyst_mc;
> +	}
> +
> +	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
> +}
> +
> +static irqreturn_t armada_overheat_isr(int irq, void *blob)
> +{
> +	/*
> +	 * Disable the IRQ and continue in thread context (thermal core
> +	 * notification and temperature monitoring).
> +	 */
> +	disable_irq_nosync(irq);
> +
> +	return IRQ_WAKE_THREAD;
> +}
> +
> +static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
> +{
> +	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;

Unnecessary cast.

> +	int low_threshold = priv->current_threshold - priv->current_hysteresis;
> +	int temperature;
> +	u32 dummy;
> +	int ret;
> +
> +	/* Notify the core in thread context */
> +	thermal_zone_device_update(priv->overheat_sensor,
> +				   THERMAL_EVENT_UNSPECIFIED);
> +
> +	/*
> +	 * The overheat interrupt must be cleared by reading the DFX interrupt
> +	 * cause _after_ the temperature has fallen down to the low threshold.
> +	 * Otherwise future interrupts might not be served.
> +	 */
> +	do {
> +		msleep(OVERHEAT_INT_POLL_DELAY_MS);
> +		mutex_lock(&priv->update_lock);
> +		ret = armada_read_sensor(priv, &temperature);
> +		mutex_unlock(&priv->update_lock);
> +		if (ret)
> +			return ret;

How will the interrupt fire again if you exit without re-enabling it? Is
"ret" guaranteed to be a valid irqreturn_t?

> +	} while (temperature >= low_threshold);
> +
> +	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
> +
> +	/* Notify the thermal core that the temperature is acceptable again */
> +	thermal_zone_device_update(priv->overheat_sensor,
> +				   THERMAL_EVENT_UNSPECIFIED);
> +
> +	enable_irq(irq);
> +
> +	return IRQ_HANDLED;
> +}
> +
>  static const struct armada_thermal_data armadaxp_data = {
>  	.init = armadaxp_init,
>  	.temp_shift = 10,
> @@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
>  	.is_valid_bit = BIT(16),
>  	.temp_shift = 0,
>  	.temp_mask = 0x3ff,
> +	.thresh_shift = 3,
> +	.hyst_shift = 19,
> +	.hyst_mask = 0x3,
>  	.coef_b = -150000LL,
>  	.coef_m = 423ULL,
>  	.coef_div = 1,
> @@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
>  	.syscon_control0_off = 0x84,
>  	.syscon_control1_off = 0x88,
>  	.syscon_status_off = 0x8C,
> +	.dfx_irq_cause_off = 0x108,
> +	.dfx_irq_mask_off = 0x10C,
> +	.dfx_overheat_irq = BIT(22),
> +	.dfx_server_irq_mask_off = 0x104,
> +	.dfx_server_irq_en = BIT(1),
>  	.cpu_nr = 4,
>  };
>  
> @@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
>  	.is_valid_bit = BIT(10),
>  	.temp_shift = 0,
>  	.temp_mask = 0x3ff,
> +	.thresh_shift = 16,
> +	.hyst_shift = 26,
> +	.hyst_mask = 0x3,
>  	.coef_b = 1172499100ULL,
>  	.coef_m = 2000096ULL,
>  	.coef_div = 4201,
> @@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
>  	.syscon_control0_off = 0x70,
>  	.syscon_control1_off = 0x74,
>  	.syscon_status_off = 0x78,
> +	.dfx_irq_cause_off = 0x108,
> +	.dfx_irq_mask_off = 0x10C,
> +	.dfx_overheat_irq = BIT(20),
> +	.dfx_server_irq_mask_off = 0x104,
> +	.dfx_server_irq_en = BIT(1),
>  };
>  
>  static const struct of_device_id armada_thermal_id_table[] = {
> @@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
>  	} while (insane_char);
>  }
>  
> +/*
> + * The IP can manage to trigger interrupts on overheat situation from all the
> + * sensors. However, the interrupt source changes along with the last selected
> + * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
> + * possible glitches by always selecting back only one channel (arbitrarily: the
> + * first in the DT which has a critical trip point). We also disable sensor

The DT doesn't seem to *mandate* this. Should it? Does it also mean that
you can only have a single critical point that will generate an
interrupt on overheat?

> + * switch during overheat situations.
> + */
> +static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
> +					 struct thermal_zone_device *tz,
> +					 int sensor_id)
> +{
> +	/* Retrieve the critical trip point to enable the overheat interrupt */
> +	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
> +	int ret;
> +	int i;
> +
> +	if (!trips)
> +		return -EINVAL;
> +
> +	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
> +		if (trips[i].type == THERMAL_TRIP_CRITICAL)
> +			break;
> +
> +	if (i == of_thermal_get_ntrips(tz))
> +		return -EINVAL;
> +
> +	ret = armada_select_channel(priv, sensor_id);
> +	if (ret)
> +		return ret;
> +
> +	armada_set_overheat_thresholds(priv,
> +				       trips[i].temperature,
> +				       trips[i].hysteresis);
> +	priv->overheat_sensor = tz;
> +	priv->interrupt_source = sensor_id;
> +
> +	armada_enable_overheat_interrupt(priv);
> +
> +	return 0;
> +}
> +
>  static int armada_thermal_probe(struct platform_device *pdev)
>  {
>  	struct thermal_zone_device *tz;
> @@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
>  	struct armada_drvdata *drvdata;
>  	const struct of_device_id *match;
>  	struct armada_thermal_priv *priv;
> -	int sensor_id;
> +	int sensor_id, irq;
>  	int ret;
>  
>  	match = of_match_device(armada_thermal_id_table, &pdev->dev);
> @@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
>  	drvdata->data.priv = priv;
>  	platform_set_drvdata(pdev, drvdata);
>  
> +	irq = platform_get_irq(pdev, 0);
> +	if (irq == -EPROBE_DEFER)
> +		return irq;
> +
> +	/* The overheat interrupt feature is not mandatory */
> +	if (irq >= 0) {

0 is not a valid interrupt.

> +		ret = devm_request_threaded_irq(&pdev->dev, irq,
> +						armada_overheat_isr,
> +						armada_overheat_isr_thread,
> +						0, NULL, priv);
> +		if (ret) {
> +			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
> +				irq);
> +			return ret;
> +		}
> +	}
> +
>  	/*
>  	 * There is one channel for the IC and one per CPU (if any), each
>  	 * channel has one sensor.
> @@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
>  			devm_kfree(&pdev->dev, sensor);
>  			continue;
>  		}
> +
> +		/*
> +		 * The first channel that has a critical trip point registered
> +		 * in the DT will serve as interrupt source. Others possible
> +		 * critical trip points will simply be ignored by the driver.
> +		 */
> +		if (irq >= 0 && !priv->overheat_sensor)

Same here.

> +			armada_configure_overheat_int(priv, tz, sensor->id);
>  	}
>  
> +	/* Just complain if no overheat interrupt was set up */
> +	if (!priv->overheat_sensor)
> +		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
> +
>  	return 0;
>  }
>  
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/6] thermal: armada: add overheat interrupt support
@ 2018-11-23 16:55     ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2018-11-23 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 23/11/2018 16:17, Miquel Raynal wrote:
> The IP can manage to trigger interrupts on overheat situation from all
> the sensors.
> 
> However, the interrupt source changes along with the last selected
> source (ie. the last read sensor), which is an inconsistent behavior.
> Avoid possible glitches by always selecting back only one channel which
> will then be referenced as the "overheat_sensor" (arbitrarily: the first
> in the DT which has a critical trip point filled in).
> 
> It is possible that the scan of all thermal zone nodes did not bring a
> critical trip point from which the overheat interrupt could be
> configured. In this case just complain but do not fail the probe.
> 
> Also disable sensor switch during overheat situations because changing
> the channel while the system is too hot could clear the overheat state
> by changing the source while the temperature is still very high.
> 
> Even if the overheat state is not declared, overheat interrupt must be
> cleared by reading the DFX interrupt cause _after_ the temperature has
> fallen down to the low threshold, otherwise future possible interrupts
> would not be served. A work polls the corresponding register until the
> overheat flag gets cleared in this case.
> 
> Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
>  1 file changed, 269 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
> index 92f67d40f2e9..55cfaee0da77 100644
> --- a/drivers/thermal/armada_thermal.c
> +++ b/drivers/thermal/armada_thermal.c
> @@ -26,6 +26,11 @@
>  #include <linux/iopoll.h>
>  #include <linux/mfd/syscon.h>
>  #include <linux/regmap.h>
> +#include <linux/interrupt.h>
> +
> +#include "thermal_core.h"
> +
> +#define TO_MCELSIUS(c)			((c) * 1000)
>  
>  /* Thermal Manager Control and Status Register */
>  #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
> @@ -61,9 +66,13 @@
>  #define CONTROL1_TSEN_AVG_MASK		0x7
>  #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
>  #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
> +#define CONTROL1_TSEN_INT_EN		BIT(25)
> +#define CONTROL1_TSEN_SELECT_OFF	21
> +#define CONTROL1_TSEN_SELECT_MASK	0x3
>  
>  #define STATUS_POLL_PERIOD_US		1000
>  #define STATUS_POLL_TIMEOUT_US		100000
> +#define OVERHEAT_INT_POLL_DELAY_MS	1000
>  
>  struct armada_thermal_data;
>  
> @@ -75,7 +84,11 @@ struct armada_thermal_priv {
>  	/* serialize temperature reads/updates */
>  	struct mutex update_lock;
>  	struct armada_thermal_data *data;
> +	struct thermal_zone_device *overheat_sensor;
> +	int interrupt_source;
>  	int current_channel;
> +	long current_threshold;
> +	long current_hysteresis;
>  };
>  
>  struct armada_thermal_data {
> @@ -93,12 +106,20 @@ struct armada_thermal_data {
>  	/* Register shift and mask to access the sensor temperature */
>  	unsigned int temp_shift;
>  	unsigned int temp_mask;
> +	unsigned int thresh_shift;
> +	unsigned int hyst_shift;
> +	unsigned int hyst_mask;
>  	u32 is_valid_bit;
>  
>  	/* Syscon access */
>  	unsigned int syscon_control0_off;
>  	unsigned int syscon_control1_off;
>  	unsigned int syscon_status_off;
> +	unsigned int dfx_irq_cause_off;
> +	unsigned int dfx_irq_mask_off;
> +	unsigned int dfx_overheat_irq;
> +	unsigned int dfx_server_irq_mask_off;
> +	unsigned int dfx_server_irq_en;
>  
>  	/* One sensor is in the thermal IC, the others are in the CPUs if any */
>  	unsigned int cpu_nr;
> @@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
>  	return reg & priv->data->is_valid_bit;
>  }
>  
> +static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
> +{
> +	struct armada_thermal_data *data = priv->data;
> +	u32 reg;
> +
> +	/* Clear DFX temperature IRQ cause */
> +	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
> +
> +	/* Enable DFX Temperature IRQ */
> +	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
> +	reg |= data->dfx_overheat_irq;
> +	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
> +
> +	/* Enable DFX server IRQ */
> +	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
> +	reg |= data->dfx_server_irq_en;
> +	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
> +
> +	/* Enable overheat interrupt */
> +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> +	reg |= CONTROL1_TSEN_INT_EN;
> +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> +}
> +
> +static void __maybe_unused
> +armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
> +{
> +	struct armada_thermal_data *data = priv->data;
> +	u32 reg;
> +
> +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> +	reg &= ~CONTROL1_TSEN_INT_EN;
> +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> +}
> +
>  /* There is currently no board with more than one sensor per channel */
>  static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
>  {
> @@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
>  
>  	/* Do the actual reading */
>  	ret = armada_read_sensor(priv, temp);
> +	if (ret)
> +		goto unlock_mutex;
> +
> +	/*
> +	 * Select back the interrupt source channel from which a potential
> +	 * critical trip point has been set.
> +	 */
> +	ret = armada_select_channel(priv, priv->interrupt_source);
> +	if (ret)
> +		goto unlock_mutex;

Do you really need a "goto" to the immediately following line?

>  
>  unlock_mutex:
>  	mutex_unlock(&priv->update_lock);
> @@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
>  	.get_temp = armada_get_temp,
>  };
>  
> +static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
> +					  unsigned int temp_mc)
> +{
> +	s64 b = data->coef_b;
> +	s64 m = data->coef_m;
> +	s64 div = data->coef_div;
> +	unsigned int sample;
> +
> +	if (data->inverted)
> +		sample = div_s64(((temp_mc * div) + b), m);
> +	else
> +		sample = div_s64((b - (temp_mc * div)), m);
> +
> +	return sample & data->temp_mask;
> +}
> +
> +static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
> +					  unsigned int hyst_mc)
> +{
> +	/*
> +	 * The documentation states:
> +	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
> +	 * which is the mathematical derivation for:
> +	 * 0x0 <=> 1.9?C, 0x1 <=> 3.8?C, 0x2 <=> 7.6?C, 0x3 <=> 15.2
> +	 */
> +	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};

Why isn't this a static array rather than something you have to push on
the stack each time you execute this function?

> +	int i;
> +
> +	/*
> +	 * We will always take the smallest possible hysteresis to avoid risking
> +	 * the hardware integrity by enlarging the threshold by +8?C in the
> +	 * worst case.
> +	 */
> +	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
> +		if (hyst_mc >= hyst_levels_mc[i])
> +			break;
> +
> +	return i & data->hyst_mask;
> +}
> +
> +static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
> +					   int thresh_mc, int hyst_mc)
> +{
> +	struct armada_thermal_data *data = priv->data;
> +	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
> +	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
> +	u32 ctrl1;
> +
> +	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
> +
> +	/* Set Threshold */
> +	if (thresh_mc >= 0) {
> +		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
> +		ctrl1 |= threshold << data->thresh_shift;
> +		priv->current_threshold = thresh_mc;
> +	}
> +
> +	/* Set Hysteresis */
> +	if (hyst_mc >= 0) {
> +		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
> +		ctrl1 |= hysteresis << data->hyst_shift;
> +		priv->current_hysteresis = hyst_mc;
> +	}
> +
> +	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
> +}
> +
> +static irqreturn_t armada_overheat_isr(int irq, void *blob)
> +{
> +	/*
> +	 * Disable the IRQ and continue in thread context (thermal core
> +	 * notification and temperature monitoring).
> +	 */
> +	disable_irq_nosync(irq);
> +
> +	return IRQ_WAKE_THREAD;
> +}
> +
> +static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
> +{
> +	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;

Unnecessary cast.

> +	int low_threshold = priv->current_threshold - priv->current_hysteresis;
> +	int temperature;
> +	u32 dummy;
> +	int ret;
> +
> +	/* Notify the core in thread context */
> +	thermal_zone_device_update(priv->overheat_sensor,
> +				   THERMAL_EVENT_UNSPECIFIED);
> +
> +	/*
> +	 * The overheat interrupt must be cleared by reading the DFX interrupt
> +	 * cause _after_ the temperature has fallen down to the low threshold.
> +	 * Otherwise future interrupts might not be served.
> +	 */
> +	do {
> +		msleep(OVERHEAT_INT_POLL_DELAY_MS);
> +		mutex_lock(&priv->update_lock);
> +		ret = armada_read_sensor(priv, &temperature);
> +		mutex_unlock(&priv->update_lock);
> +		if (ret)
> +			return ret;

How will the interrupt fire again if you exit without re-enabling it? Is
"ret" guaranteed to be a valid irqreturn_t?

> +	} while (temperature >= low_threshold);
> +
> +	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
> +
> +	/* Notify the thermal core that the temperature is acceptable again */
> +	thermal_zone_device_update(priv->overheat_sensor,
> +				   THERMAL_EVENT_UNSPECIFIED);
> +
> +	enable_irq(irq);
> +
> +	return IRQ_HANDLED;
> +}
> +
>  static const struct armada_thermal_data armadaxp_data = {
>  	.init = armadaxp_init,
>  	.temp_shift = 10,
> @@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
>  	.is_valid_bit = BIT(16),
>  	.temp_shift = 0,
>  	.temp_mask = 0x3ff,
> +	.thresh_shift = 3,
> +	.hyst_shift = 19,
> +	.hyst_mask = 0x3,
>  	.coef_b = -150000LL,
>  	.coef_m = 423ULL,
>  	.coef_div = 1,
> @@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
>  	.syscon_control0_off = 0x84,
>  	.syscon_control1_off = 0x88,
>  	.syscon_status_off = 0x8C,
> +	.dfx_irq_cause_off = 0x108,
> +	.dfx_irq_mask_off = 0x10C,
> +	.dfx_overheat_irq = BIT(22),
> +	.dfx_server_irq_mask_off = 0x104,
> +	.dfx_server_irq_en = BIT(1),
>  	.cpu_nr = 4,
>  };
>  
> @@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
>  	.is_valid_bit = BIT(10),
>  	.temp_shift = 0,
>  	.temp_mask = 0x3ff,
> +	.thresh_shift = 16,
> +	.hyst_shift = 26,
> +	.hyst_mask = 0x3,
>  	.coef_b = 1172499100ULL,
>  	.coef_m = 2000096ULL,
>  	.coef_div = 4201,
> @@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
>  	.syscon_control0_off = 0x70,
>  	.syscon_control1_off = 0x74,
>  	.syscon_status_off = 0x78,
> +	.dfx_irq_cause_off = 0x108,
> +	.dfx_irq_mask_off = 0x10C,
> +	.dfx_overheat_irq = BIT(20),
> +	.dfx_server_irq_mask_off = 0x104,
> +	.dfx_server_irq_en = BIT(1),
>  };
>  
>  static const struct of_device_id armada_thermal_id_table[] = {
> @@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
>  	} while (insane_char);
>  }
>  
> +/*
> + * The IP can manage to trigger interrupts on overheat situation from all the
> + * sensors. However, the interrupt source changes along with the last selected
> + * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
> + * possible glitches by always selecting back only one channel (arbitrarily: the
> + * first in the DT which has a critical trip point). We also disable sensor

The DT doesn't seem to *mandate* this. Should it? Does it also mean that
you can only have a single critical point that will generate an
interrupt on overheat?

> + * switch during overheat situations.
> + */
> +static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
> +					 struct thermal_zone_device *tz,
> +					 int sensor_id)
> +{
> +	/* Retrieve the critical trip point to enable the overheat interrupt */
> +	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
> +	int ret;
> +	int i;
> +
> +	if (!trips)
> +		return -EINVAL;
> +
> +	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
> +		if (trips[i].type == THERMAL_TRIP_CRITICAL)
> +			break;
> +
> +	if (i == of_thermal_get_ntrips(tz))
> +		return -EINVAL;
> +
> +	ret = armada_select_channel(priv, sensor_id);
> +	if (ret)
> +		return ret;
> +
> +	armada_set_overheat_thresholds(priv,
> +				       trips[i].temperature,
> +				       trips[i].hysteresis);
> +	priv->overheat_sensor = tz;
> +	priv->interrupt_source = sensor_id;
> +
> +	armada_enable_overheat_interrupt(priv);
> +
> +	return 0;
> +}
> +
>  static int armada_thermal_probe(struct platform_device *pdev)
>  {
>  	struct thermal_zone_device *tz;
> @@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
>  	struct armada_drvdata *drvdata;
>  	const struct of_device_id *match;
>  	struct armada_thermal_priv *priv;
> -	int sensor_id;
> +	int sensor_id, irq;
>  	int ret;
>  
>  	match = of_match_device(armada_thermal_id_table, &pdev->dev);
> @@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
>  	drvdata->data.priv = priv;
>  	platform_set_drvdata(pdev, drvdata);
>  
> +	irq = platform_get_irq(pdev, 0);
> +	if (irq == -EPROBE_DEFER)
> +		return irq;
> +
> +	/* The overheat interrupt feature is not mandatory */
> +	if (irq >= 0) {

0 is not a valid interrupt.

> +		ret = devm_request_threaded_irq(&pdev->dev, irq,
> +						armada_overheat_isr,
> +						armada_overheat_isr_thread,
> +						0, NULL, priv);
> +		if (ret) {
> +			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
> +				irq);
> +			return ret;
> +		}
> +	}
> +
>  	/*
>  	 * There is one channel for the IC and one per CPU (if any), each
>  	 * channel has one sensor.
> @@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
>  			devm_kfree(&pdev->dev, sensor);
>  			continue;
>  		}
> +
> +		/*
> +		 * The first channel that has a critical trip point registered
> +		 * in the DT will serve as interrupt source. Others possible
> +		 * critical trip points will simply be ignored by the driver.
> +		 */
> +		if (irq >= 0 && !priv->overheat_sensor)

Same here.

> +			armada_configure_overheat_int(priv, tz, sensor->id);
>  	}
>  
> +	/* Just complain if no overheat interrupt was set up */
> +	if (!priv->overheat_sensor)
> +		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
> +
>  	return 0;
>  }
>  
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] thermal: armada: add overheat interrupt support
  2018-11-23 16:55     ` Marc Zyngier
@ 2018-11-29 17:12       ` Eduardo Valentin
  -1 siblings, 0 replies; 19+ messages in thread
From: Eduardo Valentin @ 2018-11-29 17:12 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Andrew Lunn, Jason Cooper, devicetree,
	Antoine Tenart, Catalin Marinas, Gregory Clement, Daniel Lezcano,
	Will Deacon, Russell King, Maxime Chevallier, Nadav Haklai,
	David Sniatkiwicz, Rob Herring, Thomas Petazzoni, Miquel Raynal,
	linux-pm, Zhang Rui, linux-arm-kernel, Sebastian Hesselbarth

On Fri, Nov 23, 2018 at 04:55:59PM +0000, Marc Zyngier wrote:
> On 23/11/2018 16:17, Miquel Raynal wrote:
> > The IP can manage to trigger interrupts on overheat situation from all
> > the sensors.
> > 
> > However, the interrupt source changes along with the last selected
> > source (ie. the last read sensor), which is an inconsistent behavior.
> > Avoid possible glitches by always selecting back only one channel which
> > will then be referenced as the "overheat_sensor" (arbitrarily: the first
> > in the DT which has a critical trip point filled in).
> > 
> > It is possible that the scan of all thermal zone nodes did not bring a
> > critical trip point from which the overheat interrupt could be
> > configured. In this case just complain but do not fail the probe.
> > 
> > Also disable sensor switch during overheat situations because changing
> > the channel while the system is too hot could clear the overheat state
> > by changing the source while the temperature is still very high.
> > 
> > Even if the overheat state is not declared, overheat interrupt must be
> > cleared by reading the DFX interrupt cause _after_ the temperature has
> > fallen down to the low threshold, otherwise future possible interrupts
> > would not be served. A work polls the corresponding register until the
> > overheat flag gets cleared in this case.
> > 
> > Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
> >  1 file changed, 269 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
> > index 92f67d40f2e9..55cfaee0da77 100644
> > --- a/drivers/thermal/armada_thermal.c
> > +++ b/drivers/thermal/armada_thermal.c
> > @@ -26,6 +26,11 @@
> >  #include <linux/iopoll.h>
> >  #include <linux/mfd/syscon.h>
> >  #include <linux/regmap.h>
> > +#include <linux/interrupt.h>
> > +
> > +#include "thermal_core.h"
> > +
> > +#define TO_MCELSIUS(c)			((c) * 1000)
> >  
> >  /* Thermal Manager Control and Status Register */
> >  #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
> > @@ -61,9 +66,13 @@
> >  #define CONTROL1_TSEN_AVG_MASK		0x7
> >  #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
> >  #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
> > +#define CONTROL1_TSEN_INT_EN		BIT(25)
> > +#define CONTROL1_TSEN_SELECT_OFF	21
> > +#define CONTROL1_TSEN_SELECT_MASK	0x3
> >  
> >  #define STATUS_POLL_PERIOD_US		1000
> >  #define STATUS_POLL_TIMEOUT_US		100000
> > +#define OVERHEAT_INT_POLL_DELAY_MS	1000
> >  
> >  struct armada_thermal_data;
> >  
> > @@ -75,7 +84,11 @@ struct armada_thermal_priv {
> >  	/* serialize temperature reads/updates */
> >  	struct mutex update_lock;
> >  	struct armada_thermal_data *data;
> > +	struct thermal_zone_device *overheat_sensor;
> > +	int interrupt_source;
> >  	int current_channel;
> > +	long current_threshold;
> > +	long current_hysteresis;
> >  };
> >  
> >  struct armada_thermal_data {
> > @@ -93,12 +106,20 @@ struct armada_thermal_data {
> >  	/* Register shift and mask to access the sensor temperature */
> >  	unsigned int temp_shift;
> >  	unsigned int temp_mask;
> > +	unsigned int thresh_shift;
> > +	unsigned int hyst_shift;
> > +	unsigned int hyst_mask;
> >  	u32 is_valid_bit;
> >  
> >  	/* Syscon access */
> >  	unsigned int syscon_control0_off;
> >  	unsigned int syscon_control1_off;
> >  	unsigned int syscon_status_off;
> > +	unsigned int dfx_irq_cause_off;
> > +	unsigned int dfx_irq_mask_off;
> > +	unsigned int dfx_overheat_irq;
> > +	unsigned int dfx_server_irq_mask_off;
> > +	unsigned int dfx_server_irq_en;
> >  
> >  	/* One sensor is in the thermal IC, the others are in the CPUs if any */
> >  	unsigned int cpu_nr;
> > @@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
> >  	return reg & priv->data->is_valid_bit;
> >  }
> >  
> > +static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
> > +{
> > +	struct armada_thermal_data *data = priv->data;
> > +	u32 reg;
> > +
> > +	/* Clear DFX temperature IRQ cause */
> > +	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
> > +
> > +	/* Enable DFX Temperature IRQ */
> > +	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
> > +	reg |= data->dfx_overheat_irq;
> > +	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
> > +
> > +	/* Enable DFX server IRQ */
> > +	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
> > +	reg |= data->dfx_server_irq_en;
> > +	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
> > +
> > +	/* Enable overheat interrupt */
> > +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> > +	reg |= CONTROL1_TSEN_INT_EN;
> > +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> > +}
> > +
> > +static void __maybe_unused
> > +armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
> > +{
> > +	struct armada_thermal_data *data = priv->data;
> > +	u32 reg;
> > +
> > +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> > +	reg &= ~CONTROL1_TSEN_INT_EN;
> > +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> > +}
> > +
> >  /* There is currently no board with more than one sensor per channel */
> >  static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
> >  {
> > @@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
> >  
> >  	/* Do the actual reading */
> >  	ret = armada_read_sensor(priv, temp);
> > +	if (ret)
> > +		goto unlock_mutex;
> > +
> > +	/*
> > +	 * Select back the interrupt source channel from which a potential
> > +	 * critical trip point has been set.
> > +	 */
> > +	ret = armada_select_channel(priv, priv->interrupt_source);
> > +	if (ret)
> > +		goto unlock_mutex;
> 
> Do you really need a "goto" to the immediately following line?
> 

Agreed, that goto can be removed..

> >  
> >  unlock_mutex:
> >  	mutex_unlock(&priv->update_lock);
> > @@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
> >  	.get_temp = armada_get_temp,
> >  };
> >  
> > +static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
> > +					  unsigned int temp_mc)
> > +{
> > +	s64 b = data->coef_b;
> > +	s64 m = data->coef_m;
> > +	s64 div = data->coef_div;
> > +	unsigned int sample;
> > +
> > +	if (data->inverted)
> > +		sample = div_s64(((temp_mc * div) + b), m);
> > +	else
> > +		sample = div_s64((b - (temp_mc * div)), m);
> > +
> > +	return sample & data->temp_mask;
> > +}
> > +
> > +static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
> > +					  unsigned int hyst_mc)
> > +{
> > +	/*
> > +	 * The documentation states:
> > +	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
> > +	 * which is the mathematical derivation for:
> > +	 * 0x0 <=> 1.9°C, 0x1 <=> 3.8°C, 0x2 <=> 7.6°C, 0x3 <=> 15.2
> > +	 */
> > +	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};
> 
> Why isn't this a static array rather than something you have to push on
> the stack each time you execute this function?

Also, where those constants come from? Is this something the driver
needs to be updated when new chips are added for instance?

> 
> > +	int i;
> > +
> > +	/*
> > +	 * We will always take the smallest possible hysteresis to avoid risking
> > +	 * the hardware integrity by enlarging the threshold by +8°C in the
> > +	 * worst case.
> > +	 */
> > +	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
> > +		if (hyst_mc >= hyst_levels_mc[i])
> > +			break;
> > +
> > +	return i & data->hyst_mask;
> > +}
> > +
> > +static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
> > +					   int thresh_mc, int hyst_mc)
> > +{
> > +	struct armada_thermal_data *data = priv->data;
> > +	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
> > +	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
> > +	u32 ctrl1;
> > +
> > +	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
> > +
> > +	/* Set Threshold */
> > +	if (thresh_mc >= 0) {
> > +		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
> > +		ctrl1 |= threshold << data->thresh_shift;
> > +		priv->current_threshold = thresh_mc;
> > +	}
> > +
> > +	/* Set Hysteresis */
> > +	if (hyst_mc >= 0) {
> > +		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
> > +		ctrl1 |= hysteresis << data->hyst_shift;
> > +		priv->current_hysteresis = hyst_mc;
> > +	}
> > +
> > +	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
> > +}
> > +
> > +static irqreturn_t armada_overheat_isr(int irq, void *blob)
> > +{
> > +	/*
> > +	 * Disable the IRQ and continue in thread context (thermal core
> > +	 * notification and temperature monitoring).
> > +	 */
> > +	disable_irq_nosync(irq);
> > +
> > +	return IRQ_WAKE_THREAD;
> > +}
> > +
> > +static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
> > +{
> > +	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;
> 
> Unnecessary cast.
> 
> > +	int low_threshold = priv->current_threshold - priv->current_hysteresis;
> > +	int temperature;
> > +	u32 dummy;
> > +	int ret;
> > +
> > +	/* Notify the core in thread context */
> > +	thermal_zone_device_update(priv->overheat_sensor,
> > +				   THERMAL_EVENT_UNSPECIFIED);
> > +
> > +	/*
> > +	 * The overheat interrupt must be cleared by reading the DFX interrupt
> > +	 * cause _after_ the temperature has fallen down to the low threshold.
> > +	 * Otherwise future interrupts might not be served.
> > +	 */
> > +	do {
> > +		msleep(OVERHEAT_INT_POLL_DELAY_MS);
> > +		mutex_lock(&priv->update_lock);
> > +		ret = armada_read_sensor(priv, &temperature);
> > +		mutex_unlock(&priv->update_lock);
> > +		if (ret)
> > +			return ret;
> 
> How will the interrupt fire again if you exit without re-enabling it? Is
> "ret" guaranteed to be a valid irqreturn_t?
> 
> > +	} while (temperature >= low_threshold);
> > +
> > +	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
> > +
> > +	/* Notify the thermal core that the temperature is acceptable again */
> > +	thermal_zone_device_update(priv->overheat_sensor,
> > +				   THERMAL_EVENT_UNSPECIFIED);
> > +
> > +	enable_irq(irq);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> >  static const struct armada_thermal_data armadaxp_data = {
> >  	.init = armadaxp_init,
> >  	.temp_shift = 10,
> > @@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
> >  	.is_valid_bit = BIT(16),
> >  	.temp_shift = 0,
> >  	.temp_mask = 0x3ff,
> > +	.thresh_shift = 3,
> > +	.hyst_shift = 19,
> > +	.hyst_mask = 0x3,
> >  	.coef_b = -150000LL,
> >  	.coef_m = 423ULL,
> >  	.coef_div = 1,
> > @@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
> >  	.syscon_control0_off = 0x84,
> >  	.syscon_control1_off = 0x88,
> >  	.syscon_status_off = 0x8C,
> > +	.dfx_irq_cause_off = 0x108,
> > +	.dfx_irq_mask_off = 0x10C,
> > +	.dfx_overheat_irq = BIT(22),
> > +	.dfx_server_irq_mask_off = 0x104,
> > +	.dfx_server_irq_en = BIT(1),
> >  	.cpu_nr = 4,
> >  };
> >  
> > @@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
> >  	.is_valid_bit = BIT(10),
> >  	.temp_shift = 0,
> >  	.temp_mask = 0x3ff,
> > +	.thresh_shift = 16,
> > +	.hyst_shift = 26,
> > +	.hyst_mask = 0x3,
> >  	.coef_b = 1172499100ULL,
> >  	.coef_m = 2000096ULL,
> >  	.coef_div = 4201,
> > @@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
> >  	.syscon_control0_off = 0x70,
> >  	.syscon_control1_off = 0x74,
> >  	.syscon_status_off = 0x78,
> > +	.dfx_irq_cause_off = 0x108,
> > +	.dfx_irq_mask_off = 0x10C,
> > +	.dfx_overheat_irq = BIT(20),
> > +	.dfx_server_irq_mask_off = 0x104,
> > +	.dfx_server_irq_en = BIT(1),
> >  };
> >  
> >  static const struct of_device_id armada_thermal_id_table[] = {
> > @@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
> >  	} while (insane_char);
> >  }
> >  
> > +/*
> > + * The IP can manage to trigger interrupts on overheat situation from all the
> > + * sensors. However, the interrupt source changes along with the last selected
> > + * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
> > + * possible glitches by always selecting back only one channel (arbitrarily: the
> > + * first in the DT which has a critical trip point). We also disable sensor
> 
> The DT doesn't seem to *mandate* this. Should it? Does it also mean that
> you can only have a single critical point that will generate an
> interrupt on overheat?
> 

Also, keep in mind that thermal core will shutdown the system when it
sees any zone with temp crossing the first critical trip.


> > + * switch during overheat situations.
> > + */
> > +static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
> > +					 struct thermal_zone_device *tz,
> > +					 int sensor_id)
> > +{
> > +	/* Retrieve the critical trip point to enable the overheat interrupt */
> > +	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
> > +	int ret;
> > +	int i;
> > +
> > +	if (!trips)
> > +		return -EINVAL;
> > +
> > +	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
> > +		if (trips[i].type == THERMAL_TRIP_CRITICAL)
> > +			break;
> > +
> > +	if (i == of_thermal_get_ntrips(tz))
> > +		return -EINVAL;
> > +
> > +	ret = armada_select_channel(priv, sensor_id);
> > +	if (ret)
> > +		return ret;
> > +
> > +	armada_set_overheat_thresholds(priv,
> > +				       trips[i].temperature,
> > +				       trips[i].hysteresis);
> > +	priv->overheat_sensor = tz;
> > +	priv->interrupt_source = sensor_id;
> > +
> > +	armada_enable_overheat_interrupt(priv);
> > +
> > +	return 0;
> > +}
> > +
> >  static int armada_thermal_probe(struct platform_device *pdev)
> >  {
> >  	struct thermal_zone_device *tz;
> > @@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
> >  	struct armada_drvdata *drvdata;
> >  	const struct of_device_id *match;
> >  	struct armada_thermal_priv *priv;
> > -	int sensor_id;
> > +	int sensor_id, irq;
> >  	int ret;
> >  
> >  	match = of_match_device(armada_thermal_id_table, &pdev->dev);
> > @@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
> >  	drvdata->data.priv = priv;
> >  	platform_set_drvdata(pdev, drvdata);
> >  
> > +	irq = platform_get_irq(pdev, 0);
> > +	if (irq == -EPROBE_DEFER)
> > +		return irq;
> > +
> > +	/* The overheat interrupt feature is not mandatory */
> > +	if (irq >= 0) {
> 
> 0 is not a valid interrupt.
> 
> > +		ret = devm_request_threaded_irq(&pdev->dev, irq,
> > +						armada_overheat_isr,
> > +						armada_overheat_isr_thread,
> > +						0, NULL, priv);
> > +		if (ret) {
> > +			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
> > +				irq);
> > +			return ret;
> > +		}
> > +	}
> > +
> >  	/*
> >  	 * There is one channel for the IC and one per CPU (if any), each
> >  	 * channel has one sensor.
> > @@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
> >  			devm_kfree(&pdev->dev, sensor);
> >  			continue;
> >  		}
> > +
> > +		/*
> > +		 * The first channel that has a critical trip point registered
> > +		 * in the DT will serve as interrupt source. Others possible
> > +		 * critical trip points will simply be ignored by the driver.
> > +		 */
> > +		if (irq >= 0 && !priv->overheat_sensor)
> 
> Same here.
> 
> > +			armada_configure_overheat_int(priv, tz, sensor->id);
> >  	}
> >  
> > +	/* Just complain if no overheat interrupt was set up */
> > +	if (!priv->overheat_sensor)
> > +		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
> > +
> >  	return 0;
> >  }
> >  
> > 
> 
> Thanks,
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] thermal: armada: add overheat interrupt support
@ 2018-11-29 17:12       ` Eduardo Valentin
  0 siblings, 0 replies; 19+ messages in thread
From: Eduardo Valentin @ 2018-11-29 17:12 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Andrew Lunn, Jason Cooper, devicetree,
	Antoine Tenart, Catalin Marinas, Gregory Clement, Daniel Lezcano,
	Will Deacon, Russell King, Maxime Chevallier, Nadav Haklai,
	David Sniatkiwicz, Rob Herring, Thomas Petazzoni, Miquel Raynal,
	linux-pm, Zhang Rui, linux-arm-kernel, Sebastian Hesselbarth

On Fri, Nov 23, 2018 at 04:55:59PM +0000, Marc Zyngier wrote:
> On 23/11/2018 16:17, Miquel Raynal wrote:
> > The IP can manage to trigger interrupts on overheat situation from all
> > the sensors.
> > 
> > However, the interrupt source changes along with the last selected
> > source (ie. the last read sensor), which is an inconsistent behavior.
> > Avoid possible glitches by always selecting back only one channel which
> > will then be referenced as the "overheat_sensor" (arbitrarily: the first
> > in the DT which has a critical trip point filled in).
> > 
> > It is possible that the scan of all thermal zone nodes did not bring a
> > critical trip point from which the overheat interrupt could be
> > configured. In this case just complain but do not fail the probe.
> > 
> > Also disable sensor switch during overheat situations because changing
> > the channel while the system is too hot could clear the overheat state
> > by changing the source while the temperature is still very high.
> > 
> > Even if the overheat state is not declared, overheat interrupt must be
> > cleared by reading the DFX interrupt cause _after_ the temperature has
> > fallen down to the low threshold, otherwise future possible interrupts
> > would not be served. A work polls the corresponding register until the
> > overheat flag gets cleared in this case.
> > 
> > Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
> >  1 file changed, 269 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
> > index 92f67d40f2e9..55cfaee0da77 100644
> > --- a/drivers/thermal/armada_thermal.c
> > +++ b/drivers/thermal/armada_thermal.c
> > @@ -26,6 +26,11 @@
> >  #include <linux/iopoll.h>
> >  #include <linux/mfd/syscon.h>
> >  #include <linux/regmap.h>
> > +#include <linux/interrupt.h>
> > +
> > +#include "thermal_core.h"
> > +
> > +#define TO_MCELSIUS(c)			((c) * 1000)
> >  
> >  /* Thermal Manager Control and Status Register */
> >  #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
> > @@ -61,9 +66,13 @@
> >  #define CONTROL1_TSEN_AVG_MASK		0x7
> >  #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
> >  #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
> > +#define CONTROL1_TSEN_INT_EN		BIT(25)
> > +#define CONTROL1_TSEN_SELECT_OFF	21
> > +#define CONTROL1_TSEN_SELECT_MASK	0x3
> >  
> >  #define STATUS_POLL_PERIOD_US		1000
> >  #define STATUS_POLL_TIMEOUT_US		100000
> > +#define OVERHEAT_INT_POLL_DELAY_MS	1000
> >  
> >  struct armada_thermal_data;
> >  
> > @@ -75,7 +84,11 @@ struct armada_thermal_priv {
> >  	/* serialize temperature reads/updates */
> >  	struct mutex update_lock;
> >  	struct armada_thermal_data *data;
> > +	struct thermal_zone_device *overheat_sensor;
> > +	int interrupt_source;
> >  	int current_channel;
> > +	long current_threshold;
> > +	long current_hysteresis;
> >  };
> >  
> >  struct armada_thermal_data {
> > @@ -93,12 +106,20 @@ struct armada_thermal_data {
> >  	/* Register shift and mask to access the sensor temperature */
> >  	unsigned int temp_shift;
> >  	unsigned int temp_mask;
> > +	unsigned int thresh_shift;
> > +	unsigned int hyst_shift;
> > +	unsigned int hyst_mask;
> >  	u32 is_valid_bit;
> >  
> >  	/* Syscon access */
> >  	unsigned int syscon_control0_off;
> >  	unsigned int syscon_control1_off;
> >  	unsigned int syscon_status_off;
> > +	unsigned int dfx_irq_cause_off;
> > +	unsigned int dfx_irq_mask_off;
> > +	unsigned int dfx_overheat_irq;
> > +	unsigned int dfx_server_irq_mask_off;
> > +	unsigned int dfx_server_irq_en;
> >  
> >  	/* One sensor is in the thermal IC, the others are in the CPUs if any */
> >  	unsigned int cpu_nr;
> > @@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
> >  	return reg & priv->data->is_valid_bit;
> >  }
> >  
> > +static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
> > +{
> > +	struct armada_thermal_data *data = priv->data;
> > +	u32 reg;
> > +
> > +	/* Clear DFX temperature IRQ cause */
> > +	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
> > +
> > +	/* Enable DFX Temperature IRQ */
> > +	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
> > +	reg |= data->dfx_overheat_irq;
> > +	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
> > +
> > +	/* Enable DFX server IRQ */
> > +	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
> > +	reg |= data->dfx_server_irq_en;
> > +	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
> > +
> > +	/* Enable overheat interrupt */
> > +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> > +	reg |= CONTROL1_TSEN_INT_EN;
> > +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> > +}
> > +
> > +static void __maybe_unused
> > +armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
> > +{
> > +	struct armada_thermal_data *data = priv->data;
> > +	u32 reg;
> > +
> > +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> > +	reg &= ~CONTROL1_TSEN_INT_EN;
> > +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> > +}
> > +
> >  /* There is currently no board with more than one sensor per channel */
> >  static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
> >  {
> > @@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
> >  
> >  	/* Do the actual reading */
> >  	ret = armada_read_sensor(priv, temp);
> > +	if (ret)
> > +		goto unlock_mutex;
> > +
> > +	/*
> > +	 * Select back the interrupt source channel from which a potential
> > +	 * critical trip point has been set.
> > +	 */
> > +	ret = armada_select_channel(priv, priv->interrupt_source);
> > +	if (ret)
> > +		goto unlock_mutex;
> 
> Do you really need a "goto" to the immediately following line?
> 

Agreed, that goto can be removed..

> >  
> >  unlock_mutex:
> >  	mutex_unlock(&priv->update_lock);
> > @@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
> >  	.get_temp = armada_get_temp,
> >  };
> >  
> > +static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
> > +					  unsigned int temp_mc)
> > +{
> > +	s64 b = data->coef_b;
> > +	s64 m = data->coef_m;
> > +	s64 div = data->coef_div;
> > +	unsigned int sample;
> > +
> > +	if (data->inverted)
> > +		sample = div_s64(((temp_mc * div) + b), m);
> > +	else
> > +		sample = div_s64((b - (temp_mc * div)), m);
> > +
> > +	return sample & data->temp_mask;
> > +}
> > +
> > +static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
> > +					  unsigned int hyst_mc)
> > +{
> > +	/*
> > +	 * The documentation states:
> > +	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
> > +	 * which is the mathematical derivation for:
> > +	 * 0x0 <=> 1.9°C, 0x1 <=> 3.8°C, 0x2 <=> 7.6°C, 0x3 <=> 15.2
> > +	 */
> > +	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};
> 
> Why isn't this a static array rather than something you have to push on
> the stack each time you execute this function?

Also, where those constants come from? Is this something the driver
needs to be updated when new chips are added for instance?

> 
> > +	int i;
> > +
> > +	/*
> > +	 * We will always take the smallest possible hysteresis to avoid risking
> > +	 * the hardware integrity by enlarging the threshold by +8°C in the
> > +	 * worst case.
> > +	 */
> > +	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
> > +		if (hyst_mc >= hyst_levels_mc[i])
> > +			break;
> > +
> > +	return i & data->hyst_mask;
> > +}
> > +
> > +static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
> > +					   int thresh_mc, int hyst_mc)
> > +{
> > +	struct armada_thermal_data *data = priv->data;
> > +	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
> > +	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
> > +	u32 ctrl1;
> > +
> > +	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
> > +
> > +	/* Set Threshold */
> > +	if (thresh_mc >= 0) {
> > +		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
> > +		ctrl1 |= threshold << data->thresh_shift;
> > +		priv->current_threshold = thresh_mc;
> > +	}
> > +
> > +	/* Set Hysteresis */
> > +	if (hyst_mc >= 0) {
> > +		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
> > +		ctrl1 |= hysteresis << data->hyst_shift;
> > +		priv->current_hysteresis = hyst_mc;
> > +	}
> > +
> > +	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
> > +}
> > +
> > +static irqreturn_t armada_overheat_isr(int irq, void *blob)
> > +{
> > +	/*
> > +	 * Disable the IRQ and continue in thread context (thermal core
> > +	 * notification and temperature monitoring).
> > +	 */
> > +	disable_irq_nosync(irq);
> > +
> > +	return IRQ_WAKE_THREAD;
> > +}
> > +
> > +static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
> > +{
> > +	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;
> 
> Unnecessary cast.
> 
> > +	int low_threshold = priv->current_threshold - priv->current_hysteresis;
> > +	int temperature;
> > +	u32 dummy;
> > +	int ret;
> > +
> > +	/* Notify the core in thread context */
> > +	thermal_zone_device_update(priv->overheat_sensor,
> > +				   THERMAL_EVENT_UNSPECIFIED);
> > +
> > +	/*
> > +	 * The overheat interrupt must be cleared by reading the DFX interrupt
> > +	 * cause _after_ the temperature has fallen down to the low threshold.
> > +	 * Otherwise future interrupts might not be served.
> > +	 */
> > +	do {
> > +		msleep(OVERHEAT_INT_POLL_DELAY_MS);
> > +		mutex_lock(&priv->update_lock);
> > +		ret = armada_read_sensor(priv, &temperature);
> > +		mutex_unlock(&priv->update_lock);
> > +		if (ret)
> > +			return ret;
> 
> How will the interrupt fire again if you exit without re-enabling it? Is
> "ret" guaranteed to be a valid irqreturn_t?
> 
> > +	} while (temperature >= low_threshold);
> > +
> > +	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
> > +
> > +	/* Notify the thermal core that the temperature is acceptable again */
> > +	thermal_zone_device_update(priv->overheat_sensor,
> > +				   THERMAL_EVENT_UNSPECIFIED);
> > +
> > +	enable_irq(irq);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> >  static const struct armada_thermal_data armadaxp_data = {
> >  	.init = armadaxp_init,
> >  	.temp_shift = 10,
> > @@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
> >  	.is_valid_bit = BIT(16),
> >  	.temp_shift = 0,
> >  	.temp_mask = 0x3ff,
> > +	.thresh_shift = 3,
> > +	.hyst_shift = 19,
> > +	.hyst_mask = 0x3,
> >  	.coef_b = -150000LL,
> >  	.coef_m = 423ULL,
> >  	.coef_div = 1,
> > @@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
> >  	.syscon_control0_off = 0x84,
> >  	.syscon_control1_off = 0x88,
> >  	.syscon_status_off = 0x8C,
> > +	.dfx_irq_cause_off = 0x108,
> > +	.dfx_irq_mask_off = 0x10C,
> > +	.dfx_overheat_irq = BIT(22),
> > +	.dfx_server_irq_mask_off = 0x104,
> > +	.dfx_server_irq_en = BIT(1),
> >  	.cpu_nr = 4,
> >  };
> >  
> > @@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
> >  	.is_valid_bit = BIT(10),
> >  	.temp_shift = 0,
> >  	.temp_mask = 0x3ff,
> > +	.thresh_shift = 16,
> > +	.hyst_shift = 26,
> > +	.hyst_mask = 0x3,
> >  	.coef_b = 1172499100ULL,
> >  	.coef_m = 2000096ULL,
> >  	.coef_div = 4201,
> > @@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
> >  	.syscon_control0_off = 0x70,
> >  	.syscon_control1_off = 0x74,
> >  	.syscon_status_off = 0x78,
> > +	.dfx_irq_cause_off = 0x108,
> > +	.dfx_irq_mask_off = 0x10C,
> > +	.dfx_overheat_irq = BIT(20),
> > +	.dfx_server_irq_mask_off = 0x104,
> > +	.dfx_server_irq_en = BIT(1),
> >  };
> >  
> >  static const struct of_device_id armada_thermal_id_table[] = {
> > @@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
> >  	} while (insane_char);
> >  }
> >  
> > +/*
> > + * The IP can manage to trigger interrupts on overheat situation from all the
> > + * sensors. However, the interrupt source changes along with the last selected
> > + * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
> > + * possible glitches by always selecting back only one channel (arbitrarily: the
> > + * first in the DT which has a critical trip point). We also disable sensor
> 
> The DT doesn't seem to *mandate* this. Should it? Does it also mean that
> you can only have a single critical point that will generate an
> interrupt on overheat?
> 

Also, keep in mind that thermal core will shutdown the system when it
sees any zone with temp crossing the first critical trip.


> > + * switch during overheat situations.
> > + */
> > +static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
> > +					 struct thermal_zone_device *tz,
> > +					 int sensor_id)
> > +{
> > +	/* Retrieve the critical trip point to enable the overheat interrupt */
> > +	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
> > +	int ret;
> > +	int i;
> > +
> > +	if (!trips)
> > +		return -EINVAL;
> > +
> > +	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
> > +		if (trips[i].type == THERMAL_TRIP_CRITICAL)
> > +			break;
> > +
> > +	if (i == of_thermal_get_ntrips(tz))
> > +		return -EINVAL;
> > +
> > +	ret = armada_select_channel(priv, sensor_id);
> > +	if (ret)
> > +		return ret;
> > +
> > +	armada_set_overheat_thresholds(priv,
> > +				       trips[i].temperature,
> > +				       trips[i].hysteresis);
> > +	priv->overheat_sensor = tz;
> > +	priv->interrupt_source = sensor_id;
> > +
> > +	armada_enable_overheat_interrupt(priv);
> > +
> > +	return 0;
> > +}
> > +
> >  static int armada_thermal_probe(struct platform_device *pdev)
> >  {
> >  	struct thermal_zone_device *tz;
> > @@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
> >  	struct armada_drvdata *drvdata;
> >  	const struct of_device_id *match;
> >  	struct armada_thermal_priv *priv;
> > -	int sensor_id;
> > +	int sensor_id, irq;
> >  	int ret;
> >  
> >  	match = of_match_device(armada_thermal_id_table, &pdev->dev);
> > @@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
> >  	drvdata->data.priv = priv;
> >  	platform_set_drvdata(pdev, drvdata);
> >  
> > +	irq = platform_get_irq(pdev, 0);
> > +	if (irq == -EPROBE_DEFER)
> > +		return irq;
> > +
> > +	/* The overheat interrupt feature is not mandatory */
> > +	if (irq >= 0) {
> 
> 0 is not a valid interrupt.
> 
> > +		ret = devm_request_threaded_irq(&pdev->dev, irq,
> > +						armada_overheat_isr,
> > +						armada_overheat_isr_thread,
> > +						0, NULL, priv);
> > +		if (ret) {
> > +			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
> > +				irq);
> > +			return ret;
> > +		}
> > +	}
> > +
> >  	/*
> >  	 * There is one channel for the IC and one per CPU (if any), each
> >  	 * channel has one sensor.
> > @@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
> >  			devm_kfree(&pdev->dev, sensor);
> >  			continue;
> >  		}
> > +
> > +		/*
> > +		 * The first channel that has a critical trip point registered
> > +		 * in the DT will serve as interrupt source. Others possible
> > +		 * critical trip points will simply be ignored by the driver.
> > +		 */
> > +		if (irq >= 0 && !priv->overheat_sensor)
> 
> Same here.
> 
> > +			armada_configure_overheat_int(priv, tz, sensor->id);
> >  	}
> >  
> > +	/* Just complain if no overheat interrupt was set up */
> > +	if (!priv->overheat_sensor)
> > +		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
> > +
> >  	return 0;
> >  }
> >  
> > 
> 
> Thanks,
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] thermal: armada: add overheat interrupt support
  2018-11-29 17:12       ` Eduardo Valentin
  (?)
@ 2018-12-03 15:31       ` Miquel Raynal
  -1 siblings, 0 replies; 19+ messages in thread
From: Miquel Raynal @ 2018-12-03 15:31 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Mark Rutland, Andrew Lunn, Jason Cooper, devicetree,
	Marc Zyngier, Catalin Marinas, Gregory Clement, Daniel Lezcano,
	Will Deacon, Russell King, Maxime Chevallier, Nadav Haklai,
	Antoine Tenart, David Sniatkiwicz, Rob Herring, Thomas Petazzoni,
	linux-pm, Zhang Rui, linux-arm-kernel, Sebastian Hesselbarth

Hi Marc, Eduardo,

Thanks for the reviews! Two questions below.

Eduardo Valentin <edubezval@gmail.com> wrote on Thu, 29 Nov 2018
09:12:31 -0800:

> On Fri, Nov 23, 2018 at 04:55:59PM +0000, Marc Zyngier wrote:
> > On 23/11/2018 16:17, Miquel Raynal wrote:  
> > > The IP can manage to trigger interrupts on overheat situation from all
> > > the sensors.
> > > 
> > > However, the interrupt source changes along with the last selected
> > > source (ie. the last read sensor), which is an inconsistent behavior.
> > > Avoid possible glitches by always selecting back only one channel which
> > > will then be referenced as the "overheat_sensor" (arbitrarily: the first
> > > in the DT which has a critical trip point filled in).
> > > 
> > > It is possible that the scan of all thermal zone nodes did not bring a
> > > critical trip point from which the overheat interrupt could be
> > > configured. In this case just complain but do not fail the probe.
> > > 
> > > Also disable sensor switch during overheat situations because changing
> > > the channel while the system is too hot could clear the overheat state
> > > by changing the source while the temperature is still very high.
> > > 
> > > Even if the overheat state is not declared, overheat interrupt must be
> > > cleared by reading the DFX interrupt cause _after_ the temperature has
> > > fallen down to the low threshold, otherwise future possible interrupts
> > > would not be served. A work polls the corresponding register until the
> > > overheat flag gets cleared in this case.
> > > 
> > > Suggested-by: David Sniatkiwicz <davidsn@marvell.com>
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  drivers/thermal/armada_thermal.c | 270 ++++++++++++++++++++++++++++++-
> > >  1 file changed, 269 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
> > > index 92f67d40f2e9..55cfaee0da77 100644
> > > --- a/drivers/thermal/armada_thermal.c
> > > +++ b/drivers/thermal/armada_thermal.c
> > > @@ -26,6 +26,11 @@
> > >  #include <linux/iopoll.h>
> > >  #include <linux/mfd/syscon.h>
> > >  #include <linux/regmap.h>
> > > +#include <linux/interrupt.h>
> > > +
> > > +#include "thermal_core.h"
> > > +
> > > +#define TO_MCELSIUS(c)			((c) * 1000)
> > >  
> > >  /* Thermal Manager Control and Status Register */
> > >  #define PMU_TDC0_SW_RST_MASK		(0x1 << 1)
> > > @@ -61,9 +66,13 @@
> > >  #define CONTROL1_TSEN_AVG_MASK		0x7
> > >  #define CONTROL1_EXT_TSEN_SW_RESET	BIT(7)
> > >  #define CONTROL1_EXT_TSEN_HW_RESETn	BIT(8)
> > > +#define CONTROL1_TSEN_INT_EN		BIT(25)
> > > +#define CONTROL1_TSEN_SELECT_OFF	21
> > > +#define CONTROL1_TSEN_SELECT_MASK	0x3
> > >  
> > >  #define STATUS_POLL_PERIOD_US		1000
> > >  #define STATUS_POLL_TIMEOUT_US		100000
> > > +#define OVERHEAT_INT_POLL_DELAY_MS	1000
> > >  
> > >  struct armada_thermal_data;
> > >  
> > > @@ -75,7 +84,11 @@ struct armada_thermal_priv {
> > >  	/* serialize temperature reads/updates */
> > >  	struct mutex update_lock;
> > >  	struct armada_thermal_data *data;
> > > +	struct thermal_zone_device *overheat_sensor;
> > > +	int interrupt_source;
> > >  	int current_channel;
> > > +	long current_threshold;
> > > +	long current_hysteresis;
> > >  };
> > >  
> > >  struct armada_thermal_data {
> > > @@ -93,12 +106,20 @@ struct armada_thermal_data {
> > >  	/* Register shift and mask to access the sensor temperature */
> > >  	unsigned int temp_shift;
> > >  	unsigned int temp_mask;
> > > +	unsigned int thresh_shift;
> > > +	unsigned int hyst_shift;
> > > +	unsigned int hyst_mask;
> > >  	u32 is_valid_bit;
> > >  
> > >  	/* Syscon access */
> > >  	unsigned int syscon_control0_off;
> > >  	unsigned int syscon_control1_off;
> > >  	unsigned int syscon_status_off;
> > > +	unsigned int dfx_irq_cause_off;
> > > +	unsigned int dfx_irq_mask_off;
> > > +	unsigned int dfx_overheat_irq;
> > > +	unsigned int dfx_server_irq_mask_off;
> > > +	unsigned int dfx_server_irq_en;
> > >  
> > >  	/* One sensor is in the thermal IC, the others are in the CPUs if any */
> > >  	unsigned int cpu_nr;
> > > @@ -272,6 +293,41 @@ static bool armada_is_valid(struct armada_thermal_priv *priv)
> > >  	return reg & priv->data->is_valid_bit;
> > >  }
> > >  
> > > +static void armada_enable_overheat_interrupt(struct armada_thermal_priv *priv)
> > > +{
> > > +	struct armada_thermal_data *data = priv->data;
> > > +	u32 reg;
> > > +
> > > +	/* Clear DFX temperature IRQ cause */
> > > +	regmap_read(priv->syscon, data->dfx_irq_cause_off, &reg);
> > > +
> > > +	/* Enable DFX Temperature IRQ */
> > > +	regmap_read(priv->syscon, data->dfx_irq_mask_off, &reg);
> > > +	reg |= data->dfx_overheat_irq;
> > > +	regmap_write(priv->syscon, data->dfx_irq_mask_off, reg);
> > > +
> > > +	/* Enable DFX server IRQ */
> > > +	regmap_read(priv->syscon, data->dfx_server_irq_mask_off, &reg);
> > > +	reg |= data->dfx_server_irq_en;
> > > +	regmap_write(priv->syscon, data->dfx_server_irq_mask_off, reg);
> > > +
> > > +	/* Enable overheat interrupt */
> > > +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> > > +	reg |= CONTROL1_TSEN_INT_EN;
> > > +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> > > +}
> > > +
> > > +static void __maybe_unused
> > > +armada_disable_overheat_interrupt(struct armada_thermal_priv *priv)
> > > +{
> > > +	struct armada_thermal_data *data = priv->data;
> > > +	u32 reg;
> > > +
> > > +	regmap_read(priv->syscon, data->syscon_control1_off, &reg);
> > > +	reg &= ~CONTROL1_TSEN_INT_EN;
> > > +	regmap_write(priv->syscon, data->syscon_control1_off, reg);
> > > +}
> > > +
> > >  /* There is currently no board with more than one sensor per channel */
> > >  static int armada_select_channel(struct armada_thermal_priv *priv, int channel)
> > >  {
> > > @@ -388,6 +444,16 @@ static int armada_get_temp(void *_sensor, int *temp)
> > >  
> > >  	/* Do the actual reading */
> > >  	ret = armada_read_sensor(priv, temp);
> > > +	if (ret)
> > > +		goto unlock_mutex;
> > > +
> > > +	/*
> > > +	 * Select back the interrupt source channel from which a potential
> > > +	 * critical trip point has been set.
> > > +	 */
> > > +	ret = armada_select_channel(priv, priv->interrupt_source);
> > > +	if (ret)
> > > +		goto unlock_mutex;  
> > 
> > Do you really need a "goto" to the immediately following line?
> >   
> 
> Agreed, that goto can be removed..

Sure.

> 
> > >  
> > >  unlock_mutex:
> > >  	mutex_unlock(&priv->update_lock);
> > > @@ -399,6 +465,121 @@ static struct thermal_zone_of_device_ops of_ops = {
> > >  	.get_temp = armada_get_temp,
> > >  };
> > >  
> > > +static unsigned int armada_mc_to_reg_temp(struct armada_thermal_data *data,
> > > +					  unsigned int temp_mc)
> > > +{
> > > +	s64 b = data->coef_b;
> > > +	s64 m = data->coef_m;
> > > +	s64 div = data->coef_div;
> > > +	unsigned int sample;
> > > +
> > > +	if (data->inverted)
> > > +		sample = div_s64(((temp_mc * div) + b), m);
> > > +	else
> > > +		sample = div_s64((b - (temp_mc * div)), m);
> > > +
> > > +	return sample & data->temp_mask;
> > > +}
> > > +
> > > +static unsigned int armada_mc_to_reg_hyst(struct armada_thermal_data *data,
> > > +					  unsigned int hyst_mc)
> > > +{
> > > +	/*
> > > +	 * The documentation states:
> > > +	 * high/low watermark = threshold +/- 0.4761 * 2^(hysteresis + 2)
> > > +	 * which is the mathematical derivation for:
> > > +	 * 0x0 <=> 1.9°C, 0x1 <=> 3.8°C, 0x2 <=> 7.6°C, 0x3 <=> 15.2
> > > +	 */
> > > +	unsigned int hyst_levels_mc[] = {1900, 3800, 7600, 15200};  
> > 
> > Why isn't this a static array rather than something you have to push on
> > the stack each time you execute this function?  

Yes, this should definitely be declared in a static array.

> 
> Also, where those constants come from? Is this something the driver
> needs to be updated when new chips are added for instance?

Well, I can't tell for sure. These values are derived thanks to the
mathematical function in the comment. I don't think this is subject to
change in the future but you never know. I propose to just have one
static array for now, if we need to tweak these parameters for similar
IPs in the future we will integrate it in platform data. What do you
think?

> 
> >   
> > > +	int i;
> > > +
> > > +	/*
> > > +	 * We will always take the smallest possible hysteresis to avoid risking
> > > +	 * the hardware integrity by enlarging the threshold by +8°C in the
> > > +	 * worst case.
> > > +	 */
> > > +	for (i = ARRAY_SIZE(hyst_levels_mc) - 1; i > 0; i--)
> > > +		if (hyst_mc >= hyst_levels_mc[i])
> > > +			break;
> > > +
> > > +	return i & data->hyst_mask;
> > > +}
> > > +
> > > +static void armada_set_overheat_thresholds(struct armada_thermal_priv *priv,
> > > +					   int thresh_mc, int hyst_mc)
> > > +{
> > > +	struct armada_thermal_data *data = priv->data;
> > > +	unsigned int threshold = armada_mc_to_reg_temp(data, thresh_mc);
> > > +	unsigned int hysteresis = armada_mc_to_reg_hyst(data, hyst_mc);
> > > +	u32 ctrl1;
> > > +
> > > +	regmap_read(priv->syscon, data->syscon_control1_off, &ctrl1);
> > > +
> > > +	/* Set Threshold */
> > > +	if (thresh_mc >= 0) {
> > > +		ctrl1 &= ~(data->temp_mask << data->thresh_shift);
> > > +		ctrl1 |= threshold << data->thresh_shift;
> > > +		priv->current_threshold = thresh_mc;
> > > +	}
> > > +
> > > +	/* Set Hysteresis */
> > > +	if (hyst_mc >= 0) {
> > > +		ctrl1 &= ~(data->hyst_mask << data->hyst_shift);
> > > +		ctrl1 |= hysteresis << data->hyst_shift;
> > > +		priv->current_hysteresis = hyst_mc;
> > > +	}
> > > +
> > > +	regmap_write(priv->syscon, data->syscon_control1_off, ctrl1);
> > > +}
> > > +
> > > +static irqreturn_t armada_overheat_isr(int irq, void *blob)
> > > +{
> > > +	/*
> > > +	 * Disable the IRQ and continue in thread context (thermal core
> > > +	 * notification and temperature monitoring).
> > > +	 */
> > > +	disable_irq_nosync(irq);
> > > +
> > > +	return IRQ_WAKE_THREAD;
> > > +}
> > > +
> > > +static irqreturn_t armada_overheat_isr_thread(int irq, void *blob)
> > > +{
> > > +	struct armada_thermal_priv *priv = (struct armada_thermal_priv *)blob;  
> > 
> > Unnecessary cast.

Ok.

> >   
> > > +	int low_threshold = priv->current_threshold - priv->current_hysteresis;
> > > +	int temperature;
> > > +	u32 dummy;
> > > +	int ret;
> > > +
> > > +	/* Notify the core in thread context */
> > > +	thermal_zone_device_update(priv->overheat_sensor,
> > > +				   THERMAL_EVENT_UNSPECIFIED);
> > > +
> > > +	/*
> > > +	 * The overheat interrupt must be cleared by reading the DFX interrupt
> > > +	 * cause _after_ the temperature has fallen down to the low threshold.
> > > +	 * Otherwise future interrupts might not be served.
> > > +	 */
> > > +	do {
> > > +		msleep(OVERHEAT_INT_POLL_DELAY_MS);
> > > +		mutex_lock(&priv->update_lock);
> > > +		ret = armada_read_sensor(priv, &temperature);
> > > +		mutex_unlock(&priv->update_lock);
> > > +		if (ret)
> > > +			return ret;  
> > 
> > How will the interrupt fire again if you exit without re-enabling it? Is
> > "ret" guaranteed to be a valid irqreturn_t?

You are right this structure is wrong, I will find a better way to
handle errors.

> >   
> > > +	} while (temperature >= low_threshold);
> > > +
> > > +	regmap_read(priv->syscon, priv->data->dfx_irq_cause_off, &dummy);
> > > +
> > > +	/* Notify the thermal core that the temperature is acceptable again */
> > > +	thermal_zone_device_update(priv->overheat_sensor,
> > > +				   THERMAL_EVENT_UNSPECIFIED);
> > > +
> > > +	enable_irq(irq);
> > > +
> > > +	return IRQ_HANDLED;
> > > +}
> > > +
> > >  static const struct armada_thermal_data armadaxp_data = {
> > >  	.init = armadaxp_init,
> > >  	.temp_shift = 10,
> > > @@ -454,6 +635,9 @@ static const struct armada_thermal_data armada_ap806_data = {
> > >  	.is_valid_bit = BIT(16),
> > >  	.temp_shift = 0,
> > >  	.temp_mask = 0x3ff,
> > > +	.thresh_shift = 3,
> > > +	.hyst_shift = 19,
> > > +	.hyst_mask = 0x3,
> > >  	.coef_b = -150000LL,
> > >  	.coef_m = 423ULL,
> > >  	.coef_div = 1,
> > > @@ -462,6 +646,11 @@ static const struct armada_thermal_data armada_ap806_data = {
> > >  	.syscon_control0_off = 0x84,
> > >  	.syscon_control1_off = 0x88,
> > >  	.syscon_status_off = 0x8C,
> > > +	.dfx_irq_cause_off = 0x108,
> > > +	.dfx_irq_mask_off = 0x10C,
> > > +	.dfx_overheat_irq = BIT(22),
> > > +	.dfx_server_irq_mask_off = 0x104,
> > > +	.dfx_server_irq_en = BIT(1),
> > >  	.cpu_nr = 4,
> > >  };
> > >  
> > > @@ -470,6 +659,9 @@ static const struct armada_thermal_data armada_cp110_data = {
> > >  	.is_valid_bit = BIT(10),
> > >  	.temp_shift = 0,
> > >  	.temp_mask = 0x3ff,
> > > +	.thresh_shift = 16,
> > > +	.hyst_shift = 26,
> > > +	.hyst_mask = 0x3,
> > >  	.coef_b = 1172499100ULL,
> > >  	.coef_m = 2000096ULL,
> > >  	.coef_div = 4201,
> > > @@ -477,6 +669,11 @@ static const struct armada_thermal_data armada_cp110_data = {
> > >  	.syscon_control0_off = 0x70,
> > >  	.syscon_control1_off = 0x74,
> > >  	.syscon_status_off = 0x78,
> > > +	.dfx_irq_cause_off = 0x108,
> > > +	.dfx_irq_mask_off = 0x10C,
> > > +	.dfx_overheat_irq = BIT(20),
> > > +	.dfx_server_irq_mask_off = 0x104,
> > > +	.dfx_server_irq_en = BIT(1),
> > >  };
> > >  
> > >  static const struct of_device_id armada_thermal_id_table[] = {
> > > @@ -592,6 +789,48 @@ static void armada_set_sane_name(struct platform_device *pdev,
> > >  	} while (insane_char);
> > >  }
> > >  
> > > +/*
> > > + * The IP can manage to trigger interrupts on overheat situation from all the
> > > + * sensors. However, the interrupt source changes along with the last selected
> > > + * source (ie. the last read sensor), which is an inconsistent behavior. Avoid
> > > + * possible glitches by always selecting back only one channel (arbitrarily: the
> > > + * first in the DT which has a critical trip point). We also disable sensor  
> > 
> > The DT doesn't seem to *mandate* this. Should it? Does it also mean that
> > you can only have a single critical point that will generate an
> > interrupt on overheat?

Yes:
* the IP only supports one single critical point that will generate
  an interrupt,
* any channel can trigger the interrupt,
* one (and only one) channel is selected at any time, it is the one
  that will trigger the interrupt.

> 
> Also, keep in mind that thermal core will shutdown the system when it
> sees any zone with temp crossing the first critical trip.

I had in mind the possibility to declare a trip point less important
than "critical" from which actions could take place in order to reduce
the temperature and try to avoid the shutdown. If you think this is not
useful I might probably simplify a lot the interrupt handling. But
then, there is no way to recover from such interrupt.

> 
> 
> > > + * switch during overheat situations.
> > > + */
> > > +static int armada_configure_overheat_int(struct armada_thermal_priv *priv,
> > > +					 struct thermal_zone_device *tz,
> > > +					 int sensor_id)
> > > +{
> > > +	/* Retrieve the critical trip point to enable the overheat interrupt */
> > > +	const struct thermal_trip *trips = of_thermal_get_trip_points(tz);
> > > +	int ret;
> > > +	int i;
> > > +
> > > +	if (!trips)
> > > +		return -EINVAL;
> > > +
> > > +	for (i = 0; i < of_thermal_get_ntrips(tz); i++)
> > > +		if (trips[i].type == THERMAL_TRIP_CRITICAL)
> > > +			break;
> > > +
> > > +	if (i == of_thermal_get_ntrips(tz))
> > > +		return -EINVAL;
> > > +
> > > +	ret = armada_select_channel(priv, sensor_id);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	armada_set_overheat_thresholds(priv,
> > > +				       trips[i].temperature,
> > > +				       trips[i].hysteresis);
> > > +	priv->overheat_sensor = tz;
> > > +	priv->interrupt_source = sensor_id;
> > > +
> > > +	armada_enable_overheat_interrupt(priv);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int armada_thermal_probe(struct platform_device *pdev)
> > >  {
> > >  	struct thermal_zone_device *tz;
> > > @@ -599,7 +838,7 @@ static int armada_thermal_probe(struct platform_device *pdev)
> > >  	struct armada_drvdata *drvdata;
> > >  	const struct of_device_id *match;
> > >  	struct armada_thermal_priv *priv;
> > > -	int sensor_id;
> > > +	int sensor_id, irq;
> > >  	int ret;
> > >  
> > >  	match = of_match_device(armada_thermal_id_table, &pdev->dev);
> > > @@ -669,6 +908,23 @@ static int armada_thermal_probe(struct platform_device *pdev)
> > >  	drvdata->data.priv = priv;
> > >  	platform_set_drvdata(pdev, drvdata);
> > >  
> > > +	irq = platform_get_irq(pdev, 0);
> > > +	if (irq == -EPROBE_DEFER)
> > > +		return irq;
> > > +
> > > +	/* The overheat interrupt feature is not mandatory */
> > > +	if (irq >= 0) {  
> > 
> > 0 is not a valid interrupt.

Right.

> >   
> > > +		ret = devm_request_threaded_irq(&pdev->dev, irq,
> > > +						armada_overheat_isr,
> > > +						armada_overheat_isr_thread,
> > > +						0, NULL, priv);
> > > +		if (ret) {
> > > +			dev_err(&pdev->dev, "Cannot request threaded IRQ %d\n",
> > > +				irq);
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > >  	/*
> > >  	 * There is one channel for the IC and one per CPU (if any), each
> > >  	 * channel has one sensor.
> > > @@ -692,8 +948,20 @@ static int armada_thermal_probe(struct platform_device *pdev)
> > >  			devm_kfree(&pdev->dev, sensor);
> > >  			continue;
> > >  		}
> > > +
> > > +		/*
> > > +		 * The first channel that has a critical trip point registered
> > > +		 * in the DT will serve as interrupt source. Others possible
> > > +		 * critical trip points will simply be ignored by the driver.
> > > +		 */
> > > +		if (irq >= 0 && !priv->overheat_sensor)  
> > 
> > Same here.

Yes.

> >   
> > > +			armada_configure_overheat_int(priv, tz, sensor->id);
> > >  	}
> > >  
> > > +	/* Just complain if no overheat interrupt was set up */
> > > +	if (!priv->overheat_sensor)
> > > +		dev_warn(&pdev->dev, "Overheat interrupt not available\n");
> > > +
> > >  	return 0;
> > >  }
> > >  
> > >   
> > 
> > Thanks,
> > 
> > 	M.
> > -- 
> > Jazz is not dead. It just smells funny...  


Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-12-03 15:31 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-23 16:17 [PATCH v2 0/6] Add hw overheat IRQ support to Marvell thermal driver Miquel Raynal
2018-11-23 16:17 ` Miquel Raynal
2018-11-23 16:17 ` [PATCH v2 1/6] thermal: armada: add overheat interrupt support Miquel Raynal
2018-11-23 16:17   ` Miquel Raynal
2018-11-23 16:55   ` Marc Zyngier
2018-11-23 16:55     ` Marc Zyngier
2018-11-29 17:12     ` Eduardo Valentin
2018-11-29 17:12       ` Eduardo Valentin
2018-12-03 15:31       ` Miquel Raynal
2018-11-23 16:17 ` [PATCH v2 2/6] MAINTAINERS: thermal: add entry for Marvell MVEBU thermal driver Miquel Raynal
2018-11-23 16:17   ` Miquel Raynal
2018-11-23 16:17 ` [PATCH v2 3/6] dt-bindings: ap806: document the thermal interrupt capabilities Miquel Raynal
2018-11-23 16:17   ` Miquel Raynal
2018-11-23 16:17 ` [PATCH v2 4/6] dt-bindings: cp110: " Miquel Raynal
2018-11-23 16:17   ` Miquel Raynal
2018-11-23 16:17 ` [PATCH v2 5/6] arm64: dts: marvell: add interrupt support to ap806 thermal node Miquel Raynal
2018-11-23 16:17   ` Miquel Raynal
2018-11-23 16:17 ` [PATCH v2 6/6] arm64: dts: marvell: add interrupt support to cp110 " Miquel Raynal
2018-11-23 16:17   ` Miquel Raynal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.