linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
@ 2021-06-16 19:07 Dmitry Osipenko
  2021-06-17  0:12 ` Guenter Roeck
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Osipenko @ 2021-06-16 19:07 UTC (permalink / raw)
  To: Jean Delvare, Guenter Roeck; +Cc: linux-kernel, linux-hwmon

The LM90 driver uses level-based interrupt triggering. The interrupt
handler prints a warning message about the breached temperature and
quits. There is no way to stop interrupt from re-triggering since it's
level-based, thus thousands of warning messages are printed per second
once interrupt is triggered. Use edge-triggered interrupt in order to
fix this trouble.

Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/hwmon/lm90.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
index ebbfd5f352c0..ce8ebe60fcdc 100644
--- a/drivers/hwmon/lm90.c
+++ b/drivers/hwmon/lm90.c
@@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
 		dev_dbg(dev, "IRQ: %d\n", client->irq);
 		err = devm_request_threaded_irq(dev, client->irq,
 						NULL, lm90_irq_thread,
-						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
+						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
 						"lm90", client);
 		if (err < 0) {
 			dev_err(dev, "cannot request IRQ %d\n", client->irq);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-16 19:07 [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt Dmitry Osipenko
@ 2021-06-17  0:12 ` Guenter Roeck
  2021-06-17  7:11   ` Dmitry Osipenko
  0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2021-06-17  0:12 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: Jean Delvare, linux-kernel, linux-hwmon

On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> The LM90 driver uses level-based interrupt triggering. The interrupt
> handler prints a warning message about the breached temperature and
> quits. There is no way to stop interrupt from re-triggering since it's
> level-based, thus thousands of warning messages are printed per second
> once interrupt is triggered. Use edge-triggered interrupt in order to
> fix this trouble.
> 
> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/hwmon/lm90.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> index ebbfd5f352c0..ce8ebe60fcdc 100644
> --- a/drivers/hwmon/lm90.c
> +++ b/drivers/hwmon/lm90.c
> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>  		err = devm_request_threaded_irq(dev, client->irq,
>  						NULL, lm90_irq_thread,
> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>  						"lm90", client);

We can't do that. Problem is that many of the devices supported by this driver
behave differently when it comes to interrupts. Specifically, the interrupt
handler is supposed to reset the interrupt condition (ie reading the status
register should reset it). If that is the not the case for a specific chip,
we'll have to update the code to address the problem for that specific chip.
The above code would probably just generate a single interrupt while never
resetting the interrupt condition, which is obviously not what we want to
happen.

Guenter

>  		if (err < 0) {
>  			dev_err(dev, "cannot request IRQ %d\n", client->irq);
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17  0:12 ` Guenter Roeck
@ 2021-06-17  7:11   ` Dmitry Osipenko
  2021-06-17 13:12     ` Guenter Roeck
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Osipenko @ 2021-06-17  7:11 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jean Delvare, linux-kernel, linux-hwmon

17.06.2021 03:12, Guenter Roeck пишет:
> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
>> The LM90 driver uses level-based interrupt triggering. The interrupt
>> handler prints a warning message about the breached temperature and
>> quits. There is no way to stop interrupt from re-triggering since it's
>> level-based, thus thousands of warning messages are printed per second
>> once interrupt is triggered. Use edge-triggered interrupt in order to
>> fix this trouble.
>>
>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/hwmon/lm90.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
>> index ebbfd5f352c0..ce8ebe60fcdc 100644
>> --- a/drivers/hwmon/lm90.c
>> +++ b/drivers/hwmon/lm90.c
>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>>  		err = devm_request_threaded_irq(dev, client->irq,
>>  						NULL, lm90_irq_thread,
>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>>  						"lm90", client);
> 
> We can't do that. Problem is that many of the devices supported by this driver
> behave differently when it comes to interrupts. Specifically, the interrupt
> handler is supposed to reset the interrupt condition (ie reading the status
> register should reset it). If that is the not the case for a specific chip,
> we'll have to update the code to address the problem for that specific chip.
> The above code would probably just generate a single interrupt while never
> resetting the interrupt condition, which is obviously not what we want to
> happen.

The nct1008/72 datasheet [1] says that reading the status register
doesn't reset interrupt until temperature is returned back into normal
state, which is what I'm witnessing.

[1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf

Page 10 "Status Register":

"Reading the status register clears the five flags, Bit 6 to Bit 2,
provided the error conditions causing the flags to beset  have  gone
away.  A  flag  bit  can  be  reset  only  if  the corresponding
value    register    contains    an    in-limit measurement or if the
sensor is good."

So the interrupt handler doesn't actually stop interrupt from
reoccurring and the whole KMSG is instantly spammed with:

...
[  217.484034] lm90 0-004c: temp2 out of range, please check!
[  217.484569] lm90 0-004c: temp2 out of range, please check!
[  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
messages lost.
[  217.485109] lm90 0-004c: temp2 out of range, please check!
[  217.485699] lm90 0-004c: temp2 out of range, please check!
[  217.486235] lm90 0-004c: temp2 out of range, please check!
[  217.486776] lm90 0-004c: temp2 out of range, please check!
[  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...

It's interesting that the very first version of the nct1008-support
patch used edge-triggered interrupt flags [2].

[2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html

Limiting the interrupt rate could be an alternative solution.

What do you think about something like this:

diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
index ce8ebe60fcdc..74886b8066ab 100644
--- a/drivers/hwmon/lm90.c
+++ b/drivers/hwmon/lm90.c
@@ -79,6 +79,7 @@
  * concern all supported chipsets, unless mentioned otherwise.
  */

+#include <linux/delay.h>
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/slab.h>
@@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657,
max6659, adt7461, max6680,
 #define MAX6696_STATUS2_R2OT2	(1 << 6) /* remote2 emergency limit
tripped */
 #define MAX6696_STATUS2_LOT2	(1 << 7) /* local emergency limit tripped */

+/* Prevent instant interrupt re-triggering */
+#define LM90_IRQ_DELAY		(15 * MSEC_PER_SEC)
+
 /*
  * Driver data (common to all clients)
  */
@@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void
*dev_id)
 	struct i2c_client *client = dev_id;
 	u16 status;

-	if (lm90_is_tripped(client, &status))
-		return IRQ_HANDLED;
-	else
+	if (!lm90_is_tripped(client, &status))
 		return IRQ_NONE;
+
+	msleep(LM90_IRQ_DELAY);
+
+	return IRQ_HANDLED;
 }

 static void lm90_remove_pec(void *dev)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17  7:11   ` Dmitry Osipenko
@ 2021-06-17 13:12     ` Guenter Roeck
  2021-06-17 13:48       ` Dmitry Osipenko
  0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2021-06-17 13:12 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: Jean Delvare, linux-kernel, linux-hwmon

On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
> 17.06.2021 03:12, Guenter Roeck пишет:
> > On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> >> The LM90 driver uses level-based interrupt triggering. The interrupt
> >> handler prints a warning message about the breached temperature and
> >> quits. There is no way to stop interrupt from re-triggering since it's
> >> level-based, thus thousands of warning messages are printed per second
> >> once interrupt is triggered. Use edge-triggered interrupt in order to
> >> fix this trouble.
> >>
> >> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/hwmon/lm90.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> >> index ebbfd5f352c0..ce8ebe60fcdc 100644
> >> --- a/drivers/hwmon/lm90.c
> >> +++ b/drivers/hwmon/lm90.c
> >> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
> >>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
> >>  		err = devm_request_threaded_irq(dev, client->irq,
> >>  						NULL, lm90_irq_thread,
> >> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> >> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
> >>  						"lm90", client);
> > 
> > We can't do that. Problem is that many of the devices supported by this driver
> > behave differently when it comes to interrupts. Specifically, the interrupt
> > handler is supposed to reset the interrupt condition (ie reading the status
> > register should reset it). If that is the not the case for a specific chip,
> > we'll have to update the code to address the problem for that specific chip.
> > The above code would probably just generate a single interrupt while never
> > resetting the interrupt condition, which is obviously not what we want to
> > happen.
> 
> The nct1008/72 datasheet [1] says that reading the status register
> doesn't reset interrupt until temperature is returned back into normal
> state, which is what I'm witnessing.
> 
> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
> 
> Page 10 "Status Register":
> 
> "Reading the status register clears the five flags, Bit 6 to Bit 2,
> provided the error conditions causing the flags to beset  have  gone
> away.  A  flag  bit  can  be  reset  only  if  the corresponding
> value    register    contains    an    in-limit measurement or if the
> sensor is good."
> 
> So the interrupt handler doesn't actually stop interrupt from
> reoccurring and the whole KMSG is instantly spammed with:
> 
> ...
> [  217.484034] lm90 0-004c: temp2 out of range, please check!
> [  217.484569] lm90 0-004c: temp2 out of range, please check!
> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
> messages lost.
> [  217.485109] lm90 0-004c: temp2 out of range, please check!
> [  217.485699] lm90 0-004c: temp2 out of range, please check!
> [  217.486235] lm90 0-004c: temp2 out of range, please check!
> [  217.486776] lm90 0-004c: temp2 out of range, please check!
> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
> 
> It's interesting that the very first version of the nct1008-support
> patch used edge-triggered interrupt flags [2].
> 
> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
> 
A lot of this depends on the chip and its wiring, as well as on chip
configuration. Even for a specific chip there may be configuration
dependencies. The interrupt configuration in situations like this
should really be determined by devicetree configuration, and not
be hardcoded. Is this a devicetree based system ? If so, there should
be an entry for this chip pointing to the interrupt, and that entry
should include a trigger mask. That mask should be set to edge
triggered.

> Limiting the interrupt rate could be an alternative solution.
> 
> What do you think about something like this:
> 
A sleep in an interrupt handler to "prevent" an interrupt storm
is never acceptable.

Guenter

> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> index ce8ebe60fcdc..74886b8066ab 100644
> --- a/drivers/hwmon/lm90.c
> +++ b/drivers/hwmon/lm90.c
> @@ -79,6 +79,7 @@
>   * concern all supported chipsets, unless mentioned otherwise.
>   */
> 
> +#include <linux/delay.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/slab.h>
> @@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657,
> max6659, adt7461, max6680,
>  #define MAX6696_STATUS2_R2OT2	(1 << 6) /* remote2 emergency limit
> tripped */
>  #define MAX6696_STATUS2_LOT2	(1 << 7) /* local emergency limit tripped */
> 
> +/* Prevent instant interrupt re-triggering */
> +#define LM90_IRQ_DELAY		(15 * MSEC_PER_SEC)
> +
>  /*
>   * Driver data (common to all clients)
>   */
> @@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void
> *dev_id)
>  	struct i2c_client *client = dev_id;
>  	u16 status;
> 
> -	if (lm90_is_tripped(client, &status))
> -		return IRQ_HANDLED;
> -	else
> +	if (!lm90_is_tripped(client, &status))
>  		return IRQ_NONE;
> +
> +	msleep(LM90_IRQ_DELAY);
> +
> +	return IRQ_HANDLED;
>  }
> 
>  static void lm90_remove_pec(void *dev)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 13:12     ` Guenter Roeck
@ 2021-06-17 13:48       ` Dmitry Osipenko
  2021-06-17 14:13         ` Guenter Roeck
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Osipenko @ 2021-06-17 13:48 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jean Delvare, linux-kernel, linux-hwmon

17.06.2021 16:12, Guenter Roeck пишет:
> On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
>> 17.06.2021 03:12, Guenter Roeck пишет:
>>> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
>>>> The LM90 driver uses level-based interrupt triggering. The interrupt
>>>> handler prints a warning message about the breached temperature and
>>>> quits. There is no way to stop interrupt from re-triggering since it's
>>>> level-based, thus thousands of warning messages are printed per second
>>>> once interrupt is triggered. Use edge-triggered interrupt in order to
>>>> fix this trouble.
>>>>
>>>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>>  drivers/hwmon/lm90.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
>>>> index ebbfd5f352c0..ce8ebe60fcdc 100644
>>>> --- a/drivers/hwmon/lm90.c
>>>> +++ b/drivers/hwmon/lm90.c
>>>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>>>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>>>>  		err = devm_request_threaded_irq(dev, client->irq,
>>>>  						NULL, lm90_irq_thread,
>>>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
>>>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>>>>  						"lm90", client);
>>>
>>> We can't do that. Problem is that many of the devices supported by this driver
>>> behave differently when it comes to interrupts. Specifically, the interrupt
>>> handler is supposed to reset the interrupt condition (ie reading the status
>>> register should reset it). If that is the not the case for a specific chip,
>>> we'll have to update the code to address the problem for that specific chip.
>>> The above code would probably just generate a single interrupt while never
>>> resetting the interrupt condition, which is obviously not what we want to
>>> happen.
>>
>> The nct1008/72 datasheet [1] says that reading the status register
>> doesn't reset interrupt until temperature is returned back into normal
>> state, which is what I'm witnessing.
>>
>> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
>>
>> Page 10 "Status Register":
>>
>> "Reading the status register clears the five flags, Bit 6 to Bit 2,
>> provided the error conditions causing the flags to beset  have  gone
>> away.  A  flag  bit  can  be  reset  only  if  the corresponding
>> value    register    contains    an    in-limit measurement or if the
>> sensor is good."
>>
>> So the interrupt handler doesn't actually stop interrupt from
>> reoccurring and the whole KMSG is instantly spammed with:
>>
>> ...
>> [  217.484034] lm90 0-004c: temp2 out of range, please check!
>> [  217.484569] lm90 0-004c: temp2 out of range, please check!
>> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
>> messages lost.
>> [  217.485109] lm90 0-004c: temp2 out of range, please check!
>> [  217.485699] lm90 0-004c: temp2 out of range, please check!
>> [  217.486235] lm90 0-004c: temp2 out of range, please check!
>> [  217.486776] lm90 0-004c: temp2 out of range, please check!
>> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
>>
>> It's interesting that the very first version of the nct1008-support
>> patch used edge-triggered interrupt flags [2].
>>
>> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
>>
> A lot of this depends on the chip and its wiring, as well as on chip
> configuration. Even for a specific chip there may be configuration
> dependencies. The interrupt configuration in situations like this
> should really be determined by devicetree configuration, and not
> be hardcoded. Is this a devicetree based system ? If so, there should
> be an entry for this chip pointing to the interrupt, and that entry
> should include a trigger mask. That mask should be set to edge
> triggered.

This is a device-tree based system, in particular it's NVIDIA Tegra30
Nexus 7. The interrupt support was originally added to the lm90 driver
by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
device-trees are specifying the trigger mask and apparently they all are
cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it
should be IRQ_TYPE_EDGE_FALLING.

The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
specified in a device-tree. IIUC, the interrupt is used only by OF-based
devices, hence I think we could simply remove the IRQF flag from the
code and fix the device-trees. Does it sound good to you?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 13:48       ` Dmitry Osipenko
@ 2021-06-17 14:13         ` Guenter Roeck
  2021-06-17 14:46           ` Dmitry Osipenko
  0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2021-06-17 14:13 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: Jean Delvare, linux-kernel, linux-hwmon

On Thu, Jun 17, 2021 at 04:48:08PM +0300, Dmitry Osipenko wrote:
> 17.06.2021 16:12, Guenter Roeck пишет:
> > On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
> >> 17.06.2021 03:12, Guenter Roeck пишет:
> >>> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> >>>> The LM90 driver uses level-based interrupt triggering. The interrupt
> >>>> handler prints a warning message about the breached temperature and
> >>>> quits. There is no way to stop interrupt from re-triggering since it's
> >>>> level-based, thus thousands of warning messages are printed per second
> >>>> once interrupt is triggered. Use edge-triggered interrupt in order to
> >>>> fix this trouble.
> >>>>
> >>>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> >>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >>>> ---
> >>>>  drivers/hwmon/lm90.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> >>>> index ebbfd5f352c0..ce8ebe60fcdc 100644
> >>>> --- a/drivers/hwmon/lm90.c
> >>>> +++ b/drivers/hwmon/lm90.c
> >>>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
> >>>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
> >>>>  		err = devm_request_threaded_irq(dev, client->irq,
> >>>>  						NULL, lm90_irq_thread,
> >>>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> >>>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
> >>>>  						"lm90", client);
> >>>
> >>> We can't do that. Problem is that many of the devices supported by this driver
> >>> behave differently when it comes to interrupts. Specifically, the interrupt
> >>> handler is supposed to reset the interrupt condition (ie reading the status
> >>> register should reset it). If that is the not the case for a specific chip,
> >>> we'll have to update the code to address the problem for that specific chip.
> >>> The above code would probably just generate a single interrupt while never
> >>> resetting the interrupt condition, which is obviously not what we want to
> >>> happen.
> >>
> >> The nct1008/72 datasheet [1] says that reading the status register
> >> doesn't reset interrupt until temperature is returned back into normal
> >> state, which is what I'm witnessing.
> >>
> >> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
> >>
> >> Page 10 "Status Register":
> >>
> >> "Reading the status register clears the five flags, Bit 6 to Bit 2,
> >> provided the error conditions causing the flags to beset  have  gone
> >> away.  A  flag  bit  can  be  reset  only  if  the corresponding
> >> value    register    contains    an    in-limit measurement or if the
> >> sensor is good."
> >>
> >> So the interrupt handler doesn't actually stop interrupt from
> >> reoccurring and the whole KMSG is instantly spammed with:
> >>
> >> ...
> >> [  217.484034] lm90 0-004c: temp2 out of range, please check!
> >> [  217.484569] lm90 0-004c: temp2 out of range, please check!
> >> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
> >> messages lost.
> >> [  217.485109] lm90 0-004c: temp2 out of range, please check!
> >> [  217.485699] lm90 0-004c: temp2 out of range, please check!
> >> [  217.486235] lm90 0-004c: temp2 out of range, please check!
> >> [  217.486776] lm90 0-004c: temp2 out of range, please check!
> >> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
> >>
> >> It's interesting that the very first version of the nct1008-support
> >> patch used edge-triggered interrupt flags [2].
> >>
> >> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
> >>
> > A lot of this depends on the chip and its wiring, as well as on chip
> > configuration. Even for a specific chip there may be configuration
> > dependencies. The interrupt configuration in situations like this
> > should really be determined by devicetree configuration, and not
> > be hardcoded. Is this a devicetree based system ? If so, there should
> > be an entry for this chip pointing to the interrupt, and that entry
> > should include a trigger mask. That mask should be set to edge
> > triggered.
> 
> This is a device-tree based system, in particular it's NVIDIA Tegra30
> Nexus 7. The interrupt support was originally added to the lm90 driver
> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
> device-trees are specifying the trigger mask and apparently they all are
> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it

Be fair, no one is perfect.

> should be IRQ_TYPE_EDGE_FALLING.

It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING,
and the interrupt handler should call hwmon_notify_event() instead of
clogging the kernel log, but that should be done in a separate patch.

Anyway, the tegra30 dts files in the upstream kernel either use
IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
in the upstream kernel has no interrupt configured (and coincidentally
it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?

> 
> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
> specified in a device-tree. IIUC, the interrupt is used only by OF-based
> devices, hence I think we could simply remove the IRQF flag from the
> code and fix the device-trees. Does it sound good to you?

Yes, that is a better approach.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 14:13         ` Guenter Roeck
@ 2021-06-17 14:46           ` Dmitry Osipenko
  2021-06-17 15:12             ` Guenter Roeck
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Osipenko @ 2021-06-17 14:46 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jean Delvare, linux-kernel, linux-hwmon

17.06.2021 17:13, Guenter Roeck пишет:
...
>> This is a device-tree based system, in particular it's NVIDIA Tegra30
>> Nexus 7. The interrupt support was originally added to the lm90 driver
>> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
>> device-trees are specifying the trigger mask and apparently they all are
>> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it
> 
> Be fair, no one is perfect.

This is a very minor problem, so no wonder that nobody noticed or
bothered to fix it yet. I'm just clarifying the status here.

>> should be IRQ_TYPE_EDGE_FALLING.
> 
> It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING,

For now I see that the rising edge isn't needed, the TEMP_ALERT goes
HIGH by itself when temperature backs to normal. But I will try to
double check.

> and the interrupt handler should call hwmon_notify_event() instead of
> clogging the kernel log, but that should be done in a separate patch.

Thank you for suggestion, I will take a look.

> Anyway, the tegra30 dts files in the upstream kernel either use
> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
> in the upstream kernel has no interrupt configured (and coincidentally
> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?

I have a patch that will add the interrupt property, it's stashed
locally for the next kernel release.

IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
use IRQ_TYPE_LEVEL_LOW are in the same position.

>> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
>> specified in a device-tree. IIUC, the interrupt is used only by OF-based
>> devices, hence I think we could simply remove the IRQF flag from the
>> code and fix the device-trees. Does it sound good to you?
> 
> Yes, that is a better approach.

Thank you for reviewing this patch. I'll prepare v2.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 14:46           ` Dmitry Osipenko
@ 2021-06-17 15:12             ` Guenter Roeck
  2021-06-17 15:27               ` Dmitry Osipenko
  0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2021-06-17 15:12 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: Jean Delvare, linux-kernel, linux-hwmon

On Thu, Jun 17, 2021 at 05:46:33PM +0300, Dmitry Osipenko wrote:
> 17.06.2021 17:13, Guenter Roeck пишет:
> ...
> >> This is a device-tree based system, in particular it's NVIDIA Tegra30
> >> Nexus 7. The interrupt support was originally added to the lm90 driver
> >> by Wei Ni who works at NVIDIA and did it for the Tegra boards. The Tegra
> >> device-trees are specifying the trigger mask and apparently they all are
> >> cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH, while it
> > 
> > Be fair, no one is perfect.
> 
> This is a very minor problem, so no wonder that nobody noticed or
> bothered to fix it yet. I'm just clarifying the status here.
> 
> >> should be IRQ_TYPE_EDGE_FALLING.
> > 
> > It should probably be both IRQ_TYPE_EDGE_FALLING and IRQ_TYPE_EDGE_RISING,
> 
> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
> HIGH by itself when temperature backs to normal. But I will try to
> double check.
> 
The point is that a sysfs event should be sent to userspace on both
edges, not only when an alarm is raised. But, you are correct,
IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
are not generated.

> > and the interrupt handler should call hwmon_notify_event() instead of
> > clogging the kernel log, but that should be done in a separate patch.
> 
> Thank you for suggestion, I will take a look.
> 
> > Anyway, the tegra30 dts files in the upstream kernel either use
> > IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
> > in the upstream kernel has no interrupt configured (and coincidentally
> > it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
> 
> I have a patch that will add the interrupt property, it's stashed
> locally for the next kernel release.
> 
> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
> use IRQ_TYPE_LEVEL_LOW are in the same position.

I still don't see a IRQ_TYPE_LEVEL_HIGH, though.

Thanks,
Guenter

> 
> >> The IRQF flag in devm_request_threaded_irq() overrides the trigger mask
> >> specified in a device-tree. IIUC, the interrupt is used only by OF-based
> >> devices, hence I think we could simply remove the IRQF flag from the
> >> code and fix the device-trees. Does it sound good to you?
> > 
> > Yes, that is a better approach.
> 
> Thank you for reviewing this patch. I'll prepare v2.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 15:12             ` Guenter Roeck
@ 2021-06-17 15:27               ` Dmitry Osipenko
  2021-06-17 21:42                 ` Guenter Roeck
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Osipenko @ 2021-06-17 15:27 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jean Delvare, linux-kernel, linux-hwmon

17.06.2021 18:12, Guenter Roeck пишет:
>> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
>> HIGH by itself when temperature backs to normal. But I will try to
>> double check.
>>
> The point is that a sysfs event should be sent to userspace on both
> edges, not only when an alarm is raised. But, you are correct,
> IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
> are not generated.

Ok, thank you for the clarification.

>>> Anyway, the tegra30 dts files in the upstream kernel either use
>>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
>>> in the upstream kernel has no interrupt configured (and coincidentally
>>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
>> I have a patch that will add the interrupt property, it's stashed
>> locally for the next kernel release.
>>
>> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
>> use IRQ_TYPE_LEVEL_LOW are in the same position.
> I still don't see a IRQ_TYPE_LEVEL_HIGH, though.

Could you please clarify why you're looking for HIGH and not for LOW?
The TEMP_ALERT is active-low.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 15:27               ` Dmitry Osipenko
@ 2021-06-17 21:42                 ` Guenter Roeck
  2021-06-18  8:55                   ` Dmitry Osipenko
  0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2021-06-17 21:42 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: Jean Delvare, linux-kernel, linux-hwmon

On Thu, Jun 17, 2021 at 06:27:50PM +0300, Dmitry Osipenko wrote:
> 17.06.2021 18:12, Guenter Roeck пишет:
> >> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
> >> HIGH by itself when temperature backs to normal. But I will try to
> >> double check.
> >>
> > The point is that a sysfs event should be sent to userspace on both
> > edges, not only when an alarm is raised. But, you are correct,
> > IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
> > are not generated.
> 
> Ok, thank you for the clarification.
> 
> >>> Anyway, the tegra30 dts files in the upstream kernel either use
> >>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
> >>> in the upstream kernel has no interrupt configured (and coincidentally
> >>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
> >> I have a patch that will add the interrupt property, it's stashed
> >> locally for the next kernel release.
> >>
> >> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
> >> use IRQ_TYPE_LEVEL_LOW are in the same position.
> > I still don't see a IRQ_TYPE_LEVEL_HIGH, though.
> 
> Could you please clarify why you're looking for HIGH and not for LOW?
> The TEMP_ALERT is active-low.

Because you stated earlier:

"... cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH ..."
                                             ^^^^^^^^^^^^^^^^^^^

Guenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
  2021-06-17 21:42                 ` Guenter Roeck
@ 2021-06-18  8:55                   ` Dmitry Osipenko
  0 siblings, 0 replies; 11+ messages in thread
From: Dmitry Osipenko @ 2021-06-18  8:55 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jean Delvare, linux-kernel, linux-hwmon

18.06.2021 00:42, Guenter Roeck пишет:
> On Thu, Jun 17, 2021 at 06:27:50PM +0300, Dmitry Osipenko wrote:
>> 17.06.2021 18:12, Guenter Roeck пишет:
>>>> For now I see that the rising edge isn't needed, the TEMP_ALERT goes
>>>> HIGH by itself when temperature backs to normal. But I will try to
>>>> double check.
>>>>
>>> The point is that a sysfs event should be sent to userspace on both
>>> edges, not only when an alarm is raised. But, you are correct,
>>> IRQ_TYPE_EDGE_RISING is currently not needed since sysfs events
>>> are not generated.
>>
>> Ok, thank you for the clarification.
>>
>>>>> Anyway, the tegra30 dts files in the upstream kernel either use
>>>>> IRQ_TYPE_LEVEL_LOW or no interrupts for nct1008. The Nexus 7 dts file
>>>>> in the upstream kernel has no interrupt configured (and coincidentally
>>>>> it was you who added that entry). Where do you see IRQ_TYPE_LEVEL_HIGH ?
>>>> I have a patch that will add the interrupt property, it's stashed
>>>> locally for the next kernel release.
>>>>
>>>> IIUC, it's not only the Tegra30 dts, but all the TegraXXX boards that
>>>> use IRQ_TYPE_LEVEL_LOW are in the same position.
>>> I still don't see a IRQ_TYPE_LEVEL_HIGH, though.
>>
>> Could you please clarify why you're looking for HIGH and not for LOW?
>> The TEMP_ALERT is active-low.
> 
> Because you stated earlier:
> 
> "... cargo-culted and wrong because they use IRQ_TYPE_LEVEL_HIGH ..."
>                                              ^^^^^^^^^^^^^^^^^^^

That was a typo, my bad.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-06-18  8:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16 19:07 [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt Dmitry Osipenko
2021-06-17  0:12 ` Guenter Roeck
2021-06-17  7:11   ` Dmitry Osipenko
2021-06-17 13:12     ` Guenter Roeck
2021-06-17 13:48       ` Dmitry Osipenko
2021-06-17 14:13         ` Guenter Roeck
2021-06-17 14:46           ` Dmitry Osipenko
2021-06-17 15:12             ` Guenter Roeck
2021-06-17 15:27               ` Dmitry Osipenko
2021-06-17 21:42                 ` Guenter Roeck
2021-06-18  8:55                   ` Dmitry Osipenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).