linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Dmitry Osipenko <digetx@gmail.com>
Cc: Jean Delvare <jdelvare@suse.com>,
	linux-kernel@vger.kernel.org, linux-hwmon@vger.kernel.org
Subject: Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt
Date: Thu, 17 Jun 2021 06:12:05 -0700	[thread overview]
Message-ID: <20210617131205.GA59767@roeck-us.net> (raw)
In-Reply-To: <3c6cbaf3-187b-1682-69b8-a2b34f23b928@gmail.com>

On Thu, Jun 17, 2021 at 10:11:19AM +0300, Dmitry Osipenko wrote:
> 17.06.2021 03:12, Guenter Roeck пишет:
> > On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
> >> The LM90 driver uses level-based interrupt triggering. The interrupt
> >> handler prints a warning message about the breached temperature and
> >> quits. There is no way to stop interrupt from re-triggering since it's
> >> level-based, thus thousands of warning messages are printed per second
> >> once interrupt is triggered. Use edge-triggered interrupt in order to
> >> fix this trouble.
> >>
> >> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/hwmon/lm90.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> >> index ebbfd5f352c0..ce8ebe60fcdc 100644
> >> --- a/drivers/hwmon/lm90.c
> >> +++ b/drivers/hwmon/lm90.c
> >> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
> >>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
> >>  		err = devm_request_threaded_irq(dev, client->irq,
> >>  						NULL, lm90_irq_thread,
> >> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
> >> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
> >>  						"lm90", client);
> > 
> > We can't do that. Problem is that many of the devices supported by this driver
> > behave differently when it comes to interrupts. Specifically, the interrupt
> > handler is supposed to reset the interrupt condition (ie reading the status
> > register should reset it). If that is the not the case for a specific chip,
> > we'll have to update the code to address the problem for that specific chip.
> > The above code would probably just generate a single interrupt while never
> > resetting the interrupt condition, which is obviously not what we want to
> > happen.
> 
> The nct1008/72 datasheet [1] says that reading the status register
> doesn't reset interrupt until temperature is returned back into normal
> state, which is what I'm witnessing.
> 
> [1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf
> 
> Page 10 "Status Register":
> 
> "Reading the status register clears the five flags, Bit 6 to Bit 2,
> provided the error conditions causing the flags to beset  have  gone
> away.  A  flag  bit  can  be  reset  only  if  the corresponding
> value    register    contains    an    in-limit measurement or if the
> sensor is good."
> 
> So the interrupt handler doesn't actually stop interrupt from
> reoccurring and the whole KMSG is instantly spammed with:
> 
> ...
> [  217.484034] lm90 0-004c: temp2 out of range, please check!
> [  217.484569] lm90 0-004c: temp2 out of range, please check!
> [  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
> messages lost.
> [  217.485109] lm90 0-004c: temp2 out of range, please check!
> [  217.485699] lm90 0-004c: temp2 out of range, please check!
> [  217.486235] lm90 0-004c: temp2 out of range, please check!
> [  217.486776] lm90 0-004c: temp2 out of range, please check!
> [  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...
> 
> It's interesting that the very first version of the nct1008-support
> patch used edge-triggered interrupt flags [2].
> 
> [2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html
> 
A lot of this depends on the chip and its wiring, as well as on chip
configuration. Even for a specific chip there may be configuration
dependencies. The interrupt configuration in situations like this
should really be determined by devicetree configuration, and not
be hardcoded. Is this a devicetree based system ? If so, there should
be an entry for this chip pointing to the interrupt, and that entry
should include a trigger mask. That mask should be set to edge
triggered.

> Limiting the interrupt rate could be an alternative solution.
> 
> What do you think about something like this:
> 
A sleep in an interrupt handler to "prevent" an interrupt storm
is never acceptable.

Guenter

> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
> index ce8ebe60fcdc..74886b8066ab 100644
> --- a/drivers/hwmon/lm90.c
> +++ b/drivers/hwmon/lm90.c
> @@ -79,6 +79,7 @@
>   * concern all supported chipsets, unless mentioned otherwise.
>   */
> 
> +#include <linux/delay.h>
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/slab.h>
> @@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657,
> max6659, adt7461, max6680,
>  #define MAX6696_STATUS2_R2OT2	(1 << 6) /* remote2 emergency limit
> tripped */
>  #define MAX6696_STATUS2_LOT2	(1 << 7) /* local emergency limit tripped */
> 
> +/* Prevent instant interrupt re-triggering */
> +#define LM90_IRQ_DELAY		(15 * MSEC_PER_SEC)
> +
>  /*
>   * Driver data (common to all clients)
>   */
> @@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void
> *dev_id)
>  	struct i2c_client *client = dev_id;
>  	u16 status;
> 
> -	if (lm90_is_tripped(client, &status))
> -		return IRQ_HANDLED;
> -	else
> +	if (!lm90_is_tripped(client, &status))
>  		return IRQ_NONE;
> +
> +	msleep(LM90_IRQ_DELAY);
> +
> +	return IRQ_HANDLED;
>  }
> 
>  static void lm90_remove_pec(void *dev)

  reply	other threads:[~2021-06-17 13:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-16 19:07 Dmitry Osipenko
2021-06-17  0:12 ` Guenter Roeck
2021-06-17  7:11   ` Dmitry Osipenko
2021-06-17 13:12     ` Guenter Roeck [this message]
2021-06-17 13:48       ` Dmitry Osipenko
2021-06-17 14:13         ` Guenter Roeck
2021-06-17 14:46           ` Dmitry Osipenko
2021-06-17 15:12             ` Guenter Roeck
2021-06-17 15:27               ` Dmitry Osipenko
2021-06-17 21:42                 ` Guenter Roeck
2021-06-18  8:55                   ` Dmitry Osipenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210617131205.GA59767@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=digetx@gmail.com \
    --cc=jdelvare@suse.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).