linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tpm_tis: Disable interrupts if interrupt storm detected
@ 2020-11-30 23:23 Jerry Snitselaar
  2020-12-01  2:58 ` [PATCH v2] " Jerry Snitselaar
  0 siblings, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-11-30 23:23 UTC (permalink / raw)
  To: linux-kernel, linux-integrity
  Cc: Jarkko Sakkinen, Jason Gunthorpe, Peter Huewe, James Bottomley,
	Matthew Garrett, Hans de Goede

When enabling the interrupt code for the tpm_tis driver we have
noticed some systems have a bios issue causing an interrupt storm to
occur. The issue isn't limited to a single tpm or system vendor
so keeping a denylist of systems with the issue isn't optimal. Instead
try to detect the problem occurring, disable interrupts, and revert to
polling when it happens.

Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Peter Huewe <peterhuewe@gmx.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
---
 drivers/char/tpm/tpm_tis_core.c | 34 +++++++++++++++++++++++++++++++++
 drivers/char/tpm/tpm_tis_core.h |  2 ++
 2 files changed, 36 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 92c51c6cfd1b..19115a628f25 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -24,6 +24,8 @@
 #include <linux/wait.h>
 #include <linux/acpi.h>
 #include <linux/freezer.h>
+#include <linux/workqueue.h>
+#include <linux/kernel_stat.h>
 #include "tpm.h"
 #include "tpm_tis_core.h"
 
@@ -711,13 +713,30 @@ static bool tpm_tis_req_canceled(struct tpm_chip *chip, u8 status)
 	}
 }
 
+static struct workqueue_struct *tpm_tis_wq;
+static DEFINE_MUTEX(tpm_tis_wq_lock);
+
 static irqreturn_t tis_int_handler(int dummy, void *dev_id)
 {
 	struct tpm_chip *chip = dev_id;
 	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
+	static bool check_storm = true;
+	static unsigned int check_start;
 	u32 interrupt;
 	int i, rc;
 
+	if (unlikely(check_storm)) {
+		if (!check_start) {
+			check_start = jiffies_to_msecs(jiffies);
+		} else if ((kstat_irqs(priv->irq) > 1000) &&
+			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
+			check_storm = false;
+			queue_work(tpm_tis_wq, &priv->storm_work);
+		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
+			check_storm = false;
+		}
+	}
+
 	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
 	if (rc < 0)
 		return IRQ_NONE;
@@ -943,6 +962,14 @@ static const struct tpm_class_ops tpm_tis = {
 	.clk_enable = tpm_tis_clkrun_enable,
 };
 
+static void tpm_tis_storm_work(struct work_struct *work)
+{
+	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
+
+	disable_interrupts(priv->chip);
+	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
+}
+
 int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 		      const struct tpm_tis_phy_ops *phy_ops,
 		      acpi_handle acpi_dev_handle)
@@ -959,6 +986,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	if (IS_ERR(chip))
 		return PTR_ERR(chip);
 
+	priv->chip = chip;
+	tpm_tis_wq = alloc_workqueue("tpm_tis_wq", WQ_MEM_RECLAIM, 0);
+	if (!tpm_tis_wq)
+		return -ENOMEM;
+
+	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
+
 #ifdef CONFIG_ACPI
 	chip->acpi_dev_handle = acpi_dev_handle;
 #endif
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index 9b2d32a59f67..973297ee2e16 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -95,6 +95,8 @@ struct tpm_tis_data {
 	u16 clkrun_enabled;
 	wait_queue_head_t int_queue;
 	wait_queue_head_t read_queue;
+	struct work_struct storm_work;
+	struct tpm_chip *chip;
 	const struct tpm_tis_phy_ops *phy_ops;
 	unsigned short rng_quality;
 };
-- 
2.27.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-11-30 23:23 [PATCH] tpm_tis: Disable interrupts if interrupt storm detected Jerry Snitselaar
@ 2020-12-01  2:58 ` Jerry Snitselaar
  2020-12-01  3:26   ` Jerry Snitselaar
  0 siblings, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-01  2:58 UTC (permalink / raw)
  To: linux-kernel, linux-integrity
  Cc: Jarkko Sakkinen, Jason Gunthorpe, Peter Huewe, James Bottomley,
	Matthew Garrett, Hans de Goede

When enabling the interrupt code for the tpm_tis driver we have
noticed some systems have a bios issue causing an interrupt storm to
occur. The issue isn't limited to a single tpm or system manufacturer
so keeping a denylist of systems with the issue isn't optimal. Instead
try to detect the problem occurring, disable interrupts, and revert to
polling when it happens.

Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Peter Huewe <peterhuewe@gmx.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
---
v2: drop tpm_tis specific workqueue and use just system_wq

drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
 drivers/char/tpm/tpm_tis_core.h |  2 ++
 2 files changed, 29 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 23b60583928b..72cc8a5a152c 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -24,6 +24,8 @@
 #include <linux/wait.h>
 #include <linux/acpi.h>
 #include <linux/freezer.h>
+#include <linux/workqueue.h>
+#include <linux/kernel_stat.h>
 #include "tpm.h"
 #include "tpm_tis_core.h"
 
@@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
 {
 	struct tpm_chip *chip = dev_id;
 	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
+	static bool check_storm = true;
+	static unsigned int check_start;
 	u32 interrupt;
 	int i, rc;
 
+	if (unlikely(check_storm)) {
+		if (!check_start) {
+			check_start = jiffies_to_msecs(jiffies);
+		} else if ((kstat_irqs(priv->irq) > 1000) &&
+			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
+			check_storm = false;
+			schedule_work(&priv->storm_work);
+		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
+			check_storm = false;
+		}
+	}
+
 	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
 	if (rc < 0)
 		return IRQ_NONE;
@@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
 	.clk_enable = tpm_tis_clkrun_enable,
 };
 
+static void tpm_tis_storm_work(struct work_struct *work)
+{
+	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
+
+	disable_interrupts(priv->chip);
+	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
+}
+
 int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 		      const struct tpm_tis_phy_ops *phy_ops,
 		      acpi_handle acpi_dev_handle)
@@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	if (IS_ERR(chip))
 		return PTR_ERR(chip);
 
+	priv->chip = chip;
+	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
+
 #ifdef CONFIG_ACPI
 	chip->acpi_dev_handle = acpi_dev_handle;
 #endif
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index edeb5dc61c95..5630f294dc0c 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -95,6 +95,8 @@ struct tpm_tis_data {
 	u16 clkrun_enabled;
 	wait_queue_head_t int_queue;
 	wait_queue_head_t read_queue;
+	struct work_struct storm_work;
+	struct tpm_chip *chip;
 	const struct tpm_tis_phy_ops *phy_ops;
 	unsigned short rng_quality;
 };
-- 
2.27.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-01  2:58 ` [PATCH v2] " Jerry Snitselaar
@ 2020-12-01  3:26   ` Jerry Snitselaar
  2020-12-01 19:59     ` Jerry Snitselaar
  0 siblings, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-01  3:26 UTC (permalink / raw)
  To: linux-kernel, linux-integrity
  Cc: Jarkko Sakkinen, Jason Gunthorpe, Peter Huewe, James Bottomley,
	Matthew Garrett, Hans de Goede


Jerry Snitselaar @ 2020-11-30 19:58 MST:

> When enabling the interrupt code for the tpm_tis driver we have
> noticed some systems have a bios issue causing an interrupt storm to
> occur. The issue isn't limited to a single tpm or system manufacturer
> so keeping a denylist of systems with the issue isn't optimal. Instead
> try to detect the problem occurring, disable interrupts, and revert to
> polling when it happens.
>
> Cc: Jarkko Sakkinen <jarkko@kernel.org>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Peter Huewe <peterhuewe@gmx.de>
> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: Matthew Garrett <mjg59@google.com>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
> ---
> v2: drop tpm_tis specific workqueue and use just system_wq
>
> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
>  drivers/char/tpm/tpm_tis_core.h |  2 ++
>  2 files changed, 29 insertions(+)
>
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 23b60583928b..72cc8a5a152c 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -24,6 +24,8 @@
>  #include <linux/wait.h>
>  #include <linux/acpi.h>
>  #include <linux/freezer.h>
> +#include <linux/workqueue.h>
> +#include <linux/kernel_stat.h>
>  #include "tpm.h"
>  #include "tpm_tis_core.h"
>  
> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
>  {
>  	struct tpm_chip *chip = dev_id;
>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
> +	static bool check_storm = true;
> +	static unsigned int check_start;
>  	u32 interrupt;
>  	int i, rc;
>  
> +	if (unlikely(check_storm)) {
> +		if (!check_start) {
> +			check_start = jiffies_to_msecs(jiffies);
> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
> +			check_storm = false;
> +			schedule_work(&priv->storm_work);
> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
> +			check_storm = false;
> +		}
> +	}
> +
>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
>  	if (rc < 0)
>  		return IRQ_NONE;
> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
>  	.clk_enable = tpm_tis_clkrun_enable,
>  };
>  
> +static void tpm_tis_storm_work(struct work_struct *work)
> +{
> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
> +
> +	disable_interrupts(priv->chip);
> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
> +}
> +
>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  		      const struct tpm_tis_phy_ops *phy_ops,
>  		      acpi_handle acpi_dev_handle)
> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	if (IS_ERR(chip))
>  		return PTR_ERR(chip);
>  
> +	priv->chip = chip;
> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
> +
>  #ifdef CONFIG_ACPI
>  	chip->acpi_dev_handle = acpi_dev_handle;
>  #endif
> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
> index edeb5dc61c95..5630f294dc0c 100644
> --- a/drivers/char/tpm/tpm_tis_core.h
> +++ b/drivers/char/tpm/tpm_tis_core.h
> @@ -95,6 +95,8 @@ struct tpm_tis_data {
>  	u16 clkrun_enabled;
>  	wait_queue_head_t int_queue;
>  	wait_queue_head_t read_queue;
> +	struct work_struct storm_work;
> +	struct tpm_chip *chip;
>  	const struct tpm_tis_phy_ops *phy_ops;
>  	unsigned short rng_quality;
>  };

I've tested this with the Intel platform that has an Infineon chip that
I found the other week. It works, but isn't the complete fix. With this
on top of James' patchset I sometimes see the message "Lost Interrupt
waiting for TPM stat", so I guess there needs to be a check in
wait_for_tpm_stat and request_locality to see if interrupts were
disabled when the wait_event_interruptible_timeout call times out.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-01  3:26   ` Jerry Snitselaar
@ 2020-12-01 19:59     ` Jerry Snitselaar
  2020-12-02 16:49       ` Jarkko Sakkinen
  0 siblings, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-01 19:59 UTC (permalink / raw)
  To: linux-kernel, linux-integrity
  Cc: Jarkko Sakkinen, Jason Gunthorpe, Peter Huewe, James Bottomley,
	Matthew Garrett, Hans de Goede


Jerry Snitselaar @ 2020-11-30 20:26 MST:

> Jerry Snitselaar @ 2020-11-30 19:58 MST:
>
>> When enabling the interrupt code for the tpm_tis driver we have
>> noticed some systems have a bios issue causing an interrupt storm to
>> occur. The issue isn't limited to a single tpm or system manufacturer
>> so keeping a denylist of systems with the issue isn't optimal. Instead
>> try to detect the problem occurring, disable interrupts, and revert to
>> polling when it happens.
>>
>> Cc: Jarkko Sakkinen <jarkko@kernel.org>
>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: Peter Huewe <peterhuewe@gmx.de>
>> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>> Cc: Matthew Garrett <mjg59@google.com>
>> Cc: Hans de Goede <hdegoede@redhat.com>
>> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
>> ---
>> v2: drop tpm_tis specific workqueue and use just system_wq
>>
>> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
>>  drivers/char/tpm/tpm_tis_core.h |  2 ++
>>  2 files changed, 29 insertions(+)
>>
>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> index 23b60583928b..72cc8a5a152c 100644
>> --- a/drivers/char/tpm/tpm_tis_core.c
>> +++ b/drivers/char/tpm/tpm_tis_core.c
>> @@ -24,6 +24,8 @@
>>  #include <linux/wait.h>
>>  #include <linux/acpi.h>
>>  #include <linux/freezer.h>
>> +#include <linux/workqueue.h>
>> +#include <linux/kernel_stat.h>
>>  #include "tpm.h"
>>  #include "tpm_tis_core.h"
>>  
>> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
>>  {
>>  	struct tpm_chip *chip = dev_id;
>>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
>> +	static bool check_storm = true;
>> +	static unsigned int check_start;
>>  	u32 interrupt;
>>  	int i, rc;
>>  
>> +	if (unlikely(check_storm)) {
>> +		if (!check_start) {
>> +			check_start = jiffies_to_msecs(jiffies);
>> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
>> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
>> +			check_storm = false;
>> +			schedule_work(&priv->storm_work);
>> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
>> +			check_storm = false;
>> +		}
>> +	}
>> +
>>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
>>  	if (rc < 0)
>>  		return IRQ_NONE;
>> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
>>  	.clk_enable = tpm_tis_clkrun_enable,
>>  };
>>  
>> +static void tpm_tis_storm_work(struct work_struct *work)
>> +{
>> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
>> +
>> +	disable_interrupts(priv->chip);
>> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
>> +}
>> +
>>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>  		      const struct tpm_tis_phy_ops *phy_ops,
>>  		      acpi_handle acpi_dev_handle)
>> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>  	if (IS_ERR(chip))
>>  		return PTR_ERR(chip);
>>  
>> +	priv->chip = chip;
>> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
>> +
>>  #ifdef CONFIG_ACPI
>>  	chip->acpi_dev_handle = acpi_dev_handle;
>>  #endif
>> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>> index edeb5dc61c95..5630f294dc0c 100644
>> --- a/drivers/char/tpm/tpm_tis_core.h
>> +++ b/drivers/char/tpm/tpm_tis_core.h
>> @@ -95,6 +95,8 @@ struct tpm_tis_data {
>>  	u16 clkrun_enabled;
>>  	wait_queue_head_t int_queue;
>>  	wait_queue_head_t read_queue;
>> +	struct work_struct storm_work;
>> +	struct tpm_chip *chip;
>>  	const struct tpm_tis_phy_ops *phy_ops;
>>  	unsigned short rng_quality;
>>  };
>
> I've tested this with the Intel platform that has an Infineon chip that
> I found the other week. It works, but isn't the complete fix. With this
> on top of James' patchset I sometimes see the message "Lost Interrupt
> waiting for TPM stat", so I guess there needs to be a check in
> wait_for_tpm_stat and request_locality to see if interrupts were
> disabled when the wait_event_interruptible_timeout call times out.

As kernel test robot pointed out. kstat_irqs isn't visible when tpm_tis
builds as a module. It looks like it is only called by kstat_irq_usrs,
and that is only by the fs/proc code. I have a patch to export it, but
the i915 driver open codes their own version instead of using it. Is
there any reason not to export it?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-01 19:59     ` Jerry Snitselaar
@ 2020-12-02 16:49       ` Jarkko Sakkinen
  2020-12-03  0:02         ` Jerry Snitselaar
  0 siblings, 1 reply; 12+ messages in thread
From: Jarkko Sakkinen @ 2020-12-02 16:49 UTC (permalink / raw)
  To: Jerry Snitselaar, Jani Nikula
  Cc: linux-kernel, linux-integrity, Jason Gunthorpe, Peter Huewe,
	James Bottomley, Matthew Garrett, Hans de Goede

On Tue, Dec 01, 2020 at 12:59:23PM -0700, Jerry Snitselaar wrote:
> 
> Jerry Snitselaar @ 2020-11-30 20:26 MST:
> 
> > Jerry Snitselaar @ 2020-11-30 19:58 MST:
> >
> >> When enabling the interrupt code for the tpm_tis driver we have
> >> noticed some systems have a bios issue causing an interrupt storm to
> >> occur. The issue isn't limited to a single tpm or system manufacturer
> >> so keeping a denylist of systems with the issue isn't optimal. Instead
> >> try to detect the problem occurring, disable interrupts, and revert to
> >> polling when it happens.
> >>
> >> Cc: Jarkko Sakkinen <jarkko@kernel.org>
> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> >> Cc: Peter Huewe <peterhuewe@gmx.de>
> >> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> >> Cc: Matthew Garrett <mjg59@google.com>
> >> Cc: Hans de Goede <hdegoede@redhat.com>
> >> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
> >> ---
> >> v2: drop tpm_tis specific workqueue and use just system_wq
> >>
> >> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
> >>  drivers/char/tpm/tpm_tis_core.h |  2 ++
> >>  2 files changed, 29 insertions(+)
> >>
> >> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> >> index 23b60583928b..72cc8a5a152c 100644
> >> --- a/drivers/char/tpm/tpm_tis_core.c
> >> +++ b/drivers/char/tpm/tpm_tis_core.c
> >> @@ -24,6 +24,8 @@
> >>  #include <linux/wait.h>
> >>  #include <linux/acpi.h>
> >>  #include <linux/freezer.h>
> >> +#include <linux/workqueue.h>
> >> +#include <linux/kernel_stat.h>
> >>  #include "tpm.h"
> >>  #include "tpm_tis_core.h"
> >>  
> >> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
> >>  {
> >>  	struct tpm_chip *chip = dev_id;
> >>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
> >> +	static bool check_storm = true;
> >> +	static unsigned int check_start;
> >>  	u32 interrupt;
> >>  	int i, rc;
> >>  
> >> +	if (unlikely(check_storm)) {
> >> +		if (!check_start) {
> >> +			check_start = jiffies_to_msecs(jiffies);
> >> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
> >> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
> >> +			check_storm = false;
> >> +			schedule_work(&priv->storm_work);
> >> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
> >> +			check_storm = false;
> >> +		}
> >> +	}
> >> +
> >>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
> >>  	if (rc < 0)
> >>  		return IRQ_NONE;
> >> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
> >>  	.clk_enable = tpm_tis_clkrun_enable,
> >>  };
> >>  
> >> +static void tpm_tis_storm_work(struct work_struct *work)
> >> +{
> >> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
> >> +
> >> +	disable_interrupts(priv->chip);
> >> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
> >> +}
> >> +
> >>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> >>  		      const struct tpm_tis_phy_ops *phy_ops,
> >>  		      acpi_handle acpi_dev_handle)
> >> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> >>  	if (IS_ERR(chip))
> >>  		return PTR_ERR(chip);
> >>  
> >> +	priv->chip = chip;
> >> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
> >> +
> >>  #ifdef CONFIG_ACPI
> >>  	chip->acpi_dev_handle = acpi_dev_handle;
> >>  #endif
> >> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
> >> index edeb5dc61c95..5630f294dc0c 100644
> >> --- a/drivers/char/tpm/tpm_tis_core.h
> >> +++ b/drivers/char/tpm/tpm_tis_core.h
> >> @@ -95,6 +95,8 @@ struct tpm_tis_data {
> >>  	u16 clkrun_enabled;
> >>  	wait_queue_head_t int_queue;
> >>  	wait_queue_head_t read_queue;
> >> +	struct work_struct storm_work;
> >> +	struct tpm_chip *chip;
> >>  	const struct tpm_tis_phy_ops *phy_ops;
> >>  	unsigned short rng_quality;
> >>  };
> >
> > I've tested this with the Intel platform that has an Infineon chip that
> > I found the other week. It works, but isn't the complete fix. With this
> > on top of James' patchset I sometimes see the message "Lost Interrupt
> > waiting for TPM stat", so I guess there needs to be a check in
> > wait_for_tpm_stat and request_locality to see if interrupts were
> > disabled when the wait_event_interruptible_timeout call times out.
> 
> As kernel test robot pointed out. kstat_irqs isn't visible when tpm_tis
> builds as a module. It looks like it is only called by kstat_irq_usrs,
> and that is only by the fs/proc code. I have a patch to export it, but
> the i915 driver open codes their own version instead of using it. Is
> there any reason not to export it?

If you add a patch that exports it, then for coherency it'd be better to
also patch i915 driver. Jani?

/Jarkko

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-02 16:49       ` Jarkko Sakkinen
@ 2020-12-03  0:02         ` Jerry Snitselaar
  2020-12-03  6:11           ` Jerry Snitselaar
  0 siblings, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-03  0:02 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Jani Nikula, linux-kernel, linux-integrity, Jason Gunthorpe,
	Peter Huewe, James Bottomley, Matthew Garrett, Hans de Goede


Jarkko Sakkinen @ 2020-12-02 09:49 MST:

> On Tue, Dec 01, 2020 at 12:59:23PM -0700, Jerry Snitselaar wrote:
>> 
>> Jerry Snitselaar @ 2020-11-30 20:26 MST:
>> 
>> > Jerry Snitselaar @ 2020-11-30 19:58 MST:
>> >
>> >> When enabling the interrupt code for the tpm_tis driver we have
>> >> noticed some systems have a bios issue causing an interrupt storm to
>> >> occur. The issue isn't limited to a single tpm or system manufacturer
>> >> so keeping a denylist of systems with the issue isn't optimal. Instead
>> >> try to detect the problem occurring, disable interrupts, and revert to
>> >> polling when it happens.
>> >>
>> >> Cc: Jarkko Sakkinen <jarkko@kernel.org>
>> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> >> Cc: Peter Huewe <peterhuewe@gmx.de>
>> >> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>> >> Cc: Matthew Garrett <mjg59@google.com>
>> >> Cc: Hans de Goede <hdegoede@redhat.com>
>> >> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
>> >> ---
>> >> v2: drop tpm_tis specific workqueue and use just system_wq
>> >>
>> >> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
>> >>  drivers/char/tpm/tpm_tis_core.h |  2 ++
>> >>  2 files changed, 29 insertions(+)
>> >>
>> >> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> >> index 23b60583928b..72cc8a5a152c 100644
>> >> --- a/drivers/char/tpm/tpm_tis_core.c
>> >> +++ b/drivers/char/tpm/tpm_tis_core.c
>> >> @@ -24,6 +24,8 @@
>> >>  #include <linux/wait.h>
>> >>  #include <linux/acpi.h>
>> >>  #include <linux/freezer.h>
>> >> +#include <linux/workqueue.h>
>> >> +#include <linux/kernel_stat.h>
>> >>  #include "tpm.h"
>> >>  #include "tpm_tis_core.h"
>> >>  
>> >> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
>> >>  {
>> >>  	struct tpm_chip *chip = dev_id;
>> >>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
>> >> +	static bool check_storm = true;
>> >> +	static unsigned int check_start;
>> >>  	u32 interrupt;
>> >>  	int i, rc;
>> >>  
>> >> +	if (unlikely(check_storm)) {
>> >> +		if (!check_start) {
>> >> +			check_start = jiffies_to_msecs(jiffies);
>> >> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
>> >> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
>> >> +			check_storm = false;
>> >> +			schedule_work(&priv->storm_work);
>> >> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
>> >> +			check_storm = false;
>> >> +		}
>> >> +	}
>> >> +
>> >>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
>> >>  	if (rc < 0)
>> >>  		return IRQ_NONE;
>> >> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
>> >>  	.clk_enable = tpm_tis_clkrun_enable,
>> >>  };
>> >>  
>> >> +static void tpm_tis_storm_work(struct work_struct *work)
>> >> +{
>> >> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
>> >> +
>> >> +	disable_interrupts(priv->chip);
>> >> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
>> >> +}
>> >> +
>> >>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> >>  		      const struct tpm_tis_phy_ops *phy_ops,
>> >>  		      acpi_handle acpi_dev_handle)
>> >> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> >>  	if (IS_ERR(chip))
>> >>  		return PTR_ERR(chip);
>> >>  
>> >> +	priv->chip = chip;
>> >> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
>> >> +
>> >>  #ifdef CONFIG_ACPI
>> >>  	chip->acpi_dev_handle = acpi_dev_handle;
>> >>  #endif
>> >> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>> >> index edeb5dc61c95..5630f294dc0c 100644
>> >> --- a/drivers/char/tpm/tpm_tis_core.h
>> >> +++ b/drivers/char/tpm/tpm_tis_core.h
>> >> @@ -95,6 +95,8 @@ struct tpm_tis_data {
>> >>  	u16 clkrun_enabled;
>> >>  	wait_queue_head_t int_queue;
>> >>  	wait_queue_head_t read_queue;
>> >> +	struct work_struct storm_work;
>> >> +	struct tpm_chip *chip;
>> >>  	const struct tpm_tis_phy_ops *phy_ops;
>> >>  	unsigned short rng_quality;
>> >>  };
>> >
>> > I've tested this with the Intel platform that has an Infineon chip that
>> > I found the other week. It works, but isn't the complete fix. With this
>> > on top of James' patchset I sometimes see the message "Lost Interrupt
>> > waiting for TPM stat", so I guess there needs to be a check in
>> > wait_for_tpm_stat and request_locality to see if interrupts were
>> > disabled when the wait_event_interruptible_timeout call times out.
>> 
>> As kernel test robot pointed out. kstat_irqs isn't visible when tpm_tis
>> builds as a module. It looks like it is only called by kstat_irq_usrs,
>> and that is only by the fs/proc code. I have a patch to export it, but
>> the i915 driver open codes their own version instead of using it. Is
>> there any reason not to export it?
>
> If you add a patch that exports it, then for coherency it'd be better to
> also patch i915 driver. Jani?
>
> /Jarkko

It looks like this might not solve all cases. I'm having Lenovo test
another build to make sure I gave them the right code, but they reported
with the L490 that the system hangs right when it is initializing
tpm_tis. I'm working on getting a build on the T490s I have to try there
as well. With the Intel system it spits out that it detects the
interrupt storm, and continues on its way.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-03  0:02         ` Jerry Snitselaar
@ 2020-12-03  6:11           ` Jerry Snitselaar
  2020-12-03 20:14             ` Jerry Snitselaar
  2020-12-04  1:59             ` Jarkko Sakkinen
  0 siblings, 2 replies; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-03  6:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Jani Nikula, linux-kernel, linux-integrity, Jason Gunthorpe,
	Peter Huewe, James Bottomley, Matthew Garrett, Hans de Goede


Jerry Snitselaar @ 2020-12-02 17:02 MST:

> Jarkko Sakkinen @ 2020-12-02 09:49 MST:
>
>> On Tue, Dec 01, 2020 at 12:59:23PM -0700, Jerry Snitselaar wrote:
>>> 
>>> Jerry Snitselaar @ 2020-11-30 20:26 MST:
>>> 
>>> > Jerry Snitselaar @ 2020-11-30 19:58 MST:
>>> >
>>> >> When enabling the interrupt code for the tpm_tis driver we have
>>> >> noticed some systems have a bios issue causing an interrupt storm to
>>> >> occur. The issue isn't limited to a single tpm or system manufacturer
>>> >> so keeping a denylist of systems with the issue isn't optimal. Instead
>>> >> try to detect the problem occurring, disable interrupts, and revert to
>>> >> polling when it happens.
>>> >>
>>> >> Cc: Jarkko Sakkinen <jarkko@kernel.org>
>>> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>>> >> Cc: Peter Huewe <peterhuewe@gmx.de>
>>> >> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>>> >> Cc: Matthew Garrett <mjg59@google.com>
>>> >> Cc: Hans de Goede <hdegoede@redhat.com>
>>> >> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
>>> >> ---
>>> >> v2: drop tpm_tis specific workqueue and use just system_wq
>>> >>
>>> >> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
>>> >>  drivers/char/tpm/tpm_tis_core.h |  2 ++
>>> >>  2 files changed, 29 insertions(+)
>>> >>
>>> >> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>> >> index 23b60583928b..72cc8a5a152c 100644
>>> >> --- a/drivers/char/tpm/tpm_tis_core.c
>>> >> +++ b/drivers/char/tpm/tpm_tis_core.c
>>> >> @@ -24,6 +24,8 @@
>>> >>  #include <linux/wait.h>
>>> >>  #include <linux/acpi.h>
>>> >>  #include <linux/freezer.h>
>>> >> +#include <linux/workqueue.h>
>>> >> +#include <linux/kernel_stat.h>
>>> >>  #include "tpm.h"
>>> >>  #include "tpm_tis_core.h"
>>> >>  
>>> >> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
>>> >>  {
>>> >>  	struct tpm_chip *chip = dev_id;
>>> >>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
>>> >> +	static bool check_storm = true;
>>> >> +	static unsigned int check_start;
>>> >>  	u32 interrupt;
>>> >>  	int i, rc;
>>> >>  
>>> >> +	if (unlikely(check_storm)) {
>>> >> +		if (!check_start) {
>>> >> +			check_start = jiffies_to_msecs(jiffies);
>>> >> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
>>> >> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
>>> >> +			check_storm = false;
>>> >> +			schedule_work(&priv->storm_work);
>>> >> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
>>> >> +			check_storm = false;
>>> >> +		}
>>> >> +	}
>>> >> +
>>> >>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
>>> >>  	if (rc < 0)
>>> >>  		return IRQ_NONE;
>>> >> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
>>> >>  	.clk_enable = tpm_tis_clkrun_enable,
>>> >>  };
>>> >>  
>>> >> +static void tpm_tis_storm_work(struct work_struct *work)
>>> >> +{
>>> >> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
>>> >> +
>>> >> +	disable_interrupts(priv->chip);
>>> >> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
>>> >> +}
>>> >> +
>>> >>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>> >>  		      const struct tpm_tis_phy_ops *phy_ops,
>>> >>  		      acpi_handle acpi_dev_handle)
>>> >> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>> >>  	if (IS_ERR(chip))
>>> >>  		return PTR_ERR(chip);
>>> >>  
>>> >> +	priv->chip = chip;
>>> >> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
>>> >> +
>>> >>  #ifdef CONFIG_ACPI
>>> >>  	chip->acpi_dev_handle = acpi_dev_handle;
>>> >>  #endif
>>> >> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>>> >> index edeb5dc61c95..5630f294dc0c 100644
>>> >> --- a/drivers/char/tpm/tpm_tis_core.h
>>> >> +++ b/drivers/char/tpm/tpm_tis_core.h
>>> >> @@ -95,6 +95,8 @@ struct tpm_tis_data {
>>> >>  	u16 clkrun_enabled;
>>> >>  	wait_queue_head_t int_queue;
>>> >>  	wait_queue_head_t read_queue;
>>> >> +	struct work_struct storm_work;
>>> >> +	struct tpm_chip *chip;
>>> >>  	const struct tpm_tis_phy_ops *phy_ops;
>>> >>  	unsigned short rng_quality;
>>> >>  };
>>> >
>>> > I've tested this with the Intel platform that has an Infineon chip that
>>> > I found the other week. It works, but isn't the complete fix. With this
>>> > on top of James' patchset I sometimes see the message "Lost Interrupt
>>> > waiting for TPM stat", so I guess there needs to be a check in
>>> > wait_for_tpm_stat and request_locality to see if interrupts were
>>> > disabled when the wait_event_interruptible_timeout call times out.
>>> 
>>> As kernel test robot pointed out. kstat_irqs isn't visible when tpm_tis
>>> builds as a module. It looks like it is only called by kstat_irq_usrs,
>>> and that is only by the fs/proc code. I have a patch to export it, but
>>> the i915 driver open codes their own version instead of using it. Is
>>> there any reason not to export it?
>>
>> If you add a patch that exports it, then for coherency it'd be better to
>> also patch i915 driver. Jani?
>>
>> /Jarkko
>
> It looks like this might not solve all cases. I'm having Lenovo test
> another build to make sure I gave them the right code, but they reported
> with the L490 that the system hangs right when it is initializing
> tpm_tis. I'm working on getting a build on the T490s I have to try there
> as well. With the Intel system it spits out that it detects the
> interrupt storm, and continues on its way.

The interrupt storm detection code works on the T490s. I'm not sure what
is going on with the L490. I will see if I can get access to one.

Jerry


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-03  6:11           ` Jerry Snitselaar
@ 2020-12-03 20:14             ` Jerry Snitselaar
  2020-12-03 21:05               ` James Bottomley
  2020-12-04  1:59             ` Jarkko Sakkinen
  1 sibling, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-03 20:14 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Jani Nikula, linux-kernel, linux-integrity, Jason Gunthorpe,
	Peter Huewe, James Bottomley, Matthew Garrett, Hans de Goede


Jerry Snitselaar @ 2020-12-02 23:11 MST:

> Jerry Snitselaar @ 2020-12-02 17:02 MST:
>
>> Jarkko Sakkinen @ 2020-12-02 09:49 MST:
>>
>>> On Tue, Dec 01, 2020 at 12:59:23PM -0700, Jerry Snitselaar wrote:
>>>> 
>>>> Jerry Snitselaar @ 2020-11-30 20:26 MST:
>>>> 
>>>> > Jerry Snitselaar @ 2020-11-30 19:58 MST:
>>>> >
>>>> >> When enabling the interrupt code for the tpm_tis driver we have
>>>> >> noticed some systems have a bios issue causing an interrupt storm to
>>>> >> occur. The issue isn't limited to a single tpm or system manufacturer
>>>> >> so keeping a denylist of systems with the issue isn't optimal. Instead
>>>> >> try to detect the problem occurring, disable interrupts, and revert to
>>>> >> polling when it happens.
>>>> >>
>>>> >> Cc: Jarkko Sakkinen <jarkko@kernel.org>
>>>> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>>>> >> Cc: Peter Huewe <peterhuewe@gmx.de>
>>>> >> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>>>> >> Cc: Matthew Garrett <mjg59@google.com>
>>>> >> Cc: Hans de Goede <hdegoede@redhat.com>
>>>> >> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
>>>> >> ---
>>>> >> v2: drop tpm_tis specific workqueue and use just system_wq
>>>> >>
>>>> >> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
>>>> >>  drivers/char/tpm/tpm_tis_core.h |  2 ++
>>>> >>  2 files changed, 29 insertions(+)
>>>> >>
>>>> >> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>> >> index 23b60583928b..72cc8a5a152c 100644
>>>> >> --- a/drivers/char/tpm/tpm_tis_core.c
>>>> >> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>> >> @@ -24,6 +24,8 @@
>>>> >>  #include <linux/wait.h>
>>>> >>  #include <linux/acpi.h>
>>>> >>  #include <linux/freezer.h>
>>>> >> +#include <linux/workqueue.h>
>>>> >> +#include <linux/kernel_stat.h>
>>>> >>  #include "tpm.h"
>>>> >>  #include "tpm_tis_core.h"
>>>> >>  
>>>> >> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
>>>> >>  {
>>>> >>  	struct tpm_chip *chip = dev_id;
>>>> >>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
>>>> >> +	static bool check_storm = true;
>>>> >> +	static unsigned int check_start;
>>>> >>  	u32 interrupt;
>>>> >>  	int i, rc;
>>>> >>  
>>>> >> +	if (unlikely(check_storm)) {
>>>> >> +		if (!check_start) {
>>>> >> +			check_start = jiffies_to_msecs(jiffies);
>>>> >> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
>>>> >> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
>>>> >> +			check_storm = false;
>>>> >> +			schedule_work(&priv->storm_work);
>>>> >> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
>>>> >> +			check_storm = false;
>>>> >> +		}
>>>> >> +	}
>>>> >> +
>>>> >>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
>>>> >>  	if (rc < 0)
>>>> >>  		return IRQ_NONE;
>>>> >> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
>>>> >>  	.clk_enable = tpm_tis_clkrun_enable,
>>>> >>  };
>>>> >>  
>>>> >> +static void tpm_tis_storm_work(struct work_struct *work)
>>>> >> +{
>>>> >> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
>>>> >> +
>>>> >> +	disable_interrupts(priv->chip);
>>>> >> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
>>>> >> +}
>>>> >> +
>>>> >>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>>> >>  		      const struct tpm_tis_phy_ops *phy_ops,
>>>> >>  		      acpi_handle acpi_dev_handle)
>>>> >> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>>> >>  	if (IS_ERR(chip))
>>>> >>  		return PTR_ERR(chip);
>>>> >>  
>>>> >> +	priv->chip = chip;
>>>> >> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
>>>> >> +
>>>> >>  #ifdef CONFIG_ACPI
>>>> >>  	chip->acpi_dev_handle = acpi_dev_handle;
>>>> >>  #endif
>>>> >> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>>>> >> index edeb5dc61c95..5630f294dc0c 100644
>>>> >> --- a/drivers/char/tpm/tpm_tis_core.h
>>>> >> +++ b/drivers/char/tpm/tpm_tis_core.h
>>>> >> @@ -95,6 +95,8 @@ struct tpm_tis_data {
>>>> >>  	u16 clkrun_enabled;
>>>> >>  	wait_queue_head_t int_queue;
>>>> >>  	wait_queue_head_t read_queue;
>>>> >> +	struct work_struct storm_work;
>>>> >> +	struct tpm_chip *chip;
>>>> >>  	const struct tpm_tis_phy_ops *phy_ops;
>>>> >>  	unsigned short rng_quality;
>>>> >>  };
>>>> >
>>>> > I've tested this with the Intel platform that has an Infineon chip that
>>>> > I found the other week. It works, but isn't the complete fix. With this
>>>> > on top of James' patchset I sometimes see the message "Lost Interrupt
>>>> > waiting for TPM stat", so I guess there needs to be a check in
>>>> > wait_for_tpm_stat and request_locality to see if interrupts were
>>>> > disabled when the wait_event_interruptible_timeout call times out.
>>>> 
>>>> As kernel test robot pointed out. kstat_irqs isn't visible when tpm_tis
>>>> builds as a module. It looks like it is only called by kstat_irq_usrs,
>>>> and that is only by the fs/proc code. I have a patch to export it, but
>>>> the i915 driver open codes their own version instead of using it. Is
>>>> there any reason not to export it?
>>>
>>> If you add a patch that exports it, then for coherency it'd be better to
>>> also patch i915 driver. Jani?
>>>
>>> /Jarkko
>>
>> It looks like this might not solve all cases. I'm having Lenovo test
>> another build to make sure I gave them the right code, but they reported
>> with the L490 that the system hangs right when it is initializing
>> tpm_tis. I'm working on getting a build on the T490s I have to try there
>> as well. With the Intel system it spits out that it detects the
>> interrupt storm, and continues on its way.
>
> The interrupt storm detection code works on the T490s. I'm not sure what
> is going on with the L490. I will see if I can get access to one.
>
> Jerry

Lenovo verified that the L490 hangs.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-03 20:14             ` Jerry Snitselaar
@ 2020-12-03 21:05               ` James Bottomley
  2020-12-04 21:51                 ` Jerry Snitselaar
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2020-12-03 21:05 UTC (permalink / raw)
  To: Jerry Snitselaar, Jarkko Sakkinen
  Cc: Jani Nikula, linux-kernel, linux-integrity, Jason Gunthorpe,
	Peter Huewe, Matthew Garrett, Hans de Goede

On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote:
> Jerry Snitselaar @ 2020-12-02 23:11 MST:
[...]
> > The interrupt storm detection code works on the T490s. I'm not sure
> > what is going on with the L490. I will see if I can get access to
> > one.
> > 
> > Jerry
> 
> Lenovo verified that the L490 hangs.

Just to confirm, that's this system:

https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490

We could ask if lenovo will give us one, but if not we could pull a
Jens.  [the backstory is that when Jens was doing queueing in the block
layer, there were lots of SATA devices that didn't work quite right but
you couldn't tell unless you actually tried them out.  Getting
manufacturers to send samples is rather arduous, so he took to ordering
them online, testing them out, and then returning them for a full
refund within the allowed window]

It looks like Lenovo has a nice christmas returns policy:

https://www.lenovo.com/us/en/shopping-faq/#returns

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-03  6:11           ` Jerry Snitselaar
  2020-12-03 20:14             ` Jerry Snitselaar
@ 2020-12-04  1:59             ` Jarkko Sakkinen
  1 sibling, 0 replies; 12+ messages in thread
From: Jarkko Sakkinen @ 2020-12-04  1:59 UTC (permalink / raw)
  To: Jerry Snitselaar
  Cc: Jani Nikula, linux-kernel, linux-integrity, Jason Gunthorpe,
	Peter Huewe, James Bottomley, Matthew Garrett, Hans de Goede

On Wed, Dec 02, 2020 at 11:11:41PM -0700, Jerry Snitselaar wrote:
> 
> Jerry Snitselaar @ 2020-12-02 17:02 MST:
> 
> > Jarkko Sakkinen @ 2020-12-02 09:49 MST:
> >
> >> On Tue, Dec 01, 2020 at 12:59:23PM -0700, Jerry Snitselaar wrote:
> >>> 
> >>> Jerry Snitselaar @ 2020-11-30 20:26 MST:
> >>> 
> >>> > Jerry Snitselaar @ 2020-11-30 19:58 MST:
> >>> >
> >>> >> When enabling the interrupt code for the tpm_tis driver we have
> >>> >> noticed some systems have a bios issue causing an interrupt storm to
> >>> >> occur. The issue isn't limited to a single tpm or system manufacturer
> >>> >> so keeping a denylist of systems with the issue isn't optimal. Instead
> >>> >> try to detect the problem occurring, disable interrupts, and revert to
> >>> >> polling when it happens.
> >>> >>
> >>> >> Cc: Jarkko Sakkinen <jarkko@kernel.org>
> >>> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> >>> >> Cc: Peter Huewe <peterhuewe@gmx.de>
> >>> >> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> >>> >> Cc: Matthew Garrett <mjg59@google.com>
> >>> >> Cc: Hans de Goede <hdegoede@redhat.com>
> >>> >> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
> >>> >> ---
> >>> >> v2: drop tpm_tis specific workqueue and use just system_wq
> >>> >>
> >>> >> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
> >>> >>  drivers/char/tpm/tpm_tis_core.h |  2 ++
> >>> >>  2 files changed, 29 insertions(+)
> >>> >>
> >>> >> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> >>> >> index 23b60583928b..72cc8a5a152c 100644
> >>> >> --- a/drivers/char/tpm/tpm_tis_core.c
> >>> >> +++ b/drivers/char/tpm/tpm_tis_core.c
> >>> >> @@ -24,6 +24,8 @@
> >>> >>  #include <linux/wait.h>
> >>> >>  #include <linux/acpi.h>
> >>> >>  #include <linux/freezer.h>
> >>> >> +#include <linux/workqueue.h>
> >>> >> +#include <linux/kernel_stat.h>
> >>> >>  #include "tpm.h"
> >>> >>  #include "tpm_tis_core.h"
> >>> >>  
> >>> >> @@ -745,9 +747,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
> >>> >>  {
> >>> >>  	struct tpm_chip *chip = dev_id;
> >>> >>  	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
> >>> >> +	static bool check_storm = true;
> >>> >> +	static unsigned int check_start;
> >>> >>  	u32 interrupt;
> >>> >>  	int i, rc;
> >>> >>  
> >>> >> +	if (unlikely(check_storm)) {
> >>> >> +		if (!check_start) {
> >>> >> +			check_start = jiffies_to_msecs(jiffies);
> >>> >> +		} else if ((kstat_irqs(priv->irq) > 1000) &&
> >>> >> +			   (jiffies_to_msecs(jiffies) - check_start < 500)) {
> >>> >> +			check_storm = false;
> >>> >> +			schedule_work(&priv->storm_work);
> >>> >> +		} else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
> >>> >> +			check_storm = false;
> >>> >> +		}
> >>> >> +	}
> >>> >> +
> >>> >>  	rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
> >>> >>  	if (rc < 0)
> >>> >>  		return IRQ_NONE;
> >>> >> @@ -987,6 +1003,14 @@ static const struct tpm_class_ops tpm_tis = {
> >>> >>  	.clk_enable = tpm_tis_clkrun_enable,
> >>> >>  };
> >>> >>  
> >>> >> +static void tpm_tis_storm_work(struct work_struct *work)
> >>> >> +{
> >>> >> +	struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
> >>> >> +
> >>> >> +	disable_interrupts(priv->chip);
> >>> >> +	dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
> >>> >> +}
> >>> >> +
> >>> >>  int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> >>> >>  		      const struct tpm_tis_phy_ops *phy_ops,
> >>> >>  		      acpi_handle acpi_dev_handle)
> >>> >> @@ -1003,6 +1027,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> >>> >>  	if (IS_ERR(chip))
> >>> >>  		return PTR_ERR(chip);
> >>> >>  
> >>> >> +	priv->chip = chip;
> >>> >> +	INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
> >>> >> +
> >>> >>  #ifdef CONFIG_ACPI
> >>> >>  	chip->acpi_dev_handle = acpi_dev_handle;
> >>> >>  #endif
> >>> >> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
> >>> >> index edeb5dc61c95..5630f294dc0c 100644
> >>> >> --- a/drivers/char/tpm/tpm_tis_core.h
> >>> >> +++ b/drivers/char/tpm/tpm_tis_core.h
> >>> >> @@ -95,6 +95,8 @@ struct tpm_tis_data {
> >>> >>  	u16 clkrun_enabled;
> >>> >>  	wait_queue_head_t int_queue;
> >>> >>  	wait_queue_head_t read_queue;
> >>> >> +	struct work_struct storm_work;
> >>> >> +	struct tpm_chip *chip;
> >>> >>  	const struct tpm_tis_phy_ops *phy_ops;
> >>> >>  	unsigned short rng_quality;
> >>> >>  };
> >>> >
> >>> > I've tested this with the Intel platform that has an Infineon chip that
> >>> > I found the other week. It works, but isn't the complete fix. With this
> >>> > on top of James' patchset I sometimes see the message "Lost Interrupt
> >>> > waiting for TPM stat", so I guess there needs to be a check in
> >>> > wait_for_tpm_stat and request_locality to see if interrupts were
> >>> > disabled when the wait_event_interruptible_timeout call times out.
> >>> 
> >>> As kernel test robot pointed out. kstat_irqs isn't visible when tpm_tis
> >>> builds as a module. It looks like it is only called by kstat_irq_usrs,
> >>> and that is only by the fs/proc code. I have a patch to export it, but
> >>> the i915 driver open codes their own version instead of using it. Is
> >>> there any reason not to export it?
> >>
> >> If you add a patch that exports it, then for coherency it'd be better to
> >> also patch i915 driver. Jani?
> >>
> >> /Jarkko
> >
> > It looks like this might not solve all cases. I'm having Lenovo test
> > another build to make sure I gave them the right code, but they reported
> > with the L490 that the system hangs right when it is initializing
> > tpm_tis. I'm working on getting a build on the T490s I have to try there
> > as well. With the Intel system it spits out that it detects the
> > interrupt storm, and continues on its way.
> 
> The interrupt storm detection code works on the T490s. I'm not sure what
> is going on with the L490. I will see if I can get access to one.

If you cannot, just send it. We can consider applying it, if it does
not make life worse everyone else, even if it turned out to have some
bad spots.

/Jarkko

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-03 21:05               ` James Bottomley
@ 2020-12-04 21:51                 ` Jerry Snitselaar
  2020-12-05  0:06                   ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Jerry Snitselaar @ 2020-12-04 21:51 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jarkko Sakkinen, Jani Nikula, linux-kernel, linux-integrity,
	Jason Gunthorpe, Peter Huewe, Matthew Garrett, Hans de Goede


James Bottomley @ 2020-12-03 14:05 MST:

> On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote:
>> Jerry Snitselaar @ 2020-12-02 23:11 MST:
> [...]
>> > The interrupt storm detection code works on the T490s. I'm not sure
>> > what is going on with the L490. I will see if I can get access to
>> > one.
>> > 
>> > Jerry
>> 
>> Lenovo verified that the L490 hangs.
>
> Just to confirm, that's this system:
>
> https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490
>
> We could ask if lenovo will give us one, but if not we could pull a
> Jens.  [the backstory is that when Jens was doing queueing in the block
> layer, there were lots of SATA devices that didn't work quite right but
> you couldn't tell unless you actually tried them out.  Getting
> manufacturers to send samples is rather arduous, so he took to ordering
> them online, testing them out, and then returning them for a full
> refund within the allowed window]
>
> It looks like Lenovo has a nice christmas returns policy:
>
> https://www.lenovo.com/us/en/shopping-faq/#returns
>
> James

Yes, that is the one. I'm seeing if we have any located somewhere, or if
Lenovo will loan me one. I think for the time being the patch that
disabled interrupts for the T490s could be changed to it for the L490
instead. I'll post a v3 of my current patchset. It would probably make
sense for it to go in with your patches when they land.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tpm_tis: Disable interrupts if interrupt storm detected
  2020-12-04 21:51                 ` Jerry Snitselaar
@ 2020-12-05  0:06                   ` James Bottomley
  0 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2020-12-05  0:06 UTC (permalink / raw)
  To: Jerry Snitselaar
  Cc: Jarkko Sakkinen, Jani Nikula, linux-kernel, linux-integrity,
	Jason Gunthorpe, Peter Huewe, Matthew Garrett, Hans de Goede

On Fri, 2020-12-04 at 14:51 -0700, Jerry Snitselaar wrote:
> James Bottomley @ 2020-12-03 14:05 MST:
> 
> > On Thu, 2020-12-03 at 13:14 -0700, Jerry Snitselaar wrote:
> > > Jerry Snitselaar @ 2020-12-02 23:11 MST:
> > [...]
> > > > The interrupt storm detection code works on the T490s. I'm not
> > > > sure what is going on with the L490. I will see if I can get
> > > > access to one.
> > > > 
> > > > Jerry
> > > 
> > > Lenovo verified that the L490 hangs.
> > 
> > Just to confirm, that's this system:
> > 
> > https://www.lenovo.com/us/en/laptops/thinkpad/thinkpad-l/ThinkPad-L490/p/22TP2TBL490
> > 
> > We could ask if lenovo will give us one, but if not we could pull a
> > Jens.  [the backstory is that when Jens was doing queueing in the
> > block layer, there were lots of SATA devices that didn't work quite
> > right but you couldn't tell unless you actually tried them
> > out.  Getting manufacturers to send samples is rather arduous, so
> > he took to ordering them online, testing them out, and then
> > returning them for a full refund within the allowed window]
> > 
> > It looks like Lenovo has a nice christmas returns policy:
> > 
> > https://www.lenovo.com/us/en/shopping-faq/#returns
> > 
> > James
> 
> Yes, that is the one. I'm seeing if we have any located somewhere, or
> if Lenovo will loan me one.

OK, that would be best since it's your patch.  We've got 12 days left
to get the free returns policy, so if you haven't managed to find one
by the end of next week, I'll order one and play with it ... you could
always put your manager's credit card down for one as well ...
especially as it should get refunded ...

>  I think for the time being the patch that disabled interrupts for
> the T490s could be changed to it for the L490 instead. I'll post a v3
> of my current patchset. It would probably make sense for it to go in
> with your patches when they land.

Absolutely ... we can't turn interrupts on and then have a load of hung
laptops we already knew about ...

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-12-05  0:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-30 23:23 [PATCH] tpm_tis: Disable interrupts if interrupt storm detected Jerry Snitselaar
2020-12-01  2:58 ` [PATCH v2] " Jerry Snitselaar
2020-12-01  3:26   ` Jerry Snitselaar
2020-12-01 19:59     ` Jerry Snitselaar
2020-12-02 16:49       ` Jarkko Sakkinen
2020-12-03  0:02         ` Jerry Snitselaar
2020-12-03  6:11           ` Jerry Snitselaar
2020-12-03 20:14             ` Jerry Snitselaar
2020-12-03 21:05               ` James Bottomley
2020-12-04 21:51                 ` Jerry Snitselaar
2020-12-05  0:06                   ` James Bottomley
2020-12-04  1:59             ` Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).