All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix Atmel TPM crash caused by too frequent queries
@ 2021-06-20 23:18 Hao Wu
  2021-06-23 13:35 ` Jarkko Sakkinen
  2021-06-24  5:33 ` Hao Wu
  0 siblings, 2 replies; 47+ messages in thread
From: Hao Wu @ 2021-06-20 23:18 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

This is a fix for the ATMEL TPM crash bug reported in
https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/

According to the discussions in the original thread,
we don't want to revert the timeout of wait_for_tpm_stat
for non-ATMEL chips, which brings back the performance cost.
For investigation and analysis of why wait_for_tpm_stat
caused the issue, and how the regression was introduced,
please read the original thread above.

Thus the proposed fix here is to only revert the timeout
for ATMEL chips by checking the vendor ID.

Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
  has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
  the timeout has not been changed.
---
 drivers/char/tpm/tpm.h          |  9 ++++++++-
 drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
 include/linux/tpm.h             |  2 ++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..bc6aa7f9e119 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -42,7 +42,9 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
 };
 
 /* TPM addresses */
@@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
 		     delay_msec * 1000);
 };
 
+static inline void tpm_usleep(unsigned int delay_usec)
+{
+	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
+};
+
 int tpm_chip_start(struct tpm_chip *chip);
 void tpm_chip_stop(struct tpm_chip *chip);
 struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..9ddd4edfe1c2 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			if (chip->timeout_wait_stat && 
+				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
+				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
+			} else {
+				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
+			}
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeout for wait_for_tpm_stat */
+	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	switch (priv->manufacturer_id) {
+	case TPM_VID_ATML:
+        /* ATMEL chip needs longer timeout to avoid crash */
+		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
+		break;
+	default:
+		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..35f2a0260d76 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,7 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned long timeout_wait_stat; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH] Fix Atmel TPM crash caused by too frequent queries
  2021-06-20 23:18 [PATCH] Fix Atmel TPM crash caused by too frequent queries Hao Wu
@ 2021-06-23 13:35 ` Jarkko Sakkinen
  2021-06-24  5:49   ` Hao Wu
  2021-06-24  5:33 ` Hao Wu
  1 sibling, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-06-23 13:35 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

On Sun, Jun 20, 2021 at 04:18:09PM -0700, Hao Wu wrote:
> This is a fix for the ATMEL TPM crash bug reported in
> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> 
> According to the discussions in the original thread,
> we don't want to revert the timeout of wait_for_tpm_stat
> for non-ATMEL chips, which brings back the performance cost.
> For investigation and analysis of why wait_for_tpm_stat
> caused the issue, and how the regression was introduced,
> please read the original thread above.
> 
> Thus the proposed fix here is to only revert the timeout
> for ATMEL chips by checking the vendor ID.
> 
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash
>   has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>   the timeout has not been changed.

Please move test plan right before diffstat if you wan to include such,
so that it does not go into the commit log.


> ---
>  drivers/char/tpm/tpm.h          |  9 ++++++++-
>  drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>  include/linux/tpm.h             |  2 ++
>  3 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 283f78211c3a..bc6aa7f9e119 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -42,7 +42,9 @@ enum tpm_timeout {
>  	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>  	TPM_TIMEOUT_POLL = 1,	/* msecs */
>  	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>  };
>  
>  /* TPM addresses */
> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>  		     delay_msec * 1000);
>  };
>  
> +static inline void tpm_usleep(unsigned int delay_usec)
> +{
> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> +};
> +
>  int tpm_chip_start(struct tpm_chip *chip);
>  void tpm_chip_stop(struct tpm_chip *chip);
>  struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..9ddd4edfe1c2 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>  		}
>  	} else {
>  		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			if (chip->timeout_wait_stat && 
> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> +			} else {
> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> +			}
>  			status = chip->ops->status(chip);
>  			if ((status & mask) == mask)
>  				return 0;
> @@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>  	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>  	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> +	/* init timeout for wait_for_tpm_stat */
> +	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>  	priv->phy_ops = phy_ops;
>  	dev_set_drvdata(&chip->dev, priv);
>  
> @@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  
>  	priv->manufacturer_id = vendor;
>  
> +	switch (priv->manufacturer_id) {
> +	case TPM_VID_ATML:
> +        /* ATMEL chip needs longer timeout to avoid crash */
> +		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
> +		break;
> +	default:
> +		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
> +	}
> +
>  	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>  	if (rc < 0)
>  		goto out_err;
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index aa11fe323c56..35f2a0260d76 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -150,6 +150,7 @@ struct tpm_chip {
>  	bool timeout_adjusted;
>  	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>  	bool duration_adjusted;
> +	unsigned long timeout_wait_stat; /* usecs */
>  
>  	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>  
> @@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
>  #define TPM_VID_INTEL    0x8086
>  #define TPM_VID_WINBOND  0x1050
>  #define TPM_VID_STM      0x104A
> +#define TPM_VID_ATML     0x1114
>  
>  enum tpm_chip_flags {
>  	TPM_CHIP_FLAG_TPM2		= BIT(1),
> -- 
> 2.29.0.vfs.0.0
> 
> 

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH] Fix Atmel TPM crash caused by too frequent queries
  2021-06-20 23:18 [PATCH] Fix Atmel TPM crash caused by too frequent queries Hao Wu
  2021-06-23 13:35 ` Jarkko Sakkinen
@ 2021-06-24  5:33 ` Hao Wu
  2021-06-29 20:07   ` Jarkko Sakkinen
  2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
  1 sibling, 2 replies; 47+ messages in thread
From: Hao Wu @ 2021-06-24  5:33 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

This is a fix for the ATMEL TPM crash bug reported in
https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/

According to the discussions in the original thread,
we don't want to revert the timeout of wait_for_tpm_stat
for non-ATMEL chips, which brings back the performance cost.
For investigation and analysis of why wait_for_tpm_stat
caused the issue, and how the regression was introduced,
please read the original thread above.

Thus the proposed fix here is to only revert the timeout
for ATMEL chips by checking the vendor ID.
---
 drivers/char/tpm/tpm.h          |  9 ++++++++-
 drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
 include/linux/tpm.h             |  2 ++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..bc6aa7f9e119 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -42,7 +42,9 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
 };
 
 /* TPM addresses */
@@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
 		     delay_msec * 1000);
 };
 
+static inline void tpm_usleep(unsigned int delay_usec)
+{
+	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
+};
+
 int tpm_chip_start(struct tpm_chip *chip);
 void tpm_chip_stop(struct tpm_chip *chip);
 struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..9ddd4edfe1c2 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			if (chip->timeout_wait_stat && 
+				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
+				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
+			} else {
+				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
+			}
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeout for wait_for_tpm_stat */
+	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	switch (priv->manufacturer_id) {
+	case TPM_VID_ATML:
+        /* ATMEL chip needs longer timeout to avoid crash */
+		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
+		break;
+	default:
+		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..35f2a0260d76 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,7 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned long timeout_wait_stat; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH] Fix Atmel TPM crash caused by too frequent queries
  2021-06-23 13:35 ` Jarkko Sakkinen
@ 2021-06-24  5:49   ` Hao Wu
  2021-06-29 20:06     ` Jarkko Sakkinen
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-06-24  5:49 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jun 23, 2021, at 6:35 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Sun, Jun 20, 2021 at 04:18:09PM -0700, Hao Wu wrote:
>> This is a fix for the ATMEL TPM crash bug reported in
>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>> 
>> According to the discussions in the original thread,
>> we don't want to revert the timeout of wait_for_tpm_stat
>> for non-ATMEL chips, which brings back the performance cost.
>> For investigation and analysis of why wait_for_tpm_stat
>> caused the issue, and how the regression was introduced,
>> please read the original thread above.
>> 
>> Thus the proposed fix here is to only revert the timeout
>> for ATMEL chips by checking the vendor ID.
>> 
>> Test Plan:
>> - Run fixed kernel with ATMEL TPM chips and see crash
>>  has been fixed.
>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>  the timeout has not been changed.
> 
> Please move test plan right before diffstat if you wan to include such,
> so that it does not go into the commit log.
Hi Jarkko, not sure I understood your suggestion or not. I removed
the test plan from the commit message in a updated commit
https://patchwork.kernel.org/project/linux-integrity/patch/20210624053321.861-1-hao.wu@rubrik.com/

Let me know if I misunderstood this. I am fine to not include test plan,
If this is not something expected by linux community.
I personally think it is helpful to understand the confidence of the commit.

> 
>> ---
>> drivers/char/tpm/tpm.h          |  9 ++++++++-
>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>> include/linux/tpm.h             |  2 ++
>> 3 files changed, 27 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>> index 283f78211c3a..bc6aa7f9e119 100644
>> --- a/drivers/char/tpm/tpm.h
>> +++ b/drivers/char/tpm/tpm.h
>> @@ -42,7 +42,9 @@ enum tpm_timeout {
>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>> };
>> 
>> /* TPM addresses */
>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>> 		     delay_msec * 1000);
>> };
>> 
>> +static inline void tpm_usleep(unsigned int delay_usec)
>> +{
>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
>> +};
>> +
>> int tpm_chip_start(struct tpm_chip *chip);
>> void tpm_chip_stop(struct tpm_chip *chip);
>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> index 55b9d3965ae1..9ddd4edfe1c2 100644
>> --- a/drivers/char/tpm/tpm_tis_core.c
>> +++ b/drivers/char/tpm/tpm_tis_core.c
>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>> 		}
>> 	} else {
>> 		do {
>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>> -				     TPM_TIMEOUT_USECS_MAX);
>> +			if (chip->timeout_wait_stat && 
>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
>> +			} else {
>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
>> +			}
>> 			status = chip->ops->status(chip);
>> 			if ((status & mask) == mask)
>> 				return 0;
>> @@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>> +	/* init timeout for wait_for_tpm_stat */
>> +	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>> 	priv->phy_ops = phy_ops;
>> 	dev_set_drvdata(&chip->dev, priv);
>> 
>> @@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 
>> 	priv->manufacturer_id = vendor;
>> 
>> +	switch (priv->manufacturer_id) {
>> +	case TPM_VID_ATML:
>> +        /* ATMEL chip needs longer timeout to avoid crash */
>> +		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
>> +		break;
>> +	default:
>> +		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>> +	}
>> +
>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>> 	if (rc < 0)
>> 		goto out_err;
>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>> index aa11fe323c56..35f2a0260d76 100644
>> --- a/include/linux/tpm.h
>> +++ b/include/linux/tpm.h
>> @@ -150,6 +150,7 @@ struct tpm_chip {
>> 	bool timeout_adjusted;
>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>> 	bool duration_adjusted;
>> +	unsigned long timeout_wait_stat; /* usecs */
>> 
>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>> 
>> @@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
>> #define TPM_VID_INTEL    0x8086
>> #define TPM_VID_WINBOND  0x1050
>> #define TPM_VID_STM      0x104A
>> +#define TPM_VID_ATML     0x1114
>> 
>> enum tpm_chip_flags {
>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>> -- 
>> 2.29.0.vfs.0.0
>> 
>> 
> 
> /Jarkko

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] Fix Atmel TPM crash caused by too frequent queries
  2021-06-24  5:49   ` Hao Wu
@ 2021-06-29 20:06     ` Jarkko Sakkinen
  2021-06-30  4:27       ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-06-29 20:06 UTC (permalink / raw)
  To: Hao Wu
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

On Wed, Jun 23, 2021 at 10:49:27PM -0700, Hao Wu wrote:
> > On Jun 23, 2021, at 6:35 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > 
> > On Sun, Jun 20, 2021 at 04:18:09PM -0700, Hao Wu wrote:
> >> This is a fix for the ATMEL TPM crash bug reported in
> >> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> >> 
> >> According to the discussions in the original thread,
> >> we don't want to revert the timeout of wait_for_tpm_stat
> >> for non-ATMEL chips, which brings back the performance cost.
> >> For investigation and analysis of why wait_for_tpm_stat
> >> caused the issue, and how the regression was introduced,
> >> please read the original thread above.
> >> 
> >> Thus the proposed fix here is to only revert the timeout
> >> for ATMEL chips by checking the vendor ID.
> >> 
> >> Test Plan:
> >> - Run fixed kernel with ATMEL TPM chips and see crash
> >>  has been fixed.
> >> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> >>  the timeout has not been changed.
> > 
> > Please move test plan right before diffstat if you wan to include such,
> > so that it does not go into the commit log.
> Hi Jarkko, not sure I understood your suggestion or not. I removed
> the test plan from the commit message in a updated commit
> https://patchwork.kernel.org/project/linux-integrity/patch/20210624053321.861-1-hao.wu@rubrik.com/
> 
> Let me know if I misunderstood this. I am fine to not include test plan,
> If this is not something expected by linux community.
> I personally think it is helpful to understand the confidence of the commit.
> 
> > 
> >> ---

You can add it right here. Then it won't be included to the actual
commit log but is still available in the patch.

/Jarkko 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] Fix Atmel TPM crash caused by too frequent queries
  2021-06-24  5:33 ` Hao Wu
@ 2021-06-29 20:07   ` Jarkko Sakkinen
  2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
  1 sibling, 0 replies; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-06-29 20:07 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

On Wed, Jun 23, 2021 at 10:33:21PM -0700, Hao Wu wrote:
> This is a fix for the ATMEL TPM crash bug reported in
> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> 
> According to the discussions in the original thread,
> we don't want to revert the timeout of wait_for_tpm_stat
> for non-ATMEL chips, which brings back the performance cost.
> For investigation and analysis of why wait_for_tpm_stat
> caused the issue, and how the regression was introduced,
> please read the original thread above.
> 
> Thus the proposed fix here is to only revert the timeout
> for ATMEL chips by checking the vendor ID.
> ---

Lacking "Fixes" and "Signed-off-by".

/Jarkko

>  drivers/char/tpm/tpm.h          |  9 ++++++++-
>  drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>  include/linux/tpm.h             |  2 ++
>  3 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 283f78211c3a..bc6aa7f9e119 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -42,7 +42,9 @@ enum tpm_timeout {
>  	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>  	TPM_TIMEOUT_POLL = 1,	/* msecs */
>  	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>  };
>  
>  /* TPM addresses */
> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>  		     delay_msec * 1000);
>  };
>  
> +static inline void tpm_usleep(unsigned int delay_usec)
> +{
> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> +};
> +
>  int tpm_chip_start(struct tpm_chip *chip);
>  void tpm_chip_stop(struct tpm_chip *chip);
>  struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..9ddd4edfe1c2 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>  		}
>  	} else {
>  		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			if (chip->timeout_wait_stat && 
> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> +			} else {
> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> +			}
>  			status = chip->ops->status(chip);
>  			if ((status & mask) == mask)
>  				return 0;
> @@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>  	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>  	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> +	/* init timeout for wait_for_tpm_stat */
> +	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>  	priv->phy_ops = phy_ops;
>  	dev_set_drvdata(&chip->dev, priv);
>  
> @@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  
>  	priv->manufacturer_id = vendor;
>  
> +	switch (priv->manufacturer_id) {
> +	case TPM_VID_ATML:
> +        /* ATMEL chip needs longer timeout to avoid crash */
> +		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
> +		break;
> +	default:
> +		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
> +	}
> +
>  	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>  	if (rc < 0)
>  		goto out_err;
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index aa11fe323c56..35f2a0260d76 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -150,6 +150,7 @@ struct tpm_chip {
>  	bool timeout_adjusted;
>  	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>  	bool duration_adjusted;
> +	unsigned long timeout_wait_stat; /* usecs */
>  
>  	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>  
> @@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
>  #define TPM_VID_INTEL    0x8086
>  #define TPM_VID_WINBOND  0x1050
>  #define TPM_VID_STM      0x104A
> +#define TPM_VID_ATML     0x1114
>  
>  enum tpm_chip_flags {
>  	TPM_CHIP_FLAG_TPM2		= BIT(1),
> -- 
> 2.29.0.vfs.0.0
> 
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-06-24  5:33 ` Hao Wu
  2021-06-29 20:07   ` Jarkko Sakkinen
@ 2021-06-30  4:22   ` Hao Wu
  2021-07-02  6:35     ` Jarkko Sakkinen
                       ` (3 more replies)
  1 sibling, 4 replies; 47+ messages in thread
From: Hao Wu @ 2021-06-30  4:22 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

This is a fix for the ATMEL TPM crash bug reported in
https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/

According to the discussions in the original thread,
we don't want to revert the timeout of wait_for_tpm_stat
for non-ATMEL chips, which brings back the performance cost.
For investigation and analysis of why wait_for_tpm_stat
caused the issue, and how the regression was introduced,
please read the original thread above.

Thus the proposed fix here is to only revert the timeout
for ATMEL chips by checking the vendor ID.

Signed-off-by: Hao Wu <hao.wu@rubrik.com>
Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
---
Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
the timeout has not been changed.

 drivers/char/tpm/tpm.h          |  9 ++++++++-
 drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
 include/linux/tpm.h             |  2 ++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..bc6aa7f9e119 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -42,7 +42,9 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
 };
 
 /* TPM addresses */
@@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
 		     delay_msec * 1000);
 };
 
+static inline void tpm_usleep(unsigned int delay_usec)
+{
+	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
+};
+
 int tpm_chip_start(struct tpm_chip *chip);
 void tpm_chip_stop(struct tpm_chip *chip);
 struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..9ddd4edfe1c2 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			if (chip->timeout_wait_stat && 
+				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
+				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
+			} else {
+				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
+			}
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeout for wait_for_tpm_stat */
+	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	switch (priv->manufacturer_id) {
+	case TPM_VID_ATML:
+        /* ATMEL chip needs longer timeout to avoid crash */
+		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
+		break;
+	default:
+		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..35f2a0260d76 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,7 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned long timeout_wait_stat; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH] Fix Atmel TPM crash caused by too frequent queries
  2021-06-29 20:06     ` Jarkko Sakkinen
@ 2021-06-30  4:27       ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-06-30  4:27 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jun 29, 2021, at 1:06 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Wed, Jun 23, 2021 at 10:49:27PM -0700, Hao Wu wrote:
>>> On Jun 23, 2021, at 6:35 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>>> On Sun, Jun 20, 2021 at 04:18:09PM -0700, Hao Wu wrote:
>>>> This is a fix for the ATMEL TPM crash bug reported in
>>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>> 
>>>> According to the discussions in the original thread,
>>>> we don't want to revert the timeout of wait_for_tpm_stat
>>>> for non-ATMEL chips, which brings back the performance cost.
>>>> For investigation and analysis of why wait_for_tpm_stat
>>>> caused the issue, and how the regression was introduced,
>>>> please read the original thread above.
>>>> 
>>>> Thus the proposed fix here is to only revert the timeout
>>>> for ATMEL chips by checking the vendor ID.
>>>> 
>>>> Test Plan:
>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>> has been fixed.
>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>> the timeout has not been changed.
>>> 
>>> Please move test plan right before diffstat if you wan to include such,
>>> so that it does not go into the commit log.
>> Hi Jarkko, not sure I understood your suggestion or not. I removed
>> the test plan from the commit message in a updated commit
>> https://patchwork.kernel.org/project/linux-integrity/patch/20210624053321.861-1-hao.wu@rubrik.com/
>> 
>> Let me know if I misunderstood this. I am fine to not include test plan,
>> If this is not something expected by linux community.
>> I personally think it is helpful to understand the confidence of the commit.
>> 
>>> 
>>>> ---
> 
> You can add it right here. Then it won't be included to the actual
> commit log but is still available in the patch.
> 
I see, thanks Jarkko. Updated the patch
https://patchwork.kernel.org/project/linux-integrity/patch/20210630042205.30051-1-hao.wu@rubrik.com/
Hopefull it makes more sense now.

> /Jarkko 

Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
@ 2021-07-02  6:35     ` Jarkko Sakkinen
  2021-07-02  7:12       ` Greg KH
  2021-07-02  7:33       ` Hao Wu
  2021-07-04  0:07     ` Hao Wu
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-02  6:35 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
> This is a fix for the ATMEL TPM crash bug reported in
> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> 
> According to the discussions in the original thread,
> we don't want to revert the timeout of wait_for_tpm_stat
> for non-ATMEL chips, which brings back the performance cost.
> For investigation and analysis of why wait_for_tpm_stat
> caused the issue, and how the regression was introduced,
> please read the original thread above.
> 
> Thus the proposed fix here is to only revert the timeout
> for ATMEL chips by checking the vendor ID.
> 
> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")

Fixes tag should be before SOB.

> ---
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash
> has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> the timeout has not been changed.
> 
>  drivers/char/tpm/tpm.h          |  9 ++++++++-
>  drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>  include/linux/tpm.h             |  2 ++
>  3 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 283f78211c3a..bc6aa7f9e119 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -42,7 +42,9 @@ enum tpm_timeout {
>  	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>  	TPM_TIMEOUT_POLL = 1,	/* msecs */
>  	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */

What is this change?

> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>  };
>  
>  /* TPM addresses */
> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>  		     delay_msec * 1000);
>  };
>  
> +static inline void tpm_usleep(unsigned int delay_usec)
> +{
> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> +};

Please remove this, and open code.

> +
>  int tpm_chip_start(struct tpm_chip *chip);
>  void tpm_chip_stop(struct tpm_chip *chip);
>  struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..9ddd4edfe1c2 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>  		}
>  	} else {
>  		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			if (chip->timeout_wait_stat && 
> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> +			} else {
> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> +			}

Invalid use of braces. Please read

https://www.kernel.org/doc/html/v5.13/process/coding-style.html

Why do you have to use this field conditionally anyway? Why doesn't
it always contain a legit value?

>  			status = chip->ops->status(chip);
>  			if ((status & mask) == mask)
>  				return 0;
> @@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>  	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>  	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> +	/* init timeout for wait_for_tpm_stat */
> +	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>  	priv->phy_ops = phy_ops;
>  	dev_set_drvdata(&chip->dev, priv);
>  
> @@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  
>  	priv->manufacturer_id = vendor;
>  
> +	switch (priv->manufacturer_id) {
> +	case TPM_VID_ATML:
> +        /* ATMEL chip needs longer timeout to avoid crash */
> +		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
> +		break;
> +	default:
> +		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
> +	}
> +
>  	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>  	if (rc < 0)
>  		goto out_err;
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index aa11fe323c56..35f2a0260d76 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -150,6 +150,7 @@ struct tpm_chip {
>  	bool timeout_adjusted;
>  	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>  	bool duration_adjusted;
> +	unsigned long timeout_wait_stat; /* usecs */
>  
>  	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>  
> @@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
>  #define TPM_VID_INTEL    0x8086
>  #define TPM_VID_WINBOND  0x1050
>  #define TPM_VID_STM      0x104A
> +#define TPM_VID_ATML     0x1114
>  
>  enum tpm_chip_flags {
>  	TPM_CHIP_FLAG_TPM2		= BIT(1),
> -- 
> 2.29.0.vfs.0.0
> 
> 

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  6:35     ` Jarkko Sakkinen
@ 2021-07-02  7:12       ` Greg KH
  2021-07-02  7:33       ` Hao Wu
  1 sibling, 0 replies; 47+ messages in thread
From: Greg KH @ 2021-07-02  7:12 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Hao Wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, arnd, nayna, James.Bottomley

On Fri, Jul 02, 2021 at 09:35:55AM +0300, Jarkko Sakkinen wrote:
> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
> > This is a fix for the ATMEL TPM crash bug reported in
> > https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> > 
> > According to the discussions in the original thread,
> > we don't want to revert the timeout of wait_for_tpm_stat
> > for non-ATMEL chips, which brings back the performance cost.
> > For investigation and analysis of why wait_for_tpm_stat
> > caused the issue, and how the regression was introduced,
> > please read the original thread above.
> > 
> > Thus the proposed fix here is to only revert the timeout
> > for ATMEL chips by checking the vendor ID.
> > 
> > Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> > Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> 
> Fixes tag should be before SOB.

Does not matter :)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  6:35     ` Jarkko Sakkinen
  2021-07-02  7:12       ` Greg KH
@ 2021-07-02  7:33       ` Hao Wu
  2021-07-02  7:35         ` Hao Wu
  2021-07-02  7:45         ` Jarkko Sakkinen
  1 sibling, 2 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-02  7:33 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley



> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
>> This is a fix for the ATMEL TPM crash bug reported in
>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>> 
>> According to the discussions in the original thread,
>> we don't want to revert the timeout of wait_for_tpm_stat
>> for non-ATMEL chips, which brings back the performance cost.
>> For investigation and analysis of why wait_for_tpm_stat
>> caused the issue, and how the regression was introduced,
>> please read the original thread above.
>> 
>> Thus the proposed fix here is to only revert the timeout
>> for ATMEL chips by checking the vendor ID.
>> 
>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> 
> Fixes tag should be before SOB.
> 
>> ---
>> Test Plan:
>> - Run fixed kernel with ATMEL TPM chips and see crash
>> has been fixed.
>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>> the timeout has not been changed.
>> 
>> drivers/char/tpm/tpm.h          |  9 ++++++++-
>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>> include/linux/tpm.h             |  2 ++
>> 3 files changed, 27 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>> index 283f78211c3a..bc6aa7f9e119 100644
>> --- a/drivers/char/tpm/tpm.h
>> +++ b/drivers/char/tpm/tpm.h
>> @@ -42,7 +42,9 @@ enum tpm_timeout {
>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> 
> What is this change?
Need to add the tailing comma

> 
>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>> };
>> 
>> /* TPM addresses */
>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>> 		     delay_msec * 1000);
>> };
>> 
>> +static inline void tpm_usleep(unsigned int delay_usec)
>> +{
>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
>> +};
> 
> Please remove this, and open code.
Ok, will do

>> +
>> int tpm_chip_start(struct tpm_chip *chip);
>> void tpm_chip_stop(struct tpm_chip *chip);
>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> index 55b9d3965ae1..9ddd4edfe1c2 100644
>> --- a/drivers/char/tpm/tpm_tis_core.c
>> +++ b/drivers/char/tpm/tpm_tis_core.c
>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>> 		}
>> 	} else {
>> 		do {
>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>> -				     TPM_TIMEOUT_USECS_MAX);
>> +			if (chip->timeout_wait_stat && 
>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
>> +			} else {
>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
>> +			}
> 
> Invalid use of braces. Please read
> 
> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
> 
> Why do you have to use this field conditionally anyway? Why doesn't
> it always contain a legit value?
The field is legit now, but doesn’t hurt to do addition check for robustness 
to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 

Can remove if we think it is not needed.

> 
>> 			status = chip->ops->status(chip);
>> 			if ((status & mask) == mask)
>> 				return 0;
>> @@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>> +	/* init timeout for wait_for_tpm_stat */
>> +	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>> 	priv->phy_ops = phy_ops;
>> 	dev_set_drvdata(&chip->dev, priv);
>> 
>> @@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 
>> 	priv->manufacturer_id = vendor;
>> 
>> +	switch (priv->manufacturer_id) {
>> +	case TPM_VID_ATML:
>> +        /* ATMEL chip needs longer timeout to avoid crash */
Will fix the indentation

Also according to Kenneth we only want to do so for TPM 1.2, 
I will try checking chip->flags against TPM_CHIP_FLAG_TPM2 here
Let me know if there are concerns.
 
>> +		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
>> +		break;
>> +	default:
>> +		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>> +	}
>> +
>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>> 	if (rc < 0)
>> 		goto out_err;
>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>> index aa11fe323c56..35f2a0260d76 100644
>> --- a/include/linux/tpm.h
>> +++ b/include/linux/tpm.h
>> @@ -150,6 +150,7 @@ struct tpm_chip {
>> 	bool timeout_adjusted;
>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>> 	bool duration_adjusted;
>> +	unsigned long timeout_wait_stat; /* usecs */
>> 
>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>> 
>> @@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
>> #define TPM_VID_INTEL    0x8086
>> #define TPM_VID_WINBOND  0x1050
>> #define TPM_VID_STM      0x104A
>> +#define TPM_VID_ATML     0x1114
>> 
>> enum tpm_chip_flags {
>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>> -- 
>> 2.29.0.vfs.0.0
>> 
>> 
> 
> /Jarkko


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  7:33       ` Hao Wu
@ 2021-07-02  7:35         ` Hao Wu
  2021-07-02  7:45         ` Jarkko Sakkinen
  1 sibling, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-02  7:35 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley



> On Jul 2, 2021, at 12:33 AM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
> 
> 
>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>> 
>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
>>> This is a fix for the ATMEL TPM crash bug reported in
>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>> 
>>> According to the discussions in the original thread,
>>> we don't want to revert the timeout of wait_for_tpm_stat
>>> for non-ATMEL chips, which brings back the performance cost.
>>> For investigation and analysis of why wait_for_tpm_stat
>>> caused the issue, and how the regression was introduced,
>>> please read the original thread above.
>>> 
>>> Thus the proposed fix here is to only revert the timeout
>>> for ATMEL chips by checking the vendor ID.
>>> 
>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>> 
>> Fixes tag should be before SOB.
>> 
>>> ---
>>> Test Plan:
>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>> has been fixed.
>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>> the timeout has not been changed.
>>> 
>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>>> include/linux/tpm.h             |  2 ++
>>> 3 files changed, 27 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>>> index 283f78211c3a..bc6aa7f9e119 100644
>>> --- a/drivers/char/tpm/tpm.h
>>> +++ b/drivers/char/tpm/tpm.h
>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>> 
>> What is this change?
> Need to add the tailing comma
Ah, sorry, didn’t notice added the duplicated line by mistake. Will remove it. 
> 
>> 
>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>>> };
>>> 
>>> /* TPM addresses */
>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>>> 		     delay_msec * 1000);
>>> };
>>> 
>>> +static inline void tpm_usleep(unsigned int delay_usec)
>>> +{
>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
>>> +};
>> 
>> Please remove this, and open code.
> Ok, will do
> 
>>> +
>>> int tpm_chip_start(struct tpm_chip *chip);
>>> void tpm_chip_stop(struct tpm_chip *chip);
>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>> 		}
>>> 	} else {
>>> 		do {
>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>> -				     TPM_TIMEOUT_USECS_MAX);
>>> +			if (chip->timeout_wait_stat && 
>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
>>> +			} else {
>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
>>> +			}
>> 
>> Invalid use of braces. Please read
>> 
>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
>> 
>> Why do you have to use this field conditionally anyway? Why doesn't
>> it always contain a legit value?
> The field is legit now, but doesn’t hurt to do addition check for robustness 
> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
> 
> Can remove if we think it is not needed.
> 
>> 
>>> 			status = chip->ops->status(chip);
>>> 			if ((status & mask) == mask)
>>> 				return 0;
>>> @@ -934,6 +938,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>>> +	/* init timeout for wait_for_tpm_stat */
>>> +	chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>>> 	priv->phy_ops = phy_ops;
>>> 	dev_set_drvdata(&chip->dev, priv);
>>> 
>>> @@ -983,6 +989,15 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>> 
>>> 	priv->manufacturer_id = vendor;
>>> 
>>> +	switch (priv->manufacturer_id) {
>>> +	case TPM_VID_ATML:
>>> +        /* ATMEL chip needs longer timeout to avoid crash */
> Will fix the indentation
> 
> Also according to Kenneth we only want to do so for TPM 1.2, 
> I will try checking chip->flags against TPM_CHIP_FLAG_TPM2 here
> Let me know if there are concerns.
> 
>>> +		chip->timeout_wait_stat = TPM_ATML_TIMEOUT_WAIT_STAT;
>>> +		break;
>>> +	default:
>>> +		chip->timeout_wait_stat = TPM_TIMEOUT_WAIT_STAT;
>>> +	}
>>> +
>>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>>> 	if (rc < 0)
>>> 		goto out_err;
>>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>>> index aa11fe323c56..35f2a0260d76 100644
>>> --- a/include/linux/tpm.h
>>> +++ b/include/linux/tpm.h
>>> @@ -150,6 +150,7 @@ struct tpm_chip {
>>> 	bool timeout_adjusted;
>>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>>> 	bool duration_adjusted;
>>> +	unsigned long timeout_wait_stat; /* usecs */
>>> 
>>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>>> 
>>> @@ -269,6 +270,7 @@ enum tpm2_cc_attrs {
>>> #define TPM_VID_INTEL    0x8086
>>> #define TPM_VID_WINBOND  0x1050
>>> #define TPM_VID_STM      0x104A
>>> +#define TPM_VID_ATML     0x1114
>>> 
>>> enum tpm_chip_flags {
>>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>>> -- 
>>> 2.29.0.vfs.0.0
>>> 
>>> 
>> 
>> /Jarkko

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  7:33       ` Hao Wu
  2021-07-02  7:35         ` Hao Wu
@ 2021-07-02  7:45         ` Jarkko Sakkinen
  2021-07-02  7:59           ` Hao Wu
  1 sibling, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-02  7:45 UTC (permalink / raw)
  To: Hao Wu
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
> 
> 
> > On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > 
> > On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
> >> This is a fix for the ATMEL TPM crash bug reported in
> >> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> >> 
> >> According to the discussions in the original thread,
> >> we don't want to revert the timeout of wait_for_tpm_stat
> >> for non-ATMEL chips, which brings back the performance cost.
> >> For investigation and analysis of why wait_for_tpm_stat
> >> caused the issue, and how the regression was introduced,
> >> please read the original thread above.
> >> 
> >> Thus the proposed fix here is to only revert the timeout
> >> for ATMEL chips by checking the vendor ID.
> >> 
> >> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> >> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> > 
> > Fixes tag should be before SOB.
> > 
> >> ---
> >> Test Plan:
> >> - Run fixed kernel with ATMEL TPM chips and see crash
> >> has been fixed.
> >> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> >> the timeout has not been changed.
> >> 
> >> drivers/char/tpm/tpm.h          |  9 ++++++++-
> >> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
> >> include/linux/tpm.h             |  2 ++
> >> 3 files changed, 27 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> >> index 283f78211c3a..bc6aa7f9e119 100644
> >> --- a/drivers/char/tpm/tpm.h
> >> +++ b/drivers/char/tpm/tpm.h
> >> @@ -42,7 +42,9 @@ enum tpm_timeout {
> >> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
> >> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
> >> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> >> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> >> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> > 
> > What is this change?
> Need to add the tailing comma
> 
> > 
> >> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> >> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
> >> };
> >> 
> >> /* TPM addresses */
> >> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
> >> 		     delay_msec * 1000);
> >> };
> >> 
> >> +static inline void tpm_usleep(unsigned int delay_usec)
> >> +{
> >> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> >> +};
> > 
> > Please remove this, and open code.
> Ok, will do
> 
> >> +
> >> int tpm_chip_start(struct tpm_chip *chip);
> >> void tpm_chip_stop(struct tpm_chip *chip);
> >> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> >> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> >> index 55b9d3965ae1..9ddd4edfe1c2 100644
> >> --- a/drivers/char/tpm/tpm_tis_core.c
> >> +++ b/drivers/char/tpm/tpm_tis_core.c
> >> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> >> 		}
> >> 	} else {
> >> 		do {
> >> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> >> -				     TPM_TIMEOUT_USECS_MAX);
> >> +			if (chip->timeout_wait_stat && 
> >> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> >> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> >> +			} else {
> >> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> >> +			}
> > 
> > Invalid use of braces. Please read
> > 
> > https://www.kernel.org/doc/html/v5.13/process/coding-style.html
> > 
> > Why do you have to use this field conditionally anyway? Why doesn't
> > it always contain a legit value?
> The field is legit now, but doesn’t hurt to do addition check for robustness 
> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
> 
> Can remove if we think it is not needed.

A simple question: why you use it conditionally? Can the field contain invalid value?

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  7:45         ` Jarkko Sakkinen
@ 2021-07-02  7:59           ` Hao Wu
  2021-07-02  8:42             ` Jarkko Sakkinen
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-02  7:59 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

> On Jul 2, 2021, at 12:45 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
>> 
>> 
>>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
>>>> This is a fix for the ATMEL TPM crash bug reported in
>>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>> 
>>>> According to the discussions in the original thread,
>>>> we don't want to revert the timeout of wait_for_tpm_stat
>>>> for non-ATMEL chips, which brings back the performance cost.
>>>> For investigation and analysis of why wait_for_tpm_stat
>>>> caused the issue, and how the regression was introduced,
>>>> please read the original thread above.
>>>> 
>>>> Thus the proposed fix here is to only revert the timeout
>>>> for ATMEL chips by checking the vendor ID.
>>>> 
>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>> 
>>> Fixes tag should be before SOB.
>>> 
>>>> ---
>>>> Test Plan:
>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>> has been fixed.
>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>> the timeout has not been changed.
>>>> 
>>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
>>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>>>> include/linux/tpm.h             |  2 ++
>>>> 3 files changed, 27 insertions(+), 3 deletions(-)
>>>> 
>>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>>>> index 283f78211c3a..bc6aa7f9e119 100644
>>>> --- a/drivers/char/tpm/tpm.h
>>>> +++ b/drivers/char/tpm/tpm.h
>>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
>>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>>> 
>>> What is this change?
>> Need to add the tailing comma
>> 
>>> 
>>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>>>> };
>>>> 
>>>> /* TPM addresses */
>>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>>>> 		     delay_msec * 1000);
>>>> };
>>>> 
>>>> +static inline void tpm_usleep(unsigned int delay_usec)
>>>> +{
>>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
>>>> +};
>>> 
>>> Please remove this, and open code.
>> Ok, will do
>> 
>>>> +
>>>> int tpm_chip_start(struct tpm_chip *chip);
>>>> void tpm_chip_stop(struct tpm_chip *chip);
>>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>>> 		}
>>>> 	} else {
>>>> 		do {
>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>> -				     TPM_TIMEOUT_USECS_MAX);
>>>> +			if (chip->timeout_wait_stat && 
>>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
>>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
>>>> +			} else {
>>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
>>>> +			}
>>> 
>>> Invalid use of braces. Please read
>>> 
>>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
>>> 
>>> Why do you have to use this field conditionally anyway? Why doesn't
>>> it always contain a legit value?
>> The field is legit now, but doesn’t hurt to do addition check for robustness 
>> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
>> 
>> Can remove if we think it is not needed.
> 
> A simple question: why you use it conditionally? Can the field contain invalid value?
> 
There are two checks
- chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT
It could be invalid when future developer set it to some value less than `TPM_TIMEOUT_USECS_MIN`,
and crash the usleep 

- chip->timeout_wait_stat
Yes this is needed, because this code path will be run even chip->timeout_wait_stat has not been initialized in tpm_tis_core_init
from my observation. Didn’t dig into how it is used though.  

> /Jarkko

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  7:59           ` Hao Wu
@ 2021-07-02  8:42             ` Jarkko Sakkinen
  2021-07-02 11:57               ` Jarkko Sakkinen
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-02  8:42 UTC (permalink / raw)
  To: Hao Wu
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

On Fri, Jul 02, 2021 at 12:59:18AM -0700, Hao Wu wrote:
> > On Jul 2, 2021, at 12:45 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > 
> > On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
> >> 
> >> 
> >>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>> 
> >>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
> >>>> This is a fix for the ATMEL TPM crash bug reported in
> >>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> >>>> 
> >>>> According to the discussions in the original thread,
> >>>> we don't want to revert the timeout of wait_for_tpm_stat
> >>>> for non-ATMEL chips, which brings back the performance cost.
> >>>> For investigation and analysis of why wait_for_tpm_stat
> >>>> caused the issue, and how the regression was introduced,
> >>>> please read the original thread above.
> >>>> 
> >>>> Thus the proposed fix here is to only revert the timeout
> >>>> for ATMEL chips by checking the vendor ID.
> >>>> 
> >>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> >>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> >>> 
> >>> Fixes tag should be before SOB.
> >>> 
> >>>> ---
> >>>> Test Plan:
> >>>> - Run fixed kernel with ATMEL TPM chips and see crash
> >>>> has been fixed.
> >>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> >>>> the timeout has not been changed.
> >>>> 
> >>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
> >>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
> >>>> include/linux/tpm.h             |  2 ++
> >>>> 3 files changed, 27 insertions(+), 3 deletions(-)
> >>>> 
> >>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> >>>> index 283f78211c3a..bc6aa7f9e119 100644
> >>>> --- a/drivers/char/tpm/tpm.h
> >>>> +++ b/drivers/char/tpm/tpm.h
> >>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
> >>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
> >>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
> >>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> >>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> >>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> >>> 
> >>> What is this change?
> >> Need to add the tailing comma
> >> 
> >>> 
> >>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> >>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
> >>>> };
> >>>> 
> >>>> /* TPM addresses */
> >>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
> >>>> 		     delay_msec * 1000);
> >>>> };
> >>>> 
> >>>> +static inline void tpm_usleep(unsigned int delay_usec)
> >>>> +{
> >>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> >>>> +};
> >>> 
> >>> Please remove this, and open code.
> >> Ok, will do
> >> 
> >>>> +
> >>>> int tpm_chip_start(struct tpm_chip *chip);
> >>>> void tpm_chip_stop(struct tpm_chip *chip);
> >>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> >>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> >>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
> >>>> --- a/drivers/char/tpm/tpm_tis_core.c
> >>>> +++ b/drivers/char/tpm/tpm_tis_core.c
> >>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> >>>> 		}
> >>>> 	} else {
> >>>> 		do {
> >>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> >>>> -				     TPM_TIMEOUT_USECS_MAX);
> >>>> +			if (chip->timeout_wait_stat && 
> >>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> >>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> >>>> +			} else {
> >>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> >>>> +			}
> >>> 
> >>> Invalid use of braces. Please read
> >>> 
> >>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
> >>> 
> >>> Why do you have to use this field conditionally anyway? Why doesn't
> >>> it always contain a legit value?
> >> The field is legit now, but doesn’t hurt to do addition check for robustness 
> >> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
> >> 
> >> Can remove if we think it is not needed.
> > 
> > A simple question: why you use it conditionally? Can the field contain invalid value?
> > 
> There are two checks
> - chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT
> It could be invalid when future developer set it to some value less than `TPM_TIMEOUT_USECS_MIN`,
> and crash the usleep 

I don't understand this. Why you don't set to appropriate value?

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02  8:42             ` Jarkko Sakkinen
@ 2021-07-02 11:57               ` Jarkko Sakkinen
  2021-07-02 19:16                 ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-02 11:57 UTC (permalink / raw)
  To: Hao Wu
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

On Fri, Jul 02, 2021 at 11:42:39AM +0300, Jarkko Sakkinen wrote:
> On Fri, Jul 02, 2021 at 12:59:18AM -0700, Hao Wu wrote:
> > > On Jul 2, 2021, at 12:45 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > 
> > > On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
> > >> 
> > >> 
> > >>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >>> 
> > >>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
> > >>>> This is a fix for the ATMEL TPM crash bug reported in
> > >>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> > >>>> 
> > >>>> According to the discussions in the original thread,
> > >>>> we don't want to revert the timeout of wait_for_tpm_stat
> > >>>> for non-ATMEL chips, which brings back the performance cost.
> > >>>> For investigation and analysis of why wait_for_tpm_stat
> > >>>> caused the issue, and how the regression was introduced,
> > >>>> please read the original thread above.
> > >>>> 
> > >>>> Thus the proposed fix here is to only revert the timeout
> > >>>> for ATMEL chips by checking the vendor ID.
> > >>>> 
> > >>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> > >>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> > >>> 
> > >>> Fixes tag should be before SOB.
> > >>> 
> > >>>> ---
> > >>>> Test Plan:
> > >>>> - Run fixed kernel with ATMEL TPM chips and see crash
> > >>>> has been fixed.
> > >>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> > >>>> the timeout has not been changed.
> > >>>> 
> > >>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
> > >>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
> > >>>> include/linux/tpm.h             |  2 ++
> > >>>> 3 files changed, 27 insertions(+), 3 deletions(-)
> > >>>> 
> > >>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> > >>>> index 283f78211c3a..bc6aa7f9e119 100644
> > >>>> --- a/drivers/char/tpm/tpm.h
> > >>>> +++ b/drivers/char/tpm/tpm.h
> > >>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
> > >>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
> > >>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
> > >>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> > >>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> > >>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> > >>> 
> > >>> What is this change?
> > >> Need to add the tailing comma
> > >> 
> > >>> 
> > >>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> > >>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
> > >>>> };
> > >>>> 
> > >>>> /* TPM addresses */
> > >>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
> > >>>> 		     delay_msec * 1000);
> > >>>> };
> > >>>> 
> > >>>> +static inline void tpm_usleep(unsigned int delay_usec)
> > >>>> +{
> > >>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> > >>>> +};
> > >>> 
> > >>> Please remove this, and open code.
> > >> Ok, will do
> > >> 
> > >>>> +
> > >>>> int tpm_chip_start(struct tpm_chip *chip);
> > >>>> void tpm_chip_stop(struct tpm_chip *chip);
> > >>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> > >>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> > >>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
> > >>>> --- a/drivers/char/tpm/tpm_tis_core.c
> > >>>> +++ b/drivers/char/tpm/tpm_tis_core.c
> > >>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> > >>>> 		}
> > >>>> 	} else {
> > >>>> 		do {
> > >>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> > >>>> -				     TPM_TIMEOUT_USECS_MAX);
> > >>>> +			if (chip->timeout_wait_stat && 
> > >>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> > >>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> > >>>> +			} else {
> > >>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> > >>>> +			}
> > >>> 
> > >>> Invalid use of braces. Please read
> > >>> 
> > >>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
> > >>> 
> > >>> Why do you have to use this field conditionally anyway? Why doesn't
> > >>> it always contain a legit value?
> > >> The field is legit now, but doesn’t hurt to do addition check for robustness 
> > >> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
> > >> 
> > >> Can remove if we think it is not needed.
> > > 
> > > A simple question: why you use it conditionally? Can the field contain invalid value?
> > > 
> > There are two checks
> > - chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT
> > It could be invalid when future developer set it to some value less than `TPM_TIMEOUT_USECS_MIN`,
> > and crash the usleep 
> 
> I don't understand this. Why you don't set to appropriate value?

What you should do, is to define two fields:

- tpm_timeout_min
- tpm_timeout_max

And initialize these to TPM_TIMEOUT_USECS_MIN and TPM_TIMEOUT_USECS_MAX.

Then fixup those for Atmel (with a simple if-statement, switch-case is
overkill).

The way you work out things right now is broken:

1. Before for non-Atmel: usleep_range(100, 500)
2. After for non-Atmel: usleep_range(200, 500)

I.e. the patch changes code semantically that it should not touch in the
first place.

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02 11:57               ` Jarkko Sakkinen
@ 2021-07-02 19:16                 ` Hao Wu
  2021-07-05  5:19                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-02 19:16 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley


> On Jul 2, 2021, at 4:57 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Fri, Jul 02, 2021 at 11:42:39AM +0300, Jarkko Sakkinen wrote:
>> On Fri, Jul 02, 2021 at 12:59:18AM -0700, Hao Wu wrote:
>>>> On Jul 2, 2021, at 12:45 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>> 
>>>> On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
>>>>> 
>>>>> 
>>>>>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>>> 
>>>>>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
>>>>>>> This is a fix for the ATMEL TPM crash bug reported in
>>>>>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>>>>> 
>>>>>>> According to the discussions in the original thread,
>>>>>>> we don't want to revert the timeout of wait_for_tpm_stat
>>>>>>> for non-ATMEL chips, which brings back the performance cost.
>>>>>>> For investigation and analysis of why wait_for_tpm_stat
>>>>>>> caused the issue, and how the regression was introduced,
>>>>>>> please read the original thread above.
>>>>>>> 
>>>>>>> Thus the proposed fix here is to only revert the timeout
>>>>>>> for ATMEL chips by checking the vendor ID.
>>>>>>> 
>>>>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>>>>> 
>>>>>> Fixes tag should be before SOB.
>>>>>> 
>>>>>>> ---
>>>>>>> Test Plan:
>>>>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>>>>> has been fixed.
>>>>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>>>>> the timeout has not been changed.
>>>>>>> 
>>>>>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
>>>>>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>>>>>>> include/linux/tpm.h             |  2 ++
>>>>>>> 3 files changed, 27 insertions(+), 3 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>>>>>>> index 283f78211c3a..bc6aa7f9e119 100644
>>>>>>> --- a/drivers/char/tpm/tpm.h
>>>>>>> +++ b/drivers/char/tpm/tpm.h
>>>>>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
>>>>>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>>>>>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>>>>>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>>>>>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>>>>>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>>>>>> 
>>>>>> What is this change?
>>>>> Need to add the tailing comma
>>>>> 
>>>>>> 
>>>>>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
>>>>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>>>>>>> };
>>>>>>> 
>>>>>>> /* TPM addresses */
>>>>>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>>>>>>> 		     delay_msec * 1000);
>>>>>>> };
>>>>>>> 
>>>>>>> +static inline void tpm_usleep(unsigned int delay_usec)
>>>>>>> +{
>>>>>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
>>>>>>> +};
>>>>>> 
>>>>>> Please remove this, and open code.
>>>>> Ok, will do
>>>>> 
>>>>>>> +
>>>>>>> int tpm_chip_start(struct tpm_chip *chip);
>>>>>>> void tpm_chip_stop(struct tpm_chip *chip);
>>>>>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
>>>>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>>>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
>>>>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>>>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>>>>>> 		}
>>>>>>> 	} else {
>>>>>>> 		do {
>>>>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>>>>> -				     TPM_TIMEOUT_USECS_MAX);
>>>>>>> +			if (chip->timeout_wait_stat && 
>>>>>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
>>>>>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
>>>>>>> +			} else {
>>>>>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
>>>>>>> +			}
>>>>>> 
>>>>>> Invalid use of braces. Please read
>>>>>> 
>>>>>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
>>>>>> 
>>>>>> Why do you have to use this field conditionally anyway? Why doesn't
>>>>>> it always contain a legit value?
>>>>> The field is legit now, but doesn’t hurt to do addition check for robustness 
>>>>> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
>>>>> 
>>>>> Can remove if we think it is not needed.
>>>> 
>>>> A simple question: why you use it conditionally? Can the field contain invalid value?
>>>> 
>>> There are two checks
>>> - chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT
>>> It could be invalid when future developer set it to some value less than `TPM_TIMEOUT_USECS_MIN`,
>>> and crash the usleep 
>> 
>> I don't understand this. Why you don't set to appropriate value?
Ok, fair enough, I assume developers will test it anyway to ensure no crash. Will remove this check.

> What you should do, is to define two fields:
> 
> - tpm_timeout_min
> - tpm_timeout_max
> 
> And initialize these to TPM_TIMEOUT_USECS_MIN and TPM_TIMEOUT_USECS_MAX.
> 
> Then fixup those for Atmel (with a simple if-statement, switch-case is
> overkill).
Switch was more for extensibility when other vendor has similar issue,
but we can refactor when needed in the future. I can use if-statement for now.

> The way you work out things right now is broken:
> 
> 1. Before for non-Atmel: usleep_range(100, 500)
> 2. After for non-Atmel: usleep_range(200, 500)
I realized this in day-1, I think this range change does not matter much.
`TPM_TIMEOUT_RANGE_US=300` is already used in the codebase, I assume people define
such if for general use cases for usleep_range in TPM
But we can add two fields if that makes us more comfortable to strictly follow the current code
semantically.

> I.e. the patch changes code semantically that it should not touch in the
> first place.
> 
> /Jarkko
Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
  2021-07-02  6:35     ` Jarkko Sakkinen
@ 2021-07-04  0:07     ` Hao Wu
  2021-07-05  7:15       ` Jarkko Sakkinen
  2021-07-07  4:31     ` [PATCH v2] " Hao Wu
  2021-07-09  4:40     ` [PATCH v2] tpm: fix Atmel " Hao Wu
  3 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-04  0:07 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

This is a fix for the ATMEL TPM crash bug reported in
https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/

According to the discussions in the original thread,
we don't want to revert the timeout of wait_for_tpm_stat
for non-ATMEL chips, which brings back the performance cost.
For investigation and analysis of why wait_for_tpm_stat
caused the issue, and how the regression was introduced,
please read the original thread above.

Thus the proposed fix here is to only revert the timeout
for ATMEL chips by checking the vendor ID.

Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
---
Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
the timeout has not been changed.

 drivers/char/tpm/tpm.h          |  6 ++++--
 drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
 include/linux/tpm.h             |  3 +++
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..6de1b44c4aab 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -41,8 +41,10 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RETRY = 100, /* msecs */
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
-	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
 };
 
 /* TPM addresses */
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..ae27d66fdd94 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			/* this code path could be executed before
+			 * timeouts initialized in chip instance.
+			 */
+			if (chip->timeout_wait_stat_min &&
+			    chip->timeout_wait_stat_max)
+				usleep_range(chip->timeout_wait_stat_min,
+					     chip->timeout_wait_stat_max);
+			else
+				usleep_range(TPM_TIMEOUT_USECS_MIN,
+					     TPM_TIMEOUT_USECS_MAX);
+
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +943,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeouts for wait_for_tpm_stat */
+	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
+	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +995,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	if (priv->manufacturer_id == TPM_VID_ATML &&
+		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
+		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
+		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..171b9102c976 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,8 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned int timeout_wait_stat_min; /* usecs */
+	unsigned int timeout_wait_stat_max; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-02 19:16                 ` Hao Wu
@ 2021-07-05  5:19                   ` Jarkko Sakkinen
  2021-07-05  5:29                     ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-05  5:19 UTC (permalink / raw)
  To: Hao Wu
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

On Fri, Jul 02, 2021 at 12:16:12PM -0700, Hao Wu wrote:
> 
> > On Jul 2, 2021, at 4:57 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > 
> > On Fri, Jul 02, 2021 at 11:42:39AM +0300, Jarkko Sakkinen wrote:
> >> On Fri, Jul 02, 2021 at 12:59:18AM -0700, Hao Wu wrote:
> >>>> On Jul 2, 2021, at 12:45 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>>> 
> >>>> On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
> >>>>> 
> >>>>> 
> >>>>>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>>>>> 
> >>>>>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
> >>>>>>> This is a fix for the ATMEL TPM crash bug reported in
> >>>>>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> >>>>>>> 
> >>>>>>> According to the discussions in the original thread,
> >>>>>>> we don't want to revert the timeout of wait_for_tpm_stat
> >>>>>>> for non-ATMEL chips, which brings back the performance cost.
> >>>>>>> For investigation and analysis of why wait_for_tpm_stat
> >>>>>>> caused the issue, and how the regression was introduced,
> >>>>>>> please read the original thread above.
> >>>>>>> 
> >>>>>>> Thus the proposed fix here is to only revert the timeout
> >>>>>>> for ATMEL chips by checking the vendor ID.
> >>>>>>> 
> >>>>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> >>>>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> >>>>>> 
> >>>>>> Fixes tag should be before SOB.
> >>>>>> 
> >>>>>>> ---
> >>>>>>> Test Plan:
> >>>>>>> - Run fixed kernel with ATMEL TPM chips and see crash
> >>>>>>> has been fixed.
> >>>>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> >>>>>>> the timeout has not been changed.
> >>>>>>> 
> >>>>>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
> >>>>>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
> >>>>>>> include/linux/tpm.h             |  2 ++
> >>>>>>> 3 files changed, 27 insertions(+), 3 deletions(-)
> >>>>>>> 
> >>>>>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> >>>>>>> index 283f78211c3a..bc6aa7f9e119 100644
> >>>>>>> --- a/drivers/char/tpm/tpm.h
> >>>>>>> +++ b/drivers/char/tpm/tpm.h
> >>>>>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
> >>>>>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
> >>>>>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
> >>>>>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> >>>>>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> >>>>>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> >>>>>> 
> >>>>>> What is this change?
> >>>>> Need to add the tailing comma
> >>>>> 
> >>>>>> 
> >>>>>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
> >>>>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
> >>>>>>> };
> >>>>>>> 
> >>>>>>> /* TPM addresses */
> >>>>>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
> >>>>>>> 		     delay_msec * 1000);
> >>>>>>> };
> >>>>>>> 
> >>>>>>> +static inline void tpm_usleep(unsigned int delay_usec)
> >>>>>>> +{
> >>>>>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
> >>>>>>> +};
> >>>>>> 
> >>>>>> Please remove this, and open code.
> >>>>> Ok, will do
> >>>>> 
> >>>>>>> +
> >>>>>>> int tpm_chip_start(struct tpm_chip *chip);
> >>>>>>> void tpm_chip_stop(struct tpm_chip *chip);
> >>>>>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
> >>>>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> >>>>>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
> >>>>>>> --- a/drivers/char/tpm/tpm_tis_core.c
> >>>>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
> >>>>>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> >>>>>>> 		}
> >>>>>>> 	} else {
> >>>>>>> 		do {
> >>>>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> >>>>>>> -				     TPM_TIMEOUT_USECS_MAX);
> >>>>>>> +			if (chip->timeout_wait_stat && 
> >>>>>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
> >>>>>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
> >>>>>>> +			} else {
> >>>>>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
> >>>>>>> +			}
> >>>>>> 
> >>>>>> Invalid use of braces. Please read
> >>>>>> 
> >>>>>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
> >>>>>> 
> >>>>>> Why do you have to use this field conditionally anyway? Why doesn't
> >>>>>> it always contain a legit value?
> >>>>> The field is legit now, but doesn’t hurt to do addition check for robustness 
> >>>>> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
> >>>>> 
> >>>>> Can remove if we think it is not needed.
> >>>> 
> >>>> A simple question: why you use it conditionally? Can the field contain invalid value?
> >>>> 
> >>> There are two checks
> >>> - chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT
> >>> It could be invalid when future developer set it to some value less than `TPM_TIMEOUT_USECS_MIN`,
> >>> and crash the usleep 
> >> 
> >> I don't understand this. Why you don't set to appropriate value?
> Ok, fair enough, I assume developers will test it anyway to ensure no crash. Will remove this check.
> 
> > What you should do, is to define two fields:
> > 
> > - tpm_timeout_min
> > - tpm_timeout_max
> > 
> > And initialize these to TPM_TIMEOUT_USECS_MIN and TPM_TIMEOUT_USECS_MAX.
> > 
> > Then fixup those for Atmel (with a simple if-statement, switch-case is
> > overkill).
> Switch was more for extensibility when other vendor has similar issue,
> but we can refactor when needed in the future. I can use if-statement for now.

Make things more fancy *only* when you actually need more fancy.

> > The way you work out things right now is broken:
> > 
> > 1. Before for non-Atmel: usleep_range(100, 500)
> > 2. After for non-Atmel: usleep_range(200, 500)
> I realized this in day-1, I think this range change does not matter much.

By saying that you are actually saying that *undocumented* semantic changes
to the kernel code are fine as long as they don't change things "too much"

Are you serious about this?

> `TPM_TIMEOUT_RANGE_US=300` is already used in the codebase, I assume people define
> such if for general use cases for usleep_range in TPM
> But we can add two fields if that makes us more comfortable to strictly follow the current code
> semantically.

This has absolutely nothing to do with "comfortable". It's black and white
wrong.

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-05  5:19                   ` Jarkko Sakkinen
@ 2021-07-05  5:29                     ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-05  5:29 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

> On Jul 4, 2021, at 10:19 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Fri, Jul 02, 2021 at 12:16:12PM -0700, Hao Wu wrote:
>> 
>>> On Jul 2, 2021, at 4:57 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>>> On Fri, Jul 02, 2021 at 11:42:39AM +0300, Jarkko Sakkinen wrote:
>>>> On Fri, Jul 02, 2021 at 12:59:18AM -0700, Hao Wu wrote:
>>>>>> On Jul 2, 2021, at 12:45 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>>> 
>>>>>> On Fri, Jul 02, 2021 at 12:33:15AM -0700, Hao Wu wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jul 1, 2021, at 11:35 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>>>>> 
>>>>>>>> On Tue, Jun 29, 2021 at 09:22:05PM -0700, Hao Wu wrote:
>>>>>>>>> This is a fix for the ATMEL TPM crash bug reported in
>>>>>>>>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>>>>>>> 
>>>>>>>>> According to the discussions in the original thread,
>>>>>>>>> we don't want to revert the timeout of wait_for_tpm_stat
>>>>>>>>> for non-ATMEL chips, which brings back the performance cost.
>>>>>>>>> For investigation and analysis of why wait_for_tpm_stat
>>>>>>>>> caused the issue, and how the regression was introduced,
>>>>>>>>> please read the original thread above.
>>>>>>>>> 
>>>>>>>>> Thus the proposed fix here is to only revert the timeout
>>>>>>>>> for ATMEL chips by checking the vendor ID.
>>>>>>>>> 
>>>>>>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>>>>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>>>>>>> 
>>>>>>>> Fixes tag should be before SOB.
>>>>>>>> 
>>>>>>>>> ---
>>>>>>>>> Test Plan:
>>>>>>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>>>>>>> has been fixed.
>>>>>>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>>>>>>> the timeout has not been changed.
>>>>>>>>> 
>>>>>>>>> drivers/char/tpm/tpm.h          |  9 ++++++++-
>>>>>>>>> drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++--
>>>>>>>>> include/linux/tpm.h             |  2 ++
>>>>>>>>> 3 files changed, 27 insertions(+), 3 deletions(-)
>>>>>>>>> 
>>>>>>>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>>>>>>>>> index 283f78211c3a..bc6aa7f9e119 100644
>>>>>>>>> --- a/drivers/char/tpm/tpm.h
>>>>>>>>> +++ b/drivers/char/tpm/tpm.h
>>>>>>>>> @@ -42,7 +42,9 @@ enum tpm_timeout {
>>>>>>>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>>>>>>>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>>>>>>>>> 	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>>>>>>>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>>>>>>>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>>>>>>>> 
>>>>>>>> What is this change?
>>>>>>> Need to add the tailing comma
>>>>>>> 
>>>>>>>> 
>>>>>>>>> +	TPM_TIMEOUT_WAIT_STAT = 500,	/* usecs */
>>>>>>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT = 15000	/* usecs */
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> /* TPM addresses */
>>>>>>>>> @@ -189,6 +191,11 @@ static inline void tpm_msleep(unsigned int delay_msec)
>>>>>>>>> 		     delay_msec * 1000);
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> +static inline void tpm_usleep(unsigned int delay_usec)
>>>>>>>>> +{
>>>>>>>>> +	usleep_range(delay_usec - TPM_TIMEOUT_RANGE_US, delay_usec);
>>>>>>>>> +};
>>>>>>>> 
>>>>>>>> Please remove this, and open code.
>>>>>>> Ok, will do
>>>>>>> 
>>>>>>>>> +
>>>>>>>>> int tpm_chip_start(struct tpm_chip *chip);
>>>>>>>>> void tpm_chip_stop(struct tpm_chip *chip);
>>>>>>>>> struct tpm_chip *tpm_find_get_ops(struct tpm_chip *chip);
>>>>>>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>>>>>>> index 55b9d3965ae1..9ddd4edfe1c2 100644
>>>>>>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>>>>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>>>>>>> @@ -80,8 +80,12 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>>>>>>>> 		}
>>>>>>>>> 	} else {
>>>>>>>>> 		do {
>>>>>>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>>>>>>> -				     TPM_TIMEOUT_USECS_MAX);
>>>>>>>>> +			if (chip->timeout_wait_stat && 
>>>>>>>>> +				chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT) {
>>>>>>>>> +				tpm_usleep((unsigned int)(chip->timeout_wait_stat));
>>>>>>>>> +			} else {
>>>>>>>>> +				tpm_usleep((unsigned int)(TPM_TIMEOUT_WAIT_STAT));
>>>>>>>>> +			}
>>>>>>>> 
>>>>>>>> Invalid use of braces. Please read
>>>>>>>> 
>>>>>>>> https://www.kernel.org/doc/html/v5.13/process/coding-style.html
>>>>>>>> 
>>>>>>>> Why do you have to use this field conditionally anyway? Why doesn't
>>>>>>>> it always contain a legit value?
>>>>>>> The field is legit now, but doesn’t hurt to do addition check for robustness 
>>>>>>> to ensure no crash ? Just in case the value is updated below TPM_TIMEOUT_WAIT_STAT ? 
>>>>>>> 
>>>>>>> Can remove if we think it is not needed.
>>>>>> 
>>>>>> A simple question: why you use it conditionally? Can the field contain invalid value?
>>>>>> 
>>>>> There are two checks
>>>>> - chip->timeout_wait_stat >= TPM_TIMEOUT_WAIT_STAT
>>>>> It could be invalid when future developer set it to some value less than `TPM_TIMEOUT_USECS_MIN`,
>>>>> and crash the usleep 
>>>> 
>>>> I don't understand this. Why you don't set to appropriate value?
>> Ok, fair enough, I assume developers will test it anyway to ensure no crash. Will remove this check.
>> 
>>> What you should do, is to define two fields:
>>> 
>>> - tpm_timeout_min
>>> - tpm_timeout_max
>>> 
>>> And initialize these to TPM_TIMEOUT_USECS_MIN and TPM_TIMEOUT_USECS_MAX.
>>> 
>>> Then fixup those for Atmel (with a simple if-statement, switch-case is
>>> overkill).
>> Switch was more for extensibility when other vendor has similar issue,
>> but we can refactor when needed in the future. I can use if-statement for now.
> 
> Make things more fancy *only* when you actually need more fancy.
> 
>>> The way you work out things right now is broken:
>>> 
>>> 1. Before for non-Atmel: usleep_range(100, 500)
>>> 2. After for non-Atmel: usleep_range(200, 500)
>> I realized this in day-1, I think this range change does not matter much.
> 
> By saying that you are actually saying that *undocumented* semantic changes
> to the kernel code are fine as long as they don't change things "too much"
> 
> Are you serious about this?
Fair enough, I agree that keeping things as it avoid potential issues. Thanks for pointing this out!
> 
>> `TPM_TIMEOUT_RANGE_US=300` is already used in the codebase, I assume people define
>> such if for general use cases for usleep_range in TPM
>> But we can add two fields if that makes us more comfortable to strictly follow the current code
>> semantically.
> 
> This has absolutely nothing to do with "comfortable". It's black and white
> wrong.
> 
> /Jarkko

I believe the comments are addressed in 
https://patchwork.kernel.org/project/linux-integrity/patch/20210704000754.1384-1-hao.wu@rubrik.com/

Have tested it with ATMEL 1.2 chip. 

Thanks
Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-04  0:07     ` Hao Wu
@ 2021-07-05  7:15       ` Jarkko Sakkinen
  2021-07-05 23:09         ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-05  7:15 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

Is this really the first version? Please, use git-format-patch -vX.

On Sat, Jul 03, 2021 at 05:07:54PM -0700, Hao Wu wrote:
> This is a fix for the ATMEL TPM crash bug reported in
> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> 
> According to the discussions in the original thread,
> we don't want to revert the timeout of wait_for_tpm_stat
> for non-ATMEL chips, which brings back the performance cost.
> For investigation and analysis of why wait_for_tpm_stat
> caused the issue, and how the regression was introduced,
> please read the original thread above.

Please, no xrefs. Instead, describe what you are doing.

> Thus the proposed fix here is to only revert the timeout
> for ATMEL chips by checking the vendor ID.

What do you mean by reverting?

The long description needs a full rewrite.

You can add

Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/

But do not expect anyone to read the thread in order to
understand what the commit is doing.

> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> ---
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash
> has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> the timeout has not been changed.

The changelog is missing.

Please read section 14 of

https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-05  7:15       ` Jarkko Sakkinen
@ 2021-07-05 23:09         ` Hao Wu
  2021-07-06 12:34           ` Mimi Zohar
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-05 23:09 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jul 5, 2021, at 12:15 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> Is this really the first version? Please, use git-format-patch -vX.
Got it, will re-send the patch with `[PATCH v2]`. 
Thanks for bearing my mistakes, I am not quite familiar with the workflow yet.

> On Sat, Jul 03, 2021 at 05:07:54PM -0700, Hao Wu wrote:
>> This is a fix for the ATMEL TPM crash bug reported in
>> https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>> 
>> According to the discussions in the original thread,
>> we don't want to revert the timeout of wait_for_tpm_stat
>> for non-ATMEL chips, which brings back the performance cost.
>> For investigation and analysis of why wait_for_tpm_stat
>> caused the issue, and how the regression was introduced,
>> please read the original thread above.
> 
> Please, no xrefs. Instead, describe what you are doing.
Ok, will rewrite the commit message
> 
>> Thus the proposed fix here is to only revert the timeout
>> for ATMEL chips by checking the vendor ID.
> 
> What do you mean by reverting?
> 
> The long description needs a full rewrite.
> 
> You can add
> 
> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> 
> But do not expect anyone to read the thread in order to
> understand what the commit is doing.
> 
>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>> ---
>> Test Plan:
>> - Run fixed kernel with ATMEL TPM chips and see crash
>> has been fixed.
>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>> the timeout has not been changed.
> 
> The changelog is missing.
Sorry, I don’t get your point here. Could you help clarify it a bit.
I did follow the section 14, but I didn’t see anything specifically needed after `---`
Could you be specific ? Thanks for your time. 

> Please read section 14 of
> 
> https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
> 
> /Jarkko
Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-05 23:09         ` Hao Wu
@ 2021-07-06 12:34           ` Mimi Zohar
  2021-07-07  4:18             ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Mimi Zohar @ 2021-07-06 12:34 UTC (permalink / raw)
  To: Hao Wu, Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

On Mon, 2021-07-05 at 16:09 -0700, Hao Wu wrote:
> > On Jul 5, 2021, at 12:15 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > 
> 
> >> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> >> ---
> >> Test Plan:
> >> - Run fixed kernel with ATMEL TPM chips and see crash
> >> has been fixed.
> >> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> >> the timeout has not been changed.
> > 
> > The changelog is missing.
> Sorry, I don’t get your point here. Could you help clarify it a bit.
> I did follow the section 14, but I didn’t see anything specifically needed after `---`
> Could you be specific ? Thanks for your time. 

The changes from one version of a patch, or patch set, to the next has
moved around a bit.  Some people put it in the cover letter, others put
it on the individual patches.   It's also moved from within the patch
description to after the dashes.  Documentation/process/submitting-
patches.rst provides an example, but leaves out the word "Changelog".  
For an explanation of what goes into the patch description versus the
Changelog, search for "Other comments relevant only to the moment or
the maintainer".

For example, this version of the patch limits increasing the delay just
for Atmel TPM 1.2 chips.  At minimum it would be included in the
Changelog, but more likely included in the patch description itself and
perhaps even in the Subject line.

thanks,

Mimi

> 
> > Please read section 14 of
> > 
> > https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
> > 
> > /Jarkko
> Hao



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-06 12:34           ` Mimi Zohar
@ 2021-07-07  4:18             ` Hao Wu
  2021-07-07  4:34               ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-07  4:18 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Jarkko Sakkinen, Shrihari Kalkar, Seungyeop Han, Anish Jhaveri,
	peterhuewe, jgg, linux-integrity, Paul Menzel, Ken Goldman,
	zohar, why2jjj.linux, Hamza Attak, gregkh, arnd, Nayna,
	James.Bottomley

> On Jul 6, 2021, at 5:34 AM, Mimi Zohar <zohar@linux.ibm.com> wrote:
> 
> On Mon, 2021-07-05 at 16:09 -0700, Hao Wu wrote:
>>> On Jul 5, 2021, at 12:15 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>> 
>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>> ---
>>>> Test Plan:
>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>> has been fixed.
>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>> the timeout has not been changed.
>>> 
>>> The changelog is missing.
>> Sorry, I don’t get your point here. Could you help clarify it a bit.
>> I did follow the section 14, but I didn’t see anything specifically needed after `---`
>> Could you be specific ? Thanks for your time. 
> 
> The changes from one version of a patch, or patch set, to the next has
> moved around a bit.  Some people put it in the cover letter, others put
> it on the individual patches.   It's also moved from within the patch
> description to after the dashes.  Documentation/process/submitting-
> patches.rst provides an example, but leaves out the word "Changelog".  
> For an explanation of what goes into the patch description versus the
> Changelog, search for "Other comments relevant only to the moment or
> the maintainer".
I see. That makes sense to me now. The term “changelog” appears multiple times
In the doc for different purpose, thus it was confusing to me. Here we are
referring to "patch changelog”. Will add it after `—`. Thanks for the clarification!

> 
> For example, this version of the patch limits increasing the delay just
> for Atmel TPM 1.2 chips.  At minimum it would be included in the
> Changelog, but more likely included in the patch description itself and
> perhaps even in the Subject line.
> 
> thanks,
> 
> Mimi
> 
>> 
>>> Please read section 14 of
>>> 
>>> https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
>>> 
>>> /Jarkko
>> Hao
> 
> 
Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
  2021-07-02  6:35     ` Jarkko Sakkinen
  2021-07-04  0:07     ` Hao Wu
@ 2021-07-07  4:31     ` Hao Wu
  2021-07-07  9:24       ` Jarkko Sakkinen
  2021-07-09  4:40     ` [PATCH v2] tpm: fix Atmel " Hao Wu
  3 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-07  4:31 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

Since kernel 4.14, there was a commit (9f3fc7bcddcb)
fixed the TPM sleep logic from msleep to usleep_range,
so that the TPM sleeps exactly with TPM_TIMEOUT (=5ms) afterward.
Before that fix, msleep(5) actually sleeps for around 15ms.

This timeout change caused the ATMEL 1.2 TPM chip crash,
and the patch here is to fix it in the master branch.
Crash signature is as follows:
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl, code=0087 (135),
I/O error

$ sudo dmesg | grep tpm0
[59154.665549] tpm tpm0: tpm_try_transmit: send(): error -62
```

With at few more changes after 4.14, the timeout was
reduced to less than 1 ms today in the master branch.
- in 4.16 commit cf151a9a44d5 uses `TPM_POLL_SLEEP` instead of
  TPM_TIMEOUT for `wait_for_tpm_stat` and set `TPM_POLL_SLEEP` (1ms).
- in 4.18 commits 59f5a6b07f64 and 424eaf910c32 further
  reduced the timeout in wait_for_tpm_stat to less than 1ms.

This patch is to fix the TPM crash for ATMEL 1.2 TPM chip.
We specifically use 15ms timeout for the ATMEL 1.2 TPM chip
in wait_for_tpm_stat, but keep the timeout for other chips
unchanged. The 15ms timeout was the timeout
works for ATMEL 1.2 TPM chip before 4.14.

Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
---
This version (v2) has following changes on top of the last (v1):
- follow the existing way to define two timeouts (min and max)
  for ATMEL chip, thus keep the exact timeout logic for 
  non-ATEML chips.
- limit the timeout increase to only ATMEL TPM 1.2 chips,
  because it is not an issue for TPM 2.0 chips yet.

Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
the timeout has not been changed.

 drivers/char/tpm/tpm.h          |  6 ++++--
 drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
 include/linux/tpm.h             |  3 +++
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..6de1b44c4aab 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -41,8 +41,10 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RETRY = 100, /* msecs */
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
-	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
 };
 
 /* TPM addresses */
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..ae27d66fdd94 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			/* this code path could be executed before
+			 * timeouts initialized in chip instance.
+			 */
+			if (chip->timeout_wait_stat_min &&
+			    chip->timeout_wait_stat_max)
+				usleep_range(chip->timeout_wait_stat_min,
+					     chip->timeout_wait_stat_max);
+			else
+				usleep_range(TPM_TIMEOUT_USECS_MIN,
+					     TPM_TIMEOUT_USECS_MAX);
+
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +943,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeouts for wait_for_tpm_stat */
+	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
+	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +995,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	if (priv->manufacturer_id == TPM_VID_ATML &&
+		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
+		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
+		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..171b9102c976 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,8 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned int timeout_wait_stat_min; /* usecs */
+	unsigned int timeout_wait_stat_max; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-07  4:18             ` Hao Wu
@ 2021-07-07  4:34               ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-07  4:34 UTC (permalink / raw)
  To: Mimi Zohar, Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

> On Jul 6, 2021, at 9:18 PM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
>> On Jul 6, 2021, at 5:34 AM, Mimi Zohar <zohar@linux.ibm.com> wrote:
>> 
>> On Mon, 2021-07-05 at 16:09 -0700, Hao Wu wrote:
>>>> On Jul 5, 2021, at 12:15 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>> 
>>> 
>>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>>> ---
>>>>> Test Plan:
>>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>>> has been fixed.
>>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>>> the timeout has not been changed.
>>>> 
>>>> The changelog is missing.
>>> Sorry, I don’t get your point here. Could you help clarify it a bit.
>>> I did follow the section 14, but I didn’t see anything specifically needed after `---`
>>> Could you be specific ? Thanks for your time. 
>> 
>> The changes from one version of a patch, or patch set, to the next has
>> moved around a bit.  Some people put it in the cover letter, others put
>> it on the individual patches.   It's also moved from within the patch
>> description to after the dashes.  Documentation/process/submitting-
>> patches.rst provides an example, but leaves out the word "Changelog".  
>> For an explanation of what goes into the patch description versus the
>> Changelog, search for "Other comments relevant only to the moment or
>> the maintainer".
> I see. That makes sense to me now. The term “changelog” appears multiple times
> In the doc for different purpose, thus it was confusing to me. Here we are
> referring to "patch changelog”. Will add it after `—`. Thanks for the clarification!
> 
>> 
>> For example, this version of the patch limits increasing the delay just
>> for Atmel TPM 1.2 chips.  At minimum it would be included in the
>> Changelog, but more likely included in the patch description itself and
>> perhaps even in the Subject line.
>> 
>> thanks,
>> 
>> Mimi
>> 
>>> 
>>>> Please read section 14 of
>>>> 
>>>> https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
>>>> 
>>>> /Jarkko
>>> Hao
>> 
>> 
> Hao

Updated in https://patchwork.kernel.org/project/linux-integrity/patch/20210707043135.33434-1-hao.wu@rubrik.com/
Hopefully this time it is qualified.

Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-07  4:31     ` [PATCH v2] " Hao Wu
@ 2021-07-07  9:24       ` Jarkko Sakkinen
  2021-07-07 18:28         ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-07  9:24 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

On Tue, Jul 06, 2021 at 09:31:35PM -0700, Hao Wu wrote:
> Since kernel 4.14, there was a commit (9f3fc7bcddcb)
> fixed the TPM sleep logic from msleep to usleep_range,
> so that the TPM sleeps exactly with TPM_TIMEOUT (=5ms) afterward.
> Before that fix, msleep(5) actually sleeps for around 15ms.

What is TPM sleep logic?

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-07  9:24       ` Jarkko Sakkinen
@ 2021-07-07 18:28         ` Hao Wu
  2021-07-07 21:10           ` Jarkko Sakkinen
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-07 18:28 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jul 7, 2021, at 2:24 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Tue, Jul 06, 2021 at 09:31:35PM -0700, Hao Wu wrote:
>> Since kernel 4.14, there was a commit (9f3fc7bcddcb)
>> fixed the TPM sleep logic from msleep to usleep_range,
>> so that the TPM sleeps exactly with TPM_TIMEOUT (=5ms) afterward.
>> Before that fix, msleep(5) actually sleeps for around 15ms.
> 
> What is TPM sleep logic?
It is about the commit metnioned in the description
`tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers`
https://github.com/torvalds/linux/commit/9f3fc7bcddcb51234e23494531f93ab60475e1c3

Any better description or terms ?

> 
> /Jarkko

Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-07 18:28         ` Hao Wu
@ 2021-07-07 21:10           ` Jarkko Sakkinen
  2021-07-09  4:43             ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-07 21:10 UTC (permalink / raw)
  To: Hao Wu
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

On Wed, Jul 07, 2021 at 11:28:35AM -0700, Hao Wu wrote:
> > On Jul 7, 2021, at 2:24 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > 
> > On Tue, Jul 06, 2021 at 09:31:35PM -0700, Hao Wu wrote:
> >> Since kernel 4.14, there was a commit (9f3fc7bcddcb)

BTW, please remove this. You have a fixes tag.

> >> fixed the TPM sleep logic from msleep to usleep_range,
> >> so that the TPM sleeps exactly with TPM_TIMEOUT (=5ms) afterward.
> >> Before that fix, msleep(5) actually sleeps for around 15ms.
> > 
> > What is TPM sleep logic?
> It is about the commit metnioned in the description
> `tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers`
> https://github.com/torvalds/linux/commit/9f3fc7bcddcb51234e23494531f93ab60475e1c3

What you should do is to explain in simple terms the unwanted behaviour
that you are observing, and also, *when* you observe it. E.g. does this
happen when you use /dev/tpm0, or is it visible already in klog at boot
time. And also: does it occur for anything you put to /dev/tpm0, or is
the bug triggering for some particular TPM commands.

You also need to have information what kind of Atmel chip triggers the
bug. I'd presume that you have access to a machine with such chip?

When you get all that figured out, you should explain how you change
the existing behaviour, and why it makes sense. E.g. if you fixup
timeouts, please just tell how'd you end up choosing the values that
you picked.

E.g. the rationale for that could come from testing and finding the "sweet
spot", or perhaps the reason could be that old values worked, new ones
don't.

Especially in bug fixes the reasoning is *at least* as important as the
the code change itself because I need to know what is going on.

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
                       ` (2 preceding siblings ...)
  2021-07-07  4:31     ` [PATCH v2] " Hao Wu
@ 2021-07-09  4:40     ` Hao Wu
  2021-07-09 17:47       ` Jarkko Sakkinen
  2021-07-11  7:51       ` [PATCH v3] " Hao Wu
  3 siblings, 2 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-09  4:40 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

The Atmel TPM 1.2 chips crash with error
`tpm_try_transmit: send(): error -62` since kernel 4.14.
It is observed from the kernel log after running `tpm_sealdata -z`.
The error thrown from the command is as follows
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
code=0087 (135), I/O error
```

The issue was reproduced with the following Atmel TPM chip:
```
$ tpm_version
T0  TPM 1.2 Version Info:
  Chip Version:        1.2.66.1
  Spec Level:          2
  Errata Revision:     3
  TPM Vendor ID:       ATML
  TPM Version:         01010000
  Manufacturer Info:   41544d4c
```

The root cause of the issue is due to the TPM calls to msleep()
were replaced with usleep_range() [1], which reduces
the actual timeout. Via experiments, it is observed that
the original msleep(5) actually sleeps for 15ms.
Because of a known timeout issue in Atmel TPM 1.2 chip,
the shorter timeout than 15ms can cause the error described above.

A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
reduced the timeout to less than 1ms. With experiments,
the problematic timeout in the latest kernel is the one
for `wait_for_tpm_stat`.

To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
for Ateml TPM 2.0 chip, and chips from other vendors.
As explained above, the chosen 15ms timeout is
the actual timeout before this issue introduced,
thus the old value is used here.
Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
the existing TPM_TIMEOUT_RANGE_US (300us).
The fixed has been tested in the system with the affected Atmel chip
with no issues observed after boot up.

References:
[1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
1.2/2.0 generic drivers
[2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
[3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
[4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
granularity

Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
---
This version (v2) has following changes on top of the last (v1):
- follow the existing way to define two timeouts (min and max)
  for ATMEL chip, thus keep the exact timeout logic for 
  non-ATEML chips.
- limit the timeout increase to only ATMEL TPM 1.2 chips,
  because it is not an issue for TPM 2.0 chips yet.

Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
  the timeout has not been changed.

 drivers/char/tpm/tpm.h          |  6 ++++--
 drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
 include/linux/tpm.h             |  3 +++
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..6de1b44c4aab 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -41,8 +41,10 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RETRY = 100, /* msecs */
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
-	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
 };
 
 /* TPM addresses */
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..ae27d66fdd94 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			/* this code path could be executed before
+			 * timeouts initialized in chip instance.
+			 */
+			if (chip->timeout_wait_stat_min &&
+			    chip->timeout_wait_stat_max)
+				usleep_range(chip->timeout_wait_stat_min,
+					     chip->timeout_wait_stat_max);
+			else
+				usleep_range(TPM_TIMEOUT_USECS_MIN,
+					     TPM_TIMEOUT_USECS_MAX);
+
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +943,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeouts for wait_for_tpm_stat */
+	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
+	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +995,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	if (priv->manufacturer_id == TPM_VID_ATML &&
+		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
+		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
+		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..171b9102c976 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,8 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned int timeout_wait_stat_min; /* usecs */
+	unsigned int timeout_wait_stat_max; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix ATMEL TPM crash caused by too frequent queries
  2021-07-07 21:10           ` Jarkko Sakkinen
@ 2021-07-09  4:43             ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-09  4:43 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jul 7, 2021, at 2:10 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Wed, Jul 07, 2021 at 11:28:35AM -0700, Hao Wu wrote:
>>> On Jul 7, 2021, at 2:24 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>>> On Tue, Jul 06, 2021 at 09:31:35PM -0700, Hao Wu wrote:
>>>> Since kernel 4.14, there was a commit (9f3fc7bcddcb)
> 
> BTW, please remove this. You have a fixes tag.
> 
>>>> fixed the TPM sleep logic from msleep to usleep_range,
>>>> so that the TPM sleeps exactly with TPM_TIMEOUT (=5ms) afterward.
>>>> Before that fix, msleep(5) actually sleeps for around 15ms.
>>> 
>>> What is TPM sleep logic?
>> It is about the commit metnioned in the description
>> `tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers`
>> https://github.com/torvalds/linux/commit/9f3fc7bcddcb51234e23494531f93ab60475e1c3
> 
> What you should do is to explain in simple terms the unwanted behaviour
> that you are observing, and also, *when* you observe it. E.g. does this
> happen when you use /dev/tpm0, or is it visible already in klog at boot
> time. And also: does it occur for anything you put to /dev/tpm0, or is
> the bug triggering for some particular TPM commands.
> 
> You also need to have information what kind of Atmel chip triggers the
> bug. I'd presume that you have access to a machine with such chip?
> 
> When you get all that figured out, you should explain how you change
> the existing behaviour, and why it makes sense. E.g. if you fixup
> timeouts, please just tell how'd you end up choosing the values that
> you picked.
> 
> E.g. the rationale for that could come from testing and finding the "sweet
> spot", or perhaps the reason could be that old values worked, new ones
> don't.
> 
> Especially in bug fixes the reasoning is *at least* as important as the
> the code change itself because I need to know what is going on.
> 
> /Jarkko

Thanks Jarkko for pointing the direction! I have updated the description
and sent a new revision.

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-09  4:40     ` [PATCH v2] tpm: fix Atmel " Hao Wu
@ 2021-07-09 17:47       ` Jarkko Sakkinen
  2021-07-09 19:23         ` Hao Wu
  2021-07-11  7:51       ` [PATCH v3] " Hao Wu
  1 sibling, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-09 17:47 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

On Thu, Jul 08, 2021 at 09:40:28PM -0700, Hao Wu wrote:
> The Atmel TPM 1.2 chips crash with error
> `tpm_try_transmit: send(): error -62` since kernel 4.14.
> It is observed from the kernel log after running `tpm_sealdata -z`.
> The error thrown from the command is as follows
> ```
> $ tpm_sealdata -z
> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
> code=0087 (135), I/O error
> ```
> 
> The issue was reproduced with the following Atmel TPM chip:
> ```
> $ tpm_version
> T0  TPM 1.2 Version Info:
>   Chip Version:        1.2.66.1
>   Spec Level:          2
>   Errata Revision:     3
>   TPM Vendor ID:       ATML
>   TPM Version:         01010000
>   Manufacturer Info:   41544d4c
> ```
> 
> The root cause of the issue is due to the TPM calls to msleep()
> were replaced with usleep_range() [1], which reduces
> the actual timeout. Via experiments, it is observed that
> the original msleep(5) actually sleeps for 15ms.
> Because of a known timeout issue in Atmel TPM 1.2 chip,
> the shorter timeout than 15ms can cause the error described above.
> 
> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
> reduced the timeout to less than 1ms. With experiments,
> the problematic timeout in the latest kernel is the one
> for `wait_for_tpm_stat`.
> 
> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
> for Ateml TPM 2.0 chip, and chips from other vendors.
> As explained above, the chosen 15ms timeout is
> the actual timeout before this issue introduced,
> thus the old value is used here.
> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
> the existing TPM_TIMEOUT_RANGE_US (300us).
> The fixed has been tested in the system with the affected Atmel chip
> with no issues observed after boot up.
> 
> References:
> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
> 1.2/2.0 generic drivers
> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
> granularity
> 
> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> ---
> This version (v2) has following changes on top of the last (v1):
> - follow the existing way to define two timeouts (min and max)
>   for ATMEL chip, thus keep the exact timeout logic for 
>   non-ATEML chips.
> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>   because it is not an issue for TPM 2.0 chips yet.
> 
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>   the timeout has not been changed.
> 
>  drivers/char/tpm/tpm.h          |  6 ++++--
>  drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
>  include/linux/tpm.h             |  3 +++
>  3 files changed, 28 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 283f78211c3a..6de1b44c4aab 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -41,8 +41,10 @@ enum tpm_timeout {
>  	TPM_TIMEOUT_RETRY = 100, /* msecs */
>  	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>  	TPM_TIMEOUT_POLL = 1,	/* msecs */
> -	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> +	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
>  };
>  
>  /* TPM addresses */
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..ae27d66fdd94 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>  		}
>  	} else {
>  		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			/* this code path could be executed before
> +			 * timeouts initialized in chip instance.
> +			 */
> +			if (chip->timeout_wait_stat_min &&
> +			    chip->timeout_wait_stat_max)
> +				usleep_range(chip->timeout_wait_stat_min,
> +					     chip->timeout_wait_stat_max);
> +			else
> +				usleep_range(TPM_TIMEOUT_USECS_MIN,
> +					     TPM_TIMEOUT_USECS_MAX);

This starts to look otherwise fine but you don't need this condition.
Just initialize variables to TPM_TIMEOUT_USECS_{MIN, MAX} for non-Atmel.

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-09 17:47       ` Jarkko Sakkinen
@ 2021-07-09 19:23         ` Hao Wu
  2021-07-11  7:37           ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-09 19:23 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jul 9, 2021, at 10:47 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Thu, Jul 08, 2021 at 09:40:28PM -0700, Hao Wu wrote:
>> The Atmel TPM 1.2 chips crash with error
>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>> It is observed from the kernel log after running `tpm_sealdata -z`.
>> The error thrown from the command is as follows
>> ```
>> $ tpm_sealdata -z
>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>> code=0087 (135), I/O error
>> ```
>> 
>> The issue was reproduced with the following Atmel TPM chip:
>> ```
>> $ tpm_version
>> T0  TPM 1.2 Version Info:
>>  Chip Version:        1.2.66.1
>>  Spec Level:          2
>>  Errata Revision:     3
>>  TPM Vendor ID:       ATML
>>  TPM Version:         01010000
>>  Manufacturer Info:   41544d4c
>> ```
>> 
>> The root cause of the issue is due to the TPM calls to msleep()
>> were replaced with usleep_range() [1], which reduces
>> the actual timeout. Via experiments, it is observed that
>> the original msleep(5) actually sleeps for 15ms.
>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>> the shorter timeout than 15ms can cause the error described above.
>> 
>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>> reduced the timeout to less than 1ms. With experiments,
>> the problematic timeout in the latest kernel is the one
>> for `wait_for_tpm_stat`.
>> 
>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>> for Ateml TPM 2.0 chip, and chips from other vendors.
>> As explained above, the chosen 15ms timeout is
>> the actual timeout before this issue introduced,
>> thus the old value is used here.
>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>> the existing TPM_TIMEOUT_RANGE_US (300us).
>> The fixed has been tested in the system with the affected Atmel chip
>> with no issues observed after boot up.
>> 
>> References:
>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>> 1.2/2.0 generic drivers
>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>> granularity
>> 
>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>> ---
>> This version (v2) has following changes on top of the last (v1):
>> - follow the existing way to define two timeouts (min and max)
>>  for ATMEL chip, thus keep the exact timeout logic for 
>>  non-ATEML chips.
>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>  because it is not an issue for TPM 2.0 chips yet.
>> 
>> Test Plan:
>> - Run fixed kernel with ATMEL TPM chips and see crash has been fixed.
>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>  the timeout has not been changed.
>> 
>> drivers/char/tpm/tpm.h          |  6 ++++--
>> drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
>> include/linux/tpm.h             |  3 +++
>> 3 files changed, 28 insertions(+), 4 deletions(-)
>> 
>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>> index 283f78211c3a..6de1b44c4aab 100644
>> --- a/drivers/char/tpm/tpm.h
>> +++ b/drivers/char/tpm/tpm.h
>> @@ -41,8 +41,10 @@ enum tpm_timeout {
>> 	TPM_TIMEOUT_RETRY = 100, /* msecs */
>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>> -	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>> +	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
>> };
>> 
>> /* TPM addresses */
>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> index 55b9d3965ae1..ae27d66fdd94 100644
>> --- a/drivers/char/tpm/tpm_tis_core.c
>> +++ b/drivers/char/tpm/tpm_tis_core.c
>> @@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>> 		}
>> 	} else {
>> 		do {
>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>> -				     TPM_TIMEOUT_USECS_MAX);
>> +			/* this code path could be executed before
>> +			 * timeouts initialized in chip instance.
>> +			 */
>> +			if (chip->timeout_wait_stat_min &&
>> +			    chip->timeout_wait_stat_max)
>> +				usleep_range(chip->timeout_wait_stat_min,
>> +					     chip->timeout_wait_stat_max);
>> +			else
>> +				usleep_range(TPM_TIMEOUT_USECS_MIN,
>> +					     TPM_TIMEOUT_USECS_MAX);
> 
> This starts to look otherwise fine but you don't need this condition.
> Just initialize variables to TPM_TIMEOUT_USECS_{MIN, MAX} for non-Atmel.
Not sure I got your point or not. We have discussed this question a few rounds before,
I answered you about this. This check is required because before the time of 
Initialization in the code I added in `tpm_tis_core_init`
```
+	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
+	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
```
The func `wait_for_tpm_stat` runs, we need the condition to fall back to avoid system startup crash.

Let me know if this makes sense. If needed, I can do another confirm.

> /Jarkko

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-09 19:23         ` Hao Wu
@ 2021-07-11  7:37           ` Hao Wu
  2021-07-16  5:30             ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-07-11  7:37 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jul 9, 2021, at 12:23 PM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
>> On Jul 9, 2021, at 10:47 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>> 
>> On Thu, Jul 08, 2021 at 09:40:28PM -0700, Hao Wu wrote:
>>> The Atmel TPM 1.2 chips crash with error
>>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>>> It is observed from the kernel log after running `tpm_sealdata -z`.
>>> The error thrown from the command is as follows
>>> ```
>>> $ tpm_sealdata -z
>>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>>> code=0087 (135), I/O error
>>> ```
>>> 
>>> The issue was reproduced with the following Atmel TPM chip:
>>> ```
>>> $ tpm_version
>>> T0  TPM 1.2 Version Info:
>>> Chip Version:        1.2.66.1
>>> Spec Level:          2
>>> Errata Revision:     3
>>> TPM Vendor ID:       ATML
>>> TPM Version:         01010000
>>> Manufacturer Info:   41544d4c
>>> ```
>>> 
>>> The root cause of the issue is due to the TPM calls to msleep()
>>> were replaced with usleep_range() [1], which reduces
>>> the actual timeout. Via experiments, it is observed that
>>> the original msleep(5) actually sleeps for 15ms.
>>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>>> the shorter timeout than 15ms can cause the error described above.
>>> 
>>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>>> reduced the timeout to less than 1ms. With experiments,
>>> the problematic timeout in the latest kernel is the one
>>> for `wait_for_tpm_stat`.
>>> 
>>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>>> for Ateml TPM 2.0 chip, and chips from other vendors.
>>> As explained above, the chosen 15ms timeout is
>>> the actual timeout before this issue introduced,
>>> thus the old value is used here.
>>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>>> the existing TPM_TIMEOUT_RANGE_US (300us).
>>> The fixed has been tested in the system with the affected Atmel chip
>>> with no issues observed after boot up.
>>> 
>>> References:
>>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>>> 1.2/2.0 generic drivers
>>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>>> granularity
>>> 
>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>> ---
>>> This version (v2) has following changes on top of the last (v1):
>>> - follow the existing way to define two timeouts (min and max)
>>> for ATMEL chip, thus keep the exact timeout logic for 
>>> non-ATEML chips.
>>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>> because it is not an issue for TPM 2.0 chips yet.
>>> 
>>> Test Plan:
>>> - Run fixed kernel with ATMEL TPM chips and see crash has been fixed.
>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>> the timeout has not been changed.
>>> 
>>> drivers/char/tpm/tpm.h          |  6 ++++--
>>> drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
>>> include/linux/tpm.h             |  3 +++
>>> 3 files changed, 28 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>>> index 283f78211c3a..6de1b44c4aab 100644
>>> --- a/drivers/char/tpm/tpm.h
>>> +++ b/drivers/char/tpm/tpm.h
>>> @@ -41,8 +41,10 @@ enum tpm_timeout {
>>> 	TPM_TIMEOUT_RETRY = 100, /* msecs */
>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>>> -	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>>> +	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
>>> };
>>> 
>>> /* TPM addresses */
>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>> index 55b9d3965ae1..ae27d66fdd94 100644
>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>> @@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>> 		}
>>> 	} else {
>>> 		do {
>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>> -				     TPM_TIMEOUT_USECS_MAX);
>>> +			/* this code path could be executed before
>>> +			 * timeouts initialized in chip instance.
>>> +			 */
>>> +			if (chip->timeout_wait_stat_min &&
>>> +			    chip->timeout_wait_stat_max)
>>> +				usleep_range(chip->timeout_wait_stat_min,
>>> +					     chip->timeout_wait_stat_max);
>>> +			else
>>> +				usleep_range(TPM_TIMEOUT_USECS_MIN,
>>> +					     TPM_TIMEOUT_USECS_MAX);
>> 
>> This starts to look otherwise fine but you don't need this condition.
>> Just initialize variables to TPM_TIMEOUT_USECS_{MIN, MAX} for non-Atmel.
> Not sure I got your point or not. We have discussed this question a few rounds before,
> I answered you about this. This check is required because before the time of 
> Initialization in the code I added in `tpm_tis_core_init`
> ```
> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
> ```
> The func `wait_for_tpm_stat` runs, we need the condition to fall back to avoid system startup crash.
> 
> Let me know if this makes sense. If needed, I can do another confirm.
I double checked this, and found the current init lines in `tpm_tis_core_init` 
is actually before this code path now. Maybe it was an issue in one
of my old revision and I had the wrong impression. 
The condition seems ok to remove in the current revision. 

But I am not fully sure is if the behavior is consistent across other 1.2 chips, and TPM 2.0 chips.
Should we still keep the condition for robustness or ship without it ?  

>> /Jarkko
> 
> Hao

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-09  4:40     ` [PATCH v2] tpm: fix Atmel " Hao Wu
  2021-07-09 17:47       ` Jarkko Sakkinen
@ 2021-07-11  7:51       ` Hao Wu
  2021-07-27  2:46         ` Jarkko Sakkinen
  2021-08-14 22:25         ` [PATCH v4] " Hao Wu
  1 sibling, 2 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-11  7:51 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

The Atmel TPM 1.2 chips crash with error
`tpm_try_transmit: send(): error -62` since kernel 4.14.
It is observed from the kernel log after running `tpm_sealdata -z`.
The error thrown from the command is as follows
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
code=0087 (135), I/O error
```

The issue was reproduced with the following Atmel TPM chip:
```
$ tpm_version
T0  TPM 1.2 Version Info:
  Chip Version:        1.2.66.1
  Spec Level:          2
  Errata Revision:     3
  TPM Vendor ID:       ATML
  TPM Version:         01010000
  Manufacturer Info:   41544d4c
```

The root cause of the issue is due to the TPM calls to msleep()
were replaced with usleep_range() [1], which reduces
the actual timeout. Via experiments, it is observed that
the original msleep(5) actually sleeps for 15ms.
Because of a known timeout issue in Atmel TPM 1.2 chip,
the shorter timeout than 15ms can cause the error described above.

A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
reduced the timeout to less than 1ms. With experiments,
the problematic timeout in the latest kernel is the one
for `wait_for_tpm_stat`.

To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
for Ateml TPM 2.0 chip, and chips from other vendors.
As explained above, the chosen 15ms timeout is
the actual timeout before this issue introduced,
thus the old value is used here.
Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
the existing TPM_TIMEOUT_RANGE_US (300us).
The fixed has been tested in the system with the affected Atmel chip
with no issues observed after boot up.

References:
[1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
1.2/2.0 generic drivers
[2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
[3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
[4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
granularity

Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
---
This version (v3) removes unnecessary condition check
in `wait_for_tpm_stat`.

Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
the timeout has not been changed.

drivers/char/tpm/tpm.h          |  6 ++++--
 drivers/char/tpm/tpm_tis_core.c | 14 ++++++++++++--
 include/linux/tpm.h             |  3 +++
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..6de1b44c4aab 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -41,8 +41,10 @@ enum tpm_timeout {
 	TPM_TIMEOUT_RETRY = 100, /* msecs */
 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
 	TPM_TIMEOUT_POLL = 1,	/* msecs */
-	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
-	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
+	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
+	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
 };
 
 /* TPM addresses */
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..2de1f71e8ae1 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			usleep_range(chip->timeout_wait_stat_min,
+				     chip->timeout_wait_stat_max);
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +934,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	/* init timeouts for wait_for_tpm_stat */
+	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
+	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +986,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	if (priv->manufacturer_id == TPM_VID_ATML &&
+		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
+		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
+		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..171b9102c976 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,8 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned int timeout_wait_stat_min; /* usecs */
+	unsigned int timeout_wait_stat_max; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-11  7:37           ` Hao Wu
@ 2021-07-16  5:30             ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-16  5:30 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley

> On Jul 11, 2021, at 12:37 AM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
>> On Jul 9, 2021, at 12:23 PM, Hao Wu <hao.wu@rubrik.com> wrote:
>> 
>>> On Jul 9, 2021, at 10:47 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>>> On Thu, Jul 08, 2021 at 09:40:28PM -0700, Hao Wu wrote:
>>>> The Atmel TPM 1.2 chips crash with error
>>>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>>>> It is observed from the kernel log after running `tpm_sealdata -z`.
>>>> The error thrown from the command is as follows
>>>> ```
>>>> $ tpm_sealdata -z
>>>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>>>> code=0087 (135), I/O error
>>>> ```
>>>> 
>>>> The issue was reproduced with the following Atmel TPM chip:
>>>> ```
>>>> $ tpm_version
>>>> T0  TPM 1.2 Version Info:
>>>> Chip Version:        1.2.66.1
>>>> Spec Level:          2
>>>> Errata Revision:     3
>>>> TPM Vendor ID:       ATML
>>>> TPM Version:         01010000
>>>> Manufacturer Info:   41544d4c
>>>> ```
>>>> 
>>>> The root cause of the issue is due to the TPM calls to msleep()
>>>> were replaced with usleep_range() [1], which reduces
>>>> the actual timeout. Via experiments, it is observed that
>>>> the original msleep(5) actually sleeps for 15ms.
>>>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>>>> the shorter timeout than 15ms can cause the error described above.
>>>> 
>>>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>>>> reduced the timeout to less than 1ms. With experiments,
>>>> the problematic timeout in the latest kernel is the one
>>>> for `wait_for_tpm_stat`.
>>>> 
>>>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>>>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>>>> for Ateml TPM 2.0 chip, and chips from other vendors.
>>>> As explained above, the chosen 15ms timeout is
>>>> the actual timeout before this issue introduced,
>>>> thus the old value is used here.
>>>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>>>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>>>> the existing TPM_TIMEOUT_RANGE_US (300us).
>>>> The fixed has been tested in the system with the affected Atmel chip
>>>> with no issues observed after boot up.
>>>> 
>>>> References:
>>>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>>>> 1.2/2.0 generic drivers
>>>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>>>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>>>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>>>> granularity
>>>> 
>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>> ---
>>>> This version (v2) has following changes on top of the last (v1):
>>>> - follow the existing way to define two timeouts (min and max)
>>>> for ATMEL chip, thus keep the exact timeout logic for 
>>>> non-ATEML chips.
>>>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>>> because it is not an issue for TPM 2.0 chips yet.
>>>> 
>>>> Test Plan:
>>>> - Run fixed kernel with ATMEL TPM chips and see crash has been fixed.
>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>> the timeout has not been changed.
>>>> 
>>>> drivers/char/tpm/tpm.h          |  6 ++++--
>>>> drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++--
>>>> include/linux/tpm.h             |  3 +++
>>>> 3 files changed, 28 insertions(+), 4 deletions(-)
>>>> 
>>>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>>>> index 283f78211c3a..6de1b44c4aab 100644
>>>> --- a/drivers/char/tpm/tpm.h
>>>> +++ b/drivers/char/tpm/tpm.h
>>>> @@ -41,8 +41,10 @@ enum tpm_timeout {
>>>> 	TPM_TIMEOUT_RETRY = 100, /* msecs */
>>>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>>>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>>>> -	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>>>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>>>> +	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
>>>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
>>>> };
>>>> 
>>>> /* TPM addresses */
>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>> index 55b9d3965ae1..ae27d66fdd94 100644
>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>> @@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>>> 		}
>>>> 	} else {
>>>> 		do {
>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>> -				     TPM_TIMEOUT_USECS_MAX);
>>>> +			/* this code path could be executed before
>>>> +			 * timeouts initialized in chip instance.
>>>> +			 */
>>>> +			if (chip->timeout_wait_stat_min &&
>>>> +			    chip->timeout_wait_stat_max)
>>>> +				usleep_range(chip->timeout_wait_stat_min,
>>>> +					     chip->timeout_wait_stat_max);
>>>> +			else
>>>> +				usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>> +					     TPM_TIMEOUT_USECS_MAX);
>>> 
>>> This starts to look otherwise fine but you don't need this condition.
>>> Just initialize variables to TPM_TIMEOUT_USECS_{MIN, MAX} for non-Atmel.
>> Not sure I got your point or not. We have discussed this question a few rounds before,
>> I answered you about this. This check is required because before the time of 
>> Initialization in the code I added in `tpm_tis_core_init`
>> ```
>> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
>> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
>> ```
>> The func `wait_for_tpm_stat` runs, we need the condition to fall back to avoid system startup crash.
>> 
>> Let me know if this makes sense. If needed, I can do another confirm.
> I double checked this, and found the current init lines in `tpm_tis_core_init` 
> is actually before this code path now. Maybe it was an issue in one
> of my old revision and I had the wrong impression. 
> The condition seems ok to remove in the current revision. 
> 
> But I am not fully sure is if the behavior is consistent across other 1.2 chips, and TPM 2.0 chips.
> Should we still keep the condition for robustness or ship without it ?  
> 
This has been updated in a v3 patch 
https://patchwork.kernel.org/project/linux-integrity/patch/20210711075122.30056-1-hao.wu@rubrik.com/

Let me know if that is preferred. I tested in both atmel and non-atmel machine. Works fine so far.

>>> /Jarkko
>> 
>> Hao
> 
> Hao

Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-11  7:51       ` [PATCH v3] " Hao Wu
@ 2021-07-27  2:46         ` Jarkko Sakkinen
  2021-07-27  3:40           ` Hao Wu
  2021-08-14 22:25         ` [PATCH v4] " Hao Wu
  1 sibling, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-07-27  2:46 UTC (permalink / raw)
  To: Hao Wu
  Cc: shrihari.kalkar, seungyeop.han, anish.jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

On Sun, Jul 11, 2021 at 12:51:22AM -0700, Hao Wu wrote:
> The Atmel TPM 1.2 chips crash with error
> `tpm_try_transmit: send(): error -62` since kernel 4.14.
> It is observed from the kernel log after running `tpm_sealdata -z`.
> The error thrown from the command is as follows
> ```
> $ tpm_sealdata -z
> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
> code=0087 (135), I/O error
> ```
> 
> The issue was reproduced with the following Atmel TPM chip:
> ```
> $ tpm_version
> T0  TPM 1.2 Version Info:
>   Chip Version:        1.2.66.1
>   Spec Level:          2
>   Errata Revision:     3
>   TPM Vendor ID:       ATML
>   TPM Version:         01010000
>   Manufacturer Info:   41544d4c
> ```
> 
> The root cause of the issue is due to the TPM calls to msleep()
> were replaced with usleep_range() [1], which reduces
> the actual timeout. Via experiments, it is observed that
> the original msleep(5) actually sleeps for 15ms.
> Because of a known timeout issue in Atmel TPM 1.2 chip,
> the shorter timeout than 15ms can cause the error described above.
> 
> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
> reduced the timeout to less than 1ms. With experiments,
> the problematic timeout in the latest kernel is the one
> for `wait_for_tpm_stat`.
> 
> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
> for Ateml TPM 2.0 chip, and chips from other vendors.
> As explained above, the chosen 15ms timeout is
> the actual timeout before this issue introduced,
> thus the old value is used here.
> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
> the existing TPM_TIMEOUT_RANGE_US (300us).
> The fixed has been tested in the system with the affected Atmel chip
> with no issues observed after boot up.
> 
> References:
> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
> 1.2/2.0 generic drivers
> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
> granularity
> 
> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> ---
> This version (v3) removes unnecessary condition check
> in `wait_for_tpm_stat`.

Missing change long v1 -> v2.

Please do something like

v3:
- ...

v2:
- ...

> 
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash
> has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> the timeout has not been changed.
> 
> drivers/char/tpm/tpm.h          |  6 ++++--
>  drivers/char/tpm/tpm_tis_core.c | 14 ++++++++++++--
>  include/linux/tpm.h             |  3 +++
>  3 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> index 283f78211c3a..6de1b44c4aab 100644
> --- a/drivers/char/tpm/tpm.h
> +++ b/drivers/char/tpm/tpm.h
> @@ -41,8 +41,10 @@ enum tpm_timeout {
>  	TPM_TIMEOUT_RETRY = 100, /* msecs */
>  	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>  	TPM_TIMEOUT_POLL = 1,	/* msecs */
> -	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> +	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */

What is going on here?

These lines should not change.

> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */

Move these definitions to tpm_tis_core.h. They are only useful
for a single driver.


>  };
>  
>  /* TPM addresses */
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..2de1f71e8ae1 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>  		}
>  	} else {
>  		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			usleep_range(chip->timeout_wait_stat_min,
> +				     chip->timeout_wait_stat_max);
>  			status = chip->ops->status(chip);
>  			if ((status & mask) == mask)
>  				return 0;
> @@ -934,6 +934,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>  	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>  	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> +	/* init timeouts for wait_for_tpm_stat */

Remove this comment.

> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
>  	priv->phy_ops = phy_ops;
>  	dev_set_drvdata(&chip->dev, priv);
>  
> @@ -983,6 +986,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  
>  	priv->manufacturer_id = vendor;
>  
> +	if (priv->manufacturer_id == TPM_VID_ATML &&
> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
> +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
> +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
> +	}
> +
>  	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>  	if (rc < 0)
>  		goto out_err;
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index aa11fe323c56..171b9102c976 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -150,6 +150,8 @@ struct tpm_chip {
>  	bool timeout_adjusted;
>  	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>  	bool duration_adjusted;
> +	unsigned int timeout_wait_stat_min; /* usecs */
> +	unsigned int timeout_wait_stat_max; /* usecs */
>  
>  	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>  
> @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
>  #define TPM_VID_INTEL    0x8086
>  #define TPM_VID_WINBOND  0x1050
>  #define TPM_VID_STM      0x104A
> +#define TPM_VID_ATML     0x1114
>  
>  enum tpm_chip_flags {
>  	TPM_CHIP_FLAG_TPM2		= BIT(1),
> -- 
> 2.29.0.vfs.0.0
> 
> 

/Jarkko

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-27  2:46         ` Jarkko Sakkinen
@ 2021-07-27  3:40           ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-07-27  3:40 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, pmenzel, kgold, zohar, why2jjj.linux, hamza,
	gregkh, arnd, nayna, James.Bottomley

> On Jul 26, 2021, at 7:46 PM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Sun, Jul 11, 2021 at 12:51:22AM -0700, Hao Wu wrote:
>> The Atmel TPM 1.2 chips crash with error
>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>> It is observed from the kernel log after running `tpm_sealdata -z`.
>> The error thrown from the command is as follows
>> ```
>> $ tpm_sealdata -z
>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>> code=0087 (135), I/O error
>> ```
>> 
>> The issue was reproduced with the following Atmel TPM chip:
>> ```
>> $ tpm_version
>> T0  TPM 1.2 Version Info:
>>  Chip Version:        1.2.66.1
>>  Spec Level:          2
>>  Errata Revision:     3
>>  TPM Vendor ID:       ATML
>>  TPM Version:         01010000
>>  Manufacturer Info:   41544d4c
>> ```
>> 
>> The root cause of the issue is due to the TPM calls to msleep()
>> were replaced with usleep_range() [1], which reduces
>> the actual timeout. Via experiments, it is observed that
>> the original msleep(5) actually sleeps for 15ms.
>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>> the shorter timeout than 15ms can cause the error described above.
>> 
>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>> reduced the timeout to less than 1ms. With experiments,
>> the problematic timeout in the latest kernel is the one
>> for `wait_for_tpm_stat`.
>> 
>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>> for Ateml TPM 2.0 chip, and chips from other vendors.
>> As explained above, the chosen 15ms timeout is
>> the actual timeout before this issue introduced,
>> thus the old value is used here.
>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>> the existing TPM_TIMEOUT_RANGE_US (300us).
>> The fixed has been tested in the system with the affected Atmel chip
>> with no issues observed after boot up.
>> 
>> References:
>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>> 1.2/2.0 generic drivers
>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>> granularity
>> 
>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>> ---
>> This version (v3) removes unnecessary condition check
>> in `wait_for_tpm_stat`.
> 
> Missing change long v1 -> v2.
> 
> Please do something like
> 
> v3:
> - ...
> 
> v2:
> - ...
Ok, I thought it is chained. I will add all changes

>> 
>> Test Plan:
>> - Run fixed kernel with ATMEL TPM chips and see crash
>> has been fixed.
>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>> the timeout has not been changed.
>> 
>> drivers/char/tpm/tpm.h          |  6 ++++--
>> drivers/char/tpm/tpm_tis_core.c | 14 ++++++++++++--
>> include/linux/tpm.h             |  3 +++
>> 3 files changed, 19 insertions(+), 4 deletions(-)
>> 
>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>> index 283f78211c3a..6de1b44c4aab 100644
>> --- a/drivers/char/tpm/tpm.h
>> +++ b/drivers/char/tpm/tpm.h
>> @@ -41,8 +41,10 @@ enum tpm_timeout {
>> 	TPM_TIMEOUT_RETRY = 100, /* msecs */
>> 	TPM_TIMEOUT_RANGE_US = 300,	/* usecs */
>> 	TPM_TIMEOUT_POLL = 1,	/* msecs */
>> -	TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
>> -	TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
>> +	TPM_TIMEOUT_USECS_MIN = 100,	/* usecs */
The spaces of this this line is wrong, it should use tab instead of spaces before the `/*`. 
Fixing this by the way here and align the comment 
>> +	TPM_TIMEOUT_USECS_MAX = 500,	/* usecs */
You need to add tailing comma, don’t we?
> What is going on here?
> 
> These lines should not change.

> 
>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000	/* usecs */
> 
> Move these definitions to tpm_tis_core.h. They are only useful
> for a single driver.
I thought putting them along with the original MIN / MAX is easier to understand for code reader. 
Let me know if you have strong opinion though.

> 
>> };
>> 
>> /* TPM addresses */
>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> index 55b9d3965ae1..2de1f71e8ae1 100644
>> --- a/drivers/char/tpm/tpm_tis_core.c
>> +++ b/drivers/char/tpm/tpm_tis_core.c
>> @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>> 		}
>> 	} else {
>> 		do {
>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>> -				     TPM_TIMEOUT_USECS_MAX);
>> +			usleep_range(chip->timeout_wait_stat_min,
>> +				     chip->timeout_wait_stat_max);
>> 			status = chip->ops->status(chip);
>> 			if ((status & mask) == mask)
>> 				return 0;
>> @@ -934,6 +934,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>> +	/* init timeouts for wait_for_tpm_stat */
> 
> Remove this comment.
Ok

> 
>> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
>> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
>> 	priv->phy_ops = phy_ops;
>> 	dev_set_drvdata(&chip->dev, priv);
>> 
>> @@ -983,6 +986,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 
>> 	priv->manufacturer_id = vendor;
>> 
>> +	if (priv->manufacturer_id == TPM_VID_ATML &&
>> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
>> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
>> +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
>> +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
>> +	}
>> +
>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>> 	if (rc < 0)
>> 		goto out_err;
>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>> index aa11fe323c56..171b9102c976 100644
>> --- a/include/linux/tpm.h
>> +++ b/include/linux/tpm.h
>> @@ -150,6 +150,8 @@ struct tpm_chip {
>> 	bool timeout_adjusted;
>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>> 	bool duration_adjusted;
>> +	unsigned int timeout_wait_stat_min; /* usecs */
>> +	unsigned int timeout_wait_stat_max; /* usecs */
>> 
>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>> 
>> @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
>> #define TPM_VID_INTEL    0x8086
>> #define TPM_VID_WINBOND  0x1050
>> #define TPM_VID_STM      0x104A
>> +#define TPM_VID_ATML     0x1114
>> 
>> enum tpm_chip_flags {
>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>> -- 
>> 2.29.0.vfs.0.0
>> 
>> 
> 
> /Jarkko

Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v4] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-07-11  7:51       ` [PATCH v3] " Hao Wu
  2021-07-27  2:46         ` Jarkko Sakkinen
@ 2021-08-14 22:25         ` Hao Wu
  2021-08-26  5:38           ` Hao Wu
  2021-09-05  3:51           ` [PATCH v5] " Hao Wu
  1 sibling, 2 replies; 47+ messages in thread
From: Hao Wu @ 2021-08-14 22:25 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

The Atmel TPM 1.2 chips crash with error
`tpm_try_transmit: send(): error -62` since kernel 4.14.
It is observed from the kernel log after running `tpm_sealdata -z`.
The error thrown from the command is as follows
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
code=0087 (135), I/O error
```

The issue was reproduced with the following Atmel TPM chip:
```
$ tpm_version
T0  TPM 1.2 Version Info:
  Chip Version:        1.2.66.1
  Spec Level:          2
  Errata Revision:     3
  TPM Vendor ID:       ATML
  TPM Version:         01010000
  Manufacturer Info:   41544d4c
```

The root cause of the issue is due to the TPM calls to msleep()
were replaced with usleep_range() [1], which reduces
the actual timeout. Via experiments, it is observed that
the original msleep(5) actually sleeps for 15ms.
Because of a known timeout issue in Atmel TPM 1.2 chip,
the shorter timeout than 15ms can cause the error described above.

A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
reduced the timeout to less than 1ms. With experiments,
the problematic timeout in the latest kernel is the one
for `wait_for_tpm_stat`.

To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
for Ateml TPM 2.0 chip, and chips from other vendors.
As explained above, the chosen 15ms timeout is
the actual timeout before this issue introduced,
thus the old value is used here.
Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
the existing TPM_TIMEOUT_RANGE_US (300us).
The fixed has been tested in the system with the affected Atmel chip
with no issues observed after boot up.

References:
[1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
1.2/2.0 generic drivers
[2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
[3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
[4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
granularity

Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
---
v4:
- Move timeout constants to drivers/char/tpm/tpm_tis_core.h
- Cleanup unnecessary inline comment

v3:
- removes unnecessary condition check in `wait_for_tpm_stat`

v2:
- follow the existing way to define two timeouts (min and max)
  for ATMEL chip, thus keep the exact timeout logic for 
  non-ATEML chips.
- limit the timeout increase to only ATMEL TPM 1.2 chips,
  because it is not an issue for TPM 2.0 chips yet.

Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
the timeout has not been changed.

 drivers/char/tpm/tpm_tis_core.c | 13 +++++++++++--
 drivers/char/tpm/tpm_tis_core.h |  2 ++
 include/linux/tpm.h             |  3 +++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..24605f100e96 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 		}
 	} else {
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			usleep_range(chip->timeout_wait_stat_min,
+				     chip->timeout_wait_stat_max);
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,6 +934,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
+	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
 	priv->phy_ops = phy_ops;
 	dev_set_drvdata(&chip->dev, priv);
 
@@ -983,6 +985,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 
 	priv->manufacturer_id = vendor;
 
+	if (priv->manufacturer_id == TPM_VID_ATML &&
+		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
+		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
+		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
+	}
+
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index 9b2d32a59f67..2e431beb44f7 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -54,6 +54,8 @@ enum tis_defaults {
 	TIS_MEM_LEN = 0x5000,
 	TIS_SHORT_TIMEOUT = 750,	/* ms */
 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
+	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000,	/* usecs */
 };
 
 /* Some timeout values are needed before it is known whether the chip is
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..171b9102c976 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -150,6 +150,8 @@ struct tpm_chip {
 	bool timeout_adjusted;
 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
 	bool duration_adjusted;
+	unsigned int timeout_wait_stat_min; /* usecs */
+	unsigned int timeout_wait_stat_max; /* usecs */
 
 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
 
@@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-08-14 22:25         ` [PATCH v4] " Hao Wu
@ 2021-08-26  5:38           ` Hao Wu
  2021-08-26 16:24             ` Jarkko Sakkinen
  2021-09-05  3:51           ` [PATCH v5] " Hao Wu
  1 sibling, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-08-26  5:38 UTC (permalink / raw)
  To: Hao Wu, Shrihari Kalkar, Han Seungyeop, Anish Jhaveri,
	peterhuewe, Jarkko Sakkinen, jgg, linux-integrity, Paul Menzel,
	Ken Goldman, zohar, why2jjj.linux, Hamza Attak, gregkh, arnd,
	Nayna, James Bottomley

> On Aug 14, 2021, at 3:25 PM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
> The Atmel TPM 1.2 chips crash with error
> `tpm_try_transmit: send(): error -62` since kernel 4.14.
> It is observed from the kernel log after running `tpm_sealdata -z`.
> The error thrown from the command is as follows
> ```
> $ tpm_sealdata -z
> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
> code=0087 (135), I/O error
> ```
> 
> The issue was reproduced with the following Atmel TPM chip:
> ```
> $ tpm_version
> T0  TPM 1.2 Version Info:
>  Chip Version:        1.2.66.1
>  Spec Level:          2
>  Errata Revision:     3
>  TPM Vendor ID:       ATML
>  TPM Version:         01010000
>  Manufacturer Info:   41544d4c
> ```
> 
> The root cause of the issue is due to the TPM calls to msleep()
> were replaced with usleep_range() [1], which reduces
> the actual timeout. Via experiments, it is observed that
> the original msleep(5) actually sleeps for 15ms.
> Because of a known timeout issue in Atmel TPM 1.2 chip,
> the shorter timeout than 15ms can cause the error described above.
> 
> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
> reduced the timeout to less than 1ms. With experiments,
> the problematic timeout in the latest kernel is the one
> for `wait_for_tpm_stat`.
> 
> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
> for Ateml TPM 2.0 chip, and chips from other vendors.
> As explained above, the chosen 15ms timeout is
> the actual timeout before this issue introduced,
> thus the old value is used here.
> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
> the existing TPM_TIMEOUT_RANGE_US (300us).
> The fixed has been tested in the system with the affected Atmel chip
> with no issues observed after boot up.
> 
> References:
> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
> 1.2/2.0 generic drivers
> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
> granularity
> 
> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> ---
> v4:
> - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
> - Cleanup unnecessary inline comment
> 
> v3:
> - removes unnecessary condition check in `wait_for_tpm_stat`
> 
> v2:
> - follow the existing way to define two timeouts (min and max)
>  for ATMEL chip, thus keep the exact timeout logic for 
>  non-ATEML chips.
> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>  because it is not an issue for TPM 2.0 chips yet.
> 
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash
> has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> the timeout has not been changed.
> 
> drivers/char/tpm/tpm_tis_core.c | 13 +++++++++++--
> drivers/char/tpm/tpm_tis_core.h |  2 ++
> include/linux/tpm.h             |  3 +++
> 3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..24605f100e96 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> 		}
> 	} else {
> 		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			usleep_range(chip->timeout_wait_stat_min,
> +				     chip->timeout_wait_stat_max);
> 			status = chip->ops->status(chip);
> 			if ((status & mask) == mask)
> 				return 0;
> @@ -934,6 +934,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
> 	priv->phy_ops = phy_ops;
> 	dev_set_drvdata(&chip->dev, priv);
> 
> @@ -983,6 +985,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> 
> 	priv->manufacturer_id = vendor;
> 
> +	if (priv->manufacturer_id == TPM_VID_ATML &&
> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
> +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
> +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
> +	}
> +
> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
> 	if (rc < 0)
> 		goto out_err;
> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
> index 9b2d32a59f67..2e431beb44f7 100644
> --- a/drivers/char/tpm/tpm_tis_core.h
> +++ b/drivers/char/tpm/tpm_tis_core.h
> @@ -54,6 +54,8 @@ enum tis_defaults {
> 	TIS_MEM_LEN = 0x5000,
> 	TIS_SHORT_TIMEOUT = 750,	/* ms */
> 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000,	/* usecs */
> };
> 
> /* Some timeout values are needed before it is known whether the chip is
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index aa11fe323c56..171b9102c976 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -150,6 +150,8 @@ struct tpm_chip {
> 	bool timeout_adjusted;
> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
> 	bool duration_adjusted;
> +	unsigned int timeout_wait_stat_min; /* usecs */
> +	unsigned int timeout_wait_stat_max; /* usecs */
> 
> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
> 
> @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
> #define TPM_VID_INTEL    0x8086
> #define TPM_VID_WINBOND  0x1050
> #define TPM_VID_STM      0x104A
> +#define TPM_VID_ATML     0x1114
> 
> enum tpm_chip_flags {
> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
> -- 
> 2.29.0.vfs.0.0
> 

Just kindly remind this code review in case it has been missed somehow

Thanks
Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-08-26  5:38           ` Hao Wu
@ 2021-08-26 16:24             ` Jarkko Sakkinen
  2021-08-27  0:35               ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-08-26 16:24 UTC (permalink / raw)
  To: Hao Wu, Shrihari Kalkar, Han Seungyeop, Anish Jhaveri,
	peterhuewe, jgg, linux-integrity, Paul Menzel, Ken Goldman,
	zohar, why2jjj.linux, Hamza Attak, gregkh, arnd, Nayna,
	James Bottomley

On Wed, 2021-08-25 at 22:38 -0700, Hao Wu wrote:
> > On Aug 14, 2021, at 3:25 PM, Hao Wu <hao.wu@rubrik.com> wrote:
> > 
> > The Atmel TPM 1.2 chips crash with error
> > `tpm_try_transmit: send(): error -62` since kernel 4.14.
> > It is observed from the kernel log after running `tpm_sealdata -z`.
> > The error thrown from the command is as follows
> > ```
> > $ tpm_sealdata -z
> > Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
> > code=0087 (135), I/O error
> > ```
> > 
> > The issue was reproduced with the following Atmel TPM chip:
> > ```
> > $ tpm_version
> > T0  TPM 1.2 Version Info:
> >  Chip Version:        1.2.66.1
> >  Spec Level:          2
> >  Errata Revision:     3
> >  TPM Vendor ID:       ATML
> >  TPM Version:         01010000
> >  Manufacturer Info:   41544d4c
> > ```
> > 
> > The root cause of the issue is due to the TPM calls to msleep()
> > were replaced with usleep_range() [1], which reduces
> > the actual timeout. Via experiments, it is observed that
> > the original msleep(5) actually sleeps for 15ms.
> > Because of a known timeout issue in Atmel TPM 1.2 chip,
> > the shorter timeout than 15ms can cause the error described above.
> > 
> > A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
> > reduced the timeout to less than 1ms. With experiments,
> > the problematic timeout in the latest kernel is the one
> > for `wait_for_tpm_stat`.
> > 
> > To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
> > to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
> > for Ateml TPM 2.0 chip, and chips from other vendors.
> > As explained above, the chosen 15ms timeout is
> > the actual timeout before this issue introduced,
> > thus the old value is used here.
> > Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
> > TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
> > the existing TPM_TIMEOUT_RANGE_US (300us).
> > The fixed has been tested in the system with the affected Atmel chip
> > with no issues observed after boot up.
> > 
> > References:
> > [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
> > 1.2/2.0 generic drivers
> > [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
> > [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
> > [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
> > granularity
> > 
> > Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> > Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> > Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> > ---
> > v4:
> > - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
> > - Cleanup unnecessary inline comment
> > 
> > v3:
> > - removes unnecessary condition check in `wait_for_tpm_stat`
> > 
> > v2:
> > - follow the existing way to define two timeouts (min and max)
> >  for ATMEL chip, thus keep the exact timeout logic for 
> >  non-ATEML chips.
> > - limit the timeout increase to only ATMEL TPM 1.2 chips,
> >  because it is not an issue for TPM 2.0 chips yet.
> > 
> > Test Plan:
> > - Run fixed kernel with ATMEL TPM chips and see crash
> > has been fixed.
> > - Run fixed kernel with non-ATMEL TPM chips, and confirm
> > the timeout has not been changed.
> > 
> > drivers/char/tpm/tpm_tis_core.c | 13 +++++++++++--
> > drivers/char/tpm/tpm_tis_core.h |  2 ++
> > include/linux/tpm.h             |  3 +++
> > 3 files changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> > index 55b9d3965ae1..24605f100e96 100644
> > --- a/drivers/char/tpm/tpm_tis_core.c
> > +++ b/drivers/char/tpm/tpm_tis_core.c
> > @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> > 		}
> > 	} else {
> > 		do {
> > -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> > -				     TPM_TIMEOUT_USECS_MAX);
> > +			usleep_range(chip->timeout_wait_stat_min,
> > +				     chip->timeout_wait_stat_max);
> > 			status = chip->ops->status(chip);
> > 			if ((status & mask) == mask)
> > 				return 0;
> > @@ -934,6 +934,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> > 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
> > 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
> > 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> > +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
> > +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
> > 	priv->phy_ops = phy_ops;
> > 	dev_set_drvdata(&chip->dev, priv);
> > 
> > @@ -983,6 +985,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
> > 
> > 	priv->manufacturer_id = vendor;
> > 
> > +	if (priv->manufacturer_id == TPM_VID_ATML &&
> > +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
> > +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
> > +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
> > +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
> > +	}
> > +
> > 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
> > 	if (rc < 0)
> > 		goto out_err;
> > diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
> > index 9b2d32a59f67..2e431beb44f7 100644
> > --- a/drivers/char/tpm/tpm_tis_core.h
> > +++ b/drivers/char/tpm/tpm_tis_core.h
> > @@ -54,6 +54,8 @@ enum tis_defaults {
> > 	TIS_MEM_LEN = 0x5000,
> > 	TIS_SHORT_TIMEOUT = 750,	/* ms */
> > 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
> > +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
> > +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000,	/* usecs */
> > };

I'd prefer TIS_TIMEOUT_{MIN, MAX}_ATML. I.e. no "WAIT_STAT" and without "TPM_"
to be consistent with other constants here.

> > 
> > /* Some timeout values are needed before it is known whether the chip is
> > diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> > index aa11fe323c56..171b9102c976 100644
> > --- a/include/linux/tpm.h
> > +++ b/include/linux/tpm.h
> > @@ -150,6 +150,8 @@ struct tpm_chip {
> > 	bool timeout_adjusted;
> > 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
> > 	bool duration_adjusted;
> > +	unsigned int timeout_wait_stat_min; /* usecs */
> > +	unsigned int timeout_wait_stat_max; /* usecs */

Please rename as timeout_{min, max}.

And I think tpm_chip is wrong place to put them as they are TIS
specific, i.e. they should be in tpm_tis_data.

> > 
> > 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
> > 
> > @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
> > #define TPM_VID_INTEL    0x8086
> > #define TPM_VID_WINBOND  0x1050
> > #define TPM_VID_STM      0x104A
> > +#define TPM_VID_ATML     0x1114
> > 
> > enum tpm_chip_flags {
> > 	TPM_CHIP_FLAG_TPM2		= BIT(1),
> > -- 
> > 2.29.0.vfs.0.0
> > 
> 
> Just kindly remind this code review in case it has been missed somehow

I'm sorry, my bad. I managed to somehow miss this. Might be because
I've been recently reorganizing my email accounts. And thanks for
pinging so that I spotted it.

> Thanks
> Hao

/Jarkko


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-08-26 16:24             ` Jarkko Sakkinen
@ 2021-08-27  0:35               ` Hao Wu
  2021-09-04 21:14                 ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-08-27  0:35 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Han Seungyeop, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley

> On Aug 26, 2021, at 9:24 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Wed, 2021-08-25 at 22:38 -0700, Hao Wu wrote:
>>> On Aug 14, 2021, at 3:25 PM, Hao Wu <hao.wu@rubrik.com> wrote:
>>> 
>>> The Atmel TPM 1.2 chips crash with error
>>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>>> It is observed from the kernel log after running `tpm_sealdata -z`.
>>> The error thrown from the command is as follows
>>> ```
>>> $ tpm_sealdata -z
>>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>>> code=0087 (135), I/O error
>>> ```
>>> 
>>> The issue was reproduced with the following Atmel TPM chip:
>>> ```
>>> $ tpm_version
>>> T0  TPM 1.2 Version Info:
>>> Chip Version:        1.2.66.1
>>> Spec Level:          2
>>> Errata Revision:     3
>>> TPM Vendor ID:       ATML
>>> TPM Version:         01010000
>>> Manufacturer Info:   41544d4c
>>> ```
>>> 
>>> The root cause of the issue is due to the TPM calls to msleep()
>>> were replaced with usleep_range() [1], which reduces
>>> the actual timeout. Via experiments, it is observed that
>>> the original msleep(5) actually sleeps for 15ms.
>>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>>> the shorter timeout than 15ms can cause the error described above.
>>> 
>>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>>> reduced the timeout to less than 1ms. With experiments,
>>> the problematic timeout in the latest kernel is the one
>>> for `wait_for_tpm_stat`.
>>> 
>>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>>> for Ateml TPM 2.0 chip, and chips from other vendors.
>>> As explained above, the chosen 15ms timeout is
>>> the actual timeout before this issue introduced,
>>> thus the old value is used here.
>>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>>> the existing TPM_TIMEOUT_RANGE_US (300us).
>>> The fixed has been tested in the system with the affected Atmel chip
>>> with no issues observed after boot up.
>>> 
>>> References:
>>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>>> 1.2/2.0 generic drivers
>>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>>> granularity
>>> 
>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>> ---
>>> v4:
>>> - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
>>> - Cleanup unnecessary inline comment
>>> 
>>> v3:
>>> - removes unnecessary condition check in `wait_for_tpm_stat`
>>> 
>>> v2:
>>> - follow the existing way to define two timeouts (min and max)
>>> for ATMEL chip, thus keep the exact timeout logic for 
>>> non-ATEML chips.
>>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>> because it is not an issue for TPM 2.0 chips yet.
>>> 
>>> Test Plan:
>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>> has been fixed.
>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>> the timeout has not been changed.
>>> 
>>> drivers/char/tpm/tpm_tis_core.c | 13 +++++++++++--
>>> drivers/char/tpm/tpm_tis_core.h |  2 ++
>>> include/linux/tpm.h             |  3 +++
>>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>> index 55b9d3965ae1..24605f100e96 100644
>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>> @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>> 		}
>>> 	} else {
>>> 		do {
>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>> -				     TPM_TIMEOUT_USECS_MAX);
>>> +			usleep_range(chip->timeout_wait_stat_min,
>>> +				     chip->timeout_wait_stat_max);
>>> 			status = chip->ops->status(chip);
>>> 			if ((status & mask) == mask)
>>> 				return 0;
>>> @@ -934,6 +934,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>>> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
>>> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
>>> 	priv->phy_ops = phy_ops;
>>> 	dev_set_drvdata(&chip->dev, priv);
>>> 
>>> @@ -983,6 +985,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>> 
>>> 	priv->manufacturer_id = vendor;
>>> 
>>> +	if (priv->manufacturer_id == TPM_VID_ATML &&
>>> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
>>> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
>>> +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
>>> +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
>>> +	}
>>> +
>>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>>> 	if (rc < 0)
>>> 		goto out_err;
>>> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>>> index 9b2d32a59f67..2e431beb44f7 100644
>>> --- a/drivers/char/tpm/tpm_tis_core.h
>>> +++ b/drivers/char/tpm/tpm_tis_core.h
>>> @@ -54,6 +54,8 @@ enum tis_defaults {
>>> 	TIS_MEM_LEN = 0x5000,
>>> 	TIS_SHORT_TIMEOUT = 750,	/* ms */
>>> 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000,	/* usecs */
>>> };
> 
> I'd prefer TIS_TIMEOUT_{MIN, MAX}_ATML. I.e. no "WAIT_STAT" and without "TPM_"
> to be consistent with other constants here.
Ok will do
> 
>>> 
>>> /* Some timeout values are needed before it is known whether the chip is
>>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>>> index aa11fe323c56..171b9102c976 100644
>>> --- a/include/linux/tpm.h
>>> +++ b/include/linux/tpm.h
>>> @@ -150,6 +150,8 @@ struct tpm_chip {
>>> 	bool timeout_adjusted;
>>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>>> 	bool duration_adjusted;
>>> +	unsigned int timeout_wait_stat_min; /* usecs */
>>> +	unsigned int timeout_wait_stat_max; /* usecs */
> 
> Please rename as timeout_{min, max}.
Ok will do
> 
> And I think tpm_chip is wrong place to put them as they are TIS
> specific, i.e. they should be in tpm_tis_data.
Sorry, I am not familiar with tpm_tis_data, could tell the the place that you want me to put the var? 
I think I may have hard time to move forward according toward this comment due to bandwidth constraints.
Some helps would be appreciated. 

Is tpm_tis_data something specific to a chip instance ? Given the values are tied to chip,
we need chip specific instance to make this work.

> 
>>> 
>>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>>> 
>>> @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
>>> #define TPM_VID_INTEL    0x8086
>>> #define TPM_VID_WINBOND  0x1050
>>> #define TPM_VID_STM      0x104A
>>> +#define TPM_VID_ATML     0x1114
>>> 
>>> enum tpm_chip_flags {
>>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>>> -- 
>>> 2.29.0.vfs.0.0
>>> 
>> 
>> Just kindly remind this code review in case it has been missed somehow
> 
> I'm sorry, my bad. I managed to somehow miss this. Might be because
> I've been recently reorganizing my email accounts. And thanks for
> pinging so that I spotted it.
No worries, thanks for quick response!

> 
>> Thanks
>> Hao
> 
> /Jarkko

Hao

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-08-27  0:35               ` Hao Wu
@ 2021-09-04 21:14                 ` Hao Wu
  2021-09-04 23:15                   ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-09-04 21:14 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Han Seungyeop, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley



> On Aug 26, 2021, at 5:35 PM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
>> On Aug 26, 2021, at 9:24 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>> 
>> On Wed, 2021-08-25 at 22:38 -0700, Hao Wu wrote:
>>>> On Aug 14, 2021, at 3:25 PM, Hao Wu <hao.wu@rubrik.com> wrote:
>>>> 
>>>> The Atmel TPM 1.2 chips crash with error
>>>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>>>> It is observed from the kernel log after running `tpm_sealdata -z`.
>>>> The error thrown from the command is as follows
>>>> ```
>>>> $ tpm_sealdata -z
>>>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>>>> code=0087 (135), I/O error
>>>> ```
>>>> 
>>>> The issue was reproduced with the following Atmel TPM chip:
>>>> ```
>>>> $ tpm_version
>>>> T0  TPM 1.2 Version Info:
>>>> Chip Version:        1.2.66.1
>>>> Spec Level:          2
>>>> Errata Revision:     3
>>>> TPM Vendor ID:       ATML
>>>> TPM Version:         01010000
>>>> Manufacturer Info:   41544d4c
>>>> ```
>>>> 
>>>> The root cause of the issue is due to the TPM calls to msleep()
>>>> were replaced with usleep_range() [1], which reduces
>>>> the actual timeout. Via experiments, it is observed that
>>>> the original msleep(5) actually sleeps for 15ms.
>>>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>>>> the shorter timeout than 15ms can cause the error described above.
>>>> 
>>>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>>>> reduced the timeout to less than 1ms. With experiments,
>>>> the problematic timeout in the latest kernel is the one
>>>> for `wait_for_tpm_stat`.
>>>> 
>>>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>>>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>>>> for Ateml TPM 2.0 chip, and chips from other vendors.
>>>> As explained above, the chosen 15ms timeout is
>>>> the actual timeout before this issue introduced,
>>>> thus the old value is used here.
>>>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>>>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>>>> the existing TPM_TIMEOUT_RANGE_US (300us).
>>>> The fixed has been tested in the system with the affected Atmel chip
>>>> with no issues observed after boot up.
>>>> 
>>>> References:
>>>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>>>> 1.2/2.0 generic drivers
>>>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>>>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>>>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>>>> granularity
>>>> 
>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>> ---
>>>> v4:
>>>> - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
>>>> - Cleanup unnecessary inline comment
>>>> 
>>>> v3:
>>>> - removes unnecessary condition check in `wait_for_tpm_stat`
>>>> 
>>>> v2:
>>>> - follow the existing way to define two timeouts (min and max)
>>>> for ATMEL chip, thus keep the exact timeout logic for 
>>>> non-ATEML chips.
>>>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>>> because it is not an issue for TPM 2.0 chips yet.
>>>> 
>>>> Test Plan:
>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>> has been fixed.
>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>> the timeout has not been changed.
>>>> 
>>>> drivers/char/tpm/tpm_tis_core.c | 13 +++++++++++--
>>>> drivers/char/tpm/tpm_tis_core.h |  2 ++
>>>> include/linux/tpm.h             |  3 +++
>>>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>> index 55b9d3965ae1..24605f100e96 100644
>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>> @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>>> 		}
>>>> 	} else {
>>>> 		do {
>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>> -				     TPM_TIMEOUT_USECS_MAX);
>>>> +			usleep_range(chip->timeout_wait_stat_min,
>>>> +				     chip->timeout_wait_stat_max);
>>>> 			status = chip->ops->status(chip);
>>>> 			if ((status & mask) == mask)
>>>> 				return 0;
>>>> @@ -934,6 +934,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>>>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>>>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>>>> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
>>>> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
>>>> 	priv->phy_ops = phy_ops;
>>>> 	dev_set_drvdata(&chip->dev, priv);
>>>> 
>>>> @@ -983,6 +985,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>>> 
>>>> 	priv->manufacturer_id = vendor;
>>>> 
>>>> +	if (priv->manufacturer_id == TPM_VID_ATML &&
>>>> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
>>>> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
>>>> +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
>>>> +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
>>>> +	}
>>>> +
>>>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>>>> 	if (rc < 0)
>>>> 		goto out_err;
>>>> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>>>> index 9b2d32a59f67..2e431beb44f7 100644
>>>> --- a/drivers/char/tpm/tpm_tis_core.h
>>>> +++ b/drivers/char/tpm/tpm_tis_core.h
>>>> @@ -54,6 +54,8 @@ enum tis_defaults {
>>>> 	TIS_MEM_LEN = 0x5000,
>>>> 	TIS_SHORT_TIMEOUT = 750,	/* ms */
>>>> 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000,	/* usecs */
>>>> };
>> 
>> I'd prefer TIS_TIMEOUT_{MIN, MAX}_ATML. I.e. no "WAIT_STAT" and without "TPM_"
>> to be consistent with other constants here.
> Ok will do
>> 
>>>> 
>>>> /* Some timeout values are needed before it is known whether the chip is
>>>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>>>> index aa11fe323c56..171b9102c976 100644
>>>> --- a/include/linux/tpm.h
>>>> +++ b/include/linux/tpm.h
>>>> @@ -150,6 +150,8 @@ struct tpm_chip {
>>>> 	bool timeout_adjusted;
>>>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>>>> 	bool duration_adjusted;
>>>> +	unsigned int timeout_wait_stat_min; /* usecs */
>>>> +	unsigned int timeout_wait_stat_max; /* usecs */
>> 
>> Please rename as timeout_{min, max}.
> Ok will do
>> 
>> And I think tpm_chip is wrong place to put them as they are TIS
>> specific, i.e. they should be in tpm_tis_data.
> Sorry, I am not familiar with tpm_tis_data, could tell the the place that you want me to put the var? 
> I think I may have hard time to move forward according toward this comment due to bandwidth constraints.
> Some helps would be appreciated. 
> 
> Is tpm_tis_data something specific to a chip instance ? Given the values are tied to chip,
> we need chip specific instance to make this work.

Hi Jarkko, I have checked about your proposal a bit. It look slike we need to 
Run “struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev)” in every wait_for_tpm_stat call. Would this be a performance concern ? 
If we cache this in tpm_chip instance, it is not the case. 

Please let me know your thought.

Hao 

>> 
>>>> 
>>>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>>>> 
>>>> @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
>>>> #define TPM_VID_INTEL    0x8086
>>>> #define TPM_VID_WINBOND  0x1050
>>>> #define TPM_VID_STM      0x104A
>>>> +#define TPM_VID_ATML     0x1114
>>>> 
>>>> enum tpm_chip_flags {
>>>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>>>> -- 
>>>> 2.29.0.vfs.0.0
>>>> 
>>> 
>>> Just kindly remind this code review in case it has been missed somehow
>> 
>> I'm sorry, my bad. I managed to somehow miss this. Might be because
>> I've been recently reorganizing my email accounts. And thanks for
>> pinging so that I spotted it.
> No worries, thanks for quick response!
> 
>> 
>>> Thanks
>>> Hao
>> 
>> /Jarkko
> 
> Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v4] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-09-04 21:14                 ` Hao Wu
@ 2021-09-04 23:15                   ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-09-04 23:15 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Han Seungyeop, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James Bottomley



> On Sep 4, 2021, at 2:14 PM, Hao Wu <hao.wu@rubrik.com> wrote:
> 
> 
> 
>> On Aug 26, 2021, at 5:35 PM, Hao Wu <hao.wu@rubrik.com> wrote:
>> 
>>> On Aug 26, 2021, at 9:24 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>> 
>>> On Wed, 2021-08-25 at 22:38 -0700, Hao Wu wrote:
>>>>> On Aug 14, 2021, at 3:25 PM, Hao Wu <hao.wu@rubrik.com> wrote:
>>>>> 
>>>>> The Atmel TPM 1.2 chips crash with error
>>>>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>>>>> It is observed from the kernel log after running `tpm_sealdata -z`.
>>>>> The error thrown from the command is as follows
>>>>> ```
>>>>> $ tpm_sealdata -z
>>>>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>>>>> code=0087 (135), I/O error
>>>>> ```
>>>>> 
>>>>> The issue was reproduced with the following Atmel TPM chip:
>>>>> ```
>>>>> $ tpm_version
>>>>> T0  TPM 1.2 Version Info:
>>>>> Chip Version:        1.2.66.1
>>>>> Spec Level:          2
>>>>> Errata Revision:     3
>>>>> TPM Vendor ID:       ATML
>>>>> TPM Version:         01010000
>>>>> Manufacturer Info:   41544d4c
>>>>> ```
>>>>> 
>>>>> The root cause of the issue is due to the TPM calls to msleep()
>>>>> were replaced with usleep_range() [1], which reduces
>>>>> the actual timeout. Via experiments, it is observed that
>>>>> the original msleep(5) actually sleeps for 15ms.
>>>>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>>>>> the shorter timeout than 15ms can cause the error described above.
>>>>> 
>>>>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>>>>> reduced the timeout to less than 1ms. With experiments,
>>>>> the problematic timeout in the latest kernel is the one
>>>>> for `wait_for_tpm_stat`.
>>>>> 
>>>>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>>>>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>>>>> for Ateml TPM 2.0 chip, and chips from other vendors.
>>>>> As explained above, the chosen 15ms timeout is
>>>>> the actual timeout before this issue introduced,
>>>>> thus the old value is used here.
>>>>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>>>>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>>>>> the existing TPM_TIMEOUT_RANGE_US (300us).
>>>>> The fixed has been tested in the system with the affected Atmel chip
>>>>> with no issues observed after boot up.
>>>>> 
>>>>> References:
>>>>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>>>>> 1.2/2.0 generic drivers
>>>>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>>>>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>>>>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>>>>> granularity
>>>>> 
>>>>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>>>>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>>>>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>>>>> ---
>>>>> v4:
>>>>> - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
>>>>> - Cleanup unnecessary inline comment
>>>>> 
>>>>> v3:
>>>>> - removes unnecessary condition check in `wait_for_tpm_stat`
>>>>> 
>>>>> v2:
>>>>> - follow the existing way to define two timeouts (min and max)
>>>>> for ATMEL chip, thus keep the exact timeout logic for 
>>>>> non-ATEML chips.
>>>>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>>>> because it is not an issue for TPM 2.0 chips yet.
>>>>> 
>>>>> Test Plan:
>>>>> - Run fixed kernel with ATMEL TPM chips and see crash
>>>>> has been fixed.
>>>>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>>>>> the timeout has not been changed.
>>>>> 
>>>>> drivers/char/tpm/tpm_tis_core.c | 13 +++++++++++--
>>>>> drivers/char/tpm/tpm_tis_core.h |  2 ++
>>>>> include/linux/tpm.h             |  3 +++
>>>>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>>>> 
>>>>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>>>>> index 55b9d3965ae1..24605f100e96 100644
>>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>>> @@ -80,8 +80,8 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>>>>> 		}
>>>>> 	} else {
>>>>> 		do {
>>>>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>>>>> -				     TPM_TIMEOUT_USECS_MAX);
>>>>> +			usleep_range(chip->timeout_wait_stat_min,
>>>>> +				     chip->timeout_wait_stat_max);
>>>>> 			status = chip->ops->status(chip);
>>>>> 			if ((status & mask) == mask)
>>>>> 				return 0;
>>>>> @@ -934,6 +934,8 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>>>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>>>>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>>>>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>>>>> +	chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN;
>>>>> +	chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX;
>>>>> 	priv->phy_ops = phy_ops;
>>>>> 	dev_set_drvdata(&chip->dev, priv);
>>>>> 
>>>>> @@ -983,6 +985,13 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>>>>> 
>>>>> 	priv->manufacturer_id = vendor;
>>>>> 
>>>>> +	if (priv->manufacturer_id == TPM_VID_ATML &&
>>>>> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
>>>>> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
>>>>> +		chip->timeout_wait_stat_min = TPM_ATML_TIMEOUT_WAIT_STAT_MIN;
>>>>> +		chip->timeout_wait_stat_max = TPM_ATML_TIMEOUT_WAIT_STAT_MAX;
>>>>> +	}
>>>>> +
>>>>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>>>>> 	if (rc < 0)
>>>>> 		goto out_err;
>>>>> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>>>>> index 9b2d32a59f67..2e431beb44f7 100644
>>>>> --- a/drivers/char/tpm/tpm_tis_core.h
>>>>> +++ b/drivers/char/tpm/tpm_tis_core.h
>>>>> @@ -54,6 +54,8 @@ enum tis_defaults {
>>>>> 	TIS_MEM_LEN = 0x5000,
>>>>> 	TIS_SHORT_TIMEOUT = 750,	/* ms */
>>>>> 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
>>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700,	/* usecs */
>>>>> +	TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000,	/* usecs */
>>>>> };
>>> 
>>> I'd prefer TIS_TIMEOUT_{MIN, MAX}_ATML. I.e. no "WAIT_STAT" and without "TPM_"
>>> to be consistent with other constants here.
>> Ok will do
>>> 
>>>>> 
>>>>> /* Some timeout values are needed before it is known whether the chip is
>>>>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>>>>> index aa11fe323c56..171b9102c976 100644
>>>>> --- a/include/linux/tpm.h
>>>>> +++ b/include/linux/tpm.h
>>>>> @@ -150,6 +150,8 @@ struct tpm_chip {
>>>>> 	bool timeout_adjusted;
>>>>> 	unsigned long duration[TPM_NUM_DURATIONS]; /* jiffies */
>>>>> 	bool duration_adjusted;
>>>>> +	unsigned int timeout_wait_stat_min; /* usecs */
>>>>> +	unsigned int timeout_wait_stat_max; /* usecs */
>>> 
>>> Please rename as timeout_{min, max}.
>> Ok will do
To be honest, this naming could be misleading, because the timeout here only applies to wait_stat use case. 
But I will just follow you suggestion anyway.

Hao

>>> 
>>> And I think tpm_chip is wrong place to put them as they are TIS
>>> specific, i.e. they should be in tpm_tis_data.
>> Sorry, I am not familiar with tpm_tis_data, could tell the the place that you want me to put the var? 
>> I think I may have hard time to move forward according toward this comment due to bandwidth constraints.
>> Some helps would be appreciated. 
>> 
>> Is tpm_tis_data something specific to a chip instance ? Given the values are tied to chip,
>> we need chip specific instance to make this work.
> 
> Hi Jarkko, I have checked about your proposal a bit. It look slike we need to 
> Run “struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev)” in every wait_for_tpm_stat call. Would this be a performance concern ? 
> If we cache this in tpm_chip instance, it is not the case. 
> 
> Please let me know your thought.
> 
> Hao 
> 
>>> 
>>>>> 
>>>>> 	struct dentry *bios_dir[TPM_NUM_EVENT_LOG_FILES];
>>>>> 
>>>>> @@ -269,6 +271,7 @@ enum tpm2_cc_attrs {
>>>>> #define TPM_VID_INTEL    0x8086
>>>>> #define TPM_VID_WINBOND  0x1050
>>>>> #define TPM_VID_STM      0x104A
>>>>> +#define TPM_VID_ATML     0x1114
>>>>> 
>>>>> enum tpm_chip_flags {
>>>>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
>>>>> -- 
>>>>> 2.29.0.vfs.0.0
>>>>> 
>>>> 
>>>> Just kindly remind this code review in case it has been missed somehow
>>> 
>>> I'm sorry, my bad. I managed to somehow miss this. Might be because
>>> I've been recently reorganizing my email accounts. And thanks for
>>> pinging so that I spotted it.
>> No worries, thanks for quick response!
>> 
>>> 
>>>> Thanks
>>>> Hao
>>> 
>>> /Jarkko
>> 
>> Hao
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v5] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-08-14 22:25         ` [PATCH v4] " Hao Wu
  2021-08-26  5:38           ` Hao Wu
@ 2021-09-05  3:51           ` Hao Wu
  2021-09-07 17:43             ` Jarkko Sakkinen
  1 sibling, 1 reply; 47+ messages in thread
From: Hao Wu @ 2021-09-05  3:51 UTC (permalink / raw)
  To: hao.wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jarkko, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

The Atmel TPM 1.2 chips crash with error
`tpm_try_transmit: send(): error -62` since kernel 4.14.
It is observed from the kernel log after running `tpm_sealdata -z`.
The error thrown from the command is as follows
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
code=0087 (135), I/O error
```

The issue was reproduced with the following Atmel TPM chip:
```
$ tpm_version
T0  TPM 1.2 Version Info:
  Chip Version:        1.2.66.1
  Spec Level:          2
  Errata Revision:     3
  TPM Vendor ID:       ATML
  TPM Version:         01010000
  Manufacturer Info:   41544d4c
```

The root cause of the issue is due to the TPM calls to msleep()
were replaced with usleep_range() [1], which reduces
the actual timeout. Via experiments, it is observed that
the original msleep(5) actually sleeps for 15ms.
Because of a known timeout issue in Atmel TPM 1.2 chip,
the shorter timeout than 15ms can cause the error described above.

A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
reduced the timeout to less than 1ms. With experiments,
the problematic timeout in the latest kernel is the one
for `wait_for_tpm_stat`.

To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
for Ateml TPM 2.0 chip, and chips from other vendors.
As explained above, the chosen 15ms timeout is
the actual timeout before this issue introduced,
thus the old value is used here.
Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
the existing TPM_TIMEOUT_RANGE_US (300us).
The fixed has been tested in the system with the affected Atmel chip
with no issues observed after boot up.

References:
[1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
1.2/2.0 generic drivers
[2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
[3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
[4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
granularity

Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
---
v5:
- Rename variables according to feedbacks
- Move timeout min/max to tpm_tis_data

v4:
- Move timeout constants to drivers/char/tpm/tpm_tis_core.h
- Cleanup unnecessary inline comment

v3:
- removes unnecessary condition check in `wait_for_tpm_stat`

v2:
- follow the existing way to define two timeouts (min and max)
  for ATMEL chip, thus keep the exact timeout logic for 
  non-ATEML chips.
- limit the timeout increase to only ATMEL TPM 1.2 chips,
  because it is not an issue for TPM 2.0 chips yet.

Test Plan:
- Run fixed kernel with ATMEL TPM chips and see crash
has been fixed.
- Run fixed kernel with non-ATMEL TPM chips, and confirm
the timeout has not been changed.

 drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++--------
 drivers/char/tpm/tpm_tis_core.h |  4 ++++
 include/linux/tpm.h             |  1 +
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 55b9d3965ae1..29de383aec5f 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -79,9 +79,10 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 			goto again;
 		}
 	} else {
+		struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
 		do {
-			usleep_range(TPM_TIMEOUT_USECS_MIN,
-				     TPM_TIMEOUT_USECS_MAX);
+			usleep_range(priv->timeout_min,
+				     priv->timeout_max);
 			status = chip->ops->status(chip);
 			if ((status & mask) == mask)
 				return 0;
@@ -934,7 +935,23 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
+	priv->timeout_min = TPM_TIMEOUT_USECS_MIN;
+	priv->timeout_max = TPM_TIMEOUT_USECS_MAX;
 	priv->phy_ops = phy_ops;
+
+	rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
+	if (rc < 0)
+		goto out_err;
+
+	priv->manufacturer_id = vendor;
+
+	if (priv->manufacturer_id == TPM_VID_ATML &&
+		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
+		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
+		priv->timeout_min = TIS_TIMEOUT_MIN_ATML;
+		priv->timeout_max = TIS_TIMEOUT_MAX_ATML;
+	}
+
 	dev_set_drvdata(&chip->dev, priv);
 
 	if (is_bsw()) {
@@ -977,12 +994,6 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
 	if (rc)
 		goto out_err;
 
-	rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
-	if (rc < 0)
-		goto out_err;
-
-	priv->manufacturer_id = vendor;
-
 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
 	if (rc < 0)
 		goto out_err;
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index 9b2d32a59f67..c33f27c929f4 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -54,6 +54,8 @@ enum tis_defaults {
 	TIS_MEM_LEN = 0x5000,
 	TIS_SHORT_TIMEOUT = 750,	/* ms */
 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
+	TIS_TIMEOUT_MIN_ATML = 14700,	/* usecs */
+	TIS_TIMEOUT_MAX_ATML = 15000,	/* usecs */
 };
 
 /* Some timeout values are needed before it is known whether the chip is
@@ -97,6 +99,8 @@ struct tpm_tis_data {
 	wait_queue_head_t read_queue;
 	const struct tpm_tis_phy_ops *phy_ops;
 	unsigned short rng_quality;
+	unsigned int timeout_min; /* usecs */
+	unsigned int timeout_max; /* usecs */
 };
 
 struct tpm_tis_phy_ops {
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index aa11fe323c56..12d827734686 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -269,6 +269,7 @@ enum tpm2_cc_attrs {
 #define TPM_VID_INTEL    0x8086
 #define TPM_VID_WINBOND  0x1050
 #define TPM_VID_STM      0x104A
+#define TPM_VID_ATML     0x1114
 
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
-- 
2.29.0.vfs.0.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v5] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-09-05  3:51           ` [PATCH v5] " Hao Wu
@ 2021-09-07 17:43             ` Jarkko Sakkinen
  2021-09-08  8:33               ` Hao Wu
  0 siblings, 1 reply; 47+ messages in thread
From: Jarkko Sakkinen @ 2021-09-07 17:43 UTC (permalink / raw)
  To: Hao Wu, shrihari.kalkar, seungyeop.han, anish.jhaveri,
	peterhuewe, jgg, linux-integrity, pmenzel, kgold, zohar,
	why2jjj.linux, hamza, gregkh, arnd, nayna, James.Bottomley

On Sat, 2021-09-04 at 20:51 -0700, Hao Wu wrote:
> The Atmel TPM 1.2 chips crash with error
> `tpm_try_transmit: send(): error -62` since kernel 4.14.
> It is observed from the kernel log after running `tpm_sealdata -z`.
> The error thrown from the command is as follows
> ```
> $ tpm_sealdata -z
> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
> code=0087 (135), I/O error
> ```
> 
> The issue was reproduced with the following Atmel TPM chip:
> ```
> $ tpm_version
> T0  TPM 1.2 Version Info:
>   Chip Version:        1.2.66.1
>   Spec Level:          2
>   Errata Revision:     3
>   TPM Vendor ID:       ATML
>   TPM Version:         01010000
>   Manufacturer Info:   41544d4c
> ```
> 
> The root cause of the issue is due to the TPM calls to msleep()
> were replaced with usleep_range() [1], which reduces
> the actual timeout. Via experiments, it is observed that
> the original msleep(5) actually sleeps for 15ms.
> Because of a known timeout issue in Atmel TPM 1.2 chip,
> the shorter timeout than 15ms can cause the error described above.
> 
> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
> reduced the timeout to less than 1ms. With experiments,
> the problematic timeout in the latest kernel is the one
> for `wait_for_tpm_stat`.
> 
> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
> for Ateml TPM 2.0 chip, and chips from other vendors.
> As explained above, the chosen 15ms timeout is
> the actual timeout before this issue introduced,
> thus the old value is used here.
> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
> the existing TPM_TIMEOUT_RANGE_US (300us).
> The fixed has been tested in the system with the affected Atmel chip
> with no issues observed after boot up.
> 
> References:
> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
> 1.2/2.0 generic drivers
> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
> granularity
> 
> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
> ---
> v5:
> - Rename variables according to feedbacks
> - Move timeout min/max to tpm_tis_data
> 
> v4:
> - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
> - Cleanup unnecessary inline comment
> 
> v3:
> - removes unnecessary condition check in `wait_for_tpm_stat`
> 
> v2:
> - follow the existing way to define two timeouts (min and max)
>   for ATMEL chip, thus keep the exact timeout logic for 
>   non-ATEML chips.
> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>   because it is not an issue for TPM 2.0 chips yet.
> 
> Test Plan:
> - Run fixed kernel with ATMEL TPM chips and see crash
> has been fixed.
> - Run fixed kernel with non-ATMEL TPM chips, and confirm
> the timeout has not been changed.
> 
>  drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++--------
>  drivers/char/tpm/tpm_tis_core.h |  4 ++++
>  include/linux/tpm.h             |  1 +
>  3 files changed, 24 insertions(+), 8 deletions(-)
> 

I just noticed that these are part of the same email thread from
lore.kernel.org. Please always use separate thread. E.g. I'm not sure if
this would play out well with tooling such as b4 that can pick up patch
sets from lore.


> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 55b9d3965ae1..29de383aec5f 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -79,9 +79,10 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>  			goto again;
>  		}
>  	} else {
> +		struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);

Move this declaration to the beginning of the function.

>  		do {
> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
> -				     TPM_TIMEOUT_USECS_MAX);
> +			usleep_range(priv->timeout_min,
> +				     priv->timeout_max);
>  			status = chip->ops->status(chip);
>  			if ((status & mask) == mask)
>  				return 0;
> @@ -934,7 +935,23 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>  	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>  	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
> +	priv->timeout_min = TPM_TIMEOUT_USECS_MIN;
> +	priv->timeout_max = TPM_TIMEOUT_USECS_MAX;
>  	priv->phy_ops = phy_ops;
> +
> +	rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
> +	if (rc < 0)
> +		goto out_err;
> +
> +	priv->manufacturer_id = vendor;
> +
> +	if (priv->manufacturer_id == TPM_VID_ATML &&
> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
                                                        
A ' ' character missing before the last asterisk.

Also the comment is just in English the same exact thing already
clearly expressed by the if-statement, so it's better that you
just remove the comment altogether.

> +		priv->timeout_min = TIS_TIMEOUT_MIN_ATML;
> +		priv->timeout_max = TIS_TIMEOUT_MAX_ATML;
> +	}
> +
>  	dev_set_drvdata(&chip->dev, priv);
>  
>  	if (is_bsw()) {
> @@ -977,12 +994,6 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>  	if (rc)
>  		goto out_err;
>  
> -	rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
> -	if (rc < 0)
> -		goto out_err;
> -
> -	priv->manufacturer_id = vendor;
> -
>  	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>  	if (rc < 0)
>  		goto out_err;
> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
> index 9b2d32a59f67..c33f27c929f4 100644
> --- a/drivers/char/tpm/tpm_tis_core.h
> +++ b/drivers/char/tpm/tpm_tis_core.h
> @@ -54,6 +54,8 @@ enum tis_defaults {
>  	TIS_MEM_LEN = 0x5000,
>  	TIS_SHORT_TIMEOUT = 750,	/* ms */
>  	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
> +	TIS_TIMEOUT_MIN_ATML = 14700,	/* usecs */
> +	TIS_TIMEOUT_MAX_ATML = 15000,	/* usecs */
>  };
>  
>  /* Some timeout values are needed before it is known whether the chip is
> @@ -97,6 +99,8 @@ struct tpm_tis_data {
>  	wait_queue_head_t read_queue;
>  	const struct tpm_tis_phy_ops *phy_ops;
>  	unsigned short rng_quality;
> +	unsigned int timeout_min; /* usecs */
> +	unsigned int timeout_max; /* usecs */
>  };
>  
>  struct tpm_tis_phy_ops {
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index aa11fe323c56..12d827734686 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -269,6 +269,7 @@ enum tpm2_cc_attrs {
>  #define TPM_VID_INTEL    0x8086
>  #define TPM_VID_WINBOND  0x1050
>  #define TPM_VID_STM      0x104A
> +#define TPM_VID_ATML     0x1114
>  
>  enum tpm_chip_flags {
>  	TPM_CHIP_FLAG_TPM2		= BIT(1),

Looking good other than a those minor nitpicks. Please send the next as
a separate thread, and *not* as response, so that it can be picked up.

/Jarkko


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v5] tpm: fix Atmel TPM crash caused by too frequent queries
  2021-09-07 17:43             ` Jarkko Sakkinen
@ 2021-09-08  8:33               ` Hao Wu
  0 siblings, 0 replies; 47+ messages in thread
From: Hao Wu @ 2021-09-08  8:33 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Shrihari Kalkar, Seungyeop Han, Anish Jhaveri, peterhuewe, jgg,
	linux-integrity, Paul Menzel, Ken Goldman, zohar, why2jjj.linux,
	Hamza Attak, gregkh, arnd, Nayna, James.Bottomley


> On Sep 7, 2021, at 10:43 AM, Jarkko Sakkinen <jarkko@kernel.org> wrote:
> 
> On Sat, 2021-09-04 at 20:51 -0700, Hao Wu wrote:
>> The Atmel TPM 1.2 chips crash with error
>> `tpm_try_transmit: send(): error -62` since kernel 4.14.
>> It is observed from the kernel log after running `tpm_sealdata -z`.
>> The error thrown from the command is as follows
>> ```
>> $ tpm_sealdata -z
>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
>> code=0087 (135), I/O error
>> ```
>> 
>> The issue was reproduced with the following Atmel TPM chip:
>> ```
>> $ tpm_version
>> T0  TPM 1.2 Version Info:
>>  Chip Version:        1.2.66.1
>>  Spec Level:          2
>>  Errata Revision:     3
>>  TPM Vendor ID:       ATML
>>  TPM Version:         01010000
>>  Manufacturer Info:   41544d4c
>> ```
>> 
>> The root cause of the issue is due to the TPM calls to msleep()
>> were replaced with usleep_range() [1], which reduces
>> the actual timeout. Via experiments, it is observed that
>> the original msleep(5) actually sleeps for 15ms.
>> Because of a known timeout issue in Atmel TPM 1.2 chip,
>> the shorter timeout than 15ms can cause the error described above.
>> 
>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
>> reduced the timeout to less than 1ms. With experiments,
>> the problematic timeout in the latest kernel is the one
>> for `wait_for_tpm_stat`.
>> 
>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
>> for Ateml TPM 2.0 chip, and chips from other vendors.
>> As explained above, the chosen 15ms timeout is
>> the actual timeout before this issue introduced,
>> thus the old value is used here.
>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
>> the existing TPM_TIMEOUT_RANGE_US (300us).
>> The fixed has been tested in the system with the affected Atmel chip
>> with no issues observed after boot up.
>> 
>> References:
>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
>> 1.2/2.0 generic drivers
>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
>> granularity
>> 
>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
>> Signed-off-by: Hao Wu <hao.wu@rubrik.com>
>> ---
>> v5:
>> - Rename variables according to feedbacks
>> - Move timeout min/max to tpm_tis_data
>> 
>> v4:
>> - Move timeout constants to drivers/char/tpm/tpm_tis_core.h
>> - Cleanup unnecessary inline comment
>> 
>> v3:
>> - removes unnecessary condition check in `wait_for_tpm_stat`
>> 
>> v2:
>> - follow the existing way to define two timeouts (min and max)
>>  for ATMEL chip, thus keep the exact timeout logic for 
>>  non-ATEML chips.
>> - limit the timeout increase to only ATMEL TPM 1.2 chips,
>>  because it is not an issue for TPM 2.0 chips yet.
>> 
>> Test Plan:
>> - Run fixed kernel with ATMEL TPM chips and see crash
>> has been fixed.
>> - Run fixed kernel with non-ATMEL TPM chips, and confirm
>> the timeout has not been changed.
>> 
>> drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++--------
>> drivers/char/tpm/tpm_tis_core.h |  4 ++++
>> include/linux/tpm.h             |  1 +
>> 3 files changed, 24 insertions(+), 8 deletions(-)
>> 
> 
> I just noticed that these are part of the same email thread from
> lore.kernel.org. Please always use separate thread. E.g. I'm not sure if
> this would play out well with tooling such as b4 that can pick up patch
> sets from lore.
I see. I thought I need to chain these. Will send a separate one.

> 
>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
>> index 55b9d3965ae1..29de383aec5f 100644
>> --- a/drivers/char/tpm/tpm_tis_core.c
>> +++ b/drivers/char/tpm/tpm_tis_core.c
>> @@ -79,9 +79,10 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
>> 			goto again;
>> 		}
>> 	} else {
>> +		struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
> 
> Move this declaration to the beginning of the function.
OK

>> 		do {
>> -			usleep_range(TPM_TIMEOUT_USECS_MIN,
>> -				     TPM_TIMEOUT_USECS_MAX);
>> +			usleep_range(priv->timeout_min,
>> +				     priv->timeout_max);
>> 			status = chip->ops->status(chip);
>> 			if ((status & mask) == mask)
>> 				return 0;
>> @@ -934,7 +935,23 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 	chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
>> 	chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
>> 	chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
>> +	priv->timeout_min = TPM_TIMEOUT_USECS_MIN;
>> +	priv->timeout_max = TPM_TIMEOUT_USECS_MAX;
>> 	priv->phy_ops = phy_ops;
>> +
>> +	rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
>> +	if (rc < 0)
>> +		goto out_err;
>> +
>> +	priv->manufacturer_id = vendor;
>> +
>> +	if (priv->manufacturer_id == TPM_VID_ATML &&
>> +		!(chip->flags & TPM_CHIP_FLAG_TPM2)) {
>> +		/* If TPM chip is 1.2 ATMEL chip, timeout need to be relaxed*/
> 
> A ' ' character missing before the last asterisk.
> 
> Also the comment is just in English the same exact thing already
> clearly expressed by the if-statement, so it's better that you
> just remove the comment altogether.
Sure will remove it
> 
>> +		priv->timeout_min = TIS_TIMEOUT_MIN_ATML;
>> +		priv->timeout_max = TIS_TIMEOUT_MAX_ATML;
>> +	}
>> +
>> 	dev_set_drvdata(&chip->dev, priv);
>> 
>> 	if (is_bsw()) {
>> @@ -977,12 +994,6 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
>> 	if (rc)
>> 		goto out_err;
>> 
>> -	rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
>> -	if (rc < 0)
>> -		goto out_err;
>> -
>> -	priv->manufacturer_id = vendor;
>> -
>> 	rc = tpm_tis_read8(priv, TPM_RID(0), &rid);
>> 	if (rc < 0)
>> 		goto out_err;
>> diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
>> index 9b2d32a59f67..c33f27c929f4 100644
>> --- a/drivers/char/tpm/tpm_tis_core.h
>> +++ b/drivers/char/tpm/tpm_tis_core.h
>> @@ -54,6 +54,8 @@ enum tis_defaults {
>> 	TIS_MEM_LEN = 0x5000,
>> 	TIS_SHORT_TIMEOUT = 750,	/* ms */
>> 	TIS_LONG_TIMEOUT = 2000,	/* 2 sec */
>> +	TIS_TIMEOUT_MIN_ATML = 14700,	/* usecs */
>> +	TIS_TIMEOUT_MAX_ATML = 15000,	/* usecs */
>> };
>> 
>> /* Some timeout values are needed before it is known whether the chip is
>> @@ -97,6 +99,8 @@ struct tpm_tis_data {
>> 	wait_queue_head_t read_queue;
>> 	const struct tpm_tis_phy_ops *phy_ops;
>> 	unsigned short rng_quality;
>> +	unsigned int timeout_min; /* usecs */
>> +	unsigned int timeout_max; /* usecs */
>> };
>> 
>> struct tpm_tis_phy_ops {
>> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>> index aa11fe323c56..12d827734686 100644
>> --- a/include/linux/tpm.h
>> +++ b/include/linux/tpm.h
>> @@ -269,6 +269,7 @@ enum tpm2_cc_attrs {
>> #define TPM_VID_INTEL    0x8086
>> #define TPM_VID_WINBOND  0x1050
>> #define TPM_VID_STM      0x104A
>> +#define TPM_VID_ATML     0x1114
>> 
>> enum tpm_chip_flags {
>> 	TPM_CHIP_FLAG_TPM2		= BIT(1),
> 
> Looking good other than a those minor nitpicks. Please send the next as
> a separate thread, and *not* as response, so that it can be picked up.
> 
> /Jarkko
> 
Thanks!
Hao


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2021-09-08  8:33 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-20 23:18 [PATCH] Fix Atmel TPM crash caused by too frequent queries Hao Wu
2021-06-23 13:35 ` Jarkko Sakkinen
2021-06-24  5:49   ` Hao Wu
2021-06-29 20:06     ` Jarkko Sakkinen
2021-06-30  4:27       ` Hao Wu
2021-06-24  5:33 ` Hao Wu
2021-06-29 20:07   ` Jarkko Sakkinen
2021-06-30  4:22   ` [PATCH] tpm: fix ATMEL " Hao Wu
2021-07-02  6:35     ` Jarkko Sakkinen
2021-07-02  7:12       ` Greg KH
2021-07-02  7:33       ` Hao Wu
2021-07-02  7:35         ` Hao Wu
2021-07-02  7:45         ` Jarkko Sakkinen
2021-07-02  7:59           ` Hao Wu
2021-07-02  8:42             ` Jarkko Sakkinen
2021-07-02 11:57               ` Jarkko Sakkinen
2021-07-02 19:16                 ` Hao Wu
2021-07-05  5:19                   ` Jarkko Sakkinen
2021-07-05  5:29                     ` Hao Wu
2021-07-04  0:07     ` Hao Wu
2021-07-05  7:15       ` Jarkko Sakkinen
2021-07-05 23:09         ` Hao Wu
2021-07-06 12:34           ` Mimi Zohar
2021-07-07  4:18             ` Hao Wu
2021-07-07  4:34               ` Hao Wu
2021-07-07  4:31     ` [PATCH v2] " Hao Wu
2021-07-07  9:24       ` Jarkko Sakkinen
2021-07-07 18:28         ` Hao Wu
2021-07-07 21:10           ` Jarkko Sakkinen
2021-07-09  4:43             ` Hao Wu
2021-07-09  4:40     ` [PATCH v2] tpm: fix Atmel " Hao Wu
2021-07-09 17:47       ` Jarkko Sakkinen
2021-07-09 19:23         ` Hao Wu
2021-07-11  7:37           ` Hao Wu
2021-07-16  5:30             ` Hao Wu
2021-07-11  7:51       ` [PATCH v3] " Hao Wu
2021-07-27  2:46         ` Jarkko Sakkinen
2021-07-27  3:40           ` Hao Wu
2021-08-14 22:25         ` [PATCH v4] " Hao Wu
2021-08-26  5:38           ` Hao Wu
2021-08-26 16:24             ` Jarkko Sakkinen
2021-08-27  0:35               ` Hao Wu
2021-09-04 21:14                 ` Hao Wu
2021-09-04 23:15                   ` Hao Wu
2021-09-05  3:51           ` [PATCH v5] " Hao Wu
2021-09-07 17:43             ` Jarkko Sakkinen
2021-09-08  8:33               ` Hao Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.