All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
@ 2020-05-26 18:32 Mario Limonciello
  2020-05-26 19:14 ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Mario Limonciello @ 2020-05-26 18:32 UTC (permalink / raw)
  To: Peter Huewe, Jarkko Sakkinen, Jason Gunthorpe
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Mario Limonciello, Jeffrin Jose T, Alex Guzman

This reverts commit d23d12484307b40eea549b8a858f5fffad913897.

This commit has caused regressions for the XPS 9560 containing
a Nuvoton TPM.

As mentioned by the reporter all TPM2 commands are failing with:
  ERROR:tcti:src/tss2-tcti/tcti-device.c:290:tcti_device_receive()
  Failed to read response from fd 3, got errno 1: Operation not permitted

The reporter bisected this issue back to this commit which was
backported to stable as commit 4d6ebc4.

Cc: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206275
Fixes: d23d124 ("tpm: fix invalid locking in NONBLOCKING mode")
Reported-by: Alex Guzman <alex@guzman.io>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
---
 drivers/char/tpm/tpm-dev-common.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-common.c
index 87f449340202..bc9d7c7ddc01 100644
--- a/drivers/char/tpm/tpm-dev-common.c
+++ b/drivers/char/tpm/tpm-dev-common.c
@@ -61,12 +61,6 @@ static void tpm_dev_async_work(struct work_struct *work)
 
 	mutex_lock(&priv->buffer_mutex);
 	priv->command_enqueued = false;
-	ret = tpm_try_get_ops(priv->chip);
-	if (ret) {
-		priv->response_length = ret;
-		goto out;
-	}
-
 	ret = tpm_dev_transmit(priv->chip, priv->space, priv->data_buffer,
 			       sizeof(priv->data_buffer));
 	tpm_put_ops(priv->chip);
@@ -74,7 +68,6 @@ static void tpm_dev_async_work(struct work_struct *work)
 		priv->response_length = ret;
 		mod_timer(&priv->user_read_timer, jiffies + (120 * HZ));
 	}
-out:
 	mutex_unlock(&priv->buffer_mutex);
 	wake_up_interruptible(&priv->async_wait);
 }
@@ -211,7 +204,6 @@ ssize_t tpm_common_write(struct file *file, const char __user *buf,
 	if (file->f_flags & O_NONBLOCK) {
 		priv->command_enqueued = true;
 		queue_work(tpm_dev_wq, &priv->async_work);
-		tpm_put_ops(priv->chip);
 		mutex_unlock(&priv->buffer_mutex);
 		return size;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 18:32 [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode" Mario Limonciello
@ 2020-05-26 19:14 ` James Bottomley
  2020-05-26 19:23   ` Mario.Limonciello
  2020-05-26 19:39   ` Tadeusz Struk
  0 siblings, 2 replies; 24+ messages in thread
From: James Bottomley @ 2020-05-26 19:14 UTC (permalink / raw)
  To: Mario Limonciello, Peter Huewe, Jarkko Sakkinen, Jason Gunthorpe
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> 
> This commit has caused regressions for the XPS 9560 containing
> a Nuvoton TPM.

Presumably this is using the tis driver?

> As mentioned by the reporter all TPM2 commands are failing with:
>   ERROR:tcti:src/tss2-tcti/tcti-device.c:290:tcti_device_receive()
>   Failed to read response from fd 3, got errno 1: Operation not
> permitted
> 
> The reporter bisected this issue back to this commit which was
> backported to stable as commit 4d6ebc4.

I think the problem is request_locality ... for some inexplicable
reason a failure there returns -1, which is EPERM to user space.

That seems to be a bug in the async code since everything else gives a
ESPIPE error if tpm_try_get_ops fails ... at least no-one assumes it
gives back a sensible return code.

What I think is happening is that with the patch the TPM goes through a
quick sequence of request, relinquish, request, relinquish and it's the
third request which is failing (likely timing out).  Without the patch,
the patch there's only one request,relinquish cycle because the ops are
held while the async work is executed.  I have a vague recollection
that there is a problem with too many locality request in quick
succession, but I'll defer to Jason, who I think understands the
intricacies of localities better than I do.

If that's the problem, the solution looks simple enough: just move the
ops get down because the priv state is already protected by the buffer
mutex

James

---

diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-common.c
index 87f449340202..1784530b8387 100644
--- a/drivers/char/tpm/tpm-dev-common.c
+++ b/drivers/char/tpm/tpm-dev-common.c
@@ -189,15 +189,6 @@ ssize_t tpm_common_write(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	/* atomic tpm command send and result receive. We only hold the ops
-	 * lock during this period so that the tpm can be unregistered even if
-	 * the char dev is held open.
-	 */
-	if (tpm_try_get_ops(priv->chip)) {
-		ret = -EPIPE;
-		goto out;
-	}
-
 	priv->response_length = 0;
 	priv->response_read = false;
 	*off = 0;
@@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file, const char __user *buf,
 	if (file->f_flags & O_NONBLOCK) {
 		priv->command_enqueued = true;
 		queue_work(tpm_dev_wq, &priv->async_work);
-		tpm_put_ops(priv->chip);
 		mutex_unlock(&priv->buffer_mutex);
 		return size;
 	}
 
+	/* atomic tpm command send and result receive. We only hold the ops
+	 * lock during this period so that the tpm can be unregistered even if
+	 * the char dev is held open.
+	 */
+	if (tpm_try_get_ops(priv->chip)) {
+		ret = -EPIPE;
+		goto out;
+	}
+
 	ret = tpm_dev_transmit(priv->chip, priv->space, priv->data_buffer,
 			       sizeof(priv->data_buffer));
 	tpm_put_ops(priv->chip);

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* RE: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:14 ` James Bottomley
@ 2020-05-26 19:23   ` Mario.Limonciello
  2020-05-26 19:38     ` James Bottomley
  2020-05-27 20:15     ` Jarkko Sakkinen
  2020-05-26 19:39   ` Tadeusz Struk
  1 sibling, 2 replies; 24+ messages in thread
From: Mario.Limonciello @ 2020-05-26 19:23 UTC (permalink / raw)
  To: James.Bottomley, peterhuewe, jarkko.sakkinen, jgg
  Cc: arnd, gregkh, linux-integrity, linux-kernel, jeffrin, alex

> On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> >
> > This commit has caused regressions for the XPS 9560 containing
> > a Nuvoton TPM.
> 
> Presumably this is using the tis driver?

Correct.

> 
> > As mentioned by the reporter all TPM2 commands are failing with:
> >   ERROR:tcti:src/tss2-tcti/tcti-device.c:290:tcti_device_receive()
> >   Failed to read response from fd 3, got errno 1: Operation not
> > permitted
> >
> > The reporter bisected this issue back to this commit which was
> > backported to stable as commit 4d6ebc4.
> 
> I think the problem is request_locality ... for some inexplicable
> reason a failure there returns -1, which is EPERM to user space.
> 
> That seems to be a bug in the async code since everything else gives a
> ESPIPE error if tpm_try_get_ops fails ... at least no-one assumes it
> gives back a sensible return code.
> 
> What I think is happening is that with the patch the TPM goes through a
> quick sequence of request, relinquish, request, relinquish and it's the
> third request which is failing (likely timing out).  Without the patch,
> the patch there's only one request,relinquish cycle because the ops are
> held while the async work is executed.  I have a vague recollection
> that there is a problem with too many locality request in quick
> succession, but I'll defer to Jason, who I think understands the
> intricacies of localities better than I do.

Thanks, I don't pretend to understand the nuances of this particular code,
but I was hoping that the request to revert got some attention since Alex's
kernel Bugzilla and message a few months ago to linux integrity weren't.

> 
> If that's the problem, the solution looks simple enough: just move the
> ops get down because the priv state is already protected by the buffer
> mutex

Yeah, if that works for Alex's situation it certainly sounds like a better
solution than reverting this patch as this patch actually does fix a problem
reported by Jeffrin originally.

Could you propose a specific patch that Alex and Jeffrin can perhaps both try?

> 
> James
> 
> ---
> 
> diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-
> common.c
> index 87f449340202..1784530b8387 100644
> --- a/drivers/char/tpm/tpm-dev-common.c
> +++ b/drivers/char/tpm/tpm-dev-common.c
> @@ -189,15 +189,6 @@ ssize_t tpm_common_write(struct file *file, const char
> __user *buf,
>  		goto out;
>  	}
> 
> -	/* atomic tpm command send and result receive. We only hold the ops
> -	 * lock during this period so that the tpm can be unregistered even if
> -	 * the char dev is held open.
> -	 */
> -	if (tpm_try_get_ops(priv->chip)) {
> -		ret = -EPIPE;
> -		goto out;
> -	}
> -
>  	priv->response_length = 0;
>  	priv->response_read = false;
>  	*off = 0;
> @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file, const char
> __user *buf,
>  	if (file->f_flags & O_NONBLOCK) {
>  		priv->command_enqueued = true;
>  		queue_work(tpm_dev_wq, &priv->async_work);
> -		tpm_put_ops(priv->chip);
>  		mutex_unlock(&priv->buffer_mutex);
>  		return size;
>  	}
> 
> +	/* atomic tpm command send and result receive. We only hold the ops
> +	 * lock during this period so that the tpm can be unregistered even if
> +	 * the char dev is held open.
> +	 */
> +	if (tpm_try_get_ops(priv->chip)) {
> +		ret = -EPIPE;
> +		goto out;
> +	}
> +
>  	ret = tpm_dev_transmit(priv->chip, priv->space, priv->data_buffer,
>  			       sizeof(priv->data_buffer));
>  	tpm_put_ops(priv->chip);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:23   ` Mario.Limonciello
@ 2020-05-26 19:38     ` James Bottomley
  2020-05-26 22:19       ` Alex Guzman
  2020-05-27 20:09       ` Jarkko Sakkinen
  2020-05-27 20:15     ` Jarkko Sakkinen
  1 sibling, 2 replies; 24+ messages in thread
From: James Bottomley @ 2020-05-26 19:38 UTC (permalink / raw)
  To: Mario.Limonciello, peterhuewe, jarkko.sakkinen, jgg
  Cc: arnd, gregkh, linux-integrity, linux-kernel, jeffrin, alex

On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > 
> > > This commit has caused regressions for the XPS 9560 containing
> > > a Nuvoton TPM.
> > 
> > Presumably this is using the tis driver?
> 
> Correct.
> 
> > 
> > > As mentioned by the reporter all TPM2 commands are failing with:
> > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > device.c:290:tcti_device_receive()
> > >   Failed to read response from fd 3, got errno 1: Operation not
> > > permitted
> > > 
> > > The reporter bisected this issue back to this commit which was
> > > backported to stable as commit 4d6ebc4.
> > 
> > I think the problem is request_locality ... for some inexplicable
> > reason a failure there returns -1, which is EPERM to user space.
> > 
> > That seems to be a bug in the async code since everything else
> > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > assumes it gives back a sensible return code.
> > 
> > What I think is happening is that with the patch the TPM goes
> > through a quick sequence of request, relinquish, request,
> > relinquish and it's the third request which is failing (likely
> > timing out).  Without the patch, the patch there's only one
> > request,relinquish cycle because the ops are held while the async
> > work is executed.  I have a vague recollection that there is a
> > problem with too many locality request in quick succession, but
> > I'll defer to Jason, who I think understands the intricacies of
> > localities better than I do.
> 
> Thanks, I don't pretend to understand the nuances of this particular
> code, but I was hoping that the request to revert got some attention
> since Alex's kernel Bugzilla and message a few months ago to linux
> integrity weren't.
> 
> > 
> > If that's the problem, the solution looks simple enough: just move
> > the ops get down because the priv state is already protected by the
> > buffer mutex
> 
> Yeah, if that works for Alex's situation it certainly sounds like a
> better solution than reverting this patch as this patch actually does
> fix a problem reported by Jeffrin originally.
> 
> Could you propose a specific patch that Alex and Jeffrin can perhaps
> both try?

Um, what's wrong with the one I originally attached and which you quote
below?  It's only compile tested, but I think it will work, if the
theory is correct.

James

> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/char/tpm/tpm-dev-common.c
> > b/drivers/char/tpm/tpm-dev-
> > common.c
> > index 87f449340202..1784530b8387 100644
> > --- a/drivers/char/tpm/tpm-dev-common.c
> > +++ b/drivers/char/tpm/tpm-dev-common.c
> > @@ -189,15 +189,6 @@ ssize_t tpm_common_write(struct file *file,
> > const char
> > __user *buf,
> >  		goto out;
> >  	}
> > 
> > -	/* atomic tpm command send and result receive. We only
> > hold the ops
> > -	 * lock during this period so that the tpm can be
> > unregistered even if
> > -	 * the char dev is held open.
> > -	 */
> > -	if (tpm_try_get_ops(priv->chip)) {
> > -		ret = -EPIPE;
> > -		goto out;
> > -	}
> > -
> >  	priv->response_length = 0;
> >  	priv->response_read = false;
> >  	*off = 0;
> > @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file,
> > const char
> > __user *buf,
> >  	if (file->f_flags & O_NONBLOCK) {
> >  		priv->command_enqueued = true;
> >  		queue_work(tpm_dev_wq, &priv->async_work);
> > -		tpm_put_ops(priv->chip);
> >  		mutex_unlock(&priv->buffer_mutex);
> >  		return size;
> >  	}
> > 
> > +	/* atomic tpm command send and result receive. We only
> > hold the ops
> > +	 * lock during this period so that the tpm can be
> > unregistered even if
> > +	 * the char dev is held open.
> > +	 */
> > +	if (tpm_try_get_ops(priv->chip)) {
> > +		ret = -EPIPE;
> > +		goto out;
> > +	}
> > +
> >  	ret = tpm_dev_transmit(priv->chip, priv->space, priv-
> > > data_buffer,
> > 
> >  			       sizeof(priv->data_buffer));
> >  	tpm_put_ops(priv->chip);


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:14 ` James Bottomley
  2020-05-26 19:23   ` Mario.Limonciello
@ 2020-05-26 19:39   ` Tadeusz Struk
  2020-05-26 20:00     ` James Bottomley
  2020-05-28  0:30     ` Jarkko Sakkinen
  1 sibling, 2 replies; 24+ messages in thread
From: Tadeusz Struk @ 2020-05-26 19:39 UTC (permalink / raw)
  To: James Bottomley, Mario Limonciello, Peter Huewe, Jarkko Sakkinen,
	Jason Gunthorpe
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On 5/26/20 12:14 PM, James Bottomley wrote:
> +	/* atomic tpm command send and result receive. We only hold the ops
> +	 * lock during this period so that the tpm can be unregistered even if
> +	 * the char dev is held open.
> +	 */
> +	if (tpm_try_get_ops(priv->chip)) {
> +		ret = -EPIPE;
> +		goto out;
> +	}
> +
Hi James,
This won't help if the message is read by an async tcti. If the problem lies
in the chip get locality code, perhaps this could help to debug the root-cause
instead of masking it out in the upper layer code:

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 2435216bd10a..da5ecd0376bf 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -202,20 +202,22 @@ static int request_locality(struct tpm_chip *chip, int l)
 		return rc;
 
 	stop = jiffies + chip->timeout_a;
+	timeout = stop - jiffies;
 
 	if (chip->flags & TPM_CHIP_FLAG_IRQ) {
 again:
 		timeout = stop - jiffies;
 		if ((long)timeout <= 0)
-			return -1;
+			goto out;
+
 		rc = wait_event_interruptible_timeout(priv->int_queue,
-						      (check_locality
-						       (chip, l)),
+						      check_locality(chip, l),
 						      timeout);
 		if (rc > 0)
 			return l;
 		if (rc == -ERESTARTSYS && freezing(current)) {
 			clear_thread_flag(TIF_SIGPENDING);
+			timeout = stop - jiffies;
 			goto again;
 		}
 	} else {
@@ -226,6 +228,10 @@ static int request_locality(struct tpm_chip *chip, int l)
 			tpm_msleep(TPM_TIMEOUT);
 		} while (time_before(jiffies, stop));
 	}
+out:
+	dev_warn(&chip->dev, "%s: failed to request locality %d after %lu ms\n",
+			 __func__, l, timeout * HZ / 1000);
+
 	return -1;
 }
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:39   ` Tadeusz Struk
@ 2020-05-26 20:00     ` James Bottomley
  2020-05-26 21:33       ` Tadeusz Struk
  2020-05-28  0:30     ` Jarkko Sakkinen
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2020-05-26 20:00 UTC (permalink / raw)
  To: Tadeusz Struk, Mario Limonciello, Peter Huewe, Jarkko Sakkinen,
	Jason Gunthorpe
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On Tue, 2020-05-26 at 12:39 -0700, Tadeusz Struk wrote:
> On 5/26/20 12:14 PM, James Bottomley wrote:
> > +	/* atomic tpm command send and result receive. We only
> > hold the ops
> > +	 * lock during this period so that the tpm can be
> > unregistered even if
> > +	 * the char dev is held open.
> > +	 */
> > +	if (tpm_try_get_ops(priv->chip)) {
> > +		ret = -EPIPE;
> > +		goto out;
> > +	}
> > +
> 
> Hi James,
> This won't help if the message is read by an async tcti.

Why not?  It moves the ops get underneath the async path, so it's now
all done in the direct read or the async read.  That seems to be more
efficient.

>  If the problem lies in the chip get locality code, perhaps this
> could help to debug the root-cause instead of masking it out in the
> upper layer code:

I don't think there is a root cause other than a TIS TPM is getting
annoyed by us cycling localities too rapidly because we don't do an
actual TPM operation between request and relinquish.  Since the first
request/relinquish seems unnecessary for the async case, moving the ops
get eliminates the problem.

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 20:00     ` James Bottomley
@ 2020-05-26 21:33       ` Tadeusz Struk
  2020-05-26 22:34         ` Alex Guzman
  0 siblings, 1 reply; 24+ messages in thread
From: Tadeusz Struk @ 2020-05-26 21:33 UTC (permalink / raw)
  To: James Bottomley, Mario Limonciello, Peter Huewe, Jarkko Sakkinen,
	Jason Gunthorpe
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On 5/26/20 1:00 PM, James Bottomley wrote:
> I don't think there is a root cause other than a TIS TPM is getting
> annoyed by us cycling localities too rapidly because we don't do an
> actual TPM operation between request and relinquish.  Since the first
> request/relinquish seems unnecessary for the async case, moving the ops
> get eliminates the problem.

Could be, so maybe we could try both patches.
More debug info on the error path won't hurt.
Thanks,
Tadeusz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:38     ` James Bottomley
@ 2020-05-26 22:19       ` Alex Guzman
  2020-05-26 23:06         ` James Bottomley
  2020-05-27 20:09       ` Jarkko Sakkinen
  1 sibling, 1 reply; 24+ messages in thread
From: Alex Guzman @ 2020-05-26 22:19 UTC (permalink / raw)
  To: James Bottomley, Mario.Limonciello, peterhuewe, jarkko.sakkinen, jgg
  Cc: arnd, gregkh, linux-integrity, linux-kernel, jeffrin

[-- Attachment #1: Type: text/plain, Size: 4462 bytes --]

On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote:
> On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > > 
> > > > This commit has caused regressions for the XPS 9560 containing
> > > > a Nuvoton TPM.
> > > 
> > > Presumably this is using the tis driver?
> > 
> > Correct.
> > 
> > > > As mentioned by the reporter all TPM2 commands are failing
> > > > with:
> > > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > > device.c:290:tcti_device_receive()
> > > >   Failed to read response from fd 3, got errno 1: Operation not
> > > > permitted
> > > > 
> > > > The reporter bisected this issue back to this commit which was
> > > > backported to stable as commit 4d6ebc4.
> > > 
> > > I think the problem is request_locality ... for some inexplicable
> > > reason a failure there returns -1, which is EPERM to user space.
> > > 
> > > That seems to be a bug in the async code since everything else
> > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > > assumes it gives back a sensible return code.
> > > 
> > > What I think is happening is that with the patch the TPM goes
> > > through a quick sequence of request, relinquish, request,
> > > relinquish and it's the third request which is failing (likely
> > > timing out).  Without the patch, the patch there's only one
> > > request,relinquish cycle because the ops are held while the async
> > > work is executed.  I have a vague recollection that there is a
> > > problem with too many locality request in quick succession, but
> > > I'll defer to Jason, who I think understands the intricacies of
> > > localities better than I do.
> > 
> > Thanks, I don't pretend to understand the nuances of this
> > particular
> > code, but I was hoping that the request to revert got some
> > attention
> > since Alex's kernel Bugzilla and message a few months ago to linux
> > integrity weren't.
> > 
> > > If that's the problem, the solution looks simple enough: just
> > > move
> > > the ops get down because the priv state is already protected by
> > > the
> > > buffer mutex
> > 
> > Yeah, if that works for Alex's situation it certainly sounds like a
> > better solution than reverting this patch as this patch actually
> > does
> > fix a problem reported by Jeffrin originally.
> > 
> > Could you propose a specific patch that Alex and Jeffrin can
> > perhaps
> > both try?
> 
> Um, what's wrong with the one I originally attached and which you
> quote
> below?  It's only compile tested, but I think it will work, if the
> theory is correct.
> 
> James
> 
> > > James
> > > 
> > > ---
> > > 
> > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> > > b/drivers/char/tpm/tpm-dev-
> > > common.c
> > > index 87f449340202..1784530b8387 100644
> > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > @@ -189,15 +189,6 @@ ssize_t tpm_common_write(struct file *file,
> > > const char
> > > __user *buf,
> > >  		goto out;
> > >  	}
> > > 
> > > -	/* atomic tpm command send and result receive. We only
> > > hold the ops
> > > -	 * lock during this period so that the tpm can be
> > > unregistered even if
> > > -	 * the char dev is held open.
> > > -	 */
> > > -	if (tpm_try_get_ops(priv->chip)) {
> > > -		ret = -EPIPE;
> > > -		goto out;
> > > -	}
> > > -
> > >  	priv->response_length = 0;
> > >  	priv->response_read = false;
> > >  	*off = 0;
> > > @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file,
> > > const char
> > > __user *buf,
> > >  	if (file->f_flags & O_NONBLOCK) {
> > >  		priv->command_enqueued = true;
> > >  		queue_work(tpm_dev_wq, &priv->async_work);
> > > -		tpm_put_ops(priv->chip);
> > >  		mutex_unlock(&priv->buffer_mutex);
> > >  		return size;
> > >  	}
> > > 
> > > +	/* atomic tpm command send and result receive. We only
> > > hold the ops
> > > +	 * lock during this period so that the tpm can be
> > > unregistered even if
> > > +	 * the char dev is held open.
> > > +	 */
> > > +	if (tpm_try_get_ops(priv->chip)) {
> > > +		ret = -EPIPE;
> > > +		goto out;
> > > +	}
> > > +
> > >  	ret = tpm_dev_transmit(priv->chip, priv->space, priv-
> > > > data_buffer,
> > > 
> > >  			       sizeof(priv->data_buffer));
> > >  	tpm_put_ops(priv->chip);

When using your patch, I get a hang when trying to use tpm2_getcap, and
dmesg shows some info.



[-- Attachment #2: buglog.txt --]
[-- Type: text/plain, Size: 3761 bytes --]

[  570.913779] BUG: unable to handle page fault for address: ffffb20001247000
[  570.913782] #PF: supervisor write access in kernel mode
[  570.913783] #PF: error_code(0x0002) - not-present page
[  570.913784] PGD 0 P4D 0 
[  570.913785] Oops: 0002 [#3] SMP PTI
[  570.913787] CPU: 6 PID: 24744 Comm: tpm2_getcap Tainted: G     UD           5.7.0-rc7+ #31
[  570.913788] Hardware name: Dell Inc. XPS 15 9560/05FFDN, BIOS 1.18.0 11/17/2019
[  570.913791] RIP: 0010:iowrite8+0x9/0x50
[  570.913792] Code: 48 c7 c2 40 43 9f 99 48 89 04 24 e8 14 a7 90 ff 0f 0b 48 8b 04 24 48 83 c4 08 c3 66 0f 1f 44 00 00 48 81 fe ff ff 03 00 76 04 <40> 88 3e c3 48 81 fe 00 00 01 00 76 07 0f b7 d6 89 f8 ee c3 8b 05
[  570.913793] RSP: 0018:ffffb1ff049d7db0 EFLAGS: 00010292
[  570.913794] RAX: ffffffff981bf520 RBX: ffffb1ff049d7df9 RCX: ffffb1ff049d7df8
[  570.913795] RDX: 0000000000000001 RSI: ffffb20001247000 RDI: 0000000000000020
[  570.913796] RBP: ffffb1ff049d7df9 R08: 0000000000000000 R09: ffff8b80de5370f0
[  570.913797] R10: 0000000000b71b00 R11: 000000000000028f R12: ffff8b80b148cda8
[  570.913797] R13: 00000000fffff000 R14: ffff8b80b148cda8 R15: ffff8b80cb44a0ba
[  570.913799] FS:  00007f78f7cd0d80(0000) GS:ffff8b80de500000(0000) knlGS:0000000000000000
[  570.913799] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  570.913800] CR2: ffffb20001247000 CR3: 0000000795618001 CR4: 00000000003606e0
[  570.913801] Call Trace:
[  570.913803]  tpm_tcg_write_bytes+0x2f/0x40
[  570.913805]  release_locality+0x49/0x220
[  570.913807]  tpm_relinquish_locality+0x1f/0x40
[  570.913808]  tpm_chip_stop+0x21/0x40
[  570.913810]  tpm_put_ops+0x9/0x30
[  570.913811]  tpm_common_write+0x179/0x190
[  570.913813]  vfs_write+0xb1/0x1a0
[  570.913815]  ksys_write+0x5a/0xd0
[  570.913816]  do_syscall_64+0x43/0x130
[  570.913819]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  570.913820] RIP: 0033:0x7f78f7e00123
[  570.913821] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
[  570.913822] RSP: 002b:00007fff724e8c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  570.913823] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f78f7e00123
[  570.913824] RDX: 0000000000000016 RSI: 0000564cf24a7220 RDI: 0000000000000003
[  570.913825] RBP: 0000000000000016 R08: 00007f78f7ccc785 R09: 00007f78f7ccca40
[  570.913826] R10: 00007fff724e8b10 R11: 0000000000000246 R12: 0000564cf24a7220
[  570.913826] R13: 0000000000000000 R14: 0000000000000016 R15: 00007f78f7ccc890
[  570.913827] Modules linked in: squashfs rtsx_pci_sdmmc x86_pkg_temp_thermal coretemp rtsx_pci mfd_core
[  570.913831] CR2: ffffb20001247000
[  570.913832] ---[ end trace c84437b00f0d01a0 ]---
[  570.913833] RIP: 0010:iowrite8+0x9/0x50
[  570.913834] Code: 48 c7 c2 40 43 9f 99 48 89 04 24 e8 14 a7 90 ff 0f 0b 48 8b 04 24 48 83 c4 08 c3 66 0f 1f 44 00 00 48 81 fe ff ff 03 00 76 04 <40> 88 3e c3 48 81 fe 00 00 01 00 76 07 0f b7 d6 89 f8 ee c3 8b 05
[  570.913835] RSP: 0018:ffffb1ff030b7db0 EFLAGS: 00010292
[  570.913836] RAX: ffffffff981bf520 RBX: ffffb1ff030b7df9 RCX: ffffb1ff030b7df8
[  570.913837] RDX: 0000000000000001 RSI: ffffb20001247000 RDI: 0000000000000020
[  570.913837] RBP: ffffb1ff030b7df9 R08: 0000000000000000 R09: ffff8b80de2370f0
[  570.913838] R10: 0000000000b71b00 R11: 000000000000019c R12: ffff8b80b148cda8
[  570.913839] R13: 00000000fffff000 R14: ffff8b80b148cda8 R15: ffff8b80c4cfc0ba
[  570.913840] FS:  00007f78f7cd0d80(0000) GS:ffff8b80de500000(0000) knlGS:0000000000000000
[  570.913840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  570.913841] CR2: ffffb20001247000 CR3: 0000000795618001 CR4: 00000000003606e0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 21:33       ` Tadeusz Struk
@ 2020-05-26 22:34         ` Alex Guzman
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Guzman @ 2020-05-26 22:34 UTC (permalink / raw)
  To: Tadeusz Struk, James Bottomley, Mario Limonciello, Peter Huewe,
	Jarkko Sakkinen, Jason Gunthorpe
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T

On Tue, 2020-05-26 at 14:33 -0700, Tadeusz Struk wrote:
> On 5/26/20 1:00 PM, James Bottomley wrote:
> > I don't think there is a root cause other than a TIS TPM is getting
> > annoyed by us cycling localities too rapidly because we don't do an
> > actual TPM operation between request and relinquish.  Since the
> > first
> > request/relinquish seems unnecessary for the async case, moving the
> > ops
> > get eliminates the problem.
> 
> Could be, so maybe we could try both patches.
> More debug info on the error path won't hurt.
> Thanks,
> Tadeusz

With your logging patch, I consistently see this message in dmesg when
tpm2_getcap fails:

tpm tpm0: request_locality: failed to request locality 0 after 750 ms


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 22:19       ` Alex Guzman
@ 2020-05-26 23:06         ` James Bottomley
  2020-05-26 23:31           ` Alex Guzman
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2020-05-26 23:06 UTC (permalink / raw)
  To: Alex Guzman, Mario.Limonciello, peterhuewe, jarkko.sakkinen, jgg
  Cc: arnd, gregkh, linux-integrity, linux-kernel, jeffrin

On Tue, 2020-05-26 at 15:19 -0700, Alex Guzman wrote:
[...]
> When using your patch, I get a hang when trying to use tpm2_getcap,
> and dmesg shows some info.

Are you sure it's all applied?  This

> [  570.913803]  tpm_tcg_write_bytes+0x2f/0x40
> [  570.913805]  release_locality+0x49/0x220
> [  570.913807]  tpm_relinquish_locality+0x1f/0x40
> [  570.913808]  tpm_chip_stop+0x21/0x40
> [  570.913810]  tpm_put_ops+0x9/0x30
> [  570.913811]  tpm_common_write+0x179/0x190
> [  570.913813]  vfs_write+0xb1/0x1a0

Implies an unmatched tpm_put_ops() in the async write path, as though
this hunk:

> @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file,
> const char __user *buf,
>         if (file->f_flags & O_NONBLOCK) {
>                 priv->command_enqueued = true;
>                 queue_work(tpm_dev_wq, &priv->async_work);
> -               tpm_put_ops(priv->chip);
>                 mutex_unlock(&priv->buffer_mutex);
>                 return size;
>         }

Is missing.  I actually booted the patch in my TPM based VM and it all
seems to work OK when I execute tpm2_getcap (I verified it's using
O_NONBLOCK) and tssgetcapability in sync mode.

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 23:06         ` James Bottomley
@ 2020-05-26 23:31           ` Alex Guzman
  2020-05-27  0:18             ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Alex Guzman @ 2020-05-26 23:31 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mario Limonciello, Peter Huewe, Jarkko Sakkinen, Jason Gunthorpe,
	Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T

On Tue, May 26, 2020 at 4:06 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Tue, 2020-05-26 at 15:19 -0700, Alex Guzman wrote:
> [...]
> > When using your patch, I get a hang when trying to use tpm2_getcap,
> > and dmesg shows some info.
>
> Are you sure it's all applied?  This
>
> > [  570.913803]  tpm_tcg_write_bytes+0x2f/0x40
> > [  570.913805]  release_locality+0x49/0x220
> > [  570.913807]  tpm_relinquish_locality+0x1f/0x40
> > [  570.913808]  tpm_chip_stop+0x21/0x40
> > [  570.913810]  tpm_put_ops+0x9/0x30
> > [  570.913811]  tpm_common_write+0x179/0x190
> > [  570.913813]  vfs_write+0xb1/0x1a0
>
> Implies an unmatched tpm_put_ops() in the async write path, as though
> this hunk:
>
> > @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file,
> > const char __user *buf,
> >         if (file->f_flags & O_NONBLOCK) {
> >                 priv->command_enqueued = true;
> >                 queue_work(tpm_dev_wq, &priv->async_work);
> > -               tpm_put_ops(priv->chip);
> >                 mutex_unlock(&priv->buffer_mutex);
> >                 return size;
> >         }
>
> Is missing.  I actually booted the patch in my TPM based VM and it all
> seems to work OK when I execute tpm2_getcap (I verified it's using
> O_NONBLOCK) and tssgetcapability in sync mode.
>
> James
>

Oh, I did miss that bit. The patch had issues applying for some reason
and I missed the single-line removal when I was looking at the diff.

I gave it a spin on my machine again. getcap seems to work correctly
with and without having the async config flag set for tpm2-tss. The
pkcs11 plugin seems to work correctly again too. :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 23:31           ` Alex Guzman
@ 2020-05-27  0:18             ` James Bottomley
  0 siblings, 0 replies; 24+ messages in thread
From: James Bottomley @ 2020-05-27  0:18 UTC (permalink / raw)
  To: Alex Guzman
  Cc: Mario Limonciello, Peter Huewe, Jarkko Sakkinen, Jason Gunthorpe,
	Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T

On Tue, 2020-05-26 at 16:31 -0700, Alex Guzman wrote:
> On Tue, May 26, 2020 at 4:06 PM James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > 
> > On Tue, 2020-05-26 at 15:19 -0700, Alex Guzman wrote:
> > [...]
> > > When using your patch, I get a hang when trying to use
> > > tpm2_getcap, and dmesg shows some info.
> > 
> > Are you sure it's all applied?  This
> > 
> > > [  570.913803]  tpm_tcg_write_bytes+0x2f/0x40
> > > [  570.913805]  release_locality+0x49/0x220
> > > [  570.913807]  tpm_relinquish_locality+0x1f/0x40
> > > [  570.913808]  tpm_chip_stop+0x21/0x40
> > > [  570.913810]  tpm_put_ops+0x9/0x30
> > > [  570.913811]  tpm_common_write+0x179/0x190
> > > [  570.913813]  vfs_write+0xb1/0x1a0
> > 
> > Implies an unmatched tpm_put_ops() in the async write path, as
> > though this hunk:
> > 
> > > @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file,
> > > const char __user *buf,
> > >         if (file->f_flags & O_NONBLOCK) {
> > >                 priv->command_enqueued = true;
> > >                 queue_work(tpm_dev_wq, &priv->async_work);
> > > -               tpm_put_ops(priv->chip);
> > >                 mutex_unlock(&priv->buffer_mutex);
> > >                 return size;
> > >         }
> > 
> > Is missing.  I actually booted the patch in my TPM based VM and it
> > all seems to work OK when I execute tpm2_getcap (I verified it's
> > using O_NONBLOCK) and tssgetcapability in sync mode.
> > 
> > James
> > 
> 
> Oh, I did miss that bit. The patch had issues applying for some
> reason and I missed the single-line removal when I was looking at the
> diff.

Sorry, that's likely my fault: I did it on top of my current TPM tree.
I'll prepare a version against the vanilla kernel with a real
changelog.

> I gave it a spin on my machine again. getcap seems to work correctly
> with and without having the async config flag set for tpm2-tss. The
> pkcs11 plugin seems to work correctly again too. :)

Great, thanks!  I'll add your tested-by to the above.

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:38     ` James Bottomley
  2020-05-26 22:19       ` Alex Guzman
@ 2020-05-27 20:09       ` Jarkko Sakkinen
  2020-05-27 20:18         ` Mario.Limonciello
  1 sibling, 1 reply; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-27 20:09 UTC (permalink / raw)
  To: James Bottomley, Mario.Limonciello, peterhuewe, jgg
  Cc: arnd, gregkh, linux-integrity, linux-kernel, jeffrin, alex

On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote:
> On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > > 
> > > > This commit has caused regressions for the XPS 9560 containing
> > > > a Nuvoton TPM.
> > > 
> > > Presumably this is using the tis driver?
> > 
> > Correct.
> > 
> > > > As mentioned by the reporter all TPM2 commands are failing with:
> > > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > > device.c:290:tcti_device_receive()
> > > >   Failed to read response from fd 3, got errno 1: Operation not
> > > > permitted
> > > > 
> > > > The reporter bisected this issue back to this commit which was
> > > > backported to stable as commit 4d6ebc4.
> > > 
> > > I think the problem is request_locality ... for some inexplicable
> > > reason a failure there returns -1, which is EPERM to user space.
> > > 
> > > That seems to be a bug in the async code since everything else
> > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > > assumes it gives back a sensible return code.
> > > 
> > > What I think is happening is that with the patch the TPM goes
> > > through a quick sequence of request, relinquish, request,
> > > relinquish and it's the third request which is failing (likely
> > > timing out).  Without the patch, the patch there's only one
> > > request,relinquish cycle because the ops are held while the async
> > > work is executed.  I have a vague recollection that there is a
> > > problem with too many locality request in quick succession, but
> > > I'll defer to Jason, who I think understands the intricacies of
> > > localities better than I do.
> > 
> > Thanks, I don't pretend to understand the nuances of this particular
> > code, but I was hoping that the request to revert got some attention
> > since Alex's kernel Bugzilla and message a few months ago to linux
> > integrity weren't.
> > 
> > > If that's the problem, the solution looks simple enough: just move
> > > the ops get down because the priv state is already protected by the
> > > buffer mutex
> > 
> > Yeah, if that works for Alex's situation it certainly sounds like a
> > better solution than reverting this patch as this patch actually does
> > fix a problem reported by Jeffrin originally.
> > 
> > Could you propose a specific patch that Alex and Jeffrin can perhaps
> > both try?
> 
> Um, what's wrong with the one I originally attached and which you quote
> below?  It's only compile tested, but I think it will work, if the
> theory is correct.

Please send a legit patch, thanks.

/Jarkko


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:23   ` Mario.Limonciello
  2020-05-26 19:38     ` James Bottomley
@ 2020-05-27 20:15     ` Jarkko Sakkinen
  2020-05-28  1:10       ` Alex Guzman
  1 sibling, 1 reply; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-27 20:15 UTC (permalink / raw)
  To: Mario.Limonciello, James.Bottomley, peterhuewe, jgg
  Cc: arnd, gregkh, linux-integrity, jeffrin, alex

On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> Thanks, I don't pretend to understand the nuances of this particular code,
> but I was hoping that the request to revert got some attention since Alex's
> kernel Bugzilla and message a few months ago to linux integrity weren't.

Removing linux-kernel from CC since this subsystem internal discussion.

Seeing the whole thing first time today.

Bugzilla is the first thing to ignore when busy. It is good as place
holder for bugs, but all discussions should happen only in LKML. There's
no official requirement to proactively use Bugzilla for anything.

That said I'm happy that people put stuff there so that it gets logged.

For follow-up's use only LKML if it is important to you. Those will get
processed.

As far as this goes, if nothing is heard from me, check that you put me
as CC to the original email. Otherwise, I might have missed it (by mistake,
not by purpose).

Honestly, I'm not sure what point was this patch when there was time to
wait for months without response. Why the passivity for all this time?

/Jarkko


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-27 20:09       ` Jarkko Sakkinen
@ 2020-05-27 20:18         ` Mario.Limonciello
  2020-05-28  0:43           ` Jarkko Sakkinen
  0 siblings, 1 reply; 24+ messages in thread
From: Mario.Limonciello @ 2020-05-27 20:18 UTC (permalink / raw)
  To: jarkko.sakkinen, James.Bottomley, peterhuewe, jgg
  Cc: arnd, gregkh, linux-integrity, linux-kernel, jeffrin, alex

> -----Original Message-----
> From: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> Sent: Wednesday, May 27, 2020 3:09 PM
> To: James Bottomley; Limonciello, Mario; peterhuewe@gmx.de; jgg@ziepe.ca
> Cc: arnd@arndb.de; gregkh@linuxfoundation.org; linux-integrity@vger.kernel.org;
> linux-kernel@vger.kernel.org; jeffrin@rajagiritech.edu.in; alex@guzman.io
> Subject: Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
> 
> 
> [EXTERNAL EMAIL]
> 
> On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote:
> > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > > >
> > > > > This commit has caused regressions for the XPS 9560 containing
> > > > > a Nuvoton TPM.
> > > >
> > > > Presumably this is using the tis driver?
> > >
> > > Correct.
> > >
> > > > > As mentioned by the reporter all TPM2 commands are failing with:
> > > > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > > > device.c:290:tcti_device_receive()
> > > > >   Failed to read response from fd 3, got errno 1: Operation not
> > > > > permitted
> > > > >
> > > > > The reporter bisected this issue back to this commit which was
> > > > > backported to stable as commit 4d6ebc4.
> > > >
> > > > I think the problem is request_locality ... for some inexplicable
> > > > reason a failure there returns -1, which is EPERM to user space.
> > > >
> > > > That seems to be a bug in the async code since everything else
> > > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > > > assumes it gives back a sensible return code.
> > > >
> > > > What I think is happening is that with the patch the TPM goes
> > > > through a quick sequence of request, relinquish, request,
> > > > relinquish and it's the third request which is failing (likely
> > > > timing out).  Without the patch, the patch there's only one
> > > > request,relinquish cycle because the ops are held while the async
> > > > work is executed.  I have a vague recollection that there is a
> > > > problem with too many locality request in quick succession, but
> > > > I'll defer to Jason, who I think understands the intricacies of
> > > > localities better than I do.
> > >
> > > Thanks, I don't pretend to understand the nuances of this particular
> > > code, but I was hoping that the request to revert got some attention
> > > since Alex's kernel Bugzilla and message a few months ago to linux
> > > integrity weren't.
> > >
> > > > If that's the problem, the solution looks simple enough: just move
> > > > the ops get down because the priv state is already protected by the
> > > > buffer mutex
> > >
> > > Yeah, if that works for Alex's situation it certainly sounds like a
> > > better solution than reverting this patch as this patch actually does
> > > fix a problem reported by Jeffrin originally.
> > >
> > > Could you propose a specific patch that Alex and Jeffrin can perhaps
> > > both try?
> >
> > Um, what's wrong with the one I originally attached and which you quote
> > below?  It's only compile tested, but I think it will work, if the
> > theory is correct.
> 
> Please send a legit patch, thanks.
> 
> /Jarkko

Jarkko,

After the confirmation from Alex that this patch attached to the end of the thread
worked, James did send a proper patch that can be accessed here:
https://lore.kernel.org/linux-integrity/20200527155800.ya43xm2ltuwduwjg@cantor/T/#t

Thanks,

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-26 19:39   ` Tadeusz Struk
  2020-05-26 20:00     ` James Bottomley
@ 2020-05-28  0:30     ` Jarkko Sakkinen
  2020-05-28  4:40       ` Tadeusz Struk
  1 sibling, 1 reply; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-28  0:30 UTC (permalink / raw)
  To: Tadeusz Struk
  Cc: James Bottomley, Mario Limonciello, Peter Huewe, Jason Gunthorpe,
	Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On Tue, May 26, 2020 at 12:39:37PM -0700, Tadeusz Struk wrote:
> On 5/26/20 12:14 PM, James Bottomley wrote:
> > +	/* atomic tpm command send and result receive. We only hold the ops
> > +	 * lock during this period so that the tpm can be unregistered even if
> > +	 * the char dev is held open.
> > +	 */
> > +	if (tpm_try_get_ops(priv->chip)) {
> > +		ret = -EPIPE;
> > +		goto out;
> > +	}
> > +
> Hi James,
> This won't help if the message is read by an async tcti. If the problem lies
> in the chip get locality code, perhaps this could help to debug the root-cause
> instead of masking it out in the upper layer code:

What is TCTI and async TCTI? Not following.

> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 2435216bd10a..da5ecd0376bf 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -202,20 +202,22 @@ static int request_locality(struct tpm_chip *chip, int l)
>  		return rc;
>  
>  	stop = jiffies + chip->timeout_a;
> +	timeout = stop - jiffies;
>  
>  	if (chip->flags & TPM_CHIP_FLAG_IRQ) {
>  again:
>  		timeout = stop - jiffies;
>  		if ((long)timeout <= 0)
> -			return -1;
> +			goto out;
> +
>  		rc = wait_event_interruptible_timeout(priv->int_queue,
> -						      (check_locality
> -						       (chip, l)),
> +						      check_locality(chip, l),
>  						      timeout);
>  		if (rc > 0)
>  			return l;
>  		if (rc == -ERESTARTSYS && freezing(current)) {
>  			clear_thread_flag(TIF_SIGPENDING);
> +			timeout = stop - jiffies;
>  			goto again;
>  		}
>  	} else {
> @@ -226,6 +228,10 @@ static int request_locality(struct tpm_chip *chip, int l)
>  			tpm_msleep(TPM_TIMEOUT);
>  		} while (time_before(jiffies, stop));
>  	}
> +out:
> +	dev_warn(&chip->dev, "%s: failed to request locality %d after %lu ms\n",
> +			 __func__, l, timeout * HZ / 1000);
> +
>  	return -1;
>  }
>  

/Jarkko

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-27 20:18         ` Mario.Limonciello
@ 2020-05-28  0:43           ` Jarkko Sakkinen
  2020-05-28  0:59             ` Mario.Limonciello
  0 siblings, 1 reply; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-28  0:43 UTC (permalink / raw)
  To: Mario.Limonciello
  Cc: James.Bottomley, peterhuewe, jgg, arnd, gregkh, linux-integrity,
	linux-kernel, jeffrin, alex

On Wed, May 27, 2020 at 08:18:56PM +0000, Mario.Limonciello@dell.com wrote:
> > -----Original Message-----
> > From: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > Sent: Wednesday, May 27, 2020 3:09 PM
> > To: James Bottomley; Limonciello, Mario; peterhuewe@gmx.de; jgg@ziepe.ca
> > Cc: arnd@arndb.de; gregkh@linuxfoundation.org; linux-integrity@vger.kernel.org;
> > linux-kernel@vger.kernel.org; jeffrin@rajagiritech.edu.in; alex@guzman.io
> > Subject: Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
> > 
> > 
> > [EXTERNAL EMAIL]

What is this?

> > On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote:
> > > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > > > >
> > > > > > This commit has caused regressions for the XPS 9560 containing
> > > > > > a Nuvoton TPM.
> > > > >
> > > > > Presumably this is using the tis driver?
> > > >
> > > > Correct.
> > > >
> > > > > > As mentioned by the reporter all TPM2 commands are failing with:
> > > > > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > > > > device.c:290:tcti_device_receive()
> > > > > >   Failed to read response from fd 3, got errno 1: Operation not
> > > > > > permitted
> > > > > >
> > > > > > The reporter bisected this issue back to this commit which was
> > > > > > backported to stable as commit 4d6ebc4.
> > > > >
> > > > > I think the problem is request_locality ... for some inexplicable
> > > > > reason a failure there returns -1, which is EPERM to user space.
> > > > >
> > > > > That seems to be a bug in the async code since everything else
> > > > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > > > > assumes it gives back a sensible return code.
> > > > >
> > > > > What I think is happening is that with the patch the TPM goes
> > > > > through a quick sequence of request, relinquish, request,
> > > > > relinquish and it's the third request which is failing (likely
> > > > > timing out).  Without the patch, the patch there's only one
> > > > > request,relinquish cycle because the ops are held while the async
> > > > > work is executed.  I have a vague recollection that there is a
> > > > > problem with too many locality request in quick succession, but
> > > > > I'll defer to Jason, who I think understands the intricacies of
> > > > > localities better than I do.
> > > >
> > > > Thanks, I don't pretend to understand the nuances of this particular
> > > > code, but I was hoping that the request to revert got some attention
> > > > since Alex's kernel Bugzilla and message a few months ago to linux
> > > > integrity weren't.
> > > >
> > > > > If that's the problem, the solution looks simple enough: just move
> > > > > the ops get down because the priv state is already protected by the
> > > > > buffer mutex
> > > >
> > > > Yeah, if that works for Alex's situation it certainly sounds like a
> > > > better solution than reverting this patch as this patch actually does
> > > > fix a problem reported by Jeffrin originally.
> > > >
> > > > Could you propose a specific patch that Alex and Jeffrin can perhaps
> > > > both try?
> > >
> > > Um, what's wrong with the one I originally attached and which you quote
> > > below?  It's only compile tested, but I think it will work, if the
> > > theory is correct.
> > 
> > Please send a legit patch, thanks.
> > 
> > /Jarkko
> 
> Jarkko,
> 
> After the confirmation from Alex that this patch attached to the end of the thread
> worked, James did send a proper patch that can be accessed here:
> https://lore.kernel.org/linux-integrity/20200527155800.ya43xm2ltuwduwjg@cantor/T/#t
> 
> Thanks,

Hi thanks a lot! I did read the full discussions and agree with the
conclusions as I get a patch in proper form.

Please ping next time a bit earlier. It's not that I don't want to deal
with the issues quickly as possible. It's probably just that I've forgot
something or missed.

/Jarkko

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-28  0:43           ` Jarkko Sakkinen
@ 2020-05-28  0:59             ` Mario.Limonciello
  2020-05-28  6:53               ` Jarkko Sakkinen
  0 siblings, 1 reply; 24+ messages in thread
From: Mario.Limonciello @ 2020-05-28  0:59 UTC (permalink / raw)
  To: jarkko.sakkinen
  Cc: James.Bottomley, peterhuewe, jgg, arnd, gregkh, linux-integrity,
	linux-kernel, jeffrin, alex

> > > [EXTERNAL EMAIL]
> 
> What is this?

Something my employer's mail system automatically tags in external email.

My mistakes in forgetting to remove it on the response.

> 
> > > On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote:
> > > > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > > > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > > > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > > > > >
> > > > > > > This commit has caused regressions for the XPS 9560 containing
> > > > > > > a Nuvoton TPM.
> > > > > >
> > > > > > Presumably this is using the tis driver?
> > > > >
> > > > > Correct.
> > > > >
> > > > > > > As mentioned by the reporter all TPM2 commands are failing with:
> > > > > > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > > > > > device.c:290:tcti_device_receive()
> > > > > > >   Failed to read response from fd 3, got errno 1: Operation not
> > > > > > > permitted
> > > > > > >
> > > > > > > The reporter bisected this issue back to this commit which was
> > > > > > > backported to stable as commit 4d6ebc4.
> > > > > >
> > > > > > I think the problem is request_locality ... for some inexplicable
> > > > > > reason a failure there returns -1, which is EPERM to user space.
> > > > > >
> > > > > > That seems to be a bug in the async code since everything else
> > > > > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > > > > > assumes it gives back a sensible return code.
> > > > > >
> > > > > > What I think is happening is that with the patch the TPM goes
> > > > > > through a quick sequence of request, relinquish, request,
> > > > > > relinquish and it's the third request which is failing (likely
> > > > > > timing out).  Without the patch, the patch there's only one
> > > > > > request,relinquish cycle because the ops are held while the async
> > > > > > work is executed.  I have a vague recollection that there is a
> > > > > > problem with too many locality request in quick succession, but
> > > > > > I'll defer to Jason, who I think understands the intricacies of
> > > > > > localities better than I do.
> > > > >
> > > > > Thanks, I don't pretend to understand the nuances of this particular
> > > > > code, but I was hoping that the request to revert got some attention
> > > > > since Alex's kernel Bugzilla and message a few months ago to linux
> > > > > integrity weren't.
> > > > >
> > > > > > If that's the problem, the solution looks simple enough: just move
> > > > > > the ops get down because the priv state is already protected by the
> > > > > > buffer mutex
> > > > >
> > > > > Yeah, if that works for Alex's situation it certainly sounds like a
> > > > > better solution than reverting this patch as this patch actually does
> > > > > fix a problem reported by Jeffrin originally.
> > > > >
> > > > > Could you propose a specific patch that Alex and Jeffrin can perhaps
> > > > > both try?
> > > >
> > > > Um, what's wrong with the one I originally attached and which you quote
> > > > below?  It's only compile tested, but I think it will work, if the
> > > > theory is correct.
> > >
> > > Please send a legit patch, thanks.
> > >
> > > /Jarkko
> >
> > Jarkko,
> >
> > After the confirmation from Alex that this patch attached to the end of the
> thread
> > worked, James did send a proper patch that can be accessed here:
> > https://lore.kernel.org/linux-
> integrity/20200527155800.ya43xm2ltuwduwjg@cantor/T/#t
> >
> > Thanks,
> 
> Hi thanks a lot! I did read the full discussions and agree with the
> conclusions as I get a patch in proper form.
> 
> Please ping next time a bit earlier. It's not that I don't want to deal
> with the issues quickly as possible. It's probably just that I've forgot
> something or missed.
> 
> /Jarkko

Thanks!

I completely forgot about it too, it was mentioned to me right after holidays
and I forgot to follow up and see that it got sorted.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-27 20:15     ` Jarkko Sakkinen
@ 2020-05-28  1:10       ` Alex Guzman
  2020-05-28  6:54         ` Jarkko Sakkinen
  2020-05-28 22:33         ` James Bottomley
  0 siblings, 2 replies; 24+ messages in thread
From: Alex Guzman @ 2020-05-28  1:10 UTC (permalink / raw)
  To: Jarkko Sakkinen, Mario.Limonciello, James.Bottomley, peterhuewe, jgg
  Cc: arnd, gregkh, linux-integrity, jeffrin

On Wed, 2020-05-27 at 23:15 +0300, Jarkko Sakkinen wrote:
> On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > Thanks, I don't pretend to understand the nuances of this
> > particular code,
> > but I was hoping that the request to revert got some attention
> > since Alex's
> > kernel Bugzilla and message a few months ago to linux integrity
> > weren't.
> 
> Removing linux-kernel from CC since this subsystem internal
> discussion.
> 
> Seeing the whole thing first time today.
> 
> Bugzilla is the first thing to ignore when busy. It is good as place
> holder for bugs, but all discussions should happen only in LKML.
> There's
> no official requirement to proactively use Bugzilla for anything.
> 
> That said I'm happy that people put stuff there so that it gets
> logged.
> 
> For follow-up's use only LKML if it is important to you. Those will
> get
> processed.
> 
> As far as this goes, if nothing is heard from me, check that you put
> me
> as CC to the original email. Otherwise, I might have missed it (by
> mistake,
> not by purpose).
> 
> Honestly, I'm not sure what point was this patch when there was time
> to
> wait for months without response. Why the passivity for all this
> time?
> 
> /Jarkko
> 

It largely went quiet because I didn't raise the issue in the mailing
list again. I pinged back in February (
https://lore.kernel.org/linux-integrity/CAJ7-PMbujee92N1f9xVF8vtXgS49qpe7qHkeWh1Z0R-Rk-Jkaw@mail.gmail.com/
) but the conversation died out and I was content to simply use the
last working kernel version and see if the bug was resolved on its own.
I raised the issue again on the bugtracker a few days ago, leading to
this follow up here. :)

- Alex


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-28  0:30     ` Jarkko Sakkinen
@ 2020-05-28  4:40       ` Tadeusz Struk
  2020-05-28  6:55         ` Jarkko Sakkinen
  0 siblings, 1 reply; 24+ messages in thread
From: Tadeusz Struk @ 2020-05-28  4:40 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: James Bottomley, Mario Limonciello, Peter Huewe, Jason Gunthorpe,
	Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On 5/27/20 5:30 PM, Jarkko Sakkinen wrote:
>> This won't help if the message is read by an async tcti. If the problem lies
>> in the chip get locality code, perhaps this could help to debug the root-cause
>> instead of masking it out in the upper layer code:
> What is TCTI and async TCTI? Not following.

TPM Command Transmission Interface (TCTI) as defined by TCG in
https://trustedcomputinggroup.org/resource/tss-tcti-specification/

the reason we added the O_NONBLOCK mode was to satisfy the TCG spec for async TCTI.

Thanks,
Tadeusz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-28  0:59             ` Mario.Limonciello
@ 2020-05-28  6:53               ` Jarkko Sakkinen
  0 siblings, 0 replies; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-28  6:53 UTC (permalink / raw)
  To: Mario.Limonciello
  Cc: James.Bottomley, peterhuewe, jgg, arnd, gregkh, linux-integrity,
	linux-kernel, jeffrin, alex

On Thu, May 28, 2020 at 12:59:59AM +0000, Mario.Limonciello@dell.com wrote:
> > > > [EXTERNAL EMAIL]
> > 
> > What is this?
> 
> Something my employer's mail system automatically tags in external email.
> 
> My mistakes in forgetting to remove it on the response.

NP, just asking :-)

> > > > On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote:
> > > > > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > > > > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote:
> > > > > > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897.
> > > > > > > >
> > > > > > > > This commit has caused regressions for the XPS 9560 containing
> > > > > > > > a Nuvoton TPM.
> > > > > > >
> > > > > > > Presumably this is using the tis driver?
> > > > > >
> > > > > > Correct.
> > > > > >
> > > > > > > > As mentioned by the reporter all TPM2 commands are failing with:
> > > > > > > >   ERROR:tcti:src/tss2-tcti/tcti-
> > > > > > > > device.c:290:tcti_device_receive()
> > > > > > > >   Failed to read response from fd 3, got errno 1: Operation not
> > > > > > > > permitted
> > > > > > > >
> > > > > > > > The reporter bisected this issue back to this commit which was
> > > > > > > > backported to stable as commit 4d6ebc4.
> > > > > > >
> > > > > > > I think the problem is request_locality ... for some inexplicable
> > > > > > > reason a failure there returns -1, which is EPERM to user space.
> > > > > > >
> > > > > > > That seems to be a bug in the async code since everything else
> > > > > > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one
> > > > > > > assumes it gives back a sensible return code.
> > > > > > >
> > > > > > > What I think is happening is that with the patch the TPM goes
> > > > > > > through a quick sequence of request, relinquish, request,
> > > > > > > relinquish and it's the third request which is failing (likely
> > > > > > > timing out).  Without the patch, the patch there's only one
> > > > > > > request,relinquish cycle because the ops are held while the async
> > > > > > > work is executed.  I have a vague recollection that there is a
> > > > > > > problem with too many locality request in quick succession, but
> > > > > > > I'll defer to Jason, who I think understands the intricacies of
> > > > > > > localities better than I do.
> > > > > >
> > > > > > Thanks, I don't pretend to understand the nuances of this particular
> > > > > > code, but I was hoping that the request to revert got some attention
> > > > > > since Alex's kernel Bugzilla and message a few months ago to linux
> > > > > > integrity weren't.
> > > > > >
> > > > > > > If that's the problem, the solution looks simple enough: just move
> > > > > > > the ops get down because the priv state is already protected by the
> > > > > > > buffer mutex
> > > > > >
> > > > > > Yeah, if that works for Alex's situation it certainly sounds like a
> > > > > > better solution than reverting this patch as this patch actually does
> > > > > > fix a problem reported by Jeffrin originally.
> > > > > >
> > > > > > Could you propose a specific patch that Alex and Jeffrin can perhaps
> > > > > > both try?
> > > > >
> > > > > Um, what's wrong with the one I originally attached and which you quote
> > > > > below?  It's only compile tested, but I think it will work, if the
> > > > > theory is correct.
> > > >
> > > > Please send a legit patch, thanks.
> > > >
> > > > /Jarkko
> > >
> > > Jarkko,
> > >
> > > After the confirmation from Alex that this patch attached to the end of the
> > thread
> > > worked, James did send a proper patch that can be accessed here:
> > > https://lore.kernel.org/linux-
> > integrity/20200527155800.ya43xm2ltuwduwjg@cantor/T/#t
> > >
> > > Thanks,
> > 
> > Hi thanks a lot! I did read the full discussions and agree with the
> > conclusions as I get a patch in proper form.
> > 
> > Please ping next time a bit earlier. It's not that I don't want to deal
> > with the issues quickly as possible. It's probably just that I've forgot
> > something or missed.
> > 
> > /Jarkko
> 
> Thanks!
> 
> I completely forgot about it too, it was mentioned to me right after holidays
> and I forgot to follow up and see that it got sorted.

Yeah, sure, lets try to get a fix landed asap :-)

/Jarkko

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-28  1:10       ` Alex Guzman
@ 2020-05-28  6:54         ` Jarkko Sakkinen
  2020-05-28 22:33         ` James Bottomley
  1 sibling, 0 replies; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-28  6:54 UTC (permalink / raw)
  To: Alex Guzman
  Cc: Mario.Limonciello, James.Bottomley, peterhuewe, jgg, arnd,
	gregkh, linux-integrity, jeffrin

On Wed, May 27, 2020 at 06:10:02PM -0700, Alex Guzman wrote:
> On Wed, 2020-05-27 at 23:15 +0300, Jarkko Sakkinen wrote:
> > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote:
> > > Thanks, I don't pretend to understand the nuances of this
> > > particular code,
> > > but I was hoping that the request to revert got some attention
> > > since Alex's
> > > kernel Bugzilla and message a few months ago to linux integrity
> > > weren't.
> > 
> > Removing linux-kernel from CC since this subsystem internal
> > discussion.
> > 
> > Seeing the whole thing first time today.
> > 
> > Bugzilla is the first thing to ignore when busy. It is good as place
> > holder for bugs, but all discussions should happen only in LKML.
> > There's
> > no official requirement to proactively use Bugzilla for anything.
> > 
> > That said I'm happy that people put stuff there so that it gets
> > logged.
> > 
> > For follow-up's use only LKML if it is important to you. Those will
> > get
> > processed.
> > 
> > As far as this goes, if nothing is heard from me, check that you put
> > me
> > as CC to the original email. Otherwise, I might have missed it (by
> > mistake,
> > not by purpose).
> > 
> > Honestly, I'm not sure what point was this patch when there was time
> > to
> > wait for months without response. Why the passivity for all this
> > time?
> > 
> > /Jarkko
> > 
> 
> It largely went quiet because I didn't raise the issue in the mailing
> list again. I pinged back in February (
> https://lore.kernel.org/linux-integrity/CAJ7-PMbujee92N1f9xVF8vtXgS49qpe7qHkeWh1Z0R-Rk-Jkaw@mail.gmail.com/
> ) but the conversation died out and I was content to simply use the
> last working kernel version and see if the bug was resolved on its own.
> I raised the issue again on the bugtracker a few days ago, leading to
> this follow up here. :)

OK  cool and thanks a lot for reporting this!

/Jarkko

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-28  4:40       ` Tadeusz Struk
@ 2020-05-28  6:55         ` Jarkko Sakkinen
  0 siblings, 0 replies; 24+ messages in thread
From: Jarkko Sakkinen @ 2020-05-28  6:55 UTC (permalink / raw)
  To: Tadeusz Struk
  Cc: James Bottomley, Mario Limonciello, Peter Huewe, Jason Gunthorpe,
	Arnd Bergmann, Greg Kroah-Hartman, linux-integrity, linux-kernel,
	Jeffrin Jose T, Alex Guzman

On Wed, May 27, 2020 at 09:40:25PM -0700, Tadeusz Struk wrote:
> On 5/27/20 5:30 PM, Jarkko Sakkinen wrote:
> >> This won't help if the message is read by an async tcti. If the problem lies
> >> in the chip get locality code, perhaps this could help to debug the root-cause
> >> instead of masking it out in the upper layer code:
> > What is TCTI and async TCTI? Not following.
> 
> TPM Command Transmission Interface (TCTI) as defined by TCG in
> https://trustedcomputinggroup.org/resource/tss-tcti-specification/
> 
> the reason we added the O_NONBLOCK mode was to satisfy the TCG spec for async TCTI.
> 
> Thanks,
> Tadeusz

OK, thanks recalling.

/Jarkko

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
  2020-05-28  1:10       ` Alex Guzman
  2020-05-28  6:54         ` Jarkko Sakkinen
@ 2020-05-28 22:33         ` James Bottomley
  1 sibling, 0 replies; 24+ messages in thread
From: James Bottomley @ 2020-05-28 22:33 UTC (permalink / raw)
  To: Alex Guzman, Jarkko Sakkinen, Mario.Limonciello, peterhuewe, jgg
  Cc: arnd, gregkh, linux-integrity, jeffrin

On Wed, 2020-05-27 at 18:10 -0700, Alex Guzman wrote:
> On Wed, 2020-05-27 at 23:15 +0300, Jarkko Sakkinen wrote:
> > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com
> > wrote:
> > > Thanks, I don't pretend to understand the nuances of this
> > > particular code, but I was hoping that the request to revert got
> > > some attention since Alex's kernel Bugzilla and message a few
> > > months ago to linux integrity weren't.
> > 
> > Removing linux-kernel from CC since this subsystem internal
> > discussion.
> > 
> > Seeing the whole thing first time today.
> > 
> > Bugzilla is the first thing to ignore when busy. It is good as
> > place holder for bugs, but all discussions should happen only in
> > LKML. There's no official requirement to proactively use Bugzilla
> > for anything.
> > 
> > That said I'm happy that people put stuff there so that it gets
> > logged.
> > 
> > For follow-up's use only LKML if it is important to you. Those will
> > get processed.
> > 
> > As far as this goes, if nothing is heard from me, check that you
> > put me as CC to the original email. Otherwise, I might have missed
> > it (by mistake, not by purpose).
> > 
> > Honestly, I'm not sure what point was this patch when there was
> > time to wait for months without response. Why the passivity for all
> > this time?
> > 
> > /Jarkko
> > 
> 
> It largely went quiet because I didn't raise the issue in the mailing
> list again. I pinged back in February (
> https://lore.kernel.org/linux-integrity/CAJ7-
> PMbujee92N1f9xVF8vtXgS49qpe7qHkeWh1Z0R-Rk-Jkaw@mail.gmail.com/) but
> the conversation died out and I was content to simply use the
> last working kernel version and see if the bug was resolved on its
> own.

I think its just a state of knowledge problem: back in February I
didn't know how unusual EPERM errors are in the TPM so the issue just
flew by as a "this is a curious issue with an O_NONBLOCK path" thing,
but thanks to some key stuff I've been doing I now do.  So this time
your EPERM struck me as "that's impossible surely" which is why I dug
into the code to find out where it was coming from ... and sure enough,
it was impossible: it was an untranslated failure return, but at least
it accidentally told me exactly what the real error was.

So the upshot is you got lucky this time around ...

James

> I raised the issue again on the bugtracker a few days ago, leading to
> this follow up here. :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-05-28 22:33 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-26 18:32 [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode" Mario Limonciello
2020-05-26 19:14 ` James Bottomley
2020-05-26 19:23   ` Mario.Limonciello
2020-05-26 19:38     ` James Bottomley
2020-05-26 22:19       ` Alex Guzman
2020-05-26 23:06         ` James Bottomley
2020-05-26 23:31           ` Alex Guzman
2020-05-27  0:18             ` James Bottomley
2020-05-27 20:09       ` Jarkko Sakkinen
2020-05-27 20:18         ` Mario.Limonciello
2020-05-28  0:43           ` Jarkko Sakkinen
2020-05-28  0:59             ` Mario.Limonciello
2020-05-28  6:53               ` Jarkko Sakkinen
2020-05-27 20:15     ` Jarkko Sakkinen
2020-05-28  1:10       ` Alex Guzman
2020-05-28  6:54         ` Jarkko Sakkinen
2020-05-28 22:33         ` James Bottomley
2020-05-26 19:39   ` Tadeusz Struk
2020-05-26 20:00     ` James Bottomley
2020-05-26 21:33       ` Tadeusz Struk
2020-05-26 22:34         ` Alex Guzman
2020-05-28  0:30     ` Jarkko Sakkinen
2020-05-28  4:40       ` Tadeusz Struk
2020-05-28  6:55         ` Jarkko Sakkinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.