On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote: > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@dell.com wrote: > > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote: > > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897. > > > > > > > > This commit has caused regressions for the XPS 9560 containing > > > > a Nuvoton TPM. > > > > > > Presumably this is using the tis driver? > > > > Correct. > > > > > > As mentioned by the reporter all TPM2 commands are failing > > > > with: > > > > ERROR:tcti:src/tss2-tcti/tcti- > > > > device.c:290:tcti_device_receive() > > > > Failed to read response from fd 3, got errno 1: Operation not > > > > permitted > > > > > > > > The reporter bisected this issue back to this commit which was > > > > backported to stable as commit 4d6ebc4. > > > > > > I think the problem is request_locality ... for some inexplicable > > > reason a failure there returns -1, which is EPERM to user space. > > > > > > That seems to be a bug in the async code since everything else > > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one > > > assumes it gives back a sensible return code. > > > > > > What I think is happening is that with the patch the TPM goes > > > through a quick sequence of request, relinquish, request, > > > relinquish and it's the third request which is failing (likely > > > timing out). Without the patch, the patch there's only one > > > request,relinquish cycle because the ops are held while the async > > > work is executed. I have a vague recollection that there is a > > > problem with too many locality request in quick succession, but > > > I'll defer to Jason, who I think understands the intricacies of > > > localities better than I do. > > > > Thanks, I don't pretend to understand the nuances of this > > particular > > code, but I was hoping that the request to revert got some > > attention > > since Alex's kernel Bugzilla and message a few months ago to linux > > integrity weren't. > > > > > If that's the problem, the solution looks simple enough: just > > > move > > > the ops get down because the priv state is already protected by > > > the > > > buffer mutex > > > > Yeah, if that works for Alex's situation it certainly sounds like a > > better solution than reverting this patch as this patch actually > > does > > fix a problem reported by Jeffrin originally. > > > > Could you propose a specific patch that Alex and Jeffrin can > > perhaps > > both try? > > Um, what's wrong with the one I originally attached and which you > quote > below? It's only compile tested, but I think it will work, if the > theory is correct. > > James > > > > James > > > > > > --- > > > > > > diff --git a/drivers/char/tpm/tpm-dev-common.c > > > b/drivers/char/tpm/tpm-dev- > > > common.c > > > index 87f449340202..1784530b8387 100644 > > > --- a/drivers/char/tpm/tpm-dev-common.c > > > +++ b/drivers/char/tpm/tpm-dev-common.c > > > @@ -189,15 +189,6 @@ ssize_t tpm_common_write(struct file *file, > > > const char > > > __user *buf, > > > goto out; > > > } > > > > > > - /* atomic tpm command send and result receive. We only > > > hold the ops > > > - * lock during this period so that the tpm can be > > > unregistered even if > > > - * the char dev is held open. > > > - */ > > > - if (tpm_try_get_ops(priv->chip)) { > > > - ret = -EPIPE; > > > - goto out; > > > - } > > > - > > > priv->response_length = 0; > > > priv->response_read = false; > > > *off = 0; > > > @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file, > > > const char > > > __user *buf, > > > if (file->f_flags & O_NONBLOCK) { > > > priv->command_enqueued = true; > > > queue_work(tpm_dev_wq, &priv->async_work); > > > - tpm_put_ops(priv->chip); > > > mutex_unlock(&priv->buffer_mutex); > > > return size; > > > } > > > > > > + /* atomic tpm command send and result receive. We only > > > hold the ops > > > + * lock during this period so that the tpm can be > > > unregistered even if > > > + * the char dev is held open. > > > + */ > > > + if (tpm_try_get_ops(priv->chip)) { > > > + ret = -EPIPE; > > > + goto out; > > > + } > > > + > > > ret = tpm_dev_transmit(priv->chip, priv->space, priv- > > > > data_buffer, > > > > > > sizeof(priv->data_buffer)); > > > tpm_put_ops(priv->chip); When using your patch, I get a hang when trying to use tpm2_getcap, and dmesg shows some info.