linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs
@ 2021-05-18  9:09 Patryk Duda
  2021-05-18  9:29 ` Greg KH
  2021-05-18 14:07 ` [PATCH v2] " Patryk Duda
  0 siblings, 2 replies; 6+ messages in thread
From: Patryk Duda @ 2021-05-18  9:09 UTC (permalink / raw)
  To: Benson Leung; +Cc: Guenter Roeck, linux-kernel, upstream, Patryk Duda, stable

Sometimes kernel is trying to probe Fingerprint MCU (FPMCU) when it
hasn't initialized SPI yet. This can happen because FPMCU is restarted
during system boot and kernel can send message in short window
eg. between sysjump to RW and SPI initialization.

Cc: <stable@vger.kernel.org> # 4.4+
Signed-off-by: Patryk Duda <pdk@semihalf.com>
---
Fingerprint MCU is rebooted during system startup by AP firmware (coreboot).
During cold boot kernel can query FPMCU in a window just after jump to RW
section of firmware but before SPI is initialized. The window was
shortened to <1ms, but it can't be eliminated completly.

Communication with FPMCU (and all devices based on EC) is bi-directional.
When kernel sends message, EC will send EC_SPI* status codes. When EC is
not able to process command one of bytes will be eg. EC_SPI_NOT_READY.
This mechanism won't work when SPI is not initailized on EC side. In fact,
buffer is filled with 0xFF bytes, so from kernel perspective device is not
responding. To avoid this problem, we can query device once again. We are
already waiting EC_MSG_DEADLINE_MS for response, so we can send command
immediately.

Best regards,
Patryk
 drivers/platform/chrome/cros_ec_proto.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
index aa7f7aa77297..3384631d21e2 100644
--- a/drivers/platform/chrome/cros_ec_proto.c
+++ b/drivers/platform/chrome/cros_ec_proto.c
@@ -279,6 +279,18 @@ static int cros_ec_host_command_proto_query(struct cros_ec_device *ec_dev,
 	msg->insize = sizeof(struct ec_response_get_protocol_info);
 
 	ret = send_command(ec_dev, msg);
+	/*
+	 * Send command once again when timeout occurred.
+	 * Fingerprint MCU (FPMCU) is restarted during system boot which
+	 * introduces small window in which FPMCU won't respond for any
+	 * messages sent by kernel. There is no need to wait before next
+	 * attempt because we waited at least EC_MSG_DEADLINE_MS.
+	 */
+	if (ret == -ETIMEDOUT) {
+		dev_warn(ec_dev->dev,
+			 "Timeout to get response from EC. Retrying.\n");
+		ret = send_command(ec_dev, msg);
+	}
 
 	if (ret < 0) {
 		dev_dbg(ec_dev->dev,
-- 
2.31.1.751.gd2f1c929bd-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs
  2021-05-18  9:09 [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs Patryk Duda
@ 2021-05-18  9:29 ` Greg KH
  2021-05-18 11:22   ` Patryk Duda
  2021-05-18 14:07 ` [PATCH v2] " Patryk Duda
  1 sibling, 1 reply; 6+ messages in thread
From: Greg KH @ 2021-05-18  9:29 UTC (permalink / raw)
  To: Patryk Duda; +Cc: Benson Leung, Guenter Roeck, linux-kernel, upstream, stable

On Tue, May 18, 2021 at 11:09:25AM +0200, Patryk Duda wrote:
> Sometimes kernel is trying to probe Fingerprint MCU (FPMCU) when it
> hasn't initialized SPI yet. This can happen because FPMCU is restarted
> during system boot and kernel can send message in short window
> eg. between sysjump to RW and SPI initialization.
> 
> Cc: <stable@vger.kernel.org> # 4.4+
> Signed-off-by: Patryk Duda <pdk@semihalf.com>
> ---
> Fingerprint MCU is rebooted during system startup by AP firmware (coreboot).
> During cold boot kernel can query FPMCU in a window just after jump to RW
> section of firmware but before SPI is initialized. The window was
> shortened to <1ms, but it can't be eliminated completly.
> 
> Communication with FPMCU (and all devices based on EC) is bi-directional.
> When kernel sends message, EC will send EC_SPI* status codes. When EC is
> not able to process command one of bytes will be eg. EC_SPI_NOT_READY.
> This mechanism won't work when SPI is not initailized on EC side. In fact,
> buffer is filled with 0xFF bytes, so from kernel perspective device is not
> responding. To avoid this problem, we can query device once again. We are
> already waiting EC_MSG_DEADLINE_MS for response, so we can send command
> immediately.
> 
> Best regards,
> Patryk
>  drivers/platform/chrome/cros_ec_proto.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
> index aa7f7aa77297..3384631d21e2 100644
> --- a/drivers/platform/chrome/cros_ec_proto.c
> +++ b/drivers/platform/chrome/cros_ec_proto.c
> @@ -279,6 +279,18 @@ static int cros_ec_host_command_proto_query(struct cros_ec_device *ec_dev,
>  	msg->insize = sizeof(struct ec_response_get_protocol_info);
>  
>  	ret = send_command(ec_dev, msg);
> +	/*
> +	 * Send command once again when timeout occurred.
> +	 * Fingerprint MCU (FPMCU) is restarted during system boot which
> +	 * introduces small window in which FPMCU won't respond for any
> +	 * messages sent by kernel. There is no need to wait before next
> +	 * attempt because we waited at least EC_MSG_DEADLINE_MS.
> +	 */
> +	if (ret == -ETIMEDOUT) {
> +		dev_warn(ec_dev->dev,
> +			 "Timeout to get response from EC. Retrying.\n");

If a user sees this, what can they do?  No need to spam the kernel logs,
just retry.

> +		ret = send_command(ec_dev, msg);

But wait, why just retry once?  Why not 10 times?  100?  1000?  How
about a simple loop here instead, with a "sane" number of retries as a
max.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs
  2021-05-18  9:29 ` Greg KH
@ 2021-05-18 11:22   ` Patryk Duda
  2021-05-18 11:34     ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Patryk Duda @ 2021-05-18 11:22 UTC (permalink / raw)
  To: Greg KH; +Cc: Benson Leung, Guenter Roeck, linux-kernel, upstream, stable

wt., 18 maj 2021 o 11:29 Greg KH <greg@kroah.com> napisał(a):
>
> On Tue, May 18, 2021 at 11:09:25AM +0200, Patryk Duda wrote:
> > Sometimes kernel is trying to probe Fingerprint MCU (FPMCU) when it
> > hasn't initialized SPI yet. This can happen because FPMCU is restarted
> > during system boot and kernel can send message in short window
> > eg. between sysjump to RW and SPI initialization.
> >
> > Cc: <stable@vger.kernel.org> # 4.4+
> > Signed-off-by: Patryk Duda <pdk@semihalf.com>
> > ---
> > Fingerprint MCU is rebooted during system startup by AP firmware (coreboot).
> > During cold boot kernel can query FPMCU in a window just after jump to RW
> > section of firmware but before SPI is initialized. The window was
> > shortened to <1ms, but it can't be eliminated completly.
> >
> > Communication with FPMCU (and all devices based on EC) is bi-directional.
> > When kernel sends message, EC will send EC_SPI* status codes. When EC is
> > not able to process command one of bytes will be eg. EC_SPI_NOT_READY.
> > This mechanism won't work when SPI is not initailized on EC side. In fact,
> > buffer is filled with 0xFF bytes, so from kernel perspective device is not
> > responding. To avoid this problem, we can query device once again. We are
> > already waiting EC_MSG_DEADLINE_MS for response, so we can send command
> > immediately.
> >
> > Best regards,
> > Patryk
> >  drivers/platform/chrome/cros_ec_proto.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
> > index aa7f7aa77297..3384631d21e2 100644
> > --- a/drivers/platform/chrome/cros_ec_proto.c
> > +++ b/drivers/platform/chrome/cros_ec_proto.c
> > @@ -279,6 +279,18 @@ static int cros_ec_host_command_proto_query(struct cros_ec_device *ec_dev,
> >       msg->insize = sizeof(struct ec_response_get_protocol_info);
> >
> >       ret = send_command(ec_dev, msg);
> > +     /*
> > +      * Send command once again when timeout occurred.
> > +      * Fingerprint MCU (FPMCU) is restarted during system boot which
> > +      * introduces small window in which FPMCU won't respond for any
> > +      * messages sent by kernel. There is no need to wait before next
> > +      * attempt because we waited at least EC_MSG_DEADLINE_MS.
> > +      */
> > +     if (ret == -ETIMEDOUT) {
> > +             dev_warn(ec_dev->dev,
> > +                      "Timeout to get response from EC. Retrying.\n");
>
> If a user sees this, what can they do?  No need to spam the kernel logs,
> just retry.
User can do nothing about it. I will remove this in next version of patch.
>
> > +             ret = send_command(ec_dev, msg);
>
> But wait, why just retry once?  Why not 10 times?  100?  1000?  How
> about a simple loop here instead, with a "sane" number of retries as a
> max.
EC based devices are designed to respond always or return appropriate
status code
when they can't process command. But this assumes that SPI is always
ready to work.
It's true for Embedded Controller, but not for Fingerprint MCU. So we
can retry once,
in case of sending message, when FPMCU is in narrow window (~1ms) when SPI is
not initialized.

Every send_command() call can take about 200ms when device is not responding,
so next retry will happen after 200ms, at least. If 200ms is not
enough for FPMCU
to initialize SPI, it's definitely something wrong with FPMCU
>
> thanks,
>
> greg k-h

Best regards,
Patryk

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs
  2021-05-18 11:22   ` Patryk Duda
@ 2021-05-18 11:34     ` Greg KH
  0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2021-05-18 11:34 UTC (permalink / raw)
  To: Patryk Duda; +Cc: Benson Leung, Guenter Roeck, linux-kernel, upstream, stable

On Tue, May 18, 2021 at 01:22:30PM +0200, Patryk Duda wrote:
> wt., 18 maj 2021 o 11:29 Greg KH <greg@kroah.com> napisał(a):
> >
> > On Tue, May 18, 2021 at 11:09:25AM +0200, Patryk Duda wrote:
> > > Sometimes kernel is trying to probe Fingerprint MCU (FPMCU) when it
> > > hasn't initialized SPI yet. This can happen because FPMCU is restarted
> > > during system boot and kernel can send message in short window
> > > eg. between sysjump to RW and SPI initialization.
> > >
> > > Cc: <stable@vger.kernel.org> # 4.4+
> > > Signed-off-by: Patryk Duda <pdk@semihalf.com>
> > > ---
> > > Fingerprint MCU is rebooted during system startup by AP firmware (coreboot).
> > > During cold boot kernel can query FPMCU in a window just after jump to RW
> > > section of firmware but before SPI is initialized. The window was
> > > shortened to <1ms, but it can't be eliminated completly.
> > >
> > > Communication with FPMCU (and all devices based on EC) is bi-directional.
> > > When kernel sends message, EC will send EC_SPI* status codes. When EC is
> > > not able to process command one of bytes will be eg. EC_SPI_NOT_READY.
> > > This mechanism won't work when SPI is not initailized on EC side. In fact,
> > > buffer is filled with 0xFF bytes, so from kernel perspective device is not
> > > responding. To avoid this problem, we can query device once again. We are
> > > already waiting EC_MSG_DEADLINE_MS for response, so we can send command
> > > immediately.
> > >
> > > Best regards,
> > > Patryk
> > >  drivers/platform/chrome/cros_ec_proto.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
> > > index aa7f7aa77297..3384631d21e2 100644
> > > --- a/drivers/platform/chrome/cros_ec_proto.c
> > > +++ b/drivers/platform/chrome/cros_ec_proto.c
> > > @@ -279,6 +279,18 @@ static int cros_ec_host_command_proto_query(struct cros_ec_device *ec_dev,
> > >       msg->insize = sizeof(struct ec_response_get_protocol_info);
> > >
> > >       ret = send_command(ec_dev, msg);
> > > +     /*
> > > +      * Send command once again when timeout occurred.
> > > +      * Fingerprint MCU (FPMCU) is restarted during system boot which
> > > +      * introduces small window in which FPMCU won't respond for any
> > > +      * messages sent by kernel. There is no need to wait before next
> > > +      * attempt because we waited at least EC_MSG_DEADLINE_MS.
> > > +      */
> > > +     if (ret == -ETIMEDOUT) {
> > > +             dev_warn(ec_dev->dev,
> > > +                      "Timeout to get response from EC. Retrying.\n");
> >
> > If a user sees this, what can they do?  No need to spam the kernel logs,
> > just retry.
> User can do nothing about it. I will remove this in next version of patch.
> >
> > > +             ret = send_command(ec_dev, msg);
> >
> > But wait, why just retry once?  Why not 10 times?  100?  1000?  How
> > about a simple loop here instead, with a "sane" number of retries as a
> > max.
> EC based devices are designed to respond always or return appropriate
> status code
> when they can't process command. But this assumes that SPI is always
> ready to work.
> It's true for Embedded Controller, but not for Fingerprint MCU. So we
> can retry once,
> in case of sending message, when FPMCU is in narrow window (~1ms) when SPI is
> not initialized.
> 
> Every send_command() call can take about 200ms when device is not responding,
> so next retry will happen after 200ms, at least. If 200ms is not
> enough for FPMCU
> to initialize SPI, it's definitely something wrong with FPMCU

Ok, then just loop for 2 for this, should make it more obvious.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] platform/chrome: cros_ec_proto: Send command again when timeout occurs
  2021-05-18  9:09 [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs Patryk Duda
  2021-05-18  9:29 ` Greg KH
@ 2021-05-18 14:07 ` Patryk Duda
  2021-07-26 23:20   ` Benson Leung
  1 sibling, 1 reply; 6+ messages in thread
From: Patryk Duda @ 2021-05-18 14:07 UTC (permalink / raw)
  To: Benson Leung; +Cc: Guenter Roeck, linux-kernel, upstream, Patryk Duda

Sometimes kernel is trying to probe Fingerprint MCU (FPMCU) when it
hasn't initialized SPI yet. This can happen because FPMCU is restarted
during system boot and kernel can send message in short window
eg. between sysjump to RW and SPI initialization.

Cc: <stable@vger.kernel.org> # 4.4+
Signed-off-by: Patryk Duda <pdk@semihalf.com>
---
Fingerprint MCU is rebooted during system startup by AP firmware (coreboot).
During cold boot kernel can query FPMCU in a window just after jump to RW
section of firmware but before SPI is initialized. The window was
shortened to <1ms, but it can't be eliminated completly.

Communication with FPMCU (and all devices based on EC) is bi-directional.
When kernel sends message, EC will send EC_SPI* status codes. When EC is
not able to process command one of bytes will be eg. EC_SPI_NOT_READY.
This mechanism won't work when SPI is not initailized on EC side. In fact,
buffer is filled with 0xFF bytes, so from kernel perspective device is not
responding. To avoid this problem, we can query device once again. We are
already waiting EC_MSG_DEADLINE_MS for response, so we can send command
immediately.

Best regards,
Patryk
v1 -> v2
- Removed message about timeout
 drivers/platform/chrome/cros_ec_proto.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
index aa7f7aa77297..a7404d69b2d3 100644
--- a/drivers/platform/chrome/cros_ec_proto.c
+++ b/drivers/platform/chrome/cros_ec_proto.c
@@ -279,6 +279,15 @@ static int cros_ec_host_command_proto_query(struct cros_ec_device *ec_dev,
 	msg->insize = sizeof(struct ec_response_get_protocol_info);
 
 	ret = send_command(ec_dev, msg);
+	/*
+	 * Send command once again when timeout occurred.
+	 * Fingerprint MCU (FPMCU) is restarted during system boot which
+	 * introduces small window in which FPMCU won't respond for any
+	 * messages sent by kernel. There is no need to wait before next
+	 * attempt because we waited at least EC_MSG_DEADLINE_MS.
+	 */
+	if (ret == -ETIMEDOUT)
+		ret = send_command(ec_dev, msg);
 
 	if (ret < 0) {
 		dev_dbg(ec_dev->dev,
-- 
2.31.1.751.gd2f1c929bd-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] platform/chrome: cros_ec_proto: Send command again when timeout occurs
  2021-05-18 14:07 ` [PATCH v2] " Patryk Duda
@ 2021-07-26 23:20   ` Benson Leung
  0 siblings, 0 replies; 6+ messages in thread
From: Benson Leung @ 2021-07-26 23:20 UTC (permalink / raw)
  To: Patryk Duda; +Cc: Guenter Roeck, upstream, linux-kernel, bleung, bleung

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]

Hi Patryk,

On Tue, 18 May 2021 16:07:58 +0200, Patryk Duda wrote:
> Sometimes kernel is trying to probe Fingerprint MCU (FPMCU) when it
> hasn't initialized SPI yet. This can happen because FPMCU is restarted
> during system boot and kernel can send message in short window
> eg. between sysjump to RW and SPI initialization.

Applied, thanks!

[1/1] platform/chrome: cros_ec_proto: Send command again when timeout occurs
      commit: 3abc16af57c9939724df92fcbda296b25cc95168

Best regards,
-- 
Benson Leung
Staff Software Engineer
Chrome OS Kernel
Google Inc.
bleung@google.com
Chromium OS Project
bleung@chromium.org

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-26 23:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-18  9:09 [PATCH] platform/chrome: cros_ec_proto: Send command again when timeout occurs Patryk Duda
2021-05-18  9:29 ` Greg KH
2021-05-18 11:22   ` Patryk Duda
2021-05-18 11:34     ` Greg KH
2021-05-18 14:07 ` [PATCH v2] " Patryk Duda
2021-07-26 23:20   ` Benson Leung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).