From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shawn N Subject: Re: [PATCH v3] platform/chrome: Use proper protocol transfer function Date: Tue, 19 Sep 2017 10:03:03 -0700 Message-ID: References: <20170908205011.77986-1-briannorris@chromium.org> <02aa65e7-e967-055b-2af3-2e9b6ef77935@nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: Sender: linux-tegra-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Jon Hunter Cc: Brian Norris , Olof Johansson , Benson Leung , Lee Jones , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Doug Anderson , Brian Norris , Javier Martinez Canillas , Gwendal Grignou , Enric Balletbo , Tomeu Vizoso , "linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" List-Id: linux-tegra@vger.kernel.org On Tue, Sep 19, 2017 at 9:39 AM, Jon Hunter wrote: > > > On 19/09/17 15:09, Shawn N wrote: >> On Tue, Sep 19, 2017 at 6:44 AM, Jon Hunter wrote: >>> >>> Hi Brian, >>> >>> On 08/09/17 21:50, Brian Norris wrote: >>>> From: Shawn Nematbakhsh >>>> >>>> pkt_xfer should be used for protocol v3, and cmd_xfer otherwise. We had >>>> one instance of these functions correct, but not the second, fall-back >>>> case. We use the fall-back only when the first command returns an >>>> IN_PROGRESS status, which is only used on some EC firmwares where we >>>> don't want to constantly poll the bus, but instead back off and >>>> sleep/retry for a little while. >>>> >>>> Fixes: 2c7589af3c4d ("mfd: cros_ec: add proto v3 skeleton") >>>> Signed-off-by: Shawn Nematbakhsh >>>> Signed-off-by: Brian Norris >>>> Reviewed-by: Javier Martinez Canillas >>>> --- >>>> v3: >>>> * Added Javier's reviewed tag >>>> * It's been > 8 months since [1], so why not? And hey, Benson's officially in >>>> MAINTAINERS now! Too bad no one told me. >>>> >>>> [1] https://patchwork.kernel.org/patch/9450633/ >>>> >>>> >>>> v2: >>>> * Add Benson in 'To:' >>>> * make subject prefix more obvious >>>> >>>> drivers/platform/chrome/cros_ec_proto.c | 8 +++++--- >>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c >>>> index 8dfa7fcb1248..e7bbdf947bbc 100644 >>>> --- a/drivers/platform/chrome/cros_ec_proto.c >>>> +++ b/drivers/platform/chrome/cros_ec_proto.c >>>> @@ -60,12 +60,14 @@ static int send_command(struct cros_ec_device *ec_dev, >>>> struct cros_ec_command *msg) >>>> { >>>> int ret; >>>> + int (*xfer_fxn)(struct cros_ec_device *ec, struct cros_ec_command *msg); >>>> >>>> if (ec_dev->proto_version > 2) >>>> - ret = ec_dev->pkt_xfer(ec_dev, msg); >>>> + xfer_fxn = ec_dev->pkt_xfer; >>>> else >>>> - ret = ec_dev->cmd_xfer(ec_dev, msg); >>>> + xfer_fxn = ec_dev->cmd_xfer;u >>>> >>>> + ret = (*xfer_fxn)(ec_dev, msg); >>>> if (msg->result == EC_RES_IN_PROGRESS) { >>>> int i; >>>> struct cros_ec_command *status_msg; >>>> @@ -88,7 +90,7 @@ static int send_command(struct cros_ec_device *ec_dev, >>>> for (i = 0; i < EC_COMMAND_RETRIES; i++) { >>>> usleep_range(10000, 11000); >>>> >>>> - ret = ec_dev->cmd_xfer(ec_dev, status_msg); >>>> + ret = (*xfer_fxn)(ec_dev, status_msg); >>>> if (ret < 0) >>>> break; >>>> >>> >>> Tegra124 Nyan-Big is currently crashing during boot with -next [0] and >>> bisect is pointing to this commit. Reverting the above on top of -next >>> does allow the board to boot successfully. Looks like this board is >>> proto_version 3 but I have not looked into this any further. Let me know >>> if you have any thoughts. >> >> >> Thanks for the bug report, I'll look into this today. >> >>> [ 1.502497] kernel BUG at drivers/platform/chrome/cros_ec_proto.c:34! >>> 34 BUG_ON(ec_dev->proto_version != EC_HOST_REQUEST_VERSION); >> >> So, ec_dev->proto_version > 3? That doesn't seem right. > > You mean != 3, but yes. Looks like an initialisation problem, because if I > add the following WARNING ... I meant > 3 because we check for > 2 in send_command(). > > diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c > index e7bbdf947bbc..ad3b3a1e8d54 100644 > --- a/drivers/platform/chrome/cros_ec_proto.c > +++ b/drivers/platform/chrome/cros_ec_proto.c > @@ -31,6 +31,7 @@ static int prepare_packet(struct cros_ec_device *ec_dev, > int i; > u8 csum = 0; > > + WARN(ec_dev->proto_version != EC_HOST_REQUEST_VERSION, "%d != %d", ec_dev->proto_version, EC_HOST_REQUEST_VERSION); > BUG_ON(ec_dev->proto_version != EC_HOST_REQUEST_VERSION); > BUG_ON(msg->outsize + sizeof(*request) > ec_dev->dout_size); > > ... then I see ... > > [ 1.502495] WARNING: CPU: 0 PID: 1 at drivers/platform/chrome/cros_ec_proto.c:35 cros_ec_prepare_tx+0x190/0x1a8 > [ 1.512566] 65535 != 3 > > Any chance this is being called before the version is initialised? It's initialized in cros_ec_query_all(). Considering your trace shows (send_command+0x20/0xd8) as a caller, I'm guessing that we die on the first call to (*xfer_fxn): + ret = (*xfer_fxn)(ec_dev, msg); That part of the change should be a NOP, I only added a function pointer so we wouldn't have to re-check protocol_version later. The syntax looks fine to me even after re-checking, but maybe I missed something. Let me test on my side. > > Cheers > Jon > > -- > nvpublic From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751361AbdISRDK (ORCPT ); Tue, 19 Sep 2017 13:03:10 -0400 Received: from mail-wm0-f41.google.com ([74.125.82.41]:50652 "EHLO mail-wm0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750892AbdISRDH (ORCPT ); Tue, 19 Sep 2017 13:03:07 -0400 X-Google-Smtp-Source: AOwi7QD6iz2MYUXdhSJk2hNeXTS6dUqyI+gBbwpnnIhmKB8pYmVkm1E9VkhhVcMD+naipE+JOLnv3A7UwqFK4TFCly4= MIME-Version: 1.0 In-Reply-To: References: <20170908205011.77986-1-briannorris@chromium.org> <02aa65e7-e967-055b-2af3-2e9b6ef77935@nvidia.com> From: Shawn N Date: Tue, 19 Sep 2017 10:03:03 -0700 X-Google-Sender-Auth: BHTGkiw2bvshtSq-x118HokwxM8 Message-ID: Subject: Re: [PATCH v3] platform/chrome: Use proper protocol transfer function To: Jon Hunter Cc: Brian Norris , Olof Johansson , Benson Leung , Lee Jones , linux-kernel@vger.kernel.org, Doug Anderson , Brian Norris , Javier Martinez Canillas , Gwendal Grignou , Enric Balletbo , Tomeu Vizoso , "linux-tegra@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 19, 2017 at 9:39 AM, Jon Hunter wrote: > > > On 19/09/17 15:09, Shawn N wrote: >> On Tue, Sep 19, 2017 at 6:44 AM, Jon Hunter wrote: >>> >>> Hi Brian, >>> >>> On 08/09/17 21:50, Brian Norris wrote: >>>> From: Shawn Nematbakhsh >>>> >>>> pkt_xfer should be used for protocol v3, and cmd_xfer otherwise. We had >>>> one instance of these functions correct, but not the second, fall-back >>>> case. We use the fall-back only when the first command returns an >>>> IN_PROGRESS status, which is only used on some EC firmwares where we >>>> don't want to constantly poll the bus, but instead back off and >>>> sleep/retry for a little while. >>>> >>>> Fixes: 2c7589af3c4d ("mfd: cros_ec: add proto v3 skeleton") >>>> Signed-off-by: Shawn Nematbakhsh >>>> Signed-off-by: Brian Norris >>>> Reviewed-by: Javier Martinez Canillas >>>> --- >>>> v3: >>>> * Added Javier's reviewed tag >>>> * It's been > 8 months since [1], so why not? And hey, Benson's officially in >>>> MAINTAINERS now! Too bad no one told me. >>>> >>>> [1] https://patchwork.kernel.org/patch/9450633/ >>>> >>>> >>>> v2: >>>> * Add Benson in 'To:' >>>> * make subject prefix more obvious >>>> >>>> drivers/platform/chrome/cros_ec_proto.c | 8 +++++--- >>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c >>>> index 8dfa7fcb1248..e7bbdf947bbc 100644 >>>> --- a/drivers/platform/chrome/cros_ec_proto.c >>>> +++ b/drivers/platform/chrome/cros_ec_proto.c >>>> @@ -60,12 +60,14 @@ static int send_command(struct cros_ec_device *ec_dev, >>>> struct cros_ec_command *msg) >>>> { >>>> int ret; >>>> + int (*xfer_fxn)(struct cros_ec_device *ec, struct cros_ec_command *msg); >>>> >>>> if (ec_dev->proto_version > 2) >>>> - ret = ec_dev->pkt_xfer(ec_dev, msg); >>>> + xfer_fxn = ec_dev->pkt_xfer; >>>> else >>>> - ret = ec_dev->cmd_xfer(ec_dev, msg); >>>> + xfer_fxn = ec_dev->cmd_xfer;u >>>> >>>> + ret = (*xfer_fxn)(ec_dev, msg); >>>> if (msg->result == EC_RES_IN_PROGRESS) { >>>> int i; >>>> struct cros_ec_command *status_msg; >>>> @@ -88,7 +90,7 @@ static int send_command(struct cros_ec_device *ec_dev, >>>> for (i = 0; i < EC_COMMAND_RETRIES; i++) { >>>> usleep_range(10000, 11000); >>>> >>>> - ret = ec_dev->cmd_xfer(ec_dev, status_msg); >>>> + ret = (*xfer_fxn)(ec_dev, status_msg); >>>> if (ret < 0) >>>> break; >>>> >>> >>> Tegra124 Nyan-Big is currently crashing during boot with -next [0] and >>> bisect is pointing to this commit. Reverting the above on top of -next >>> does allow the board to boot successfully. Looks like this board is >>> proto_version 3 but I have not looked into this any further. Let me know >>> if you have any thoughts. >> >> >> Thanks for the bug report, I'll look into this today. >> >>> [ 1.502497] kernel BUG at drivers/platform/chrome/cros_ec_proto.c:34! >>> 34 BUG_ON(ec_dev->proto_version != EC_HOST_REQUEST_VERSION); >> >> So, ec_dev->proto_version > 3? That doesn't seem right. > > You mean != 3, but yes. Looks like an initialisation problem, because if I > add the following WARNING ... I meant > 3 because we check for > 2 in send_command(). > > diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c > index e7bbdf947bbc..ad3b3a1e8d54 100644 > --- a/drivers/platform/chrome/cros_ec_proto.c > +++ b/drivers/platform/chrome/cros_ec_proto.c > @@ -31,6 +31,7 @@ static int prepare_packet(struct cros_ec_device *ec_dev, > int i; > u8 csum = 0; > > + WARN(ec_dev->proto_version != EC_HOST_REQUEST_VERSION, "%d != %d", ec_dev->proto_version, EC_HOST_REQUEST_VERSION); > BUG_ON(ec_dev->proto_version != EC_HOST_REQUEST_VERSION); > BUG_ON(msg->outsize + sizeof(*request) > ec_dev->dout_size); > > ... then I see ... > > [ 1.502495] WARNING: CPU: 0 PID: 1 at drivers/platform/chrome/cros_ec_proto.c:35 cros_ec_prepare_tx+0x190/0x1a8 > [ 1.512566] 65535 != 3 > > Any chance this is being called before the version is initialised? It's initialized in cros_ec_query_all(). Considering your trace shows (send_command+0x20/0xd8) as a caller, I'm guessing that we die on the first call to (*xfer_fxn): + ret = (*xfer_fxn)(ec_dev, msg); That part of the change should be a NOP, I only added a function pointer so we wouldn't have to re-check protocol_version later. The syntax looks fine to me even after re-checking, but maybe I missed something. Let me test on my side. > > Cheers > Jon > > -- > nvpublic