From: Eddie James <eajames@linux.ibm.com>
To: Joel Stanley <joel@jms.id.au>
Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>,
Mark Brown <broonie@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-spi@vger.kernel.org
Subject: Re: [PATCH] spi: fsi: Implement a timeout for polling status
Date: Fri, 18 Mar 2022 09:10:47 -0500 [thread overview]
Message-ID: <acec0279-f5a2-2b0c-e044-6200e7a37e37@linux.ibm.com> (raw)
In-Reply-To: <CACPK8Xd42+NgTfS8ERagv-1GkAb8XiY8U71Q8Hz0wQ9dEUJekQ@mail.gmail.com>
On 3/17/22 23:19, Joel Stanley wrote:
> On Thu, 17 Mar 2022 at 21:14, Eddie James <eajames@linux.ibm.com> wrote:
>> The data transfer routines must poll the status register to
>> determine when more data can be shifted in or out. If the hardware
>> gets into a bad state, these polling loops may never exit. Prevent
>> this by returning an error if a timeout is exceeded.
> This makes sense. We may even want to put this code in regardless.
>
> However, I'm wondering why the code in fsi_spi_status didn't catch this.
Same, which is why I thought the problem couldn't be happening here for
a long time. See below with what I think is going on.
>
>> static int fsi_spi_status(struct fsi_spi *ctx, u64 *status, const char *dir)
>> {
>> int rc = fsi_spi_read_reg(ctx, SPI_FSI_STATUS, status);
> You mentioned the error condition is we get back 0xff. That means that
> status will be 0xffff_ffff_ffff_ffff ?
>
> Did you observe status being this value?
No, I think my observation of 0xff is not universal. I suspect that
while the CFAM is IN reset, 0xff is returned, but once it's been reset,
valid (though maybe uninitialized) data is returned. I observed a status
of 0x0001100000000000, which means the controller is idle, which makes
sense since it's been reset. So the issue occurs if we start an
operation before a CFAM reset and are waiting for it to complete during
the CFAM reset, but we also don't get any failed/invalid data FSI
operations during/after the reset (very timing dependent - the FSI
master does lock during the reset but doesn't wait afterwards for the
hardware to initialize).
>
>> if (rc)
>> return rc;
>>
>> if (*status & SPI_FSI_STATUS_ANY_ERROR) {
> I think that we're checking against 0xffe0f000.
>
>> dev_err(ctx->dev, "%s error: %016llx\n", dir, *status);
>>
>> rc = fsi_spi_reset(ctx);
>> if (rc)
>> return rc;
> Is the problem here? fsi_spi_reset writes to the clock config
> registers, but doesn't read the status.
>
> Obviously doing the writes causes a call to fsi_spi_check_status, but
> that checks the FSI2SPI bridge, not the SPI master.
>
> ...but it doesn't matter, because we're either going to return an
> error from the reset, or return EREMOTEIO, so there's no masking of
> the error.
Not sure I follow. I don't think we were hitting this path in this error
scenario. Do you think we need to check the status after a reset? It
should always be the same.
>
>> return -EREMOTEIO;
>> }
>>
>> return 0;
>> }
>
>> Signed-off-by: Eddie James <eajames@linux.ibm.com>
>> ---
>> drivers/spi/spi-fsi.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/spi/spi-fsi.c b/drivers/spi/spi-fsi.c
>> index b6c7467f0b59..d403a7a3021d 100644
>> --- a/drivers/spi/spi-fsi.c
>> +++ b/drivers/spi/spi-fsi.c
>> @@ -25,6 +25,7 @@
>>
>> #define SPI_FSI_BASE 0x70000
>> #define SPI_FSI_INIT_TIMEOUT_MS 1000
>> +#define SPI_FSI_STATUS_TIMEOUT_MS 100
> Can you add a comment (or put something in the commit message) about
> why you chose 100ms.
Hm, sure, but I chose it pretty arbitrarily. I'm not sure how to choose
something like this.
>
>> #define SPI_FSI_MAX_RX_SIZE 8
>> #define SPI_FSI_MAX_TX_SIZE 40
>>
>> @@ -299,6 +300,7 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
>> struct spi_transfer *transfer)
>> {
>> int rc = 0;
>> + unsigned long end;
>> u64 status = 0ULL;
>>
>> if (transfer->tx_buf) {
>> @@ -315,10 +317,14 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
>> if (rc)
>> return rc;
>>
>> + end = jiffies + msecs_to_jiffies(SPI_FSI_STATUS_TIMEOUT_MS);
>> do {
>> rc = fsi_spi_status(ctx, &status, "TX");
>> if (rc)
>> return rc;
>> +
>> + if (time_after(jiffies, end))
>> + return -ETIMEDOUT;
>> } while (status & SPI_FSI_STATUS_TDR_FULL);
>>
>> sent += nb;
>> @@ -329,10 +335,14 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
>> u8 *rx = transfer->rx_buf;
>>
>> while (transfer->len > recv) {
>> + end = jiffies + msecs_to_jiffies(SPI_FSI_STATUS_TIMEOUT_MS);
>> do {
>> rc = fsi_spi_status(ctx, &status, "RX");
>> if (rc)
>> return rc;
>> +
>> + if (time_after(jiffies, end))
>> + return -ETIMEDOUT;
>> } while (!(status & SPI_FSI_STATUS_RDR_FULL));
>>
>> rc = fsi_spi_read_reg(ctx, SPI_FSI_DATA_RX, &in);
>> --
>> 2.27.0
>>
next prev parent reply other threads:[~2022-03-18 14:11 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-17 21:14 [PATCH] spi: fsi: Implement a timeout for polling status Eddie James
2022-03-18 4:19 ` Joel Stanley
2022-03-18 14:10 ` Eddie James [this message]
2022-03-18 20:58 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acec0279-f5a2-2b0c-e044-6200e7a37e37@linux.ibm.com \
--to=eajames@linux.ibm.com \
--cc=broonie@kernel.org \
--cc=joel@jms.id.au \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-spi@vger.kernel.org \
--cc=openbmc@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).