linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Heyi Guo <guoheyi@linux.alibaba.com>
To: Lei Yu <yulei.sh@bytedance.com>
Cc: Joel Stanley <joel@jms.id.au>,
	linux-aspeed <linux-aspeed@lists.ozlabs.org>,
	Andrew Jeffery <andrew@aj.id.au>,
	OpenBMC Maillist <openbmc@lists.ozlabs.org>,
	Brendan Higgins <brendanhiggins@google.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"open list:I2C SUBSYSTEM HOST DRIVERS"
	<linux-i2c@vger.kernel.org>,
	Philipp Zabel <p.zabel@pengutronix.de>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] drivers/i2c-aspeed: avoid invalid memory reference after timeout
Date: Wed, 1 Jun 2022 09:59:58 +0800	[thread overview]
Message-ID: <374237cb-1cda-df12-eb9f-7422cab51fc4@linux.alibaba.com> (raw)
In-Reply-To: <CAGm54UFUxNpwKjQyQnqtbys_nfgx2KcEEJt3-0nJWYjyjM9pvw@mail.gmail.com>

Thanks for your feedback :)

在 2022/5/31 下午5:35, Lei Yu 写道:
> I hit a similar problem that has a slightly different backtrace on a
> malfunctioning device.
> https://pastebin.com/TiWdkdrG
>
> With this patch, the kernel panic is gone and it gets below logs instead:
>
>   aspeed-i2c-bus 1e78a180.i2c-bus: bus in unknown state. irq_status: 0x1
>   aspeed-i2c-bus 1e78a180.i2c-bus: irq handled != irq. expected
> 0x00000001, but was 0x00000000
>   aspeed-i2c-bus 1e78a180.i2c-bus: bus in unknown state. irq_status: 0x10
>   aspeed-i2c-bus 1e78a180.i2c-bus: irq handled != irq. expected
> 0x00000010, but was 0x00000000
>
> So I think this patch is good in that it prevents the kernel panic.
>
> On Wed, Jan 19, 2022 at 11:00 AM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>>
>> 在 2022/1/17 下午2:38, Joel Stanley 写道:
>>> On Fri, 14 Jan 2022 at 14:01, Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>>>> Hi Joel,
>>>>
>>>>
>>>> 在 2022/1/11 下午6:51, Joel Stanley 写道:
>>>>> On Tue, 11 Jan 2022 at 07:52, Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Any comments?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Heyi
>>>>>>
>>>>>> 在 2022/1/9 下午9:26, Heyi Guo 写道:
>>>>>>> The memory will be freed by the caller if transfer timeout occurs,
>>>>>>> then it would trigger kernel panic if the peer device responds with
>>>>>>> something after timeout and triggers the interrupt handler of aspeed
>>>>>>> i2c driver.
>>>>>>>
>>>>>>> Set the msgs pointer to NULL to avoid invalid memory reference after
>>>>>>> timeout to fix this potential kernel panic.
>>>>> Thanks for the patch. How did you discover this issue? Do you have a
>>>>> test I can run to reproduce the crash?
>>>> We are using one i2c channel to communicate with another MCU by
>>>> implementing user space SSIF protocol, and the MCU may not respond in
>>>> time if it is busy. If it responds after timeout occurs, it will trigger
>>>> below kernel panic:
>>>>
>>> Thanks for the details. It looks like you've done some testing of
>>> this, which is good.
>>>
>>>> After applying this patch, we'll get below warning instead:
>>>>
>>>> "bus in unknown state. irq_status: 0x%x\n"
>>> Given we get to here in the irq handler, we've done these two tests:
>>>
>>>    - aspeed_i2c_is_irq_error()
>>>    - the state is not INACTIVE or PENDING
>>>
>>> but there's no buffer ready for us to use. So what has triggered the
>>> IRQ in this case? Do you have a record of the irq status bits?
>>>
>>> I am wondering if the driver should know that the transaction has
>>> timed out, instead of detecting this unknown state.
>> Yes, some drivers will try to abort the transaction before returning to
>> the caller, if timeout happens.
>>
>> The irq status bits are not always the same; searching from the history
>> logs, I found some messages like below:
>>
>> [  495.289499] aspeed-i2c-bus 1e78a680.i2c-bus: bus in unknown state.
>> irq_status: 0x2
>> [  495.298003] aspeed-i2c-bus 1e78a680.i2c-bus: bus in unknown state.
>> irq_status: 0x10
>>
>> [   65.176761] aspeed-i2c-bus 1e78a680.i2c-bus: bus in unknown state.
>> irq_status: 0x15
>>
>> Thanks,
>>
>> Heyi
>>
>>>
>>>>> Can you provide a Fixes tag?
>>>> I think the bug was introduced by the first commit of this file :(
>>>>
>>>> f327c686d3ba44eda79a2d9e02a6a242e0b75787
>>>>
>>>>
>>>>> Do other i2c master drivers do this? I took a quick look at the meson
>>>>> driver and it doesn't appear to clear it's pointer to msgs.
>>>> It is hard to say. It seems other drivers have some recover scheme like
>>>> aborting the transfer, or loop each messages in process context and
>>>> don't do much in IRQ handler, which may disable interrupts or not retain
>>>> the buffer pointer before returning timeout.
>>> I think your change is okay to go in as it fixes the crash, but first
>>> I want to work out if there's some missing handling of a timeout
>>> condition that we should add as well.
>>>
>>>
>>>> Thanks,
>>>>
>>>> Heyi
>>>>
>>>>
>>>>>>> Signed-off-by: Heyi Guo <guoheyi@linux.alibaba.com>
>>>>>>>
>>>>>>> -------
>>>>>>>
>>>>>>> Cc: Brendan Higgins <brendanhiggins@google.com>
>>>>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>>>> Cc: Joel Stanley <joel@jms.id.au>
>>>>>>> Cc: Andrew Jeffery <andrew@aj.id.au>
>>>>>>> Cc: Philipp Zabel <p.zabel@pengutronix.de>
>>>>>>> Cc: linux-i2c@vger.kernel.org
>>>>>>> Cc: openbmc@lists.ozlabs.org
>>>>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>>>>> Cc: linux-aspeed@lists.ozlabs.org
>>>>>>> ---
>>>>>>>      drivers/i2c/busses/i2c-aspeed.c | 5 +++++
>>>>>>>      1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>>>>>>> index 67e8b97c0c950..3ab0396168680 100644
>>>>>>> --- a/drivers/i2c/busses/i2c-aspeed.c
>>>>>>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>>>>>>> @@ -708,6 +708,11 @@ static int aspeed_i2c_master_xfer(struct i2c_adapter *adap,
>>>>>>>                  spin_lock_irqsave(&bus->lock, flags);
>>>>>>>                  if (bus->master_state == ASPEED_I2C_MASTER_PENDING)
>>>>>>>                          bus->master_state = ASPEED_I2C_MASTER_INACTIVE;
>>>>>>> +             /*
>>>>>>> +              * All the buffers may be freed after returning to caller, so
>>>>>>> +              * set msgs to NULL to avoid memory reference after freeing.
>>>>>>> +              */
>>>>>>> +             bus->msgs = NULL;
>>>>>>>                  spin_unlock_irqrestore(&bus->lock, flags);
>>>>>>>
>>>>>>>                  return -ETIMEDOUT;
>
>

      reply	other threads:[~2022-06-01  2:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-09 13:26 [PATCH] drivers/i2c-aspeed: avoid invalid memory reference after timeout Heyi Guo
2022-01-11  7:52 ` Heyi Guo
2022-01-11 10:51   ` Joel Stanley
2022-01-14 14:01     ` Heyi Guo
2022-01-17  6:38       ` Joel Stanley
2022-01-19  2:59         ` Heyi Guo
2022-05-31  9:35           ` Lei Yu
2022-06-01  1:59             ` Heyi Guo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=374237cb-1cda-df12-eb9f-7422cab51fc4@linux.alibaba.com \
    --to=guoheyi@linux.alibaba.com \
    --cc=andrew@aj.id.au \
    --cc=brendanhiggins@google.com \
    --cc=joel@jms.id.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-aspeed@lists.ozlabs.org \
    --cc=linux-i2c@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=openbmc@lists.ozlabs.org \
    --cc=p.zabel@pengutronix.de \
    --cc=yulei.sh@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).