All of lore.kernel.org
 help / color / mirror / Atom feed
From: mike.isely@cobaltdigital.com
To: Andi Shyti <andi.shyti@kernel.org>,
	Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Mike Isely <mike.isely@cobaltdigital.com>,
	Mike Isely <isely@pobox.com>,
	Broadcom internal kernel review list 
	<bcm-kernel-feedback-list@broadcom.com>,
	Ray Jui <rjui@broadcom.com>,
	Scott Branden <sbranden@broadcom.com>,
	linux-rpi-kernel@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 2/2] [i2c-bcm2835] ALWAYS enable INTD
Date: Mon, 30 Oct 2023 11:21:14 -0500	[thread overview]
Message-ID: <20231030162114.3603829-3-mike.isely@cobaltdigital.com> (raw)
In-Reply-To: <20231030162114.3603829-1-mike.isely@cobaltdigital.com>

From: Mike Isely <mike.isely@cobaltdigital.com>

There is a race in the bcm2835 i2c hardware: When one starts a write
transaction, two things apparently take place at the same time: (1) an
interrupt is posted to cause the FIFO to be filled with TX data,
and (2) an I2C transaction is started on the wire with the slave
select byte.  The race happens if there's no slave, as this causes a
slave selection timeout, raising the ERR flag in the hardware and
setting DONE.  The setting of that DONE flag races against TXW.  If
TXW gets set first, then an interrupt is raised if INTT was set.  If
ERR gets set first, then an interrupt is raised if INTD was set.  It's
one or the other, not both - probably because DONE being set disables
the hardware INTT interrupt path.

MOST of the time, TXW gets set first, the ISR runs, sees ERR is set
and cleanly fails the transaction.  However some of the time DONE gets
set first - but since the driver doesn't enable INTD until it's on the
last message - there's no interrupt at all.  Thus the ISR never fires
and the driver detects a timeout instead.  At best, the "wrong" error
code is delivered to the owner of the transaction.  At worst, if the
timeout doesn't propertly clean up the hardware (see prior commit
fixing that), the next - likely unrelated - transaction will get
fouled, leading to bizarre behavior in logic otherwise unrelated to
the source of the original error.

The fix here is to set INTD on for all messages not just the last one.
In that way, unexpected failures which might set DONE earlier than
expected will always trigger an interrupt and be handled correctly.

The datasheet for this hardware doesn't describe any scenario where
the hardware can realistically hang - even a stretched clock will be
noticed if it takes too long.  So in theory a timeout should really
NEVER happen, and with this fix I was completely unable to trigger any
further timeouts at all.

Signed-off-by: Mike Isely <isely@pobox.com>
---
 drivers/i2c/busses/i2c-bcm2835.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-bcm2835.c b/drivers/i2c/busses/i2c-bcm2835.c
index 96de875394e1..70005c037ff9 100644
--- a/drivers/i2c/busses/i2c-bcm2835.c
+++ b/drivers/i2c/busses/i2c-bcm2835.c
@@ -235,26 +235,22 @@ static void bcm2835_drain_rxfifo(struct bcm2835_i2c_dev *i2c_dev)
 
 static void bcm2835_i2c_start_transfer(struct bcm2835_i2c_dev *i2c_dev)
 {
-	u32 c = BCM2835_I2C_C_ST | BCM2835_I2C_C_I2CEN;
+	u32 c = BCM2835_I2C_C_ST | BCM2835_I2C_C_I2CEN | BCM2835_I2C_C_INTD;
 	struct i2c_msg *msg = i2c_dev->curr_msg;
-	bool last_msg = (i2c_dev->num_msgs == 1);
 
 	if (!i2c_dev->num_msgs)
 		return;
 
 	i2c_dev->num_msgs--;
 	i2c_dev->msg_buf = msg->buf;
 	i2c_dev->msg_buf_remaining = msg->len;
 
 	if (msg->flags & I2C_M_RD)
 		c |= BCM2835_I2C_C_READ | BCM2835_I2C_C_INTR;
 	else
 		c |= BCM2835_I2C_C_INTT;
 
-	if (last_msg)
-		c |= BCM2835_I2C_C_INTD;
-
 	bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_A, msg->addr);
 	bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_DLEN, msg->len);
 	bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_C, c);
 }
-- 
2.39.2


WARNING: multiple messages have this Message-ID (diff)
From: mike.isely@cobaltdigital.com
To: Andi Shyti <andi.shyti@kernel.org>,
	Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Mike Isely <mike.isely@cobaltdigital.com>,
	Mike Isely <isely@pobox.com>,
	Broadcom internal kernel review list
	<bcm-kernel-feedback-list@broadcom.com>,
	Ray Jui <rjui@broadcom.com>,
	Scott Branden <sbranden@broadcom.com>,
	linux-rpi-kernel@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 2/2] [i2c-bcm2835] ALWAYS enable INTD
Date: Mon, 30 Oct 2023 11:21:14 -0500	[thread overview]
Message-ID: <20231030162114.3603829-3-mike.isely@cobaltdigital.com> (raw)
In-Reply-To: <20231030162114.3603829-1-mike.isely@cobaltdigital.com>

From: Mike Isely <mike.isely@cobaltdigital.com>

There is a race in the bcm2835 i2c hardware: When one starts a write
transaction, two things apparently take place at the same time: (1) an
interrupt is posted to cause the FIFO to be filled with TX data,
and (2) an I2C transaction is started on the wire with the slave
select byte.  The race happens if there's no slave, as this causes a
slave selection timeout, raising the ERR flag in the hardware and
setting DONE.  The setting of that DONE flag races against TXW.  If
TXW gets set first, then an interrupt is raised if INTT was set.  If
ERR gets set first, then an interrupt is raised if INTD was set.  It's
one or the other, not both - probably because DONE being set disables
the hardware INTT interrupt path.

MOST of the time, TXW gets set first, the ISR runs, sees ERR is set
and cleanly fails the transaction.  However some of the time DONE gets
set first - but since the driver doesn't enable INTD until it's on the
last message - there's no interrupt at all.  Thus the ISR never fires
and the driver detects a timeout instead.  At best, the "wrong" error
code is delivered to the owner of the transaction.  At worst, if the
timeout doesn't propertly clean up the hardware (see prior commit
fixing that), the next - likely unrelated - transaction will get
fouled, leading to bizarre behavior in logic otherwise unrelated to
the source of the original error.

The fix here is to set INTD on for all messages not just the last one.
In that way, unexpected failures which might set DONE earlier than
expected will always trigger an interrupt and be handled correctly.

The datasheet for this hardware doesn't describe any scenario where
the hardware can realistically hang - even a stretched clock will be
noticed if it takes too long.  So in theory a timeout should really
NEVER happen, and with this fix I was completely unable to trigger any
further timeouts at all.

Signed-off-by: Mike Isely <isely@pobox.com>
---
 drivers/i2c/busses/i2c-bcm2835.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-bcm2835.c b/drivers/i2c/busses/i2c-bcm2835.c
index 96de875394e1..70005c037ff9 100644
--- a/drivers/i2c/busses/i2c-bcm2835.c
+++ b/drivers/i2c/busses/i2c-bcm2835.c
@@ -235,26 +235,22 @@ static void bcm2835_drain_rxfifo(struct bcm2835_i2c_dev *i2c_dev)
 
 static void bcm2835_i2c_start_transfer(struct bcm2835_i2c_dev *i2c_dev)
 {
-	u32 c = BCM2835_I2C_C_ST | BCM2835_I2C_C_I2CEN;
+	u32 c = BCM2835_I2C_C_ST | BCM2835_I2C_C_I2CEN | BCM2835_I2C_C_INTD;
 	struct i2c_msg *msg = i2c_dev->curr_msg;
-	bool last_msg = (i2c_dev->num_msgs == 1);
 
 	if (!i2c_dev->num_msgs)
 		return;
 
 	i2c_dev->num_msgs--;
 	i2c_dev->msg_buf = msg->buf;
 	i2c_dev->msg_buf_remaining = msg->len;
 
 	if (msg->flags & I2C_M_RD)
 		c |= BCM2835_I2C_C_READ | BCM2835_I2C_C_INTR;
 	else
 		c |= BCM2835_I2C_C_INTT;
 
-	if (last_msg)
-		c |= BCM2835_I2C_C_INTD;
-
 	bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_A, msg->addr);
 	bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_DLEN, msg->len);
 	bcm2835_i2c_writel(i2c_dev, BCM2835_I2C_C, c);
 }
-- 
2.39.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2023-10-30 16:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-30 16:21 [PATCH 0/2] Fix error-leg bugs / misbehaviors in i2c-bcm2835 driver mike.isely
2023-10-30 16:21 ` mike.isely
2023-10-30 16:21 ` [PATCH 1/2] [i2c-bcm2835] Fully clean up hardware state machine after a timeout mike.isely
2023-10-30 16:21   ` mike.isely
2023-10-31 11:43   ` Andi Shyti
2023-10-31 11:43     ` Andi Shyti
2023-10-31 15:14     ` Dave Stevenson
2023-10-31 15:14       ` Dave Stevenson
2023-10-31 16:26       ` Mike Isely
2023-10-31 16:26         ` Mike Isely
2023-10-31 12:36   ` Stefan Wahren
2023-10-31 12:36     ` Stefan Wahren
2023-10-30 16:21 ` mike.isely [this message]
2023-10-30 16:21   ` [PATCH 2/2] [i2c-bcm2835] ALWAYS enable INTD mike.isely
2023-10-31 12:37   ` Stefan Wahren
2023-10-31 12:37     ` Stefan Wahren
2023-10-31 15:34     ` Dave Stevenson
2023-10-31 15:34       ` Dave Stevenson
2023-10-31 16:44       ` Mike Isely
2023-10-31 16:44         ` Mike Isely
2023-10-31 18:04         ` Dave Stevenson
2023-10-31 18:04           ` Dave Stevenson
2023-11-01 19:26           ` Mike Isely
2023-11-01 19:26             ` Mike Isely
2023-11-01 21:12             ` aborting i2c bcm2835 controller [was: [PATCH 2/2] [i2c-bcm2835] ALWAYS enable INTD] Mike Isely
2023-11-01 21:12               ` Mike Isely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231030162114.3603829-3-mike.isely@cobaltdigital.com \
    --to=mike.isely@cobaltdigital.com \
    --cc=andi.shyti@kernel.org \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=florian.fainelli@broadcom.com \
    --cc=isely@pobox.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rpi-kernel@lists.infradead.org \
    --cc=rjui@broadcom.com \
    --cc=sbranden@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.