All of lore.kernel.org
 help / color / mirror / Atom feed
* pull-request: can-next 2021-04-07
@ 2021-04-07  8:01 Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 1/6] can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails Marc Kleine-Budde
                   ` (6 more replies)
  0 siblings, 7 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, linux-can, kernel

Hello Jakub, hello David,

this is a pull request of 6 patches for net-next/master.

The first patch targets the CAN driver infrastructure, it improves the
alloc_can{,fd}_skb() function to set the pointer to the CAN frame to
NULL if skb allocation fails.

The next patch adds missing error handling to the m_can driver's RX
path (the code was introduced in -next, no need to backport).

In the next patch an unused constant is removed from an enum in the
c_can driver.

The last 3 patches target the mcp251xfd driver. They add BQL support
and try to work around a sometimes broken CRC when reading the TBC
register.

regards,
Marc

---

The following changes since commit 0b35e0deb5bee7d4882356d6663522c1562a8321:

  docs: ethtool: correct quotes (2021-04-06 16:56:58 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git tags/linux-can-next-for-5.13-20210407

for you to fetch changes up to c7eb923c3caf4c6a183465cc012dc368b199a4b2:

  can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register (2021-04-07 09:31:28 +0200)

----------------------------------------------------------------
linux-can-next-for-5.13-20210407

----------------------------------------------------------------
Marc Kleine-Budde (6):
      can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails
      can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call
      can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM
      can: mcp251xfd: add BQL support
      can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function
      can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register

 drivers/net/can/c_can/c_can.h                    |  1 -
 drivers/net/can/dev/skb.c                        | 10 +++-
 drivers/net/can/m_can/m_can.c                    | 13 +++--
 drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c   | 23 +++++++--
 drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c | 64 ++++++++++++++++++++----
 5 files changed, 90 insertions(+), 21 deletions(-)



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [net-next 1/6] can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
@ 2021-04-07  8:01 ` Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 2/6] can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call Marc Kleine-Budde
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, linux-can, kernel, Marc Kleine-Budde, Vincent MAILHOL

The handling of CAN bus errors typically consist of allocating a CAN
error SKB using alloc_can_err_skb() followed by stats handling and
filling the error details in the newly allocated CAN error SKB. Even
if the allocation of the SKB fails the stats handling should not be
skipped.

The common pattern in CAN drivers is to allocate the skb and work on
the struct can_frame pointer "cf", if it has been assigned by
alloc_can_err_skb().

|	skb = alloc_can_err_skb(priv->ndev, &cf);
|
| 	/* RX errors */
| 	if (bdiag1 & (MCP251XFD_REG_BDIAG1_DCRCERR |
| 		      MCP251XFD_REG_BDIAG1_NCRCERR)) {
| 		netdev_dbg(priv->ndev, "CRC error\n");
|
| 		stats->rx_errors++;
| 		if (cf)
| 			cf->data[3] |= CAN_ERR_PROT_LOC_CRC_SEQ;
| 	}

In case of an OOM alloc_can_err_skb() returns NULL, but doesn't set
"cf" to NULL as well. For the above pattern to work the "cf" has to be
initialized to NULL, which is easily forgotten.

To solve this kind of problems, set "cf" to NULL if
alloc_can_err_skb() returns NULL.

Link: https://lore.kernel.org/r/20210402102245.1512583-1-mkl@pengutronix.de
Suggested-by: Vincent MAILHOL <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/dev/skb.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/can/dev/skb.c b/drivers/net/can/dev/skb.c
index 387c0bc0fb9c..61660248c69e 100644
--- a/drivers/net/can/dev/skb.c
+++ b/drivers/net/can/dev/skb.c
@@ -183,8 +183,11 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, struct can_frame **cf)
 
 	skb = netdev_alloc_skb(dev, sizeof(struct can_skb_priv) +
 			       sizeof(struct can_frame));
-	if (unlikely(!skb))
+	if (unlikely(!skb)) {
+		*cf = NULL;
+
 		return NULL;
+	}
 
 	skb->protocol = htons(ETH_P_CAN);
 	skb->pkt_type = PACKET_BROADCAST;
@@ -211,8 +214,11 @@ struct sk_buff *alloc_canfd_skb(struct net_device *dev,
 
 	skb = netdev_alloc_skb(dev, sizeof(struct can_skb_priv) +
 			       sizeof(struct canfd_frame));
-	if (unlikely(!skb))
+	if (unlikely(!skb)) {
+		*cfd = NULL;
+
 		return NULL;
+	}
 
 	skb->protocol = htons(ETH_P_CANFD);
 	skb->pkt_type = PACKET_BROADCAST;

base-commit: 0b35e0deb5bee7d4882356d6663522c1562a8321
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [net-next 2/6] can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 1/6] can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails Marc Kleine-Budde
@ 2021-04-07  8:01 ` Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 3/6] can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM Marc Kleine-Budde
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, linux-can, kernel, Marc Kleine-Budde, coverity-bot,
	Kees Cook, Torin Cooper-Bennun

In commit 1be37d3b0414 ("can: m_can: fix periph RX path: use
rx-offload to ensure skbs are sent from softirq context") the RX path
for peripherals (i.e. SPI based m_can controllers) was converted to
the rx-offload infrastructure. However, the error handling for
can_rx_offload_queue_sorted() was forgotten.
can_rx_offload_queue_sorted() will return with an error if the
internal queue is full.

This patch adds the missing error handling, by increasing the
rx_fifo_errors.

Fixes: 1be37d3b0414 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context")
Link: https://lore.kernel.org/r/20210401084515.1455013-1-mkl@pengutronix.de
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1503583 ("Error handling issues")
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Torin Cooper-Bennun <torin@maxiluxsystems.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/m_can/m_can.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 890ed826a355..34073cd077e4 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -466,10 +466,17 @@ static void m_can_receive_skb(struct m_can_classdev *cdev,
 			      struct sk_buff *skb,
 			      u32 timestamp)
 {
-	if (cdev->is_peripheral)
-		can_rx_offload_queue_sorted(&cdev->offload, skb, timestamp);
-	else
+	if (cdev->is_peripheral) {
+		struct net_device_stats *stats = &cdev->net->stats;
+		int err;
+
+		err = can_rx_offload_queue_sorted(&cdev->offload, skb,
+						  timestamp);
+		if (err)
+			stats->rx_fifo_errors++;
+	} else {
 		netif_receive_skb(skb);
+	}
 }
 
 static void m_can_read_fifo(struct net_device *dev, u32 rxfs)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [net-next 3/6] can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 1/6] can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 2/6] can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call Marc Kleine-Budde
@ 2021-04-07  8:01 ` Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 4/6] can: mcp251xfd: add BQL support Marc Kleine-Budde
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, linux-can, kernel, Marc Kleine-Budde

This patch removes the unused enum BOSCH_C_CAN_PLATFORM.

Link: https://lore.kernel.org/r/20210406110617.1865592-2-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/c_can/c_can.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 8acedd9e63a7..06045f610f0e 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -149,7 +149,6 @@ static const u16 __maybe_unused reg_map_d_can[] = {
 };
 
 enum c_can_dev_id {
-	BOSCH_C_CAN_PLATFORM,
 	BOSCH_C_CAN,
 	BOSCH_D_CAN,
 };
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [net-next 4/6] can: mcp251xfd: add BQL support
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
                   ` (2 preceding siblings ...)
  2021-04-07  8:01 ` [net-next 3/6] can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM Marc Kleine-Budde
@ 2021-04-07  8:01 ` Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 5/6] can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function Marc Kleine-Budde
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, linux-can, kernel, Marc Kleine-Budde,
	Manivannan Sadhasivam, Thomas Kopp

This patch re-adds BQL support to the driver. Support for
netdev_xmit_more() will be added in a separate patch series.

Link: https://lore.kernel.org/r/20210406110617.1865592-3-mkl@pengutronix.de
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Thomas Kopp <thomas.kopp@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 .../net/can/spi/mcp251xfd/mcp251xfd-core.c    | 23 +++++++++++++++----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
index 142eb4506b55..970dc570e7a5 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
@@ -335,6 +335,8 @@ static void mcp251xfd_ring_init(struct mcp251xfd_priv *priv)
 	u8 len;
 	int i, j;
 
+	netdev_reset_queue(priv->ndev);
+
 	/* TEF */
 	tef_ring = priv->tef;
 	tef_ring->head = 0;
@@ -1262,7 +1264,8 @@ mcp251xfd_handle_tefif_recover(const struct mcp251xfd_priv *priv, const u32 seq)
 
 static int
 mcp251xfd_handle_tefif_one(struct mcp251xfd_priv *priv,
-			   const struct mcp251xfd_hw_tef_obj *hw_tef_obj)
+			   const struct mcp251xfd_hw_tef_obj *hw_tef_obj,
+			   unsigned int *frame_len_ptr)
 {
 	struct net_device_stats *stats = &priv->ndev->stats;
 	struct sk_buff *skb;
@@ -1288,8 +1291,8 @@ mcp251xfd_handle_tefif_one(struct mcp251xfd_priv *priv,
 		mcp251xfd_skb_set_timestamp(priv, skb, hw_tef_obj->ts);
 	stats->tx_bytes +=
 		can_rx_offload_get_echo_skb(&priv->offload,
-					    tef_tail,
-					    hw_tef_obj->ts, NULL);
+					    tef_tail, hw_tef_obj->ts,
+					    frame_len_ptr);
 	stats->tx_packets++;
 	priv->tef->tail++;
 
@@ -1347,6 +1350,7 @@ mcp251xfd_tef_obj_read(const struct mcp251xfd_priv *priv,
 static int mcp251xfd_handle_tefif(struct mcp251xfd_priv *priv)
 {
 	struct mcp251xfd_hw_tef_obj hw_tef_obj[MCP251XFD_TX_OBJ_NUM_MAX];
+	unsigned int total_frame_len = 0;
 	u8 tef_tail, len, l;
 	int err, i;
 
@@ -1368,7 +1372,9 @@ static int mcp251xfd_handle_tefif(struct mcp251xfd_priv *priv)
 	}
 
 	for (i = 0; i < len; i++) {
-		err = mcp251xfd_handle_tefif_one(priv, &hw_tef_obj[i]);
+		unsigned int frame_len = 0;
+
+		err = mcp251xfd_handle_tefif_one(priv, &hw_tef_obj[i], &frame_len);
 		/* -EAGAIN means the Sequence Number in the TEF
 		 * doesn't match our tef_tail. This can happen if we
 		 * read the TEF objects too early. Leave loop let the
@@ -1378,6 +1384,8 @@ static int mcp251xfd_handle_tefif(struct mcp251xfd_priv *priv)
 			goto out_netif_wake_queue;
 		if (err)
 			return err;
+
+		total_frame_len += frame_len;
 	}
 
  out_netif_wake_queue:
@@ -1403,6 +1411,7 @@ static int mcp251xfd_handle_tefif(struct mcp251xfd_priv *priv)
 			return err;
 
 		tx_ring->tail += len;
+		netdev_completed_queue(priv->ndev, len, total_frame_len);
 
 		err = mcp251xfd_check_tef_tail(priv);
 		if (err)
@@ -2446,6 +2455,7 @@ static netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
 	struct mcp251xfd_priv *priv = netdev_priv(ndev);
 	struct mcp251xfd_tx_ring *tx_ring = priv->tx;
 	struct mcp251xfd_tx_obj *tx_obj;
+	unsigned int frame_len;
 	u8 tx_head;
 	int err;
 
@@ -2464,7 +2474,10 @@ static netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
 	if (mcp251xfd_get_tx_free(tx_ring) == 0)
 		netif_stop_queue(ndev);
 
-	can_put_echo_skb(skb, ndev, tx_head, 0);
+	frame_len = can_skb_get_frame_len(skb);
+	err = can_put_echo_skb(skb, ndev, tx_head, frame_len);
+	if (!err)
+		netdev_sent_queue(priv->ndev, frame_len);
 
 	err = mcp251xfd_tx_obj_write(priv, tx_obj);
 	if (err)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [net-next 5/6] can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
                   ` (3 preceding siblings ...)
  2021-04-07  8:01 ` [net-next 4/6] can: mcp251xfd: add BQL support Marc Kleine-Budde
@ 2021-04-07  8:01 ` Marc Kleine-Budde
  2021-04-07  8:01 ` [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register Marc Kleine-Budde
  2021-04-07 22:10 ` pull-request: can-next 2021-04-07 patchwork-bot+netdevbpf
  6 siblings, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, linux-can, kernel, Marc Kleine-Budde,
	Manivannan Sadhasivam, Thomas Kopp

This patch factors out the crc check into a separate function. This is
preparation for the next patch.

Link: https://lore.kernel.org/r/20210406110617.1865592-4-mkl@pengutronix.de
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Thomas Kopp <thomas.kopp@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 .../net/can/spi/mcp251xfd/mcp251xfd-regmap.c  | 30 ++++++++++++-------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
index 314f868b3465..35557ac43c03 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
@@ -232,13 +232,31 @@ mcp251xfd_regmap_crc_write(void *context,
 						 count - data_offset);
 }
 
+static int
+mcp251xfd_regmap_crc_read_check_crc(const struct mcp251xfd_map_buf_crc * const buf_rx,
+				    const struct mcp251xfd_map_buf_crc * const buf_tx,
+				    unsigned int data_len)
+{
+	u16 crc_received, crc_calculated;
+
+	crc_received = get_unaligned_be16(buf_rx->data + data_len);
+	crc_calculated = mcp251xfd_crc16_compute2(&buf_tx->cmd,
+						  sizeof(buf_tx->cmd),
+						  buf_rx->data,
+						  data_len);
+	if (crc_received != crc_calculated)
+		return -EBADMSG;
+
+	return 0;
+}
+
+
 static int
 mcp251xfd_regmap_crc_read_one(struct mcp251xfd_priv *priv,
 			      struct spi_message *msg, unsigned int data_len)
 {
 	const struct mcp251xfd_map_buf_crc *buf_rx = priv->map_buf_crc_rx;
 	const struct mcp251xfd_map_buf_crc *buf_tx = priv->map_buf_crc_tx;
-	u16 crc_received, crc_calculated;
 	int err;
 
 	BUILD_BUG_ON(sizeof(buf_rx->cmd) != sizeof(__be16) + sizeof(u8));
@@ -248,15 +266,7 @@ mcp251xfd_regmap_crc_read_one(struct mcp251xfd_priv *priv,
 	if (err)
 		return err;
 
-	crc_received = get_unaligned_be16(buf_rx->data + data_len);
-	crc_calculated = mcp251xfd_crc16_compute2(&buf_tx->cmd,
-						  sizeof(buf_tx->cmd),
-						  buf_rx->data,
-						  data_len);
-	if (crc_received != crc_calculated)
-		return -EBADMSG;
-
-	return 0;
+	return mcp251xfd_regmap_crc_read_check_crc(buf_rx, buf_tx, data_len);
 }
 
 static int
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
                   ` (4 preceding siblings ...)
  2021-04-07  8:01 ` [net-next 5/6] can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function Marc Kleine-Budde
@ 2021-04-07  8:01 ` Marc Kleine-Budde
  2021-04-21 19:58   ` Drew Fustini
  2021-04-07 22:10 ` pull-request: can-next 2021-04-07 patchwork-bot+netdevbpf
  6 siblings, 1 reply; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-07  8:01 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, linux-can, kernel, Marc Kleine-Budde,
	Manivannan Sadhasivam, Thomas Kopp

MCP251XFD_REG_TBC is the time base counter register. It increments
once per SYS clock tick, which is 20 or 40 MHz. Observation shows that
if the lowest byte (which is transferred first on the SPI bus) of that
register is 0x00 or 0x80 the calculated CRC doesn't always match the
transferred one.

To reproduce this problem let the driver read the TBC register in a
high frequency. This can be done by attaching only the mcp251xfd CAN
controller to a valid terminated CAN bus and send a single CAN frame.
As there are no other CAN controller on the bus, the sent CAN frame is
not ACKed and the mcp251xfd repeats it. If user space enables the bus
error reporting, each of the NACK errors is reported with a time
stamp (which is read from the TBC register) to user space.

$ ip link set can0 down
$ ip link set can0 up type can bitrate 500000 berr-reporting on
$ cansend can0 4FF#ff.01.00.00.00.00.00.00

This leads to several error messages per second:

| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 3a 86 da, CRC=0x7753) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 01 b4 da, CRC=0x5830) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 e9 23 db, CRC=0xa723) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 8a 30 db, CRC=0x4a9c) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 f3 43 db, CRC=0x66d2) retrying.

If the highest bit in the lowest byte is flipped the transferred CRC
matches the calculated one. We assume for now the CRC calculation in
the chip works on wrong data and the transferred data is correct.

This patch implements the following workaround:

- If a CRC read error on the TBC register is detected and the lowest
  byte is 0x00 or 0x80, the highest bit of the lowest byte is flipped
  and the CRC is calculated again.
- If the CRC now matches, the _original_ data is passed to the reader.
  For now we assume transferred data was OK.

Link: https://lore.kernel.org/r/20210406110617.1865592-5-mkl@pengutronix.de
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Thomas Kopp <thomas.kopp@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 .../net/can/spi/mcp251xfd/mcp251xfd-regmap.c  | 34 +++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
index 35557ac43c03..297491516a26 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
@@ -321,6 +321,40 @@ mcp251xfd_regmap_crc_read(void *context,
 		if (err != -EBADMSG)
 			return err;
 
+		/* MCP251XFD_REG_TBC is the time base counter
+		 * register. It increments once per SYS clock tick,
+		 * which is 20 or 40 MHz.
+		 *
+		 * Observation shows that if the lowest byte (which is
+		 * transferred first on the SPI bus) of that register
+		 * is 0x00 or 0x80 the calculated CRC doesn't always
+		 * match the transferred one.
+		 *
+		 * If the highest bit in the lowest byte is flipped
+		 * the transferred CRC matches the calculated one. We
+		 * assume for now the CRC calculation in the chip
+		 * works on wrong data and the transferred data is
+		 * correct.
+		 */
+		if (reg == MCP251XFD_REG_TBC &&
+		    (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
+			/* Flip highest bit in lowest byte of le32 */
+			buf_rx->data[0] ^= 0x80;
+
+			/* re-check CRC */
+			err = mcp251xfd_regmap_crc_read_check_crc(buf_rx,
+								  buf_tx,
+								  val_len);
+			if (!err) {
+				/* If CRC is now correct, assume
+				 * transferred data was OK, flip bit
+				 * back to original value.
+				 */
+				buf_rx->data[0] ^= 0x80;
+				goto out;
+			}
+		}
+
 		/* MCP251XFD_REG_OSC is the first ever reg we read from.
 		 *
 		 * The chip may be in deep sleep and this SPI transfer
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: pull-request: can-next 2021-04-07
  2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
                   ` (5 preceding siblings ...)
  2021-04-07  8:01 ` [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register Marc Kleine-Budde
@ 2021-04-07 22:10 ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 33+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-04-07 22:10 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: netdev, davem, kuba, linux-can, kernel

Hello:

This pull request was applied to netdev/net-next.git (refs/heads/master):

On Wed,  7 Apr 2021 10:01:12 +0200 you wrote:
> Hello Jakub, hello David,
> 
> this is a pull request of 6 patches for net-next/master.
> 
> The first patch targets the CAN driver infrastructure, it improves the
> alloc_can{,fd}_skb() function to set the pointer to the CAN frame to
> NULL if skb allocation fails.
> 
> [...]

Here is the summary with links:
  - pull-request: can-next 2021-04-07
    https://git.kernel.org/netdev/net-next/c/33b32a298426
  - [net-next,2/6] can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call
    https://git.kernel.org/netdev/net-next/c/644022b1de9e
  - [net-next,3/6] can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM
    https://git.kernel.org/netdev/net-next/c/8dc987519ae9
  - [net-next,4/6] can: mcp251xfd: add BQL support
    https://git.kernel.org/netdev/net-next/c/0084e298acfe
  - [net-next,5/6] can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function
    https://git.kernel.org/netdev/net-next/c/ef7a8c3e7599
  - [net-next,6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
    https://git.kernel.org/netdev/net-next/c/c7eb923c3caf

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-04-07  8:01 ` [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register Marc Kleine-Budde
@ 2021-04-21 19:58   ` Drew Fustini
  2021-04-22  7:18     ` Marc Kleine-Budde
  0 siblings, 1 reply; 33+ messages in thread
From: Drew Fustini @ 2021-04-21 19:58 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: netdev, David S. Miller, Jakub Kicinski, linux-can, kernel,
	Manivannan Sadhasivam, Will C, Thomas Kopp

On Wed, Apr 7, 2021 at 1:01 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>
> MCP251XFD_REG_TBC is the time base counter register. It increments
> once per SYS clock tick, which is 20 or 40 MHz. Observation shows that
> if the lowest byte (which is transferred first on the SPI bus) of that
> register is 0x00 or 0x80 the calculated CRC doesn't always match the
> transferred one.
>
> To reproduce this problem let the driver read the TBC register in a
> high frequency. This can be done by attaching only the mcp251xfd CAN
> controller to a valid terminated CAN bus and send a single CAN frame.
> As there are no other CAN controller on the bus, the sent CAN frame is
> not ACKed and the mcp251xfd repeats it. If user space enables the bus
> error reporting, each of the NACK errors is reported with a time
> stamp (which is read from the TBC register) to user space.
>
> $ ip link set can0 down
> $ ip link set can0 up type can bitrate 500000 berr-reporting on
> $ cansend can0 4FF#ff.01.00.00.00.00.00.00
>
> This leads to several error messages per second:
>
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 3a 86 da, CRC=0x7753) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 01 b4 da, CRC=0x5830) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 e9 23 db, CRC=0xa723) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 8a 30 db, CRC=0x4a9c) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 f3 43 db, CRC=0x66d2) retrying.
>
> If the highest bit in the lowest byte is flipped the transferred CRC
> matches the calculated one. We assume for now the CRC calculation in
> the chip works on wrong data and the transferred data is correct.
>
> This patch implements the following workaround:
>
> - If a CRC read error on the TBC register is detected and the lowest
>   byte is 0x00 or 0x80, the highest bit of the lowest byte is flipped
>   and the CRC is calculated again.
> - If the CRC now matches, the _original_ data is passed to the reader.
>   For now we assume transferred data was OK.
>
> Link: https://lore.kernel.org/r/20210406110617.1865592-5-mkl@pengutronix.de
> Cc: Manivannan Sadhasivam <mani@kernel.org>
> Cc: Thomas Kopp <thomas.kopp@microchip.com>
> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
> ---
>  .../net/can/spi/mcp251xfd/mcp251xfd-regmap.c  | 34 +++++++++++++++++++
>  1 file changed, 34 insertions(+)
>
> diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> index 35557ac43c03..297491516a26 100644
> --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> @@ -321,6 +321,40 @@ mcp251xfd_regmap_crc_read(void *context,
>                 if (err != -EBADMSG)
>                         return err;
>
> +               /* MCP251XFD_REG_TBC is the time base counter
> +                * register. It increments once per SYS clock tick,
> +                * which is 20 or 40 MHz.
> +                *
> +                * Observation shows that if the lowest byte (which is
> +                * transferred first on the SPI bus) of that register
> +                * is 0x00 or 0x80 the calculated CRC doesn't always
> +                * match the transferred one.
> +                *
> +                * If the highest bit in the lowest byte is flipped
> +                * the transferred CRC matches the calculated one. We
> +                * assume for now the CRC calculation in the chip
> +                * works on wrong data and the transferred data is
> +                * correct.
> +                */
> +               if (reg == MCP251XFD_REG_TBC &&
> +                   (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
> +                       /* Flip highest bit in lowest byte of le32 */
> +                       buf_rx->data[0] ^= 0x80;
> +
> +                       /* re-check CRC */
> +                       err = mcp251xfd_regmap_crc_read_check_crc(buf_rx,
> +                                                                 buf_tx,
> +                                                                 val_len);
> +                       if (!err) {
> +                               /* If CRC is now correct, assume
> +                                * transferred data was OK, flip bit
> +                                * back to original value.
> +                                */
> +                               buf_rx->data[0] ^= 0x80;
> +                               goto out;
> +                       }
> +               }
> +
>                 /* MCP251XFD_REG_OSC is the first ever reg we read from.
>                  *
>                  * The chip may be in deep sleep and this SPI transfer
> --
> 2.30.2
>
>

Hello Marc,

I am encountering similar error with the 5.10 raspberrypi kernel on
RPi 4 with MCP2518FD:

  mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
data=00 ad 58 67, CRC=0xbbfd) retrying

Would it be possible for you to pull these patches into a v5.10 branch
in your linux-rpi repo [1]?

Thanks,
Drew

[1] https://github.com/marckleinebudde/linux-rpi

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-04-21 19:58   ` Drew Fustini
@ 2021-04-22  7:18     ` Marc Kleine-Budde
  2021-04-22 16:46       ` Patrick Menschel
  2021-05-07  7:25       ` Marc Kleine-Budde
  0 siblings, 2 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-04-22  7:18 UTC (permalink / raw)
  To: Drew Fustini
  Cc: netdev, linux-can, kernel, Manivannan Sadhasivam, Will C, Thomas Kopp


[-- Attachment #1.1: Type: text/plain, Size: 1338 bytes --]

On 4/21/21 9:58 PM, Drew Fustini wrote:
> I am encountering similar error with the 5.10 raspberrypi kernel on
> RPi 4 with MCP2518FD:
> 
>   mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> data=00 ad 58 67, CRC=0xbbfd) retrying

What's the situation you see these errors?

I'm not particular happy with that patch, as it only works around that one
particular bit flip issue. If you really hammer the register, the driver will
still notice CRC errors that can be explained by other bits flipping. Consider
this as the first order approximation of a higher order problem :) - the root
cause is still unknown.

> Would it be possible for you to pull these patches into a v5.10 branch
> in your linux-rpi repo [1]?

Here you are:

https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements

I've included the UINC performance enhancements, too. The branch is compiled
tested only, though. I'll send a pull request to the rpi kernel after I've
testing feedback from you.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-04-22  7:18     ` Marc Kleine-Budde
@ 2021-04-22 16:46       ` Patrick Menschel
  2021-05-07  7:25       ` Marc Kleine-Budde
  1 sibling, 0 replies; 33+ messages in thread
From: Patrick Menschel @ 2021-04-22 16:46 UTC (permalink / raw)
  To: Marc Kleine-Budde, Drew Fustini
  Cc: netdev, linux-can, kernel, Manivannan Sadhasivam, Will C, Thomas Kopp

Am 22.04.21 um 09:18 schrieb Marc Kleine-Budde:
> On 4/21/21 9:58 PM, Drew Fustini wrote:
>> I am encountering similar error with the 5.10 raspberrypi kernel on
>> RPi 4 with MCP2518FD:
>>
>>   mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
>> data=00 ad 58 67, CRC=0xbbfd) retrying
> 
> What's the situation you see these errors?
> 
> I'm not particular happy with that patch, as it only works around that one
> particular bit flip issue. If you really hammer the register, the driver will
> still notice CRC errors that can be explained by other bits flipping. Consider
> this as the first order approximation of a higher order problem :) - the root
> cause is still unknown.
> 
>> Would it be possible for you to pull these patches into a v5.10 branch
>> in your linux-rpi repo [1]?
> 
> Here you are:
> 
> https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
> 
> I've included the UINC performance enhancements, too. The branch is compiled
> tested only, though. I'll send a pull request to the rpi kernel after I've
> testing feedback from you.
> 
> regards,
> Marc
> 

I can also confirm these occasional CRC errors. I run a custom pytest
suite on a pi3b+ that uses isotp and bcm in parallel.
No CAN-FD, just 500k regular vehicle can for infotainment.

$ dmesg | grep CRC

[    8.198863] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT
-MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
r:17.00MHz e:16.66MHz) successfully initialized.

[    8.209159] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT
-MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
r:17.00MHz e:16.66MHz) successfully initialized.

[74264.462934] mcp251xfd spi0.0 mcp0: CRC read error at address 0x0010
(length=4, data=80 33 d2 8a, CRC=0x0e3e) retrying.

[74749.267977] mcp251xfd spi0.0 mcp0: CRC read error at address 0x0010
(length=4, data=80 1a ad 0e, CRC=0x8d17) retrying.

[591150.766153] mcp251xfd spi0.0 mcp0: CRC read error at address 0x0010
(length=4, data=80 3d b5 60, CRC=0x5e9c) retrying.


Best Regards,
Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-04-22  7:18     ` Marc Kleine-Budde
  2021-04-22 16:46       ` Patrick Menschel
@ 2021-05-07  7:25       ` Marc Kleine-Budde
  2021-05-07  8:21         ` Patrick Menschel
  2021-05-07 22:36         ` Drew Fustini
  1 sibling, 2 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-05-07  7:25 UTC (permalink / raw)
  To: Drew Fustini; +Cc: netdev, linux-can, Will C, menschel.p

[-- Attachment #1: Type: text/plain, Size: 1536 bytes --]

On 22.04.2021 09:18:54, Marc Kleine-Budde wrote:
> On 4/21/21 9:58 PM, Drew Fustini wrote:
> > I am encountering similar error with the 5.10 raspberrypi kernel on
> > RPi 4 with MCP2518FD:
> > 
> >   mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> > data=00 ad 58 67, CRC=0xbbfd) retrying
> 
> What's the situation you see these errors?
> 
> I'm not particular happy with that patch, as it only works around that one
> particular bit flip issue. If you really hammer the register, the driver will
> still notice CRC errors that can be explained by other bits flipping. Consider
> this as the first order approximation of a higher order problem :) - the root
> cause is still unknown.
> 
> > Would it be possible for you to pull these patches into a v5.10 branch
> > in your linux-rpi repo [1]?
> 
> Here you are:
> 
> https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
> 
> I've included the UINC performance enhancements, too. The branch is compiled
> tested only, though. I'll send a pull request to the rpi kernel after I've
> testing feedback from you.

Drew, Patrick, have you tested this branch? If so I'll send a pull
request to the raspi kernel.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-07  7:25       ` Marc Kleine-Budde
@ 2021-05-07  8:21         ` Patrick Menschel
  2021-05-07  8:25           ` Marc Kleine-Budde
  2021-05-07 22:36         ` Drew Fustini
  1 sibling, 1 reply; 33+ messages in thread
From: Patrick Menschel @ 2021-05-07  8:21 UTC (permalink / raw)
  To: Marc Kleine-Budde, Drew Fustini; +Cc: netdev, linux-can, Will C

Am 07.05.21 um 09:25 schrieb Marc Kleine-Budde:
> On 22.04.2021 09:18:54, Marc Kleine-Budde wrote:
>> On 4/21/21 9:58 PM, Drew Fustini wrote:
>>> I am encountering similar error with the 5.10 raspberrypi kernel on
>>> RPi 4 with MCP2518FD:
>>>
>>>   mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
>>> data=00 ad 58 67, CRC=0xbbfd) retrying
>>
>> What's the situation you see these errors?
>>
>> I'm not particular happy with that patch, as it only works around that one
>> particular bit flip issue. If you really hammer the register, the driver will
>> still notice CRC errors that can be explained by other bits flipping. Consider
>> this as the first order approximation of a higher order problem :) - the root
>> cause is still unknown.
>>
>>> Would it be possible for you to pull these patches into a v5.10 branch
>>> in your linux-rpi repo [1]?
>>
>> Here you are:
>>
>> https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
>>
>> I've included the UINC performance enhancements, too. The branch is compiled
>> tested only, though. I'll send a pull request to the rpi kernel after I've
>> testing feedback from you.
> 
> Drew, Patrick, have you tested this branch? If so I'll send a pull
> request to the raspi kernel.
> 
Sorry Marc,

not yet. Thanks for reminding me. I'll start a native build on a pi0w asap.

Is there any test application or stress test that I should run?

Regards,
Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-07  8:21         ` Patrick Menschel
@ 2021-05-07  8:25           ` Marc Kleine-Budde
  2021-05-08 18:36             ` Patrick Menschel
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-05-07  8:25 UTC (permalink / raw)
  To: Patrick Menschel; +Cc: Drew Fustini, netdev, linux-can, Will C

[-- Attachment #1: Type: text/plain, Size: 1116 bytes --]

On 07.05.2021 08:21:57, Patrick Menschel wrote:
> >>> Would it be possible for you to pull these patches into a v5.10 branch
> >>> in your linux-rpi repo [1]?
> >>
> >> Here you are:
> >>
> >> https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
> >>
> >> I've included the UINC performance enhancements, too. The branch is compiled
> >> tested only, though. I'll send a pull request to the rpi kernel after I've
> >> testing feedback from you.
> > 
> > Drew, Patrick, have you tested this branch? If so I'll send a pull
> > request to the raspi kernel.
> > 

> not yet. Thanks for reminding me. I'll start a native build on a pi0w asap.
> 
> Is there any test application or stress test that I should run?

No, not any particular, do your normal (stress) testing.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-07  7:25       ` Marc Kleine-Budde
  2021-05-07  8:21         ` Patrick Menschel
@ 2021-05-07 22:36         ` Drew Fustini
  2021-05-08 12:30           ` Marc Kleine-Budde
  1 sibling, 1 reply; 33+ messages in thread
From: Drew Fustini @ 2021-05-07 22:36 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Drew Fustini, netdev, linux-can, Will C, menschel.p

On Fri, May 7, 2021 at 12:56 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>
> On 22.04.2021 09:18:54, Marc Kleine-Budde wrote:
> > On 4/21/21 9:58 PM, Drew Fustini wrote:
> > > I am encountering similar error with the 5.10 raspberrypi kernel on
> > > RPi 4 with MCP2518FD:
> > >
> > >   mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> > > data=00 ad 58 67, CRC=0xbbfd) retrying
> >
> > What's the situation you see these errors?
> >
> > I'm not particular happy with that patch, as it only works around that one
> > particular bit flip issue. If you really hammer the register, the driver will
> > still notice CRC errors that can be explained by other bits flipping. Consider
> > this as the first order approximation of a higher order problem :) - the root
> > cause is still unknown.
> >
> > > Would it be possible for you to pull these patches into a v5.10 branch
> > > in your linux-rpi repo [1]?
> >
> > Here you are:
> >
> > https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
> >
> > I've included the UINC performance enhancements, too. The branch is compiled
> > tested only, though. I'll send a pull request to the rpi kernel after I've
> > testing feedback from you.
>
> Drew, Patrick, have you tested this branch? If so I'll send a pull
> request to the raspi kernel.

Thank you for following up.

I need to build it and send it to the friend who was testing to check
if the CRC errors go away.  He is testing CANFD with a 2021 Ford F150
truck.  I will follow up here once I know the results.

-Drew

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-07 22:36         ` Drew Fustini
@ 2021-05-08 12:30           ` Marc Kleine-Budde
  0 siblings, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-05-08 12:30 UTC (permalink / raw)
  To: Drew Fustini; +Cc: Drew Fustini, netdev, linux-can, Will C, menschel.p

[-- Attachment #1: Type: text/plain, Size: 1254 bytes --]

On 07.05.2021 15:36:32, Drew Fustini wrote:
> > > > Would it be possible for you to pull these patches into a v5.10 branch
> > > > in your linux-rpi repo [1]?
> > >
> > > Here you are:
> > >
> > > https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
> > >
> > > I've included the UINC performance enhancements, too. The branch is compiled
> > > tested only, though. I'll send a pull request to the rpi kernel after I've
> > > testing feedback from you.
> >
> > Drew, Patrick, have you tested this branch? If so I'll send a pull
> > request to the raspi kernel.
> 
> Thank you for following up.
> 
> I need to build it and send it to the friend who was testing to check
> if the CRC errors go away.  He is testing CANFD with a 2021 Ford F150
> truck.  I will follow up here once I know the results.

The CRC Errors don't go away completely, however they are reduced by
more than an order of magnitude.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-07  8:25           ` Marc Kleine-Budde
@ 2021-05-08 18:36             ` Patrick Menschel
  2021-05-09  7:46               ` Patrick Menschel
  2021-05-10  7:43               ` Marc Kleine-Budde
  0 siblings, 2 replies; 33+ messages in thread
From: Patrick Menschel @ 2021-05-08 18:36 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Drew Fustini, netdev, linux-can, Will C

Am 07.05.21 um 10:25 schrieb Marc Kleine-Budde:
> On 07.05.2021 08:21:57, Patrick Menschel wrote:
>>>>> Would it be possible for you to pull these patches into a v5.10 branch
>>>>> in your linux-rpi repo [1]?
>>>>
>>>> Here you are:
>>>>
>>>> https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
>>>>
>>>> I've included the UINC performance enhancements, too. The branch is compiled
>>>> tested only, though. I'll send a pull request to the rpi kernel after I've
>>>> testing feedback from you.
>>>
>>> Drew, Patrick, have you tested this branch? If so I'll send a pull
>>> request to the raspi kernel.
>>>
> 
>> not yet. Thanks for reminding me. I'll start a native build on a pi0w asap.
>>
>> Is there any test application or stress test that I should run?
> 
> No, not any particular, do your normal (stress) testing.
> 
Following up on this.

Build and test finished on a pi0w.

### Test conditions ###

Since I lacked a true stress test, I wrote one for regular tox with
pytest collection.

https://gitlab.com/Menschel/socketcan/-/blob/master/tests/test_socketcan.py#L872

It uses mcp0 and mcp1 which are directly connected.
No CAN FD, just 500k with regular frames, random id and random data.

I basically mimic cangen but enhanced with a queue that handles to the
rx thread what should be compared next.

### Extract from dmesg shows no CRC Errors ###

[   30.930608] CAN device driver interface
[   30.967349] spi_master spi0: will run message pump with realtime priority
[   31.054202] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT
-MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
r:17.00MHz e:16.66MHz) successfully initialized.
[   31.076906] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT
-MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
r:17.00MHz e:16.66MHz) successfully initialized.
[   31.298969] mcp251xfd spi0.0 mcp0: renamed from can1
[   31.339864] mcp251xfd spi0.1 mcp1: renamed from can0
[   33.471889] IPv6: ADDRCONF(NETDEV_CHANGE): mcp0: link becomes ready
[   34.482260] IPv6: ADDRCONF(NETDEV_CHANGE): mcp1: link becomes ready
[  215.218979] can: controller area network core
[  215.219146] NET: Registered protocol family 29
[  215.261599] can: raw protocol
[  218.745376] can: isotp protocol
[  220.931150] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  220.931274] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  220.931395] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  220.931518] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  220.931643] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  220.931768] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  220.931893] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  222.099822] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  222.099901] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  222.100022] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[  222.330438] can: broadcast manager protocol

That softirq error has something to do with IsoTp. I was not able to
trace it back but I have it on multiple boards: pi0w, pi3b, pi3b+.


### Performance ###

## v5.10-rpi/backport-performance-improvements ##

I get about 20000 frames in 2 minutes.

2021-05-08 19:00:36 [    INFO] 20336 frames in 0:02:00
(test_socketcan.py:890)

2021-05-08 19:49:34 [    INFO] 20001 frames in 0:02:00
(test_socketcan.py:890)


## regular v5.10 ##

2021-05-08 20:19:55 [    INFO] 20000 frames in 0:02:00
(test_socketcan.py:890)

2021-05-08 20:22:40 [    INFO] 19995 frames in 0:02:00
(test_socketcan.py:890)

2021-05-08 20:25:22 [    INFO] 19931 frames in 0:02:00
(test_socketcan.py:890)


The numbers are slightly better but I count that as tolerance. I also
found that there are cross effects. If I run the same test on vcan0
before, the frame count goes down to 13000 instead.

I also have to admit, that I didn't get any crc errors with regular
v5.10 during that tests.

Do I have to change my test?

I can still update that pi3b+ that runs my micro-hil at work. That was
the one that occasionally had CRC errors.

Best Regards,
Patrick




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-08 18:36             ` Patrick Menschel
@ 2021-05-09  7:46               ` Patrick Menschel
  2021-05-10  7:45                 ` Marc Kleine-Budde
  2021-05-10  7:43               ` Marc Kleine-Budde
  1 sibling, 1 reply; 33+ messages in thread
From: Patrick Menschel @ 2021-05-09  7:46 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Drew Fustini, netdev, linux-can, Will C

Am 08.05.21 um 20:36 schrieb Patrick Menschel:
> Am 07.05.21 um 10:25 schrieb Marc Kleine-Budde:
>> On 07.05.2021 08:21:57, Patrick Menschel wrote:
>>>>>> Would it be possible for you to pull these patches into a v5.10 branch
>>>>>> in your linux-rpi repo [1]?
>>>>>
>>>>> Here you are:
>>>>>
>>>>> https://github.com/marckleinebudde/linux-rpi/tree/v5.10-rpi/backport-performance-improvements
>>>>>
>>>>> I've included the UINC performance enhancements, too. The branch is compiled
>>>>> tested only, though. I'll send a pull request to the rpi kernel after I've
>>>>> testing feedback from you.
>>>>
>>>> Drew, Patrick, have you tested this branch? If so I'll send a pull
>>>> request to the raspi kernel.
>>>>
>>
>>> not yet. Thanks for reminding me. I'll start a native build on a pi0w asap.
>>>
>>> Is there any test application or stress test that I should run?
>>
>> No, not any particular, do your normal (stress) testing.
>>
> Following up on this.
> 
> ...
> 
> Do I have to change my test?

Hi Marc,

I changed my test to 1 hour and removed the sleep statement.
Still no measurable difference for performance and no CRC Errors with
both kernels.

Apparently the test is hard on the CPU, I have two pytest processes
listed in htop one with 80%CPU and one with 60% CPU, approx 30% ram
usage of 512MB. I have no clue how it reaches the CPU values, there
should be only one CPU on the pi0w.


### 5.10.17+ on pi0w ###

2021-05-09 08:02:56 [    INFO] 725649 frames in 1:00:00
(test_socketcan.py:890)


### 5.10.31-performance-backports+ on pi0w ###

2021-05-09 09:13:32 [    INFO] 715936 frames in 1:00:00
(test_socketcan.py:890)


I'll switch boards to a pi3b and test again with these settings.

Best Regards,
Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-08 18:36             ` Patrick Menschel
  2021-05-09  7:46               ` Patrick Menschel
@ 2021-05-10  7:43               ` Marc Kleine-Budde
  2021-12-07 16:53                 ` Modilaynen, Pavel
  1 sibling, 1 reply; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-05-10  7:43 UTC (permalink / raw)
  To: Patrick Menschel; +Cc: Drew Fustini, netdev, linux-can, Will C

[-- Attachment #1: Type: text/plain, Size: 6214 bytes --]

On 08.05.2021 18:36:56, Patrick Menschel wrote:
> ### Test conditions ###
> 
> Since I lacked a true stress test, I wrote one for regular tox with
> pytest collection.
> 
> https://gitlab.com/Menschel/socketcan/-/blob/master/tests/test_socketcan.py#L872
> 
> It uses mcp0 and mcp1 which are directly connected.
> No CAN FD, just 500k with regular frames, random id and random data.
> 
> I basically mimic cangen but enhanced with a queue that handles to the
> rx thread what should be compared next.
> 
> ### Extract from dmesg shows no CRC Errors ###
> 
> [   30.930608] CAN device driver interface
> [   30.967349] spi_master spi0: will run message pump with realtime priority
> [   31.054202] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT
> -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
> r:17.00MHz e:16.66MHz) successfully initialized.
> [   31.076906] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT
> -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
> r:17.00MHz e:16.66MHz) successfully initialized.
> [   31.298969] mcp251xfd spi0.0 mcp0: renamed from can1
> [   31.339864] mcp251xfd spi0.1 mcp1: renamed from can0
> [   33.471889] IPv6: ADDRCONF(NETDEV_CHANGE): mcp0: link becomes ready
> [   34.482260] IPv6: ADDRCONF(NETDEV_CHANGE): mcp1: link becomes ready
> [  215.218979] can: controller area network core
> [  215.219146] NET: Registered protocol family 29
> [  215.261599] can: raw protocol
> [  218.745376] can: isotp protocol
> [  220.931150] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931274] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931395] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931518] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931643] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931768] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931893] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.099822] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.099901] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.100022] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.330438] can: broadcast manager protocol
> 
> That softirq error has something to do with IsoTp. I was not able to
> trace it back but I have it on multiple boards: pi0w, pi3b, pi3b+.

The softirq error is known and shows up as the mcp251xfd driver raises a
softirq from threaded IRQ context. We're working on fixing this.

> ### Performance ###
> 
> ## v5.10-rpi/backport-performance-improvements ##
> 
> I get about 20000 frames in 2 minutes.
> 
> 2021-05-08 19:00:36 [    INFO] 20336 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 2021-05-08 19:49:34 [    INFO] 20001 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 
> ## regular v5.10 ##
> 
> 2021-05-08 20:19:55 [    INFO] 20000 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 2021-05-08 20:22:40 [    INFO] 19995 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 2021-05-08 20:25:22 [    INFO] 19931 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 
> The numbers are slightly better but I count that as tolerance.

Makes sense. But you have only measured number of frames in a given
time. The raspi SPI driver is highly optimized so the changes in the
driver don't show up in those numbers.

Thanks for testing, I'll send a pull request to the raspi kernel.

If you are interested if there are performance benefits on your raspi,
consider measuring the spent CPU time and the number of SPI interrupts.

Measure CPU time by putting the command "time" in front of your test.
Measure SPI Interrupts by looking at /proc/interrupts before and after
the test. Note: there are SPI host controller interrupts and Interrupts
from the mcp251xfd.

On a raspi you probably only have a hand full of SPI host controller
interrupts, as the raspi driver only uses interrupts for long transfers.
There will be a mcp251xfd interrupt per TX-complete and RX CAN message,
maybe a few less if they overlap.

> I also found that there are cross effects. If I run the same test on
> vcan0 before, the frame count goes down to 13000 instead.

The changes only touch the mcp251xfd driver, if you see a difference
with the vcan driver, it's either a change in the kernel somewhere else
or your test setup is sensitive to something you changed without
noticing (starting condition, ...)

> I also have to admit, that I didn't get any crc errors with regular
> v5.10 during that tests.

The CRC errors the patch works around are CRC errors introduced by a
chip erratum, not by electromagnetic interference. In my observation
these CRC errors show up if the register contents changes while the
register is read. The register that changes most is the timer base
counter register. That register is only read if a CAN bus error is
signaled to user space (and this is maximized by enabling bus error
reporting). If it happens to be a CRC error while reading the TBC
register and the CRC can be "corrected" by flipping the upper most bit,
there will be no error message about any CRC errors.

Long story short. You only notice that this patch works, if in a
situation you had CRC errors on the TBC register (that is CAN errors are
reported to user space), you now have an order of magnitude less CRC
errors than before.

> Do I have to change my test?

No need to.

> I can still update that pi3b+ that runs my micro-hil at work. That was
> the one that occasionally had CRC errors.

Thanks again for testing!

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-09  7:46               ` Patrick Menschel
@ 2021-05-10  7:45                 ` Marc Kleine-Budde
  2021-05-20 10:29                   ` Patrick Menschel
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-05-10  7:45 UTC (permalink / raw)
  To: Patrick Menschel; +Cc: Drew Fustini, netdev, linux-can, Will C

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

On 09.05.2021 07:46:20, Patrick Menschel wrote:
> > Do I have to change my test?
> 
> I changed my test to 1 hour and removed the sleep statement.
> Still no measurable difference for performance and no CRC Errors with
> both kernels.

See other mail about my thoughts about performance and CRC.

> Apparently the test is hard on the CPU, I have two pytest processes
> listed in htop one with 80%CPU and one with 60% CPU, approx 30% ram
> usage of 512MB. I have no clue how it reaches the CPU values, there
> should be only one CPU on the pi0w.

Interesting :)

> ### 5.10.17+ on pi0w ###
> 
> 2021-05-09 08:02:56 [    INFO] 725649 frames in 1:00:00
> (test_socketcan.py:890)
> 
> ### 5.10.31-performance-backports+ on pi0w ###
> 
> 2021-05-09 09:13:32 [    INFO] 715936 frames in 1:00:00
> (test_socketcan.py:890)
> 
> I'll switch boards to a pi3b and test again with these settings.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-05-10  7:45                 ` Marc Kleine-Budde
@ 2021-05-20 10:29                   ` Patrick Menschel
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Menschel @ 2021-05-20 10:29 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Drew Fustini, netdev, linux-can, Will C

Am 10.05.21 um 09:45 schrieb Marc Kleine-Budde:

Just checked that particular pi3b+. 2 errors in 10 days.

$ dmesg | grep -i crc

[    8.335215] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT
-MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
r:17.00MHz e:16.66MHz) successfully initialized.

[    8.362778] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT
-MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
r:17.00MHz e:16.66MHz) successfully initialized.

[178691.606232] mcp251xfd spi0.0 mcp0: CRC read error at address 0x0010
(length=4, data=84 35 73 16, CRC=0x9b26) retrying.

[188341.664546] mcp251xfd spi0.0 mcp0: CRC read error at address 0x0010
(length=4, data=14 df c5 f5, CRC=0xe06b) retrying.


Regards,
Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
@ 2021-12-07 16:53                 ` Modilaynen, Pavel
  2021-12-08  8:54                   ` Marc Kleine-Budde
  2021-12-09 10:22                   ` Thomas.Kopp
  0 siblings, 2 replies; 33+ messages in thread
From: Modilaynen, Pavel @ 2021-12-07 16:53 UTC (permalink / raw)
  To: mkl; +Cc: drew, linux-can, menschel.p, netdev, will

Hello Marc,

We observe the very similar issue with MCP2517FD.

>The CRC errors the patch works around are CRC errors introduced by a
>chip erratum, not by electromagnetic interference. In my observation

Are you referring this errata doc
https://datasheet.octopart.com/MCP2517FD-H-JHA-Microchip-datasheet-136609045.pdf ?

We have the similar CRC read errors but
the lowest byte is not 0x00 and 0x80, it's actually 0x0x or 0x8x, e.g.

mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=82 d1 fa 6c, CRC=0xd9c2) retrying.

0xb0 0x10 0x04 0x82 0xd1 0xfa 0x6c => 0x59FD (not matching)

but if I flip the first received bit  (highest bit in the lowest byte):
0xb0 0x10 0x04 0x02 0xd1 0xfa 0x6c => 0xD9C2 (matching!)

So, your fix covers only the case of 0x00 and 0x80, 
do you think that the workaround should be extended so check
     (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
turns into 
     ((buf_rx->data[0] & 0xf0) == 0x0 || (buf_rx->data[0] & 0xf0) == 0x80)) {

Errata, actually says
"Only bits 7/15/23/31 of the following registers can be affected:"

So, we could basically, in simplest case flip bit 31 and re-check CRC without any check of 
rx->data[0]....

Regards,
Pavel



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-07 16:53                 ` Modilaynen, Pavel
@ 2021-12-08  8:54                   ` Marc Kleine-Budde
  2021-12-09 10:22                   ` Thomas.Kopp
  1 sibling, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-12-08  8:54 UTC (permalink / raw)
  To: Modilaynen, Pavel; +Cc: drew, linux-can, menschel.p, netdev, will

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

On 07.12.2021 16:53:12, Modilaynen, Pavel wrote:
> Hello Marc,
> 
> We observe the very similar issue with MCP2517FD.
> 
> >The CRC errors the patch works around are CRC errors introduced by a
> >chip erratum, not by electromagnetic interference. In my observation
> 
> Are you referring this errata doc
> https://datasheet.octopart.com/MCP2517FD-H-JHA-Microchip-datasheet-136609045.pdf ?

ACK - although there are newer versions available. The newest erratum
for the mcp2517fd is the "C" issue:

| https://ww1.microchip.com/downloads/en/DeviceDoc/MCP2517FD-Silicon-Errata-and-Data-Sheet-Clarification-DS80000792C.pdf

The mpc2518fd can be downloaded here:

| https://ww1.microchip.com/downloads/en/DeviceDoc/MCP2518FD-Silicon-Errata-and-Data-Sheet-Clarification-DS80000789C.pdf

> We have the similar CRC read errors but
> the lowest byte is not 0x00 and 0x80, it's actually 0x0x or 0x8x, e.g.
> 
> mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=82 d1 fa 6c, CRC=0xd9c2) retrying.
> 
> 0xb0 0x10 0x04 0x82 0xd1 0xfa 0x6c => 0x59FD (not matching)
> 
> but if I flip the first received bit  (highest bit in the lowest byte):
> 0xb0 0x10 0x04 0x02 0xd1 0xfa 0x6c => 0xD9C2 (matching!)
> 
> So, your fix covers only the case of 0x00 and 0x80, 
> do you think that the workaround should be extended so check
>      (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
> turns into 
>      ((buf_rx->data[0] & 0xf0) == 0x0 || (buf_rx->data[0] & 0xf0) == 0x80)) {
> 
> Errata, actually says
> "Only bits 7/15/23/31 of the following registers can be affected:"
> 
> So, we could basically, in simplest case flip bit 31 and re-check CRC
> without any check of rx->data[0]....

Can you send a patch that updates the workaround and description.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-07 16:53                 ` Modilaynen, Pavel
  2021-12-08  8:54                   ` Marc Kleine-Budde
@ 2021-12-09 10:22                   ` Thomas.Kopp
  2021-12-09 11:17                     ` AW: " Sven Schuchmann
  2021-12-13 22:12                     ` Modilaynen, Pavel
  1 sibling, 2 replies; 33+ messages in thread
From: Thomas.Kopp @ 2021-12-09 10:22 UTC (permalink / raw)
  To: pavel.modilaynen, mkl; +Cc: drew, linux-can, menschel.p, netdev, will

Hi Pavel,
 
> We have the similar CRC read errors but
> the lowest byte is not 0x00 and 0x80, it's actually 0x0x or 0x8x, e.g.
> 
> mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> data=82 d1 fa 6c, CRC=0xd9c2) retrying.
> 
> 0xb0 0x10 0x04 0x82 0xd1 0xfa 0x6c => 0x59FD (not matching)
> 
> but if I flip the first received bit  (highest bit in the lowest byte):
> 0xb0 0x10 0x04 0x02 0xd1 0xfa 0x6c => 0xD9C2 (matching!)

What settings do you have on your setup? Can you please print the dmesg output from the init? I'm especially interested in Sysclk and SPI speed.

Thanks,
Thomas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* AW: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-09 10:22                   ` Thomas.Kopp
@ 2021-12-09 11:17                     ` Sven Schuchmann
  2021-12-09 11:27                       ` Marc Kleine-Budde
  2021-12-13 22:12                     ` Modilaynen, Pavel
  1 sibling, 1 reply; 33+ messages in thread
From: Sven Schuchmann @ 2021-12-09 11:17 UTC (permalink / raw)
  To: Thomas.Kopp, pavel.modilaynen, mkl
  Cc: drew, linux-can, menschel.p, netdev, will

Hi All,

we are also seeing the CRC Errors in our setup (rpi4, Kernel 5.10.x)
from time to time. I just wanted to post here what I am seeing, maybe it helps...

[    6.761711] spi_master spi1: will run message pump with realtime priority
[    6.778063] mcp251xfd spi1.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.

[ 4327.107856] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=00 cc 62 c4, CRC=0xa3a0) retrying.
[ 7770.163335] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=00 bf 16 d5, CRC=0x9d3c) retrying.
[ 8000.565955] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=00 40 66 fa, CRC=0x31d7) retrying.
[ 9753.658173] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=80 e9 01 4e, CRC=0xe862) retrying.


Sven


> -----Ursprüngliche Nachricht-----
> Von: Thomas.Kopp@microchip.com <Thomas.Kopp@microchip.com>
> Gesendet: Donnerstag, 9. Dezember 2021 11:22
> An: pavel.modilaynen@volvocars.com; mkl@pengutronix.de
> Cc: drew@beagleboard.org; linux-can@vger.kernel.org; menschel.p@posteo.de;
> netdev@vger.kernel.org; will@macchina.cc
> Betreff: RE: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around
> broken CRC on TBC register
> 
> Hi Pavel,
> 
> > We have the similar CRC read errors but
> > the lowest byte is not 0x00 and 0x80, it's actually 0x0x or 0x8x, e.g.
> >
> > mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> > data=82 d1 fa 6c, CRC=0xd9c2) retrying.
> >
> > 0xb0 0x10 0x04 0x82 0xd1 0xfa 0x6c => 0x59FD (not matching)
> >
> > but if I flip the first received bit  (highest bit in the lowest byte):
> > 0xb0 0x10 0x04 0x02 0xd1 0xfa 0x6c => 0xD9C2 (matching!)
> 
> What settings do you have on your setup? Can you please print the dmesg output from the
> init? I'm especially interested in Sysclk and SPI speed.
> 
> Thanks,
> Thomas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-09 11:17                     ` AW: " Sven Schuchmann
@ 2021-12-09 11:27                       ` Marc Kleine-Budde
  2021-12-09 12:53                         ` AW: " Sven Schuchmann
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Kleine-Budde @ 2021-12-09 11:27 UTC (permalink / raw)
  To: Sven Schuchmann
  Cc: Thomas.Kopp, pavel.modilaynen, drew, linux-can, menschel.p, netdev, will

[-- Attachment #1: Type: text/plain, Size: 1890 bytes --]

On 09.12.2021 11:17:09, Sven Schuchmann wrote:
> we are also seeing the CRC Errors in our setup (rpi4, Kernel 5.10.x)
> from time to time. I just wanted to post here what I am seeing, maybe
> it helps...
> 
> [    6.761711] spi_master spi1: will run message pump with realtime priority
> [    6.778063] mcp251xfd spi1.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> 
> [ 4327.107856] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=00 cc 62 c4, CRC=0xa3a0) retrying.
> [ 7770.163335] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=00 bf 16 d5, CRC=0x9d3c) retrying.
> [ 8000.565955] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=00 40 66 fa, CRC=0x31d7) retrying.
> [ 9753.658173] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4, data=80 e9 01 4e, CRC=0xe862) retrying.

You are using the a back port of my HW timestamp in your v5.10 branch.
So every 45 seconds the TBC register (address 0x0010) is read,
additionally for every CAN error frame.

In the mean time, I've implemented a workaround for the CRC read errors:

| c7eb923c3caf can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
| ef7a8c3e7599 can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function

It fixes the CRC read error, if the first data byte is 0x00 or 0x80.

These messages should disappear, if you cherry-pick the above patches.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* AW: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-09 11:27                       ` Marc Kleine-Budde
@ 2021-12-09 12:53                         ` Sven Schuchmann
  0 siblings, 0 replies; 33+ messages in thread
From: Sven Schuchmann @ 2021-12-09 12:53 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Thomas.Kopp, pavel.modilaynen, drew, linux-can, menschel.p, netdev, will

Hello Marc,

> Von: Marc Kleine-Budde <mkl@pengutronix.de>
> Gesendet: Donnerstag, 9. Dezember 2021 12:28
> An: Sven Schuchmann <schuchmann@schleissheimer.de>
> 
> On 09.12.2021 11:17:09, Sven Schuchmann wrote:
> > we are also seeing the CRC Errors in our setup (rpi4, Kernel 5.10.x)
> > from time to time. I just wanted to post here what I am seeing, maybe
> > it helps...
> >
> > [    6.761711] spi_master spi1: will run message pump with realtime priority
> > [    6.778063] mcp251xfd spi1.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG
> +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully
> initialized.
> >
> > [ 4327.107856] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4,
> data=00 cc 62 c4, CRC=0xa3a0) retrying.
> > [ 7770.163335] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4,
> data=00 bf 16 d5, CRC=0x9d3c) retrying.
> > [ 8000.565955] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4,
> data=00 40 66 fa, CRC=0x31d7) retrying.
> > [ 9753.658173] mcp251xfd spi1.0 canfd1: CRC read error at address 0x0010 (length=4,
> data=80 e9 01 4e, CRC=0xe862) retrying.
> 
> You are using the a back port of my HW timestamp in your v5.10 branch.
> So every 45 seconds the TBC register (address 0x0010) is read,
> additionally for every CAN error frame.
>
> In the mean time, I've implemented a workaround for the CRC read errors:
> 
> | c7eb923c3caf can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC
> register
> | ef7a8c3e7599 can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into
> separate function
> 
> It fixes the CRC read error, if the first data byte is 0x00 or 0x80.
> 
> These messages should disappear, if you cherry-pick the above patches.

Sorry for the confusion, you are right.
I picked the two patches and so far no more CRC read errors.
Thanks a lot.

Sven

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-09 10:22                   ` Thomas.Kopp
  2021-12-09 11:17                     ` AW: " Sven Schuchmann
@ 2021-12-13 22:12                     ` Modilaynen, Pavel
  2021-12-21 22:24                       ` Thomas.Kopp
  1 sibling, 1 reply; 33+ messages in thread
From: Modilaynen, Pavel @ 2021-12-13 22:12 UTC (permalink / raw)
  To: Thomas.Kopp, mkl; +Cc: drew, linux-can, menschel.p, netdev, will

Hi Thomas,

> > We have the similar CRC read errors but
> > the lowest byte is not 0x00 and 0x80, it's actually 0x0x or 0x8x, e.g.
> >
> > mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> > data=82 d1 fa 6c, CRC=0xd9c2) retrying.
> > 
> > 0xb0 0x10 0x04 0x82 0xd1 0xfa 0x6c => 0x59FD (not matching)
> > 
> > but if I flip the first received bit  (highest bit in the lowest byte):
> > 0xb0 0x10 0x04 0x02 0xd1 0xfa 0x6c => 0xD9C2 (matching!)

> What settings do you have on your setup? Can you please print the dmesg output from the init? I'm especially interested in Sysclk and SPI speed.

mcp251xfd spi0.0 can0: MCP2517FD rev0.0 (-RX_INT +MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:10.00MHz r:10.00MHz e:10.00MHz) successfully initialized.
...

Regards,
Pavel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-13 22:12                     ` Modilaynen, Pavel
@ 2021-12-21 22:24                       ` Thomas.Kopp
  2022-06-21 14:25                         ` Marc Kleine-Budde
  2022-06-26 19:14                         ` Marc Kleine-Budde
  0 siblings, 2 replies; 33+ messages in thread
From: Thomas.Kopp @ 2021-12-21 22:24 UTC (permalink / raw)
  To: pavel.modilaynen, mkl; +Cc: drew, linux-can, menschel.p, netdev, will

Hi Pavel,

> > > We have the similar CRC read errors but
> > > the lowest byte is not 0x00 and 0x80, it's actually 0x0x or 0x8x, e.g.
> > >
> > > mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4,
> > > data=82 d1 fa 6c, CRC=0xd9c2) retrying.
> > >
> > > 0xb0 0x10 0x04 0x82 0xd1 0xfa 0x6c => 0x59FD (not matching)
> > >
> > > but if I flip the first received bit  (highest bit in the lowest byte):
> > > 0xb0 0x10 0x04 0x02 0xd1 0xfa 0x6c => 0xD9C2 (matching!)
> 
> > What settings do you have on your setup? Can you please print the
> dmesg output from the init? I'm especially interested in Sysclk and SPI
> speed.
> 
> mcp251xfd spi0.0 can0: MCP2517FD rev0.0 (-RX_INT +MAB_NO_WARN
> +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:10.00MHz
> r:10.00MHz e:10.00MHz) successfully initialized.

Thanks for the data. I've looked into this and it seems that the second bit being set in your case does not depend on the SPI-Rate (or the quirks for that matter) but it seems to be hardware setup related. 
I'm fine with changing the driver so that it ignores set LSBs but would limit it to 2 or 3 bits:
(buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80))
becomes
((buf_rx->data[0] & 0xf8) == 0x0 || (buf_rx->data[0] & 0xf8) == 0x80)) {

The action also needs to be changed and the flip back of the bit needs to be removed. In this case the flipped databit that produces a matching CRC is actually  correct (i.e. consistent with the 7 LSBs in that byte.)
A patch could look like this (I'm currently not close to a setup where I can compile/test this.)

diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
index 297491516a26..e5bc897f37e8 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
@@ -332,12 +332,10 @@ mcp251xfd_regmap_crc_read(void *context,
                 *
                 * If the highest bit in the lowest byte is flipped
                 * the transferred CRC matches the calculated one. We
-                * assume for now the CRC calculation in the chip
-                * works on wrong data and the transferred data is
-                * correct.
+                * assume for now the CRC operates on the correct data.
                 */
                if (reg == MCP251XFD_REG_TBC &&
-                   (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
+                   ((buf_rx->data[0] & 0xF8) == 0x0 || (buf_rx->data[0] & 0xF8) == 0x80)) {
                        /* Flip highest bit in lowest byte of le32 */
                        buf_rx->data[0] ^= 0x80;

@@ -347,10 +345,8 @@ mcp251xfd_regmap_crc_read(void *context,
                                                                  val_len);
                        if (!err) {
                                /* If CRC is now correct, assume
-                                * transferred data was OK, flip bit
-                                * back to original value.
+                                * flipped data was OK.
                                 */
-                               buf_rx->data[0] ^= 0x80;
                                goto out;
                        }
                }

Thanks,
Thomas

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-21 22:24                       ` Thomas.Kopp
@ 2022-06-21 14:25                         ` Marc Kleine-Budde
  2022-06-22  8:19                           ` Marc Kleine-Budde
  2022-06-22 13:47                           ` Thomas.Kopp
  2022-06-26 19:14                         ` Marc Kleine-Budde
  1 sibling, 2 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2022-06-21 14:25 UTC (permalink / raw)
  To: Thomas.Kopp; +Cc: pavel.modilaynen, drew, linux-can, menschel.p, netdev, will

[-- Attachment #1: Type: text/plain, Size: 3033 bytes --]

Picking up this old thread....

On 21.12.2021 22:24:52, Thomas.Kopp@microchip.com wrote:
> Thanks for the data. I've looked into this and it seems that the
> second bit being set in your case does not depend on the SPI-Rate (or
> the quirks for that matter) but it seems to be hardware setup related.
> 
> I'm fine with changing the driver so that it ignores set LSBs but
> would limit it to 2 or 3 bits:

> (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80))
> becomes
> ((buf_rx->data[0] & 0xf8) == 0x0 || (buf_rx->data[0] & 0xf8) == 0x80)) {
> 
> The action also needs to be changed and the flip back of the bit needs
> to be removed. In this case the flipped databit that produces a
> matching CRC is actually  correct (i.e. consistent with the 7 LSBs in
> that byte.)
> 
> A patch could look like this (I'm currently not close to a setup where
> I can compile/test this.)

Thomas, can I have your Signed-off-by for this patch?

Marc

> diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> index 297491516a26..e5bc897f37e8 100644
> --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> @@ -332,12 +332,10 @@ mcp251xfd_regmap_crc_read(void *context,
>                  *
>                  * If the highest bit in the lowest byte is flipped
>                  * the transferred CRC matches the calculated one. We
> -                * assume for now the CRC calculation in the chip
> -                * works on wrong data and the transferred data is
> -                * correct.
> +                * assume for now the CRC operates on the correct data.
>                  */
>                 if (reg == MCP251XFD_REG_TBC &&
> -                   (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
> +                   ((buf_rx->data[0] & 0xF8) == 0x0 || (buf_rx->data[0] & 0xF8) == 0x80)) {
>                         /* Flip highest bit in lowest byte of le32 */
>                         buf_rx->data[0] ^= 0x80;
> 
> @@ -347,10 +345,8 @@ mcp251xfd_regmap_crc_read(void *context,
>                                                                   val_len);
>                         if (!err) {
>                                 /* If CRC is now correct, assume
> -                                * transferred data was OK, flip bit
> -                                * back to original value.
> +                                * flipped data was OK.
>                                  */
> -                               buf_rx->data[0] ^= 0x80;
>                                 goto out;
>                         }
>                 }
> 
> Thanks,
> Thomas
> 

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2022-06-21 14:25                         ` Marc Kleine-Budde
@ 2022-06-22  8:19                           ` Marc Kleine-Budde
  2022-06-22 13:47                           ` Thomas.Kopp
  1 sibling, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2022-06-22  8:19 UTC (permalink / raw)
  To: Thomas.Kopp; +Cc: pavel.modilaynen, drew, linux-can, menschel.p, netdev, will

[-- Attachment #1: Type: text/plain, Size: 1451 bytes --]

On 21.06.2022 16:25:15, Marc Kleine-Budde wrote:
> > Thanks for the data. I've looked into this and it seems that the
> > second bit being set in your case does not depend on the SPI-Rate (or
> > the quirks for that matter) but it seems to be hardware setup related.
> > 
> > I'm fine with changing the driver so that it ignores set LSBs but
> > would limit it to 2 or 3 bits:
> > 
> > (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80))
> > becomes
> > ((buf_rx->data[0] & 0xf8) == 0x0 || (buf_rx->data[0] & 0xf8) == 0x80)) {
> > 
> > The action also needs to be changed and the flip back of the bit
> > needs to be removed. In this case the flipped databit that produces
> > a matching CRC is actually correct (i.e. consistent with the 7 LSBs
> > in that byte.)

The mcp2517fd errata says the transmitted data is okay, but the CRC is
calculated on wrong data:

| It is possible that there is a mismatch between the transmitted CRC
| and the actual CRC for the transmitted data when data are updated at a
| specific time during the SPI READ_CRC command. In these cases, the
| transmitted CRC is wrong. The data transmitted are correct.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2022-06-21 14:25                         ` Marc Kleine-Budde
  2022-06-22  8:19                           ` Marc Kleine-Budde
@ 2022-06-22 13:47                           ` Thomas.Kopp
  1 sibling, 0 replies; 33+ messages in thread
From: Thomas.Kopp @ 2022-06-22 13:47 UTC (permalink / raw)
  To: mkl; +Cc: pavel.modilaynen, drew, linux-can, menschel.p, netdev, will

Hi Marc,

> Picking up this old thread....
> 
> Thomas, can I have your Signed-off-by for this patch?
> 
> Marc

> > diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> > index 297491516a26..e5bc897f37e8 100644
> > --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> > +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> > @@ -332,12 +332,10 @@ mcp251xfd_regmap_crc_read(void *context,
> >                  *
> >                  * If the highest bit in the lowest byte is flipped
> >                  * the transferred CRC matches the calculated one. We
> > -                * assume for now the CRC calculation in the chip
> > -                * works on wrong data and the transferred data is
> > -                * correct.
> > +                * assume for now the CRC operates on the correct data.
> >                  */
> >                 if (reg == MCP251XFD_REG_TBC &&
> > -                   (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
> > +                   ((buf_rx->data[0] & 0xF8) == 0x0 || (buf_rx->data[0] & 0xF8) ==
> 0x80)) {
> >                         /* Flip highest bit in lowest byte of le32 */
> >                         buf_rx->data[0] ^= 0x80;
> >
> > @@ -347,10 +345,8 @@ mcp251xfd_regmap_crc_read(void *context,
> >                                                                   val_len);
> >                         if (!err) {
> >                                 /* If CRC is now correct, assume
> > -                                * transferred data was OK, flip bit
> > -                                * back to original value.
> > +                                * flipped data was OK.
> >                                  */
> > -                               buf_rx->data[0] ^= 0x80;
> >                                 goto out;
> >                         }
> >                 }
> >

Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com>

> The mcp2517fd errata says the transmitted data is okay, but the CRC is
> calculated on wrong data:

Yes, will be updated.

Best Regards,
Thomas


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
  2021-12-21 22:24                       ` Thomas.Kopp
  2022-06-21 14:25                         ` Marc Kleine-Budde
@ 2022-06-26 19:14                         ` Marc Kleine-Budde
  1 sibling, 0 replies; 33+ messages in thread
From: Marc Kleine-Budde @ 2022-06-26 19:14 UTC (permalink / raw)
  To: Thomas.Kopp; +Cc: pavel.modilaynen, drew, linux-can, menschel.p, netdev, will

[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]

On 21.12.2021 22:24:52, Thomas.Kopp@microchip.com wrote:
> diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> index 297491516a26..e5bc897f37e8 100644
> --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
> @@ -332,12 +332,10 @@ mcp251xfd_regmap_crc_read(void *context,
>                  *
>                  * If the highest bit in the lowest byte is flipped
>                  * the transferred CRC matches the calculated one. We
> -                * assume for now the CRC calculation in the chip
> -                * works on wrong data and the transferred data is
> -                * correct.
> +                * assume for now the CRC operates on the correct data.
>                  */
>                 if (reg == MCP251XFD_REG_TBC &&
> -                   (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
> +                   ((buf_rx->data[0] & 0xF8) == 0x0 || (buf_rx->data[0] & 0xF8) == 0x80)) {

With this change the read of the TBC on the mcp2517fd becomes much more
stable. No more single bit flips in the 1st data byte that can be fixed
with xor 0x80.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-06-26 19:14 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 1/6] can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 2/6] can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 3/6] can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 4/6] can: mcp251xfd: add BQL support Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 5/6] can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register Marc Kleine-Budde
2021-04-21 19:58   ` Drew Fustini
2021-04-22  7:18     ` Marc Kleine-Budde
2021-04-22 16:46       ` Patrick Menschel
2021-05-07  7:25       ` Marc Kleine-Budde
2021-05-07  8:21         ` Patrick Menschel
2021-05-07  8:25           ` Marc Kleine-Budde
2021-05-08 18:36             ` Patrick Menschel
2021-05-09  7:46               ` Patrick Menschel
2021-05-10  7:45                 ` Marc Kleine-Budde
2021-05-20 10:29                   ` Patrick Menschel
2021-05-10  7:43               ` Marc Kleine-Budde
2021-12-07 16:53                 ` Modilaynen, Pavel
2021-12-08  8:54                   ` Marc Kleine-Budde
2021-12-09 10:22                   ` Thomas.Kopp
2021-12-09 11:17                     ` AW: " Sven Schuchmann
2021-12-09 11:27                       ` Marc Kleine-Budde
2021-12-09 12:53                         ` AW: " Sven Schuchmann
2021-12-13 22:12                     ` Modilaynen, Pavel
2021-12-21 22:24                       ` Thomas.Kopp
2022-06-21 14:25                         ` Marc Kleine-Budde
2022-06-22  8:19                           ` Marc Kleine-Budde
2022-06-22 13:47                           ` Thomas.Kopp
2022-06-26 19:14                         ` Marc Kleine-Budde
2021-05-07 22:36         ` Drew Fustini
2021-05-08 12:30           ` Marc Kleine-Budde
2021-04-07 22:10 ` pull-request: can-next 2021-04-07 patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.