All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] can: various fixes to xilinx_can driver
@ 2017-02-13 13:12 Anssi Hannula
  2017-02-13 13:12 ` [PATCH 1/5] can: xilinx_can: fix device dropping off bus on RX overrun Anssi Hannula
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 13:12 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana

Hi,

Here are several bugfixes for the xilinx_can driver found while taking
the driver into use.

I've marked the three more serious ones for stable (issues with RX/TX
FIFOs and RX overrun). Those are also the ones that mostly remove code
instead of adding lines.


Anssi Hannula (5):
      can: xilinx_can: fix device dropping off bus on RX overrun
      can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK
      can: xilinx_can: fix recovery from error states not being propagated
      can: xilinx_can: only report warning and passive states on state changes
      can: xilinx_can: fix TX FIFO accounting getting out of sync

 xilinx_can.c |  208 ++++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 142 insertions(+), 66 deletions(-)


-- 
Anssi Hannula / Bitwise Oy


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] can: xilinx_can: fix device dropping off bus on RX overrun
  2017-02-13 13:12 [PATCH 0/5] can: various fixes to xilinx_can driver Anssi Hannula
@ 2017-02-13 13:12 ` Anssi Hannula
  2017-02-13 13:12 ` [PATCH 2/5] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Anssi Hannula
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 13:12 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

The xilinx_can driver performs a software reset when an RX overrun is
detected. This causes the device to enter Configuration mode where no
messages are received or transmitted.

The documentation does not mention any need to perform a reset on an RX
overrun, and testing by inducing an RX overflow also indicated that the
device continues to work just fine without a reset.

Remove the software reset.

Tested with the integrated CAN on Zynq-7000 SoC.

Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
---
 drivers/net/can/xilinx_can.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index c71a035..ddb3287 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -600,7 +600,6 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
 	if (isr & XCAN_IXR_RXOFLW_MASK) {
 		stats->rx_over_errors++;
 		stats->rx_errors++;
-		priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
 		if (skb) {
 			cf->can_id |= CAN_ERR_CRTL;
 			cf->data[1] |= CAN_ERR_CRTL_RX_OVERFLOW;
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK
  2017-02-13 13:12 [PATCH 0/5] can: various fixes to xilinx_can driver Anssi Hannula
  2017-02-13 13:12 ` [PATCH 1/5] can: xilinx_can: fix device dropping off bus on RX overrun Anssi Hannula
@ 2017-02-13 13:12 ` Anssi Hannula
  2017-02-13 13:12 ` [PATCH 3/5] can: xilinx_can: fix recovery from error states not being propagated Anssi Hannula
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 13:12 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

If the device gets into a state where RXNEMP (RX FIFO not empty)
interrupt is asserted without RXOK (new frame received successfully)
interrupt being asserted, xcan_rx_poll() will continue to try to clear
RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
not empty, the interrupt will not be cleared and napi_schedule() will
just be called again.

This situation can occur when:

(a) xcan_rx() returns without reading RX FIFO due to an error condition.
The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
due to a frame still being in the FIFO. The frame will never be read
from the FIFO as RXOK is no longer set.

(b) A frame is received between xcan_rx_poll() reading interrupt status
and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
set as the new message is still in the FIFO.

I'm able to trigger case (b) by flooding the bus with frames under load.

There does not seem to be any benefit in using both RXNEMP and RXOK in
the way the driver does, and the polling example in the reference manual
(UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
RXOK or RXNEMP can be used for detecting incoming messages.

Fix the issue and simplify the RX processing by only using RXNEMP
without RXOK.

Tested with the integrated CAN on Zynq-7000 SoC.

Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
---
 drivers/net/can/xilinx_can.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index ddb3287..b23bbfb 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -101,7 +101,7 @@ enum xcan_reg {
 #define XCAN_INTR_ALL		(XCAN_IXR_TXOK_MASK | XCAN_IXR_BSOFF_MASK |\
 				 XCAN_IXR_WKUP_MASK | XCAN_IXR_SLP_MASK | \
 				 XCAN_IXR_RXNEMP_MASK | XCAN_IXR_ERROR_MASK | \
-				 XCAN_IXR_ARBLST_MASK | XCAN_IXR_RXOK_MASK)
+				 XCAN_IXR_ARBLST_MASK)
 
 /* CAN register bit shift - XCAN_<REG>_<BIT>_SHIFT */
 #define XCAN_BTR_SJW_SHIFT		7  /* Synchronous jump width */
@@ -708,15 +708,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
 
 	isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
 	while ((isr & XCAN_IXR_RXNEMP_MASK) && (work_done < quota)) {
-		if (isr & XCAN_IXR_RXOK_MASK) {
-			priv->write_reg(priv, XCAN_ICR_OFFSET,
-				XCAN_IXR_RXOK_MASK);
-			work_done += xcan_rx(ndev);
-		} else {
-			priv->write_reg(priv, XCAN_ICR_OFFSET,
-				XCAN_IXR_RXNEMP_MASK);
-			break;
-		}
+		work_done += xcan_rx(ndev);
 		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_RXNEMP_MASK);
 		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
 	}
@@ -727,7 +719,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
 	if (work_done < quota) {
 		napi_complete(napi);
 		ier = priv->read_reg(priv, XCAN_IER_OFFSET);
-		ier |= (XCAN_IXR_RXOK_MASK | XCAN_IXR_RXNEMP_MASK);
+		ier |= XCAN_IXR_RXNEMP_MASK;
 		priv->write_reg(priv, XCAN_IER_OFFSET, ier);
 	}
 	return work_done;
@@ -799,9 +791,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
 	}
 
 	/* Check for the type of receive interrupt and Processing it */
-	if (isr & (XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK)) {
+	if (isr & XCAN_IXR_RXNEMP_MASK) {
 		ier = priv->read_reg(priv, XCAN_IER_OFFSET);
-		ier &= ~(XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK);
+		ier &= ~XCAN_IXR_RXNEMP_MASK;
 		priv->write_reg(priv, XCAN_IER_OFFSET, ier);
 		napi_schedule(&priv->napi);
 	}
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] can: xilinx_can: fix recovery from error states not being propagated
  2017-02-13 13:12 [PATCH 0/5] can: various fixes to xilinx_can driver Anssi Hannula
  2017-02-13 13:12 ` [PATCH 1/5] can: xilinx_can: fix device dropping off bus on RX overrun Anssi Hannula
  2017-02-13 13:12 ` [PATCH 2/5] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Anssi Hannula
@ 2017-02-13 13:12 ` Anssi Hannula
  2017-02-13 13:12 ` [PATCH 4/5] can: xilinx_can: only report warning and passive states on state changes Anssi Hannula
  2017-02-13 13:12 ` [PATCH 5/5] can: xilinx_can: fix TX FIFO accounting getting out of sync Anssi Hannula
  4 siblings, 0 replies; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 13:12 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana

The xilinx_can driver contains no mechanism for propagating recovery
from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE.

Add such a mechanism by factoring the handling of
XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of
xcan_err_interrupt and checking for recovery after RX and TX if the
interface is in one of those states.

Tested with the integrated CAN on Zynq-7000 SoC.

Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
---
 drivers/net/can/xilinx_can.c | 155 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 127 insertions(+), 28 deletions(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index b23bbfb..e02fabf 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -2,6 +2,7 @@
  *
  * Copyright (C) 2012 - 2014 Xilinx, Inc.
  * Copyright (C) 2009 PetaLogix. All rights reserved.
+ * Copyright (C) 2017 Sandvik Mining and Construction Oy
  *
  * Description:
  * This driver is developed for Axi CAN IP and for Zynq CANPS Controller.
@@ -530,6 +531,123 @@ static int xcan_rx(struct net_device *ndev)
 }
 
 /**
+ * xcan_current_error_state - Get current error state from HW
+ * @ndev:	Pointer to net_device structure
+ *
+ * Checks the current CAN error state from the HW. Note that this
+ * only checks for ERROR_PASSIVE and ERROR_WARNING.
+ *
+ * Return:
+ * ERROR_PASSIVE or ERROR_WARNING if either is active, ERROR_ACTIVE
+ * otherwise.
+ */
+static enum can_state xcan_current_error_state(struct net_device *ndev)
+{
+	struct xcan_priv *priv = netdev_priv(ndev);
+	u32 status = priv->read_reg(priv, XCAN_SR_OFFSET);
+
+	if ((status & XCAN_SR_ESTAT_MASK) == XCAN_SR_ESTAT_MASK)
+		return CAN_STATE_ERROR_PASSIVE;
+	else if (status & XCAN_SR_ERRWRN_MASK)
+		return CAN_STATE_ERROR_WARNING;
+	else
+		return CAN_STATE_ERROR_ACTIVE;
+}
+
+/**
+ * xcan_set_error_state - Set new CAN error state
+ * @ndev:	Pointer to net_device structure
+ * @new_state:	The new CAN state to be set
+ * @cf:		Error frame to be populated or NULL
+ *
+ * Set new CAN error state for the device, updating statistics and
+ * populating the error frame if given.
+ */
+static void xcan_set_error_state(struct net_device *ndev,
+				 enum can_state new_state,
+				 struct can_frame *cf)
+{
+	struct xcan_priv *priv = netdev_priv(ndev);
+	u32 ecr = priv->read_reg(priv, XCAN_ECR_OFFSET);
+	u32 txerr = ecr & XCAN_ECR_TEC_MASK;
+	u32 rxerr = (ecr & XCAN_ECR_REC_MASK) >> XCAN_ESR_REC_SHIFT;
+
+	priv->can.state = new_state;
+
+	if (cf) {
+		cf->can_id |= CAN_ERR_CRTL;
+		cf->data[6] = txerr;
+		cf->data[7] = rxerr;
+	}
+
+	switch (new_state) {
+	case CAN_STATE_ERROR_PASSIVE:
+		priv->can.can_stats.error_passive++;
+		if (cf)
+			cf->data[1] = (rxerr > 127) ?
+					CAN_ERR_CRTL_RX_PASSIVE :
+					CAN_ERR_CRTL_TX_PASSIVE;
+		break;
+	case CAN_STATE_ERROR_WARNING:
+		priv->can.can_stats.error_warning++;
+		if (cf)
+			cf->data[1] |= (txerr > rxerr) ?
+					CAN_ERR_CRTL_TX_WARNING :
+					CAN_ERR_CRTL_RX_WARNING;
+		break;
+	case CAN_STATE_ERROR_ACTIVE:
+		if (cf)
+			cf->data[1] |= CAN_ERR_CRTL_ACTIVE;
+		break;
+	default:
+		/* non-ERROR states are handled elsewhere */
+		WARN_ON(1);
+		break;
+	}
+}
+
+/**
+ * xcan_update_error_state_after_rxtx - Update CAN error state after RX/TX
+ * @ndev:	Pointer to net_device structure
+ *
+ * If the device is in a ERROR-WARNING or ERROR-PASSIVE state, check if
+ * the performed RX/TX has caused it to drop to a lesser state and set
+ * the interface state accordingly.
+ */
+static void xcan_update_error_state_after_rxtx(struct net_device *ndev)
+{
+	struct xcan_priv *priv = netdev_priv(ndev);
+	enum can_state old_state = priv->can.state;
+	enum can_state new_state;
+
+	/* changing error state due to successful frame RX/TX can only
+	 * occur from these states
+	 */
+	if (old_state != CAN_STATE_ERROR_WARNING &&
+	    old_state != CAN_STATE_ERROR_PASSIVE)
+		return;
+
+	new_state = xcan_current_error_state(ndev);
+
+	if (new_state != old_state) {
+		struct sk_buff *skb;
+		struct can_frame *cf;
+
+		skb = alloc_can_err_skb(ndev, &cf);
+
+		xcan_set_error_state(ndev, new_state, skb ? cf : NULL);
+
+		if (skb) {
+			struct net_device_stats *stats = &ndev->stats;
+
+			stats->rx_packets++;
+			stats->rx_bytes += cf->can_dlc;
+			netif_rx(skb);
+		}
+	}
+}
+
+/**
  * xcan_err_interrupt - error frame Isr
  * @ndev:	net_device pointer
  * @isr:	interrupt status register value
@@ -544,16 +662,12 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
 	struct net_device_stats *stats = &ndev->stats;
 	struct can_frame *cf;
 	struct sk_buff *skb;
-	u32 err_status, status, txerr = 0, rxerr = 0;
+	u32 err_status;
 
 	skb = alloc_can_err_skb(ndev, &cf);
 
 	err_status = priv->read_reg(priv, XCAN_ESR_OFFSET);
 	priv->write_reg(priv, XCAN_ESR_OFFSET, err_status);
-	txerr = priv->read_reg(priv, XCAN_ECR_OFFSET) & XCAN_ECR_TEC_MASK;
-	rxerr = ((priv->read_reg(priv, XCAN_ECR_OFFSET) &
-			XCAN_ECR_REC_MASK) >> XCAN_ESR_REC_SHIFT);
-	status = priv->read_reg(priv, XCAN_SR_OFFSET);
 
 	if (isr & XCAN_IXR_BSOFF_MASK) {
 		priv->can.state = CAN_STATE_BUS_OFF;
@@ -563,28 +677,10 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
 		can_bus_off(ndev);
 		if (skb)
 			cf->can_id |= CAN_ERR_BUSOFF;
-	} else if ((status & XCAN_SR_ESTAT_MASK) == XCAN_SR_ESTAT_MASK) {
-		priv->can.state = CAN_STATE_ERROR_PASSIVE;
-		priv->can.can_stats.error_passive++;
-		if (skb) {
-			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (rxerr > 127) ?
-					CAN_ERR_CRTL_RX_PASSIVE :
-					CAN_ERR_CRTL_TX_PASSIVE;
-			cf->data[6] = txerr;
-			cf->data[7] = rxerr;
-		}
-	} else if (status & XCAN_SR_ERRWRN_MASK) {
-		priv->can.state = CAN_STATE_ERROR_WARNING;
-		priv->can.can_stats.error_warning++;
-		if (skb) {
-			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] |= (txerr > rxerr) ?
-					CAN_ERR_CRTL_TX_WARNING :
-					CAN_ERR_CRTL_RX_WARNING;
-			cf->data[6] = txerr;
-			cf->data[7] = rxerr;
-		}
+	} else {
+		enum can_state new_state = xcan_current_error_state(ndev);
+
+		xcan_set_error_state(ndev, new_state, skb ? cf : NULL);
 	}
 
 	/* Check for Arbitration lost interrupt */
@@ -713,8 +809,10 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
 		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
 	}
 
-	if (work_done)
+	if (work_done) {
 		can_led_event(ndev, CAN_LED_EVENT_RX);
+		xcan_update_error_state_after_rxtx(ndev);
+	}
 
 	if (work_done < quota) {
 		napi_complete(napi);
@@ -745,6 +843,7 @@ static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
 		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
 	}
 	can_led_event(ndev, CAN_LED_EVENT_TX);
+	xcan_update_error_state_after_rxtx(ndev);
 	netif_wake_queue(ndev);
 }
 
-- 
2.8.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] can: xilinx_can: only report warning and passive states on state changes
  2017-02-13 13:12 [PATCH 0/5] can: various fixes to xilinx_can driver Anssi Hannula
                   ` (2 preceding siblings ...)
  2017-02-13 13:12 ` [PATCH 3/5] can: xilinx_can: fix recovery from error states not being propagated Anssi Hannula
@ 2017-02-13 13:12 ` Anssi Hannula
  2017-02-13 13:12 ` [PATCH 5/5] can: xilinx_can: fix TX FIFO accounting getting out of sync Anssi Hannula
  4 siblings, 0 replies; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 13:12 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana

The xilinx_can driver currently increments error-warning and
error-passive statistics on every error interrupt regardless of whether
the interface was already in the same state. Similarly, the error frame
sent on error interrupts is always sent with
CAN_ERR_CRTL_(RX|TX)_(PASSIVE|WARNING) bit set.

To make the error-warning and error-passive statistics more useful, add
a check to only set the error state when the state has actually been
changed.

Tested with the integrated CAN on Zynq-7000 SoC.

Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>

---

I'm not 100% sure what the expected behavior here is. Another
alternative option would be to keep the code so that CAN_ERR_CRTL etc
is always set in the error frame (as the error frame is generated
anyway), and just make the statistics increment only on state changes.


 drivers/net/can/xilinx_can.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index e02fabf..392be77 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -680,7 +680,8 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
 	} else {
 		enum can_state new_state = xcan_current_error_state(ndev);
 
-		xcan_set_error_state(ndev, new_state, skb ? cf : NULL);
+		if (new_state != priv->can.state)
+			xcan_set_error_state(ndev, new_state, skb ? cf : NULL);
 	}
 
 	/* Check for Arbitration lost interrupt */
-- 
2.8.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-13 13:12 [PATCH 0/5] can: various fixes to xilinx_can driver Anssi Hannula
                   ` (3 preceding siblings ...)
  2017-02-13 13:12 ` [PATCH 4/5] can: xilinx_can: only report warning and passive states on state changes Anssi Hannula
@ 2017-02-13 13:12 ` Anssi Hannula
  2017-02-13 14:01   ` Anssi Hannula
  4 siblings, 1 reply; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 13:12 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.

However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.

Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.

The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
  BUG!, TX FIFO full when queue awake!
messages to be output.

Since there does not seem to be any way to accurately track the state of
the TX FIFO, drop the code trying to do that and perform
netif_stop_queue() based on whether the HW reports the FIFO being full.
Driver-side local echo support is also removed as it is not possible to
do accurately (relying instead on the CAN core fallback support).

Tested with the integrated CAN on Zynq-7000 SoC.

Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
---
 drivers/net/can/xilinx_can.c | 33 +++++++++------------------------
 1 file changed, 9 insertions(+), 24 deletions(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 392be77..7cfb8b6d 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -119,9 +119,6 @@ enum xcan_reg {
 /**
  * struct xcan_priv - This definition define CAN driver instance
  * @can:			CAN private data structure.
- * @tx_head:			Tx CAN packets ready to send on the queue
- * @tx_tail:			Tx CAN packets successfully sended on the queue
- * @tx_max:			Maximum number packets the driver can send
  * @napi:			NAPI structure
  * @read_reg:			For reading data from CAN registers
  * @write_reg:			For writing data to CAN registers
@@ -133,9 +130,6 @@ enum xcan_reg {
  */
 struct xcan_priv {
 	struct can_priv can;
-	unsigned int tx_head;
-	unsigned int tx_tail;
-	unsigned int tx_max;
 	struct napi_struct napi;
 	u32 (*read_reg)(const struct xcan_priv *priv, enum xcan_reg reg);
 	void (*write_reg)(const struct xcan_priv *priv, enum xcan_reg reg,
@@ -439,9 +433,6 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	if (cf->can_dlc > 4)
 		data[1] = be32_to_cpup((__be32 *)(cf->data + 4));
 
-	can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max);
-	priv->tx_head++;
-
 	/* Write the Frame to Xilinx CAN TX FIFO */
 	priv->write_reg(priv, XCAN_TXFIFO_ID_OFFSET, id);
 	/* If the CAN frame is RTR frame this write triggers tranmission */
@@ -453,10 +444,13 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 		 */
 		priv->write_reg(priv, XCAN_TXFIFO_DW2_OFFSET, data[1]);
 		stats->tx_bytes += cf->can_dlc;
+		stats->tx_packets++;
 	}
 
+	dev_kfree_skb_any(skb);
+
 	/* Check if the TX buffer is full */
-	if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
+	if (priv->read_reg(priv, XCAN_SR_OFFSET) & XCAN_SR_TXFLL_MASK)
 		netif_stop_queue(ndev);
 
 	return NETDEV_TX_OK;
@@ -832,17 +826,9 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
 static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
 {
 	struct xcan_priv *priv = netdev_priv(ndev);
-	struct net_device_stats *stats = &ndev->stats;
 
-	while ((priv->tx_head - priv->tx_tail > 0) &&
-			(isr & XCAN_IXR_TXOK_MASK)) {
-		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
-		can_get_echo_skb(ndev, priv->tx_tail %
-					priv->tx_max);
-		priv->tx_tail++;
-		stats->tx_packets++;
-		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
-	}
+	priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+
 	can_led_event(ndev, CAN_LED_EVENT_TX);
 	xcan_update_error_state_after_rxtx(ndev);
 	netif_wake_queue(ndev);
@@ -1177,6 +1163,7 @@ static int xcan_probe(struct platform_device *pdev)
 		goto err;
 	}
 
+	/* tx-fifo-depth is currently not used for anything by the driver */
 	ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth", &tx_max);
 	if (ret < 0)
 		goto err;
@@ -1186,7 +1173,7 @@ static int xcan_probe(struct platform_device *pdev)
 		goto err;
 
 	/* Create a CAN device instance */
-	ndev = alloc_candev(sizeof(struct xcan_priv), tx_max);
+	ndev = alloc_candev(sizeof(struct xcan_priv), 0);
 	if (!ndev)
 		return -ENOMEM;
 
@@ -1198,11 +1185,9 @@ static int xcan_probe(struct platform_device *pdev)
 	priv->can.ctrlmode_supported = CAN_CTRLMODE_LOOPBACK |
 					CAN_CTRLMODE_BERR_REPORTING;
 	priv->reg_base = addr;
-	priv->tx_max = tx_max;
 
 	/* Get IRQ for the device */
 	ndev->irq = platform_get_irq(pdev, 0);
-	ndev->flags |= IFF_ECHO;	/* We support local echo */
 
 	platform_set_drvdata(pdev, ndev);
 	SET_NETDEV_DEV(ndev, &pdev->dev);
@@ -1265,7 +1250,7 @@ static int xcan_probe(struct platform_device *pdev)
 
 	netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth:%d\n",
 			priv->reg_base, ndev->irq, priv->can.clock.freq,
-			priv->tx_max);
+			tx_max);
 
 	return 0;
 
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-13 13:12 ` [PATCH 5/5] can: xilinx_can: fix TX FIFO accounting getting out of sync Anssi Hannula
@ 2017-02-13 14:01   ` Anssi Hannula
  2017-02-13 15:39     ` [PATCH 5/5 v2] " Anssi Hannula
  0 siblings, 1 reply; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 14:01 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

On 13.2.2017 15:12, Anssi Hannula wrote:
> The xilinx_can driver assumes that the TXOK interrupt only clears after
> it has been acknowledged as many times as there have been successfully
> sent frames.
>
> However, the documentation does not mention such behavior, instead
> saying just that the interrupt is cleared when the clear bit is set.
>
> Similarly, testing seems to also suggest that it is immediately cleared
> regardless of the amount of frames having been sent. Performing some
> heavy TX load and then going back to idle has the tx_head drifting
> further away from tx_tail over time, steadily reducing the amount of
> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
> interrupt always frees up space for 1 frame from the driver's
> perspective, so frames continue to be sent) and delaying the local echo
> frames.
>
> The TX FIFO tracking is also otherwise buggy as it does not account for
> TX FIFO being cleared after software resets, causing
>   BUG!, TX FIFO full when queue awake!
> messages to be output.
>
> Since there does not seem to be any way to accurately track the state of
> the TX FIFO, drop the code trying to do that and perform
> netif_stop_queue() based on whether the HW reports the FIFO being full.
> Driver-side local echo support is also removed as it is not possible to
> do accurately (relying instead on the CAN core fallback support).
>
> Tested with the integrated CAN on Zynq-7000 SoC.
>
> Fixes: b1201e44f50b ("can: xilinx CAN controller support")
> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/net/can/xilinx_can.c | 33 +++++++++------------------------
>  1 file changed, 9 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
> index 392be77..7cfb8b6d 100644
> --- a/drivers/net/can/xilinx_can.c
> +++ b/drivers/net/can/xilinx_can.c
[...]
> @@ -832,17 +826,9 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
>  static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
>  {
>  	struct xcan_priv *priv = netdev_priv(ndev);
> -	struct net_device_stats *stats = &ndev->stats;
>  
> -	while ((priv->tx_head - priv->tx_tail > 0) &&
> -			(isr & XCAN_IXR_TXOK_MASK)) {
> -		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
> -		can_get_echo_skb(ndev, priv->tx_tail %
> -					priv->tx_max);
> -		priv->tx_tail++;
> -		stats->tx_packets++;
> -		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
> -	}
> +	priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
> +
>  	can_led_event(ndev, CAN_LED_EVENT_TX);
>  	xcan_update_error_state_after_rxtx(ndev);
>  	netif_wake_queue(ndev);

Hmm, actually I think this netif_wake_queue() is racy with start_xmit()
as the latter
could've filled the TX FIFO at this point...

So maybe some locking is needed here, unless there is another way this
is usually handled.

-- 
Anssi Hannula / Bitwise Oy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-13 14:01   ` Anssi Hannula
@ 2017-02-13 15:39     ` Anssi Hannula
  2017-02-20  9:42       ` Marc Kleine-Budde
  0 siblings, 1 reply; 15+ messages in thread
From: Anssi Hannula @ 2017-02-13 15:39 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.

However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.

Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.

The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
  BUG!, TX FIFO full when queue awake!
messages to be output.

Since there does not seem to be any way to accurately track the state of
the TX FIFO, drop the code trying to do that and perform
netif_stop_queue() based on whether the HW reports the FIFO being full.
Driver-side local echo support is also removed as it is not possible to
do accurately (relying instead on the CAN core fallback support).

Tested with the integrated CAN on Zynq-7000 SoC.

v2: Add FIFO space check before TX queue wake with locking to
synchronize with queue stop. This avoids waking the queue when xmit()
had just filled it.

Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
---
 drivers/net/can/xilinx_can.c | 52 +++++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 25 deletions(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 392be77..a4f69bc 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -28,6 +28,7 @@
 #include <linux/of.h>
 #include <linux/platform_device.h>
 #include <linux/skbuff.h>
+#include <linux/spinlock.h>
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/can/dev.h>
@@ -119,9 +120,7 @@ enum xcan_reg {
 /**
  * struct xcan_priv - This definition define CAN driver instance
  * @can:			CAN private data structure.
- * @tx_head:			Tx CAN packets ready to send on the queue
- * @tx_tail:			Tx CAN packets successfully sended on the queue
- * @tx_max:			Maximum number packets the driver can send
+ * @txfifo_lock:		Lock for synchronizing queue start/stop
  * @napi:			NAPI structure
  * @read_reg:			For reading data from CAN registers
  * @write_reg:			For writing data to CAN registers
@@ -133,9 +132,7 @@ enum xcan_reg {
  */
 struct xcan_priv {
 	struct can_priv can;
-	unsigned int tx_head;
-	unsigned int tx_tail;
-	unsigned int tx_max;
+	spinlock_t txfifo_lock;
 	struct napi_struct napi;
 	u32 (*read_reg)(const struct xcan_priv *priv, enum xcan_reg reg);
 	void (*write_reg)(const struct xcan_priv *priv, enum xcan_reg reg,
@@ -393,6 +390,7 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	struct net_device_stats *stats = &ndev->stats;
 	struct can_frame *cf = (struct can_frame *)skb->data;
 	u32 id, dlc, data[2] = {0, 0};
+	unsigned long flags;
 
 	if (can_dropped_invalid_skb(ndev, skb))
 		return NETDEV_TX_OK;
@@ -439,9 +437,6 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	if (cf->can_dlc > 4)
 		data[1] = be32_to_cpup((__be32 *)(cf->data + 4));
 
-	can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max);
-	priv->tx_head++;
-
 	/* Write the Frame to Xilinx CAN TX FIFO */
 	priv->write_reg(priv, XCAN_TXFIFO_ID_OFFSET, id);
 	/* If the CAN frame is RTR frame this write triggers tranmission */
@@ -453,12 +448,19 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 		 */
 		priv->write_reg(priv, XCAN_TXFIFO_DW2_OFFSET, data[1]);
 		stats->tx_bytes += cf->can_dlc;
+		stats->tx_packets++;
 	}
 
+	dev_kfree_skb_any(skb);
+
+	spin_lock_irqsave(&priv->txfifo_lock, flags);
+
 	/* Check if the TX buffer is full */
-	if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
+	if (priv->read_reg(priv, XCAN_SR_OFFSET) & XCAN_SR_TXFLL_MASK)
 		netif_stop_queue(ndev);
 
+	spin_unlock_irqrestore(&priv->txfifo_lock, flags);
+
 	return NETDEV_TX_OK;
 }
 
@@ -832,20 +834,20 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
 static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
 {
 	struct xcan_priv *priv = netdev_priv(ndev);
-	struct net_device_stats *stats = &ndev->stats;
+	unsigned long flags;
+
+	priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
 
-	while ((priv->tx_head - priv->tx_tail > 0) &&
-			(isr & XCAN_IXR_TXOK_MASK)) {
-		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
-		can_get_echo_skb(ndev, priv->tx_tail %
-					priv->tx_max);
-		priv->tx_tail++;
-		stats->tx_packets++;
-		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
-	}
 	can_led_event(ndev, CAN_LED_EVENT_TX);
 	xcan_update_error_state_after_rxtx(ndev);
-	netif_wake_queue(ndev);
+
+	spin_lock_irqsave(&priv->txfifo_lock, flags);
+
+	if (netif_queue_stopped(ndev) &&
+	    !(priv->read_reg(priv, XCAN_SR_OFFSET) & XCAN_SR_TXFLL_MASK))
+		netif_wake_queue(ndev);
+
+	spin_unlock_irqrestore(&priv->txfifo_lock, flags);
 }
 
 /**
@@ -1177,6 +1179,7 @@ static int xcan_probe(struct platform_device *pdev)
 		goto err;
 	}
 
+	/* tx-fifo-depth is currently not used for anything by the driver */
 	ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth", &tx_max);
 	if (ret < 0)
 		goto err;
@@ -1186,7 +1189,7 @@ static int xcan_probe(struct platform_device *pdev)
 		goto err;
 
 	/* Create a CAN device instance */
-	ndev = alloc_candev(sizeof(struct xcan_priv), tx_max);
+	ndev = alloc_candev(sizeof(struct xcan_priv), 0);
 	if (!ndev)
 		return -ENOMEM;
 
@@ -1198,11 +1201,10 @@ static int xcan_probe(struct platform_device *pdev)
 	priv->can.ctrlmode_supported = CAN_CTRLMODE_LOOPBACK |
 					CAN_CTRLMODE_BERR_REPORTING;
 	priv->reg_base = addr;
-	priv->tx_max = tx_max;
+	spin_lock_init(&priv->txfifo_lock);
 
 	/* Get IRQ for the device */
 	ndev->irq = platform_get_irq(pdev, 0);
-	ndev->flags |= IFF_ECHO;	/* We support local echo */
 
 	platform_set_drvdata(pdev, ndev);
 	SET_NETDEV_DEV(ndev, &pdev->dev);
@@ -1265,7 +1267,7 @@ static int xcan_probe(struct platform_device *pdev)
 
 	netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth:%d\n",
 			priv->reg_base, ndev->irq, priv->can.clock.freq,
-			priv->tx_max);
+			tx_max);
 
 	return 0;
 
-- 
2.8.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-13 15:39     ` [PATCH 5/5 v2] " Anssi Hannula
@ 2017-02-20  9:42       ` Marc Kleine-Budde
  2017-02-20 10:27         ` Anssi Hannula
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Kleine-Budde @ 2017-02-20  9:42 UTC (permalink / raw)
  To: Anssi Hannula; +Cc: linux-can, Kedareswara rao Appana, stable


[-- Attachment #1.1: Type: text/plain, Size: 2086 bytes --]

On 02/13/2017 04:39 PM, Anssi Hannula wrote:
> The xilinx_can driver assumes that the TXOK interrupt only clears after
> it has been acknowledged as many times as there have been successfully
> sent frames.
> 
> However, the documentation does not mention such behavior, instead
> saying just that the interrupt is cleared when the clear bit is set.
> 
> Similarly, testing seems to also suggest that it is immediately cleared
> regardless of the amount of frames having been sent. Performing some
> heavy TX load and then going back to idle has the tx_head drifting
> further away from tx_tail over time, steadily reducing the amount of
> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
> interrupt always frees up space for 1 frame from the driver's
> perspective, so frames continue to be sent) and delaying the local echo
> frames.
> 
> The TX FIFO tracking is also otherwise buggy as it does not account for
> TX FIFO being cleared after software resets, causing
>   BUG!, TX FIFO full when queue awake!
> messages to be output.
> 
> Since there does not seem to be any way to accurately track the state of
> the TX FIFO, drop the code trying to do that and perform
> netif_stop_queue() based on whether the HW reports the FIFO being full.
> Driver-side local echo support is also removed as it is not possible to
> do accurately (relying instead on the CAN core fallback support).
> 
> Tested with the integrated CAN on Zynq-7000 SoC.
> 
> v2: Add FIFO space check before TX queue wake with locking to
> synchronize with queue stop. This avoids waking the queue when xmit()
> had just filled it.

Removing the echo on TX-complete should be avoided if possible. Is there
a way to get a indication that an individual package has been send?

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-20  9:42       ` Marc Kleine-Budde
@ 2017-02-20 10:27         ` Anssi Hannula
  2017-02-21 14:57           ` Anssi Hannula
  0 siblings, 1 reply; 15+ messages in thread
From: Anssi Hannula @ 2017-02-20 10:27 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

On 20.2.2017 11:42, Marc Kleine-Budde wrote:
> On 02/13/2017 04:39 PM, Anssi Hannula wrote:
>> The xilinx_can driver assumes that the TXOK interrupt only clears after
>> it has been acknowledged as many times as there have been successfully
>> sent frames.
>>
>> However, the documentation does not mention such behavior, instead
>> saying just that the interrupt is cleared when the clear bit is set.
>>
>> Similarly, testing seems to also suggest that it is immediately cleared
>> regardless of the amount of frames having been sent. Performing some
>> heavy TX load and then going back to idle has the tx_head drifting
>> further away from tx_tail over time, steadily reducing the amount of
>> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
>> interrupt always frees up space for 1 frame from the driver's
>> perspective, so frames continue to be sent) and delaying the local echo
>> frames.
>>
>> The TX FIFO tracking is also otherwise buggy as it does not account for
>> TX FIFO being cleared after software resets, causing
>>   BUG!, TX FIFO full when queue awake!
>> messages to be output.
>>
>> Since there does not seem to be any way to accurately track the state of
>> the TX FIFO, drop the code trying to do that and perform
>> netif_stop_queue() based on whether the HW reports the FIFO being full.
>> Driver-side local echo support is also removed as it is not possible to
>> do accurately (relying instead on the CAN core fallback support).
>>
>> Tested with the integrated CAN on Zynq-7000 SoC.
>>
>> v2: Add FIFO space check before TX queue wake with locking to
>> synchronize with queue stop. This avoids waking the queue when xmit()
>> had just filled it.
> Removing the echo on TX-complete should be avoided if possible. Is there
> a way to get a indication that an individual package has been send?

Looking at the reference manual [1, chapter 18 and appendix B.5] I can't
see any. The TX FIFO interrupts and status registers available are TXOK
(1 or more messages sent), FIFO empty, FIFO full, FIFO under watermark
(watermark not programmable while on-bus). There is no FIFO free space
counter that I can see.

One way would be to only put a single frame into the FIFO at a time, of
course, but at least to me that seems worse than losing echo-on-TX-complete.

[1]
https://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf


-- 
Anssi Hannula / Bitwise Oy
+358503803997

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-20 10:27         ` Anssi Hannula
@ 2017-02-21 14:57           ` Anssi Hannula
  2017-02-21 15:14             ` Marc Kleine-Budde
  0 siblings, 1 reply; 15+ messages in thread
From: Anssi Hannula @ 2017-02-21 14:57 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

On 20.2.2017 12:27, Anssi Hannula wrote:
> On 20.2.2017 11:42, Marc Kleine-Budde wrote:
>> On 02/13/2017 04:39 PM, Anssi Hannula wrote:
>>> The xilinx_can driver assumes that the TXOK interrupt only clears after
>>> it has been acknowledged as many times as there have been successfully
>>> sent frames.
>>>
>>> However, the documentation does not mention such behavior, instead
>>> saying just that the interrupt is cleared when the clear bit is set.
>>>
>>> Similarly, testing seems to also suggest that it is immediately cleared
>>> regardless of the amount of frames having been sent. Performing some
>>> heavy TX load and then going back to idle has the tx_head drifting
>>> further away from tx_tail over time, steadily reducing the amount of
>>> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
>>> interrupt always frees up space for 1 frame from the driver's
>>> perspective, so frames continue to be sent) and delaying the local echo
>>> frames.
>>>
>>> The TX FIFO tracking is also otherwise buggy as it does not account for
>>> TX FIFO being cleared after software resets, causing
>>>   BUG!, TX FIFO full when queue awake!
>>> messages to be output.
>>>
>>> Since there does not seem to be any way to accurately track the state of
>>> the TX FIFO, drop the code trying to do that and perform
>>> netif_stop_queue() based on whether the HW reports the FIFO being full.
>>> Driver-side local echo support is also removed as it is not possible to
>>> do accurately (relying instead on the CAN core fallback support).
>>>
>>> Tested with the integrated CAN on Zynq-7000 SoC.
>>>
>>> v2: Add FIFO space check before TX queue wake with locking to
>>> synchronize with queue stop. This avoids waking the queue when xmit()
>>> had just filled it.
>> Removing the echo on TX-complete should be avoided if possible. Is there
>> a way to get a indication that an individual package has been send?
> Looking at the reference manual [1, chapter 18 and appendix B.5] I can't
> see any. The TX FIFO interrupts and status registers available are TXOK
> (1 or more messages sent), FIFO empty, FIFO full, FIFO under watermark
> (watermark not programmable while on-bus). There is no FIFO free space
> counter that I can see.
>
> One way would be to only put a single frame into the FIFO at a time, of
> course, but at least to me that seems worse than losing echo-on-TX-complete.

A correction to my last statement above, we could probably put up to 3
frames in TX FIFO at a time and still keep track of them using the TX
FIFO watermark, allowing echo-on-TX-complete to still be supported, with
something like this:
if (tx fifo empty)
   all frames sent;
else if (tx fifo under watermark 2)
   all except 1 frame sent;
else if (txok)
  1 frame sent; (all except 2)

Though 3 is still much less than 64, the current TX FIFO max size.

I'd myself prefer keeping full FIFO over keeping echo-on-TX-complete,
but not sure what is the general opinion.

> [1]
> https://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf
>
>

-- 
Anssi Hannula / Bitwise Oy
+358503803997

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-21 14:57           ` Anssi Hannula
@ 2017-02-21 15:14             ` Marc Kleine-Budde
  2017-02-21 16:07               ` Anssi Hannula
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Kleine-Budde @ 2017-02-21 15:14 UTC (permalink / raw)
  To: Anssi Hannula; +Cc: linux-can, Kedareswara rao Appana, stable


[-- Attachment #1.1: Type: text/plain, Size: 3587 bytes --]

On 02/21/2017 03:57 PM, Anssi Hannula wrote:
> On 20.2.2017 12:27, Anssi Hannula wrote:
>> On 20.2.2017 11:42, Marc Kleine-Budde wrote:
>>> On 02/13/2017 04:39 PM, Anssi Hannula wrote:
>>>> The xilinx_can driver assumes that the TXOK interrupt only clears after
>>>> it has been acknowledged as many times as there have been successfully
>>>> sent frames.
>>>>
>>>> However, the documentation does not mention such behavior, instead
>>>> saying just that the interrupt is cleared when the clear bit is set.
>>>>
>>>> Similarly, testing seems to also suggest that it is immediately cleared
>>>> regardless of the amount of frames having been sent. Performing some
>>>> heavy TX load and then going back to idle has the tx_head drifting
>>>> further away from tx_tail over time, steadily reducing the amount of
>>>> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
>>>> interrupt always frees up space for 1 frame from the driver's
>>>> perspective, so frames continue to be sent) and delaying the local echo
>>>> frames.
>>>>
>>>> The TX FIFO tracking is also otherwise buggy as it does not account for
>>>> TX FIFO being cleared after software resets, causing
>>>>   BUG!, TX FIFO full when queue awake!
>>>> messages to be output.
>>>>
>>>> Since there does not seem to be any way to accurately track the state of
>>>> the TX FIFO, drop the code trying to do that and perform
>>>> netif_stop_queue() based on whether the HW reports the FIFO being full.
>>>> Driver-side local echo support is also removed as it is not possible to
>>>> do accurately (relying instead on the CAN core fallback support).
>>>>
>>>> Tested with the integrated CAN on Zynq-7000 SoC.
>>>>
>>>> v2: Add FIFO space check before TX queue wake with locking to
>>>> synchronize with queue stop. This avoids waking the queue when xmit()
>>>> had just filled it.
>>> Removing the echo on TX-complete should be avoided if possible. Is there
>>> a way to get a indication that an individual package has been send?
>> Looking at the reference manual [1, chapter 18 and appendix B.5] I can't
>> see any. The TX FIFO interrupts and status registers available are TXOK
>> (1 or more messages sent), FIFO empty, FIFO full, FIFO under watermark
>> (watermark not programmable while on-bus). There is no FIFO free space
>> counter that I can see.
>>
>> One way would be to only put a single frame into the FIFO at a time, of
>> course, but at least to me that seems worse than losing echo-on-TX-complete.
> 
> A correction to my last statement above, we could probably put up to 3
> frames in TX FIFO at a time and still keep track of them using the TX
> FIFO watermark, allowing echo-on-TX-complete to still be supported, with
> something like this:
> if (tx fifo empty)
>    all frames sent;
> else if (tx fifo under watermark 2)
>    all except 1 frame sent;
> else if (txok)
>   1 frame sent; (all except 2)
> 
> Though 3 is still much less than 64, the current TX FIFO max size.

Sounds good, can you work on a patch?

> I'd myself prefer keeping full FIFO over keeping echo-on-TX-complete,
> but not sure what is the general opinion.

Big FIFOs have the drawback of bigger latencies, as you cannot reorder
high priority packages inside the kernel.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-21 15:14             ` Marc Kleine-Budde
@ 2017-02-21 16:07               ` Anssi Hannula
  2017-02-22  9:11                 ` Marc Kleine-Budde
  0 siblings, 1 reply; 15+ messages in thread
From: Anssi Hannula @ 2017-02-21 16:07 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Kedareswara rao Appana, stable

On 21.2.2017 17:14, Marc Kleine-Budde wrote:
> On 02/21/2017 03:57 PM, Anssi Hannula wrote:
>> On 20.2.2017 12:27, Anssi Hannula wrote:
>>> On 20.2.2017 11:42, Marc Kleine-Budde wrote:
>>>> On 02/13/2017 04:39 PM, Anssi Hannula wrote:
>>>>> The xilinx_can driver assumes that the TXOK interrupt only clears after
>>>>> it has been acknowledged as many times as there have been successfully
>>>>> sent frames.
>>>>>
>>>>> However, the documentation does not mention such behavior, instead
>>>>> saying just that the interrupt is cleared when the clear bit is set.
>>>>>
>>>>> Similarly, testing seems to also suggest that it is immediately cleared
>>>>> regardless of the amount of frames having been sent. Performing some
>>>>> heavy TX load and then going back to idle has the tx_head drifting
>>>>> further away from tx_tail over time, steadily reducing the amount of
>>>>> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
>>>>> interrupt always frees up space for 1 frame from the driver's
>>>>> perspective, so frames continue to be sent) and delaying the local echo
>>>>> frames.
>>>>>
>>>>> The TX FIFO tracking is also otherwise buggy as it does not account for
>>>>> TX FIFO being cleared after software resets, causing
>>>>>   BUG!, TX FIFO full when queue awake!
>>>>> messages to be output.
>>>>>
>>>>> Since there does not seem to be any way to accurately track the state of
>>>>> the TX FIFO, drop the code trying to do that and perform
>>>>> netif_stop_queue() based on whether the HW reports the FIFO being full.
>>>>> Driver-side local echo support is also removed as it is not possible to
>>>>> do accurately (relying instead on the CAN core fallback support).
>>>>>
>>>>> Tested with the integrated CAN on Zynq-7000 SoC.
>>>>>
>>>>> v2: Add FIFO space check before TX queue wake with locking to
>>>>> synchronize with queue stop. This avoids waking the queue when xmit()
>>>>> had just filled it.
>>>> Removing the echo on TX-complete should be avoided if possible. Is there
>>>> a way to get a indication that an individual package has been send?
>>> Looking at the reference manual [1, chapter 18 and appendix B.5] I can't
>>> see any. The TX FIFO interrupts and status registers available are TXOK
>>> (1 or more messages sent), FIFO empty, FIFO full, FIFO under watermark
>>> (watermark not programmable while on-bus). There is no FIFO free space
>>> counter that I can see.
>>>
>>> One way would be to only put a single frame into the FIFO at a time, of
>>> course, but at least to me that seems worse than losing echo-on-TX-complete.
>> A correction to my last statement above, we could probably put up to 3
>> frames in TX FIFO at a time and still keep track of them using the TX
>> FIFO watermark, allowing echo-on-TX-complete to still be supported, with
>> something like this:
>> if (tx fifo empty)
>>    all frames sent;
>> else if (tx fifo under watermark 2)
>>    all except 1 frame sent;
>> else if (txok)
>>   1 frame sent; (all except 2)
>>
>> Though 3 is still much less than 64, the current TX FIFO max size.
> Sounds good, can you work on a patch?

Unfortunately I've just noticed that the soft-IP version of the
controller (compatible-string "xlnx,axi-can-1.00.a" instead of
"xlnx,zynq-can-1.0" found on Zynq SoCs) does not have the TX FIFO empty
bit nor the watermark programming feature:
https://www.xilinx.com/support/documentation/ip_documentation/can/v5_0/pg096-can.pdf

So the above idea for tracking more than 1 frame in FIFO would not work
on those.

What do you think should be done here?

>> I'd myself prefer keeping full FIFO over keeping echo-on-TX-complete,
>> but not sure what is the general opinion.
> Big FIFOs have the drawback of bigger latencies, as you cannot reorder
> high priority packages inside the kernel.
>
> Marc
>

-- 
Anssi Hannula / Bitwise Oy
+358503803997

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync
  2017-02-21 16:07               ` Anssi Hannula
@ 2017-02-22  9:11                 ` Marc Kleine-Budde
  2017-03-03 15:42                   ` [PATCH 5/5 v3] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting Anssi Hannula
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Kleine-Budde @ 2017-02-22  9:11 UTC (permalink / raw)
  To: Anssi Hannula; +Cc: linux-can, Kedareswara rao Appana, stable


[-- Attachment #1.1: Type: text/plain, Size: 4614 bytes --]

On 02/21/2017 05:07 PM, Anssi Hannula wrote:
> On 21.2.2017 17:14, Marc Kleine-Budde wrote:
>> On 02/21/2017 03:57 PM, Anssi Hannula wrote:
>>> On 20.2.2017 12:27, Anssi Hannula wrote:
>>>> On 20.2.2017 11:42, Marc Kleine-Budde wrote:
>>>>> On 02/13/2017 04:39 PM, Anssi Hannula wrote:
>>>>>> The xilinx_can driver assumes that the TXOK interrupt only clears after
>>>>>> it has been acknowledged as many times as there have been successfully
>>>>>> sent frames.
>>>>>>
>>>>>> However, the documentation does not mention such behavior, instead
>>>>>> saying just that the interrupt is cleared when the clear bit is set.
>>>>>>
>>>>>> Similarly, testing seems to also suggest that it is immediately cleared
>>>>>> regardless of the amount of frames having been sent. Performing some
>>>>>> heavy TX load and then going back to idle has the tx_head drifting
>>>>>> further away from tx_tail over time, steadily reducing the amount of
>>>>>> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
>>>>>> interrupt always frees up space for 1 frame from the driver's
>>>>>> perspective, so frames continue to be sent) and delaying the local echo
>>>>>> frames.
>>>>>>
>>>>>> The TX FIFO tracking is also otherwise buggy as it does not account for
>>>>>> TX FIFO being cleared after software resets, causing
>>>>>>   BUG!, TX FIFO full when queue awake!
>>>>>> messages to be output.
>>>>>>
>>>>>> Since there does not seem to be any way to accurately track the state of
>>>>>> the TX FIFO, drop the code trying to do that and perform
>>>>>> netif_stop_queue() based on whether the HW reports the FIFO being full.
>>>>>> Driver-side local echo support is also removed as it is not possible to
>>>>>> do accurately (relying instead on the CAN core fallback support).
>>>>>>
>>>>>> Tested with the integrated CAN on Zynq-7000 SoC.
>>>>>>
>>>>>> v2: Add FIFO space check before TX queue wake with locking to
>>>>>> synchronize with queue stop. This avoids waking the queue when xmit()
>>>>>> had just filled it.
>>>>> Removing the echo on TX-complete should be avoided if possible. Is there
>>>>> a way to get a indication that an individual package has been send?
>>>> Looking at the reference manual [1, chapter 18 and appendix B.5] I can't
>>>> see any. The TX FIFO interrupts and status registers available are TXOK
>>>> (1 or more messages sent), FIFO empty, FIFO full, FIFO under watermark
>>>> (watermark not programmable while on-bus). There is no FIFO free space
>>>> counter that I can see.
>>>>
>>>> One way would be to only put a single frame into the FIFO at a time, of
>>>> course, but at least to me that seems worse than losing echo-on-TX-complete.
>>> A correction to my last statement above, we could probably put up to 3
>>> frames in TX FIFO at a time and still keep track of them using the TX
>>> FIFO watermark, allowing echo-on-TX-complete to still be supported, with
>>> something like this:
>>> if (tx fifo empty)
>>>    all frames sent;
>>> else if (tx fifo under watermark 2)
>>>    all except 1 frame sent;
>>> else if (txok)
>>>   1 frame sent; (all except 2)
>>>
>>> Though 3 is still much less than 64, the current TX FIFO max size.
>> Sounds good, can you work on a patch?
> 
> Unfortunately I've just noticed that the soft-IP version of the
> controller (compatible-string "xlnx,axi-can-1.00.a" instead of
> "xlnx,zynq-can-1.0" found on Zynq SoCs) does not have the TX FIFO empty
> bit nor the watermark programming feature:
> https://www.xilinx.com/support/documentation/ip_documentation/can/v5_0/pg096-can.pdf
> 
> So the above idea for tracking more than 1 frame in FIFO would not work
> on those.
> 
> What do you think should be done here?

Depending on the compatible string we know that max usable FIFO size.
You can upgrade the driver to cope with this.

The xmit function doesn't change at all:

> 	/* Check if the TX buffer is full */
> 	if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
> 		netif_stop_queue(ndev);

Just the calculation of priv->tx_max changes depending on the compatible.

The loop in xcan_tx_interrupt() get more complicated.

If this works, we can discuss on a property (maybe DT) to switch the
driver into "non"-IFF_ECHO mode, as you proposed in your original patch.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/5 v3] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting
  2017-02-22  9:11                 ` Marc Kleine-Budde
@ 2017-03-03 15:42                   ` Anssi Hannula
  0 siblings, 0 replies; 15+ messages in thread
From: Anssi Hannula @ 2017-03-03 15:42 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Kedareswara rao Appana, linux-can

The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.

However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.

Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.

The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
  BUG!, TX FIFO full when queue awake!
messages to be output.

There does not seem to be any way to accurately track the state of the
TX FIFO for local echo support while using the full TX FIFO.

The Zynq version of the HW (but not the soft-AXI version) has watermark
programming support and with it an additional TX-FIFO-empty interrupt
bit.

Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
to detect whether 1 or 2 frames have been sent at interrupt processing
time.

Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
was also tested.

v2: Add FIFO space check before TX queue wake with locking to
synchronize with queue stop. This avoids waking the queue when xmit()
had just filled it.

v3: Keep local echo support and reduce the amount of frames in FIFO
instead.

Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
---

It wasn't quite so simple after all. Unfortunately using TXFWMEMP also
did not seem to work, so this patch is limited to only 2 frames in FIFO
on Zynq and 1 frame on soft-AXI.

(I won't be reachable during next week)

 drivers/net/can/xilinx_can.c | 139 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 123 insertions(+), 16 deletions(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 392be77..a6b7c21 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -26,8 +26,10 @@
 #include <linux/module.h>
 #include <linux/netdevice.h>
 #include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/platform_device.h>
 #include <linux/skbuff.h>
+#include <linux/spinlock.h>
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/can/dev.h>
@@ -119,6 +121,7 @@ enum xcan_reg {
 /**
  * struct xcan_priv - This definition define CAN driver instance
  * @can:			CAN private data structure.
+ * @tx_lock:			Lock for synchronizing TX interrupt handling
  * @tx_head:			Tx CAN packets ready to send on the queue
  * @tx_tail:			Tx CAN packets successfully sended on the queue
  * @tx_max:			Maximum number packets the driver can send
@@ -133,6 +136,7 @@ enum xcan_reg {
  */
 struct xcan_priv {
 	struct can_priv can;
+	spinlock_t tx_lock;
 	unsigned int tx_head;
 	unsigned int tx_tail;
 	unsigned int tx_max;
@@ -160,6 +164,11 @@ static const struct can_bittiming_const xcan_bittiming_const = {
 	.brp_inc = 1,
 };
 
+#define XCAN_CAP_WATERMARK	0x0001
+struct xcan_devtype_data {
+	unsigned int caps;
+};
+
 /**
  * xcan_write_reg_le - Write a value to the device register little endian
  * @priv:	Driver private data structure
@@ -239,6 +248,10 @@ static int set_reset_mode(struct net_device *ndev)
 		usleep_range(500, 10000);
 	}
 
+	/* reset clears FIFOs */
+	priv->tx_head = 0;
+	priv->tx_tail = 0;
+
 	return 0;
 }
 
@@ -393,6 +406,7 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	struct net_device_stats *stats = &ndev->stats;
 	struct can_frame *cf = (struct can_frame *)skb->data;
 	u32 id, dlc, data[2] = {0, 0};
+	unsigned long flags;
 
 	if (can_dropped_invalid_skb(ndev, skb))
 		return NETDEV_TX_OK;
@@ -440,6 +454,9 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 		data[1] = be32_to_cpup((__be32 *)(cf->data + 4));
 
 	can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max);
+
+	spin_lock_irqsave(&priv->tx_lock, flags);
+
 	priv->tx_head++;
 
 	/* Write the Frame to Xilinx CAN TX FIFO */
@@ -455,10 +472,16 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 		stats->tx_bytes += cf->can_dlc;
 	}
 
+	/* Clear TX-FIFO-empty interrupt for xcan_tx_interrupt() */
+	if (priv->tx_max > 1)
+		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXFEMP_MASK);
+
 	/* Check if the TX buffer is full */
 	if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
 		netif_stop_queue(ndev);
 
+	spin_unlock_irqrestore(&priv->tx_lock, flags);
+
 	return NETDEV_TX_OK;
 }
 
@@ -833,19 +856,71 @@ static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
 {
 	struct xcan_priv *priv = netdev_priv(ndev);
 	struct net_device_stats *stats = &ndev->stats;
+	unsigned int frames_in_fifo;
+	int frames_sent = 1; /* TXOK => at least 1 frame was sent */
+	unsigned long flags;
+	int retries = 0;
+
+	/* Synchronize with xmit as we need to know the exact number
+	 * of frames in the FIFO to stay in sync due to the TXFEMP
+	 * handling.
+	 * This also prevents a race between netif_wake_queue() and
+	 * netif_stop_queue().
+	 */
+	spin_lock_irqsave(&priv->tx_lock, flags);
+
+	frames_in_fifo = priv->tx_head - priv->tx_tail;
 
-	while ((priv->tx_head - priv->tx_tail > 0) &&
-			(isr & XCAN_IXR_TXOK_MASK)) {
+	if (WARN_ON_ONCE(frames_in_fifo == 0)) {
+		/* clear TXOK anyway to avoid getting back here */
 		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+		spin_unlock_irqrestore(&priv->tx_lock, flags);
+		return;
+	}
+
+	/* Check if 2 frames were sent (TXOK only means that at least 1
+	 * frame was sent).
+	 */
+	if (frames_in_fifo > 1) {
+		WARN_ON(frames_in_fifo > priv->tx_max);
+
+		/* Synchronize TXOK and isr so that after the loop:
+		 * (1) isr variable is up-to-date at least up to TXOK clear
+		 *     time. This avoids us clearing a TXOK of a second frame
+		 *     but not noticing that the FIFO is now empty and thus
+		 *     marking only a single frame as sent.
+		 * (2) No TXOK is left. Having one could mean leaving a
+		 *     stray TXOK as we might process the associated frame
+		 *     via TXFEMP handling as we read TXFEMP *after* TXOK
+		 *     clear to satisfy (1).
+		 */
+		while ((isr & XCAN_IXR_TXOK_MASK) && !WARN_ON(++retries == 100)) {
+			priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+			isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
+		}
+
+		if (isr & XCAN_IXR_TXFEMP_MASK) {
+			/* nothing in FIFO anymore */
+			frames_sent = frames_in_fifo;
+		}
+	} else {
+		/* single frame in fifo, just clear TXOK */
+		priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+	}
+
+	while (frames_sent--) {
 		can_get_echo_skb(ndev, priv->tx_tail %
 					priv->tx_max);
 		priv->tx_tail++;
 		stats->tx_packets++;
-		isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
 	}
+
+	netif_wake_queue(ndev);
+
+	spin_unlock_irqrestore(&priv->tx_lock, flags);
+
 	can_led_event(ndev, CAN_LED_EVENT_TX);
 	xcan_update_error_state_after_rxtx(ndev);
-	netif_wake_queue(ndev);
 }
 
 /**
@@ -1152,6 +1227,18 @@ static const struct dev_pm_ops xcan_dev_pm_ops = {
 	SET_RUNTIME_PM_OPS(xcan_runtime_suspend, xcan_runtime_resume, NULL)
 };
 
+static const struct xcan_devtype_data xcan_zynq_data = {
+	.caps = XCAN_CAP_WATERMARK,
+};
+
+/* Match table for OF platform binding */
+static const struct of_device_id xcan_of_match[] = {
+	{ .compatible = "xlnx,zynq-can-1.0", .data = &xcan_zynq_data },
+	{ .compatible = "xlnx,axi-can-1.00.a", },
+	{ /* end of list */ },
+};
+MODULE_DEVICE_TABLE(of, xcan_of_match);
+
 /**
  * xcan_probe - Platform registration call
  * @pdev:	Handle to the platform device structure
@@ -1166,8 +1253,10 @@ static int xcan_probe(struct platform_device *pdev)
 	struct resource *res; /* IO mem resources */
 	struct net_device *ndev;
 	struct xcan_priv *priv;
+	const struct of_device_id *of_id;
+	int caps = 0;
 	void __iomem *addr;
-	int ret, rx_max, tx_max;
+	int ret, rx_max, tx_max, tx_fifo_depth;
 
 	/* Get the virtual base address for the device */
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -1177,7 +1266,8 @@ static int xcan_probe(struct platform_device *pdev)
 		goto err;
 	}
 
-	ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth", &tx_max);
+	ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth",
+				   &tx_fifo_depth);
 	if (ret < 0)
 		goto err;
 
@@ -1185,6 +1275,30 @@ static int xcan_probe(struct platform_device *pdev)
 	if (ret < 0)
 		goto err;
 
+	of_id = of_match_device(xcan_of_match, &pdev->dev);
+	if (of_id) {
+		const struct xcan_devtype_data *devtype_data = of_id->data;
+
+		if (devtype_data)
+			caps = devtype_data->caps;
+	}
+
+	/* There is no way to directly figure out how many frames have been
+	 * sent when the TXOK interrupt is processed. If watermark programming
+	 * is supported, we can have 2 frames in the FIFO and use TXFEMP
+	 * to determine if 1 or 2 frames have been sent.
+	 * Theoretically we should be able to use TXFWMEMP to determine up
+	 * to 3 frames, but it seems that after putting a second frame in the
+	 * FIFO, with watermark at 2 frames, it can happen that TXFWMEMP (less
+	 * than 2 frames in FIFO) is set anyway with no TXOK (a frame was
+	 * sent), which is not a sensible state - possibly TXFWMEMP is not
+	 * completely synchronized with the rest of the bits?
+	 */
+	if (caps & XCAN_CAP_WATERMARK)
+		tx_max = min(tx_fifo_depth, 2);
+	else
+		tx_max = 1;
+
 	/* Create a CAN device instance */
 	ndev = alloc_candev(sizeof(struct xcan_priv), tx_max);
 	if (!ndev)
@@ -1199,6 +1313,7 @@ static int xcan_probe(struct platform_device *pdev)
 					CAN_CTRLMODE_BERR_REPORTING;
 	priv->reg_base = addr;
 	priv->tx_max = tx_max;
+	spin_lock_init(&priv->tx_lock);
 
 	/* Get IRQ for the device */
 	ndev->irq = platform_get_irq(pdev, 0);
@@ -1263,9 +1378,9 @@ static int xcan_probe(struct platform_device *pdev)
 
 	pm_runtime_put(&pdev->dev);
 
-	netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth:%d\n",
+	netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth: actual %d, using %d\n",
 			priv->reg_base, ndev->irq, priv->can.clock.freq,
-			priv->tx_max);
+			tx_fifo_depth, priv->tx_max);
 
 	return 0;
 
@@ -1299,14 +1414,6 @@ static int xcan_remove(struct platform_device *pdev)
 	return 0;
 }
 
-/* Match table for OF platform binding */
-static const struct of_device_id xcan_of_match[] = {
-	{ .compatible = "xlnx,zynq-can-1.0", },
-	{ .compatible = "xlnx,axi-can-1.00.a", },
-	{ /* end of list */ },
-};
-MODULE_DEVICE_TABLE(of, xcan_of_match);
-
 static struct platform_driver xcan_driver = {
 	.probe = xcan_probe,
 	.remove	= xcan_remove,
-- 
2.8.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-03-03 15:51 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-13 13:12 [PATCH 0/5] can: various fixes to xilinx_can driver Anssi Hannula
2017-02-13 13:12 ` [PATCH 1/5] can: xilinx_can: fix device dropping off bus on RX overrun Anssi Hannula
2017-02-13 13:12 ` [PATCH 2/5] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Anssi Hannula
2017-02-13 13:12 ` [PATCH 3/5] can: xilinx_can: fix recovery from error states not being propagated Anssi Hannula
2017-02-13 13:12 ` [PATCH 4/5] can: xilinx_can: only report warning and passive states on state changes Anssi Hannula
2017-02-13 13:12 ` [PATCH 5/5] can: xilinx_can: fix TX FIFO accounting getting out of sync Anssi Hannula
2017-02-13 14:01   ` Anssi Hannula
2017-02-13 15:39     ` [PATCH 5/5 v2] " Anssi Hannula
2017-02-20  9:42       ` Marc Kleine-Budde
2017-02-20 10:27         ` Anssi Hannula
2017-02-21 14:57           ` Anssi Hannula
2017-02-21 15:14             ` Marc Kleine-Budde
2017-02-21 16:07               ` Anssi Hannula
2017-02-22  9:11                 ` Marc Kleine-Budde
2017-03-03 15:42                   ` [PATCH 5/5 v3] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting Anssi Hannula

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.