All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Various nb8800 tweaks
@ 2017-11-14 10:53 ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:53 UTC (permalink / raw)
  To: David Miller, Mans Rullgard
  Cc: netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Hello,

The main feature in this series is suspend/resume support (patch 4).
It is implemented by closing the device on suspend, and reopening it
on resume. To achieve this, HW inits are moved to ndo_open (patch 3).

After some back and forth with the HW dev, it turns out that the "raw" IP
does not support disabling DMA in software, so I want to get rid of the
nb8800_dma_stop() hack. Since there is no standard way to disable DMA,
I dropped generic support (patch 1).

The HW dev has also stated that it is safe to tweak the flow control bit
in the NB8800_RXC_CR register at any time (patch 2) i.e. no need for
special handling when RX DMA is enabled, despite the documentation stating:

> The Receive Channel Control Register holds channel enable, mode, endian,
> DMA control, and interrupt control information.  This register can only be
> written when the Receive DMA Channel is idle - the Enable bit in it is "0".

Marc Gonzalez (4):
  net: nb8800: Drop generic support
  net: nb8800: Simplify nb8800_pause_config()
  net: nb8800: Move HW init to ndo_open()
  net: nb8800: Add support for suspend/resume

 drivers/net/ethernet/aurora/nb8800.c | 189 +++++++++++------------------------
 drivers/net/ethernet/aurora/nb8800.h |   4 +-
 2 files changed, 63 insertions(+), 130 deletions(-)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 0/4] Various nb8800 tweaks
@ 2017-11-14 10:53 ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

The main feature in this series is suspend/resume support (patch 4).
It is implemented by closing the device on suspend, and reopening it
on resume. To achieve this, HW inits are moved to ndo_open (patch 3).

After some back and forth with the HW dev, it turns out that the "raw" IP
does not support disabling DMA in software, so I want to get rid of the
nb8800_dma_stop() hack. Since there is no standard way to disable DMA,
I dropped generic support (patch 1).

The HW dev has also stated that it is safe to tweak the flow control bit
in the NB8800_RXC_CR register at any time (patch 2) i.e. no need for
special handling when RX DMA is enabled, despite the documentation stating:

> The Receive Channel Control Register holds channel enable, mode, endian,
> DMA control, and interrupt control information.  This register can only be
> written when the Receive DMA Channel is idle - the Enable bit in it is "0".

Marc Gonzalez (4):
  net: nb8800: Drop generic support
  net: nb8800: Simplify nb8800_pause_config()
  net: nb8800: Move HW init to ndo_open()
  net: nb8800: Add support for suspend/resume

 drivers/net/ethernet/aurora/nb8800.c | 189 +++++++++++------------------------
 drivers/net/ethernet/aurora/nb8800.h |   4 +-
 2 files changed, 63 insertions(+), 130 deletions(-)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 1/4] net: nb8800: Drop generic support
  2017-11-14 10:53 ` Marc Gonzalez
@ 2017-11-14 10:54   ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:54 UTC (permalink / raw)
  To: David Miller, Mans Rullgard
  Cc: netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

According to our HW dev, there is no provision for software to safely
disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
it is the responsibility of the SoC designer to provide such a feature.

The nb8800_dma_stop() implementation is a clever hack that works most
of the times, but it breaks the DMA state machine in rare cases.

Therefore, let's drop generic support.

FWIW, tango chips provide a reset register. When the ethernet block
comes out of reset, DMA is disabled.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index e94159507847..26f719e2d6ca 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
 };
 
 static const struct of_device_id nb8800_dt_ids[] = {
-	{
-		.compatible = "aurora,nb8800",
-	},
 	{
 		.compatible = "sigma,smp8642-ethernet",
 		.data = &nb8800_tangox_ops,
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 1/4] net: nb8800: Drop generic support
@ 2017-11-14 10:54   ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:54 UTC (permalink / raw)
  To: linux-arm-kernel

According to our HW dev, there is no provision for software to safely
disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
it is the responsibility of the SoC designer to provide such a feature.

The nb8800_dma_stop() implementation is a clever hack that works most
of the times, but it breaks the DMA state machine in rare cases.

Therefore, let's drop generic support.

FWIW, tango chips provide a reset register. When the ethernet block
comes out of reset, DMA is disabled.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index e94159507847..26f719e2d6ca 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
 };
 
 static const struct of_device_id nb8800_dt_ids[] = {
-	{
-		.compatible = "aurora,nb8800",
-	},
 	{
 		.compatible = "sigma,smp8642-ethernet",
 		.data = &nb8800_tangox_ops,
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-14 10:53 ` Marc Gonzalez
@ 2017-11-14 10:55   ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:55 UTC (permalink / raw)
  To: David Miller, Mans Rullgard
  Cc: netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

The "flow control enable" bit can be tweaked, even if DMA is enabled.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index 26f719e2d6ca..09b8001e1ecc 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -633,7 +633,6 @@ static void nb8800_pause_config(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
 	struct phy_device *phydev = dev->phydev;
-	u32 rxcr;
 
 	if (priv->pause_aneg) {
 		if (!phydev || !phydev->link)
@@ -644,22 +643,7 @@ static void nb8800_pause_config(struct net_device *dev)
 	}
 
 	nb8800_modb(priv, NB8800_RX_CTL, RX_PAUSE_EN, priv->pause_rx);
-
-	rxcr = nb8800_readl(priv, NB8800_RXC_CR);
-	if (!!(rxcr & RCR_FL) == priv->pause_tx)
-		return;
-
-	if (netif_running(dev)) {
-		napi_disable(&priv->napi);
-		netif_tx_lock_bh(dev);
-		nb8800_dma_stop(dev);
-		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
-		nb8800_start_rx(dev);
-		netif_tx_unlock_bh(dev);
-		napi_enable(&priv->napi);
-	} else {
-		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
-	}
+	nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
 }
 
 static void nb8800_link_reconfigure(struct net_device *dev)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-14 10:55   ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:55 UTC (permalink / raw)
  To: linux-arm-kernel

The "flow control enable" bit can be tweaked, even if DMA is enabled.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index 26f719e2d6ca..09b8001e1ecc 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -633,7 +633,6 @@ static void nb8800_pause_config(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
 	struct phy_device *phydev = dev->phydev;
-	u32 rxcr;
 
 	if (priv->pause_aneg) {
 		if (!phydev || !phydev->link)
@@ -644,22 +643,7 @@ static void nb8800_pause_config(struct net_device *dev)
 	}
 
 	nb8800_modb(priv, NB8800_RX_CTL, RX_PAUSE_EN, priv->pause_rx);
-
-	rxcr = nb8800_readl(priv, NB8800_RXC_CR);
-	if (!!(rxcr & RCR_FL) == priv->pause_tx)
-		return;
-
-	if (netif_running(dev)) {
-		napi_disable(&priv->napi);
-		netif_tx_lock_bh(dev);
-		nb8800_dma_stop(dev);
-		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
-		nb8800_start_rx(dev);
-		netif_tx_unlock_bh(dev);
-		napi_enable(&priv->napi);
-	} else {
-		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
-	}
+	nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
 }
 
 static void nb8800_link_reconfigure(struct net_device *dev)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 10:53 ` Marc Gonzalez
@ 2017-11-14 10:56   ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:56 UTC (permalink / raw)
  To: David Miller, Mans Rullgard
  Cc: netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Power entire ethernet block down in ndo_stop().
Power it back up in ndo_open() and perform HW init.
Delete nb8800_dma_stop.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 146 +++++++++--------------------------
 drivers/net/ethernet/aurora/nb8800.h |   4 +-
 2 files changed, 40 insertions(+), 110 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index 09b8001e1ecc..b71d8fb80610 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -40,7 +40,7 @@
 #include "nb8800.h"
 
 static void nb8800_tx_done(struct net_device *dev);
-static int nb8800_dma_stop(struct net_device *dev);
+static void nb8800_hw_init(struct net_device *dev);
 
 static inline u8 nb8800_readb(struct nb8800_priv *priv, int reg)
 {
@@ -862,61 +862,6 @@ static int nb8800_dma_init(struct net_device *dev)
 	return -ENOMEM;
 }
 
-static int nb8800_dma_stop(struct net_device *dev)
-{
-	struct nb8800_priv *priv = netdev_priv(dev);
-	struct nb8800_tx_buf *txb = &priv->tx_bufs[0];
-	struct nb8800_tx_desc *txd = &priv->tx_descs[0];
-	int retry = 5;
-	u32 txcr;
-	u32 rxcr;
-	int err;
-	unsigned int i;
-
-	/* wait for tx to finish */
-	err = readl_poll_timeout_atomic(priv->base + NB8800_TXC_CR, txcr,
-					!(txcr & TCR_EN) &&
-					priv->tx_done == priv->tx_next,
-					1000, 1000000);
-	if (err)
-		return err;
-
-	/* The rx DMA only stops if it reaches the end of chain.
-	 * To make this happen, we set the EOC flag on all rx
-	 * descriptors, put the device in loopback mode, and send
-	 * a few dummy frames.  The interrupt handler will ignore
-	 * these since NAPI is disabled and no real frames are in
-	 * the tx queue.
-	 */
-
-	for (i = 0; i < RX_DESC_COUNT; i++)
-		priv->rx_descs[i].desc.config |= DESC_EOC;
-
-	txd->desc[0].s_addr =
-		txb->dma_desc + offsetof(struct nb8800_tx_desc, buf);
-	txd->desc[0].config = DESC_BTS(2) | DESC_DS | DESC_EOF | DESC_EOC | 8;
-	memset(txd->buf, 0, sizeof(txd->buf));
-
-	nb8800_mac_af(dev, false);
-	nb8800_setb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
-
-	do {
-		nb8800_writel(priv, NB8800_TX_DESC_ADDR, txb->dma_desc);
-		wmb();
-		nb8800_writel(priv, NB8800_TXC_CR, txcr | TCR_EN);
-
-		err = readl_poll_timeout_atomic(priv->base + NB8800_RXC_CR,
-						rxcr, !(rxcr & RCR_EN),
-						1000, 100000);
-	} while (err && --retry);
-
-	nb8800_mac_af(dev, true);
-	nb8800_clearb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
-	nb8800_dma_reset(dev);
-
-	return retry ? 0 : -ETIMEDOUT;
-}
-
 static void nb8800_pause_adv(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
@@ -941,6 +886,11 @@ static int nb8800_open(struct net_device *dev)
 	struct phy_device *phydev;
 	int err;
 
+	priv->ops->power_up(dev);
+	nb8800_hw_init(dev);
+	priv->ops->init(dev);
+	nb8800_update_mac_addr(dev);
+
 	/* clear any pending interrupts */
 	nb8800_writel(priv, NB8800_RXC_SR, 0xf);
 	nb8800_writel(priv, NB8800_TXC_SR, 0xf);
@@ -993,12 +943,11 @@ static int nb8800_stop(struct net_device *dev)
 	netif_stop_queue(dev);
 	napi_disable(&priv->napi);
 
-	nb8800_dma_stop(dev);
-	nb8800_mac_rx(dev, false);
-	nb8800_mac_tx(dev, false);
-
 	phy_disconnect(phydev);
 
+	priv->ops->power_down(dev);
+	priv->speed = 0;
+
 	free_irq(dev->irq, dev);
 
 	nb8800_dma_free(dev);
@@ -1150,7 +1099,7 @@ static const struct ethtool_ops nb8800_ethtool_ops = {
 	.set_link_ksettings	= phy_ethtool_set_link_ksettings,
 };
 
-static int nb8800_hw_init(struct net_device *dev)
+static void nb8800_hw_init(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
 	u32 val;
@@ -1230,20 +1179,14 @@ static int nb8800_hw_init(struct net_device *dev)
 	nb8800_writeb(priv, NB8800_PQ1, val >> 8);
 	nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
 
-	/* Auto-negotiate by default */
-	priv->pause_aneg = true;
-	priv->pause_rx = true;
-	priv->pause_tx = true;
-
 	nb8800_mc_init(dev, 0);
-
-	return 0;
 }
 
 static int nb8800_tangox_init(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
 	u32 pad_mode = PAD_MODE_MII;
+	int clk_div;
 
 	switch (priv->phy_mode) {
 	case PHY_INTERFACE_MODE_MII:
@@ -1266,29 +1209,26 @@ static int nb8800_tangox_init(struct net_device *dev)
 
 	nb8800_writeb(priv, NB8800_TANGOX_PAD_MODE, pad_mode);
 
+	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
+	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
+
 	return 0;
 }
 
-static int nb8800_tangox_reset(struct net_device *dev)
+static void nb8800_tango_power_down(struct net_device *dev)
 {
-	struct nb8800_priv *priv = netdev_priv(dev);
-	int clk_div;
-
-	nb8800_writeb(priv, NB8800_TANGOX_RESET, 0);
-	usleep_range(1000, 10000);
-	nb8800_writeb(priv, NB8800_TANGOX_RESET, 1);
-
-	wmb();		/* ensure reset is cleared before proceeding */
-
-	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
-	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
+	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 0);
+}
 
-	return 0;
+static void nb8800_tango_power_up(struct net_device *dev)
+{
+	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 1);
 }
 
 static const struct nb8800_ops nb8800_tangox_ops = {
-	.init	= nb8800_tangox_init,
-	.reset	= nb8800_tangox_reset,
+	.init		= nb8800_tangox_init,
+	.power_down	= nb8800_tango_power_down,
+	.power_up	= nb8800_tango_power_up,
 };
 
 static int nb8800_tango4_init(struct net_device *dev)
@@ -1314,8 +1254,9 @@ static int nb8800_tango4_init(struct net_device *dev)
 }
 
 static const struct nb8800_ops nb8800_tango4_ops = {
-	.init	= nb8800_tango4_init,
-	.reset	= nb8800_tangox_reset,
+	.init		= nb8800_tango4_init,
+	.power_down	= nb8800_tango_power_down,
+	.power_up	= nb8800_tango_power_up,
 };
 
 static const struct of_device_id nb8800_dt_ids[] = {
@@ -1334,7 +1275,6 @@ MODULE_DEVICE_TABLE(of, nb8800_dt_ids);
 static int nb8800_probe(struct platform_device *pdev)
 {
 	const struct of_device_id *match;
-	const struct nb8800_ops *ops = NULL;
 	struct nb8800_priv *priv;
 	struct resource *res;
 	struct net_device *dev;
@@ -1345,8 +1285,8 @@ static int nb8800_probe(struct platform_device *pdev)
 	int ret;
 
 	match = of_match_device(nb8800_dt_ids, &pdev->dev);
-	if (match)
-		ops = match->data;
+	if (!match || !match->data)
+		return -ENODEV;
 
 	irq = platform_get_irq(pdev, 0);
 	if (irq <= 0) {
@@ -1388,12 +1328,6 @@ static int nb8800_probe(struct platform_device *pdev)
 
 	spin_lock_init(&priv->tx_lock);
 
-	if (ops && ops->reset) {
-		ret = ops->reset(dev);
-		if (ret)
-			goto err_disable_clk;
-	}
-
 	bus = devm_mdiobus_alloc(&pdev->dev);
 	if (!bus) {
 		ret = -ENOMEM;
@@ -1435,16 +1369,6 @@ static int nb8800_probe(struct platform_device *pdev)
 
 	priv->mii_bus = bus;
 
-	ret = nb8800_hw_init(dev);
-	if (ret)
-		goto err_deregister_fixed_link;
-
-	if (ops && ops->init) {
-		ret = ops->init(dev);
-		if (ret)
-			goto err_deregister_fixed_link;
-	}
-
 	dev->netdev_ops = &nb8800_netdev_ops;
 	dev->ethtool_ops = &nb8800_ethtool_ops;
 	dev->flags |= IFF_MULTICAST;
@@ -1457,24 +1381,28 @@ static int nb8800_probe(struct platform_device *pdev)
 	if (!is_valid_ether_addr(dev->dev_addr))
 		eth_hw_addr_random(dev);
 
-	nb8800_update_mac_addr(dev);
-
 	netif_carrier_off(dev);
 
 	ret = register_netdev(dev);
 	if (ret) {
 		netdev_err(dev, "failed to register netdev\n");
-		goto err_free_dma;
+		goto err_deregister_fixed_link;
 	}
 
 	netif_napi_add(dev, &priv->napi, nb8800_poll, NAPI_POLL_WEIGHT);
 
 	netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
 
+	/* Auto-negotiate by default */
+	priv->pause_aneg = true;
+	priv->pause_rx = true;
+	priv->pause_tx = true;
+
+	priv->ops = match->data;
+	priv->ops->power_down(dev);
+
 	return 0;
 
-err_free_dma:
-	nb8800_dma_free(dev);
 err_deregister_fixed_link:
 	if (of_phy_is_fixed_link(pdev->dev.of_node))
 		of_phy_deregister_fixed_link(pdev->dev.of_node);
diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
index 6ec4a956e1e5..23fefca54804 100644
--- a/drivers/net/ethernet/aurora/nb8800.h
+++ b/drivers/net/ethernet/aurora/nb8800.h
@@ -305,11 +305,13 @@ struct nb8800_priv {
 	dma_addr_t			tx_desc_dma;
 
 	struct clk			*clk;
+	const struct nb8800_ops		*ops;
 };
 
 struct nb8800_ops {
 	int				(*init)(struct net_device *dev);
-	int				(*reset)(struct net_device *dev);
+	void				(*power_down)(struct net_device *dev);
+	void				(*power_up)(struct net_device *dev);
 };
 
 #endif /* _NB8800_H_ */
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 10:56   ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 10:56 UTC (permalink / raw)
  To: linux-arm-kernel

Power entire ethernet block down in ndo_stop().
Power it back up in ndo_open() and perform HW init.
Delete nb8800_dma_stop.

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 146 +++++++++--------------------------
 drivers/net/ethernet/aurora/nb8800.h |   4 +-
 2 files changed, 40 insertions(+), 110 deletions(-)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index 09b8001e1ecc..b71d8fb80610 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -40,7 +40,7 @@
 #include "nb8800.h"
 
 static void nb8800_tx_done(struct net_device *dev);
-static int nb8800_dma_stop(struct net_device *dev);
+static void nb8800_hw_init(struct net_device *dev);
 
 static inline u8 nb8800_readb(struct nb8800_priv *priv, int reg)
 {
@@ -862,61 +862,6 @@ static int nb8800_dma_init(struct net_device *dev)
 	return -ENOMEM;
 }
 
-static int nb8800_dma_stop(struct net_device *dev)
-{
-	struct nb8800_priv *priv = netdev_priv(dev);
-	struct nb8800_tx_buf *txb = &priv->tx_bufs[0];
-	struct nb8800_tx_desc *txd = &priv->tx_descs[0];
-	int retry = 5;
-	u32 txcr;
-	u32 rxcr;
-	int err;
-	unsigned int i;
-
-	/* wait for tx to finish */
-	err = readl_poll_timeout_atomic(priv->base + NB8800_TXC_CR, txcr,
-					!(txcr & TCR_EN) &&
-					priv->tx_done == priv->tx_next,
-					1000, 1000000);
-	if (err)
-		return err;
-
-	/* The rx DMA only stops if it reaches the end of chain.
-	 * To make this happen, we set the EOC flag on all rx
-	 * descriptors, put the device in loopback mode, and send
-	 * a few dummy frames.  The interrupt handler will ignore
-	 * these since NAPI is disabled and no real frames are in
-	 * the tx queue.
-	 */
-
-	for (i = 0; i < RX_DESC_COUNT; i++)
-		priv->rx_descs[i].desc.config |= DESC_EOC;
-
-	txd->desc[0].s_addr =
-		txb->dma_desc + offsetof(struct nb8800_tx_desc, buf);
-	txd->desc[0].config = DESC_BTS(2) | DESC_DS | DESC_EOF | DESC_EOC | 8;
-	memset(txd->buf, 0, sizeof(txd->buf));
-
-	nb8800_mac_af(dev, false);
-	nb8800_setb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
-
-	do {
-		nb8800_writel(priv, NB8800_TX_DESC_ADDR, txb->dma_desc);
-		wmb();
-		nb8800_writel(priv, NB8800_TXC_CR, txcr | TCR_EN);
-
-		err = readl_poll_timeout_atomic(priv->base + NB8800_RXC_CR,
-						rxcr, !(rxcr & RCR_EN),
-						1000, 100000);
-	} while (err && --retry);
-
-	nb8800_mac_af(dev, true);
-	nb8800_clearb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
-	nb8800_dma_reset(dev);
-
-	return retry ? 0 : -ETIMEDOUT;
-}
-
 static void nb8800_pause_adv(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
@@ -941,6 +886,11 @@ static int nb8800_open(struct net_device *dev)
 	struct phy_device *phydev;
 	int err;
 
+	priv->ops->power_up(dev);
+	nb8800_hw_init(dev);
+	priv->ops->init(dev);
+	nb8800_update_mac_addr(dev);
+
 	/* clear any pending interrupts */
 	nb8800_writel(priv, NB8800_RXC_SR, 0xf);
 	nb8800_writel(priv, NB8800_TXC_SR, 0xf);
@@ -993,12 +943,11 @@ static int nb8800_stop(struct net_device *dev)
 	netif_stop_queue(dev);
 	napi_disable(&priv->napi);
 
-	nb8800_dma_stop(dev);
-	nb8800_mac_rx(dev, false);
-	nb8800_mac_tx(dev, false);
-
 	phy_disconnect(phydev);
 
+	priv->ops->power_down(dev);
+	priv->speed = 0;
+
 	free_irq(dev->irq, dev);
 
 	nb8800_dma_free(dev);
@@ -1150,7 +1099,7 @@ static const struct ethtool_ops nb8800_ethtool_ops = {
 	.set_link_ksettings	= phy_ethtool_set_link_ksettings,
 };
 
-static int nb8800_hw_init(struct net_device *dev)
+static void nb8800_hw_init(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
 	u32 val;
@@ -1230,20 +1179,14 @@ static int nb8800_hw_init(struct net_device *dev)
 	nb8800_writeb(priv, NB8800_PQ1, val >> 8);
 	nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
 
-	/* Auto-negotiate by default */
-	priv->pause_aneg = true;
-	priv->pause_rx = true;
-	priv->pause_tx = true;
-
 	nb8800_mc_init(dev, 0);
-
-	return 0;
 }
 
 static int nb8800_tangox_init(struct net_device *dev)
 {
 	struct nb8800_priv *priv = netdev_priv(dev);
 	u32 pad_mode = PAD_MODE_MII;
+	int clk_div;
 
 	switch (priv->phy_mode) {
 	case PHY_INTERFACE_MODE_MII:
@@ -1266,29 +1209,26 @@ static int nb8800_tangox_init(struct net_device *dev)
 
 	nb8800_writeb(priv, NB8800_TANGOX_PAD_MODE, pad_mode);
 
+	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
+	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
+
 	return 0;
 }
 
-static int nb8800_tangox_reset(struct net_device *dev)
+static void nb8800_tango_power_down(struct net_device *dev)
 {
-	struct nb8800_priv *priv = netdev_priv(dev);
-	int clk_div;
-
-	nb8800_writeb(priv, NB8800_TANGOX_RESET, 0);
-	usleep_range(1000, 10000);
-	nb8800_writeb(priv, NB8800_TANGOX_RESET, 1);
-
-	wmb();		/* ensure reset is cleared before proceeding */
-
-	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
-	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
+	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 0);
+}
 
-	return 0;
+static void nb8800_tango_power_up(struct net_device *dev)
+{
+	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 1);
 }
 
 static const struct nb8800_ops nb8800_tangox_ops = {
-	.init	= nb8800_tangox_init,
-	.reset	= nb8800_tangox_reset,
+	.init		= nb8800_tangox_init,
+	.power_down	= nb8800_tango_power_down,
+	.power_up	= nb8800_tango_power_up,
 };
 
 static int nb8800_tango4_init(struct net_device *dev)
@@ -1314,8 +1254,9 @@ static int nb8800_tango4_init(struct net_device *dev)
 }
 
 static const struct nb8800_ops nb8800_tango4_ops = {
-	.init	= nb8800_tango4_init,
-	.reset	= nb8800_tangox_reset,
+	.init		= nb8800_tango4_init,
+	.power_down	= nb8800_tango_power_down,
+	.power_up	= nb8800_tango_power_up,
 };
 
 static const struct of_device_id nb8800_dt_ids[] = {
@@ -1334,7 +1275,6 @@ MODULE_DEVICE_TABLE(of, nb8800_dt_ids);
 static int nb8800_probe(struct platform_device *pdev)
 {
 	const struct of_device_id *match;
-	const struct nb8800_ops *ops = NULL;
 	struct nb8800_priv *priv;
 	struct resource *res;
 	struct net_device *dev;
@@ -1345,8 +1285,8 @@ static int nb8800_probe(struct platform_device *pdev)
 	int ret;
 
 	match = of_match_device(nb8800_dt_ids, &pdev->dev);
-	if (match)
-		ops = match->data;
+	if (!match || !match->data)
+		return -ENODEV;
 
 	irq = platform_get_irq(pdev, 0);
 	if (irq <= 0) {
@@ -1388,12 +1328,6 @@ static int nb8800_probe(struct platform_device *pdev)
 
 	spin_lock_init(&priv->tx_lock);
 
-	if (ops && ops->reset) {
-		ret = ops->reset(dev);
-		if (ret)
-			goto err_disable_clk;
-	}
-
 	bus = devm_mdiobus_alloc(&pdev->dev);
 	if (!bus) {
 		ret = -ENOMEM;
@@ -1435,16 +1369,6 @@ static int nb8800_probe(struct platform_device *pdev)
 
 	priv->mii_bus = bus;
 
-	ret = nb8800_hw_init(dev);
-	if (ret)
-		goto err_deregister_fixed_link;
-
-	if (ops && ops->init) {
-		ret = ops->init(dev);
-		if (ret)
-			goto err_deregister_fixed_link;
-	}
-
 	dev->netdev_ops = &nb8800_netdev_ops;
 	dev->ethtool_ops = &nb8800_ethtool_ops;
 	dev->flags |= IFF_MULTICAST;
@@ -1457,24 +1381,28 @@ static int nb8800_probe(struct platform_device *pdev)
 	if (!is_valid_ether_addr(dev->dev_addr))
 		eth_hw_addr_random(dev);
 
-	nb8800_update_mac_addr(dev);
-
 	netif_carrier_off(dev);
 
 	ret = register_netdev(dev);
 	if (ret) {
 		netdev_err(dev, "failed to register netdev\n");
-		goto err_free_dma;
+		goto err_deregister_fixed_link;
 	}
 
 	netif_napi_add(dev, &priv->napi, nb8800_poll, NAPI_POLL_WEIGHT);
 
 	netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
 
+	/* Auto-negotiate by default */
+	priv->pause_aneg = true;
+	priv->pause_rx = true;
+	priv->pause_tx = true;
+
+	priv->ops = match->data;
+	priv->ops->power_down(dev);
+
 	return 0;
 
-err_free_dma:
-	nb8800_dma_free(dev);
 err_deregister_fixed_link:
 	if (of_phy_is_fixed_link(pdev->dev.of_node))
 		of_phy_deregister_fixed_link(pdev->dev.of_node);
diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
index 6ec4a956e1e5..23fefca54804 100644
--- a/drivers/net/ethernet/aurora/nb8800.h
+++ b/drivers/net/ethernet/aurora/nb8800.h
@@ -305,11 +305,13 @@ struct nb8800_priv {
 	dma_addr_t			tx_desc_dma;
 
 	struct clk			*clk;
+	const struct nb8800_ops		*ops;
 };
 
 struct nb8800_ops {
 	int				(*init)(struct net_device *dev);
-	int				(*reset)(struct net_device *dev);
+	void				(*power_down)(struct net_device *dev);
+	void				(*power_up)(struct net_device *dev);
 };
 
 #endif /* _NB8800_H_ */
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
  2017-11-14 10:53 ` Marc Gonzalez
@ 2017-11-14 12:04   ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 12:04 UTC (permalink / raw)
  To: David Miller, Mans Rullgard
  Cc: netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index b71d8fb80610..9af2423aed03 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -1437,6 +1437,26 @@ static int nb8800_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static int nb8800_suspend(struct platform_device *pdev, pm_message_t state)
+{
+	struct net_device *dev = platform_get_drvdata(pdev);
+
+	if (netif_running(dev))
+		return nb8800_stop(dev);
+
+	return 0;
+}
+
+static int nb8800_resume(struct platform_device *pdev)
+{
+	struct net_device *dev = platform_get_drvdata(pdev);
+
+	if (netif_running(dev))
+		return nb8800_open(dev);
+
+	return 0;
+}
+
 static struct platform_driver nb8800_driver = {
 	.driver = {
 		.name		= "nb8800",
@@ -1444,6 +1464,8 @@ static struct platform_driver nb8800_driver = {
 	},
 	.probe	= nb8800_probe,
 	.remove	= nb8800_remove,
+	.suspend = nb8800_suspend,
+	.resume = nb8800_resume,
 };
 
 module_platform_driver(nb8800_driver);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
@ 2017-11-14 12:04   ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 12:04 UTC (permalink / raw)
  To: linux-arm-kernel

Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
---
 drivers/net/ethernet/aurora/nb8800.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index b71d8fb80610..9af2423aed03 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -1437,6 +1437,26 @@ static int nb8800_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static int nb8800_suspend(struct platform_device *pdev, pm_message_t state)
+{
+	struct net_device *dev = platform_get_drvdata(pdev);
+
+	if (netif_running(dev))
+		return nb8800_stop(dev);
+
+	return 0;
+}
+
+static int nb8800_resume(struct platform_device *pdev)
+{
+	struct net_device *dev = platform_get_drvdata(pdev);
+
+	if (netif_running(dev))
+		return nb8800_open(dev);
+
+	return 0;
+}
+
 static struct platform_driver nb8800_driver = {
 	.driver = {
 		.name		= "nb8800",
@@ -1444,6 +1464,8 @@ static struct platform_driver nb8800_driver = {
 	},
 	.probe	= nb8800_probe,
 	.remove	= nb8800_remove,
+	.suspend = nb8800_suspend,
+	.resume = nb8800_resume,
 };
 
 module_platform_driver(nb8800_driver);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/4] net: nb8800: Drop generic support
  2017-11-14 10:54   ` Marc Gonzalez
@ 2017-11-14 12:37     ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 12:37 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Florian Fainelli, Mason, netdev, Thibaud Cornic, David Miller, Linux ARM

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> According to our HW dev, there is no provision for software to safely
> disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
> it is the responsibility of the SoC designer to provide such a feature.
>
> The nb8800_dma_stop() implementation is a clever hack that works most
> of the times, but it breaks the DMA state machine in rare cases.
>
> Therefore, let's drop generic support.
>
> FWIW, tango chips provide a reset register. When the ethernet block
> comes out of reset, DMA is disabled.
>
> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index e94159507847..26f719e2d6ca 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
>  };
>
>  static const struct of_device_id nb8800_dt_ids[] = {
> -	{
> -		.compatible = "aurora,nb8800",
> -	},
>  	{
>  		.compatible = "sigma,smp8642-ethernet",
>  		.data = &nb8800_tangox_ops,
> -- 

Please leave this.  It works just fine on tango3.

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 1/4] net: nb8800: Drop generic support
@ 2017-11-14 12:37     ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 12:37 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> According to our HW dev, there is no provision for software to safely
> disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
> it is the responsibility of the SoC designer to provide such a feature.
>
> The nb8800_dma_stop() implementation is a clever hack that works most
> of the times, but it breaks the DMA state machine in rare cases.
>
> Therefore, let's drop generic support.
>
> FWIW, tango chips provide a reset register. When the ethernet block
> comes out of reset, DMA is disabled.
>
> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index e94159507847..26f719e2d6ca 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
>  };
>
>  static const struct of_device_id nb8800_dt_ids[] = {
> -	{
> -		.compatible = "aurora,nb8800",
> -	},
>  	{
>  		.compatible = "sigma,smp8642-ethernet",
>  		.data = &nb8800_tangox_ops,
> -- 

Please leave this.  It works just fine on tango3.

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-14 10:55   ` Marc Gonzalez
@ 2017-11-14 12:38     ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 12:38 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Florian Fainelli, Mason, netdev, Thibaud Cornic, David Miller, Linux ARM

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> The "flow control enable" bit can be tweaked, even if DMA is enabled.

No, it can't.  Maybe on some of your newer chips it can, but not on the
older ones.

> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 18 +-----------------
>  1 file changed, 1 insertion(+), 17 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 26f719e2d6ca..09b8001e1ecc 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -633,7 +633,6 @@ static void nb8800_pause_config(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
>  	struct phy_device *phydev = dev->phydev;
> -	u32 rxcr;
>
>  	if (priv->pause_aneg) {
>  		if (!phydev || !phydev->link)
> @@ -644,22 +643,7 @@ static void nb8800_pause_config(struct net_device *dev)
>  	}
>
>  	nb8800_modb(priv, NB8800_RX_CTL, RX_PAUSE_EN, priv->pause_rx);
> -
> -	rxcr = nb8800_readl(priv, NB8800_RXC_CR);
> -	if (!!(rxcr & RCR_FL) == priv->pause_tx)
> -		return;
> -
> -	if (netif_running(dev)) {
> -		napi_disable(&priv->napi);
> -		netif_tx_lock_bh(dev);
> -		nb8800_dma_stop(dev);
> -		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
> -		nb8800_start_rx(dev);
> -		netif_tx_unlock_bh(dev);
> -		napi_enable(&priv->napi);
> -	} else {
> -		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
> -	}
> +	nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
>  }
>
>  static void nb8800_link_reconfigure(struct net_device *dev)
> -- 

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-14 12:38     ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> The "flow control enable" bit can be tweaked, even if DMA is enabled.

No, it can't.  Maybe on some of your newer chips it can, but not on the
older ones.

> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 18 +-----------------
>  1 file changed, 1 insertion(+), 17 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 26f719e2d6ca..09b8001e1ecc 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -633,7 +633,6 @@ static void nb8800_pause_config(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
>  	struct phy_device *phydev = dev->phydev;
> -	u32 rxcr;
>
>  	if (priv->pause_aneg) {
>  		if (!phydev || !phydev->link)
> @@ -644,22 +643,7 @@ static void nb8800_pause_config(struct net_device *dev)
>  	}
>
>  	nb8800_modb(priv, NB8800_RX_CTL, RX_PAUSE_EN, priv->pause_rx);
> -
> -	rxcr = nb8800_readl(priv, NB8800_RXC_CR);
> -	if (!!(rxcr & RCR_FL) == priv->pause_tx)
> -		return;
> -
> -	if (netif_running(dev)) {
> -		napi_disable(&priv->napi);
> -		netif_tx_lock_bh(dev);
> -		nb8800_dma_stop(dev);
> -		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
> -		nb8800_start_rx(dev);
> -		netif_tx_unlock_bh(dev);
> -		napi_enable(&priv->napi);
> -	} else {
> -		nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
> -	}
> +	nb8800_modl(priv, NB8800_RXC_CR, RCR_FL, priv->pause_tx);
>  }
>
>  static void nb8800_link_reconfigure(struct net_device *dev)
> -- 

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 10:56   ` Marc Gonzalez
@ 2017-11-14 12:40     ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 12:40 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> Power entire ethernet block down in ndo_stop().
> Power it back up in ndo_open() and perform HW init.
> Delete nb8800_dma_stop.

Leave it alone, please.  Not all chips might have a separate power
domain for this.  Also, it works just fine on the older chips.

> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 146 +++++++++--------------------------
>  drivers/net/ethernet/aurora/nb8800.h |   4 +-
>  2 files changed, 40 insertions(+), 110 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 09b8001e1ecc..b71d8fb80610 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -40,7 +40,7 @@
>  #include "nb8800.h"
>
>  static void nb8800_tx_done(struct net_device *dev);
> -static int nb8800_dma_stop(struct net_device *dev);
> +static void nb8800_hw_init(struct net_device *dev);
>
>  static inline u8 nb8800_readb(struct nb8800_priv *priv, int reg)
>  {
> @@ -862,61 +862,6 @@ static int nb8800_dma_init(struct net_device *dev)
>  	return -ENOMEM;
>  }
>
> -static int nb8800_dma_stop(struct net_device *dev)
> -{
> -	struct nb8800_priv *priv = netdev_priv(dev);
> -	struct nb8800_tx_buf *txb = &priv->tx_bufs[0];
> -	struct nb8800_tx_desc *txd = &priv->tx_descs[0];
> -	int retry = 5;
> -	u32 txcr;
> -	u32 rxcr;
> -	int err;
> -	unsigned int i;
> -
> -	/* wait for tx to finish */
> -	err = readl_poll_timeout_atomic(priv->base + NB8800_TXC_CR, txcr,
> -					!(txcr & TCR_EN) &&
> -					priv->tx_done == priv->tx_next,
> -					1000, 1000000);
> -	if (err)
> -		return err;
> -
> -	/* The rx DMA only stops if it reaches the end of chain.
> -	 * To make this happen, we set the EOC flag on all rx
> -	 * descriptors, put the device in loopback mode, and send
> -	 * a few dummy frames.  The interrupt handler will ignore
> -	 * these since NAPI is disabled and no real frames are in
> -	 * the tx queue.
> -	 */
> -
> -	for (i = 0; i < RX_DESC_COUNT; i++)
> -		priv->rx_descs[i].desc.config |= DESC_EOC;
> -
> -	txd->desc[0].s_addr =
> -		txb->dma_desc + offsetof(struct nb8800_tx_desc, buf);
> -	txd->desc[0].config = DESC_BTS(2) | DESC_DS | DESC_EOF | DESC_EOC | 8;
> -	memset(txd->buf, 0, sizeof(txd->buf));
> -
> -	nb8800_mac_af(dev, false);
> -	nb8800_setb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
> -
> -	do {
> -		nb8800_writel(priv, NB8800_TX_DESC_ADDR, txb->dma_desc);
> -		wmb();
> -		nb8800_writel(priv, NB8800_TXC_CR, txcr | TCR_EN);
> -
> -		err = readl_poll_timeout_atomic(priv->base + NB8800_RXC_CR,
> -						rxcr, !(rxcr & RCR_EN),
> -						1000, 100000);
> -	} while (err && --retry);
> -
> -	nb8800_mac_af(dev, true);
> -	nb8800_clearb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
> -	nb8800_dma_reset(dev);
> -
> -	return retry ? 0 : -ETIMEDOUT;
> -}
> -
>  static void nb8800_pause_adv(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
> @@ -941,6 +886,11 @@ static int nb8800_open(struct net_device *dev)
>  	struct phy_device *phydev;
>  	int err;
>
> +	priv->ops->power_up(dev);
> +	nb8800_hw_init(dev);
> +	priv->ops->init(dev);
> +	nb8800_update_mac_addr(dev);
> +
>  	/* clear any pending interrupts */
>  	nb8800_writel(priv, NB8800_RXC_SR, 0xf);
>  	nb8800_writel(priv, NB8800_TXC_SR, 0xf);
> @@ -993,12 +943,11 @@ static int nb8800_stop(struct net_device *dev)
>  	netif_stop_queue(dev);
>  	napi_disable(&priv->napi);
>
> -	nb8800_dma_stop(dev);
> -	nb8800_mac_rx(dev, false);
> -	nb8800_mac_tx(dev, false);
> -
>  	phy_disconnect(phydev);
>
> +	priv->ops->power_down(dev);
> +	priv->speed = 0;
> +
>  	free_irq(dev->irq, dev);
>
>  	nb8800_dma_free(dev);
> @@ -1150,7 +1099,7 @@ static const struct ethtool_ops nb8800_ethtool_ops = {
>  	.set_link_ksettings	= phy_ethtool_set_link_ksettings,
>  };
>
> -static int nb8800_hw_init(struct net_device *dev)
> +static void nb8800_hw_init(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
>  	u32 val;
> @@ -1230,20 +1179,14 @@ static int nb8800_hw_init(struct net_device *dev)
>  	nb8800_writeb(priv, NB8800_PQ1, val >> 8);
>  	nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
>
> -	/* Auto-negotiate by default */
> -	priv->pause_aneg = true;
> -	priv->pause_rx = true;
> -	priv->pause_tx = true;
> -
>  	nb8800_mc_init(dev, 0);
> -
> -	return 0;
>  }
>
>  static int nb8800_tangox_init(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
>  	u32 pad_mode = PAD_MODE_MII;
> +	int clk_div;
>
>  	switch (priv->phy_mode) {
>  	case PHY_INTERFACE_MODE_MII:
> @@ -1266,29 +1209,26 @@ static int nb8800_tangox_init(struct net_device *dev)
>
>  	nb8800_writeb(priv, NB8800_TANGOX_PAD_MODE, pad_mode);
>
> +	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
> +	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
> +
>  	return 0;
>  }
>
> -static int nb8800_tangox_reset(struct net_device *dev)
> +static void nb8800_tango_power_down(struct net_device *dev)
>  {
> -	struct nb8800_priv *priv = netdev_priv(dev);
> -	int clk_div;
> -
> -	nb8800_writeb(priv, NB8800_TANGOX_RESET, 0);
> -	usleep_range(1000, 10000);
> -	nb8800_writeb(priv, NB8800_TANGOX_RESET, 1);
> -
> -	wmb();		/* ensure reset is cleared before proceeding */
> -
> -	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
> -	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
> +	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 0);
> +}
>
> -	return 0;
> +static void nb8800_tango_power_up(struct net_device *dev)
> +{
> +	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 1);
>  }
>
>  static const struct nb8800_ops nb8800_tangox_ops = {
> -	.init	= nb8800_tangox_init,
> -	.reset	= nb8800_tangox_reset,
> +	.init		= nb8800_tangox_init,
> +	.power_down	= nb8800_tango_power_down,
> +	.power_up	= nb8800_tango_power_up,
>  };
>
>  static int nb8800_tango4_init(struct net_device *dev)
> @@ -1314,8 +1254,9 @@ static int nb8800_tango4_init(struct net_device *dev)
>  }
>
>  static const struct nb8800_ops nb8800_tango4_ops = {
> -	.init	= nb8800_tango4_init,
> -	.reset	= nb8800_tangox_reset,
> +	.init		= nb8800_tango4_init,
> +	.power_down	= nb8800_tango_power_down,
> +	.power_up	= nb8800_tango_power_up,
>  };
>
>  static const struct of_device_id nb8800_dt_ids[] = {
> @@ -1334,7 +1275,6 @@ MODULE_DEVICE_TABLE(of, nb8800_dt_ids);
>  static int nb8800_probe(struct platform_device *pdev)
>  {
>  	const struct of_device_id *match;
> -	const struct nb8800_ops *ops = NULL;
>  	struct nb8800_priv *priv;
>  	struct resource *res;
>  	struct net_device *dev;
> @@ -1345,8 +1285,8 @@ static int nb8800_probe(struct platform_device *pdev)
>  	int ret;
>
>  	match = of_match_device(nb8800_dt_ids, &pdev->dev);
> -	if (match)
> -		ops = match->data;
> +	if (!match || !match->data)
> +		return -ENODEV;
>
>  	irq = platform_get_irq(pdev, 0);
>  	if (irq <= 0) {
> @@ -1388,12 +1328,6 @@ static int nb8800_probe(struct platform_device *pdev)
>
>  	spin_lock_init(&priv->tx_lock);
>
> -	if (ops && ops->reset) {
> -		ret = ops->reset(dev);
> -		if (ret)
> -			goto err_disable_clk;
> -	}
> -
>  	bus = devm_mdiobus_alloc(&pdev->dev);
>  	if (!bus) {
>  		ret = -ENOMEM;
> @@ -1435,16 +1369,6 @@ static int nb8800_probe(struct platform_device *pdev)
>
>  	priv->mii_bus = bus;
>
> -	ret = nb8800_hw_init(dev);
> -	if (ret)
> -		goto err_deregister_fixed_link;
> -
> -	if (ops && ops->init) {
> -		ret = ops->init(dev);
> -		if (ret)
> -			goto err_deregister_fixed_link;
> -	}
> -
>  	dev->netdev_ops = &nb8800_netdev_ops;
>  	dev->ethtool_ops = &nb8800_ethtool_ops;
>  	dev->flags |= IFF_MULTICAST;
> @@ -1457,24 +1381,28 @@ static int nb8800_probe(struct platform_device *pdev)
>  	if (!is_valid_ether_addr(dev->dev_addr))
>  		eth_hw_addr_random(dev);
>
> -	nb8800_update_mac_addr(dev);
> -
>  	netif_carrier_off(dev);
>
>  	ret = register_netdev(dev);
>  	if (ret) {
>  		netdev_err(dev, "failed to register netdev\n");
> -		goto err_free_dma;
> +		goto err_deregister_fixed_link;
>  	}
>
>  	netif_napi_add(dev, &priv->napi, nb8800_poll, NAPI_POLL_WEIGHT);
>
>  	netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
>
> +	/* Auto-negotiate by default */
> +	priv->pause_aneg = true;
> +	priv->pause_rx = true;
> +	priv->pause_tx = true;
> +
> +	priv->ops = match->data;
> +	priv->ops->power_down(dev);
> +
>  	return 0;
>
> -err_free_dma:
> -	nb8800_dma_free(dev);
>  err_deregister_fixed_link:
>  	if (of_phy_is_fixed_link(pdev->dev.of_node))
>  		of_phy_deregister_fixed_link(pdev->dev.of_node);
> diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
> index 6ec4a956e1e5..23fefca54804 100644
> --- a/drivers/net/ethernet/aurora/nb8800.h
> +++ b/drivers/net/ethernet/aurora/nb8800.h
> @@ -305,11 +305,13 @@ struct nb8800_priv {
>  	dma_addr_t			tx_desc_dma;
>
>  	struct clk			*clk;
> +	const struct nb8800_ops		*ops;
>  };
>
>  struct nb8800_ops {
>  	int				(*init)(struct net_device *dev);
> -	int				(*reset)(struct net_device *dev);
> +	void				(*power_down)(struct net_device *dev);
> +	void				(*power_up)(struct net_device *dev);
>  };
>
>  #endif /* _NB8800_H_ */
> -- 
> 2.15.0
>

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 12:40     ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> Power entire ethernet block down in ndo_stop().
> Power it back up in ndo_open() and perform HW init.
> Delete nb8800_dma_stop.

Leave it alone, please.  Not all chips might have a separate power
domain for this.  Also, it works just fine on the older chips.

> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
> ---
>  drivers/net/ethernet/aurora/nb8800.c | 146 +++++++++--------------------------
>  drivers/net/ethernet/aurora/nb8800.h |   4 +-
>  2 files changed, 40 insertions(+), 110 deletions(-)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 09b8001e1ecc..b71d8fb80610 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -40,7 +40,7 @@
>  #include "nb8800.h"
>
>  static void nb8800_tx_done(struct net_device *dev);
> -static int nb8800_dma_stop(struct net_device *dev);
> +static void nb8800_hw_init(struct net_device *dev);
>
>  static inline u8 nb8800_readb(struct nb8800_priv *priv, int reg)
>  {
> @@ -862,61 +862,6 @@ static int nb8800_dma_init(struct net_device *dev)
>  	return -ENOMEM;
>  }
>
> -static int nb8800_dma_stop(struct net_device *dev)
> -{
> -	struct nb8800_priv *priv = netdev_priv(dev);
> -	struct nb8800_tx_buf *txb = &priv->tx_bufs[0];
> -	struct nb8800_tx_desc *txd = &priv->tx_descs[0];
> -	int retry = 5;
> -	u32 txcr;
> -	u32 rxcr;
> -	int err;
> -	unsigned int i;
> -
> -	/* wait for tx to finish */
> -	err = readl_poll_timeout_atomic(priv->base + NB8800_TXC_CR, txcr,
> -					!(txcr & TCR_EN) &&
> -					priv->tx_done == priv->tx_next,
> -					1000, 1000000);
> -	if (err)
> -		return err;
> -
> -	/* The rx DMA only stops if it reaches the end of chain.
> -	 * To make this happen, we set the EOC flag on all rx
> -	 * descriptors, put the device in loopback mode, and send
> -	 * a few dummy frames.  The interrupt handler will ignore
> -	 * these since NAPI is disabled and no real frames are in
> -	 * the tx queue.
> -	 */
> -
> -	for (i = 0; i < RX_DESC_COUNT; i++)
> -		priv->rx_descs[i].desc.config |= DESC_EOC;
> -
> -	txd->desc[0].s_addr =
> -		txb->dma_desc + offsetof(struct nb8800_tx_desc, buf);
> -	txd->desc[0].config = DESC_BTS(2) | DESC_DS | DESC_EOF | DESC_EOC | 8;
> -	memset(txd->buf, 0, sizeof(txd->buf));
> -
> -	nb8800_mac_af(dev, false);
> -	nb8800_setb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
> -
> -	do {
> -		nb8800_writel(priv, NB8800_TX_DESC_ADDR, txb->dma_desc);
> -		wmb();
> -		nb8800_writel(priv, NB8800_TXC_CR, txcr | TCR_EN);
> -
> -		err = readl_poll_timeout_atomic(priv->base + NB8800_RXC_CR,
> -						rxcr, !(rxcr & RCR_EN),
> -						1000, 100000);
> -	} while (err && --retry);
> -
> -	nb8800_mac_af(dev, true);
> -	nb8800_clearb(priv, NB8800_MAC_MODE, LOOPBACK_EN);
> -	nb8800_dma_reset(dev);
> -
> -	return retry ? 0 : -ETIMEDOUT;
> -}
> -
>  static void nb8800_pause_adv(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
> @@ -941,6 +886,11 @@ static int nb8800_open(struct net_device *dev)
>  	struct phy_device *phydev;
>  	int err;
>
> +	priv->ops->power_up(dev);
> +	nb8800_hw_init(dev);
> +	priv->ops->init(dev);
> +	nb8800_update_mac_addr(dev);
> +
>  	/* clear any pending interrupts */
>  	nb8800_writel(priv, NB8800_RXC_SR, 0xf);
>  	nb8800_writel(priv, NB8800_TXC_SR, 0xf);
> @@ -993,12 +943,11 @@ static int nb8800_stop(struct net_device *dev)
>  	netif_stop_queue(dev);
>  	napi_disable(&priv->napi);
>
> -	nb8800_dma_stop(dev);
> -	nb8800_mac_rx(dev, false);
> -	nb8800_mac_tx(dev, false);
> -
>  	phy_disconnect(phydev);
>
> +	priv->ops->power_down(dev);
> +	priv->speed = 0;
> +
>  	free_irq(dev->irq, dev);
>
>  	nb8800_dma_free(dev);
> @@ -1150,7 +1099,7 @@ static const struct ethtool_ops nb8800_ethtool_ops = {
>  	.set_link_ksettings	= phy_ethtool_set_link_ksettings,
>  };
>
> -static int nb8800_hw_init(struct net_device *dev)
> +static void nb8800_hw_init(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
>  	u32 val;
> @@ -1230,20 +1179,14 @@ static int nb8800_hw_init(struct net_device *dev)
>  	nb8800_writeb(priv, NB8800_PQ1, val >> 8);
>  	nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
>
> -	/* Auto-negotiate by default */
> -	priv->pause_aneg = true;
> -	priv->pause_rx = true;
> -	priv->pause_tx = true;
> -
>  	nb8800_mc_init(dev, 0);
> -
> -	return 0;
>  }
>
>  static int nb8800_tangox_init(struct net_device *dev)
>  {
>  	struct nb8800_priv *priv = netdev_priv(dev);
>  	u32 pad_mode = PAD_MODE_MII;
> +	int clk_div;
>
>  	switch (priv->phy_mode) {
>  	case PHY_INTERFACE_MODE_MII:
> @@ -1266,29 +1209,26 @@ static int nb8800_tangox_init(struct net_device *dev)
>
>  	nb8800_writeb(priv, NB8800_TANGOX_PAD_MODE, pad_mode);
>
> +	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
> +	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
> +
>  	return 0;
>  }
>
> -static int nb8800_tangox_reset(struct net_device *dev)
> +static void nb8800_tango_power_down(struct net_device *dev)
>  {
> -	struct nb8800_priv *priv = netdev_priv(dev);
> -	int clk_div;
> -
> -	nb8800_writeb(priv, NB8800_TANGOX_RESET, 0);
> -	usleep_range(1000, 10000);
> -	nb8800_writeb(priv, NB8800_TANGOX_RESET, 1);
> -
> -	wmb();		/* ensure reset is cleared before proceeding */
> -
> -	clk_div = DIV_ROUND_UP(clk_get_rate(priv->clk), 2 * MAX_MDC_CLOCK);
> -	nb8800_writew(priv, NB8800_TANGOX_MDIO_CLKDIV, clk_div);
> +	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 0);
> +}
>
> -	return 0;
> +static void nb8800_tango_power_up(struct net_device *dev)
> +{
> +	nb8800_writel(netdev_priv(dev), NB8800_TANGOX_RESET, 1);
>  }
>
>  static const struct nb8800_ops nb8800_tangox_ops = {
> -	.init	= nb8800_tangox_init,
> -	.reset	= nb8800_tangox_reset,
> +	.init		= nb8800_tangox_init,
> +	.power_down	= nb8800_tango_power_down,
> +	.power_up	= nb8800_tango_power_up,
>  };
>
>  static int nb8800_tango4_init(struct net_device *dev)
> @@ -1314,8 +1254,9 @@ static int nb8800_tango4_init(struct net_device *dev)
>  }
>
>  static const struct nb8800_ops nb8800_tango4_ops = {
> -	.init	= nb8800_tango4_init,
> -	.reset	= nb8800_tangox_reset,
> +	.init		= nb8800_tango4_init,
> +	.power_down	= nb8800_tango_power_down,
> +	.power_up	= nb8800_tango_power_up,
>  };
>
>  static const struct of_device_id nb8800_dt_ids[] = {
> @@ -1334,7 +1275,6 @@ MODULE_DEVICE_TABLE(of, nb8800_dt_ids);
>  static int nb8800_probe(struct platform_device *pdev)
>  {
>  	const struct of_device_id *match;
> -	const struct nb8800_ops *ops = NULL;
>  	struct nb8800_priv *priv;
>  	struct resource *res;
>  	struct net_device *dev;
> @@ -1345,8 +1285,8 @@ static int nb8800_probe(struct platform_device *pdev)
>  	int ret;
>
>  	match = of_match_device(nb8800_dt_ids, &pdev->dev);
> -	if (match)
> -		ops = match->data;
> +	if (!match || !match->data)
> +		return -ENODEV;
>
>  	irq = platform_get_irq(pdev, 0);
>  	if (irq <= 0) {
> @@ -1388,12 +1328,6 @@ static int nb8800_probe(struct platform_device *pdev)
>
>  	spin_lock_init(&priv->tx_lock);
>
> -	if (ops && ops->reset) {
> -		ret = ops->reset(dev);
> -		if (ret)
> -			goto err_disable_clk;
> -	}
> -
>  	bus = devm_mdiobus_alloc(&pdev->dev);
>  	if (!bus) {
>  		ret = -ENOMEM;
> @@ -1435,16 +1369,6 @@ static int nb8800_probe(struct platform_device *pdev)
>
>  	priv->mii_bus = bus;
>
> -	ret = nb8800_hw_init(dev);
> -	if (ret)
> -		goto err_deregister_fixed_link;
> -
> -	if (ops && ops->init) {
> -		ret = ops->init(dev);
> -		if (ret)
> -			goto err_deregister_fixed_link;
> -	}
> -
>  	dev->netdev_ops = &nb8800_netdev_ops;
>  	dev->ethtool_ops = &nb8800_ethtool_ops;
>  	dev->flags |= IFF_MULTICAST;
> @@ -1457,24 +1381,28 @@ static int nb8800_probe(struct platform_device *pdev)
>  	if (!is_valid_ether_addr(dev->dev_addr))
>  		eth_hw_addr_random(dev);
>
> -	nb8800_update_mac_addr(dev);
> -
>  	netif_carrier_off(dev);
>
>  	ret = register_netdev(dev);
>  	if (ret) {
>  		netdev_err(dev, "failed to register netdev\n");
> -		goto err_free_dma;
> +		goto err_deregister_fixed_link;
>  	}
>
>  	netif_napi_add(dev, &priv->napi, nb8800_poll, NAPI_POLL_WEIGHT);
>
>  	netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
>
> +	/* Auto-negotiate by default */
> +	priv->pause_aneg = true;
> +	priv->pause_rx = true;
> +	priv->pause_tx = true;
> +
> +	priv->ops = match->data;
> +	priv->ops->power_down(dev);
> +
>  	return 0;
>
> -err_free_dma:
> -	nb8800_dma_free(dev);
>  err_deregister_fixed_link:
>  	if (of_phy_is_fixed_link(pdev->dev.of_node))
>  		of_phy_deregister_fixed_link(pdev->dev.of_node);
> diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
> index 6ec4a956e1e5..23fefca54804 100644
> --- a/drivers/net/ethernet/aurora/nb8800.h
> +++ b/drivers/net/ethernet/aurora/nb8800.h
> @@ -305,11 +305,13 @@ struct nb8800_priv {
>  	dma_addr_t			tx_desc_dma;
>
>  	struct clk			*clk;
> +	const struct nb8800_ops		*ops;
>  };
>
>  struct nb8800_ops {
>  	int				(*init)(struct net_device *dev);
> -	int				(*reset)(struct net_device *dev);
> +	void				(*power_down)(struct net_device *dev);
> +	void				(*power_up)(struct net_device *dev);
>  };
>
>  #endif /* _NB8800_H_ */
> -- 
> 2.15.0
>

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/4] net: nb8800: Drop generic support
  2017-11-14 12:37     ` Måns Rullgård
@ 2017-11-14 12:47       ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 12:47 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 13:37, Måns Rullgård wrote:

> Marc Gonzalez writes:
> 
>> According to our HW dev, there is no provision for software to safely
>> disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
>> it is the responsibility of the SoC designer to provide such a feature.
>>
>> The nb8800_dma_stop() implementation is a clever hack that works most
>> of the times, but it breaks the DMA state machine in rare cases.
>>
>> Therefore, let's drop generic support.
>>
>> FWIW, tango chips provide a reset register. When the ethernet block
>> comes out of reset, DMA is disabled.
>>
>> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
>> ---
>>  drivers/net/ethernet/aurora/nb8800.c | 3 ---
>>  1 file changed, 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
>> index e94159507847..26f719e2d6ca 100644
>> --- a/drivers/net/ethernet/aurora/nb8800.c
>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>> @@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
>>  };
>>
>>  static const struct of_device_id nb8800_dt_ids[] = {
>> -	{
>> -		.compatible = "aurora,nb8800",
>> -	},
>>  	{
>>  		.compatible = "sigma,smp8642-ethernet",
>>  		.data = &nb8800_tangox_ops,
>> -- 
> 
> Please leave this.  It works just fine on tango3.

What you call "tango3" is your SMP8642-based board, I suppose.
That is covered by the "sigma,smp8642-ethernet" string.

There is no need for the generic "aurora,nb8800" string, as no other
known SoC uses the Aurora SSN8800+NB8800 IP; and as I point out in the
commit message, the raw IP lacks certain features.

There is no point in keeping this around. It just adds unnecessary
existence tests for every use of the ops struct.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 1/4] net: nb8800: Drop generic support
@ 2017-11-14 12:47       ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 13:37, M?ns Rullg?rd wrote:

> Marc Gonzalez writes:
> 
>> According to our HW dev, there is no provision for software to safely
>> disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
>> it is the responsibility of the SoC designer to provide such a feature.
>>
>> The nb8800_dma_stop() implementation is a clever hack that works most
>> of the times, but it breaks the DMA state machine in rare cases.
>>
>> Therefore, let's drop generic support.
>>
>> FWIW, tango chips provide a reset register. When the ethernet block
>> comes out of reset, DMA is disabled.
>>
>> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
>> ---
>>  drivers/net/ethernet/aurora/nb8800.c | 3 ---
>>  1 file changed, 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
>> index e94159507847..26f719e2d6ca 100644
>> --- a/drivers/net/ethernet/aurora/nb8800.c
>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>> @@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
>>  };
>>
>>  static const struct of_device_id nb8800_dt_ids[] = {
>> -	{
>> -		.compatible = "aurora,nb8800",
>> -	},
>>  	{
>>  		.compatible = "sigma,smp8642-ethernet",
>>  		.data = &nb8800_tangox_ops,
>> -- 
> 
> Please leave this.  It works just fine on tango3.

What you call "tango3" is your SMP8642-based board, I suppose.
That is covered by the "sigma,smp8642-ethernet" string.

There is no need for the generic "aurora,nb8800" string, as no other
known SoC uses the Aurora SSN8800+NB8800 IP; and as I point out in the
commit message, the raw IP lacks certain features.

There is no point in keeping this around. It just adds unnecessary
existence tests for every use of the ops struct.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-14 12:38     ` Måns Rullgård
@ 2017-11-14 12:56       ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 12:56 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 13:38, Måns Rullgård wrote:

> Marc Gonzalez writes:
> 
>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
> 
> No, it can't.  Maybe on some of your newer chips it can, but not on the
> older ones.

Again, I suppose you are referring to your SMP8642-based board.

Are you saying you tested this patch, and it doesn't work?
What are the symptoms?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-14 12:56       ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 12:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 13:38, M?ns Rullg?rd wrote:

> Marc Gonzalez writes:
> 
>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
> 
> No, it can't.  Maybe on some of your newer chips it can, but not on the
> older ones.

Again, I suppose you are referring to your SMP8642-based board.

Are you saying you tested this patch, and it doesn't work?
What are the symptoms?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
  2017-11-14 12:04   ` Marc Gonzalez
@ 2017-11-14 13:02     ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:02 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>

Missing patch description.  Don't bother though.  I won't approve of
this implementation.

Suspend support has to depend on the chip since the EMAC core doesn't
have anything of the kind.

> ---
>  drivers/net/ethernet/aurora/nb8800.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index b71d8fb80610..9af2423aed03 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1437,6 +1437,26 @@ static int nb8800_remove(struct platform_device *pdev)
>  	return 0;
>  }
>
> +static int nb8800_suspend(struct platform_device *pdev, pm_message_t state)
> +{
> +	struct net_device *dev = platform_get_drvdata(pdev);
> +
> +	if (netif_running(dev))
> +		return nb8800_stop(dev);
> +
> +	return 0;
> +}
> +
> +static int nb8800_resume(struct platform_device *pdev)
> +{
> +	struct net_device *dev = platform_get_drvdata(pdev);
> +
> +	if (netif_running(dev))
> +		return nb8800_open(dev);
> +
> +	return 0;
> +}
> +
>  static struct platform_driver nb8800_driver = {
>  	.driver = {
>  		.name		= "nb8800",
> @@ -1444,6 +1464,8 @@ static struct platform_driver nb8800_driver = {
>  	},
>  	.probe	= nb8800_probe,
>  	.remove	= nb8800_remove,
> +	.suspend = nb8800_suspend,
> +	.resume = nb8800_resume,
>  };
>
>  module_platform_driver(nb8800_driver);
> -- 
> 2.15.0
>

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
@ 2017-11-14 13:02     ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:02 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>

Missing patch description.  Don't bother though.  I won't approve of
this implementation.

Suspend support has to depend on the chip since the EMAC core doesn't
have anything of the kind.

> ---
>  drivers/net/ethernet/aurora/nb8800.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index b71d8fb80610..9af2423aed03 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1437,6 +1437,26 @@ static int nb8800_remove(struct platform_device *pdev)
>  	return 0;
>  }
>
> +static int nb8800_suspend(struct platform_device *pdev, pm_message_t state)
> +{
> +	struct net_device *dev = platform_get_drvdata(pdev);
> +
> +	if (netif_running(dev))
> +		return nb8800_stop(dev);
> +
> +	return 0;
> +}
> +
> +static int nb8800_resume(struct platform_device *pdev)
> +{
> +	struct net_device *dev = platform_get_drvdata(pdev);
> +
> +	if (netif_running(dev))
> +		return nb8800_open(dev);
> +
> +	return 0;
> +}
> +
>  static struct platform_driver nb8800_driver = {
>  	.driver = {
>  		.name		= "nb8800",
> @@ -1444,6 +1464,8 @@ static struct platform_driver nb8800_driver = {
>  	},
>  	.probe	= nb8800_probe,
>  	.remove	= nb8800_remove,
> +	.suspend = nb8800_suspend,
> +	.resume = nb8800_resume,
>  };
>
>  module_platform_driver(nb8800_driver);
> -- 
> 2.15.0
>

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 1/4] net: nb8800: Drop generic support
  2017-11-14 12:47       ` Marc Gonzalez
@ 2017-11-14 13:03         ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:03 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 13:37, Måns Rullgård wrote:
>
>> Marc Gonzalez writes:
>> 
>>> According to our HW dev, there is no provision for software to safely
>>> disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
>>> it is the responsibility of the SoC designer to provide such a feature.
>>>
>>> The nb8800_dma_stop() implementation is a clever hack that works most
>>> of the times, but it breaks the DMA state machine in rare cases.
>>>
>>> Therefore, let's drop generic support.
>>>
>>> FWIW, tango chips provide a reset register. When the ethernet block
>>> comes out of reset, DMA is disabled.
>>>
>>> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
>>> ---
>>>  drivers/net/ethernet/aurora/nb8800.c | 3 ---
>>>  1 file changed, 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
>>> index e94159507847..26f719e2d6ca 100644
>>> --- a/drivers/net/ethernet/aurora/nb8800.c
>>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>>> @@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
>>>  };
>>>
>>>  static const struct of_device_id nb8800_dt_ids[] = {
>>> -	{
>>> -		.compatible = "aurora,nb8800",
>>> -	},
>>>  	{
>>>  		.compatible = "sigma,smp8642-ethernet",
>>>  		.data = &nb8800_tangox_ops,
>>> -- 
>> 
>> Please leave this.  It works just fine on tango3.
>
> What you call "tango3" is your SMP8642-based board, I suppose.
> That is covered by the "sigma,smp8642-ethernet" string.
>
> There is no need for the generic "aurora,nb8800" string, as no other
> known SoC uses the Aurora SSN8800+NB8800 IP; and as I point out in the
> commit message, the raw IP lacks certain features.
>
> There is no point in keeping this around. It just adds unnecessary
> existence tests for every use of the ops struct.

And if someone discovers another chip using this controller, having the
base support in there makes it much easier to get it working.

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 1/4] net: nb8800: Drop generic support
@ 2017-11-14 13:03         ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 13:37, M?ns Rullg?rd wrote:
>
>> Marc Gonzalez writes:
>> 
>>> According to our HW dev, there is no provision for software to safely
>>> disable RX DMA in the AU-NB8800 hardware block (ethernet DMA). Thus,
>>> it is the responsibility of the SoC designer to provide such a feature.
>>>
>>> The nb8800_dma_stop() implementation is a clever hack that works most
>>> of the times, but it breaks the DMA state machine in rare cases.
>>>
>>> Therefore, let's drop generic support.
>>>
>>> FWIW, tango chips provide a reset register. When the ethernet block
>>> comes out of reset, DMA is disabled.
>>>
>>> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
>>> ---
>>>  drivers/net/ethernet/aurora/nb8800.c | 3 ---
>>>  1 file changed, 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
>>> index e94159507847..26f719e2d6ca 100644
>>> --- a/drivers/net/ethernet/aurora/nb8800.c
>>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>>> @@ -1335,9 +1335,6 @@ static const struct nb8800_ops nb8800_tango4_ops = {
>>>  };
>>>
>>>  static const struct of_device_id nb8800_dt_ids[] = {
>>> -	{
>>> -		.compatible = "aurora,nb8800",
>>> -	},
>>>  	{
>>>  		.compatible = "sigma,smp8642-ethernet",
>>>  		.data = &nb8800_tangox_ops,
>>> -- 
>> 
>> Please leave this.  It works just fine on tango3.
>
> What you call "tango3" is your SMP8642-based board, I suppose.
> That is covered by the "sigma,smp8642-ethernet" string.
>
> There is no need for the generic "aurora,nb8800" string, as no other
> known SoC uses the Aurora SSN8800+NB8800 IP; and as I point out in the
> commit message, the raw IP lacks certain features.
>
> There is no point in keeping this around. It just adds unnecessary
> existence tests for every use of the ops struct.

And if someone discovers another chip using this controller, having the
base support in there makes it much easier to get it working.

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-14 12:56       ` Marc Gonzalez
@ 2017-11-14 13:22         ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:22 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 13:38, Måns Rullgård wrote:
>
>> Marc Gonzalez writes:
>> 
>>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
>> 
>> No, it can't.  Maybe on some of your newer chips it can, but not on the
>> older ones.
>
> Again, I suppose you are referring to your SMP8642-based board.
>
> Are you saying you tested this patch, and it doesn't work?
> What are the symptoms?

I didn't test that patch per se.  I did initially try simply changing
that bit, and this didn't work.  Either it had no effect, or it locked
up the controller, I forget which.  The manual explicitly states that
writes are forbidden while the DMA enabled bit is set.

If shutting down the DMA really isn't possible, I would rather just
allow changing the flow control setting only when the interface is
stopped.  The normal case of getting the initial setting from the
auto-negotiation will still work properly.  It just won't be possible to
change it while the link is active.

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-14 13:22         ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:22 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 13:38, M?ns Rullg?rd wrote:
>
>> Marc Gonzalez writes:
>> 
>>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
>> 
>> No, it can't.  Maybe on some of your newer chips it can, but not on the
>> older ones.
>
> Again, I suppose you are referring to your SMP8642-based board.
>
> Are you saying you tested this patch, and it doesn't work?
> What are the symptoms?

I didn't test that patch per se.  I did initially try simply changing
that bit, and this didn't work.  Either it had no effect, or it locked
up the controller, I forget which.  The manual explicitly states that
writes are forbidden while the DMA enabled bit is set.

If shutting down the DMA really isn't possible, I would rather just
allow changing the flow control setting only when the interface is
stopped.  The normal case of getting the initial setting from the
auto-negotiation will still work properly.  It just won't be possible to
change it while the link is active.

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 12:40     ` Måns Rullgård
@ 2017-11-14 13:26       ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 13:26 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 13:40, Måns Rullgård wrote:

> Marc Gonzalez wrote:
> 
>> Power entire ethernet block down in ndo_stop().
>> Power it back up in ndo_open() and perform HW init.
>> Delete nb8800_dma_stop.
> 
> Leave it alone, please.  Not all chips might have a separate power
> domain for this.  Also, it works just fine on the older chips.

There is no need for separate power domains. The ethernet block is
clock-gated when it is held in reset. The reset register is implemented
on all tango3, tango4, tango5 chips.

nb8800_dma_stop() is a hack. The HW dev has stated that it is not supported.
One cannot conclude that it "works fine" just because you've never triggered
the error condition. (On tango5, the error condition triggers systematically.)

We have several customer bug reports on tango3 and tango4 chips complaining
about "broken" ethernet after a link down / link up cycle. They are using a
different driver, but it implements the same hack in enet_stop_rx().
There is a high probability that the DMA hack is responsible, and wedged the
RX DMA state machine.

Since older chips do support the reset register, this patch implements
the same method for all tango chips.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 13:26       ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 13:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 13:40, M?ns Rullg?rd wrote:

> Marc Gonzalez wrote:
> 
>> Power entire ethernet block down in ndo_stop().
>> Power it back up in ndo_open() and perform HW init.
>> Delete nb8800_dma_stop.
> 
> Leave it alone, please.  Not all chips might have a separate power
> domain for this.  Also, it works just fine on the older chips.

There is no need for separate power domains. The ethernet block is
clock-gated when it is held in reset. The reset register is implemented
on all tango3, tango4, tango5 chips.

nb8800_dma_stop() is a hack. The HW dev has stated that it is not supported.
One cannot conclude that it "works fine" just because you've never triggered
the error condition. (On tango5, the error condition triggers systematically.)

We have several customer bug reports on tango3 and tango4 chips complaining
about "broken" ethernet after a link down / link up cycle. They are using a
different driver, but it implements the same hack in enet_stop_rx().
There is a high probability that the DMA hack is responsible, and wedged the
RX DMA state machine.

Since older chips do support the reset register, this patch implements
the same method for all tango chips.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 13:26       ` Marc Gonzalez
@ 2017-11-14 13:54         ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:54 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 13:40, Måns Rullgård wrote:
>
>> Marc Gonzalez wrote:
>> 
>>> Power entire ethernet block down in ndo_stop().
>>> Power it back up in ndo_open() and perform HW init.
>>> Delete nb8800_dma_stop.
>> 
>> Leave it alone, please.  Not all chips might have a separate power
>> domain for this.  Also, it works just fine on the older chips.
>
> There is no need for separate power domains. The ethernet block is
> clock-gated when it is held in reset.

So you're not powering it down then.  Please be accurate.

> The reset register is implemented on all tango3, tango4, tango5 chips.

It's still not a core feature.

> nb8800_dma_stop() is a hack.

The hack originated from your company.

> The HW dev has stated that it is not supported.  One cannot conclude
> that it "works fine" just because you've never triggered the error
> condition. (On tango5, the error condition triggers systematically.)

That sounds like a problem for tango5.

Also, I have repeated asked you what happens if the tango5 runs out of
DMA buffers under normal operation.  Does that also cause it to lock up?
If so, you have a much bigger problem on your hands.

> We have several customer bug reports on tango3 and tango4 chips complaining
> about "broken" ethernet after a link down / link up cycle. They are using a
> different driver, but it implements the same hack in enet_stop_rx().
> There is a high probability that the DMA hack is responsible, and wedged the
> RX DMA state machine.

But you have no idea what's really the problem?

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 13:54         ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 13:54 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 13:40, M?ns Rullg?rd wrote:
>
>> Marc Gonzalez wrote:
>> 
>>> Power entire ethernet block down in ndo_stop().
>>> Power it back up in ndo_open() and perform HW init.
>>> Delete nb8800_dma_stop.
>> 
>> Leave it alone, please.  Not all chips might have a separate power
>> domain for this.  Also, it works just fine on the older chips.
>
> There is no need for separate power domains. The ethernet block is
> clock-gated when it is held in reset.

So you're not powering it down then.  Please be accurate.

> The reset register is implemented on all tango3, tango4, tango5 chips.

It's still not a core feature.

> nb8800_dma_stop() is a hack.

The hack originated from your company.

> The HW dev has stated that it is not supported.  One cannot conclude
> that it "works fine" just because you've never triggered the error
> condition. (On tango5, the error condition triggers systematically.)

That sounds like a problem for tango5.

Also, I have repeated asked you what happens if the tango5 runs out of
DMA buffers under normal operation.  Does that also cause it to lock up?
If so, you have a much bigger problem on your hands.

> We have several customer bug reports on tango3 and tango4 chips complaining
> about "broken" ethernet after a link down / link up cycle. They are using a
> different driver, but it implements the same hack in enet_stop_rx().
> There is a high probability that the DMA hack is responsible, and wedged the
> RX DMA state machine.

But you have no idea what's really the problem?

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
  2017-11-14 13:02     ` Måns Rullgård
@ 2017-11-14 14:22       ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 14:22 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 14:02, Måns Rullgård wrote:

> Missing patch description.  Don't bother though.  I won't approve of
> this implementation.

I hope I can convince David that you should not have veto power over
this driver just because it was you who submitted it upstream first.

Per the copyright notice, it was rewritten using Sigma's driver as a
reference. And as of today, only Sigma chips use this driver, a fact
which is unlikely to change, since the IP is no longer sold.

http://www.auroravlsi.com/products.html

> Suspend support has to depend on the chip since the EMAC core doesn't
> have anything of the kind.

Your statement makes no sense to me. If suspending the platform loses
context, then all that is required is saving said context at suspend,
and restoring it at resume.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
@ 2017-11-14 14:22       ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 14:02, M?ns Rullg?rd wrote:

> Missing patch description.  Don't bother though.  I won't approve of
> this implementation.

I hope I can convince David that you should not have veto power over
this driver just because it was you who submitted it upstream first.

Per the copyright notice, it was rewritten using Sigma's driver as a
reference. And as of today, only Sigma chips use this driver, a fact
which is unlikely to change, since the IP is no longer sold.

http://www.auroravlsi.com/products.html

> Suspend support has to depend on the chip since the EMAC core doesn't
> have anything of the kind.

Your statement makes no sense to me. If suspending the platform loses
context, then all that is required is saving said context at suspend,
and restoring it at resume.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
  2017-11-14 14:22       ` Marc Gonzalez
@ 2017-11-14 16:31         ` Andrew Lunn
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-14 16:31 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On Tue, Nov 14, 2017 at 03:22:04PM +0100, Marc Gonzalez wrote:
> On 14/11/2017 14:02, Måns Rullgård wrote:
> 
> > Missing patch description.  Don't bother though.  I won't approve of
> > this implementation.
> 
> I hope I can convince David that you should not have veto power over
> this driver just because it was you who submitted it upstream first.

In practice, anybody can request a veto on a patch. If a patch
introduces a regression, and somebody reports it, that is pretty much
an automatic veto.

Please work with Mans to ensure you are not breaking the driver for
the hardware he cares about.

    Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
@ 2017-11-14 16:31         ` Andrew Lunn
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-14 16:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 14, 2017 at 03:22:04PM +0100, Marc Gonzalez wrote:
> On 14/11/2017 14:02, M?ns Rullg?rd wrote:
> 
> > Missing patch description.  Don't bother though.  I won't approve of
> > this implementation.
> 
> I hope I can convince David that you should not have veto power over
> this driver just because it was you who submitted it upstream first.

In practice, anybody can request a veto on a patch. If a patch
introduces a regression, and somebody reports it, that is pretty much
an automatic veto.

Please work with Mans to ensure you are not breaking the driver for
the hardware he cares about.

    Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 13:54         ` Måns Rullgård
@ 2017-11-14 16:41           ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 16:41 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 14:54, Måns Rullgård wrote:

> Marc Gonzalez writes:
> 
>> On 14/11/2017 13:40, Måns Rullgård wrote:
>>
>>> Marc Gonzalez wrote:
>>>
>>>> Power entire ethernet block down in ndo_stop().
>>>> Power it back up in ndo_open() and perform HW init.
>>>> Delete nb8800_dma_stop.
>>>
>>> Leave it alone, please.  Not all chips might have a separate power
>>> domain for this.  Also, it works just fine on the older chips.
>>
>> There is no need for separate power domains. The ethernet block is
>> clock-gated when it is held in reset.
> 
> So you're not powering it down then.  Please be accurate.

Smirk. That looks like trolling.


>> The reset register is implemented on all tango3, tango4, tango5 chips.
> 
> It's still not a core feature.

Correct. But it covers 100% of all chips using this driver.
There is no point in trying to implement support for chips that
have never existed, do not exist, and never will.


>> nb8800_dma_stop() is a hack.
> 
> The hack originated from your company.

So why are you so insistent that we keep using it?


>> The HW dev has stated that it is not supported.  One cannot conclude
>> that it "works fine" just because you've never triggered the error
>> condition. (On tango5, the error condition triggers systematically.)
> 
> That sounds like a problem for tango5.

tango5 does have its share of issues.


> Also, I have repeated asked you what happens if the tango5 runs out of
> DMA buffers under normal operation.  Does that also cause it to lock up?
> If so, you have a much bigger problem on your hands.

I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
Would that produce conclusive results?
Do you have other suggestions?


>> We have several customer bug reports on tango3 and tango4 chips complaining
>> about "broken" ethernet after a link down / link up cycle. They are using a
>> different driver, but it implements the same hack in enet_stop_rx().
>> There is a high probability that the DMA hack is responsible, and wedged the
>> RX DMA state machine.
> 
> But you have no idea what's really the problem?

I have an idea that enet_stop_rx() wedged the RX DMA state machine.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 16:41           ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 16:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 14:54, M?ns Rullg?rd wrote:

> Marc Gonzalez writes:
> 
>> On 14/11/2017 13:40, M?ns Rullg?rd wrote:
>>
>>> Marc Gonzalez wrote:
>>>
>>>> Power entire ethernet block down in ndo_stop().
>>>> Power it back up in ndo_open() and perform HW init.
>>>> Delete nb8800_dma_stop.
>>>
>>> Leave it alone, please.  Not all chips might have a separate power
>>> domain for this.  Also, it works just fine on the older chips.
>>
>> There is no need for separate power domains. The ethernet block is
>> clock-gated when it is held in reset.
> 
> So you're not powering it down then.  Please be accurate.

Smirk. That looks like trolling.


>> The reset register is implemented on all tango3, tango4, tango5 chips.
> 
> It's still not a core feature.

Correct. But it covers 100% of all chips using this driver.
There is no point in trying to implement support for chips that
have never existed, do not exist, and never will.


>> nb8800_dma_stop() is a hack.
> 
> The hack originated from your company.

So why are you so insistent that we keep using it?


>> The HW dev has stated that it is not supported.  One cannot conclude
>> that it "works fine" just because you've never triggered the error
>> condition. (On tango5, the error condition triggers systematically.)
> 
> That sounds like a problem for tango5.

tango5 does have its share of issues.


> Also, I have repeated asked you what happens if the tango5 runs out of
> DMA buffers under normal operation.  Does that also cause it to lock up?
> If so, you have a much bigger problem on your hands.

I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
Would that produce conclusive results?
Do you have other suggestions?


>> We have several customer bug reports on tango3 and tango4 chips complaining
>> about "broken" ethernet after a link down / link up cycle. They are using a
>> different driver, but it implements the same hack in enet_stop_rx().
>> There is a high probability that the DMA hack is responsible, and wedged the
>> RX DMA state machine.
> 
> But you have no idea what's really the problem?

I have an idea that enet_stop_rx() wedged the RX DMA state machine.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 16:41           ` Marc Gonzalez
@ 2017-11-14 16:55             ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 16:55 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 14:54, Måns Rullgård wrote:
>
>> Marc Gonzalez writes:
>> 
>>> On 14/11/2017 13:40, Måns Rullgård wrote:
>>>
>>>> Marc Gonzalez wrote:
>>>>
>>>>> Power entire ethernet block down in ndo_stop().
>>>>> Power it back up in ndo_open() and perform HW init.
>>>>> Delete nb8800_dma_stop.
>>>>
>>>> Leave it alone, please.  Not all chips might have a separate power
>>>> domain for this.  Also, it works just fine on the older chips.
>>>
>>> There is no need for separate power domains. The ethernet block is
>>> clock-gated when it is held in reset.
>> 
>> So you're not powering it down then.  Please be accurate.
>
> Smirk. That looks like trolling.
>
>>> The reset register is implemented on all tango3, tango4, tango5 chips.
>> 
>> It's still not a core feature.
>
> Correct. But it covers 100% of all chips using this driver.
> There is no point in trying to implement support for chips that
> have never existed, do not exist, and never will.

You can't know that.

>>> nb8800_dma_stop() is a hack.
>> 
>> The hack originated from your company.
>
> So why are you so insistent that we keep using it?

Because it's the only way to support some chip variants.  Ones you'd
apparently rather forget, but which nonetheless exist.

>> Also, I have repeated asked you what happens if the tango5 runs out of
>> DMA buffers under normal operation.  Does that also cause it to lock up?
>> If so, you have a much bigger problem on your hands.
>
> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
> Would that produce conclusive results?
> Do you have other suggestions?

Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
That should ensure that queue is drained slowly enough for the buffers
to run out.

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 16:55             ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-14 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 14:54, M?ns Rullg?rd wrote:
>
>> Marc Gonzalez writes:
>> 
>>> On 14/11/2017 13:40, M?ns Rullg?rd wrote:
>>>
>>>> Marc Gonzalez wrote:
>>>>
>>>>> Power entire ethernet block down in ndo_stop().
>>>>> Power it back up in ndo_open() and perform HW init.
>>>>> Delete nb8800_dma_stop.
>>>>
>>>> Leave it alone, please.  Not all chips might have a separate power
>>>> domain for this.  Also, it works just fine on the older chips.
>>>
>>> There is no need for separate power domains. The ethernet block is
>>> clock-gated when it is held in reset.
>> 
>> So you're not powering it down then.  Please be accurate.
>
> Smirk. That looks like trolling.
>
>>> The reset register is implemented on all tango3, tango4, tango5 chips.
>> 
>> It's still not a core feature.
>
> Correct. But it covers 100% of all chips using this driver.
> There is no point in trying to implement support for chips that
> have never existed, do not exist, and never will.

You can't know that.

>>> nb8800_dma_stop() is a hack.
>> 
>> The hack originated from your company.
>
> So why are you so insistent that we keep using it?

Because it's the only way to support some chip variants.  Ones you'd
apparently rather forget, but which nonetheless exist.

>> Also, I have repeated asked you what happens if the tango5 runs out of
>> DMA buffers under normal operation.  Does that also cause it to lock up?
>> If so, you have a much bigger problem on your hands.
>
> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
> Would that produce conclusive results?
> Do you have other suggestions?

Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
That should ensure that queue is drained slowly enough for the buffers
to run out.

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 16:55             ` Måns Rullgård
@ 2017-11-14 17:07               ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 17:07 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 17:55, Måns Rullgård wrote:

> Marc Gonzalez wrote:
> 
>> On 14/11/2017 14:54, Måns Rullgård wrote:
>>
>>> The hack originated from your company.
>>
>> So why are you so insistent that we keep using it?
> 
> Because it's the only way to support some chip variants.  Ones you'd
> apparently rather forget, but which nonetheless exist.

Which chip variants do you have in mind?
All tango3 board supports the reset method.

BTW, could you test my patch series on your board?
(I can't since the board is not supported upstream.)

>>> Also, I have repeated asked you what happens if the tango5 runs out of
>>> DMA buffers under normal operation.  Does that also cause it to lock up?
>>> If so, you have a much bigger problem on your hands.
>>
>> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
>> Would that produce conclusive results?
>> Do you have other suggestions?
> 
> Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
> That should ensure that queue is drained slowly enough for the buffers
> to run out.

OK. I will test this ASAP on tango4 and tango5.

I'll try finding a tango3 board, to check the nb8800_mdio_cmd()
quirk and the flow_control quirk. Again, the HW dev said they
are not needed, so if he's wrong, his credibility his shot.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-14 17:07               ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 17:55, M?ns Rullg?rd wrote:

> Marc Gonzalez wrote:
> 
>> On 14/11/2017 14:54, M?ns Rullg?rd wrote:
>>
>>> The hack originated from your company.
>>
>> So why are you so insistent that we keep using it?
> 
> Because it's the only way to support some chip variants.  Ones you'd
> apparently rather forget, but which nonetheless exist.

Which chip variants do you have in mind?
All tango3 board supports the reset method.

BTW, could you test my patch series on your board?
(I can't since the board is not supported upstream.)

>>> Also, I have repeated asked you what happens if the tango5 runs out of
>>> DMA buffers under normal operation.  Does that also cause it to lock up?
>>> If so, you have a much bigger problem on your hands.
>>
>> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
>> Would that produce conclusive results?
>> Do you have other suggestions?
> 
> Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
> That should ensure that queue is drained slowly enough for the buffers
> to run out.

OK. I will test this ASAP on tango4 and tango5.

I'll try finding a tango3 board, to check the nb8800_mdio_cmd()
quirk and the flow_control quirk. Again, the HW dev said they
are not needed, so if he's wrong, his credibility his shot.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
  2017-11-14 16:31         ` Andrew Lunn
@ 2017-11-14 17:08           ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 17:08 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On 14/11/2017 17:31, Andrew Lunn wrote:

> On Tue, Nov 14, 2017 at 03:22:04PM +0100, Marc Gonzalez wrote:
> 
>> On 14/11/2017 14:02, Måns Rullgård wrote:
>>
>>> Missing patch description.  Don't bother though.  I won't approve of
>>> this implementation.
>>
>> I hope I can convince David that you should not have veto power over
>> this driver just because it was you who submitted it upstream first.
> 
> In practice, anybody can request a veto on a patch. If a patch
> introduces a regression, and somebody reports it, that is pretty much
> an automatic veto.
> 
> Please work with Mans to ensure you are not breaking the driver for
> the hardware he cares about.

FWIW, removing generic support ("aurora,nb8800") does not break any existing hardware.

The ethernet controller in the board Mans uses is supported by "sigma,smp8642-ethernet".

I can't test my exact changes on a similar board because support for that board
is not (yet?) upstream.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
@ 2017-11-14 17:08           ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-14 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 17:31, Andrew Lunn wrote:

> On Tue, Nov 14, 2017 at 03:22:04PM +0100, Marc Gonzalez wrote:
> 
>> On 14/11/2017 14:02, M?ns Rullg?rd wrote:
>>
>>> Missing patch description.  Don't bother though.  I won't approve of
>>> this implementation.
>>
>> I hope I can convince David that you should not have veto power over
>> this driver just because it was you who submitted it upstream first.
> 
> In practice, anybody can request a veto on a patch. If a patch
> introduces a regression, and somebody reports it, that is pretty much
> an automatic veto.
> 
> Please work with Mans to ensure you are not breaking the driver for
> the hardware he cares about.

FWIW, removing generic support ("aurora,nb8800") does not break any existing hardware.

The ethernet controller in the board Mans uses is supported by "sigma,smp8642-ethernet".

I can't test my exact changes on a similar board because support for that board
is not (yet?) upstream.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
  2017-11-14 17:08           ` Marc Gonzalez
@ 2017-11-14 17:33             ` Andrew Lunn
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-14 17:33 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On Tue, Nov 14, 2017 at 06:08:47PM +0100, Marc Gonzalez wrote:
> On 14/11/2017 17:31, Andrew Lunn wrote:
> 
> > On Tue, Nov 14, 2017 at 03:22:04PM +0100, Marc Gonzalez wrote:
> > 
> >> On 14/11/2017 14:02, Måns Rullgård wrote:
> >>
> >>> Missing patch description.  Don't bother though.  I won't approve of
> >>> this implementation.
> >>
> >> I hope I can convince David that you should not have veto power over
> >> this driver just because it was you who submitted it upstream first.
> > 
> > In practice, anybody can request a veto on a patch. If a patch
> > introduces a regression, and somebody reports it, that is pretty much
> > an automatic veto.
> > 
> > Please work with Mans to ensure you are not breaking the driver for
> > the hardware he cares about.
> 
> FWIW, removing generic support ("aurora,nb8800") does not break any existing hardware.

Hi Marc

I also did a quick search, and no board appears to use this, at the
moment.

But it does appear that the changes to the pause configuration while
the DMA is running will break stuff. So there appears to be a
legitimate reason for that patch to get a NACK.

	   Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 4/4] net: nb8800: Add support for suspend/resume
@ 2017-11-14 17:33             ` Andrew Lunn
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-14 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 14, 2017 at 06:08:47PM +0100, Marc Gonzalez wrote:
> On 14/11/2017 17:31, Andrew Lunn wrote:
> 
> > On Tue, Nov 14, 2017 at 03:22:04PM +0100, Marc Gonzalez wrote:
> > 
> >> On 14/11/2017 14:02, M?ns Rullg?rd wrote:
> >>
> >>> Missing patch description.  Don't bother though.  I won't approve of
> >>> this implementation.
> >>
> >> I hope I can convince David that you should not have veto power over
> >> this driver just because it was you who submitted it upstream first.
> > 
> > In practice, anybody can request a veto on a patch. If a patch
> > introduces a regression, and somebody reports it, that is pretty much
> > an automatic veto.
> > 
> > Please work with Mans to ensure you are not breaking the driver for
> > the hardware he cares about.
> 
> FWIW, removing generic support ("aurora,nb8800") does not break any existing hardware.

Hi Marc

I also did a quick search, and no board appears to use this, at the
moment.

But it does appear that the changes to the pause configuration while
the DMA is running will break stuff. So there appears to be a
legitimate reason for that patch to get a NACK.

	   Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-14 13:22         ` Måns Rullgård
@ 2017-11-15 10:53           ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 10:53 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: David Miller, netdev, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 14/11/2017 14:22, Måns Rullgård wrote:

> Marc Gonzalez wrote:
> 
>> On 14/11/2017 13:38, Måns Rullgård wrote:
>>
>>> Marc Gonzalez writes:
>>>
>>>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
>>>
>>> No, it can't.  Maybe on some of your newer chips it can, but not on the
>>> older ones.
>>
>> Again, I suppose you are referring to your SMP8642-based board.
>>
>> Are you saying you tested this patch, and it doesn't work?
>> What are the symptoms?
> 
> I didn't test that patch per se.  I did initially try simply changing
> that bit, and this didn't work.  Either it had no effect, or it locked
> up the controller, I forget which.  The manual explicitly states that
> writes are forbidden while the DMA enabled bit is set.
> 
> If shutting down the DMA really isn't possible, I would rather just
> allow changing the flow control setting only when the interface is
> stopped.  The normal case of getting the initial setting from the
> auto-negotiation will still work properly.  It just won't be possible to
> change it while the link is active.

Hello Mans,

With the default setup,

	priv->pause_aneg = true;
	priv->pause_rx = true;
	priv->pause_tx = true;

even the very first call to nb8800_pause_config() occurs with netif_running=1
as well as the RX DMA bit enabled.

buildroot login: root
# ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0
# ip link set eth0 up
[   25.771000] ENTER nb8800_pause_config: netif_running=1
[   25.776195] CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.9.54-1-rc6 #18
[   25.783102] Hardware name: Sigma Tango DT
[   25.787138] Workqueue: events_power_efficient phy_state_machine
[   25.793107] [<c010f290>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
[   25.800896] [<c010af34>] (show_stack) from [<c02d06e8>] (dump_stack+0x88/0x9c)
[   25.808160] [<c02d06e8>] (dump_stack) from [<c03967e4>] (nb8800_pause_config+0x28/0xf0)
[   25.816208] [<c03967e4>] (nb8800_pause_config) from [<c03969e0>] (nb8800_link_reconfigure+0x134/0x148)
[   25.825565] [<c03969e0>] (nb8800_link_reconfigure) from [<c0391d84>] (phy_state_machine+0x348/0x468)
[   25.834750] [<c0391d84>] (phy_state_machine) from [<c012e858>] (process_one_work+0x1d8/0x3f0)
[   25.843323] [<c012e858>] (process_one_work) from [<c012f6a4>] (worker_thread+0x4c/0x560)
[   25.851459] [<c012f6a4>] (worker_thread) from [<c0134354>] (kthread+0xec/0xf4)
[   25.858721] [<c0134354>] (kthread) from [<c01078f8>] (ret_from_fork+0x14/0x3c)
[   25.874597] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx


This makes sense, since nb8800_open() calls nb8800_start_rx()
before phy_start().

Moving nb8800_start_rx() to after nb8800_pause_config() does
appear to work, but I'm not sure it is the correct action?

FWIW, this is the patch I'm testing:

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index da6e8bdbb77a..86081632346e 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -667,6 +667,7 @@ static void nb8800_link_reconfigure(struct net_device *dev)
                        nb8800_mac_config(dev);
 
                nb8800_pause_config(dev);
+               nb8800_start_rx(dev);
        }
 
        if (phydev->link != priv->link) {
@@ -918,7 +919,6 @@ static int nb8800_open(struct net_device *dev)
        napi_enable(&priv->napi);
        netif_start_queue(dev);
 
-       nb8800_start_rx(dev);
        phy_start(phydev);
 
        return 0;

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 10:53           ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 10:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 14:22, M?ns Rullg?rd wrote:

> Marc Gonzalez wrote:
> 
>> On 14/11/2017 13:38, M?ns Rullg?rd wrote:
>>
>>> Marc Gonzalez writes:
>>>
>>>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
>>>
>>> No, it can't.  Maybe on some of your newer chips it can, but not on the
>>> older ones.
>>
>> Again, I suppose you are referring to your SMP8642-based board.
>>
>> Are you saying you tested this patch, and it doesn't work?
>> What are the symptoms?
> 
> I didn't test that patch per se.  I did initially try simply changing
> that bit, and this didn't work.  Either it had no effect, or it locked
> up the controller, I forget which.  The manual explicitly states that
> writes are forbidden while the DMA enabled bit is set.
> 
> If shutting down the DMA really isn't possible, I would rather just
> allow changing the flow control setting only when the interface is
> stopped.  The normal case of getting the initial setting from the
> auto-negotiation will still work properly.  It just won't be possible to
> change it while the link is active.

Hello Mans,

With the default setup,

	priv->pause_aneg = true;
	priv->pause_rx = true;
	priv->pause_tx = true;

even the very first call to nb8800_pause_config() occurs with netif_running=1
as well as the RX DMA bit enabled.

buildroot login: root
# ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0
# ip link set eth0 up
[   25.771000] ENTER nb8800_pause_config: netif_running=1
[   25.776195] CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.9.54-1-rc6 #18
[   25.783102] Hardware name: Sigma Tango DT
[   25.787138] Workqueue: events_power_efficient phy_state_machine
[   25.793107] [<c010f290>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
[   25.800896] [<c010af34>] (show_stack) from [<c02d06e8>] (dump_stack+0x88/0x9c)
[   25.808160] [<c02d06e8>] (dump_stack) from [<c03967e4>] (nb8800_pause_config+0x28/0xf0)
[   25.816208] [<c03967e4>] (nb8800_pause_config) from [<c03969e0>] (nb8800_link_reconfigure+0x134/0x148)
[   25.825565] [<c03969e0>] (nb8800_link_reconfigure) from [<c0391d84>] (phy_state_machine+0x348/0x468)
[   25.834750] [<c0391d84>] (phy_state_machine) from [<c012e858>] (process_one_work+0x1d8/0x3f0)
[   25.843323] [<c012e858>] (process_one_work) from [<c012f6a4>] (worker_thread+0x4c/0x560)
[   25.851459] [<c012f6a4>] (worker_thread) from [<c0134354>] (kthread+0xec/0xf4)
[   25.858721] [<c0134354>] (kthread) from [<c01078f8>] (ret_from_fork+0x14/0x3c)
[   25.874597] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx


This makes sense, since nb8800_open() calls nb8800_start_rx()
before phy_start().

Moving nb8800_start_rx() to after nb8800_pause_config() does
appear to work, but I'm not sure it is the correct action?

FWIW, this is the patch I'm testing:

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index da6e8bdbb77a..86081632346e 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -667,6 +667,7 @@ static void nb8800_link_reconfigure(struct net_device *dev)
                        nb8800_mac_config(dev);
 
                nb8800_pause_config(dev);
+               nb8800_start_rx(dev);
        }
 
        if (phydev->link != priv->link) {
@@ -918,7 +919,6 @@ static int nb8800_open(struct net_device *dev)
        napi_enable(&priv->napi);
        netif_start_queue(dev);
 
-       nb8800_start_rx(dev);
        phy_start(phydev);
 
        return 0;

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-15 10:53           ` Marc Gonzalez
@ 2017-11-15 14:17             ` Andrew Lunn
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-15 14:17 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On Wed, Nov 15, 2017 at 11:53:23AM +0100, Marc Gonzalez wrote:
> On 14/11/2017 14:22, Måns Rullgård wrote:
> 
> > Marc Gonzalez wrote:
> > 
> >> On 14/11/2017 13:38, Måns Rullgård wrote:
> >>
> >>> Marc Gonzalez writes:
> >>>
> >>>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
> >>>
> >>> No, it can't.  Maybe on some of your newer chips it can, but not on the
> >>> older ones.
> >>
> >> Again, I suppose you are referring to your SMP8642-based board.
> >>
> >> Are you saying you tested this patch, and it doesn't work?
> >> What are the symptoms?
> > 
> > I didn't test that patch per se.  I did initially try simply changing
> > that bit, and this didn't work.  Either it had no effect, or it locked
> > up the controller, I forget which.  The manual explicitly states that
> > writes are forbidden while the DMA enabled bit is set.
> > 
> > If shutting down the DMA really isn't possible, I would rather just
> > allow changing the flow control setting only when the interface is
> > stopped.  The normal case of getting the initial setting from the
> > auto-negotiation will still work properly.  It just won't be possible to
> > change it while the link is active.
> 
> Hello Mans,
> 
> With the default setup,
> 
> 	priv->pause_aneg = true;
> 	priv->pause_rx = true;
> 	priv->pause_tx = true;

Hi Marc

So you are assuming the peer supports pause? Is that a safe assumption
to make? Should you not have it disabled until auto-neg has completed
and you then know what the peer can do?
> 
> even the very first call to nb8800_pause_config() occurs with netif_running=1
> as well as the RX DMA bit enabled.
> 
> buildroot login: root
> # ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0
> # ip link set eth0 up
> [   25.771000] ENTER nb8800_pause_config: netif_running=1
> [   25.776195] CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.9.54-1-rc6 #18
> [   25.783102] Hardware name: Sigma Tango DT
> [   25.787138] Workqueue: events_power_efficient phy_state_machine
> [   25.793107] [<c010f290>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
> [   25.800896] [<c010af34>] (show_stack) from [<c02d06e8>] (dump_stack+0x88/0x9c)
> [   25.808160] [<c02d06e8>] (dump_stack) from [<c03967e4>] (nb8800_pause_config+0x28/0xf0)
> [   25.816208] [<c03967e4>] (nb8800_pause_config) from [<c03969e0>] (nb8800_link_reconfigure+0x134/0x148)
> [   25.825565] [<c03969e0>] (nb8800_link_reconfigure) from [<c0391d84>] (phy_state_machine+0x348/0x468)
> [   25.834750] [<c0391d84>] (phy_state_machine) from [<c012e858>] (process_one_work+0x1d8/0x3f0)
> [   25.843323] [<c012e858>] (process_one_work) from [<c012f6a4>] (worker_thread+0x4c/0x560)
> [   25.851459] [<c012f6a4>] (worker_thread) from [<c0134354>] (kthread+0xec/0xf4)
> [   25.858721] [<c0134354>] (kthread) from [<c01078f8>] (ret_from_fork+0x14/0x3c)
> [   25.874597] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> 
> 
> This makes sense, since nb8800_open() calls nb8800_start_rx()
> before phy_start().
> 
> Moving nb8800_start_rx() to after nb8800_pause_config() does
> appear to work, but I'm not sure it is the correct action?

nb8800_link_reconfigure() can be called whenever the link to the peer
changes. auto-neg may happen later because the cable was not plugged
in until later, etc.

   Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 14:17             ` Andrew Lunn
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-15 14:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 15, 2017 at 11:53:23AM +0100, Marc Gonzalez wrote:
> On 14/11/2017 14:22, M?ns Rullg?rd wrote:
> 
> > Marc Gonzalez wrote:
> > 
> >> On 14/11/2017 13:38, M?ns Rullg?rd wrote:
> >>
> >>> Marc Gonzalez writes:
> >>>
> >>>> The "flow control enable" bit can be tweaked, even if DMA is enabled.
> >>>
> >>> No, it can't.  Maybe on some of your newer chips it can, but not on the
> >>> older ones.
> >>
> >> Again, I suppose you are referring to your SMP8642-based board.
> >>
> >> Are you saying you tested this patch, and it doesn't work?
> >> What are the symptoms?
> > 
> > I didn't test that patch per se.  I did initially try simply changing
> > that bit, and this didn't work.  Either it had no effect, or it locked
> > up the controller, I forget which.  The manual explicitly states that
> > writes are forbidden while the DMA enabled bit is set.
> > 
> > If shutting down the DMA really isn't possible, I would rather just
> > allow changing the flow control setting only when the interface is
> > stopped.  The normal case of getting the initial setting from the
> > auto-negotiation will still work properly.  It just won't be possible to
> > change it while the link is active.
> 
> Hello Mans,
> 
> With the default setup,
> 
> 	priv->pause_aneg = true;
> 	priv->pause_rx = true;
> 	priv->pause_tx = true;

Hi Marc

So you are assuming the peer supports pause? Is that a safe assumption
to make? Should you not have it disabled until auto-neg has completed
and you then know what the peer can do?
> 
> even the very first call to nb8800_pause_config() occurs with netif_running=1
> as well as the RX DMA bit enabled.
> 
> buildroot login: root
> # ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0
> # ip link set eth0 up
> [   25.771000] ENTER nb8800_pause_config: netif_running=1
> [   25.776195] CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.9.54-1-rc6 #18
> [   25.783102] Hardware name: Sigma Tango DT
> [   25.787138] Workqueue: events_power_efficient phy_state_machine
> [   25.793107] [<c010f290>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
> [   25.800896] [<c010af34>] (show_stack) from [<c02d06e8>] (dump_stack+0x88/0x9c)
> [   25.808160] [<c02d06e8>] (dump_stack) from [<c03967e4>] (nb8800_pause_config+0x28/0xf0)
> [   25.816208] [<c03967e4>] (nb8800_pause_config) from [<c03969e0>] (nb8800_link_reconfigure+0x134/0x148)
> [   25.825565] [<c03969e0>] (nb8800_link_reconfigure) from [<c0391d84>] (phy_state_machine+0x348/0x468)
> [   25.834750] [<c0391d84>] (phy_state_machine) from [<c012e858>] (process_one_work+0x1d8/0x3f0)
> [   25.843323] [<c012e858>] (process_one_work) from [<c012f6a4>] (worker_thread+0x4c/0x560)
> [   25.851459] [<c012f6a4>] (worker_thread) from [<c0134354>] (kthread+0xec/0xf4)
> [   25.858721] [<c0134354>] (kthread) from [<c01078f8>] (ret_from_fork+0x14/0x3c)
> [   25.874597] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> 
> 
> This makes sense, since nb8800_open() calls nb8800_start_rx()
> before phy_start().
> 
> Moving nb8800_start_rx() to after nb8800_pause_config() does
> appear to work, but I'm not sure it is the correct action?

nb8800_link_reconfigure() can be called whenever the link to the peer
changes. auto-neg may happen later because the cable was not plugged
in until later, etc.

   Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-15 14:17             ` Andrew Lunn
@ 2017-11-15 14:33               ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 14:33 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On 15/11/2017 15:17, Andrew Lunn wrote:

> nb8800_link_reconfigure() can be called whenever the link to the peer
> changes. auto-neg may happen later because the cable was not plugged
> in until later, etc.

Hello Andrew,

AFAICS, Mans was right: trying to toggle the flow control bit when
the RX DMA bit is already set "breaks" the HW (ping times out).

Thus, I will drop patch 2/4.

In our local branch, I have completely disabled flow control support,
so I don't have to worry about this problem.

Regards.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 14:33               ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/11/2017 15:17, Andrew Lunn wrote:

> nb8800_link_reconfigure() can be called whenever the link to the peer
> changes. auto-neg may happen later because the cable was not plugged
> in until later, etc.

Hello Andrew,

AFAICS, Mans was right: trying to toggle the flow control bit when
the RX DMA bit is already set "breaks" the HW (ping times out).

Thus, I will drop patch 2/4.

In our local branch, I have completely disabled flow control support,
so I don't have to worry about this problem.

Regards.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-14 16:55             ` Måns Rullgård
@ 2017-11-15 14:58               ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 14:58 UTC (permalink / raw)
  To: netdev
  Cc: Mans Rullgard, David Miller, Linux ARM, Florian Fainelli,
	Thibaud Cornic, Mason

On 14/11/2017 17:55, Måns Rullgård wrote:

> Marc Gonzalez wrote:
> 
>> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
>> Would that produce conclusive results?
>> Do you have other suggestions?
> 
> Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
> That should ensure that queue is drained slowly enough for the buffers
> to run out.

Using the following patch:

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index 31c3f0f10fbb..646300bb53b6 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -317,6 +317,7 @@ static int nb8800_poll(struct napi_struct *napi, int budget)
 
                len = RX_BYTES_TRANSFERRED(rxd->report);
 
+               udelay(200);
                if (IS_RX_ERROR(rxd->report))
                        nb8800_rx_error(dev, rxd->report);
                else
@@ -1416,9 +1417,9 @@ static int nb8800_probe(struct platform_device *pdev)
        netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
 
        /* Auto-negotiate by default */
-       priv->pause_aneg = true;
-       priv->pause_rx = true;
-       priv->pause_tx = true;
+       priv->pause_aneg = false;
+       priv->pause_rx = false;
+       priv->pause_tx = false;
 
        priv->ops = match->data;
        priv->ops->power_down(dev);
diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
index 23fefca54804..9b59ea776e4a 100644
--- a/drivers/net/ethernet/aurora/nb8800.h
+++ b/drivers/net/ethernet/aurora/nb8800.h
@@ -7,7 +7,7 @@
 #include <linux/clk.h>
 #include <linux/bitops.h>
 
-#define RX_DESC_COUNT                  256
+#define RX_DESC_COUNT                  16
 #define TX_DESC_COUNT                  256
 
 #define NB8800_DESC_LOW                        4


I saw both tango4 and tango5 wedged... :-(

tango5:

[  5] local 172.27.64.23 port 5201 connected to 172.27.64.1 port 57415
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  6.33 MBytes  53.1 Mbits/sec  0.030 ms  73897/78442 (94%)  
[  5]   1.00-2.00   sec  6.59 MBytes  55.3 Mbits/sec  0.040 ms  76944/81675 (94%)  
[  5]   2.00-3.00   sec  6.59 MBytes  55.3 Mbits/sec  0.036 ms  76944/81675 (94%)  
[  5]   3.00-4.00   sec  6.59 MBytes  55.3 Mbits/sec  0.038 ms  76976/81708 (94%)  
[  5]   4.00-5.00   sec  6.59 MBytes  55.3 Mbits/sec  0.038 ms  76976/81708 (94%)  
[  5]   5.00-6.00   sec  6.59 MBytes  55.3 Mbits/sec  0.039 ms  76944/81675 (94%)  
[  5]   6.00-7.00   sec  6.59 MBytes  55.3 Mbits/sec  0.042 ms  76941/81674 (94%)  
[  5]   7.00-8.00   sec  6.59 MBytes  55.3 Mbits/sec  0.048 ms  76976/81707 (94%)  
[  5]   8.00-9.00   sec  6.59 MBytes  55.3 Mbits/sec  0.041 ms  76928/81660 (94%)  
[  5]   9.00-10.00  sec  6.59 MBytes  55.3 Mbits/sec  0.039 ms  76960/81692 (94%)  
[  5]  10.00-10.04  sec   279 KBytes  55.1 Mbits/sec  0.094 ms  3088/3284 (94%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec  0.094 ms  769574/816900 (94%)  
^Ciperf3: interrupt - the server has terminated
# ping -c 10 172.27.64.1
PING 172.27.64.1 (172.27.64.1): 56 data bytes

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss


tango4:

[  5] local 172.27.64.23 port 5201 connected to 172.27.64.1 port 52983
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  6.24 MBytes  52.3 Mbits/sec  0.035 ms  74019/78498 (94%)  
[  5]   1.00-2.00   sec  6.51 MBytes  54.6 Mbits/sec  0.046 ms  77024/81702 (94%)  
[  5]   2.00-3.00   sec  6.51 MBytes  54.7 Mbits/sec  0.050 ms  77024/81703 (94%)  
[  5]   3.00-4.00   sec  6.51 MBytes  54.7 Mbits/sec  0.050 ms  77024/81703 (94%)  
[  5]   4.00-5.00   sec  6.52 MBytes  54.7 Mbits/sec  0.052 ms  77024/81705 (94%)  
[  5]   5.00-6.00   sec  6.52 MBytes  54.7 Mbits/sec  0.037 ms  76960/81640 (94%)  
[  5]   6.00-7.00   sec  6.52 MBytes  54.7 Mbits/sec  0.043 ms  77008/81689 (94%)  
[  5]   7.00-8.00   sec  6.52 MBytes  54.7 Mbits/sec  0.050 ms  77024/81704 (94%)  
[  5]   8.00-9.00   sec  6.52 MBytes  54.7 Mbits/sec  0.045 ms  77024/81705 (94%)  
[  5]   9.00-10.00  sec  6.52 MBytes  54.7 Mbits/sec  0.043 ms  77008/81688 (94%)  
[  5]  10.00-10.04  sec   275 KBytes  54.5 Mbits/sec  0.103 ms  3040/3233 (94%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec  0.103 ms  770179/816970 (94%)  
^Ciperf3: interrupt - the server has terminated
# ping -c 10 172.27.64.1
PING 172.27.64.1 (172.27.64.1): 56 data bytes

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss


More tests needed...

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-15 14:58               ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 14:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/2017 17:55, M?ns Rullg?rd wrote:

> Marc Gonzalez wrote:
> 
>> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
>> Would that produce conclusive results?
>> Do you have other suggestions?
> 
> Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
> That should ensure that queue is drained slowly enough for the buffers
> to run out.

Using the following patch:

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index 31c3f0f10fbb..646300bb53b6 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -317,6 +317,7 @@ static int nb8800_poll(struct napi_struct *napi, int budget)
 
                len = RX_BYTES_TRANSFERRED(rxd->report);
 
+               udelay(200);
                if (IS_RX_ERROR(rxd->report))
                        nb8800_rx_error(dev, rxd->report);
                else
@@ -1416,9 +1417,9 @@ static int nb8800_probe(struct platform_device *pdev)
        netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
 
        /* Auto-negotiate by default */
-       priv->pause_aneg = true;
-       priv->pause_rx = true;
-       priv->pause_tx = true;
+       priv->pause_aneg = false;
+       priv->pause_rx = false;
+       priv->pause_tx = false;
 
        priv->ops = match->data;
        priv->ops->power_down(dev);
diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
index 23fefca54804..9b59ea776e4a 100644
--- a/drivers/net/ethernet/aurora/nb8800.h
+++ b/drivers/net/ethernet/aurora/nb8800.h
@@ -7,7 +7,7 @@
 #include <linux/clk.h>
 #include <linux/bitops.h>
 
-#define RX_DESC_COUNT                  256
+#define RX_DESC_COUNT                  16
 #define TX_DESC_COUNT                  256
 
 #define NB8800_DESC_LOW                        4


I saw both tango4 and tango5 wedged... :-(

tango5:

[  5] local 172.27.64.23 port 5201 connected to 172.27.64.1 port 57415
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  6.33 MBytes  53.1 Mbits/sec  0.030 ms  73897/78442 (94%)  
[  5]   1.00-2.00   sec  6.59 MBytes  55.3 Mbits/sec  0.040 ms  76944/81675 (94%)  
[  5]   2.00-3.00   sec  6.59 MBytes  55.3 Mbits/sec  0.036 ms  76944/81675 (94%)  
[  5]   3.00-4.00   sec  6.59 MBytes  55.3 Mbits/sec  0.038 ms  76976/81708 (94%)  
[  5]   4.00-5.00   sec  6.59 MBytes  55.3 Mbits/sec  0.038 ms  76976/81708 (94%)  
[  5]   5.00-6.00   sec  6.59 MBytes  55.3 Mbits/sec  0.039 ms  76944/81675 (94%)  
[  5]   6.00-7.00   sec  6.59 MBytes  55.3 Mbits/sec  0.042 ms  76941/81674 (94%)  
[  5]   7.00-8.00   sec  6.59 MBytes  55.3 Mbits/sec  0.048 ms  76976/81707 (94%)  
[  5]   8.00-9.00   sec  6.59 MBytes  55.3 Mbits/sec  0.041 ms  76928/81660 (94%)  
[  5]   9.00-10.00  sec  6.59 MBytes  55.3 Mbits/sec  0.039 ms  76960/81692 (94%)  
[  5]  10.00-10.04  sec   279 KBytes  55.1 Mbits/sec  0.094 ms  3088/3284 (94%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec  0.094 ms  769574/816900 (94%)  
^Ciperf3: interrupt - the server has terminated
# ping -c 10 172.27.64.1
PING 172.27.64.1 (172.27.64.1): 56 data bytes

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss


tango4:

[  5] local 172.27.64.23 port 5201 connected to 172.27.64.1 port 52983
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  6.24 MBytes  52.3 Mbits/sec  0.035 ms  74019/78498 (94%)  
[  5]   1.00-2.00   sec  6.51 MBytes  54.6 Mbits/sec  0.046 ms  77024/81702 (94%)  
[  5]   2.00-3.00   sec  6.51 MBytes  54.7 Mbits/sec  0.050 ms  77024/81703 (94%)  
[  5]   3.00-4.00   sec  6.51 MBytes  54.7 Mbits/sec  0.050 ms  77024/81703 (94%)  
[  5]   4.00-5.00   sec  6.52 MBytes  54.7 Mbits/sec  0.052 ms  77024/81705 (94%)  
[  5]   5.00-6.00   sec  6.52 MBytes  54.7 Mbits/sec  0.037 ms  76960/81640 (94%)  
[  5]   6.00-7.00   sec  6.52 MBytes  54.7 Mbits/sec  0.043 ms  77008/81689 (94%)  
[  5]   7.00-8.00   sec  6.52 MBytes  54.7 Mbits/sec  0.050 ms  77024/81704 (94%)  
[  5]   8.00-9.00   sec  6.52 MBytes  54.7 Mbits/sec  0.045 ms  77024/81705 (94%)  
[  5]   9.00-10.00  sec  6.52 MBytes  54.7 Mbits/sec  0.043 ms  77008/81688 (94%)  
[  5]  10.00-10.04  sec   275 KBytes  54.5 Mbits/sec  0.103 ms  3040/3233 (94%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec  0.103 ms  770179/816970 (94%)  
^Ciperf3: interrupt - the server has terminated
# ping -c 10 172.27.64.1
PING 172.27.64.1 (172.27.64.1): 56 data bytes

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss


More tests needed...

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-15 14:33               ` Marc Gonzalez
@ 2017-11-15 15:03                 ` Andrew Lunn
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-15 15:03 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
> On 15/11/2017 15:17, Andrew Lunn wrote:
> 
> > nb8800_link_reconfigure() can be called whenever the link to the peer
> > changes. auto-neg may happen later because the cable was not plugged
> > in until later, etc.
> 
> Hello Andrew,
> 
> AFAICS, Mans was right: trying to toggle the flow control bit when
> the RX DMA bit is already set "breaks" the HW (ping times out).
> 
> Thus, I will drop patch 2/4.

O.K. Thanks for testing this.

> In our local branch, I have completely disabled flow control support,
> so I don't have to worry about this problem.

That is an interesting statement. You now know there is an issue here,
your solution is to fix your private branch and leave mainline as is.

     Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 15:03                 ` Andrew Lunn
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-15 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
> On 15/11/2017 15:17, Andrew Lunn wrote:
> 
> > nb8800_link_reconfigure() can be called whenever the link to the peer
> > changes. auto-neg may happen later because the cable was not plugged
> > in until later, etc.
> 
> Hello Andrew,
> 
> AFAICS, Mans was right: trying to toggle the flow control bit when
> the RX DMA bit is already set "breaks" the HW (ping times out).
> 
> Thus, I will drop patch 2/4.

O.K. Thanks for testing this.

> In our local branch, I have completely disabled flow control support,
> so I don't have to worry about this problem.

That is an interesting statement. You now know there is an issue here,
your solution is to fix your private branch and leave mainline as is.

     Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-15 14:58               ` Marc Gonzalez
@ 2017-11-15 15:11                 ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-15 15:11 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: netdev, David Miller, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 17:55, Måns Rullgård wrote:
>
>> Marc Gonzalez wrote:
>> 
>>> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
>>> Would that produce conclusive results?
>>> Do you have other suggestions?
>> 
>> Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
>> That should ensure that queue is drained slowly enough for the buffers
>> to run out.
>
> Using the following patch:
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 31c3f0f10fbb..646300bb53b6 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -317,6 +317,7 @@ static int nb8800_poll(struct napi_struct *napi, int budget)
>
>                 len = RX_BYTES_TRANSFERRED(rxd->report);
>
> +               udelay(200);
>                 if (IS_RX_ERROR(rxd->report))
>                         nb8800_rx_error(dev, rxd->report);
>                 else
> @@ -1416,9 +1417,9 @@ static int nb8800_probe(struct platform_device *pdev)
>         netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
>
>         /* Auto-negotiate by default */
> -       priv->pause_aneg = true;
> -       priv->pause_rx = true;
> -       priv->pause_tx = true;
> +       priv->pause_aneg = false;
> +       priv->pause_rx = false;
> +       priv->pause_tx = false;
>
>         priv->ops = match->data;
>         priv->ops->power_down(dev);
> diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
> index 23fefca54804..9b59ea776e4a 100644
> --- a/drivers/net/ethernet/aurora/nb8800.h
> +++ b/drivers/net/ethernet/aurora/nb8800.h
> @@ -7,7 +7,7 @@
>  #include <linux/clk.h>
>  #include <linux/bitops.h>
>
> -#define RX_DESC_COUNT                  256
> +#define RX_DESC_COUNT                  16
>  #define TX_DESC_COUNT                  256
>
>  #define NB8800_DESC_LOW                        4
>
> I saw both tango4 and tango5 wedged... :-(

Well, that's not good.  I'll see if I can replicate it later this week.

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-15 15:11                 ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-15 15:11 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 14/11/2017 17:55, M?ns Rullg?rd wrote:
>
>> Marc Gonzalez wrote:
>> 
>>> I will run iperf3 tests with RX_DESC_COUNT lowered to 2.
>>> Would that produce conclusive results?
>>> Do you have other suggestions?
>> 
>> Leave RX_DESC_COUNT alone but add a delay in the nb8800_poll() loop.
>> That should ensure that queue is drained slowly enough for the buffers
>> to run out.
>
> Using the following patch:
>
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 31c3f0f10fbb..646300bb53b6 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -317,6 +317,7 @@ static int nb8800_poll(struct napi_struct *napi, int budget)
>
>                 len = RX_BYTES_TRANSFERRED(rxd->report);
>
> +               udelay(200);
>                 if (IS_RX_ERROR(rxd->report))
>                         nb8800_rx_error(dev, rxd->report);
>                 else
> @@ -1416,9 +1417,9 @@ static int nb8800_probe(struct platform_device *pdev)
>         netdev_info(dev, "MAC address %pM\n", dev->dev_addr);
>
>         /* Auto-negotiate by default */
> -       priv->pause_aneg = true;
> -       priv->pause_rx = true;
> -       priv->pause_tx = true;
> +       priv->pause_aneg = false;
> +       priv->pause_rx = false;
> +       priv->pause_tx = false;
>
>         priv->ops = match->data;
>         priv->ops->power_down(dev);
> diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
> index 23fefca54804..9b59ea776e4a 100644
> --- a/drivers/net/ethernet/aurora/nb8800.h
> +++ b/drivers/net/ethernet/aurora/nb8800.h
> @@ -7,7 +7,7 @@
>  #include <linux/clk.h>
>  #include <linux/bitops.h>
>
> -#define RX_DESC_COUNT                  256
> +#define RX_DESC_COUNT                  16
>  #define TX_DESC_COUNT                  256
>
>  #define NB8800_DESC_LOW                        4
>
> I saw both tango4 and tango5 wedged... :-(

Well, that's not good.  I'll see if I can replicate it later this week.

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-15 15:03                 ` Andrew Lunn
@ 2017-11-15 15:19                   ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 15:19 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On 15/11/2017 16:03, Andrew Lunn wrote:

> On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
> 
>> On 15/11/2017 15:17, Andrew Lunn wrote:
>>
>> In our local branch, I have completely disabled flow control support,
>> so I don't have to worry about this problem.
> 
> That is an interesting statement. You now know there is an issue here,
> your solution is to fix your private branch and leave mainline as is.

All my patches are NACKed, what would you have me do?

Moreover, mainline still has the nb8800_dma_stop() work-around,
which Mans has never seen hang.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 15:19                   ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/11/2017 16:03, Andrew Lunn wrote:

> On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
> 
>> On 15/11/2017 15:17, Andrew Lunn wrote:
>>
>> In our local branch, I have completely disabled flow control support,
>> so I don't have to worry about this problem.
> 
> That is an interesting statement. You now know there is an issue here,
> your solution is to fix your private branch and leave mainline as is.

All my patches are NACKed, what would you have me do?

Moreover, mainline still has the nb8800_dma_stop() work-around,
which Mans has never seen hang.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-15 15:19                   ` Marc Gonzalez
@ 2017-11-15 15:36                     ` Måns Rullgård
  -1 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-15 15:36 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Andrew Lunn, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 15/11/2017 16:03, Andrew Lunn wrote:
>
>> On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
>> 
>>> On 15/11/2017 15:17, Andrew Lunn wrote:
>>>
>>> In our local branch, I have completely disabled flow control support,
>>> so I don't have to worry about this problem.
>> 
>> That is an interesting statement. You now know there is an issue here,
>> your solution is to fix your private branch and leave mainline as is.
>
> All my patches are NACKed, what would you have me do?
>
> Moreover, mainline still has the nb8800_dma_stop() work-around,
> which Mans has never seen hang.

Here's the thing, if that trick doesn't work, then the dma queue filling
up from real traffic will also hang the controller, which is a much
bigger problem.  Your test today suggests that this might be the case.

-- 
Måns Rullgård

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 15:36                     ` Måns Rullgård
  0 siblings, 0 replies; 70+ messages in thread
From: Måns Rullgård @ 2017-11-15 15:36 UTC (permalink / raw)
  To: linux-arm-kernel

Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes:

> On 15/11/2017 16:03, Andrew Lunn wrote:
>
>> On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
>> 
>>> On 15/11/2017 15:17, Andrew Lunn wrote:
>>>
>>> In our local branch, I have completely disabled flow control support,
>>> so I don't have to worry about this problem.
>> 
>> That is an interesting statement. You now know there is an issue here,
>> your solution is to fix your private branch and leave mainline as is.
>
> All my patches are NACKed, what would you have me do?
>
> Moreover, mainline still has the nb8800_dma_stop() work-around,
> which Mans has never seen hang.

Here's the thing, if that trick doesn't work, then the dma queue filling
up from real traffic will also hang the controller, which is a much
bigger problem.  Your test today suggests that this might be the case.

-- 
M?ns Rullg?rd

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-15 15:11                 ` Måns Rullgård
@ 2017-11-15 16:15                   ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 16:15 UTC (permalink / raw)
  To: Mans Rullgard
  Cc: netdev, David Miller, Linux ARM, Florian Fainelli, Thibaud Cornic, Mason

On 15/11/2017 16:11, Måns Rullgård wrote:

> Well, that's not good.  I'll see if I can replicate it later this week.

In my latest test, I started from v4.14 and only applied the
following patch:

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index e94159507847..5c109cc4bde8 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -317,6 +317,7 @@ static int nb8800_poll(struct napi_struct *napi, int budget)
 
                len = RX_BYTES_TRANSFERRED(rxd->report);
 
+               udelay(200);
                if (IS_RX_ERROR(rxd->report))
                        nb8800_rx_error(dev, rxd->report);
                else
@@ -1246,10 +1247,12 @@ static int nb8800_hw_init(struct net_device *dev)
        nb8800_writeb(priv, NB8800_PQ1, val >> 8);
        nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
 
+#if 0
        /* Auto-negotiate by default */
        priv->pause_aneg = true;
        priv->pause_rx = true;
        priv->pause_tx = true;
+#endif
 
        nb8800_mc_init(dev, 0);
 
diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
index aacc3cce2cc0..a2d4b841306a 100644
--- a/drivers/net/ethernet/aurora/nb8800.h
+++ b/drivers/net/ethernet/aurora/nb8800.h
@@ -8,7 +8,7 @@
 #include <linux/clk.h>
 #include <linux/bitops.h>
 
-#define RX_DESC_COUNT                  256
+#define RX_DESC_COUNT                  16
 #define TX_DESC_COUNT                  256
 
 #define NB8800_DESC_LOW                        4



Once I booted the SMP8759 board, I ran

ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0
ip link set eth0 up
sleep 5
iperf3 -s


And from the PC, I ran

iperf3 -c 172.27.64.23 -u -b 0 -l 1400 -t 120


[  5] local 172.27.64.23 port 5201 connected to 172.27.64.1 port 53771
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  5.89 MBytes  49.4 Mbits/sec  0.044 ms  77062/81472 (95%)  
[  5]   1.00-2.00   sec  6.25 MBytes  52.5 Mbits/sec  0.052 ms  80112/84796 (94%)  
[  5]   2.00-3.00   sec  6.26 MBytes  52.5 Mbits/sec  0.054 ms  80112/84797 (94%)  
[  5]   3.00-4.00   sec  6.26 MBytes  52.5 Mbits/sec  0.053 ms  80128/84814 (94%)  
[  5]   4.00-5.00   sec  6.26 MBytes  52.5 Mbits/sec  0.049 ms  80112/84798 (94%)  
[  5]   5.00-6.00   sec  6.26 MBytes  52.5 Mbits/sec  0.052 ms  80096/84783 (94%)  
[  5]   6.00-7.00   sec  6.26 MBytes  52.5 Mbits/sec  0.047 ms  80096/84784 (94%)  
[  5]   7.00-8.00   sec  6.26 MBytes  52.5 Mbits/sec  0.060 ms  80128/84815 (94%)  
iperf3: OUT OF ORDER - incoming packet = 731264 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731265 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731266 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731267 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731268 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731269 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731270 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731271 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731272 and received packet = 0 AND SP = 731279
[  5]   8.00-9.00   sec  4.17 MBytes  35.0 Mbits/sec  0.149 ms  53104/56220 (94%)  
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  10.00-11.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  11.00-12.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  12.00-13.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  13.00-14.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
^C
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-17.04  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  690950/731279 (94%)  
[SUM]  0.0-17.0 sec  9 datagrams received out-of-order
iperf3: interrupt - the server has terminated
# ping -c 10 172.27.64.1
PING 172.27.64.1 (172.27.64.1): 56 data bytes

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss


Given the out-of-order datagrams, I'm wondering if it's possible
for the DMA engine to overwrite a not-yet-read descriptor?

The EOC flag should stop the DMA engine though...

Maybe some kind of race...

I don't think I've been able to trigger the wedge when 256 descriptors
are used.

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-15 16:15                   ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-15 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/11/2017 16:11, M?ns Rullg?rd wrote:

> Well, that's not good.  I'll see if I can replicate it later this week.

In my latest test, I started from v4.14 and only applied the
following patch:

diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
index e94159507847..5c109cc4bde8 100644
--- a/drivers/net/ethernet/aurora/nb8800.c
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -317,6 +317,7 @@ static int nb8800_poll(struct napi_struct *napi, int budget)
 
                len = RX_BYTES_TRANSFERRED(rxd->report);
 
+               udelay(200);
                if (IS_RX_ERROR(rxd->report))
                        nb8800_rx_error(dev, rxd->report);
                else
@@ -1246,10 +1247,12 @@ static int nb8800_hw_init(struct net_device *dev)
        nb8800_writeb(priv, NB8800_PQ1, val >> 8);
        nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
 
+#if 0
        /* Auto-negotiate by default */
        priv->pause_aneg = true;
        priv->pause_rx = true;
        priv->pause_tx = true;
+#endif
 
        nb8800_mc_init(dev, 0);
 
diff --git a/drivers/net/ethernet/aurora/nb8800.h b/drivers/net/ethernet/aurora/nb8800.h
index aacc3cce2cc0..a2d4b841306a 100644
--- a/drivers/net/ethernet/aurora/nb8800.h
+++ b/drivers/net/ethernet/aurora/nb8800.h
@@ -8,7 +8,7 @@
 #include <linux/clk.h>
 #include <linux/bitops.h>
 
-#define RX_DESC_COUNT                  256
+#define RX_DESC_COUNT                  16
 #define TX_DESC_COUNT                  256
 
 #define NB8800_DESC_LOW                        4



Once I booted the SMP8759 board, I ran

ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0
ip link set eth0 up
sleep 5
iperf3 -s


And from the PC, I ran

iperf3 -c 172.27.64.23 -u -b 0 -l 1400 -t 120


[  5] local 172.27.64.23 port 5201 connected to 172.27.64.1 port 53771
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  5.89 MBytes  49.4 Mbits/sec  0.044 ms  77062/81472 (95%)  
[  5]   1.00-2.00   sec  6.25 MBytes  52.5 Mbits/sec  0.052 ms  80112/84796 (94%)  
[  5]   2.00-3.00   sec  6.26 MBytes  52.5 Mbits/sec  0.054 ms  80112/84797 (94%)  
[  5]   3.00-4.00   sec  6.26 MBytes  52.5 Mbits/sec  0.053 ms  80128/84814 (94%)  
[  5]   4.00-5.00   sec  6.26 MBytes  52.5 Mbits/sec  0.049 ms  80112/84798 (94%)  
[  5]   5.00-6.00   sec  6.26 MBytes  52.5 Mbits/sec  0.052 ms  80096/84783 (94%)  
[  5]   6.00-7.00   sec  6.26 MBytes  52.5 Mbits/sec  0.047 ms  80096/84784 (94%)  
[  5]   7.00-8.00   sec  6.26 MBytes  52.5 Mbits/sec  0.060 ms  80128/84815 (94%)  
iperf3: OUT OF ORDER - incoming packet = 731264 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731265 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731266 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731267 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731268 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731269 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731270 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731271 and received packet = 0 AND SP = 731279
iperf3: OUT OF ORDER - incoming packet = 731272 and received packet = 0 AND SP = 731279
[  5]   8.00-9.00   sec  4.17 MBytes  35.0 Mbits/sec  0.149 ms  53104/56220 (94%)  
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  10.00-11.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  11.00-12.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  12.00-13.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
[  5]  13.00-14.00  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  0/0 (0%)  
^C
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-17.04  sec  0.00 Bytes  0.00 bits/sec  0.149 ms  690950/731279 (94%)  
[SUM]  0.0-17.0 sec  9 datagrams received out-of-order
iperf3: interrupt - the server has terminated
# ping -c 10 172.27.64.1
PING 172.27.64.1 (172.27.64.1): 56 data bytes

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 0 packets received, 100% packet loss


Given the out-of-order datagrams, I'm wondering if it's possible
for the DMA engine to overwrite a not-yet-read descriptor?

The EOC flag should stop the DMA engine though...

Maybe some kind of race...

I don't think I've been able to trigger the wedge when 256 descriptors
are used.

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
  2017-11-15 15:19                   ` Marc Gonzalez
@ 2017-11-15 21:12                     ` Andrew Lunn
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-15 21:12 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: Mans Rullgard, Florian Fainelli, Mason, netdev, Thibaud Cornic,
	David Miller, Linux ARM

On Wed, Nov 15, 2017 at 04:19:56PM +0100, Marc Gonzalez wrote:
> On 15/11/2017 16:03, Andrew Lunn wrote:
> 
> > On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
> > 
> >> On 15/11/2017 15:17, Andrew Lunn wrote:
> >>
> >> In our local branch, I have completely disabled flow control support,
> >> so I don't have to worry about this problem.
> > 
> > That is an interesting statement. You now know there is an issue here,
> > your solution is to fix your private branch and leave mainline as is.
> 
> All my patches are NACKed, what would you have me do?

Hi Marc

You need to consider your own maintenance burden. You want your local
branch to be as near to mainline as possible. Each change you have
means additional maintenance work for you. It also possibly means
additional work for your customers.

You seem to think flow control in your hardware is too broken to be
usable. So you probably want to submit a patch to mainline disabling
it. If it is accepted, that is one less patch you need to maintain.

    Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config()
@ 2017-11-15 21:12                     ` Andrew Lunn
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-15 21:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 15, 2017 at 04:19:56PM +0100, Marc Gonzalez wrote:
> On 15/11/2017 16:03, Andrew Lunn wrote:
> 
> > On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote:
> > 
> >> On 15/11/2017 15:17, Andrew Lunn wrote:
> >>
> >> In our local branch, I have completely disabled flow control support,
> >> so I don't have to worry about this problem.
> > 
> > That is an interesting statement. You now know there is an issue here,
> > your solution is to fix your private branch and leave mainline as is.
> 
> All my patches are NACKed, what would you have me do?

Hi Marc

You need to consider your own maintenance burden. You want your local
branch to be as near to mainline as possible. Each change you have
means additional maintenance work for you. It also possibly means
additional work for your customers.

You seem to think flow control in your hardware is too broken to be
usable. So you probably want to submit a patch to mainline disabling
it. If it is accepted, that is one less patch you need to maintain.

    Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-15 16:15                   ` Marc Gonzalez
@ 2017-11-16 12:21                     ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-16 12:21 UTC (permalink / raw)
  To: netdev
  Cc: Mans Rullgard, David Miller, Linux ARM, Florian Fainelli,
	Thibaud Cornic, Mason

On 15/11/2017 17:15, Marc Gonzalez wrote:

> Given the out-of-order datagrams, I'm wondering if it's possible
> for the DMA engine to overwrite a not-yet-read descriptor?
> 
> The EOC flag should stop the DMA engine though...
> 
> Maybe some kind of race...
> 
> I don't think I've been able to trigger the wedge when 256 descriptors
> are used.

I'm still taking stabs in the dark.

Adding a 10 ms delay for every 1024 packets, and using 256 descriptors
doesn't cause any hang, even after an hour.

At 85 packets per ms, 10 ms is large enough to consume all available
descriptors... Yet no issue.

Only when I lower the number of available RX descriptors do I see
the issue. And I don't need any delay...

I'm starting to think there is some kind of race condition between
SW and HW handling of descriptors. This might also explain the
out-of-order warnings.

Lowering the number of RX descriptors to 64, and adding no delay
causes many OUT OF ORDER warnings, and the network locks up in
under a minute:


iperf3: OUT OF ORDER - incoming packet = 13031 and received packet = 0 AND SP = 13094
iperf3: OUT OF ORDER - incoming packet = 21441 and received packet = 0 AND SP = 21504
iperf3: OUT OF ORDER - incoming packet = 21442 and received packet = 0 AND SP = 21504
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  97.4 MBytes   817 Mbits/sec  0.016 ms  8793/81753 (11%)  
iperf3: OUT OF ORDER - incoming packet = 92503 and received packet = 0 AND SP = 92566
iperf3: OUT OF ORDER - incoming packet = 92504 and received packet = 0 AND SP = 92566
iperf3: OUT OF ORDER - incoming packet = 125503 and received packet = 0 AND SP = 125566
[  5]   1.00-2.00   sec   101 MBytes   851 Mbits/sec  0.011 ms  8835/84797 (10%)  
iperf3: OUT OF ORDER - incoming packet = 198710 and received packet = 0 AND SP = 198773
iperf3: OUT OF ORDER - incoming packet = 243272 and received packet = 0 AND SP = 243335
iperf3: OUT OF ORDER - incoming packet = 243791 and received packet = 0 AND SP = 243854
[  5]   2.00-3.00   sec   101 MBytes   851 Mbits/sec  0.016 ms  8771/84782 (10%)  
iperf3: OUT OF ORDER - incoming packet = 294545 and received packet = 0 AND SP = 294608
iperf3: OUT OF ORDER - incoming packet = 301719 and received packet = 0 AND SP = 301782
iperf3: OUT OF ORDER - incoming packet = 301720 and received packet = 0 AND SP = 301782
iperf3: OUT OF ORDER - incoming packet = 305218 and received packet = 0 AND SP = 305281
iperf3: OUT OF ORDER - incoming packet = 314655 and received packet = 0 AND SP = 314718
iperf3: OUT OF ORDER - incoming packet = 325473 and received packet = 0 AND SP = 325536
iperf3: OUT OF ORDER - incoming packet = 325474 and received packet = 0 AND SP = 325536
[  5]   3.00-4.00   sec   101 MBytes   850 Mbits/sec  0.013 ms  8903/84798 (10%)  
iperf3: OUT OF ORDER - incoming packet = 340919 and received packet = 0 AND SP = 340982
iperf3: OUT OF ORDER - incoming packet = 340920 and received packet = 0 AND SP = 340982
iperf3: OUT OF ORDER - incoming packet = 346325 and received packet = 0 AND SP = 346388
iperf3: OUT OF ORDER - incoming packet = 346326 and received packet = 0 AND SP = 346388
[  5]   4.00-5.00   sec   101 MBytes   851 Mbits/sec  0.012 ms  8868/84829 (10%)  
iperf3: OUT OF ORDER - incoming packet = 441873 and received packet = 0 AND SP = 441936
iperf3: OUT OF ORDER - incoming packet = 459258 and received packet = 0 AND SP = 459321
iperf3: OUT OF ORDER - incoming packet = 495604 and received packet = 0 AND SP = 495667
[  5]   5.00-6.00   sec   101 MBytes   850 Mbits/sec  0.013 ms  8835/84760 (10%)  
iperf3: OUT OF ORDER - incoming packet = 558280 and received packet = 0 AND SP = 558343
iperf3: OUT OF ORDER - incoming packet = 558281 and received packet = 0 AND SP = 558343
iperf3: OUT OF ORDER - incoming packet = 587445 and received packet = 0 AND SP = 587508
iperf3: OUT OF ORDER - incoming packet = 587446 and received packet = 0 AND SP = 587508
[  5]   6.00-7.00   sec   102 MBytes   854 Mbits/sec  0.016 ms  8580/84843 (10%)  
iperf3: OUT OF ORDER - incoming packet = 617324 and received packet = 0 AND SP = 617387
iperf3: OUT OF ORDER - incoming packet = 620412 and received packet = 0 AND SP = 620475
iperf3: OUT OF ORDER - incoming packet = 635845 and received packet = 0 AND SP = 635908
iperf3: OUT OF ORDER - incoming packet = 657102 and received packet = 0 AND SP = 657165
iperf3: OUT OF ORDER - incoming packet = 670672 and received packet = 0 AND SP = 670735
iperf3: OUT OF ORDER - incoming packet = 670673 and received packet = 0 AND SP = 670735
[  5]   7.00-8.00   sec   102 MBytes   853 Mbits/sec  0.012 ms  8582/84757 (10%)  
iperf3: OUT OF ORDER - incoming packet = 676529 and received packet = 0 AND SP = 676592
iperf3: OUT OF ORDER - incoming packet = 682693 and received packet = 0 AND SP = 682756
iperf3: OUT OF ORDER - incoming packet = 682694 and received packet = 0 AND SP = 682756
iperf3: OUT OF ORDER - incoming packet = 690106 and received packet = 0 AND SP = 690169
iperf3: OUT OF ORDER - incoming packet = 694369 and received packet = 0 AND SP = 694432
iperf3: OUT OF ORDER - incoming packet = 715288 and received packet = 0 AND SP = 715351
iperf3: OUT OF ORDER - incoming packet = 715289 and received packet = 0 AND SP = 715351
iperf3: OUT OF ORDER - incoming packet = 731162 and received packet = 0 AND SP = 731225
iperf3: OUT OF ORDER - incoming packet = 731163 and received packet = 0 AND SP = 731225
iperf3: OUT OF ORDER - incoming packet = 750282 and received packet = 0 AND SP = 750345
iperf3: OUT OF ORDER - incoming packet = 752731 and received packet = 0 AND SP = 752794
[  5]   8.00-9.00   sec   102 MBytes   853 Mbits/sec  0.011 ms  8651/84784 (10%)  
iperf3: OUT OF ORDER - incoming packet = 786352 and received packet = 0 AND SP = 786415
iperf3: OUT OF ORDER - incoming packet = 786353 and received packet = 0 AND SP = 786415
iperf3: OUT OF ORDER - incoming packet = 801154 and received packet = 0 AND SP = 801217
iperf3: OUT OF ORDER - incoming packet = 803611 and received packet = 0 AND SP = 803674
iperf3: OUT OF ORDER - incoming packet = 803612 and received packet = 0 AND SP = 803674
iperf3: OUT OF ORDER - incoming packet = 809297 and received packet = 0 AND SP = 809360
iperf3: OUT OF ORDER - incoming packet = 821889 and received packet = 0 AND SP = 821952
iperf3: OUT OF ORDER - incoming packet = 828216 and received packet = 0 AND SP = 828279
iperf3: OUT OF ORDER - incoming packet = 828217 and received packet = 0 AND SP = 828279
iperf3: OUT OF ORDER - incoming packet = 830446 and received packet = 0 AND SP = 830509
[  5]   9.00-10.00  sec   102 MBytes   854 Mbits/sec  0.010 ms  8590/84807 (10%)  
iperf3: OUT OF ORDER - incoming packet = 863210 and received packet = 0 AND SP = 863273
iperf3: OUT OF ORDER - incoming packet = 863211 and received packet = 0 AND SP = 863273
iperf3: OUT OF ORDER - incoming packet = 870942 and received packet = 0 AND SP = 871005
iperf3: OUT OF ORDER - incoming packet = 870943 and received packet = 0 AND SP = 871005
iperf3: OUT OF ORDER - incoming packet = 900170 and received packet = 0 AND SP = 900233
iperf3: OUT OF ORDER - incoming packet = 919395 and received packet = 0 AND SP = 919458
iperf3: OUT OF ORDER - incoming packet = 919396 and received packet = 0 AND SP = 919458
[  5]  10.00-11.00  sec   102 MBytes   855 Mbits/sec  0.013 ms  8455/84823 (10%)  
iperf3: OUT OF ORDER - incoming packet = 973534 and received packet = 0 AND SP = 973597
iperf3: OUT OF ORDER - incoming packet = 1011487 and received packet = 0 AND SP = 1011550
[  5]  11.00-12.00  sec   102 MBytes   856 Mbits/sec  0.015 ms  8322/84753 (9.8%)  
iperf3: OUT OF ORDER - incoming packet = 1050138 and received packet = 0 AND SP = 1050201
iperf3: OUT OF ORDER - incoming packet = 1050139 and received packet = 0 AND SP = 1050201
[  5]  12.00-13.00  sec   102 MBytes   855 Mbits/sec  0.010 ms  8450/84812 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1099930 and received packet = 0 AND SP = 1099993
iperf3: OUT OF ORDER - incoming packet = 1123640 and received packet = 0 AND SP = 1123703
iperf3: OUT OF ORDER - incoming packet = 1123641 and received packet = 0 AND SP = 1123703
iperf3: OUT OF ORDER - incoming packet = 1132657 and received packet = 0 AND SP = 1132720
iperf3: OUT OF ORDER - incoming packet = 1132658 and received packet = 0 AND SP = 1132720
iperf3: OUT OF ORDER - incoming packet = 1149744 and received packet = 0 AND SP = 1149807
iperf3: OUT OF ORDER - incoming packet = 1149745 and received packet = 0 AND SP = 1149807
iperf3: OUT OF ORDER - incoming packet = 1165178 and received packet = 0 AND SP = 1165241
[  5]  13.00-14.00  sec   102 MBytes   855 Mbits/sec  0.010 ms  8509/84818 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1186134 and received packet = 0 AND SP = 1186197
iperf3: OUT OF ORDER - incoming packet = 1191699 and received packet = 0 AND SP = 1191762
iperf3: OUT OF ORDER - incoming packet = 1206111 and received packet = 0 AND SP = 1206174
iperf3: OUT OF ORDER - incoming packet = 1206112 and received packet = 0 AND SP = 1206174
iperf3: OUT OF ORDER - incoming packet = 1210586 and received packet = 0 AND SP = 1210649
[  5]  14.00-15.00  sec   102 MBytes   855 Mbits/sec  0.022 ms  8460/84786 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1307589 and received packet = 0 AND SP = 1307652
iperf3: OUT OF ORDER - incoming packet = 1314198 and received packet = 0 AND SP = 1314261
iperf3: OUT OF ORDER - incoming packet = 1345142 and received packet = 0 AND SP = 1345205
[  5]  15.00-16.00  sec   102 MBytes   857 Mbits/sec  0.011 ms  8248/84782 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 1359295 and received packet = 0 AND SP = 1359358
iperf3: OUT OF ORDER - incoming packet = 1413394 and received packet = 0 AND SP = 1413457
iperf3: OUT OF ORDER - incoming packet = 1413395 and received packet = 0 AND SP = 1413457
iperf3: OUT OF ORDER - incoming packet = 1423771 and received packet = 0 AND SP = 1423834
iperf3: OUT OF ORDER - incoming packet = 1423772 and received packet = 0 AND SP = 1423834
[  5]  16.00-17.00  sec   102 MBytes   858 Mbits/sec  0.018 ms  8197/84768 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 1441628 and received packet = 0 AND SP = 1441691
iperf3: OUT OF ORDER - incoming packet = 1500381 and received packet = 0 AND SP = 1500444
iperf3: OUT OF ORDER - incoming packet = 1500382 and received packet = 0 AND SP = 1500444
[  5]  17.00-18.00  sec   102 MBytes   856 Mbits/sec  0.012 ms  8387/84849 (9.9%)  
iperf3: OUT OF ORDER - incoming packet = 1524208 and received packet = 0 AND SP = 1524271
iperf3: OUT OF ORDER - incoming packet = 1525402 and received packet = 0 AND SP = 1525465
iperf3: OUT OF ORDER - incoming packet = 1525403 and received packet = 0 AND SP = 1525465
iperf3: OUT OF ORDER - incoming packet = 1549050 and received packet = 0 AND SP = 1549113
iperf3: OUT OF ORDER - incoming packet = 1549051 and received packet = 0 AND SP = 1549113
iperf3: OUT OF ORDER - incoming packet = 1581586 and received packet = 0 AND SP = 1581649
iperf3: OUT OF ORDER - incoming packet = 1581587 and received packet = 0 AND SP = 1581649
[  5]  18.00-19.00  sec   102 MBytes   854 Mbits/sec  0.012 ms  8519/84776 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1614525 and received packet = 0 AND SP = 1614588
iperf3: OUT OF ORDER - incoming packet = 1614526 and received packet = 0 AND SP = 1614588
iperf3: OUT OF ORDER - incoming packet = 1631851 and received packet = 0 AND SP = 1631914
iperf3: OUT OF ORDER - incoming packet = 1631852 and received packet = 0 AND SP = 1631914
iperf3: OUT OF ORDER - incoming packet = 1632403 and received packet = 0 AND SP = 1632466
iperf3: OUT OF ORDER - incoming packet = 1668834 and received packet = 0 AND SP = 1668897
[  5]  19.00-20.00  sec   102 MBytes   854 Mbits/sec  0.012 ms  8507/84777 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1710901 and received packet = 0 AND SP = 1710964
iperf3: OUT OF ORDER - incoming packet = 1710902 and received packet = 0 AND SP = 1710964
iperf3: OUT OF ORDER - incoming packet = 1743216 and received packet = 0 AND SP = 1743279
iperf3: OUT OF ORDER - incoming packet = 1743217 and received packet = 0 AND SP = 1743279
[  5]  20.00-21.00  sec   102 MBytes   856 Mbits/sec  0.011 ms  8452/84842 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1801411 and received packet = 0 AND SP = 1801474
iperf3: OUT OF ORDER - incoming packet = 1801412 and received packet = 0 AND SP = 1801474
iperf3: OUT OF ORDER - incoming packet = 1828046 and received packet = 0 AND SP = 1828109
[  5]  21.00-22.00  sec   102 MBytes   855 Mbits/sec  0.013 ms  8451/84819 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1872009 and received packet = 0 AND SP = 1872072
iperf3: OUT OF ORDER - incoming packet = 1907637 and received packet = 0 AND SP = 1907700
iperf3: OUT OF ORDER - incoming packet = 1920279 and received packet = 0 AND SP = 1920342
iperf3: OUT OF ORDER - incoming packet = 1920280 and received packet = 0 AND SP = 1920342
[  5]  22.00-23.00  sec   102 MBytes   853 Mbits/sec  0.017 ms  8633/84757 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1958806 and received packet = 0 AND SP = 1958869
iperf3: OUT OF ORDER - incoming packet = 1960612 and received packet = 0 AND SP = 1960675
iperf3: OUT OF ORDER - incoming packet = 1960613 and received packet = 0 AND SP = 1960675
iperf3: OUT OF ORDER - incoming packet = 1971512 and received packet = 0 AND SP = 1971575
iperf3: OUT OF ORDER - incoming packet = 1971513 and received packet = 0 AND SP = 1971575
iperf3: OUT OF ORDER - incoming packet = 1974598 and received packet = 0 AND SP = 1974661
iperf3: OUT OF ORDER - incoming packet = 1974599 and received packet = 0 AND SP = 1974661
iperf3: OUT OF ORDER - incoming packet = 2020663 and received packet = 0 AND SP = 2020726
[  5]  23.00-24.00  sec   102 MBytes   854 Mbits/sec  0.012 ms  8583/84833 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2048932 and received packet = 0 AND SP = 2048995
iperf3: OUT OF ORDER - incoming packet = 2059769 and received packet = 0 AND SP = 2059832
iperf3: OUT OF ORDER - incoming packet = 2069989 and received packet = 0 AND SP = 2070052
iperf3: OUT OF ORDER - incoming packet = 2069990 and received packet = 0 AND SP = 2070052
iperf3: OUT OF ORDER - incoming packet = 2073023 and received packet = 0 AND SP = 2073086
iperf3: OUT OF ORDER - incoming packet = 2083300 and received packet = 0 AND SP = 2083363
iperf3: OUT OF ORDER - incoming packet = 2083301 and received packet = 0 AND SP = 2083363
iperf3: OUT OF ORDER - incoming packet = 2087583 and received packet = 0 AND SP = 2087646
iperf3: OUT OF ORDER - incoming packet = 2089400 and received packet = 0 AND SP = 2089463
iperf3: OUT OF ORDER - incoming packet = 2089401 and received packet = 0 AND SP = 2089463
iperf3: OUT OF ORDER - incoming packet = 2091835 and received packet = 0 AND SP = 2091898
iperf3: OUT OF ORDER - incoming packet = 2091836 and received packet = 0 AND SP = 2091898
iperf3: OUT OF ORDER - incoming packet = 2112610 and received packet = 0 AND SP = 2112673
[  5]  24.00-25.00  sec   102 MBytes   853 Mbits/sec  0.012 ms  8589/84768 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2167633 and received packet = 0 AND SP = 2167696
iperf3: OUT OF ORDER - incoming packet = 2177652 and received packet = 0 AND SP = 2177715
iperf3: OUT OF ORDER - incoming packet = 2177653 and received packet = 0 AND SP = 2177715
[  5]  25.00-26.00  sec   102 MBytes   854 Mbits/sec  0.022 ms  8633/84853 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2267485 and received packet = 0 AND SP = 2267548
iperf3: OUT OF ORDER - incoming packet = 2267486 and received packet = 0 AND SP = 2267548
[  5]  26.00-27.00  sec   102 MBytes   857 Mbits/sec  0.011 ms  8258/84791 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2287486 and received packet = 0 AND SP = 2287549
iperf3: OUT OF ORDER - incoming packet = 2287487 and received packet = 0 AND SP = 2287549
iperf3: OUT OF ORDER - incoming packet = 2308286 and received packet = 0 AND SP = 2308349
iperf3: OUT OF ORDER - incoming packet = 2347272 and received packet = 0 AND SP = 2347335
iperf3: OUT OF ORDER - incoming packet = 2347273 and received packet = 0 AND SP = 2347335
iperf3: OUT OF ORDER - incoming packet = 2354962 and received packet = 0 AND SP = 2355025
iperf3: OUT OF ORDER - incoming packet = 2354963 and received packet = 0 AND SP = 2355025
iperf3: OUT OF ORDER - incoming packet = 2369466 and received packet = 0 AND SP = 2369529
[  5]  27.00-28.00  sec   102 MBytes   856 Mbits/sec  0.014 ms  8317/84758 (9.8%)  
iperf3: OUT OF ORDER - incoming packet = 2380936 and received packet = 0 AND SP = 2380999
iperf3: OUT OF ORDER - incoming packet = 2380937 and received packet = 0 AND SP = 2380999
iperf3: OUT OF ORDER - incoming packet = 2384122 and received packet = 0 AND SP = 2384185
iperf3: OUT OF ORDER - incoming packet = 2398106 and received packet = 0 AND SP = 2398169
iperf3: OUT OF ORDER - incoming packet = 2407507 and received packet = 0 AND SP = 2407570
[  5]  28.00-29.00  sec   102 MBytes   858 Mbits/sec  0.011 ms  8261/84849 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2463519 and received packet = 0 AND SP = 2463582
iperf3: OUT OF ORDER - incoming packet = 2463520 and received packet = 0 AND SP = 2463582
iperf3: OUT OF ORDER - incoming packet = 2490271 and received packet = 0 AND SP = 2490334
iperf3: OUT OF ORDER - incoming packet = 2490272 and received packet = 0 AND SP = 2490334
iperf3: OUT OF ORDER - incoming packet = 2498122 and received packet = 0 AND SP = 2498185
iperf3: OUT OF ORDER - incoming packet = 2514674 and received packet = 0 AND SP = 2514737
iperf3: OUT OF ORDER - incoming packet = 2514675 and received packet = 0 AND SP = 2514737
iperf3: OUT OF ORDER - incoming packet = 2517846 and received packet = 0 AND SP = 2517909
[  5]  29.00-30.00  sec   102 MBytes   857 Mbits/sec  0.011 ms  8262/84806 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2546398 and received packet = 0 AND SP = 2546461
iperf3: OUT OF ORDER - incoming packet = 2546399 and received packet = 0 AND SP = 2546461
iperf3: OUT OF ORDER - incoming packet = 2558838 and received packet = 0 AND SP = 2558901
iperf3: OUT OF ORDER - incoming packet = 2572958 and received packet = 0 AND SP = 2573021
iperf3: OUT OF ORDER - incoming packet = 2572959 and received packet = 0 AND SP = 2573021
iperf3: OUT OF ORDER - incoming packet = 2586545 and received packet = 0 AND SP = 2586608
iperf3: OUT OF ORDER - incoming packet = 2586546 and received packet = 0 AND SP = 2586608
iperf3: OUT OF ORDER - incoming packet = 2621613 and received packet = 0 AND SP = 2621676
[  5]  30.00-31.00  sec   102 MBytes   858 Mbits/sec  0.011 ms  8200/84785 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2629982 and received packet = 0 AND SP = 2630045
iperf3: OUT OF ORDER - incoming packet = 2629983 and received packet = 0 AND SP = 2630045
iperf3: OUT OF ORDER - incoming packet = 2672925 and received packet = 0 AND SP = 2672988
iperf3: OUT OF ORDER - incoming packet = 2676140 and received packet = 0 AND SP = 2676203
iperf3: OUT OF ORDER - incoming packet = 2688740 and received packet = 0 AND SP = 2688803
iperf3: OUT OF ORDER - incoming packet = 2688741 and received packet = 0 AND SP = 2688803
iperf3: OUT OF ORDER - incoming packet = 2697968 and received packet = 0 AND SP = 2698031
[  5]  31.00-32.00  sec   102 MBytes   858 Mbits/sec  0.011 ms  8199/84780 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2764257 and received packet = 0 AND SP = 2764320
iperf3: OUT OF ORDER - incoming packet = 2770960 and received packet = 0 AND SP = 2771023
iperf3: OUT OF ORDER - incoming packet = 2783857 and received packet = 0 AND SP = 2783920
[  5]  32.00-33.00  sec   102 MBytes   852 Mbits/sec  0.014 ms  8771/84809 (10%)  
[  5]  33.00-34.00  sec   101 MBytes   849 Mbits/sec  0.011 ms  8707/84542 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2913446 and received packet = 0 AND SP = 2913509
iperf3: OUT OF ORDER - incoming packet = 2927697 and received packet = 0 AND SP = 2927760
iperf3: OUT OF ORDER - incoming packet = 2927698 and received packet = 0 AND SP = 2927760
[  5]  34.00-35.00  sec   102 MBytes   852 Mbits/sec  0.012 ms  8771/84799 (10%)  
iperf3: OUT OF ORDER - incoming packet = 3038091 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038092 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038093 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038094 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038095 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038096 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038097 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038098 and received packet = 0 AND SP = 3038250
[  5]  35.00-36.00  sec  88.0 MBytes   738 Mbits/sec  0.087 ms  7732/73605 (11%)  
[  5]  36.00-37.00  sec  0.00 Bytes  0.00 bits/sec  0.087 ms  0/0 (0%)  
[  5]  37.00-38.00  sec  0.00 Bytes  0.00 bits/sec  0.087 ms  0/0 (0%)  
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-42.30  sec  0.00 Bytes  0.00 bits/sec  0.087 ms  306281/3038250 (10%)  
[SUM]  0.0-42.3 sec  189 datagrams received out-of-order
iperf3: interrupt - the server has terminated




Lowering the sender's bitrate to 500 Mbps also leads to an
even faster lock up.

iperf3: OUT OF ORDER - incoming packet = 34566 and received packet = 0 AND SP = 34629
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  49.5 MBytes   415 Mbits/sec  0.010 ms  3947/41030 (9.6%)  
iperf3: OUT OF ORDER - incoming packet = 76093 and received packet = 0 AND SP = 76156
iperf3: OUT OF ORDER - incoming packet = 76094 and received packet = 0 AND SP = 76156
[  5]   1.00-2.00   sec  54.7 MBytes   459 Mbits/sec  0.011 ms  4019/44998 (8.9%)  
iperf3: OUT OF ORDER - incoming packet = 98670 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98671 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98672 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98673 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98674 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98675 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98676 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98677 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98582 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98583 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98584 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98585 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98586 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98587 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98588 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98589 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98590 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98591 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98592 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98593 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98594 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98595 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98596 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98597 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98598 and received packet = 0 AND SP = 98733
[  5]   2.00-3.00   sec  15.4 MBytes   129 Mbits/sec  0.049 ms  1160/12705 (9.1%)  
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec  0.049 ms  0/0 (0%)  
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec  0.049 ms  0/0 (0%)  
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.71  sec  0.00 Bytes  0.00 bits/sec  0.049 ms  9126/98733 (9.2%)  
[SUM]  0.0-10.7 sec  28 datagrams received out-of-order
iperf3: interrupt - the server has terminated



Even faster with sender @ 200 Mbps

iperf3: OUT OF ORDER - incoming packet = 3451 and received packet = 0 AND SP = 3514
iperf3: OUT OF ORDER - incoming packet = 3452 and received packet = 0 AND SP = 3514
iperf3: OUT OF ORDER - incoming packet = 8736 and received packet = 0 AND SP = 8799
[   15.299039] random: crng init done
iperf3: OUT OF ORDER - incoming packet = 12813 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12814 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12815 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12816 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12817 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12818 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12819 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12820 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12821 and received packet = 0 AND SP = 12876
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  15.5 MBytes   130 Mbits/sec  0.051 ms  1267/12876 (9.8%)  
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec  0.051 ms  0/0 (0%)  
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec  0.051 ms  0/0 (0%)  
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-7.58   sec  0.00 Bytes  0.00 bits/sec  0.051 ms  1267/12876 (9.8%)  
[SUM]  0.0- 7.6 sec  12 datagrams received out-of-order
iperf3: interrupt - the server has terminated


Same problem with sender @ 80 Mbps


Increasing RX descriptors to 128... Cannot reproduce anymore...

Setting RX descriptors back to 64...

Keep digging...

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-16 12:21                     ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-16 12:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/11/2017 17:15, Marc Gonzalez wrote:

> Given the out-of-order datagrams, I'm wondering if it's possible
> for the DMA engine to overwrite a not-yet-read descriptor?
> 
> The EOC flag should stop the DMA engine though...
> 
> Maybe some kind of race...
> 
> I don't think I've been able to trigger the wedge when 256 descriptors
> are used.

I'm still taking stabs in the dark.

Adding a 10 ms delay for every 1024 packets, and using 256 descriptors
doesn't cause any hang, even after an hour.

At 85 packets per ms, 10 ms is large enough to consume all available
descriptors... Yet no issue.

Only when I lower the number of available RX descriptors do I see
the issue. And I don't need any delay...

I'm starting to think there is some kind of race condition between
SW and HW handling of descriptors. This might also explain the
out-of-order warnings.

Lowering the number of RX descriptors to 64, and adding no delay
causes many OUT OF ORDER warnings, and the network locks up in
under a minute:


iperf3: OUT OF ORDER - incoming packet = 13031 and received packet = 0 AND SP = 13094
iperf3: OUT OF ORDER - incoming packet = 21441 and received packet = 0 AND SP = 21504
iperf3: OUT OF ORDER - incoming packet = 21442 and received packet = 0 AND SP = 21504
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  97.4 MBytes   817 Mbits/sec  0.016 ms  8793/81753 (11%)  
iperf3: OUT OF ORDER - incoming packet = 92503 and received packet = 0 AND SP = 92566
iperf3: OUT OF ORDER - incoming packet = 92504 and received packet = 0 AND SP = 92566
iperf3: OUT OF ORDER - incoming packet = 125503 and received packet = 0 AND SP = 125566
[  5]   1.00-2.00   sec   101 MBytes   851 Mbits/sec  0.011 ms  8835/84797 (10%)  
iperf3: OUT OF ORDER - incoming packet = 198710 and received packet = 0 AND SP = 198773
iperf3: OUT OF ORDER - incoming packet = 243272 and received packet = 0 AND SP = 243335
iperf3: OUT OF ORDER - incoming packet = 243791 and received packet = 0 AND SP = 243854
[  5]   2.00-3.00   sec   101 MBytes   851 Mbits/sec  0.016 ms  8771/84782 (10%)  
iperf3: OUT OF ORDER - incoming packet = 294545 and received packet = 0 AND SP = 294608
iperf3: OUT OF ORDER - incoming packet = 301719 and received packet = 0 AND SP = 301782
iperf3: OUT OF ORDER - incoming packet = 301720 and received packet = 0 AND SP = 301782
iperf3: OUT OF ORDER - incoming packet = 305218 and received packet = 0 AND SP = 305281
iperf3: OUT OF ORDER - incoming packet = 314655 and received packet = 0 AND SP = 314718
iperf3: OUT OF ORDER - incoming packet = 325473 and received packet = 0 AND SP = 325536
iperf3: OUT OF ORDER - incoming packet = 325474 and received packet = 0 AND SP = 325536
[  5]   3.00-4.00   sec   101 MBytes   850 Mbits/sec  0.013 ms  8903/84798 (10%)  
iperf3: OUT OF ORDER - incoming packet = 340919 and received packet = 0 AND SP = 340982
iperf3: OUT OF ORDER - incoming packet = 340920 and received packet = 0 AND SP = 340982
iperf3: OUT OF ORDER - incoming packet = 346325 and received packet = 0 AND SP = 346388
iperf3: OUT OF ORDER - incoming packet = 346326 and received packet = 0 AND SP = 346388
[  5]   4.00-5.00   sec   101 MBytes   851 Mbits/sec  0.012 ms  8868/84829 (10%)  
iperf3: OUT OF ORDER - incoming packet = 441873 and received packet = 0 AND SP = 441936
iperf3: OUT OF ORDER - incoming packet = 459258 and received packet = 0 AND SP = 459321
iperf3: OUT OF ORDER - incoming packet = 495604 and received packet = 0 AND SP = 495667
[  5]   5.00-6.00   sec   101 MBytes   850 Mbits/sec  0.013 ms  8835/84760 (10%)  
iperf3: OUT OF ORDER - incoming packet = 558280 and received packet = 0 AND SP = 558343
iperf3: OUT OF ORDER - incoming packet = 558281 and received packet = 0 AND SP = 558343
iperf3: OUT OF ORDER - incoming packet = 587445 and received packet = 0 AND SP = 587508
iperf3: OUT OF ORDER - incoming packet = 587446 and received packet = 0 AND SP = 587508
[  5]   6.00-7.00   sec   102 MBytes   854 Mbits/sec  0.016 ms  8580/84843 (10%)  
iperf3: OUT OF ORDER - incoming packet = 617324 and received packet = 0 AND SP = 617387
iperf3: OUT OF ORDER - incoming packet = 620412 and received packet = 0 AND SP = 620475
iperf3: OUT OF ORDER - incoming packet = 635845 and received packet = 0 AND SP = 635908
iperf3: OUT OF ORDER - incoming packet = 657102 and received packet = 0 AND SP = 657165
iperf3: OUT OF ORDER - incoming packet = 670672 and received packet = 0 AND SP = 670735
iperf3: OUT OF ORDER - incoming packet = 670673 and received packet = 0 AND SP = 670735
[  5]   7.00-8.00   sec   102 MBytes   853 Mbits/sec  0.012 ms  8582/84757 (10%)  
iperf3: OUT OF ORDER - incoming packet = 676529 and received packet = 0 AND SP = 676592
iperf3: OUT OF ORDER - incoming packet = 682693 and received packet = 0 AND SP = 682756
iperf3: OUT OF ORDER - incoming packet = 682694 and received packet = 0 AND SP = 682756
iperf3: OUT OF ORDER - incoming packet = 690106 and received packet = 0 AND SP = 690169
iperf3: OUT OF ORDER - incoming packet = 694369 and received packet = 0 AND SP = 694432
iperf3: OUT OF ORDER - incoming packet = 715288 and received packet = 0 AND SP = 715351
iperf3: OUT OF ORDER - incoming packet = 715289 and received packet = 0 AND SP = 715351
iperf3: OUT OF ORDER - incoming packet = 731162 and received packet = 0 AND SP = 731225
iperf3: OUT OF ORDER - incoming packet = 731163 and received packet = 0 AND SP = 731225
iperf3: OUT OF ORDER - incoming packet = 750282 and received packet = 0 AND SP = 750345
iperf3: OUT OF ORDER - incoming packet = 752731 and received packet = 0 AND SP = 752794
[  5]   8.00-9.00   sec   102 MBytes   853 Mbits/sec  0.011 ms  8651/84784 (10%)  
iperf3: OUT OF ORDER - incoming packet = 786352 and received packet = 0 AND SP = 786415
iperf3: OUT OF ORDER - incoming packet = 786353 and received packet = 0 AND SP = 786415
iperf3: OUT OF ORDER - incoming packet = 801154 and received packet = 0 AND SP = 801217
iperf3: OUT OF ORDER - incoming packet = 803611 and received packet = 0 AND SP = 803674
iperf3: OUT OF ORDER - incoming packet = 803612 and received packet = 0 AND SP = 803674
iperf3: OUT OF ORDER - incoming packet = 809297 and received packet = 0 AND SP = 809360
iperf3: OUT OF ORDER - incoming packet = 821889 and received packet = 0 AND SP = 821952
iperf3: OUT OF ORDER - incoming packet = 828216 and received packet = 0 AND SP = 828279
iperf3: OUT OF ORDER - incoming packet = 828217 and received packet = 0 AND SP = 828279
iperf3: OUT OF ORDER - incoming packet = 830446 and received packet = 0 AND SP = 830509
[  5]   9.00-10.00  sec   102 MBytes   854 Mbits/sec  0.010 ms  8590/84807 (10%)  
iperf3: OUT OF ORDER - incoming packet = 863210 and received packet = 0 AND SP = 863273
iperf3: OUT OF ORDER - incoming packet = 863211 and received packet = 0 AND SP = 863273
iperf3: OUT OF ORDER - incoming packet = 870942 and received packet = 0 AND SP = 871005
iperf3: OUT OF ORDER - incoming packet = 870943 and received packet = 0 AND SP = 871005
iperf3: OUT OF ORDER - incoming packet = 900170 and received packet = 0 AND SP = 900233
iperf3: OUT OF ORDER - incoming packet = 919395 and received packet = 0 AND SP = 919458
iperf3: OUT OF ORDER - incoming packet = 919396 and received packet = 0 AND SP = 919458
[  5]  10.00-11.00  sec   102 MBytes   855 Mbits/sec  0.013 ms  8455/84823 (10%)  
iperf3: OUT OF ORDER - incoming packet = 973534 and received packet = 0 AND SP = 973597
iperf3: OUT OF ORDER - incoming packet = 1011487 and received packet = 0 AND SP = 1011550
[  5]  11.00-12.00  sec   102 MBytes   856 Mbits/sec  0.015 ms  8322/84753 (9.8%)  
iperf3: OUT OF ORDER - incoming packet = 1050138 and received packet = 0 AND SP = 1050201
iperf3: OUT OF ORDER - incoming packet = 1050139 and received packet = 0 AND SP = 1050201
[  5]  12.00-13.00  sec   102 MBytes   855 Mbits/sec  0.010 ms  8450/84812 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1099930 and received packet = 0 AND SP = 1099993
iperf3: OUT OF ORDER - incoming packet = 1123640 and received packet = 0 AND SP = 1123703
iperf3: OUT OF ORDER - incoming packet = 1123641 and received packet = 0 AND SP = 1123703
iperf3: OUT OF ORDER - incoming packet = 1132657 and received packet = 0 AND SP = 1132720
iperf3: OUT OF ORDER - incoming packet = 1132658 and received packet = 0 AND SP = 1132720
iperf3: OUT OF ORDER - incoming packet = 1149744 and received packet = 0 AND SP = 1149807
iperf3: OUT OF ORDER - incoming packet = 1149745 and received packet = 0 AND SP = 1149807
iperf3: OUT OF ORDER - incoming packet = 1165178 and received packet = 0 AND SP = 1165241
[  5]  13.00-14.00  sec   102 MBytes   855 Mbits/sec  0.010 ms  8509/84818 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1186134 and received packet = 0 AND SP = 1186197
iperf3: OUT OF ORDER - incoming packet = 1191699 and received packet = 0 AND SP = 1191762
iperf3: OUT OF ORDER - incoming packet = 1206111 and received packet = 0 AND SP = 1206174
iperf3: OUT OF ORDER - incoming packet = 1206112 and received packet = 0 AND SP = 1206174
iperf3: OUT OF ORDER - incoming packet = 1210586 and received packet = 0 AND SP = 1210649
[  5]  14.00-15.00  sec   102 MBytes   855 Mbits/sec  0.022 ms  8460/84786 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1307589 and received packet = 0 AND SP = 1307652
iperf3: OUT OF ORDER - incoming packet = 1314198 and received packet = 0 AND SP = 1314261
iperf3: OUT OF ORDER - incoming packet = 1345142 and received packet = 0 AND SP = 1345205
[  5]  15.00-16.00  sec   102 MBytes   857 Mbits/sec  0.011 ms  8248/84782 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 1359295 and received packet = 0 AND SP = 1359358
iperf3: OUT OF ORDER - incoming packet = 1413394 and received packet = 0 AND SP = 1413457
iperf3: OUT OF ORDER - incoming packet = 1413395 and received packet = 0 AND SP = 1413457
iperf3: OUT OF ORDER - incoming packet = 1423771 and received packet = 0 AND SP = 1423834
iperf3: OUT OF ORDER - incoming packet = 1423772 and received packet = 0 AND SP = 1423834
[  5]  16.00-17.00  sec   102 MBytes   858 Mbits/sec  0.018 ms  8197/84768 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 1441628 and received packet = 0 AND SP = 1441691
iperf3: OUT OF ORDER - incoming packet = 1500381 and received packet = 0 AND SP = 1500444
iperf3: OUT OF ORDER - incoming packet = 1500382 and received packet = 0 AND SP = 1500444
[  5]  17.00-18.00  sec   102 MBytes   856 Mbits/sec  0.012 ms  8387/84849 (9.9%)  
iperf3: OUT OF ORDER - incoming packet = 1524208 and received packet = 0 AND SP = 1524271
iperf3: OUT OF ORDER - incoming packet = 1525402 and received packet = 0 AND SP = 1525465
iperf3: OUT OF ORDER - incoming packet = 1525403 and received packet = 0 AND SP = 1525465
iperf3: OUT OF ORDER - incoming packet = 1549050 and received packet = 0 AND SP = 1549113
iperf3: OUT OF ORDER - incoming packet = 1549051 and received packet = 0 AND SP = 1549113
iperf3: OUT OF ORDER - incoming packet = 1581586 and received packet = 0 AND SP = 1581649
iperf3: OUT OF ORDER - incoming packet = 1581587 and received packet = 0 AND SP = 1581649
[  5]  18.00-19.00  sec   102 MBytes   854 Mbits/sec  0.012 ms  8519/84776 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1614525 and received packet = 0 AND SP = 1614588
iperf3: OUT OF ORDER - incoming packet = 1614526 and received packet = 0 AND SP = 1614588
iperf3: OUT OF ORDER - incoming packet = 1631851 and received packet = 0 AND SP = 1631914
iperf3: OUT OF ORDER - incoming packet = 1631852 and received packet = 0 AND SP = 1631914
iperf3: OUT OF ORDER - incoming packet = 1632403 and received packet = 0 AND SP = 1632466
iperf3: OUT OF ORDER - incoming packet = 1668834 and received packet = 0 AND SP = 1668897
[  5]  19.00-20.00  sec   102 MBytes   854 Mbits/sec  0.012 ms  8507/84777 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1710901 and received packet = 0 AND SP = 1710964
iperf3: OUT OF ORDER - incoming packet = 1710902 and received packet = 0 AND SP = 1710964
iperf3: OUT OF ORDER - incoming packet = 1743216 and received packet = 0 AND SP = 1743279
iperf3: OUT OF ORDER - incoming packet = 1743217 and received packet = 0 AND SP = 1743279
[  5]  20.00-21.00  sec   102 MBytes   856 Mbits/sec  0.011 ms  8452/84842 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1801411 and received packet = 0 AND SP = 1801474
iperf3: OUT OF ORDER - incoming packet = 1801412 and received packet = 0 AND SP = 1801474
iperf3: OUT OF ORDER - incoming packet = 1828046 and received packet = 0 AND SP = 1828109
[  5]  21.00-22.00  sec   102 MBytes   855 Mbits/sec  0.013 ms  8451/84819 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1872009 and received packet = 0 AND SP = 1872072
iperf3: OUT OF ORDER - incoming packet = 1907637 and received packet = 0 AND SP = 1907700
iperf3: OUT OF ORDER - incoming packet = 1920279 and received packet = 0 AND SP = 1920342
iperf3: OUT OF ORDER - incoming packet = 1920280 and received packet = 0 AND SP = 1920342
[  5]  22.00-23.00  sec   102 MBytes   853 Mbits/sec  0.017 ms  8633/84757 (10%)  
iperf3: OUT OF ORDER - incoming packet = 1958806 and received packet = 0 AND SP = 1958869
iperf3: OUT OF ORDER - incoming packet = 1960612 and received packet = 0 AND SP = 1960675
iperf3: OUT OF ORDER - incoming packet = 1960613 and received packet = 0 AND SP = 1960675
iperf3: OUT OF ORDER - incoming packet = 1971512 and received packet = 0 AND SP = 1971575
iperf3: OUT OF ORDER - incoming packet = 1971513 and received packet = 0 AND SP = 1971575
iperf3: OUT OF ORDER - incoming packet = 1974598 and received packet = 0 AND SP = 1974661
iperf3: OUT OF ORDER - incoming packet = 1974599 and received packet = 0 AND SP = 1974661
iperf3: OUT OF ORDER - incoming packet = 2020663 and received packet = 0 AND SP = 2020726
[  5]  23.00-24.00  sec   102 MBytes   854 Mbits/sec  0.012 ms  8583/84833 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2048932 and received packet = 0 AND SP = 2048995
iperf3: OUT OF ORDER - incoming packet = 2059769 and received packet = 0 AND SP = 2059832
iperf3: OUT OF ORDER - incoming packet = 2069989 and received packet = 0 AND SP = 2070052
iperf3: OUT OF ORDER - incoming packet = 2069990 and received packet = 0 AND SP = 2070052
iperf3: OUT OF ORDER - incoming packet = 2073023 and received packet = 0 AND SP = 2073086
iperf3: OUT OF ORDER - incoming packet = 2083300 and received packet = 0 AND SP = 2083363
iperf3: OUT OF ORDER - incoming packet = 2083301 and received packet = 0 AND SP = 2083363
iperf3: OUT OF ORDER - incoming packet = 2087583 and received packet = 0 AND SP = 2087646
iperf3: OUT OF ORDER - incoming packet = 2089400 and received packet = 0 AND SP = 2089463
iperf3: OUT OF ORDER - incoming packet = 2089401 and received packet = 0 AND SP = 2089463
iperf3: OUT OF ORDER - incoming packet = 2091835 and received packet = 0 AND SP = 2091898
iperf3: OUT OF ORDER - incoming packet = 2091836 and received packet = 0 AND SP = 2091898
iperf3: OUT OF ORDER - incoming packet = 2112610 and received packet = 0 AND SP = 2112673
[  5]  24.00-25.00  sec   102 MBytes   853 Mbits/sec  0.012 ms  8589/84768 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2167633 and received packet = 0 AND SP = 2167696
iperf3: OUT OF ORDER - incoming packet = 2177652 and received packet = 0 AND SP = 2177715
iperf3: OUT OF ORDER - incoming packet = 2177653 and received packet = 0 AND SP = 2177715
[  5]  25.00-26.00  sec   102 MBytes   854 Mbits/sec  0.022 ms  8633/84853 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2267485 and received packet = 0 AND SP = 2267548
iperf3: OUT OF ORDER - incoming packet = 2267486 and received packet = 0 AND SP = 2267548
[  5]  26.00-27.00  sec   102 MBytes   857 Mbits/sec  0.011 ms  8258/84791 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2287486 and received packet = 0 AND SP = 2287549
iperf3: OUT OF ORDER - incoming packet = 2287487 and received packet = 0 AND SP = 2287549
iperf3: OUT OF ORDER - incoming packet = 2308286 and received packet = 0 AND SP = 2308349
iperf3: OUT OF ORDER - incoming packet = 2347272 and received packet = 0 AND SP = 2347335
iperf3: OUT OF ORDER - incoming packet = 2347273 and received packet = 0 AND SP = 2347335
iperf3: OUT OF ORDER - incoming packet = 2354962 and received packet = 0 AND SP = 2355025
iperf3: OUT OF ORDER - incoming packet = 2354963 and received packet = 0 AND SP = 2355025
iperf3: OUT OF ORDER - incoming packet = 2369466 and received packet = 0 AND SP = 2369529
[  5]  27.00-28.00  sec   102 MBytes   856 Mbits/sec  0.014 ms  8317/84758 (9.8%)  
iperf3: OUT OF ORDER - incoming packet = 2380936 and received packet = 0 AND SP = 2380999
iperf3: OUT OF ORDER - incoming packet = 2380937 and received packet = 0 AND SP = 2380999
iperf3: OUT OF ORDER - incoming packet = 2384122 and received packet = 0 AND SP = 2384185
iperf3: OUT OF ORDER - incoming packet = 2398106 and received packet = 0 AND SP = 2398169
iperf3: OUT OF ORDER - incoming packet = 2407507 and received packet = 0 AND SP = 2407570
[  5]  28.00-29.00  sec   102 MBytes   858 Mbits/sec  0.011 ms  8261/84849 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2463519 and received packet = 0 AND SP = 2463582
iperf3: OUT OF ORDER - incoming packet = 2463520 and received packet = 0 AND SP = 2463582
iperf3: OUT OF ORDER - incoming packet = 2490271 and received packet = 0 AND SP = 2490334
iperf3: OUT OF ORDER - incoming packet = 2490272 and received packet = 0 AND SP = 2490334
iperf3: OUT OF ORDER - incoming packet = 2498122 and received packet = 0 AND SP = 2498185
iperf3: OUT OF ORDER - incoming packet = 2514674 and received packet = 0 AND SP = 2514737
iperf3: OUT OF ORDER - incoming packet = 2514675 and received packet = 0 AND SP = 2514737
iperf3: OUT OF ORDER - incoming packet = 2517846 and received packet = 0 AND SP = 2517909
[  5]  29.00-30.00  sec   102 MBytes   857 Mbits/sec  0.011 ms  8262/84806 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2546398 and received packet = 0 AND SP = 2546461
iperf3: OUT OF ORDER - incoming packet = 2546399 and received packet = 0 AND SP = 2546461
iperf3: OUT OF ORDER - incoming packet = 2558838 and received packet = 0 AND SP = 2558901
iperf3: OUT OF ORDER - incoming packet = 2572958 and received packet = 0 AND SP = 2573021
iperf3: OUT OF ORDER - incoming packet = 2572959 and received packet = 0 AND SP = 2573021
iperf3: OUT OF ORDER - incoming packet = 2586545 and received packet = 0 AND SP = 2586608
iperf3: OUT OF ORDER - incoming packet = 2586546 and received packet = 0 AND SP = 2586608
iperf3: OUT OF ORDER - incoming packet = 2621613 and received packet = 0 AND SP = 2621676
[  5]  30.00-31.00  sec   102 MBytes   858 Mbits/sec  0.011 ms  8200/84785 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2629982 and received packet = 0 AND SP = 2630045
iperf3: OUT OF ORDER - incoming packet = 2629983 and received packet = 0 AND SP = 2630045
iperf3: OUT OF ORDER - incoming packet = 2672925 and received packet = 0 AND SP = 2672988
iperf3: OUT OF ORDER - incoming packet = 2676140 and received packet = 0 AND SP = 2676203
iperf3: OUT OF ORDER - incoming packet = 2688740 and received packet = 0 AND SP = 2688803
iperf3: OUT OF ORDER - incoming packet = 2688741 and received packet = 0 AND SP = 2688803
iperf3: OUT OF ORDER - incoming packet = 2697968 and received packet = 0 AND SP = 2698031
[  5]  31.00-32.00  sec   102 MBytes   858 Mbits/sec  0.011 ms  8199/84780 (9.7%)  
iperf3: OUT OF ORDER - incoming packet = 2764257 and received packet = 0 AND SP = 2764320
iperf3: OUT OF ORDER - incoming packet = 2770960 and received packet = 0 AND SP = 2771023
iperf3: OUT OF ORDER - incoming packet = 2783857 and received packet = 0 AND SP = 2783920
[  5]  32.00-33.00  sec   102 MBytes   852 Mbits/sec  0.014 ms  8771/84809 (10%)  
[  5]  33.00-34.00  sec   101 MBytes   849 Mbits/sec  0.011 ms  8707/84542 (10%)  
iperf3: OUT OF ORDER - incoming packet = 2913446 and received packet = 0 AND SP = 2913509
iperf3: OUT OF ORDER - incoming packet = 2927697 and received packet = 0 AND SP = 2927760
iperf3: OUT OF ORDER - incoming packet = 2927698 and received packet = 0 AND SP = 2927760
[  5]  34.00-35.00  sec   102 MBytes   852 Mbits/sec  0.012 ms  8771/84799 (10%)  
iperf3: OUT OF ORDER - incoming packet = 3038091 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038092 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038093 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038094 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038095 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038096 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038097 and received packet = 0 AND SP = 3038250
iperf3: OUT OF ORDER - incoming packet = 3038098 and received packet = 0 AND SP = 3038250
[  5]  35.00-36.00  sec  88.0 MBytes   738 Mbits/sec  0.087 ms  7732/73605 (11%)  
[  5]  36.00-37.00  sec  0.00 Bytes  0.00 bits/sec  0.087 ms  0/0 (0%)  
[  5]  37.00-38.00  sec  0.00 Bytes  0.00 bits/sec  0.087 ms  0/0 (0%)  
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-42.30  sec  0.00 Bytes  0.00 bits/sec  0.087 ms  306281/3038250 (10%)  
[SUM]  0.0-42.3 sec  189 datagrams received out-of-order
iperf3: interrupt - the server has terminated




Lowering the sender's bitrate to 500 Mbps also leads to an
even faster lock up.

iperf3: OUT OF ORDER - incoming packet = 34566 and received packet = 0 AND SP = 34629
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  49.5 MBytes   415 Mbits/sec  0.010 ms  3947/41030 (9.6%)  
iperf3: OUT OF ORDER - incoming packet = 76093 and received packet = 0 AND SP = 76156
iperf3: OUT OF ORDER - incoming packet = 76094 and received packet = 0 AND SP = 76156
[  5]   1.00-2.00   sec  54.7 MBytes   459 Mbits/sec  0.011 ms  4019/44998 (8.9%)  
iperf3: OUT OF ORDER - incoming packet = 98670 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98671 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98672 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98673 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98674 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98675 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98676 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98677 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98582 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98583 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98584 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98585 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98586 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98587 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98588 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98589 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98590 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98591 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98592 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98593 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98594 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98595 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98596 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98597 and received packet = 0 AND SP = 98733
iperf3: OUT OF ORDER - incoming packet = 98598 and received packet = 0 AND SP = 98733
[  5]   2.00-3.00   sec  15.4 MBytes   129 Mbits/sec  0.049 ms  1160/12705 (9.1%)  
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec  0.049 ms  0/0 (0%)  
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec  0.049 ms  0/0 (0%)  
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.71  sec  0.00 Bytes  0.00 bits/sec  0.049 ms  9126/98733 (9.2%)  
[SUM]  0.0-10.7 sec  28 datagrams received out-of-order
iperf3: interrupt - the server has terminated



Even faster with sender @ 200 Mbps

iperf3: OUT OF ORDER - incoming packet = 3451 and received packet = 0 AND SP = 3514
iperf3: OUT OF ORDER - incoming packet = 3452 and received packet = 0 AND SP = 3514
iperf3: OUT OF ORDER - incoming packet = 8736 and received packet = 0 AND SP = 8799
[   15.299039] random: crng init done
iperf3: OUT OF ORDER - incoming packet = 12813 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12814 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12815 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12816 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12817 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12818 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12819 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12820 and received packet = 0 AND SP = 12876
iperf3: OUT OF ORDER - incoming packet = 12821 and received packet = 0 AND SP = 12876
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  15.5 MBytes   130 Mbits/sec  0.051 ms  1267/12876 (9.8%)  
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec  0.051 ms  0/0 (0%)  
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec  0.051 ms  0/0 (0%)  
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-7.58   sec  0.00 Bytes  0.00 bits/sec  0.051 ms  1267/12876 (9.8%)  
[SUM]  0.0- 7.6 sec  12 datagrams received out-of-order
iperf3: interrupt - the server has terminated


Same problem with sender @ 80 Mbps


Increasing RX descriptors to 128... Cannot reproduce anymore...

Setting RX descriptors back to 64...

Keep digging...

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-16 12:21                     ` Marc Gonzalez
@ 2017-11-16 16:23                       ` Andrew Lunn
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-16 16:23 UTC (permalink / raw)
  To: Marc Gonzalez
  Cc: netdev, Florian Fainelli, Mason, Thibaud Cornic, Mans Rullgard,
	David Miller, Linux ARM

> I'm starting to think there is some kind of race condition between
> SW and HW handling of descriptors. This might also explain the
> out-of-order warnings.

Hi Marc

Maybe take a look at your memory barriers. Most accesses using the
_relaxed() version, i.e, no barrier. And then there are specific
barriers when needed. One could be missing.

As a quick test, drop the _relaxed. Force a barrier with each
access. If that works, it is a clear indication you have a barrier
problem.

	Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-16 16:23                       ` Andrew Lunn
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Lunn @ 2017-11-16 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

> I'm starting to think there is some kind of race condition between
> SW and HW handling of descriptors. This might also explain the
> out-of-order warnings.

Hi Marc

Maybe take a look at your memory barriers. Most accesses using the
_relaxed() version, i.e, no barrier. And then there are specific
barriers when needed. One could be missing.

As a quick test, drop the _relaxed. Force a barrier with each
access. If that works, it is a clear indication you have a barrier
problem.

	Andrew

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
  2017-11-16 16:23                       ` Andrew Lunn
@ 2017-11-16 16:52                         ` Marc Gonzalez
  -1 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-16 16:52 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, Florian Fainelli, Mason, Thibaud Cornic, Mans Rullgard,
	David Miller, Linux ARM

On 16/11/2017 17:23, Andrew Lunn wrote:

> Maybe take a look at your memory barriers. Most accesses using the
> _relaxed() version, i.e, no barrier. And then there are specific
> barriers when needed. One could be missing.
> 
> As a quick test, drop the _relaxed. Force a barrier with each
> access. If that works, it is a clear indication you have a barrier
> problem.

That was an interesting suggestion, thanks!

Unfortunately, adding wmb() in dozens of strategic places doesn't
prevent the issue where network connectivity is lost :-(

Regards.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open()
@ 2017-11-16 16:52                         ` Marc Gonzalez
  0 siblings, 0 replies; 70+ messages in thread
From: Marc Gonzalez @ 2017-11-16 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 16/11/2017 17:23, Andrew Lunn wrote:

> Maybe take a look at your memory barriers. Most accesses using the
> _relaxed() version, i.e, no barrier. And then there are specific
> barriers when needed. One could be missing.
> 
> As a quick test, drop the _relaxed. Force a barrier with each
> access. If that works, it is a clear indication you have a barrier
> problem.

That was an interesting suggestion, thanks!

Unfortunately, adding wmb() in dozens of strategic places doesn't
prevent the issue where network connectivity is lost :-(

Regards.

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2017-11-16 16:52 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-14 10:53 [PATCH v3 0/4] Various nb8800 tweaks Marc Gonzalez
2017-11-14 10:53 ` Marc Gonzalez
2017-11-14 10:54 ` [PATCH v3 1/4] net: nb8800: Drop generic support Marc Gonzalez
2017-11-14 10:54   ` Marc Gonzalez
2017-11-14 12:37   ` Måns Rullgård
2017-11-14 12:37     ` Måns Rullgård
2017-11-14 12:47     ` Marc Gonzalez
2017-11-14 12:47       ` Marc Gonzalez
2017-11-14 13:03       ` Måns Rullgård
2017-11-14 13:03         ` Måns Rullgård
2017-11-14 10:55 ` [PATCH v3 2/4] net: nb8800: Simplify nb8800_pause_config() Marc Gonzalez
2017-11-14 10:55   ` Marc Gonzalez
2017-11-14 12:38   ` Måns Rullgård
2017-11-14 12:38     ` Måns Rullgård
2017-11-14 12:56     ` Marc Gonzalez
2017-11-14 12:56       ` Marc Gonzalez
2017-11-14 13:22       ` Måns Rullgård
2017-11-14 13:22         ` Måns Rullgård
2017-11-15 10:53         ` Marc Gonzalez
2017-11-15 10:53           ` Marc Gonzalez
2017-11-15 14:17           ` Andrew Lunn
2017-11-15 14:17             ` Andrew Lunn
2017-11-15 14:33             ` Marc Gonzalez
2017-11-15 14:33               ` Marc Gonzalez
2017-11-15 15:03               ` Andrew Lunn
2017-11-15 15:03                 ` Andrew Lunn
2017-11-15 15:19                 ` Marc Gonzalez
2017-11-15 15:19                   ` Marc Gonzalez
2017-11-15 15:36                   ` Måns Rullgård
2017-11-15 15:36                     ` Måns Rullgård
2017-11-15 21:12                   ` Andrew Lunn
2017-11-15 21:12                     ` Andrew Lunn
2017-11-14 10:56 ` [PATCH v3 3/4] net: nb8800: Move HW init to ndo_open() Marc Gonzalez
2017-11-14 10:56   ` Marc Gonzalez
2017-11-14 12:40   ` Måns Rullgård
2017-11-14 12:40     ` Måns Rullgård
2017-11-14 13:26     ` Marc Gonzalez
2017-11-14 13:26       ` Marc Gonzalez
2017-11-14 13:54       ` Måns Rullgård
2017-11-14 13:54         ` Måns Rullgård
2017-11-14 16:41         ` Marc Gonzalez
2017-11-14 16:41           ` Marc Gonzalez
2017-11-14 16:55           ` Måns Rullgård
2017-11-14 16:55             ` Måns Rullgård
2017-11-14 17:07             ` Marc Gonzalez
2017-11-14 17:07               ` Marc Gonzalez
2017-11-15 14:58             ` Marc Gonzalez
2017-11-15 14:58               ` Marc Gonzalez
2017-11-15 15:11               ` Måns Rullgård
2017-11-15 15:11                 ` Måns Rullgård
2017-11-15 16:15                 ` Marc Gonzalez
2017-11-15 16:15                   ` Marc Gonzalez
2017-11-16 12:21                   ` Marc Gonzalez
2017-11-16 12:21                     ` Marc Gonzalez
2017-11-16 16:23                     ` Andrew Lunn
2017-11-16 16:23                       ` Andrew Lunn
2017-11-16 16:52                       ` Marc Gonzalez
2017-11-16 16:52                         ` Marc Gonzalez
2017-11-14 12:04 ` [PATCH v3 4/4] net: nb8800: Add support for suspend/resume Marc Gonzalez
2017-11-14 12:04   ` Marc Gonzalez
2017-11-14 13:02   ` Måns Rullgård
2017-11-14 13:02     ` Måns Rullgård
2017-11-14 14:22     ` Marc Gonzalez
2017-11-14 14:22       ` Marc Gonzalez
2017-11-14 16:31       ` Andrew Lunn
2017-11-14 16:31         ` Andrew Lunn
2017-11-14 17:08         ` Marc Gonzalez
2017-11-14 17:08           ` Marc Gonzalez
2017-11-14 17:33           ` Andrew Lunn
2017-11-14 17:33             ` Andrew Lunn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.