linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features
@ 2021-05-11 19:05 Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 1/4] atl1c: show correct link speed on Mikrotik 10/25G NIC Gatis Peisenieks
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-11 19:05 UTC (permalink / raw)
  To: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel, Gatis Peisenieks

The new Mikrotik 10/25G NIC maintains compatibility with existing atl1c
driver. However it does have new features.

This patch set adds support for reporting cards higher link speed, max-mtu,
enables rx csum offload and improves tx performance.

Gatis Peisenieks (4):
  atl1c: show correct link speed on Mikrotik 10/25G NIC
  atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability
  atl1c: enable rx csum offload on Mikrotik 10/25G NIC

 drivers/net/ethernet/atheros/atl1c/atl1c.h    |  3 ++
 drivers/net/ethernet/atheros/atl1c/atl1c_hw.c |  9 +++++
 drivers/net/ethernet/atheros/atl1c/atl1c_hw.h |  7 ++++
 .../net/ethernet/atheros/atl1c/atl1c_main.c   | 33 +++++++++++++++----
 4 files changed, 46 insertions(+), 6 deletions(-)


base-commit: 3913ba732e972d88ebc391323999e780a9295852
-- 
2.31.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/4] atl1c: show correct link speed on Mikrotik 10/25G NIC
  2021-05-11 19:05 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
@ 2021-05-11 19:05 ` Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit Gatis Peisenieks
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-11 19:05 UTC (permalink / raw)
  To: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel, Gatis Peisenieks

The new Mikrotik 10/25G NIC maintains compatibility with existing atl1c
driver. However it does have new features.

This defines some new register offsets, code for identifying the new type
of NIC and correct speed detection for the NIC.

Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c.h      | 1 +
 drivers/net/ethernet/atheros/atl1c/atl1c_hw.c   | 9 +++++++++
 drivers/net/ethernet/atheros/atl1c/atl1c_hw.h   | 7 +++++++
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 4 ++++
 4 files changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h b/drivers/net/ethernet/atheros/atl1c/atl1c.h
index 28ae5c16831e..3fda7eb3bd69 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c.h
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h
@@ -289,6 +289,7 @@ enum atl1c_nic_type {
 	athr_l2c_b2,
 	athr_l1d,
 	athr_l1d_2,
+	athr_mt,
 };
 
 enum atl1c_trans_queue {
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_hw.c b/drivers/net/ethernet/atheros/atl1c/atl1c_hw.c
index 140358dcf61e..ddb9442416cd 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_hw.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_hw.c
@@ -648,6 +648,15 @@ int atl1c_get_speed_and_duplex(struct atl1c_hw *hw, u16 *speed, u16 *duplex)
 	int err;
 	u16 phy_data;
 
+	if (hw->nic_type == athr_mt) {
+		u32 spd;
+
+		AT_READ_REG(hw, REG_MT_SPEED, &spd);
+		*speed = spd;
+		*duplex = FULL_DUPLEX;
+		return 0;
+	}
+
 	/* Read   PHY Specific Status Register (17) */
 	err = atl1c_read_phy_reg(hw, MII_GIGA_PSSR, &phy_data);
 	if (err)
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_hw.h b/drivers/net/ethernet/atheros/atl1c/atl1c_hw.h
index ce1a123dce2c..73cbc049a63e 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_hw.h
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_hw.h
@@ -764,6 +764,13 @@ void atl1c_post_phy_linkchg(struct atl1c_hw *hw, u16 link_speed);
 #define REG_DEBUG_DATA0 		0x1900
 #define REG_DEBUG_DATA1 		0x1904
 
+#define REG_MT_MAGIC			0x1F00
+#define REG_MT_MODE			0x1F04
+#define REG_MT_SPEED			0x1F08
+#define REG_MT_VERSION			0x1F0C
+
+#define MT_MAGIC			0xaabb1234
+
 #define L1D_MPW_PHYID1			0xD01C  /* V7 */
 #define L1D_MPW_PHYID2			0xD01D  /* V1-V6 */
 #define L1D_MPW_PHYID3			0xD01E  /* V8 */
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index c6263cf8d3c0..28c30d5288e4 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -644,6 +644,7 @@ static int atl1c_alloc_queues(struct atl1c_adapter *adapter)
 
 static void atl1c_set_mac_type(struct atl1c_hw *hw)
 {
+	u32 magic;
 	switch (hw->device_id) {
 	case PCI_DEVICE_ID_ATTANSIC_L2C:
 		hw->nic_type = athr_l2c;
@@ -662,6 +663,9 @@ static void atl1c_set_mac_type(struct atl1c_hw *hw)
 		break;
 	case PCI_DEVICE_ID_ATHEROS_L1D_2_0:
 		hw->nic_type = athr_l1d_2;
+		AT_READ_REG(hw, REG_MT_MAGIC, &magic);
+		if (magic == MT_MAGIC)
+			hw->nic_type = athr_mt;
 		break;
 	default:
 		break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  2021-05-11 19:05 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 1/4] atl1c: show correct link speed on Mikrotik 10/25G NIC Gatis Peisenieks
@ 2021-05-11 19:05 ` Gatis Peisenieks
  2021-05-11 21:39   ` Eric Dumazet
  2021-05-12  2:39   ` Chris Snook
  2021-05-11 19:05 ` [PATCH net-next 3/4] atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 4/4] atl1c: enable rx csum offload on Mikrotik 10/25G NIC Gatis Peisenieks
  3 siblings, 2 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-11 19:05 UTC (permalink / raw)
  To: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel, Gatis Peisenieks

The kernel has xmit_more facility that hints the networking driver xmit
path about whether more packets are coming soon. This information can be
used to avoid unnecessary expensive PCIe transaction per tx packet at a
slight increase in latency.

Max TX pps on Mikrotik 10/25G NIC in a Threadripper 3960X system
improved from 1150Kpps to 1700Kpps.

Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 28c30d5288e4..2a8ab51b0ed9 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -2211,8 +2211,8 @@ static int atl1c_tx_map(struct atl1c_adapter *adapter,
 	return -1;
 }
 
-static void atl1c_tx_queue(struct atl1c_adapter *adapter, struct sk_buff *skb,
-			   struct atl1c_tpd_desc *tpd, enum atl1c_trans_queue type)
+static void atl1c_tx_queue(struct atl1c_adapter *adapter,
+			   enum atl1c_trans_queue type)
 {
 	struct atl1c_tpd_ring *tpd_ring = &adapter->tpd_ring[type];
 	u16 reg;
@@ -2238,6 +2238,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
 
 	if (atl1c_tpd_avail(adapter, type) < tpd_req) {
 		/* no enough descriptor, just stop queue */
+		atl1c_tx_queue(adapter, type);
 		netif_stop_queue(netdev);
 		return NETDEV_TX_BUSY;
 	}
@@ -2246,6 +2247,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
 
 	/* do TSO and check sum */
 	if (atl1c_tso_csum(adapter, skb, &tpd, type) != 0) {
+		atl1c_tx_queue(adapter, type);
 		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
@@ -2270,8 +2272,11 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
 		atl1c_tx_rollback(adapter, tpd, type);
 		dev_kfree_skb_any(skb);
 	} else {
-		netdev_sent_queue(adapter->netdev, skb->len);
-		atl1c_tx_queue(adapter, skb, tpd, type);
+		bool more = netdev_xmit_more();
+
+		__netdev_sent_queue(adapter->netdev, skb->len, more);
+		if (!more)
+			atl1c_tx_queue(adapter, type);
 	}
 
 	return NETDEV_TX_OK;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/4] atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability
  2021-05-11 19:05 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 1/4] atl1c: show correct link speed on Mikrotik 10/25G NIC Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit Gatis Peisenieks
@ 2021-05-11 19:05 ` Gatis Peisenieks
  2021-05-11 19:05 ` [PATCH net-next 4/4] atl1c: enable rx csum offload on Mikrotik 10/25G NIC Gatis Peisenieks
  3 siblings, 0 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-11 19:05 UTC (permalink / raw)
  To: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel, Gatis Peisenieks

The new Mikrotik 10/25G NIC supports jumbo frames. Jumbo frames are
supported for TSO as well.

This enables the support for mtu up to 9500 bytes.

Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 2a8ab51b0ed9..c34ad1c3a532 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -478,6 +478,8 @@ static void atl1c_set_rxbufsize(struct atl1c_adapter *adapter,
 static netdev_features_t atl1c_fix_features(struct net_device *netdev,
 	netdev_features_t features)
 {
+	struct atl1c_adapter *adapter = netdev_priv(netdev);
+	struct atl1c_hw *hw = &adapter->hw;
 	/*
 	 * Since there is no support for separate rx/tx vlan accel
 	 * enable/disable make sure tx flag is always in same state as rx.
@@ -487,8 +489,10 @@ static netdev_features_t atl1c_fix_features(struct net_device *netdev,
 	else
 		features &= ~NETIF_F_HW_VLAN_CTAG_TX;
 
-	if (netdev->mtu > MAX_TSO_FRAME_SIZE)
-		features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+	if (hw->nic_type != athr_mt) {
+		if (netdev->mtu > MAX_TSO_FRAME_SIZE)
+			features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+	}
 
 	return features;
 }
@@ -517,6 +521,9 @@ static void atl1c_set_max_mtu(struct net_device *netdev)
 		netdev->max_mtu = MAX_JUMBO_FRAME_SIZE -
 				  (ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN);
 		break;
+	case athr_mt:
+		netdev->max_mtu = 9500;
+		break;
 	/* The 10/100 devices don't support jumbo packets, max_mtu 1500 */
 	default:
 		netdev->max_mtu = ETH_DATA_LEN;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 4/4] atl1c: enable rx csum offload on Mikrotik 10/25G NIC
  2021-05-11 19:05 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
                   ` (2 preceding siblings ...)
  2021-05-11 19:05 ` [PATCH net-next 3/4] atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability Gatis Peisenieks
@ 2021-05-11 19:05 ` Gatis Peisenieks
  3 siblings, 0 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-11 19:05 UTC (permalink / raw)
  To: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel, Gatis Peisenieks

Mikrotik 10/25G NIC supports hw checksum verification on rx for
IP/IPv6 + TCP/UDP packets. HW checksum offload helps reduce host
cpu load.

This enables the csum offload specifically for Mikrotik 10/25G NIC
as other HW supported by the driver is known to have problems with it.

TCP iperf3 to Threadripper 3960X with NIC improved 16.5 -> 20.0 Gbps
with mtu=1500.

Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c.h      | 2 ++
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h b/drivers/net/ethernet/atheros/atl1c/atl1c.h
index 3fda7eb3bd69..9d70cb7544f1 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c.h
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h
@@ -241,6 +241,8 @@ struct atl1c_tpd_ext_desc {
 #define RRS_PACKET_PROT_IS_IPV6_ONLY(word) \
 	((((word) >> RRS_PROT_ID_SHIFT) & RRS_PROT_ID_MASK) == 6)
 
+#define RRS_MT_PROT_ID_TCPUDP	BIT(19)
+
 struct atl1c_recv_ret_status {
 	__le32  word0;
 	__le32	rss_hash;
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index c34ad1c3a532..7bb5631d9922 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -1670,6 +1670,11 @@ static irqreturn_t atl1c_intr(int irq, void *data)
 static inline void atl1c_rx_checksum(struct atl1c_adapter *adapter,
 		  struct sk_buff *skb, struct atl1c_recv_ret_status *prrs)
 {
+	if (adapter->hw.nic_type == athr_mt) {
+		if (prrs->word3 & RRS_MT_PROT_ID_TCPUDP)
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+		return;
+	}
 	/*
 	 * The pid field in RRS in not correct sometimes, so we
 	 * cannot figure out if the packet is fragmented or not,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  2021-05-11 19:05 ` [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit Gatis Peisenieks
@ 2021-05-11 21:39   ` Eric Dumazet
  2021-05-12  7:53     ` Gatis Peisenieks
  2021-05-12  2:39   ` Chris Snook
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2021-05-11 21:39 UTC (permalink / raw)
  To: Gatis Peisenieks, chris.snook, davem, kuba, hkallweit1,
	jesse.brandeburg, dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel



On 5/11/21 9:05 PM, Gatis Peisenieks wrote:
> The kernel has xmit_more facility that hints the networking driver xmit
> path about whether more packets are coming soon. This information can be
> used to avoid unnecessary expensive PCIe transaction per tx packet at a
> slight increase in latency.
> 
> Max TX pps on Mikrotik 10/25G NIC in a Threadripper 3960X system
> improved from 1150Kpps to 1700Kpps.
> 
> Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
> ---
>  drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> index 28c30d5288e4..2a8ab51b0ed9 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> @@ -2211,8 +2211,8 @@ static int atl1c_tx_map(struct atl1c_adapter *adapter,
>  	return -1;
>  }
>  
> -static void atl1c_tx_queue(struct atl1c_adapter *adapter, struct sk_buff *skb,
> -			   struct atl1c_tpd_desc *tpd, enum atl1c_trans_queue type)
> +static void atl1c_tx_queue(struct atl1c_adapter *adapter,
> +			   enum atl1c_trans_queue type)
>  {
>  	struct atl1c_tpd_ring *tpd_ring = &adapter->tpd_ring[type];
>  	u16 reg;
> @@ -2238,6 +2238,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
>  
>  	if (atl1c_tpd_avail(adapter, type) < tpd_req) {
>  		/* no enough descriptor, just stop queue */
> +		atl1c_tx_queue(adapter, type);
>  		netif_stop_queue(netdev);
>  		return NETDEV_TX_BUSY;
>  	}
> @@ -2246,6 +2247,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
>  
>  	/* do TSO and check sum */
>  	if (atl1c_tso_csum(adapter, skb, &tpd, type) != 0) {
> +		atl1c_tx_queue(adapter, type);
>  		dev_kfree_skb_any(skb);
>  		return NETDEV_TX_OK;
>  	}
> @@ -2270,8 +2272,11 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
>  		atl1c_tx_rollback(adapter, tpd, type);
>  		dev_kfree_skb_any(skb);
>  	} else {
> -		netdev_sent_queue(adapter->netdev, skb->len);
> -		atl1c_tx_queue(adapter, skb, tpd, type);
> +		bool more = netdev_xmit_more();
> +
> +		__netdev_sent_queue(adapter->netdev, skb->len, more);


This is probably buggy.

You must check and use the return code of this function,
as in :

	bool door_bell = __netdev_sent_queue(adapter->netdev, skb->len, netdev_xmit_more());

	if (door_bell)
		atl1c_tx_queue(adapter, type);


> +		if (!more)
> +			atl1c_tx_queue(adapter, type);
>  	}
>  
>  	return NETDEV_TX_OK;
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  2021-05-11 19:05 ` [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit Gatis Peisenieks
  2021-05-11 21:39   ` Eric Dumazet
@ 2021-05-12  2:39   ` Chris Snook
  2021-05-12  8:33     ` David Laight
  2021-05-12 14:35     ` Gatis Peisenieks
  1 sibling, 2 replies; 11+ messages in thread
From: Chris Snook @ 2021-05-12  2:39 UTC (permalink / raw)
  To: Gatis Peisenieks
  Cc: David S. Miller, Jakub Kicinski, Heiner Kallweit,
	jesse.brandeburg, dchickles, tully, Eric Dumazet, netdev, LKML

Increases in latency tend to hurt more on single-queue devices. Has
this been tested on the original gigabit atl1c?

- Chris

On Tue, May 11, 2021 at 12:05 PM Gatis Peisenieks <gatis@mikrotik.com> wrote:
>
> The kernel has xmit_more facility that hints the networking driver xmit
> path about whether more packets are coming soon. This information can be
> used to avoid unnecessary expensive PCIe transaction per tx packet at a
> slight increase in latency.
>
> Max TX pps on Mikrotik 10/25G NIC in a Threadripper 3960X system
> improved from 1150Kpps to 1700Kpps.
>
> Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
> ---
>  drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> index 28c30d5288e4..2a8ab51b0ed9 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> @@ -2211,8 +2211,8 @@ static int atl1c_tx_map(struct atl1c_adapter *adapter,
>         return -1;
>  }
>
> -static void atl1c_tx_queue(struct atl1c_adapter *adapter, struct sk_buff *skb,
> -                          struct atl1c_tpd_desc *tpd, enum atl1c_trans_queue type)
> +static void atl1c_tx_queue(struct atl1c_adapter *adapter,
> +                          enum atl1c_trans_queue type)
>  {
>         struct atl1c_tpd_ring *tpd_ring = &adapter->tpd_ring[type];
>         u16 reg;
> @@ -2238,6 +2238,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
>
>         if (atl1c_tpd_avail(adapter, type) < tpd_req) {
>                 /* no enough descriptor, just stop queue */
> +               atl1c_tx_queue(adapter, type);
>                 netif_stop_queue(netdev);
>                 return NETDEV_TX_BUSY;
>         }
> @@ -2246,6 +2247,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
>
>         /* do TSO and check sum */
>         if (atl1c_tso_csum(adapter, skb, &tpd, type) != 0) {
> +               atl1c_tx_queue(adapter, type);
>                 dev_kfree_skb_any(skb);
>                 return NETDEV_TX_OK;
>         }
> @@ -2270,8 +2272,11 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
>                 atl1c_tx_rollback(adapter, tpd, type);
>                 dev_kfree_skb_any(skb);
>         } else {
> -               netdev_sent_queue(adapter->netdev, skb->len);
> -               atl1c_tx_queue(adapter, skb, tpd, type);
> +               bool more = netdev_xmit_more();
> +
> +               __netdev_sent_queue(adapter->netdev, skb->len, more);
> +               if (!more)
> +                       atl1c_tx_queue(adapter, type);
>         }
>
>         return NETDEV_TX_OK;
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  2021-05-11 21:39   ` Eric Dumazet
@ 2021-05-12  7:53     ` Gatis Peisenieks
  0 siblings, 0 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-12  7:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, netdev, linux-kernel



On 2021-05-12 00:39, Eric Dumazet wrote:
>> @@ -2270,8 +2272,11 @@ static netdev_tx_t atl1c_xmit_frame(struct 
>> sk_buff *skb,
>>  		atl1c_tx_rollback(adapter, tpd, type);
>>  		dev_kfree_skb_any(skb);
>>  	} else {
>> -		netdev_sent_queue(adapter->netdev, skb->len);
>> -		atl1c_tx_queue(adapter, skb, tpd, type);
>> +		bool more = netdev_xmit_more();
>> +
>> +		__netdev_sent_queue(adapter->netdev, skb->len, more);
> 
> 
> This is probably buggy.
> 
> You must check and use the return code of this function,
> as in :
> 
> 	bool door_bell = __netdev_sent_queue(adapter->netdev, skb->len,
> netdev_xmit_more());
> 
> 	if (door_bell)
> 		atl1c_tx_queue(adapter, type);
> 

Eric, thank you for taking your time to look at this!

You are correct, tx queue can get stopped in __netdev_sent_queue
and if there were more packets coming the submit to HW would be
missed / unnecessarily delayed.


> 
>> +		if (!more)
>> +			atl1c_tx_queue(adapter, type);
>>  	}
>> 
>>  	return NETDEV_TX_OK;
>> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  2021-05-12  2:39   ` Chris Snook
@ 2021-05-12  8:33     ` David Laight
  2021-05-12 14:35     ` Gatis Peisenieks
  1 sibling, 0 replies; 11+ messages in thread
From: David Laight @ 2021-05-12  8:33 UTC (permalink / raw)
  To: 'Chris Snook', Gatis Peisenieks
  Cc: David S. Miller, Jakub Kicinski, Heiner Kallweit,
	jesse.brandeburg, dchickles, tully, Eric Dumazet, netdev, LKML

From: Chris Snook <chris.snook@gmail.com>
> Sent: 12 May 2021 03:40
> 
> On Tue, May 11, 2021 at 12:05 PM Gatis Peisenieks <gatis@mikrotik.com> wrote:
> >
> > The kernel has xmit_more facility that hints the networking driver xmit
> > path about whether more packets are coming soon. This information can be
> > used to avoid unnecessary expensive PCIe transaction per tx packet at a
> > slight increase in latency.
> 
> Increases in latency tend to hurt more on single-queue devices. Has
> this been tested on the original gigabit atl1c?

It probably depends a lot on how expensive it is to 'kick' the mac unit.

A simple (posted) PCIe write when the PCIe host interface is idle (as is likely
when you've just been updating descriptors) is probably noise compared
to the rest of the cost of sending the packet.
(Eric will probably say they measured gains.)

OTOH if you have (as I have on one system) the e1000e driver and some
completely broken 'management interface' hardware which means it can
take a lot of microseconds to write to any MAC register you really
do need to look at netdev_xmit_more() [1].

Unfortunately it doesn't help that much.
netdev_xmit_more() reports the state of the tx queue when the current
skb transmit was passed to the mac driver.
It doesn't report the state of the queue at the time netdev_xmit_more()
is called - so any packets queued while the transmit setup is in
progress don't cause netdev_xmit_more() to return true.
I've traced this happening repeatedly...

[1] If the MI is active MAC writes are broken (may write to the
wrong register), so there is horrid code before each access that
(IIRC) effectively does:
	while (mi_active())
		mdelay(10);
This is just so broken (interrupts are even enabled).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit
  2021-05-12  2:39   ` Chris Snook
  2021-05-12  8:33     ` David Laight
@ 2021-05-12 14:35     ` Gatis Peisenieks
  1 sibling, 0 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-05-12 14:35 UTC (permalink / raw)
  To: Chris Snook
  Cc: David S. Miller, Jakub Kicinski, Heiner Kallweit,
	jesse.brandeburg, dchickles, tully, Eric Dumazet, netdev, LKML



On 2021-05-12 05:39, Chris Snook wrote:
> Increases in latency tend to hurt more on single-queue devices. Has
> this been tested on the original gigabit atl1c?

Thank you Chris, for checking this out!

I did test the atl1c driver with and without this change on actual
AR8151 hardware.

My test system was Intel(R) Core(TM) i7-4790K + RB44Ge.
That is a 4 port 1G AR8151 based card.

I measured latency with external traffic generator with test system
doing L2 forwarding. Receiving traffic on one atl1c interface and
transmitting over another atl1c interface. I had default 1000 packet
pfifo queue configured on the atl1c interfaces.

Max 64 byte packet L2 forward pps rate improved 860K -> 1070K.

Any latency difference at 800Kpps was lost in the noise - with the
particular traffic generator system (a linux based RouterOS 
traffic-gen).
I measured average 285us for a 30 second run in both cases. Note that
this includes any traffic generator "internal" latency.

Note that I had to tweak atl1c tx interrupt moderation to get these
numbers. With default tx_imt = 1000 no matter what I get only 500
tx interrupts/sec. Since the tx clean is fast and do not get polled
repeatedly and ring size is 1024 I am limited to ~500Kpps.
tx_imt = 500 dobubles that, I used tx_imt = 200 for this test.

As a side note that still relates to latency discussion on AR8151
hardware what I did find out however is that rx interrupt moderation
timer value has a big effect on latency. Changing rx_imt
from 200 to 10 resulted in considerable improvement from 285us to 41us
average latency as measured by traffic generator. I do not have
enough knowledge of the quirks of all the hardware supported by
the driver to confidently put this in a patch though.

Mikrotik 10/25G NIC has its own interrupt moderation mechanism,
so this is not relevant to that if anyone is interested.


> 
> - Chris
> 
> On Tue, May 11, 2021 at 12:05 PM Gatis Peisenieks <gatis@mikrotik.com> 
> wrote:
>> 
>> The kernel has xmit_more facility that hints the networking driver 
>> xmit
>> path about whether more packets are coming soon. This information can 
>> be
>> used to avoid unnecessary expensive PCIe transaction per tx packet at 
>> a
>> slight increase in latency.
>> 
>> Max TX pps on Mikrotik 10/25G NIC in a Threadripper 3960X system
>> improved from 1150Kpps to 1700Kpps.
>> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/4] atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability
  2021-04-19 14:34 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
@ 2021-04-19 14:34 ` Gatis Peisenieks
  0 siblings, 0 replies; 11+ messages in thread
From: Gatis Peisenieks @ 2021-04-19 14:34 UTC (permalink / raw)
  To: chris.snook, davem, kuba, hkallweit1, jesse.brandeburg,
	dchickles, tully, eric.dumazet
  Cc: netdev, linux-kernel, Gatis Peisenieks

The new Mikrotik 10/25G NIC supports jumbo frames. Jumbo frames are
supported for TSO as well.

This enables the support for mtu up to 9500 bytes.

Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 8b097964a035..920e408ce7b4 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -478,6 +478,8 @@ static void atl1c_set_rxbufsize(struct atl1c_adapter *adapter,
 static netdev_features_t atl1c_fix_features(struct net_device *netdev,
 	netdev_features_t features)
 {
+	struct atl1c_adapter *adapter = netdev_priv(netdev);
+	struct atl1c_hw *hw = &adapter->hw;
 	/*
 	 * Since there is no support for separate rx/tx vlan accel
 	 * enable/disable make sure tx flag is always in same state as rx.
@@ -487,8 +489,10 @@ static netdev_features_t atl1c_fix_features(struct net_device *netdev,
 	else
 		features &= ~NETIF_F_HW_VLAN_CTAG_TX;
 
-	if (netdev->mtu > MAX_TSO_FRAME_SIZE)
-		features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+	if (hw->nic_type != athr_mt) {
+		if (netdev->mtu > MAX_TSO_FRAME_SIZE)
+			features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+	}
 
 	return features;
 }
@@ -517,6 +521,9 @@ static void atl1c_set_max_mtu(struct net_device *netdev)
 		netdev->max_mtu = MAX_JUMBO_FRAME_SIZE -
 				  (ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN);
 		break;
+	case athr_mt:
+		netdev->max_mtu = 9500;
+		break;
 	/* The 10/100 devices don't support jumbo packets, max_mtu 1500 */
 	default:
 		netdev->max_mtu = ETH_DATA_LEN;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-05-12 14:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 19:05 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
2021-05-11 19:05 ` [PATCH net-next 1/4] atl1c: show correct link speed on Mikrotik 10/25G NIC Gatis Peisenieks
2021-05-11 19:05 ` [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit Gatis Peisenieks
2021-05-11 21:39   ` Eric Dumazet
2021-05-12  7:53     ` Gatis Peisenieks
2021-05-12  2:39   ` Chris Snook
2021-05-12  8:33     ` David Laight
2021-05-12 14:35     ` Gatis Peisenieks
2021-05-11 19:05 ` [PATCH net-next 3/4] atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability Gatis Peisenieks
2021-05-11 19:05 ` [PATCH net-next 4/4] atl1c: enable rx csum offload on Mikrotik 10/25G NIC Gatis Peisenieks
  -- strict thread matches above, loose matches on Subject: below --
2021-04-19 14:34 [PATCH net-next 0/4] atl1c: support for Mikrotik 10/25G NIC features Gatis Peisenieks
2021-04-19 14:34 ` [PATCH net-next 3/4] atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability Gatis Peisenieks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).