linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v1 0/6] lan743x speed boost
@ 2021-01-29 19:52 Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
                   ` (5 more replies)
  0 siblings, 6 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

The first patch of this series boosts the chip's rx performance by up to 3x
on cpus such as ARM. However it introduces a breaking change: the mtu
can no longer be changed while the network interface is up.

To get around this efficiently, the second patch adds driver support for
multi-buffer frames. This will allow us to change the mtu while the device
is up, without having to re-allocate all ring buffers.

Since this is an important change to the driver's rx logic, I have attempted
to very carefully test this. Test descriptions are included with each
commit message.

I invite all interested users of the lan743x to test out these changes, either
by testing them out "in the real world", or by repeating my artificial tests.

Suggestions for better tests are very welcome.

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

Sven Van Asbroeck (6):
  lan743x: boost performance on cpu archs w/o dma cache snooping
  lan743x: support rx multi-buffer packets
  lan743x: allow mtu change while network interface is up
  TEST ONLY: lan743x: limit rx ring buffer size to 500 bytes
  TEST ONLY: lan743x: skb_alloc failure test
  TEST ONLY: lan743x: skb_trim failure test

 drivers/net/ethernet/microchip/lan743x_main.c | 324 ++++++++----------
 drivers/net/ethernet/microchip/lan743x_main.h |   2 +
 2 files changed, 152 insertions(+), 174 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
@ 2021-01-29 19:52 ` Sven Van Asbroeck
  2021-01-29 20:36   ` Andrew Lunn
                     ` (4 more replies)
  2021-01-29 19:52 ` [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets Sven Van Asbroeck
                   ` (4 subsequent siblings)
  5 siblings, 5 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

The buffers in the lan743x driver's receive ring are always 9K,
even when the largest packet that can be received (the mtu) is
much smaller. This performs particularly badly on cpu archs
without dma cache snooping (such as ARM): each received packet
results in a 9K dma_{map|unmap} operation, which is very expensive
because cpu caches need to be invalidated.

Careful measurement of the driver rx path on armv7 reveals that
the cpu spends the majority of its time waiting for cache
invalidation.

Optimize as follows:

1. set rx ring buffer size equal to the mtu. this limits the
   amount of cache that needs to be invalidated per dma_map().

2. when dma_unmap()ping, skip cpu sync. Sync only the packet data
   actually received, the size of which the chip will indicate in
   its rx ring descriptors. this limits the amount of cache that
   needs to be invalidated per dma_unmap().

These optimizations double the rx performance on armv7.
Third parties report 3x rx speedup on armv8.

Performance on dma cache snooping architectures (such as x86)
is expected to stay the same.

Tested with iperf3 on a freescale imx6qp + lan7430, both sides
set to mtu 1500 bytes, measure rx performance:

Before:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec   550 MBytes   231 Mbits/sec    0
After:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec  1.33 GBytes   570 Mbits/sec    0

Test by Anders Roenningen (anders@ronningen.priv.no) on armv8,
    rx iperf3:
Before 102 Mbits/sec
After  279 Mbits/sec

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

 drivers/net/ethernet/microchip/lan743x_main.c | 35 ++++++++++++-------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index f1f6eba4ace4..f485320e5784 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1957,11 +1957,11 @@ static int lan743x_rx_next_index(struct lan743x_rx *rx, int index)
 
 static struct sk_buff *lan743x_rx_allocate_skb(struct lan743x_rx *rx)
 {
-	int length = 0;
+	struct net_device *netdev = rx->adapter->netdev;
 
-	length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING);
-	return __netdev_alloc_skb(rx->adapter->netdev,
-				  length, GFP_ATOMIC | GFP_DMA);
+	return __netdev_alloc_skb(netdev,
+				  netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING,
+				  GFP_ATOMIC | GFP_DMA);
 }
 
 static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index)
@@ -1977,9 +1977,10 @@ static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index,
 {
 	struct lan743x_rx_buffer_info *buffer_info;
 	struct lan743x_rx_descriptor *descriptor;
-	int length = 0;
+	struct net_device *netdev = rx->adapter->netdev;
+	int length;
 
-	length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING);
+	length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
 	descriptor = &rx->ring_cpu_ptr[index];
 	buffer_info = &rx->buffer_info[index];
 	buffer_info->skb = skb;
@@ -2148,11 +2149,18 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
 			descriptor = &rx->ring_cpu_ptr[first_index];
 
 			/* unmap from dma */
+			packet_length =	RX_DESC_DATA0_FRAME_LENGTH_GET_
+					(descriptor->data0);
 			if (buffer_info->dma_ptr) {
-				dma_unmap_single(&rx->adapter->pdev->dev,
-						 buffer_info->dma_ptr,
-						 buffer_info->buffer_length,
-						 DMA_FROM_DEVICE);
+				dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
+							buffer_info->dma_ptr,
+							packet_length,
+							DMA_FROM_DEVICE);
+				dma_unmap_single_attrs(&rx->adapter->pdev->dev,
+						       buffer_info->dma_ptr,
+						       buffer_info->buffer_length,
+						       DMA_FROM_DEVICE,
+						       DMA_ATTR_SKIP_CPU_SYNC);
 				buffer_info->dma_ptr = 0;
 				buffer_info->buffer_length = 0;
 			}
@@ -2167,8 +2175,8 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
 			int index = first_index;
 
 			/* multi buffer packet not supported */
-			/* this should not happen since
-			 * buffers are allocated to be at least jumbo size
+			/* this should not happen since buffers are allocated
+			 * to be at least the mtu size configured in the mac.
 			 */
 
 			/* clean up buffers */
@@ -2628,6 +2636,9 @@ static int lan743x_netdev_change_mtu(struct net_device *netdev, int new_mtu)
 	struct lan743x_adapter *adapter = netdev_priv(netdev);
 	int ret = 0;
 
+	if (netif_running(netdev))
+		return -EBUSY;
+
 	ret = lan743x_mac_set_mtu(adapter, new_mtu);
 	if (!ret)
 		netdev->mtu = new_mtu;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
@ 2021-01-29 19:52 ` Sven Van Asbroeck
  2021-01-29 22:11   ` Willem de Bruijn
  2021-01-31  7:06   ` Bryan.Whitehead
  2021-01-29 19:52 ` [PATCH net-next v1 3/6] lan743x: allow mtu change while network interface is up Sven Van Asbroeck
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

Multi-buffer packets enable us to use rx ring buffers smaller than
the mtu. This will allow us to change the mtu on-the-fly, without
having to stop the network interface in order to re-size the rx
ring buffers.

This is a big change touching a key driver function (process_packet),
so care has been taken to test this extensively:

Tests with debug logging enabled (add #define DEBUG).

1. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
   Ping to chip, verify correct packet size is sent to OS.
   Ping large packets to chip (ping -s 1400), verify correct
     packet size is sent to OS.
   Ping using packets around the buffer size, verify number of
     buffers is changing, verify correct packet size is sent
     to OS:
     $ ping -s 472
     $ ping -s 473
     $ ping -s 992
     $ ping -s 993
   Verify that each packet is followed by extension processing.

2. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
   Run iperf3 -s on chip, verify that packets come in 3 buffers
     at a time.
   Verify that packet size is equal to mtu.
   Verify that each packet is followed by extension processing.

3. Set chip and host mtu to 2000.
   Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
   Run iperf3 -s on chip, verify that packets come in 4 buffers
     at a time.
   Verify that packet size is equal to mtu.
   Verify that each packet is followed by extension processing.

Tests with debug logging DISabled (remove #define DEBUG).

4. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
   Run iperf3 -s on chip, note sustained rx speed.
   Set chip and host mtu to 2000, so mtu takes 4 buffers.
   Run iperf3 -s on chip, note sustained rx speed.
   Verify no packets are dropped in both cases.

Tests with DEBUG_KMEMLEAK on:
 $ mount -t debugfs nodev /sys/kernel/debug/
 $ echo scan > /sys/kernel/debug/kmemleak

5. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
   Run the following tests concurrently for at least one hour:
   - iperf3 -s on chip
   - ping -> chip
   Monitor reported memory leaks.

6. Set chip and host mtu to 2000.
   Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
   Run the following tests concurrently for at least one hour:
   - iperf3 -s on chip
   - ping -> chip
   Monitor reported memory leaks.

7. Simulate low-memory in lan743x_rx_allocate_skb(): fail every
     100 allocations.
   Repeat (5) and (6).
   Monitor reported memory leaks.

8. Simulate  low-memory in lan743x_rx_allocate_skb(): fail 10
     allocations in a row in every 100.
   Repeat (5) and (6).
   Monitor reported memory leaks.

9. Simulate  low-memory in lan743x_rx_trim_skb(): fail 1 allocation
     in every 100.
   Repeat (5) and (6).
   Monitor reported memory leaks.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

 drivers/net/ethernet/microchip/lan743x_main.c | 321 ++++++++----------
 drivers/net/ethernet/microchip/lan743x_main.h |   2 +
 2 files changed, 143 insertions(+), 180 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index f485320e5784..b784e9feadac 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1955,15 +1955,6 @@ static int lan743x_rx_next_index(struct lan743x_rx *rx, int index)
 	return ((++index) % rx->ring_size);
 }
 
-static struct sk_buff *lan743x_rx_allocate_skb(struct lan743x_rx *rx)
-{
-	struct net_device *netdev = rx->adapter->netdev;
-
-	return __netdev_alloc_skb(netdev,
-				  netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING,
-				  GFP_ATOMIC | GFP_DMA);
-}
-
 static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index)
 {
 	/* update the tail once per 8 descriptors */
@@ -1972,37 +1963,37 @@ static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index)
 				  index);
 }
 
-static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index,
-					struct sk_buff *skb)
+static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index)
 {
+	struct net_device *netdev = rx->adapter->netdev;
+	struct device *dev = &rx->adapter->pdev->dev;
 	struct lan743x_rx_buffer_info *buffer_info;
 	struct lan743x_rx_descriptor *descriptor;
-	struct net_device *netdev = rx->adapter->netdev;
+	struct sk_buff *skb;
+	dma_addr_t dma_ptr;
 	int length;
 
 	length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
+
 	descriptor = &rx->ring_cpu_ptr[index];
 	buffer_info = &rx->buffer_info[index];
-	buffer_info->skb = skb;
-	if (!(buffer_info->skb))
+	skb = __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA);
+	if (!skb)
 		return -ENOMEM;
-	buffer_info->dma_ptr = dma_map_single(&rx->adapter->pdev->dev,
-					      buffer_info->skb->data,
-					      length,
-					      DMA_FROM_DEVICE);
-	if (dma_mapping_error(&rx->adapter->pdev->dev,
-			      buffer_info->dma_ptr)) {
-		buffer_info->dma_ptr = 0;
+	dma_ptr = dma_map_single(dev, skb->data, length, DMA_FROM_DEVICE);
+	if (dma_mapping_error(dev, dma_ptr)) {
+		dev_kfree_skb_any(skb);
 		return -ENOMEM;
 	}
 
+	buffer_info->skb = skb;
+	buffer_info->dma_ptr = dma_ptr;
 	buffer_info->buffer_length = length;
 	descriptor->data1 = cpu_to_le32(DMA_ADDR_LOW32(buffer_info->dma_ptr));
 	descriptor->data2 = cpu_to_le32(DMA_ADDR_HIGH32(buffer_info->dma_ptr));
 	descriptor->data3 = 0;
 	descriptor->data0 = cpu_to_le32((RX_DESC_DATA0_OWN_ |
 			    (length & RX_DESC_DATA0_BUF_LENGTH_MASK_)));
-	skb_reserve(buffer_info->skb, RX_HEAD_PADDING);
 	lan743x_rx_update_tail(rx, index);
 
 	return 0;
@@ -2051,16 +2042,32 @@ static void lan743x_rx_release_ring_element(struct lan743x_rx *rx, int index)
 	memset(buffer_info, 0, sizeof(*buffer_info));
 }
 
+static struct sk_buff *
+lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
+{
+	if (skb_linearize(skb)) {
+		dev_kfree_skb_irq(skb);
+		return NULL;
+	}
+	frame_length = max_t(int, 0, frame_length - RX_HEAD_PADDING - 2);
+	if (skb->len > frame_length) {
+		skb->tail -= skb->len - frame_length;
+		skb->len = frame_length;
+	}
+	return skb;
+}
+
 static int lan743x_rx_process_packet(struct lan743x_rx *rx)
 {
-	struct skb_shared_hwtstamps *hwtstamps = NULL;
+	struct lan743x_rx_descriptor *descriptor, *desc_ext;
 	int result = RX_PROCESS_RESULT_NOTHING_TO_DO;
 	int current_head_index = le32_to_cpu(*rx->head_cpu_ptr);
 	struct lan743x_rx_buffer_info *buffer_info;
-	struct lan743x_rx_descriptor *descriptor;
+	struct skb_shared_hwtstamps *hwtstamps;
+	int frame_length, buffer_length;
+	struct sk_buff *skb;
 	int extension_index = -1;
-	int first_index = -1;
-	int last_index = -1;
+	bool is_last, is_first;
 
 	if (current_head_index < 0 || current_head_index >= rx->ring_size)
 		goto done;
@@ -2068,170 +2075,126 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
 	if (rx->last_head < 0 || rx->last_head >= rx->ring_size)
 		goto done;
 
-	if (rx->last_head != current_head_index) {
-		descriptor = &rx->ring_cpu_ptr[rx->last_head];
-		if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
-			goto done;
+	if (rx->last_head == current_head_index)
+		goto done;
 
-		if (!(le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_))
-			goto done;
+	descriptor = &rx->ring_cpu_ptr[rx->last_head];
+	if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
+		goto done;
+	buffer_info = &rx->buffer_info[rx->last_head];
 
-		first_index = rx->last_head;
-		if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
-			last_index = rx->last_head;
-		} else {
-			int index;
-
-			index = lan743x_rx_next_index(rx, first_index);
-			while (index != current_head_index) {
-				descriptor = &rx->ring_cpu_ptr[index];
-				if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
-					goto done;
-
-				if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
-					last_index = index;
-					break;
-				}
-				index = lan743x_rx_next_index(rx, index);
-			}
-		}
-		if (last_index >= 0) {
-			descriptor = &rx->ring_cpu_ptr[last_index];
-			if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
-				/* extension is expected to follow */
-				int index = lan743x_rx_next_index(rx,
-								  last_index);
-				if (index != current_head_index) {
-					descriptor = &rx->ring_cpu_ptr[index];
-					if (le32_to_cpu(descriptor->data0) &
-					    RX_DESC_DATA0_OWN_) {
-						goto done;
-					}
-					if (le32_to_cpu(descriptor->data0) &
-					    RX_DESC_DATA0_EXT_) {
-						extension_index = index;
-					} else {
-						goto done;
-					}
-				} else {
-					/* extension is not yet available */
-					/* prevent processing of this packet */
-					first_index = -1;
-					last_index = -1;
-				}
-			}
-		}
+	is_last = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_;
+	is_first = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_;
+
+	if (is_last && le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
+		/* extension is expected to follow */
+		int index = lan743x_rx_next_index(rx, rx->last_head);
+
+		if (index == current_head_index)
+			/* extension not yet available */
+			goto done;
+		desc_ext = &rx->ring_cpu_ptr[index];
+		if (le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_OWN_)
+			/* extension not yet available */
+			goto done;
+		if (!(le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_EXT_))
+			goto move_forward;
+		extension_index = index;
 	}
-	if (first_index >= 0 && last_index >= 0) {
-		int real_last_index = last_index;
-		struct sk_buff *skb = NULL;
-		u32 ts_sec = 0;
-		u32 ts_nsec = 0;
-
-		/* packet is available */
-		if (first_index == last_index) {
-			/* single buffer packet */
-			struct sk_buff *new_skb = NULL;
-			int packet_length;
-
-			new_skb = lan743x_rx_allocate_skb(rx);
-			if (!new_skb) {
-				/* failed to allocate next skb.
-				 * Memory is very low.
-				 * Drop this packet and reuse buffer.
-				 */
-				lan743x_rx_reuse_ring_element(rx, first_index);
-				goto process_extension;
-			}
 
-			buffer_info = &rx->buffer_info[first_index];
-			skb = buffer_info->skb;
-			descriptor = &rx->ring_cpu_ptr[first_index];
-
-			/* unmap from dma */
-			packet_length =	RX_DESC_DATA0_FRAME_LENGTH_GET_
-					(descriptor->data0);
-			if (buffer_info->dma_ptr) {
-				dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
-							buffer_info->dma_ptr,
-							packet_length,
-							DMA_FROM_DEVICE);
-				dma_unmap_single_attrs(&rx->adapter->pdev->dev,
-						       buffer_info->dma_ptr,
-						       buffer_info->buffer_length,
-						       DMA_FROM_DEVICE,
-						       DMA_ATTR_SKIP_CPU_SYNC);
-				buffer_info->dma_ptr = 0;
-				buffer_info->buffer_length = 0;
-			}
-			buffer_info->skb = NULL;
-			packet_length =	RX_DESC_DATA0_FRAME_LENGTH_GET_
-					(le32_to_cpu(descriptor->data0));
-			skb_put(skb, packet_length - 4);
-			skb->protocol = eth_type_trans(skb,
-						       rx->adapter->netdev);
-			lan743x_rx_init_ring_element(rx, first_index, new_skb);
-		} else {
-			int index = first_index;
+	/* Only the last buffer in a multi-buffer frame contains the total frame
+	 * length. All other buffers have a zero frame length. The chip
+	 * occasionally sends more buffers than strictly required to reach the
+	 * total frame length.
+	 * Handle this by adding all buffers to the skb in their entirety.
+	 * Once the real frame length is known, trim the skb.
+	 */
+	frame_length =
+		RX_DESC_DATA0_FRAME_LENGTH_GET_(le32_to_cpu(descriptor->data0));
+	buffer_length = buffer_info->buffer_length;
 
-			/* multi buffer packet not supported */
-			/* this should not happen since buffers are allocated
-			 * to be at least the mtu size configured in the mac.
-			 */
+	netdev_dbg(rx->adapter->netdev, "%s%schunk: %d/%d",
+		   is_first ? "first " : "      ",
+		   is_last  ? "last  " : "      ",
+		   frame_length, buffer_length);
 
-			/* clean up buffers */
-			if (first_index <= last_index) {
-				while ((index >= first_index) &&
-				       (index <= last_index)) {
-					lan743x_rx_reuse_ring_element(rx,
-								      index);
-					index = lan743x_rx_next_index(rx,
-								      index);
-				}
-			} else {
-				while ((index >= first_index) ||
-				       (index <= last_index)) {
-					lan743x_rx_reuse_ring_element(rx,
-								      index);
-					index = lan743x_rx_next_index(rx,
-								      index);
-				}
-			}
-		}
+	/* unmap from dma */
+	if (buffer_info->dma_ptr) {
+		dma_unmap_single(&rx->adapter->pdev->dev,
+				 buffer_info->dma_ptr,
+				 buffer_info->buffer_length,
+				 DMA_FROM_DEVICE);
+		buffer_info->dma_ptr = 0;
+		buffer_info->buffer_length = 0;
+	}
+	skb = buffer_info->skb;
 
-process_extension:
-		if (extension_index >= 0) {
-			descriptor = &rx->ring_cpu_ptr[extension_index];
-			buffer_info = &rx->buffer_info[extension_index];
-
-			ts_sec = le32_to_cpu(descriptor->data1);
-			ts_nsec = (le32_to_cpu(descriptor->data2) &
-				  RX_DESC_DATA2_TS_NS_MASK_);
-			lan743x_rx_reuse_ring_element(rx, extension_index);
-			real_last_index = extension_index;
-		}
+	/* allocate new skb and map to dma */
+	if (lan743x_rx_init_ring_element(rx, rx->last_head)) {
+		/* failed to allocate next skb.
+		 * Memory is very low.
+		 * Drop this packet and reuse buffer.
+		 */
+		lan743x_rx_reuse_ring_element(rx, rx->last_head);
+		goto process_extension;
+	}
+
+	/* add buffers to skb via skb->frag_list */
+	if (is_first) {
+		skb_reserve(skb, RX_HEAD_PADDING);
+		skb_put(skb, buffer_length - RX_HEAD_PADDING);
+		if (rx->skb_head)
+			dev_kfree_skb_irq(rx->skb_head);
+		rx->skb_head = skb;
+	} else if (rx->skb_head) {
+		skb_put(skb, buffer_length);
+		if (skb_shinfo(rx->skb_head)->frag_list)
+			rx->skb_tail->next = skb;
+		else
+			skb_shinfo(rx->skb_head)->frag_list = skb;
+		rx->skb_tail = skb;
+		rx->skb_head->len += skb->len;
+		rx->skb_head->data_len += skb->len;
+		rx->skb_head->truesize += skb->truesize;
+	} else {
+		rx->skb_head = skb;
+	}
 
-		if (!skb) {
-			result = RX_PROCESS_RESULT_PACKET_DROPPED;
-			goto move_forward;
+process_extension:
+	if (extension_index >= 0) {
+		u32 ts_sec;
+		u32 ts_nsec;
+
+		ts_sec = le32_to_cpu(desc_ext->data1);
+		ts_nsec = (le32_to_cpu(desc_ext->data2) &
+			  RX_DESC_DATA2_TS_NS_MASK_);
+		if (rx->skb_head) {
+			hwtstamps = skb_hwtstamps(rx->skb_head);
+			if (hwtstamps)
+				hwtstamps->hwtstamp = ktime_set(ts_sec, ts_nsec);
 		}
+		lan743x_rx_reuse_ring_element(rx, extension_index);
+		rx->last_head = extension_index;
+		netdev_dbg(rx->adapter->netdev, "process extension");
+	}
 
-		if (extension_index < 0)
-			goto pass_packet_to_os;
-		hwtstamps = skb_hwtstamps(skb);
-		if (hwtstamps)
-			hwtstamps->hwtstamp = ktime_set(ts_sec, ts_nsec);
+	if (is_last && rx->skb_head)
+		rx->skb_head = lan743x_rx_trim_skb(rx->skb_head, frame_length);
 
-pass_packet_to_os:
-		/* pass packet to OS */
-		napi_gro_receive(&rx->napi, skb);
-		result = RX_PROCESS_RESULT_PACKET_RECEIVED;
+	if (is_last && rx->skb_head) {
+		rx->skb_head->protocol = eth_type_trans(rx->skb_head,
+							rx->adapter->netdev);
+		netdev_dbg(rx->adapter->netdev, "sending %d byte frame to OS",
+			   rx->skb_head->len);
+		napi_gro_receive(&rx->napi, rx->skb_head);
+		rx->skb_head = NULL;
+	}
 
 move_forward:
-		/* push tail and head forward */
-		rx->last_tail = real_last_index;
-		rx->last_head = lan743x_rx_next_index(rx, real_last_index);
-	}
+	/* push tail and head forward */
+	rx->last_tail = rx->last_head;
+	rx->last_head = lan743x_rx_next_index(rx, rx->last_head);
+	result = RX_PROCESS_RESULT_PACKET_RECEIVED;
 done:
 	return result;
 }
@@ -2367,9 +2330,7 @@ static int lan743x_rx_ring_init(struct lan743x_rx *rx)
 
 	rx->last_head = 0;
 	for (index = 0; index < rx->ring_size; index++) {
-		struct sk_buff *new_skb = lan743x_rx_allocate_skb(rx);
-
-		ret = lan743x_rx_init_ring_element(rx, index, new_skb);
+		ret = lan743x_rx_init_ring_element(rx, index);
 		if (ret)
 			goto cleanup;
 	}
diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h
index 751f2bc9ce84..17e31a6210c6 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.h
+++ b/drivers/net/ethernet/microchip/lan743x_main.h
@@ -698,6 +698,8 @@ struct lan743x_rx {
 	struct napi_struct napi;
 
 	u32		frame_count;
+
+	struct sk_buff *skb_head, *skb_tail;
 };
 
 struct lan743x_adapter {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next v1 3/6] lan743x: allow mtu change while network interface is up
  2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets Sven Van Asbroeck
@ 2021-01-29 19:52 ` Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 4/6] TEST ONLY: lan743x: limit rx ring buffer size to 500 bytes Sven Van Asbroeck
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

Now that we can use rx ring buffers smaller than the mtu,
we allow users to change the mtu on the fly.

Tested as follows:

Tests with debug logging enabled (add #define DEBUG).

1. Set the chip mtu to 1500, generate lots of network traffic.
   Stop all network traffic.
   Set the chip and remote mtus to 8000.
   Ping remote -> chip: $ ping <chip ip> -s 7000
   Verify that the first few received packets are multi-buffer.
   Verify no pings are dropped.

Tests with DEBUG_KMEMLEAK on:
 $ mount -t debugfs nodev /sys/kernel/debug/
 $ echo scan > /sys/kernel/debug/kmemleak

2. Start with chip mtu at 1500, host mtu at 8000.
Run concurrently:
 - iperf3 -s on chip
 - ping -> chip

Cycle the chip mtu between 1500 and 8000 every 10 seconds.

Scan kmemleak periodically to watch for memory leaks.
Verify that the mtu changeover happens smoothly, i.e.
the iperf3 test does not report periods where speed
drops and recovers suddenly.

Note: iperf3 occasionally reports dropped packets on
changeover. This behaviour also occurs on the original
driver, it's not a regression. Possibly related to the
chip's mac rx being disabled when the mtu is changed.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

 drivers/net/ethernet/microchip/lan743x_main.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index b784e9feadac..618f0714a2cf 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -2597,9 +2597,6 @@ static int lan743x_netdev_change_mtu(struct net_device *netdev, int new_mtu)
 	struct lan743x_adapter *adapter = netdev_priv(netdev);
 	int ret = 0;
 
-	if (netif_running(netdev))
-		return -EBUSY;
-
 	ret = lan743x_mac_set_mtu(adapter, new_mtu);
 	if (!ret)
 		netdev->mtu = new_mtu;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next v1 4/6] TEST ONLY: lan743x: limit rx ring buffer size to 500 bytes
  2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
                   ` (2 preceding siblings ...)
  2021-01-29 19:52 ` [PATCH net-next v1 3/6] lan743x: allow mtu change while network interface is up Sven Van Asbroeck
@ 2021-01-29 19:52 ` Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 5/6] TEST ONLY: lan743x: skb_alloc failure test Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 6/6] TEST ONLY: lan743x: skb_trim " Sven Van Asbroeck
  5 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

 drivers/net/ethernet/microchip/lan743x_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 618f0714a2cf..ed4959ad9237 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1973,7 +1973,7 @@ static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index)
 	dma_addr_t dma_ptr;
 	int length;
 
-	length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
+	length = 500 + ETH_HLEN + 4 + RX_HEAD_PADDING;
 
 	descriptor = &rx->ring_cpu_ptr[index];
 	buffer_info = &rx->buffer_info[index];
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next v1 5/6] TEST ONLY: lan743x: skb_alloc failure test
  2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
                   ` (3 preceding siblings ...)
  2021-01-29 19:52 ` [PATCH net-next v1 4/6] TEST ONLY: lan743x: limit rx ring buffer size to 500 bytes Sven Van Asbroeck
@ 2021-01-29 19:52 ` Sven Van Asbroeck
  2021-01-29 19:52 ` [PATCH net-next v1 6/6] TEST ONLY: lan743x: skb_trim " Sven Van Asbroeck
  5 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

Simulate low-memory in lan743x_rx_allocate_skb(): fail 10
allocations in a row in every 100.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

 drivers/net/ethernet/microchip/lan743x_main.c | 21 +++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index ed4959ad9237..149b482fd984 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1963,7 +1963,20 @@ static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index)
 				  index);
 }
 
-static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index)
+static struct sk_buff *
+lan743x_alloc_skb(struct net_device *netdev, int length, bool can_fail)
+{
+	static int rx_alloc;
+	int counter = rx_alloc++ % 100;
+
+	if (can_fail && counter >= 20 && counter < 30)
+		return NULL;
+
+	return __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA);
+}
+
+static int
+lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index, bool can_fail)
 {
 	struct net_device *netdev = rx->adapter->netdev;
 	struct device *dev = &rx->adapter->pdev->dev;
@@ -1977,7 +1990,7 @@ static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index)
 
 	descriptor = &rx->ring_cpu_ptr[index];
 	buffer_info = &rx->buffer_info[index];
-	skb = __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA);
+	skb = lan743x_alloc_skb(netdev, length, can_fail);
 	if (!skb)
 		return -ENOMEM;
 	dma_ptr = dma_map_single(dev, skb->data, length, DMA_FROM_DEVICE);
@@ -2130,7 +2143,7 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
 	skb = buffer_info->skb;
 
 	/* allocate new skb and map to dma */
-	if (lan743x_rx_init_ring_element(rx, rx->last_head)) {
+	if (lan743x_rx_init_ring_element(rx, rx->last_head, true)) {
 		/* failed to allocate next skb.
 		 * Memory is very low.
 		 * Drop this packet and reuse buffer.
@@ -2330,7 +2343,7 @@ static int lan743x_rx_ring_init(struct lan743x_rx *rx)
 
 	rx->last_head = 0;
 	for (index = 0; index < rx->ring_size; index++) {
-		ret = lan743x_rx_init_ring_element(rx, index);
+		ret = lan743x_rx_init_ring_element(rx, index, false);
 		if (ret)
 			goto cleanup;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next v1 6/6] TEST ONLY: lan743x: skb_trim failure test
  2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
                   ` (4 preceding siblings ...)
  2021-01-29 19:52 ` [PATCH net-next v1 5/6] TEST ONLY: lan743x: skb_alloc failure test Sven Van Asbroeck
@ 2021-01-29 19:52 ` Sven Van Asbroeck
  5 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 19:52 UTC (permalink / raw)
  To: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski
  Cc: Sven Van Asbroeck, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, netdev, linux-kernel

From: Sven Van Asbroeck <thesven73@gmail.com>

Simulate low-memory in lan743x_rx_trim_skb(): fail one allocation
in every 100.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)

 drivers/net/ethernet/microchip/lan743x_main.c | 28 ++++++++-----------
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 149b482fd984..dfae7745094b 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1963,20 +1963,7 @@ static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index)
 				  index);
 }
 
-static struct sk_buff *
-lan743x_alloc_skb(struct net_device *netdev, int length, bool can_fail)
-{
-	static int rx_alloc;
-	int counter = rx_alloc++ % 100;
-
-	if (can_fail && counter >= 20 && counter < 30)
-		return NULL;
-
-	return __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA);
-}
-
-static int
-lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index, bool can_fail)
+static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index)
 {
 	struct net_device *netdev = rx->adapter->netdev;
 	struct device *dev = &rx->adapter->pdev->dev;
@@ -1990,7 +1977,7 @@ lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index, bool can_fail)
 
 	descriptor = &rx->ring_cpu_ptr[index];
 	buffer_info = &rx->buffer_info[index];
-	skb = lan743x_alloc_skb(netdev, length, can_fail);
+	skb = __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA);
 	if (!skb)
 		return -ENOMEM;
 	dma_ptr = dma_map_single(dev, skb->data, length, DMA_FROM_DEVICE);
@@ -2058,6 +2045,13 @@ static void lan743x_rx_release_ring_element(struct lan743x_rx *rx, int index)
 static struct sk_buff *
 lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
 {
+	static int trim_cnt;
+
+	if ((trim_cnt++ % 100) == 77) {
+		dev_kfree_skb_irq(skb);
+		return NULL;
+	}
+
 	if (skb_linearize(skb)) {
 		dev_kfree_skb_irq(skb);
 		return NULL;
@@ -2143,7 +2137,7 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
 	skb = buffer_info->skb;
 
 	/* allocate new skb and map to dma */
-	if (lan743x_rx_init_ring_element(rx, rx->last_head, true)) {
+	if (lan743x_rx_init_ring_element(rx, rx->last_head)) {
 		/* failed to allocate next skb.
 		 * Memory is very low.
 		 * Drop this packet and reuse buffer.
@@ -2343,7 +2337,7 @@ static int lan743x_rx_ring_init(struct lan743x_rx *rx)
 
 	rx->last_head = 0;
 	for (index = 0; index < rx->ring_size; index++) {
-		ret = lan743x_rx_init_ring_element(rx, index, false);
+		ret = lan743x_rx_init_ring_element(rx, index);
 		if (ret)
 			goto cleanup;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
@ 2021-01-29 20:36   ` Andrew Lunn
  2021-01-29 22:49     ` Sven Van Asbroeck
  2021-01-29 22:01   ` Jakub Kicinski
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: Andrew Lunn @ 2021-01-29 20:36 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski,
	Alexey Denisov, Sergej Bauer, Tim Harvey, Anders Rønningen,
	netdev, linux-kernel

> diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
> index f1f6eba4ace4..f485320e5784 100644
> --- a/drivers/net/ethernet/microchip/lan743x_main.c
> +++ b/drivers/net/ethernet/microchip/lan743x_main.c
> @@ -1957,11 +1957,11 @@ static int lan743x_rx_next_index(struct lan743x_rx *rx, int index)
>  
>  static struct sk_buff *lan743x_rx_allocate_skb(struct lan743x_rx *rx)
>  {
> -	int length = 0;
> +	struct net_device *netdev = rx->adapter->netdev;
>  
> -	length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING);
> -	return __netdev_alloc_skb(rx->adapter->netdev,
> -				  length, GFP_ATOMIC | GFP_DMA);
> +	return __netdev_alloc_skb(netdev,
> +				  netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING,
> +				  GFP_ATOMIC | GFP_DMA);
>  }
>  
>  static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index)
> @@ -1977,9 +1977,10 @@ static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index,
>  {
>  	struct lan743x_rx_buffer_info *buffer_info;
>  	struct lan743x_rx_descriptor *descriptor;
> -	int length = 0;
> +	struct net_device *netdev = rx->adapter->netdev;
> +	int length;

Please keep to reverse christmass tree.
>  
> -	length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING);
> +	length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
>  	descriptor = &rx->ring_cpu_ptr[index];
>  	buffer_info = &rx->buffer_info[index];
>  	buffer_info->skb = skb;
> @@ -2148,11 +2149,18 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
>  			descriptor = &rx->ring_cpu_ptr[first_index];
>  
>  			/* unmap from dma */
> +			packet_length =	RX_DESC_DATA0_FRAME_LENGTH_GET_
> +					(descriptor->data0);
>  			if (buffer_info->dma_ptr) {
> -				dma_unmap_single(&rx->adapter->pdev->dev,
> -						 buffer_info->dma_ptr,
> -						 buffer_info->buffer_length,
> -						 DMA_FROM_DEVICE);
> +				dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
> +							buffer_info->dma_ptr,
> +							packet_length,
> +							DMA_FROM_DEVICE);
> +				dma_unmap_single_attrs(&rx->adapter->pdev->dev,
> +						       buffer_info->dma_ptr,
> +						       buffer_info->buffer_length,
> +						       DMA_FROM_DEVICE,
> +						       DMA_ATTR_SKIP_CPU_SYNC);

So this patch appears to contain two different changes
1) You only allocate a receive buffer as big as the MTU plus overheads
2) You change the cache operations to operate on the received length.

The first change should be completely safe, and i guess, is giving
most of the benefits. The second one is where interesting things might
happen. So please split this patch into two.  If it does break, we can
git bisect, and probably end up on the second patch.

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
  2021-01-29 20:36   ` Andrew Lunn
@ 2021-01-29 22:01   ` Jakub Kicinski
  2021-01-29 22:46     ` Sven Van Asbroeck
  2021-01-30 22:10   ` Bryan.Whitehead
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: Jakub Kicinski @ 2021-01-29 22:01 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Bryan Whitehead, UNGLinuxDriver, David S Miller, Andrew Lunn,
	Alexey Denisov, Sergej Bauer, Tim Harvey, Anders Rønningen,
	netdev, linux-kernel

On Fri, 29 Jan 2021 14:52:35 -0500 Sven Van Asbroeck wrote:
> From: Sven Van Asbroeck <thesven73@gmail.com>
> 
> The buffers in the lan743x driver's receive ring are always 9K,
> even when the largest packet that can be received (the mtu) is
> much smaller. This performs particularly badly on cpu archs
> without dma cache snooping (such as ARM): each received packet
> results in a 9K dma_{map|unmap} operation, which is very expensive
> because cpu caches need to be invalidated.
> 
> Careful measurement of the driver rx path on armv7 reveals that
> the cpu spends the majority of its time waiting for cache
> invalidation.
> 
> Optimize as follows:
> 
> 1. set rx ring buffer size equal to the mtu. this limits the
>    amount of cache that needs to be invalidated per dma_map().
> 
> 2. when dma_unmap()ping, skip cpu sync. Sync only the packet data
>    actually received, the size of which the chip will indicate in
>    its rx ring descriptors. this limits the amount of cache that
>    needs to be invalidated per dma_unmap().
> 
> These optimizations double the rx performance on armv7.
> Third parties report 3x rx speedup on armv8.
> 
> Performance on dma cache snooping architectures (such as x86)
> is expected to stay the same.
> 
> Tested with iperf3 on a freescale imx6qp + lan7430, both sides
> set to mtu 1500 bytes, measure rx performance:
> 
> Before:
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-20.00  sec   550 MBytes   231 Mbits/sec    0
> After:
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-20.00  sec  1.33 GBytes   570 Mbits/sec    0
> 
> Test by Anders Roenningen (anders@ronningen.priv.no) on armv8,
>     rx iperf3:
> Before 102 Mbits/sec
> After  279 Mbits/sec
> 
> Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>

You may need to rebase to see this:

drivers/net/ethernet/microchip/lan743x_main.c:2123:41: warning: restricted __le32 degrades to integer

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-29 19:52 ` [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets Sven Van Asbroeck
@ 2021-01-29 22:11   ` Willem de Bruijn
  2021-01-29 23:02     ` Sven Van Asbroeck
  2021-01-31  7:06   ` Bryan.Whitehead
  1 sibling, 1 reply; 32+ messages in thread
From: Willem de Bruijn @ 2021-01-29 22:11 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Bryan Whitehead, UNGLinuxDriver, David S Miller, Jakub Kicinski,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, Network Development, LKML

On Fri, Jan 29, 2021 at 2:56 PM Sven Van Asbroeck <thesven73@gmail.com> wrote:
>
> From: Sven Van Asbroeck <thesven73@gmail.com>
>
> Multi-buffer packets enable us to use rx ring buffers smaller than
> the mtu. This will allow us to change the mtu on-the-fly, without
> having to stop the network interface in order to re-size the rx
> ring buffers.
>
> This is a big change touching a key driver function (process_packet),
> so care has been taken to test this extensively:
>
> Tests with debug logging enabled (add #define DEBUG).
>
> 1. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Ping to chip, verify correct packet size is sent to OS.
>    Ping large packets to chip (ping -s 1400), verify correct
>      packet size is sent to OS.
>    Ping using packets around the buffer size, verify number of
>      buffers is changing, verify correct packet size is sent
>      to OS:
>      $ ping -s 472
>      $ ping -s 473
>      $ ping -s 992
>      $ ping -s 993
>    Verify that each packet is followed by extension processing.
>
> 2. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Run iperf3 -s on chip, verify that packets come in 3 buffers
>      at a time.
>    Verify that packet size is equal to mtu.
>    Verify that each packet is followed by extension processing.
>
> 3. Set chip and host mtu to 2000.
>    Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
>    Run iperf3 -s on chip, verify that packets come in 4 buffers
>      at a time.
>    Verify that packet size is equal to mtu.
>    Verify that each packet is followed by extension processing.
>
> Tests with debug logging DISabled (remove #define DEBUG).
>
> 4. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Run iperf3 -s on chip, note sustained rx speed.
>    Set chip and host mtu to 2000, so mtu takes 4 buffers.
>    Run iperf3 -s on chip, note sustained rx speed.
>    Verify no packets are dropped in both cases.
>
> Tests with DEBUG_KMEMLEAK on:
>  $ mount -t debugfs nodev /sys/kernel/debug/
>  $ echo scan > /sys/kernel/debug/kmemleak
>
> 5. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Run the following tests concurrently for at least one hour:
>    - iperf3 -s on chip
>    - ping -> chip
>    Monitor reported memory leaks.
>
> 6. Set chip and host mtu to 2000.
>    Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
>    Run the following tests concurrently for at least one hour:
>    - iperf3 -s on chip
>    - ping -> chip
>    Monitor reported memory leaks.
>
> 7. Simulate low-memory in lan743x_rx_allocate_skb(): fail every
>      100 allocations.
>    Repeat (5) and (6).
>    Monitor reported memory leaks.
>
> 8. Simulate  low-memory in lan743x_rx_allocate_skb(): fail 10
>      allocations in a row in every 100.
>    Repeat (5) and (6).
>    Monitor reported memory leaks.
>
> 9. Simulate  low-memory in lan743x_rx_trim_skb(): fail 1 allocation
>      in every 100.
>    Repeat (5) and (6).
>    Monitor reported memory leaks.
>
> Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
> ---
>
> Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1
>
> To: Bryan Whitehead <bryan.whitehead@microchip.com>
> To: UNGLinuxDriver@microchip.com
> To: "David S. Miller" <davem@davemloft.net>
> To: Jakub Kicinski <kuba@kernel.org>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Alexey Denisov <rtgbnm@gmail.com>
> Cc: Sergej Bauer <sbauer@blackbox.su>
> Cc: Tim Harvey <tharvey@gateworks.com>
> Cc: Anders Rønningen <anders@ronningen.priv.no>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org (open list)
>
>  drivers/net/ethernet/microchip/lan743x_main.c | 321 ++++++++----------
>  drivers/net/ethernet/microchip/lan743x_main.h |   2 +
>  2 files changed, 143 insertions(+), 180 deletions(-)


> +static struct sk_buff *
> +lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
> +{
> +       if (skb_linearize(skb)) {

Is this needed? That will be quite expensive

> +               dev_kfree_skb_irq(skb);
> +               return NULL;
> +       }
> +       frame_length = max_t(int, 0, frame_length - RX_HEAD_PADDING - 2);
> +       if (skb->len > frame_length) {
> +               skb->tail -= skb->len - frame_length;
> +               skb->len = frame_length;
> +       }
> +       return skb;
> +}
> +
>  static int lan743x_rx_process_packet(struct lan743x_rx *rx)
>  {
> -       struct skb_shared_hwtstamps *hwtstamps = NULL;
> +       struct lan743x_rx_descriptor *descriptor, *desc_ext;
>         int result = RX_PROCESS_RESULT_NOTHING_TO_DO;
>         int current_head_index = le32_to_cpu(*rx->head_cpu_ptr);
>         struct lan743x_rx_buffer_info *buffer_info;
> -       struct lan743x_rx_descriptor *descriptor;
> +       struct skb_shared_hwtstamps *hwtstamps;
> +       int frame_length, buffer_length;
> +       struct sk_buff *skb;
>         int extension_index = -1;
> -       int first_index = -1;
> -       int last_index = -1;
> +       bool is_last, is_first;
>
>         if (current_head_index < 0 || current_head_index >= rx->ring_size)
>                 goto done;
> @@ -2068,170 +2075,126 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
>         if (rx->last_head < 0 || rx->last_head >= rx->ring_size)
>                 goto done;
>
> -       if (rx->last_head != current_head_index) {
> -               descriptor = &rx->ring_cpu_ptr[rx->last_head];
> -               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> -                       goto done;
> +       if (rx->last_head == current_head_index)
> +               goto done;

Is it possible to avoid the large indentation change, or else do that
in a separate patch? It makes it harder to follow the functional
change.

>
> -               if (!(le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_))
> -                       goto done;
> +       descriptor = &rx->ring_cpu_ptr[rx->last_head];
> +       if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> +               goto done;
> +       buffer_info = &rx->buffer_info[rx->last_head];
>
> -               first_index = rx->last_head;
> -               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
> -                       last_index = rx->last_head;
> -               } else {
> -                       int index;
> -
> -                       index = lan743x_rx_next_index(rx, first_index);
> -                       while (index != current_head_index) {
> -                               descriptor = &rx->ring_cpu_ptr[index];
> -                               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> -                                       goto done;
> -
> -                               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
> -                                       last_index = index;
> -                                       break;
> -                               }
> -                               index = lan743x_rx_next_index(rx, index);
> -                       }
> -               }
> -               if (last_index >= 0) {
> -                       descriptor = &rx->ring_cpu_ptr[last_index];
> -                       if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
> -                               /* extension is expected to follow */
> -                               int index = lan743x_rx_next_index(rx,
> -                                                                 last_index);
> -                               if (index != current_head_index) {
> -                                       descriptor = &rx->ring_cpu_ptr[index];
> -                                       if (le32_to_cpu(descriptor->data0) &
> -                                           RX_DESC_DATA0_OWN_) {
> -                                               goto done;
> -                                       }
> -                                       if (le32_to_cpu(descriptor->data0) &
> -                                           RX_DESC_DATA0_EXT_) {
> -                                               extension_index = index;
> -                                       } else {
> -                                               goto done;
> -                                       }
> -                               } else {
> -                                       /* extension is not yet available */
> -                                       /* prevent processing of this packet */
> -                                       first_index = -1;
> -                                       last_index = -1;
> -                               }
> -                       }
> -               }
> +       is_last = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_;
> +       is_first = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_;
> +
> +       if (is_last && le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
> +               /* extension is expected to follow */
> +               int index = lan743x_rx_next_index(rx, rx->last_head);
> +
> +               if (index == current_head_index)
> +                       /* extension not yet available */
> +                       goto done;
> +               desc_ext = &rx->ring_cpu_ptr[index];
> +               if (le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_OWN_)
> +                       /* extension not yet available */
> +                       goto done;
> +               if (!(le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_EXT_))
> +                       goto move_forward;
> +               extension_index = index;
>         }
> -       if (first_index >= 0 && last_index >= 0) {
> -               int real_last_index = last_index;
> -               struct sk_buff *skb = NULL;
> -               u32 ts_sec = 0;
> -               u32 ts_nsec = 0;
> -
> -               /* packet is available */
> -               if (first_index == last_index) {
> -                       /* single buffer packet */
> -                       struct sk_buff *new_skb = NULL;
> -                       int packet_length;
> -
> -                       new_skb = lan743x_rx_allocate_skb(rx);
> -                       if (!new_skb) {
> -                               /* failed to allocate next skb.
> -                                * Memory is very low.
> -                                * Drop this packet and reuse buffer.
> -                                */
> -                               lan743x_rx_reuse_ring_element(rx, first_index);
> -                               goto process_extension;
> -                       }
>
> -                       buffer_info = &rx->buffer_info[first_index];
> -                       skb = buffer_info->skb;
> -                       descriptor = &rx->ring_cpu_ptr[first_index];
> -
> -                       /* unmap from dma */
> -                       packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> -                                       (descriptor->data0);
> -                       if (buffer_info->dma_ptr) {
> -                               dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
> -                                                       buffer_info->dma_ptr,
> -                                                       packet_length,
> -                                                       DMA_FROM_DEVICE);
> -                               dma_unmap_single_attrs(&rx->adapter->pdev->dev,
> -                                                      buffer_info->dma_ptr,
> -                                                      buffer_info->buffer_length,
> -                                                      DMA_FROM_DEVICE,
> -                                                      DMA_ATTR_SKIP_CPU_SYNC);
> -                               buffer_info->dma_ptr = 0;
> -                               buffer_info->buffer_length = 0;
> -                       }
> -                       buffer_info->skb = NULL;
> -                       packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> -                                       (le32_to_cpu(descriptor->data0));
> -                       skb_put(skb, packet_length - 4);
> -                       skb->protocol = eth_type_trans(skb,
> -                                                      rx->adapter->netdev);
> -                       lan743x_rx_init_ring_element(rx, first_index, new_skb);
> -               } else {
> -                       int index = first_index;
> +       /* Only the last buffer in a multi-buffer frame contains the total frame
> +        * length. All other buffers have a zero frame length. The chip
> +        * occasionally sends more buffers than strictly required to reach the
> +        * total frame length.
> +        * Handle this by adding all buffers to the skb in their entirety.
> +        * Once the real frame length is known, trim the skb.
> +        */
> +       frame_length =
> +               RX_DESC_DATA0_FRAME_LENGTH_GET_(le32_to_cpu(descriptor->data0));
> +       buffer_length = buffer_info->buffer_length;
>
> -                       /* multi buffer packet not supported */
> -                       /* this should not happen since buffers are allocated
> -                        * to be at least the mtu size configured in the mac.
> -                        */
> +       netdev_dbg(rx->adapter->netdev, "%s%schunk: %d/%d",
> +                  is_first ? "first " : "      ",
> +                  is_last  ? "last  " : "      ",
> +                  frame_length, buffer_length);
>
> -                       /* clean up buffers */
> -                       if (first_index <= last_index) {
> -                               while ((index >= first_index) &&
> -                                      (index <= last_index)) {
> -                                       lan743x_rx_reuse_ring_element(rx,
> -                                                                     index);
> -                                       index = lan743x_rx_next_index(rx,
> -                                                                     index);
> -                               }
> -                       } else {
> -                               while ((index >= first_index) ||
> -                                      (index <= last_index)) {
> -                                       lan743x_rx_reuse_ring_element(rx,
> -                                                                     index);
> -                                       index = lan743x_rx_next_index(rx,
> -                                                                     index);
> -                               }
> -                       }
> -               }
> +       /* unmap from dma */
> +       if (buffer_info->dma_ptr) {
> +               dma_unmap_single(&rx->adapter->pdev->dev,
> +                                buffer_info->dma_ptr,
> +                                buffer_info->buffer_length,
> +                                DMA_FROM_DEVICE);
> +               buffer_info->dma_ptr = 0;
> +               buffer_info->buffer_length = 0;
> +       }
> +       skb = buffer_info->skb;
>
> -process_extension:
> -               if (extension_index >= 0) {
> -                       descriptor = &rx->ring_cpu_ptr[extension_index];
> -                       buffer_info = &rx->buffer_info[extension_index];
> -
> -                       ts_sec = le32_to_cpu(descriptor->data1);
> -                       ts_nsec = (le32_to_cpu(descriptor->data2) &
> -                                 RX_DESC_DATA2_TS_NS_MASK_);
> -                       lan743x_rx_reuse_ring_element(rx, extension_index);
> -                       real_last_index = extension_index;
> -               }
> +       /* allocate new skb and map to dma */
> +       if (lan743x_rx_init_ring_element(rx, rx->last_head)) {
> +               /* failed to allocate next skb.
> +                * Memory is very low.
> +                * Drop this packet and reuse buffer.
> +                */
> +               lan743x_rx_reuse_ring_element(rx, rx->last_head);
> +               goto process_extension;
> +       }
> +
> +       /* add buffers to skb via skb->frag_list */
> +       if (is_first) {
> +               skb_reserve(skb, RX_HEAD_PADDING);
> +               skb_put(skb, buffer_length - RX_HEAD_PADDING);
> +               if (rx->skb_head)
> +                       dev_kfree_skb_irq(rx->skb_head);
> +               rx->skb_head = skb;
> +       } else if (rx->skb_head) {
> +               skb_put(skb, buffer_length);
> +               if (skb_shinfo(rx->skb_head)->frag_list)
> +                       rx->skb_tail->next = skb;
> +               else
> +                       skb_shinfo(rx->skb_head)->frag_list = skb;

Instead of chaining skbs into frag_list, you could perhaps delay skb
alloc until after reception, allocate buffers stand-alone, and link
them into the skb as skb_frags? That might avoid a few skb alloc +
frees. Though a bit change, not sure how feasible.

> +               rx->skb_tail = skb;
> +               rx->skb_head->len += skb->len;
> +               rx->skb_head->data_len += skb->len;
> +               rx->skb_head->truesize += skb->truesize;
> +       } else {
> +               rx->skb_head = skb;
> +       }
>
> -               if (!skb) {
> -                       result = RX_PROCESS_RESULT_PACKET_DROPPED;
> -                       goto move_forward;
> +process_extension:
> +       if (extension_index >= 0) {
> +               u32 ts_sec;
> +               u32 ts_nsec;
> +
> +               ts_sec = le32_to_cpu(desc_ext->data1);
> +               ts_nsec = (le32_to_cpu(desc_ext->data2) &
> +                         RX_DESC_DATA2_TS_NS_MASK_);
> +               if (rx->skb_head) {
> +                       hwtstamps = skb_hwtstamps(rx->skb_head);
> +                       if (hwtstamps)

This is always true.

You can just call skb_hwtstamps(skb)->hwtstamp = ktime_set(ts_sec, ts_nsec);

Though I see that this is existing code just moved due to
aforementioned indentation change.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 22:01   ` Jakub Kicinski
@ 2021-01-29 22:46     ` Sven Van Asbroeck
  0 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 22:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Bryan Whitehead, Microchip Linux Driver Support, David S Miller,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

On Fri, Jan 29, 2021 at 5:01 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> You may need to rebase to see this:
>
> drivers/net/ethernet/microchip/lan743x_main.c:2123:41: warning: restricted __le32 degrades to integer

Good catch. The problem goes away with the next commit in the set.
This is probably because I rebased to the little endian/big endian patch at
the last minute. I'll fix it up in v2.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 20:36   ` Andrew Lunn
@ 2021-01-29 22:49     ` Sven Van Asbroeck
  0 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 22:49 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Bryan Whitehead, Microchip Linux Driver Support, David S Miller,
	Jakub Kicinski, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

Hi Andrew, thank you so much for looking at this patch !

On Fri, Jan 29, 2021 at 3:36 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> So this patch appears to contain two different changes
> 1) You only allocate a receive buffer as big as the MTU plus overheads
> 2) You change the cache operations to operate on the received length.
>
> The first change should be completely safe, and i guess, is giving
> most of the benefits. The second one is where interesting things might
> happen. So please split this patch into two.  If it does break, we can
> git bisect, and probably end up on the second patch.
>

Yes, I tested this extensively on arm7, but you're right, it might behave
differently on other platforms. I will split into two, as you suggested.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-29 22:11   ` Willem de Bruijn
@ 2021-01-29 23:02     ` Sven Van Asbroeck
  2021-01-29 23:08       ` Willem de Bruijn
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 23:02 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Bryan Whitehead, Microchip Linux Driver Support, David S Miller,
	Jakub Kicinski, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, Network Development, LKML

Hoi Willem, thanks a lot for reviewing this patch, much appreciated !!

On Fri, Jan 29, 2021 at 5:11 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> > +static struct sk_buff *
> > +lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
> > +{
> > +       if (skb_linearize(skb)) {
>
> Is this needed? That will be quite expensive

The skb will only be non-linear when it's created from a multi-buffer frame.
Multi-buffer frames are only generated right after a mtu change - fewer than
32 frames will be non-linear after an mtu increase. So as long as people don't
change the mtu in a tight loop, skb_linearize is just a single comparison,
99.999999+% of the time.

>
> Is it possible to avoid the large indentation change, or else do that
> in a separate patch? It makes it harder to follow the functional
> change.

It's not immediately obvious, but I have replaced the whole function
with slightly different logic, and the replacement content has a much
flatter indentation structure, and should be easier to follow.

Or perhaps I am misinterpreting your question?

> > +
> > +       /* add buffers to skb via skb->frag_list */
> > +       if (is_first) {
> > +               skb_reserve(skb, RX_HEAD_PADDING);
> > +               skb_put(skb, buffer_length - RX_HEAD_PADDING);
> > +               if (rx->skb_head)
> > +                       dev_kfree_skb_irq(rx->skb_head);
> > +               rx->skb_head = skb;
> > +       } else if (rx->skb_head) {
> > +               skb_put(skb, buffer_length);
> > +               if (skb_shinfo(rx->skb_head)->frag_list)
> > +                       rx->skb_tail->next = skb;
> > +               else
> > +                       skb_shinfo(rx->skb_head)->frag_list = skb;
>
> Instead of chaining skbs into frag_list, you could perhaps delay skb
> alloc until after reception, allocate buffers stand-alone, and link
> them into the skb as skb_frags? That might avoid a few skb alloc +
> frees. Though a bit change, not sure how feasible.

The problem here is this (copypasta from somewhere else in this patch):

/* Only the last buffer in a multi-buffer frame contains the total frame
* length. All other buffers have a zero frame length. The chip
* occasionally sends more buffers than strictly required to reach the
* total frame length.
* Handle this by adding all buffers to the skb in their entirety.
* Once the real frame length is known, trim the skb.
*/

In other words, the chip sometimes sends more buffers than strictly needed to
fit the frame. linearize + trim deals with this thorny issue perfectly.

If the skb weren't linearized, we would run into trouble when trying to trim
(remove from the end) a chunk bigger than the last skb fragment.

> > +process_extension:
> > +       if (extension_index >= 0) {
> > +               u32 ts_sec;
> > +               u32 ts_nsec;
> > +
> > +               ts_sec = le32_to_cpu(desc_ext->data1);
> > +               ts_nsec = (le32_to_cpu(desc_ext->data2) &
> > +                         RX_DESC_DATA2_TS_NS_MASK_);
> > +               if (rx->skb_head) {
> > +                       hwtstamps = skb_hwtstamps(rx->skb_head);
> > +                       if (hwtstamps)
>
> This is always true.
>
> You can just call skb_hwtstamps(skb)->hwtstamp = ktime_set(ts_sec, ts_nsec);

Thank you, will do !

>
> Though I see that this is existing code just moved due to
> aforementioned indentation change.

True, but I can make the change anyway.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-29 23:02     ` Sven Van Asbroeck
@ 2021-01-29 23:08       ` Willem de Bruijn
  2021-01-29 23:10         ` Sven Van Asbroeck
  0 siblings, 1 reply; 32+ messages in thread
From: Willem de Bruijn @ 2021-01-29 23:08 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Bryan Whitehead, Microchip Linux Driver Support, David S Miller,
	Jakub Kicinski, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, Network Development, LKML

On Fri, Jan 29, 2021 at 6:03 PM Sven Van Asbroeck <thesven73@gmail.com> wrote:
>
> Hoi Willem, thanks a lot for reviewing this patch, much appreciated !!
>
> On Fri, Jan 29, 2021 at 5:11 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > > +static struct sk_buff *
> > > +lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
> > > +{
> > > +       if (skb_linearize(skb)) {
> >
> > Is this needed? That will be quite expensive
>
> The skb will only be non-linear when it's created from a multi-buffer frame.
> Multi-buffer frames are only generated right after a mtu change - fewer than
> 32 frames will be non-linear after an mtu increase. So as long as people don't
> change the mtu in a tight loop, skb_linearize is just a single comparison,
> 99.999999+% of the time.

Ah. I had missed the temporary state of this until the buffers are
reinitialized. Yes, then there is no reason to worry. Same for the
frag_list vs frags comment I made.

> >
> > Is it possible to avoid the large indentation change, or else do that
> > in a separate patch? It makes it harder to follow the functional
> > change.
>
> It's not immediately obvious, but I have replaced the whole function
> with slightly different logic, and the replacement content has a much
> flatter indentation structure, and should be easier to follow.
>
> Or perhaps I am misinterpreting your question?

Okay. I found it a bit hard to parse how much true code change was
mixed in with just reindenting existing code. If a lot, then no need
to split of the code refactor.

>
> > > +
> > > +       /* add buffers to skb via skb->frag_list */
> > > +       if (is_first) {
> > > +               skb_reserve(skb, RX_HEAD_PADDING);
> > > +               skb_put(skb, buffer_length - RX_HEAD_PADDING);
> > > +               if (rx->skb_head)
> > > +                       dev_kfree_skb_irq(rx->skb_head);
> > > +               rx->skb_head = skb;
> > > +       } else if (rx->skb_head) {
> > > +               skb_put(skb, buffer_length);
> > > +               if (skb_shinfo(rx->skb_head)->frag_list)
> > > +                       rx->skb_tail->next = skb;
> > > +               else
> > > +                       skb_shinfo(rx->skb_head)->frag_list = skb;
> >
> > Instead of chaining skbs into frag_list, you could perhaps delay skb
> > alloc until after reception, allocate buffers stand-alone, and link
> > them into the skb as skb_frags? That might avoid a few skb alloc +
> > frees. Though a bit change, not sure how feasible.
>
> The problem here is this (copypasta from somewhere else in this patch):
>
> /* Only the last buffer in a multi-buffer frame contains the total frame
> * length. All other buffers have a zero frame length. The chip
> * occasionally sends more buffers than strictly required to reach the
> * total frame length.
> * Handle this by adding all buffers to the skb in their entirety.
> * Once the real frame length is known, trim the skb.
> */
>
> In other words, the chip sometimes sends more buffers than strictly needed to
> fit the frame. linearize + trim deals with this thorny issue perfectly.
>
> If the skb weren't linearized, we would run into trouble when trying to trim
> (remove from the end) a chunk bigger than the last skb fragment.
>
> > > +process_extension:
> > > +       if (extension_index >= 0) {
> > > +               u32 ts_sec;
> > > +               u32 ts_nsec;
> > > +
> > > +               ts_sec = le32_to_cpu(desc_ext->data1);
> > > +               ts_nsec = (le32_to_cpu(desc_ext->data2) &
> > > +                         RX_DESC_DATA2_TS_NS_MASK_);
> > > +               if (rx->skb_head) {
> > > +                       hwtstamps = skb_hwtstamps(rx->skb_head);
> > > +                       if (hwtstamps)
> >
> > This is always true.
> >
> > You can just call skb_hwtstamps(skb)->hwtstamp = ktime_set(ts_sec, ts_nsec);
>
> Thank you, will do !
>
> >
> > Though I see that this is existing code just moved due to
> > aforementioned indentation change.
>
> True, but I can make the change anyway.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-29 23:08       ` Willem de Bruijn
@ 2021-01-29 23:10         ` Sven Van Asbroeck
  0 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-29 23:10 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Bryan Whitehead, Microchip Linux Driver Support, David S Miller,
	Jakub Kicinski, Andrew Lunn, Alexey Denisov, Sergej Bauer,
	Tim Harvey, Anders Rønningen, Network Development, LKML

On Fri, Jan 29, 2021 at 6:08 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Okay. I found it a bit hard to parse how much true code change was
> mixed in with just reindenting existing code. If a lot, then no need
> to split of the code refactor.

Thank you. The code is quite hard to review in patch format.
Probably easier to apply the patch first, then read the replaced
function.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
  2021-01-29 20:36   ` Andrew Lunn
  2021-01-29 22:01   ` Jakub Kicinski
@ 2021-01-30 22:10   ` Bryan.Whitehead
  2021-01-30 23:59     ` Sven Van Asbroeck
  2021-01-31  0:14     ` Sven Van Asbroeck
       [not found]   ` <20210204060210.2362-1-hdanton@sina.com>
  2021-02-05 12:44   ` Sergej Bauer
  4 siblings, 2 replies; 32+ messages in thread
From: Bryan.Whitehead @ 2021-01-30 22:10 UTC (permalink / raw)
  To: thesven73, UNGLinuxDriver, davem, kuba
  Cc: andrew, rtgbnm, sbauer, tharvey, anders, netdev, linux-kernel

Sven, see below comments

> @@ -2148,11 +2149,18 @@ static int lan743x_rx_process_packet(struct
> lan743x_rx *rx)
>                         descriptor = &rx->ring_cpu_ptr[first_index];
> 
>                         /* unmap from dma */
> +                       packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> +                                       (descriptor->data0);
It appears you moved this packet_length assignment from just below the following if block, however  you left out the le32_to_cpu.See next comment

>                         if (buffer_info->dma_ptr) {
> -                               dma_unmap_single(&rx->adapter->pdev->dev,
> -                                                buffer_info->dma_ptr,
> -                                                buffer_info->buffer_length,
> -                                                DMA_FROM_DEVICE);
> +                               dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
> +                                                       buffer_info->dma_ptr,
> +                                                       packet_length,
> +                                                       DMA_FROM_DEVICE);
> +                               dma_unmap_single_attrs(&rx->adapter->pdev->dev,
> +                                                      buffer_info->dma_ptr,
> +                                                      buffer_info->buffer_length,
> +                                                      DMA_FROM_DEVICE,
> +
> + DMA_ATTR_SKIP_CPU_SYNC);
>                                 buffer_info->dma_ptr = 0;
>                                 buffer_info->buffer_length = 0;
>                         }
Just below here is the following line
		packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
				(le32_to_cpu(descriptor->data0));
This line was moved above the previous if block, but the le32_to_cpu was removed. Is that intentional?
Also I don't see any mention of this packet_length assignment (after the if block) being removed.
Since packet_length already contains this value, it seems unnecessary to keep this assignment.

> @@ -2167,8 +2175,8 @@ static int lan743x_rx_process_packet(struct
> lan743x_rx *rx)
>                         int index = first_index;
> 
>                         /* multi buffer packet not supported */
> -                       /* this should not happen since
> -                        * buffers are allocated to be at least jumbo size
> +                       /* this should not happen since buffers are allocated
> +                        * to be at least the mtu size configured in the mac.
>                          */
> 
>                         /* clean up buffers */ @@ -2628,6 +2636,9 @@ static int
> lan743x_netdev_change_mtu(struct net_device *netdev, int new_mtu)
>         struct lan743x_adapter *adapter = netdev_priv(netdev);
>         int ret = 0;
> 
> +       if (netif_running(netdev))
> +               return -EBUSY;
> +
>         ret = lan743x_mac_set_mtu(adapter, new_mtu);
>         if (!ret)
>                 netdev->mtu = new_mtu;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-30 22:10   ` Bryan.Whitehead
@ 2021-01-30 23:59     ` Sven Van Asbroeck
  2021-01-31  0:14     ` Sven Van Asbroeck
  1 sibling, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-30 23:59 UTC (permalink / raw)
  To: Bryan Whitehead
  Cc: Microchip Linux Driver Support, David Miller, Jakub Kicinski,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

Hi Bryan, thank you so much for reviewing, I really appreciate it.

On Sat, Jan 30, 2021 at 5:11 PM <Bryan.Whitehead@microchip.com> wrote:
>
> >                         /* unmap from dma */
> > +                       packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> > +                                       (descriptor->data0);
> It appears you moved this packet_length assignment from just below the following if block, however  you left out the le32_to_cpu.See next comment
>

Correct on both counts. This is a merge snafu that crept in when I
rebased to Alexey's very recent little/big endian patch, at the last
minute.
My tests didn't catch it, because I'm running on a little-endian cpu,
where le32_to_cpu() compiles to a nop.

Had I had the good sense to run sparse on every patch, like Jakub has,
I would have caught it...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-30 22:10   ` Bryan.Whitehead
  2021-01-30 23:59     ` Sven Van Asbroeck
@ 2021-01-31  0:14     ` Sven Van Asbroeck
  1 sibling, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-31  0:14 UTC (permalink / raw)
  To: Bryan Whitehead
  Cc: Microchip Linux Driver Support, David Miller, Jakub Kicinski,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

On Sat, Jan 30, 2021 at 5:11 PM <Bryan.Whitehead@microchip.com> wrote:
>
> It appears you moved this packet_length assignment from just below the following if block, however  you left out the le32_to_cpu.See next comment

PS this merge snafu is removed completely by the next patch in the set.
So this will not prevent you from reviewing/testing the multi-buffer support,
should you want to.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-29 19:52 ` [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets Sven Van Asbroeck
  2021-01-29 22:11   ` Willem de Bruijn
@ 2021-01-31  7:06   ` Bryan.Whitehead
  2021-01-31 15:25     ` Sven Van Asbroeck
  1 sibling, 1 reply; 32+ messages in thread
From: Bryan.Whitehead @ 2021-01-31  7:06 UTC (permalink / raw)
  To: thesven73, UNGLinuxDriver, davem, kuba
  Cc: andrew, rtgbnm, sbauer, tharvey, anders, netdev, linux-kernel

Hi Sven, 

Looks good.
see comments below.

>  static int lan743x_rx_process_packet(struct lan743x_rx *rx)  {
It looks like this function no longer processes a packet, but rather only processes a single buffer.
So perhaps it should be renamed to lan743x_rx_process_buffer, so it is not misleading.

...
> +       /* unmap from dma */
> +       if (buffer_info->dma_ptr) {
> +               dma_unmap_single(&rx->adapter->pdev->dev,
> +                                buffer_info->dma_ptr,
> +                                buffer_info->buffer_length,
> +                                DMA_FROM_DEVICE);
> +               buffer_info->dma_ptr = 0;
> +               buffer_info->buffer_length = 0;
> +       }
> +       skb = buffer_info->skb;
> 
> -process_extension:
> -               if (extension_index >= 0) {
> -                       descriptor = &rx->ring_cpu_ptr[extension_index];
> -                       buffer_info = &rx->buffer_info[extension_index];
> -
> -                       ts_sec = le32_to_cpu(descriptor->data1);
> -                       ts_nsec = (le32_to_cpu(descriptor->data2) &
> -                                 RX_DESC_DATA2_TS_NS_MASK_);
> -                       lan743x_rx_reuse_ring_element(rx, extension_index);
> -                       real_last_index = extension_index;
> -               }
> +       /* allocate new skb and map to dma */
> +       if (lan743x_rx_init_ring_element(rx, rx->last_head)) {

If lan743x_rx_init_ring_element fails to allocate an skb,
Then lan743x_rx_reuse_ring_element will be called.
But that function expects the skb is already allocated and dma mapped.
But the dma was unmapped above.

Also if lan743x_rx_init_ring_element fails to allocate an skb.
Then control will jump to process_extension and therefor
the currently received skb will not be added to the skb list.
I assume that would corrupt the packet? Or am I missing something?

...
> -               if (!skb) {
> -                       result = RX_PROCESS_RESULT_PACKET_DROPPED;
It looks like this return value is no longer used.
If there is no longer a case where a packet will be dropped 
then maybe this return value should be deleted from the header file.

... 
>  move_forward:
> -               /* push tail and head forward */
> -               rx->last_tail = real_last_index;
> -               rx->last_head = lan743x_rx_next_index(rx, real_last_index);
> -       }
> +       /* push tail and head forward */
> +       rx->last_tail = rx->last_head;
> +       rx->last_head = lan743x_rx_next_index(rx, rx->last_head);
> +       result = RX_PROCESS_RESULT_PACKET_RECEIVED;

Since this function handles one buffer at a time,
  The return value RX_PROCESS_RESULT_PACKET_RECEIVED is now misleading.
  Can you change it to RX_PROCESS_RESULT_BUFFER_RECEIVED.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-31  7:06   ` Bryan.Whitehead
@ 2021-01-31 15:25     ` Sven Van Asbroeck
  2021-02-01 18:04       ` Bryan.Whitehead
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-01-31 15:25 UTC (permalink / raw)
  To: Bryan Whitehead
  Cc: Microchip Linux Driver Support, David Miller, Jakub Kicinski,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

On Sun, Jan 31, 2021 at 2:06 AM <Bryan.Whitehead@microchip.com> wrote:
>
> >  static int lan743x_rx_process_packet(struct lan743x_rx *rx)  {
> It looks like this function no longer processes a packet, but rather only processes a single buffer.
> So perhaps it should be renamed to lan743x_rx_process_buffer, so it is not misleading.

Agreed, will do.

>
> If lan743x_rx_init_ring_element fails to allocate an skb,
> Then lan743x_rx_reuse_ring_element will be called.
> But that function expects the skb is already allocated and dma mapped.
> But the dma was unmapped above.

Good catch. I think you're right, the skb allocation always has to come before
the unmap. Because if we unmap, and then the skb allocation fails, there is no
guarantee that we can remap the old skb we've just unmapped (it could fail).
And then we'd end up with a broken driver.

BUT I actually joined skb alloc and init_ring_element, because of a very subtle
synchronization bug I was seeing: if someone changes the mtu _in_between_
skb alloc and init_ring_element, things will go haywire, because the skb and
mapping lengths would be different !

We could fix that by using a spinlock I guess, but synchronization primitives
in "hot paths" like these are far from ideal... Would be nice if we could
avoid that.

Here's an idea: what if we fold "unmap from dma" into init_ring_element()?
That way, we get the best of both worlds: length cannot change in the middle,
and the function can always "back out" without touching the ring element
in case an allocation or mapping fails.

Pseudo-code:

init_ring_element() {
    /* single "sampling" of mtu, so no synchronization required */
    length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;

    skb = alloc(length);
    if (!skb) return FAIL;
    dma_ptr = dma_map(skb, length);
    if (!dma_ptr) {
        free(skb);
        return FAIL;
    }
    if (buffer_info->dma_ptr)
        dma_unmap(buffer_info->dma_ptr, buffer_info->buffer_length);
    buffer_info->skb = skb;
    buffer_info->dma_ptr = dma_ptr;
    buffer_info->buffer_length = length;

    return SUCCESS;
}

What do you think?

>
> Also if lan743x_rx_init_ring_element fails to allocate an skb.
> Then control will jump to process_extension and therefor
> the currently received skb will not be added to the skb list.
> I assume that would corrupt the packet? Or am I missing something?
>

Yes if an skb alloc failure in the middle of a multi-buffer frame, will corrupt
the packet inside the frame. A chunk will be missing. I had assumed that this
would be caught by an upper network layer, some checksum would be incorrect?

Are there current networking devices that would send a corrupted packet to
Linux if there is a corruption on the physical link? Especially if they don't
support checksumming?

Maybe my assumption is naive.
I'll fix this up if you believe that it could be an issue.

> ...
> > -               if (!skb) {
> > -                       result = RX_PROCESS_RESULT_PACKET_DROPPED;
> It looks like this return value is no longer used.
> If there is no longer a case where a packet will be dropped
> then maybe this return value should be deleted from the header file.

Agreed, will do.

>
> ...
> >  move_forward:
> > -               /* push tail and head forward */
> > -               rx->last_tail = real_last_index;
> > -               rx->last_head = lan743x_rx_next_index(rx, real_last_index);
> > -       }
> > +       /* push tail and head forward */
> > +       rx->last_tail = rx->last_head;
> > +       rx->last_head = lan743x_rx_next_index(rx, rx->last_head);
> > +       result = RX_PROCESS_RESULT_PACKET_RECEIVED;
>
> Since this function handles one buffer at a time,
>   The return value RX_PROCESS_RESULT_PACKET_RECEIVED is now misleading.
>   Can you change it to RX_PROCESS_RESULT_BUFFER_RECEIVED.

Agreed, will do.

RX_PROCESS_RESULT_XXX can now only take two values (RECEIVED and NOTHING_TO_DO),
so in theory it could be replaced by a bool. But perhaps we should keep the
current names, because they are clearer to the reader?

>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-01-31 15:25     ` Sven Van Asbroeck
@ 2021-02-01 18:04       ` Bryan.Whitehead
  2021-02-03 18:53         ` Sven Van Asbroeck
  0 siblings, 1 reply; 32+ messages in thread
From: Bryan.Whitehead @ 2021-02-01 18:04 UTC (permalink / raw)
  To: thesven73
  Cc: UNGLinuxDriver, davem, kuba, andrew, rtgbnm, sbauer, tharvey,
	anders, netdev, linux-kernel

Hi Sven, see below

> >
> > If lan743x_rx_init_ring_element fails to allocate an skb, Then
> > lan743x_rx_reuse_ring_element will be called.
> > But that function expects the skb is already allocated and dma mapped.
> > But the dma was unmapped above.
> 
> Good catch. I think you're right, the skb allocation always has to come before
> the unmap. Because if we unmap, and then the skb allocation fails, there is
> no guarantee that we can remap the old skb we've just unmapped (it could
> fail).
> And then we'd end up with a broken driver.
> 
> BUT I actually joined skb alloc and init_ring_element, because of a very
> subtle synchronization bug I was seeing: if someone changes the mtu
> _in_between_ skb alloc and init_ring_element, things will go haywire,
> because the skb and mapping lengths would be different !
> 
> We could fix that by using a spinlock I guess, but synchronization primitives in
> "hot paths" like these are far from ideal... Would be nice if we could avoid
> that.
> 
> Here's an idea: what if we fold "unmap from dma" into init_ring_element()?
> That way, we get the best of both worlds: length cannot change in the
> middle, and the function can always "back out" without touching the ring
> element in case an allocation or mapping fails.
> 
> Pseudo-code:
> 
> init_ring_element() {
>     /* single "sampling" of mtu, so no synchronization required */
>     length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
> 
>     skb = alloc(length);
>     if (!skb) return FAIL;
>     dma_ptr = dma_map(skb, length);
>     if (!dma_ptr) {
>         free(skb);
>         return FAIL;
>     }
>     if (buffer_info->dma_ptr)
>         dma_unmap(buffer_info->dma_ptr, buffer_info->buffer_length);
>     buffer_info->skb = skb;
>     buffer_info->dma_ptr = dma_ptr;
>     buffer_info->buffer_length = length;
> 
>     return SUCCESS;
> }
> 
> What do you think?

Yes, I believe this will work.

> 
> >
> > Also if lan743x_rx_init_ring_element fails to allocate an skb.
> > Then control will jump to process_extension and therefor the currently
> > received skb will not be added to the skb list.
> > I assume that would corrupt the packet? Or am I missing something?
> >
> 
> Yes if an skb alloc failure in the middle of a multi-buffer frame, will corrupt
> the packet inside the frame. A chunk will be missing. I had assumed that this
> would be caught by an upper network layer, some checksum would be
> incorrect?
> 
> Are there current networking devices that would send a corrupted packet to
> Linux if there is a corruption on the physical link? Especially if they don't
> support checksumming?
> 
> Maybe my assumption is naive.
> I'll fix this up if you believe that it could be an issue.

Yes the upper layers will catch it and drop it.
But if we already know the packet is corrupted, I think it would be better if we
dropped it here, to avoid unnecessary processing upstream.

...
> 
> RX_PROCESS_RESULT_XXX can now only take two values (RECEIVED and
> NOTHING_TO_DO), so in theory it could be replaced by a bool. But perhaps
> we should keep the current names, because they are clearer to the reader?
> 
I'm ok if you want to change it too a bool. 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-02-01 18:04       ` Bryan.Whitehead
@ 2021-02-03 18:53         ` Sven Van Asbroeck
  2021-02-03 20:14           ` Bryan.Whitehead
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-02-03 18:53 UTC (permalink / raw)
  To: Bryan Whitehead
  Cc: Microchip Linux Driver Support, David Miller, Jakub Kicinski,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

Thank you Bryan. I will prepare a v2 early next week.

Would Microchip be able to donate some time to test v2? My own tests
are certainly not perfect. Various stress tests across architectures
(intel/arm) would help create confidence in the multi-buffer frame
implementation. Perhaps Microchip has various test rigs already set
up?

After all, absence of bugs is more important than speed improvements.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-02-03 18:53         ` Sven Van Asbroeck
@ 2021-02-03 20:14           ` Bryan.Whitehead
  2021-02-03 20:25             ` Sven Van Asbroeck
  0 siblings, 1 reply; 32+ messages in thread
From: Bryan.Whitehead @ 2021-02-03 20:14 UTC (permalink / raw)
  To: thesven73
  Cc: UNGLinuxDriver, davem, kuba, andrew, rtgbnm, sbauer, tharvey,
	anders, netdev, linux-kernel

Hi Sven,

We can test on x86 PC. We will just need about a week after you release your next version.

Thanks,
Bryan

> -----Original Message-----
> From: Sven Van Asbroeck <thesven73@gmail.com>
> Sent: Wednesday, February 3, 2021 1:53 PM
> To: Bryan Whitehead - C21958 <Bryan.Whitehead@microchip.com>
> Cc: UNGLinuxDriver <UNGLinuxDriver@microchip.com>; David Miller
> <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>; Andrew Lunn
> <andrew@lunn.ch>; Alexey Denisov <rtgbnm@gmail.com>; Sergej Bauer
> <sbauer@blackbox.su>; Tim Harvey <tharvey@gateworks.com>; Anders
> Rønningen <anders@ronningen.priv.no>; netdev
> <netdev@vger.kernel.org>; Linux Kernel Mailing List <linux-
> kernel@vger.kernel.org>
> Subject: Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer
> packets
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> Thank you Bryan. I will prepare a v2 early next week.
> 
> Would Microchip be able to donate some time to test v2? My own tests are
> certainly not perfect. Various stress tests across architectures
> (intel/arm) would help create confidence in the multi-buffer frame
> implementation. Perhaps Microchip has various test rigs already set up?
> 
> After all, absence of bugs is more important than speed improvements.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-02-03 20:14           ` Bryan.Whitehead
@ 2021-02-03 20:25             ` Sven Van Asbroeck
  2021-02-03 20:41               ` Bryan.Whitehead
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-02-03 20:25 UTC (permalink / raw)
  To: Bryan Whitehead
  Cc: Microchip Linux Driver Support, David Miller, Jakub Kicinski,
	Andrew Lunn, Alexey Denisov, Sergej Bauer, Tim Harvey,
	Anders Rønningen, netdev, Linux Kernel Mailing List

On Wed, Feb 3, 2021 at 3:14 PM <Bryan.Whitehead@microchip.com> wrote:
>
> We can test on x86 PC. We will just need about a week after you release your next version.
>

That's great. If you have any suggestions on how I can improve testing
on my end, feel free to reach out.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
  2021-02-03 20:25             ` Sven Van Asbroeck
@ 2021-02-03 20:41               ` Bryan.Whitehead
  0 siblings, 0 replies; 32+ messages in thread
From: Bryan.Whitehead @ 2021-02-03 20:41 UTC (permalink / raw)
  To: thesven73
  Cc: UNGLinuxDriver, davem, kuba, andrew, rtgbnm, sbauer, tharvey,
	anders, netdev, linux-kernel

> On Wed, Feb 3, 2021 at 3:14 PM <Bryan.Whitehead@microchip.com> wrote:
> >
> > We can test on x86 PC. We will just need about a week after you release
> your next version.
> >
> 
> That's great. If you have any suggestions on how I can improve testing on my
> end, feel free to reach out.

If you are able, in addition to basic rx and tx iperf tests, I would recommend PTP tests.
PTP relies on the time stamps extracted from the extension descriptors, which is directly in the RX path you are modifying.

If you are not able, we will at least cover that for x86 PC.

Thanks,
Bryan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
       [not found]   ` <20210204060210.2362-1-hdanton@sina.com>
@ 2021-02-05  9:31     ` Christoph Hellwig
  2021-02-05 14:01       ` Sven Van Asbroeck
  0 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2021-02-05  9:31 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Andrew Lunn, Sven Van Asbroeck, Christoph Hellwig,
	Bryan Whitehead, netdev, linux-kernel

This is a pattern we've seen in a few other net driver, so we should
be ok.  It just is rather hairy and needs a good justification, which
seems to be given here.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
                     ` (3 preceding siblings ...)
       [not found]   ` <20210204060210.2362-1-hdanton@sina.com>
@ 2021-02-05 12:44   ` Sergej Bauer
  2021-02-05 14:07     ` Sven Van Asbroeck
  4 siblings, 1 reply; 32+ messages in thread
From: Sergej Bauer @ 2021-02-05 12:44 UTC (permalink / raw)
  To: thesven73
  Cc: andrew, Markus.Elfring, rtgbnm, tharvey, anders, sbauer,
	Bryan Whitehead, maintainer:MICROCHIP LAN743X ETHERNET DRIVER,
	David S. Miller, Jakub Kicinski,
	open list:MICROCHIP LAN743X ETHERNET DRIVER, open list

Hi Sven
I can confirm great stability improvement after your patch
"lan743x: boost performance on cpu archs w/o dma cache snooping".
Please note, that test_ber opens raw sockets as
s = socket(AF_PACKET, SOCK_RAW, ETH_P_ALL)
and resulting 'average speed' is a average egress speed.

Test machine is Intel Pentium G4560 3.50GHz
lan743x with rejected virtual phy 'inside'

What I had before:
$ ifmtu eth7 500
$ test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 289017
number of lost packets = 710983
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 710983
bit error rate         = 0.710983
average speed: 429.3799 Mbit/s

$ ifmtu eth7 1500
$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f1500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 577194
number of lost packets = 422806
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 422806
bit error rate         = 0.422806
average speed: 644.6557 Mbit/s
---

and what I had after your patch:
$ ifmtu eth7 500
$ test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 711329
number of lost packets = 288671
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 288671
bit error rate         = 0.288671
average speed: 429.2263 Mbit/s

$ ifmtu eth7 1500
$ test_ber -l eth7 -c 1000 -n 1000000 -f1500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 1000000
number of lost packets = 0
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 0
bit error rate         = 0
average speed: 644.5405 Mbit/s

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-02-05  9:31     ` Christoph Hellwig
@ 2021-02-05 14:01       ` Sven Van Asbroeck
  0 siblings, 0 replies; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-02-05 14:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hillf Danton, Andrew Lunn, Bryan Whitehead, netdev,
	Linux Kernel Mailing List

Hi Christoph,

On Fri, Feb 5, 2021 at 4:31 AM Christoph Hellwig <hch@lst.de> wrote:
>
> This is a pattern we've seen in a few other net driver, so we should
> be ok.  It just is rather hairy and needs a good justification, which
> seems to be given here.

Thank you so much for taking the time to look into this.
That is certainly good news !!

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-02-05 12:44   ` Sergej Bauer
@ 2021-02-05 14:07     ` Sven Van Asbroeck
  2021-02-05 15:09       ` Sergej Bauer
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-02-05 14:07 UTC (permalink / raw)
  To: Sergej Bauer
  Cc: Andrew Lunn, Markus.Elfring, Alexey Denisov, Tim Harvey,
	Anders Rønningen, Bryan Whitehead,
	maintainer:MICROCHIP LAN743X ETHERNET DRIVER, David S. Miller,
	Jakub Kicinski, open list:MICROCHIP LAN743X ETHERNET DRIVER,
	open list

Hi Sergej,

On Fri, Feb 5, 2021 at 7:44 AM Sergej Bauer <sbauer@blackbox.su> wrote:
>
> Hi Sven
> I can confirm great stability improvement after your patch
> "lan743x: boost performance on cpu archs w/o dma cache snooping".
>
> Test machine is Intel Pentium G4560 3.50GHz
> lan743x with rejected virtual phy 'inside'

Interesting, so the speed boost patch seems to improve things even on Intel...

Would you be able to apply and test the multi-buffer patch as well?
To do that, you can simply apply patches [2/6] and [3/6] on top of
what you already have.

Keeping in mind that Bryan has identified an issue with the above
patch, which will get fixed in v2. So YMMV.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-02-05 14:07     ` Sven Van Asbroeck
@ 2021-02-05 15:09       ` Sergej Bauer
  2021-02-05 16:39         ` Sven Van Asbroeck
  0 siblings, 1 reply; 32+ messages in thread
From: Sergej Bauer @ 2021-02-05 15:09 UTC (permalink / raw)
  To: thesven73
  Cc: andrew, Markus.Elfring, rtgbnm, tharvey, anders, sbauer,
	Bryan Whitehead, maintainer:MICROCHIP LAN743X ETHERNET DRIVER,
	David S. Miller, Jakub Kicinski,
	open list:MICROCHIP LAN743X ETHERNET DRIVER, open list

On Friday, February 5, 2021 5:07:22 PM MSK you wrote:
> Hi Sergej,
> 
> On Fri, Feb 5, 2021 at 7:44 AM Sergej Bauer <sbauer@blackbox.su> wrote:
> > Hi Sven
> > I can confirm great stability improvement after your patch
> > "lan743x: boost performance on cpu archs w/o dma cache snooping".
> > 
> > Test machine is Intel Pentium G4560 3.50GHz
> > lan743x with rejected virtual phy 'inside'
> 
> Interesting, so the speed boost patch seems to improve things even on
> Intel...
> 
> Would you be able to apply and test the multi-buffer patch as well?
> To do that, you can simply apply patches [2/6] and [3/6] on top of
> what you already have.
> 

Hi Sven
Tests after applying patches [2/6] and [3/6] are:
$ ifmtu eth7 500
$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 713288
number of lost packets = 286712
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 286712
bit error rate         = 0.286712
average speed: 427.8043 Mbit/s

$ ifmtu eth7 1500
$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 707869
number of lost packets = 292131
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 292131
bit error rate         = 0.292131
average speed: 431.0163 Mbit/s

$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f1500 --no-conf
...
number of sent packets      = 1000000
number of received packets  = 1000000
number of lost packets = 0
number of out of order packets = 0
number of bit errors   = 0
total errors detected  = 0
bit error rate         = 0
average speed: 646.4932 Mbit/s

> Keeping in mind that Bryan has identified an issue with the above
> patch, which will get fixed in v2. So YMMV.
I'll perform tests with v2 as well.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-02-05 15:09       ` Sergej Bauer
@ 2021-02-05 16:39         ` Sven Van Asbroeck
  2021-02-05 16:59           ` Sergej Bauer
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Van Asbroeck @ 2021-02-05 16:39 UTC (permalink / raw)
  To: Sergej Bauer
  Cc: Andrew Lunn, Markus.Elfring, Alexey Denisov, Tim Harvey,
	Anders Rønningen, Bryan Whitehead,
	maintainer:MICROCHIP LAN743X ETHERNET DRIVER, David S. Miller,
	Jakub Kicinski, open list:MICROCHIP LAN743X ETHERNET DRIVER,
	open list

Hi Sergej,

On Fri, Feb 5, 2021 at 10:09 AM Sergej Bauer <sbauer@blackbox.su> wrote:
>
> Tests after applying patches [2/6] and [3/6] are:
> $ ifmtu eth7 500
> $ sudo test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf

Thank you! Is there a way for me to run test_ber myself?
Is this a standard, or a bespoke testing tool?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping
  2021-02-05 16:39         ` Sven Van Asbroeck
@ 2021-02-05 16:59           ` Sergej Bauer
  0 siblings, 0 replies; 32+ messages in thread
From: Sergej Bauer @ 2021-02-05 16:59 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Andrew Lunn, Markus.Elfring, Alexey Denisov, Tim Harvey,
	Anders Rønningen, Bryan Whitehead,
	maintainer:MICROCHIP LAN743X ETHERNET DRIVER, David S. Miller,
	Jakub Kicinski, open list:MICROCHIP LAN743X ETHERNET DRIVER,
	open list

sOn Friday, February 5, 2021 7:39:40 PM MSK Sven Van Asbroeck wrote:
> Hi Sergej,
> 
> On Fri, Feb 5, 2021 at 10:09 AM Sergej Bauer <sbauer@blackbox.su> wrote:
> > Tests after applying patches [2/6] and [3/6] are:
> > $ ifmtu eth7 500
> > $ sudo test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf
> 
> Thank you! Is there a way for me to run test_ber myself?
> Is this a standard, or a bespoke testing tool?
It's kind of bespoke... A part of framework to assist HW guys in developing
PHY-device. But the project is finished, so I could ask for permission to send
the source to you.




^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2021-02-05 23:56 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 19:52 [PATCH net-next v1 0/6] lan743x speed boost Sven Van Asbroeck
2021-01-29 19:52 ` [PATCH net-next v1 1/6] lan743x: boost performance on cpu archs w/o dma cache snooping Sven Van Asbroeck
2021-01-29 20:36   ` Andrew Lunn
2021-01-29 22:49     ` Sven Van Asbroeck
2021-01-29 22:01   ` Jakub Kicinski
2021-01-29 22:46     ` Sven Van Asbroeck
2021-01-30 22:10   ` Bryan.Whitehead
2021-01-30 23:59     ` Sven Van Asbroeck
2021-01-31  0:14     ` Sven Van Asbroeck
     [not found]   ` <20210204060210.2362-1-hdanton@sina.com>
2021-02-05  9:31     ` Christoph Hellwig
2021-02-05 14:01       ` Sven Van Asbroeck
2021-02-05 12:44   ` Sergej Bauer
2021-02-05 14:07     ` Sven Van Asbroeck
2021-02-05 15:09       ` Sergej Bauer
2021-02-05 16:39         ` Sven Van Asbroeck
2021-02-05 16:59           ` Sergej Bauer
2021-01-29 19:52 ` [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets Sven Van Asbroeck
2021-01-29 22:11   ` Willem de Bruijn
2021-01-29 23:02     ` Sven Van Asbroeck
2021-01-29 23:08       ` Willem de Bruijn
2021-01-29 23:10         ` Sven Van Asbroeck
2021-01-31  7:06   ` Bryan.Whitehead
2021-01-31 15:25     ` Sven Van Asbroeck
2021-02-01 18:04       ` Bryan.Whitehead
2021-02-03 18:53         ` Sven Van Asbroeck
2021-02-03 20:14           ` Bryan.Whitehead
2021-02-03 20:25             ` Sven Van Asbroeck
2021-02-03 20:41               ` Bryan.Whitehead
2021-01-29 19:52 ` [PATCH net-next v1 3/6] lan743x: allow mtu change while network interface is up Sven Van Asbroeck
2021-01-29 19:52 ` [PATCH net-next v1 4/6] TEST ONLY: lan743x: limit rx ring buffer size to 500 bytes Sven Van Asbroeck
2021-01-29 19:52 ` [PATCH net-next v1 5/6] TEST ONLY: lan743x: skb_alloc failure test Sven Van Asbroeck
2021-01-29 19:52 ` [PATCH net-next v1 6/6] TEST ONLY: lan743x: skb_trim " Sven Van Asbroeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).