linux-renesas-soc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] sh_eth: implement simple RX checksum offload
@ 2019-01-27 17:33 Sergei Shtylyov
  2019-01-27 17:36 ` [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum Sergei Shtylyov
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:33 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

Hello!

Here's a set of 7 patches against DaveM's 'net-next.git' repo. I'm implemeting
the simple RX checksum offload (like was done for the 'ravb' driver by Simon
Horman); it was only tested on the R8A77980 SoC, the other SoCs should just
work (according to their manuals)...

[1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
[2/7] sh_eth: RX checksum offload support
[3/7] sh_eth: offload RX checksum on R7S72100
[4/7] sh_eth: offload RX checksum on R8A7740
[5/7] sh_eth: offload RX checksum on R8A77980
[6/7] sh_eth: offload RX checksum on SH7734
[7/7] sh_eth: offload RX checksum on SH7763

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
@ 2019-01-27 17:36 ` Sergei Shtylyov
  2019-01-28  9:21   ` Geert Uytterhoeven
  2019-01-27 17:37 ` [PATCH 2/7] sh_eth: RX checksum offload support Sergei Shtylyov
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:36 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

Commit 62e04b7e0e3c ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed
the field to 'hw_checksum' for the Ether DMAC "intelligent checksum",
however some Ether MACs implement a simpler checksumming scheme, so that
name now seems misleading. Rename that filed to 'csmr' as the "intelligent
checkmum" is always controlled by the CSMR register.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |   14 +++++++-------
 drivers/net/ethernet/renesas/sh_eth.h |    2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -555,7 +555,7 @@ static int sh_eth_soft_reset_gether(stru
 	sh_eth_write(ndev, 0, RDFFR);
 
 	/* Reset HW CRC register */
-	if (mdp->cd->hw_checksum)
+	if (mdp->cd->csmr)
 		sh_eth_write(ndev, 0, CSMR);
 
 	/* Select MII mode */
@@ -619,7 +619,7 @@ static struct sh_eth_cpu_data r7s72100_d
 	.no_trimd	= 1,
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
-	.hw_checksum	= 1,
+	.csmr		= 1,
 	.tsu		= 1,
 	.no_tx_cntrs	= 1,
 };
@@ -668,7 +668,7 @@ static struct sh_eth_cpu_data r8a7740_da
 	.no_trimd	= 1,
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
-	.hw_checksum	= 1,
+	.csmr		= 1,
 	.tsu		= 1,
 	.select_mii	= 1,
 	.magic		= 1,
@@ -793,7 +793,7 @@ static struct sh_eth_cpu_data r8a77980_d
 	.no_trimd	= 1,
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
-	.hw_checksum	= 1,
+	.csmr		= 1,
 	.select_mii	= 1,
 	.magic		= 1,
 	.cexcr		= 1,
@@ -1045,7 +1045,7 @@ static struct sh_eth_cpu_data sh7734_dat
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
 	.tsu		= 1,
-	.hw_checksum	= 1,
+	.csmr		= 1,
 	.select_mii	= 1,
 	.magic		= 1,
 	.cexcr		= 1,
@@ -1633,7 +1633,7 @@ static int sh_eth_rx(struct net_device *
 		 * the RFS bits are from bit 25 to bit 16. So, the
 		 * driver needs right shifting by 16.
 		 */
-		if (mdp->cd->hw_checksum)
+		if (mdp->cd->csmr)
 			desc_status >>= 16;
 
 		skb = mdp->rx_skbuff[entry];
@@ -2173,7 +2173,7 @@ static size_t __sh_eth_get_regs(struct n
 	add_reg(MAFCR);
 	if (cd->rtrate)
 		add_reg(RTRATE);
-	if (cd->hw_checksum)
+	if (cd->csmr)
 		add_reg(CSMR);
 	if (cd->select_mii)
 		add_reg(RMII_MII);
Index: net-next/drivers/net/ethernet/renesas/sh_eth.h
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
+++ net-next/drivers/net/ethernet/renesas/sh_eth.h
@@ -499,7 +499,7 @@ struct sh_eth_cpu_data {
 	unsigned no_ade:1;	/* E-DMAC DOES NOT have ADE bit in EESR */
 	unsigned no_xdfar:1;	/* E-DMAC DOES NOT have RDFAR/TDFAR */
 	unsigned xdfar_rw:1;	/* E-DMAC has writeable RDFAR/TDFAR */
-	unsigned hw_checksum:1;	/* E-DMAC has CSMR */
+	unsigned csmr:1;	/* E-DMAC has CSMR */
 	unsigned select_mii:1;	/* EtherC has RMII_MII (MII select register) */
 	unsigned rmiimode:1;	/* EtherC has RMIIMODE register */
 	unsigned rtrate:1;	/* EtherC has RTRATE register */

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/7] sh_eth: RX checksum offload support
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
  2019-01-27 17:36 ` [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum Sergei Shtylyov
@ 2019-01-27 17:37 ` Sergei Shtylyov
  2019-01-28 12:18   ` Simon Horman
  2019-01-27 17:38 ` [PATCH 3/7] sh_eth: offload RX checksum on R7S72100 Sergei Shtylyov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:37 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

Add support for the RX checksum offload. This is enabled by default and
may be disabled and re-enabled using 'ethtool':

# ethtool -K eth0 rx {on|off}

Some Ether MACs provide a simple checksumming scheme which appears to be
completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
the L2 header is appended to packet data; this may be trivially read by
the driver and used to update the skb accordingly. The same checksumming
scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
driver.

In terms of performance, throughput is close to gigabit line rate with the
RX checksum offload both enabled and disabled.  The 'perf' output, however,
appears to indicate that significantly less time is spent in do_csum() --
this is as expected.

Test results with RX checksum offload enabled:

~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
TCP MAERTS TEST to 192.168.2.4
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

131072  16384  16384    10.01     933.93
[ perf record: Woken up 8 times to write data ]
[ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ]
~/netperf-2.2pl4# perf report
Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763
Overhead  Command          Shared Object             Symbol
   9.44%  netperf          [kernel.kallsyms]         [k] __arch_copy_to_user
   7.75%  swapper          [kernel.kallsyms]         [k] _raw_spin_unlock_irq
   6.31%  swapper          [kernel.kallsyms]         [k] default_idle_call
   5.89%  swapper          [kernel.kallsyms]         [k] arch_cpu_idle
   4.37%  swapper          [kernel.kallsyms]         [k] tick_nohz_idle_exit
   4.02%  netperf          [kernel.kallsyms]         [k] _raw_spin_unlock_irq
   2.52%  netperf          [kernel.kallsyms]         [k] preempt_count_sub
   1.81%  netperf          [kernel.kallsyms]         [k] tcp_recvmsg
   1.80%  netperf          [kernel.kallsyms]         [k] _raw_spin_unlock_irqres
   1.78%  netperf          [kernel.kallsyms]         [k] preempt_count_add
   1.36%  netperf          [kernel.kallsyms]         [k] __tcp_transmit_skb
   1.20%  netperf          [kernel.kallsyms]         [k] __local_bh_enable_ip
   1.10%  netperf          [kernel.kallsyms]         [k] sh_eth_start_xmit

Test results with RX checksum offload disabled:

~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
TCP MAERTS TEST to 192.168.2.4
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
131072  16384  16384    10.01     932.04
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ]
~/netperf-2.2pl4# perf report
Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796          
Overhead  Command          Shared Object       Symbol                           
   7.00%  swapper          [kernel.kallsyms]   [k] do_csum                      
   3.94%  swapper          [kernel.kallsyms]   [k] sh_eth_poll                  
   3.83%  ksoftirqd/0      [kernel.kallsyms]   [k] do_csum                      
   3.23%  swapper          [kernel.kallsyms]   [k] _raw_spin_unlock_irq         
   2.87%  netperf          [kernel.kallsyms]   [k] __arch_copy_to_user          
   2.86%  swapper          [kernel.kallsyms]   [k] arch_cpu_idle                
   2.13%  swapper          [kernel.kallsyms]   [k] default_idle_call            
   2.12%  ksoftirqd/0      [kernel.kallsyms]   [k] sh_eth_poll                  
   2.02%  swapper          [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore  
   1.84%  swapper          [kernel.kallsyms]   [k] __softirqentry_text_start    
   1.64%  swapper          [kernel.kallsyms]   [k] tick_nohz_idle_exit          
   1.53%  netperf          [kernel.kallsyms]   [k] _raw_spin_unlock_irq         
   1.32%  netperf          [kernel.kallsyms]   [k] preempt_count_sub            
   1.27%  swapper          [kernel.kallsyms]   [k] __pi___inval_dcache_area     
   1.22%  swapper          [kernel.kallsyms]   [k] check_preemption_disabled    
   1.01%  ksoftirqd/0      [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore  

The above results collected on the R-Car V3H Starter Kit board.

Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |   60 ++++++++++++++++++++++++++++++++--
 drivers/net/ethernet/renesas/sh_eth.h |    1 
 2 files changed, 59 insertions(+), 2 deletions(-)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -1532,8 +1532,9 @@ static int sh_eth_dev_init(struct net_de
 	mdp->irq_enabled = true;
 	sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
 
-	/* PAUSE Prohibition */
+	/* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */
 	sh_eth_write(ndev, ECMR_ZPF | (mdp->duplex ? ECMR_DM : 0) |
+		     (ndev->features & NETIF_F_RXCSUM ? ECMR_RCSC : 0) |
 		     ECMR_TE | ECMR_RE, ECMR);
 
 	if (mdp->cd->set_rate)
@@ -1592,6 +1593,19 @@ static void sh_eth_dev_exit(struct net_d
 	update_mac_address(ndev);
 }
 
+static void sh_eth_rx_csum(struct sk_buff *skb)
+{
+	u8 *hw_csum;
+
+	/* The hardware checksum is 2 bytes appended to packet data */
+	if (unlikely(skb->len < sizeof(__sum16)))
+		return;
+	hw_csum = skb_tail_pointer(skb) - sizeof(__sum16);
+	skb->csum = csum_unfold((__force __sum16)get_unaligned_le16(hw_csum));
+	skb->ip_summed = CHECKSUM_COMPLETE;
+	skb_trim(skb, skb->len - sizeof(__sum16));
+}
+
 /* Packet receive function */
 static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
 {
@@ -1666,6 +1680,8 @@ static int sh_eth_rx(struct net_device *
 					 DMA_FROM_DEVICE);
 			skb_put(skb, pkt_len);
 			skb->protocol = eth_type_trans(skb, ndev);
+			if (ndev->features & NETIF_F_RXCSUM)
+				sh_eth_rx_csum(skb);
 			netif_receive_skb(skb);
 			ndev->stats.rx_packets++;
 			ndev->stats.rx_bytes += pkt_len;
@@ -2921,6 +2937,39 @@ static void sh_eth_set_rx_mode(struct ne
 	spin_unlock_irqrestore(&mdp->lock, flags);
 }
 
+static void sh_eth_set_rx_csum(struct net_device *ndev, bool enable)
+{
+	struct sh_eth_private *mdp = netdev_priv(ndev);
+	unsigned long flags;
+
+	spin_lock_irqsave(&mdp->lock, flags);
+
+	/* Disable TX and RX */
+	sh_eth_rcv_snd_disable(ndev);
+
+	/* Modify RX Checksum setting */
+	sh_eth_modify(ndev, ECMR, ECMR_RCSC, enable ? ECMR_RCSC : 0);
+
+	/* Enable TX and RX */
+	sh_eth_rcv_snd_enable(ndev);
+
+	spin_unlock_irqrestore(&mdp->lock, flags);
+}
+
+static int sh_eth_set_features(struct net_device *ndev,
+			       netdev_features_t features)
+{
+	netdev_features_t changed = ndev->features ^ features;
+	struct sh_eth_private *mdp = netdev_priv(ndev);
+
+	if (changed & NETIF_F_RXCSUM && mdp->cd->rx_csum)
+		sh_eth_set_rx_csum(ndev, features & NETIF_F_RXCSUM);
+
+	ndev->features = features;
+
+	return 0;
+}
+
 static int sh_eth_get_vtag_index(struct sh_eth_private *mdp)
 {
 	if (!mdp->port)
@@ -3102,6 +3151,7 @@ static const struct net_device_ops sh_et
 	.ndo_change_mtu		= sh_eth_change_mtu,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_set_mac_address	= eth_mac_addr,
+	.ndo_set_features	= sh_eth_set_features,
 };
 
 static const struct net_device_ops sh_eth_netdev_ops_tsu = {
@@ -3117,6 +3167,7 @@ static const struct net_device_ops sh_et
 	.ndo_change_mtu		= sh_eth_change_mtu,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_set_mac_address	= eth_mac_addr,
+	.ndo_set_features	= sh_eth_set_features,
 };
 
 #ifdef CONFIG_OF
@@ -3245,6 +3296,11 @@ static int sh_eth_drv_probe(struct platf
 	ndev->max_mtu = 2000 - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN);
 	ndev->min_mtu = ETH_MIN_MTU;
 
+	if (mdp->cd->rx_csum) {
+		ndev->features = NETIF_F_RXCSUM;
+		ndev->hw_features = NETIF_F_RXCSUM;
+	}
+
 	/* set function */
 	if (mdp->cd->tsu)
 		ndev->netdev_ops = &sh_eth_netdev_ops_tsu;
@@ -3294,7 +3350,7 @@ static int sh_eth_drv_probe(struct platf
 			goto out_release;
 		}
 		mdp->port = port;
-		ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER;
+		ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
 
 		/* Need to init only the first port of the two sharing a TSU */
 		if (port == 0) {
Index: net-next/drivers/net/ethernet/renesas/sh_eth.h
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
+++ net-next/drivers/net/ethernet/renesas/sh_eth.h
@@ -500,6 +500,7 @@ struct sh_eth_cpu_data {
 	unsigned no_xdfar:1;	/* E-DMAC DOES NOT have RDFAR/TDFAR */
 	unsigned xdfar_rw:1;	/* E-DMAC has writeable RDFAR/TDFAR */
 	unsigned csmr:1;	/* E-DMAC has CSMR */
+	unsigned rx_csum:1;	/* EtherC has ECMR.RCSC */
 	unsigned select_mii:1;	/* EtherC has RMII_MII (MII select register) */
 	unsigned rmiimode:1;	/* EtherC has RMIIMODE register */
 	unsigned rtrate:1;	/* EtherC has RTRATE register */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
  2019-01-27 17:36 ` [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum Sergei Shtylyov
  2019-01-27 17:37 ` [PATCH 2/7] sh_eth: RX checksum offload support Sergei Shtylyov
@ 2019-01-27 17:38 ` Sergei Shtylyov
  2019-01-28 12:20   ` Simon Horman
  2019-01-27 17:39 ` [PATCH 4/7] sh_eth: offload RX checksum on R8A7740 Sergei Shtylyov
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:38 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum
offload the same way as it's implemented in the EtherAVB MACs...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |    1 +
 1 file changed, 1 insertion(+)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -620,6 +620,7 @@ static struct sh_eth_cpu_data r7s72100_d
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
 	.csmr		= 1,
+	.rx_csum	= 1,
 	.tsu		= 1,
 	.no_tx_cntrs	= 1,
 };


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4/7] sh_eth: offload RX checksum on R8A7740
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
                   ` (2 preceding siblings ...)
  2019-01-27 17:38 ` [PATCH 3/7] sh_eth: offload RX checksum on R7S72100 Sergei Shtylyov
@ 2019-01-27 17:39 ` Sergei Shtylyov
  2019-01-29 18:20   ` Geert Uytterhoeven
  2019-01-27 17:40 ` [PATCH 5/7] sh_eth: offload RX checksum on R8A77980 Sergei Shtylyov
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:39 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

The R-Mobile A1 (R8A7740) SoC manual describes the Ether MAC's RX checksum
offload the same way as it's implemented in the EtherAVB MAC...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |    1 +
 1 file changed, 1 insertion(+)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -670,6 +670,7 @@ static struct sh_eth_cpu_data r8a7740_da
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
 	.csmr		= 1,
+	.rx_csum	= 1,
 	.tsu		= 1,
 	.select_mii	= 1,
 	.magic		= 1,

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 5/7] sh_eth: offload RX checksum on R8A77980
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
                   ` (3 preceding siblings ...)
  2019-01-27 17:39 ` [PATCH 4/7] sh_eth: offload RX checksum on R8A7740 Sergei Shtylyov
@ 2019-01-27 17:40 ` Sergei Shtylyov
  2019-01-27 17:41 ` [PATCH 6/7] sh_eth: offload RX checksum on SH7734 Sergei Shtylyov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:40 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

The R-Car V3H (R8A77980) SoC manual describes the Ether MAC's RX checksum
offload the same way as it's implemented in the EtherAVB MAC...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |    1 +
 1 file changed, 1 insertion(+)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -796,6 +796,7 @@ static struct sh_eth_cpu_data r8a77980_d
 	.no_ade		= 1,
 	.xdfar_rw	= 1,
 	.csmr		= 1,
+	.rx_csum	= 1,
 	.select_mii	= 1,
 	.magic		= 1,
 	.cexcr		= 1,


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 6/7] sh_eth: offload RX checksum on SH7734
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
                   ` (4 preceding siblings ...)
  2019-01-27 17:40 ` [PATCH 5/7] sh_eth: offload RX checksum on R8A77980 Sergei Shtylyov
@ 2019-01-27 17:41 ` Sergei Shtylyov
  2019-01-27 17:42 ` [PATCH 7/7] sh_eth: offload RX checksum on SH7763 Sergei Shtylyov
  2019-01-27 17:52 ` [PATCH 0/7] sh_eth: implement simple RX checksum offload Heiner Kallweit
  7 siblings, 0 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:41 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

The SH7734 SoC manual describes the Ether MAC's RX checksum offload
the same way as it's implemented in the EtherAVB MACs...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |    1 +
 1 file changed, 1 insertion(+)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -1049,6 +1049,7 @@ static struct sh_eth_cpu_data sh7734_dat
 	.xdfar_rw	= 1,
 	.tsu		= 1,
 	.csmr		= 1,
+	.rx_csum	= 1,
 	.select_mii	= 1,
 	.magic		= 1,
 	.cexcr		= 1,




^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 7/7] sh_eth: offload RX checksum on SH7763
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
                   ` (5 preceding siblings ...)
  2019-01-27 17:41 ` [PATCH 6/7] sh_eth: offload RX checksum on SH7734 Sergei Shtylyov
@ 2019-01-27 17:42 ` Sergei Shtylyov
  2019-02-04 11:55   ` Rob Landley
  2019-01-27 17:52 ` [PATCH 0/7] sh_eth: implement simple RX checksum offload Heiner Kallweit
  7 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-27 17:42 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

The SH7763 SoC manual describes the Ether MAC's RX checksum offload
the same way as it's implemented in the EtherAVB MACs...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |    1 +
 1 file changed, 1 insertion(+)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -1092,6 +1092,7 @@ static struct sh_eth_cpu_data sh7763_dat
 	.irq_flags	= IRQF_SHARED,
 	.magic		= 1,
 	.cexcr		= 1,
+	.rx_csum	= 1,
 	.dual_port	= 1,
 };
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/7] sh_eth: implement simple RX checksum offload
  2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
                   ` (6 preceding siblings ...)
  2019-01-27 17:42 ` [PATCH 7/7] sh_eth: offload RX checksum on SH7763 Sergei Shtylyov
@ 2019-01-27 17:52 ` Heiner Kallweit
  2019-01-29 11:06   ` Sergei Shtylyov
  7 siblings, 1 reply; 30+ messages in thread
From: Heiner Kallweit @ 2019-01-27 17:52 UTC (permalink / raw)
  To: Sergei Shtylyov, netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

On 27.01.2019 18:33, Sergei Shtylyov wrote:
> Hello!
> 
> Here's a set of 7 patches against DaveM's 'net-next.git' repo. I'm implemeting
> the simple RX checksum offload (like was done for the 'ravb' driver by Simon
> Horman); it was only tested on the R8A77980 SoC, the other SoCs should just
> work (according to their manuals)...
> 
> [1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
> [2/7] sh_eth: RX checksum offload support
> [3/7] sh_eth: offload RX checksum on R7S72100
> [4/7] sh_eth: offload RX checksum on R8A7740
> [5/7] sh_eth: offload RX checksum on R8A77980
> [6/7] sh_eth: offload RX checksum on SH7734
> [7/7] sh_eth: offload RX checksum on SH7763
> 
> MBR, Sergei
> 
Hi Sergei,

the formatting of the patch series isn't in line with the netdev standards.
See here: https://www.kernel.org/doc/html/latest/networking/netdev-FAQ.html

- cover letter isn't generated by git
 
- That's not ok
--- *net-next.orig/*drivers/net/ethernet/renesas/sh_eth.h
+++ *net-next/*drivers/net/ethernet/renesas/sh_eth.h

- patches miss net / net-next annotation

Heiner

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
  2019-01-27 17:36 ` [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum Sergei Shtylyov
@ 2019-01-28  9:21   ` Geert Uytterhoeven
  2019-01-28 11:08     ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Geert Uytterhoeven @ 2019-01-28  9:21 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, Linux-Renesas, Linux-sh list

Hi Sergei,

Thanks for your patch!

On Sun, Jan 27, 2019 at 6:40 PM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> Commit 62e04b7e0e3c ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed
> the field to 'hw_checksum' for the Ether DMAC "intelligent checksum",
> however some Ether MACs implement a simpler checksumming scheme, so that
> name now seems misleading. Rename that filed to 'csmr' as the "intelligent
> checkmum" is always controlled by the CSMR register.

checksum

> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

Apart from that:
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

> --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
> +++ net-next/drivers/net/ethernet/renesas/sh_eth.c

> @@ -793,7 +793,7 @@ static struct sh_eth_cpu_data r8a77980_d
>         .no_trimd       = 1,
>         .no_ade         = 1,
>         .xdfar_rw       = 1,
> -       .hw_checksum    = 1,
> +       .csmr           = 1,

Interestingly, I cannot find the CSMR register in the R-Car Gen3 docs?
Not introduced by this patch, though.

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
  2019-01-28  9:21   ` Geert Uytterhoeven
@ 2019-01-28 11:08     ` Sergei Shtylyov
  2019-01-28 19:15       ` David Miller
  0 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-28 11:08 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: netdev, David S. Miller, Linux-Renesas, Linux-sh list

On 01/28/2019 12:21 PM, Geert Uytterhoeven wrote:

> On Sun, Jan 27, 2019 at 6:40 PM Sergei Shtylyov
> <sergei.shtylyov@cogentembedded.com> wrote:
>> Commit 62e04b7e0e3c ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed
>> the field to 'hw_checksum' for the Ether DMAC "intelligent checksum",
>> however some Ether MACs implement a simpler checksumming scheme, so that
>> name now seems misleading. Rename that filed to 'csmr' as the "intelligent
>> checkmum" is always controlled by the CSMR register.
> 
> checksum

   Oops! Do I need to repost?

> 
>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> Apart from that:
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
> 
>> --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
>> +++ net-next/drivers/net/ethernet/renesas/sh_eth.c
> 
>> @@ -793,7 +793,7 @@ static struct sh_eth_cpu_data r8a77980_d
>>         .no_trimd       = 1,
>>         .no_ade         = 1,
>>         .xdfar_rw       = 1,
>> -       .hw_checksum    = 1,
>> +       .csmr           = 1,
> 
> Interestingly, I cannot find the CSMR register in the R-Car Gen3 docs?

   Me niether... But if you remove that flag, the driver stops working due to
not doing >>= 16 in sh_eth_rx() anymore. Go figure... :-)

> Not introduced by this patch, though.

   Yep.

> Gr{oetje,eeting}s,
> 
>                         Geert

MBR, Sergei


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] sh_eth: RX checksum offload support
  2019-01-27 17:37 ` [PATCH 2/7] sh_eth: RX checksum offload support Sergei Shtylyov
@ 2019-01-28 12:18   ` Simon Horman
  2019-01-28 15:45     ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Simon Horman @ 2019-01-28 12:18 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

Hi Sergei,

On Sun, Jan 27, 2019 at 08:37:33PM +0300, Sergei Shtylyov wrote:
> Add support for the RX checksum offload. This is enabled by default and
> may be disabled and re-enabled using 'ethtool':
> 
> # ethtool -K eth0 rx {on|off}
> 
> Some Ether MACs provide a simple checksumming scheme which appears to be
> completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
> the L2 header is appended to packet data; this may be trivially read by
> the driver and used to update the skb accordingly. The same checksumming
> scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
> driver.
> 
> In terms of performance, throughput is close to gigabit line rate with the
> RX checksum offload both enabled and disabled.  The 'perf' output, however,
> appears to indicate that significantly less time is spent in do_csum() --
> this is as expected.

Nice.

FYI, this seems similar to what I observed for RAVB, perhaps on H3 I don't
exactly recall. On E3, which has less CPU power, I recently observed that
with rx-csum enabled I can achieve gigabit line rate, but with rx-csum
disabled throughput is significantly lower. I.e. on that system throughput
is CPU bound with 1500 byte packets unless rx-csum enabled.


Next point:

2da64300fbc ("ravb: expand rx descriptor data to accommodate hw checksum")
is fresh in my mind and I wonder if mdp->rx_buf_sz needs to grow to ensure
that there is always enough space for the csum. In particular, have you
tested this with MTU-size frames with VLANs. (My test is to run iperf3 over
a VLAN netdev, netperf over a VLAN netdev would likely work just as well.)

> 
> Test results with RX checksum offload enabled:
> 
> ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
> TCP MAERTS TEST to 192.168.2.4
> Recv   Send    Send
> Socket Socket  Message  Elapsed
> Size   Size    Size     Time     Throughput
> bytes  bytes   bytes    secs.    10^6bits/sec
> 
> 131072  16384  16384    10.01     933.93
> [ perf record: Woken up 8 times to write data ]
> [ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ]
> ~/netperf-2.2pl4# perf report
> Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763
> Overhead  Command          Shared Object             Symbol
>    9.44%  netperf          [kernel.kallsyms]         [k] __arch_copy_to_user
>    7.75%  swapper          [kernel.kallsyms]         [k] _raw_spin_unlock_irq
>    6.31%  swapper          [kernel.kallsyms]         [k] default_idle_call
>    5.89%  swapper          [kernel.kallsyms]         [k] arch_cpu_idle
>    4.37%  swapper          [kernel.kallsyms]         [k] tick_nohz_idle_exit
>    4.02%  netperf          [kernel.kallsyms]         [k] _raw_spin_unlock_irq
>    2.52%  netperf          [kernel.kallsyms]         [k] preempt_count_sub
>    1.81%  netperf          [kernel.kallsyms]         [k] tcp_recvmsg
>    1.80%  netperf          [kernel.kallsyms]         [k] _raw_spin_unlock_irqres
>    1.78%  netperf          [kernel.kallsyms]         [k] preempt_count_add
>    1.36%  netperf          [kernel.kallsyms]         [k] __tcp_transmit_skb
>    1.20%  netperf          [kernel.kallsyms]         [k] __local_bh_enable_ip
>    1.10%  netperf          [kernel.kallsyms]         [k] sh_eth_start_xmit
> 
> Test results with RX checksum offload disabled:
> 
> ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
> TCP MAERTS TEST to 192.168.2.4
> Recv   Send    Send
> Socket Socket  Message  Elapsed
> Size   Size    Size     Time     Throughput
> bytes  bytes   bytes    secs.    10^6bits/sec
> 131072  16384  16384    10.01     932.04
> [ perf record: Woken up 14 times to write data ]
> [ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ]
> ~/netperf-2.2pl4# perf report
> Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796          
> Overhead  Command          Shared Object       Symbol                           
>    7.00%  swapper          [kernel.kallsyms]   [k] do_csum                      
>    3.94%  swapper          [kernel.kallsyms]   [k] sh_eth_poll                  
>    3.83%  ksoftirqd/0      [kernel.kallsyms]   [k] do_csum                      
>    3.23%  swapper          [kernel.kallsyms]   [k] _raw_spin_unlock_irq         
>    2.87%  netperf          [kernel.kallsyms]   [k] __arch_copy_to_user          
>    2.86%  swapper          [kernel.kallsyms]   [k] arch_cpu_idle                
>    2.13%  swapper          [kernel.kallsyms]   [k] default_idle_call            
>    2.12%  ksoftirqd/0      [kernel.kallsyms]   [k] sh_eth_poll                  
>    2.02%  swapper          [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore  
>    1.84%  swapper          [kernel.kallsyms]   [k] __softirqentry_text_start    
>    1.64%  swapper          [kernel.kallsyms]   [k] tick_nohz_idle_exit          
>    1.53%  netperf          [kernel.kallsyms]   [k] _raw_spin_unlock_irq         
>    1.32%  netperf          [kernel.kallsyms]   [k] preempt_count_sub            
>    1.27%  swapper          [kernel.kallsyms]   [k] __pi___inval_dcache_area     
>    1.22%  swapper          [kernel.kallsyms]   [k] check_preemption_disabled    
>    1.01%  ksoftirqd/0      [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore  
> 
> The above results collected on the R-Car V3H Starter Kit board.
> 
> Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...
> 
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> ---
>  drivers/net/ethernet/renesas/sh_eth.c |   60 ++++++++++++++++++++++++++++++++--
>  drivers/net/ethernet/renesas/sh_eth.h |    1 
>  2 files changed, 59 insertions(+), 2 deletions(-)
> 
> Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
> ===================================================================
> --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
> +++ net-next/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1532,8 +1532,9 @@ static int sh_eth_dev_init(struct net_de
>  	mdp->irq_enabled = true;
>  	sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
>  
> -	/* PAUSE Prohibition */
> +	/* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */
>  	sh_eth_write(ndev, ECMR_ZPF | (mdp->duplex ? ECMR_DM : 0) |
> +		     (ndev->features & NETIF_F_RXCSUM ? ECMR_RCSC : 0) |
>  		     ECMR_TE | ECMR_RE, ECMR);
>  
>  	if (mdp->cd->set_rate)
> @@ -1592,6 +1593,19 @@ static void sh_eth_dev_exit(struct net_d
>  	update_mac_address(ndev);
>  }
>  
> +static void sh_eth_rx_csum(struct sk_buff *skb)
> +{
> +	u8 *hw_csum;
> +
> +	/* The hardware checksum is 2 bytes appended to packet data */
> +	if (unlikely(skb->len < sizeof(__sum16)))
> +		return;
> +	hw_csum = skb_tail_pointer(skb) - sizeof(__sum16);
> +	skb->csum = csum_unfold((__force __sum16)get_unaligned_le16(hw_csum));
> +	skb->ip_summed = CHECKSUM_COMPLETE;
> +	skb_trim(skb, skb->len - sizeof(__sum16));
> +}
> +
>  /* Packet receive function */
>  static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
>  {
> @@ -1666,6 +1680,8 @@ static int sh_eth_rx(struct net_device *
>  					 DMA_FROM_DEVICE);
>  			skb_put(skb, pkt_len);
>  			skb->protocol = eth_type_trans(skb, ndev);
> +			if (ndev->features & NETIF_F_RXCSUM)
> +				sh_eth_rx_csum(skb);
>  			netif_receive_skb(skb);
>  			ndev->stats.rx_packets++;
>  			ndev->stats.rx_bytes += pkt_len;
> @@ -2921,6 +2937,39 @@ static void sh_eth_set_rx_mode(struct ne
>  	spin_unlock_irqrestore(&mdp->lock, flags);
>  }
>  
> +static void sh_eth_set_rx_csum(struct net_device *ndev, bool enable)
> +{
> +	struct sh_eth_private *mdp = netdev_priv(ndev);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&mdp->lock, flags);
> +
> +	/* Disable TX and RX */
> +	sh_eth_rcv_snd_disable(ndev);
> +
> +	/* Modify RX Checksum setting */
> +	sh_eth_modify(ndev, ECMR, ECMR_RCSC, enable ? ECMR_RCSC : 0);
> +
> +	/* Enable TX and RX */
> +	sh_eth_rcv_snd_enable(ndev);
> +
> +	spin_unlock_irqrestore(&mdp->lock, flags);
> +}
> +
> +static int sh_eth_set_features(struct net_device *ndev,
> +			       netdev_features_t features)
> +{
> +	netdev_features_t changed = ndev->features ^ features;
> +	struct sh_eth_private *mdp = netdev_priv(ndev);
> +
> +	if (changed & NETIF_F_RXCSUM && mdp->cd->rx_csum)
> +		sh_eth_set_rx_csum(ndev, features & NETIF_F_RXCSUM);
> +
> +	ndev->features = features;
> +
> +	return 0;
> +}
> +
>  static int sh_eth_get_vtag_index(struct sh_eth_private *mdp)
>  {
>  	if (!mdp->port)
> @@ -3102,6 +3151,7 @@ static const struct net_device_ops sh_et
>  	.ndo_change_mtu		= sh_eth_change_mtu,
>  	.ndo_validate_addr	= eth_validate_addr,
>  	.ndo_set_mac_address	= eth_mac_addr,
> +	.ndo_set_features	= sh_eth_set_features,
>  };
>  
>  static const struct net_device_ops sh_eth_netdev_ops_tsu = {
> @@ -3117,6 +3167,7 @@ static const struct net_device_ops sh_et
>  	.ndo_change_mtu		= sh_eth_change_mtu,
>  	.ndo_validate_addr	= eth_validate_addr,
>  	.ndo_set_mac_address	= eth_mac_addr,
> +	.ndo_set_features	= sh_eth_set_features,
>  };
>  
>  #ifdef CONFIG_OF
> @@ -3245,6 +3296,11 @@ static int sh_eth_drv_probe(struct platf
>  	ndev->max_mtu = 2000 - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN);
>  	ndev->min_mtu = ETH_MIN_MTU;
>  
> +	if (mdp->cd->rx_csum) {
> +		ndev->features = NETIF_F_RXCSUM;
> +		ndev->hw_features = NETIF_F_RXCSUM;
> +	}
> +
>  	/* set function */
>  	if (mdp->cd->tsu)
>  		ndev->netdev_ops = &sh_eth_netdev_ops_tsu;
> @@ -3294,7 +3350,7 @@ static int sh_eth_drv_probe(struct platf
>  			goto out_release;
>  		}
>  		mdp->port = port;
> -		ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER;
> +		ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
>  
>  		/* Need to init only the first port of the two sharing a TSU */
>  		if (port == 0) {
> Index: net-next/drivers/net/ethernet/renesas/sh_eth.h
> ===================================================================
> --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
> +++ net-next/drivers/net/ethernet/renesas/sh_eth.h
> @@ -500,6 +500,7 @@ struct sh_eth_cpu_data {
>  	unsigned no_xdfar:1;	/* E-DMAC DOES NOT have RDFAR/TDFAR */
>  	unsigned xdfar_rw:1;	/* E-DMAC has writeable RDFAR/TDFAR */
>  	unsigned csmr:1;	/* E-DMAC has CSMR */
> +	unsigned rx_csum:1;	/* EtherC has ECMR.RCSC */
>  	unsigned select_mii:1;	/* EtherC has RMII_MII (MII select register) */
>  	unsigned rmiimode:1;	/* EtherC has RMIIMODE register */
>  	unsigned rtrate:1;	/* EtherC has RTRATE register */
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-27 17:38 ` [PATCH 3/7] sh_eth: offload RX checksum on R7S72100 Sergei Shtylyov
@ 2019-01-28 12:20   ` Simon Horman
  2019-01-28 15:21     ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Simon Horman @ 2019-01-28 12:20 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On Sun, Jan 27, 2019 at 08:38:33PM +0300, Sergei Shtylyov wrote:
> The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum
> offload the same way as it's implemented in the EtherAVB MACs...
> 
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

Regarding this and the remaining patches in this series,
which add rx-csum offload support in the sh_eth driver for
various SoCs: has this been tested?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-28 12:20   ` Simon Horman
@ 2019-01-28 15:21     ` Sergei Shtylyov
  2019-01-29  8:00       ` Simon Horman
  0 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-28 15:21 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On 01/28/2019 03:20 PM, Simon Horman wrote:

>> The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum
>> offload the same way as it's implemented in the EtherAVB MACs...
>>
>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> Regarding this and the remaining patches in this series,
> which add rx-csum offload support in the sh_eth driver for
> various SoCs: has this been tested?

   As I said, I've only tested it on R8A77980.

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] sh_eth: RX checksum offload support
  2019-01-28 12:18   ` Simon Horman
@ 2019-01-28 15:45     ` Sergei Shtylyov
  2019-01-29  7:58       ` Simon Horman
  0 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-28 15:45 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

Hello!

On 01/28/2019 03:18 PM, Simon Horman wrote:

>> Add support for the RX checksum offload. This is enabled by default and
>> may be disabled and re-enabled using 'ethtool':
>>
>> # ethtool -K eth0 rx {on|off}
>>
>> Some Ether MACs provide a simple checksumming scheme which appears to be
>> completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
>> the L2 header is appended to packet data; this may be trivially read by
>> the driver and used to update the skb accordingly. The same checksumming
>> scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
>> driver.
>>
>> In terms of performance, throughput is close to gigabit line rate with the
>> RX checksum offload both enabled and disabled.  The 'perf' output, however,
>> appears to indicate that significantly less time is spent in do_csum() --
>> this is as expected.
> 
> Nice.
> 
> FYI, this seems similar to what I observed for RAVB, perhaps on H3 I don't
> exactly recall. On E3, which has less CPU power, I recently observed that
> with rx-csum enabled I can achieve gigabit line rate, but with rx-csum
> disabled throughput is significantly lower. I.e. on that system throughput
> is CPU bound with 1500 byte packets unless rx-csum enabled.

   Unfortunately, we can't teset these patches on the other gen3 boards. ISTR
you have RZ/A1H board... if it's still with you, I'd appreciate testing.

> Next point:
> 
> 2da64300fbc ("ravb: expand rx descriptor data to accommodate hw checksum")
> is fresh in my mind and I wonder if mdp->rx_buf_sz needs to grow to ensure
> that there is always enough space for the csum.

   Well, if you look at sh_eth_ring_init(), you'll see that the driver reserves
plenty of space at the end the RX buffers.

> In particular, have you
> tested this with MTU-size frames with VLANs. (My test is to run iperf3 over
> a VLAN netdev, netperf over a VLAN netdev would likely work just as well.)

   Could you refresh me on how to bring up a VLAN on a given interface?

[...]
>> The above results collected on the R-Car V3H Starter Kit board.
>>
>> Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...
>>
>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
  2019-01-28 11:08     ` Sergei Shtylyov
@ 2019-01-28 19:15       ` David Miller
  0 siblings, 0 replies; 30+ messages in thread
From: David Miller @ 2019-01-28 19:15 UTC (permalink / raw)
  To: sergei.shtylyov; +Cc: geert, netdev, linux-renesas-soc, linux-sh

From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Date: Mon, 28 Jan 2019 14:08:48 +0300

> On 01/28/2019 12:21 PM, Geert Uytterhoeven wrote:
> 
>> On Sun, Jan 27, 2019 at 6:40 PM Sergei Shtylyov
>> <sergei.shtylyov@cogentembedded.com> wrote:
>>> Commit 62e04b7e0e3c ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed
>>> the field to 'hw_checksum' for the Ether DMAC "intelligent checksum",
>>> however some Ether MACs implement a simpler checksumming scheme, so that
>>> name now seems misleading. Rename that filed to 'csmr' as the "intelligent
>>> checkmum" is always controlled by the CSMR register.
>> 
>> checksum
> 
>    Oops! Do I need to repost?

Please repost the series, thank you.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] sh_eth: RX checksum offload support
  2019-01-28 15:45     ` Sergei Shtylyov
@ 2019-01-29  7:58       ` Simon Horman
  2019-01-29 15:43         ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Simon Horman @ 2019-01-29  7:58 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

Hi Sergei,

On Mon, Jan 28, 2019 at 06:45:26PM +0300, Sergei Shtylyov wrote:
> Hello!
> 
> On 01/28/2019 03:18 PM, Simon Horman wrote:
> 
> >> Add support for the RX checksum offload. This is enabled by default and
> >> may be disabled and re-enabled using 'ethtool':
> >>
> >> # ethtool -K eth0 rx {on|off}
> >>
> >> Some Ether MACs provide a simple checksumming scheme which appears to be
> >> completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
> >> the L2 header is appended to packet data; this may be trivially read by
> >> the driver and used to update the skb accordingly. The same checksumming
> >> scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
> >> driver.
> >>
> >> In terms of performance, throughput is close to gigabit line rate with the
> >> RX checksum offload both enabled and disabled.  The 'perf' output, however,
> >> appears to indicate that significantly less time is spent in do_csum() --
> >> this is as expected.
> > 
> > Nice.
> > 
> > FYI, this seems similar to what I observed for RAVB, perhaps on H3 I don't
> > exactly recall. On E3, which has less CPU power, I recently observed that
> > with rx-csum enabled I can achieve gigabit line rate, but with rx-csum
> > disabled throughput is significantly lower. I.e. on that system throughput
> > is CPU bound with 1500 byte packets unless rx-csum enabled.
> 
>    Unfortunately, we can't teset these patches on the other gen3 boards. ISTR
> you have RZ/A1H board... if it's still with you, I'd appreciate testing.

Unfortunately, as of a few weeks ago, I no longer have that board.

> > Next point:
> > 
> > 2da64300fbc ("ravb: expand rx descriptor data to accommodate hw checksum")
> > is fresh in my mind and I wonder if mdp->rx_buf_sz needs to grow to ensure
> > that there is always enough space for the csum.
> 
>    Well, if you look at sh_eth_ring_init(), you'll see that the driver reserves
> plenty of space at the end the RX buffers.

Yes, I see that. And I assume that was enough space before this patch.
But is it still enough space now that 2 bytes are needed for the hardware csum?
2 bytes that might have previously been used as packet data in some
circumstances.

> > In particular, have you
> > tested this with MTU-size frames with VLANs. (My test is to run iperf3 over
> > a VLAN netdev, netperf over a VLAN netdev would likely work just as well.)
> 
>    Could you refresh me on how to bring up a VLAN on a given interface?

You will need a kernel with CONFIG_VLAN_8021Q enabled.

Then you can do something like this:

	ip link add link eth0 name eth0.1 type vlan id 1
	ip addr add 10.1.1.100/24 dev eth0.1
	ip link set dev eth0.1 up


> [...]
> >> The above results collected on the R-Car V3H Starter Kit board.
> >>
> >> Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...
> >>
> >> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> [...]
> 
> MBR, Sergei
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-28 15:21     ` Sergei Shtylyov
@ 2019-01-29  8:00       ` Simon Horman
  2019-01-29 10:37         ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Simon Horman @ 2019-01-29  8:00 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On Mon, Jan 28, 2019 at 06:21:11PM +0300, Sergei Shtylyov wrote:
> On 01/28/2019 03:20 PM, Simon Horman wrote:
> 
> >> The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum
> >> offload the same way as it's implemented in the EtherAVB MACs...
> >>
> >> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> > 
> > Regarding this and the remaining patches in this series,
> > which add rx-csum offload support in the sh_eth driver for
> > various SoCs: has this been tested?
> 
>    As I said, I've only tested it on R8A77980.

Thanks, I missed that.

As you may have guessed the implication of my question is that
IMHO it would be best only to add this feature to SoCs where
it has been tested.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-29  8:00       ` Simon Horman
@ 2019-01-29 10:37         ` Sergei Shtylyov
  2019-01-29 15:02           ` Chris Brandt
  2019-01-30 10:08           ` Simon Horman
  0 siblings, 2 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-29 10:37 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

Hello!

On 01/29/2019 11:00 AM, Simon Horman wrote:

>>>> The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum
>>>> offload the same way as it's implemented in the EtherAVB MACs...
>>>>
>>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>>>
>>> Regarding this and the remaining patches in this series,
>>> which add rx-csum offload support in the sh_eth driver for
>>> various SoCs: has this been tested?
>>
>>    As I said, I've only tested it on R8A77980.

   And still hoping Geert would be able to test on R8A7740.

> 
> Thanks, I missed that.
> 
> As you may have guessed the implication of my question is that
> IMHO it would be best only to add this feature to SoCs where
> it has been tested.

   You don't trust the manuals? :-)

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/7] sh_eth: implement simple RX checksum offload
  2019-01-27 17:52 ` [PATCH 0/7] sh_eth: implement simple RX checksum offload Heiner Kallweit
@ 2019-01-29 11:06   ` Sergei Shtylyov
  0 siblings, 0 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-29 11:06 UTC (permalink / raw)
  To: Heiner Kallweit, netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

Hello!

On 01/27/2019 08:52 PM, Heiner Kallweit wrote:

>> Here's a set of 7 patches against DaveM's 'net-next.git' repo. I'm implemeting
>> the simple RX checksum offload (like was done for the 'ravb' driver by Simon
>> Horman); it was only tested on the R8A77980 SoC, the other SoCs should just
>> work (according to their manuals)...
>>
>> [1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum
>> [2/7] sh_eth: RX checksum offload support
>> [3/7] sh_eth: offload RX checksum on R7S72100
>> [4/7] sh_eth: offload RX checksum on R8A7740
>> [5/7] sh_eth: offload RX checksum on R8A77980
>> [6/7] sh_eth: offload RX checksum on SH7734
>> [7/7] sh_eth: offload RX checksum on SH7763
>>
>> MBR, Sergei
>>
> Hi Sergei,
> 
> the formatting of the patch series isn't in line with the netdev standards.
> See here: https://www.kernel.org/doc/html/latest/networking/netdev-FAQ.html
> 
> - cover letter isn't generated by git

   Sorry, I don't use git for development. However, I fail to see where this
is requested in the above FAQ.

>  
> - That's not ok
> --- *net-next.orig/*drivers/net/ethernet/renesas/sh_eth.h
> +++ *net-next/*drivers/net/ethernet/renesas/sh_eth.h
> 
> - patches miss net / net-next annotation

   I noted the applicable repo in the cover letter. I know I should use
the subject... but I just keep forgetting about this requirement. :-)
 
> Heiner

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-29 10:37         ` Sergei Shtylyov
@ 2019-01-29 15:02           ` Chris Brandt
  2019-01-29 16:03             ` Sergei Shtylyov
  2019-01-30 10:08           ` Simon Horman
  1 sibling, 1 reply; 30+ messages in thread
From: Chris Brandt @ 2019-01-29 15:02 UTC (permalink / raw)
  To: Sergei Shtylyov, Simon Horman
  Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On Tuesday, January 29, 2019, Sergei Shtylyov wrote:
> On 01/29/2019 11:00 AM, Simon Horman wrote:
> > As you may have guessed the implication of my question is that
> > IMHO it would be best only to add this feature to SoCs where
> > it has been tested.
> 
>    You don't trust the manuals? :-)
> 
> MBR, Sergei

How were you testing this feature with the R8A77980?

Chris


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] sh_eth: RX checksum offload support
  2019-01-29  7:58       ` Simon Horman
@ 2019-01-29 15:43         ` Sergei Shtylyov
  2019-01-30 10:06           ` Simon Horman
  0 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-29 15:43 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On 01/29/2019 10:58 AM, Simon Horman wrote:

>>>> Add support for the RX checksum offload. This is enabled by default and
>>>> may be disabled and re-enabled using 'ethtool':
>>>>
>>>> # ethtool -K eth0 rx {on|off}
>>>>
>>>> Some Ether MACs provide a simple checksumming scheme which appears to be
>>>> completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
>>>> the L2 header is appended to packet data; this may be trivially read by
>>>> the driver and used to update the skb accordingly. The same checksumming
>>>> scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
>>>> driver.
>>>>
>>>> In terms of performance, throughput is close to gigabit line rate with the
>>>> RX checksum offload both enabled and disabled.  The 'perf' output, however,
>>>> appears to indicate that significantly less time is spent in do_csum() --
>>>> this is as expected.
>>>
>>> Nice.
>>>
>>> FYI, this seems similar to what I observed for RAVB, perhaps on H3 I don't
>>> exactly recall. On E3, which has less CPU power, I recently observed that
>>> with rx-csum enabled I can achieve gigabit line rate, but with rx-csum
>>> disabled throughput is significantly lower. I.e. on that system throughput
>>> is CPU bound with 1500 byte packets unless rx-csum enabled.
>>
>>    Unfortunately, we can't teset these patches on the other gen3 boards. ISTR
>> you have RZ/A1H board... if it's still with you, I'd appreciate testing.
> 
> Unfortunately, as of a few weeks ago, I no longer have that board.
> 
>>> Next point:
>>>
>>> 2da64300fbc ("ravb: expand rx descriptor data to accommodate hw checksum")
>>> is fresh in my mind and I wonder if mdp->rx_buf_sz needs to grow to ensure
>>> that there is always enough space for the csum.
>>
>>    Well, if you look at sh_eth_ring_init(), you'll see that the driver reserves
>> plenty of space at the end the RX buffers.
> 
> Yes, I see that. And I assume that was enough space before this patch.
> But is it still enough space now that 2 bytes are needed for the hardware csum?

  To quote the source:

	/* +26 gets the maximum ethernet encapsulation, +7 & ~7 because the
	 * card needs room to do 8 byte alignment, +2 so we can reserve
	 * the first 2 bytes, and +16 gets room for the status word from the
	 * card.
	 */
	mdp->rx_buf_sz = (ndev->mtu <= 1492 ? PKT_BUF_SZ :
			  (((ndev->mtu + 26 + 7) & ~7) + 2 + 16));

   I have no idea what they mean by status word and why it takes 16 bytes (and I even
have the R8A771x manual!) but I think these 16 bytes are where our checksum goes...
that's why I said there's plenty of space. :-)

> 2 bytes that might have previously been used as packet data in some
> circumstances.
> 
>>> In particular, have you
>>> tested this with MTU-size frames with VLANs. (My test is to run iperf3 over
>>> a VLAN netdev, netperf over a VLAN netdev would likely work just as well.)
>>
>>    Could you refresh me on how to bring up a VLAN on a given interface?
> 
> You will need a kernel with CONFIG_VLAN_8021Q enabled.
> 
> Then you can do something like this:
> 
> 	ip link add link eth0 name eth0.1 type vlan id 1
> 	ip addr add 10.1.1.100/24 dev eth0.1
> 	ip link set dev eth0.1 up

  Thank you! I'm not familiar with 'ip' at all, thought 'ifconfig' could do the same
thing easier but couldn't remember all the needed incantations... :-)
   Anyway, it worked!

>> [...]
>>>> The above results collected on the R-Car V3H Starter Kit board.
>>>>
>>>> Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...
>>>>
>>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>> [...]

MBR, Sergei


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-29 15:02           ` Chris Brandt
@ 2019-01-29 16:03             ` Sergei Shtylyov
  0 siblings, 0 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-29 16:03 UTC (permalink / raw)
  To: Chris Brandt, Simon Horman
  Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On 01/29/2019 06:02 PM, Chris Brandt wrote:

>>> As you may have guessed the implication of my question is that
>>> IMHO it would be best only to add this feature to SoCs where
>>> it has been tested.
>>
>>    You don't trust the manuals? :-)
>>
>> MBR, Sergei
> 
> How were you testing this feature with the R8A77980?

  Like Simon, I used perf/netperf, you can see it in the patch #2's
description. You have other ideas?

> Chris

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] sh_eth: offload RX checksum on R8A7740
  2019-01-27 17:39 ` [PATCH 4/7] sh_eth: offload RX checksum on R8A7740 Sergei Shtylyov
@ 2019-01-29 18:20   ` Geert Uytterhoeven
  2019-01-31 10:52     ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Geert Uytterhoeven @ 2019-01-29 18:20 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, Linux-Renesas, Linux-sh list

Hi Sergei,

On Sun, Jan 27, 2019 at 6:41 PM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> The R-Mobile A1 (R8A7740) SoC manual describes the Ether MAC's RX checksum
> offload the same way as it's implemented in the EtherAVB MAC...
>
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

Thanks for your patch!

Running netperf as described in patch 2/7, perf tells me there's a reduction
for csum_partial from ca. 1.9% to 0.01%, so this feature seems to work.

Hence:
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

However, while effective according to perf results, using ethtool to
enable/disable
the feature prints an error message:

root@armadillo:~# ethtool -K eth0 rx on
Cannot get device udp-fragmentation-offload settings: Operation not supported
Cannot get device udp-fragmentation-offload settings: Operation not supported
root@armadillo:~# ethtool -K eth0 rx off
Cannot get device udp-fragmentation-offload settings: Operation not supported
Cannot get device udp-fragmentation-offload settings: Operation not supported
root@armadillo:~#

Do you have any clue?

Does this needs testing on R-Mobile A1 with VLAN enabled, too, or is that
independent from the underlying sh-eth hardware version?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/7] sh_eth: RX checksum offload support
  2019-01-29 15:43         ` Sergei Shtylyov
@ 2019-01-30 10:06           ` Simon Horman
  0 siblings, 0 replies; 30+ messages in thread
From: Simon Horman @ 2019-01-30 10:06 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On Tue, Jan 29, 2019 at 06:43:45PM +0300, Sergei Shtylyov wrote:
> On 01/29/2019 10:58 AM, Simon Horman wrote:
> 
> >>>> Add support for the RX checksum offload. This is enabled by default and
> >>>> may be disabled and re-enabled using 'ethtool':
> >>>>
> >>>> # ethtool -K eth0 rx {on|off}
> >>>>
> >>>> Some Ether MACs provide a simple checksumming scheme which appears to be
> >>>> completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
> >>>> the L2 header is appended to packet data; this may be trivially read by
> >>>> the driver and used to update the skb accordingly. The same checksumming
> >>>> scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
> >>>> driver.
> >>>>
> >>>> In terms of performance, throughput is close to gigabit line rate with the
> >>>> RX checksum offload both enabled and disabled.  The 'perf' output, however,
> >>>> appears to indicate that significantly less time is spent in do_csum() --
> >>>> this is as expected.
> >>>
> >>> Nice.
> >>>
> >>> FYI, this seems similar to what I observed for RAVB, perhaps on H3 I don't
> >>> exactly recall. On E3, which has less CPU power, I recently observed that
> >>> with rx-csum enabled I can achieve gigabit line rate, but with rx-csum
> >>> disabled throughput is significantly lower. I.e. on that system throughput
> >>> is CPU bound with 1500 byte packets unless rx-csum enabled.
> >>
> >>    Unfortunately, we can't teset these patches on the other gen3 boards. ISTR
> >> you have RZ/A1H board... if it's still with you, I'd appreciate testing.
> > 
> > Unfortunately, as of a few weeks ago, I no longer have that board.
> > 
> >>> Next point:
> >>>
> >>> 2da64300fbc ("ravb: expand rx descriptor data to accommodate hw checksum")
> >>> is fresh in my mind and I wonder if mdp->rx_buf_sz needs to grow to ensure
> >>> that there is always enough space for the csum.
> >>
> >>    Well, if you look at sh_eth_ring_init(), you'll see that the driver reserves
> >> plenty of space at the end the RX buffers.
> > 
> > Yes, I see that. And I assume that was enough space before this patch.
> > But is it still enough space now that 2 bytes are needed for the hardware csum?
> 
>   To quote the source:
> 
> 	/* +26 gets the maximum ethernet encapsulation, +7 & ~7 because the
> 	 * card needs room to do 8 byte alignment, +2 so we can reserve
> 	 * the first 2 bytes, and +16 gets room for the status word from the
> 	 * card.
> 	 */
> 	mdp->rx_buf_sz = (ndev->mtu <= 1492 ? PKT_BUF_SZ :
> 			  (((ndev->mtu + 26 + 7) & ~7) + 2 + 16));
> 
>    I have no idea what they mean by status word and why it takes 16 bytes (and I even
> have the R8A771x manual!) but I think these 16 bytes are where our checksum goes...
> that's why I said there's plenty of space. :-)

Ok. FWIIW, I don't know either.

> > 2 bytes that might have previously been used as packet data in some
> > circumstances.
> > 
> >>> In particular, have you
> >>> tested this with MTU-size frames with VLANs. (My test is to run iperf3 over
> >>> a VLAN netdev, netperf over a VLAN netdev would likely work just as well.)
> >>
> >>    Could you refresh me on how to bring up a VLAN on a given interface?
> > 
> > You will need a kernel with CONFIG_VLAN_8021Q enabled.
> > 
> > Then you can do something like this:
> > 
> > 	ip link add link eth0 name eth0.1 type vlan id 1
> > 	ip addr add 10.1.1.100/24 dev eth0.1
> > 	ip link set dev eth0.1 up
> 
>   Thank you! I'm not familiar with 'ip' at all, thought 'ifconfig' could do the same
> thing easier but couldn't remember all the needed incantations... :-)
>    Anyway, it worked!
> 
> >> [...]
> >>>> The above results collected on the R-Car V3H Starter Kit board.
> >>>>
> >>>> Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...
> >>>>
> >>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> >> [...]
> 
> MBR, Sergei
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/7] sh_eth: offload RX checksum on R7S72100
  2019-01-29 10:37         ` Sergei Shtylyov
  2019-01-29 15:02           ` Chris Brandt
@ 2019-01-30 10:08           ` Simon Horman
  1 sibling, 0 replies; 30+ messages in thread
From: Simon Horman @ 2019-01-30 10:08 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, linux-renesas-soc, linux-sh

On Tue, Jan 29, 2019 at 01:37:38PM +0300, Sergei Shtylyov wrote:
> Hello!
> 
> On 01/29/2019 11:00 AM, Simon Horman wrote:
> 
> >>>> The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum
> >>>> offload the same way as it's implemented in the EtherAVB MACs...
> >>>>
> >>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> >>>
> >>> Regarding this and the remaining patches in this series,
> >>> which add rx-csum offload support in the sh_eth driver for
> >>> various SoCs: has this been tested?
> >>
> >>    As I said, I've only tested it on R8A77980.
> 
>    And still hoping Geert would be able to test on R8A7740.
> 
> > 
> > Thanks, I missed that.
> > 
> > As you may have guessed the implication of my question is that
> > IMHO it would be best only to add this feature to SoCs where
> > it has been tested.
> 
>    You don't trust the manuals? :-)

As a rule I do not.
But sometimes I have to anyway.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] sh_eth: offload RX checksum on R8A7740
  2019-01-29 18:20   ` Geert Uytterhoeven
@ 2019-01-31 10:52     ` Sergei Shtylyov
  2019-01-31 11:11       ` Geert Uytterhoeven
  0 siblings, 1 reply; 30+ messages in thread
From: Sergei Shtylyov @ 2019-01-31 10:52 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: netdev, David S. Miller, Linux-Renesas, Linux-sh list

Hello!

On 01/29/2019 09:20 PM, Geert Uytterhoeven wrote:

>> The R-Mobile A1 (R8A7740) SoC manual describes the Ether MAC's RX checksum
>> offload the same way as it's implemented in the EtherAVB MAC...
>>
>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> Thanks for your patch!
> 
> Running netperf as described in patch 2/7, perf tells me there's a reduction
> for csum_partial from ca. 1.9% to 0.01%, so this feature seems to work.

   Hm, what about do_csum()?

> Hence:
> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

   Thank you!
 
> However, while effective according to perf results, using ethtool to
> enable/disable
> the feature prints an error message:
> 
> root@armadillo:~# ethtool -K eth0 rx on
> Cannot get device udp-fragmentation-offload settings: Operation not supported
> Cannot get device udp-fragmentation-offload settings: Operation not supported
> root@armadillo:~# ethtool -K eth0 rx off
> Cannot get device udp-fragmentation-offload settings: Operation not supported
> Cannot get device udp-fragmentation-offload settings: Operation not supported
> root@armadillo:~#
> 
> Do you have any clue?

   No (I'm seeing the same).
 
> Does this needs testing on R-Mobile A1 with VLAN enabled, too, or is that
> independent from the underlying sh-eth hardware version?

   It's dependent...

> Gr{oetje,eeting}s,
> 
>                         Geert

MBR, Sergei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] sh_eth: offload RX checksum on R8A7740
  2019-01-31 10:52     ` Sergei Shtylyov
@ 2019-01-31 11:11       ` Geert Uytterhoeven
  0 siblings, 0 replies; 30+ messages in thread
From: Geert Uytterhoeven @ 2019-01-31 11:11 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, Linux-Renesas, Linux-sh list

Hi Sergei,

On Thu, Jan 31, 2019 at 11:52 AM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> On 01/29/2019 09:20 PM, Geert Uytterhoeven wrote:
> >> The R-Mobile A1 (R8A7740) SoC manual describes the Ether MAC's RX checksum
> >> offload the same way as it's implemented in the EtherAVB MAC...
> >>
> >> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> >
> > Thanks for your patch!
> >
> > Running netperf as described in patch 2/7, perf tells me there's a reduction
> > for csum_partial from ca. 1.9% to 0.01%, so this feature seems to work.
>
>    Hm, what about do_csum()?

I had looked for that, but didn't see it. Probably inlined, as it's static.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/7] sh_eth: offload RX checksum on SH7763
  2019-01-27 17:42 ` [PATCH 7/7] sh_eth: offload RX checksum on SH7763 Sergei Shtylyov
@ 2019-02-04 11:55   ` Rob Landley
  2019-02-04 15:17     ` Sergei Shtylyov
  0 siblings, 1 reply; 30+ messages in thread
From: Rob Landley @ 2019-02-04 11:55 UTC (permalink / raw)
  To: Sergei Shtylyov, netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

On 1/27/19 11:42 AM, Sergei Shtylyov wrote:
> The SH7763 SoC manual describes the Ether MAC's RX checksum offload
> the same way as it's implemented in the EtherAVB MACs...
> 
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

I think this is the chip in the JCI N40 on my desk, how would I test this and
tell it was working? (Is there an existing test program or...?)

Also, can this be tested under QEMU's r2 board emulation?

Rob

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/7] sh_eth: offload RX checksum on SH7763
  2019-02-04 11:55   ` Rob Landley
@ 2019-02-04 15:17     ` Sergei Shtylyov
  0 siblings, 0 replies; 30+ messages in thread
From: Sergei Shtylyov @ 2019-02-04 15:17 UTC (permalink / raw)
  To: Rob Landley, netdev, David S. Miller; +Cc: linux-renesas-soc, linux-sh

Hello!

On 02/04/2019 02:55 PM, Rob Landley wrote:

>> The SH7763 SoC manual describes the Ether MAC's RX checksum offload
>> the same way as it's implemented in the EtherAVB MACs...
>>
>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> I think this is the chip in the JCI N40 on my desk, how would I test this and
> tell it was working? (Is there an existing test program or...?)

   There are programs, yes. I used (rather old) netperf-2.2pl4 (I can send it to
you) running under perf (provided by the Poky rootfs). The details running netperf
on target are in the patch description #2; on host you just run netserver from that
same testsuite. The current netperf is maintained by HP on github but I was unable
to figure out how to build it quickly enough... :-)

> Also, can this be tested under QEMU's r2 board emulation?

   I have no idea, sorry.

> Rob

MBR, Sergei


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-02-04 15:17 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
2019-01-27 17:36 ` [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum Sergei Shtylyov
2019-01-28  9:21   ` Geert Uytterhoeven
2019-01-28 11:08     ` Sergei Shtylyov
2019-01-28 19:15       ` David Miller
2019-01-27 17:37 ` [PATCH 2/7] sh_eth: RX checksum offload support Sergei Shtylyov
2019-01-28 12:18   ` Simon Horman
2019-01-28 15:45     ` Sergei Shtylyov
2019-01-29  7:58       ` Simon Horman
2019-01-29 15:43         ` Sergei Shtylyov
2019-01-30 10:06           ` Simon Horman
2019-01-27 17:38 ` [PATCH 3/7] sh_eth: offload RX checksum on R7S72100 Sergei Shtylyov
2019-01-28 12:20   ` Simon Horman
2019-01-28 15:21     ` Sergei Shtylyov
2019-01-29  8:00       ` Simon Horman
2019-01-29 10:37         ` Sergei Shtylyov
2019-01-29 15:02           ` Chris Brandt
2019-01-29 16:03             ` Sergei Shtylyov
2019-01-30 10:08           ` Simon Horman
2019-01-27 17:39 ` [PATCH 4/7] sh_eth: offload RX checksum on R8A7740 Sergei Shtylyov
2019-01-29 18:20   ` Geert Uytterhoeven
2019-01-31 10:52     ` Sergei Shtylyov
2019-01-31 11:11       ` Geert Uytterhoeven
2019-01-27 17:40 ` [PATCH 5/7] sh_eth: offload RX checksum on R8A77980 Sergei Shtylyov
2019-01-27 17:41 ` [PATCH 6/7] sh_eth: offload RX checksum on SH7734 Sergei Shtylyov
2019-01-27 17:42 ` [PATCH 7/7] sh_eth: offload RX checksum on SH7763 Sergei Shtylyov
2019-02-04 11:55   ` Rob Landley
2019-02-04 15:17     ` Sergei Shtylyov
2019-01-27 17:52 ` [PATCH 0/7] sh_eth: implement simple RX checksum offload Heiner Kallweit
2019-01-29 11:06   ` Sergei Shtylyov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).