From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
To: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Cc: linux-renesas-soc@vger.kernel.org, linux-sh@vger.kernel.org
Subject: [PATCH 2/7] sh_eth: RX checksum offload support
Date: Sun, 27 Jan 2019 20:37:33 +0300 [thread overview]
Message-ID: <29cb4d33-37c3-43c0-3bce-fa01737c0fa4@cogentembedded.com> (raw)
In-Reply-To: <fc43bfa2-4312-b52b-a749-e3e43e242a95@cogentembedded.com>
Add support for the RX checksum offload. This is enabled by default and
may be disabled and re-enabled using 'ethtool':
# ethtool -K eth0 rx {on|off}
Some Ether MACs provide a simple checksumming scheme which appears to be
completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
the L2 header is appended to packet data; this may be trivially read by
the driver and used to update the skb accordingly. The same checksumming
scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb'
driver.
In terms of performance, throughput is close to gigabit line rate with the
RX checksum offload both enabled and disabled. The 'perf' output, however,
appears to indicate that significantly less time is spent in do_csum() --
this is as expected.
Test results with RX checksum offload enabled:
~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
TCP MAERTS TEST to 192.168.2.4
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 10.01 933.93
[ perf record: Woken up 8 times to write data ]
[ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ]
~/netperf-2.2pl4# perf report
Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763
Overhead Command Shared Object Symbol
9.44% netperf [kernel.kallsyms] [k] __arch_copy_to_user
7.75% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq
6.31% swapper [kernel.kallsyms] [k] default_idle_call
5.89% swapper [kernel.kallsyms] [k] arch_cpu_idle
4.37% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit
4.02% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq
2.52% netperf [kernel.kallsyms] [k] preempt_count_sub
1.81% netperf [kernel.kallsyms] [k] tcp_recvmsg
1.80% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irqres
1.78% netperf [kernel.kallsyms] [k] preempt_count_add
1.36% netperf [kernel.kallsyms] [k] __tcp_transmit_skb
1.20% netperf [kernel.kallsyms] [k] __local_bh_enable_ip
1.10% netperf [kernel.kallsyms] [k] sh_eth_start_xmit
Test results with RX checksum offload disabled:
~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
TCP MAERTS TEST to 192.168.2.4
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 10.01 932.04
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ]
~/netperf-2.2pl4# perf report
Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796
Overhead Command Shared Object Symbol
7.00% swapper [kernel.kallsyms] [k] do_csum
3.94% swapper [kernel.kallsyms] [k] sh_eth_poll
3.83% ksoftirqd/0 [kernel.kallsyms] [k] do_csum
3.23% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq
2.87% netperf [kernel.kallsyms] [k] __arch_copy_to_user
2.86% swapper [kernel.kallsyms] [k] arch_cpu_idle
2.13% swapper [kernel.kallsyms] [k] default_idle_call
2.12% ksoftirqd/0 [kernel.kallsyms] [k] sh_eth_poll
2.02% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.84% swapper [kernel.kallsyms] [k] __softirqentry_text_start
1.64% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit
1.53% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq
1.32% netperf [kernel.kallsyms] [k] preempt_count_sub
1.27% swapper [kernel.kallsyms] [k] __pi___inval_dcache_area
1.22% swapper [kernel.kallsyms] [k] check_preemption_disabled
1.01% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
The above results collected on the R-Car V3H Starter Kit board.
Based on the commit 4d86d3818627 ("ravb: RX checksum offload")...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
---
drivers/net/ethernet/renesas/sh_eth.c | 60 ++++++++++++++++++++++++++++++++--
drivers/net/ethernet/renesas/sh_eth.h | 1
2 files changed, 59 insertions(+), 2 deletions(-)
Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -1532,8 +1532,9 @@ static int sh_eth_dev_init(struct net_de
mdp->irq_enabled = true;
sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
- /* PAUSE Prohibition */
+ /* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */
sh_eth_write(ndev, ECMR_ZPF | (mdp->duplex ? ECMR_DM : 0) |
+ (ndev->features & NETIF_F_RXCSUM ? ECMR_RCSC : 0) |
ECMR_TE | ECMR_RE, ECMR);
if (mdp->cd->set_rate)
@@ -1592,6 +1593,19 @@ static void sh_eth_dev_exit(struct net_d
update_mac_address(ndev);
}
+static void sh_eth_rx_csum(struct sk_buff *skb)
+{
+ u8 *hw_csum;
+
+ /* The hardware checksum is 2 bytes appended to packet data */
+ if (unlikely(skb->len < sizeof(__sum16)))
+ return;
+ hw_csum = skb_tail_pointer(skb) - sizeof(__sum16);
+ skb->csum = csum_unfold((__force __sum16)get_unaligned_le16(hw_csum));
+ skb->ip_summed = CHECKSUM_COMPLETE;
+ skb_trim(skb, skb->len - sizeof(__sum16));
+}
+
/* Packet receive function */
static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
{
@@ -1666,6 +1680,8 @@ static int sh_eth_rx(struct net_device *
DMA_FROM_DEVICE);
skb_put(skb, pkt_len);
skb->protocol = eth_type_trans(skb, ndev);
+ if (ndev->features & NETIF_F_RXCSUM)
+ sh_eth_rx_csum(skb);
netif_receive_skb(skb);
ndev->stats.rx_packets++;
ndev->stats.rx_bytes += pkt_len;
@@ -2921,6 +2937,39 @@ static void sh_eth_set_rx_mode(struct ne
spin_unlock_irqrestore(&mdp->lock, flags);
}
+static void sh_eth_set_rx_csum(struct net_device *ndev, bool enable)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ unsigned long flags;
+
+ spin_lock_irqsave(&mdp->lock, flags);
+
+ /* Disable TX and RX */
+ sh_eth_rcv_snd_disable(ndev);
+
+ /* Modify RX Checksum setting */
+ sh_eth_modify(ndev, ECMR, ECMR_RCSC, enable ? ECMR_RCSC : 0);
+
+ /* Enable TX and RX */
+ sh_eth_rcv_snd_enable(ndev);
+
+ spin_unlock_irqrestore(&mdp->lock, flags);
+}
+
+static int sh_eth_set_features(struct net_device *ndev,
+ netdev_features_t features)
+{
+ netdev_features_t changed = ndev->features ^ features;
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+
+ if (changed & NETIF_F_RXCSUM && mdp->cd->rx_csum)
+ sh_eth_set_rx_csum(ndev, features & NETIF_F_RXCSUM);
+
+ ndev->features = features;
+
+ return 0;
+}
+
static int sh_eth_get_vtag_index(struct sh_eth_private *mdp)
{
if (!mdp->port)
@@ -3102,6 +3151,7 @@ static const struct net_device_ops sh_et
.ndo_change_mtu = sh_eth_change_mtu,
.ndo_validate_addr = eth_validate_addr,
.ndo_set_mac_address = eth_mac_addr,
+ .ndo_set_features = sh_eth_set_features,
};
static const struct net_device_ops sh_eth_netdev_ops_tsu = {
@@ -3117,6 +3167,7 @@ static const struct net_device_ops sh_et
.ndo_change_mtu = sh_eth_change_mtu,
.ndo_validate_addr = eth_validate_addr,
.ndo_set_mac_address = eth_mac_addr,
+ .ndo_set_features = sh_eth_set_features,
};
#ifdef CONFIG_OF
@@ -3245,6 +3296,11 @@ static int sh_eth_drv_probe(struct platf
ndev->max_mtu = 2000 - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN);
ndev->min_mtu = ETH_MIN_MTU;
+ if (mdp->cd->rx_csum) {
+ ndev->features = NETIF_F_RXCSUM;
+ ndev->hw_features = NETIF_F_RXCSUM;
+ }
+
/* set function */
if (mdp->cd->tsu)
ndev->netdev_ops = &sh_eth_netdev_ops_tsu;
@@ -3294,7 +3350,7 @@ static int sh_eth_drv_probe(struct platf
goto out_release;
}
mdp->port = port;
- ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER;
+ ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
/* Need to init only the first port of the two sharing a TSU */
if (port == 0) {
Index: net-next/drivers/net/ethernet/renesas/sh_eth.h
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
+++ net-next/drivers/net/ethernet/renesas/sh_eth.h
@@ -500,6 +500,7 @@ struct sh_eth_cpu_data {
unsigned no_xdfar:1; /* E-DMAC DOES NOT have RDFAR/TDFAR */
unsigned xdfar_rw:1; /* E-DMAC has writeable RDFAR/TDFAR */
unsigned csmr:1; /* E-DMAC has CSMR */
+ unsigned rx_csum:1; /* EtherC has ECMR.RCSC */
unsigned select_mii:1; /* EtherC has RMII_MII (MII select register) */
unsigned rmiimode:1; /* EtherC has RMIIMODE register */
unsigned rtrate:1; /* EtherC has RTRATE register */
next prev parent reply other threads:[~2019-01-27 17:37 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-27 17:33 [PATCH 0/7] sh_eth: implement simple RX checksum offload Sergei Shtylyov
2019-01-27 17:36 ` [PATCH 1/7] sh_eth: rename sh_eth_cpu_data::hw_checksum Sergei Shtylyov
2019-01-28 9:21 ` Geert Uytterhoeven
2019-01-28 11:08 ` Sergei Shtylyov
2019-01-28 19:15 ` David Miller
2019-01-27 17:37 ` Sergei Shtylyov [this message]
2019-01-28 12:18 ` [PATCH 2/7] sh_eth: RX checksum offload support Simon Horman
2019-01-28 15:45 ` Sergei Shtylyov
2019-01-29 7:58 ` Simon Horman
2019-01-29 15:43 ` Sergei Shtylyov
2019-01-30 10:06 ` Simon Horman
2019-01-27 17:38 ` [PATCH 3/7] sh_eth: offload RX checksum on R7S72100 Sergei Shtylyov
2019-01-28 12:20 ` Simon Horman
2019-01-28 15:21 ` Sergei Shtylyov
2019-01-29 8:00 ` Simon Horman
2019-01-29 10:37 ` Sergei Shtylyov
2019-01-29 15:02 ` Chris Brandt
2019-01-29 16:03 ` Sergei Shtylyov
2019-01-30 10:08 ` Simon Horman
2019-01-27 17:39 ` [PATCH 4/7] sh_eth: offload RX checksum on R8A7740 Sergei Shtylyov
2019-01-29 18:20 ` Geert Uytterhoeven
2019-01-31 10:52 ` Sergei Shtylyov
2019-01-31 11:11 ` Geert Uytterhoeven
2019-01-27 17:40 ` [PATCH 5/7] sh_eth: offload RX checksum on R8A77980 Sergei Shtylyov
2019-01-27 17:41 ` [PATCH 6/7] sh_eth: offload RX checksum on SH7734 Sergei Shtylyov
2019-01-27 17:42 ` [PATCH 7/7] sh_eth: offload RX checksum on SH7763 Sergei Shtylyov
2019-02-04 11:55 ` Rob Landley
2019-02-04 15:17 ` Sergei Shtylyov
2019-01-27 17:52 ` [PATCH 0/7] sh_eth: implement simple RX checksum offload Heiner Kallweit
2019-01-29 11:06 ` Sergei Shtylyov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=29cb4d33-37c3-43c0-3bce-fa01737c0fa4@cogentembedded.com \
--to=sergei.shtylyov@cogentembedded.com \
--cc=davem@davemloft.net \
--cc=linux-renesas-soc@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).