netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v7 0/9] xen-netback: TX grant mapping with SKBTX_DEV_ZEROCOPY instead of copy
@ 2014-03-06 21:48 Zoltan Kiss
  2014-03-06 21:48 ` [PATCH net-next v7 1/9] xen-netback: Use skb->cb for pending_idx Zoltan Kiss
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Zoltan Kiss @ 2014-03-06 21:48 UTC (permalink / raw)
  To: ian.campbell, wei.liu2, xen-devel
  Cc: netdev, linux-kernel, jonathan.davies, Zoltan Kiss

A long known problem of the upstream netback implementation that on the TX
path (from guest to Dom0) it copies the whole packet from guest memory into
Dom0. That simply became a bottleneck with 10Gb NICs, and generally it's a
huge perfomance penalty. The classic kernel version of netback used grant
mapping, and to get notified when the page can be unmapped, it used page
destructors. Unfortunately that destructor is not an upstreamable solution.
Ian Campbell's skb fragment destructor patch series [1] tried to solve this
problem, however it seems to be very invasive on the network stack's code,
and therefore haven't progressed very well.
This patch series use SKBTX_DEV_ZEROCOPY flags to tell the stack it needs to
know when the skb is freed up. That is the way KVM solved the same problem,
and based on my initial tests it can do the same for us. Avoiding the extra
copy boosted up TX throughput from 6.8 Gbps to 7.9 (I used a slower AMD
Interlagos box, both Dom0 and guest on upstream kernel, on the same NUMA node,
running iperf 2.0.5, and the remote end was a bare metal box on the same 10Gb
switch)
Based on my investigations the packet get only copied if it is delivered to
Dom0 IP stack through deliver_skb, which is due to this [2] patch. This affects
DomU->Dom0 IP traffic and when Dom0 does routing/NAT for the guest. That's a bit
unfortunate, but luckily it doesn't cause a major regression for this usecase.
In the future we should try to eliminate that copy somehow.
There are a few spinoff tasks which will be addressed in separate patches:
- grant copy the header directly instead of map and memcpy. This should help
  us avoiding TLB flushing
- use something else than ballooned pages
- fix grant map to use page->index properly
I've tried to broke it down to smaller patches, with mixed results, so I
welcome suggestions on that part as well:
1: Use skb->cb to store pending_idx
2: Some refactoring
3: Change RX path for mapped SKB fragments (moved here to keep bisectability,
review it after #4)
4: Introduce TX grant mapping
5: Remove old TX grant copy definitons and fix indentations
6: Add stat counters for zerocopy
7: Handle guests with too many frags
8: Timeout packets in RX path
9: Aggregate TX unmap operations

v2: I've fixed some smaller things, see the individual patches. I've added a
few new stat counters, and handling the important use case when an older guest
sends lots of slots. Instead of delayed copy now we timeout packets on the RX
path, based on the assumption that otherwise packets should get stucked
anywhere else. Finally some unmap batching to avoid too much TLB flush

v3: Apart from fixing a few things mentioned in responses the important change
is the use the hypercall directly for grant [un]mapping, therefore we can
avoid m2p override.

v4: Now we are using a new grant mapping API to avoid m2p_override. The RX queue
timeout logic changed also.

v5: Only minor fixes based on Wei's comments

v6: Important bugfixes for xenvif_poll exit path and zerocopy callback, see
first 2 patches. Also rework of handling packets with too many slots, and
reorder the series a bit.

v7: Small fixes in comments/log messages/error paths, and merging the frag
overflow stats patch into its parent.

[1] http://lwn.net/Articles/491522/
[2] https://lkml.org/lkml/2012/7/20/363

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-03-20 16:11 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-06 21:48 [PATCH net-next v7 0/9] xen-netback: TX grant mapping with SKBTX_DEV_ZEROCOPY instead of copy Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 1/9] xen-netback: Use skb->cb for pending_idx Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 2/9] xen-netback: Minor refactoring of netback code Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 3/9] xen-netback: Handle foreign mapped pages on the guest RX path Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 4/9] xen-netback: Introduce TX grant mapping Zoltan Kiss
2014-03-13 10:17   ` Ian Campbell
2014-03-13 12:34     ` Zoltan Kiss
2014-03-13 10:33   ` Ian Campbell
2014-03-13 10:56     ` [Xen-devel] " David Vrabel
2014-03-13 11:02       ` Ian Campbell
2014-03-13 11:09         ` David Vrabel
2014-03-13 11:13         ` Wei Liu
2014-03-13 13:17     ` Zoltan Kiss
2014-03-13 13:56       ` Ian Campbell
2014-03-13 17:43         ` Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 5/9] xen-netback: Remove old TX grant copy definitons and fix indentations Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 6/9] xen-netback: Add stat counters for zerocopy Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 7/9] xen-netback: Handle guests with too many frags Zoltan Kiss
2014-03-06 21:48 ` [PATCH net-next v7 8/9] xen-netback: Timeout packets in RX path Zoltan Kiss
2014-03-13 10:39   ` Ian Campbell
2014-03-06 21:48 ` [PATCH net-next v7 9/9] xen-netback: Aggregate TX unmap operations Zoltan Kiss
2014-03-19 21:16   ` Zoltan Kiss
2014-03-20  9:53     ` Paul Durrant
2014-03-20 10:48     ` Wei Liu
2014-03-20 11:14       ` Paul Durrant
2014-03-20 12:38         ` Wei Liu
2014-03-20 16:11           ` Zoltan Kiss
2014-03-07 21:05 ` [PATCH net-next v7 0/9] xen-netback: TX grant mapping with SKBTX_DEV_ZEROCOPY instead of copy David Miller
2014-03-08 14:37   ` Zoltan Kiss
2014-03-08 23:57     ` David Miller
2014-03-10 10:15       ` Wei Liu
2014-03-12 15:40       ` Ian Campbell
2014-03-12 18:49         ` Zoltan Kiss
2014-03-13 10:43         ` Ian Campbell
2014-03-13 10:08 ` Ian Campbell
2014-03-13 18:23   ` Zoltan Kiss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).