linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
@ 2017-01-25  4:07 Alexei Starovoitov
  2017-01-25 14:23 ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2017-01-25  4:07 UTC (permalink / raw)
  To: John Fastabend
  Cc: Michael S. Tsirkin, David Miller, linux-kernel, Jason Wang,
	virtualization, netdev

On Tue, Jan 24, 2017 at 7:48 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
>
> It is a concern on my side. I want XDP and Linux stack to work
> reasonably well together.

btw the micro benchmarks showed that page per packet approach
that xdp took in mlx4 should be 10% slower vs normal operation
for tcp/ip stack. We thought that for our LB use case
it will be an acceptable slowdown, but turned out that overall we
got a performance boost, since xdp model simplified user space
and got data path faster, so we magically got extra free cpu
that is used for other apps on the same host and overall
perf win despite extra overhead in tcp/ip.
Not all use cases are the same and not everyone will be as lucky,
so I'd like to see performance of xdp_pass improving too, though
it turned out to be not as high priority as I initially estimated.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-25  4:07 [PATCH v2] virtio_net: fix PAGE_SIZE > 64k Alexei Starovoitov
@ 2017-01-25 14:23 ` Michael S. Tsirkin
  0 siblings, 0 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2017-01-25 14:23 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: John Fastabend, David Miller, linux-kernel, Jason Wang,
	virtualization, netdev

On Tue, Jan 24, 2017 at 08:07:40PM -0800, Alexei Starovoitov wrote:
> On Tue, Jan 24, 2017 at 7:48 PM, John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > It is a concern on my side. I want XDP and Linux stack to work
> > reasonably well together.
> 
> btw the micro benchmarks showed that page per packet approach
> that xdp took in mlx4 should be 10% slower vs normal operation
> for tcp/ip stack.

Interesting. TCP only or UDP too? What's the packet size? Are you tuning
your rmem limits at all?  The slowdown would be more noticeable with
UDP with default values and small packet sizes.

> We thought that for our LB use case
> it will be an acceptable slowdown, but turned out that overall we
> got a performance boost, since xdp model simplified user space
> and got data path faster, so we magically got extra free cpu
> that is used for other apps on the same host and overall
> perf win despite extra overhead in tcp/ip.
> Not all use cases are the same and not everyone will be as lucky,
> so I'd like to see performance of xdp_pass improving too, though
> it turned out to be not as high priority as I initially estimated.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 21:56               ` Michael S. Tsirkin
@ 2017-01-25  3:48                 ` John Fastabend
  0 siblings, 0 replies; 12+ messages in thread
From: John Fastabend @ 2017-01-25  3:48 UTC (permalink / raw)
  To: Michael S. Tsirkin, David Miller
  Cc: linux-kernel, jasowang, virtualization, netdev

On 17-01-24 01:56 PM, Michael S. Tsirkin wrote:
> On Tue, Jan 24, 2017 at 04:10:46PM -0500, David Miller wrote:
>> This works in the regimen that XDP packets always live in exactly one
>> page.  That will be needed to mmap the RX ring into userspace, and it
>> helps make adjust_header trivial as well.

I still don't see why this is a hard requirement for mmap let me post
some patches later tonight to show how we do this with af_packet.

> 
> I think the point was to avoid resets across xdp attach/detach.  If we
> are doing resets now, we could do whatever buffering we want. We could
> also just disable mergeable buffers for that matter.
> 
>> MTU 1500, PAGESIZE >= 4096, so a headroom of 256 is no problem, and
>> we still have enough tailroom for skb_shared_info should we wrap
>> the buffer into a real SKB and push it into the stack.
>>
>> If you are trying to do buffering differently for virtio_net, well...
>> that's a self inflicted wound as far as I can tell.
> 
> Right but I was wondering about the fact that this makes XDP_PASS
> much slower than processing skbs without XDP, as truesize is huge
> so we'll quickly run out of rmem space.
> 
> When XDP is used to fight DOS attacks, why isn't this a concern?
> 

It is a concern on my side. I want XDP and Linux stack to work
reasonably well together.

.John

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 21:10             ` David Miller
@ 2017-01-24 21:56               ` Michael S. Tsirkin
  2017-01-25  3:48                 ` John Fastabend
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2017-01-24 21:56 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

On Tue, Jan 24, 2017 at 04:10:46PM -0500, David Miller wrote:
> This works in the regimen that XDP packets always live in exactly one
> page.  That will be needed to mmap the RX ring into userspace, and it
> helps make adjust_header trivial as well.

I think the point was to avoid resets across xdp attach/detach.  If we
are doing resets now, we could do whatever buffering we want. We could
also just disable mergeable buffers for that matter.

> MTU 1500, PAGESIZE >= 4096, so a headroom of 256 is no problem, and
> we still have enough tailroom for skb_shared_info should we wrap
> the buffer into a real SKB and push it into the stack.
> 
> If you are trying to do buffering differently for virtio_net, well...
> that's a self inflicted wound as far as I can tell.

Right but I was wondering about the fact that this makes XDP_PASS
much slower than processing skbs without XDP, as truesize is huge
so we'll quickly run out of rmem space.

When XDP is used to fight DOS attacks, why isn't this a concern?

-- 
MST

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 21:07           ` Michael S. Tsirkin
@ 2017-01-24 21:10             ` David Miller
  2017-01-24 21:56               ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2017-01-24 21:10 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 24 Jan 2017 23:07:51 +0200

> On Tue, Jan 24, 2017 at 03:53:31PM -0500, David Miller wrote:
>> From: "Michael S. Tsirkin" <mst@redhat.com>
>> Date: Tue, 24 Jan 2017 22:45:37 +0200
>> 
>> > On Tue, Jan 24, 2017 at 03:09:59PM -0500, David Miller wrote:
>> >> From: "Michael S. Tsirkin" <mst@redhat.com>
>> >> Date: Tue, 24 Jan 2017 21:53:13 +0200
>> >> 
>> >> > I didn't realise. Why can't we? I thought that adjust_header is an
>> >> > optional feature that userspace can test for, so no rush.
>> >> 
>> >> No, we want the base set of XDP features to be present in all drivers
>> >> supporting XDP.
>> > 
>> > I see, I didn't realize this. In light of this, is there any
>> > guidance *how much* head room is required to be considered
>> > valid? We already have 12 bytes of headroom.
>> 
>> The idea is to allow programs to implement arbitrary kinds of
>> encapsulation, so we need to be able to allow them to push headers for
>> all kinds of software tunnels, with allowance for a few depths in some
>> extreme cases.
>> 
>> In that light, a nice round power of 2 number such as 256 seems quite
>> reasonable to me.
>> 
>> This seems to be what other XDP implementations in drivers use at the
>> moment as well.
> 
> It bothers me that this becomes a part of userspace ABI.
> Apps will see that everyone does 256 and will assume it,
> we'll never be able to go back.
> 
> This does mean that XDP_PASS will use much more memory
> for small packets and by extension need a higher rmem limit.
> Would all admins be comfortable with this? Why would they want
> to if all their XDP does is DROP?
> Why not teach applications to query the headroom?

This works in the regimen that XDP packets always live in exactly one
page.  That will be needed to mmap the RX ring into userspace, and it
helps make adjust_header trivial as well.

MTU 1500, PAGESIZE >= 4096, so a headroom of 256 is no problem, and
we still have enough tailroom for skb_shared_info should we wrap
the buffer into a real SKB and push it into the stack.

If you are trying to do buffering differently for virtio_net, well...
that's a self inflicted wound as far as I can tell.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 20:53         ` David Miller
@ 2017-01-24 21:07           ` Michael S. Tsirkin
  2017-01-24 21:10             ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2017-01-24 21:07 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

On Tue, Jan 24, 2017 at 03:53:31PM -0500, David Miller wrote:
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Date: Tue, 24 Jan 2017 22:45:37 +0200
> 
> > On Tue, Jan 24, 2017 at 03:09:59PM -0500, David Miller wrote:
> >> From: "Michael S. Tsirkin" <mst@redhat.com>
> >> Date: Tue, 24 Jan 2017 21:53:13 +0200
> >> 
> >> > I didn't realise. Why can't we? I thought that adjust_header is an
> >> > optional feature that userspace can test for, so no rush.
> >> 
> >> No, we want the base set of XDP features to be present in all drivers
> >> supporting XDP.
> > 
> > I see, I didn't realize this. In light of this, is there any
> > guidance *how much* head room is required to be considered
> > valid? We already have 12 bytes of headroom.
> 
> The idea is to allow programs to implement arbitrary kinds of
> encapsulation, so we need to be able to allow them to push headers for
> all kinds of software tunnels, with allowance for a few depths in some
> extreme cases.
> 
> In that light, a nice round power of 2 number such as 256 seems quite
> reasonable to me.
> 
> This seems to be what other XDP implementations in drivers use at the
> moment as well.

It bothers me that this becomes a part of userspace ABI.
Apps will see that everyone does 256 and will assume it,
we'll never be able to go back.

This does mean that XDP_PASS will use much more memory
for small packets and by extension need a higher rmem limit.
Would all admins be comfortable with this? Why would they want
to if all their XDP does is DROP?
Why not teach applications to query the headroom?

Or even better, do what we do with skbs and do data copies whenever you
run out of headroom instead of a failure. Anyone using build_skb already
has a ton of tailroom so that will work better.

-- 
MST

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 20:45       ` Michael S. Tsirkin
@ 2017-01-24 20:53         ` David Miller
  2017-01-24 21:07           ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2017-01-24 20:53 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 24 Jan 2017 22:45:37 +0200

> On Tue, Jan 24, 2017 at 03:09:59PM -0500, David Miller wrote:
>> From: "Michael S. Tsirkin" <mst@redhat.com>
>> Date: Tue, 24 Jan 2017 21:53:13 +0200
>> 
>> > I didn't realise. Why can't we? I thought that adjust_header is an
>> > optional feature that userspace can test for, so no rush.
>> 
>> No, we want the base set of XDP features to be present in all drivers
>> supporting XDP.
> 
> I see, I didn't realize this. In light of this, is there any
> guidance *how much* head room is required to be considered
> valid? We already have 12 bytes of headroom.

The idea is to allow programs to implement arbitrary kinds of
encapsulation, so we need to be able to allow them to push headers for
all kinds of software tunnels, with allowance for a few depths in some
extreme cases.

In that light, a nice round power of 2 number such as 256 seems quite
reasonable to me.

This seems to be what other XDP implementations in drivers use at the
moment as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 20:09     ` David Miller
@ 2017-01-24 20:45       ` Michael S. Tsirkin
  2017-01-24 20:53         ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2017-01-24 20:45 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

On Tue, Jan 24, 2017 at 03:09:59PM -0500, David Miller wrote:
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Date: Tue, 24 Jan 2017 21:53:13 +0200
> 
> > I didn't realise. Why can't we? I thought that adjust_header is an
> > optional feature that userspace can test for, so no rush.
> 
> No, we want the base set of XDP features to be present in all drivers
> supporting XDP.

I see, I didn't realize this. In light of this, is there any
guidance *how much* head room is required to be considered
valid? We already have 12 bytes of headroom.


I'm generally sorry it's taking long, a large part of that is difficulty
figuring out the requirements: when we discussed this on LPC my
take-away was that one of the first users will be fighting DDOS attacks.

In light of this, I assumed that
- supporting just DROP (or DROP+TX) would already be useful
- XDP_PASS shouldn't be too slow as some people will run all
  their traffic with XDP enabled
- people actually want this in virtio because they run in a vm

Since then I heard opinions that seem to imply that
- you must support all features, not just DROP, otherwise it's useless
- XDP_PASS is a slow path fallback as people are not expected to mix XDP
  with regular sockets
- DDOS protection and by extension XDP in virtio is a developer's toy anyway

Thus, expect some slowness while I figure it all out.

-- 
MST

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 19:53   ` Michael S. Tsirkin
@ 2017-01-24 20:09     ` David Miller
  2017-01-24 20:45       ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2017-01-24 20:09 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 24 Jan 2017 21:53:13 +0200

> I didn't realise. Why can't we? I thought that adjust_header is an
> optional feature that userspace can test for, so no rush.

No, we want the base set of XDP features to be present in all drivers
supporting XDP.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-24 19:42 ` David Miller
@ 2017-01-24 19:53   ` Michael S. Tsirkin
  2017-01-24 20:09     ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2017-01-24 19:53 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

On Tue, Jan 24, 2017 at 02:42:27PM -0500, David Miller wrote:
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Date: Mon, 23 Jan 2017 21:37:52 +0200
> 
> > I don't have any guests with PAGE_SIZE > 64k but the
> > code seems to be clearly broken in that case
> > as PAGE_SIZE / MERGEABLE_BUFFER_ALIGN will need
> > more than 8 bit and so the code in mergeable_ctx_to_buf_address
> > does not give us the actual true size.
> > 
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Applied, thanks Michael.
> 
> I am really trying to be patient, but we are about to run out of
> time for fixing the adjust header XDP stuff.
> 
> That should have been resolved in a week or two, but now we're
> basically a month or so later.
> 
> Please come to some kind of agreement about how to implement this
> because we can't let v4.10 go out without this being resolved.
> 
> Thank you.

I didn't realise. Why can't we? I thought that adjust_header is an
optional feature that userspace can test for, so no rush.

-- 
MST

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
  2017-01-23 19:37 Michael S. Tsirkin
@ 2017-01-24 19:42 ` David Miller
  2017-01-24 19:53   ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2017-01-24 19:42 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, jasowang, virtualization, netdev, john.fastabend

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Mon, 23 Jan 2017 21:37:52 +0200

> I don't have any guests with PAGE_SIZE > 64k but the
> code seems to be clearly broken in that case
> as PAGE_SIZE / MERGEABLE_BUFFER_ALIGN will need
> more than 8 bit and so the code in mergeable_ctx_to_buf_address
> does not give us the actual true size.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Applied, thanks Michael.

I am really trying to be patient, but we are about to run out of
time for fixing the adjust header XDP stuff.

That should have been resolved in a week or two, but now we're
basically a month or so later.

Please come to some kind of agreement about how to implement this
because we can't let v4.10 go out without this being resolved.

Thank you.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] virtio_net: fix PAGE_SIZE > 64k
@ 2017-01-23 19:37 Michael S. Tsirkin
  2017-01-24 19:42 ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2017-01-23 19:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jason Wang, virtualization, netdev, John Fastabend

I don't have any guests with PAGE_SIZE > 64k but the
code seems to be clearly broken in that case
as PAGE_SIZE / MERGEABLE_BUFFER_ALIGN will need
more than 8 bit and so the code in mergeable_ctx_to_buf_address
does not give us the actual true size.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

changes from v1:
	fix build warnings

 drivers/net/virtio_net.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 4a10500..4dc373b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -48,8 +48,16 @@ module_param(gso, bool, 0444);
  */
 DECLARE_EWMA(pkt_len, 1, 64)
 
+/* With mergeable buffers we align buffer address and use the low bits to
+ * encode its true size. Buffer size is up to 1 page so we need to align to
+ * square root of page size to ensure we reserve enough bits to encode the true
+ * size.
+ */
+#define MERGEABLE_BUFFER_MIN_ALIGN_SHIFT ((PAGE_SHIFT + 1) / 2)
+
 /* Minimum alignment for mergeable packet buffers. */
-#define MERGEABLE_BUFFER_ALIGN max(L1_CACHE_BYTES, 256)
+#define MERGEABLE_BUFFER_ALIGN max(L1_CACHE_BYTES, \
+				   1 << MERGEABLE_BUFFER_MIN_ALIGN_SHIFT)
 
 #define VIRTNET_DRIVER_VERSION "1.0.0"
 
-- 
MST

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-01-25 14:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25  4:07 [PATCH v2] virtio_net: fix PAGE_SIZE > 64k Alexei Starovoitov
2017-01-25 14:23 ` Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2017-01-23 19:37 Michael S. Tsirkin
2017-01-24 19:42 ` David Miller
2017-01-24 19:53   ` Michael S. Tsirkin
2017-01-24 20:09     ` David Miller
2017-01-24 20:45       ` Michael S. Tsirkin
2017-01-24 20:53         ` David Miller
2017-01-24 21:07           ` Michael S. Tsirkin
2017-01-24 21:10             ` David Miller
2017-01-24 21:56               ` Michael S. Tsirkin
2017-01-25  3:48                 ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).