bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: dust.li@linux.alibaba.com, tonylu@linux.alibaba.com,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"VIRTIO CORE AND NET DRIVERS"
	<virtualization@lists.linux-foundation.org>,
	"open list" <linux-kernel@vger.kernel.org>,
	"XDP SOCKETS (AF_XDP)" <bpf@vger.kernel.org>,
	netdev@vger.kernel.org
Subject: Re: [PATCH netdev 0/5] virtio-net support xdp socket zero copy xmit
Date: Wed, 6 Jan 2021 11:54:26 +0800	[thread overview]
Message-ID: <b5dee65c-2a0c-296c-56b4-1ed17f7aec38@redhat.com> (raw)
In-Reply-To: <1609901717.683732-2-xuanzhuo@linux.alibaba.com>


On 2021/1/6 上午10:55, Xuan Zhuo wrote:
> On Wed, 6 Jan 2021 10:46:43 +0800, Jason Wang <jasowang@redhat.com> wrote:
>> On 2021/1/5 下午8:42, Xuan Zhuo wrote:
>>> On Tue, 5 Jan 2021 17:32:19 +0800, Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/1/5 下午5:11, Xuan Zhuo wrote:
>>>>> The first patch made some adjustments to xsk.
>>>> Thanks a lot for the work. It's rather interesting.
>>>>
>>>>
>>>>> The second patch itself can be used as an independent patch to solve the problem
>>>>> that XDP may fail to load when the number of queues is insufficient.
>>>> It would be better to send this as a separated patch. Several people
>>>> asked for this before.
>>>>
>>>>
>>>>> The third to last patch implements support for xsk in virtio-net.
>>>>>
>>>>> A practical problem with virtio is that tx interrupts are not very reliable.
>>>>> There will always be some missing or delayed tx interrupts. So I specially added
>>>>> a point timer to solve this problem. Of course, considering performance issues,
>>>>> The timer only triggers when the ring of the network card is full.
>>>> This is sub-optimal. We need figure out the root cause. We don't meet
>>>> such issue before.
>>>>
>>>> Several questions:
>>>>
>>>> - is tx interrupt enabled?
>>>> - can you still see the issue if you disable event index?
>>>> - what's backend did you use? qemu or vhost(user)?
>>> Sorry, it may just be a problem with the backend I used here. I just tested the
>>> latest qemu and it did not have this problem. I think I should delete the
>>> timer-related code?
>>
>> Yes, please.
>>
>>
>>>>> Regarding the issue of virtio-net supporting xsk's zero copy rx, I am also
>>>>> developing it, but I found that the modification may be relatively large, so I
>>>>> consider this patch set to be separated from the code related to xsk zero copy
>>>>> rx.
>>>> That's fine, but a question here.
>>>>
>>>> How is the multieuque being handled here. I'm asking since there's no
>>>> programmable filters/directors support in virtio spec now.
>>>>
>>>> Thanks
>>> I don't really understand what you mean. In the case of multiple queues,
>>> there is no problem.
>>
>> So consider we bind xsk to queue 4, how can you make sure the traffic to
>> be directed queue 4? One possible solution is to use filters as what
>> suggested in af_xdp.rst:
>>
>>         ethtool -N p3p2 rx-flow-hash udp4 fn
>>         ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
>>             action 16
>> ...
>>
>> But virtio-net doesn't have any filters that could be programmed from
>> the driver.
>>
>> Anything I missed here?
>>
>> Thanks
> I understand what you mean, this problem does exist, and I encountered it when I
> tested qemu.
>
> First of all, this is that the problem only affects recv. This patch is for
> xmit. Of course, our normal business must also have recv scenarios.
>
> My solution in developing the upper-level application is to bond all the queues
> to ensure that we can receive the packets we want.


I'm not sure I get you here. Note that. one advantage of AF_XDP is that 
is allows XSK to be bound to a specific queue and the rest could still 
be used by kernel.


>   And I think in the
> implementation of the use, even if the network card supports filters, we should
> also bond all the queues, because we don't know which queue the traffic we care
> about will arrive from.


With the help of filters the card can select a specific queue based on 
hash or n-tuple so it should work?


>
> Regarding the problem of virtio-net, I think our core question is whether we
> need to deal with this problem in the driver of virtio-net, I personally think
> that we should add the virtio specification to define this scenario.


Yes, so do you want to do that? It would make virtio-net more user 
friendly to AF_XDP. (Or if you wish I can post patch to extend the spec).


>
> When I tested it, I found that some cloud vendors' implementations guarantee
> this queue selection algorithm.


Right, though spec suggest a automatic steering algorithm but it's not 
mandatory. Vendor can implement their own.

But hash or ntuple filter should be still useful.

Thanks


>
> Thanks!!
>
>>
>>>>> Xuan Zhuo (5):
>>>>>      xsk: support get page for drv
>>>>>      virtio-net: support XDP_TX when not more queues
>>>>>      virtio-net, xsk: distinguish XDP_TX and XSK XMIT ctx
>>>>>      xsk, virtio-net: prepare for support xsk
>>>>>      virtio-net, xsk: virtio-net support xsk zero copy tx
>>>>>
>>>>>     drivers/net/virtio_net.c    | 643 +++++++++++++++++++++++++++++++++++++++-----
>>>>>     include/linux/netdevice.h   |   1 +
>>>>>     include/net/xdp_sock_drv.h  |  10 +
>>>>>     include/net/xsk_buff_pool.h |   1 +
>>>>>     net/xdp/xsk_buff_pool.c     |  10 +-
>>>>>     5 files changed, 597 insertions(+), 68 deletions(-)
>>>>>
>>>>> --
>>>>> 1.8.3.1
>>>>>


       reply	other threads:[~2021-01-06  3:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1609901717.683732-2-xuanzhuo@linux.alibaba.com>
2021-01-06  3:54 ` Jason Wang [this message]
     [not found] <1609850555.8687568-1-xuanzhuo@linux.alibaba.com>
2021-01-06  2:46 ` [PATCH netdev 0/5] virtio-net support xdp socket zero copy xmit Jason Wang
2021-01-05  9:11 Xuan Zhuo
2021-01-05  9:32 ` Jason Wang
2021-01-05 12:25 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b5dee65c-2a0c-296c-56b4-1ed17f7aec38@redhat.com \
    --to=jasowang@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dust.li@linux.alibaba.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=magnus.karlsson@intel.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).