From: Heng Qi <hengqi@linux.alibaba.com> To: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com> Cc: virtio-comment@lists.oasis-open.org, virtio-dev@lists.oasis-open.org, Parav Pandit <parav@nvidia.com>, Yuri Benditovich <yuri.benditovich@daynix.com>, Cornelia Huck <cohuck@redhat.com>, Xuan Zhuo <xuanzhuo@linux.alibaba.com> Subject: Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash Date: Tue, 28 Feb 2023 17:56:28 +0800 [thread overview] Message-ID: <8660f842-e443-d206-1f8c-7c298e576274@linux.alibaba.com> (raw) In-Reply-To: <20230228034620-mutt-send-email-mst@kernel.org> 在 2023/2/28 下午4:52, Michael S. Tsirkin 写道: > On Tue, Feb 28, 2023 at 11:04:26AM +0800, Jason Wang wrote: >> On Tue, Feb 28, 2023 at 1:49 AM Michael S. Tsirkin <mst@redhat.com> wrote: >>> On Mon, Feb 27, 2023 at 04:35:09PM +0800, Jason Wang wrote: >>>> On Mon, Feb 27, 2023 at 3:39 PM Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> On Mon, Feb 27, 2023 at 12:07:17PM +0800, Jason Wang wrote: >>>>>> Btw, this kind of 1:1 hash features seems not scalable and flexible. >>>>>> It requires an endless extension on bits/fields. Modern NICs allow the >>>>>> user to customize the hash calculation, for virtio-net we can allow to >>>>>> use eBPF program to classify the packets. It seems to be more flexible >>>>>> and scalable and there's almost no maintain burden in the spec (only >>>>>> bytecode is required, no need any fancy features/interactions like >>>>>> maps), easy to be migrated etc. >>>>>> >>>>>> Prototype is also easy, tun/tap had an eBPF classifier for years. >>>>>> >>>>>> Thanks >>>>> Yea BPF offload would be great to have. We have been discussing it for >>>>> years though - security issues keep blocking it. *Maybe* it's finally >>>>> going to be there but I'm not going to block this work waiting for BPF >>>>> offload. And easily migrated is what BPF is not. >>>> Just to make sure we're at the same page. I meant to find a way to >>>> allow the driver/user to fully customize what it wants to >>>> hash/classify. Similar technologies which is based on private solution >>>> has been used by some vendors, which allow user to customize the >>>> classifier[1] >>>> >>>> ePBF looks like a good open-source solution candidate for this (there >>>> could be others). But there could be many kinds of eBPF programs that >>>> could be offloaded. One famous one is XDP which requires many features >>>> other than the bytecode/VM like map access, tailcall. Starting from >>>> such a complicated type is hard. Instead, we can start from a simple >>>> type, that is the eBPF classifier. All it needs is to pass the >>>> bytecode to the device, the device can choose to run it or compile it >>>> to what it can understand for classifying. We don't need maps, tail >>>> calls and other features. >>> Until people start asking exactly for maps because they want >>> state for their classifier? >> Yes, but let's compare the eBPF without maps with the static feature >> proposed here. It is much more scalable and flexible. >> >>> And it makes sense - if you want >>> e.g. load balancing you need stats which needs maps. >> Yes, but we know it's possible to have that (through the XDP offload). >> This is impossible with the approach proposed here. > I'm not actually objecting. And at least we then don't need to > worry about leaking info - it's not virtio leaking info > it's the bpf program. I wonder what does Heng Qi think. > Heng Qi would it work for your scenario? We are positive on ebpf, which looks adequate in our scenario. Although it currently has some problems in offloading, such as imperfect interfaces, unstable, and user-unfriendly ebpf codes may consume a lot of device resources. Device support for ebpf will also take time. Also, the presence of ebpf offload does not conflict with other solutions, eg we still have RSS. Our goal is to pass this patch first. For the support of ebpf offloading, we have not collected internal requirements for the time being, but it is indeed a good direction. Thanks. > >>>> We don't need to worry about the security >>>> because of its simplicity: the eBPF program is only in charge of doing >>>> classification, no other interactions with the driver and packet >>>> modification is prohibited. The feature is limited only to the >>>> VM/bytecode abstraction itself. >>>> >>>> What's more, it's a good first step to achieve full eBPF offloading in >>>> the future. >>>> >>>> Thanks >>>> >>>> [1] https://www.intel.com/content/www/us/en/architecture-and-technology/ethernet/dynamic-device-personalization-brief.html >>> Dave seems to have nacked this approach, no? >> I may miss something but looking at kernel commit, there are few >> patches to support that: >> >> E.g >> >> commit c7648810961682b9388be2dd041df06915647445 >> Author: Tony Nguyen <anthony.l.nguyen@intel.com> >> Date: Mon Sep 9 06:47:44 2019 -0700 >> >> ice: Implement Dynamic Device Personalization (DDP) download >> >> And it has been used by DPDK drivers. >> >> Thanks > If we are talking about netdev then this discussion has to take place on netdev. > If it's dpdk this is more believable. > >>>>> -- >>>>> MST >>>>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
WARNING: multiple messages have this Message-ID (diff)
From: Heng Qi <hengqi@linux.alibaba.com> To: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com> Cc: virtio-comment@lists.oasis-open.org, virtio-dev@lists.oasis-open.org, Parav Pandit <parav@nvidia.com>, Yuri Benditovich <yuri.benditovich@daynix.com>, Cornelia Huck <cohuck@redhat.com>, Xuan Zhuo <xuanzhuo@linux.alibaba.com> Subject: Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash Date: Tue, 28 Feb 2023 17:56:28 +0800 [thread overview] Message-ID: <8660f842-e443-d206-1f8c-7c298e576274@linux.alibaba.com> (raw) Message-ID: <20230228095628.Icyu8BYdGEsvA_MAvMvEMUOUqpSZheDUdl5OmMzM7oQ@z> (raw) In-Reply-To: <20230228034620-mutt-send-email-mst@kernel.org> 在 2023/2/28 下午4:52, Michael S. Tsirkin 写道: > On Tue, Feb 28, 2023 at 11:04:26AM +0800, Jason Wang wrote: >> On Tue, Feb 28, 2023 at 1:49 AM Michael S. Tsirkin <mst@redhat.com> wrote: >>> On Mon, Feb 27, 2023 at 04:35:09PM +0800, Jason Wang wrote: >>>> On Mon, Feb 27, 2023 at 3:39 PM Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> On Mon, Feb 27, 2023 at 12:07:17PM +0800, Jason Wang wrote: >>>>>> Btw, this kind of 1:1 hash features seems not scalable and flexible. >>>>>> It requires an endless extension on bits/fields. Modern NICs allow the >>>>>> user to customize the hash calculation, for virtio-net we can allow to >>>>>> use eBPF program to classify the packets. It seems to be more flexible >>>>>> and scalable and there's almost no maintain burden in the spec (only >>>>>> bytecode is required, no need any fancy features/interactions like >>>>>> maps), easy to be migrated etc. >>>>>> >>>>>> Prototype is also easy, tun/tap had an eBPF classifier for years. >>>>>> >>>>>> Thanks >>>>> Yea BPF offload would be great to have. We have been discussing it for >>>>> years though - security issues keep blocking it. *Maybe* it's finally >>>>> going to be there but I'm not going to block this work waiting for BPF >>>>> offload. And easily migrated is what BPF is not. >>>> Just to make sure we're at the same page. I meant to find a way to >>>> allow the driver/user to fully customize what it wants to >>>> hash/classify. Similar technologies which is based on private solution >>>> has been used by some vendors, which allow user to customize the >>>> classifier[1] >>>> >>>> ePBF looks like a good open-source solution candidate for this (there >>>> could be others). But there could be many kinds of eBPF programs that >>>> could be offloaded. One famous one is XDP which requires many features >>>> other than the bytecode/VM like map access, tailcall. Starting from >>>> such a complicated type is hard. Instead, we can start from a simple >>>> type, that is the eBPF classifier. All it needs is to pass the >>>> bytecode to the device, the device can choose to run it or compile it >>>> to what it can understand for classifying. We don't need maps, tail >>>> calls and other features. >>> Until people start asking exactly for maps because they want >>> state for their classifier? >> Yes, but let's compare the eBPF without maps with the static feature >> proposed here. It is much more scalable and flexible. >> >>> And it makes sense - if you want >>> e.g. load balancing you need stats which needs maps. >> Yes, but we know it's possible to have that (through the XDP offload). >> This is impossible with the approach proposed here. > I'm not actually objecting. And at least we then don't need to > worry about leaking info - it's not virtio leaking info > it's the bpf program. I wonder what does Heng Qi think. > Heng Qi would it work for your scenario? We are positive on ebpf, which looks adequate in our scenario. Although it currently has some problems in offloading, such as imperfect interfaces, unstable, and user-unfriendly ebpf codes may consume a lot of device resources. Device support for ebpf will also take time. Also, the presence of ebpf offload does not conflict with other solutions, eg we still have RSS. Our goal is to pass this patch first. For the support of ebpf offloading, we have not collected internal requirements for the time being, but it is indeed a good direction. Thanks. > >>>> We don't need to worry about the security >>>> because of its simplicity: the eBPF program is only in charge of doing >>>> classification, no other interactions with the driver and packet >>>> modification is prohibited. The feature is limited only to the >>>> VM/bytecode abstraction itself. >>>> >>>> What's more, it's a good first step to achieve full eBPF offloading in >>>> the future. >>>> >>>> Thanks >>>> >>>> [1] https://www.intel.com/content/www/us/en/architecture-and-technology/ethernet/dynamic-device-personalization-brief.html >>> Dave seems to have nacked this approach, no? >> I may miss something but looking at kernel commit, there are few >> patches to support that: >> >> E.g >> >> commit c7648810961682b9388be2dd041df06915647445 >> Author: Tony Nguyen <anthony.l.nguyen@intel.com> >> Date: Mon Sep 9 06:47:44 2019 -0700 >> >> ice: Implement Dynamic Device Personalization (DDP) download >> >> And it has been used by DPDK drivers. >> >> Thanks > If we are talking about netdev then this discussion has to take place on netdev. > If it's dpdk this is more believable. > >>>>> -- >>>>> MST >>>>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
next prev parent reply other threads:[~2023-02-28 9:56 UTC|newest] Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-02-18 14:37 [PATCH v9] virtio-net: support inner header hash Heng Qi 2023-02-20 15:53 ` [virtio-comment] Re: [virtio-dev] " Heng Qi 2023-02-20 16:12 ` Michael S. Tsirkin 2023-02-21 4:20 ` Parav Pandit 2023-02-21 6:14 ` [virtio-comment] " Heng Qi 2023-02-21 12:47 ` Parav Pandit 2023-02-21 13:34 ` Heng Qi 2023-02-21 15:32 ` Parav Pandit 2023-02-21 16:44 ` [virtio-comment] Re: [virtio-dev] " Heng Qi 2023-02-21 16:50 ` Parav Pandit 2023-02-21 17:13 ` Michael S. Tsirkin 2023-02-21 17:40 ` [virtio-comment] " Parav Pandit 2023-02-21 17:44 ` Michael S. Tsirkin 2023-02-21 17:54 ` Parav Pandit 2023-02-21 17:17 ` [virtio-comment] " Heng Qi 2023-02-21 17:39 ` Parav Pandit 2023-02-21 13:37 ` Heng Qi 2023-02-21 17:05 ` Michael S. Tsirkin 2023-02-21 19:29 ` Parav Pandit 2023-02-21 21:23 ` Michael S. Tsirkin 2023-02-21 21:36 ` Parav Pandit 2023-02-21 21:46 ` Michael S. Tsirkin 2023-02-21 22:32 ` Parav Pandit 2023-02-21 23:18 ` Michael S. Tsirkin 2023-02-22 1:41 ` Parav Pandit 2023-02-22 2:51 ` [virtio-dev] " Heng Qi 2023-02-22 2:34 ` [virtio-dev] " Heng Qi 2023-02-22 6:21 ` Michael S. Tsirkin 2023-02-22 7:03 ` Heng Qi 2023-02-22 11:29 ` Michael S. Tsirkin 2023-03-01 14:32 ` [virtio-dev] " Heng Qi 2023-02-21 17:50 ` Michael S. Tsirkin 2023-02-22 3:22 ` Jason Wang 2023-02-22 6:46 ` Heng Qi 2023-02-22 11:30 ` Michael S. Tsirkin 2023-02-23 2:50 ` Jason Wang 2023-02-23 4:41 ` [virtio-dev] " Heng Qi 2023-02-24 2:45 ` Jason Wang 2023-02-24 4:47 ` [virtio-comment] " Heng Qi 2023-02-24 8:07 ` Michael S. Tsirkin 2023-02-23 13:03 ` Michael S. Tsirkin 2023-02-24 2:26 ` Jason Wang 2023-02-24 8:06 ` [virtio-dev] " Michael S. Tsirkin 2023-02-27 4:07 ` Jason Wang 2023-02-27 4:07 ` [virtio-dev] " Jason Wang 2023-02-27 7:39 ` Michael S. Tsirkin 2023-02-27 7:39 ` [virtio-dev] " Michael S. Tsirkin 2023-02-27 8:35 ` Jason Wang 2023-02-27 8:35 ` [virtio-dev] " Jason Wang 2023-02-27 12:38 ` Heng Qi 2023-02-27 12:38 ` [virtio-dev] " Heng Qi 2023-02-27 17:49 ` Michael S. Tsirkin 2023-02-27 17:49 ` [virtio-dev] " Michael S. Tsirkin 2023-02-28 3:04 ` Jason Wang 2023-02-28 3:04 ` [virtio-dev] " Jason Wang 2023-02-28 8:52 ` Michael S. Tsirkin 2023-02-28 8:52 ` [virtio-dev] " Michael S. Tsirkin 2023-02-28 9:56 ` Heng Qi [this message] 2023-02-28 9:56 ` Heng Qi 2023-02-28 11:04 ` Michael S. Tsirkin 2023-02-28 11:04 ` [virtio-dev] " Michael S. Tsirkin 2023-03-01 2:36 ` Jason Wang 2023-03-01 2:36 ` [virtio-dev] " Jason Wang 2023-03-01 10:36 ` Michael S. Tsirkin 2023-03-02 2:57 ` Jason Wang 2023-03-02 7:42 ` Michael S. Tsirkin 2023-03-02 7:57 ` Jason Wang 2023-03-02 8:09 ` Michael S. Tsirkin 2023-03-02 8:15 ` Jason Wang 2023-03-02 8:41 ` Michael S. Tsirkin 2023-03-02 8:59 ` Jason Wang 2023-03-02 9:46 ` Michael S. Tsirkin 2023-02-23 13:13 ` Michael S. Tsirkin 2023-02-23 14:40 ` [virtio-comment] " Parav Pandit 2023-02-24 8:13 ` Michael S. Tsirkin 2023-02-24 14:38 ` [virtio-dev] " Heng Qi 2023-02-24 17:10 ` Michael S. Tsirkin 2023-02-24 17:10 ` Michael S. Tsirkin 2023-02-27 0:29 ` Parav Pandit 2023-02-27 0:29 ` [virtio-dev] " Parav Pandit 2023-02-24 4:42 ` Heng Qi 2023-02-24 8:04 ` Michael S. Tsirkin 2023-02-28 11:16 ` Michael S. Tsirkin 2023-02-28 11:16 ` [virtio-dev] " Michael S. Tsirkin 2023-03-01 2:56 ` Heng Qi 2023-03-01 2:56 ` Heng Qi 2023-03-08 14:39 ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin 2023-03-09 4:55 ` Heng Qi 2023-03-09 19:36 ` Michael S. Tsirkin 2023-03-11 3:23 ` Heng Qi 2023-03-15 11:58 ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin 2023-03-15 12:55 ` Heng Qi 2023-03-15 14:57 ` Michael S. Tsirkin 2023-03-16 13:17 ` Heng Qi 2023-03-20 19:45 ` Michael S. Tsirkin 2023-03-30 12:10 ` Heng Qi 2023-03-20 19:48 ` Michael S. Tsirkin 2023-03-30 12:37 ` Heng Qi 2023-04-08 10:29 ` Michael S. Tsirkin 2023-04-10 13:26 ` Heng Qi 2023-03-01 3:30 ` [virtio-comment] " Heng Qi 2023-03-01 3:30 ` [virtio-dev] " Heng Qi 2023-03-01 11:07 ` Michael S. Tsirkin 2023-03-01 15:10 ` Heng Qi 2023-03-09 12:28 ` [virtio-dev] " Heng Qi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=8660f842-e443-d206-1f8c-7c298e576274@linux.alibaba.com \ --to=hengqi@linux.alibaba.com \ --cc=cohuck@redhat.com \ --cc=jasowang@redhat.com \ --cc=mst@redhat.com \ --cc=parav@nvidia.com \ --cc=virtio-comment@lists.oasis-open.org \ --cc=virtio-dev@lists.oasis-open.org \ --cc=xuanzhuo@linux.alibaba.com \ --cc=yuri.benditovich@daynix.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).