All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: John Fastabend <john.fastabend@gmail.com>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Cc: lorenzo.bianconi@redhat.com, davem@davemloft.net,
	kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net,
	shayagr@amazon.com, sameehj@amazon.com, dsahern@kernel.org,
	brouer@redhat.com, echaudro@redhat.com, jasowang@redhat.com,
	alexander.duyck@gmail.com, saeed@kernel.org,
	maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
	tirthendu.sarkar@intel.com
Subject: Re: [PATCH v9 bpf-next 00/14] mvneta: introduce XDP multi-buffer support
Date: Wed, 23 Jun 2021 08:40:44 -0600	[thread overview]
Message-ID: <9710dfc6-94f6-b0bd-5ac8-4e7dbfa54e7e@gmail.com> (raw)
In-Reply-To: <60d2cb1fd2bf9_2052b20886@john-XPS-13-9370.notmuch>

On 6/22/21 11:48 PM, John Fastabend wrote:
> David Ahern wrote:
>> On 6/22/21 5:18 PM, John Fastabend wrote:
>>> At this point I don't think we can have a partial implementation. At
>>> the moment we have packet capture applications and protocol parsers
>>> running in production. If we allow this to go in staged we are going
>>> to break those applications that make the fundamental assumption they
>>> have access to all the data in the packet.
>>
>> What about cases like netgpu where headers are accessible but data is
>> not (e.g., gpu memory)? If the API indicates limited buffer access, is
>> that sufficient?
> 
> I never consider netgpus and I guess I don't fully understand the
> architecture to say. But, I would try to argue that an XDP API
> should allow XDP to reach into the payload of these GPU packets as well.
> Of course it might be slow.

AIUI S/W on the host can not access gpu memory, so that is not a
possibility at all.

Another use case is DDP and ZC. Mellanox has a proposal for NVME (with
intentions to extend to iscsi) to do direct data placement. This is
really just an example of zerocopy (and netgpu has morphed into zctap
with current prototype working for host memory) which will become more
prominent. XDP programs accessing memory already mapped to user space
will be racy.

To me these proposals suggest a trend and one that XDP APIs should be
ready to handle - like indicating limited access or specifying length
that can be accessed.


> 
> I'm not really convinced just indicating its a limited buffer is enough.
> I think we want to be able to read/write any byte in the packet. I see
> two ways to do it,
> 
>   /* xdp_pull_data moves data and data_end pointers into the frag
>    * containing the byte offset start.
>    *
>    * returns negative value on error otherwise returns offset of
>    * data pointer into payload.
>    */
>   int xdp_pull_data(int start)
> 
> This would be a helper call to push the xdp->data{_end} pointers into
> the correct frag and then normal verification should work. From my
> side this works because I can always find the next frag by starting
> at 'xdp_pull_data(xdp->data_end+1)'. And by returning offset we can
> always figure out where we are in the payload. This is the easiest
> thing I could come up with. And hopefully for _most_ cases the bytes
> we need are in the initial data. Also I don't see how extending tail
> works without something like this.
> 
> My other thought, but requires some verifier work would be to extend
> 'struct xdp_md' with a frags[] pointer.
> 
>  struct xdp_md {
>    __u32 data;
>    __u32 data_end;
>    __u32 data_meta;
>    /* metadata stuff */
>   struct _xdp_md frags[] 
>   __u32 frags_end;
>  }
> 
> Then a XDP program could read access a frag like so,
> 
>   if (i < xdp->frags_end) {
>      frag = xdp->frags[i];
>      if (offset + hdr_size < frag->data_end)
>          memcpy(dst, frag->data[offset], hdr_size);
>   }
> 
> The nice bit about above is you avoid the call, but maybe it doesn't
> matter if you are already looking into frags pps is probably not at
> 64B sizes anyways.
> 
> My main concern here is we hit a case where the driver doesn't pull in
> the bytes we need and then we are stuck without a workaround. The helper
> looks fairly straightforward to me could we try that?
> 
> Also I thought we had another driver in the works? Any ideas where
> that went...
> 
> Last, I'll add thanks for working on this everyone.
> 
> .John
> 


  reply	other threads:[~2021-06-23 14:40 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14 12:49 [PATCH v9 bpf-next 00/14] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 01/14] net: skbuff: add data_len field to skb_shared_info Lorenzo Bianconi
2021-06-28 19:58   ` Alexander Duyck
2021-06-29 12:44     ` Lorenzo Bianconi
2021-06-29 17:08       ` Jakub Kicinski
2021-06-29 18:18         ` Alexander Duyck
2021-06-29 18:37           ` Jakub Kicinski
2021-06-29 19:11             ` Jesper Dangaard Brouer
2021-06-29 19:18               ` Lorenzo Bianconi
2021-06-29 20:45                 ` Alexander Duyck
2021-06-14 12:49 ` [PATCH v9 bpf-next 02/14] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
2021-06-28 20:14   ` Alexander Duyck
2021-06-29 12:43     ` Lorenzo Bianconi
2021-06-29 13:07       ` Alexander Duyck
2021-06-29 13:25         ` Lorenzo Bianconi
2021-07-05 15:52         ` Lorenzo Bianconi
2021-07-05 21:35           ` Alexander Duyck
2021-07-06 11:53             ` Lorenzo Bianconi
2021-07-06 14:04               ` Alexander Duyck
2021-07-06 17:47                 ` Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 03/14] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 04/14] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 05/14] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 06/14] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 07/14] net: xdp: add multi-buff support to xdp_build_skb_from_frame Lorenzo Bianconi
2021-06-28 21:05   ` Alexander Duyck
2021-06-29 18:34     ` Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 08/14] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-06-22 23:37   ` John Fastabend
2021-06-24  9:26     ` Eelco Chaudron
2021-06-24 14:24       ` John Fastabend
2021-06-24 15:16         ` Zvi Effron
2021-06-29 13:19         ` Lorenzo Bianconi
2021-06-29 13:27           ` Toke Høiland-Jørgensen
2021-07-06 21:44         ` Backwards compatibility for XDP multi-buff (was: Re: [PATCH v9 bpf-next 08/14] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API) Toke Høiland-Jørgensen
2021-06-14 12:49 ` [PATCH v9 bpf-next 09/14] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 10/14] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
2021-06-22 23:49   ` John Fastabend
2021-06-24  9:42     ` Eelco Chaudron
2021-06-24 14:28       ` John Fastabend
2021-06-25  8:25         ` Eelco Chaudron
2021-06-29 13:23         ` Lorenzo Bianconi
2021-07-06 10:15         ` Eelco Chaudron
2021-06-14 12:49 ` [PATCH v9 bpf-next 11/14] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 12/14] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 13/14] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2021-06-14 12:49 ` [PATCH v9 bpf-next 14/14] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
2021-06-22 23:18 ` [PATCH v9 bpf-next 00/14] mvneta: introduce XDP multi-buffer support John Fastabend
2021-06-23  3:41   ` David Ahern
2021-06-23  5:48     ` John Fastabend
2021-06-23 14:40       ` David Ahern [this message]
2021-06-24 14:22         ` John Fastabend
2021-07-01  7:56   ` Magnus Karlsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9710dfc6-94f6-b0bd-5ac8-4e7dbfa54e7e@gmail.com \
    --to=dsahern@gmail.com \
    --cc=alexander.duyck@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=echaudro@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeed@kernel.org \
    --cc=sameehj@amazon.com \
    --cc=shayagr@amazon.com \
    --cc=tirthendu.sarkar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.