All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>,
	Alexei Starovoitov <ast@kernel.org>
Cc: "Daniel Borkmann" <daniel@iogearbox.net>,
	netdev@vger.kernel.org,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	brouer@redhat.com
Subject: Re: [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi ring
Date: Wed, 1 Aug 2018 17:09:49 +0200	[thread overview]
Message-ID: <20180801170949.5bf6101e@redhat.com> (raw)
In-Reply-To: <90f355ef-1e56-5f12-ab78-a19c83fc9253@lab.ntt.co.jp>

On Wed, 1 Aug 2018 14:41:08 +0900
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:

> On 2018/07/31 21:46, Jesper Dangaard Brouer wrote:
> > On Tue, 31 Jul 2018 19:40:08 +0900
> > Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >   
> >> On 2018/07/31 19:26, Jesper Dangaard Brouer wrote:  
> >>>
> >>> Context needed from: [PATCH v6 bpf-next 2/9] veth: Add driver XDP
> >>>
> >>> On Mon, 30 Jul 2018 19:43:44 +0900
> >>> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >>>     
[...]
> >>>
> >>> Here you are adding an assumption that struct xdp_frame is always
> >>> located in-the-top of the packet-data area.  I tried hard not to add
> >>> such a dependency!  You can calculate the beginning of the frame from
> >>> the xdp_frame->data pointer.
> >>>
> >>> Why not add such a dependency?  Because for AF_XDP zero-copy, we cannot
> >>> make such an assumption.  
> >>>
> >>> Currently, when an RX-queue is in AF-XDP-ZC mode (MEM_TYPE_ZERO_COPY)
> >>> the packet will get dropped when calling convert_to_xdp_frame(), but as
> >>> the TODO comment indicated in convert_to_xdp_frame() this is not the
> >>> end-goal. 
> >>>
> >>> The comment in convert_to_xdp_frame(), indicate we need a full
> >>> alloc+copy, but that is actually not necessary, if we can just use
> >>> another memory area for struct xdp_frame, and a pointer to data.  Thus,
> >>> allowing devmap-redir to work-ZC and allow cpumap-redir to do the copy
> >>> on the remote CPU.    
> >>
> >> Thanks for pointing this out.
> >> Seems you are saying xdp_frame area is not reusable. That means we
> >> reduce usable headroom on every REDIRECT. I wanted to avoid this but
> >> actually it is impossible, right?  
> > 
> > I'm not sure I understand fully...  has this something to do, with the
> > below memset?  
> 
> Sorry for not being so clear...
> It has something to do with the memset as well but mainly I was talking
> about XDP_TX and REDIRECT introduced in patch 8. On REDIRECT,
> dev_map_enqueue() calls convert_to_xdp_frame() so we use the headroom
> for struct xdp_frame on REDIRECT. If we don't reuse xdp_frame region of
> the original xdp packet, we reduce the headroom size each time on
> REDIRECT. When ZC is used, in the future xdp_frame can be non-contiguous
> to the buffer, so we cannot reuse the xdp_frame region in
> convert_to_xdp_frame()? But current convert_to_xdp_frame()
> implementation requires xdp_frame region in headroom so I think I cannot
> avoid this dependency now.
> 
> SKB has a similar problem if we cannot reuse it. It can be passed to a
> bridge and redirected to another veth which has driver XDP. In that case
> we need to reallocate the page if we have reduced the headroom because
> sufficient headroom is required for XDP processing for now (can we
> remove this requirement actually?).

Okay, now I understand.  Your changes allow multiple levels of
XDP_REDIRECT between/into other veth net_devices.  This is very
interesting and exciting stuff, but also a bit scary, when thinking
about if we got he life-time correct for the different memory objects.

You have convinced me.  We should not sacrifice/reduce the headroom
this way.  I'll also fix up cpumap.

To avoid the performance penalty of the memset, I propose that we just
clear the xdp_frame->data pointer.  But lets implement it via a common
sanitize/scrub function.


> > When cpumap generate an SKB for the netstack, then we sacrifice/reduce
> > the SKB headroom available, by in convert_to_xdp_frame() reducing the
> > headroom by xdp_frame size.
> > 
> >  xdp_frame->headroom = headroom - sizeof(*xdp_frame)
> > 
> > In-order to avoid doing such memset of this area.  We are actually only
> > worried about exposing the 'data' pointer, thus we could just clear
> > that.  (See commit 6dfb970d3dbd, this is because Alexei is planing to
> > move from CAP_SYS_ADMIN to lesser privileged mode CAP_NET_ADMIN)
> > 
> > See commits:
> >  97e19cce05e5 ("bpf: reserve xdp_frame size in xdp headroom")
> >  6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse")  
> 
> We have talked about that...
> https://patchwork.ozlabs.org/patch/903536/
> 
> The memset is introduced as per your feedback, but I'm still not sure if
> we need this. In general the headroom is not cleared after allocation in
> drivers, so anyway unprivileged users should not see it no matter if it
> contains xdp_frame or not...

I actually got this request from Alexei. That is why I implemented it.
Personally I don't think this clearing is really needed, until someone
actually makes the TC/cls_act BPF hook CAP_NET_ADMIN.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-08-01 16:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-30 10:43 [PATCH v6 bpf-next 0/9] veth: Driver XDP Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 1/9] net: Export skb_headers_offset_update Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 2/9] veth: Add driver XDP Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 3/9] veth: Avoid drops by oversized packets when XDP is enabled Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi ring Toshiaki Makita
2018-07-31 10:26   ` Jesper Dangaard Brouer
2018-07-31 10:40     ` Toshiaki Makita
2018-07-31 12:46       ` Jesper Dangaard Brouer
2018-08-01  5:41         ` Toshiaki Makita
2018-08-01 15:09           ` Jesper Dangaard Brouer [this message]
2018-07-30 10:43 ` [PATCH v6 bpf-next 5/9] veth: Add ndo_xdp_xmit Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 6/9] bpf: Make redirect_info accessible from modules Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 7/9] xdp: Helpers for disabling napi_direct of xdp_return_frame Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 8/9] veth: Add XDP TX and REDIRECT Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 9/9] veth: Support per queue XDP ring Toshiaki Makita

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180801170949.5bf6101e@redhat.com \
    --to=brouer@redhat.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=john.fastabend@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.