From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH bpf-next v3 07/15] bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP Date: Mon, 8 Oct 2018 08:31:50 -0700 Message-ID: References: <20180502110136.3738-1-bjorn.topel@gmail.com> <20180502110136.3738-8-bjorn.topel@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , michael.lundkvist@ericsson.com, jesse.brandeburg@intel.com, anjali.singhai@intel.com, qi.z.zhang@intel.com To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, alexander.h.duyck@intel.com, alexander.duyck@gmail.com, john.fastabend@gmail.com, ast@fb.com, brouer@redhat.com, willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, mst@redhat.com, netdev@vger.kernel.org Return-path: Received: from mail-pl1-f193.google.com ([209.85.214.193]:44223 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726078AbeJHWoI (ORCPT ); Mon, 8 Oct 2018 18:44:08 -0400 Received: by mail-pl1-f193.google.com with SMTP id p25-v6so10243413pli.11 for ; Mon, 08 Oct 2018 08:31:52 -0700 (PDT) In-Reply-To: <20180502110136.3738-8-bjorn.topel@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 05/02/2018 04:01 AM, Björn Töpel wrote: > From: Björn Töpel > > The xskmap is yet another BPF map, very much inspired by > dev/cpu/sockmap, and is a holder of AF_XDP sockets. A user application > adds AF_XDP sockets into the map, and by using the bpf_redirect_map > helper, an XDP program can redirect XDP frames to an AF_XDP socket. > > Note that a socket that is bound to certain ifindex/queue index will > *only* accept XDP frames from that netdev/queue index. If an XDP > program tries to redirect from a netdev/queue index other than what > the socket is bound to, the frame will not be received on the socket. > > A socket can reside in multiple maps. > > v3: Fixed race and simplified code. > v2: Removed one indirection in map lookup. > > Signed-off-by: Björn Töpel > --- > include/linux/bpf.h | 25 +++++ > include/linux/bpf_types.h | 3 + > include/net/xdp_sock.h | 7 ++ > include/uapi/linux/bpf.h | 1 + > kernel/bpf/Makefile | 3 + > kernel/bpf/verifier.c | 8 +- > kernel/bpf/xskmap.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++ > net/xdp/xsk.c | 5 + > 8 files changed, 289 insertions(+), 2 deletions(-) > create mode 100644 kernel/bpf/xskmap.c > This function is called under rcu_read_lock() , from map_update_elem() > + > +static int xsk_map_update_elem(struct bpf_map *map, void *key, void *value, > + u64 map_flags) > +{ > + struct xsk_map *m = container_of(map, struct xsk_map, map); > + u32 i = *(u32 *)key, fd = *(u32 *)value; > + struct xdp_sock *xs, *old_xs; > + struct socket *sock; > + int err; > + > + if (unlikely(map_flags > BPF_EXIST)) > + return -EINVAL; > + if (unlikely(i >= m->map.max_entries)) > + return -E2BIG; > + if (unlikely(map_flags == BPF_NOEXIST)) > + return -EEXIST; > + > + sock = sockfd_lookup(fd, &err); > + if (!sock) > + return err; > + > + if (sock->sk->sk_family != PF_XDP) { > + sockfd_put(sock); > + return -EOPNOTSUPP; > + } > + > + xs = (struct xdp_sock *)sock->sk; > + > + if (!xsk_is_setup_for_bpf_map(xs)) { > + sockfd_put(sock); > + return -EOPNOTSUPP; > + } > + > + sock_hold(sock->sk); > + > + old_xs = xchg(&m->xsk_map[i], xs); > + if (old_xs) { > + /* Make sure we've flushed everything. */ So it is illegal to call synchronize_net(), since it is a reschedule point. > + synchronize_net(); > + sock_put((struct sock *)old_xs); > + } > + > + sockfd_put(sock); > + return 0; > +} >