All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Herbert <tom@herbertland.com>
To: Brenden Blanco <bblanco@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Saeed Mahameed <saeedm@dev.mellanox.co.il>,
	Martin KaFai Lau <kafai@fb.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Ari Saha <as754m@att.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Or Gerlitz <gerlitz.or@gmail.com>,
	john fastabend <john.fastabend@gmail.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Thomas Graf <tgraf@suug.ch>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Tariq Toukan <ttoukan.linux@gmail.com>,
	Aaron Yue <haoxuany@fb.com>
Subject: Re: [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite
Date: Wed, 3 Aug 2016 10:01:54 -0700	[thread overview]
Message-ID: <CALx6S34b1O6iiFn6M2=xX3iduGWDQm=G+jweYVvGqY-xa4E2sg@mail.gmail.com> (raw)
In-Reply-To: <1468955817-10604-13-git-send-email-bblanco@plumgrid.com>

On Tue, Jul 19, 2016 at 12:16 PM, Brenden Blanco <bblanco@plumgrid.com> wrote:
> Add a sample that rewrites and forwards packets out on the same
> interface. Observed single core forwarding performance of ~10Mpps.
>
> Since the mlx4 driver under test recycles every single packet page, the
> perf output shows almost exclusively just the ring management and bpf
> program work. Slowdowns are likely occurring due to cache misses.
>
> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
> ---
>  samples/bpf/Makefile    |   5 +++
>  samples/bpf/xdp2_kern.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 119 insertions(+)
>  create mode 100644 samples/bpf/xdp2_kern.c
>
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index 0e4ab3a..d2d2b35 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -22,6 +22,7 @@ hostprogs-y += map_perf_test
>  hostprogs-y += test_overhead
>  hostprogs-y += test_cgrp2_array_pin
>  hostprogs-y += xdp1
> +hostprogs-y += xdp2
>
>  test_verifier-objs := test_verifier.o libbpf.o
>  test_maps-objs := test_maps.o libbpf.o
> @@ -44,6 +45,8 @@ map_perf_test-objs := bpf_load.o libbpf.o map_perf_test_user.o
>  test_overhead-objs := bpf_load.o libbpf.o test_overhead_user.o
>  test_cgrp2_array_pin-objs := libbpf.o test_cgrp2_array_pin.o
>  xdp1-objs := bpf_load.o libbpf.o xdp1_user.o
> +# reuse xdp1 source intentionally
> +xdp2-objs := bpf_load.o libbpf.o xdp1_user.o
>
>  # Tell kbuild to always build the programs
>  always := $(hostprogs-y)
> @@ -67,6 +70,7 @@ always += test_overhead_kprobe_kern.o
>  always += parse_varlen.o parse_simple.o parse_ldabs.o
>  always += test_cgrp2_tc_kern.o
>  always += xdp1_kern.o
> +always += xdp2_kern.o
>
>  HOSTCFLAGS += -I$(objtree)/usr/include
>
> @@ -88,6 +92,7 @@ HOSTLOADLIBES_spintest += -lelf
>  HOSTLOADLIBES_map_perf_test += -lelf -lrt
>  HOSTLOADLIBES_test_overhead += -lelf -lrt
>  HOSTLOADLIBES_xdp1 += -lelf
> +HOSTLOADLIBES_xdp2 += -lelf
>
>  # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
>  #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
> diff --git a/samples/bpf/xdp2_kern.c b/samples/bpf/xdp2_kern.c
> new file mode 100644
> index 0000000..38fe7e1
> --- /dev/null
> +++ b/samples/bpf/xdp2_kern.c
> @@ -0,0 +1,114 @@
> +/* Copyright (c) 2016 PLUMgrid
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of version 2 of the GNU General Public
> + * License as published by the Free Software Foundation.
> + */
> +#define KBUILD_MODNAME "foo"
> +#include <uapi/linux/bpf.h>
> +#include <linux/in.h>
> +#include <linux/if_ether.h>
> +#include <linux/if_packet.h>
> +#include <linux/if_vlan.h>
> +#include <linux/ip.h>
> +#include <linux/ipv6.h>
> +#include "bpf_helpers.h"
> +
> +struct bpf_map_def SEC("maps") dropcnt = {
> +       .type = BPF_MAP_TYPE_PERCPU_ARRAY,
> +       .key_size = sizeof(u32),
> +       .value_size = sizeof(long),
> +       .max_entries = 256,
> +};
> +
> +static void swap_src_dst_mac(void *data)
> +{
> +       unsigned short *p = data;
> +       unsigned short dst[3];
> +
> +       dst[0] = p[0];
> +       dst[1] = p[1];
> +       dst[2] = p[2];
> +       p[0] = p[3];
> +       p[1] = p[4];
> +       p[2] = p[5];
> +       p[3] = dst[0];
> +       p[4] = dst[1];
> +       p[5] = dst[2];
> +}
> +
> +static int parse_ipv4(void *data, u64 nh_off, void *data_end)
> +{
> +       struct iphdr *iph = data + nh_off;
> +
> +       if (iph + 1 > data_end)
> +               return 0;
> +       return iph->protocol;
> +}
> +
> +static int parse_ipv6(void *data, u64 nh_off, void *data_end)
> +{
> +       struct ipv6hdr *ip6h = data + nh_off;
> +
> +       if (ip6h + 1 > data_end)
> +               return 0;
> +       return ip6h->nexthdr;
> +}
> +
> +SEC("xdp1")
> +int xdp_prog1(struct xdp_md *ctx)
> +{
> +       void *data_end = (void *)(long)ctx->data_end;
> +       void *data = (void *)(long)ctx->data;

Brendan,

It seems that the cast to long here is done because data_end and data
are u32s in xdp_md. So the effect is that we are upcasting a
thirty-bit integer into a sixty-four bit pointer (in fact without the
cast we see compiler warnings). I don't understand how this can be
correct. Can you shed some light on this?

Thanks,
Tom

> +       struct ethhdr *eth = data;
> +       int rc = XDP_DROP;
> +       long *value;
> +       u16 h_proto;
> +       u64 nh_off;
> +       u32 index;
> +
> +       nh_off = sizeof(*eth);
> +       if (data + nh_off > data_end)
> +               return rc;
> +
> +       h_proto = eth->h_proto;
> +
> +       if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
> +               struct vlan_hdr *vhdr;
> +
> +               vhdr = data + nh_off;
> +               nh_off += sizeof(struct vlan_hdr);
> +               if (data + nh_off > data_end)
> +                       return rc;
> +               h_proto = vhdr->h_vlan_encapsulated_proto;
> +       }
> +       if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
> +               struct vlan_hdr *vhdr;
> +
> +               vhdr = data + nh_off;
> +               nh_off += sizeof(struct vlan_hdr);
> +               if (data + nh_off > data_end)
> +                       return rc;
> +               h_proto = vhdr->h_vlan_encapsulated_proto;
> +       }
> +
> +       if (h_proto == htons(ETH_P_IP))
> +               index = parse_ipv4(data, nh_off, data_end);
> +       else if (h_proto == htons(ETH_P_IPV6))
> +               index = parse_ipv6(data, nh_off, data_end);
> +       else
> +               index = 0;
> +
> +       value = bpf_map_lookup_elem(&dropcnt, &index);
> +       if (value)
> +               *value += 1;
> +
> +       if (index == 17) {
> +               swap_src_dst_mac(data);
> +               rc = XDP_TX;
> +       }
> +
> +       return rc;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.8.2
>

  parent reply	other threads:[~2016-08-03 18:06 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19 19:16 [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 01/12] bpf: add bpf_prog_add api for bulk prog refcnt Brenden Blanco
2016-07-19 21:46   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 02/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-19 21:33   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 03/12] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 04/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-20  8:38   ` Daniel Borkmann
2016-07-20 17:35     ` Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-19 21:41   ` Alexei Starovoitov
2016-07-20  9:07   ` Daniel Borkmann
2016-07-20 17:33     ` Brenden Blanco
2016-07-24 11:56   ` Jesper Dangaard Brouer
2016-07-24 16:57   ` Tom Herbert
2016-07-24 20:34     ` Daniel Borkmann
2016-07-19 19:16 ` [PATCH v10 06/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-19 21:44   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-19 21:49   ` Alexei Starovoitov
2016-07-25  7:35   ` Eric Dumazet
2016-08-03 17:45     ` order-0 vs order-N driver allocation. Was: " Alexei Starovoitov
2016-08-04 16:19       ` Jesper Dangaard Brouer
2016-08-05  0:30         ` Alexander Duyck
2016-08-05  3:55           ` Alexei Starovoitov
2016-08-05 15:15             ` Alexander Duyck
2016-08-05 15:33               ` David Laight
2016-08-05 15:33                 ` David Laight
2016-08-05 16:00                 ` Alexander Duyck
2016-08-05 16:00                   ` Alexander Duyck
2016-08-05  7:15         ` Eric Dumazet
2016-08-05  7:15           ` Eric Dumazet
2016-08-08  2:15           ` Alexei Starovoitov
2016-08-08  2:15             ` Alexei Starovoitov
2016-08-08  8:01             ` Jesper Dangaard Brouer
2016-08-08 18:34               ` Alexei Starovoitov
2016-08-09 12:14                 ` Jesper Dangaard Brouer
2016-07-19 19:16 ` [PATCH v10 08/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-19 21:53   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 09/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 10/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 11/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-19 21:59   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-19 22:05   ` Alexei Starovoitov
2016-07-20 17:38     ` Brenden Blanco
2016-07-27 18:25     ` Jesper Dangaard Brouer
2016-08-03 17:01   ` Tom Herbert [this message]
2016-08-03 17:11     ` Alexei Starovoitov
2016-08-03 17:29       ` Tom Herbert
2016-08-03 18:29         ` David Miller
2016-08-03 18:29         ` Brenden Blanco
2016-08-03 18:31           ` David Miller
2016-08-03 19:06           ` Tom Herbert
2016-08-03 22:36             ` Alexei Starovoitov
2016-08-03 23:18               ` Daniel Borkmann
2016-07-20  5:09 ` [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding David Miller
     [not found]   ` <6a09ce5d-f902-a576-e44e-8e1e111ae26b@gmail.com>
2016-07-20 14:08     ` Brenden Blanco
2016-07-20 19:14     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALx6S34b1O6iiFn6M2=xX3iduGWDQm=G+jweYVvGqY-xa4E2sg@mail.gmail.com' \
    --to=tom@herbertland.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=as754m@att.com \
    --cc=bblanco@plumgrid.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=haoxuany@fb.com \
    --cc=jhs@mojatatu.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=tgraf@suug.ch \
    --cc=ttoukan.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.