bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Magnus Karlsson <magnus.karlsson@gmail.com>
To: Ye Xiaolong <xiaolong.ye@intel.com>
Cc: "Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Network Development" <netdev@vger.kernel.org>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	bpf@vger.kernel.org, bruce.richardson@intel.com,
	ciara.loftus@intel.com,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"Maxim Mikityanskiy" <maximmi@mellanox.com>,
	"Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
	kevin.laatz@intel.com
Subject: Re: [RFC bpf-next 7/7] samples/bpf: add busy-poll support to xdpsock sample
Date: Mon, 27 May 2019 08:58:52 +0200	[thread overview]
Message-ID: <CAJ8uoz3MNrz_f7dy6+U=zj7GeywUda9E9pP2s9uU1jVW5O2zHw@mail.gmail.com> (raw)
In-Reply-To: <20190524025111.GB6321@intel.com>

On Fri, May 24, 2019 at 4:59 AM Ye Xiaolong <xiaolong.ye@intel.com> wrote:
>
> Hi, Magnus
>
> On 05/02, Magnus Karlsson wrote:
> >This patch adds busy-poll support to the xdpsock sample
> >application. It is enabled by the "-b" or the "--busy-poll" command
> >line options.
> >
> >Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> >---
> > samples/bpf/xdpsock_user.c | 203 ++++++++++++++++++++++++++++-----------------
> > 1 file changed, 125 insertions(+), 78 deletions(-)
> >
> >diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> >index d08ee1a..1272edf 100644
> >--- a/samples/bpf/xdpsock_user.c
> >+++ b/samples/bpf/xdpsock_user.c
> >@@ -66,6 +66,7 @@ static const char *opt_if = "";
> > static int opt_ifindex;
> > static int opt_queue;
> > static int opt_poll;
> >+static int opt_busy_poll;
> > static int opt_interval = 1;
> > static u32 opt_xdp_bind_flags;
> > static __u32 prog_id;
> >@@ -119,8 +120,11 @@ static void print_benchmark(bool running)
> >       else
> >               printf("        ");
> >
> >-      if (opt_poll)
> >+      if (opt_poll) {
> >+              if (opt_busy_poll)
> >+                      printf("busy-");
> >               printf("poll() ");
> >+      }
> >
> >       if (running) {
> >               printf("running...");
> >@@ -306,7 +310,7 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> >       xsk->umem = umem;
> >       cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> >       cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> >-      cfg.libbpf_flags = 0;
>
> Any purpose for removing this line, as cfg here is a local variable, cfg.libbpf_flags
> can be random and may lead to xdpsock failure as `Invalid no_argument`.

No, that was a mistake. Thanks for catching.

I will produce a new patch set with the improvements needed to make
execution of application and softirq/ksoftirqd on the same core
efficient, make a v2 of the busy poll patch set (with all your
suggested improvements) on top of that, and then see what performance
we have.

Thanks: Magnus

> Thanks,
> Xiaolong
>
> >+      cfg.busy_poll = (opt_busy_poll ? BATCH_SIZE : 0);
> >       cfg.xdp_flags = opt_xdp_flags;
> >       cfg.bind_flags = opt_xdp_bind_flags;
> >       ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem,
> >@@ -319,17 +323,17 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem)
> >               exit_with_error(-ret);
> >
> >       ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> >-                                   XSK_RING_PROD__DEFAULT_NUM_DESCS,
> >+                                   1024,
> >                                    &idx);
> >-      if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS)
> >+      if (ret != 1024)
> >               exit_with_error(-ret);
> >       for (i = 0;
> >-           i < XSK_RING_PROD__DEFAULT_NUM_DESCS *
> >+           i < 1024 *
> >                    XSK_UMEM__DEFAULT_FRAME_SIZE;
> >            i += XSK_UMEM__DEFAULT_FRAME_SIZE)
> >               *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) = i;
> >       xsk_ring_prod__submit(&xsk->umem->fq,
> >-                            XSK_RING_PROD__DEFAULT_NUM_DESCS);
> >+                            1024);
> >
> >       return xsk;
> > }
> >@@ -341,6 +345,7 @@ static struct option long_options[] = {
> >       {"interface", required_argument, 0, 'i'},
> >       {"queue", required_argument, 0, 'q'},
> >       {"poll", no_argument, 0, 'p'},
> >+      {"busy-poll", no_argument, 0, 'b'},
> >       {"xdp-skb", no_argument, 0, 'S'},
> >       {"xdp-native", no_argument, 0, 'N'},
> >       {"interval", required_argument, 0, 'n'},
> >@@ -360,6 +365,7 @@ static void usage(const char *prog)
> >               "  -i, --interface=n    Run on interface n\n"
> >               "  -q, --queue=n        Use queue n (default 0)\n"
> >               "  -p, --poll           Use poll syscall\n"
> >+              "  -b, --busy-poll      Use poll syscall with busy poll\n"
> >               "  -S, --xdp-skb=n      Use XDP skb-mod\n"
> >               "  -N, --xdp-native=n   Enfore XDP native mode\n"
> >               "  -n, --interval=n     Specify statistics update interval (default 1 sec).\n"
> >@@ -377,7 +383,7 @@ static void parse_command_line(int argc, char **argv)
> >       opterr = 0;
> >
> >       for (;;) {
> >-              c = getopt_long(argc, argv, "Frtli:q:psSNn:cz", long_options,
> >+              c = getopt_long(argc, argv, "Frtli:q:pbsSNn:cz", long_options,
> >                               &option_index);
> >               if (c == -1)
> >                       break;
> >@@ -401,6 +407,10 @@ static void parse_command_line(int argc, char **argv)
> >               case 'p':
> >                       opt_poll = 1;
> >                       break;
> >+              case 'b':
> >+                      opt_busy_poll = 1;
> >+                      opt_poll = 1;
> >+                      break;
> >               case 'S':
> >                       opt_xdp_flags |= XDP_FLAGS_SKB_MODE;
> >                       opt_xdp_bind_flags |= XDP_COPY;
> >@@ -444,7 +454,8 @@ static void kick_tx(struct xsk_socket_info *xsk)
> >       exit_with_error(errno);
> > }
> >
> >-static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk)
> >+static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk,
> >+                                   struct pollfd *fds)
> > {
> >       u32 idx_cq = 0, idx_fq = 0;
> >       unsigned int rcvd;
> >@@ -453,7 +464,8 @@ static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk)
> >       if (!xsk->outstanding_tx)
> >               return;
> >
> >-      kick_tx(xsk);
> >+      if (!opt_poll)
> >+              kick_tx(xsk);
> >       ndescs = (xsk->outstanding_tx > BATCH_SIZE) ? BATCH_SIZE :
> >               xsk->outstanding_tx;
> >
> >@@ -467,6 +479,8 @@ static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk)
> >               while (ret != rcvd) {
> >                       if (ret < 0)
> >                               exit_with_error(-ret);
> >+                      if (opt_busy_poll)
> >+                              ret = poll(fds, num_socks, 0);
> >                       ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
> >                                                    &idx_fq);
> >               }
> >@@ -490,7 +504,8 @@ static inline void complete_tx_only(struct xsk_socket_info *xsk)
> >       if (!xsk->outstanding_tx)
> >               return;
> >
> >-      kick_tx(xsk);
> >+      if (!opt_busy_poll)
> >+              kick_tx(xsk);
> >
> >       rcvd = xsk_ring_cons__peek(&xsk->umem->cq, BATCH_SIZE, &idx);
> >       if (rcvd > 0) {
> >@@ -500,10 +515,10 @@ static inline void complete_tx_only(struct xsk_socket_info *xsk)
> >       }
> > }
> >
> >-static void rx_drop(struct xsk_socket_info *xsk)
> >+static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds)
> > {
> >-      unsigned int rcvd, i;
> >       u32 idx_rx = 0, idx_fq = 0;
> >+      unsigned int rcvd, i;
> >       int ret;
> >
> >       rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
> >@@ -514,6 +529,8 @@ static void rx_drop(struct xsk_socket_info *xsk)
> >       while (ret != rcvd) {
> >               if (ret < 0)
> >                       exit_with_error(-ret);
> >+              if (opt_busy_poll)
> >+                      ret = poll(fds, num_socks, 0);
> >               ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> >       }
> >
> >@@ -533,43 +550,68 @@ static void rx_drop(struct xsk_socket_info *xsk)
> >
> > static void rx_drop_all(void)
> > {
> >-      struct pollfd fds[MAX_SOCKS + 1];
> >-      int i, ret, timeout, nfds = 1;
> >+      struct pollfd fds[MAX_SOCKS];
> >+      int i, ret;
> >
> >       memset(fds, 0, sizeof(fds));
> >
> >       for (i = 0; i < num_socks; i++) {
> >               fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> >               fds[i].events = POLLIN;
> >-              timeout = 1000; /* 1sn */
> >       }
> >
> >       for (;;) {
> >               if (opt_poll) {
> >-                      ret = poll(fds, nfds, timeout);
> >+                      ret = poll(fds, num_socks, 0);
> >                       if (ret <= 0)
> >                               continue;
> >               }
> >
> >               for (i = 0; i < num_socks; i++)
> >-                      rx_drop(xsks[i]);
> >+                      rx_drop(xsks[i], fds);
> >+      }
> >+}
> >+
> >+static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb)
> >+{
> >+      u32 idx;
> >+
> >+      if (xsk_ring_prod__reserve(&xsk->tx, BATCH_SIZE, &idx) ==
> >+          BATCH_SIZE) {
> >+              unsigned int i;
> >+
> >+              for (i = 0; i < BATCH_SIZE; i++) {
> >+                      xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->addr
> >+                              = (frame_nb + i) <<
> >+                              XSK_UMEM__DEFAULT_FRAME_SHIFT;
> >+                      xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->len =
> >+                              sizeof(pkt_data) - 1;
> >+              }
> >+
> >+              xsk_ring_prod__submit(&xsk->tx, BATCH_SIZE);
> >+              xsk->outstanding_tx += BATCH_SIZE;
> >+              frame_nb += BATCH_SIZE;
> >+              frame_nb %= NUM_FRAMES;
> >       }
> >+
> >+      complete_tx_only(xsk);
> > }
> >
> >-static void tx_only(struct xsk_socket_info *xsk)
> >+static void tx_only_all(void)
> > {
> >-      int timeout, ret, nfds = 1;
> >-      struct pollfd fds[nfds + 1];
> >-      u32 idx, frame_nb = 0;
> >+      struct pollfd fds[MAX_SOCKS];
> >+      u32 frame_nb[MAX_SOCKS] = {};
> >+      int i, ret;
> >
> >       memset(fds, 0, sizeof(fds));
> >-      fds[0].fd = xsk_socket__fd(xsk->xsk);
> >-      fds[0].events = POLLOUT;
> >-      timeout = 1000; /* 1sn */
> >+      for (i = 0; i < num_socks; i++) {
> >+              fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
> >+              fds[0].events = POLLOUT;
> >+      }
> >
> >       for (;;) {
> >               if (opt_poll) {
> >-                      ret = poll(fds, nfds, timeout);
> >+                      ret = poll(fds, num_socks, 0);
> >                       if (ret <= 0)
> >                               continue;
> >
> >@@ -577,70 +619,75 @@ static void tx_only(struct xsk_socket_info *xsk)
> >                               continue;
> >               }
> >
> >-              if (xsk_ring_prod__reserve(&xsk->tx, BATCH_SIZE, &idx) ==
> >-                  BATCH_SIZE) {
> >-                      unsigned int i;
> >-
> >-                      for (i = 0; i < BATCH_SIZE; i++) {
> >-                              xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->addr
> >-                                      = (frame_nb + i) <<
> >-                                      XSK_UMEM__DEFAULT_FRAME_SHIFT;
> >-                              xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->len =
> >-                                      sizeof(pkt_data) - 1;
> >-                      }
> >-
> >-                      xsk_ring_prod__submit(&xsk->tx, BATCH_SIZE);
> >-                      xsk->outstanding_tx += BATCH_SIZE;
> >-                      frame_nb += BATCH_SIZE;
> >-                      frame_nb %= NUM_FRAMES;
> >-              }
> >-
> >-              complete_tx_only(xsk);
> >+              for (i = 0; i < num_socks; i++)
> >+                      tx_only(xsks[i], frame_nb[i]);
> >       }
> > }
> >
> >-static void l2fwd(struct xsk_socket_info *xsk)
> >+static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
> > {
> >-      for (;;) {
> >-              unsigned int rcvd, i;
> >-              u32 idx_rx = 0, idx_tx = 0;
> >-              int ret;
> >+      unsigned int rcvd, i;
> >+      u32 idx_rx = 0, idx_tx = 0;
> >+      int ret;
> >
> >-              for (;;) {
> >-                      complete_tx_l2fwd(xsk);
> >+      complete_tx_l2fwd(xsk, fds);
> >
> >-                      rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE,
> >-                                                 &idx_rx);
> >-                      if (rcvd > 0)
> >-                              break;
> >-              }
> >+      rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE,
> >+                                 &idx_rx);
> >+      if (!rcvd)
> >+              return;
> >
> >+      ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx);
> >+      while (ret != rcvd) {
> >+              if (ret < 0)
> >+                      exit_with_error(-ret);
> >+              if (opt_busy_poll)
> >+                      ret = poll(fds, num_socks, 0);
> >               ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx);
> >-              while (ret != rcvd) {
> >-                      if (ret < 0)
> >-                              exit_with_error(-ret);
> >-                      ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx);
> >-              }
> >+      }
> >+
> >+      for (i = 0; i < rcvd; i++) {
> >+              u64 addr = xsk_ring_cons__rx_desc(&xsk->rx,
> >+                                                idx_rx)->addr;
> >+              u32 len = xsk_ring_cons__rx_desc(&xsk->rx,
> >+                                               idx_rx++)->len;
> >+              char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
> >
> >-              for (i = 0; i < rcvd; i++) {
> >-                      u64 addr = xsk_ring_cons__rx_desc(&xsk->rx,
> >-                                                        idx_rx)->addr;
> >-                      u32 len = xsk_ring_cons__rx_desc(&xsk->rx,
> >-                                                       idx_rx++)->len;
> >-                      char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
> >+              swap_mac_addresses(pkt);
> >
> >-                      swap_mac_addresses(pkt);
> >+              hex_dump(pkt, len, addr);
> >+              xsk_ring_prod__tx_desc(&xsk->tx, idx_tx)->addr = addr;
> >+              xsk_ring_prod__tx_desc(&xsk->tx, idx_tx++)->len = len;
> >+      }
> >
> >-                      hex_dump(pkt, len, addr);
> >-                      xsk_ring_prod__tx_desc(&xsk->tx, idx_tx)->addr = addr;
> >-                      xsk_ring_prod__tx_desc(&xsk->tx, idx_tx++)->len = len;
> >-              }
> >+      xsk_ring_prod__submit(&xsk->tx, rcvd);
> >+      xsk_ring_cons__release(&xsk->rx, rcvd);
> >
> >-              xsk_ring_prod__submit(&xsk->tx, rcvd);
> >-              xsk_ring_cons__release(&xsk->rx, rcvd);
> >+      xsk->rx_npkts += rcvd;
> >+      xsk->outstanding_tx += rcvd;
> >+}
> >
> >-              xsk->rx_npkts += rcvd;
> >-              xsk->outstanding_tx += rcvd;
> >+static void l2fwd_all(void)
> >+{
> >+      struct pollfd fds[MAX_SOCKS];
> >+      int i, ret;
> >+
> >+      memset(fds, 0, sizeof(fds));
> >+
> >+      for (i = 0; i < num_socks; i++) {
> >+              fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> >+              fds[i].events = POLLOUT | POLLIN;
> >+      }
> >+
> >+      for (;;) {
> >+              if (opt_poll) {
> >+                      ret = poll(fds, num_socks, 0);
> >+                      if (ret <= 0)
> >+                              continue;
> >+              }
> >+
> >+              for (i = 0; i < num_socks; i++)
> >+                      l2fwd(xsks[i], fds);
> >       }
> > }
> >
> >@@ -693,9 +740,9 @@ int main(int argc, char **argv)
> >       if (opt_bench == BENCH_RXDROP)
> >               rx_drop_all();
> >       else if (opt_bench == BENCH_TXONLY)
> >-              tx_only(xsks[0]);
> >+              tx_only_all();
> >       else
> >-              l2fwd(xsks[0]);
> >+              l2fwd_all();
> >
> >       return 0;
> > }
> >--
> >2.7.4
> >

  reply	other threads:[~2019-05-27  6:59 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-02  8:39 [RFC bpf-next 0/7] busy poll support for AF_XDP sockets Magnus Karlsson
2019-05-02  8:39 ` [RFC bpf-next 1/7] net: fs: make busy poll budget configurable in napi_busy_loop Magnus Karlsson
2019-05-02  8:39 ` [RFC bpf-next 2/7] net: i40e: ixgbe: tun: veth: virtio-net: centralize xdp_rxq_info and add napi id Magnus Karlsson
2019-05-02  8:39 ` [RFC bpf-next 4/7] netdevice: introduce busy-poll setsockopt for AF_XDP Magnus Karlsson
2019-05-03  0:13   ` Samudrala, Sridhar
2019-05-03  6:35     ` Magnus Karlsson
2019-05-02  8:39 ` [RFC bpf-next 5/7] net: add busy-poll support for XDP sockets Magnus Karlsson
2019-05-03  0:23   ` Samudrala, Sridhar
2019-05-02  8:39 ` [RFC bpf-next 6/7] libbpf: add busy-poll support to " Magnus Karlsson
2019-05-02  8:39 ` [RFC bpf-next 7/7] samples/bpf: add busy-poll support to xdpsock sample Magnus Karlsson
2019-05-24  2:51   ` Ye Xiaolong
2019-05-27  6:58     ` Magnus Karlsson [this message]
2019-05-06 16:31 ` [RFC bpf-next 0/7] busy poll support for AF_XDP sockets Alexei Starovoitov
2019-05-07 11:51   ` Magnus Karlsson
2019-05-07 18:24     ` Alexei Starovoitov
2019-05-08 12:10       ` Magnus Karlsson
2019-05-16 12:37         ` Magnus Karlsson
2019-05-16 23:50           ` Samudrala, Sridhar
2019-05-17  7:50             ` Magnus Karlsson
2019-05-17 18:20           ` Jakub Kicinski
2019-05-18  8:49             ` Magnus Karlsson
2019-05-13 20:42       ` Jonathan Lemon
2019-05-13 23:30         ` Samudrala, Sridhar
2019-05-14  7:53         ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ8uoz3MNrz_f7dy6+U=zj7GeywUda9E9pP2s9uU1jVW5O2zHw@mail.gmail.com' \
    --to=magnus.karlsson@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=bruce.richardson@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=kevin.laatz@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=maximmi@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).