linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	mst@redhat.com, jasowang@redhat.com
Subject: [PATCH net-next 09/11] tuntap: accept an array of XDP buffs through sendmsg()
Date: Thu,  6 Sep 2018 12:05:24 +0800	[thread overview]
Message-ID: <20180906040526.22518-10-jasowang@redhat.com> (raw)
In-Reply-To: <20180906040526.22518-1-jasowang@redhat.com>

This patch implement TUN_MSG_PTR msg_control type. This type allows
the caller to pass an array of XDP buffs to tuntap through ptr field
of the tun_msg_control. If an XDP program is attached, tuntap can run
XDP program directly. If not, tuntap will build skb and do a fast
receiving since part of the work has been done by vhost_net.

This will avoid lots of indirect calls thus improves the icache
utilization and allows to do XDP batched flushing when doing XDP
redirection.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/tun.c | 103 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 100 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index c839a4bdcbd9..069db2e5dd08 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2424,22 +2424,119 @@ static void tun_sock_write_space(struct sock *sk)
 	kill_fasync(&tfile->fasync, SIGIO, POLL_OUT);
 }
 
+static int tun_xdp_one(struct tun_struct *tun,
+		       struct tun_file *tfile,
+		       struct xdp_buff *xdp, int *flush)
+{
+	struct virtio_net_hdr *gso = xdp->data_hard_start + sizeof(int);
+	struct tun_pcpu_stats *stats;
+	struct bpf_prog *xdp_prog;
+	struct sk_buff *skb = NULL;
+	u32 rxhash = 0, act;
+	int buflen = *(int *)xdp->data_hard_start;
+	int err = 0;
+	bool skb_xdp = false;
+
+	xdp_prog = rcu_dereference(tun->xdp_prog);
+	if (xdp_prog) {
+		if (gso->gso_type) {
+			skb_xdp = true;
+			goto build;
+		}
+		xdp_set_data_meta_invalid(xdp);
+		xdp->rxq = &tfile->xdp_rxq;
+		act = tun_do_xdp(tun, tfile, xdp_prog, xdp, &err);
+		if (err)
+			goto out;
+		if (act == XDP_REDIRECT)
+			*flush = true;
+		if (act != XDP_PASS)
+			goto out;
+	}
+
+build:
+	skb = build_skb(xdp->data_hard_start, buflen);
+	if (!skb) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	skb_reserve(skb, xdp->data - xdp->data_hard_start);
+	skb_put(skb, xdp->data_end - xdp->data);
+
+	if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) {
+		this_cpu_inc(tun->pcpu_stats->rx_frame_errors);
+		kfree_skb(skb);
+		err = -EINVAL;
+		goto out;
+	}
+
+	skb->protocol = eth_type_trans(skb, tun->dev);
+	skb_reset_network_header(skb);
+	skb_probe_transport_header(skb, 0);
+
+	if (skb_xdp) {
+		err = do_xdp_generic(xdp_prog, skb);
+		if (err != XDP_PASS)
+			goto out;
+	}
+
+	if (!rcu_dereference(tun->steering_prog))
+		rxhash = __skb_get_hash_symmetric(skb);
+
+	netif_receive_skb(skb);
+
+	stats = get_cpu_ptr(tun->pcpu_stats);
+	u64_stats_update_begin(&stats->syncp);
+	stats->rx_packets++;
+	stats->rx_bytes += skb->len;
+	u64_stats_update_end(&stats->syncp);
+	put_cpu_ptr(stats);
+
+	if (rxhash)
+		tun_flow_update(tun, rxhash, tfile);
+
+out:
+	return err;
+}
+
 static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
 {
-	int ret;
+	int ret, i;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = tun_get(tfile);
 	struct tun_msg_ctl *ctl = m->msg_control;
+	struct xdp_buff *xdp;
 
 	if (!tun)
 		return -EBADFD;
 
-	if (ctl && ctl->type != TUN_MSG_UBUF)
-		return -EINVAL;
+	if (ctl && ((ctl->type & 0xF) == TUN_MSG_PTR)) {
+		int n = ctl->type >> 16;
+		int flush = 0;
+
+		local_bh_disable();
+		rcu_read_lock();
+
+		for (i = 0; i < n; i++) {
+			xdp = &((struct xdp_buff *)ctl->ptr)[i];
+			tun_xdp_one(tun, tfile, xdp, &flush);
+		}
+
+		if (flush)
+			xdp_do_flush_map();
+
+		rcu_read_unlock();
+		local_bh_enable();
+
+		ret = total_len;
+		goto out;
+	}
 
 	ret = tun_get_user(tun, tfile, ctl ? ctl->ptr : NULL, &m->msg_iter,
 			   m->msg_flags & MSG_DONTWAIT,
 			   m->msg_flags & MSG_MORE);
+out:
 	tun_put(tun);
 	return ret;
 }
-- 
2.17.1


  parent reply	other threads:[~2018-09-06  4:06 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-06  4:05 [PATCH net-next 00/11] Vhost_net TX batching Jason Wang
2018-09-06  4:05 ` [PATCH net-next 01/11] net: sock: introduce SOCK_XDP Jason Wang
2018-09-06 16:56   ` Michael S. Tsirkin
2018-09-07  3:07     ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 02/11] tuntap: switch to use XDP_PACKET_HEADROOM Jason Wang
2018-09-06 16:57   ` Michael S. Tsirkin
2018-09-07  3:12     ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 03/11] tuntap: enable bh early during processing XDP Jason Wang
2018-09-06 17:02   ` Michael S. Tsirkin
2018-09-06  4:05 ` [PATCH net-next 04/11] tuntap: simplify error handling in tun_build_skb() Jason Wang
2018-09-06 17:14   ` Michael S. Tsirkin
2018-09-07  3:22     ` Jason Wang
2018-09-07 14:17       ` Michael S. Tsirkin
2018-09-10  3:44         ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 05/11] tuntap: tweak on the path of non-xdp case " Jason Wang
2018-09-06 17:16   ` Michael S. Tsirkin
2018-09-07  3:24     ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 06/11] tuntap: split out XDP logic Jason Wang
2018-09-06 17:21   ` Michael S. Tsirkin
2018-09-07  3:29     ` Jason Wang
2018-09-07 14:16       ` Michael S. Tsirkin
2018-09-10  3:43         ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 07/11] tuntap: move XDP flushing out of tun_do_xdp() Jason Wang
2018-09-06 17:48   ` Michael S. Tsirkin
2018-09-07  3:31     ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 08/11] tun: switch to new type of msg_control Jason Wang
2018-09-06 16:54   ` Michael S. Tsirkin
2018-09-07  3:35     ` Jason Wang
2018-09-06  4:05 ` Jason Wang [this message]
2018-09-06 17:51   ` [PATCH net-next 09/11] tuntap: accept an array of XDP buffs through sendmsg() Michael S. Tsirkin
2018-09-07  7:33     ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 10/11] tap: " Jason Wang
2018-09-06 18:00   ` Michael S. Tsirkin
2018-09-07  3:41     ` Jason Wang
2018-09-06  4:05 ` [PATCH net-next 11/11] vhost_net: batch submitting XDP buffers to underlayer sockets Jason Wang
2018-09-06 16:46   ` Michael S. Tsirkin
2018-09-07  7:41     ` Jason Wang
2018-09-07 16:13       ` Michael S. Tsirkin
2018-09-10  3:47         ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180906040526.22518-10-jasowang@redhat.com \
    --to=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).