linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 0/5] XDP rx handler
@ 2018-08-13  3:05 Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 1/5] net: core: generic XDP support for stacked device Jason Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:05 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst, Jason Wang

Hi:

This series tries to implement XDP support for rx hanlder. This would
be useful for doing native XDP on stacked device like macvlan, bridge
or even bond.

The idea is simple, let stacked device register a XDP rx handler. And
when driver return XDP_PASS, it will call a new helper xdp_do_pass()
which will try to pass XDP buff to XDP rx handler directly. XDP rx
handler may then decide how to proceed, it could consume the buff, ask
driver to drop the packet or ask the driver to fallback to normal skb
path.

A sample XDP rx handler was implemented for macvlan. And virtio-net
(mergeable buffer case) was converted to call xdp_do_pass() as an
example. For ease comparision, generic XDP support for rx handler was
also implemented.

Compared to skb mode XDP on macvlan, native XDP on macvlan (XDP_DROP)
shows about 83% improvement.

Please review.

Thanks

Jason Wang (5):
  net: core: generic XDP support for stacked device
  net: core: introduce XDP rx handler
  macvlan: count the number of vlan in source mode
  macvlan: basic XDP support
  virtio-net: support XDP rx handler

 drivers/net/macvlan.c      | 189 +++++++++++++++++++++++++++++++++++++++++++--
 drivers/net/virtio_net.c   |  11 +++
 include/linux/filter.h     |   1 +
 include/linux/if_macvlan.h |   1 +
 include/linux/netdevice.h  |  12 +++
 net/core/dev.c             |  34 ++++++++
 net/core/filter.c          |  28 +++++++
 7 files changed, 271 insertions(+), 5 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH net-next 1/5] net: core: generic XDP support for stacked device
  2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
@ 2018-08-13  3:05 ` Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 2/5] net: core: introduce XDP rx handler Jason Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:05 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst, Jason Wang

Stacked device usually change skb->dev to its own and return
RX_HANDLER_ANOTHER during rx handler processing. But we don't call
generic XDP routine at that time, this means it can't work for stacked
device.

Fixing this by calling netif_do_generic_xdp() if rx handler returns
RX_HANDLER_ANOTHER. This allows us to do generic XDP on stacked device
e.g macvlan.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 net/core/dev.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 605c66e..a77ce08 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4822,6 +4822,11 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc,
 			ret = NET_RX_SUCCESS;
 			goto out;
 		case RX_HANDLER_ANOTHER:
+			ret = netif_do_generic_xdp(skb);
+			if (ret != XDP_PASS) {
+				ret = NET_RX_SUCCESS;
+				goto out;
+			}
 			goto another_round;
 		case RX_HANDLER_EXACT:
 			deliver_exact = true;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH net-next 2/5] net: core: introduce XDP rx handler
  2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 1/5] net: core: generic XDP support for stacked device Jason Wang
@ 2018-08-13  3:05 ` Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 3/5] macvlan: count the number of vlan in source mode Jason Wang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:05 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst, Jason Wang

This patch tries to introduce XDP rx handler. This will be used by
stacked device that depends on rx handler for having a fast packet
processing path based on XDP.

This idea is simple, when XDP program returns XDP_PASS, instead of
building skb immediately, driver will call xdp_do_pass() to check
whether or not there's a XDP rx handler, if yes, it will pass XDP
buffer to XDP rx handler first.

There are two main tasks for XDP rx handler, the first is check
whether or not the setup or packet could be processed through XDP buff
directly. The second task is to run XDP program. An XDP rx handler can
return several different results which was defined by enum
rx_xdp_handler_result_t:

RX_XDP_HANDLER_CONSUMED: This means the XDP buff were consumed.
RX_XDP_HANDLER_DROP: This means XDP rx handler ask to drop the packet.
RX_XDP_HANDLER_PASS_FALLBACK: This means XDP rx handler can not
process the packet (e.g cloning), and we need to fall back to normal
skb path to deal with the packet.

Consider we have the following configuration, Level 0 device which has
a rx handler for Level 1 device which has a rx handler for L2 device.

L2 device
    |
L1 device
    |
L0 device

With the help of XDP rx handler, we can attach XDP program on each of
the layer or even run native XDP handler for L2 without XDP prog
attached to L1 device:

(XDP prog for L2 device)
    |
L2 XDP rx handler for L1
    |
(XDP prog for L1 device)
    |
L1 XDP rx hanlder for L0
    |
XDP prog for L0 device

It works like: When the XDP program for L0 device returns XDP_PASS, we
will first try to check and pass XDP buff to its XDP rx handler if
there's one. Then the L1 XDP rx handler will be called and to run XDP
program for L1. When L1 XDP program returns XDP_PASS or there's no XDP
program attached to L1, we will try to call xdp_do_pass() to pass it
to XDP rx hanlder for L1. Then XDP buff will be passed to L2 XDP rx
handler etc. And it will try to run L2 XDP program if any. And if
there's no L2 XDP program or XDP program returns XDP_PASS. The handler
usually will build skb and call netif_rx() for a local receive. If any
of the XDP rx handlers returns XDP_RX_HANDLER_FALLBACK, the code will
return to L0 device and L0 device will try to build skb and go through
normal rx handler path for skb.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/filter.h    |  1 +
 include/linux/netdevice.h | 12 ++++++++++++
 net/core/dev.c            | 29 +++++++++++++++++++++++++++++
 net/core/filter.c         | 28 ++++++++++++++++++++++++++++
 4 files changed, 70 insertions(+)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index c73dd73..7cc8e69 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -791,6 +791,7 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 int xdp_do_redirect(struct net_device *dev,
 		    struct xdp_buff *xdp,
 		    struct bpf_prog *prog);
+rx_handler_result_t xdp_do_pass(struct xdp_buff *xdp);
 void xdp_do_flush_map(void);
 
 void bpf_warn_invalid_xdp_action(u32 act);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 282e2e9..21f0a9e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -421,6 +421,14 @@ enum rx_handler_result {
 typedef enum rx_handler_result rx_handler_result_t;
 typedef rx_handler_result_t rx_handler_func_t(struct sk_buff **pskb);
 
+enum rx_xdp_handler_result {
+	RX_XDP_HANDLER_CONSUMED,
+	RX_XDP_HANDLER_DROP,
+	RX_XDP_HANDLER_FALLBACK,
+};
+typedef enum rx_xdp_handler_result rx_xdp_handler_result_t;
+typedef rx_xdp_handler_result_t rx_xdp_handler_func_t(struct net_device *dev,
+						      struct xdp_buff *xdp);
 void __napi_schedule(struct napi_struct *n);
 void __napi_schedule_irqoff(struct napi_struct *n);
 
@@ -1898,6 +1906,7 @@ struct net_device {
 	struct bpf_prog __rcu	*xdp_prog;
 	unsigned long		gro_flush_timeout;
 	rx_handler_func_t __rcu	*rx_handler;
+	rx_xdp_handler_func_t __rcu *rx_xdp_handler;
 	void __rcu		*rx_handler_data;
 
 #ifdef CONFIG_NET_CLS_ACT
@@ -3530,7 +3539,10 @@ bool netdev_is_rx_handler_busy(struct net_device *dev);
 int netdev_rx_handler_register(struct net_device *dev,
 			       rx_handler_func_t *rx_handler,
 			       void *rx_handler_data);
+int netdev_rx_xdp_handler_register(struct net_device *dev,
+				   rx_xdp_handler_func_t *rx_xdp_handler);
 void netdev_rx_handler_unregister(struct net_device *dev);
+void netdev_rx_xdp_handler_unregister(struct net_device *dev);
 
 bool dev_valid_name(const char *name);
 int dev_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr,
diff --git a/net/core/dev.c b/net/core/dev.c
index a77ce08..b4e8949 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4638,6 +4638,12 @@ bool netdev_is_rx_handler_busy(struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(netdev_is_rx_handler_busy);
 
+static bool netdev_is_rx_xdp_handler_busy(struct net_device *dev)
+{
+	ASSERT_RTNL();
+	return dev && rtnl_dereference(dev->rx_xdp_handler);
+}
+
 /**
  *	netdev_rx_handler_register - register receive handler
  *	@dev: device to register a handler for
@@ -4670,6 +4676,22 @@ int netdev_rx_handler_register(struct net_device *dev,
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_register);
 
+int netdev_rx_xdp_handler_register(struct net_device *dev,
+				   rx_xdp_handler_func_t *rx_xdp_handler)
+{
+	if (netdev_is_rx_xdp_handler_busy(dev))
+		return -EBUSY;
+
+	if (dev->priv_flags & IFF_NO_RX_HANDLER)
+		return -EINVAL;
+
+	rcu_assign_pointer(dev->rx_xdp_handler, rx_xdp_handler);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(netdev_rx_xdp_handler_register);
+
+
 /**
  *	netdev_rx_handler_unregister - unregister receive handler
  *	@dev: device to unregister a handler from
@@ -4692,6 +4714,13 @@ void netdev_rx_handler_unregister(struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);
 
+void netdev_rx_xdp_handler_unregister(struct net_device *dev)
+{
+	ASSERT_RTNL();
+	RCU_INIT_POINTER(dev->rx_xdp_handler, NULL);
+}
+EXPORT_SYMBOL_GPL(netdev_rx_xdp_handler_unregister);
+
 /*
  * Limit the use of PFMEMALLOC reserves to those protocols that implement
  * the special handling of PFMEMALLOC skbs.
diff --git a/net/core/filter.c b/net/core/filter.c
index 587bbfb..9ea3797 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3312,6 +3312,34 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
 }
 EXPORT_SYMBOL_GPL(xdp_do_redirect);
 
+rx_handler_result_t xdp_do_pass(struct xdp_buff *xdp)
+{
+	rx_xdp_handler_result_t ret;
+	rx_xdp_handler_func_t *rx_xdp_handler;
+	struct net_device *dev = xdp->rxq->dev;
+
+	ret = RX_XDP_HANDLER_FALLBACK;
+	rx_xdp_handler = rcu_dereference(dev->rx_xdp_handler);
+
+	if (rx_xdp_handler) {
+		ret = rx_xdp_handler(dev, xdp);
+		switch (ret) {
+		case RX_XDP_HANDLER_CONSUMED:
+			/* Fall through */
+		case RX_XDP_HANDLER_DROP:
+			/* Fall through */
+		case RX_XDP_HANDLER_FALLBACK:
+			break;
+		default:
+			BUG();
+			break;
+		}
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(xdp_do_pass);
+
 static int xdp_do_generic_redirect_map(struct net_device *dev,
 				       struct sk_buff *skb,
 				       struct xdp_buff *xdp,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH net-next 3/5] macvlan: count the number of vlan in source mode
  2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 1/5] net: core: generic XDP support for stacked device Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 2/5] net: core: introduce XDP rx handler Jason Wang
@ 2018-08-13  3:05 ` Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 4/5] macvlan: basic XDP support Jason Wang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:05 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst, Jason Wang

This patch tries to count the number of vlans in source mode. This
will be used for implementing XDP rx handler for macvlan.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/macvlan.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index cfda146..b7c814d 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -53,6 +53,7 @@ struct macvlan_port {
 	struct hlist_head	vlan_source_hash[MACVLAN_HASH_SIZE];
 	DECLARE_BITMAP(mc_filter, MACVLAN_MC_FILTER_SZ);
 	unsigned char           perm_addr[ETH_ALEN];
+	unsigned long           source_count;
 };
 
 struct macvlan_source_entry {
@@ -1433,6 +1434,9 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 	if (err)
 		goto unregister_netdev;
 
+	if (vlan->mode == MACVLAN_MODE_SOURCE)
+		port->source_count++;
+
 	list_add_tail_rcu(&vlan->list, &port->vlans);
 	netif_stacked_transfer_operstate(lowerdev, dev);
 	linkwatch_fire_event(dev);
@@ -1477,6 +1481,7 @@ static int macvlan_changelink(struct net_device *dev,
 			      struct netlink_ext_ack *extack)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;
 	enum macvlan_mode mode;
 	bool set_mode = false;
 	enum macvlan_macaddr_mode macmode;
@@ -1491,8 +1496,10 @@ static int macvlan_changelink(struct net_device *dev,
 		    (vlan->mode == MACVLAN_MODE_PASSTHRU))
 			return -EINVAL;
 		if (vlan->mode == MACVLAN_MODE_SOURCE &&
-		    vlan->mode != mode)
+		    vlan->mode != mode) {
 			macvlan_flush_sources(vlan->port, vlan);
+			port->source_count--;
+		}
 	}
 
 	if (data && data[IFLA_MACVLAN_FLAGS]) {
@@ -1510,8 +1517,13 @@ static int macvlan_changelink(struct net_device *dev,
 		}
 		vlan->flags = flags;
 	}
-	if (set_mode)
+	if (set_mode) {
 		vlan->mode = mode;
+		if (mode == MACVLAN_MODE_SOURCE &&
+		    vlan->mode != mode) {
+			port->source_count++;
+		}
+	}
 	if (data && data[IFLA_MACVLAN_MACADDR_MODE]) {
 		if (vlan->mode != MACVLAN_MODE_SOURCE)
 			return -EINVAL;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH net-next 4/5] macvlan: basic XDP support
  2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
                   ` (2 preceding siblings ...)
  2018-08-13  3:05 ` [RFC PATCH net-next 3/5] macvlan: count the number of vlan in source mode Jason Wang
@ 2018-08-13  3:05 ` Jason Wang
  2018-08-13  3:05 ` [RFC PATCH net-next 5/5] virtio-net: support XDP rx handler Jason Wang
  2018-08-13  3:17 ` [RFC PATCH net-next 0/5] " Jason Wang
  5 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:05 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst, Jason Wang

This patch tries to implementing basic XDP support for macvlan. The
implementation was split into two parts:

1) XDP rx handler of underlay device:

We will register an XDP rx handler (macvlan_handle_xdp) to under layer
device. In this handler, we will the following cases to go for slow
path (XDP_RX_HANDLER_PASS):

- The packet is a multicast packet.
- A vlan is source mode
- Destination mac address does not match any vlan

If none of the above cases were true, it means we could go for XDP
path directly. We will change the dev and return
RX_XDP_HANDLER_ANOTHER.

2) If we find a destination vlan, we will try to run XDP prog.

If XDP prog return XDP_PASS, we will call xdp_do_pass() to pass it to
up layer XDP rx handler. This is needed for e.g macvtap to work. If
XDP_RX_HANDLER_FALLBACK is returned, we will build skb and call
netif_rx() to finish the receiving. Otherwise just return the result
to lower device. For XDP_TX, we will build skb and try XDP generic
transmission routine for simplicity. This could be optimized on top.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/macvlan.c      | 173 ++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/if_macvlan.h |   1 +
 2 files changed, 171 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index b7c814d..42b747c 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -34,6 +34,7 @@
 #include <net/rtnetlink.h>
 #include <net/xfrm.h>
 #include <linux/netpoll.h>
+#include <linux/bpf.h>
 
 #define MACVLAN_HASH_BITS	8
 #define MACVLAN_HASH_SIZE	(1<<MACVLAN_HASH_BITS)
@@ -436,6 +437,122 @@ static void macvlan_forward_source(struct sk_buff *skb,
 	}
 }
 
+struct sk_buff *macvlan_xdp_build_skb(struct net_device *dev,
+				      struct xdp_buff *xdp)
+{
+	int len;
+	int buflen = xdp->data_end - xdp->data_hard_start;
+	int headroom = xdp->data - xdp->data_hard_start;
+	struct sk_buff *skb;
+
+	len = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + headroom +
+	      SKB_DATA_ALIGN(buflen);
+
+	skb = build_skb(xdp->data_hard_start, len);
+	if (!skb)
+		return NULL;
+
+	skb_reserve(skb, headroom);
+	__skb_put(skb, xdp->data_end - xdp->data);
+
+	skb->protocol = eth_type_trans(skb, dev);
+	skb->dev = dev;
+
+	return skb;
+}
+
+static rx_xdp_handler_result_t macvlan_receive_xdp(struct net_device *dev,
+						   struct xdp_buff *xdp)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct bpf_prog *xdp_prog;
+	struct sk_buff *skb;
+	u32 act = XDP_PASS;
+	rx_xdp_handler_result_t ret;
+	int err;
+
+	rcu_read_lock();
+	xdp_prog = rcu_dereference(vlan->xdp_prog);
+
+	if (xdp_prog)
+		act = bpf_prog_run_xdp(xdp_prog, xdp);
+
+	switch (act) {
+	case XDP_PASS:
+		ret = xdp_do_pass(xdp);
+		if (ret != RX_XDP_HANDLER_FALLBACK) {
+			rcu_read_unlock();
+			return ret;
+		}
+		skb = macvlan_xdp_build_skb(dev, xdp);
+		if (!skb) {
+			act = XDP_DROP;
+			break;
+		}
+		rcu_read_unlock();
+		netif_rx(skb);
+		macvlan_count_rx(vlan, skb->len, true, false);
+		goto out;
+	case XDP_TX:
+		skb = macvlan_xdp_build_skb(dev, xdp);
+		if (!skb) {
+			act = XDP_DROP;
+			break;
+		}
+		generic_xdp_tx(skb, xdp_prog);
+		break;
+	case XDP_REDIRECT:
+		err = xdp_do_redirect(dev, xdp, xdp_prog);
+		xdp_do_flush_map();
+		if (err)
+			act = XDP_DROP;
+		break;
+	case XDP_DROP:
+		break;
+	default:
+		bpf_warn_invalid_xdp_action(act);
+		break;
+	}
+
+	rcu_read_unlock();
+out:
+	if (act == XDP_DROP)
+		return RX_XDP_HANDLER_DROP;
+
+	return RX_XDP_HANDLER_CONSUMED;
+}
+
+/* called under rcu_read_lock() from XDP handler */
+static rx_xdp_handler_result_t macvlan_handle_xdp(struct net_device *dev,
+						  struct xdp_buff *xdp)
+{
+	const struct ethhdr *eth = (const struct ethhdr *)xdp->data;
+	struct macvlan_port *port;
+	struct macvlan_dev *vlan;
+
+	if (is_multicast_ether_addr(eth->h_dest))
+		return RX_XDP_HANDLER_FALLBACK;
+
+	port = macvlan_port_get_rcu(dev);
+	if (port->source_count)
+		return RX_XDP_HANDLER_FALLBACK;
+
+	if (macvlan_passthru(port))
+		vlan = list_first_or_null_rcu(&port->vlans,
+					      struct macvlan_dev, list);
+	else
+		vlan = macvlan_hash_lookup(port, eth->h_dest);
+
+	if (!vlan)
+		return RX_XDP_HANDLER_FALLBACK;
+
+	dev = vlan->dev;
+	if (unlikely(!(dev->flags & IFF_UP)))
+		return RX_XDP_HANDLER_DROP;
+
+	return macvlan_receive_xdp(dev, xdp);
+}
+
 /* called under rcu_read_lock() from netif_receive_skb */
 static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 {
@@ -1089,6 +1206,44 @@ static int macvlan_dev_get_iflink(const struct net_device *dev)
 	return vlan->lowerdev->ifindex;
 }
 
+static int macvlan_xdp_set(struct net_device *dev, struct bpf_prog *prog,
+			   struct netlink_ext_ack *extack)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct bpf_prog *old_prog = rtnl_dereference(vlan->xdp_prog);
+
+	rcu_assign_pointer(vlan->xdp_prog, prog);
+
+	if (old_prog)
+		bpf_prog_put(old_prog);
+
+	return 0;
+}
+
+static u32 macvlan_xdp_query(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	const struct bpf_prog *xdp_prog = rtnl_dereference(vlan->xdp_prog);
+
+	if (xdp_prog)
+		return xdp_prog->aux->id;
+
+	return 0;
+}
+
+static int macvlan_xdp(struct net_device *dev, struct netdev_bpf *xdp)
+{
+	switch (xdp->command) {
+	case XDP_SETUP_PROG:
+		return macvlan_xdp_set(dev, xdp->prog, xdp->extack);
+	case XDP_QUERY_PROG:
+		xdp->prog_id = macvlan_xdp_query(dev);
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
 static const struct ethtool_ops macvlan_ethtool_ops = {
 	.get_link		= ethtool_op_get_link,
 	.get_link_ksettings	= macvlan_ethtool_get_link_ksettings,
@@ -1121,6 +1276,7 @@ static const struct net_device_ops macvlan_netdev_ops = {
 #endif
 	.ndo_get_iflink		= macvlan_dev_get_iflink,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_bpf		= macvlan_xdp,
 };
 
 void macvlan_common_setup(struct net_device *dev)
@@ -1173,10 +1329,20 @@ static int macvlan_port_create(struct net_device *dev)
 	INIT_WORK(&port->bc_work, macvlan_process_broadcast);
 
 	err = netdev_rx_handler_register(dev, macvlan_handle_frame, port);
-	if (err)
+	if (err) {
 		kfree(port);
-	else
-		dev->priv_flags |= IFF_MACVLAN_PORT;
+		goto out;
+	}
+
+	err = netdev_rx_xdp_handler_register(dev, macvlan_handle_xdp);
+	if (err) {
+		netdev_rx_handler_unregister(dev);
+		kfree(port);
+		goto out;
+	}
+
+	dev->priv_flags |= IFF_MACVLAN_PORT;
+out:
 	return err;
 }
 
@@ -1187,6 +1353,7 @@ static void macvlan_port_destroy(struct net_device *dev)
 
 	dev->priv_flags &= ~IFF_MACVLAN_PORT;
 	netdev_rx_handler_unregister(dev);
+	netdev_rx_xdp_handler_unregister(dev);
 
 	/* After this point, no packet can schedule bc_work anymore,
 	 * but we need to cancel it and purge left skbs if any.
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index 2e55e4c..7c7059b 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -34,6 +34,7 @@ struct macvlan_dev {
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll		*netpoll;
 #endif
+	struct bpf_prog __rcu   *xdp_prog;
 };
 
 static inline void macvlan_count_rx(const struct macvlan_dev *vlan,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH net-next 5/5] virtio-net: support XDP rx handler
  2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
                   ` (3 preceding siblings ...)
  2018-08-13  3:05 ` [RFC PATCH net-next 4/5] macvlan: basic XDP support Jason Wang
@ 2018-08-13  3:05 ` Jason Wang
  2018-08-13  3:17 ` [RFC PATCH net-next 0/5] " Jason Wang
  5 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:05 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst, Jason Wang

This patch tries to add the support of XDP rx handler to
virtio-net. This is straight-forward, just call xdp_do_pass() and
behave depends on its return value.

Test was done by using XDP_DROP (xdp1) for macvlan on top of
virtio-net. PPS of SKB mode was ~1.2Mpps while PPS of native XDP mode
was ~2.2Mpps. About 83% improvement was measured.

Notes: for RFC, only mergeable buffer case was implemented.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 62311dd..1e22ad9 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -777,6 +777,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
+		rx_xdp_handler_result_t ret;
 		struct xdp_frame *xdpf;
 		struct page *xdp_page;
 		struct xdp_buff xdp;
@@ -825,6 +826,15 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 
 		switch (act) {
 		case XDP_PASS:
+			ret = xdp_do_pass(&xdp);
+			if (ret == RX_XDP_HANDLER_DROP)
+				goto drop;
+			if (ret != RX_XDP_HANDLER_FALLBACK) {
+				if (unlikely(xdp_page != page))
+					put_page(page);
+				rcu_read_unlock();
+				goto xdp_xmit;
+			}
 			/* recalculate offset to account for any header
 			 * adjustments. Note other cases do not build an
 			 * skb and avoid using offset
@@ -881,6 +891,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		case XDP_ABORTED:
 			trace_xdp_exception(vi->dev, xdp_prog, act);
 			/* fall through */
+drop:
 		case XDP_DROP:
 			if (unlikely(xdp_page != page))
 				__free_pages(xdp_page, 0);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH net-next 0/5] XDP rx handler
  2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
                   ` (4 preceding siblings ...)
  2018-08-13  3:05 ` [RFC PATCH net-next 5/5] virtio-net: support XDP rx handler Jason Wang
@ 2018-08-13  3:17 ` Jason Wang
  5 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2018-08-13  3:17 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: ast, daniel, jbrouer, mst



On 2018年08月13日 11:05, Jason Wang wrote:
> Hi:
>
> This series tries to implement XDP support for rx hanlder. This would
> be useful for doing native XDP on stacked device like macvlan, bridge
> or even bond.
>
> The idea is simple, let stacked device register a XDP rx handler. And
> when driver return XDP_PASS, it will call a new helper xdp_do_pass()
> which will try to pass XDP buff to XDP rx handler directly. XDP rx
> handler may then decide how to proceed, it could consume the buff, ask
> driver to drop the packet or ask the driver to fallback to normal skb
> path.
>
> A sample XDP rx handler was implemented for macvlan. And virtio-net
> (mergeable buffer case) was converted to call xdp_do_pass() as an
> example. For ease comparision, generic XDP support for rx handler was
> also implemented.
>
> Compared to skb mode XDP on macvlan, native XDP on macvlan (XDP_DROP)
> shows about 83% improvement.
>
> Please review.
>
> Thanks
>
> Jason Wang (5):
>    net: core: generic XDP support for stacked device
>    net: core: introduce XDP rx handler
>    macvlan: count the number of vlan in source mode
>    macvlan: basic XDP support
>    virtio-net: support XDP rx handler
>
>   drivers/net/macvlan.c      | 189 +++++++++++++++++++++++++++++++++++++++++++--
>   drivers/net/virtio_net.c   |  11 +++
>   include/linux/filter.h     |   1 +
>   include/linux/if_macvlan.h |   1 +
>   include/linux/netdevice.h  |  12 +++
>   net/core/dev.c             |  34 ++++++++
>   net/core/filter.c          |  28 +++++++
>   7 files changed, 271 insertions(+), 5 deletions(-)
>

Looks like a patch is missed. Let me post V2.

Thanks


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-08-13  3:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-13  3:05 [RFC PATCH net-next 0/5] XDP rx handler Jason Wang
2018-08-13  3:05 ` [RFC PATCH net-next 1/5] net: core: generic XDP support for stacked device Jason Wang
2018-08-13  3:05 ` [RFC PATCH net-next 2/5] net: core: introduce XDP rx handler Jason Wang
2018-08-13  3:05 ` [RFC PATCH net-next 3/5] macvlan: count the number of vlan in source mode Jason Wang
2018-08-13  3:05 ` [RFC PATCH net-next 4/5] macvlan: basic XDP support Jason Wang
2018-08-13  3:05 ` [RFC PATCH net-next 5/5] virtio-net: support XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next 0/5] " Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).