From mboxrd@z Thu Jan  1 00:00:00 1970
From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Subject: Re: [PATCH v3 bpf-next 5/8] veth: Add ndo_xdp_xmit
Date: Tue, 24 Jul 2018 11:24:30 +0900
Message-ID: <1c45d12e-f88e-f0be-9b5c-910d6bf85d8f@lab.ntt.co.jp>
References: <20180722151308.5480-1-toshiaki.makita1@gmail.com>
 <20180722151308.5480-6-toshiaki.makita1@gmail.com>
 <20180723180246.1836bc11@cakuba.netronome.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Cc: Toshiaki Makita <toshiaki.makita1@gmail.com>,
        netdev@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
        Daniel Borkmann <daniel@iogearbox.net>,
        Jesper Dangaard Brouer <brouer@redhat.com>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from tama50.ecl.ntt.co.jp ([129.60.39.147]:45814 "EHLO
        tama50.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S2388194AbeGXD3Z (ORCPT
        <rfc822;netdev@vger.kernel.org>); Mon, 23 Jul 2018 23:29:25 -0400
In-Reply-To: <20180723180246.1836bc11@cakuba.netronome.com>
Content-Language: en-US
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 2018/07/24 10:02, Jakub Kicinski wrote:
> On Mon, 23 Jul 2018 00:13:05 +0900, Toshiaki Makita wrote:
>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>>
>> This allows NIC's XDP to redirect packets to veth. The destination veth
>> device enqueues redirected packets to the napi ring of its peer, then
>> they are processed by XDP on its peer veth device.
>> This can be thought as calling another XDP program by XDP program using
>> REDIRECT, when the peer enables driver XDP.
>>
>> Note that when the peer veth device does not set driver xdp, redirected
>> packets will be dropped because the peer is not ready for NAPI.
...
>> +static int veth_xdp_xmit(struct net_device *dev, int n,
>> +			 struct xdp_frame **frames, u32 flags)
>> +{
>> +	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
>> +	struct net_device *rcv;
>> +	int i, drops = 0;
>> +
>> +	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
>> +		return -EINVAL;
>> +
>> +	rcv = rcu_dereference(priv->peer);
>> +	if (unlikely(!rcv))
>> +		return -ENXIO;
>> +
>> +	rcv_priv = netdev_priv(rcv);
>> +	/* xdp_ring is initialized on receive side? */
>> +	if (!rcu_access_pointer(rcv_priv->xdp_prog))
>> +		return -ENXIO;
>> +
>> +	spin_lock(&rcv_priv->xdp_ring.producer_lock);
>> +	for (i = 0; i < n; i++) {
>> +		struct xdp_frame *frame = frames[i];
>> +		void *ptr = veth_xdp_to_ptr(frame);
>> +
>> +		if (unlikely(xdp_ok_fwd_dev(rcv, frame->len) ||
>> +			     __ptr_ring_produce(&rcv_priv->xdp_ring, ptr))) {
> 
> Would you mind sparing a few more words how this is safe vs the
> .ndo_close() on the peer?  Personally I'm a bit uncomfortable with the
> IFF_UP check in xdp_ok_fwd_dev(), I'm not sure what's supposed to
> guarantee the device doesn't go down right after that check, or is
> already down, but netdev->flags are not atomic...  

Actually it is guarded by RCU. On closing the device rcv_priv->xdp_prog
is set to be NULL, and synchronize_net() is called from within
netif_napi_del(). Then ptr_ring is cleaned-up.
xdp_ok_fwd_dev() is doing the same check as non-XDP case, but it may not
be appropriate because IFF_UP check here is not usable as you say.

> 
>> +			xdp_return_frame_rx_napi(frame);
>> +			drops++;
>> +		}
>> +	}
>> +	spin_unlock(&rcv_priv->xdp_ring.producer_lock);
>> +
>> +	if (flags & XDP_XMIT_FLUSH)
>> +		__veth_xdp_flush(rcv_priv);
>> +
>> +	return n - drops;
>> +}
>> +
>>  static struct sk_buff *veth_xdp_rcv_one(struct veth_priv *priv,
>>  					struct xdp_frame *frame)
>>  {
>> @@ -760,6 +804,7 @@ static const struct net_device_ops veth_netdev_ops = {
>>  	.ndo_features_check	= passthru_features_check,
>>  	.ndo_set_rx_headroom	= veth_set_rx_headroom,
>>  	.ndo_bpf		= veth_xdp,
>> +	.ndo_xdp_xmit		= veth_xdp_xmit,
>>  };
>>  
>>  #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \
> 
> 
> 

-- 
Toshiaki Makita