All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Toshiaki Makita <toshiaki.makita1@gmail.com>,
	netdev@vger.kernel.org, Tariq Toukan <tariqt@mellanox.com>
Subject: Re: [PATCH RFC 6/9] veth: Add ndo_xdp_xmit
Date: Tue, 1 May 2018 10:02:01 +0900	[thread overview]
Message-ID: <6ed7c0c5-d6cb-7870-38da-4bb49707a63c@lab.ntt.co.jp> (raw)
In-Reply-To: <20180430192709.1fc47265@redhat.com>

On 2018/05/01 2:27, Jesper Dangaard Brouer wrote:
> On Thu, 26 Apr 2018 19:52:40 +0900
> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> 
>> On 2018/04/26 5:24, Jesper Dangaard Brouer wrote:
>>> On Tue, 24 Apr 2018 23:39:20 +0900
>>> Toshiaki Makita <toshiaki.makita1@gmail.com> wrote:
>>>   
>>>> +static int veth_xdp_xmit(struct net_device *dev, struct xdp_frame *frame)
>>>> +{
>>>> +	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
>>>> +	int headroom = frame->data - (void *)frame;
>>>> +	struct net_device *rcv;
>>>> +	int err = 0;
>>>> +
>>>> +	rcv = rcu_dereference(priv->peer);
>>>> +	if (unlikely(!rcv))
>>>> +		return -ENXIO;
>>>> +
>>>> +	rcv_priv = netdev_priv(rcv);
>>>> +	/* xdp_ring is initialized on receive side? */
>>>> +	if (rcu_access_pointer(rcv_priv->xdp_prog)) {
>>>> +		err = xdp_ok_fwd_dev(rcv, frame->len);
>>>> +		if (unlikely(err))
>>>> +			return err;
>>>> +
>>>> +		err = veth_xdp_enqueue(rcv_priv, veth_xdp_to_ptr(frame));
>>>> +	} else {
>>>> +		struct sk_buff *skb;
>>>> +
>>>> +		skb = veth_build_skb(frame, headroom, frame->len, 0);
>>>> +		if (unlikely(!skb))
>>>> +			return -ENOMEM;
>>>> +
>>>> +		/* Get page ref in case skb is dropped in netif_rx.
>>>> +		 * The caller is responsible for freeing the page on error.
>>>> +		 */
>>>> +		get_page(virt_to_page(frame->data));  
>>>
>>> I'm not sure you can make this assumption, that xdp_frames coming from
>>> another device driver uses a refcnt based memory model. But maybe I'm
>>> confused, as this looks like an SKB receive path, but in the
>>> ndo_xdp_xmit().  
>>
>> I find this path similar to cpumap, which creates skb from redirected
>> xdp frame. Once it is converted to skb, skb head is freed by
>> page_frag_free, so anyway I needed to get the refcount here regardless
>> of memory model.
> 
> Yes I know, I wrote cpumap ;-)
> 
> First of all, I don't want to see such xdp_frame to SKB conversion code
> in every driver.  Because that increase the chances of errors.  And
> when looking at the details, then it seems that you have made the
> mistake of making it possible to leak xdp_frame info to the SKB (which
> cpumap takes into account).

Do you mean leaving xdp_frame in skb->head is leaking something? how?

> 
> Second, I think the refcnt scheme here is wrong. The xdp_frame should
> be "owned" by XDP and have the proper refcnt to deliver it directly to
> the network stack.
> 
> Third, if we choose that we want a fallback, in-case XDP is not enabled
> on egress dev (but it have an ndo_xdp_xmit), then it should be placed
> in the generic/core code.  E.g. __bpf_tx_xdp_map() could look at the
> return code from dev->netdev_ops->ndo_xdp() and create an SKB.  (Hint,
> this would make it easy to implement TX bulking towards the dev).

Right, this is a much cleaner way.
Although I feel like we should add this fallback for veth because it
requires something which is different from other drivers (enabling XDP
on the peer device of the egress device), I'll drop the part for now. It
should not be resolved in the driver code.

-- 
Toshiaki Makita

  reply	other threads:[~2018-05-01  1:02 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-24 14:39 [PATCH RFC 0/9] veth: Driver XDP Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 1/9] net: Export skb_headers_offset_update and skb_copy_header Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 2/9] veth: Add driver XDP Toshiaki Makita
2018-04-25 20:39   ` Jesper Dangaard Brouer
2018-04-26 10:46     ` Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 3/9] veth: Avoid drops by oversized packets when XDP is enabled Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 4/9] veth: Use NAPI for XDP Toshiaki Makita
2018-05-01  7:50   ` Jesper Dangaard Brouer
2018-05-01  8:02     ` Toshiaki Makita
2018-05-01  8:43       ` Jesper Dangaard Brouer
2018-05-02  3:37         ` Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 5/9] veth: Handle xdp_frame in xdp napi ring Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 6/9] veth: Add ndo_xdp_xmit Toshiaki Makita
2018-04-25 20:24   ` Jesper Dangaard Brouer
2018-04-26 10:52     ` Toshiaki Makita
2018-04-30 17:27       ` Jesper Dangaard Brouer
2018-05-01  1:02         ` Toshiaki Makita [this message]
2018-05-01  8:14           ` Jesper Dangaard Brouer
2018-05-02  3:33             ` Toshiaki Makita
2018-05-02  5:56               ` Jesper Dangaard Brouer
2018-04-24 14:39 ` [PATCH RFC 7/9] veth: Add XDP TX and REDIRECT Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 8/9] veth: Avoid per-packet spinlock of XDP napi ring on dequeueing Toshiaki Makita
2018-04-24 14:39 ` [PATCH RFC 9/9] veth: Avoid per-packet spinlock of XDP napi ring on enqueueing Toshiaki Makita

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ed7c0c5-d6cb-7870-38da-4bb49707a63c@lab.ntt.co.jp \
    --to=makita.toshiaki@lab.ntt.co.jp \
    --cc=brouer@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=tariqt@mellanox.com \
    --cc=toshiaki.makita1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.