All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: netdev@vger.kernel.org, jaswinder.singh@linaro.org,
	ard.biesheuvel@linaro.org, masami.hiramatsu@linaro.org,
	arnd@arndb.de, bjorn.topel@intel.com, magnus.karlsson@intel.com,
	daniel@iogearbox.net, ast@kernel.org,
	jesus.sanchez-palencia@intel.com, vinicius.gomes@intel.com,
	makita.toshiaki@lab.ntt.co.jp, brouer@redhat.com,
	Tariq Toukan <tariqt@mellanox.com>,
	Tariq Toukan <ttoukan.linux@gmail.com>
Subject: Re: [net-next, PATCH 1/2, v3] net: socionext: different approach on DMA
Date: Mon, 1 Oct 2018 11:26:31 +0200	[thread overview]
Message-ID: <20181001112631.4a1fbb62@redhat.com> (raw)
In-Reply-To: <1538220482-16129-2-git-send-email-ilias.apalodimas@linaro.org>


On Sat, 29 Sep 2018 14:28:01 +0300 Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:

> +static void *netsec_alloc_rx_data(struct netsec_priv *priv,
> +				  dma_addr_t *dma_handle, u16 *desc_len)
> +{
> +	size_t len = priv->ndev->mtu + ETH_HLEN + 2 * VLAN_HLEN + NET_SKB_PAD +
> +		NET_IP_ALIGN;
> +	dma_addr_t mapping;
> +	void *buf;
> +
> +	len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> +	len = SKB_DATA_ALIGN(len);
> +
> +	buf = napi_alloc_frag(len);

Using napi_alloc_frag here ^^^^

> +	if (!buf)
> +		return NULL;
> +
> +	mapping = dma_map_single(priv->dev, buf, len, DMA_FROM_DEVICE);
> +	if (unlikely(dma_mapping_error(priv->dev, mapping)))
> +		goto err_out;
> +
> +	*dma_handle = mapping;
> +	*desc_len = len;
> +
> +	return buf;
> +
> +err_out:
> +	skb_free_frag(buf);
> +	return NULL;
> +}

Hmmm, you are using napi_alloc_frag() in above code, which behind
your-back allocates order-3 pages (32 Kbytes memory in 8 order-0 pages).

This violates at-least two XDP principals:

#1: You are NOT using order-0 page based allocations for XDP.

Notice, I'm not saying 1-page per packet, as ixgbe + i40e violated
this, and it is now "per-practical-code-example" acceptable to split up
the order-0 page, and store two RX frames per order-0 page (4096 bytes).
(To make this fit you have to reduce XDP_HEADROOM to 192 bytes, which
killed the idea of placing the SKB in this area).

#2: You have allocations on the XDP fast-path.

The REAL secret behind the XDP performance is to avoid allocations on
the fast-path.  While I just told you to use the page-allocator and
order-0 pages, this will actually kill performance.  Thus, to make this
fast, you need a driver local recycle scheme that avoids going through
the page allocator, which makes XDP_DROP and XDP_TX extremely fast.
For the XDP_REDIRECT action (which you seems to be interested in, as
this is needed for AF_XDP), there is a xdp_return_frame() API that can
make this fast.

To avoid every driver inventing their own driver local page-recycle
cache (which many does today), we/I have created the page pool API.
See include/net/page_pool.h, and look at how mlx5 driver uses it
in v4.18 links[1][2][3].  Do notice, that mlx5 ALSO have a driver
recycle scheme on top, which Tariq is working on removing or
generalizing.  AND also that mlx5 does not use the DMA mapping feature
that page_pool also provide yet. (Contact me if you want to use
page_pool for handing DMA mapping, we might need to export
__page_pool_clean_page and call it before XDP_PASS action).


[1] https://github.com/torvalds/linux/blob/v4.18/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c#L226
[2] https://github.com/torvalds/linux/blob/v4.18/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c#L255
[3] https://github.com/torvalds/linux/blob/v4.18/drivers/net/ethernet/mellanox/mlx5/core/en_main.c#L598-L618
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-10-01 16:03 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-29 11:28 [net-next, PATCH 0/2, v3] net: socionext: XDP support Ilias Apalodimas
2018-09-29 11:28 ` [net-next, PATCH 1/2, v3] net: socionext: different approach on DMA Ilias Apalodimas
2018-10-01  9:26   ` Jesper Dangaard Brouer [this message]
2018-10-01  9:44     ` Ilias Apalodimas
2018-10-01  9:56       ` Ilias Apalodimas
2018-10-01 11:03         ` Jesper Dangaard Brouer
2018-10-01 11:20           ` Ilias Apalodimas
2018-10-01 13:48             ` Jesper Dangaard Brouer
2018-10-01 14:37               ` Ilias Apalodimas
2018-10-01 15:58                 ` Jesper Dangaard Brouer
2018-09-29 11:28 ` [net-next, PATCH 2/2, v3] net: socionext: add XDP support Ilias Apalodimas
2018-10-01 12:48 ` [net-next, PATCH 0/2, v3] net: socionext: " Björn Töpel
2018-10-01 13:59   ` Ilias Apalodimas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181001112631.4a1fbb62@redhat.com \
    --to=brouer@redhat.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jaswinder.singh@linaro.org \
    --cc=jesus.sanchez-palencia@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=masami.hiramatsu@linaro.org \
    --cc=netdev@vger.kernel.org \
    --cc=tariqt@mellanox.com \
    --cc=ttoukan.linux@gmail.com \
    --cc=vinicius.gomes@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.