bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Alexander Duyck <alexanderduyck@fb.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCH net-next v3 0/7] dma: skip calling no-op sync ops when possible
Date: Wed, 14 Feb 2024 17:21:54 +0100	[thread overview]
Message-ID: <20240214162201.4168778-1-aleksander.lobakin@intel.com> (raw)

The series grew from Eric's idea and patch at [0]. The idea of using the
shortcut for direct DMA as well belongs to Chris.

When an architecture doesn't need DMA synchronization and the buffer is
not an SWIOTLB buffer, most of times the kernel and the drivers end up
calling DMA sync operations for nothing.
Even when DMA is direct, this involves a good non-inline call ladder and
eats a bunch of CPU time. With IOMMU, this results in calling indirect
calls on hotpath just to check what is already known and return.
XSk is been using a custom shortcut for that for quite some time.
I recently wanted to introduce a similar one for Page Pool. Let's combine
all this into one generic shortcut, which would cover all DMA sync ops
and all types of DMA (direct, IOMMU, ...).

* #1 adds stub inlines to be able to skip DMA sync ops or even compile
     them out when not needed.
* #2 adds the generic shortcut and enables it for direct DMA.
* #3 adds ability to skip DMA syncs behind an IOMMU.
* #4-5 are just cleanups for Page Pool to avoid merge conflicts in future.
* #6 checks for the shortcut as early as possible in the Page Pool code to
     make sure no cycles wasted.
* #7 replaces XSk's shortcut with the generic one.

On 100G NIC, the result is +3-5% for direct DMA and +10-11% for IOMMU.
As a bonus, XSk core now allows batched buffer allocations for IOMMU
setups.
If the shortcut is not available on some system, there should be no
visible performance regressions.

[0] https://lore.kernel.org/netdev/20221115182841.2640176-1-edumazet@google.com

Alexander Lobakin (7):
  dma: compile-out DMA sync op calls when not used
  dma: avoid redundant calls for sync operations
  iommu/dma: avoid expensive indirect calls for sync operations
  page_pool: make sure frag API fields don't span between cachelines
  page_pool: don't use driver-set flags field directly
  page_pool: check for DMA sync shortcut earlier
  xsk: use generic DMA sync shortcut instead of a custom one

 kernel/dma/Kconfig                            |  4 +
 include/net/page_pool/types.h                 | 21 ++++-
 include/linux/device.h                        |  5 ++
 include/linux/dma-map-ops.h                   | 21 +++++
 include/linux/dma-mapping.h                   | 84 ++++++++++++++++---
 include/net/xdp_sock_drv.h                    |  7 +-
 include/net/xsk_buff_pool.h                   | 13 +--
 drivers/base/dd.c                             |  2 +
 drivers/iommu/dma-iommu.c                     |  3 +-
 drivers/net/ethernet/engleder/tsnep_main.c    |  2 +-
 .../net/ethernet/freescale/dpaa2/dpaa2-xsk.c  |  2 +-
 drivers/net/ethernet/intel/i40e/i40e_xsk.c    |  2 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c      |  2 +-
 drivers/net/ethernet/intel/igc/igc_main.c     |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c  |  2 +-
 .../ethernet/mellanox/mlx5/core/en/xsk/rx.c   |  4 +-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   |  2 +-
 drivers/net/ethernet/netronome/nfp/nfd3/xsk.c |  2 +-
 .../net/ethernet/stmicro/stmmac/stmmac_main.c |  2 +-
 kernel/dma/mapping.c                          | 59 ++++++++++---
 kernel/dma/swiotlb.c                          |  8 ++
 net/core/page_pool.c                          | 67 +++++++++------
 net/xdp/xsk_buff_pool.c                       | 29 +------
 23 files changed, 239 insertions(+), 106 deletions(-)

---
From v2[1]:
* #1:
  * use two tabs for indenting multi-line function prototypes (Chris);
* #2:
  * make shortcut clearing function generic and move it out of the
    SWIOTLB code (Chris);
  * remove dma_set_skip_sync(): use direct assignment during the initial
    setup, not used anywhere else (Chris);
  * commitmsg: remove "NIC" and the workaround paragraph (Chris).

From v1[2]:
* #1:
  * use static inlines instead of macros (Chris);
  * move CONFIG_DMA_NEED_SYNC check into dma_skip_sync() (Robin);
* #2:
  * use a new dma_map_ops flag instead of new callback, assume the same
    conditions as for direct DMA are enough (Petr, Robin);
  * add more code comments to make sure the whole idea and path are
    clear (Petr, Robin, Chris);
* #2, #3: correct the Git tags and the authorship a bit.

Not addressed in v2:
* #1:
  * dma_sync_*range_*() are still wrapped, as some subsystems may want
    to call the underscored versions directly (e.g. Page Pool);
* #2:
  * the new dev->dma_skip_sync bit is still preferred over checking for
    READ_ONCE(dev->dma_uses_io_tlb) + dev_is_dma_coherent() on hotpath
    as a faster solution.

[1] https://lore.kernel.org/netdev/20240205110426.764393-1-aleksander.lobakin@intel.com
[2] https://lore.kernel.org/netdev/20240126135456.704351-1-aleksander.lobakin@intel.com
-- 
2.43.0


             reply	other threads:[~2024-02-14 16:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-14 16:21 Alexander Lobakin [this message]
2024-02-14 16:21 ` [PATCH net-next v3 1/7] dma: compile-out DMA sync op calls when not used Alexander Lobakin
2024-02-14 17:20   ` Robin Murphy
2024-02-15  5:06     ` Christoph Hellwig
2024-02-19 12:53     ` Alexander Lobakin
2024-02-26 16:27       ` Robin Murphy
2024-02-14 18:09   ` Robin Murphy
2024-02-15  5:06     ` Christoph Hellwig
2024-02-14 16:21 ` [PATCH net-next v3 2/7] dma: avoid redundant calls for sync operations Alexander Lobakin
2024-02-14 17:55   ` Robin Murphy
2024-02-15  5:08     ` Christoph Hellwig
2024-02-15 11:36       ` Robin Murphy
2024-02-19 12:49     ` Alexander Lobakin
2024-02-26 15:45       ` Robin Murphy
2024-02-14 16:21 ` [PATCH net-next v3 3/7] iommu/dma: avoid expensive indirect " Alexander Lobakin
2024-02-14 17:58   ` Robin Murphy
2024-02-14 16:21 ` [PATCH net-next v3 4/7] page_pool: make sure frag API fields don't span between cachelines Alexander Lobakin
2024-02-14 16:21 ` [PATCH net-next v3 5/7] page_pool: don't use driver-set flags field directly Alexander Lobakin
2024-02-14 16:22 ` [PATCH net-next v3 6/7] page_pool: check for DMA sync shortcut earlier Alexander Lobakin
2024-02-14 16:22 ` [PATCH net-next v3 7/7] xsk: use generic DMA sync shortcut instead of a custom one Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240214162201.4168778-1-aleksander.lobakin@intel.com \
    --to=aleksander.lobakin@intel.com \
    --cc=alexanderduyck@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).