* [PATCH net-next 0/3] net_prefetch API
@ 2020-08-26 12:54 Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 1/3] net: Take common prefetch code structure into a function Tariq Toukan
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Tariq Toukan @ 2020-08-26 12:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Moshe Shemesh, Saeed Mahameed, Jakub Kicinski, Tariq Toukan
Hi,
This patchset adds a common net API for L1 cacheline size-aware prefetch.
Patch 1 introduces the common API in net and aligns the drivers to use it.
Patches 2 and 3 add usage in mlx4 and mlx5 Eth drivers.
Series generated against net-next commit:
079f921e9f4d Merge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge
Thanks,
Tariq.
Tariq Toukan (3):
net: Take common prefetch code structure into a function
net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
net/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES
drivers/net/ethernet/chelsio/cxgb3/sge.c | 5 +----
drivers/net/ethernet/hisilicon/hns/hns_enet.c | 5 +----
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 5 +----
drivers/net/ethernet/intel/fm10k/fm10k_main.c | 5 +----
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 12 ++++--------
drivers/net/ethernet/intel/iavf/iavf_txrx.c | 11 +++--------
drivers/net/ethernet/intel/ice/ice_txrx.c | 10 ++--------
drivers/net/ethernet/intel/igb/igb_main.c | 10 ++--------
drivers/net/ethernet/intel/igc/igc_main.c | 10 ++--------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 11 +++--------
.../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 11 +++--------
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 4 ++--
.../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 ++++++-------
.../ethernet/mellanox/mlx5/core/en_selftest.c | 3 +--
include/linux/netdevice.h | 16 ++++++++++++++++
17 files changed, 51 insertions(+), 86 deletions(-)
--
2.21.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next 1/3] net: Take common prefetch code structure into a function
2020-08-26 12:54 [PATCH net-next 0/3] net_prefetch API Tariq Toukan
@ 2020-08-26 12:54 ` Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 2/3] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Tariq Toukan
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Tariq Toukan @ 2020-08-26 12:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Moshe Shemesh, Saeed Mahameed, Jakub Kicinski,
Tariq Toukan, Jakub Kicinski
Many device drivers use the same prefetch code structure to
deal with small L1 cacheline size.
Take this code into a function and call it from the drivers.
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/chelsio/cxgb3/sge.c | 5 +----
drivers/net/ethernet/hisilicon/hns/hns_enet.c | 5 +----
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 5 +----
drivers/net/ethernet/intel/fm10k/fm10k_main.c | 5 +----
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 12 ++++--------
drivers/net/ethernet/intel/iavf/iavf_txrx.c | 11 +++--------
drivers/net/ethernet/intel/ice/ice_txrx.c | 10 ++--------
drivers/net/ethernet/intel/igb/igb_main.c | 10 ++--------
drivers/net/ethernet/intel/igc/igc_main.c | 10 ++--------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 11 +++--------
.../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 11 +++--------
include/linux/netdevice.h | 16 ++++++++++++++++
12 files changed, 39 insertions(+), 72 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb3/sge.c b/drivers/net/ethernet/chelsio/cxgb3/sge.c
index 6dabbf1502c7..ee6188dea705 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/sge.c
@@ -2372,10 +2372,7 @@ static int process_responses(struct adapter *adap, struct sge_qset *qs,
if (fl->use_pages) {
void *addr = fl->sdesc[fl->cidx].pg_chunk.va;
- prefetch(addr);
-#if L1_CACHE_BYTES < 128
- prefetch(addr + L1_CACHE_BYTES);
-#endif
+ net_prefetch(addr);
__refill_fl(adap, fl);
if (lro > 0) {
lro_add_page(adap, qs, fl,
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 23f278e46975..3af33ade7b60 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -557,10 +557,7 @@ static int hns_nic_poll_rx_skb(struct hns_nic_ring_data *ring_data,
va = (unsigned char *)desc_cb->buf + desc_cb->page_offset;
/* prefetch first cache line of first page */
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
skb = *out_skb = napi_alloc_skb(&ring_data->napi,
HNS_RX_HEAD_SIZE);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 87776ce3539b..1a1ba6a41bfe 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3091,10 +3091,7 @@ static int hns3_handle_rx_bd(struct hns3_enet_ring *ring)
* lines. In such a case, single fetch would suffice to cache in the
* relevant part of the header.
*/
- prefetch(ring->va);
-#if L1_CACHE_BYTES < 128
- prefetch(ring->va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(ring->va);
if (!skb) {
ret = hns3_alloc_skb(ring, length, ring->va);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index d88dd41a9442..99b8252eb969 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -310,10 +310,7 @@ static struct sk_buff *fm10k_fetch_rx_buffer(struct fm10k_ring *rx_ring,
rx_buffer->page_offset;
/* prefetch first cache line of first page */
- prefetch(page_addr);
-#if L1_CACHE_BYTES < 128
- prefetch((void *)((u8 *)page_addr + L1_CACHE_BYTES));
-#endif
+ net_prefetch(page_addr);
/* allocate a skb to store the frags */
skb = napi_alloc_skb(&rx_ring->q_vector->napi,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 3e5c566ceb01..432a984ac335 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1992,10 +1992,8 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(xdp->data);
-#if L1_CACHE_BYTES < 128
- prefetch(xdp->data + L1_CACHE_BYTES);
-#endif
+ net_prefetch(xdp->data);
+
/* Note, we get here by enabling legacy-rx via:
*
* ethtool --set-priv-flags <dev> legacy-rx on
@@ -2078,10 +2076,8 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
* likely have a consumer accessing first few bytes of meta
* data, and then actual data.
*/
- prefetch(xdp->data_meta);
-#if L1_CACHE_BYTES < 128
- prefetch(xdp->data_meta + L1_CACHE_BYTES);
-#endif
+ net_prefetch(xdp->data_meta);
+
/* build an skb around the page buffer */
skb = build_skb(xdp->data_hard_start, truesize);
if (unlikely(!skb))
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
index ca041b39ffda..256fa07d54d5 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
@@ -1309,10 +1309,7 @@ static struct sk_buff *iavf_construct_skb(struct iavf_ring *rx_ring,
return NULL;
/* prefetch first cache line of first page */
va = page_address(rx_buffer->page) + rx_buffer->page_offset;
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
/* allocate a skb to store the frags */
skb = __napi_alloc_skb(&rx_ring->q_vector->napi,
@@ -1376,10 +1373,8 @@ static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring,
return NULL;
/* prefetch first cache line of first page */
va = page_address(rx_buffer->page) + rx_buffer->page_offset;
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
+
/* build an skb around the page buffer */
skb = build_skb(va - IAVF_SKB_PAD, truesize);
if (unlikely(!skb))
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 9d0d6b0025cf..d2fca4a52f51 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -919,10 +919,7 @@ ice_build_skb(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf,
* likely have a consumer accessing first few bytes of meta
* data, and then actual data.
*/
- prefetch(xdp->data_meta);
-#if L1_CACHE_BYTES < 128
- prefetch((void *)(xdp->data + L1_CACHE_BYTES));
-#endif
+ net_prefetch(xdp->data_meta);
/* build an skb around the page buffer */
skb = build_skb(xdp->data_hard_start, truesize);
if (unlikely(!skb))
@@ -964,10 +961,7 @@ ice_construct_skb(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(xdp->data);
-#if L1_CACHE_BYTES < 128
- prefetch((void *)(xdp->data + L1_CACHE_BYTES));
-#endif /* L1_CACHE_BYTES */
+ net_prefetch(xdp->data);
/* allocate a skb to store the frags */
skb = __napi_alloc_skb(&rx_ring->q_vector->napi, ICE_RX_HDR_SIZE,
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 4f05f6efe6af..698bb6a4b088 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8047,10 +8047,7 @@ static struct sk_buff *igb_construct_skb(struct igb_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
/* allocate a skb to store the frags */
skb = napi_alloc_skb(&rx_ring->q_vector->napi, IGB_RX_HDR_LEN);
@@ -8104,10 +8101,7 @@ static struct sk_buff *igb_build_skb(struct igb_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
/* build an skb around the page buffer */
skb = build_skb(va - IGB_SKB_PAD, truesize);
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 9593aa4eea36..c6968fdb6caa 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1550,10 +1550,7 @@ static struct sk_buff *igc_build_skb(struct igc_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
/* build an skb around the page buffer */
skb = build_skb(va - IGC_SKB_PAD, truesize);
@@ -1589,10 +1586,7 @@ static struct sk_buff *igc_construct_skb(struct igc_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(va);
-#if L1_CACHE_BYTES < 128
- prefetch(va + L1_CACHE_BYTES);
-#endif
+ net_prefetch(va);
/* allocate a skb to store the frags */
skb = napi_alloc_skb(&rx_ring->q_vector->napi, IGC_RX_HDR_LEN);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 2f8a4cfc5fa1..f4f2198f388b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2095,10 +2095,8 @@ static struct sk_buff *ixgbe_construct_skb(struct ixgbe_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(xdp->data);
-#if L1_CACHE_BYTES < 128
- prefetch(xdp->data + L1_CACHE_BYTES);
-#endif
+ net_prefetch(xdp->data);
+
/* Note, we get here by enabling legacy-rx via:
*
* ethtool --set-priv-flags <dev> legacy-rx on
@@ -2161,10 +2159,7 @@ static struct sk_buff *ixgbe_build_skb(struct ixgbe_ring *rx_ring,
* likely have a consumer accessing first few bytes of meta
* data, and then actual data.
*/
- prefetch(xdp->data_meta);
-#if L1_CACHE_BYTES < 128
- prefetch(xdp->data_meta + L1_CACHE_BYTES);
-#endif
+ net_prefetch(xdp->data_meta);
/* build an skb to around the page buffer */
skb = build_skb(xdp->data_hard_start, truesize);
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index a428113e6d54..50afec43e001 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -866,10 +866,8 @@ struct sk_buff *ixgbevf_construct_skb(struct ixgbevf_ring *rx_ring,
struct sk_buff *skb;
/* prefetch first cache line of first page */
- prefetch(xdp->data);
-#if L1_CACHE_BYTES < 128
- prefetch(xdp->data + L1_CACHE_BYTES);
-#endif
+ net_prefetch(xdp->data);
+
/* Note, we get here by enabling legacy-rx via:
*
* ethtool --set-priv-flags <dev> legacy-rx on
@@ -947,10 +945,7 @@ static struct sk_buff *ixgbevf_build_skb(struct ixgbevf_ring *rx_ring,
* have a consumer accessing first few bytes of meta data,
* and then actual data.
*/
- prefetch(xdp->data_meta);
-#if L1_CACHE_BYTES < 128
- prefetch(xdp->data_meta + L1_CACHE_BYTES);
-#endif
+ net_prefetch(xdp->data_meta);
/* build an skb around the page buffer */
skb = build_skb(xdp->data_hard_start, truesize);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b0e303f6603f..b8abe1d7aa0b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2193,6 +2193,22 @@ int netdev_get_num_tc(struct net_device *dev)
return dev->num_tc;
}
+static inline void net_prefetch(void *p)
+{
+ prefetch(p);
+#if L1_CACHE_BYTES < 128
+ prefetch((u8 *)p + L1_CACHE_BYTES);
+#endif
+}
+
+static inline void net_prefetchw(void *p)
+{
+ prefetchw(p);
+#if L1_CACHE_BYTES < 128
+ prefetchw((u8 *)p + L1_CACHE_BYTES);
+#endif
+}
+
void netdev_unbind_sb_channel(struct net_device *dev,
struct net_device *sb_dev);
int netdev_bind_sb_channel_queue(struct net_device *dev,
--
2.21.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next 2/3] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
2020-08-26 12:54 [PATCH net-next 0/3] net_prefetch API Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 1/3] net: Take common prefetch code structure into a function Tariq Toukan
@ 2020-08-26 12:54 ` Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 3/3] net/mlx4_en: " Tariq Toukan
2020-08-26 22:56 ` [PATCH net-next 0/3] net_prefetch API David Miller
3 siblings, 0 replies; 7+ messages in thread
From: Tariq Toukan @ 2020-08-26 12:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Moshe Shemesh, Saeed Mahameed, Jakub Kicinski, Tariq Toukan
A single cacheline might not contain the packet header for
small L1_CACHE_BYTES values.
Use net_prefetch() as it issues an additional prefetch
in this case.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 ++++++-------
.../net/ethernet/mellanox/mlx5/core/en_selftest.c | 3 +--
4 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 0e6946fc121f..4bcb73a5522f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -201,7 +201,7 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
pi = mlx5e_xdpsq_get_next_pi(sq, MLX5_SEND_WQE_MAX_WQEBBS);
session->wqe = MLX5E_TX_FETCH_WQE(sq, pi);
- prefetchw(session->wqe->data);
+ net_prefetchw(session->wqe->data);
session->ds_count = MLX5E_XDP_TX_EMPTY_DS_COUNT;
session->pkt_count = 0;
@@ -322,7 +322,7 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_xmit_data *xdptxd,
struct mlx5e_xdpsq_stats *stats = sq->stats;
- prefetchw(wqe);
+ net_prefetchw(wqe);
if (unlikely(dma_len < MLX5E_XDP_MIN_INLINE || sq->hw_mtu < dma_len)) {
stats->err++;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index a33a1f762c70..786fedf52436 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -49,7 +49,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
xdp->data_end = xdp->data + cqe_bcnt32;
xdp_set_data_meta_invalid(xdp);
xsk_buff_dma_sync_for_cpu(xdp);
- prefetch(xdp->data);
+ net_prefetch(xdp->data);
rcu_read_lock();
consumed = mlx5e_xdp_handle(rq, NULL, &cqe_bcnt32, xdp);
@@ -100,7 +100,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
xdp->data_end = xdp->data + cqe_bcnt;
xdp_set_data_meta_invalid(xdp);
xsk_buff_dma_sync_for_cpu(xdp);
- prefetch(xdp->data);
+ net_prefetch(xdp->data);
if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_RESP_SEND)) {
rq->stats->wqe_err++;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 65828af120b7..228fd775bcd5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -30,7 +30,6 @@
* SOFTWARE.
*/
-#include <linux/prefetch.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/tcp.h>
@@ -1141,8 +1140,8 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
dma_sync_single_range_for_cpu(rq->pdev, di->addr, wi->offset,
frag_size, DMA_FROM_DEVICE);
- prefetchw(va); /* xdp_frame data area */
- prefetch(data);
+ net_prefetchw(va); /* xdp_frame data area */
+ net_prefetch(data);
rcu_read_lock();
mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
@@ -1184,7 +1183,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
return NULL;
}
- prefetchw(skb->data);
+ net_prefetchw(skb->data);
while (byte_cnt) {
u16 frag_consumed_bytes =
@@ -1399,7 +1398,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
return NULL;
}
- prefetchw(skb->data);
+ net_prefetchw(skb->data);
if (unlikely(frag_offset >= PAGE_SIZE)) {
di++;
@@ -1452,8 +1451,8 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
dma_sync_single_range_for_cpu(rq->pdev, di->addr, head_offset,
frag_size, DMA_FROM_DEVICE);
- prefetchw(va); /* xdp_frame data area */
- prefetch(data);
+ net_prefetchw(va); /* xdp_frame data area */
+ net_prefetch(data);
rcu_read_lock();
mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt32, &xdp);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 46790216ce86..ce8ab1f01876 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -30,7 +30,6 @@
* SOFTWARE.
*/
-#include <linux/prefetch.h>
#include <linux/ip.h>
#include <linux/udp.h>
#include <net/udp.h>
@@ -115,7 +114,7 @@ static struct sk_buff *mlx5e_test_get_udp_skb(struct mlx5e_priv *priv)
return NULL;
}
- prefetchw(skb->data);
+ net_prefetchw(skb->data);
skb_reserve(skb, NET_IP_ALIGN);
/* Reserve for ethernet and IP header */
--
2.21.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next 3/3] net/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES
2020-08-26 12:54 [PATCH net-next 0/3] net_prefetch API Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 1/3] net: Take common prefetch code structure into a function Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 2/3] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Tariq Toukan
@ 2020-08-26 12:54 ` Tariq Toukan
2020-08-26 14:55 ` Eric Dumazet
2020-08-26 22:56 ` [PATCH net-next 0/3] net_prefetch API David Miller
3 siblings, 1 reply; 7+ messages in thread
From: Tariq Toukan @ 2020-08-26 12:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Moshe Shemesh, Saeed Mahameed, Jakub Kicinski, Tariq Toukan
A single cacheline might not contain the packet header for
small L1_CACHE_BYTES values.
Use net_prefetch() as it issues an additional prefetch
in this case.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index b50c567ef508..99d7737e8ad6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -705,7 +705,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
frags = ring->rx_info + (index << priv->log_rx_info);
va = page_address(frags[0].page) + frags[0].page_offset;
- prefetchw(va);
+ net_prefetchw(va);
/*
* make sure we read the CQE after we read the ownership bit
*/
--
2.21.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 3/3] net/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES
2020-08-26 12:54 ` [PATCH net-next 3/3] net/mlx4_en: " Tariq Toukan
@ 2020-08-26 14:55 ` Eric Dumazet
2020-08-26 17:09 ` Saeed Mahameed
0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2020-08-26 14:55 UTC (permalink / raw)
To: Tariq Toukan, David S. Miller
Cc: netdev, Moshe Shemesh, Saeed Mahameed, Jakub Kicinski
On 8/26/20 5:54 AM, Tariq Toukan wrote:
> A single cacheline might not contain the packet header for
> small L1_CACHE_BYTES values.
> Use net_prefetch() as it issues an additional prefetch
> in this case.
>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index b50c567ef508..99d7737e8ad6 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -705,7 +705,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
>
> frags = ring->rx_info + (index << priv->log_rx_info);
> va = page_address(frags[0].page) + frags[0].page_offset;
> - prefetchw(va);
> + net_prefetchw(va);
> /*
> * make sure we read the CQE after we read the ownership bit
> */
>
Why these cache lines would be written next ? Presumably we read the headers (pulled into skb->head)
Really using prefetch() for the about to be read packet is too late anyway for current cpus.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 3/3] net/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES
2020-08-26 14:55 ` Eric Dumazet
@ 2020-08-26 17:09 ` Saeed Mahameed
0 siblings, 0 replies; 7+ messages in thread
From: Saeed Mahameed @ 2020-08-26 17:09 UTC (permalink / raw)
To: eric.dumazet, tariqt, davem; +Cc: netdev, kuba, moshe
On Wed, 2020-08-26 at 07:55 -0700, Eric Dumazet wrote:
>
> On 8/26/20 5:54 AM, Tariq Toukan wrote:
> > A single cacheline might not contain the packet header for
> > small L1_CACHE_BYTES values.
> > Use net_prefetch() as it issues an additional prefetch
> > in this case.
> >
> > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
> > ---
> > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > index b50c567ef508..99d7737e8ad6 100644
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > @@ -705,7 +705,7 @@ int mlx4_en_process_rx_cq(struct net_device
> > *dev, struct mlx4_en_cq *cq, int bud
> >
> > frags = ring->rx_info + (index << priv->log_rx_info);
> > va = page_address(frags[0].page) +
> > frags[0].page_offset;
> > - prefetchw(va);
> > + net_prefetchw(va);
> > /*
> > * make sure we read the CQE after we read the
> > ownership bit
> > */
> >
>
> Why these cache lines would be written next ? Presumably we read the
> headers (pulled into skb->head)
>
XDP
> Really using prefetch() for the about to be read packet is too late
> anyway for current cpus.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/3] net_prefetch API
2020-08-26 12:54 [PATCH net-next 0/3] net_prefetch API Tariq Toukan
` (2 preceding siblings ...)
2020-08-26 12:54 ` [PATCH net-next 3/3] net/mlx4_en: " Tariq Toukan
@ 2020-08-26 22:56 ` David Miller
3 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2020-08-26 22:56 UTC (permalink / raw)
To: tariqt; +Cc: netdev, moshe, saeedm, kuba
From: Tariq Toukan <tariqt@mellanox.com>
Date: Wed, 26 Aug 2020 15:54:15 +0300
> This patchset adds a common net API for L1 cacheline size-aware prefetch.
>
> Patch 1 introduces the common API in net and aligns the drivers to use it.
> Patches 2 and 3 add usage in mlx4 and mlx5 Eth drivers.
>
> Series generated against net-next commit:
> 079f921e9f4d Merge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge
Effectively a nice little cleanup, series applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-08-26 22:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-26 12:54 [PATCH net-next 0/3] net_prefetch API Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 1/3] net: Take common prefetch code structure into a function Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 2/3] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Tariq Toukan
2020-08-26 12:54 ` [PATCH net-next 3/3] net/mlx4_en: " Tariq Toukan
2020-08-26 14:55 ` Eric Dumazet
2020-08-26 17:09 ` Saeed Mahameed
2020-08-26 22:56 ` [PATCH net-next 0/3] net_prefetch API David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).