From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yongseok Koh Subject: Re: [PATCH] net/mlx5: fix instruction hotspot on replenishing Rx buffer Date: Wed, 9 Jan 2019 09:56:50 +0000 Message-ID: <45D5A06B-4014-428E-A588-6E188C87A5AA@mellanox.com> References: <20190109085426.39965-1-yskoh@mellanox.com> <20190109095205.us53xaocvokx4jog@glumotte.dev.6wind.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: David Marchand , Shahaf Shuler , "dev@dpdk.org" , "stable@dpdk.org" To: Olivier Matz Return-path: In-Reply-To: <20190109095205.us53xaocvokx4jog@glumotte.dev.6wind.com> Content-Language: en-US Content-ID: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > On Jan 9, 2019, at 1:52 AM, Olivier Matz wrote: >=20 > On Wed, Jan 09, 2019 at 10:38:07AM +0100, David Marchand wrote: >> On Wed, Jan 9, 2019 at 9:54 AM Yongseok Koh wrote: >>=20 >>> On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't need= ed >>> to be accessed as it is static and easily calculated from the mbuf addr= ess. >>> Accessing the mbuf content causes unnecessary load stall and it is wors= ened >>> on ARM. >>>=20 >>> Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") >>> Cc: stable@dpdk.org >>>=20 >>> Signed-off-by: Yongseok Koh >>> --- >>> drivers/net/mlx5/mlx5_rxtx_vec.h | 8 ++++++-- >>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>=20 >>> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h >>> b/drivers/net/mlx5/mlx5_rxtx_vec.h >>> index fda7004e2d..ced5547307 100644 >>> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h >>> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h >>> @@ -102,8 +102,12 @@ mlx5_rx_replenish_bulk_mbuf(struct mlx5_rxq_data >>> *rxq, uint16_t n) >>> return; >>> } >>> for (i =3D 0; i < n; ++i) { >>> - wq[i].addr =3D rte_cpu_to_be_64((uintptr_t)elts[i]->buf= _addr >>> + >>> - RTE_PKTMBUF_HEADROOM); >>> + uintptr_t buf_addr =3D >>> + (uintptr_t)elts[i] + sizeof(struct rte_mbuf) + >>> + rte_pktmbuf_priv_size(rxq->mp) + >>> RTE_PKTMBUF_HEADROOM; >>> + >>> + assert(buf_addr =3D=3D (uintptr_t)elts[i]->buf_addr); >>> + wq[i].addr =3D rte_cpu_to_be_64(buf_addr); >>> /* If there's only one MR, no need to replace LKey in WQ= E. >>> */ >>> if (unlikely(mlx5_mr_btree_len(&rxq->mr_ctrl.cache_bh) > >>> 1)) >>> wq[i].lkey =3D mlx5_rx_mb2mr(rxq, elts[i]); >>> -- >>> 2.11.0 >>>=20 >>>=20 >> How about having a macro / inline in the mbuf api to get this informatio= n >> in a consistent/unique way ? >> I can see we have this calculation at least in rte_pktmbuf_init() and >> rte_pktmbuf_detach(). >=20 > Agree. Maybe rte_mbuf_default_buf_addr(m) ? I'm also okay to add. Will come up with a new patch. > Side note, is the assert() correct in the patch? I'd say there's a > difference of RTE_PKTMBUF_HEADROOM between the 2 values. Oops, my fault. Thanks for the catch, you saved a crash. :-) Thanks, Yongseok