All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
@ 2023-06-26 23:57 longli
  2023-06-29  8:42 ` Paolo Abeni
  0 siblings, 1 reply; 15+ messages in thread
From: longli @ 2023-06-26 23:57 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky, Ajay Sharma, Dexuan Cui,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: linux-rdma, linux-hyperv, netdev, linux-kernel, Long Li, stable

From: Long Li <longli@microsoft.com>

It's inefficient to ring the doorbell page every time a WQE is posted to
the received queue. Excessive MMIO writes result in CPU spending more
time waiting on LOCK instructions (atomic operations), resulting in
poor scaling performance.

Move the code for ringing doorbell page to where after we have posted all
WQEs to the receive queue during a callback from napi_poll().

With this change, tests showed an improvement from 120G/s to 160G/s on a
200G physical link, with 16 or 32 hardware queues.

Tests showed no regression in network latency benchmarks on single
connection.

While we are making changes in this code path, change the code for
ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
hardware specification specifies that it should set to 0. Although
currently the hardware doesn't enforce the check, in the future releases
it may do.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
---
Change log:
v2:
Check for comp_read > 0 as it might be negative on completion error.
Set rq.wqe_cnt to 0 according to BNIC spec.

v3:
Add details in the commit on the reason of performance increase and test numbers.
Add details in the commit on why rq.wqe_cnt should be set to 0 according to hardware spec.
Add "Reviewed-by" from Haiyang and Dexuan.

 drivers/net/ethernet/microsoft/mana/gdma_main.c |  5 ++++-
 drivers/net/ethernet/microsoft/mana/mana_en.c   | 10 ++++++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 8f3f78b68592..3765d3389a9a 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -300,8 +300,11 @@ static void mana_gd_ring_doorbell(struct gdma_context *gc, u32 db_index,
 
 void mana_gd_wq_ring_doorbell(struct gdma_context *gc, struct gdma_queue *queue)
 {
+	/* Hardware Spec specifies that software client should set 0 for
+	 * wqe_cnt for Receive Queues. This value is not used in Send Queues.
+	 */
 	mana_gd_ring_doorbell(gc, queue->gdma_dev->doorbell, queue->type,
-			      queue->id, queue->head * GDMA_WQE_BU_SIZE, 1);
+			      queue->id, queue->head * GDMA_WQE_BU_SIZE, 0);
 }
 
 void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index cd4d5ceb9f2d..1d8abe63fcb8 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1383,8 +1383,8 @@ static void mana_post_pkt_rxq(struct mana_rxq *rxq)
 
 	recv_buf_oob = &rxq->rx_oobs[curr_index];
 
-	err = mana_gd_post_and_ring(rxq->gdma_rq, &recv_buf_oob->wqe_req,
-				    &recv_buf_oob->wqe_inf);
+	err = mana_gd_post_work_request(rxq->gdma_rq, &recv_buf_oob->wqe_req,
+					&recv_buf_oob->wqe_inf);
 	if (WARN_ON_ONCE(err))
 		return;
 
@@ -1654,6 +1654,12 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
 		mana_process_rx_cqe(rxq, cq, &comp[i]);
 	}
 
+	if (comp_read > 0) {
+		struct gdma_context *gc = rxq->gdma_rq->gdma_dev->gdma_context;
+
+		mana_gd_wq_ring_doorbell(gc, rxq->gdma_rq);
+	}
+
 	if (rxq->xdp_flush)
 		xdp_do_flush();
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-26 23:57 [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets longli
@ 2023-06-29  8:42 ` Paolo Abeni
  2023-06-29 16:11   ` Jakub Kicinski
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Paolo Abeni @ 2023-06-29  8:42 UTC (permalink / raw)
  To: longli, Jason Gunthorpe, Leon Romanovsky, Ajay Sharma,
	Dexuan Cui, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, Jakub Kicinski
  Cc: linux-rdma, linux-hyperv, netdev, linux-kernel, Long Li, stable

On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
> 
> It's inefficient to ring the doorbell page every time a WQE is posted to
> the received queue. Excessive MMIO writes result in CPU spending more
> time waiting on LOCK instructions (atomic operations), resulting in
> poor scaling performance.
> 
> Move the code for ringing doorbell page to where after we have posted all
> WQEs to the receive queue during a callback from napi_poll().
> 
> With this change, tests showed an improvement from 120G/s to 160G/s on a
> 200G physical link, with 16 or 32 hardware queues.
> 
> Tests showed no regression in network latency benchmarks on single
> connection.
> 
> While we are making changes in this code path, change the code for
> ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> hardware specification specifies that it should set to 0. Although
> currently the hardware doesn't enforce the check, in the future releases
> it may do.
> 
> Cc: stable@vger.kernel.org
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")

Uhmmm... this looks like a performance improvement to me, more suitable
for the net-next tree ?!? (Note that net-next is closed now).

In any case you must avoid empty lines in the tag area.

If you really intend targeting the -net tree, please repost fixing the
above and explicitly specifying the target tree in the subj prefix.

thanks!

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-29  8:42 ` Paolo Abeni
@ 2023-06-29 16:11   ` Jakub Kicinski
  2023-06-29 18:19     ` Long Li
  2023-06-29 16:53   ` Haiyang Zhang
  2023-06-29 18:18   ` Long Li
  2 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2023-06-29 16:11 UTC (permalink / raw)
  To: longli
  Cc: Paolo Abeni, Jason Gunthorpe, Leon Romanovsky, Ajay Sharma,
	Dexuan Cui, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, linux-rdma, linux-hyperv, netdev,
	linux-kernel, Long Li, stable

On Thu, 29 Jun 2023 10:42:34 +0200 Paolo Abeni wrote:
> > While we are making changes in this code path, change the code for
> > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > hardware specification specifies that it should set to 0. Although
> > currently the hardware doesn't enforce the check, in the future releases
> > it may do.

And please split this cleanup into a separate patch, it doesn't sound
like it has to be done as part of the optimization.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-29  8:42 ` Paolo Abeni
  2023-06-29 16:11   ` Jakub Kicinski
@ 2023-06-29 16:53   ` Haiyang Zhang
  2023-06-29 17:06     ` Jakub Kicinski
  2023-06-29 18:18   ` Long Li
  2 siblings, 1 reply; 15+ messages in thread
From: Haiyang Zhang @ 2023-06-29 16:53 UTC (permalink / raw)
  To: Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Wei Liu, David S. Miller,
	Eric Dumazet, Jakub Kicinski
  Cc: linux-rdma, linux-hyperv, netdev, linux-kernel, Long Li, stable



> -----Original Message-----
> From: Paolo Abeni <pabeni@redhat.com>
> Sent: Thursday, June 29, 2023 4:43 AM
> To: longli@linuxonhyperv.com; Jason Gunthorpe <jgg@ziepe.ca>; Leon
> Romanovsky <leon@kernel.org>; Ajay Sharma <sharmaajay@microsoft.com>;
> Dexuan Cui <decui@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Wei Liu <wei.liu@kernel.org>;
> David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>
> Cc: linux-rdma@vger.kernel.org; linux-hyperv@vger.kernel.org;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Long Li
> <longli@microsoft.com>; stable@vger.kernel.org
> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving
> packets
> 
> On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com wrote:
> > From: Long Li <longli@microsoft.com>
> >
> > It's inefficient to ring the doorbell page every time a WQE is posted to
> > the received queue. Excessive MMIO writes result in CPU spending more
> > time waiting on LOCK instructions (atomic operations), resulting in
> > poor scaling performance.
> >
> > Move the code for ringing doorbell page to where after we have posted all
> > WQEs to the receive queue during a callback from napi_poll().
> >
> > With this change, tests showed an improvement from 120G/s to 160G/s on a
> > 200G physical link, with 16 or 32 hardware queues.
> >
> > Tests showed no regression in network latency benchmarks on single
> > connection.
> >
> > While we are making changes in this code path, change the code for
> > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > hardware specification specifies that it should set to 0. Although
> > currently the hardware doesn't enforce the check, in the future releases
> > it may do.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network
> Adapter (MANA)")
> 
> Uhmmm... this looks like a performance improvement to me, more suitable
> for the net-next tree ?!? (Note that net-next is closed now).

This web page shows the net-next is "open":
http://vger.kernel.org/~davem/net-next.html

Is this still the right place to check net-next status?

- Haiyang

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-29 16:53   ` Haiyang Zhang
@ 2023-06-29 17:06     ` Jakub Kicinski
  0 siblings, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2023-06-29 17:06 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Wei Liu, David S. Miller,
	Eric Dumazet, linux-rdma, linux-hyperv, netdev, linux-kernel,
	Long Li, stable

On Thu, 29 Jun 2023 16:53:42 +0000 Haiyang Zhang wrote:
> This web page shows the net-next is "open":
> http://vger.kernel.org/~davem/net-next.html
> 
> Is this still the right place to check net-next status?

We're working on fixing it. Unfortunately it's a private page and most
of the netdev maintainers don't have access to changing it :(

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-29  8:42 ` Paolo Abeni
  2023-06-29 16:11   ` Jakub Kicinski
  2023-06-29 16:53   ` Haiyang Zhang
@ 2023-06-29 18:18   ` Long Li
  2023-06-30 10:36     ` Paolo Abeni
  2 siblings, 1 reply; 15+ messages in thread
From: Long Li @ 2023-06-29 18:18 UTC (permalink / raw)
  To: Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, Jakub Kicinski
  Cc: linux-rdma, linux-hyperv, netdev, linux-kernel, stable

> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving
> packets
> 
> On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com wrote:
> > From: Long Li <longli@microsoft.com>
> >
> > It's inefficient to ring the doorbell page every time a WQE is posted
> > to the received queue. Excessive MMIO writes result in CPU spending
> > more time waiting on LOCK instructions (atomic operations), resulting
> > in poor scaling performance.
> >
> > Move the code for ringing doorbell page to where after we have posted
> > all WQEs to the receive queue during a callback from napi_poll().
> >
> > With this change, tests showed an improvement from 120G/s to 160G/s on
> > a 200G physical link, with 16 or 32 hardware queues.
> >
> > Tests showed no regression in network latency benchmarks on single
> > connection.
> >
> > While we are making changes in this code path, change the code for
> > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > hardware specification specifies that it should set to 0. Although
> > currently the hardware doesn't enforce the check, in the future
> > releases it may do.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure
> > Network Adapter (MANA)")
> 
> Uhmmm... this looks like a performance improvement to me, more suitable for
> the net-next tree ?!? (Note that net-next is closed now).

This issue is a blocker for usage on 200G physical link. I think it can be categorized as a fix.

> 
> In any case you must avoid empty lines in the tag area.
> 
> If you really intend targeting the -net tree, please repost fixing the above and
> explicitly specifying the target tree in the subj prefix.

Will do, thank you.

Long

> 
> thanks!
> 
> Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-29 16:11   ` Jakub Kicinski
@ 2023-06-29 18:19     ` Long Li
  0 siblings, 0 replies; 15+ messages in thread
From: Long Li @ 2023-06-29 18:19 UTC (permalink / raw)
  To: Jakub Kicinski, longli
  Cc: Paolo Abeni, Jason Gunthorpe, Leon Romanovsky, Ajay Sharma,
	Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, linux-rdma, linux-hyperv, netdev,
	linux-kernel, stable

> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving
> packets
> 
> On Thu, 29 Jun 2023 10:42:34 +0200 Paolo Abeni wrote:
> > > While we are making changes in this code path, change the code for
> > > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > > hardware specification specifies that it should set to 0. Although
> > > currently the hardware doesn't enforce the check, in the future
> > > releases it may do.
> 
> And please split this cleanup into a separate patch, it doesn't sound like it has to
> be done as part of the optimization.

Will do, thanks.

Long

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-29 18:18   ` Long Li
@ 2023-06-30 10:36     ` Paolo Abeni
  2023-06-30 17:31       ` Long Li
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2023-06-30 10:36 UTC (permalink / raw)
  To: Long Li, longli, Jason Gunthorpe, Leon Romanovsky, Ajay Sharma,
	Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, Jakub Kicinski
  Cc: linux-rdma, linux-hyperv, netdev, linux-kernel, stable

On Thu, 2023-06-29 at 18:18 +0000, Long Li wrote:
> > Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell
> > on receiving
> > packets
> > 
> > On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com wrote:
> > > From: Long Li <longli@microsoft.com>
> > > 
> > > It's inefficient to ring the doorbell page every time a WQE is
> > > posted
> > > to the received queue. Excessive MMIO writes result in CPU
> > > spending
> > > more time waiting on LOCK instructions (atomic operations),
> > > resulting
> > > in poor scaling performance.
> > > 
> > > Move the code for ringing doorbell page to where after we have
> > > posted
> > > all WQEs to the receive queue during a callback from napi_poll().
> > > 
> > > With this change, tests showed an improvement from 120G/s to
> > > 160G/s on
> > > a 200G physical link, with 16 or 32 hardware queues.
> > > 
> > > Tests showed no regression in network latency benchmarks on
> > > single
> > > connection.
> > > 
> > > While we are making changes in this code path, change the code
> > > for
> > > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > > hardware specification specifies that it should set to 0.
> > > Although
> > > currently the hardware doesn't enforce the check, in the future
> > > releases it may do.
> > > 
> > > Cc: stable@vger.kernel.org
> > > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure
> > > Network Adapter (MANA)")
> > 
> > Uhmmm... this looks like a performance improvement to me, more
> > suitable for
> > the net-next tree ?!? (Note that net-next is closed now).
> 
> This issue is a blocker for usage on 200G physical link. I think it
> can be categorized as a fix.

Let me ask the question the other way around: is there any specific
reason to have this fix into 6.5 and all the way back to 5.13?
Especially the latest bit (CC-ing stable) looks at least debatable.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-30 10:36     ` Paolo Abeni
@ 2023-06-30 17:31       ` Long Li
  2023-06-30 19:46         ` Greg KH
  0 siblings, 1 reply; 15+ messages in thread
From: Long Li @ 2023-06-30 17:31 UTC (permalink / raw)
  To: Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, Jakub Kicinski
  Cc: linux-rdma, linux-hyperv, netdev, linux-kernel, stable

> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on
> receiving packets
> 
> On Thu, 2023-06-29 at 18:18 +0000, Long Li wrote:
> > > Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell
> > > on receiving packets
> > >
> > > On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com wrote:
> > > > From: Long Li <longli@microsoft.com>
> > > >
> > > > It's inefficient to ring the doorbell page every time a WQE is
> > > > posted to the received queue. Excessive MMIO writes result in CPU
> > > > spending more time waiting on LOCK instructions (atomic
> > > > operations), resulting in poor scaling performance.
> > > >
> > > > Move the code for ringing doorbell page to where after we have
> > > > posted all WQEs to the receive queue during a callback from
> > > > napi_poll().
> > > >
> > > > With this change, tests showed an improvement from 120G/s to
> > > > 160G/s on a 200G physical link, with 16 or 32 hardware queues.
> > > >
> > > > Tests showed no regression in network latency benchmarks on single
> > > > connection.
> > > >
> > > > While we are making changes in this code path, change the code for
> > > > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > > > hardware specification specifies that it should set to 0.
> > > > Although
> > > > currently the hardware doesn't enforce the check, in the future
> > > > releases it may do.
> > > >
> > > > Cc: stable@vger.kernel.org
> > > > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure
> > > > Network Adapter (MANA)")
> > >
> > > Uhmmm... this looks like a performance improvement to me, more
> > > suitable for the net-next tree ?!? (Note that net-next is closed
> > > now).
> >
> > This issue is a blocker for usage on 200G physical link. I think it
> > can be categorized as a fix.
> 
> Let me ask the question the other way around: is there any specific reason to
> have this fix into 6.5 and all the way back to 5.13?
> Especially the latest bit (CC-ing stable) looks at least debatable.

There are many deployed Linux distributions with MANA driver on kernel 5.15 and kernel 6.1. (those kernels are longterm) They need this fix to achieve the performance target.

Thanks,

Long

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-30 17:31       ` Long Li
@ 2023-06-30 19:46         ` Greg KH
  2023-06-30 20:42           ` Long Li
  0 siblings, 1 reply; 15+ messages in thread
From: Greg KH @ 2023-06-30 19:46 UTC (permalink / raw)
  To: Long Li
  Cc: Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, Jakub Kicinski, linux-rdma,
	linux-hyperv, netdev, linux-kernel, stable

On Fri, Jun 30, 2023 at 05:31:48PM +0000, Long Li wrote:
> > Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on
> > receiving packets
> > 
> > On Thu, 2023-06-29 at 18:18 +0000, Long Li wrote:
> > > > Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell
> > > > on receiving packets
> > > >
> > > > On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com wrote:
> > > > > From: Long Li <longli@microsoft.com>
> > > > >
> > > > > It's inefficient to ring the doorbell page every time a WQE is
> > > > > posted to the received queue. Excessive MMIO writes result in CPU
> > > > > spending more time waiting on LOCK instructions (atomic
> > > > > operations), resulting in poor scaling performance.
> > > > >
> > > > > Move the code for ringing doorbell page to where after we have
> > > > > posted all WQEs to the receive queue during a callback from
> > > > > napi_poll().
> > > > >
> > > > > With this change, tests showed an improvement from 120G/s to
> > > > > 160G/s on a 200G physical link, with 16 or 32 hardware queues.
> > > > >
> > > > > Tests showed no regression in network latency benchmarks on single
> > > > > connection.
> > > > >
> > > > > While we are making changes in this code path, change the code for
> > > > > ringing doorbell to set the WQE_COUNT to 0 for Receive Queue. The
> > > > > hardware specification specifies that it should set to 0.
> > > > > Although
> > > > > currently the hardware doesn't enforce the check, in the future
> > > > > releases it may do.
> > > > >
> > > > > Cc: stable@vger.kernel.org
> > > > > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure
> > > > > Network Adapter (MANA)")
> > > >
> > > > Uhmmm... this looks like a performance improvement to me, more
> > > > suitable for the net-next tree ?!? (Note that net-next is closed
> > > > now).
> > >
> > > This issue is a blocker for usage on 200G physical link. I think it
> > > can be categorized as a fix.
> > 
> > Let me ask the question the other way around: is there any specific reason to
> > have this fix into 6.5 and all the way back to 5.13?
> > Especially the latest bit (CC-ing stable) looks at least debatable.
> 
> There are many deployed Linux distributions with MANA driver on kernel 5.15 and kernel 6.1. (those kernels are longterm) They need this fix to achieve the performance target.

Why can't they be upgraded to get that performance target, and all the
other goodness that those kernels have?  We don't normally backport new
features, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-30 19:46         ` Greg KH
@ 2023-06-30 20:42           ` Long Li
  2023-06-30 23:38             ` Jakub Kicinski
  0 siblings, 1 reply; 15+ messages in thread
From: Long Li @ 2023-06-30 20:42 UTC (permalink / raw)
  To: Greg KH
  Cc: Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, Jakub Kicinski, linux-rdma,
	linux-hyperv, netdev, linux-kernel, stable

> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on
> receiving packets
> 
> On Fri, Jun 30, 2023 at 05:31:48PM +0000, Long Li wrote:
> > > Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell
> > > on receiving packets
> > >
> > > On Thu, 2023-06-29 at 18:18 +0000, Long Li wrote:
> > > > > Subject: Re: [Patch v3] net: mana: Batch ringing RX queue
> > > > > doorbell on receiving packets
> > > > >
> > > > > On Mon, 2023-06-26 at 16:57 -0700, longli@linuxonhyperv.com
> wrote:
> > > > > > From: Long Li <longli@microsoft.com>
> > > > > >
> > > > > > It's inefficient to ring the doorbell page every time a WQE is
> > > > > > posted to the received queue. Excessive MMIO writes result in
> > > > > > CPU spending more time waiting on LOCK instructions (atomic
> > > > > > operations), resulting in poor scaling performance.
> > > > > >
> > > > > > Move the code for ringing doorbell page to where after we have
> > > > > > posted all WQEs to the receive queue during a callback from
> > > > > > napi_poll().
> > > > > >
> > > > > > With this change, tests showed an improvement from 120G/s to
> > > > > > 160G/s on a 200G physical link, with 16 or 32 hardware queues.
> > > > > >
> > > > > > Tests showed no regression in network latency benchmarks on
> > > > > > single connection.
> > > > > >
> > > > > > While we are making changes in this code path, change the code
> > > > > > for ringing doorbell to set the WQE_COUNT to 0 for Receive
> > > > > > Queue. The hardware specification specifies that it should set to 0.
> > > > > > Although
> > > > > > currently the hardware doesn't enforce the check, in the
> > > > > > future releases it may do.
> > > > > >
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft
> > > > > > Azure Network Adapter (MANA)")
> > > > >
> > > > > Uhmmm... this looks like a performance improvement to me, more
> > > > > suitable for the net-next tree ?!? (Note that net-next is closed
> > > > > now).
> > > >
> > > > This issue is a blocker for usage on 200G physical link. I think
> > > > it can be categorized as a fix.
> > >
> > > Let me ask the question the other way around: is there any specific
> > > reason to have this fix into 6.5 and all the way back to 5.13?
> > > Especially the latest bit (CC-ing stable) looks at least debatable.
> >
> > There are many deployed Linux distributions with MANA driver on kernel
> 5.15 and kernel 6.1. (those kernels are longterm) They need this fix to achieve
> the performance target.
> 
> Why can't they be upgraded to get that performance target, and all the other
> goodness that those kernels have?  We don't normally backport new features,
> right?

I think this should be considered as a fix, not a new feature.

MANA is designed to be 200GB full duplex at the start.  Due to lack of
hardware testing capability at early stage of the project, we could only test 100GB
for the Linux driver. When hardware is fully capable of reaching designed spec,
this bug in the Linux driver shows up.

Thanks,

Long

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-30 20:42           ` Long Li
@ 2023-06-30 23:38             ` Jakub Kicinski
  2023-07-02 20:18               ` Long Li
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2023-06-30 23:38 UTC (permalink / raw)
  To: Long Li
  Cc: Greg KH, Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, linux-rdma, linux-hyperv, netdev,
	linux-kernel, stable

On Fri, 30 Jun 2023 20:42:28 +0000 Long Li wrote:
> > > 5.15 and kernel 6.1. (those kernels are longterm) They need this fix to achieve
> > > the performance target.
> > 
> > Why can't they be upgraded to get that performance target, and all the other
> > goodness that those kernels have?  We don't normally backport new features,
> > right?  
> 
> I think this should be considered as a fix, not a new feature.
> 
> MANA is designed to be 200GB full duplex at the start.  Due to lack of
> hardware testing capability at early stage of the project, we could only test 100GB
> for the Linux driver. When hardware is fully capable of reaching designed spec,
> this bug in the Linux driver shows up.

That part we understand.

If I were you I'd try to convince Greg and Paolo that the change is
small and significant for user experience. And answer Greg's question
why upgrading the kernel past 6.1 is a challenge in your environment.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-06-30 23:38             ` Jakub Kicinski
@ 2023-07-02 20:18               ` Long Li
  2023-07-03 10:14                 ` Paolo Abeni
  0 siblings, 1 reply; 15+ messages in thread
From: Long Li @ 2023-07-02 20:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Greg KH, Paolo Abeni, longli, Jason Gunthorpe, Leon Romanovsky,
	Ajay Sharma, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, linux-rdma, linux-hyperv, netdev,
	linux-kernel, stable

> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving
> packets
> 
> On Fri, 30 Jun 2023 20:42:28 +0000 Long Li wrote:
> > > > 5.15 and kernel 6.1. (those kernels are longterm) They need this
> > > > fix to achieve the performance target.
> > >
> > > Why can't they be upgraded to get that performance target, and all
> > > the other goodness that those kernels have?  We don't normally
> > > backport new features, right?
> >
> > I think this should be considered as a fix, not a new feature.
> >
> > MANA is designed to be 200GB full duplex at the start.  Due to lack of
> > hardware testing capability at early stage of the project, we could
> > only test 100GB for the Linux driver. When hardware is fully capable
> > of reaching designed spec, this bug in the Linux driver shows up.
> 
> That part we understand.
> 
> If I were you I'd try to convince Greg and Paolo that the change is small and
> significant for user experience. And answer Greg's question why upgrading the
> kernel past 6.1 is a challenge in your environment.

I was under the impression that this patch was considered to be a feature, 
not a bug fix. I was trying to justify that the "Fixes:" tag was needed. 

I apologize for misunderstanding this.

Without this fix, it's not possible to run a typical workload designed for 200Gb
physical link speed.

We see a large number of customers and Linux distributions committed on 5.15 
and 6.1 kernels. They planned the product cycles and certification processes 
around these longterm kernel versions. It's difficult for them to upgrade to newer
kernel versions.

Thanks,

Long

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-07-02 20:18               ` Long Li
@ 2023-07-03 10:14                 ` Paolo Abeni
  2023-07-06 17:51                   ` Long Li
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2023-07-03 10:14 UTC (permalink / raw)
  To: Long Li, Jakub Kicinski
  Cc: Greg KH, longli, Jason Gunthorpe, Leon Romanovsky, Ajay Sharma,
	Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, linux-rdma, linux-hyperv, netdev,
	linux-kernel, stable

On Sun, 2023-07-02 at 20:18 +0000, Long Li wrote:
> > > > > > > > Subject: Re: [Patch v3] net: mana: Batch ringing RX
> > > > > > > > queue
> > > > > > > > doorbell
> > > > > > > > on receiving
> > > > > > > > packets
> > > > > > > > 
> > > > > > > > On Fri, 30 Jun 2023 20:42:28 +0000 Long Li wrote:
> > > > > > > > > > > > > > > > > > > > 5.15 and kernel 6.1. (those
> > > > > > > > > > > > > > > > > > > > kernels are longterm)
> > > > > > > > > > > > > > > > > > > > They need
> > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > fix to achieve the performance
> > > > > > > > > > > > > > > > > > > > target.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Why can't they be upgraded to get that
> > > > > > > > > > > > > > > > performance
> > > > > > > > > > > > > > > > target, and
> > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > the other goodness that those kernels
> > > > > > > > > > > > > > > > have? We don't
> > > > > > > > > > > > > > > > normally
> > > > > > > > > > > > > > > > backport new features, right?
> > > > > > > > > > > > 
> > > > > > > > > > > > I think this should be considered as a fix, not
> > > > > > > > > > > > a new
> > > > > > > > > > > > feature.
> > > > > > > > > > > > 
> > > > > > > > > > > > MANA is designed to be 200GB full duplex at the
> > > > > > > > > > > > start. Due
> > > > > > > > > > > > to
> > > > > > > > > > > > lack of
> > > > > > > > > > > > hardware testing capability at early stage of
> > > > > > > > > > > > the project,
> > > > > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > only test 100GB for the Linux driver. When
> > > > > > > > > > > > hardware is
> > > > > > > > > > > > fully
> > > > > > > > > > > > capable
> > > > > > > > > > > > of reaching designed spec, this bug in the
> > > > > > > > > > > > Linux driver
> > > > > > > > > > > > shows up.
> > > > > > > > 
> > > > > > > > That part we understand.
> > > > > > > > 
> > > > > > > > If I were you I'd try to convince Greg and Paolo that
> > > > > > > > the
> > > > > > > > change is
> > > > > > > > small and
> > > > > > > > significant for user experience. And answer Greg's
> > > > > > > > question why
> > > > > > > > upgrading the
> > > > > > > > kernel past 6.1 is a challenge in your environment.
> > > > 
> > > > I was under the impression that this patch was considered to be
> > > > a
> > > > feature, 
> > > > not a bug fix. I was trying to justify that the "Fixes:" tag
> > > > was
> > > > needed. 
> > > > 
> > > > I apologize for misunderstanding this.
> > > > 
> > > > Without this fix, it's not possible to run a typical workload
> > > > designed for 200Gb
> > > > physical link speed.
> > > > 
> > > > We see a large number of customers and Linux distributions
> > > > committed
> > > > on 5.15 
> > > > and 6.1 kernels. They planned the product cycles and
> > > > certification
> > > > processes 
> > > > around these longterm kernel versions. It's difficult for them
> > > > to
> > > > upgrade to newer
> > > > kernel versions.

I think there are some misunderstanding WRT distros and stable kernels.
(Commercial) distros will backport the patch as needed, regardless such
patch landing in the 5.15 upstream tree or not. Individual users
running their own vanilla 5.15 kernel can't expect performance
improvement landing there.

All in all I feel undecided. I would endorse this change going trough
net-next (without the stable tag). I would feel less torn with this
change targeting -net without the stable tag. Targeting -net with the
stable tag sounds a bit too much to me.

Cheers,
Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets
  2023-07-03 10:14                 ` Paolo Abeni
@ 2023-07-06 17:51                   ` Long Li
  0 siblings, 0 replies; 15+ messages in thread
From: Long Li @ 2023-07-06 17:51 UTC (permalink / raw)
  To: Paolo Abeni, Jakub Kicinski
  Cc: Greg KH, longli, Jason Gunthorpe, Leon Romanovsky, Ajay Sharma,
	Dexuan Cui, KY Srinivasan, Haiyang Zhang, Wei Liu,
	David S. Miller, Eric Dumazet, linux-rdma, linux-hyperv, netdev,
	linux-kernel, stable

> Subject: Re: [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving
> packets
> 
> On Sun, 2023-07-02 at 20:18 +0000, Long Li wrote:
> > > > > > > > > Subject: Re: [Patch v3] net: mana: Batch ringing RX
> > > > > > > > > queue doorbell on receiving packets
> > > > > > > > >
> > > > > > > > > On Fri, 30 Jun 2023 20:42:28 +0000 Long Li wrote:
> > > > > > > > > > > > > > > > > > > > > 5.15 and kernel 6.1. (those
> > > > > > > > > > > > > > > > > > > > > kernels are longterm) They need
> > > > > > > > > > > > > > > > > > > > > this fix to achieve the
> > > > > > > > > > > > > > > > > > > > > performance target.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Why can't they be upgraded to get that
> > > > > > > > > > > > > > > > > performance target, and all the other
> > > > > > > > > > > > > > > > > goodness that those kernels have? We
> > > > > > > > > > > > > > > > > don't normally backport new features,
> > > > > > > > > > > > > > > > > right?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think this should be considered as a fix, not
> > > > > > > > > > > > > a new feature.
> > > > > > > > > > > > >
> > > > > > > > > > > > > MANA is designed to be 200GB full duplex at the
> > > > > > > > > > > > > start. Due to lack of hardware testing
> > > > > > > > > > > > > capability at early stage of the project, we
> > > > > > > > > > > > > could only test 100GB for the Linux driver. When
> > > > > > > > > > > > > hardware is fully capable of reaching designed
> > > > > > > > > > > > > spec, this bug in the Linux driver shows up.
> > > > > > > > >
> > > > > > > > > That part we understand.
> > > > > > > > >
> > > > > > > > > If I were you I'd try to convince Greg and Paolo that
> > > > > > > > > the change is small and significant for user experience.
> > > > > > > > > And answer Greg's question why upgrading the kernel past
> > > > > > > > > 6.1 is a challenge in your environment.
> > > > >
> > > > > I was under the impression that this patch was considered to be
> > > > > a feature, not a bug fix. I was trying to justify that the
> > > > > "Fixes:" tag was needed.
> > > > >
> > > > > I apologize for misunderstanding this.
> > > > >
> > > > > Without this fix, it's not possible to run a typical workload
> > > > > designed for 200Gb physical link speed.
> > > > >
> > > > > We see a large number of customers and Linux distributions
> > > > > committed on 5.15 and 6.1 kernels. They planned the product
> > > > > cycles and certification processes around these longterm kernel
> > > > > versions. It's difficult for them to upgrade to newer kernel
> > > > > versions.
> 
> I think there are some misunderstanding WRT distros and stable kernels.
> (Commercial) distros will backport the patch as needed, regardless such patch
> landing in the 5.15 upstream tree or not. Individual users running their own
> vanilla 5.15 kernel can't expect performance improvement landing there.
> 
> All in all I feel undecided. I would endorse this change going trough net-next
> (without the stable tag). I would feel less torn with this change targeting -net
> without the stable tag. Targeting -net with the stable tag sounds a bit too much to
> me.
> 
> Cheers,
> Paolo

I'm sending this patch to net-next without stable tag.

Thanks,

Long


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-07-06 17:51 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-26 23:57 [Patch v3] net: mana: Batch ringing RX queue doorbell on receiving packets longli
2023-06-29  8:42 ` Paolo Abeni
2023-06-29 16:11   ` Jakub Kicinski
2023-06-29 18:19     ` Long Li
2023-06-29 16:53   ` Haiyang Zhang
2023-06-29 17:06     ` Jakub Kicinski
2023-06-29 18:18   ` Long Li
2023-06-30 10:36     ` Paolo Abeni
2023-06-30 17:31       ` Long Li
2023-06-30 19:46         ` Greg KH
2023-06-30 20:42           ` Long Li
2023-06-30 23:38             ` Jakub Kicinski
2023-07-02 20:18               ` Long Li
2023-07-03 10:14                 ` Paolo Abeni
2023-07-06 17:51                   ` Long Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.