All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] net: ibm: replenish rx pool and poll less frequently
@ 2021-06-02 17:01 Lijun Pan
  2021-06-02 17:47 ` Rick Lindsley
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Lijun Pan @ 2021-06-02 17:01 UTC (permalink / raw)
  To: netdev; +Cc: Lijun Pan

The old mechanism replenishes rx pool even only one frames is processed in
the poll function, which causes lots of overheads. The old mechanism
restarts polling until processed frames reaches the budget, which can
cause the poll function to loop into restart_poll 63 times at most and to
call replenish_rx_poll 63 times at most. This will cause soft lockup very
easily. So, don't replenish too often, and don't goto restart_poll in each
poll function. If there are pending descriptors, fetch them in the next
poll instance.

Signed-off-by: Lijun Pan <lijunp213@gmail.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index ffb2a91750c7..fae1eaa39dd0 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2435,7 +2435,6 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 	frames_processed = 0;
 	rx_scrq = adapter->rx_scrq[scrq_num];
 
-restart_poll:
 	while (frames_processed < budget) {
 		struct sk_buff *skb;
 		struct ibmvnic_rx_buff *rx_buff;
@@ -2512,20 +2511,12 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 	}
 
 	if (adapter->state != VNIC_CLOSING &&
-	    ((atomic_read(&adapter->rx_pool[scrq_num].available) <
-	      adapter->req_rx_add_entries_per_subcrq / 2) ||
-	      frames_processed < budget))
+	    (atomic_read(&adapter->rx_pool[scrq_num].available) <
+	      adapter->req_rx_add_entries_per_subcrq / 2))
 		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
 	if (frames_processed < budget) {
-		if (napi_complete_done(napi, frames_processed)) {
+		if (napi_complete_done(napi, frames_processed))
 			enable_scrq_irq(adapter, rx_scrq);
-			if (pending_scrq(adapter, rx_scrq)) {
-				if (napi_reschedule(napi)) {
-					disable_scrq_irq(adapter, rx_scrq);
-					goto restart_poll;
-				}
-			}
-		}
 	}
 	return frames_processed;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: ibm: replenish rx pool and poll less frequently
  2021-06-02 17:01 [PATCH net-next] net: ibm: replenish rx pool and poll less frequently Lijun Pan
@ 2021-06-02 17:47 ` Rick Lindsley
  2021-06-02 17:58 ` Dany Madden
  2021-06-02 19:25 ` Sukadev Bhattiprolu
  2 siblings, 0 replies; 5+ messages in thread
From: Rick Lindsley @ 2021-06-02 17:47 UTC (permalink / raw)
  To: Lijun Pan, netdev

On 6/2/21 10:01 AM, Lijun Pan wrote:
> The old mechanism replenishes rx pool even only one frames is processed in
> the poll function, which causes lots of overheads. The old mechanism
> restarts polling until processed frames reaches the budget, which can
> cause the poll function to loop into restart_poll 63 times at most and to
> call replenish_rx_poll 63 times at most.

Presumably this frequency is to keep ahead of demand.  When you say lots
of overhead - can you attach a number to that?  How much does this change
improve on that number (what workload benefits?)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: ibm: replenish rx pool and poll less frequently
  2021-06-02 17:01 [PATCH net-next] net: ibm: replenish rx pool and poll less frequently Lijun Pan
  2021-06-02 17:47 ` Rick Lindsley
@ 2021-06-02 17:58 ` Dany Madden
  2021-06-02 18:23   ` Dany Madden
  2021-06-02 19:25 ` Sukadev Bhattiprolu
  2 siblings, 1 reply; 5+ messages in thread
From: Dany Madden @ 2021-06-02 17:58 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev

On 2021-06-02 10:01, Lijun Pan wrote:
> The old mechanism replenishes rx pool even only one frames is processed 
> in
> the poll function, which causes lots of overheads. The old mechanism
> restarts polling until processed frames reaches the budget, which can
> cause the poll function to loop into restart_poll 63 times at most and 
> to
> call replenish_rx_poll 63 times at most. This will cause soft lockup 
> very
> easily. So, don't replenish too often, and don't goto restart_poll in 
> each
> poll function. If there are pending descriptors, fetch them in the next
> poll instance.

Does this improve performance?

> 
> Signed-off-by: Lijun Pan <lijunp213@gmail.com>
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index ffb2a91750c7..fae1eaa39dd0 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2435,7 +2435,6 @@ static int ibmvnic_poll(struct napi_struct
> *napi, int budget)
>  	frames_processed = 0;
>  	rx_scrq = adapter->rx_scrq[scrq_num];
> 
> -restart_poll:
>  	while (frames_processed < budget) {
>  		struct sk_buff *skb;
>  		struct ibmvnic_rx_buff *rx_buff;
> @@ -2512,20 +2511,12 @@ static int ibmvnic_poll(struct napi_struct
> *napi, int budget)
>  	}
> 
>  	if (adapter->state != VNIC_CLOSING &&
> -	    ((atomic_read(&adapter->rx_pool[scrq_num].available) <
> -	      adapter->req_rx_add_entries_per_subcrq / 2) ||
> -	      frames_processed < budget))

There is a budget that the driver should adhere to. Even one frame, it 
should still process the frame within a budget.

> +	    (atomic_read(&adapter->rx_pool[scrq_num].available) <
> +	      adapter->req_rx_add_entries_per_subcrq / 2))
>  		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>  	if (frames_processed < budget) {
> -		if (napi_complete_done(napi, frames_processed)) {
> +		if (napi_complete_done(napi, frames_processed))
>  			enable_scrq_irq(adapter, rx_scrq);
> -			if (pending_scrq(adapter, rx_scrq)) {
> -				if (napi_reschedule(napi)) {
> -					disable_scrq_irq(adapter, rx_scrq);
> -					goto restart_poll;
> -				}
> -			}
> -		}
>  	}
>  	return frames_processed;
>  }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: ibm: replenish rx pool and poll less frequently
  2021-06-02 17:58 ` Dany Madden
@ 2021-06-02 18:23   ` Dany Madden
  0 siblings, 0 replies; 5+ messages in thread
From: Dany Madden @ 2021-06-02 18:23 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev

On 2021-06-02 10:58, Dany Madden wrote:
> On 2021-06-02 10:01, Lijun Pan wrote:
>> The old mechanism replenishes rx pool even only one frames is 
>> processed in
>> the poll function, which causes lots of overheads. The old mechanism
>> restarts polling until processed frames reaches the budget, which can
>> cause the poll function to loop into restart_poll 63 times at most and 
>> to
>> call replenish_rx_poll 63 times at most. This will cause soft lockup 
>> very
>> easily. So, don't replenish too often, and don't goto restart_poll in 
>> each
>> poll function. If there are pending descriptors, fetch them in the 
>> next
>> poll instance.
> 
> Does this improve performance?
> 
>> 
>> Signed-off-by: Lijun Pan <lijunp213@gmail.com>
>> ---
>>  drivers/net/ethernet/ibm/ibmvnic.c | 15 +++------------
>>  1 file changed, 3 insertions(+), 12 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>> b/drivers/net/ethernet/ibm/ibmvnic.c
>> index ffb2a91750c7..fae1eaa39dd0 100644
>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>> @@ -2435,7 +2435,6 @@ static int ibmvnic_poll(struct napi_struct
>> *napi, int budget)
>>  	frames_processed = 0;
>>  	rx_scrq = adapter->rx_scrq[scrq_num];
>> 
>> -restart_poll:
>>  	while (frames_processed < budget) {
>>  		struct sk_buff *skb;
>>  		struct ibmvnic_rx_buff *rx_buff;
>> @@ -2512,20 +2511,12 @@ static int ibmvnic_poll(struct napi_struct
>> *napi, int budget)
>>  	}
>> 
>>  	if (adapter->state != VNIC_CLOSING &&
>> -	    ((atomic_read(&adapter->rx_pool[scrq_num].available) <
>> -	      adapter->req_rx_add_entries_per_subcrq / 2) ||
>> -	      frames_processed < budget))
> 
> There is a budget that the driver should adhere to. Even one frame, it
> should still process the frame within a budget.
I meant it should replenish the buffer because the commit that added 
this check, 
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=41ed0a00ffcd903ece4304a4a65d95706115ffcb, 
stated that low frame_processed means low incoming packets, so use the 
time to refill the buffers.

So, it would be good to see some numbers of how this change is doing in 
comparison to the code before.

> 
>> +	    (atomic_read(&adapter->rx_pool[scrq_num].available) <
>> +	      adapter->req_rx_add_entries_per_subcrq / 2))
>>  		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>>  	if (frames_processed < budget) {
>> -		if (napi_complete_done(napi, frames_processed)) {
>> +		if (napi_complete_done(napi, frames_processed))
>>  			enable_scrq_irq(adapter, rx_scrq);
>> -			if (pending_scrq(adapter, rx_scrq)) {
>> -				if (napi_reschedule(napi)) {
>> -					disable_scrq_irq(adapter, rx_scrq);
>> -					goto restart_poll;
>> -				}
>> -			}
>> -		}
>>  	}
>>  	return frames_processed;
>>  }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: ibm: replenish rx pool and poll less frequently
  2021-06-02 17:01 [PATCH net-next] net: ibm: replenish rx pool and poll less frequently Lijun Pan
  2021-06-02 17:47 ` Rick Lindsley
  2021-06-02 17:58 ` Dany Madden
@ 2021-06-02 19:25 ` Sukadev Bhattiprolu
  2 siblings, 0 replies; 5+ messages in thread
From: Sukadev Bhattiprolu @ 2021-06-02 19:25 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev

Lijun Pan [lijunp213@gmail.com] wrote:
> The old mechanism replenishes rx pool even only one frames is processed in
> the poll function, which causes lots of overheads. The old mechanism

The soft lockup is not seen when replenishing a small number of buffers at
a time. Its only under some conditions when replenishing a _large_ number
at once - appears to be because the netdev_alloc_skb() calls collectively
take a long time.

Replenishing a small number at a time is not a problem.

> restarts polling until processed frames reaches the budget, which can
> cause the poll function to loop into restart_poll 63 times at most and to
> call replenish_rx_poll 63 times at most. This will cause soft lockup very
> easily. So, don't replenish too often, and don't goto restart_poll in each

The 64 is from the budget the system gave us. And for us to hit the goto
restart_loop:
	a. pending_scrq() in the while loop must not have a found a packet,
	   and
	b. by the time we replenished the pool, completed napi etc we must
	   have found a packet

For this to happen 64 times, we must find
	- exactly zero packets in a. and
	- exactly one packet in b, and
	- the tight sequence must occur 64 times.

IOW its more theoretical right?

Even if it did happen a handful of times, the only "overheads" in the
replenish are the netdev_alloc_skb() and the send-subcrq-indirect hcall.

The skb alloc cannot be avoided - we must do it now or in the future
anyway. The hcall is issued every 16 skbs. If we issue it for <16 skbs
it means the traffic is extremely low. No point optimizing for that.
Besides the hcalls are not very expensive.

There was a lot of testing done in Nov 2020 when the subcrq-indirect
hcall support was added. We would need to repeat that testing at the
least.

Thanks,

Sukadev

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-02 19:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 17:01 [PATCH net-next] net: ibm: replenish rx pool and poll less frequently Lijun Pan
2021-06-02 17:47 ` Rick Lindsley
2021-06-02 17:58 ` Dany Madden
2021-06-02 18:23   ` Dany Madden
2021-06-02 19:25 ` Sukadev Bhattiprolu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.