All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net] ibmvnic: complete dev->poll nicely during adapter reset
@ 2021-03-05  7:44 Lijun Pan
  2021-03-05 18:41 ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 4+ messages in thread
From: Lijun Pan @ 2021-03-05  7:44 UTC (permalink / raw)
  To: netdev; +Cc: kuba, davem, sukadev, drt, tlfalcon, Lijun Pan

The reset path will call ibmvnic_cleanup->ibmvnic_napi_disable
->napi_disable(). This is supposed to stop the polling.
Commit 21ecba6c48f9 ("ibmvnic: Exit polling routine correctly
during adapter reset") reported that the during device reset,
polling routine never completed and napi_disable slept indefinitely.
In order to solve that problem, resetting bit was checked and
napi_complete_done was called before dev->poll::ibmvnic_poll exited.

Checking for resetting bit in dev->poll is racy because resetting
bit may be false while being checked, but turns true immediately
afterwards.

Hence we call napi_complete in ibmvnic_napi_disable, which avoids
the racing with resetting, and makes sure dev->poll and napi_disalbe
completes before reset routine actually releases resources.

Fixes: 21ecba6c48f9 ("ibmvnic: Exit polling routine correctly during adapter reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index b6102ccf9b90..338d3d071cec 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -785,6 +785,7 @@ static void ibmvnic_napi_disable(struct ibmvnic_adapter *adapter)
 
 	for (i = 0; i < adapter->req_rx_queues; i++) {
 		netdev_dbg(adapter->netdev, "Disabling napi[%d]\n", i);
+		napi_complete(&adapter->napi[i]);
 		napi_disable(&adapter->napi[i]);
 	}
 
@@ -2455,13 +2456,6 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 		u16 offset;
 		u8 flags = 0;
 
-		if (unlikely(test_bit(0, &adapter->resetting) &&
-			     adapter->reset_reason != VNIC_RESET_NON_FATAL)) {
-			enable_scrq_irq(adapter, rx_scrq);
-			napi_complete_done(napi, frames_processed);
-			return frames_processed;
-		}
-
 		if (!pending_scrq(adapter, rx_scrq))
 			break;
 		next = ibmvnic_next_scrq(adapter, rx_scrq);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH net] ibmvnic: complete dev->poll nicely during adapter reset
  2021-03-05  7:44 [RFC PATCH net] ibmvnic: complete dev->poll nicely during adapter reset Lijun Pan
@ 2021-03-05 18:41 ` Sukadev Bhattiprolu
  2021-03-05 18:52   ` Lijun Pan
  0 siblings, 1 reply; 4+ messages in thread
From: Sukadev Bhattiprolu @ 2021-03-05 18:41 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, kuba, davem, drt, tlfalcon

Lijun Pan [ljp@linux.ibm.com] wrote:
> The reset path will call ibmvnic_cleanup->ibmvnic_napi_disable
> ->napi_disable(). This is supposed to stop the polling.
> Commit 21ecba6c48f9 ("ibmvnic: Exit polling routine correctly
> during adapter reset") reported that the during device reset,
> polling routine never completed and napi_disable slept indefinitely.
> In order to solve that problem, resetting bit was checked and
> napi_complete_done was called before dev->poll::ibmvnic_poll exited.
> 
> Checking for resetting bit in dev->poll is racy because resetting
> bit may be false while being checked, but turns true immediately
> afterwards.

Yes, have been testing a fix for that.
> 
> Hence we call napi_complete in ibmvnic_napi_disable, which avoids
> the racing with resetting, and makes sure dev->poll and napi_disalbe

napi_complete() will prevent a new call to ibmvnic_poll() but what if
ibmvnic_poll() is already executing and attempting to access the scrqs
while the reset path is freeing them?

Sukadev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH net] ibmvnic: complete dev->poll nicely during adapter reset
  2021-03-05 18:41 ` Sukadev Bhattiprolu
@ 2021-03-05 18:52   ` Lijun Pan
  2021-03-05 19:05     ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 4+ messages in thread
From: Lijun Pan @ 2021-03-05 18:52 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Lijun Pan, netdev, Jakub Kicinski, David S. Miller, Dany Madden,
	tlfalcon

On Fri, Mar 5, 2021 at 12:44 PM Sukadev Bhattiprolu
<sukadev@linux.ibm.com> wrote:
>
> Lijun Pan [ljp@linux.ibm.com] wrote:
> > The reset path will call ibmvnic_cleanup->ibmvnic_napi_disable
> > ->napi_disable(). This is supposed to stop the polling.
> > Commit 21ecba6c48f9 ("ibmvnic: Exit polling routine correctly
> > during adapter reset") reported that the during device reset,
> > polling routine never completed and napi_disable slept indefinitely.
> > In order to solve that problem, resetting bit was checked and
> > napi_complete_done was called before dev->poll::ibmvnic_poll exited.
> >
> > Checking for resetting bit in dev->poll is racy because resetting
> > bit may be false while being checked, but turns true immediately
> > afterwards.
>
> Yes, have been testing a fix for that.
> >
> > Hence we call napi_complete in ibmvnic_napi_disable, which avoids
> > the racing with resetting, and makes sure dev->poll and napi_disalbe
>
> napi_complete() will prevent a new call to ibmvnic_poll() but what if
> ibmvnic_poll() is already executing and attempting to access the scrqs
> while the reset path is freeing them?
>
napi_complete() and napi_disable() are called in the earlier stages of
reset path, i.e. before reset path actually calls the functions to
freeing scrqs.
So I don't think this is a issue here.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH net] ibmvnic: complete dev->poll nicely during adapter reset
  2021-03-05 18:52   ` Lijun Pan
@ 2021-03-05 19:05     ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 4+ messages in thread
From: Sukadev Bhattiprolu @ 2021-03-05 19:05 UTC (permalink / raw)
  To: Lijun Pan
  Cc: Lijun Pan, netdev, Jakub Kicinski, David S. Miller, Dany Madden,
	tlfalcon

Lijun Pan [lijunp213@gmail.com] wrote:
> On Fri, Mar 5, 2021 at 12:44 PM Sukadev Bhattiprolu
> <sukadev@linux.ibm.com> wrote:
> >
> > Lijun Pan [ljp@linux.ibm.com] wrote:
> > > The reset path will call ibmvnic_cleanup->ibmvnic_napi_disable
> > > ->napi_disable(). This is supposed to stop the polling.
> > > Commit 21ecba6c48f9 ("ibmvnic: Exit polling routine correctly
> > > during adapter reset") reported that the during device reset,
> > > polling routine never completed and napi_disable slept indefinitely.
> > > In order to solve that problem, resetting bit was checked and
> > > napi_complete_done was called before dev->poll::ibmvnic_poll exited.
> > >
> > > Checking for resetting bit in dev->poll is racy because resetting
> > > bit may be false while being checked, but turns true immediately
> > > afterwards.
> >
> > Yes, have been testing a fix for that.
> > >
> > > Hence we call napi_complete in ibmvnic_napi_disable, which avoids
> > > the racing with resetting, and makes sure dev->poll and napi_disalbe
> >
> > napi_complete() will prevent a new call to ibmvnic_poll() but what if
> > ibmvnic_poll() is already executing and attempting to access the scrqs
> > while the reset path is freeing them?
> >
> napi_complete() and napi_disable() are called in the earlier stages of
> reset path, i.e. before reset path actually calls the functions to
> freeing scrqs.

Yes, those will prevent a _new_ call to poll right?

But what if poll is already executing? What prevents it from accessing
an scrq that the reset path will free?

> So I don't think this is a issue here.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-05 19:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-05  7:44 [RFC PATCH net] ibmvnic: complete dev->poll nicely during adapter reset Lijun Pan
2021-03-05 18:41 ` Sukadev Bhattiprolu
2021-03-05 18:52   ` Lijun Pan
2021-03-05 19:05     ` Sukadev Bhattiprolu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.