All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
@ 2020-09-23  4:53 Sukadev Bhattiprolu
  2020-09-23  8:00 ` Lijun Pan
  0 siblings, 1 reply; 5+ messages in thread
From: Sukadev Bhattiprolu @ 2020-09-23  4:53 UTC (permalink / raw)
  To: netdev; +Cc: drt, ljp


From 547fa5627b63102f3ef80edffff3a032d62c88c5 Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Date: Thu, 10 Sep 2020 11:18:41 -0700
Subject: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window

Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
the "failover pending window" - where we wait for the partner to become
ready (after a transport event) before actually attempting to failover.
i.e window is between following two events:

        a. we get a transport event due to a FAILOVER

        b. later, we get CRQ_INITIALIZED indicating the partner is
           ready  at which point we schedule a FAILOVER reset.

and ->failover_pending is true during this window.

If during this window, we attempt to open (or close) a device, we pretend
that the operation succeded and let the FAILOVER reset path complete the
operation.

This is fine, except if the transport event ("a" above) occurs during the
open and after open has already checked whether a failover is pending. If
that happens, we fail the open, which can cause the boot scripts to leave
the interface down requiring administrator to manually bring up the device.

This fix "extends" the failover pending window till we are _actually_
ready to perform the failover reset (i.e until after we get the RTNL
lock). Since open() holds the RTNL lock, we can be sure that we either
finish the open or if the open() fails due to the failover pending window,
we can again pretend that open is done and let the failover complete it.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 33 +++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 1b702a43a5d0..cf75a649ed8b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1197,18 +1197,29 @@ static int ibmvnic_open(struct net_device *netdev)
 	if (adapter->state != VNIC_CLOSED) {
 		rc = ibmvnic_login(netdev);
 		if (rc)
-			return rc;
+			goto out;
 
 		rc = init_resources(adapter);
 		if (rc) {
-			netdev_err(netdev, "failed to initialize resources\n");
+			netdev_err(netdev,
+				"failed to initialize resources, failover %d\n",
+				adapter->failover_pending);
 			release_resources(adapter);
-			return rc;
+			goto out;
 		}
 	}
 
 	rc = __ibmvnic_open(netdev);
 
+out:
+	/*
+	 * If open fails due to a pending failover, set device state and
+	 * return. Device operation will be handled by reset routine.
+	 */
+	if (rc && adapter->failover_pending) {
+		adapter->state = VNIC_OPEN;
+		rc = 0;
+	}
 	return rc;
 }
 
@@ -1931,6 +1942,13 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 		   rwi->reset_reason);
 
 	rtnl_lock();
+	/*
+	 * Now that we have the rtnl lock, clear any pending failover.
+	 * This will ensure ibmvnic_open() has either completed or will
+	 * block until failover is complete.
+	 */
+	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
+		adapter->failover_pending = false;
 
 	netif_carrier_off(netdev);
 	adapter->reset_reason = rwi->reset_reason;
@@ -2275,9 +2293,15 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
 	unsigned long flags;
 	int ret;
 
+	/*
+	 * If failover is pending don't schedule any other reset.
+	 * Instead let the failover complete. If there is already a
+	 * a failover reset scheduled, we will detect and drop the
+	 * duplicate reset when walking the ->rwi_list below.
+	 */
 	if (adapter->state == VNIC_REMOVING ||
 	    adapter->state == VNIC_REMOVED ||
-	    adapter->failover_pending) {
+	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
 		ret = EBUSY;
 		netdev_dbg(netdev, "Adapter removing or pending failover, skipping reset\n");
 		goto err;
@@ -4653,7 +4677,6 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 		case IBMVNIC_CRQ_INIT:
 			dev_info(dev, "Partner initialized\n");
 			adapter->from_passive_init = true;
-			adapter->failover_pending = false;
 			if (!completion_done(&adapter->init_done)) {
 				complete(&adapter->init_done);
 				adapter->init_done_rc = -EIO;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
  2020-09-23  4:53 [PATCH 1/1] powerpc/vnic: Extend "failover pending" window Sukadev Bhattiprolu
@ 2020-09-23  8:00 ` Lijun Pan
  2020-09-23 17:01   ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 5+ messages in thread
From: Lijun Pan @ 2020-09-23  8:00 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: netdev, drt, ljp



> On Sep 22, 2020, at 11:53 PM, Sukadev Bhattiprolu <sukadev@linux.ibm.com> wrote:
> 
> 
> From 547fa5627b63102f3ef80edffff3a032d62c88c5 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Date: Thu, 10 Sep 2020 11:18:41 -0700
> Subject: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
> 
> Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
> the "failover pending window" - where we wait for the partner to become
> ready (after a transport event) before actually attempting to failover.
> i.e window is between following two events:
> 
>        a. we get a transport event due to a FAILOVER
> 
>        b. later, we get CRQ_INITIALIZED indicating the partner is
>           ready  at which point we schedule a FAILOVER reset.
> 
> and ->failover_pending is true during this window.
> 
> If during this window, we attempt to open (or close) a device, we pretend
> that the operation succeded and let the FAILOVER reset path complete the
> operation.
> 
> This is fine, except if the transport event ("a" above) occurs during the
> open and after open has already checked whether a failover is pending. If
> that happens, we fail the open, which can cause the boot scripts to leave
> the interface down requiring administrator to manually bring up the device.
> 
> This fix "extends" the failover pending window till we are _actually_
> ready to perform the failover reset (i.e until after we get the RTNL
> lock). Since open() holds the RTNL lock, we can be sure that we either
> finish the open or if the open() fails due to the failover pending window,
> we can again pretend that open is done and let the failover complete it.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 33 +++++++++++++++++++++++++-----
> 1 file changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 1b702a43a5d0..cf75a649ed8b 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -1197,18 +1197,29 @@ static int ibmvnic_open(struct net_device *netdev)
> 	if (adapter->state != VNIC_CLOSED) {
> 		rc = ibmvnic_login(netdev);
> 		if (rc)
> -			return rc;
> +			goto out;
> 
> 		rc = init_resources(adapter);
> 		if (rc) {
> -			netdev_err(netdev, "failed to initialize resources\n");
> +			netdev_err(netdev,
> +				"failed to initialize resources, failover %d\n",
> +				adapter->failover_pending);

Would “..., failover_pending=%d\n” be more explicit than "failover %d”?

> 			release_resources(adapter);
> -			return rc;
> +			goto out;
> 		}
> 	}
> 
> 	rc = __ibmvnic_open(netdev);
> 
> +out:
> +	/*
> +	 * If open fails due to a pending failover, set device state and
> +	 * return. Device operation will be handled by reset routine.
> +	 */
> +	if (rc && adapter->failover_pending) {
> +		adapter->state = VNIC_OPEN;
> +		rc = 0;
> +	}
> 	return rc;
> }
> 
> @@ -1931,6 +1942,13 @@ static int do_reset(struct ibmvnic_adapter *adapter,
> 		   rwi->reset_reason);
> 
> 	rtnl_lock();
> +	/*
> +	 * Now that we have the rtnl lock, clear any pending failover.
> +	 * This will ensure ibmvnic_open() has either completed or will
> +	 * block until failover is complete.
> +	 */
> +	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
> +		adapter->failover_pending = false;

The window extends till here.
And sometimes VNIC_RESET_FAILOVER case will call do_hard_reset
instead of do_reset, depending on adapter->force_reset_recovery is true or false.

> 
> 	netif_carrier_off(netdev);
> 	adapter->reset_reason = rwi->reset_reason;
> @@ -2275,9 +2293,15 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
> 	unsigned long flags;
> 	int ret;
> 
> +	/*
> +	 * If failover is pending don't schedule any other reset.
> +	 * Instead let the failover complete. If there is already a
> +	 * a failover reset scheduled, we will detect and drop the
> +	 * duplicate reset when walking the ->rwi_list below.
> +	 */
> 	if (adapter->state == VNIC_REMOVING ||
> 	    adapter->state == VNIC_REMOVED ||
> -	    adapter->failover_pending) {
> +	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {

I don’t quite get “reason !=VNIC_RESET_FAILOVER”.
Isn’t failover_pending to describe VNIC_RESET_FAILOVER only? 
Please list an example that failover_pending is true and reason is not VNIC_RESET_FAILOVER.

Lijun

> 		ret = EBUSY;
> 		netdev_dbg(netdev, "Adapter removing or pending failover, skipping reset\n");
> 		goto err;
> @@ -4653,7 +4677,6 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
> 		case IBMVNIC_CRQ_INIT:
> 			dev_info(dev, "Partner initialized\n");
> 			adapter->from_passive_init = true;
> -			adapter->failover_pending = false;
> 			if (!completion_done(&adapter->init_done)) {
> 				complete(&adapter->init_done);
> 				adapter->init_done_rc = -EIO;
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
  2020-09-23  8:00 ` Lijun Pan
@ 2020-09-23 17:01   ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 5+ messages in thread
From: Sukadev Bhattiprolu @ 2020-09-23 17:01 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, drt

Lijun Pan [ljp@linux.vnet.ibm.com] wrote:
> 
> 
> > On Sep 22, 2020, at 11:53 PM, Sukadev Bhattiprolu <sukadev@linux.ibm.com> wrote:
> > 
> > 
> > From 547fa5627b63102f3ef80edffff3a032d62c88c5 Mon Sep 17 00:00:00 2001
> > From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> > Date: Thu, 10 Sep 2020 11:18:41 -0700
> > Subject: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
> > 
> > Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
> > the "failover pending window" - where we wait for the partner to become
> > ready (after a transport event) before actually attempting to failover.
> > i.e window is between following two events:
> > 
> >        a. we get a transport event due to a FAILOVER
> > 
> >        b. later, we get CRQ_INITIALIZED indicating the partner is
> >           ready  at which point we schedule a FAILOVER reset.
> > 
> > and ->failover_pending is true during this window.
> > 
> > If during this window, we attempt to open (or close) a device, we pretend
> > that the operation succeded and let the FAILOVER reset path complete the
> > operation.
> > 
> > This is fine, except if the transport event ("a" above) occurs during the
> > open and after open has already checked whether a failover is pending. If
> > that happens, we fail the open, which can cause the boot scripts to leave
> > the interface down requiring administrator to manually bring up the device.
> > 
> > This fix "extends" the failover pending window till we are _actually_
> > ready to perform the failover reset (i.e until after we get the RTNL
> > lock). Since open() holds the RTNL lock, we can be sure that we either
> > finish the open or if the open() fails due to the failover pending window,
> > we can again pretend that open is done and let the failover complete it.
> > 
> > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
> > ---
> > drivers/net/ethernet/ibm/ibmvnic.c | 33 +++++++++++++++++++++++++-----
> > 1 file changed, 28 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> > index 1b702a43a5d0..cf75a649ed8b 100644
> > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > @@ -1197,18 +1197,29 @@ static int ibmvnic_open(struct net_device *netdev)
> > 	if (adapter->state != VNIC_CLOSED) {
> > 		rc = ibmvnic_login(netdev);
> > 		if (rc)
> > -			return rc;
> > +			goto out;
> > 
> > 		rc = init_resources(adapter);
> > 		if (rc) {
> > -			netdev_err(netdev, "failed to initialize resources\n");
> > +			netdev_err(netdev,
> > +				"failed to initialize resources, failover %d\n",
> > +				adapter->failover_pending);
> 
> Would “..., failover_pending=%d\n” be more explicit than "failover %d”?

Sure.

> 
> > 			release_resources(adapter);
> > -			return rc;
> > +			goto out;
> > 		}
> > 	}
> > 
> > 	rc = __ibmvnic_open(netdev);
> > 
> > +out:
> > +	/*
> > +	 * If open fails due to a pending failover, set device state and
> > +	 * return. Device operation will be handled by reset routine.
> > +	 */
> > +	if (rc && adapter->failover_pending) {
> > +		adapter->state = VNIC_OPEN;
> > +		rc = 0;
> > +	}
> > 	return rc;
> > }
> > 
> > @@ -1931,6 +1942,13 @@ static int do_reset(struct ibmvnic_adapter *adapter,
> > 		   rwi->reset_reason);
> > 
> > 	rtnl_lock();
> > +	/*
> > +	 * Now that we have the rtnl lock, clear any pending failover.
> > +	 * This will ensure ibmvnic_open() has either completed or will
> > +	 * block until failover is complete.
> > +	 */
> > +	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
> > +		adapter->failover_pending = false;
> 
> The window extends till here.
> And sometimes VNIC_RESET_FAILOVER case will call do_hard_reset
> instead of do_reset, depending on adapter->force_reset_recovery is true or false.

If we encounter an error during failover we drop the lock, return error
and initiate a hard reset. At that point, failover is no longer pending so
its ok for failover_pending to be false? 

Once the hard reset completes, the adapter should go back to the PROBED or
OPEN state. We still need to think about what happens to a concurrent open
if there is a hard reset in progress - probably ok to fail it, unlike in
this failover case?
> 
> > 
> > 	netif_carrier_off(netdev);
> > 	adapter->reset_reason = rwi->reset_reason;
> > @@ -2275,9 +2293,15 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
> > 	unsigned long flags;
> > 	int ret;
> > 
> > +	/*
> > +	 * If failover is pending don't schedule any other reset.
> > +	 * Instead let the failover complete. If there is already a
> > +	 * a failover reset scheduled, we will detect and drop the
> > +	 * duplicate reset when walking the ->rwi_list below.
> > +	 */
> > 	if (adapter->state == VNIC_REMOVING ||
> > 	    adapter->state == VNIC_REMOVED ||
> > -	    adapter->failover_pending) {
> > +	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
> 
> I don’t quite get “reason !=VNIC_RESET_FAILOVER”.
> Isn’t failover_pending to describe VNIC_RESET_FAILOVER only? 
> Please list an example that failover_pending is true and reason is not VNIC_RESET_FAILOVER.

This function (ibmvnic_reset()) is queuing various reset requests for a
worker thread to handle later right? So, we could queue a failover reset
and before that is processed by the worker thread, we encounter another
reason to reset and come here. If we do, both existing and new code return
EBUSY.

Thanks for the questions and comments.

Sukadev

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
  2020-10-21  3:14 Sukadev Bhattiprolu
@ 2020-10-21  3:17 ` Lijun Pan
  0 siblings, 0 replies; 5+ messages in thread
From: Lijun Pan @ 2020-10-21  3:17 UTC (permalink / raw)
  To: netdev



> On Oct 20, 2020, at 10:14 PM, Sukadev Bhattiprolu <sukadev@linux.ibm.com> wrote:
> 
> Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
> the "failover pending window" - where we wait for the partner to become
> ready (after a transport event) before actually attempting to failover.
> i.e window is between following two events:
> 
>        a. we get a transport event due to a FAILOVER
> 
>        b. later, we get CRQ_INITIALIZED indicating the partner is
>           ready  at which point we schedule a FAILOVER reset.
> 
> and ->failover_pending is true during this window.
> 
> If during this window, we attempt to open (or close) a device, we pretend
> that the operation succeded and let the FAILOVER reset path complete the
> operation.
> 
> This is fine, except if the transport event ("a" above) occurs during the
> open and after open has already checked whether a failover is pending. If
> that happens, we fail the open, which can cause the boot scripts to leave
> the interface down requiring administrator to manually bring up the device.
> 
> This fix "extends" the failover pending window till we are _actually_
> ready to perform the failover reset (i.e until after we get the RTNL
> lock). Since open() holds the RTNL lock, we can be sure that we either
> finish the open or if the open() fails due to the failover pending window,
> we can again pretend that open is done and let the failover complete it.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
> ---
> Changelog [v2]:
> 	[Brian King] Ensure we clear failover_pending during hard reset
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 36 ++++++++++++++++++++++++++----
> 1 file changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 1b702a43a5d0..2a0f6f6820db 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -1197,18 +1197,27 @@ static int ibmvnic_open(struct net_device *netdev)
> 	if (adapter->state != VNIC_CLOSED) {
> 		rc = ibmvnic_login(netdev);
> 		if (rc)
> -			return rc;
> +			goto out;
> 
> 		rc = init_resources(adapter);
> 		if (rc) {
> 			netdev_err(netdev, "failed to initialize resources\n");
> 			release_resources(adapter);
> -			return rc;
> +			goto out;
> 		}
> 	}
> 
> 	rc = __ibmvnic_open(netdev);
> 
> +out:
> +	/*
> +	 * If open fails due to a pending failover, set device state and
> +	 * return. Device operation will be handled by reset routine.
> +	 */
> +	if (rc && adapter->failover_pending) {
> +		adapter->state = VNIC_OPEN;
> +		rc = 0;
> +	}
> 	return rc;
> }
> 
> @@ -1931,6 +1940,13 @@ static int do_reset(struct ibmvnic_adapter *adapter,
> 		   rwi->reset_reason);
> 
> 	rtnl_lock();
> +	/*
> +	 * Now that we have the rtnl lock, clear any pending failover.
> +	 * This will ensure ibmvnic_open() has either completed or will
> +	 * block until failover is complete.
> +	 */
> +	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
> +		adapter->failover_pending = false;
> 
> 	netif_carrier_off(netdev);
> 	adapter->reset_reason = rwi->reset_reason;
> @@ -2211,6 +2227,13 @@ static void __ibmvnic_reset(struct work_struct *work)
> 			/* CHANGE_PARAM requestor holds rtnl_lock */
> 			rc = do_change_param_reset(adapter, rwi, reset_state);
> 		} else if (adapter->force_reset_recovery) {
> +			/*
> +			 * Since we are doing a hard reset now, clear the
> +			 * failover_pending flag so we don't ignore any
> +			 * future MOBILITY or other resets.
> +			 */
> +			adapter->failover_pending = false;
> +

I think it would be better to put above chunk of code to do_hard_reset()
like you do for do_reset(),  if you really want to extend the window this way.

Extending the window that long may cause some resets being
skipped in some scenarios though I don’t know yet.
We have already seen the migration reset being skipped in some cases.

So my point is extending the window is kind of risky, and do we have an
alternative to address the "open” problem you want to solve originally?
For example, would it be a viable approach to only change the code in
ibmvnic_open() or __ibmvnic_open(), but not extend this window?

> 			/* Transport event occurred during previous reset */
> 			if (adapter->wait_for_reset) {
> 				/* Previous was CHANGE_PARAM; caller locked */
> @@ -2275,9 +2298,15 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
> 	unsigned long flags;
> 	int ret;
> 
> +	/*
> +	 * If failover is pending don't schedule any other reset.
> +	 * Instead let the failover complete. If there is already a
> +	 * a failover reset scheduled, we will detect and drop the
> +	 * duplicate reset when walking the ->rwi_list below.
> +	 */
> 	if (adapter->state == VNIC_REMOVING ||
> 	    adapter->state == VNIC_REMOVED ||
> -	    adapter->failover_pending) {
> +	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
> 		ret = EBUSY;
> 		netdev_dbg(netdev, "Adapter removing or pending failover, skipping reset\n");
> 		goto err;
> @@ -4653,7 +4682,6 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
> 		case IBMVNIC_CRQ_INIT:
> 			dev_info(dev, "Partner initialized\n");
> 			adapter->from_passive_init = true;
> -			adapter->failover_pending = false;
> 			if (!completion_done(&adapter->init_done)) {
> 				complete(&adapter->init_done);
> 				adapter->init_done_rc = -EIO;
> -- 
> 2.25.4
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] powerpc/vnic: Extend "failover pending" window
@ 2020-10-21  3:14 Sukadev Bhattiprolu
  2020-10-21  3:17 ` Lijun Pan
  0 siblings, 1 reply; 5+ messages in thread
From: Sukadev Bhattiprolu @ 2020-10-21  3:14 UTC (permalink / raw)
  To: netdev; +Cc: Dany Madden, Lijun Pan

Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
the "failover pending window" - where we wait for the partner to become
ready (after a transport event) before actually attempting to failover.
i.e window is between following two events:

        a. we get a transport event due to a FAILOVER

        b. later, we get CRQ_INITIALIZED indicating the partner is
           ready  at which point we schedule a FAILOVER reset.

and ->failover_pending is true during this window.

If during this window, we attempt to open (or close) a device, we pretend
that the operation succeded and let the FAILOVER reset path complete the
operation.

This is fine, except if the transport event ("a" above) occurs during the
open and after open has already checked whether a failover is pending. If
that happens, we fail the open, which can cause the boot scripts to leave
the interface down requiring administrator to manually bring up the device.

This fix "extends" the failover pending window till we are _actually_
ready to perform the failover reset (i.e until after we get the RTNL
lock). Since open() holds the RTNL lock, we can be sure that we either
finish the open or if the open() fails due to the failover pending window,
we can again pretend that open is done and let the failover complete it.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
Changelog [v2]:
	[Brian King] Ensure we clear failover_pending during hard reset
---
 drivers/net/ethernet/ibm/ibmvnic.c | 36 ++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 1b702a43a5d0..2a0f6f6820db 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1197,18 +1197,27 @@ static int ibmvnic_open(struct net_device *netdev)
 	if (adapter->state != VNIC_CLOSED) {
 		rc = ibmvnic_login(netdev);
 		if (rc)
-			return rc;
+			goto out;
 
 		rc = init_resources(adapter);
 		if (rc) {
 			netdev_err(netdev, "failed to initialize resources\n");
 			release_resources(adapter);
-			return rc;
+			goto out;
 		}
 	}
 
 	rc = __ibmvnic_open(netdev);
 
+out:
+	/*
+	 * If open fails due to a pending failover, set device state and
+	 * return. Device operation will be handled by reset routine.
+	 */
+	if (rc && adapter->failover_pending) {
+		adapter->state = VNIC_OPEN;
+		rc = 0;
+	}
 	return rc;
 }
 
@@ -1931,6 +1940,13 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 		   rwi->reset_reason);
 
 	rtnl_lock();
+	/*
+	 * Now that we have the rtnl lock, clear any pending failover.
+	 * This will ensure ibmvnic_open() has either completed or will
+	 * block until failover is complete.
+	 */
+	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
+		adapter->failover_pending = false;
 
 	netif_carrier_off(netdev);
 	adapter->reset_reason = rwi->reset_reason;
@@ -2211,6 +2227,13 @@ static void __ibmvnic_reset(struct work_struct *work)
 			/* CHANGE_PARAM requestor holds rtnl_lock */
 			rc = do_change_param_reset(adapter, rwi, reset_state);
 		} else if (adapter->force_reset_recovery) {
+			/*
+			 * Since we are doing a hard reset now, clear the
+			 * failover_pending flag so we don't ignore any
+			 * future MOBILITY or other resets.
+			 */
+			adapter->failover_pending = false;
+
 			/* Transport event occurred during previous reset */
 			if (adapter->wait_for_reset) {
 				/* Previous was CHANGE_PARAM; caller locked */
@@ -2275,9 +2298,15 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
 	unsigned long flags;
 	int ret;
 
+	/*
+	 * If failover is pending don't schedule any other reset.
+	 * Instead let the failover complete. If there is already a
+	 * a failover reset scheduled, we will detect and drop the
+	 * duplicate reset when walking the ->rwi_list below.
+	 */
 	if (adapter->state == VNIC_REMOVING ||
 	    adapter->state == VNIC_REMOVED ||
-	    adapter->failover_pending) {
+	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
 		ret = EBUSY;
 		netdev_dbg(netdev, "Adapter removing or pending failover, skipping reset\n");
 		goto err;
@@ -4653,7 +4682,6 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 		case IBMVNIC_CRQ_INIT:
 			dev_info(dev, "Partner initialized\n");
 			adapter->from_passive_init = true;
-			adapter->failover_pending = false;
 			if (!completion_done(&adapter->init_done)) {
 				complete(&adapter->init_done);
 				adapter->init_done_rc = -EIO;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-21  3:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23  4:53 [PATCH 1/1] powerpc/vnic: Extend "failover pending" window Sukadev Bhattiprolu
2020-09-23  8:00 ` Lijun Pan
2020-09-23 17:01   ` Sukadev Bhattiprolu
2020-10-21  3:14 Sukadev Bhattiprolu
2020-10-21  3:17 ` Lijun Pan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.