netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 00/15] ibmvnic: assorted bug fixes
@ 2020-11-20 22:40 Lijun Pan
  2020-11-20 22:40 ` [PATCH net 01/15] ibmvnic: handle inconsistent login with reset Lijun Pan
                   ` (15 more replies)
  0 siblings, 16 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt, Lijun Pan

Assorted fixes and improvements for ibmvnic bugs.

Dany Madden (9):
  ibmvnic: handle inconsistent login with reset
  ibmvnic: process HMC disable command
  ibmvnic: stop free_all_rwi on failed reset
  ibmvnic: remove free_all_rwi function
  ibmvnic: avoid memset null scrq msgs
  ibmvnic: restore adapter state on failed reset
  ibmvnic: send_login should check for crq errors
  ibmvnic: no reset timeout for 5 seconds after reset
  ibmvnic: reduce wait for completion time

Lijun Pan (3):
  ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues
  ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq
  ibmvnic: enhance resetting status check during module exit

Sukadev Bhattiprolu (3):
  ibmvnic: delay next reset if hard reset failed
  ibmvnic: track pending login
  ibmvnic: add some debugs

 drivers/net/ethernet/ibm/ibmvnic.c | 246 +++++++++++++++++++++--------
 drivers/net/ethernet/ibm/ibmvnic.h |   9 +-
 2 files changed, 183 insertions(+), 72 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH net 01/15] ibmvnic: handle inconsistent login with reset
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-21 23:36   ` Jakub Kicinski
  2020-11-20 22:40 ` [PATCH net 02/15] ibmvnic: process HMC disable command Lijun Pan
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt, Lijun Pan

From: Dany Madden <drt@linux.ibm.com>

Inconsistent login with the vnicserver is causing the device to be
removed. This does not give the device a chance to recover from error
state. This patch schedules a FATAL reset instead to bring the adapter
up.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 2aa40b2f225c..dcb23015b6b4 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4412,7 +4412,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 	     adapter->req_rx_add_queues !=
 	     be32_to_cpu(login_rsp->num_rxadd_subcrqs))) {
 		dev_err(dev, "FATAL: Inconsistent login and login rsp\n");
-		ibmvnic_remove(adapter->vdev);
+		ibmvnic_reset(adapter, VNIC_RESET_FATAL);
 		return -EIO;
 	}
 	size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 02/15] ibmvnic: process HMC disable command
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
  2020-11-20 22:40 ` [PATCH net 01/15] ibmvnic: handle inconsistent login with reset Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-21 23:36   ` Jakub Kicinski
  2020-11-20 22:40 ` [PATCH net 03/15] ibmvnic: stop free_all_rwi on failed reset Lijun Pan
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

Currently ibmvnic does not support the disable vnic command from the
Hardware Management Console. This patch enables ibmvnic to process
CRQ message 0x07, disable vnic adapter.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 40 ++++++++++++++++++++++++++++++
 drivers/net/ethernet/ibm/ibmvnic.h |  3 ++-
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index dcb23015b6b4..82074e503ba9 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -109,6 +109,8 @@ static void release_crq_queue(struct ibmvnic_adapter *);
 static int __ibmvnic_set_mac(struct net_device *, u8 *);
 static int init_crq_queue(struct ibmvnic_adapter *adapter);
 static int send_query_phys_parms(struct ibmvnic_adapter *adapter);
+static void ibmvnic_disable(struct ibmvnic_adapter *adapter);
+static int ibmvnic_close(struct net_device *netdev);
 
 struct ibmvnic_stat {
 	char name[ETH_GSTRING_LEN];
@@ -1209,6 +1211,42 @@ static int ibmvnic_open(struct net_device *netdev)
 	return rc;
 }
 
+static void ibmvnic_disable(struct ibmvnic_adapter *adapter)
+{
+	struct list_head *entry, *tmp_entry;
+	struct net_device *netdev = adapter->netdev;
+	int rc = 0;
+
+	/* cancel all pending resets in the queue */
+	if (!list_empty(&adapter->rwi_list)) {
+		list_for_each_safe(entry, tmp_entry, &adapter->rwi_list)
+			list_del(entry);
+	}
+
+	/* wait for current reset to finish */
+	flush_work(&adapter->ibmvnic_reset);
+	flush_delayed_work(&adapter->ibmvnic_delayed_reset);
+
+	if (test_bit(0, &adapter->resetting) ||
+	    adapter->state == VNIC_PROBED ||
+	    adapter->state == VNIC_OPEN ||
+	    adapter->state == VNIC_OPENING) {
+		rc = ibmvnic_close(netdev);
+		/* Expect -EINVAL when crq is no longer active. Set link down
+		 * would fail.
+		 */
+		if (rc && rc != -EINVAL) {
+			netdev_err(netdev, "Failed to disable adapter, rc=%d\n", rc);
+			return;
+		}
+	} else {
+		netdev_dbg(netdev, "Disable adapter request ignored (state=%d)\n", adapter->state);
+		return;
+	}
+
+	netdev_dbg(netdev, "Adapter disabled\n");
+}
+
 static void clean_rx_pools(struct ibmvnic_adapter *adapter)
 {
 	struct ibmvnic_rx_pool *rx_pool;
@@ -4789,6 +4827,8 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 		} else if (gen_crq->cmd == IBMVNIC_DEVICE_FAILOVER) {
 			dev_info(dev, "Backing device failover detected\n");
 			adapter->failover_pending = true;
+		} else if (gen_crq->cmd == IBMVNIC_DEVICE_DISABLE) {
+			ibmvnic_disable(adapter);
 		} else {
 			/* The adapter lost the connection */
 			dev_err(dev, "Virtual Adapter failed (rc=%d)\n",
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 217dcc7ded70..af68f85534bc 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -834,10 +834,11 @@ enum ibmvnic_crq_type {
 	IBMVNIC_CRQ_XPORT_EVENT		= 0xFF,
 };
 
-enum ibmvfc_crq_format {
+enum ibmvnic_crq_format {
 	IBMVNIC_CRQ_INIT                 = 0x01,
 	IBMVNIC_CRQ_INIT_COMPLETE        = 0x02,
 	IBMVNIC_PARTITION_MIGRATED       = 0x06,
+	IBMVNIC_DEVICE_DISABLE		 = 0x07,
 	IBMVNIC_DEVICE_FAILOVER          = 0x08,
 };
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 03/15] ibmvnic: stop free_all_rwi on failed reset
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
  2020-11-20 22:40 ` [PATCH net 01/15] ibmvnic: handle inconsistent login with reset Lijun Pan
  2020-11-20 22:40 ` [PATCH net 02/15] ibmvnic: process HMC disable command Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 04/15] ibmvnic: remove free_all_rwi function Lijun Pan
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

When ibmvnic fails to reset, it breaks out of the reset loop and frees
all of the remaining resets from the workqueue. Doing so prevents the
adapter from recovering if no reset is scheduled after that. Instead,
have the driver continue to process resets on the workqueue.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 82074e503ba9..9e097c05e249 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2291,9 +2291,9 @@ static void __ibmvnic_reset(struct work_struct *work)
 			else
 				adapter->state = reset_state;
 			rc = 0;
-		} else if (rc && rc != IBMVNIC_INIT_FAILED &&
-		    !adapter->force_reset_recovery)
-			break;
+		}
+		if (rc)
+			netdev_dbg(adapter->netdev, "Reset failed, rc=%d\n", rc);
 
 		rwi = get_next_rwi(adapter);
 
@@ -2307,11 +2307,6 @@ static void __ibmvnic_reset(struct work_struct *work)
 		complete(&adapter->reset_done);
 	}
 
-	if (rc) {
-		netdev_dbg(adapter->netdev, "Reset failed\n");
-		free_all_rwi(adapter);
-	}
-
 	clear_bit_unlock(0, &adapter->resetting);
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 04/15] ibmvnic: remove free_all_rwi function
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (2 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 03/15] ibmvnic: stop free_all_rwi on failed reset Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-21 23:39   ` Jakub Kicinski
  2020-11-20 22:40 ` [PATCH net 05/15] ibmvnic: avoid memset null scrq msgs Lijun Pan
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

Remove free_all_rwi() since it is no longer used. (__ibmvnic_remove() was
the last user of free_all_rwi()).

Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 9e097c05e249..f0924019e617 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2211,17 +2211,6 @@ static struct ibmvnic_rwi *get_next_rwi(struct ibmvnic_adapter *adapter)
 	return rwi;
 }
 
-static void free_all_rwi(struct ibmvnic_adapter *adapter)
-{
-	struct ibmvnic_rwi *rwi;
-
-	rwi = get_next_rwi(adapter);
-	while (rwi) {
-		kfree(rwi);
-		rwi = get_next_rwi(adapter);
-	}
-}
-
 static void __ibmvnic_reset(struct work_struct *work)
 {
 	struct ibmvnic_rwi *rwi;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 05/15] ibmvnic: avoid memset null scrq msgs
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (3 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 04/15] ibmvnic: remove free_all_rwi function Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 06/15] ibmvnic: restore adapter state on failed reset Lijun Pan
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt, Lijun Pan

From: Dany Madden <drt@linux.ibm.com>

scrq->msgs could be NULL during device reset, causing Linux to crash.
So, check before memset scrq->msgs.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index f0924019e617..ceafd999a1ac 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2882,16 +2882,26 @@ static int reset_one_sub_crq_queue(struct ibmvnic_adapter *adapter,
 				   struct ibmvnic_sub_crq_queue *scrq)
 {
 	int rc;
+	if (!scrq) {
+		netdev_dbg(adapter->netdev,
+			   "Invalid scrq reset. irq (%d) or msgs (%p).\n",
+			   scrq->irq, scrq->msgs);
+		return -EINVAL;
+	}
 
 	if (scrq->irq) {
 		free_irq(scrq->irq, scrq);
 		irq_dispose_mapping(scrq->irq);
 		scrq->irq = 0;
 	}
-
-	memset(scrq->msgs, 0, 4 * PAGE_SIZE);
-	atomic_set(&scrq->used, 0);
-	scrq->cur = 0;
+	if (scrq->msgs) {
+		memset(scrq->msgs, 0, 4 * PAGE_SIZE);
+		atomic_set(&scrq->used, 0);
+		scrq->cur = 0;
+	} else {
+		netdev_dbg(adapter->netdev, "Invalid scrq reset\n");
+		return -EINVAL;
+	}
 
 	rc = h_reg_sub_crq(adapter->vdev->unit_address, scrq->msg_token,
 			   4 * PAGE_SIZE, &scrq->crq_num, &scrq->hw_irq);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 06/15] ibmvnic: restore adapter state on failed reset
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (4 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 05/15] ibmvnic: avoid memset null scrq msgs Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 07/15] ibmvnic: delay next reset if hard reset failed Lijun Pan
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

In a failed reset, driver could end up in VNIC_PROBED or VNIC_CLOSED
state and cannot recover in subsequent resets, leaving it offline.
This patch restores the adapter state to reset_state, the original
state when reset was called.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 67 ++++++++++++++++--------------
 1 file changed, 36 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index ceafd999a1ac..6f775ca4bea1 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1895,7 +1895,7 @@ static int do_change_param_reset(struct ibmvnic_adapter *adapter,
 	if (reset_state == VNIC_OPEN) {
 		rc = __ibmvnic_close(netdev);
 		if (rc)
-			return rc;
+			goto out;
 	}
 
 	release_resources(adapter);
@@ -1913,24 +1913,25 @@ static int do_change_param_reset(struct ibmvnic_adapter *adapter,
 	}
 
 	rc = ibmvnic_reset_init(adapter, true);
-	if (rc)
-		return IBMVNIC_INIT_FAILED;
+	if (rc) {
+		rc = IBMVNIC_INIT_FAILED;
+		goto out;
+	}
 
 	/* If the adapter was in PROBE state prior to the reset,
 	 * exit here.
 	 */
 	if (reset_state == VNIC_PROBED)
-		return 0;
+		goto out;
 
 	rc = ibmvnic_login(netdev);
 	if (rc) {
-		adapter->state = reset_state;
-		return rc;
+		goto out;
 	}
 
 	rc = init_resources(adapter);
 	if (rc)
-		return rc;
+		goto out;
 
 	ibmvnic_disable_irqs(adapter);
 
@@ -1940,8 +1941,10 @@ static int do_change_param_reset(struct ibmvnic_adapter *adapter,
 		return 0;
 
 	rc = __ibmvnic_open(netdev);
-	if (rc)
-		return IBMVNIC_OPEN_FAILED;
+	if (rc) {
+		rc = IBMVNIC_OPEN_FAILED;
+		goto out;
+	}
 
 	/* refresh device's multicast list */
 	ibmvnic_set_multi(netdev);
@@ -1950,7 +1953,10 @@ static int do_change_param_reset(struct ibmvnic_adapter *adapter,
 	for (i = 0; i < adapter->req_rx_queues; i++)
 		napi_schedule(&adapter->napi[i]);
 
-	return 0;
+out:
+	if (rc)
+		adapter->state = reset_state;
+	return rc;
 }
 
 /**
@@ -2053,7 +2059,6 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 
 		rc = ibmvnic_login(netdev);
 		if (rc) {
-			adapter->state = reset_state;
 			goto out;
 		}
 
@@ -2121,6 +2126,9 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 	rc = 0;
 
 out:
+	/* restore the adapter state if reset failed */
+	if (rc)
+		adapter->state = reset_state;
 	rtnl_unlock();
 
 	return rc;
@@ -2153,43 +2161,46 @@ static int do_hard_reset(struct ibmvnic_adapter *adapter,
 	if (rc) {
 		netdev_err(adapter->netdev,
 			   "Couldn't initialize crq. rc=%d\n", rc);
-		return rc;
+		goto out;
 	}
 
 	rc = ibmvnic_reset_init(adapter, false);
 	if (rc)
-		return rc;
+		goto out;
 
 	/* If the adapter was in PROBE state prior to the reset,
 	 * exit here.
 	 */
 	if (reset_state == VNIC_PROBED)
-		return 0;
+		goto out;
 
 	rc = ibmvnic_login(netdev);
-	if (rc) {
-		adapter->state = VNIC_PROBED;
-		return 0;
-	}
+	if (rc)
+		goto out;
 
 	rc = init_resources(adapter);
 	if (rc)
-		return rc;
+		goto out;
 
 	ibmvnic_disable_irqs(adapter);
 	adapter->state = VNIC_CLOSED;
 
 	if (reset_state == VNIC_CLOSED)
-		return 0;
+		goto out;
 
 	rc = __ibmvnic_open(netdev);
-	if (rc)
-		return IBMVNIC_OPEN_FAILED;
+	if (rc) {
+		rc = IBMVNIC_OPEN_FAILED;
+		goto out;
+	}
 
 	call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, netdev);
 	call_netdevice_notifiers(NETDEV_RESEND_IGMP, netdev);
-
-	return 0;
+out:
+	/* restore adapter state if reset failed */
+	if (rc)
+		adapter->state = reset_state;
+	return rc;
 }
 
 static struct ibmvnic_rwi *get_next_rwi(struct ibmvnic_adapter *adapter)
@@ -2274,13 +2285,7 @@ static void __ibmvnic_reset(struct work_struct *work)
 			rc = do_reset(adapter, rwi, reset_state);
 		}
 		kfree(rwi);
-		if (rc == IBMVNIC_OPEN_FAILED) {
-			if (list_empty(&adapter->rwi_list))
-				adapter->state = VNIC_CLOSED;
-			else
-				adapter->state = reset_state;
-			rc = 0;
-		}
+
 		if (rc)
 			netdev_dbg(adapter->netdev, "Reset failed, rc=%d\n", rc);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 07/15] ibmvnic: delay next reset if hard reset failed
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (5 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 06/15] ibmvnic: restore adapter state on failed reset Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 08/15] ibmvnic: track pending login Lijun Pan
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>

If auto-priority failover is enabled, the backing device needs time
to settle if hard resetting fails for any reason. So add a delay of
60 seconds before retrying the hard-reset.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 6f775ca4bea1..b0a93556a51b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2280,6 +2280,14 @@ static void __ibmvnic_reset(struct work_struct *work)
 				rc = do_hard_reset(adapter, rwi, reset_state);
 				rtnl_unlock();
 			}
+			if (rc) {
+				/* give backing device time to settle down */
+				netdev_dbg(adapter->netdev,
+					   "[S:%d] Hard reset failed, waiting 60 secs\n",
+					   adapter->state);
+				set_current_state(TASK_UNINTERRUPTIBLE);
+				schedule_timeout(60 * HZ);
+			}
 		} else if (!(rwi->reset_reason == VNIC_RESET_FATAL &&
 				adapter->from_passive_init)) {
 			rc = do_reset(adapter, rwi, reset_state);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 08/15] ibmvnic: track pending login
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (6 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 07/15] ibmvnic: delay next reset if hard reset failed Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 09/15] ibmvnic: send_login should check for crq errors Lijun Pan
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>

If after ibmvnic sent a LOGIN and got a FAILOVER, it is possible that the
worker thread will start the reset process and free the login response
buffer before it gets a (now stale) LOGIN_RSP. The ibmvnic tasklet will
then tries to access the login response buffer and crash.

This patch tracks when ibmvnic sends a LOGIN and discards any stale login
responses.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 17 +++++++++++++++++
 drivers/net/ethernet/ibm/ibmvnic.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index b0a93556a51b..c8242c0bfee0 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3876,6 +3876,8 @@ static int send_login(struct ibmvnic_adapter *adapter)
 	crq.login.cmd = LOGIN;
 	crq.login.ioba = cpu_to_be32(buffer_token);
 	crq.login.len = cpu_to_be32(buffer_size);
+
+	adapter->login_pending = true;
 	ibmvnic_send_crq(adapter, &crq);
 
 	return 0;
@@ -4428,6 +4430,15 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 	u64 *size_array;
 	int i;
 
+	/* CHECK: Test/set of login_pending does not need to be atomic
+	 * because only ibmvnic_tasklet tests/clears this.
+	 */
+	if (!adapter->login_pending) {
+		netdev_warn(netdev, "Ignoring unexpected login response\n");
+		return 0;
+	}
+	adapter->login_pending = false;
+
 	dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz,
 			 DMA_TO_DEVICE);
 	dma_unmap_single(dev, adapter->login_rsp_buf_token,
@@ -4799,6 +4810,11 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 		case IBMVNIC_CRQ_INIT:
 			dev_info(dev, "Partner initialized\n");
 			adapter->from_passive_init = true;
+			/* Discard any stale login responses from prev reset.
+			 * CHECK: should we clear even on INIT_COMPLETE?
+			 */
+			adapter->login_pending = false;
+
 			if (!completion_done(&adapter->init_done)) {
 				complete(&adapter->init_done);
 				adapter->init_done_rc = -EIO;
@@ -5230,6 +5246,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id)
 	dev_set_drvdata(&dev->dev, netdev);
 	adapter->vdev = dev;
 	adapter->netdev = netdev;
+	adapter->login_pending = false;
 
 	ether_addr_copy(adapter->mac_addr, mac_addr_p);
 	ether_addr_copy(netdev->dev_addr, adapter->mac_addr);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index af68f85534bc..9b1f34602f33 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1088,6 +1088,7 @@ struct ibmvnic_adapter {
 	struct delayed_work ibmvnic_delayed_reset;
 	unsigned long resetting;
 	bool napi_enabled, from_passive_init;
+	bool login_pending;
 
 	bool failover_pending;
 	bool force_reset_recovery;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 09/15] ibmvnic: send_login should check for crq errors
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (7 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 08/15] ibmvnic: track pending login Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 10/15] ibmvnic: no reset timeout for 5 seconds after reset Lijun Pan
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

send_login() does not check for the result of ibmvnic_send_crq() of the
login request. This results in the driver needlessly retrying the login
10 times even when CRQ is no longer active. Check the return code and
give up in case of errors in sending the CRQ.

The only time we want to retry is if we get a PARITALSUCCESS response
from the partner.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index c8242c0bfee0..9d2eebd31ff6 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -852,10 +852,8 @@ static int ibmvnic_login(struct net_device *netdev)
 		adapter->init_done_rc = 0;
 		reinit_completion(&adapter->init_done);
 		rc = send_login(adapter);
-		if (rc) {
-			netdev_warn(netdev, "Unable to login\n");
+		if (rc)
 			return rc;
-		}
 
 		if (!wait_for_completion_timeout(&adapter->init_done,
 						 timeout)) {
@@ -3764,15 +3762,16 @@ static int send_login(struct ibmvnic_adapter *adapter)
 	struct ibmvnic_login_rsp_buffer *login_rsp_buffer;
 	struct ibmvnic_login_buffer *login_buffer;
 	struct device *dev = &adapter->vdev->dev;
+	struct vnic_login_client_data *vlcd;
 	dma_addr_t rsp_buffer_token;
 	dma_addr_t buffer_token;
 	size_t rsp_buffer_size;
 	union ibmvnic_crq crq;
+	int client_data_len;
 	size_t buffer_size;
 	__be64 *tx_list_p;
 	__be64 *rx_list_p;
-	int client_data_len;
-	struct vnic_login_client_data *vlcd;
+	int rc;
 	int i;
 
 	if (!adapter->tx_scrq || !adapter->rx_scrq) {
@@ -3878,16 +3877,23 @@ static int send_login(struct ibmvnic_adapter *adapter)
 	crq.login.len = cpu_to_be32(buffer_size);
 
 	adapter->login_pending = true;
-	ibmvnic_send_crq(adapter, &crq);
+	rc = ibmvnic_send_crq(adapter, &crq);
+	if (rc) {
+		adapter->login_pending = false;
+		netdev_err(adapter->netdev, "Failed to send login, rc=%d\n", rc);
+		goto buf_rsp_map_failed;
+	}
 
 	return 0;
 
 buf_rsp_map_failed:
 	kfree(login_rsp_buffer);
+	adapter->login_rsp_buf = NULL;
 buf_rsp_alloc_failed:
 	dma_unmap_single(dev, buffer_token, buffer_size, DMA_TO_DEVICE);
 buf_map_failed:
 	kfree(login_buffer);
+	adapter->login_buf = NULL;
 buf_alloc_failed:
 	return -1;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 10/15] ibmvnic: no reset timeout for 5 seconds after reset
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (8 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 09/15] ibmvnic: send_login should check for crq errors Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 23:01   ` drt
  2020-11-20 22:40 ` [PATCH net 11/15] ibmvnic: reduce wait for completion time Lijun Pan
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

Reset timeout is going off right after adapter reset. This patch ensures
that timeout is scheduled if it has been 5 seconds since the last reset.
5 seconds is the default watchdog timeout.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 11 +++++++++--
 drivers/net/ethernet/ibm/ibmvnic.h |  2 ++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 9d2eebd31ff6..252af4ab6468 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2291,6 +2291,7 @@ static void __ibmvnic_reset(struct work_struct *work)
 			rc = do_reset(adapter, rwi, reset_state);
 		}
 		kfree(rwi);
+		adapter->last_reset_time = jiffies;
 
 		if (rc)
 			netdev_dbg(adapter->netdev, "Reset failed, rc=%d\n", rc);
@@ -2394,7 +2395,13 @@ static void ibmvnic_tx_timeout(struct net_device *dev, unsigned int txqueue)
 			   "Adapter is resetting, skip timeout reset\n");
 		return;
 	}
-
+	/* No queuing up reset until at least 5 seconds (default watchdog val)
+	 * after last reset
+	 */
+	if (time_before(jiffies, (adapter->last_reset_time + dev->watchdog_timeo))) {
+		netdev_dbg(dev, "Not yet time to tx timeout.\n");
+		return;
+	}
 	ibmvnic_reset(adapter, VNIC_RESET_TIMEOUT);
 }
 
@@ -5316,7 +5323,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id)
 	adapter->state = VNIC_PROBED;
 
 	adapter->wait_for_reset = false;
-
+	adapter->last_reset_time = jiffies;
 	return 0;
 
 ibmvnic_register_fail:
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 9b1f34602f33..d15866cbc2a6 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1089,6 +1089,8 @@ struct ibmvnic_adapter {
 	unsigned long resetting;
 	bool napi_enabled, from_passive_init;
 	bool login_pending;
+	/* last device reset time */
+	unsigned long last_reset_time;
 
 	bool failover_pending;
 	bool force_reset_recovery;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 11/15] ibmvnic: reduce wait for completion time
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (9 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 10/15] ibmvnic: no reset timeout for 5 seconds after reset Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 12/15] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues Lijun Pan
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Dany Madden <drt@linux.ibm.com>

Reduce the wait time for Command Response Queue response from 30 seconds
to 20 seconds, as recommended by VIOS and Power Hypervisor teams.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 252af4ab6468..47446e5f8ec5 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -836,7 +836,7 @@ static void release_napi(struct ibmvnic_adapter *adapter)
 static int ibmvnic_login(struct net_device *netdev)
 {
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
-	unsigned long timeout = msecs_to_jiffies(30000);
+	unsigned long timeout = msecs_to_jiffies(20000);
 	int retry_count = 0;
 	int retries = 10;
 	bool retry;
@@ -940,7 +940,7 @@ static void release_resources(struct ibmvnic_adapter *adapter)
 static int set_link_state(struct ibmvnic_adapter *adapter, u8 link_state)
 {
 	struct net_device *netdev = adapter->netdev;
-	unsigned long timeout = msecs_to_jiffies(30000);
+	unsigned long timeout = msecs_to_jiffies(20000);
 	union ibmvnic_crq crq;
 	bool resend;
 	int rc;
@@ -5164,7 +5164,7 @@ static int init_crq_queue(struct ibmvnic_adapter *adapter)
 static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
 {
 	struct device *dev = &adapter->vdev->dev;
-	unsigned long timeout = msecs_to_jiffies(30000);
+	unsigned long timeout = msecs_to_jiffies(20000);
 	u64 old_num_rx_queues, old_num_tx_queues;
 	int rc;
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 12/15] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (10 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 11/15] ibmvnic: reduce wait for completion time Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-21 23:44   ` Jakub Kicinski
  2020-11-20 22:40 ` [PATCH net 13/15] ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq Lijun Pan
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt, Lijun Pan

adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset
did not complete after freeing sub crqs. Check for NULL before
dereferencing them.

Snippet of call trace:
ibmvnic 30000006 env6: Releasing sub-CRQ
ibmvnic 30000006 env6: Releasing CRQ
...
ibmvnic 30000006 env6: Got Control IP offload Response
ibmvnic 30000006 env6: Re-setting tx_scrq[0]
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc008000003dea7cc
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G        W         5.8.0+ #4
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000
REGS: c0000007ef7db860 TRAP: 0380   Tainted: G        W          (5.8.0+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28002422  XER: 0000000d
CFAR: c000000000bd9520 IRQMASK: 0
GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00
GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010
GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7
GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048
GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00
NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic]
LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic]
Call Trace:
[c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable)
[c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic]
[c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800
[c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520
[c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0
[c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 47446e5f8ec5..a0dbd963a1ab 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2930,6 +2930,13 @@ static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter)
 {
 	int i, rc;
 
+	if (!adapter->tx_scrq || !adapter->rx_scrq) {
+		netdev_err(adapter->netdev,
+			   "tx_scrq (%p) or rx_scrq (%p) does not exist\n",
+			   adapter->tx_scrq, adapter->rx_scrq);
+		return -EINVAL;
+	}
+
 	for (i = 0; i < adapter->req_tx_queues; i++) {
 		netdev_dbg(adapter->netdev, "Re-setting tx_scrq[%d]\n", i);
 		rc = reset_one_sub_crq_queue(adapter, adapter->tx_scrq[i]);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 13/15] ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (11 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 12/15] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 14/15] ibmvnic: enhance resetting status check during module exit Lijun Pan
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt, Lijun Pan

crq->msgs could be NULL if the previous reset did not complete after
freeing crq->msgs. Check for NULL before dereferencing them.

Snippet of call trace:
...
ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ
ibmvnic 30000003 env3 (unregistering): Releasing CRQ
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0000000000c1a30
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic]
CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G            E     5.10.0-rc1+ #12
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400
REGS: c00000000d05b7a0 TRAP: 0380   Tainted: G            E      (5.10.0-rc1+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002480  XER: 20040000
CFAR: c0000000000c19ec IRQMASK: 0
GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2
GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280
GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002
GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550
GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800
NIP [c0000000000c1a30] memset+0x68/0x104
LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic]
Call Trace:
[c00000000d05ba30] [0000000000000800] 0x800 (unreliable)
[c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic]
[c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic]
[c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800
[c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520
[c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0
[c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index a0dbd963a1ab..dda2d4bb9b40 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5053,6 +5053,10 @@ static int ibmvnic_reset_crq(struct ibmvnic_adapter *adapter)
 	} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
 
 	/* Clean out the queue */
+	if (!crq->msgs) {
+		netdev_warn(adapter->netdev, "crq->msgs == NULL\n");
+		return -EINVAL;
+	}
 	memset(crq->msgs, 0, PAGE_SIZE);
 	crq->cur = 0;
 	crq->active = false;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 14/15] ibmvnic: enhance resetting status check during module exit
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (12 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 13/15] ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-20 22:40 ` [PATCH net 15/15] ibmvnic: add some debugs Lijun Pan
  2020-11-23 19:38 ` [PATCH net 00/15] ibmvnic: assorted bug fixes ljp
  15 siblings, 0 replies; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt, Lijun Pan

Based on the discussion with Sukadev Bhattiprolu and Dany Madden,
we believe that checking adapter->resetting bit is preferred
since RESETTING state flag is not as strict as resetting bit.
RESETTING state flag is removed since it is verbose now.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 3 +--
 drivers/net/ethernet/ibm/ibmvnic.h | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index dda2d4bb9b40..b1519e92ccce 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2251,7 +2251,6 @@ static void __ibmvnic_reset(struct work_struct *work)
 
 		if (!saved_state) {
 			reset_state = adapter->state;
-			adapter->state = VNIC_RESETTING;
 			saved_state = true;
 		}
 		spin_unlock_irqrestore(&adapter->state_lock, flags);
@@ -5362,7 +5361,7 @@ static int ibmvnic_remove(struct vio_dev *dev)
 	unsigned long flags;
 
 	spin_lock_irqsave(&adapter->state_lock, flags);
-	if (adapter->state == VNIC_RESETTING) {
+	if (test_bit(0, &adapter->resetting)) {
 		spin_unlock_irqrestore(&adapter->state_lock, flags);
 		return -EBUSY;
 	}
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index d15866cbc2a6..950f439bed32 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -943,8 +943,7 @@ enum vnic_state {VNIC_PROBING = 1,
 		 VNIC_CLOSING,
 		 VNIC_CLOSED,
 		 VNIC_REMOVING,
-		 VNIC_REMOVED,
-		 VNIC_RESETTING};
+		 VNIC_REMOVED};
 
 enum ibmvnic_reset_reason {VNIC_RESET_FAILOVER = 1,
 			   VNIC_RESET_MOBILITY,
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH net 15/15] ibmvnic: add some debugs
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (13 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 14/15] ibmvnic: enhance resetting status check during module exit Lijun Pan
@ 2020-11-20 22:40 ` Lijun Pan
  2020-11-21 23:45   ` Jakub Kicinski
  2020-11-23 19:38 ` [PATCH net 00/15] ibmvnic: assorted bug fixes ljp
  15 siblings, 1 reply; 28+ messages in thread
From: Lijun Pan @ 2020-11-20 22:40 UTC (permalink / raw)
  To: netdev; +Cc: sukadev, drt

From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>

We sometimes run into situations where a soft/hard reset of the adapter
takes a long time or fails to complete. Having additional messages that
include important adapter state info will hopefully help understand what
is happening, reduce the guess work and minimize requests to reproduce
problems with debug patches.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index b1519e92ccce..1abeb3edee33 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -406,6 +406,8 @@ static void replenish_pools(struct ibmvnic_adapter *adapter)
 		if (adapter->rx_pool[i].active)
 			replenish_rx_pool(adapter, &adapter->rx_pool[i]);
 	}
+
+	netdev_dbg(adapter->netdev, "Replenished %d pools\n", i);
 }
 
 static void release_stats_buffers(struct ibmvnic_adapter *adapter)
@@ -911,6 +913,7 @@ static int ibmvnic_login(struct net_device *netdev)
 
 	__ibmvnic_set_mac(netdev, adapter->mac_addr);
 
+	netdev_dbg(netdev, "[S:%d] Login succeeded\n", adapter->state);
 	return 0;
 }
 
@@ -1377,6 +1380,10 @@ static int ibmvnic_close(struct net_device *netdev)
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
 	int rc;
 
+	netdev_dbg(netdev, "[S:%d FOP:%d FRR:%d] Closing\n",
+		   adapter->state, adapter->failover_pending,
+		   adapter->force_reset_recovery);
+
 	/* If device failover is pending, just set device state and return.
 	 * Device operation will be handled by reset routine.
 	 */
@@ -1969,8 +1976,10 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 	struct net_device *netdev = adapter->netdev;
 	int i, rc;
 
-	netdev_dbg(adapter->netdev, "Re-setting driver (%d)\n",
-		   rwi->reset_reason);
+	netdev_dbg(adapter->netdev,
+		   "[S:%d FOP:%d] Reset reason %d, reset_state %d\n",
+		   adapter->state, adapter->failover_pending,
+		   rwi->reset_reason, reset_state);
 
 	rtnl_lock();
 	/*
@@ -2129,6 +2138,8 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 		adapter->state = reset_state;
 	rtnl_unlock();
 
+	netdev_dbg(adapter->netdev, "[S:%d FOP:%d] Reset done, rc %d\n",
+		   adapter->state, adapter->failover_pending, rc);
 	return rc;
 }
 
@@ -2198,6 +2209,8 @@ static int do_hard_reset(struct ibmvnic_adapter *adapter,
 	/* restore adapter state if reset failed */
 	if (rc)
 		adapter->state = reset_state;
+	netdev_dbg(adapter->netdev, "[S:%d FOP:%d] Hard reset done, rc %d\n",
+		   adapter->state, adapter->failover_pending, rc);
 	return rc;
 }
 
@@ -2307,7 +2320,14 @@ static void __ibmvnic_reset(struct work_struct *work)
 		complete(&adapter->reset_done);
 	}
 
+	netdev_dbg(adapter->netdev, "FRR=%d\n", adapter->force_reset_recovery);
+
 	clear_bit_unlock(0, &adapter->resetting);
+
+	netdev_err(adapter->netdev,
+		   "[S:%d FRR:%d WFR:%d] Done processing resets\n",
+		   adapter->state, adapter->force_reset_recovery,
+		   adapter->wait_for_reset);
 }
 
 static void __ibmvnic_delayed_reset(struct work_struct *work)
@@ -2353,7 +2373,8 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
 	list_for_each(entry, &adapter->rwi_list) {
 		tmp = list_entry(entry, struct ibmvnic_rwi, list);
 		if (tmp->reset_reason == reason) {
-			netdev_dbg(netdev, "Skipping matching reset\n");
+			netdev_dbg(netdev, "Skipping matching reset, reason=%d\n",
+				   reason);
 			spin_unlock_irqrestore(&adapter->rwi_lock, flags);
 			ret = EBUSY;
 			goto err;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH net 10/15] ibmvnic: no reset timeout for 5 seconds after reset
  2020-11-20 22:40 ` [PATCH net 10/15] ibmvnic: no reset timeout for 5 seconds after reset Lijun Pan
@ 2020-11-20 23:01   ` drt
  0 siblings, 0 replies; 28+ messages in thread
From: drt @ 2020-11-20 23:01 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On 2020-11-20 14:40, Lijun Pan wrote:
> From: Dany Madden <drt@linux.ibm.com>
> 
> Reset timeout is going off right after adapter reset. This patch 
> ensures
> that timeout is scheduled if it has been 5 seconds since the last 
> reset.
> 5 seconds is the default watchdog timeout.
> 

Suggested-by: Brian King <brking@linux.ibm.com>

> Signed-off-by: Dany Madden <drt@linux.ibm.com>

Sorry I missed this. Thanks, Brian!

Dany

> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 11 +++++++++--
>  drivers/net/ethernet/ibm/ibmvnic.h |  2 ++
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 9d2eebd31ff6..252af4ab6468 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2291,6 +2291,7 @@ static void __ibmvnic_reset(struct work_struct 
> *work)
>  			rc = do_reset(adapter, rwi, reset_state);
>  		}
>  		kfree(rwi);
> +		adapter->last_reset_time = jiffies;
> 
>  		if (rc)
>  			netdev_dbg(adapter->netdev, "Reset failed, rc=%d\n", rc);
> @@ -2394,7 +2395,13 @@ static void ibmvnic_tx_timeout(struct
> net_device *dev, unsigned int txqueue)
>  			   "Adapter is resetting, skip timeout reset\n");
>  		return;
>  	}
> -
> +	/* No queuing up reset until at least 5 seconds (default watchdog 
> val)
> +	 * after last reset
> +	 */
> +	if (time_before(jiffies, (adapter->last_reset_time + 
> dev->watchdog_timeo))) {
> +		netdev_dbg(dev, "Not yet time to tx timeout.\n");
> +		return;
> +	}
>  	ibmvnic_reset(adapter, VNIC_RESET_TIMEOUT);
>  }
> 
> @@ -5316,7 +5323,7 @@ static int ibmvnic_probe(struct vio_dev *dev,
> const struct vio_device_id *id)
>  	adapter->state = VNIC_PROBED;
> 
>  	adapter->wait_for_reset = false;
> -
> +	adapter->last_reset_time = jiffies;
>  	return 0;
> 
>  ibmvnic_register_fail:
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.h
> b/drivers/net/ethernet/ibm/ibmvnic.h
> index 9b1f34602f33..d15866cbc2a6 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.h
> +++ b/drivers/net/ethernet/ibm/ibmvnic.h
> @@ -1089,6 +1089,8 @@ struct ibmvnic_adapter {
>  	unsigned long resetting;
>  	bool napi_enabled, from_passive_init;
>  	bool login_pending;
> +	/* last device reset time */
> +	unsigned long last_reset_time;
> 
>  	bool failover_pending;
>  	bool force_reset_recovery;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 01/15] ibmvnic: handle inconsistent login with reset
  2020-11-20 22:40 ` [PATCH net 01/15] ibmvnic: handle inconsistent login with reset Lijun Pan
@ 2020-11-21 23:36   ` Jakub Kicinski
  0 siblings, 0 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-21 23:36 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On Fri, 20 Nov 2020 16:40:35 -0600 Lijun Pan wrote:
> From: Dany Madden <drt@linux.ibm.com>
> 
> Inconsistent login with the vnicserver is causing the device to be
> removed. This does not give the device a chance to recover from error
> state. This patch schedules a FATAL reset instead to bring the adapter
> up.
> 
> Signed-off-by: Dany Madden <drt@linux.ibm.com>
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>

Please provide fixes tags for all the patches.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 02/15] ibmvnic: process HMC disable command
  2020-11-20 22:40 ` [PATCH net 02/15] ibmvnic: process HMC disable command Lijun Pan
@ 2020-11-21 23:36   ` Jakub Kicinski
  2020-11-21 23:38     ` Jakub Kicinski
  2020-11-22 15:12     ` drt
  0 siblings, 2 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-21 23:36 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On Fri, 20 Nov 2020 16:40:36 -0600 Lijun Pan wrote:
> From: Dany Madden <drt@linux.ibm.com>
> 
> Currently ibmvnic does not support the disable vnic command from the
> Hardware Management Console. This patch enables ibmvnic to process
> CRQ message 0x07, disable vnic adapter.

What user-visible problem does this one solve?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 02/15] ibmvnic: process HMC disable command
  2020-11-21 23:36   ` Jakub Kicinski
@ 2020-11-21 23:38     ` Jakub Kicinski
  2020-11-22 15:12     ` drt
  1 sibling, 0 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-21 23:38 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On Sat, 21 Nov 2020 15:36:37 -0800 Jakub Kicinski wrote:
> On Fri, 20 Nov 2020 16:40:36 -0600 Lijun Pan wrote:
> > From: Dany Madden <drt@linux.ibm.com>
> > 
> > Currently ibmvnic does not support the disable vnic command from the
> > Hardware Management Console. This patch enables ibmvnic to process
> > CRQ message 0x07, disable vnic adapter.  
> 
> What user-visible problem does this one solve?

Re-reading the commit message - is Hardware Management Console operated
by a human? So this is basically adding a missing feature, not fixes a
bug? Unless not being able to disable vnic is causing other things to
break.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 04/15] ibmvnic: remove free_all_rwi function
  2020-11-20 22:40 ` [PATCH net 04/15] ibmvnic: remove free_all_rwi function Lijun Pan
@ 2020-11-21 23:39   ` Jakub Kicinski
  0 siblings, 0 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-21 23:39 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On Fri, 20 Nov 2020 16:40:38 -0600 Lijun Pan wrote:
> From: Dany Madden <drt@linux.ibm.com>
> 
> Remove free_all_rwi() since it is no longer used. (__ibmvnic_remove() was
> the last user of free_all_rwi()).

Squash this with the appropriate change, please.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 12/15] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues
  2020-11-20 22:40 ` [PATCH net 12/15] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues Lijun Pan
@ 2020-11-21 23:44   ` Jakub Kicinski
  0 siblings, 0 replies; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-21 23:44 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On Fri, 20 Nov 2020 16:40:46 -0600 Lijun Pan wrote:
> adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset
> did not complete after freeing sub crqs. Check for NULL before
> dereferencing them.

> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 47446e5f8ec5..a0dbd963a1ab 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2930,6 +2930,13 @@ static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter)
>  {
>  	int i, rc;
>  
> +	if (!adapter->tx_scrq || !adapter->rx_scrq) {
> +		netdev_err(adapter->netdev,
> +			   "tx_scrq (%p) or rx_scrq (%p) does not exist\n",
> +			   adapter->tx_scrq, adapter->rx_scrq);

This is expected to happen for the condition you describe in the commit
message. Either prevent it from happening or silently ignore.

What's the impact to the user when this happens? Why would they want to
know that some pointer is NULL? Presumably there is already a message
printed when reset does not complete or such?

> +		return -EINVAL;
> +	}
> +
>  	for (i = 0; i < adapter->req_tx_queues; i++) {
>  		netdev_dbg(adapter->netdev, "Re-setting tx_scrq[%d]\n", i);
>  		rc = reset_one_sub_crq_queue(adapter, adapter->tx_scrq[i]);


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 15/15] ibmvnic: add some debugs
  2020-11-20 22:40 ` [PATCH net 15/15] ibmvnic: add some debugs Lijun Pan
@ 2020-11-21 23:45   ` Jakub Kicinski
  2020-11-23 19:48     ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-21 23:45 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On Fri, 20 Nov 2020 16:40:49 -0600 Lijun Pan wrote:
> From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
> 
> We sometimes run into situations where a soft/hard reset of the adapter
> takes a long time or fails to complete. Having additional messages that
> include important adapter state info will hopefully help understand what
> is happening, reduce the guess work and minimize requests to reproduce
> problems with debug patches.

This doesn't qualify as a bug fix, please send it to net-next.

> +	netdev_err(adapter->netdev,
> +		   "[S:%d FRR:%d WFR:%d] Done processing resets\n",
> +		   adapter->state, adapter->force_reset_recovery,
> +		   adapter->wait_for_reset);

Does reset only happen as a result of an error? Should this be a
netdev_info() instead?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 02/15] ibmvnic: process HMC disable command
  2020-11-21 23:36   ` Jakub Kicinski
  2020-11-21 23:38     ` Jakub Kicinski
@ 2020-11-22 15:12     ` drt
  2020-11-23 19:43       ` Jakub Kicinski
  1 sibling, 1 reply; 28+ messages in thread
From: drt @ 2020-11-22 15:12 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Lijun Pan, netdev, sukadev, drt

On 2020-11-21 15:36, Jakub Kicinski wrote:
> On Fri, 20 Nov 2020 16:40:36 -0600 Lijun Pan wrote:
>> From: Dany Madden <drt@linux.ibm.com>
>> 
>> Currently ibmvnic does not support the disable vnic command from the
>> Hardware Management Console. This patch enables ibmvnic to process
>> CRQ message 0x07, disable vnic adapter.
> 
> What user-visible problem does this one solve?
This allows HMC to disconnect a Linux client from the network if the 
vNIC adapter is misbehaving and/or sending malicious traffic. The effect 
is the same as when a sysadmin sets a link down (ibmvnic_close()) on the 
Linux client. This patch extends this ability to the HMC.

Thanks!
Dany

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 00/15] ibmvnic: assorted bug fixes
  2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
                   ` (14 preceding siblings ...)
  2020-11-20 22:40 ` [PATCH net 15/15] ibmvnic: add some debugs Lijun Pan
@ 2020-11-23 19:38 ` ljp
  15 siblings, 0 replies; 28+ messages in thread
From: ljp @ 2020-11-23 19:38 UTC (permalink / raw)
  To: Lijun Pan; +Cc: netdev, sukadev, drt

On 2020-11-20 16:40, Lijun Pan wrote:
> Assorted fixes and improvements for ibmvnic bugs.
> 
> Dany Madden (9):
>   ibmvnic: handle inconsistent login with reset
>   ibmvnic: process HMC disable command
>   ibmvnic: stop free_all_rwi on failed reset
>   ibmvnic: remove free_all_rwi function
>   ibmvnic: avoid memset null scrq msgs
>   ibmvnic: restore adapter state on failed reset
>   ibmvnic: send_login should check for crq errors
>   ibmvnic: no reset timeout for 5 seconds after reset
>   ibmvnic: reduce wait for completion time
> 
> Lijun Pan (3):
>   ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues
>   ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq
>   ibmvnic: enhance resetting status check during module exit
> 
> Sukadev Bhattiprolu (3):
>   ibmvnic: delay next reset if hard reset failed
>   ibmvnic: track pending login
>   ibmvnic: add some debugs
> 
>  drivers/net/ethernet/ibm/ibmvnic.c | 246 +++++++++++++++++++++--------
>  drivers/net/ethernet/ibm/ibmvnic.h |   9 +-
>  2 files changed, 183 insertions(+), 72 deletions(-)

In v2, we will split to 3 sets according to patch dependencies so that 
the
individual author can re-work on them during the coming holiday season.
1-11 as a set since they are dependent and most of them are Dany's 
patches
12-14 as a set since they are independent of 1-11.
15 to be sent to net-next.

Thanks,
Lijun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 02/15] ibmvnic: process HMC disable command
  2020-11-22 15:12     ` drt
@ 2020-11-23 19:43       ` Jakub Kicinski
  2020-11-23 21:46         ` drt
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2020-11-23 19:43 UTC (permalink / raw)
  To: drt; +Cc: Lijun Pan, netdev, sukadev, drt

On Sun, 22 Nov 2020 07:12:38 -0800 drt wrote:
> On 2020-11-21 15:36, Jakub Kicinski wrote:
> > On Fri, 20 Nov 2020 16:40:36 -0600 Lijun Pan wrote:  
> >> From: Dany Madden <drt@linux.ibm.com>
> >> 
> >> Currently ibmvnic does not support the disable vnic command from the
> >> Hardware Management Console. This patch enables ibmvnic to process
> >> CRQ message 0x07, disable vnic adapter.  
> > 
> > What user-visible problem does this one solve?  
> This allows HMC to disconnect a Linux client from the network if the 
> vNIC adapter is misbehaving and/or sending malicious traffic. The effect 
> is the same as when a sysadmin sets a link down (ibmvnic_close()) on the 
> Linux client. This patch extends this ability to the HMC.

Okay, sounds to me like net-next material, then.

IIUC we don't need to fix this ASAP and backport to stable.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 15/15] ibmvnic: add some debugs
  2020-11-21 23:45   ` Jakub Kicinski
@ 2020-11-23 19:48     ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2020-11-23 19:48 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Lijun Pan, netdev, drt

Jakub Kicinski [kuba@kernel.org] wrote:
> On Fri, 20 Nov 2020 16:40:49 -0600 Lijun Pan wrote:
> > From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
> > 
> > We sometimes run into situations where a soft/hard reset of the adapter
> > takes a long time or fails to complete. Having additional messages that
> > include important adapter state info will hopefully help understand what
> > is happening, reduce the guess work and minimize requests to reproduce
> > problems with debug patches.
> 
> This doesn't qualify as a bug fix, please send it to net-next.

Ok.

> 
> > +	netdev_err(adapter->netdev,
> > +		   "[S:%d FRR:%d WFR:%d] Done processing resets\n",
> > +		   adapter->state, adapter->force_reset_recovery,
> > +		   adapter->wait_for_reset);
> 
> Does reset only happen as a result of an error? Should this be a
> netdev_info() instead?

It is an informational message, will change to netdev_info().

Thanks,

Sukadev

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH net 02/15] ibmvnic: process HMC disable command
  2020-11-23 19:43       ` Jakub Kicinski
@ 2020-11-23 21:46         ` drt
  0 siblings, 0 replies; 28+ messages in thread
From: drt @ 2020-11-23 21:46 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Lijun Pan, netdev, sukadev, drt

On 2020-11-23 11:43, Jakub Kicinski wrote:
> On Sun, 22 Nov 2020 07:12:38 -0800 drt wrote:
>> On 2020-11-21 15:36, Jakub Kicinski wrote:
>> > On Fri, 20 Nov 2020 16:40:36 -0600 Lijun Pan wrote:
>> >> From: Dany Madden <drt@linux.ibm.com>
>> >>
>> >> Currently ibmvnic does not support the disable vnic command from the
>> >> Hardware Management Console. This patch enables ibmvnic to process
>> >> CRQ message 0x07, disable vnic adapter.
>> >
>> > What user-visible problem does this one solve?
>> This allows HMC to disconnect a Linux client from the network if the
>> vNIC adapter is misbehaving and/or sending malicious traffic. The 
>> effect
>> is the same as when a sysadmin sets a link down (ibmvnic_close()) on 
>> the
>> Linux client. This patch extends this ability to the HMC.
> 
> Okay, sounds to me like net-next material, then.
> 
> IIUC we don't need to fix this ASAP and backport to stable.

Yes, I will submit v2 net-next. Thank you.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-11-23 21:46 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-20 22:40 [PATCH net 00/15] ibmvnic: assorted bug fixes Lijun Pan
2020-11-20 22:40 ` [PATCH net 01/15] ibmvnic: handle inconsistent login with reset Lijun Pan
2020-11-21 23:36   ` Jakub Kicinski
2020-11-20 22:40 ` [PATCH net 02/15] ibmvnic: process HMC disable command Lijun Pan
2020-11-21 23:36   ` Jakub Kicinski
2020-11-21 23:38     ` Jakub Kicinski
2020-11-22 15:12     ` drt
2020-11-23 19:43       ` Jakub Kicinski
2020-11-23 21:46         ` drt
2020-11-20 22:40 ` [PATCH net 03/15] ibmvnic: stop free_all_rwi on failed reset Lijun Pan
2020-11-20 22:40 ` [PATCH net 04/15] ibmvnic: remove free_all_rwi function Lijun Pan
2020-11-21 23:39   ` Jakub Kicinski
2020-11-20 22:40 ` [PATCH net 05/15] ibmvnic: avoid memset null scrq msgs Lijun Pan
2020-11-20 22:40 ` [PATCH net 06/15] ibmvnic: restore adapter state on failed reset Lijun Pan
2020-11-20 22:40 ` [PATCH net 07/15] ibmvnic: delay next reset if hard reset failed Lijun Pan
2020-11-20 22:40 ` [PATCH net 08/15] ibmvnic: track pending login Lijun Pan
2020-11-20 22:40 ` [PATCH net 09/15] ibmvnic: send_login should check for crq errors Lijun Pan
2020-11-20 22:40 ` [PATCH net 10/15] ibmvnic: no reset timeout for 5 seconds after reset Lijun Pan
2020-11-20 23:01   ` drt
2020-11-20 22:40 ` [PATCH net 11/15] ibmvnic: reduce wait for completion time Lijun Pan
2020-11-20 22:40 ` [PATCH net 12/15] ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues Lijun Pan
2020-11-21 23:44   ` Jakub Kicinski
2020-11-20 22:40 ` [PATCH net 13/15] ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq Lijun Pan
2020-11-20 22:40 ` [PATCH net 14/15] ibmvnic: enhance resetting status check during module exit Lijun Pan
2020-11-20 22:40 ` [PATCH net 15/15] ibmvnic: add some debugs Lijun Pan
2020-11-21 23:45   ` Jakub Kicinski
2020-11-23 19:48     ` Sukadev Bhattiprolu
2020-11-23 19:38 ` [PATCH net 00/15] ibmvnic: assorted bug fixes ljp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).