netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 1/4] ibmvnic: Allow extra failures before disabling
@ 2022-01-22  2:59 Sukadev Bhattiprolu
  2022-01-22  2:59 ` [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early Sukadev Bhattiprolu
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Sukadev Bhattiprolu @ 2022-01-22  2:59 UTC (permalink / raw)
  To: netdev; +Cc: Brian King, Dany Madden, Rick Lindsley

If auto-priority-failover (APF) is enabled and there are at least two
backing devices of different priorities, some resets like fail-over,
change-param etc can cause at least two back to back failovers. (Failover
from high priority backing device to lower priority one and then back
to the higher priority one if that is still functional).

Depending on the timimg of the two failovers it is possible to trigger
a "hard" reset and for the hard reset to fail due to failovers. When this
occurs, the driver assumes that the network is unstable and disables the
VNIC for a 60-second "settling time". This in turn can cause the ethtool
command to fail with "No such device" while the vnic automatically recovers
a little while later.

Given that it's possible to have two back to back failures, allow for extra
failures before disabling the vnic for the settling time.

Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 0bb3911dd014..9b2d16ad76f1 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2598,6 +2598,7 @@ static void __ibmvnic_reset(struct work_struct *work)
 	struct ibmvnic_rwi *rwi;
 	unsigned long flags;
 	u32 reset_state;
+	int num_fails = 0;
 	int rc = 0;
 
 	adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset);
@@ -2651,11 +2652,23 @@ static void __ibmvnic_reset(struct work_struct *work)
 				rc = do_hard_reset(adapter, rwi, reset_state);
 				rtnl_unlock();
 			}
-			if (rc) {
-				/* give backing device time to settle down */
+			if (rc)
+				num_fails++;
+			else
+				num_fails = 0;
+
+			/* If auto-priority-failover is enabled we can get
+			 * back to back failovers during resets, resulting
+			 * in at least two failed resets (from high-priority
+			 * backing device to low-priority one and then back)
+			 * If resets continue to fail beyond that, give the
+			 * adapter some time to settle down before retrying.
+			 */
+			if (num_fails >= 3) {
 				netdev_dbg(adapter->netdev,
-					   "[S:%s] Hard reset failed, waiting 60 secs\n",
-					   adapter_state_to_string(adapter->state));
+					   "[S:%s] Hard reset failed %d times, waiting 60 secs\n",
+					   adapter_state_to_string(adapter->state),
+					   num_fails);
 				set_current_state(TASK_UNINTERRUPTIBLE);
 				schedule_timeout(60 * HZ);
 			}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early
  2022-01-22  2:59 [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Sukadev Bhattiprolu
@ 2022-01-22  2:59 ` Sukadev Bhattiprolu
  2022-01-23  0:30   ` Dany Madden
  2022-01-22  2:59 ` [PATCH net 3/4] ibmvnic: don't spin in tasklet Sukadev Bhattiprolu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Sukadev Bhattiprolu @ 2022-01-22  2:59 UTC (permalink / raw)
  To: netdev; +Cc: Brian King, Dany Madden, Rick Lindsley

We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should
send out the next protocol message type. i.e when we get back responses
to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs.
Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we
send out the QUERY_IP_OFFLOAD CRQ.

We currently increment ->running_cap_crqs as we send out each CRQ and
have the ibmvnic_tasklet() send out the next message type, when this
running_cap_crqs count drops to 0.

This assumes that all the CRQs of the current type were sent out before
the count drops to 0. However it is possible that we send out say 6 CRQs,
get preempted and receive all the 6 responses before we send out the
remaining CRQs. This can result in ->running_cap_crqs count dropping to
zero before all messages of the current type were sent and we end up
sending the next protocol message too early.

Instead initialize the ->running_cap_crqs upfront so the tasklet will
only send the next protocol message after all responses are received.

Use the cap_reqs local variable to also detect any discrepancy (either
now or in future) in the number of capability requests we actually send.

Currently only send_query_cap() is affected by this behavior (of sending
next message early) since it is called from the worker thread (during
reset) and from application thread (during ->ndo_open()) and they can be
preempted. send_request_cap() is only called from the tasklet  which
processes CRQ responses sequentially, is not be affected.  But to
maintain the existing symmtery with send_query_capability() we update
send_request_capability() also.

Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 106 +++++++++++++++++++----------
 1 file changed, 71 insertions(+), 35 deletions(-)


diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 9b2d16ad76f1..acd488310bbc 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3849,11 +3849,25 @@ static void send_request_cap(struct ibmvnic_adapter *adapter, int retry)
 	struct device *dev = &adapter->vdev->dev;
 	union ibmvnic_crq crq;
 	int max_entries;
+	int cap_reqs;
+
+	/* We send out 6 or 7 REQUEST_CAPABILITY CRQs below (depending on
+	 * the PROMISC flag). Initialize this count upfront. When the tasklet
+	 * receives a response to all of these, it will send the next protocol
+	 * message (QUERY_IP_OFFLOAD).
+	 */
+	if (!(adapter->netdev->flags & IFF_PROMISC) ||
+	    adapter->promisc_supported)
+		cap_reqs = 7;
+	else
+		cap_reqs = 6;
 
 	if (!retry) {
 		/* Sub-CRQ entries are 32 byte long */
 		int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);
 
+		atomic_set(&adapter->running_cap_crqs, cap_reqs);
+
 		if (adapter->min_tx_entries_per_subcrq > entries_page ||
 		    adapter->min_rx_add_entries_per_subcrq > entries_page) {
 			dev_err(dev, "Fatal, invalid entries per sub-crq\n");
@@ -3914,44 +3928,45 @@ static void send_request_cap(struct ibmvnic_adapter *adapter, int retry)
 					adapter->opt_rx_comp_queues;
 
 		adapter->req_rx_add_queues = adapter->max_rx_add_queues;
+	} else {
+		atomic_add(cap_reqs, &adapter->running_cap_crqs);
 	}
-
 	memset(&crq, 0, sizeof(crq));
 	crq.request_capability.first = IBMVNIC_CRQ_CMD;
 	crq.request_capability.cmd = REQUEST_CAPABILITY;
 
 	crq.request_capability.capability = cpu_to_be16(REQ_TX_QUEUES);
 	crq.request_capability.number = cpu_to_be64(adapter->req_tx_queues);
-	atomic_inc(&adapter->running_cap_crqs);
+	cap_reqs--;
 	ibmvnic_send_crq(adapter, &crq);
 
 	crq.request_capability.capability = cpu_to_be16(REQ_RX_QUEUES);
 	crq.request_capability.number = cpu_to_be64(adapter->req_rx_queues);
-	atomic_inc(&adapter->running_cap_crqs);
+	cap_reqs--;
 	ibmvnic_send_crq(adapter, &crq);
 
 	crq.request_capability.capability = cpu_to_be16(REQ_RX_ADD_QUEUES);
 	crq.request_capability.number = cpu_to_be64(adapter->req_rx_add_queues);
-	atomic_inc(&adapter->running_cap_crqs);
+	cap_reqs--;
 	ibmvnic_send_crq(adapter, &crq);
 
 	crq.request_capability.capability =
 	    cpu_to_be16(REQ_TX_ENTRIES_PER_SUBCRQ);
 	crq.request_capability.number =
 	    cpu_to_be64(adapter->req_tx_entries_per_subcrq);
-	atomic_inc(&adapter->running_cap_crqs);
+	cap_reqs--;
 	ibmvnic_send_crq(adapter, &crq);
 
 	crq.request_capability.capability =
 	    cpu_to_be16(REQ_RX_ADD_ENTRIES_PER_SUBCRQ);
 	crq.request_capability.number =
 	    cpu_to_be64(adapter->req_rx_add_entries_per_subcrq);
-	atomic_inc(&adapter->running_cap_crqs);
+	cap_reqs--;
 	ibmvnic_send_crq(adapter, &crq);
 
 	crq.request_capability.capability = cpu_to_be16(REQ_MTU);
 	crq.request_capability.number = cpu_to_be64(adapter->req_mtu);
-	atomic_inc(&adapter->running_cap_crqs);
+	cap_reqs--;
 	ibmvnic_send_crq(adapter, &crq);
 
 	if (adapter->netdev->flags & IFF_PROMISC) {
@@ -3959,16 +3974,21 @@ static void send_request_cap(struct ibmvnic_adapter *adapter, int retry)
 			crq.request_capability.capability =
 			    cpu_to_be16(PROMISC_REQUESTED);
 			crq.request_capability.number = cpu_to_be64(1);
-			atomic_inc(&adapter->running_cap_crqs);
+			cap_reqs--;
 			ibmvnic_send_crq(adapter, &crq);
 		}
 	} else {
 		crq.request_capability.capability =
 		    cpu_to_be16(PROMISC_REQUESTED);
 		crq.request_capability.number = cpu_to_be64(0);
-		atomic_inc(&adapter->running_cap_crqs);
+		cap_reqs--;
 		ibmvnic_send_crq(adapter, &crq);
 	}
+
+	/* Keep at end to catch any discrepancy between expected and actual
+	 * CRQs sent.
+	 */
+	WARN_ON(cap_reqs != 0);
 }
 
 static int pending_scrq(struct ibmvnic_adapter *adapter,
@@ -4362,118 +4382,132 @@ static void send_query_map(struct ibmvnic_adapter *adapter)
 static void send_query_cap(struct ibmvnic_adapter *adapter)
 {
 	union ibmvnic_crq crq;
+	int cap_reqs;
+
+	/* We send out 25 QUERY_CAPABILITY CRQs below.  Initialize this count
+	 * upfront. When the tasklet receives a response to all of these, it
+	 * can send out the next protocol messaage (REQUEST_CAPABILITY).
+	 */
+	cap_reqs = 25;
+
+	atomic_set(&adapter->running_cap_crqs, cap_reqs);
 
-	atomic_set(&adapter->running_cap_crqs, 0);
 	memset(&crq, 0, sizeof(crq));
 	crq.query_capability.first = IBMVNIC_CRQ_CMD;
 	crq.query_capability.cmd = QUERY_CAPABILITY;
 
 	crq.query_capability.capability = cpu_to_be16(MIN_TX_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MIN_RX_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MIN_RX_ADD_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MAX_TX_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MAX_RX_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MAX_RX_ADD_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 	    cpu_to_be16(MIN_TX_ENTRIES_PER_SUBCRQ);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 	    cpu_to_be16(MIN_RX_ADD_ENTRIES_PER_SUBCRQ);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 	    cpu_to_be16(MAX_TX_ENTRIES_PER_SUBCRQ);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 	    cpu_to_be16(MAX_RX_ADD_ENTRIES_PER_SUBCRQ);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(TCP_IP_OFFLOAD);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(PROMISC_SUPPORTED);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MIN_MTU);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MAX_MTU);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MAX_MULTICAST_FILTERS);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(VLAN_HEADER_INSERTION);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(RX_VLAN_HEADER_INSERTION);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(MAX_TX_SG_ENTRIES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(RX_SG_SUPPORTED);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(OPT_TX_COMP_SUB_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(OPT_RX_COMP_QUEUES);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 			cpu_to_be16(OPT_RX_BUFADD_Q_PER_RX_COMP_Q);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 			cpu_to_be16(OPT_TX_ENTRIES_PER_SUBCRQ);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability =
 			cpu_to_be16(OPT_RXBA_ENTRIES_PER_SUBCRQ);
-	atomic_inc(&adapter->running_cap_crqs);
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
 
 	crq.query_capability.capability = cpu_to_be16(TX_RX_DESC_REQ);
-	atomic_inc(&adapter->running_cap_crqs);
+
 	ibmvnic_send_crq(adapter, &crq);
+	cap_reqs--;
+
+	/* Keep at end to catch any discrepancy between expected and actual
+	 * CRQs sent.
+	 */
+	WARN_ON(cap_reqs != 0);
 }
 
 static void send_query_ip_offload(struct ibmvnic_adapter *adapter)
@@ -4777,6 +4811,8 @@ static void handle_request_cap_rsp(union ibmvnic_crq *crq,
 	char *name;
 
 	atomic_dec(&adapter->running_cap_crqs);
+	netdev_dbg(adapter->netdev, "Outstanding request-caps: %d\n",
+		   atomic_read(&adapter->running_cap_crqs));
 	switch (be16_to_cpu(crq->request_capability_rsp.capability)) {
 	case REQ_TX_QUEUES:
 		req_value = &adapter->req_tx_queues;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 3/4] ibmvnic: don't spin in tasklet
  2022-01-22  2:59 [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Sukadev Bhattiprolu
  2022-01-22  2:59 ` [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early Sukadev Bhattiprolu
@ 2022-01-22  2:59 ` Sukadev Bhattiprolu
  2022-01-23  0:32   ` Dany Madden
  2022-01-22  2:59 ` [PATCH net 4/4] ibmvnic: remove unused ->wait_capability Sukadev Bhattiprolu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Sukadev Bhattiprolu @ 2022-01-22  2:59 UTC (permalink / raw)
  To: netdev; +Cc: Brian King, Dany Madden, Rick Lindsley

ibmvnic_tasklet() continuously spins waiting for responses to all
capability requests. It does this to avoid encountering an error
during initialization of the vnic. However if there is a bug in the
VIOS and we do not receive a response to one or more queries the
tasklet ends up spinning continuously leading to hard lock ups.

If we fail to receive a message from the VIOS it is reasonable to
timeout the login attempt rather than spin indefinitely in the tasklet.

Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index acd488310bbc..682a440151a8 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5491,12 +5491,6 @@ static void ibmvnic_tasklet(struct tasklet_struct *t)
 			ibmvnic_handle_crq(crq, adapter);
 			crq->generic.first = 0;
 		}
-
-		/* remain in tasklet until all
-		 * capabilities responses are received
-		 */
-		if (!adapter->wait_capability)
-			done = true;
 	}
 	/* if capabilities CRQ's were sent in this tasklet, the following
 	 * tasklet must wait until all responses are received
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 4/4] ibmvnic: remove unused ->wait_capability
  2022-01-22  2:59 [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Sukadev Bhattiprolu
  2022-01-22  2:59 ` [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early Sukadev Bhattiprolu
  2022-01-22  2:59 ` [PATCH net 3/4] ibmvnic: don't spin in tasklet Sukadev Bhattiprolu
@ 2022-01-22  2:59 ` Sukadev Bhattiprolu
  2022-01-23  0:33   ` Dany Madden
  2022-01-23  0:22 ` [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Dany Madden
  2022-01-24 12:10 ` patchwork-bot+netdevbpf
  4 siblings, 1 reply; 9+ messages in thread
From: Sukadev Bhattiprolu @ 2022-01-22  2:59 UTC (permalink / raw)
  To: netdev; +Cc: Brian King, Dany Madden, Rick Lindsley

With previous bug fix, ->wait_capability flag is no longer needed and can
be removed.

Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 38 +++++++++++-------------------
 drivers/net/ethernet/ibm/ibmvnic.h |  1 -
 2 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 682a440151a8..8ed0b95147db 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4876,10 +4876,8 @@ static void handle_request_cap_rsp(union ibmvnic_crq *crq,
 	}
 
 	/* Done receiving requested capabilities, query IP offload support */
-	if (atomic_read(&adapter->running_cap_crqs) == 0) {
-		adapter->wait_capability = false;
+	if (atomic_read(&adapter->running_cap_crqs) == 0)
 		send_query_ip_offload(adapter);
-	}
 }
 
 static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
@@ -5177,10 +5175,8 @@ static void handle_query_cap_rsp(union ibmvnic_crq *crq,
 	}
 
 out:
-	if (atomic_read(&adapter->running_cap_crqs) == 0) {
-		adapter->wait_capability = false;
+	if (atomic_read(&adapter->running_cap_crqs) == 0)
 		send_request_cap(adapter, 0);
-	}
 }
 
 static int send_query_phys_parms(struct ibmvnic_adapter *adapter)
@@ -5476,27 +5472,21 @@ static void ibmvnic_tasklet(struct tasklet_struct *t)
 	struct ibmvnic_crq_queue *queue = &adapter->crq;
 	union ibmvnic_crq *crq;
 	unsigned long flags;
-	bool done = false;
 
 	spin_lock_irqsave(&queue->lock, flags);
-	while (!done) {
-		/* Pull all the valid messages off the CRQ */
-		while ((crq = ibmvnic_next_crq(adapter)) != NULL) {
-			/* This barrier makes sure ibmvnic_next_crq()'s
-			 * crq->generic.first & IBMVNIC_CRQ_CMD_RSP is loaded
-			 * before ibmvnic_handle_crq()'s
-			 * switch(gen_crq->first) and switch(gen_crq->cmd).
-			 */
-			dma_rmb();
-			ibmvnic_handle_crq(crq, adapter);
-			crq->generic.first = 0;
-		}
+
+	/* Pull all the valid messages off the CRQ */
+	while ((crq = ibmvnic_next_crq(adapter)) != NULL) {
+		/* This barrier makes sure ibmvnic_next_crq()'s
+		 * crq->generic.first & IBMVNIC_CRQ_CMD_RSP is loaded
+		 * before ibmvnic_handle_crq()'s
+		 * switch(gen_crq->first) and switch(gen_crq->cmd).
+		 */
+		dma_rmb();
+		ibmvnic_handle_crq(crq, adapter);
+		crq->generic.first = 0;
 	}
-	/* if capabilities CRQ's were sent in this tasklet, the following
-	 * tasklet must wait until all responses are received
-	 */
-	if (atomic_read(&adapter->running_cap_crqs) != 0)
-		adapter->wait_capability = true;
+
 	spin_unlock_irqrestore(&queue->lock, flags);
 }
 
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index b8e42f67d897..a80f94e161ad 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -921,7 +921,6 @@ struct ibmvnic_adapter {
 	int login_rsp_buf_sz;
 
 	atomic_t running_cap_crqs;
-	bool wait_capability;
 
 	struct ibmvnic_sub_crq_queue **tx_scrq ____cacheline_aligned;
 	struct ibmvnic_sub_crq_queue **rx_scrq ____cacheline_aligned;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/4] ibmvnic: Allow extra failures before disabling
  2022-01-22  2:59 [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Sukadev Bhattiprolu
                   ` (2 preceding siblings ...)
  2022-01-22  2:59 ` [PATCH net 4/4] ibmvnic: remove unused ->wait_capability Sukadev Bhattiprolu
@ 2022-01-23  0:22 ` Dany Madden
  2022-01-24 12:10 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 9+ messages in thread
From: Dany Madden @ 2022-01-23  0:22 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: netdev, Brian King, Rick Lindsley

On 2022-01-21 18:59, Sukadev Bhattiprolu wrote:
> If auto-priority-failover (APF) is enabled and there are at least two
> backing devices of different priorities, some resets like fail-over,
> change-param etc can cause at least two back to back failovers. 
> (Failover
> from high priority backing device to lower priority one and then back
> to the higher priority one if that is still functional).
> 
> Depending on the timimg of the two failovers it is possible to trigger
> a "hard" reset and for the hard reset to fail due to failovers. When 
> this
> occurs, the driver assumes that the network is unstable and disables 
> the
> VNIC for a 60-second "settling time". This in turn can cause the 
> ethtool
> command to fail with "No such device" while the vnic automatically 
> recovers
> a little while later.
> 
> Given that it's possible to have two back to back failures, allow for 
> extra
> failures before disabling the vnic for the settling time.
> 
> Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails")
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>

> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 0bb3911dd014..9b2d16ad76f1 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2598,6 +2598,7 @@ static void __ibmvnic_reset(struct work_struct 
> *work)
>  	struct ibmvnic_rwi *rwi;
>  	unsigned long flags;
>  	u32 reset_state;
> +	int num_fails = 0;
>  	int rc = 0;
> 
>  	adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset);
> @@ -2651,11 +2652,23 @@ static void __ibmvnic_reset(struct work_struct 
> *work)
>  				rc = do_hard_reset(adapter, rwi, reset_state);
>  				rtnl_unlock();
>  			}
> -			if (rc) {
> -				/* give backing device time to settle down */
> +			if (rc)
> +				num_fails++;
> +			else
> +				num_fails = 0;
> +
> +			/* If auto-priority-failover is enabled we can get
> +			 * back to back failovers during resets, resulting
> +			 * in at least two failed resets (from high-priority
> +			 * backing device to low-priority one and then back)
> +			 * If resets continue to fail beyond that, give the
> +			 * adapter some time to settle down before retrying.
> +			 */
> +			if (num_fails >= 3) {
>  				netdev_dbg(adapter->netdev,
> -					   "[S:%s] Hard reset failed, waiting 60 secs\n",
> -					   adapter_state_to_string(adapter->state));
> +					   "[S:%s] Hard reset failed %d times, waiting 60 secs\n",
> +					   adapter_state_to_string(adapter->state),
> +					   num_fails);
>  				set_current_state(TASK_UNINTERRUPTIBLE);
>  				schedule_timeout(60 * HZ);
>  			}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early
  2022-01-22  2:59 ` [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early Sukadev Bhattiprolu
@ 2022-01-23  0:30   ` Dany Madden
  0 siblings, 0 replies; 9+ messages in thread
From: Dany Madden @ 2022-01-23  0:30 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: netdev, Brian King, Rick Lindsley

On 2022-01-21 18:59, Sukadev Bhattiprolu wrote:
> We use ->running_cap_crqs to determine when the ibmvnic_tasklet() 
> should
> send out the next protocol message type. i.e when we get back responses
> to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs.
> Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we
> send out the QUERY_IP_OFFLOAD CRQ.
> 
> We currently increment ->running_cap_crqs as we send out each CRQ and
> have the ibmvnic_tasklet() send out the next message type, when this
> running_cap_crqs count drops to 0.
> 
> This assumes that all the CRQs of the current type were sent out before
> the count drops to 0. However it is possible that we send out say 6 
> CRQs,
> get preempted and receive all the 6 responses before we send out the
> remaining CRQs. This can result in ->running_cap_crqs count dropping to
> zero before all messages of the current type were sent and we end up
> sending the next protocol message too early.
> 
> Instead initialize the ->running_cap_crqs upfront so the tasklet will
> only send the next protocol message after all responses are received.
> 
> Use the cap_reqs local variable to also detect any discrepancy (either
> now or in future) in the number of capability requests we actually 
> send.
> 
> Currently only send_query_cap() is affected by this behavior (of 
> sending
> next message early) since it is called from the worker thread (during
> reset) and from application thread (during ->ndo_open()) and they can 
> be
> preempted. send_request_cap() is only called from the tasklet  which
> processes CRQ responses sequentially, is not be affected.  But to
> maintain the existing symmtery with send_query_capability() we update
> send_request_capability() also.
> 
> Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all
> capabilities crqs")
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>

> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 106 +++++++++++++++++++----------
>  1 file changed, 71 insertions(+), 35 deletions(-)
> 
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 9b2d16ad76f1..acd488310bbc 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -3849,11 +3849,25 @@ static void send_request_cap(struct
> ibmvnic_adapter *adapter, int retry)
>  	struct device *dev = &adapter->vdev->dev;
>  	union ibmvnic_crq crq;
>  	int max_entries;
> +	int cap_reqs;
> +
> +	/* We send out 6 or 7 REQUEST_CAPABILITY CRQs below (depending on
> +	 * the PROMISC flag). Initialize this count upfront. When the tasklet
> +	 * receives a response to all of these, it will send the next 
> protocol
> +	 * message (QUERY_IP_OFFLOAD).
> +	 */
> +	if (!(adapter->netdev->flags & IFF_PROMISC) ||
> +	    adapter->promisc_supported)
> +		cap_reqs = 7;
> +	else
> +		cap_reqs = 6;
> 
>  	if (!retry) {
>  		/* Sub-CRQ entries are 32 byte long */
>  		int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);
> 
> +		atomic_set(&adapter->running_cap_crqs, cap_reqs);
> +
>  		if (adapter->min_tx_entries_per_subcrq > entries_page ||
>  		    adapter->min_rx_add_entries_per_subcrq > entries_page) {
>  			dev_err(dev, "Fatal, invalid entries per sub-crq\n");
> @@ -3914,44 +3928,45 @@ static void send_request_cap(struct
> ibmvnic_adapter *adapter, int retry)
>  					adapter->opt_rx_comp_queues;
> 
>  		adapter->req_rx_add_queues = adapter->max_rx_add_queues;
> +	} else {
> +		atomic_add(cap_reqs, &adapter->running_cap_crqs);
>  	}
> -
>  	memset(&crq, 0, sizeof(crq));
>  	crq.request_capability.first = IBMVNIC_CRQ_CMD;
>  	crq.request_capability.cmd = REQUEST_CAPABILITY;
> 
>  	crq.request_capability.capability = cpu_to_be16(REQ_TX_QUEUES);
>  	crq.request_capability.number = cpu_to_be64(adapter->req_tx_queues);
> -	atomic_inc(&adapter->running_cap_crqs);
> +	cap_reqs--;
>  	ibmvnic_send_crq(adapter, &crq);
> 
>  	crq.request_capability.capability = cpu_to_be16(REQ_RX_QUEUES);
>  	crq.request_capability.number = cpu_to_be64(adapter->req_rx_queues);
> -	atomic_inc(&adapter->running_cap_crqs);
> +	cap_reqs--;
>  	ibmvnic_send_crq(adapter, &crq);
> 
>  	crq.request_capability.capability = cpu_to_be16(REQ_RX_ADD_QUEUES);
>  	crq.request_capability.number = 
> cpu_to_be64(adapter->req_rx_add_queues);
> -	atomic_inc(&adapter->running_cap_crqs);
> +	cap_reqs--;
>  	ibmvnic_send_crq(adapter, &crq);
> 
>  	crq.request_capability.capability =
>  	    cpu_to_be16(REQ_TX_ENTRIES_PER_SUBCRQ);
>  	crq.request_capability.number =
>  	    cpu_to_be64(adapter->req_tx_entries_per_subcrq);
> -	atomic_inc(&adapter->running_cap_crqs);
> +	cap_reqs--;
>  	ibmvnic_send_crq(adapter, &crq);
> 
>  	crq.request_capability.capability =
>  	    cpu_to_be16(REQ_RX_ADD_ENTRIES_PER_SUBCRQ);
>  	crq.request_capability.number =
>  	    cpu_to_be64(adapter->req_rx_add_entries_per_subcrq);
> -	atomic_inc(&adapter->running_cap_crqs);
> +	cap_reqs--;
>  	ibmvnic_send_crq(adapter, &crq);
> 
>  	crq.request_capability.capability = cpu_to_be16(REQ_MTU);
>  	crq.request_capability.number = cpu_to_be64(adapter->req_mtu);
> -	atomic_inc(&adapter->running_cap_crqs);
> +	cap_reqs--;
>  	ibmvnic_send_crq(adapter, &crq);
> 
>  	if (adapter->netdev->flags & IFF_PROMISC) {
> @@ -3959,16 +3974,21 @@ static void send_request_cap(struct
> ibmvnic_adapter *adapter, int retry)
>  			crq.request_capability.capability =
>  			    cpu_to_be16(PROMISC_REQUESTED);
>  			crq.request_capability.number = cpu_to_be64(1);
> -			atomic_inc(&adapter->running_cap_crqs);
> +			cap_reqs--;
>  			ibmvnic_send_crq(adapter, &crq);
>  		}
>  	} else {
>  		crq.request_capability.capability =
>  		    cpu_to_be16(PROMISC_REQUESTED);
>  		crq.request_capability.number = cpu_to_be64(0);
> -		atomic_inc(&adapter->running_cap_crqs);
> +		cap_reqs--;
>  		ibmvnic_send_crq(adapter, &crq);
>  	}
> +
> +	/* Keep at end to catch any discrepancy between expected and actual
> +	 * CRQs sent.
> +	 */
> +	WARN_ON(cap_reqs != 0);
>  }
> 
>  static int pending_scrq(struct ibmvnic_adapter *adapter,
> @@ -4362,118 +4382,132 @@ static void send_query_map(struct
> ibmvnic_adapter *adapter)
>  static void send_query_cap(struct ibmvnic_adapter *adapter)
>  {
>  	union ibmvnic_crq crq;
> +	int cap_reqs;
> +
> +	/* We send out 25 QUERY_CAPABILITY CRQs below.  Initialize this count
> +	 * upfront. When the tasklet receives a response to all of these, it
> +	 * can send out the next protocol messaage (REQUEST_CAPABILITY).
> +	 */
> +	cap_reqs = 25;
> +
> +	atomic_set(&adapter->running_cap_crqs, cap_reqs);
> 
> -	atomic_set(&adapter->running_cap_crqs, 0);
>  	memset(&crq, 0, sizeof(crq));
>  	crq.query_capability.first = IBMVNIC_CRQ_CMD;
>  	crq.query_capability.cmd = QUERY_CAPABILITY;
> 
>  	crq.query_capability.capability = cpu_to_be16(MIN_TX_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MIN_RX_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MIN_RX_ADD_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MAX_TX_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MAX_RX_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MAX_RX_ADD_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  	    cpu_to_be16(MIN_TX_ENTRIES_PER_SUBCRQ);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  	    cpu_to_be16(MIN_RX_ADD_ENTRIES_PER_SUBCRQ);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  	    cpu_to_be16(MAX_TX_ENTRIES_PER_SUBCRQ);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  	    cpu_to_be16(MAX_RX_ADD_ENTRIES_PER_SUBCRQ);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(TCP_IP_OFFLOAD);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(PROMISC_SUPPORTED);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MIN_MTU);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MAX_MTU);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MAX_MULTICAST_FILTERS);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(VLAN_HEADER_INSERTION);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = 
> cpu_to_be16(RX_VLAN_HEADER_INSERTION);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(MAX_TX_SG_ENTRIES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(RX_SG_SUPPORTED);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = 
> cpu_to_be16(OPT_TX_COMP_SUB_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(OPT_RX_COMP_QUEUES);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  			cpu_to_be16(OPT_RX_BUFADD_Q_PER_RX_COMP_Q);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  			cpu_to_be16(OPT_TX_ENTRIES_PER_SUBCRQ);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability =
>  			cpu_to_be16(OPT_RXBA_ENTRIES_PER_SUBCRQ);
> -	atomic_inc(&adapter->running_cap_crqs);
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> 
>  	crq.query_capability.capability = cpu_to_be16(TX_RX_DESC_REQ);
> -	atomic_inc(&adapter->running_cap_crqs);
> +
>  	ibmvnic_send_crq(adapter, &crq);
> +	cap_reqs--;
> +
> +	/* Keep at end to catch any discrepancy between expected and actual
> +	 * CRQs sent.
> +	 */
> +	WARN_ON(cap_reqs != 0);
>  }
> 
>  static void send_query_ip_offload(struct ibmvnic_adapter *adapter)
> @@ -4777,6 +4811,8 @@ static void handle_request_cap_rsp(union 
> ibmvnic_crq *crq,
>  	char *name;
> 
>  	atomic_dec(&adapter->running_cap_crqs);
> +	netdev_dbg(adapter->netdev, "Outstanding request-caps: %d\n",
> +		   atomic_read(&adapter->running_cap_crqs));
>  	switch (be16_to_cpu(crq->request_capability_rsp.capability)) {
>  	case REQ_TX_QUEUES:
>  		req_value = &adapter->req_tx_queues;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 3/4] ibmvnic: don't spin in tasklet
  2022-01-22  2:59 ` [PATCH net 3/4] ibmvnic: don't spin in tasklet Sukadev Bhattiprolu
@ 2022-01-23  0:32   ` Dany Madden
  0 siblings, 0 replies; 9+ messages in thread
From: Dany Madden @ 2022-01-23  0:32 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: netdev, Brian King, Rick Lindsley

On 2022-01-21 18:59, Sukadev Bhattiprolu wrote:
> ibmvnic_tasklet() continuously spins waiting for responses to all
> capability requests. It does this to avoid encountering an error
> during initialization of the vnic. However if there is a bug in the
> VIOS and we do not receive a response to one or more queries the
> tasklet ends up spinning continuously leading to hard lock ups.
> 
> If we fail to receive a message from the VIOS it is reasonable to
> timeout the login attempt rather than spin indefinitely in the tasklet.
> 
> Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all
> capabilities crqs")
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>

> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index acd488310bbc..682a440151a8 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -5491,12 +5491,6 @@ static void ibmvnic_tasklet(struct 
> tasklet_struct *t)
>  			ibmvnic_handle_crq(crq, adapter);
>  			crq->generic.first = 0;
>  		}
> -
> -		/* remain in tasklet until all
> -		 * capabilities responses are received
> -		 */
> -		if (!adapter->wait_capability)
> -			done = true;
>  	}
>  	/* if capabilities CRQ's were sent in this tasklet, the following
>  	 * tasklet must wait until all responses are received

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 4/4] ibmvnic: remove unused ->wait_capability
  2022-01-22  2:59 ` [PATCH net 4/4] ibmvnic: remove unused ->wait_capability Sukadev Bhattiprolu
@ 2022-01-23  0:33   ` Dany Madden
  0 siblings, 0 replies; 9+ messages in thread
From: Dany Madden @ 2022-01-23  0:33 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: netdev, Brian King, Rick Lindsley

On 2022-01-21 18:59, Sukadev Bhattiprolu wrote:
> With previous bug fix, ->wait_capability flag is no longer needed and 
> can
> be removed.
> 
> Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all
> capabilities crqs")
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>

> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 38 +++++++++++-------------------
>  drivers/net/ethernet/ibm/ibmvnic.h |  1 -
>  2 files changed, 14 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 682a440151a8..8ed0b95147db 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -4876,10 +4876,8 @@ static void handle_request_cap_rsp(union
> ibmvnic_crq *crq,
>  	}
> 
>  	/* Done receiving requested capabilities, query IP offload support */
> -	if (atomic_read(&adapter->running_cap_crqs) == 0) {
> -		adapter->wait_capability = false;
> +	if (atomic_read(&adapter->running_cap_crqs) == 0)
>  		send_query_ip_offload(adapter);
> -	}
>  }
> 
>  static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
> @@ -5177,10 +5175,8 @@ static void handle_query_cap_rsp(union 
> ibmvnic_crq *crq,
>  	}
> 
>  out:
> -	if (atomic_read(&adapter->running_cap_crqs) == 0) {
> -		adapter->wait_capability = false;
> +	if (atomic_read(&adapter->running_cap_crqs) == 0)
>  		send_request_cap(adapter, 0);
> -	}
>  }
> 
>  static int send_query_phys_parms(struct ibmvnic_adapter *adapter)
> @@ -5476,27 +5472,21 @@ static void ibmvnic_tasklet(struct 
> tasklet_struct *t)
>  	struct ibmvnic_crq_queue *queue = &adapter->crq;
>  	union ibmvnic_crq *crq;
>  	unsigned long flags;
> -	bool done = false;
> 
>  	spin_lock_irqsave(&queue->lock, flags);
> -	while (!done) {
> -		/* Pull all the valid messages off the CRQ */
> -		while ((crq = ibmvnic_next_crq(adapter)) != NULL) {
> -			/* This barrier makes sure ibmvnic_next_crq()'s
> -			 * crq->generic.first & IBMVNIC_CRQ_CMD_RSP is loaded
> -			 * before ibmvnic_handle_crq()'s
> -			 * switch(gen_crq->first) and switch(gen_crq->cmd).
> -			 */
> -			dma_rmb();
> -			ibmvnic_handle_crq(crq, adapter);
> -			crq->generic.first = 0;
> -		}
> +
> +	/* Pull all the valid messages off the CRQ */
> +	while ((crq = ibmvnic_next_crq(adapter)) != NULL) {
> +		/* This barrier makes sure ibmvnic_next_crq()'s
> +		 * crq->generic.first & IBMVNIC_CRQ_CMD_RSP is loaded
> +		 * before ibmvnic_handle_crq()'s
> +		 * switch(gen_crq->first) and switch(gen_crq->cmd).
> +		 */
> +		dma_rmb();
> +		ibmvnic_handle_crq(crq, adapter);
> +		crq->generic.first = 0;
>  	}
> -	/* if capabilities CRQ's were sent in this tasklet, the following
> -	 * tasklet must wait until all responses are received
> -	 */
> -	if (atomic_read(&adapter->running_cap_crqs) != 0)
> -		adapter->wait_capability = true;
> +
>  	spin_unlock_irqrestore(&queue->lock, flags);
>  }
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.h
> b/drivers/net/ethernet/ibm/ibmvnic.h
> index b8e42f67d897..a80f94e161ad 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.h
> +++ b/drivers/net/ethernet/ibm/ibmvnic.h
> @@ -921,7 +921,6 @@ struct ibmvnic_adapter {
>  	int login_rsp_buf_sz;
> 
>  	atomic_t running_cap_crqs;
> -	bool wait_capability;
> 
>  	struct ibmvnic_sub_crq_queue **tx_scrq ____cacheline_aligned;
>  	struct ibmvnic_sub_crq_queue **rx_scrq ____cacheline_aligned;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/4] ibmvnic: Allow extra failures before disabling
  2022-01-22  2:59 [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Sukadev Bhattiprolu
                   ` (3 preceding siblings ...)
  2022-01-23  0:22 ` [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Dany Madden
@ 2022-01-24 12:10 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-01-24 12:10 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: netdev, brking, drt, ricklind

Hello:

This series was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Fri, 21 Jan 2022 18:59:18 -0800 you wrote:
> If auto-priority-failover (APF) is enabled and there are at least two
> backing devices of different priorities, some resets like fail-over,
> change-param etc can cause at least two back to back failovers. (Failover
> from high priority backing device to lower priority one and then back
> to the higher priority one if that is still functional).
> 
> Depending on the timimg of the two failovers it is possible to trigger
> a "hard" reset and for the hard reset to fail due to failovers. When this
> occurs, the driver assumes that the network is unstable and disables the
> VNIC for a 60-second "settling time". This in turn can cause the ethtool
> command to fail with "No such device" while the vnic automatically recovers
> a little while later.
> 
> [...]

Here is the summary with links:
  - [net,1/4] ibmvnic: Allow extra failures before disabling
    https://git.kernel.org/netdev/net/c/db9f0e8bf79e
  - [net,2/4] ibmvnic: init ->running_cap_crqs early
    https://git.kernel.org/netdev/net/c/151b6a5c06b6
  - [net,3/4] ibmvnic: don't spin in tasklet
    https://git.kernel.org/netdev/net/c/48079e7fdd02
  - [net,4/4] ibmvnic: remove unused ->wait_capability
    https://git.kernel.org/netdev/net/c/3a5d9db7fbdf

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-01-24 12:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-22  2:59 [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Sukadev Bhattiprolu
2022-01-22  2:59 ` [PATCH net 2/4] ibmvnic: init ->running_cap_crqs early Sukadev Bhattiprolu
2022-01-23  0:30   ` Dany Madden
2022-01-22  2:59 ` [PATCH net 3/4] ibmvnic: don't spin in tasklet Sukadev Bhattiprolu
2022-01-23  0:32   ` Dany Madden
2022-01-22  2:59 ` [PATCH net 4/4] ibmvnic: remove unused ->wait_capability Sukadev Bhattiprolu
2022-01-23  0:33   ` Dany Madden
2022-01-23  0:22 ` [PATCH net 1/4] ibmvnic: Allow extra failures before disabling Dany Madden
2022-01-24 12:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).