All of lore.kernel.org
 help / color / mirror / Atom feed
From: Souradeep Chakrabarti <schakrabarti@microsoft.com>
To: Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"edumazet@google.com" <edumazet@google.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	Long Li <longli@microsoft.com>,
	Ajay Sharma <sharmaajay@microsoft.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"cai.huoqing@linux.dev" <cai.huoqing@linux.dev>,
	"ssengar@linux.microsoft.com" <ssengar@linux.microsoft.com>,
	vkuznets <vkuznets@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: RE: [EXTERNAL] Re: [PATCH V7 net] net: mana: Fix MANA VF unload when hardware is
Date: Tue, 1 Aug 2023 18:57:07 +0000	[thread overview]
Message-ID: <PUZP153MB078824A51D0D9887919E7F1CCC0AA@PUZP153MB0788.APCP153.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <8ccbfab0-e24f-b758-cd11-27b6d8ab1d48@intel.com>



>-----Original Message-----
>From: Jesse Brandeburg <jesse.brandeburg@intel.com>
>Sent: Tuesday, August 1, 2023 11:34 PM
>To: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>; KY Srinivasan
><kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
>wei.liu@kernel.org; Dexuan Cui <decui@microsoft.com>;
>davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>pabeni@redhat.com; Long Li <longli@microsoft.com>; Ajay Sharma
><sharmaajay@microsoft.com>; leon@kernel.org; cai.huoqing@linux.dev;
>ssengar@linux.microsoft.com; vkuznets <vkuznets@redhat.com>;
>tglx@linutronix.de; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org;
>linux-kernel@vger.kernel.org; linux-rdma@vger.kernel.org
>Cc: Souradeep Chakrabarti <schakrabarti@microsoft.com>;
>stable@vger.kernel.org
>Subject: [EXTERNAL] Re: [PATCH V7 net] net: mana: Fix MANA VF unload when
>hardware is
>
>On 8/1/2023 5:29 AM, Souradeep Chakrabarti wrote:
>> When unloading the MANA driver, mana_dealloc_queues() waits for the
>> MANA hardware to complete any inflight packets and set the pending
>> send count to zero. But if the hardware has failed,
>> mana_dealloc_queues() could wait forever.
>>
>> Fix this by adding a timeout to the wait. Set the timeout to 120
>> seconds,
>
>tx timeout in stack defaults to 5 seconds, do you not have that on? What
>happens when you start getting resets while unloading?
>
>> which is a somewhat arbitrary value that is more than long enough for
>> functional hardware to complete any sends.
>
>I'd say 2 or 5 seconds is probably plenty of time to hang up a driver unload.
>
Thank you for the comment. This was already discussed in V4.
I am just sharing the summary here:
This waiting time is usually much shorter than 120 sec. 
The long wait only happens in rare and unexpected NIC HW non-responding cases. 
At that point, we don't actually care if the pending packets are sent or not. 
But if we free the queues too soon, and the HW is slow for unexpected reasons, 
a delayed completion notice will DMA into the freed memory and cause corruption. 
That's why we have a longer waiting time. 
>>
>> Cc: stable@vger.kernel.org
>> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure
>> Network Adapter (MANA)")
>>
>> Signed-off-by: Souradeep Chakrabarti
>> <schakrabarti@linux.microsoft.com>
>
>keep s-o-b and other trailers together please, no spaces, it messes up git and
>doesn't conform to kernel standards.
>
>
>> ---
>> V6 -> V7:
>> * Optimized the while loop for freeing skb.
>>
>> V5 -> V6:
>> * Added pcie_flr to reset the pci after timeout.
>> * Fixed the position of changelog.
>> * Removed unused variable like cq.
>>
>> V4 -> V5:
>> * Added fixes tag
>> * Changed the usleep_range from static to incremental value.
>> * Initialized timeout in the begining.
>>
>> V3 -> V4:
>> * Removed the unnecessary braces from mana_dealloc_queues().
>>
>> V2 -> V3:
>> * Removed the unnecessary braces from mana_dealloc_queues().
>>
>> V1 -> V2:
>> * Added net branch
>> * Removed the typecasting to (struct mana_context*) of void pointer
>> * Repositioned timeout variable in mana_dealloc_queues()
>> * Repositioned vf_unload_timeout in mana_context struct, to utilise
>> the
>>  6 bytes hole
>> ---
>>  drivers/net/ethernet/microsoft/mana/mana_en.c | 37
>> +++++++++++++++++--
>>  1 file changed, 33 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
>> b/drivers/net/ethernet/microsoft/mana/mana_en.c
>> index a499e460594b..3c5552a176d0 100644
>> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
>> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
>> @@ -8,6 +8,7 @@
>>  #include <linux/ethtool.h>
>>  #include <linux/filter.h>
>>  #include <linux/mm.h>
>> +#include <linux/pci.h>
>>
>>  #include <net/checksum.h>
>>  #include <net/ip6_checksum.h>
>> @@ -2345,9 +2346,12 @@ int mana_attach(struct net_device *ndev)
>> static int mana_dealloc_queues(struct net_device *ndev)  {
>>  	struct mana_port_context *apc = netdev_priv(ndev);
>> +	unsigned long timeout = jiffies + 120 * HZ;
>>  	struct gdma_dev *gd = apc->ac->gdma_dev;
>>  	struct mana_txq *txq;
>> +	struct sk_buff *skb;
>>  	int i, err;
>> +	u32 tsleep;
>>
>>  	if (apc->port_is_up)
>>  		return -EINVAL;
>> @@ -2363,15 +2367,40 @@ static int mana_dealloc_queues(struct
>net_device *ndev)
>>  	 * to false, but it doesn't matter since mana_start_xmit() drops any
>>  	 * new packets due to apc->port_is_up being false.
>>  	 *
>> -	 * Drain all the in-flight TX packets
>> +	 * Drain all the in-flight TX packets.
>> +	 * A timeout of 120 seconds for all the queues is used.
>> +	 * This will break the while loop when h/w is not responding.
>> +	 * This value of 120 has been decided here considering max
>> +	 * number of queues.
>>  	 */
>> +
>>  	for (i = 0; i < apc->num_queues; i++) {
>>  		txq = &apc->tx_qp[i].txq;
>> -
>> -		while (atomic_read(&txq->pending_sends) > 0)
>> -			usleep_range(1000, 2000);
>> +		tsleep = 1000;
>> +		while (atomic_read(&txq->pending_sends) > 0 &&
>> +		       time_before(jiffies, timeout)) {
>> +			usleep_range(tsleep, tsleep + 1000);
>> +			tsleep <<= 1;
>> +		}
>> +		if (atomic_read(&txq->pending_sends)) {
>> +			err = pcie_flr(to_pci_dev(gd->gdma_context->dev));
>> +			if (err) {
>> +				netdev_err(ndev, "flr failed %d with %d pkts
>pending in txq %u\n",
>> +					   err, atomic_read(&txq-
>>pending_sends),
>> +					   txq->gdma_txq_id);
>> +			}
>> +			break;
>> +		}
>>  	}
>>
>> +	for (i = 0; i < apc->num_queues; i++) {
>> +		txq = &apc->tx_qp[i].txq;
>> +		while (skb = skb_dequeue(&txq->pending_skbs)) {
>> +			mana_unmap_skb(skb, apc);
>> +			dev_consume_skb_any(skb);
>
>dev_kfree_skb_any() would be more correct here since this is an error path and
>the transmit is presumed dropped, isn't it?
Yes, dev_kfree_skb_any() will be a better approach in this scenario. Will change it in next version.
>
>
>> +		}
>> +		atomic_set(&txq->pending_sends, 0);
>> +	}
>>  	/* We're 100% sure the queues can no longer be woken up, because
>>  	 * we're sure now mana_poll_tx_cq() can't be running.
>>  	 */


  reply	other threads:[~2023-08-01 18:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 12:29 [PATCH V7 net] net: mana: Fix MANA VF unload when hardware is Souradeep Chakrabarti
2023-08-01 15:31 ` Simon Horman
2023-08-01 18:58   ` [EXTERNAL] " Souradeep Chakrabarti
2023-08-02  5:27     ` Kalesh Anakkur Purayil
2023-08-02  5:37       ` Souradeep Chakrabarti
2023-08-01 18:04 ` Jesse Brandeburg
2023-08-01 18:57   ` Souradeep Chakrabarti [this message]
2023-08-02  7:50 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PUZP153MB078824A51D0D9887919E7F1CCC0AA@PUZP153MB0788.APCP153.PROD.OUTLOOK.COM \
    --to=schakrabarti@microsoft.com \
    --cc=cai.huoqing@linux.dev \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=leon@kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=schakrabarti@linux.microsoft.com \
    --cc=sharmaajay@microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.