linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] iavf: Do not restart Tx queues after reset task failure
@ 2022-11-08 10:25 Ivan Vecera
  2022-11-08 16:40 ` Jacob Keller
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Ivan Vecera @ 2022-11-08 10:25 UTC (permalink / raw)
  To: netdev
  Cc: sassmann, Jacob Keller, Patryk Piotrowski, SlawomirX Laba,
	Jesse Brandeburg, Tony Nguyen, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni,
	moderated list:INTEL ETHERNET DRIVERS, open list

After commit aa626da947e9 ("iavf: Detach device during reset task")
the device is detached during reset task and re-attached at its end.
The problem occurs when reset task fails because Tx queues are
restarted during device re-attach and this leads later to a crash.

To resolve this issue properly close the net device in cause of
failure in reset task to avoid restarting of tx queues at the end.
Also replace the hacky manipulation with IFF_UP flag by device close
that clears properly both IFF_UP and __LINK_STATE_START flags.
In these case iavf_close() does not do anything because the adapter
state is already __IAVF_DOWN.

Reproducer:
1) Run some Tx traffic (e.g. iperf3) over iavf interface
2) Set VF trusted / untrusted in loop

[root@host ~]# cat repro.sh
#!/bin/sh

PF=enp65s0f0
IF=${PF}v0

ip link set up $IF
ip addr add 192.168.0.2/24 dev $IF
sleep 1

iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null &
sleep 2

while :; do
        ip link set $PF vf 0 trust on
        ip link set $PF vf 0 trust off
done
[root@host ~]# ./repro.sh

Result:
[ 2006.650969] iavf 0000:41:01.0: Failed to init adminq: -53
[ 2006.675662] ice 0000:41:00.0: VF 0 is now trusted
[ 2006.689997] iavf 0000:41:01.0: Reset task did not complete, VF disabled
[ 2006.696611] iavf 0000:41:01.0: failed to allocate resources during reinit
[ 2006.703209] ice 0000:41:00.0: VF 0 is now untrusted
[ 2006.737011] ice 0000:41:00.0: VF 0 is now trusted
[ 2006.764536] ice 0000:41:00.0: VF 0 is now untrusted
[ 2006.768919] BUG: kernel NULL pointer dereference, address: 0000000000000b4a
[ 2006.776358] #PF: supervisor read access in kernel mode
[ 2006.781488] #PF: error_code(0x0000) - not-present page
[ 2006.786620] PGD 0 P4D 0
[ 2006.789152] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2006.792903] ice 0000:41:00.0: VF 0 is now trusted
[ 2006.793501] CPU: 4 PID: 0 Comm: swapper/4 Kdump: loaded Not tainted 6.1.0-rc3+ #2
[ 2006.805668] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.4 04/13/2022
[ 2006.815915] RIP: 0010:iavf_xmit_frame_ring+0x96/0xf70 [iavf]
[ 2006.821028] ice 0000:41:00.0: VF 0 is now untrusted
[ 2006.821572] Code: 48 83 c1 04 48 c1 e1 04 48 01 f9 48 83 c0 10 6b 50 f8 55 c1 ea 14 45 8d 64 14 01 48 39 c8 75 eb 41 83 fc 07 0f 8f e9 08 00 00 <0f> b7 45 4a 0f b7 55 48 41 8d 74 24 05 31 c9 66 39 d0 0f 86 da 00
[ 2006.845181] RSP: 0018:ffffb253004bc9e8 EFLAGS: 00010293
[ 2006.850397] RAX: ffff9d154de45b00 RBX: ffff9d15497d52e8 RCX: ffff9d154de45b00
[ 2006.856327] ice 0000:41:00.0: VF 0 is now trusted
[ 2006.857523] RDX: 0000000000000000 RSI: 00000000000005a8 RDI: ffff9d154de45ac0
[ 2006.857525] RBP: 0000000000000b00 R08: ffff9d159cb010ac R09: 0000000000000001
[ 2006.857526] R10: ffff9d154de45940 R11: 0000000000000000 R12: 0000000000000002
[ 2006.883600] R13: ffff9d1770838dc0 R14: 0000000000000000 R15: ffffffffc07b8380
[ 2006.885840] ice 0000:41:00.0: VF 0 is now untrusted
[ 2006.890725] FS:  0000000000000000(0000) GS:ffff9d248e900000(0000) knlGS:0000000000000000
[ 2006.890727] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2006.909419] CR2: 0000000000000b4a CR3: 0000000c39c10002 CR4: 0000000000770ee0
[ 2006.916543] PKRU: 55555554
[ 2006.918254] ice 0000:41:00.0: VF 0 is now trusted
[ 2006.919248] Call Trace:
[ 2006.919250]  <IRQ>
[ 2006.919252]  dev_hard_start_xmit+0x9e/0x1f0
[ 2006.932587]  sch_direct_xmit+0xa0/0x370
[ 2006.936424]  __dev_queue_xmit+0x7af/0xd00
[ 2006.940429]  ip_finish_output2+0x26c/0x540
[ 2006.944519]  ip_output+0x71/0x110
[ 2006.947831]  ? __ip_finish_output+0x2b0/0x2b0
[ 2006.952180]  __ip_queue_xmit+0x16d/0x400
[ 2006.952721] ice 0000:41:00.0: VF 0 is now untrusted
[ 2006.956098]  __tcp_transmit_skb+0xa96/0xbf0
[ 2006.965148]  __tcp_retransmit_skb+0x174/0x860
[ 2006.969499]  ? cubictcp_cwnd_event+0x40/0x40
[ 2006.973769]  tcp_retransmit_skb+0x14/0xb0
...

Fixes: aa626da947e9 ("iavf: Detach device during reset task")
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Patryk Piotrowski <patryk.piotrowski@intel.com>
Cc: SlawomirX Laba <slawomirx.laba@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 5abcd66e7c7a..b66f8fa1d83b 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -2921,7 +2921,6 @@ static void iavf_disable_vf(struct iavf_adapter *adapter)
 	iavf_free_queues(adapter);
 	memset(adapter->vf_res, 0, IAVF_VIRTCHNL_VF_RESOURCE_SIZE);
 	iavf_shutdown_adminq(&adapter->hw);
-	adapter->netdev->flags &= ~IFF_UP;
 	adapter->flags &= ~IAVF_FLAG_RESET_PENDING;
 	iavf_change_state(adapter, __IAVF_DOWN);
 	wake_up(&adapter->down_waitqueue);
@@ -3021,6 +3020,11 @@ static void iavf_reset_task(struct work_struct *work)
 		iavf_disable_vf(adapter);
 		mutex_unlock(&adapter->client_lock);
 		mutex_unlock(&adapter->crit_lock);
+		if (netif_running(netdev)) {
+			rtnl_lock();
+			dev_close(netdev);
+			rtnl_unlock();
+		}
 		return; /* Do not attempt to reinit. It's dead, Jim. */
 	}
 
@@ -3173,6 +3177,16 @@ static void iavf_reset_task(struct work_struct *work)
 
 	mutex_unlock(&adapter->client_lock);
 	mutex_unlock(&adapter->crit_lock);
+
+	if (netif_running(netdev)) {
+		/* Close device to ensure that Tx queues will not be started
+		 * during netif_device_attach() at the end of the reset task.
+		 */
+		rtnl_lock();
+		dev_close(netdev);
+		rtnl_unlock();
+	}
+
 	dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n");
 reset_finish:
 	rtnl_lock();
-- 
2.37.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-08 10:25 [PATCH net] iavf: Do not restart Tx queues after reset task failure Ivan Vecera
@ 2022-11-08 16:40 ` Jacob Keller
  2022-11-09 18:20 ` Leon Romanovsky
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Jacob Keller @ 2022-11-08 16:40 UTC (permalink / raw)
  To: Ivan Vecera, netdev
  Cc: sassmann, Patryk Piotrowski, SlawomirX Laba, Jesse Brandeburg,
	Tony Nguyen, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, moderated list:INTEL ETHERNET DRIVERS, open list



On 11/8/2022 2:25 AM, Ivan Vecera wrote:
> After commit aa626da947e9 ("iavf: Detach device during reset task")
> the device is detached during reset task and re-attached at its end.
> The problem occurs when reset task fails because Tx queues are
> restarted during device re-attach and this leads later to a crash.
> 
> To resolve this issue properly close the net device in cause of
> failure in reset task to avoid restarting of tx queues at the end.
> Also replace the hacky manipulation with IFF_UP flag by device close
> that clears properly both IFF_UP and __LINK_STATE_START flags.
> In these case iavf_close() does not do anything because the adapter
> state is already __IAVF_DOWN.

Thanks for fixing this. I thought we'd removed the last such use of 
incorrect IFF_UP munging in 7d59706dbef8 ("Revert "iavf: Fix deadlock 
occurrence during resetting VF interface""), but apparently we had 
missed one...

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-08 10:25 [PATCH net] iavf: Do not restart Tx queues after reset task failure Ivan Vecera
  2022-11-08 16:40 ` Jacob Keller
@ 2022-11-09 18:20 ` Leon Romanovsky
  2022-11-09 20:11   ` Keller, Jacob E
  2022-11-18 14:30 ` [Intel-wired-lan] " Jankowski, Konrad0
  2022-11-18 14:31 ` Jankowski, Konrad0
  3 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2022-11-09 18:20 UTC (permalink / raw)
  To: Ivan Vecera
  Cc: netdev, sassmann, Jacob Keller, Patryk Piotrowski,
	SlawomirX Laba, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	moderated list:INTEL ETHERNET DRIVERS, open list

On Tue, Nov 08, 2022 at 11:25:02AM +0100, Ivan Vecera wrote:
> After commit aa626da947e9 ("iavf: Detach device during reset task")
> the device is detached during reset task and re-attached at its end.
> The problem occurs when reset task fails because Tx queues are
> restarted during device re-attach and this leads later to a crash.

<...>

> +	if (netif_running(netdev)) {
> +		/* Close device to ensure that Tx queues will not be started
> +		 * during netif_device_attach() at the end of the reset task.
> +		 */
> +		rtnl_lock();
> +		dev_close(netdev);
> +		rtnl_unlock();
> +	}

Sorry for my naive question, I see this pattern a lot (including RDMA), 
so curious. Everyone checks netif_running() outside of rtnl_lock, while
dev_close() changes state bit __LINK_STATE_START. Shouldn't rtnl_lock()
placed before netif_running()?

Thanks

> +
>  	dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n");
>  reset_finish:
>  	rtnl_lock();
> -- 
> 2.37.4
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-09 18:20 ` Leon Romanovsky
@ 2022-11-09 20:11   ` Keller, Jacob E
  2022-11-10  9:17     ` Leon Romanovsky
  2022-11-10 14:51     ` Ivan Vecera
  0 siblings, 2 replies; 11+ messages in thread
From: Keller, Jacob E @ 2022-11-09 20:11 UTC (permalink / raw)
  To: Leon Romanovsky, ivecera
  Cc: netdev, sassmann, Piotrowski, Patryk, SlawomirX Laba, Brandeburg,
	Jesse, Nguyen, Anthony L, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni,
	moderated list:INTEL ETHERNET DRIVERS, open list



> -----Original Message-----
> From: Leon Romanovsky <leon@kernel.org>
> Sent: Wednesday, November 9, 2022 10:21 AM
> To: ivecera <ivecera@redhat.com>
> Cc: netdev@vger.kernel.org; sassmann@redhat.com; Keller, Jacob E
> <jacob.e.keller@intel.com>; Piotrowski, Patryk <patryk.piotrowski@intel.com>;
> SlawomirX Laba <slawomirx.laba@intel.com>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; moderated list:INTEL ETHERNET DRIVERS <intel-
> wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>
> Subject: Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
> 
> On Tue, Nov 08, 2022 at 11:25:02AM +0100, Ivan Vecera wrote:
> > After commit aa626da947e9 ("iavf: Detach device during reset task")
> > the device is detached during reset task and re-attached at its end.
> > The problem occurs when reset task fails because Tx queues are
> > restarted during device re-attach and this leads later to a crash.
> 
> <...>
> 
> > +	if (netif_running(netdev)) {
> > +		/* Close device to ensure that Tx queues will not be started
> > +		 * during netif_device_attach() at the end of the reset task.
> > +		 */
> > +		rtnl_lock();
> > +		dev_close(netdev);
> > +		rtnl_unlock();
> > +	}
> 
> Sorry for my naive question, I see this pattern a lot (including RDMA),
> so curious. Everyone checks netif_running() outside of rtnl_lock, while
> dev_close() changes state bit __LINK_STATE_START. Shouldn't rtnl_lock()
> placed before netif_running()?

Yes I think you're right. A ton of people check it without the lock but I think thats not strictly safe. Is dev_close safe to call when netif_running is false? Why not just remove the check and always call dev_close then.

Thanks,
Jake

> 
> Thanks
> 
> > +
> >  	dev_err(&adapter->pdev->dev, "failed to allocate resources during
> reinit\n");
> >  reset_finish:
> >  	rtnl_lock();
> > --
> > 2.37.4
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-09 20:11   ` Keller, Jacob E
@ 2022-11-10  9:17     ` Leon Romanovsky
  2022-11-10 14:51     ` Ivan Vecera
  1 sibling, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2022-11-10  9:17 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: ivecera, netdev, sassmann, Piotrowski, Patryk, SlawomirX Laba,
	Brandeburg, Jesse, Nguyen, Anthony L, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	moderated list:INTEL ETHERNET DRIVERS, open list

On Wed, Nov 09, 2022 at 08:11:55PM +0000, Keller, Jacob E wrote:
> 
> 
> > -----Original Message-----
> > From: Leon Romanovsky <leon@kernel.org>
> > Sent: Wednesday, November 9, 2022 10:21 AM
> > To: ivecera <ivecera@redhat.com>
> > Cc: netdev@vger.kernel.org; sassmann@redhat.com; Keller, Jacob E
> > <jacob.e.keller@intel.com>; Piotrowski, Patryk <patryk.piotrowski@intel.com>;
> > SlawomirX Laba <slawomirx.laba@intel.com>; Brandeburg, Jesse
> > <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> > <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>; Eric
> > Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> > Abeni <pabeni@redhat.com>; moderated list:INTEL ETHERNET DRIVERS <intel-
> > wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>
> > Subject: Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
> > 
> > On Tue, Nov 08, 2022 at 11:25:02AM +0100, Ivan Vecera wrote:
> > > After commit aa626da947e9 ("iavf: Detach device during reset task")
> > > the device is detached during reset task and re-attached at its end.
> > > The problem occurs when reset task fails because Tx queues are
> > > restarted during device re-attach and this leads later to a crash.
> > 
> > <...>
> > 
> > > +	if (netif_running(netdev)) {
> > > +		/* Close device to ensure that Tx queues will not be started
> > > +		 * during netif_device_attach() at the end of the reset task.
> > > +		 */
> > > +		rtnl_lock();
> > > +		dev_close(netdev);
> > > +		rtnl_unlock();
> > > +	}
> > 
> > Sorry for my naive question, I see this pattern a lot (including RDMA),
> > so curious. Everyone checks netif_running() outside of rtnl_lock, while
> > dev_close() changes state bit __LINK_STATE_START. Shouldn't rtnl_lock()
> > placed before netif_running()?
> 
> Yes I think you're right. A ton of people check it without the lock but I think thats not strictly safe. Is dev_close safe to call when netif_running is false? Why not just remove the check and always call dev_close then.

I honestly don't know.

To remove any doubts, this patch is LGTM.

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-09 20:11   ` Keller, Jacob E
  2022-11-10  9:17     ` Leon Romanovsky
@ 2022-11-10 14:51     ` Ivan Vecera
  2022-11-10 17:07       ` Leon Romanovsky
  1 sibling, 1 reply; 11+ messages in thread
From: Ivan Vecera @ 2022-11-10 14:51 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: Leon Romanovsky, netdev, sassmann, Piotrowski, Patryk,
	SlawomirX Laba, Brandeburg, Jesse, Nguyen, Anthony L,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	moderated list:INTEL ETHERNET DRIVERS, open list

On Wed, 9 Nov 2022 20:11:55 +0000
"Keller, Jacob E" <jacob.e.keller@intel.com> wrote:

> > Sorry for my naive question, I see this pattern a lot (including RDMA),
> > so curious. Everyone checks netif_running() outside of rtnl_lock, while
> > dev_close() changes state bit __LINK_STATE_START. Shouldn't rtnl_lock()
> > placed before netif_running()?  
> 
> Yes I think you're right. A ton of people check it without the lock but I think thats not strictly safe. Is dev_close safe to call when netif_running is false? Why not just remove the check and always call dev_close then.
> 
> Thanks,
> Jake

Check for a bit value (like netif_runnning()) is much cheaper than unconditionally
taking global lock like RTNL.

Ivan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-10 14:51     ` Ivan Vecera
@ 2022-11-10 17:07       ` Leon Romanovsky
       [not found]         ` <20221110122418.32414666@kernel.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2022-11-10 17:07 UTC (permalink / raw)
  To: Ivan Vecera
  Cc: Keller, Jacob E, netdev, sassmann, Piotrowski, Patryk,
	SlawomirX Laba, Brandeburg, Jesse, Nguyen, Anthony L,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	moderated list:INTEL ETHERNET DRIVERS, open list

On Thu, Nov 10, 2022 at 03:51:47PM +0100, Ivan Vecera wrote:
> On Wed, 9 Nov 2022 20:11:55 +0000
> "Keller, Jacob E" <jacob.e.keller@intel.com> wrote:
> 
> > > Sorry for my naive question, I see this pattern a lot (including RDMA),
> > > so curious. Everyone checks netif_running() outside of rtnl_lock, while
> > > dev_close() changes state bit __LINK_STATE_START. Shouldn't rtnl_lock()
> > > placed before netif_running()?  
> > 
> > Yes I think you're right. A ton of people check it without the lock but I think thats not strictly safe. Is dev_close safe to call when netif_running is false? Why not just remove the check and always call dev_close then.
> > 
> > Thanks,
> > Jake
> 
> Check for a bit value (like netif_runnning()) is much cheaper than unconditionally
> taking global lock like RTNL.

This cheap operation is racy and performed in non-performance critical path.

Thanks

> 
> Ivan
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
       [not found]         ` <20221110122418.32414666@kernel.org>
@ 2022-11-10 21:07           ` Leon Romanovsky
  2022-11-10 21:13             ` Keller, Jacob E
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2022-11-10 21:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Ivan Vecera, Keller, Jacob E, netdev, sassmann, Piotrowski,
	Patryk, SlawomirX Laba, Brandeburg, Jesse, Nguyen, Anthony L,
	David S. Miller, Eric Dumazet, Paolo Abeni, intel-wired-lan,
	open list

On Thu, Nov 10, 2022 at 12:24:18PM -0800, Jakub Kicinski wrote:
> On Thu, 10 Nov 2022 19:07:02 +0200 Leon Romanovsky wrote:
> > > > Yes I think you're right. A ton of people check it without the
> > > > lock but I think thats not strictly safe. Is dev_close safe to
> > > > call when netif_running is false? Why not just remove the check
> > > > and always call dev_close then.
> > > 
> > > Check for a bit value (like netif_runnning()) is much cheaper than
> > > unconditionally taking global lock like RTNL.  
> > 
> > This cheap operation is racy and performed in non-performance
> > critical path.
> 
> To be clear - the rtnl_lock around the entire if is still racy 
> in the grand scheme of things, no? What's stopping someone from
> bringing the device right back up after you drop the lock?

I want to believe what there is some sort of state machine that won't
allow simple toggling of dev_close/dev_open. If it doesn't, rtnl_lock
users should audit their code for possible toggling right after that
lock is dropped.

Anyway, this discussion reminds me our devl_lock debate where we had
completely opposite views if rtnl_lock model is the right one.
https://lore.kernel.org/netdev/20211101073259.33406da3@kicinski-fedora-PC1C0HJN/

Let's not start argue again, we had enough back then. :)

Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-10 21:07           ` Leon Romanovsky
@ 2022-11-10 21:13             ` Keller, Jacob E
  0 siblings, 0 replies; 11+ messages in thread
From: Keller, Jacob E @ 2022-11-10 21:13 UTC (permalink / raw)
  To: Leon Romanovsky, Jakub Kicinski
  Cc: ivecera, netdev, sassmann, Piotrowski, Patryk, SlawomirX Laba,
	Brandeburg, Jesse, Nguyen, Anthony L, David S. Miller,
	Eric Dumazet, Paolo Abeni, intel-wired-lan, open list



> -----Original Message-----
> From: Leon Romanovsky <leon@kernel.org>
> Sent: Thursday, November 10, 2022 1:07 PM
> To: Jakub Kicinski <kuba@kernel.org>
> Cc: ivecera <ivecera@redhat.com>; Keller, Jacob E <jacob.e.keller@intel.com>;
> netdev@vger.kernel.org; sassmann@redhat.com; Piotrowski, Patryk
> <patryk.piotrowski@intel.com>; SlawomirX Laba <slawomirx.laba@intel.com>;
> Brandeburg, Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Paolo Abeni <pabeni@redhat.com>; intel-
> wired-lan@lists.osuosl.org; open list <linux-kernel@vger.kernel.org>
> Subject: Re: [PATCH net] iavf: Do not restart Tx queues after reset task failure
> 
> On Thu, Nov 10, 2022 at 12:24:18PM -0800, Jakub Kicinski wrote:
> > On Thu, 10 Nov 2022 19:07:02 +0200 Leon Romanovsky wrote:
> > > > > Yes I think you're right. A ton of people check it without the
> > > > > lock but I think thats not strictly safe. Is dev_close safe to
> > > > > call when netif_running is false? Why not just remove the check
> > > > > and always call dev_close then.
> > > >
> > > > Check for a bit value (like netif_runnning()) is much cheaper than
> > > > unconditionally taking global lock like RTNL.
> > >
> > > This cheap operation is racy and performed in non-performance
> > > critical path.
> >
> > To be clear - the rtnl_lock around the entire if is still racy
> > in the grand scheme of things, no? What's stopping someone from
> > bringing the device right back up after you drop the lock?
> 

I think the reset flow uses netif_device_detach() to detach the device before reset. Is that enough to prevent other calls to dev_close outside the driver?

Also, perhaps we should avoid re-attaching the device if the reset fails...

> I want to believe what there is some sort of state machine that won't
> allow simple toggling of dev_close/dev_open. If it doesn't, rtnl_lock
> users should audit their code for possible toggling right after that
> lock is dropped.
> 

I think the key is that normally dev_open and dev_close are done by iproute2 netlink messages? so if we close it, its possible userspace could immediately open it.. though I think that isn't allowed while the device is detached, so we should stay closed until we re-attach, at which point dev_open can fail by noticing the VF is disabled...


> Anyway, this discussion reminds me our devl_lock debate where we had
> completely opposite views if rtnl_lock model is the right one.
> https://lore.kernel.org/netdev/20211101073259.33406da3@kicinski-fedora-
> PC1C0HJN/
> 
> Let's not start argue again, we had enough back then. :)
> 
> Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-08 10:25 [PATCH net] iavf: Do not restart Tx queues after reset task failure Ivan Vecera
  2022-11-08 16:40 ` Jacob Keller
  2022-11-09 18:20 ` Leon Romanovsky
@ 2022-11-18 14:30 ` Jankowski, Konrad0
  2022-11-18 14:31 ` Jankowski, Konrad0
  3 siblings, 0 replies; 11+ messages in thread
From: Jankowski, Konrad0 @ 2022-11-18 14:30 UTC (permalink / raw)
  To: ivecera, netdev
  Cc: SlawomirX Laba, Eric Dumazet,
	moderated list:INTEL ETHERNET DRIVERS, open list, Piotrowski,
	Patryk, Jakub Kicinski, Paolo Abeni, David S. Miller, sassmann



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ivan
> Vecera
> Sent: Tuesday, November 8, 2022 11:25 AM
> To: netdev@vger.kernel.org
> Cc: SlawomirX Laba <slawomirx.laba@intel.com>; Eric Dumazet
> <edumazet@google.com>; moderated list:INTEL ETHERNET DRIVERS <intel-
> wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>;
> Piotrowski, Patryk <patryk.piotrowski@intel.com>; Jakub Kicinski
> <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller
> <davem@davemloft.net>; sassmann@redhat.com
> Subject: [Intel-wired-lan] [PATCH net] iavf: Do not restart Tx queues after reset
> task failure
> 
> After commit aa626da947e9 ("iavf: Detach device during reset task") the device
> is detached during reset task and re-attached at its end.
> The problem occurs when reset task fails because Tx queues are restarted during
> device re-attach and this leads later to a crash.
> 
> To resolve this issue properly close the net device in cause of failure in reset task
> to avoid restarting of tx queues at the end.
> Also replace the hacky manipulation with IFF_UP flag by device close that clears
> properly both IFF_UP and __LINK_STATE_START flags.
> In these case iavf_close() does not do anything because the adapter state is
> already __IAVF_DOWN.
> 
> Reproducer:
> 1) Run some Tx traffic (e.g. iperf3) over iavf interface
> 2) Set VF trusted / untrusted in loop
> 
> [root@host ~]# cat repro.sh
> #!/bin/sh
> 
> PF=enp65s0f0
> IF=${PF}v0
> 
> ip link set up $IF
> ip addr add 192.168.0.2/24 dev $IF
> sleep 1
> 
> iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null & sleep 2
> 
> while :; do
>         ip link set $PF vf 0 trust on
>         ip link set $PF vf 0 trust off
> done
> [root@host ~]# ./repro.sh
> 
> Result:
> [ 2006.650969] iavf 0000:41:01.0: Failed to init adminq: -53 [ 2006.675662] ice
> 0000:41:00.0: VF 0 is now trusted [ 2006.689997] iavf 0000:41:01.0: Reset task
> did not complete, VF disabled [ 2006.696611] iavf 0000:41:01.0: failed to allocate
> resources during reinit [ 2006.703209] ice 0000:41:00.0: VF 0 is now untrusted [
> 2006.737011] ice 0000:41:00.0: VF 0 is now trusted [ 2006.764536] ice
> 0000:41:00.0: VF 0 is now untrusted [ 2006.768919] BUG: kernel NULL pointer
> dereference, address: 0000000000000b4a [ 2006.776358] #PF: supervisor read
> access in kernel mode [ 2006.781488] #PF: error_code(0x0000) - not-present
> page [ 2006.786620] PGD 0 P4D 0 [ 2006.789152] Oops: 0000 [#1] PREEMPT SMP
> NOPTI [ 2006.792903] ice 0000:41:00.0: VF 0 is now trusted [ 2006.793501] CPU:
> 4 PID: 0 Comm: swapper/4 Kdump: loaded Not tainted 6.1.0-rc3+ #2 [
> 2006.805668] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super
> Server/H12SSW-iN, BIOS 2.4 04/13/2022 [ 2006.815915] RIP:
> 0010:iavf_xmit_frame_ring+0x96/0xf70 [iavf] [ 2006.821028] ice 0000:41:00.0:
> VF 0 is now untrusted [ 2006.821572] Code: 48 83 c1 04 48 c1 e1 04 48 01 f9 48
> 83 c0 10 6b 50 f8 55 c1 ea 14 45 8d 64 14 01 48 39 c8 75 eb 41 83 fc 07 0f 8f e9 08
> 00 00 <0f> b7 45 4a 0f b7 55 48 41 8d 74 24 05 31 c9 66 39 d0 0f 86 da 00 [
> 2006.845181] RSP: 0018:ffffb253004bc9e8 EFLAGS: 00010293 [ 2006.850397]
> RAX: ffff9d154de45b00 RBX: ffff9d15497d52e8 RCX: ffff9d154de45b00 [
> 2006.856327] ice 0000:41:00.0: VF 0 is now trusted [ 2006.857523] RDX:
> 0000000000000000 RSI: 00000000000005a8 RDI: ffff9d154de45ac0 [
> 2006.857525] RBP: 0000000000000b00 R08: ffff9d159cb010ac R09:
> 0000000000000001 [ 2006.857526] R10: ffff9d154de45940 R11:
> 0000000000000000 R12: 0000000000000002 [ 2006.883600] R13:
> ffff9d1770838dc0 R14: 0000000000000000 R15: ffffffffc07b8380 [ 2006.885840]
> ice 0000:41:00.0: VF 0 is now untrusted [ 2006.890725] FS:
> 0000000000000000(0000) GS:ffff9d248e900000(0000) knlGS:0000000000000000
> [ 2006.890727] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [
> 2006.909419] CR2: 0000000000000b4a CR3: 0000000c39c10002 CR4:
> 0000000000770ee0 [ 2006.916543] PKRU: 55555554 [ 2006.918254] ice
> 0000:41:00.0: VF 0 is now trusted [ 2006.919248] Call Trace:
> [ 2006.919250]  <IRQ>
> [ 2006.919252]  dev_hard_start_xmit+0x9e/0x1f0 [ 2006.932587]
> sch_direct_xmit+0xa0/0x370 [ 2006.936424]  __dev_queue_xmit+0x7af/0xd00 [
> 2006.940429]  ip_finish_output2+0x26c/0x540 [ 2006.944519]
> ip_output+0x71/0x110 [ 2006.947831]  ? __ip_finish_output+0x2b0/0x2b0 [
> 2006.952180]  __ip_queue_xmit+0x16d/0x400 [ 2006.952721] ice 0000:41:00.0:
> VF 0 is now untrusted [ 2006.956098]  __tcp_transmit_skb+0xa96/0xbf0 [
> 2006.965148]  __tcp_retransmit_skb+0x174/0x860 [ 2006.969499]  ?
> cubictcp_cwnd_event+0x40/0x40 [ 2006.973769]
> tcp_retransmit_skb+0x14/0xb0 ...
> 
> Fixes: aa626da947e9 ("iavf: Detach device during reset task")
> Cc: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Patryk Piotrowski <patryk.piotrowski@intel.com>
> Cc: SlawomirX Laba <slawomirx.laba@intel.com>
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 5abcd66e7c7a..b66f8fa1d83b 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c

Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [PATCH net] iavf: Do not restart Tx queues after reset task failure
  2022-11-08 10:25 [PATCH net] iavf: Do not restart Tx queues after reset task failure Ivan Vecera
                   ` (2 preceding siblings ...)
  2022-11-18 14:30 ` [Intel-wired-lan] " Jankowski, Konrad0
@ 2022-11-18 14:31 ` Jankowski, Konrad0
  3 siblings, 0 replies; 11+ messages in thread
From: Jankowski, Konrad0 @ 2022-11-18 14:31 UTC (permalink / raw)
  To: ivecera, netdev
  Cc: SlawomirX Laba, Eric Dumazet,
	moderated list:INTEL ETHERNET DRIVERS, open list, Piotrowski,
	Patryk, Jakub Kicinski, Paolo Abeni, David S. Miller, sassmann



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ivan
> Vecera
> Sent: Tuesday, November 8, 2022 11:25 AM
> To: netdev@vger.kernel.org
> Cc: SlawomirX Laba <slawomirx.laba@intel.com>; Eric Dumazet
> <edumazet@google.com>; moderated list:INTEL ETHERNET DRIVERS <intel-
> wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>;
> Piotrowski, Patryk <patryk.piotrowski@intel.com>; Jakub Kicinski
> <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller
> <davem@davemloft.net>; sassmann@redhat.com
> Subject: [Intel-wired-lan] [PATCH net] iavf: Do not restart Tx queues after reset
> task failure
> 
> After commit aa626da947e9 ("iavf: Detach device during reset task") the device
> is detached during reset task and re-attached at its end.
> The problem occurs when reset task fails because Tx queues are restarted during
> device re-attach and this leads later to a crash.
> 
> To resolve this issue properly close the net device in cause of failure in reset task
> to avoid restarting of tx queues at the end.
> Also replace the hacky manipulation with IFF_UP flag by device close that clears
> properly both IFF_UP and __LINK_STATE_START flags.
> In these case iavf_close() does not do anything because the adapter state is
> already __IAVF_DOWN.
> 
> Reproducer:
> 1) Run some Tx traffic (e.g. iperf3) over iavf interface
> 2) Set VF trusted / untrusted in loop
> 
> [root@host ~]# cat repro.sh
> #!/bin/sh
> 
> PF=enp65s0f0
> IF=${PF}v0
> 
> ip link set up $IF
> ip addr add 192.168.0.2/24 dev $IF
> sleep 1
> 
> iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null & sleep 2
> 
> while :; do
>         ip link set $PF vf 0 trust on
>         ip link set $PF vf 0 trust off
> done
> [root@host ~]# ./repro.sh
> 
> Result:
> [ 2006.650969] iavf 0000:41:01.0: Failed to init adminq: -53 [ 2006.675662] ice
> 0000:41:00.0: VF 0 is now trusted [ 2006.689997] iavf 0000:41:01.0: Reset task
> did not complete, VF disabled [ 2006.696611] iavf 0000:41:01.0: failed to allocate
> resources during reinit [ 2006.703209] ice 0000:41:00.0: VF 0 is now untrusted [
> 2006.737011] ice 0000:41:00.0: VF 0 is now trusted [ 2006.764536] ice
> 0000:41:00.0: VF 0 is now untrusted [ 2006.768919] BUG: kernel NULL pointer
> dereference, address: 0000000000000b4a [ 2006.776358] #PF: supervisor read
> access in kernel mode [ 2006.781488] #PF: error_code(0x0000) - not-present
> page [ 2006.786620] PGD 0 P4D 0 [ 2006.789152] Oops: 0000 [#1] PREEMPT SMP
> NOPTI [ 2006.792903] ice 0000:41:00.0: VF 0 is now trusted [ 2006.793501] CPU:
> 4 PID: 0 Comm: swapper/4 Kdump: loaded Not tainted 6.1.0-rc3+ #2 [
> 2006.805668] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super
> Server/H12SSW-iN, BIOS 2.4 04/13/2022 [ 2006.815915] RIP:
> 0010:iavf_xmit_frame_ring+0x96/0xf70 [iavf] [ 2006.821028] ice 0000:41:00.0:
> VF 0 is now untrusted [ 2006.821572] Code: 48 83 c1 04 48 c1 e1 04 48 01 f9 48
> 83 c0 10 6b 50 f8 55 c1 ea 14 45 8d 64 14 01 48 39 c8 75 eb 41 83 fc 07 0f 8f e9 08
> 00 00 <0f> b7 45 4a 0f b7 55 48 41 8d 74 24 05 31 c9 66 39 d0 0f 86 da 00 [
> 2006.845181] RSP: 0018:ffffb253004bc9e8 EFLAGS: 00010293 [ 2006.850397]
> RAX: ffff9d154de45b00 RBX: ffff9d15497d52e8 RCX: ffff9d154de45b00 [
> 2006.856327] ice 0000:41:00.0: VF 0 is now trusted [ 2006.857523] RDX:
> 0000000000000000 RSI: 00000000000005a8 RDI: ffff9d154de45ac0 [
> 2006.857525] RBP: 0000000000000b00 R08: ffff9d159cb010ac R09:
> 0000000000000001 [ 2006.857526] R10: ffff9d154de45940 R11:
> 0000000000000000 R12: 0000000000000002 [ 2006.883600] R13:
> ffff9d1770838dc0 R14: 0000000000000000 R15: ffffffffc07b8380 [ 2006.885840]
> ice 0000:41:00.0: VF 0 is now untrusted [ 2006.890725] FS:
> 0000000000000000(0000) GS:ffff9d248e900000(0000) knlGS:0000000000000000
> [ 2006.890727] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [
> 2006.909419] CR2: 0000000000000b4a CR3: 0000000c39c10002 CR4:
> 0000000000770ee0 [ 2006.916543] PKRU: 55555554 [ 2006.918254] ice
> 0000:41:00.0: VF 0 is now trusted [ 2006.919248] Call Trace:
> [ 2006.919250]  <IRQ>
> [ 2006.919252]  dev_hard_start_xmit+0x9e/0x1f0 [ 2006.932587]
> sch_direct_xmit+0xa0/0x370 [ 2006.936424]  __dev_queue_xmit+0x7af/0xd00 [
> 2006.940429]  ip_finish_output2+0x26c/0x540 [ 2006.944519]
> ip_output+0x71/0x110 [ 2006.947831]  ? __ip_finish_output+0x2b0/0x2b0 [
> 2006.952180]  __ip_queue_xmit+0x16d/0x400 [ 2006.952721] ice 0000:41:00.0:
> VF 0 is now untrusted [ 2006.956098]  __tcp_transmit_skb+0xa96/0xbf0 [
> 2006.965148]  __tcp_retransmit_skb+0x174/0x860 [ 2006.969499]  ?
> cubictcp_cwnd_event+0x40/0x40 [ 2006.973769]
> tcp_retransmit_skb+0x14/0xb0 ...
> 
> Fixes: aa626da947e9 ("iavf: Detach device during reset task")
> Cc: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Patryk Piotrowski <patryk.piotrowski@intel.com>
> Cc: SlawomirX Laba <slawomirx.laba@intel.com>
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 5abcd66e7c7a..b66f8fa1d83b 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -2921,7 +2921,6 @@ static void iavf_disable_vf(struct iavf_adapter

Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-11-18 14:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 10:25 [PATCH net] iavf: Do not restart Tx queues after reset task failure Ivan Vecera
2022-11-08 16:40 ` Jacob Keller
2022-11-09 18:20 ` Leon Romanovsky
2022-11-09 20:11   ` Keller, Jacob E
2022-11-10  9:17     ` Leon Romanovsky
2022-11-10 14:51     ` Ivan Vecera
2022-11-10 17:07       ` Leon Romanovsky
     [not found]         ` <20221110122418.32414666@kernel.org>
2022-11-10 21:07           ` Leon Romanovsky
2022-11-10 21:13             ` Keller, Jacob E
2022-11-18 14:30 ` [Intel-wired-lan] " Jankowski, Konrad0
2022-11-18 14:31 ` Jankowski, Konrad0

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).