All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
@ 2020-09-03 18:28 Michael Chan
  2020-09-03 19:24 ` David Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Michael Chan @ 2020-09-03 18:28 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, drc, baptiste

If tg3_reset_task() fails, the device state is left in an inconsistent
state with IFF_RUNNING still set but NAPI state not enabled.  A
subsequent operation, such as ifdown or AER error can cause it to
soft lock up when it tries to disable NAPI state.

Fix it by bringing down the device to !IFF_RUNNING state when
tg3_reset_task() fails.  tg3_reset_task() running from workqueue
will now call tg3_close() when the reset fails.  We need to
modify tg3_reset_task_cancel() slightly to avoid tg3_close()
calling cancel_work_sync() to cancel tg3_reset_task().  Otherwise
cancel_work_sync() will wait forever for tg3_reset_task() to
finish.

Reported-by: David Christensen <drc@linux.vnet.ibm.com>
Reported-by: Baptiste Covolato <baptiste@arista.com>
Fixes: db2199737990 ("tg3: Schedule at most one tg3_reset_task run")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/tg3.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index ebff1fc..4515804 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7221,8 +7221,8 @@ static inline void tg3_reset_task_schedule(struct tg3 *tp)
 
 static inline void tg3_reset_task_cancel(struct tg3 *tp)
 {
-	cancel_work_sync(&tp->reset_task);
-	tg3_flag_clear(tp, RESET_TASK_PENDING);
+	if (test_and_clear_bit(TG3_FLAG_RESET_TASK_PENDING, tp->tg3_flags))
+		cancel_work_sync(&tp->reset_task);
 	tg3_flag_clear(tp, TX_RECOVERY_PENDING);
 }
 
@@ -11209,18 +11209,27 @@ static void tg3_reset_task(struct work_struct *work)
 
 	tg3_halt(tp, RESET_KIND_SHUTDOWN, 0);
 	err = tg3_init_hw(tp, true);
-	if (err)
+	if (err) {
+		tg3_full_unlock(tp);
+		tp->irq_sync = 0;
+		tg3_napi_enable(tp);
+		/* Clear this flag so that tg3_reset_task_cancel() will not
+		 * call cancel_work_sync() and wait forever.
+		 */
+		tg3_flag_clear(tp, RESET_TASK_PENDING);
+		dev_close(tp->dev);
 		goto out;
+	}
 
 	tg3_netif_start(tp);
 
-out:
 	tg3_full_unlock(tp);
 
 	if (!err)
 		tg3_phy_start(tp);
 
 	tg3_flag_clear(tp, RESET_TASK_PENDING);
+out:
 	rtnl_unlock();
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
  2020-09-03 18:28 [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails Michael Chan
@ 2020-09-03 19:24 ` David Miller
  2020-09-04 23:20 ` Baptiste Covolato
  2020-09-08 16:55 ` David Christensen
  2 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2020-09-03 19:24 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev, kuba, drc, baptiste

From: Michael Chan <michael.chan@broadcom.com>
Date: Thu,  3 Sep 2020 14:28:54 -0400

> If tg3_reset_task() fails, the device state is left in an inconsistent
> state with IFF_RUNNING still set but NAPI state not enabled.  A
> subsequent operation, such as ifdown or AER error can cause it to
> soft lock up when it tries to disable NAPI state.
> 
> Fix it by bringing down the device to !IFF_RUNNING state when
> tg3_reset_task() fails.  tg3_reset_task() running from workqueue
> will now call tg3_close() when the reset fails.  We need to
> modify tg3_reset_task_cancel() slightly to avoid tg3_close()
> calling cancel_work_sync() to cancel tg3_reset_task().  Otherwise
> cancel_work_sync() will wait forever for tg3_reset_task() to
> finish.
> 
> Reported-by: David Christensen <drc@linux.vnet.ibm.com>
> Reported-by: Baptiste Covolato <baptiste@arista.com>
> Fixes: db2199737990 ("tg3: Schedule at most one tg3_reset_task run")
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
  2020-09-03 18:28 [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails Michael Chan
  2020-09-03 19:24 ` David Miller
@ 2020-09-04 23:20 ` Baptiste Covolato
  2020-09-05  9:02   ` Michael Chan
  2020-09-08 16:55 ` David Christensen
  2 siblings, 1 reply; 7+ messages in thread
From: Baptiste Covolato @ 2020-09-04 23:20 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, Netdev, kuba, David Christensen

Hi Michael,

On Thu, Sep 3, 2020 at 11:29 AM Michael Chan <michael.chan@broadcom.com> wrote:
> If tg3_reset_task() fails, the device state is left in an inconsistent
> state with IFF_RUNNING still set but NAPI state not enabled.  A
> subsequent operation, such as ifdown or AER error can cause it to
> soft lock up when it tries to disable NAPI state.
>
> Fix it by bringing down the device to !IFF_RUNNING state when
> tg3_reset_task() fails.  tg3_reset_task() running from workqueue
> will now call tg3_close() when the reset fails.  We need to
> modify tg3_reset_task_cancel() slightly to avoid tg3_close()
> calling cancel_work_sync() to cancel tg3_reset_task().  Otherwise
> cancel_work_sync() will wait forever for tg3_reset_task() to
> finish.

Thank you for proposing this patch. Unfortunately, it appears to make
things worse on my test setup. The problem is a lot easier to
reproduce, and not related to transmit timeout anymore.

The manifestation of the problem with the new patch starts with a
CmpltTO error on the PCI root port of the CPU:
[11288.471126] tg3 0000:56:00.0: tg3_abort_hw timed out,
TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
[11290.258733] tg3 0000:56:00.0 lc4: No firmware running
[11302.336601] tg3 0000:56:00.0 lc4: Link is down
[11302.336616] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal)
error received: 0000:00:03.0
[11302.336621] pcieport 0000:00:03.0: PCIe Bus Error:
severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester
ID)
[11302.470089] pcieport 0000:00:03.0:   device [8086:6f08] error
status/mask=00004000/00000000
[11302.570218] pcieport 0000:00:03.0:    [14] CmpltTO                (First)
[11302.651611] pcieport 0000:00:03.0: broadcast error_detected message
[11305.119349] br1: port 4(lc4) entered disabled state
[11305.119443] br1: port 1(lc4.42) entered disabled state
[11305.119696] device lc4 left promiscuous mode
[11305.119697] br1: port 4(lc4) entered disabled state
[11305.143622] device lc4.42 left promiscuous mode
[11305.143626] br1: port 1(lc4.42) entered disabled state
[11305.219623] iommu: Removing device 0000:56:00.0 from group 52
[11305.219672] tg3 0000:61:00.0 lc5: PCI I/O error detected
[11305.345904] tg3 0000:6c:00.0 lc6: PCI I/O error detected
[11305.472089] pcieport 0000:00:03.0: AER: Device recovery failed
[11305.472092] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal)
error received: 0000:00:03.0
[11305.472096] pcieport 0000:00:03.0: PCIe Bus Error:
severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester
ID)
[11305.472142] pcielw 0000:52:0d.0:pcie204: link down processing complete
[11305.605566] pcieport 0000:00:03.0:   device [8086:6f08] error
status/mask=00004000/00000000
[11305.605568] pcieport 0000:00:03.0:    [14] CmpltTO                (First)
[11305.605578] pcieport 0000:00:03.0: broadcast error_detected message
[11305.787386] tg3 0000:61:00.0 lc5: PCI I/O error detected

Once these PCI errors happen, the kernel is dead locked for any
network operations. We shortly see stack traces of task blocked
doing network configuration operations after that:
[11481.166888] INFO: task ntpd:6107 blocked for more than 120 seconds.
[11481.242021]       Tainted: G           OE     4.19.0-9-2-amd64 #1
Debian 4.19.118-2+deb10u1
[11481.342126] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[11481.436018] ntpd            D    0  6107      1 0x00000004
[11481.436020] Call Trace:
[11481.436023]  ? __schedule+0x2a2/0x870
[11481.436025]  schedule+0x28/0x80
[11481.436043]  schedule_preempt_disabled+0xa/0x10
[11481.436045]  __mutex_lock.isra.8+0x2b5/0x4a0
[11481.436048]  ? __netlink_lookup+0xcd/0x130
[11481.436049]  __netlink_dump_start+0x56/0x1e0
[11481.436053]  ? rtnl_fill_ifinfo+0xe90/0xe90
[11481.436059]  rtnetlink_rcv_msg+0x22c/0x360
[11481.436065]  ? rtnl_fill_ifinfo+0xe90/0xe90
[11481.436071]  ? rtnl_calcit.isra.33+0x100/0x100
[11481.436076]  netlink_rcv_skb+0x4c/0x120
[11481.436083]  netlink_unicast+0x181/0x210
[11481.436089]  netlink_sendmsg+0x204/0x3d0
[11481.436095]  sock_sendmsg+0x36/0x40
[11481.436097]  __sys_sendto+0xee/0x160
[11481.436100]  ? __sys_socket+0x93/0xe0
[11481.436102]  __x64_sys_sendto+0x24/0x30
[11481.436105]  do_syscall_64+0x53/0x110
[11481.436107]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[11481.436109] RIP: 0033:0x7fe88125aa02
[11481.436113] Code: Bad RIP value.
[11481.436113] RSP: 002b:00007ffdbdcd8950 EFLAGS: 00000293 ORIG_RAX:
000000000000002c
[11481.436115] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fe88125aa02
[11481.436116] RDX: 0000000000000014 RSI: 00007ffdbdcd9a30 RDI: 0000000000000004
[11481.436116] RBP: 0000000000000000 R08: 00007ffdbdcd99f0 R09: 000000000000000c
[11481.436117] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffdbdcd9a30
[11481.436118] R13: 0000000000000014 R14: 0000000000000000 R15: 00007ffdbdcd99f0

Do you have any idea of where this could be coming from? Is it
possible that this patch introduced a new window where the deadlock
can happen?

Thanks,

--
Baptiste Covolato

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
  2020-09-04 23:20 ` Baptiste Covolato
@ 2020-09-05  9:02   ` Michael Chan
  2020-09-10 23:00     ` Baptiste Covolato
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2020-09-05  9:02 UTC (permalink / raw)
  To: Baptiste Covolato; +Cc: David Miller, Netdev, Jakub Kicinski, David Christensen

On Fri, Sep 4, 2020 at 4:20 PM Baptiste Covolato <baptiste@arista.com> wrote:

> Thank you for proposing this patch. Unfortunately, it appears to make
> things worse on my test setup. The problem is a lot easier to
> reproduce, and not related to transmit timeout anymore.

This patch specifically addresses the issue reported by David
Christensen.  When tg3_reset_task() is unsuccessful, it will bring the
device to a consistent IF_DOWN state to prevent soft lockup.
tg3_reset_task() is usually scheduled from TX timeout, or from a few
other error conditions.  In David's case, it was triggered from TX
timeout.

So if the issue you're reporting has nothing to do with TX timeout or
the other error conditions that trigger tg3_reset_task(), this patch
should have no effect.

>
> The manifestation of the problem with the new patch starts with a
> CmpltTO error on the PCI root port of the CPU:
> [11288.471126] tg3 0000:56:00.0: tg3_abort_hw timed out,
> TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff

It is unclear how tg3_abort_hw() is called, but it is encountering an
error.  The TX mode register cannot be cleared.

> [11290.258733] tg3 0000:56:00.0 lc4: No firmware running

Most tg3 NICs have firmware running.  This message about no firmware
running usually means something is wrong.

> [11302.336601] tg3 0000:56:00.0 lc4: Link is down
> [11302.336616] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal)
> error received: 0000:00:03.0
> [11302.336621] pcieport 0000:00:03.0: PCIe Bus Error:
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester
> ID)
> [11302.470089] pcieport 0000:00:03.0:   device [8086:6f08] error
> status/mask=00004000/00000000
> [11302.570218] pcieport 0000:00:03.0:    [14] CmpltTO                (First)
> [11302.651611] pcieport 0000:00:03.0: broadcast error_detected message
> [11305.119349] br1: port 4(lc4) entered disabled state
> [11305.119443] br1: port 1(lc4.42) entered disabled state
> [11305.119696] device lc4 left promiscuous mode
> [11305.119697] br1: port 4(lc4) entered disabled state
> [11305.143622] device lc4.42 left promiscuous mode
> [11305.143626] br1: port 1(lc4.42) entered disabled state
> [11305.219623] iommu: Removing device 0000:56:00.0 from group 52
> [11305.219672] tg3 0000:61:00.0 lc5: PCI I/O error detected
> [11305.345904] tg3 0000:6c:00.0 lc6: PCI I/O error detected

Now we have AER errors detected on 2 other tg3 devices, not from the
one above with tg3_abort_hw() failure.

I think this issue that you're reporting is not the same as David's
issue of TX timeout happening at about the same time as AER.

Please describe the issue in more detail, in particular how's the
tg3_abort_hw() seen above initiated and how many tg3 devices do you
have.  Also, are you injecting these AER errors?  Please also include
the complete dmesg.  Thanks.

> [11305.472089] pcieport 0000:00:03.0: AER: Device recovery failed
> [11305.472092] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal)
> error received: 0000:00:03.0
> [11305.472096] pcieport 0000:00:03.0: PCIe Bus Error:
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester
> ID)
> [11305.472142] pcielw 0000:52:0d.0:pcie204: link down processing complete
> [11305.605566] pcieport 0000:00:03.0:   device [8086:6f08] error
> status/mask=00004000/00000000
> [11305.605568] pcieport 0000:00:03.0:    [14] CmpltTO                (First)
> [11305.605578] pcieport 0000:00:03.0: broadcast error_detected message
> [11305.787386] tg3 0000:61:00.0 lc5: PCI I/O error detected
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
  2020-09-03 18:28 [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails Michael Chan
  2020-09-03 19:24 ` David Miller
  2020-09-04 23:20 ` Baptiste Covolato
@ 2020-09-08 16:55 ` David Christensen
  2 siblings, 0 replies; 7+ messages in thread
From: David Christensen @ 2020-09-08 16:55 UTC (permalink / raw)
  To: Michael Chan, davem; +Cc: netdev, kuba, baptiste

On 9/3/20 11:28 AM, Michael Chan wrote:
> If tg3_reset_task() fails, the device state is left in an inconsistent
> state with IFF_RUNNING still set but NAPI state not enabled.  A
> subsequent operation, such as ifdown or AER error can cause it to
> soft lock up when it tries to disable NAPI state.
> 
> Fix it by bringing down the device to !IFF_RUNNING state when
> tg3_reset_task() fails.  tg3_reset_task() running from workqueue
> will now call tg3_close() when the reset fails.  We need to
> modify tg3_reset_task_cancel() slightly to avoid tg3_close()
> calling cancel_work_sync() to cancel tg3_reset_task().  Otherwise
> cancel_work_sync() will wait forever for tg3_reset_task() to
> finish.
> 
> Reported-by: David Christensen <drc@linux.vnet.ibm.com>
> Reported-by: Baptiste Covolato <baptiste@arista.com>
> Fixes: db2199737990 ("tg3: Schedule at most one tg3_reset_task run")
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>

Thanks for the patch, I'll have some test time scheduled and let you know.

Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
  2020-09-05  9:02   ` Michael Chan
@ 2020-09-10 23:00     ` Baptiste Covolato
  0 siblings, 0 replies; 7+ messages in thread
From: Baptiste Covolato @ 2020-09-10 23:00 UTC (permalink / raw)
  To: Michael Chan; +Cc: David Miller, Netdev, Jakub Kicinski, David Christensen

Hi Michael,

On Sat, Sep 5, 2020 at 2:02 AM Michael Chan <michael.chan@broadcom.com> wrote:
> Now we have AER errors detected on 2 other tg3 devices, not from the
> one above with tg3_abort_hw() failure.
>
> I think this issue that you're reporting is not the same as David's
> issue of TX timeout happening at about the same time as AER.
>
> Please describe the issue in more detail, in particular how's the
> tg3_abort_hw() seen above initiated and how many tg3 devices do you
> have.  Also, are you injecting these AER errors?  Please also include
> the complete dmesg.  Thanks.

Sorry for the delay in response. I have been running some more
experiments with this issue to try to gather as much information as
possible. While running those experiments I noticed some strange
behavior on my system with PCI. I finally narrowed down the issue to
an improper power on sequence. After I fixed this, all my issues with
the tg3 are gone (tested with this patch applied). I think we're all
good now.

I'll let you know if I see any issues in the future.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails.
@ 2020-10-07 16:54 Tomas Charvat
  0 siblings, 0 replies; 7+ messages in thread
From: Tomas Charvat @ 2020-10-07 16:54 UTC (permalink / raw)
  To: CABb8VeHA8yEmi-iDs3O-eRfOucWqGM+9p6gj87NLdjeQHfJROA; +Cc: netdev

Greetings, 
tg3 with kernel 5.4.69 is able to get into state, that its reseting
link forever. From mii-tools -w XXX I can see, that link is going up
and down every 3-8 seconds. 
Take down/up interface doesnt help.
However there is no any error in dmesg or anywhere else.

Tested kernel is not modular, only reboot helped.
NIC details are:
driver=tg3 driverversion=3.137 firmware=5720-v1.39 NCSI v1.5.1.0

Best regards
-- 
Tomas Charvat <tc@excello.cz>
Excello s.r.o.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-07 17:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 18:28 [PATCH net] tg3: Fix soft lockup when tg3_reset_task() fails Michael Chan
2020-09-03 19:24 ` David Miller
2020-09-04 23:20 ` Baptiste Covolato
2020-09-05  9:02   ` Michael Chan
2020-09-10 23:00     ` Baptiste Covolato
2020-09-08 16:55 ` David Christensen
2020-10-07 16:54 Tomas Charvat

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.