linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
@ 2013-01-05  9:01 Jörg Otte
  2013-01-05  9:37 ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Jörg Otte @ 2013-01-05  9:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List, David S. Miller, Francois Romieu, netdev

I frequently see the following in the syslog:

[  184.552914] ------------[ cut here ]------------
[  184.552927] WARNING: at
/data/kernel/linux/net/sched/sch_generic.c:254
dev_watchdog+0xf2/0x151()
[  184.552929] Hardware name: LIFEBOOK AH532
[  184.552932] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[  184.552937] Pid: 0, comm: swapper/1 Not tainted
3.8.0-rc2-b11-00221-gd1c3ed6 #15
[  184.552939] Call Trace:
[  184.552941]  <IRQ>  [<ffffffff8138d4a2>] ? dev_watchdog+0xf2/0x151
[  184.552953]  [<ffffffff81025c8a>] ? warn_slowpath_common+0x73/0x87
[  184.552956]  [<ffffffff8138d3b0>] ? netif_tx_unlock+0x49/0x49
[  184.552961]  [<ffffffff81025d02>] ? warn_slowpath_fmt+0x45/0x4a
[  184.552967]  [<ffffffff8138d332>] ? netif_tx_lock+0x40/0x75
[  184.552971]  [<ffffffff8138d4a2>] ? dev_watchdog+0xf2/0x151
[  184.552977]  [<ffffffff8102f1a1>] ? call_timer_fn.isra.32+0x1d/0x73
[  184.552981]  [<ffffffff8102f34b>] ? run_timer_softirq+0x154/0x194
[  184.552988]  [<ffffffff8104cb84>] ? timekeeping_get_ns.constprop.6+0xd/0x31
[  184.552992]  [<ffffffff8102b4a5>] ? __do_softirq+0x96/0x139
[  184.552997]  [<ffffffff8146b00c>] ? call_softirq+0x1c/0x26
[  184.553002]  [<ffffffff81003cf4>] ? do_softirq+0x2e/0x62
[  184.553006]  [<ffffffff8102b615>] ? irq_exit+0x3d/0x98
[  184.553011]  [<ffffffff810184ad>] ? smp_apic_timer_interrupt+0x73/0x80
[  184.553018]  [<ffffffff8146aa0a>] ? apic_timer_interrupt+0x6a/0x70
[  184.553020]  <EOI>  [<ffffffff81326f2b>] ? cpuidle_wrap_enter+0x38/0x69
[  184.553033]  [<ffffffff81326f27>] ? cpuidle_wrap_enter+0x34/0x69
[  184.553039]  [<ffffffff81326d81>] ? cpuidle_enter_state+0xa/0x31
[  184.553044]  [<ffffffff81326e41>] ? cpuidle_idle_call+0x99/0xb9
[  184.553050]  [<ffffffff81009059>] ? cpu_idle+0x99/0xe0
[  184.553056]  [<ffffffff8145e3a4>] ? start_secondary+0x1d6/0x1dc
[  184.553059] ---[ end trace 54db26a54b22f673 ]---
[  184.587487] r8169 0000:02:00.0 eth0: link up

It's a regression, it never happend before 3.8-rc.

-- Jörg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-01-05  9:01 [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Jörg Otte
@ 2013-01-05  9:37 ` Francois Romieu
  2013-01-05 10:15   ` Jörg Otte
  0 siblings, 1 reply; 8+ messages in thread
From: Francois Romieu @ 2013-01-05  9:37 UTC (permalink / raw)
  To: Jörg Otte; +Cc: Linux Kernel Mailing List, David S. Miller, netdev

Jörg Otte <jrg.otte@gmail.com> :
[...]
> It's a regression, it never happend before 3.8-rc.

Please check that 'dmesg | grep XID' exhibits a 8168evl.

I'll showe and dig it. It's epidemic.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-01-05  9:37 ` Francois Romieu
@ 2013-01-05 10:15   ` Jörg Otte
  2013-01-05 16:57     ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Jörg Otte @ 2013-01-05 10:15 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Linux Kernel Mailing List, David S. Miller, netdev

2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
> Jörg Otte <jrg.otte@gmail.com> :
> [...]
>> It's a regression, it never happend before 3.8-rc.
>
> Please check that 'dmesg | grep XID' exhibits a 8168evl.

jojo@ahorn:~$ dmesg | grep XID
[    1.808847] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at
0xffffc90000054000, 5c:9a:d8:69:2b:39, XID 0c900800 IRQ 42
jojo@ahorn:~$

>
> I'll showe and dig it. It's epidemic.
>

Thanks, Jörg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-01-05 10:15   ` Jörg Otte
@ 2013-01-05 16:57     ` Francois Romieu
  2013-01-06 13:03       ` Jörg Otte
  0 siblings, 1 reply; 8+ messages in thread
From: Francois Romieu @ 2013-01-05 16:57 UTC (permalink / raw)
  To: Jörg Otte; +Cc: Linux Kernel Mailing List, David S. Miller, netdev

Jörg Otte <jrg.otte@gmail.com> :
[...]
> jojo@ahorn:~$ dmesg | grep XID
> [    1.808847] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at
> 0xffffc90000054000, 5c:9a:d8:69:2b:39, XID 0c900800 IRQ 42

Can you check if things improve with v3.8-rc2 after removing :

1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda 
   r8169: workaround for missing extended GigaMAC registers
2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
   r8169: enable internal ASPM and clock request settings
3. e0c075577965d1c01b30038d38bf637b027a1df3
   r8169: enable ALDPS for power saving

(you can directly try v3.7 r8169.c with v3.8-rc2 if it worked for you
so far) 

If the regression is still there, please apply the patch below to both
v3.8-rc2 unpatched and a known working version then send me their dmesg
after you 'ip link set dev eth0 up'.

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ed96f30..3d2d2446 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -90,10 +90,28 @@ static const int multicast_filter_limit = 32;
 #define RTL8169_TX_TIMEOUT	(6*HZ)
 #define RTL8169_PHY_TIMEOUT	(10*HZ)
 
+static void rw8(void __iomem *ioaddr, u8 b)
+{
+	printk(KERN_DEBUG PFX "w %p %02x\n", ioaddr, b);
+	writeb(b, ioaddr);
+}
+
+static void rw16(void __iomem *ioaddr, u16 w)
+{
+	printk(KERN_DEBUG PFX "w %p %04x\n", ioaddr, w);
+	writew(w, ioaddr);
+}
+
+static void rw32(void __iomem *ioaddr, u32 d)
+{
+	printk(KERN_DEBUG PFX "w %p %08x\n", ioaddr, d);
+	writel(d, ioaddr);
+}
+
 /* write/read MMIO register */
-#define RTL_W8(reg, val8)	writeb ((val8), ioaddr + (reg))
-#define RTL_W16(reg, val16)	writew ((val16), ioaddr + (reg))
-#define RTL_W32(reg, val32)	writel ((val32), ioaddr + (reg))
+#define RTL_W8(reg, val8)	rw8(ioaddr + (reg), (val8))
+#define RTL_W16(reg, val16)	rw16(ioaddr + (reg), (val16))
+#define RTL_W32(reg, val32)	rw32(ioaddr + (reg), (val32))
 #define RTL_R8(reg)		readb (ioaddr + (reg))
 #define RTL_R16(reg)		readw (ioaddr + (reg))
 #define RTL_R32(reg)		readl (ioaddr + (reg))

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-01-05 16:57     ` Francois Romieu
@ 2013-01-06 13:03       ` Jörg Otte
  2013-02-03 15:34         ` Jörg Otte
  0 siblings, 1 reply; 8+ messages in thread
From: Jörg Otte @ 2013-01-06 13:03 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Linux Kernel Mailing List, David S. Miller, netdev

2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
> Can you check if things improve with v3.8-rc2 after removing :
>
> 1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
>    r8169: workaround for missing extended GigaMAC registers
> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>    r8169: enable internal ASPM and clock request settings

Doesn't help for this problem.

However this fixes a second issue for me:
In 3.7.1 at startup the link came up after 15 sec.:
grep r8169 dmesg.3.7.1
[    1.956842] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.957059] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
[    1.957161] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at..
[    1.957163] r8169 0000:02:00.0 eth0: jumbo features [frames..
[   13.575452] r8169 0000:02:00.0 eth0: link down
[   13.575475] r8169 0000:02:00.0 eth0: link down
[   15.181317] r8169 0000:02:00.0 eth0: link up

In 3.8rc the time increased to 24 seconds:
grep r8169 dmesg.3.8.0
[    1.852546] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.852765] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
[    1.852872] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
[    1.852874] r8169 0000:02:00.0 eth0: jumbo features [frames...
[   14.150212] r8169 0000:02:00.0 eth0: link down
[   14.150229] r8169 0000:02:00.0 eth0: link down
[   24.140263] r8169 0000:02:00.0 eth0: link up

But with this revert I get the old performance:
dmesg | grep r8169
[    1.816613] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.816832] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
[    1.816947] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
[    1.816948] r8169 0000:02:00.0 eth0: jumbo features [frames...
[   13.986401] r8169 0000:02:00.0 eth0: link down
[   13.986422] r8169 0000:02:00.0 eth0: link down
[   15.623631] r8169 0000:02:00.0 eth0: link up

Thus I recommend to revert this too.

> 3. e0c075577965d1c01b30038d38bf637b027a1df3
>    r8169: enable ALDPS for power saving

That's it! This fixes the problem for me!


Thanks, Jörg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-01-06 13:03       ` Jörg Otte
@ 2013-02-03 15:34         ` Jörg Otte
  2013-02-06 16:55           ` Jörg Otte
  0 siblings, 1 reply; 8+ messages in thread
From: Jörg Otte @ 2013-02-03 15:34 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Linux Kernel Mailing List, David S. Miller, netdev

2013/1/6 Jörg Otte <jrg.otte@gmail.com>:
> 2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
>> Can you check if things improve with v3.8-rc2 after removing :
>>
>> 1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
>>    r8169: workaround for missing extended GigaMAC registers
>> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>>    r8169: enable internal ASPM and clock request settings
>
> Doesn't help for this problem.
>
> However this fixes a second issue for me:
> In 3.7.1 at startup the link came up after 15 sec.:
> grep r8169 dmesg.3.7.1
> [    1.956842] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    1.957059] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
> [    1.957161] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at..
> [    1.957163] r8169 0000:02:00.0 eth0: jumbo features [frames..
> [   13.575452] r8169 0000:02:00.0 eth0: link down
> [   13.575475] r8169 0000:02:00.0 eth0: link down
> [   15.181317] r8169 0000:02:00.0 eth0: link up
>
> In 3.8rc the time increased to 24 seconds:
> grep r8169 dmesg.3.8.0
> [    1.852546] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    1.852765] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
> [    1.852872] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
> [    1.852874] r8169 0000:02:00.0 eth0: jumbo features [frames...
> [   14.150212] r8169 0000:02:00.0 eth0: link down
> [   14.150229] r8169 0000:02:00.0 eth0: link down
> [   24.140263] r8169 0000:02:00.0 eth0: link up
>
> But with this revert I get the old performance:
> dmesg | grep r8169
> [    1.816613] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    1.816832] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
> [    1.816947] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
> [    1.816948] r8169 0000:02:00.0 eth0: jumbo features [frames...
> [   13.986401] r8169 0000:02:00.0 eth0: link down
> [   13.986422] r8169 0000:02:00.0 eth0: link down
> [   15.623631] r8169 0000:02:00.0 eth0: link up
>
> Thus I recommend to revert this too.
>
>> 3. e0c075577965d1c01b30038d38bf637b027a1df3
>>    r8169: enable ALDPS for power saving
>
> That's it! This fixes the problem for me!
>


We are closely before v3.8 and I didn't see a solution
so far.
What is the plan regarding this issue(s)?

Thanks, Jörg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-02-03 15:34         ` Jörg Otte
@ 2013-02-06 16:55           ` Jörg Otte
  2013-02-07  0:46             ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Jörg Otte @ 2013-02-06 16:55 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Linux Kernel Mailing List, David S. Miller, netdev, Linus Torvalds

2013/2/3 Jörg Otte <jrg.otte@gmail.com>:
> 2013/1/6 Jörg Otte <jrg.otte@gmail.com>:
>> 2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
>>> Can you check if things improve with v3.8-rc2 after removing :
>>>
>>> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>>>    r8169: enable internal ASPM and clock request settings
>>
>> this fixes a second issue for me:
>> In 3.7.1 at startup the link came up after 15 sec.:
>> grep r8169 dmesg.3.7.1
>> [    1.956842] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> [    1.957059] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
>> [    1.957161] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at..
>> [    1.957163] r8169 0000:02:00.0 eth0: jumbo features [frames..
>> [   13.575452] r8169 0000:02:00.0 eth0: link down
>> [   13.575475] r8169 0000:02:00.0 eth0: link down
>> [   15.181317] r8169 0000:02:00.0 eth0: link up
>>
>> In 3.8rc the time increased to 24 seconds:
>> grep r8169 dmesg.3.8.0
>> [    1.852546] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> [    1.852765] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
>> [    1.852872] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
>> [    1.852874] r8169 0000:02:00.0 eth0: jumbo features [frames...
>> [   14.150212] r8169 0000:02:00.0 eth0: link down
>> [   14.150229] r8169 0000:02:00.0 eth0: link down
>> [   24.140263] r8169 0000:02:00.0 eth0: link up
>>
>> But with this revert I get the old performance:
>> dmesg | grep r8169
>> [    1.816613] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> [    1.816832] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
>> [    1.816947] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
>> [    1.816948] r8169 0000:02:00.0 eth0: jumbo features [frames...
>> [   13.986401] r8169 0000:02:00.0 eth0: link down
>> [   13.986422] r8169 0000:02:00.0 eth0: link down
>> [   15.623631] r8169 0000:02:00.0 eth0: link up
>>
>>
>>> 3. e0c075577965d1c01b30038d38bf637b027a1df3
>>>    r8169: enable ALDPS for power saving
>>
>> That's it! This fixes the problem for me!
>>
>> Thanks, Jörg
>
>
> We are closely before v3.8 and I didn't see a solution
> so far.
> What is the plan regarding this issue(s)?
>
> Thanks, Jörg

No response, so I Cc to Linus:

To Summarize: Two net-regressions where introduced in v3.8 (driver r8169):

1) NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
was introduced by commit
e0c075577965d1c01b30038d38bf637b027a1df3
("r8169: enable ALDPS for power saving")

2) Boot-time increased from 15sec (V3.7) to 24sec (V3.8)
by commit:
d64ec841517a25f6d468bde9f67e5b4cffdc67c7
("r8169: enable internal ASPM and clock request settings")

Reverting the commits resolve the problems entirely.

As long as the issues are not fixed the commits should be reverted.

Thanks, Jörg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
  2013-02-06 16:55           ` Jörg Otte
@ 2013-02-07  0:46             ` Francois Romieu
  0 siblings, 0 replies; 8+ messages in thread
From: Francois Romieu @ 2013-02-07  0:46 UTC (permalink / raw)
  To: Jörg Otte
  Cc: Linux Kernel Mailing List, David S. Miller, netdev, Linus Torvalds

Jörg Otte <jrg.otte@gmail.com> :
[...]
> To Summarize: Two net-regressions where introduced in v3.8 (driver r8169):
> 
> 1) NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> was introduced by commit
> e0c075577965d1c01b30038d38bf637b027a1df3
> ("r8169: enable ALDPS for power saving")

Hayes Wang <hayeswang@realtek.com> authored it. You should ask him
why commit e0c075577965d1c01b30038d38bf637b027a1df3 sometimes chokes
with the 8168evl. 

And you can ask him if there is a chance that the non-8168evl that are
handled by the patch (mis-)behave the same too.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-02-07  1:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-05  9:01 [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Jörg Otte
2013-01-05  9:37 ` Francois Romieu
2013-01-05 10:15   ` Jörg Otte
2013-01-05 16:57     ` Francois Romieu
2013-01-06 13:03       ` Jörg Otte
2013-02-03 15:34         ` Jörg Otte
2013-02-06 16:55           ` Jörg Otte
2013-02-07  0:46             ` Francois Romieu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).