* Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled inkernel, computer crashes after 120seconds (approx)
@ 2009-07-17 5:55 David Hill
2009-07-17 14:15 ` Neil Horman
0 siblings, 1 reply; 3+ messages in thread
From: David Hill @ 2009-07-17 5:55 UTC (permalink / raw)
To: Neil Horman, Andrew Morton; +Cc: netdev, bugzilla-daemon, bugme-daemon
Hi back,
Look at bug 13219. I'm not sure the bug is related to NETCONSOLE.
It may be with the NIC drivers or the tools miidiag/ethtool or anything
else.
The behavior of the system is random.
I attached the NMI stack trace ... but for the kdump, I need to read a bit
more about it and think I'll need to patch the kernel... will I ?
Thanks again,
Dave
----- Original Message -----
From: "David Hill" <hilld@binarystorm.net>
To: "Neil Horman" <nhorman@tuxdriver.com>; "Andrew Morton"
<akpm@linux-foundation.org>
Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;
<bugme-daemon@bugzilla.kernel.org>
Sent: Thursday, July 16, 2009 1:42 AM
Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled
inkernel, computer crashes after 120seconds (approx)
> Will try that in the next few days... sorry for the delay. I was on
> vacation for the last 2 weeks and thus, out of town :D
>
>
>
> ----- Original Message -----
> From: "Neil Horman" <nhorman@tuxdriver.com>
> To: "Andrew Morton" <akpm@linux-foundation.org>
> Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;
> <bugme-daemon@bugzilla.kernel.org>; <hilld@binarystorm.net>
> Sent: Tuesday, June 23, 2009 9:05 PM
> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled
> inkernel, computer crashes after 120seconds (approx)
>
>
>> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>>
>>> (switched to email. Please respond via emailed reply-to-all, not via
>>> the
>>> bugzilla web interface).
>>>
>>> On Wed, 17 Jun 2009 01:55:54 GMT
>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>
>>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>>> >
>>> > Summary: When NETCONSOLE is enabled in kernel, computer
>>> > crashes
>>> > after 120seconds (approx)
>>> > Product: Networking
>>> > Version: 2.5
>>> > Kernel Version: 2.6.29.4, 2.6.30
>>> > Platform: All
>>> > OS/Version: Linux
>>> > Tree: Mainline
>>> > Status: NEW
>>> > Severity: high
>>> > Priority: P1
>>> > Component: Other
>>> > AssignedTo: acme@ghostprotocols.net
>>> > ReportedBy: hilld@binarystorm.net
>>> > Regression: No
>>> >
>>> >
>>>
>>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev
>>> > 01)
>>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev
>>> > 01)
>>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet
>>> > Pro 100
>>> > (rev 08)
>>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>> > RTL-8139/8139C/8139C+ (rev 10)
>>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR
>>> > AGP
>>> >
>>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-)
>>> > [reply] -------
>>> >
>>> > With NETCONSOLE enabled, if I type:
>>> > ethtool -s eth1 speed 100 duplex full autoneg on
>>> >
>>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>>> >
>>> > I can reproduce it anytime you want.
>>> >
>>>
>>> Interesting. I wonder what the significance is of the 120 seconds. I
>>> see no such timers in e100.c. Does the networking core have timers on
>>> such intervals?
>>>
>> My guess is the 120 seconds has less to do with the driver, and more to
>> do with
>> some other periodic event in the kernel that triggers a message getting
>> written
>> to the console, which in turn triggers whatever deadlock it is thats
>> getting hit
>> here. I imagine we could diagnose it pretty quick if a stack trace or
>> vmcore
>> could be captured on this. David, can you enable the NMI watchdog on
>> this
>> system to trigger a panic on the system after a deadlock? Then if you
>> could
>> enable a second serial console, or setup kdump to capture a vmcore on
>> this
>> system, we should be able to figure out whats going on. My guess is
>> that in
>> the e100 driver we're taking a lock in the ethtool set path, then calling
>> printk, which winds up recursing into the driver, trying to take the same
>> lock
>> again. A stack trace will tell us for certain.
>>
>> Regards
>> Neil
>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>>
>>
>
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled inkernel, computer crashes after 120seconds (approx)
2009-07-17 5:55 [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled inkernel, computer crashes after 120seconds (approx) David Hill
@ 2009-07-17 14:15 ` Neil Horman
0 siblings, 0 replies; 3+ messages in thread
From: Neil Horman @ 2009-07-17 14:15 UTC (permalink / raw)
To: David Hill; +Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon
On Fri, Jul 17, 2009 at 01:55:44AM -0400, David Hill wrote:
> Hi back,
> Look at bug 13219. I'm not sure the bug is related to NETCONSOLE.
> It may be with the NIC drivers or the tools miidiag/ethtool or anything
> else.
> The behavior of the system is random.
>
> I attached the NMI stack trace ... but for the kdump, I need to read a
> bit more about it and think I'll need to patch the kernel... will I ?
>
> Thanks again,
>
> Dave
>
Neither of the logs you attached in the associated bugs seem to have the NMI
lockup backtrace included. As for a kdump, you won't need to patch the kernel,
no, but depending on what kernel you're using, you may need to build the kernel
with CONFIG_CRASH and CONFIG_KEXEC turned on.
Neil
>
> ----- Original Message ----- From: "David Hill" <hilld@binarystorm.net>
> To: "Neil Horman" <nhorman@tuxdriver.com>; "Andrew Morton"
> <akpm@linux-foundation.org>
> Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;
> <bugme-daemon@bugzilla.kernel.org>
> Sent: Thursday, July 16, 2009 1:42 AM
> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled
> inkernel, computer crashes after 120seconds (approx)
>
>
>> Will try that in the next few days... sorry for the delay. I was on
>> vacation for the last 2 weeks and thus, out of town :D
>>
>>
>>
>> ----- Original Message ----- From: "Neil Horman"
>> <nhorman@tuxdriver.com>
>> To: "Andrew Morton" <akpm@linux-foundation.org>
>> Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;
>> <bugme-daemon@bugzilla.kernel.org>; <hilld@binarystorm.net>
>> Sent: Tuesday, June 23, 2009 9:05 PM
>> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled
>> inkernel, computer crashes after 120seconds (approx)
>>
>>
>>> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>>>
>>>> (switched to email. Please respond via emailed reply-to-all, not
>>>> via the
>>>> bugzilla web interface).
>>>>
>>>> On Wed, 17 Jun 2009 01:55:54 GMT
>>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>>
>>>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>>>> >
>>>> > Summary: When NETCONSOLE is enabled in kernel,
>>>> computer > crashes
>>>> > after 120seconds (approx)
>>>> > Product: Networking
>>>> > Version: 2.5
>>>> > Kernel Version: 2.6.29.4, 2.6.30
>>>> > Platform: All
>>>> > OS/Version: Linux
>>>> > Tree: Mainline
>>>> > Status: NEW
>>>> > Severity: high
>>>> > Priority: P1
>>>> > Component: Other
>>>> > AssignedTo: acme@ghostprotocols.net
>>>> > ReportedBy: hilld@binarystorm.net
>>>> > Regression: No
>>>> >
>>>> >
>>>>
>>>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>>>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>>>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>>>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE
>>>> (rev > 01)
>>>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB
>>>> (rev > 01)
>>>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>>>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1
>>>> Ethernet > Pro 100
>>>> > (rev 08)
>>>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>> > RTL-8139/8139C/8139C+ (rev 10)
>>>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128
>>>> RL/VR > AGP
>>>> >
>>>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) >
>>>> [reply] -------
>>>> >
>>>> > With NETCONSOLE enabled, if I type:
>>>> > ethtool -s eth1 speed 100 duplex full autoneg on
>>>> >
>>>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>>>> >
>>>> > I can reproduce it anytime you want.
>>>> >
>>>>
>>>> Interesting. I wonder what the significance is of the 120 seconds. I
>>>> see no such timers in e100.c. Does the networking core have timers on
>>>> such intervals?
>>>>
>>> My guess is the 120 seconds has less to do with the driver, and more
>>> to do with
>>> some other periodic event in the kernel that triggers a message
>>> getting written
>>> to the console, which in turn triggers whatever deadlock it is thats
>>> getting hit
>>> here. I imagine we could diagnose it pretty quick if a stack trace
>>> or vmcore
>>> could be captured on this. David, can you enable the NMI watchdog on
>>> this
>>> system to trigger a panic on the system after a deadlock? Then if
>>> you could
>>> enable a second serial console, or setup kdump to capture a vmcore on
>>> this
>>> system, we should be able to figure out whats going on. My guess is
>>> that in
>>> the e100 driver we're taking a lock in the ethtool set path, then calling
>>> printk, which winds up recursing into the driver, trying to take the
>>> same lock
>>> again. A stack trace will tell us for certain.
>>>
>>> Regards
>>> Neil
>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> --
>>> This message has been scanned for viruses and
>>> dangerous content by MailScanner, and is
>>> believed to be clean.
>>>
>>>
>>>
>>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled inkernel, computer crashes after 120seconds (approx)
2009-06-24 1:05 ` Neil Horman
@ 2009-07-16 5:42 ` David Hill
0 siblings, 0 replies; 3+ messages in thread
From: David Hill @ 2009-07-16 5:42 UTC (permalink / raw)
To: Neil Horman, Andrew Morton; +Cc: netdev, bugzilla-daemon, bugme-daemon
Will try that in the next few days... sorry for the delay. I was on
vacation for the last 2 weeks and thus, out of town :D
----- Original Message -----
From: "Neil Horman" <nhorman@tuxdriver.com>
To: "Andrew Morton" <akpm@linux-foundation.org>
Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;
<bugme-daemon@bugzilla.kernel.org>; <hilld@binarystorm.net>
Sent: Tuesday, June 23, 2009 9:05 PM
Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled
inkernel, computer crashes after 120seconds (approx)
> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>
>> (switched to email. Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Wed, 17 Jun 2009 01:55:54 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>> >
>> > Summary: When NETCONSOLE is enabled in kernel, computer
>> > crashes
>> > after 120seconds (approx)
>> > Product: Networking
>> > Version: 2.5
>> > Kernel Version: 2.6.29.4, 2.6.30
>> > Platform: All
>> > OS/Version: Linux
>> > Tree: Mainline
>> > Status: NEW
>> > Severity: high
>> > Priority: P1
>> > Component: Other
>> > AssignedTo: acme@ghostprotocols.net
>> > ReportedBy: hilld@binarystorm.net
>> > Regression: No
>> >
>> >
>>
>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev
>> > 01)
>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev
>> > 01)
>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet
>> > Pro 100
>> > (rev 08)
>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>> > RTL-8139/8139C/8139C+ (rev 10)
>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR
>> > AGP
>> >
>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-)
>> > [reply] -------
>> >
>> > With NETCONSOLE enabled, if I type:
>> > ethtool -s eth1 speed 100 duplex full autoneg on
>> >
>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>> >
>> > I can reproduce it anytime you want.
>> >
>>
>> Interesting. I wonder what the significance is of the 120 seconds. I
>> see no such timers in e100.c. Does the networking core have timers on
>> such intervals?
>>
> My guess is the 120 seconds has less to do with the driver, and more to do
> with
> some other periodic event in the kernel that triggers a message getting
> written
> to the console, which in turn triggers whatever deadlock it is thats
> getting hit
> here. I imagine we could diagnose it pretty quick if a stack trace or
> vmcore
> could be captured on this. David, can you enable the NMI watchdog on this
> system to trigger a panic on the system after a deadlock? Then if you
> could
> enable a second serial console, or setup kdump to capture a vmcore on this
> system, we should be able to figure out whats going on. My guess is that
> in
> the e100 driver we're taking a lock in the ethtool set path, then calling
> printk, which winds up recursing into the driver, trying to take the same
> lock
> again. A stack trace will tell us for certain.
>
> Regards
> Neil
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
>
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-07-17 14:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-17 5:55 [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled inkernel, computer crashes after 120seconds (approx) David Hill
2009-07-17 14:15 ` Neil Horman
[not found] <bug-13553-10286@http.bugzilla.kernel.org/>
2009-06-23 21:07 ` [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled in kernel, " Andrew Morton
2009-06-24 1:05 ` Neil Horman
2009-07-16 5:42 ` [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled inkernel, " David Hill
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.