All of lore.kernel.org
 help / color / mirror / Atom feed
* e1000: "eeprom checksum is not valid" after kexec
@ 2009-04-23 13:36 Jiri Slaby
  2009-04-23 14:10 ` [E1000-devel] " Thadeu Lima de Souza Cascardo
  2009-04-23 15:15 ` Rafael J. Wysocki
  0 siblings, 2 replies; 13+ messages in thread
From: Jiri Slaby @ 2009-04-23 13:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: e1000-devel, LKML, Ingo Molnar, Jesse Barnes

Hi,

4a865905f685eaefaedf6ade362323dc52aa703b
(PCI PM: Make pci_set_power_state() handle devices with no PM support)
breaks e1000 after being kexec'ed. These reverts fix the problem:
    Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
support"
    Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
(rev. 2)"

I reverted the second one just for an easy revert of the former one,
which is actually the culprit.

The symptoms:
e1000 0000:02:01.0: enabling device (0000 -> 0003)
e1000 0000:02:01.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
e1000 0000:02:01.0: setting latency timer to 64
e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid
Switched to high resolution mode on CPU 0
/*********************/
Current EEPROM Checksum : 0xffff
Calculated              : 0xbaf9
Offset    Values
========  ======
00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000010: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Include this output when contacting your support provider.
This is not a software error! Something bad happened to your hardware or
EEPROM image. Ignoring this problem could result in further problems,
possibly loss of data, corruption or system hangs!
The MAC Address will be reset to 00:00:00:00:00:00, which is invalid
and requires you to set the proper MAC address manually before continuing
to enable this network device.
Please inspect the EEPROM dump and report the issue to your hardware vendor
or Intel Customer Support.
/*********************/
e1000: 0000:02:01.0: e1000_probe: Invalid MAC Address
e1000: 0000:02:01.0: e1000_probe: (PCI-X:33MHz:64-bit) 00:00:00:00:00:00

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 13:36 e1000: "eeprom checksum is not valid" after kexec Jiri Slaby
@ 2009-04-23 14:10 ` Thadeu Lima de Souza Cascardo
  2009-04-23 14:30   ` Jiri Slaby
  2009-04-23 15:15 ` Rafael J. Wysocki
  1 sibling, 1 reply; 13+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2009-04-23 14:10 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rafael J. Wysocki, e1000-devel, Ingo Molnar, LKML, Jesse Barnes

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

On Thu, Apr 23, 2009 at 03:36:43PM +0200, Jiri Slaby wrote:
> Hi,
> 
> 4a865905f685eaefaedf6ade362323dc52aa703b
> (PCI PM: Make pci_set_power_state() handle devices with no PM support)
> breaks e1000 after being kexec'ed. These reverts fix the problem:
>     Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
> support"
>     Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
> (rev. 2)"
> 
> I reverted the second one just for an easy revert of the former one,
> which is actually the culprit.
> 
> The symptoms:
> e1000 0000:02:01.0: enabling device (0000 -> 0003)
> e1000 0000:02:01.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
> e1000 0000:02:01.0: setting latency timer to 64
> e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid
> Switched to high resolution mode on CPU 0

Have you tried b43fcd7dc7b, found in v2.6.30-rc3?

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 14:10 ` [E1000-devel] " Thadeu Lima de Souza Cascardo
@ 2009-04-23 14:30   ` Jiri Slaby
  2009-04-23 14:41     ` Thadeu Lima de Souza Cascardo
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Slaby @ 2009-04-23 14:30 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: Rafael J. Wysocki, e1000-devel, Ingo Molnar, LKML, Jesse Barnes

On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
> On Thu, Apr 23, 2009 at 03:36:43PM +0200, Jiri Slaby wrote:
>> Hi,
>>
>> 4a865905f685eaefaedf6ade362323dc52aa703b
>> (PCI PM: Make pci_set_power_state() handle devices with no PM support)
>> breaks e1000 after being kexec'ed. These reverts fix the problem:
>>     Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
>> support"
>>     Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
>> (rev. 2)"
>>
>> I reverted the second one just for an easy revert of the former one,
>> which is actually the culprit.
>>
>> The symptoms:
>> e1000 0000:02:01.0: enabling device (0000 -> 0003)
>> e1000 0000:02:01.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
>> e1000 0000:02:01.0: setting latency timer to 64
>> e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid
>> Switched to high resolution mode on CPU 0
> 
> Have you tried b43fcd7dc7b, found in v2.6.30-rc3?

I've tried 2.6.30-rc3-next-20090423 without success.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 14:30   ` Jiri Slaby
@ 2009-04-23 14:41     ` Thadeu Lima de Souza Cascardo
  2009-04-23 20:40       ` Jiri Slaby
  0 siblings, 1 reply; 13+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2009-04-23 14:41 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rafael J. Wysocki, e1000-devel, Ingo Molnar, LKML, Jesse Barnes

[-- Attachment #1: Type: text/plain, Size: 1697 bytes --]

On Thu, Apr 23, 2009 at 04:30:01PM +0200, Jiri Slaby wrote:
> On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
> > On Thu, Apr 23, 2009 at 03:36:43PM +0200, Jiri Slaby wrote:
> >> Hi,
> >>
> >> 4a865905f685eaefaedf6ade362323dc52aa703b
> >> (PCI PM: Make pci_set_power_state() handle devices with no PM support)
> >> breaks e1000 after being kexec'ed. These reverts fix the problem:
> >>     Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
> >> support"
> >>     Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
> >> (rev. 2)"
> >>
> >> I reverted the second one just for an easy revert of the former one,
> >> which is actually the culprit.
> >>
> >> The symptoms:
> >> e1000 0000:02:01.0: enabling device (0000 -> 0003)
> >> e1000 0000:02:01.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
> >> e1000 0000:02:01.0: setting latency timer to 64
> >> e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid
> >> Switched to high resolution mode on CPU 0
> > 
> > Have you tried b43fcd7dc7b, found in v2.6.30-rc3?
> 
> I've tried 2.6.30-rc3-next-20090423 without success.

You mean next-20090423. The patch is really found there.

But, then, I realize you mean reverting these patches for the kernel
that is running or the kernel that is being kexec'd?

If b43fcd7dc7b is applied to the running kernel, it fixes the shutdown
issue, and the next loaded kernel probes e1000 fine.

If you are reverting 4a865905f in the kexec'd kernel and the running
kernel does not have b43fcd7dc7b, then I'd like to test the revert for
my case here, which is e100.

Which is it?

Regards,
Cascardo.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 13:36 e1000: "eeprom checksum is not valid" after kexec Jiri Slaby
  2009-04-23 14:10 ` [E1000-devel] " Thadeu Lima de Souza Cascardo
@ 2009-04-23 15:15 ` Rafael J. Wysocki
  2009-04-23 15:36   ` Jiri Slaby
  1 sibling, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2009-04-23 15:15 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: e1000-devel, LKML, Ingo Molnar, Jesse Barnes

On Thursday 23 April 2009, Jiri Slaby wrote:
> Hi,

Hi,

> 4a865905f685eaefaedf6ade362323dc52aa703b
> (PCI PM: Make pci_set_power_state() handle devices with no PM support)
> breaks e1000 after being kexec'ed. These reverts fix the problem:
>     Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
> support"
>     Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
> (rev. 2)"
> 
> I reverted the second one

I don't think it can be reverted.

> just for an easy revert of the former one, which is actually the culprit.

Can you just try to revert the changes in pci_raw_set_power_state() and check
if that has any effect (it shouldn't)?

> The symptoms:
> e1000 0000:02:01.0: enabling device (0000 -> 0003)
> e1000 0000:02:01.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
> e1000 0000:02:01.0: setting latency timer to 64
> e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid
> Switched to high resolution mode on CPU 0
> /*********************/
> Current EEPROM Checksum : 0xffff
> Calculated              : 0xbaf9
> Offset    Values
> ========  ======
> 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000010: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> Include this output when contacting your support provider.
> This is not a software error! Something bad happened to your hardware or
> EEPROM image. Ignoring this problem could result in further problems,
> possibly loss of data, corruption or system hangs!
> The MAC Address will be reset to 00:00:00:00:00:00, which is invalid
> and requires you to set the proper MAC address manually before continuing
> to enable this network device.
> Please inspect the EEPROM dump and report the issue to your hardware vendor
> or Intel Customer Support.
> /*********************/
> e1000: 0000:02:01.0: e1000_probe: Invalid MAC Address
> e1000: 0000:02:01.0: e1000_probe: (PCI-X:33MHz:64-bit) 00:00:00:00:00:00

So this is after kexec?

What happens if you remove just the

	/* Check if we're already there */
	if (dev->current_state == state)
		return 0;

part from pci_set_power_state()?

Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 15:15 ` Rafael J. Wysocki
@ 2009-04-23 15:36   ` Jiri Slaby
  2009-04-23 21:48     ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Slaby @ 2009-04-23 15:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: e1000-devel, LKML, Ingo Molnar, Jesse Barnes

On 04/23/2009 05:15 PM, Rafael J. Wysocki wrote:
> On Thursday 23 April 2009, Jiri Slaby wrote:
>> 4a865905f685eaefaedf6ade362323dc52aa703b
>> (PCI PM: Make pci_set_power_state() handle devices with no PM support)
>> breaks e1000 after being kexec'ed. These reverts fix the problem:
>>     Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
>> support"
>>     Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
>> (rev. 2)"
>>
>> I reverted the second one
> 
> I don't think it can be reverted.

But it works :). I'm not saying, it's correct to revert them upstream.
It was just confirmation, that it causes the problem.

>> just for an easy revert of the former one, which is actually the culprit.
> 
> Can you just try to revert the changes in pci_raw_set_power_state() and check
> if that has any effect (it shouldn't)?

Please send a patch. I'm lost in the changes done there. Let's say
against the top of 20090423 next tree.

>> e1000: 0000:02:01.0: e1000_probe: Invalid MAC Address
>> e1000: 0000:02:01.0: e1000_probe: (PCI-X:33MHz:64-bit) 00:00:00:00:00:00
> 
> So this is after kexec?

yes

> What happens if you remove just the
> 
> 	/* Check if we're already there */
> 	if (dev->current_state == state)
> 		return 0;
> 
> part from pci_set_power_state()?

Will try later.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 14:41     ` Thadeu Lima de Souza Cascardo
@ 2009-04-23 20:40       ` Jiri Slaby
  2009-04-23 21:17         ` Thadeu Lima de Souza Cascardo
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Slaby @ 2009-04-23 20:40 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: Rafael J. Wysocki, e1000-devel, Ingo Molnar, LKML, Jesse Barnes

On 04/23/2009 04:41 PM, Thadeu Lima de Souza Cascardo wrote:
> On Thu, Apr 23, 2009 at 04:30:01PM +0200, Jiri Slaby wrote:
>> On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
>>> Have you tried b43fcd7dc7b, found in v2.6.30-rc3?
>> I've tried 2.6.30-rc3-next-20090423 without success.
> 
> You mean next-20090423. The patch is really found there.
> 
> But, then, I realize you mean reverting these patches for the kernel
> that is running or the kernel that is being kexec'd?

The latter.

> If b43fcd7dc7b is applied to the running kernel, it fixes the shutdown
> issue, and the next loaded kernel probes e1000 fine.

Makes sense.

> If you are reverting 4a865905f in the kexec'd kernel and the running
> kernel does not have b43fcd7dc7b, then I'd like to test the revert for
> my case here, which is e100.

To make things clear: on that machine, there was stock opensuse 11.1
distro kernel which is 2.6.27-based (no b43fcd7dc7b). I needed to debug
a wireless bug, so I kexec'ed wireless-testing (contains 4a865905f already).

So in fact, 4a865905f from the testing kernel triggered a bug fixed in
near past by b43fcd7dc7b.

Did the other two e100* drivers suffer from the same and were fixed
recently? It would render kexec pretty unusable from the older kernels
if this is not going to be fixed anyhow :(.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 20:40       ` Jiri Slaby
@ 2009-04-23 21:17         ` Thadeu Lima de Souza Cascardo
  2009-04-24 16:09           ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2009-04-23 21:17 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rafael J. Wysocki, e1000-devel, Ingo Molnar, LKML, Jesse Barnes

[-- Attachment #1: Type: text/plain, Size: 2596 bytes --]

On Thu, Apr 23, 2009 at 10:40:14PM +0200, Jiri Slaby wrote:
> On 04/23/2009 04:41 PM, Thadeu Lima de Souza Cascardo wrote:
> > On Thu, Apr 23, 2009 at 04:30:01PM +0200, Jiri Slaby wrote:
> >> On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
> >>> Have you tried b43fcd7dc7b, found in v2.6.30-rc3?
> >> I've tried 2.6.30-rc3-next-20090423 without success.
> > 
> > You mean next-20090423. The patch is really found there.
> > 
> > But, then, I realize you mean reverting these patches for the kernel
> > that is running or the kernel that is being kexec'd?
> 
> The latter.
> 
> > If b43fcd7dc7b is applied to the running kernel, it fixes the shutdown
> > issue, and the next loaded kernel probes e1000 fine.
> 
> Makes sense.
> 
> > If you are reverting 4a865905f in the kexec'd kernel and the running
> > kernel does not have b43fcd7dc7b, then I'd like to test the revert for
> > my case here, which is e100.
> 
> To make things clear: on that machine, there was stock opensuse 11.1
> distro kernel which is 2.6.27-based (no b43fcd7dc7b). I needed to debug
> a wireless bug, so I kexec'ed wireless-testing (contains 4a865905f already).
> 
> So in fact, 4a865905f from the testing kernel triggered a bug fixed in
> near past by b43fcd7dc7b.
> 
> Did the other two e100* drivers suffer from the same and were fixed
> recently? It would render kexec pretty unusable from the older kernels
> if this is not going to be fixed anyhow :(.

Yes, as well as some other network drivers, it seems. My fix for e100
should be in Jeffrey Kirsher's tree by now and go into netdev and rc4
soon, I expect.

But, since I also thought that it would be good to fix that and allow
people to kexec from earlier kernels, I did a followup to e100-devel,
linux-pci, netdev and Rafael Wysocki. I didn't include linux-kernel,
which I have just fixed, bouncing the message (oops!). I may bounce it
to you too, if you want that.

Your findings shed a light into that problem. But I could find it in
very early kernels too for some configurations, and these commits you
are reverting may only fix the issue for the most common configurations
out there. That is, it was very easy to trigger the shutdown bug with
these patches. But I think there are some other bugs out there that will
trigger it, and they are not that easy bisecting, it seems, since only
some very particular configurations trigger it.

I will do some tests with the commits you mention and reproduce the
problem using as earlier kernels as I can and send the config.

Regards,
Cascardo.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 15:36   ` Jiri Slaby
@ 2009-04-23 21:48     ` Rafael J. Wysocki
  0 siblings, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2009-04-23 21:48 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: e1000-devel, LKML, Ingo Molnar, Jesse Barnes

On Thursday 23 April 2009, Jiri Slaby wrote:
> On 04/23/2009 05:15 PM, Rafael J. Wysocki wrote:
> > On Thursday 23 April 2009, Jiri Slaby wrote:
> >> 4a865905f685eaefaedf6ade362323dc52aa703b
> >> (PCI PM: Make pci_set_power_state() handle devices with no PM support)
> >> breaks e1000 after being kexec'ed. These reverts fix the problem:
> >>     Revert "PCI PM: Make pci_set_power_state() handle devices with no PM
> >> support"
> >>     Revert "PCI PM: Introduce __pci_[start|complete]_power_transition()
> >> (rev. 2)"
> >>
> >> I reverted the second one
> > 
> > I don't think it can be reverted.
> 
> But it works :). I'm not saying, it's correct to revert them upstream.
> It was just confirmation, that it causes the problem.
> 
> >> just for an easy revert of the former one, which is actually the culprit.
> > 
> > Can you just try to revert the changes in pci_raw_set_power_state() and check
> > if that has any effect (it shouldn't)?
> 
> Please send a patch. I'm lost in the changes done there. Let's say
> against the top of 20090423 next tree.

I can easily send you a patch against -rc3 if that's sufficient.

> >> e1000: 0000:02:01.0: e1000_probe: Invalid MAC Address
> >> e1000: 0000:02:01.0: e1000_probe: (PCI-X:33MHz:64-bit) 00:00:00:00:00:00
> > 
> > So this is after kexec?
> 
> yes

Do you kexec the same kernel or any other kernel?

> > What happens if you remove just the
> > 
> > 	/* Check if we're already there */
> > 	if (dev->current_state == state)
> > 		return 0;
> > 
> > part from pci_set_power_state()?
> 
> Will try later.

OK, thanks.

Best,
Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-23 21:17         ` Thadeu Lima de Souza Cascardo
@ 2009-04-24 16:09           ` Rafael J. Wysocki
  2009-05-11 14:31             ` Jiri Slaby
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2009-04-24 16:09 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: Jiri Slaby, e1000-devel, Ingo Molnar, LKML, Jesse Barnes

On Thursday 23 April 2009, Thadeu Lima de Souza Cascardo wrote:
> On Thu, Apr 23, 2009 at 10:40:14PM +0200, Jiri Slaby wrote:
> > On 04/23/2009 04:41 PM, Thadeu Lima de Souza Cascardo wrote:
> > > On Thu, Apr 23, 2009 at 04:30:01PM +0200, Jiri Slaby wrote:
> > >> On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
> > >>> Have you tried b43fcd7dc7b, found in v2.6.30-rc3?
> > >> I've tried 2.6.30-rc3-next-20090423 without success.
> > > 
> > > You mean next-20090423. The patch is really found there.
> > > 
> > > But, then, I realize you mean reverting these patches for the kernel
> > > that is running or the kernel that is being kexec'd?
> > 
> > The latter.
> > 
> > > If b43fcd7dc7b is applied to the running kernel, it fixes the shutdown
> > > issue, and the next loaded kernel probes e1000 fine.
> > 
> > Makes sense.
> > 
> > > If you are reverting 4a865905f in the kexec'd kernel and the running
> > > kernel does not have b43fcd7dc7b, then I'd like to test the revert for
> > > my case here, which is e100.
> > 
> > To make things clear: on that machine, there was stock opensuse 11.1
> > distro kernel which is 2.6.27-based (no b43fcd7dc7b). I needed to debug
> > a wireless bug, so I kexec'ed wireless-testing (contains 4a865905f already).
> > 
> > So in fact, 4a865905f from the testing kernel triggered a bug fixed in
> > near past by b43fcd7dc7b.
> > 
> > Did the other two e100* drivers suffer from the same and were fixed
> > recently? It would render kexec pretty unusable from the older kernels
> > if this is not going to be fixed anyhow :(.
> 
> Yes, as well as some other network drivers, it seems. My fix for e100
> should be in Jeffrey Kirsher's tree by now and go into netdev and rc4
> soon, I expect.
> 
> But, since I also thought that it would be good to fix that and allow
> people to kexec from earlier kernels, I did a followup to e100-devel,
> linux-pci, netdev and Rafael Wysocki. I didn't include linux-kernel,
> which I have just fixed, bouncing the message (oops!). I may bounce it
> to you too, if you want that.
> 
> Your findings shed a light into that problem. But I could find it in
> very early kernels too for some configurations, and these commits you
> are reverting may only fix the issue for the most common configurations
> out there. That is, it was very easy to trigger the shutdown bug with
> these patches. But I think there are some other bugs out there that will
> trigger it, and they are not that easy bisecting, it seems, since only
> some very particular configurations trigger it.
> 
> I will do some tests with the commits you mention and reproduce the
> problem using as earlier kernels as I can and send the config.

Cascardo, Jiri, can you tell me please what the status here is?

My understanding is that the commit pointed to by Jiri caused a problem
if the current mainline kernel was kexeced from an older kernel (2.6.27.x from
openSUSE-11.1 in this particular case), because the older kernel didn't
have the recent network driver fixes applied.  Is this correct?

Also, I'm still interested in whether or not removig the following three lines:

        /* Check if we're already there */
        if (dev->current_state == state)
                return 0;

from pci_set_power_state() in the current mainline kernel fixes the problem
in the configuration where it is readily reproducible.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-04-24 16:09           ` Rafael J. Wysocki
@ 2009-05-11 14:31             ` Jiri Slaby
  2009-05-11 15:24               ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Slaby @ 2009-05-11 14:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thadeu Lima de Souza Cascardo, e1000-devel, Ingo Molnar, LKML,
	Jesse Barnes

On 04/24/2009 06:09 PM, Rafael J. Wysocki wrote:
> My understanding is that the commit pointed to by Jiri caused a problem
> if the current mainline kernel was kexeced from an older kernel (2.6.27.x from
> openSUSE-11.1 in this particular case), because the older kernel didn't
> have the recent network driver fixes applied.  Is this correct?

Exactly!

> Also, I'm still interested in whether or not removig the following three lines:
> 
>         /* Check if we're already there */
>         if (dev->current_state == state)
>                 return 0;
> 
> from pci_set_power_state() in the current mainline kernel fixes the problem
> in the configuration where it is readily reproducible.

After removing those lines, the problem still persists:
e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-05-11 14:31             ` Jiri Slaby
@ 2009-05-11 15:24               ` Rafael J. Wysocki
  2009-05-11 15:31                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 15:24 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Thadeu Lima de Souza Cascardo, e1000-devel, Ingo Molnar, LKML,
	Jesse Barnes

On Monday 11 May 2009, Jiri Slaby wrote:
> On 04/24/2009 06:09 PM, Rafael J. Wysocki wrote:
> > My understanding is that the commit pointed to by Jiri caused a problem
> > if the current mainline kernel was kexeced from an older kernel (2.6.27.x from
> > openSUSE-11.1 in this particular case), because the older kernel didn't
> > have the recent network driver fixes applied.  Is this correct?
> 
> Exactly!
> 
> > Also, I'm still interested in whether or not removig the following three lines:
> > 
> >         /* Check if we're already there */
> >         if (dev->current_state == state)
> >                 return 0;
> > 
> > from pci_set_power_state() in the current mainline kernel fixes the problem
> > in the configuration where it is readily reproducible.
> 
> After removing those lines, the problem still persists:
> e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid

So it's more complicated than I thought.  Well ...

What if the driver in question is rmmoded before kexec?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
  2009-05-11 15:24               ` Rafael J. Wysocki
@ 2009-05-11 15:31                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2009-05-11 15:31 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Thadeu Lima de Souza Cascardo, e1000-devel, Ingo Molnar, LKML,
	Jesse Barnes

On Monday 11 May 2009, Rafael J. Wysocki wrote:
> On Monday 11 May 2009, Jiri Slaby wrote:
> > On 04/24/2009 06:09 PM, Rafael J. Wysocki wrote:
> > > My understanding is that the commit pointed to by Jiri caused a problem
> > > if the current mainline kernel was kexeced from an older kernel (2.6.27.x from
> > > openSUSE-11.1 in this particular case), because the older kernel didn't
> > > have the recent network driver fixes applied.  Is this correct?
> > 
> > Exactly!
> > 
> > > Also, I'm still interested in whether or not removig the following three lines:
> > > 
> > >         /* Check if we're already there */
> > >         if (dev->current_state == state)
> > >                 return 0;
> > > 
> > > from pci_set_power_state() in the current mainline kernel fixes the problem
> > > in the configuration where it is readily reproducible.
> > 
> > After removing those lines, the problem still persists:
> > e1000: 0000:02:01.0: e1000_probe: The EEPROM Checksum Is Not Valid
> 
> So it's more complicated than I thought.  Well ...
> 
> What if the driver in question is rmmoded before kexec?

Well, it should be the same, never mind.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-05-11 15:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-23 13:36 e1000: "eeprom checksum is not valid" after kexec Jiri Slaby
2009-04-23 14:10 ` [E1000-devel] " Thadeu Lima de Souza Cascardo
2009-04-23 14:30   ` Jiri Slaby
2009-04-23 14:41     ` Thadeu Lima de Souza Cascardo
2009-04-23 20:40       ` Jiri Slaby
2009-04-23 21:17         ` Thadeu Lima de Souza Cascardo
2009-04-24 16:09           ` Rafael J. Wysocki
2009-05-11 14:31             ` Jiri Slaby
2009-05-11 15:24               ` Rafael J. Wysocki
2009-05-11 15:31                 ` Rafael J. Wysocki
2009-04-23 15:15 ` Rafael J. Wysocki
2009-04-23 15:36   ` Jiri Slaby
2009-04-23 21:48     ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.