All of lore.kernel.org
 help / color / mirror / Atom feed
* kexec boot regression
@ 2009-12-15 11:50 Jens Axboe
  2009-12-15 12:01 ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 11:50 UTC (permalink / raw)
  To: Linux Kernel; +Cc: mingo, yinghai, rdreier

Hi,

I have this big box that takes forever to boot, so I use kexec to boot
into new kernels. Works fine, but some time past 2.6.32 it stopped
working. Instead of wasting brain cycles on finding out why, I handed
the problem to my trusty regression friend - git bisect.

This is what it found (sorry Yinghai it's you again, you owe me a beer
for hours of 2.6.32-git bisecting ;-)


99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Sun Oct 4 21:54:24 2009 -0700

    x86/PCI: read root resources from IOH on Intel
    
    For intel systems with multi IOH, we should read peer root resources
    directly from PCI config space, and don't trust _CRS.


I could not revert this single commit, as a further commit made other
changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
that this kernel then works fine.

With current -git, I get tons and tons of:

[   16.841724] pci 0000:00:01.0: BAR 7: no parent found for bridge [io
0x6000-0x6fff]
[   16.850368] pci 0000:00:01.0: BAR 7: can't allocate [io
0x6000-0x6fff]
[   16.857821] pci 0000:00:01.0: BAR 8: no parent found for bridge [mem
0x9bc00000-0x9bcfffff]
[   16.867238] pci 0000:00:01.0: BAR 8: can't allocate [mem
0x9bc00000-0x9bcfffff]
[   16.875492] pci 0000:00:02.0: BAR 7: no parent found for bridge [io
0x5000-0x5fff]
[   16.884137] pci 0000:00:02.0: BAR 7: can't allocate [io
0x5000-0x5fff]
[   16.891591] pci 0000:00:02.0: BAR 8: no parent found for bridge [mem
0x9bb00000-0x9bbfffff]
[   16.901010] pci 0000:00:02.0: BAR 8: can't allocate [mem
0x9bb00000-0x9bbfffff]
[   16.909264] pci 0000:00:03.0: BAR 7: no parent found for bridge [io
0x4000-0x4fff]
[   16.917908] pci 0000:00:03.0: BAR 7: can't allocate [io
0x4000-0x4fff]
[...]

I can provide a full log if needed.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 11:50 kexec boot regression Jens Axboe
@ 2009-12-15 12:01 ` Yinghai Lu
  2009-12-15 12:14   ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 12:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel, mingo, rdreier

Jens Axboe wrote:
> Hi,
> 
> I have this big box that takes forever to boot, so I use kexec to boot
> into new kernels. Works fine, but some time past 2.6.32 it stopped
> working. Instead of wasting brain cycles on finding out why, I handed
> the problem to my trusty regression friend - git bisect.
> 
> This is what it found (sorry Yinghai it's you again, you owe me a beer
> for hours of 2.6.32-git bisecting ;-)

sure.

> 
> 
> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
> Author: Yinghai Lu <yinghai@kernel.org>
> Date:   Sun Oct 4 21:54:24 2009 -0700
> 
>     x86/PCI: read root resources from IOH on Intel
>     
>     For intel systems with multi IOH, we should read peer root resources
>     directly from PCI config space, and don't trust _CRS.
> 
> 
> I could not revert this single commit, as a further commit made other
> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
> that this kernel then works fine.
>

let see how BIOS mess it up again!
 
> With current -git, I get tons and tons of:
> 
> [   16.841724] pci 0000:00:01.0: BAR 7: no parent found for bridge [io
> 0x6000-0x6fff]
> [   16.850368] pci 0000:00:01.0: BAR 7: can't allocate [io
> 0x6000-0x6fff]
> [   16.857821] pci 0000:00:01.0: BAR 8: no parent found for bridge [mem
> 0x9bc00000-0x9bcfffff]
> [   16.867238] pci 0000:00:01.0: BAR 8: can't allocate [mem
> 0x9bc00000-0x9bcfffff]
> [   16.875492] pci 0000:00:02.0: BAR 7: no parent found for bridge [io
> 0x5000-0x5fff]
> [   16.884137] pci 0000:00:02.0: BAR 7: can't allocate [io
> 0x5000-0x5fff]
> [   16.891591] pci 0000:00:02.0: BAR 8: no parent found for bridge [mem
> 0x9bb00000-0x9bbfffff]
> [   16.901010] pci 0000:00:02.0: BAR 8: can't allocate [mem
> 0x9bb00000-0x9bbfffff]
> [   16.909264] pci 0000:00:03.0: BAR 7: no parent found for bridge [io
> 0x4000-0x4fff]
> [   16.917908] pci 0000:00:03.0: BAR 7: can't allocate [io
> 0x4000-0x4fff]
> [...]
> 
> I can provide a full log if needed.

please.

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 12:01 ` Yinghai Lu
@ 2009-12-15 12:14   ` Jens Axboe
  2009-12-15 12:31     ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 12:14 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Linux Kernel, mingo, rdreier

[-- Attachment #1: Type: text/plain, Size: 1343 bytes --]

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > Hi,
> > 
> > I have this big box that takes forever to boot, so I use kexec to boot
> > into new kernels. Works fine, but some time past 2.6.32 it stopped
> > working. Instead of wasting brain cycles on finding out why, I handed
> > the problem to my trusty regression friend - git bisect.
> > 
> > This is what it found (sorry Yinghai it's you again, you owe me a beer
> > for hours of 2.6.32-git bisecting ;-)
> 
> sure.
> 
> > 
> > 
> > 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
> > commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
> > Author: Yinghai Lu <yinghai@kernel.org>
> > Date:   Sun Oct 4 21:54:24 2009 -0700
> > 
> >     x86/PCI: read root resources from IOH on Intel
> >     
> >     For intel systems with multi IOH, we should read peer root resources
> >     directly from PCI config space, and don't trust _CRS.
> > 
> > 
> > I could not revert this single commit, as a further commit made other
> > changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
> > that this kernel then works fine.
> >
> 
> let see how BIOS mess it up again!

Heh, I had a feeling this was coming :-)

> please.

Please find two logs attached - one from a boot with -git and the two
patches reverted, and one from a boot with -git.

-- 
Jens Axboe


[-- Attachment #2: good-boot.log.gz --]
[-- Type: application/octet-stream, Size: 15915 bytes --]

[-- Attachment #3: bad-boot.log.gz --]
[-- Type: application/octet-stream, Size: 14734 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 12:14   ` Jens Axboe
@ 2009-12-15 12:31     ` Yinghai Lu
  2009-12-15 12:39       ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 12:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel, mingo, rdreier

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> Hi,
>>>
>>> I have this big box that takes forever to boot, so I use kexec to boot
>>> into new kernels. Works fine, but some time past 2.6.32 it stopped
>>> working. Instead of wasting brain cycles on finding out why, I handed
>>> the problem to my trusty regression friend - git bisect.
>>>
>>> This is what it found (sorry Yinghai it's you again, you owe me a beer
>>> for hours of 2.6.32-git bisecting ;-)
>> sure.
>>
>>>
>>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
>>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
>>> Author: Yinghai Lu <yinghai@kernel.org>
>>> Date:   Sun Oct 4 21:54:24 2009 -0700
>>>
>>>     x86/PCI: read root resources from IOH on Intel
>>>     
>>>     For intel systems with multi IOH, we should read peer root resources
>>>     directly from PCI config space, and don't trust _CRS.
>>>
>>>
>>> I could not revert this single commit, as a further commit made other
>>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
>>> that this kernel then works fine.
>>>
>> let see how BIOS mess it up again!
> 
> Heh, I had a feeling this was coming :-)
> 
>> please.
> 
> Please find two logs attached - one from a boot with -git and the two
> patches reverted, and one from a boot with -git.

please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 12:31     ` Yinghai Lu
@ 2009-12-15 12:39       ` Jens Axboe
  2009-12-15 12:55         ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 12:39 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Linux Kernel, mingo, rdreier

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> Hi,
> >>>
> >>> I have this big box that takes forever to boot, so I use kexec to boot
> >>> into new kernels. Works fine, but some time past 2.6.32 it stopped
> >>> working. Instead of wasting brain cycles on finding out why, I handed
> >>> the problem to my trusty regression friend - git bisect.
> >>>
> >>> This is what it found (sorry Yinghai it's you again, you owe me a beer
> >>> for hours of 2.6.32-git bisecting ;-)
> >> sure.
> >>
> >>>
> >>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
> >>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
> >>> Author: Yinghai Lu <yinghai@kernel.org>
> >>> Date:   Sun Oct 4 21:54:24 2009 -0700
> >>>
> >>>     x86/PCI: read root resources from IOH on Intel
> >>>     
> >>>     For intel systems with multi IOH, we should read peer root resources
> >>>     directly from PCI config space, and don't trust _CRS.
> >>>
> >>>
> >>> I could not revert this single commit, as a further commit made other
> >>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
> >>> that this kernel then works fine.
> >>>
> >> let see how BIOS mess it up again!
> > 
> > Heh, I had a feeling this was coming :-)
> > 
> >> please.
> > 
> > Please find two logs attached - one from a boot with -git and the two
> > patches reverted, and one from a boot with -git.
> 
> please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line.

On the good or bad kernel?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 12:39       ` Jens Axboe
@ 2009-12-15 12:55         ` Yinghai Lu
  2009-12-15 14:11           ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 12:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel, mingo, rdreier

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> Jens Axboe wrote:
>>>>> Hi,
>>>>>
>>>>> I have this big box that takes forever to boot, so I use kexec to boot
>>>>> into new kernels. Works fine, but some time past 2.6.32 it stopped
>>>>> working. Instead of wasting brain cycles on finding out why, I handed
>>>>> the problem to my trusty regression friend - git bisect.
>>>>>
>>>>> This is what it found (sorry Yinghai it's you again, you owe me a beer
>>>>> for hours of 2.6.32-git bisecting ;-)
>>>> sure.
>>>>
>>>>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
>>>>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
>>>>> Author: Yinghai Lu <yinghai@kernel.org>
>>>>> Date:   Sun Oct 4 21:54:24 2009 -0700
>>>>>
>>>>>     x86/PCI: read root resources from IOH on Intel
>>>>>     
>>>>>     For intel systems with multi IOH, we should read peer root resources
>>>>>     directly from PCI config space, and don't trust _CRS.
>>>>>
>>>>>
>>>>> I could not revert this single commit, as a further commit made other
>>>>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
>>>>> that this kernel then works fine.
>>>>>
>>>> let see how BIOS mess it up again!
>>> Heh, I had a feeling this was coming :-)
>>>
>>>> please.
>>> Please find two logs attached - one from a boot with -git and the two
>>> patches reverted, and one from a boot with -git.
>> please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line.
> 
> On the good or bad kernel?

both please.

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 12:55         ` Yinghai Lu
@ 2009-12-15 14:11           ` Jens Axboe
  2009-12-15 18:39             ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 14:11 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Linux Kernel, mingo, rdreier

[-- Attachment #1: Type: text/plain, Size: 1738 bytes --]

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> Jens Axboe wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I have this big box that takes forever to boot, so I use kexec to boot
> >>>>> into new kernels. Works fine, but some time past 2.6.32 it stopped
> >>>>> working. Instead of wasting brain cycles on finding out why, I handed
> >>>>> the problem to my trusty regression friend - git bisect.
> >>>>>
> >>>>> This is what it found (sorry Yinghai it's you again, you owe me a beer
> >>>>> for hours of 2.6.32-git bisecting ;-)
> >>>> sure.
> >>>>
> >>>>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
> >>>>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
> >>>>> Author: Yinghai Lu <yinghai@kernel.org>
> >>>>> Date:   Sun Oct 4 21:54:24 2009 -0700
> >>>>>
> >>>>>     x86/PCI: read root resources from IOH on Intel
> >>>>>     
> >>>>>     For intel systems with multi IOH, we should read peer root resources
> >>>>>     directly from PCI config space, and don't trust _CRS.
> >>>>>
> >>>>>
> >>>>> I could not revert this single commit, as a further commit made other
> >>>>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
> >>>>> that this kernel then works fine.
> >>>>>
> >>>> let see how BIOS mess it up again!
> >>> Heh, I had a feeling this was coming :-)
> >>>
> >>>> please.
> >>> Please find two logs attached - one from a boot with -git and the two
> >>> patches reverted, and one from a boot with -git.
> >> please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line.
> > 
> > On the good or bad kernel?
> 
> both please.

Attached.

-- 
Jens Axboe


[-- Attachment #2: good-log-debug.txt.gz --]
[-- Type: application/octet-stream, Size: 41724 bytes --]

[-- Attachment #3: bad-log-debug.txt.gz --]
[-- Type: application/octet-stream, Size: 38731 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 14:11           ` Jens Axboe
@ 2009-12-15 18:39             ` Yinghai Lu
  2009-12-15 18:47               ` Matthew Wilcox
                                 ` (3 more replies)
  0 siblings, 4 replies; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 18:39 UTC (permalink / raw)
  To: Jens Axboe, Jesse Barnes
  Cc: Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>
>>>>>> let see how BIOS mess it up again!
>>>>> Heh, I had a feeling this was coming :-)

[    0.000000] user-defined physical RAM map:

[    0.000000]  user: 0000000000000100 - 0000000000098800 (usable)

[    0.000000]  user: 0000000000098800 - 00000000000a0000 (reserved)

[    0.000000]  user: 00000000000e0000 - 0000000000100000 (reserved)

[    0.000000]  user: 0000000000100000 - 0000000078c63000 (usable)

[    0.000000]  user: 0000000078c63000 - 0000000078e77000 (ACPI NVS)

[    0.000000]  user: 0000000078e77000 - 000000007924e000 (ACPI data)

[    0.000000]  user: 000000007924e000 - 00000000792c2000 (reserved)

[    0.000000]  user: 00000000792c2000 - 00000000792d2000 (ACPI data)

[    0.000000]  user: 00000000792d2000 - 00000000792e7000 (reserved)

[    0.000000]  user: 00000000792e7000 - 0000000079301000 (ACPI data)

[    0.000000]  user: 0000000079301000 - 0000000079303000 (reserved)

[    0.000000]  user: 0000000079303000 - 0000000079305000 (ACPI data)


[    0.000000]  user: 0000000079305000 - 0000000079310000 (reserved)

[    0.000000]  user: 0000000079310000 - 0000000079314000 (ACPI data)

[    0.000000]  user: 0000000079314000 - 0000000079319000 (reserved)

[    0.000000]  user: 0000000079319000 - 0000000079336000 (ACPI data)

[    0.000000]  user: 0000000079336000 - 0000000079358000 (reserved)

[    0.000000]  user: 0000000079358000 - 0000000079388000 (ACPI data)

[    0.000000]  user: 0000000079388000 - 00000000793c9000 (reserved)

[    0.000000]  user: 00000000793c9000 - 000000007968f000 (ACPI data)

[    0.000000]  user: 000000007968f000 - 00000000796bb000 (reserved)

[    0.000000]  user: 00000000796bb000 - 00000000799d8000 (ACPI data)

[    0.000000]  user: 00000000799d8000 - 0000000079bd8000 (ACPI NVS)

[    0.000000]  user: 0000000079bd8000 - 0000000079d87000 (ACPI data)

[    0.000000]  user: 0000000079d87000 - 0000000079d8a000 (reserved)

[    0.000000]  user: 0000000079d8a000 - 0000000079dca000 (ACPI data)

[    0.000000]  user: 0000000079dca000 - 0000000079dcb000 (reserved)

[    0.000000]  user: 0000000079dcb000 - 0000000079e1c000 (ACPI data)

[    0.000000]  user: 0000000079e1c000 - 0000000079e87000 (reserved)

[    0.000000]  user: 0000000079e87000 - 000000007bd5f000 (ACPI data)

[    0.000000]  user: 000000007bd5f000 - 000000007be4f000 (reserved)

[    0.000000]  user: 000000007be4f000 - 000000007bf87000 (ACPI data)

[    0.000000]  user: 0000000100000000 - 0000001080000000 (usable)
...
[    0.000000] SRAT: Node 0 PXM 0 0-80000000

[    0.000000] SRAT: Node 0 PXM 0 100000000-480000000

[    0.000000] SRAT: Node 2 PXM 1 480000000-880000000

[    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000

[    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000

[    0.000000] ACPI: [SRAT:0x01] ignored 16 entries of 32 found

[    0.000000] NUMA: Using 31 for the hash shift.

[    0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.

[    0.000000] SRAT: SRAT not used.

[    0.000000] No NUMA configuration found

so SRAT is broken?

        if (max_entries && count > max_entries) {
                printk(KERN_WARNING PREFIX "[%4.4s:0x%02x] ignored %i entries of "
                       "%i found\n", id, entry_id, count - max_entries, count);
        }
...

or what is your CONFIG_NODES_SHIFT? 3? can you try to set it to 6?

[   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)

[   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources

[   13.112475] PCI: not using MMCONFIG

[   13.206650] ACPI: No dock devices found.

so mmconf is not used...<ask BIOS fix it please!>

then we get 

[   13.990335] IOH bus: [00, 00]

[   13.993707] IOH bus: 00 index 0 io port: [0, fff]

[   13.999023] IOH bus: 00 index 1 mmio: [0, ffffff]

[   14.004335] IOH bus: 00 index 2 mmio: [0, 3ffffff]

please check

[PATCH] x86/pci: intel ioh bus num reg accessing fix

it is above 0x100, so if mmconf is not enable, need to skip it

Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/pci/intel_bus.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: linux-2.6/arch/x86/pci/intel_bus.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/intel_bus.c
+++ linux-2.6/arch/x86/pci/intel_bus.c
@@ -49,6 +49,10 @@ static void __devinit pci_root_bus_res(s
 	u64 mmioh_base, mmioh_end;
 	int bus_base, bus_end;
 
+	/* some sys doesn't get mmconf enabled */
+	if (dev->cfg_size < 0x200)
+		return;
+
 	if (pci_root_num >= PCI_ROOT_NR) {
 		printk(KERN_DEBUG "intel_bus.c: PCI_ROOT_NR is too small\n");
 		return;

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 18:39             ` Yinghai Lu
@ 2009-12-15 18:47               ` Matthew Wilcox
  2009-12-15 18:54               ` Jens Axboe
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 42+ messages in thread
From: Matthew Wilcox @ 2009-12-15 18:47 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jens Axboe, Jesse Barnes, Linux Kernel, mingo, rdreier,
	Suresh Siddha, linux-pci

On Tue, Dec 15, 2009 at 10:39:37AM -0800, Yinghai Lu wrote:
> +	/* some sys doesn't get mmconf enabled */
> +	if (dev->cfg_size < 0x200)
> +		return;

What is the meaning of this mystic 0x200?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 18:39             ` Yinghai Lu
  2009-12-15 18:47               ` Matthew Wilcox
@ 2009-12-15 18:54               ` Jens Axboe
  2009-12-15 18:59               ` Jens Axboe
  2009-12-15 19:43               ` kexec boot regression Jens Axboe
  3 siblings, 0 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 18:54 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>>
> >>>>>> let see how BIOS mess it up again!
> >>>>> Heh, I had a feeling this was coming :-)
> 
> [    0.000000] user-defined physical RAM map:
> 
> [    0.000000]  user: 0000000000000100 - 0000000000098800 (usable)
> 
> [    0.000000]  user: 0000000000098800 - 00000000000a0000 (reserved)
> 
> [    0.000000]  user: 00000000000e0000 - 0000000000100000 (reserved)
> 
> [    0.000000]  user: 0000000000100000 - 0000000078c63000 (usable)
> 
> [    0.000000]  user: 0000000078c63000 - 0000000078e77000 (ACPI NVS)
> 
> [    0.000000]  user: 0000000078e77000 - 000000007924e000 (ACPI data)
> 
> [    0.000000]  user: 000000007924e000 - 00000000792c2000 (reserved)
> 
> [    0.000000]  user: 00000000792c2000 - 00000000792d2000 (ACPI data)
> 
> [    0.000000]  user: 00000000792d2000 - 00000000792e7000 (reserved)
> 
> [    0.000000]  user: 00000000792e7000 - 0000000079301000 (ACPI data)
> 
> [    0.000000]  user: 0000000079301000 - 0000000079303000 (reserved)
> 
> [    0.000000]  user: 0000000079303000 - 0000000079305000 (ACPI data)
> 
> 
> [    0.000000]  user: 0000000079305000 - 0000000079310000 (reserved)
> 
> [    0.000000]  user: 0000000079310000 - 0000000079314000 (ACPI data)
> 
> [    0.000000]  user: 0000000079314000 - 0000000079319000 (reserved)
> 
> [    0.000000]  user: 0000000079319000 - 0000000079336000 (ACPI data)
> 
> [    0.000000]  user: 0000000079336000 - 0000000079358000 (reserved)
> 
> [    0.000000]  user: 0000000079358000 - 0000000079388000 (ACPI data)
> 
> [    0.000000]  user: 0000000079388000 - 00000000793c9000 (reserved)
> 
> [    0.000000]  user: 00000000793c9000 - 000000007968f000 (ACPI data)
> 
> [    0.000000]  user: 000000007968f000 - 00000000796bb000 (reserved)
> 
> [    0.000000]  user: 00000000796bb000 - 00000000799d8000 (ACPI data)
> 
> [    0.000000]  user: 00000000799d8000 - 0000000079bd8000 (ACPI NVS)
> 
> [    0.000000]  user: 0000000079bd8000 - 0000000079d87000 (ACPI data)
> 
> [    0.000000]  user: 0000000079d87000 - 0000000079d8a000 (reserved)
> 
> [    0.000000]  user: 0000000079d8a000 - 0000000079dca000 (ACPI data)
> 
> [    0.000000]  user: 0000000079dca000 - 0000000079dcb000 (reserved)
> 
> [    0.000000]  user: 0000000079dcb000 - 0000000079e1c000 (ACPI data)
> 
> [    0.000000]  user: 0000000079e1c000 - 0000000079e87000 (reserved)
> 
> [    0.000000]  user: 0000000079e87000 - 000000007bd5f000 (ACPI data)
> 
> [    0.000000]  user: 000000007bd5f000 - 000000007be4f000 (reserved)
> 
> [    0.000000]  user: 000000007be4f000 - 000000007bf87000 (ACPI data)
> 
> [    0.000000]  user: 0000000100000000 - 0000001080000000 (usable)
> ...
> [    0.000000] SRAT: Node 0 PXM 0 0-80000000
> 
> [    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
> 
> [    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
> 
> [    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
> 
> [    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
> 
> [    0.000000] ACPI: [SRAT:0x01] ignored 16 entries of 32 found
> 
> [    0.000000] NUMA: Using 31 for the hash shift.
> 
> [    0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.
> 
> [    0.000000] SRAT: SRAT not used.
> 
> [    0.000000] No NUMA configuration found
> 
> so SRAT is broken?
> 
>         if (max_entries && count > max_entries) {
>                 printk(KERN_WARNING PREFIX "[%4.4s:0x%02x] ignored %i entries of "
>                        "%i found\n", id, entry_id, count - max_entries, count);
>         }
> ...
> 
> or what is your CONFIG_NODES_SHIFT? 3? can you try to set it to 6?

Hmm funky, perhaps the BIOS changed that too. NUMA has otherwise been
working fine, didn't check whether it still did after a BIOS upgrade.
I'll try 6, it is set to 3 iirc.

> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> 
> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> 
> [   13.112475] PCI: not using MMCONFIG
> 
> [   13.206650] ACPI: No dock devices found.
> 
> so mmconf is not used...<ask BIOS fix it please!>

Reported, thanks.

> then we get 
> 
> [   13.990335] IOH bus: [00, 00]
> 
> [   13.993707] IOH bus: 00 index 0 io port: [0, fff]
> 
> [   13.999023] IOH bus: 00 index 1 mmio: [0, ffffff]
> 
> [   14.004335] IOH bus: 00 index 2 mmio: [0, 3ffffff]
> 
> please check
> 
> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> 
> it is above 0x100, so if mmconf is not enable, need to skip it

Will check that now.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 18:39             ` Yinghai Lu
  2009-12-15 18:47               ` Matthew Wilcox
  2009-12-15 18:54               ` Jens Axboe
@ 2009-12-15 18:59               ` Jens Axboe
  2009-12-15 19:04                 ` Yinghai Lu
  2009-12-15 19:43               ` kexec boot regression Jens Axboe
  3 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 18:59 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> 
> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources

On a "normal" non-kexec boot, I get:

[   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
[   12.216874] PCI: Using configuration type 1 for base access

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 18:59               ` Jens Axboe
@ 2009-12-15 19:04                 ` Yinghai Lu
  2009-12-15 19:11                   ` Jens Axboe
  2009-12-15 21:30                   ` Markus Trippelsdorf
  0 siblings, 2 replies; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 19:04 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>
>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> 
> On a "normal" non-kexec boot, I get:
> 
> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> [   12.216874] PCI: Using configuration type 1 for base access
> 

can you run following scripts in first kernel?

cd /sys/firmware/memmap
for dir in * ; do
  start=$(cat $dir/start)
  end=$(cat $dir/end)
  type=$(cat $dir/type)
  printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
done

and send out /tmp/memmap.txt

what is your kexec tools version? could be too old?

YH


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:04                 ` Yinghai Lu
@ 2009-12-15 19:11                   ` Jens Axboe
  2009-12-15 19:17                     ` Yinghai Lu
  2009-12-15 19:44                     ` Yinghai Lu
  2009-12-15 21:30                   ` Markus Trippelsdorf
  1 sibling, 2 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:11 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>
> >> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> > 
> > On a "normal" non-kexec boot, I get:
> > 
> > [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> > [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> > [   12.216874] PCI: Using configuration type 1 for base access
> > 
> 
> can you run following scripts in first kernel?
> 
> cd /sys/firmware/memmap
> for dir in * ; do
>   start=$(cat $dir/start)
>   end=$(cat $dir/end)
>   type=$(cat $dir/type)
>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
> done
> 
> and send out /tmp/memmap.txt

Below.

> what is your kexec tools version? could be too old?

It says:

kexec-tools-testing 20080324 released 24th March 2008


0000000000000000-0000000000098800 (System RAM)
0000000000098800-00000000000a0000 (reserved)
0000000079301000-0000000079303000 (reserved)
0000000079303000-0000000079305000 (ACPI Tables)
0000000079305000-0000000079310000 (reserved)
0000000079310000-0000000079314000 (ACPI Tables)
0000000079314000-0000000079319000 (reserved)
0000000079319000-0000000079336000 (ACPI Tables)
0000000079336000-0000000079358000 (reserved)
0000000079358000-0000000079388000 (ACPI Tables)
0000000079388000-00000000793c9000 (reserved)
00000000793c9000-000000007968f000 (ACPI Tables)
00000000000e0000-0000000000100000 (reserved)
000000007968f000-00000000796bb000 (reserved)
00000000796bb000-00000000799d8000 (ACPI Tables)
00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
0000000079bd8000-0000000079d8b000 (ACPI Tables)
0000000079d8b000-0000000079d8c000 (reserved)
0000000079d8c000-0000000079dc8000 (ACPI Tables)
0000000079dc8000-0000000079dcb000 (reserved)
0000000079dcb000-0000000079e1c000 (ACPI Tables)
0000000079e1c000-0000000079e87000 (reserved)
0000000079e87000-000000007bd5f000 (ACPI Tables)
0000000000100000-0000000078c59000 (System RAM)
000000007bd5f000-000000007be4f000 (reserved)
000000007be4f000-000000007bf87000 (ACPI Tables)
000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage)
000000007bfcf000-000000007bfff000 (ACPI Tables)
000000007bfff000-0000000090000000 (reserved)
00000000fc000000-00000000fd000000 (reserved)
00000000fed1c000-00000000fed20000 (reserved)
00000000ff000000-0000000100000000 (reserved)
0000000100000000-0000001080000000 (System RAM)
0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage)
0000000078e6d000-000000007924e000 (ACPI Tables)
000000007924e000-00000000792c2000 (reserved)
00000000792c2000-00000000792d2000 (ACPI Tables)
00000000792d2000-00000000792e7000 (reserved)
00000000792e7000-0000000079301000 (ACPI Tables)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:11                   ` Jens Axboe
@ 2009-12-15 19:17                     ` Yinghai Lu
  2009-12-15 19:22                       ` Jens Axboe
  2009-12-15 19:44                     ` Yinghai Lu
  1 sibling, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 19:17 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>>>
>>>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
>>> On a "normal" non-kexec boot, I get:
>>>
>>> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
>>> [   12.216874] PCI: Using configuration type 1 for base access
>>>
>> can you run following scripts in first kernel?
>>
>> cd /sys/firmware/memmap
>> for dir in * ; do
>>   start=$(cat $dir/start)
>>   end=$(cat $dir/end)
>>   type=$(cat $dir/type)
>>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
>> done
>>
>> and send out /tmp/memmap.txt
> 
> Below.
> 
>> what is your kexec tools version? could be too old?
> 
> It says:
> 
> kexec-tools-testing 20080324 released 24th March 2008
> 
> 
> 0000000000000000-0000000000098800 (System RAM)
> 0000000000098800-00000000000a0000 (reserved)
> 0000000079301000-0000000079303000 (reserved)
> 0000000079303000-0000000079305000 (ACPI Tables)
> 0000000079305000-0000000079310000 (reserved)
> 0000000079310000-0000000079314000 (ACPI Tables)
> 0000000079314000-0000000079319000 (reserved)
> 0000000079319000-0000000079336000 (ACPI Tables)
> 0000000079336000-0000000079358000 (reserved)
> 0000000079358000-0000000079388000 (ACPI Tables)
> 0000000079388000-00000000793c9000 (reserved)
> 00000000793c9000-000000007968f000 (ACPI Tables)
> 00000000000e0000-0000000000100000 (reserved)
> 000000007968f000-00000000796bb000 (reserved)
> 00000000796bb000-00000000799d8000 (ACPI Tables)
> 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
> 0000000079bd8000-0000000079d8b000 (ACPI Tables)
> 0000000079d8b000-0000000079d8c000 (reserved)
> 0000000079d8c000-0000000079dc8000 (ACPI Tables)
> 0000000079dc8000-0000000079dcb000 (reserved)
> 0000000079dcb000-0000000079e1c000 (ACPI Tables)
> 0000000079e1c000-0000000079e87000 (reserved)
> 0000000079e87000-000000007bd5f000 (ACPI Tables)
> 0000000000100000-0000000078c59000 (System RAM)
> 000000007bd5f000-000000007be4f000 (reserved)
> 000000007be4f000-000000007bf87000 (ACPI Tables)
> 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage)
> 000000007bfcf000-000000007bfff000 (ACPI Tables)
> 000000007bfff000-0000000090000000 (reserved)
> 00000000fc000000-00000000fd000000 (reserved)
> 00000000fed1c000-00000000fed20000 (reserved)
> 00000000ff000000-0000000100000000 (reserved)
> 0000000100000000-0000001080000000 (System RAM)
> 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage)
> 0000000078e6d000-000000007924e000 (ACPI Tables)
> 000000007924e000-00000000792c2000 (reserved)
> 00000000792c2000-00000000792d2000 (ACPI Tables)
> 00000000792d2000-00000000792e7000 (reserved)
> 00000000792e7000-0000000079301000 (ACPI Tables)
> 

boot log of first kernel?

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:17                     ` Yinghai Lu
@ 2009-12-15 19:22                       ` Jens Axboe
  2009-12-15 19:28                         ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:22 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>>>
> >>>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> >>> On a "normal" non-kexec boot, I get:
> >>>
> >>> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> >>> [   12.216874] PCI: Using configuration type 1 for base access
> >>>
> >> can you run following scripts in first kernel?
> >>
> >> cd /sys/firmware/memmap
> >> for dir in * ; do
> >>   start=$(cat $dir/start)
> >>   end=$(cat $dir/end)
> >>   type=$(cat $dir/type)
> >>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
> >> done
> >>
> >> and send out /tmp/memmap.txt
> > 
> > Below.
> > 
> >> what is your kexec tools version? could be too old?
> > 
> > It says:
> > 
> > kexec-tools-testing 20080324 released 24th March 2008
> > 
> > 
> > 0000000000000000-0000000000098800 (System RAM)
> > 0000000000098800-00000000000a0000 (reserved)
> > 0000000079301000-0000000079303000 (reserved)
> > 0000000079303000-0000000079305000 (ACPI Tables)
> > 0000000079305000-0000000079310000 (reserved)
> > 0000000079310000-0000000079314000 (ACPI Tables)
> > 0000000079314000-0000000079319000 (reserved)
> > 0000000079319000-0000000079336000 (ACPI Tables)
> > 0000000079336000-0000000079358000 (reserved)
> > 0000000079358000-0000000079388000 (ACPI Tables)
> > 0000000079388000-00000000793c9000 (reserved)
> > 00000000793c9000-000000007968f000 (ACPI Tables)
> > 00000000000e0000-0000000000100000 (reserved)
> > 000000007968f000-00000000796bb000 (reserved)
> > 00000000796bb000-00000000799d8000 (ACPI Tables)
> > 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
> > 0000000079bd8000-0000000079d8b000 (ACPI Tables)
> > 0000000079d8b000-0000000079d8c000 (reserved)
> > 0000000079d8c000-0000000079dc8000 (ACPI Tables)
> > 0000000079dc8000-0000000079dcb000 (reserved)
> > 0000000079dcb000-0000000079e1c000 (ACPI Tables)
> > 0000000079e1c000-0000000079e87000 (reserved)
> > 0000000079e87000-000000007bd5f000 (ACPI Tables)
> > 0000000000100000-0000000078c59000 (System RAM)
> > 000000007bd5f000-000000007be4f000 (reserved)
> > 000000007be4f000-000000007bf87000 (ACPI Tables)
> > 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage)
> > 000000007bfcf000-000000007bfff000 (ACPI Tables)
> > 000000007bfff000-0000000090000000 (reserved)
> > 00000000fc000000-00000000fd000000 (reserved)
> > 00000000fed1c000-00000000fed20000 (reserved)
> > 00000000ff000000-0000000100000000 (reserved)
> > 0000000100000000-0000001080000000 (System RAM)
> > 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage)
> > 0000000078e6d000-000000007924e000 (ACPI Tables)
> > 000000007924e000-00000000792c2000 (reserved)
> > 00000000792c2000-00000000792d2000 (ACPI Tables)
> > 00000000792d2000-00000000792e7000 (reserved)
> > 00000000792e7000-0000000079301000 (ACPI Tables)
> > 
> 
> boot log of first kernel?

Hmm not completely sure, let me re-do it after a cold boot.

BTW, I just checked, and 2.6.32 has NUMA working fine. Below is the SRAT
and NUMA output from 2.6.32 (kexec'ed kernel). Is the check a newly
introduced one?

[    0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 64 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 32 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 96 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 2 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 66 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 34 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 98 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 4 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 68 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 36 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 100 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 6 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 70 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 38 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 102 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 16 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 80 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 48 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 112 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 18 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 82 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 50 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 114 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 20 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 84 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 52 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 116 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 22 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 86 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 54 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 118 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 65 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 33 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 97 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 3 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 67 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 35 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 99 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 5 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 69 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 37 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 101 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 7 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 71 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 39 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 103 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 17 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 81 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 49 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 113 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 19 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 83 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 51 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 115 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 21 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 85 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 53 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 117 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 23 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 87 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 55 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 119 -> Node 3
[    0.000000] SRAT: Node 0 PXM 0 0-80000000
[    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
[    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
[    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
[    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
[    0.000000] NUMA: Using 31 for the hash shift.
[    0.000000] Bootmem setup node 0 0000000000000000-0000000480000000
[    0.000000]   NODE_DATA [0000000000048000 - 000000000004cfff]
[    0.000000]   bootmap [0000000000100000 -  000000000018ffff] pages 90
[    0.000000] (8 early reservations) ==> bootmem [0000000000 - 0480000000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
[    0.000000]   #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
[    0.000000]   #2 [0001000000 - 000200f260]    TEXT DATA BSS ==> [0001000000 - 000200f260]
[    0.000000]   #3 [0000098800 - 0000100000]    BIOS reserved ==> [0000098800 - 0000100000]
[    0.000000]   #4 [0002010000 - 000201035c]              BRK ==> [0002010000 - 000201035c]
[    0.000000]   #5 [0000008000 - 000000a000]          PGTABLE ==> [0000008000 - 000000a000]
[    0.000000]   #6 [000000a000 - 0000048000]          PGTABLE ==> [000000a000 - 0000048000]
[    0.000000]   #7 [0000001000 - 000000103c]        ACPI SLIT ==> [0000001000 - 000000103c]
[    0.000000] Bootmem setup node 1 0000000880000000-0000000c80000000
[    0.000000]   NODE_DATA [0000000880000000 - 0000000880004fff]
[    0.000000]   bootmap [0000000880005000 -  0000000880084fff] pages 80
[    0.000000] (8 early reservations) ==> bootmem [0880000000 - 0c80000000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page
[    0.000000]   #1 [0000006000 - 0000008000]       TRAMPOLINE
[    0.000000]   #2 [0001000000 - 000200f260]    TEXT DATA BSS
[    0.000000]   #3 [0000098800 - 0000100000]    BIOS reserved
[    0.000000]   #4 [0002010000 - 000201035c]              BRK
[    0.000000]   #5 [0000008000 - 000000a000]          PGTABLE
[    0.000000]   #6 [000000a000 - 0000048000]          PGTABLE
[    0.000000]   #7 [0000001000 - 000000103c]        ACPI SLIT
[    0.000000] Bootmem setup node 2 0000000480000000-0000000880000000
[    0.000000]   NODE_DATA [0000000480000000 - 0000000480004fff]
[    0.000000]   bootmap [0000000480005000 -  0000000480084fff] pages 80
[    0.000000] (8 early reservations) ==> bootmem [0480000000 - 0880000000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page
[    0.000000]   #1 [0000006000 - 0000008000]       TRAMPOLINE
[    0.000000]   #2 [0001000000 - 000200f260]    TEXT DATA BSS
[    0.000000]   #3 [0000098800 - 0000100000]    BIOS reserved
[    0.000000]   #4 [0002010000 - 000201035c]              BRK
[    0.000000]   #5 [0000008000 - 000000a000]          PGTABLE
[    0.000000]   #6 [000000a000 - 0000048000]          PGTABLE
[    0.000000]   #7 [0000001000 - 000000103c]        ACPI SLIT
[    0.000000] Bootmem setup node 3 0000000c80000000-0000001080000000
[    0.000000]   NODE_DATA [0000000c80000000 - 0000000c80004fff]
[    0.000000]   bootmap [0000000c80005000 -  0000000c80084fff] pages 80
[    0.000000] (8 early reservations) ==> bootmem [0c80000000 - 1080000000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page
[    0.000000]   #1 [0000006000 - 0000008000]       TRAMPOLINE
[    0.000000]   #2 [0001000000 - 000200f260]    TEXT DATA BSS
[    0.000000]   #3 [0000098800 - 0000100000]    BIOS reserved
[    0.000000]   #4 [0002010000 - 000201035c]              BRK
[    0.000000]   #5 [0000008000 - 000000a000]          PGTABLE
[    0.000000]   #6 [000000a000 - 0000048000]          PGTABLE
[    0.000000]   #7 [0000001000 - 000000103c]        ACPI SLIT
[    0.000000] found SMP MP-table at [ffff8800000fddb0] fddb0
[    0.000000]  [ffffea0000000000-ffffea001d3fffff] PMD -> [ffff880028600000-ffff8800425fffff] on node 0
[    0.000000]  [ffffea001d400000-ffffea00373fffff] PMD -> [ffff880480200000-ffff88049a1fffff] on node 2
[    0.000000]  [ffffea0037400000-ffffea003fffffff] PMD -> [ffff880880200000-ffff880888dfffff] on node 1
[    0.000000]  [ffffea0040000000-ffffea00513fffff] PMD -> [ffff880889000000-ffff88089a3fffff] on node 1
[    0.000000]  [ffffea0051400000-ffffea006b3fffff] PMD -> [ffff880c80200000-ffff880c9a1fffff] on node 3
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA      0x00000001 -> 0x00001000
[    0.000000]   DMA32    0x00001000 -> 0x00100000
[    0.000000]   Normal   0x00100000 -> 0x01080000
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[6] active PFN ranges
[    0.000000]     0: 0x00000001 -> 0x00000098
[    0.000000]     0: 0x00000100 -> 0x00078c59
[    0.000000]     0: 0x00100000 -> 0x00480000
[    0.000000]     2: 0x00480000 -> 0x00880000
[    0.000000]     1: 0x00880000 -> 0x00c80000
[    0.000000]     3: 0x00c80000 -> 0x01080000
[    0.000000] On node 0 totalpages: 4164592
[    0.000000]   DMA zone: 104 pages used for memmap
[    0.000000]   DMA zone: 185 pages reserved
[    0.000000]   DMA zone: 3702 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 26520 pages used for memmap
[    0.000000]   DMA32 zone: 464065 pages, LIFO batch:31
[    0.000000]   Normal zone: 93184 pages used for memmap
[    0.000000]   Normal zone: 3576832 pages, LIFO batch:31
[    0.000000] On node 1 totalpages: 4194304
[    0.000000]   Normal zone: 106496 pages used for memmap
[    0.000000]   Normal zone: 4087808 pages, LIFO batch:31
[    0.000000] On node 2 totalpages: 4194304
[    0.000000]   Normal zone: 106496 pages used for memmap
[    0.000000]   Normal zone: 4087808 pages, LIFO batch:31
[    0.000000] On node 3 totalpages: 4194304
[    0.000000]   Normal zone: 106496 pages used for memmap
[    0.000000]   Normal zone: 4087808 pages, LIFO batch:31

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:22                       ` Jens Axboe
@ 2009-12-15 19:28                         ` Jens Axboe
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:28 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Jens Axboe wrote:
> > boot log of first kernel?
> 
> Hmm not completely sure, let me re-do it after a cold boot.

This is from a cold boot of 2.6.32.

0000000000000000-0000000000098800 (System RAM)
0000000000098800-00000000000a0000 (reserved)
0000000079301000-0000000079303000 (reserved)
0000000079303000-0000000079305000 (ACPI Tables)
0000000079305000-0000000079310000 (reserved)
0000000079310000-0000000079314000 (ACPI Tables)
0000000079314000-0000000079319000 (reserved)
0000000079319000-0000000079336000 (ACPI Tables)
0000000079336000-0000000079358000 (reserved)
0000000079358000-0000000079388000 (ACPI Tables)
0000000079388000-00000000793c9000 (reserved)
00000000793c9000-000000007968f000 (ACPI Tables)
00000000000e0000-0000000000100000 (reserved)
000000007968f000-00000000796bb000 (reserved)
00000000796bb000-00000000799d8000 (ACPI Tables)
00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
0000000079bd8000-0000000079d87000 (ACPI Tables)
0000000079d87000-0000000079d8a000 (reserved)
0000000079d8a000-0000000079dca000 (ACPI Tables)
0000000079dca000-0000000079dcb000 (reserved)
0000000079dcb000-0000000079e1c000 (ACPI Tables)
0000000079e1c000-0000000079e87000 (reserved)
0000000079e87000-000000007bd5f000 (ACPI Tables)
0000000000100000-0000000078c63000 (System RAM)
000000007bd5f000-000000007be4f000 (reserved)
000000007be4f000-000000007bf87000 (ACPI Tables)
000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage)
000000007bfcf000-000000007bfff000 (ACPI Tables)
000000007bfff000-0000000090000000 (reserved)
00000000fc000000-00000000fd000000 (reserved)
00000000fed1c000-00000000fed20000 (reserved)
00000000ff000000-0000000100000000 (reserved)
0000000100000000-0000001080000000 (System RAM)
0000000078c63000-0000000078e77000 (ACPI Non-volatile Storage)
0000000078e77000-000000007924e000 (ACPI Tables)
000000007924e000-00000000792c2000 (reserved)
00000000792c2000-00000000792d2000 (ACPI Tables)
00000000792d2000-00000000792e7000 (reserved)
00000000792e7000-0000000079301000 (ACPI Tables)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 18:39             ` Yinghai Lu
                                 ` (2 preceding siblings ...)
  2009-12-15 18:59               ` Jens Axboe
@ 2009-12-15 19:43               ` Jens Axboe
  2009-12-15 19:48                 ` Yinghai Lu
  3 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:43 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> 
> it is above 0x100, so if mmconf is not enable, need to skip it

This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
mmconf problem to begin with, are we now just working around the issue?
SRAT still reports issues, numa doesn't work.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:11                   ` Jens Axboe
  2009-12-15 19:17                     ` Yinghai Lu
@ 2009-12-15 19:44                     ` Yinghai Lu
  2009-12-15 19:48                       ` Jens Axboe
  1 sibling, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 19:44 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>>>
>>>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
>>> On a "normal" non-kexec boot, I get:
>>>
>>> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
>>> [   12.216874] PCI: Using configuration type 1 for base access
>>>
>> can you run following scripts in first kernel?
>>
>> cd /sys/firmware/memmap
>> for dir in * ; do
>>   start=$(cat $dir/start)
>>   end=$(cat $dir/end)
>>   type=$(cat $dir/type)
>>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
>> done
>>
>> and send out /tmp/memmap.txt
> 
> Below.
> 
>> what is your kexec tools version? could be too old?
> 
> It says:
> 
> kexec-tools-testing 20080324 released 24th March 2008
> 
> 
> 0000000000000000-0000000000098800 (System RAM)
> 0000000000098800-00000000000a0000 (reserved)
> 0000000079301000-0000000079303000 (reserved)
> 0000000079303000-0000000079305000 (ACPI Tables)
> 0000000079305000-0000000079310000 (reserved)
> 0000000079310000-0000000079314000 (ACPI Tables)
> 0000000079314000-0000000079319000 (reserved)
> 0000000079319000-0000000079336000 (ACPI Tables)
> 0000000079336000-0000000079358000 (reserved)
> 0000000079358000-0000000079388000 (ACPI Tables)
> 0000000079388000-00000000793c9000 (reserved)
> 00000000793c9000-000000007968f000 (ACPI Tables)
> 00000000000e0000-0000000000100000 (reserved)
> 000000007968f000-00000000796bb000 (reserved)
> 00000000796bb000-00000000799d8000 (ACPI Tables)
> 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
> 0000000079bd8000-0000000079d8b000 (ACPI Tables)
> 0000000079d8b000-0000000079d8c000 (reserved)
> 0000000079d8c000-0000000079dc8000 (ACPI Tables)
> 0000000079dc8000-0000000079dcb000 (reserved)
> 0000000079dcb000-0000000079e1c000 (ACPI Tables)
> 0000000079e1c000-0000000079e87000 (reserved)
> 0000000079e87000-000000007bd5f000 (ACPI Tables)
> 0000000000100000-0000000078c59000 (System RAM)
> 000000007bd5f000-000000007be4f000 (reserved)
> 000000007be4f000-000000007bf87000 (ACPI Tables)

so following ranges are not passed to second kernel by kexec?

> 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage)
> 000000007bfcf000-000000007bfff000 (ACPI Tables)
> 000000007bfff000-0000000090000000 (reserved)
> 00000000fc000000-00000000fd000000 (reserved)
> 00000000fed1c000-00000000fed20000 (reserved)
> 00000000ff000000-0000000100000000 (reserved)
> 0000000100000000-0000001080000000 (System RAM)
> 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage)
> 0000000078e6d000-000000007924e000 (ACPI Tables)
> 000000007924e000-00000000792c2000 (reserved)
> 00000000792c2000-00000000792d2000 (ACPI Tables)
> 00000000792d2000-00000000792e7000 (reserved)
> 00000000792e7000-0000000079301000 (ACPI Tables)
> 

second kernel only get

[    0.000000] BIOS-provided physical RAM map:

[    0.000000]  BIOS-e820: 0000000000000100 - 0000000000098800 (usable)

[    0.000000]  BIOS-e820: 0000000000098800 - 00000000000a0000 (reserved)

[    0.000000]  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)

[    0.000000]  BIOS-e820: 0000000000100000 - 0000000078c63000 (usable)

[    0.000000]  BIOS-e820: 0000000078c63000 - 0000000078e77000 (ACPI NVS)

[    0.000000]  BIOS-e820: 0000000078e77000 - 000000007924e000 (ACPI data)

[    0.000000]  BIOS-e820: 000000007924e000 - 00000000792c2000 (reserved)

[    0.000000]  BIOS-e820: 00000000792c2000 - 00000000792d2000 (ACPI data)

[    0.000000]  BIOS-e820: 00000000792d2000 - 00000000792e7000 (reserved)

[    0.000000]  BIOS-e820: 00000000792e7000 - 0000000079301000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079301000 - 0000000079303000 (reserved)

[    0.000000]  BIOS-e820: 0000000079303000 - 0000000079305000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079305000 - 0000000079310000 (reserved)

[    0.000000]  BIOS-e820: 0000000079310000 - 0000000079314000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079314000 - 0000000079319000 (reserved)

[    0.000000]  BIOS-e820: 0000000079319000 - 0000000079336000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079336000 - 0000000079358000 (reserved)

[    0.000000]  BIOS-e820: 0000000079358000 - 0000000079388000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079388000 - 00000000793c9000 (reserved)

[    0.000000]  BIOS-e820: 00000000793c9000 - 000000007968f000 (ACPI data)

[    0.000000]  BIOS-e820: 000000007968f000 - 00000000796bb000 (reserved)

[    0.000000]  BIOS-e820: 00000000796bb000 - 00000000799d8000 (ACPI data)

[    0.000000]  BIOS-e820: 00000000799d8000 - 0000000079bd8000 (ACPI NVS)

[    0.000000]  BIOS-e820: 0000000079bd8000 - 0000000079d87000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079d87000 - 0000000079d8a000 (reserved)

[    0.000000]  BIOS-e820: 0000000079d8a000 - 0000000079dca000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079dca000 - 0000000079dcb000 (reserved)

[    0.000000]  BIOS-e820: 0000000079dcb000 - 0000000079e1c000 (ACPI data)

[    0.000000]  BIOS-e820: 0000000079e1c000 - 0000000079e87000 (reserved)

[    0.000000]  BIOS-e820: 0000000079e87000 - 000000007bd5f000 (ACPI data)

[    0.000000]  BIOS-e820: 000000007bd5f000 - 000000007be4f000 (reserved)

[    0.000000]  BIOS-e820: 000000007be4f000 - 000000007bf87000 (ACPI data)

so mmconf range is not reserved, and some ACPI data
> 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage)
0000000078c59000 - 0000000078c63000 get currupted...

YH


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:44                     ` Yinghai Lu
@ 2009-12-15 19:48                       ` Jens Axboe
  2009-12-15 19:49                         ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:48 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>>>
> >>>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> >>> On a "normal" non-kexec boot, I get:
> >>>
> >>> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> >>> [   12.216874] PCI: Using configuration type 1 for base access
> >>>
> >> can you run following scripts in first kernel?
> >>
> >> cd /sys/firmware/memmap
> >> for dir in * ; do
> >>   start=$(cat $dir/start)
> >>   end=$(cat $dir/end)
> >>   type=$(cat $dir/type)
> >>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
> >> done
> >>
> >> and send out /tmp/memmap.txt
> > 
> > Below.
> > 
> >> what is your kexec tools version? could be too old?
> > 
> > It says:
> > 
> > kexec-tools-testing 20080324 released 24th March 2008
> > 
> > 
> > 0000000000000000-0000000000098800 (System RAM)
> > 0000000000098800-00000000000a0000 (reserved)
> > 0000000079301000-0000000079303000 (reserved)
> > 0000000079303000-0000000079305000 (ACPI Tables)
> > 0000000079305000-0000000079310000 (reserved)
> > 0000000079310000-0000000079314000 (ACPI Tables)
> > 0000000079314000-0000000079319000 (reserved)
> > 0000000079319000-0000000079336000 (ACPI Tables)
> > 0000000079336000-0000000079358000 (reserved)
> > 0000000079358000-0000000079388000 (ACPI Tables)
> > 0000000079388000-00000000793c9000 (reserved)
> > 00000000793c9000-000000007968f000 (ACPI Tables)
> > 00000000000e0000-0000000000100000 (reserved)
> > 000000007968f000-00000000796bb000 (reserved)
> > 00000000796bb000-00000000799d8000 (ACPI Tables)
> > 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
> > 0000000079bd8000-0000000079d8b000 (ACPI Tables)
> > 0000000079d8b000-0000000079d8c000 (reserved)
> > 0000000079d8c000-0000000079dc8000 (ACPI Tables)
> > 0000000079dc8000-0000000079dcb000 (reserved)
> > 0000000079dcb000-0000000079e1c000 (ACPI Tables)
> > 0000000079e1c000-0000000079e87000 (reserved)
> > 0000000079e87000-000000007bd5f000 (ACPI Tables)
> > 0000000000100000-0000000078c59000 (System RAM)
> > 000000007bd5f000-000000007be4f000 (reserved)
> > 000000007be4f000-000000007bf87000 (ACPI Tables)
> 
> so following ranges are not passed to second kernel by kexec?

I have the following addition to my kexec kernel command line:

memmap=62G@4G

since that last big 62G RAM entry doesn't show up without it, that's why
you see a user defined e820 map as well in the boot logs. So a kexec'ed
kernel is missing at least that entry.

I just tried with the latest and greatest kexec-tools (2.0.1) and
there's no difference.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:43               ` kexec boot regression Jens Axboe
@ 2009-12-15 19:48                 ` Yinghai Lu
  2009-12-15 19:51                   ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 19:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>
>> it is above 0x100, so if mmconf is not enable, need to skip it
> 
> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> mmconf problem to begin with, are we now just working around the issue?
> SRAT still reports issues, numa doesn't work.

that patch will be bullet proof... we need it.

also still need to figure out why memmap range is not passed properly.

do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in second kernel?

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:48                       ` Jens Axboe
@ 2009-12-15 19:49                         ` Yinghai Lu
  2009-12-15 19:57                           ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 19:49 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> Jens Axboe wrote:
>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>>>>>
>>>>>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
>>>>> On a "normal" non-kexec boot, I get:
>>>>>
>>>>> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
>>>>> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
>>>>> [   12.216874] PCI: Using configuration type 1 for base access
>>>>>
>>>> can you run following scripts in first kernel?
>>>>
>>>> cd /sys/firmware/memmap
>>>> for dir in * ; do
>>>>   start=$(cat $dir/start)
>>>>   end=$(cat $dir/end)
>>>>   type=$(cat $dir/type)
>>>>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
>>>> done
>>>>
>>>> and send out /tmp/memmap.txt
>>> Below.
>>>
>>>> what is your kexec tools version? could be too old?
>>> It says:
>>>
>>> kexec-tools-testing 20080324 released 24th March 2008
>>>
>>>
>>> 0000000000000000-0000000000098800 (System RAM)
>>> 0000000000098800-00000000000a0000 (reserved)
>>> 0000000079301000-0000000079303000 (reserved)
>>> 0000000079303000-0000000079305000 (ACPI Tables)
>>> 0000000079305000-0000000079310000 (reserved)
>>> 0000000079310000-0000000079314000 (ACPI Tables)
>>> 0000000079314000-0000000079319000 (reserved)
>>> 0000000079319000-0000000079336000 (ACPI Tables)
>>> 0000000079336000-0000000079358000 (reserved)
>>> 0000000079358000-0000000079388000 (ACPI Tables)
>>> 0000000079388000-00000000793c9000 (reserved)
>>> 00000000793c9000-000000007968f000 (ACPI Tables)
>>> 00000000000e0000-0000000000100000 (reserved)
>>> 000000007968f000-00000000796bb000 (reserved)
>>> 00000000796bb000-00000000799d8000 (ACPI Tables)
>>> 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
>>> 0000000079bd8000-0000000079d8b000 (ACPI Tables)
>>> 0000000079d8b000-0000000079d8c000 (reserved)
>>> 0000000079d8c000-0000000079dc8000 (ACPI Tables)
>>> 0000000079dc8000-0000000079dcb000 (reserved)
>>> 0000000079dcb000-0000000079e1c000 (ACPI Tables)
>>> 0000000079e1c000-0000000079e87000 (reserved)
>>> 0000000079e87000-000000007bd5f000 (ACPI Tables)
>>> 0000000000100000-0000000078c59000 (System RAM)
>>> 000000007bd5f000-000000007be4f000 (reserved)
>>> 000000007be4f000-000000007bf87000 (ACPI Tables)
>> so following ranges are not passed to second kernel by kexec?
> 
> I have the following addition to my kexec kernel command line:
> 
> memmap=62G@4G
> 
> since that last big 62G RAM entry doesn't show up without it, that's why
> you see a user defined e820 map as well in the boot logs. So a kexec'ed
> kernel is missing at least that entry.
> 
> I just tried with the latest and greatest kexec-tools (2.0.1) and
> there's no difference.

current kernel kexec 2.6.32 make numa and mmconf working on second kernel?

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:48                 ` Yinghai Lu
@ 2009-12-15 19:51                   ` Jens Axboe
  2009-12-15 19:56                     ` Yinghai Lu
  2009-12-15 20:14                     ` Yinghai Lu
  0 siblings, 2 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:51 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> >>
> >> it is above 0x100, so if mmconf is not enable, need to skip it
> > 
> > This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> > mmconf problem to begin with, are we now just working around the issue?
> > SRAT still reports issues, numa doesn't work.
> 
> that patch will be bullet proof... we need it.
> 
> also still need to figure out why memmap range is not passed properly.
> 
> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> second kernel?

Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
complaints and NUMA works fine.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:51                   ` Jens Axboe
@ 2009-12-15 19:56                     ` Yinghai Lu
  2009-12-15 20:09                       ` Jens Axboe
  2009-12-15 20:14                     ` Yinghai Lu
  1 sibling, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 19:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>>>
>>>> it is above 0x100, so if mmconf is not enable, need to skip it
>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
>>> mmconf problem to begin with, are we now just working around the issue?
>>> SRAT still reports issues, numa doesn't work.
>> that patch will be bullet proof... we need it.
>>
>> also still need to figure out why memmap range is not passed properly.
>>
>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
>> second kernel?
> 
> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> complaints and NUMA works fine.
> 
how about

current kernel booted and 2.6.32 kexec'ed works just fine, no SRAT
complaints and NUMA works fine. ?

YH 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:49                         ` Yinghai Lu
@ 2009-12-15 19:57                           ` Jens Axboe
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 19:57 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> Jens Axboe wrote:
> >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>>>>>
> >>>>>> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> >>>>> On a "normal" non-kexec boot, I get:
> >>>>>
> >>>>> [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>>>> [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> >>>>> [   12.216874] PCI: Using configuration type 1 for base access
> >>>>>
> >>>> can you run following scripts in first kernel?
> >>>>
> >>>> cd /sys/firmware/memmap
> >>>> for dir in * ; do
> >>>>   start=$(cat $dir/start)
> >>>>   end=$(cat $dir/end)
> >>>>   type=$(cat $dir/type)
> >>>>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
> >>>> done
> >>>>
> >>>> and send out /tmp/memmap.txt
> >>> Below.
> >>>
> >>>> what is your kexec tools version? could be too old?
> >>> It says:
> >>>
> >>> kexec-tools-testing 20080324 released 24th March 2008
> >>>
> >>>
> >>> 0000000000000000-0000000000098800 (System RAM)
> >>> 0000000000098800-00000000000a0000 (reserved)
> >>> 0000000079301000-0000000079303000 (reserved)
> >>> 0000000079303000-0000000079305000 (ACPI Tables)
> >>> 0000000079305000-0000000079310000 (reserved)
> >>> 0000000079310000-0000000079314000 (ACPI Tables)
> >>> 0000000079314000-0000000079319000 (reserved)
> >>> 0000000079319000-0000000079336000 (ACPI Tables)
> >>> 0000000079336000-0000000079358000 (reserved)
> >>> 0000000079358000-0000000079388000 (ACPI Tables)
> >>> 0000000079388000-00000000793c9000 (reserved)
> >>> 00000000793c9000-000000007968f000 (ACPI Tables)
> >>> 00000000000e0000-0000000000100000 (reserved)
> >>> 000000007968f000-00000000796bb000 (reserved)
> >>> 00000000796bb000-00000000799d8000 (ACPI Tables)
> >>> 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage)
> >>> 0000000079bd8000-0000000079d8b000 (ACPI Tables)
> >>> 0000000079d8b000-0000000079d8c000 (reserved)
> >>> 0000000079d8c000-0000000079dc8000 (ACPI Tables)
> >>> 0000000079dc8000-0000000079dcb000 (reserved)
> >>> 0000000079dcb000-0000000079e1c000 (ACPI Tables)
> >>> 0000000079e1c000-0000000079e87000 (reserved)
> >>> 0000000079e87000-000000007bd5f000 (ACPI Tables)
> >>> 0000000000100000-0000000078c59000 (System RAM)
> >>> 000000007bd5f000-000000007be4f000 (reserved)
> >>> 000000007be4f000-000000007bf87000 (ACPI Tables)
> >> so following ranges are not passed to second kernel by kexec?
> > 
> > I have the following addition to my kexec kernel command line:
> > 
> > memmap=62G@4G
> > 
> > since that last big 62G RAM entry doesn't show up without it, that's why
> > you see a user defined e820 map as well in the boot logs. So a kexec'ed
> > kernel is missing at least that entry.
> > 
> > I just tried with the latest and greatest kexec-tools (2.0.1) and
> > there's no difference.
> 
> current kernel kexec 2.6.32 make numa and mmconf working on second kernel?

Just tested that configuration, and with current -git booted and
kexec into 2.6.32 gets me working numa but mmconf still complains:

[   15.669222] PCI: MCFG configuration 0: base 80000000 segment 0 buses
0 - 255
[   15.677166] PCI: Not using MMCONFIG.
[...]
[   15.971448] PCI: MCFG configuration 0: base 80000000 segment 0 buses
0 - 255
[   16.066995] PCI: BIOS Bug: MCFG area at 80000000 is not reserved in
ACPI motherboard resources
[   16.076705] PCI: Not using MMCONFIG.

SRAT looks good:

[...]
[    0.000000] SRAT: Node 0 PXM 0 0-80000000
[    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
[    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
[    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
[    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
[    0.000000] NUMA: Using 31 for the hash shift.
[snip same working NUMA config]

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:56                     ` Yinghai Lu
@ 2009-12-15 20:09                       ` Jens Axboe
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 20:09 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> >>>>
> >>>> it is above 0x100, so if mmconf is not enable, need to skip it
> >>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> >>> mmconf problem to begin with, are we now just working around the issue?
> >>> SRAT still reports issues, numa doesn't work.
> >> that patch will be bullet proof... we need it.
> >>
> >> also still need to figure out why memmap range is not passed properly.
> >>
> >> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> >> second kernel?
> > 
> > Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> > complaints and NUMA works fine.
> > 
> how about
> 
> current kernel booted and 2.6.32 kexec'ed works just fine, no SRAT
> complaints and NUMA works fine. ?

Yes, that's exactly what happens, see the previous reply I sent. mmconf
still complains, though.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:51                   ` Jens Axboe
  2009-12-15 19:56                     ` Yinghai Lu
@ 2009-12-15 20:14                     ` Yinghai Lu
  2009-12-15 20:19                       ` Jens Axboe
  1 sibling, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 20:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>>>
>>>> it is above 0x100, so if mmconf is not enable, need to skip it
>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
>>> mmconf problem to begin with, are we now just working around the issue?
>>> SRAT still reports issues, numa doesn't work.
>> that patch will be bullet proof... we need it.
>>
>> also still need to figure out why memmap range is not passed properly.
>>
>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
>> second kernel?
> 
> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> complaints and NUMA works fine.

do you need 
memmap=62G@4G
in this case?

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 20:14                     ` Yinghai Lu
@ 2009-12-15 20:19                       ` Jens Axboe
  2009-12-15 20:21                         ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 20:19 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> >>>>
> >>>> it is above 0x100, so if mmconf is not enable, need to skip it
> >>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> >>> mmconf problem to begin with, are we now just working around the issue?
> >>> SRAT still reports issues, numa doesn't work.
> >> that patch will be bullet proof... we need it.
> >>
> >> also still need to figure out why memmap range is not passed properly.
> >>
> >> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> >> second kernel?
> > 
> > Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> > complaints and NUMA works fine.
> 
> do you need 
> memmap=62G@4G
> in this case?

Yes, I've needed that always.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 20:19                       ` Jens Axboe
@ 2009-12-15 20:21                         ` Yinghai Lu
  2009-12-15 20:42                           ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 20:21 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> Jens Axboe wrote:
>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>>>>>
>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
>>>>> mmconf problem to begin with, are we now just working around the issue?
>>>>> SRAT still reports issues, numa doesn't work.
>>>> that patch will be bullet proof... we need it.
>>>>
>>>> also still need to figure out why memmap range is not passed properly.
>>>>
>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
>>>> second kernel?
>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
>>> complaints and NUMA works fine.
>> do you need 
>> memmap=62G@4G
>> in this case?
> 
> Yes, I've needed that always.

good,

can you enable debug option in kexec to see why kexec can not pass whole 38? range to second kernel?

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 20:21                         ` Yinghai Lu
@ 2009-12-15 20:42                           ` Jens Axboe
  2009-12-15 20:55                             ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 20:42 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> Jens Axboe wrote:
> >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> >>>>>>
> >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
> >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> >>>>> mmconf problem to begin with, are we now just working around the issue?
> >>>>> SRAT still reports issues, numa doesn't work.
> >>>> that patch will be bullet proof... we need it.
> >>>>
> >>>> also still need to figure out why memmap range is not passed properly.
> >>>>
> >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> >>>> second kernel?
> >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> >>> complaints and NUMA works fine.
> >> do you need 
> >> memmap=62G@4G
> >> in this case?
> > 
> > Yes, I've needed that always.
> 
> good,
> 
> can you enable debug option in kexec to see why kexec can not pass
> whole 38? range to second kernel?

Not getting any output so far, -d doesn't do much. Poking around in the
source...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 20:42                           ` Jens Axboe
@ 2009-12-15 20:55                             ` Jens Axboe
  2009-12-15 21:01                               ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 20:55 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

On Tue, Dec 15 2009, Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > Jens Axboe wrote:
> > > On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >> Jens Axboe wrote:
> > >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >>>> Jens Axboe wrote:
> > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> > >>>>>>
> > >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
> > >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> > >>>>> mmconf problem to begin with, are we now just working around the issue?
> > >>>>> SRAT still reports issues, numa doesn't work.
> > >>>> that patch will be bullet proof... we need it.
> > >>>>
> > >>>> also still need to figure out why memmap range is not passed properly.
> > >>>>
> > >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> > >>>> second kernel?
> > >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> > >>> complaints and NUMA works fine.
> > >> do you need 
> > >> memmap=62G@4G
> > >> in this case?
> > > 
> > > Yes, I've needed that always.
> > 
> > good,
> > 
> > can you enable debug option in kexec to see why kexec can not pass
> > whole 38? range to second kernel?
> 
> Not getting any output so far, -d doesn't do much. Poking around in the
> source...

OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
total), that smells like just a kexec bug. Retesting -git...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 20:55                             ` Jens Axboe
@ 2009-12-15 21:01                               ` Jens Axboe
  2009-12-15 21:26                                 ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 21:01 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

On Tue, Dec 15 2009, Jens Axboe wrote:
> On Tue, Dec 15 2009, Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> > > Jens Axboe wrote:
> > > > On Tue, Dec 15 2009, Yinghai Lu wrote:
> > > >> Jens Axboe wrote:
> > > >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > > >>>> Jens Axboe wrote:
> > > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > > >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> > > >>>>>>
> > > >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
> > > >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> > > >>>>> mmconf problem to begin with, are we now just working around the issue?
> > > >>>>> SRAT still reports issues, numa doesn't work.
> > > >>>> that patch will be bullet proof... we need it.
> > > >>>>
> > > >>>> also still need to figure out why memmap range is not passed properly.
> > > >>>>
> > > >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> > > >>>> second kernel?
> > > >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> > > >>> complaints and NUMA works fine.
> > > >> do you need 
> > > >> memmap=62G@4G
> > > >> in this case?
> > > > 
> > > > Yes, I've needed that always.
> > > 
> > > good,
> > > 
> > > can you enable debug option in kexec to see why kexec can not pass
> > > whole 38? range to second kernel?
> > 
> > Not getting any output so far, -d doesn't do much. Poking around in the
> > source...
> 
> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
> total), that smells like just a kexec bug. Retesting -git...

Current -git works fine when all the ranges are passed correctly. So, I
think, the only existing regression is the SRAT issue.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:01                               ` Jens Axboe
@ 2009-12-15 21:26                                 ` Yinghai Lu
  2009-12-15 21:30                                   ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 21:26 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

Jens Axboe wrote:
> On Tue, Dec 15 2009, Jens Axboe wrote:
>> On Tue, Dec 15 2009, Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>> Jens Axboe wrote:
>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>> Jens Axboe wrote:
>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>> Jens Axboe wrote:
>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>>>>>>>>>
>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
>>>>>>>>> mmconf problem to begin with, are we now just working around the issue?
>>>>>>>>> SRAT still reports issues, numa doesn't work.
>>>>>>>> that patch will be bullet proof... we need it.
>>>>>>>>
>>>>>>>> also still need to figure out why memmap range is not passed properly.
>>>>>>>>
>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
>>>>>>>> second kernel?
>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
>>>>>>> complaints and NUMA works fine.
>>>>>> do you need 
>>>>>> memmap=62G@4G
>>>>>> in this case?
>>>>> Yes, I've needed that always.
>>>> good,
>>>>
>>>> can you enable debug option in kexec to see why kexec can not pass
>>>> whole 38? range to second kernel?
>>> Not getting any output so far, -d doesn't do much. Poking around in the
>>> source...
>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
>> total), that smells like just a kexec bug. Retesting -git...
> 
> Current -git works fine when all the ranges are passed correctly. So, I
> think, the only existing regression is the SRAT issue.

did you change node_shift?

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 19:04                 ` Yinghai Lu
  2009-12-15 19:11                   ` Jens Axboe
@ 2009-12-15 21:30                   ` Markus Trippelsdorf
  2009-12-15 23:02                     ` kexec boot regression radeon/kms (bisected) Markus Trippelsdorf
  1 sibling, 1 reply; 42+ messages in thread
From: Markus Trippelsdorf @ 2009-12-15 21:30 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jens Axboe, Jesse Barnes, Linux Kernel, mingo, rdreier,
	Suresh Siddha, linux-pci

[-- Attachment #1: Type: text/plain, Size: 1318 bytes --]

On Tue, Dec 15, 2009 at 11:04:55AM -0800, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Yinghai Lu wrote:
> >> [   13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> >>
> >> [   13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
> > 
> > On a "normal" non-kexec boot, I get:
> > 
> > [   12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
> > [   12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> > [   12.216874] PCI: Using configuration type 1 for base access
> > 
> 
> can you run following scripts in first kernel?
> 
> cd /sys/firmware/memmap
> for dir in * ; do
>   start=$(cat $dir/start)
>   end=$(cat $dir/end)
>   type=$(cat $dir/type)
>   printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt
> done
> 
> and send out /tmp/memmap.txt
> 
> what is your kexec tools version? could be too old?

I have the same symptoms on my machine, but the underlying cause must be
different. I once reverted all Radeon related changes since 2.6.32 and 
kexec started working again.

Full dmesg and the output of the script is attached.

kexec-tools 2.0.1 released 13th August 2009

-- 
Markus

[-- Attachment #2: memmap.txt --]
[-- Type: text/plain, Size: 431 bytes --]

0000000000000000-000000000009fc00 (System RAM)
000000000009fc00-00000000000a0000 (reserved)
00000000000e6000-0000000000100000 (reserved)
0000000000100000-00000000cbf90000 (System RAM)
00000000cbf90000-00000000cbfa8000 (ACPI Tables)
00000000cbfa8000-00000000cbfd0000 (ACPI Non-volatile Storage)
00000000cbfd0000-00000000cc000000 (reserved)
00000000fff00000-0000000100000000 (reserved)
0000000100000000-0000000130000000 (System RAM)

[-- Attachment #3: dmesg --]
[-- Type: text/plain, Size: 26916 bytes --]

Linux version 2.6.32-07500-g8bea867-dirty (markus@arch.tripp.de) (gcc version 4.4.2 (GCC) ) #5 SMP Tue Dec 15 21:55:00 CET 2009
Command line: BOOT_IMAGE=/boot/kernel root=/dev/sdb fbcon=rotate:3 quiet
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000cbf90000 (usable)
 BIOS-e820: 00000000cbf90000 - 00000000cbfa8000 (ACPI data)
 BIOS-e820: 00000000cbfa8000 - 00000000cbfd0000 (ACPI NVS)
 BIOS-e820: 00000000cbfd0000 - 00000000cc000000 (reserved)
 BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000130000000 (usable)
NX (Execute Disable) protection: active
DMI present.
last_pfn = 0x130000 max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
  00000-9FFFF write-back
  A0000-EFFFF uncachable
  F0000-FFFFF write-protect
MTRR variable ranges enabled:
  0 base 000000000000 mask FFFF80000000 write-back
  1 base 000080000000 mask FFFFC0000000 write-back
  2 base 0000C0000000 mask FFFFF8000000 write-back
  3 base 0000C8000000 mask FFFFFC000000 write-back
  4 disabled
  5 disabled
  6 disabled
  7 disabled
TOM2: 0000000130000000 aka 4864M
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
e820 update range: 00000000cc000000 - 0000000100000000 (usable) ==> (reserved)
last_pfn = 0xcbf90 max_arch_pfn = 0x400000000
initial memory mapped : 0 - 20000000
Using GB pages for direct mapping
init_memory_mapping: 0000000000000000-00000000cbf90000
 0000000000 - 00c0000000 page 1G
 00c0000000 - 00cbe00000 page 2M
 00cbe00000 - 00cbf90000 page 4k
kernel direct mapping tables up to cbf90000 @ 8000-b000
init_memory_mapping: 0000000100000000-0000000130000000
 0100000000 - 0130000000 page 2M
kernel direct mapping tables up to 130000000 @ a000-c000
ACPI: RSDP 00000000000fb880 00024 (v02 ACPIAM)
ACPI: XSDT 00000000cbf90100 00054 (v01 102809 XSDT1549 20091028 MSFT 00000097)
ACPI: FACP 00000000cbf90290 000F4 (v03 102809 FACP1549 20091028 MSFT 00000097)
ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0000000000000000/1 (20091112/tbfadt-557)
ACPI: DSDT 00000000cbf90440 0E774 (v01  A1152 A1152000 00000000 INTL 20060113)
ACPI: FACS 00000000cbfa8000 00040
ACPI: APIC 00000000cbf90390 0006C (v01 102809 APIC1549 20091028 MSFT 00000097)
ACPI: MCFG 00000000cbf90400 0003C (v01 102809 OEMMCFG  20091028 MSFT 00000097)
ACPI: OEMB 00000000cbfa8040 00072 (v01 102809 OEMB1549 20091028 MSFT 00000097)
ACPI: HPET 00000000cbf9f440 00038 (v01 102809 OEMHPET  20091028 MSFT 00000097)
ACPI: SSDT 00000000cbf9f480 0088C (v01 A M I  POWERNOW 00000001 AMD  00000001)
ACPI: Local APIC address 0xfee00000
(7 early reservations) ==> bootmem [0000000000 - 0130000000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0001000000 - 000176e80c]    TEXT DATA BSS ==> [0001000000 - 000176e80c]
  #2 [000009fc00 - 0000100000]    BIOS reserved ==> [000009fc00 - 0000100000]
  #3 [000176f000 - 000176f290]              BRK ==> [000176f000 - 000176f290]
  #4 [0000001000 - 0000003000]       TRAMPOLINE ==> [0000001000 - 0000003000]
  #5 [0000008000 - 000000a000]          PGTABLE ==> [0000008000 - 000000a000]
  #6 [000000a000 - 000000b000]          PGTABLE ==> [000000a000 - 000000b000]
 [ffffea0000000000-ffffea00043fffff] PMD -> [ffff880028600000-ffff88002bffffff] on node 0
Zone PFN ranges:
  DMA      0x00000000 -> 0x00001000
  DMA32    0x00001000 -> 0x00100000
  Normal   0x00100000 -> 0x00130000
Movable zone start PFN for each node
early_node_map[3] active PFN ranges
    0: 0x00000000 -> 0x0000009f
    0: 0x00000100 -> 0x000cbf90
    0: 0x00100000 -> 0x00130000
On node 0 totalpages: 1031983
  DMA zone: 56 pages used for memmap
  DMA zone: 102 pages reserved
  DMA zone: 3841 pages, LIFO batch:0
  DMA32 zone: 14280 pages used for memmap
  DMA32 zone: 817096 pages, LIFO batch:31
  Normal zone: 2688 pages used for memmap
  Normal zone: 193920 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 4, version 33, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8300 base: 0xfed00000
SMP: Allowing 4 CPUs, 0 hotplug CPUs
nr_irqs_gsi: 24
Allocating PCI resources starting at cc000000 (gap: cc000000:33f00000)
setup_percpu: NR_CPUS:4 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1
PERCPU: Embedded 25 pages/cpu @ffff880028200000 s81432 r0 d20968 u524288
pcpu-alloc: s81432 r0 d20968 u524288 alloc=1*2097152
pcpu-alloc: [0] 0 1 2 3 
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1014857
Kernel command line: BOOT_IMAGE=/boot/kernel root=/dev/sdb fbcon=rotate:3 quiet
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Memory: 3987284k/4980736k available (3765k kernel code, 852804k absent, 139720k reserved, 2841k data, 416k init)
SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Hierarchical RCU implementation.
NR_IRQS:384
Extended CMOS year: 2000
spurious 8259A interrupt: IRQ7.
Console: colour VGA+ 80x25
console [tty0] enabled
hpet clockevent registered
Fast TSC calibration using PIT
Detected 3210.336 MHz processor.
Calibrating delay loop (skipped), value calculated using timer frequency.. 6420.66 BogoMIPS (lpj=3210332)
Mount-cache hash table entries: 256
tseg: 0000000000
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 6 MCE banks
using C1E aware idle routine
Performance Events: AMD PMU driver.
... version:                0
... bit width:              48
... generic registers:      4
... value mask:             0000ffffffffffff
... max period:             00007fffffffffff
... fixed-purpose events:   0
... event mask:             000000000000000f
Freeing SMP alternatives: 28k freed
ACPI: Core revision 20091112
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: AMD Phenom(tm) II X4 955 Processor stepping 02
Booting Node   0, Processors  #1
System has AMD C1E enabled
Switch to broadcast mode on CPU1
 #2
Switch to broadcast mode on CPU2
 #3 Ok.
Brought up 4 CPUs
Total of 4 processors activated (25686.38 BogoMIPS).
Switch to broadcast mode on CPU3
Switch to broadcast mode on CPU0
NET: Registered protocol family 16
node 0 link 0: io port [1000, ffffff]
TOM: 00000000d0000000 aka 3328M
Fam 10h mmconf [e0000000, efffffff]
node 0 link 0: mmio [a0000, bffff]
node 0 link 0: mmio [d0000000, efffffff] ==> [d0000000, dfffffff]
node 0 link 0: mmio [f0000000, fbcfffff]
node 0 link 0: mmio [fbd00000, fbefffff]
node 0 link 0: mmio [fbf00000, ffefffff]
TOM2: 0000000130000000 aka 4864M
bus: [00, 07] on node 0 link 0
bus: 00 index 0 io port: [0, ffff]
bus: 00 index 1 mmio: [a0000, bffff]
bus: 00 index 2 mmio: [d0000000, dfffffff]
bus: 00 index 3 mmio: [f0000000, ffffffff]
bus: 00 index 4 mmio: [130000000, fcffffffff]
ACPI: bus type pci registered
PCI: Using configuration type 1 for base access
PCI: Using configuration type 1 for extended access
mtrr: your CPUs had inconsistent fixed MTRR settings
mtrr: probably your BIOS does not setup all CPUs.
mtrr: corrected configuration.
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: Executed 3 blocks of module-level executable AML code
ACPI: Interpreter enabled
ACPI: (supports S0 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI Warning: Incorrect checksum in table [OEMB] - B2, should be AA (20091112/tbutils-314)
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci_root PNP0A03:00: ignoring host bridge windows from ACPI; boot with "pci=use_crs" to use them
pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
pci_root PNP0A03:00: host bridge window [io  0x0d00-0xffff] (ignored)
pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff] (ignored)
pci_root PNP0A03:00: host bridge window [mem 0x000d0000-0x000dffff] (ignored)
pci_root PNP0A03:00: host bridge window [mem 0xcc000000-0xdfffffff] (ignored)
pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
pci 0000:00:11.0: reg 10: [io  0xc000-0xc007]
pci 0000:00:11.0: reg 14: [io  0xb000-0xb003]
pci 0000:00:11.0: reg 18: [io  0xa000-0xa007]
pci 0000:00:11.0: reg 1c: [io  0x9000-0x9003]
pci 0000:00:11.0: reg 20: [io  0x8000-0x800f]
pci 0000:00:11.0: reg 24: [mem 0xfbcffc00-0xfbcfffff]
pci 0000:00:11.0: set SATA to AHCI mode
pci 0000:00:12.0: reg 10: [mem 0xfbcfd000-0xfbcfdfff]
pci 0000:00:12.1: reg 10: [mem 0xfbcfe000-0xfbcfefff]
pci 0000:00:12.2: reg 10: [mem 0xfbcff800-0xfbcff8ff]
pci 0000:00:12.2: supports D1 D2
pci 0000:00:12.2: PME# supported from D0 D1 D2 D3hot
pci 0000:00:12.2: PME# disabled
pci 0000:00:13.0: reg 10: [mem 0xfbcfb000-0xfbcfbfff]
pci 0000:00:13.1: reg 10: [mem 0xfbcfc000-0xfbcfcfff]
pci 0000:00:13.2: reg 10: [mem 0xfbcff400-0xfbcff4ff]
pci 0000:00:13.2: supports D1 D2
pci 0000:00:13.2: PME# supported from D0 D1 D2 D3hot
pci 0000:00:13.2: PME# disabled
pci 0000:00:14.1: reg 10: [io  0x0000-0x0007]
pci 0000:00:14.1: reg 14: [io  0x0000-0x0003]
pci 0000:00:14.1: reg 18: [io  0x0000-0x0007]
pci 0000:00:14.1: reg 1c: [io  0x0000-0x0003]
pci 0000:00:14.1: reg 20: [io  0xff00-0xff0f]
pci 0000:00:14.5: reg 10: [mem 0xfbcfa000-0xfbcfafff]
pci 0000:01:05.0: reg 10: [mem 0xd0000000-0xdfffffff pref]
pci 0000:01:05.0: reg 14: [io  0xd000-0xd0ff]
pci 0000:01:05.0: reg 18: [mem 0xfbee0000-0xfbeeffff]
pci 0000:01:05.0: reg 24: [mem 0xfbd00000-0xfbdfffff]
pci 0000:01:05.0: supports D1 D2
pci 0000:01:05.1: reg 10: [mem 0xfbefc000-0xfbefffff]
pci 0000:01:05.1: supports D1 D2
pci 0000:00:01.0: PCI bridge to [bus 01-01]
pci 0000:00:01.0:   bridge window [io  0xd000-0xdfff]
pci 0000:00:01.0:   bridge window [mem 0xfbd00000-0xfbefffff]
pci 0000:00:01.0:   bridge window [mem 0xd0000000-0xdfffffff 64bit pref]
pci 0000:02:05.0: reg 10: [io  0xe800-0xe8ff]
pci 0000:02:05.0: reg 14: [mem 0xfbfffc00-0xfbfffcff]
pci 0000:02:05.0: reg 30: [mem 0xfbfc0000-0xfbfdffff pref]
pci 0000:02:05.0: supports D1 D2
pci 0000:02:05.0: PME# supported from D1 D2 D3hot D3cold
pci 0000:02:05.0: PME# disabled
pci 0000:00:14.4: PCI bridge to [bus 02-02] (subtractive decode)
pci 0000:00:14.4:   bridge window [io  0xe000-0xefff]
pci 0000:00:14.4:   bridge window [mem 0xfbf00000-0xfbffffff]
pci_bus 0000:00: on NUMA node 0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PC._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs *4 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 4 *7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 4 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 4 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 4 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 4 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 4 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 4 7 10 11 12 14 15) *0, disabled.
vgaarb: device added: PCI:0000:01:05.0,decodes=io+mem,owns=io+mem,locks=none
vgaarb: loaded
SCSI subsystem initialized
libata version 3.00 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: pci_cache_line_size set to 64 bytes
HPET: 4 timers in total, 1 timers will be used for per-cpu timer
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 24, 0
hpet0: 4 comparators, 32-bit 14.318180 MHz counter
hpet: hpet2 irq 24 for MSI
Switching to clocksource tsc
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 13 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: [mem 0xcc000000-0xcfffffff] has been reserved
system 00:07: [mem 0xfec00000-0xfec00fff] could not be reserved
system 00:07: [mem 0xfee00000-0xfee00fff] has been reserved
system 00:08: [io  0x04d0-0x04d1] has been reserved
system 00:08: [io  0x040b] has been reserved
system 00:08: [io  0x04d6] has been reserved
system 00:08: [io  0x0c00-0x0c01] has been reserved
system 00:08: [io  0x0c14] has been reserved
system 00:08: [io  0x0c50-0x0c51] has been reserved
system 00:08: [io  0x0c52] has been reserved
system 00:08: [io  0x0c6c] has been reserved
system 00:08: [io  0x0c6f] has been reserved
system 00:08: [io  0x0cd0-0x0cd1] has been reserved
system 00:08: [io  0x0cd2-0x0cd3] has been reserved
system 00:08: [io  0x0cd4-0x0cd5] has been reserved
system 00:08: [io  0x0cd6-0x0cd7] has been reserved
system 00:08: [io  0x0cd8-0x0cdf] has been reserved
system 00:08: [io  0x0b00-0x0b3f] has been reserved
system 00:08: [io  0x0800-0x089f] has been reserved
system 00:08: [io  0x0b00-0x0b0f] has been reserved
system 00:08: [io  0x0b20-0x0b3f] has been reserved
system 00:08: [io  0x0900-0x090f] has been reserved
system 00:08: [io  0x0910-0x091f] has been reserved
system 00:08: [io  0xfe00-0xfefe] has been reserved
system 00:08: [mem 0xffb80000-0xffbfffff] has been reserved
system 00:08: [mem 0xfec10000-0xfec1001f] has been reserved
system 00:0a: [io  0x0230-0x023f] has been reserved
system 00:0a: [io  0x0290-0x029f] has been reserved
system 00:0a: [io  0x0f40-0x0f4f] has been reserved
system 00:0a: [io  0x0a30-0x0a3f] has been reserved
system 00:0b: [mem 0xe0000000-0xefffffff] has been reserved
system 00:0c: [mem 0x00000000-0x0009ffff] could not be reserved
system 00:0c: [mem 0x000c0000-0x000cffff] has been reserved
system 00:0c: [mem 0x000e0000-0x000fffff] could not be reserved
system 00:0c: [mem 0x00100000-0xcbffffff] could not be reserved
system 00:0c: [mem 0xfec00000-0xffffffff] could not be reserved
pci 0000:00:01.0: PCI bridge to [bus 01-01]
pci 0000:00:01.0:   bridge window [io  0xd000-0xdfff]
pci 0000:00:01.0:   bridge window [mem 0xfbd00000-0xfbefffff]
pci 0000:00:01.0:   bridge window [mem 0xd0000000-0xdfffffff 64bit pref]
pci 0000:00:14.4: PCI bridge to [bus 02-02]
pci 0000:00:14.4:   bridge window [io  0xe000-0xefff]
pci 0000:00:14.4:   bridge window [mem 0xfbf00000-0xfbffffff]
pci 0000:00:14.4:   bridge window [mem pref disabled]
pci_bus 0000:00: resource 0 [io  0x0000-0xffff]
pci_bus 0000:00: resource 1 [mem 0x00000000-0xffffffffffffffff]
pci_bus 0000:01: resource 0 [io  0xd000-0xdfff]
pci_bus 0000:01: resource 1 [mem 0xfbd00000-0xfbefffff]
pci_bus 0000:01: resource 2 [mem 0xd0000000-0xdfffffff 64bit pref]
pci_bus 0000:02: resource 0 [io  0xe000-0xefff]
pci_bus 0000:02: resource 1 [mem 0xfbf00000-0xfbffffff]
pci_bus 0000:02: resource 3 [io  0x0000-0xffff]
pci_bus 0000:02: resource 4 [mem 0x00000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
UDP hash table entries: 2048 (order: 4, 65536 bytes)
UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes)
NET: Registered protocol family 1
pci 0000:01:05.0: Boot video device
PCI: CLS 64 bytes, default 64
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing 64MB software IO TLB between ffff88002c600000 - ffff880030600000
software IO TLB at phys 0x2c600000 - 0x30600000
kvm: Nested Virtualization enabled
kvm: Nested Paging enabled
msgmni has been set to 7789
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
io scheduler noop registered
io scheduler cfq registered (default)
input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
ACPI: Power Button [PWRB]
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
ACPI: Power Button [PWRF]
processor LNXCPU:00: registered as cooling_device0
processor LNXCPU:01: registered as cooling_device1
processor LNXCPU:02: registered as cooling_device2
processor LNXCPU:03: registered as cooling_device3
Real Time Clock Driver v1.12b
Linux agpgart interface v0.103
[drm] Initialized drm 1.1.0 20060810
[drm] radeon defaulting to kernel modesetting.
[drm] radeon kernel modesetting enabled.
radeon 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
radeon 0000:01:05.0: setting latency timer to 64
[drm] radeon: Initializing kernel modesetting.
[drm] register mmio base: 0xFBEE0000
[drm] register mmio size: 65536
ATOM BIOS: 113
[drm] Clocks initialized !
[drm] Detected VRAM RAM=192M, BAR=256M
[drm] RAM width 32bits DDR
[TTM] Zone  kernel: Available graphics memory: 1994122 kiB.
[drm] radeon: 192M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
[drm] radeon: irq initialized.
[drm] GART: num cpu pages 131072, num gpu pages 131072
[drm] Loading RS780 Microcode
platform radeon_cp.0: firmware: using built-in firmware radeon/RS780_pfp.bin
platform radeon_cp.0: firmware: using built-in firmware radeon/RS780_me.bin
platform radeon_cp.0: firmware: using built-in firmware radeon/R600_rlc.bin
[drm] ring test succeeded in 1 usecs
[drm] radeon: ib pool ready.
[drm] ib test succeeded in 0 usecs
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   VGA
[drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] Connector 1:
[drm]   DVI-D
[drm]   HPD3
[drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
[drm]   Encoders:
[drm]     DFP3: INTERNAL_KLDSCP_LVTMA
[drm] fb mappable at 0xD0141000
[drm] vram apper at 0xD0000000
[drm] size 7257600
[drm] fb depth is 24
[drm]    pitch is 6912
executing set pll
executing set crtc timing
[drm] TMDS-11: set mode 1680x1050 1d
Console: switching to colour frame buffer device 131x105
fb0: radeondrmfb frame buffer device
registered panic notifier
[drm] Initialized radeon 2.0.0 20080528 for 0000:01:05.0 on minor 0
loop: module loaded
ahci 0000:00:11.0: version 3.0
ahci 0000:00:11.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
ahci 0000:00:11.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part ccc 
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
ata1: SATA max UDMA/133 irq_stat 0x00400000, PHY RDY changed
ata2: SATA max UDMA/133 abar m1024@0xfbcffc00 port 0xfbcffd80 irq 22
ata3: SATA max UDMA/133 abar m1024@0xfbcffc00 port 0xfbcffe00 irq 22
ata4: SATA max UDMA/133 abar m1024@0xfbcffc00 port 0xfbcffe80 irq 22
pata_atiixp 0000:00:14.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pata_atiixp 0000:00:14.1: setting latency timer to 64
scsi4 : pata_atiixp
scsi5 : pata_atiixp
ata5: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xff00 irq 14
ata6: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xff08 irq 15
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:02:05.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
r8169 0000:02:05.0: no PCI Express capability
eth0: RTL8110s at 0xffffc90000454c00, 00:08:54:36:f2:2f, XID 04000000 IRQ 20
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci_hcd 0000:00:12.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17
ehci_hcd 0000:00:12.2: EHCI Host Controller
ehci_hcd 0000:00:12.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:12.2: applying AMD SB600/SB700 USB freeze workaround
ehci_hcd 0000:00:12.2: debug port 1
ehci_hcd 0000:00:12.2: irq 17, io mem 0xfbcff800
ehci_hcd 0000:00:12.2: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
ehci_hcd 0000:00:13.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
ehci_hcd 0000:00:13.2: EHCI Host Controller
ehci_hcd 0000:00:13.2: new USB bus registered, assigned bus number 2
ehci_hcd 0000:00:13.2: applying AMD SB600/SB700 USB freeze workaround
ehci_hcd 0000:00:13.2: debug port 1
ehci_hcd 0000:00:13.2: irq 19, io mem 0xfbcff400
ehci_hcd 0000:00:13.2: USB 2.0 started, EHCI 1.00
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 6 ports detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci_hcd 0000:00:12.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ohci_hcd 0000:00:12.0: OHCI Host Controller
ohci_hcd 0000:00:12.0: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:12.0: irq 16, io mem 0xfbcfd000
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 3 ports detected
ohci_hcd 0000:00:12.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ohci_hcd 0000:00:12.1: OHCI Host Controller
ohci_hcd 0000:00:12.1: new USB bus registered, assigned bus number 4
ohci_hcd 0000:00:12.1: irq 16, io mem 0xfbcfe000
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 3 ports detected
ohci_hcd 0000:00:13.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
ohci_hcd 0000:00:13.0: OHCI Host Controller
ohci_hcd 0000:00:13.0: new USB bus registered, assigned bus number 5
ohci_hcd 0000:00:13.0: irq 18, io mem 0xfbcfb000
ata5.00: ATAPI: HL-DT-STDVD-RAM GH22NP20, 1.03, max UDMA/66
ata5.00: configured for UDMA/66
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 3 ports detected
ohci_hcd 0000:00:13.1: PCI INT A -> GSI 18 (level, low) -> IRQ 18
ohci_hcd 0000:00:13.1: OHCI Host Controller
ohci_hcd 0000:00:13.1: new USB bus registered, assigned bus number 6
ohci_hcd 0000:00:13.1: irq 18, io mem 0xfbcfc000
hub 6-0:1.0: USB hub found
hub 6-0:1.0: 3 ports detected
ohci_hcd 0000:00:14.5: PCI INT C -> GSI 18 (level, low) -> IRQ 18
ohci_hcd 0000:00:14.5: OHCI Host Controller
ohci_hcd 0000:00:14.5: new USB bus registered, assigned bus number 7
ohci_hcd 0000:00:14.5: irq 18, io mem 0xfbcfa000
hub 7-0:1.0: USB hub found
hub 7-0:1.0: 2 ports detected
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
i2c /dev entries driver
cpuidle: using governor ladder
cpuidle: using governor menu
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
Advanced Linux Sound Architecture Driver Version 1.0.21.
usbcore: registered new interface driver snd-usb-audio
ALSA device list:
  No soundcards found.
Netfilter messages via NETLINK v0.30.
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ctnetlink v0.93: registering with nfnetlink.
ip_tables: (C) 2000-2006 Netfilter Core Team
TCP cubic registered
NET: Registered protocol family 17
powernow-k8: Found 1 AMD Phenom(tm) II X4 955 Processor processors (4 cpu cores) (version 2.20.00)
powernow-k8:    0 : pstate 0 (3200 MHz)
powernow-k8:    1 : pstate 1 (2500 MHz)
powernow-k8:    2 : pstate 2 (2100 MHz)
powernow-k8:    3 : pstate 3 (800 MHz)
registered taskstats version 1
ata2: SATA link down (SStatus 0 SControl 300)
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ata3.00: ATA-8: OCZ-VERTEX, 1.4, max UDMA/133
ata3.00: 62533296 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
ata3.00: configured for UDMA/133
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7: SAMSUNG HD103UJ, 1AA01118, max UDMA7
ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
usb 4-1: new full speed USB device using ohci_hcd and address 2
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HD103UJ  1AA0 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
scsi 2:0:0:0: Direct-Access     ATA      OCZ-VERTEX       1.4  PQ: 0 ANSI: 5
 sda:
sd 2:0:0:0: [sdb] 62533296 512-byte logical blocks: (32.0 GB/29.8 GiB)
sd 2:0:0:0: [sdb] Write Protect is off
sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 2:0:0:0: Attached scsi generic sg1 type 0
sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdb: unknown partition table
sd 2:0:0:0: [sdb] Attached SCSI disk
scsi 4:0:0:0: CD-ROM            HL-DT-ST DVD-RAM GH22NP20 1.03 PQ: 0 ANSI: 5
 sda1 sda2 sda3
sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sd 0:0:0:0: [sda] Attached SCSI disk
sr 4:0:0:0: Attached scsi CD-ROM sr0
sr 4:0:0:0: Attached scsi generic sg2 type 5
EXT4-fs (sdb): INFO: recovery required on readonly filesystem
EXT4-fs (sdb): write access will be enabled during recovery
EXT4-fs (sdb): recovery complete
EXT4-fs (sdb): mounted filesystem with ordered data mode
VFS: Mounted root (ext4 filesystem) readonly on device 8:16.
Freeing unused kernel memory: 416k freed
Write protecting the kernel read-only data: 6144k
Freeing unused kernel memory: 324k freed
Freeing unused kernel memory: 496k freed
input: C-Media USB Headphone Set   as /devices/pci0000:00/0000:00:12.1/usb4/4-1/4-1:1.3/input/input2
generic-usb 0003:0D8C:000C.0001: input: USB HID v1.00 Device [C-Media USB Headphone Set  ] on usb-0000:00:12.1-1/input3
udev: starting version 146
usb 3-1: new full speed USB device using ohci_hcd and address 2
input: Logitech USB Receiver as /devices/pci0000:00/0000:00:12.0/usb3/3-1/3-1:1.0/input/input3
generic-usb 0003:046D:C526.0002: input: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:12.0-1/input0
input: Logitech USB Receiver as /devices/pci0000:00/0000:00:12.0/usb3/3-1/3-1:1.1/input/input4
generic-usb 0003:046D:C526.0003: input: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:12.0-1/input1
usb 3-2: new low speed USB device using ohci_hcd and address 3
input: HID 046a:0021 as /devices/pci0000:00/0000:00:12.0/usb3/3-2/3-2:1.0/input/input5
generic-usb 0003:046A:0021.0004: input: USB HID v1.11 Keyboard [HID 046a:0021] on usb-0000:00:12.0-2/input0
input: HID 046a:0021 as /devices/pci0000:00/0000:00:12.0/usb3/3-2/3-2:1.1/input/input6
generic-usb 0003:046A:0021.0005: input: USB HID v1.11 Device [HID 046a:0021] on usb-0000:00:12.0-2/input1
EXT4-fs (sda1): mounted filesystem with ordered data mode
EXT4-fs (sda2): mounted filesystem with ordered data mode
EXT4-fs (sda3): mounted filesystem with ordered data mode
Adding 255992k swap on /var/cache/swap/swapfile.  Priority:-1 extents:2 across:354296k 
r8169: eth0: link up
executing set pll
executing set crtc timing
[drm] TMDS-11: set mode 1680x1050 1d

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:26                                 ` Yinghai Lu
@ 2009-12-15 21:30                                   ` Jens Axboe
  2009-12-15 21:40                                     ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 21:30 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Jens Axboe wrote:
> >> On Tue, Dec 15 2009, Jens Axboe wrote:
> >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>> Jens Axboe wrote:
> >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>> Jens Axboe wrote:
> >>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>>> Jens Axboe wrote:
> >>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> >>>>>>>>>>
> >>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
> >>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> >>>>>>>>> mmconf problem to begin with, are we now just working around the issue?
> >>>>>>>>> SRAT still reports issues, numa doesn't work.
> >>>>>>>> that patch will be bullet proof... we need it.
> >>>>>>>>
> >>>>>>>> also still need to figure out why memmap range is not passed properly.
> >>>>>>>>
> >>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> >>>>>>>> second kernel?
> >>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> >>>>>>> complaints and NUMA works fine.
> >>>>>> do you need 
> >>>>>> memmap=62G@4G
> >>>>>> in this case?
> >>>>> Yes, I've needed that always.
> >>>> good,
> >>>>
> >>>> can you enable debug option in kexec to see why kexec can not pass
> >>>> whole 38? range to second kernel?
> >>> Not getting any output so far, -d doesn't do much. Poking around in the
> >>> source...
> >> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
> >> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
> >> total), that smells like just a kexec bug. Retesting -git...
> > 
> > Current -git works fine when all the ranges are passed correctly. So, I
> > think, the only existing regression is the SRAT issue.
> 
> did you change node_shift?

Yes:

CONFIG_NODES_SHIFT=6

What I don't get is that 2.6.32 and -git print the same PXM map, and in
both cases it's totalling exactly 64G. Yet it says:

SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:30                                   ` Jens Axboe
@ 2009-12-15 21:40                                     ` Jens Axboe
  2009-12-15 21:43                                       ` Yinghai Lu
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 21:40 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

On Tue, Dec 15 2009, Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > Jens Axboe wrote:
> > > On Tue, Dec 15 2009, Jens Axboe wrote:
> > >> On Tue, Dec 15 2009, Jens Axboe wrote:
> > >>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >>>> Jens Axboe wrote:
> > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >>>>>> Jens Axboe wrote:
> > >>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >>>>>>>> Jens Axboe wrote:
> > >>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> > >>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> > >>>>>>>>>>
> > >>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
> > >>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> > >>>>>>>>> mmconf problem to begin with, are we now just working around the issue?
> > >>>>>>>>> SRAT still reports issues, numa doesn't work.
> > >>>>>>>> that patch will be bullet proof... we need it.
> > >>>>>>>>
> > >>>>>>>> also still need to figure out why memmap range is not passed properly.
> > >>>>>>>>
> > >>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> > >>>>>>>> second kernel?
> > >>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> > >>>>>>> complaints and NUMA works fine.
> > >>>>>> do you need 
> > >>>>>> memmap=62G@4G
> > >>>>>> in this case?
> > >>>>> Yes, I've needed that always.
> > >>>> good,
> > >>>>
> > >>>> can you enable debug option in kexec to see why kexec can not pass
> > >>>> whole 38? range to second kernel?
> > >>> Not getting any output so far, -d doesn't do much. Poking around in the
> > >>> source...
> > >> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
> > >> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
> > >> total), that smells like just a kexec bug. Retesting -git...
> > > 
> > > Current -git works fine when all the ranges are passed correctly. So, I
> > > think, the only existing regression is the SRAT issue.
> > 
> > did you change node_shift?
> 
> Yes:
> 
> CONFIG_NODES_SHIFT=6
> 
> What I don't get is that 2.6.32 and -git print the same PXM map, and in
> both cases it's totalling exactly 64G. Yet it says:
> 
> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.

Clue:

[    0.000000] SRAT: Node 0 PXM 0 0-80000000
[    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
[    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
[    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
[    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
[    0.000000] NUMA: Using 31 for the hash shift.
[    0.000000] pxm0: 0-480000 (4718592), absent 553990
[    0.000000] pxm1: 880000-c80000 (4194304), absent 0
[    0.000000] pxm2: 480000-880000 (4194304), absent 4194304
[    0.000000] pxm3: c80000-1080000 (4194304), absent 0
[    0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM.  Not used.
[    0.000000] SRAT: SRAT not used.

It's essentially disregarding pxm2, claiming all pages are absent.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:40                                     ` Jens Axboe
@ 2009-12-15 21:43                                       ` Yinghai Lu
  2009-12-15 21:47                                         ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 21:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying

[-- Attachment #1: Type: text/plain, Size: 2992 bytes --]

Jens Axboe wrote:
> On Tue, Dec 15 2009, Jens Axboe wrote:
>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>> Jens Axboe wrote:
>>>> On Tue, Dec 15 2009, Jens Axboe wrote:
>>>>> On Tue, Dec 15 2009, Jens Axboe wrote:
>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>> Jens Axboe wrote:
>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>> Jens Axboe wrote:
>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>>>> Jens Axboe wrote:
>>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>>>>>>>>>>>>
>>>>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
>>>>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
>>>>>>>>>>>> mmconf problem to begin with, are we now just working around the issue?
>>>>>>>>>>>> SRAT still reports issues, numa doesn't work.
>>>>>>>>>>> that patch will be bullet proof... we need it.
>>>>>>>>>>>
>>>>>>>>>>> also still need to figure out why memmap range is not passed properly.
>>>>>>>>>>>
>>>>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
>>>>>>>>>>> second kernel?
>>>>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
>>>>>>>>>> complaints and NUMA works fine.
>>>>>>>>> do you need 
>>>>>>>>> memmap=62G@4G
>>>>>>>>> in this case?
>>>>>>>> Yes, I've needed that always.
>>>>>>> good,
>>>>>>>
>>>>>>> can you enable debug option in kexec to see why kexec can not pass
>>>>>>> whole 38? range to second kernel?
>>>>>> Not getting any output so far, -d doesn't do much. Poking around in the
>>>>>> source...
>>>>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
>>>>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
>>>>> total), that smells like just a kexec bug. Retesting -git...
>>>> Current -git works fine when all the ranges are passed correctly. So, I
>>>> think, the only existing regression is the SRAT issue.
>>> did you change node_shift?
>> Yes:
>>
>> CONFIG_NODES_SHIFT=6
>>
>> What I don't get is that 2.6.32 and -git print the same PXM map, and in
>> both cases it's totalling exactly 64G. Yet it says:
>>
>> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.
> 
> Clue:
> 
> [    0.000000] SRAT: Node 0 PXM 0 0-80000000
> [    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
> [    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
> [    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
> [    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
> [    0.000000] NUMA: Using 31 for the hash shift.
> [    0.000000] pxm0: 0-480000 (4718592), absent 553990
> [    0.000000] pxm1: 880000-c80000 (4194304), absent 0
> [    0.000000] pxm2: 480000-880000 (4194304), absent 4194304
> [    0.000000] pxm3: c80000-1080000 (4194304), absent 0
> [    0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM.  Not used.
> [    0.000000] SRAT: SRAT not used.
> 

oh, i post one patch last week, 

can you check it?

YH

[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 5721 bytes --]

From: Yinghai Lu <yinghai@kernel.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,  "H. Peter Anvin" <hpa@zytor.com>, Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mel@csn.ul.ie>,  Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [PATCH] x86: fix checking of SRAT when node0 ram is not from 0 -v2
Date: Sun, 13 Dec 2009 15:33:38 -0800
Message-ID: <4B2579D2.3010201@kernel.org>



Found one system that boot from socket1 instead of socket0, SRAT get rejected...

[    0.000000] SRAT: Node 1 PXM 0 0-a0000
[    0.000000] SRAT: Node 1 PXM 0 100000-80000000
[    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
[    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
[    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
[    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
[    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
[    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
[    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
[    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
...
[    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
[    0.000000] NUMA: Using 20 for the hash shift.
[    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
[    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
[    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
[    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
[    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
[    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
[    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
[    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
[    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
[    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
[    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
[    0.000000] SRAT: SRAT not used.

the early_node_map is not sorted because node0 with non zero start come first.

so try to sort it right away after all regions are registered.

-v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/mm/srat_32.c |    2 ++
 arch/x86/mm/srat_64.c |    4 +++-
 include/linux/mm.h    |    3 +++
 mm/page_alloc.c       |    4 ++--
 4 files changed, 10 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/mm/srat_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/srat_32.c
+++ linux-2.6/arch/x86/mm/srat_32.c
@@ -267,6 +267,8 @@ int __init get_memcfg_from_srat(void)
 		e820_register_active_regions(chunk->nid, chunk->start_pfn,
 					     min(chunk->end_pfn, max_pfn));
 	}
+	/* for out of order entries in SRAT */
+	sort_node_map();
 
 	for_each_online_node(nid) {
 		unsigned long start = node_start_pfn[nid];
Index: linux-2.6/arch/x86/mm/srat_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/srat_64.c
+++ linux-2.6/arch/x86/mm/srat_64.c
@@ -317,7 +317,7 @@ static int __init nodes_cover_memory(con
 		unsigned long s = nodes[i].start >> PAGE_SHIFT;
 		unsigned long e = nodes[i].end >> PAGE_SHIFT;
 		pxmram += e - s;
-		pxmram -= absent_pages_in_range(s, e);
+		pxmram -= __absent_pages_in_range(i, s, e);
 		if ((long)pxmram < 0)
 			pxmram = 0;
 	}
@@ -373,6 +373,8 @@ int __init acpi_scan_nodes(unsigned long
 	for_each_node_mask(i, nodes_parsed)
 		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
 						nodes[i].end >> PAGE_SHIFT);
+	/* for out of order entries in SRAT */
+	sort_node_map();
 	if (!nodes_cover_memory(nodes)) {
 		bad_srat();
 		return -1;
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h
+++ linux-2.6/include/linux/mm.h
@@ -1022,6 +1022,9 @@ extern void add_active_range(unsigned in
 extern void remove_active_range(unsigned int nid, unsigned long start_pfn,
 					unsigned long end_pfn);
 extern void remove_all_active_ranges(void);
+void sort_node_map(void);
+unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn,
+						unsigned long end_pfn);
 extern unsigned long absent_pages_in_range(unsigned long start_pfn,
 						unsigned long end_pfn);
 extern void get_pfn_range_for_nid(unsigned int nid,
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -3573,7 +3573,7 @@ static unsigned long __meminit zone_span
  * Return the number of holes in a range on a node. If nid is MAX_NUMNODES,
  * then all holes in the requested range will be accounted for.
  */
-static unsigned long __meminit __absent_pages_in_range(int nid,
+unsigned long __meminit __absent_pages_in_range(int nid,
 				unsigned long range_start_pfn,
 				unsigned long range_end_pfn)
 {
@@ -4102,7 +4102,7 @@ static int __init cmp_node_active_region
 }
 
 /* sort the node_map by start_pfn */
-static void __init sort_node_map(void)
+void __init sort_node_map(void)
 {
 	sort(early_node_map, (size_t)nr_nodemap_entries,
 			sizeof(struct node_active_region),


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:43                                       ` Yinghai Lu
@ 2009-12-15 21:47                                         ` Jens Axboe
  2009-12-15 21:50                                           ` Yinghai Lu
  2009-12-15 21:52                                           ` Jens Axboe
  0 siblings, 2 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 21:47 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying, rientjes

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Jens Axboe wrote:
> >> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>> Jens Axboe wrote:
> >>>> On Tue, Dec 15 2009, Jens Axboe wrote:
> >>>>> On Tue, Dec 15 2009, Jens Axboe wrote:
> >>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>> Jens Axboe wrote:
> >>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>>>> Jens Axboe wrote:
> >>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>>>>>> Jens Axboe wrote:
> >>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
> >>>>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
> >>>>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
> >>>>>>>>>>>> mmconf problem to begin with, are we now just working around the issue?
> >>>>>>>>>>>> SRAT still reports issues, numa doesn't work.
> >>>>>>>>>>> that patch will be bullet proof... we need it.
> >>>>>>>>>>>
> >>>>>>>>>>> also still need to figure out why memmap range is not passed properly.
> >>>>>>>>>>>
> >>>>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
> >>>>>>>>>>> second kernel?
> >>>>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
> >>>>>>>>>> complaints and NUMA works fine.
> >>>>>>>>> do you need 
> >>>>>>>>> memmap=62G@4G
> >>>>>>>>> in this case?
> >>>>>>>> Yes, I've needed that always.
> >>>>>>> good,
> >>>>>>>
> >>>>>>> can you enable debug option in kexec to see why kexec can not pass
> >>>>>>> whole 38? range to second kernel?
> >>>>>> Not getting any output so far, -d doesn't do much. Poking around in the
> >>>>>> source...
> >>>>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
> >>>>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
> >>>>> total), that smells like just a kexec bug. Retesting -git...
> >>>> Current -git works fine when all the ranges are passed correctly. So, I
> >>>> think, the only existing regression is the SRAT issue.
> >>> did you change node_shift?
> >> Yes:
> >>
> >> CONFIG_NODES_SHIFT=6
> >>
> >> What I don't get is that 2.6.32 and -git print the same PXM map, and in
> >> both cases it's totalling exactly 64G. Yet it says:
> >>
> >> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.
> > 
> > Clue:
> > 
> > [    0.000000] SRAT: Node 0 PXM 0 0-80000000
> > [    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
> > [    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
> > [    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
> > [    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
> > [    0.000000] NUMA: Using 31 for the hash shift.
> > [    0.000000] pxm0: 0-480000 (4718592), absent 553990
> > [    0.000000] pxm1: 880000-c80000 (4194304), absent 0
> > [    0.000000] pxm2: 480000-880000 (4194304), absent 4194304
> > [    0.000000] pxm3: c80000-1080000 (4194304), absent 0
> > [    0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM.  Not used.
> > [    0.000000] SRAT: SRAT not used.
> > 
> 
> oh, i post one patch last week, 
> 
> can you check it?

Sure, let me try it. I already found out that commit 8716273c is the
guilty one (x86: Export srat physical topology).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:47                                         ` Jens Axboe
@ 2009-12-15 21:50                                           ` Yinghai Lu
  2009-12-15 21:52                                           ` Jens Axboe
  1 sibling, 0 replies; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 21:50 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying, rientjes

Jens Axboe wrote:
> On Tue, Dec 15 2009, Yinghai Lu wrote:
>> Jens Axboe wrote:
>>> On Tue, Dec 15 2009, Jens Axboe wrote:
>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>> Jens Axboe wrote:
>>>>>> On Tue, Dec 15 2009, Jens Axboe wrote:
>>>>>>> On Tue, Dec 15 2009, Jens Axboe wrote:
>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>> Jens Axboe wrote:
>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>>>> Jens Axboe wrote:
>>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>>>>>> Jens Axboe wrote:
>>>>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote:
>>>>>>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it
>>>>>>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the
>>>>>>>>>>>>>> mmconf problem to begin with, are we now just working around the issue?
>>>>>>>>>>>>>> SRAT still reports issues, numa doesn't work.
>>>>>>>>>>>>> that patch will be bullet proof... we need it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> also still need to figure out why memmap range is not passed properly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in
>>>>>>>>>>>>> second kernel?
>>>>>>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT
>>>>>>>>>>>> complaints and NUMA works fine.
>>>>>>>>>>> do you need 
>>>>>>>>>>> memmap=62G@4G
>>>>>>>>>>> in this case?
>>>>>>>>>> Yes, I've needed that always.
>>>>>>>>> good,
>>>>>>>>>
>>>>>>>>> can you enable debug option in kexec to see why kexec can not pass
>>>>>>>>> whole 38? range to second kernel?
>>>>>>>> Not getting any output so far, -d doesn't do much. Poking around in the
>>>>>>>> source...
>>>>>>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to
>>>>>>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges
>>>>>>> total), that smells like just a kexec bug. Retesting -git...
>>>>>> Current -git works fine when all the ranges are passed correctly. So, I
>>>>>> think, the only existing regression is the SRAT issue.
>>>>> did you change node_shift?
>>>> Yes:
>>>>
>>>> CONFIG_NODES_SHIFT=6
>>>>
>>>> What I don't get is that 2.6.32 and -git print the same PXM map, and in
>>>> both cases it's totalling exactly 64G. Yet it says:
>>>>
>>>> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used.
>>> Clue:
>>>
>>> [    0.000000] SRAT: Node 0 PXM 0 0-80000000
>>> [    0.000000] SRAT: Node 0 PXM 0 100000000-480000000
>>> [    0.000000] SRAT: Node 2 PXM 1 480000000-880000000
>>> [    0.000000] SRAT: Node 1 PXM 2 880000000-c80000000
>>> [    0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000
>>> [    0.000000] NUMA: Using 31 for the hash shift.
>>> [    0.000000] pxm0: 0-480000 (4718592), absent 553990
>>> [    0.000000] pxm1: 880000-c80000 (4194304), absent 0
>>> [    0.000000] pxm2: 480000-880000 (4194304), absent 4194304
>>> [    0.000000] pxm3: c80000-1080000 (4194304), absent 0
>>> [    0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM.  Not used.
>>> [    0.000000] SRAT: SRAT not used.
>>>
>> oh, i post one patch last week, 
>>
>> can you check it?
> 
> Sure, let me try it. I already found out that commit 8716273c is the
> guilty one (x86: Export srat physical topology).

ok, my patch should fix that.

YH

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:47                                         ` Jens Axboe
  2009-12-15 21:50                                           ` Yinghai Lu
@ 2009-12-15 21:52                                           ` Jens Axboe
  2009-12-15 22:24                                             ` Yinghai Lu
  1 sibling, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 21:52 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha,
	linux-pci, H. Peter Anvin, Huang Ying, rientjes

On Tue, Dec 15 2009, Jens Axboe wrote:
> > oh, i post one patch last week, 
> > 
> > can you check it?
> 
> Sure, let me try it. I already found out that commit 8716273c is the
> guilty one (x86: Export srat physical topology).

Confirmed, -git with that patch works as well. So that's all of them I
think, can we please get this expedited in so that -rc1 will work?
Thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 21:52                                           ` Jens Axboe
@ 2009-12-15 22:24                                             ` Yinghai Lu
  2009-12-16 10:01                                               ` Jens Axboe
  0 siblings, 1 reply; 42+ messages in thread
From: Yinghai Lu @ 2009-12-15 22:24 UTC (permalink / raw)
  To: mingo, H. Peter Anvin, Thomas Gleixner
  Cc: Jens Axboe, Jesse Barnes, Linux Kernel, rdreier, Suresh Siddha,
	linux-pci, Huang Ying, rientjes

Jens Axboe wrote:
> On Tue, Dec 15 2009, Jens Axboe wrote:
>>> oh, i post one patch last week, 
>>>
>>> can you check it?
>> Sure, let me try it. I already found out that commit 8716273c is the
>> guilty one (x86: Export srat physical topology).
> 
> Confirmed, -git with that patch works as well. So that's all of them I
> think, can we please get this expedited in so that -rc1 will work?
> Thanks!

updated version:

[PATCH] x86: fix checking of SRAT when node0 ram is not from 0 -v3

Found one system that boot from socket1 instead of socket0, SRAT get rejected...

[    0.000000] SRAT: Node 1 PXM 0 0-a0000
[    0.000000] SRAT: Node 1 PXM 0 100000-80000000
[    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
[    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
[    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
[    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
[    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
[    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
[    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
[    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
...
[    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
[    0.000000] NUMA: Using 20 for the hash shift.
[    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
[    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
[    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
[    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
[    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
[    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
[    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
[    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
[    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
[    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
[    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
[    0.000000] SRAT: SRAT not used.

the early_node_map is not sorted because node0 with non zero start come first.

so try to sort it right away after all regions are registered.

also fixs refression by 8716273c (x86: Export srat physical topology)

-v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)
-v3: update comments.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Jens Axboe <jens.axboe@oracle.com>

---
 arch/x86/mm/srat_32.c |    2 ++
 arch/x86/mm/srat_64.c |    4 +++-
 include/linux/mm.h    |    3 +++
 mm/page_alloc.c       |    4 ++--
 4 files changed, 10 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/mm/srat_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/srat_32.c
+++ linux-2.6/arch/x86/mm/srat_32.c
@@ -267,6 +267,8 @@ int __init get_memcfg_from_srat(void)
 		e820_register_active_regions(chunk->nid, chunk->start_pfn,
 					     min(chunk->end_pfn, max_pfn));
 	}
+	/* for out of order entries in SRAT */
+	sort_node_map();
 
 	for_each_online_node(nid) {
 		unsigned long start = node_start_pfn[nid];
Index: linux-2.6/arch/x86/mm/srat_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/srat_64.c
+++ linux-2.6/arch/x86/mm/srat_64.c
@@ -317,7 +317,7 @@ static int __init nodes_cover_memory(con
 		unsigned long s = nodes[i].start >> PAGE_SHIFT;
 		unsigned long e = nodes[i].end >> PAGE_SHIFT;
 		pxmram += e - s;
-		pxmram -= absent_pages_in_range(s, e);
+		pxmram -= __absent_pages_in_range(i, s, e);
 		if ((long)pxmram < 0)
 			pxmram = 0;
 	}
@@ -373,6 +373,8 @@ int __init acpi_scan_nodes(unsigned long
 	for_each_node_mask(i, nodes_parsed)
 		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
 						nodes[i].end >> PAGE_SHIFT);
+	/* for out of order entries in SRAT */
+	sort_node_map();
 	if (!nodes_cover_memory(nodes)) {
 		bad_srat();
 		return -1;
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h
+++ linux-2.6/include/linux/mm.h
@@ -1037,6 +1037,9 @@ extern void add_active_range(unsigned in
 extern void remove_active_range(unsigned int nid, unsigned long start_pfn,
 					unsigned long end_pfn);
 extern void remove_all_active_ranges(void);
+void sort_node_map(void);
+unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn,
+						unsigned long end_pfn);
 extern unsigned long absent_pages_in_range(unsigned long start_pfn,
 						unsigned long end_pfn);
 extern void get_pfn_range_for_nid(unsigned int nid,
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -3569,7 +3569,7 @@ static unsigned long __meminit zone_span
  * Return the number of holes in a range on a node. If nid is MAX_NUMNODES,
  * then all holes in the requested range will be accounted for.
  */
-static unsigned long __meminit __absent_pages_in_range(int nid,
+unsigned long __meminit __absent_pages_in_range(int nid,
 				unsigned long range_start_pfn,
 				unsigned long range_end_pfn)
 {
@@ -4098,7 +4098,7 @@ static int __init cmp_node_active_region
 }
 
 /* sort the node_map by start_pfn */
-static void __init sort_node_map(void)
+void __init sort_node_map(void)
 {
 	sort(early_node_map, (size_t)nr_nodemap_entries,
 			sizeof(struct node_active_region),

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression radeon/kms (bisected)
  2009-12-15 21:30                   ` Markus Trippelsdorf
@ 2009-12-15 23:02                     ` Markus Trippelsdorf
  0 siblings, 0 replies; 42+ messages in thread
From: Markus Trippelsdorf @ 2009-12-15 23:02 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jens Axboe, Jesse Barnes, Linux Kernel, mingo, rdreier,
	Suresh Siddha, linux-pci, Alex Deucher, Dave Airlie

On Tue, Dec 15, 2009 at 10:30:21PM +0100, Markus Trippelsdorf wrote:

> I have the same symptoms on my machine, but the underlying cause must be
> different. I once reverted all Radeon related changes since 2.6.32 and 
> kexec started working again.
> 
OK, I bisected this down to:

d8f60cfc93452d0554f6a701aa8e3236cbee4636 is the first bad commit
commit d8f60cfc93452d0554f6a701aa8e3236cbee4636
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Tue Dec 1 13:43:46 2009 -0500

    drm/radeon/kms: Add support for interrupts on r6xx/r7xx chips (v3)
-- 
Markus

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: kexec boot regression
  2009-12-15 22:24                                             ` Yinghai Lu
@ 2009-12-16 10:01                                               ` Jens Axboe
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Axboe @ 2009-12-16 10:01 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: mingo, H. Peter Anvin, Thomas Gleixner, Jesse Barnes,
	Linux Kernel, rdreier, Suresh Siddha, linux-pci, Huang Ying,
	rientjes

On Tue, Dec 15 2009, Yinghai Lu wrote:
> Jens Axboe wrote:
> > On Tue, Dec 15 2009, Jens Axboe wrote:
> >>> oh, i post one patch last week, 
> >>>
> >>> can you check it?
> >> Sure, let me try it. I already found out that commit 8716273c is the
> >> guilty one (x86: Export srat physical topology).
> > 
> > Confirmed, -git with that patch works as well. So that's all of them I
> > think, can we please get this expedited in so that -rc1 will work?
> > Thanks!
> 
> updated version:
> 
> [PATCH] x86: fix checking of SRAT when node0 ram is not from 0 -v3

Verified, this one works fine, too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2009-12-16 10:01 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-15 11:50 kexec boot regression Jens Axboe
2009-12-15 12:01 ` Yinghai Lu
2009-12-15 12:14   ` Jens Axboe
2009-12-15 12:31     ` Yinghai Lu
2009-12-15 12:39       ` Jens Axboe
2009-12-15 12:55         ` Yinghai Lu
2009-12-15 14:11           ` Jens Axboe
2009-12-15 18:39             ` Yinghai Lu
2009-12-15 18:47               ` Matthew Wilcox
2009-12-15 18:54               ` Jens Axboe
2009-12-15 18:59               ` Jens Axboe
2009-12-15 19:04                 ` Yinghai Lu
2009-12-15 19:11                   ` Jens Axboe
2009-12-15 19:17                     ` Yinghai Lu
2009-12-15 19:22                       ` Jens Axboe
2009-12-15 19:28                         ` Jens Axboe
2009-12-15 19:44                     ` Yinghai Lu
2009-12-15 19:48                       ` Jens Axboe
2009-12-15 19:49                         ` Yinghai Lu
2009-12-15 19:57                           ` Jens Axboe
2009-12-15 21:30                   ` Markus Trippelsdorf
2009-12-15 23:02                     ` kexec boot regression radeon/kms (bisected) Markus Trippelsdorf
2009-12-15 19:43               ` kexec boot regression Jens Axboe
2009-12-15 19:48                 ` Yinghai Lu
2009-12-15 19:51                   ` Jens Axboe
2009-12-15 19:56                     ` Yinghai Lu
2009-12-15 20:09                       ` Jens Axboe
2009-12-15 20:14                     ` Yinghai Lu
2009-12-15 20:19                       ` Jens Axboe
2009-12-15 20:21                         ` Yinghai Lu
2009-12-15 20:42                           ` Jens Axboe
2009-12-15 20:55                             ` Jens Axboe
2009-12-15 21:01                               ` Jens Axboe
2009-12-15 21:26                                 ` Yinghai Lu
2009-12-15 21:30                                   ` Jens Axboe
2009-12-15 21:40                                     ` Jens Axboe
2009-12-15 21:43                                       ` Yinghai Lu
2009-12-15 21:47                                         ` Jens Axboe
2009-12-15 21:50                                           ` Yinghai Lu
2009-12-15 21:52                                           ` Jens Axboe
2009-12-15 22:24                                             ` Yinghai Lu
2009-12-16 10:01                                               ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.