linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kexec/kdump kernel fails to start
@ 2012-09-04 17:32 Flavio Leitner
  2012-09-04 19:02 ` Yinghai Lu
  2012-09-05 15:34 ` Cong Wang
  0 siblings, 2 replies; 21+ messages in thread
From: Flavio Leitner @ 2012-09-04 17:32 UTC (permalink / raw)
  To: lkml
  Cc: Ingo Molnar, WANG Cong, Yinghai Lu, Tejun Heo, ianfang.cn, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 2397 bytes --]

Hi folks,

I have system that no longer boots kdump kernel. Basically,

# echo c > /proc/sysrq-trigger

to dump a vmcore doesn't work. It just hangs after showing the usual
panic messages. I've bisected the problem and the commit introducing
the issue is the one below.

Any idea?

commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
Branches: master, remotes/origin/master
Follows: v3.3-rc6
Precedes: v3.5-rc1

    x86/mm: Fix the size calculation of mapping tables
    
    For machines that enable PSE, the first 2/4M memory region still uses
    4K pages, so needs more PTEs in this case, but
    find_early_table_space() doesn't count this.
    
    This patch fixes it.
    
    The bug was found via code review, no misbehavior of the kernel
    was observed.


Machine details:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
stepping        : 5
microcode       : 0x11
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
bogomips        : 5333.87
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

# free
             total       used       free     shared    buffers     cached
Mem:      16161684   11749100    4412584          0      10212   11421096
-/+ buffers/cache:     317792   15843892
Swap:     17406420          0   17406420


dmesg is attached.

thanks,
fbl



[-- Attachment #2: dmesg.log.gz --]
[-- Type: application/x-gzip, Size: 18917 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 17:32 kexec/kdump kernel fails to start Flavio Leitner
@ 2012-09-04 19:02 ` Yinghai Lu
  2012-09-04 19:17   ` Flavio Leitner
  2012-09-05 15:34 ` Cong Wang
  1 sibling, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-09-04 19:02 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, Sep 4, 2012 at 10:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
> Hi folks,
>
> I have system that no longer boots kdump kernel. Basically,
>
> # echo c > /proc/sysrq-trigger
>
> to dump a vmcore doesn't work. It just hangs after showing the usual
> panic messages. I've bisected the problem and the commit introducing
> the issue is the one below.
>
> Any idea?
>
> commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
> Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
> Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
> Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
> Branches: master, remotes/origin/master
> Follows: v3.3-rc6
> Precedes: v3.5-rc1
>
>     x86/mm: Fix the size calculation of mapping tables
>
>     For machines that enable PSE, the first 2/4M memory region still uses
>     4K pages, so needs more PTEs in this case, but
>     find_early_table_space() doesn't count this.
>
>     This patch fixes it.
>
>     The bug was found via code review, no misbehavior of the kernel
>     was observed.

maybe just revert the offending commit?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 19:02 ` Yinghai Lu
@ 2012-09-04 19:17   ` Flavio Leitner
  2012-09-04 19:20     ` Yinghai Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Flavio Leitner @ 2012-09-04 19:17 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 12:02:00 -0700
Yinghai Lu <yinghai@kernel.org> wrote:

> On Tue, Sep 4, 2012 at 10:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
> > Hi folks,
> >
> > I have system that no longer boots kdump kernel. Basically,
> >
> > # echo c > /proc/sysrq-trigger
> >
> > to dump a vmcore doesn't work. It just hangs after showing the usual
> > panic messages. I've bisected the problem and the commit introducing
> > the issue is the one below.
> >
> > Any idea?
> >
> > commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> > Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
> > Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
> > Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
> > Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
> > Branches: master, remotes/origin/master
> > Follows: v3.3-rc6
> > Precedes: v3.5-rc1
> >
> >     x86/mm: Fix the size calculation of mapping tables
> >
> >     For machines that enable PSE, the first 2/4M memory region still uses
> >     4K pages, so needs more PTEs in this case, but
> >     find_early_table_space() doesn't count this.
> >
> >     This patch fixes it.
> >
> >     The bug was found via code review, no misbehavior of the kernel
> >     was observed.
> 
> maybe just revert the offending commit?

I don't know where the 4K pages were noticed. Here is the
dmesg output passing 'debug':

[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[    0.000000] last_pfn = 0xbf800 max_arch_pfn = 0x400000000
[    0.000000] initial memory mapped : 0 - 20000000
[    0.000000] Base memory trampoline at [ffff880000098000] 98000 size 20480
[    0.000000] init_memory_mapping: 0000000000000000-00000000bf800000
[    0.000000]  0000000000 - 00bf800000 page 2M
[    0.000000] kernel direct mapping tables up to bf800000 @ 1fa00000-20000000
[    0.000000] init_memory_mapping: 0000000100000000-0000000440000000
[    0.000000]  0100000000 - 0440000000 page 2M
[    0.000000] kernel direct mapping tables up to 440000000 @ bdaab000-bf4bd000
[    0.000000] RAMDISK: 352c8000 - 3695c000

so, it appears that on my system, the pages are 2M.
I will try moving the extra accounting to be inside of CONFIG_X86_32.

fbl

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 19:17   ` Flavio Leitner
@ 2012-09-04 19:20     ` Yinghai Lu
  2012-09-04 20:00       ` Flavio Leitner
  2012-09-04 20:26       ` Flavio Leitner
  0 siblings, 2 replies; 21+ messages in thread
From: Yinghai Lu @ 2012-09-04 19:20 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, Sep 4, 2012 at 12:17 PM, Flavio Leitner <fbl@redhat.com> wrote:
> On Tue, 4 Sep 2012 12:02:00 -0700
> [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> [    0.000000] last_pfn = 0xbf800 max_arch_pfn = 0x400000000
> [    0.000000] initial memory mapped : 0 - 20000000
> [    0.000000] Base memory trampoline at [ffff880000098000] 98000 size 20480
> [    0.000000] init_memory_mapping: 0000000000000000-00000000bf800000
> [    0.000000]  0000000000 - 00bf800000 page 2M
> [    0.000000] kernel direct mapping tables up to bf800000 @ 1fa00000-20000000
> [    0.000000] init_memory_mapping: 0000000100000000-0000000440000000
> [    0.000000]  0100000000 - 0440000000 page 2M
> [    0.000000] kernel direct mapping tables up to 440000000 @ bdaab000-bf4bd000
> [    0.000000] RAMDISK: 352c8000 - 3695c000
>
BTW, can you please try our new init_memory_mapping clean up at

	git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
for-x86-mm

hope it could make your kdump working.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 19:20     ` Yinghai Lu
@ 2012-09-04 20:00       ` Flavio Leitner
  2012-09-04 20:26       ` Flavio Leitner
  1 sibling, 0 replies; 21+ messages in thread
From: Flavio Leitner @ 2012-09-04 20:00 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 12:20:14 -0700
Yinghai Lu <yinghai@kernel.org> wrote:

> On Tue, Sep 4, 2012 at 12:17 PM, Flavio Leitner <fbl@redhat.com> wrote:
> > On Tue, 4 Sep 2012 12:02:00 -0700
> > [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> > [    0.000000] last_pfn = 0xbf800 max_arch_pfn = 0x400000000
> > [    0.000000] initial memory mapped : 0 - 20000000
> > [    0.000000] Base memory trampoline at [ffff880000098000] 98000 size 20480
> > [    0.000000] init_memory_mapping: 0000000000000000-00000000bf800000
> > [    0.000000]  0000000000 - 00bf800000 page 2M
> > [    0.000000] kernel direct mapping tables up to bf800000 @ 1fa00000-20000000
> > [    0.000000] init_memory_mapping: 0000000100000000-0000000440000000
> > [    0.000000]  0100000000 - 0440000000 page 2M
> > [    0.000000] kernel direct mapping tables up to 440000000 @ bdaab000-bf4bd000
> > [    0.000000] RAMDISK: 352c8000 - 3695c000
> >

Alright, moving the extra accounting to be inside of CONFIG_X86_32 works out.

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e0e6990..63e6a5c 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -60,10 +60,10 @@ static void __init find_early_table_space(struct map_range *mr, unsigned long en
 		extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT);
 #ifdef CONFIG_X86_32
 		extra += PMD_SIZE;
-#endif
 		/* The first 2/4M doesn't use large pages. */
 		if (mr->start < PMD_SIZE)
 			extra += mr->end - mr->start;
+#endif
 
 		ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
 	} else

> BTW, can you please try our new init_memory_mapping clean up at
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> for-x86-mm
> 
> hope it could make your kdump working.

I will give a try.
fbl


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 19:20     ` Yinghai Lu
  2012-09-04 20:00       ` Flavio Leitner
@ 2012-09-04 20:26       ` Flavio Leitner
  2012-09-04 20:45         ` Yinghai Lu
  1 sibling, 1 reply; 21+ messages in thread
From: Flavio Leitner @ 2012-09-04 20:26 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 12:20:14 -0700
Yinghai Lu <yinghai@kernel.org> wrote:

> On Tue, Sep 4, 2012 at 12:17 PM, Flavio Leitner <fbl@redhat.com> wrote:
> > On Tue, 4 Sep 2012 12:02:00 -0700
> > [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> > [    0.000000] last_pfn = 0xbf800 max_arch_pfn = 0x400000000
> > [    0.000000] initial memory mapped : 0 - 20000000
> > [    0.000000] Base memory trampoline at [ffff880000098000] 98000 size 20480
> > [    0.000000] init_memory_mapping: 0000000000000000-00000000bf800000
> > [    0.000000]  0000000000 - 00bf800000 page 2M
> > [    0.000000] kernel direct mapping tables up to bf800000 @ 1fa00000-20000000
> > [    0.000000] init_memory_mapping: 0000000100000000-0000000440000000
> > [    0.000000]  0100000000 - 0440000000 page 2M
> > [    0.000000] kernel direct mapping tables up to 440000000 @ bdaab000-bf4bd000
> > [    0.000000] RAMDISK: 352c8000 - 3695c000
> >
> BTW, can you please try our new init_memory_mapping clean up at
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> for-x86-mm
> 
> hope it could make your kdump working.

Sorry, but it didn't work.
The same problem happened. 
fbl

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 20:26       ` Flavio Leitner
@ 2012-09-04 20:45         ` Yinghai Lu
  2012-09-04 21:37           ` Flavio Leitner
  0 siblings, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-09-04 20:45 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, Sep 4, 2012 at 1:26 PM, Flavio Leitner <fbl@redhat.com> wrote:
>
> Sorry, but it didn't work.
> The same problem happened.

can you send out boot log ?

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 20:45         ` Yinghai Lu
@ 2012-09-04 21:37           ` Flavio Leitner
  2012-09-04 22:25             ` Yinghai Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Flavio Leitner @ 2012-09-04 21:37 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 13:45:23 -0700
Yinghai Lu <yinghai@kernel.org> wrote:

> On Tue, Sep 4, 2012 at 1:26 PM, Flavio Leitner <fbl@redhat.com> wrote:
> >
> > Sorry, but it didn't work.
> > The same problem happened.
> 
> can you send out boot log ?

sure, there you go:
http://sysclose.org/kdump/dmesg-debug.log
http://sysclose.org/kdump/config.log

fbl

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 21:37           ` Flavio Leitner
@ 2012-09-04 22:25             ` Yinghai Lu
  2012-09-04 22:40               ` Flavio Leitner
  2012-09-05  0:01               ` Flavio Leitner
  0 siblings, 2 replies; 21+ messages in thread
From: Yinghai Lu @ 2012-09-04 22:25 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, Sep 4, 2012 at 2:37 PM, Flavio Leitner <fbl@redhat.com> wrote:
> On Tue, 4 Sep 2012 13:45:23 -0700
> Yinghai Lu <yinghai@kernel.org> wrote:
>
>> On Tue, Sep 4, 2012 at 1:26 PM, Flavio Leitner <fbl@redhat.com> wrote:
>> >
>> > Sorry, but it didn't work.
>> > The same problem happened.
>>
>> can you send out boot log ?
>
> sure, there you go:
> http://sysclose.org/kdump/dmesg-debug.log
> http://sysclose.org/kdump/config.log

looks like you did not use for-x86-mm branch.

[    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
[    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
[    0.000000] init_memory_mapping: [mem 0x00000000-0xbf7fffff]
[    0.000000]  [mem 0x00000000-0xbf7fffff] page 2M
[    0.000000] kernel direct mapping tables up to 0xbf7fffff @ [mem
0x1fa00000-0x1fffffff]
[    0.000000] init_memory_mapping: [mem 0x100000000-0x43fffffff]
[    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
[    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem
0xbf7ce000-0xbf7dffff]
[    0.000000] RAMDISK: [mem 0x351d2000-0x368e0fff]
[    0.000000] Reserving 256MB of memory at 592MB for crashkernel
(System RAM: 16372MB)

please try:

mkdir linux || exit -1
cd linux

git init-db

# Add Linus's tree as a remote
git remote add linus
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

# Add the -tip tree as a remote
git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

# Add yinghai's tree

git remote add yinghai
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git

git remote update

git checkout -b yinghai-for-x86-mm yinghai/for-x86-mm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 22:25             ` Yinghai Lu
@ 2012-09-04 22:40               ` Flavio Leitner
  2012-09-05  0:01               ` Flavio Leitner
  1 sibling, 0 replies; 21+ messages in thread
From: Flavio Leitner @ 2012-09-04 22:40 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 15:25:45 -0700
Yinghai Lu <yinghai@kernel.org> wrote:

> On Tue, Sep 4, 2012 at 2:37 PM, Flavio Leitner <fbl@redhat.com> wrote:
> > On Tue, 4 Sep 2012 13:45:23 -0700
> > Yinghai Lu <yinghai@kernel.org> wrote:
> >
> >> On Tue, Sep 4, 2012 at 1:26 PM, Flavio Leitner <fbl@redhat.com> wrote:
> >> >
> >> > Sorry, but it didn't work.
> >> > The same problem happened.
> >>
> >> can you send out boot log ?
> >
> > sure, there you go:
> > http://sysclose.org/kdump/dmesg-debug.log
> > http://sysclose.org/kdump/config.log
> 
> looks like you did not use for-x86-mm branch.

No, I didn't. Sorry about that.
I will test and report back.
fbl

> 
> [    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
> [    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
> [    0.000000] init_memory_mapping: [mem 0x00000000-0xbf7fffff]
> [    0.000000]  [mem 0x00000000-0xbf7fffff] page 2M
> [    0.000000] kernel direct mapping tables up to 0xbf7fffff @ [mem
> 0x1fa00000-0x1fffffff]
> [    0.000000] init_memory_mapping: [mem 0x100000000-0x43fffffff]
> [    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
> [    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem
> 0xbf7ce000-0xbf7dffff]
> [    0.000000] RAMDISK: [mem 0x351d2000-0x368e0fff]
> [    0.000000] Reserving 256MB of memory at 592MB for crashkernel
> (System RAM: 16372MB)
> 
> please try:
> 
> mkdir linux || exit -1
> cd linux
> 
> git init-db
> 
> # Add Linus's tree as a remote
> git remote add linus
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> 
> # Add the -tip tree as a remote
> git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> 
> # Add yinghai's tree
> 
> git remote add yinghai
> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> 
> git remote update
> 
> git checkout -b yinghai-for-x86-mm yinghai/for-x86-mm


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 22:25             ` Yinghai Lu
  2012-09-04 22:40               ` Flavio Leitner
@ 2012-09-05  0:01               ` Flavio Leitner
  2012-09-05  1:15                 ` Yinghai Lu
  1 sibling, 1 reply; 21+ messages in thread
From: Flavio Leitner @ 2012-09-05  0:01 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 15:25:45 -0700
Yinghai Lu <yinghai@kernel.org> wrote:

> On Tue, Sep 4, 2012 at 2:37 PM, Flavio Leitner <fbl@redhat.com> wrote:
> > On Tue, 4 Sep 2012 13:45:23 -0700
> > Yinghai Lu <yinghai@kernel.org> wrote:
> >
> >> On Tue, Sep 4, 2012 at 1:26 PM, Flavio Leitner <fbl@redhat.com> wrote:
> >> >
> >> > Sorry, but it didn't work.
> >> > The same problem happened.
> >>
> >> can you send out boot log ?
> >
> > sure, there you go:
> > http://sysclose.org/kdump/dmesg-debug.log
> > http://sysclose.org/kdump/config.log
> 
> looks like you did not use for-x86-mm branch.
> 
> [    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
> [    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
> [    0.000000] init_memory_mapping: [mem 0x00000000-0xbf7fffff]
> [    0.000000]  [mem 0x00000000-0xbf7fffff] page 2M
> [    0.000000] kernel direct mapping tables up to 0xbf7fffff @ [mem
> 0x1fa00000-0x1fffffff]
> [    0.000000] init_memory_mapping: [mem 0x100000000-0x43fffffff]
> [    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
> [    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem
> 0xbf7ce000-0xbf7dffff]
> [    0.000000] RAMDISK: [mem 0x351d2000-0x368e0fff]
> [    0.000000] Reserving 256MB of memory at 592MB for crashkernel
> (System RAM: 16372MB)
> 
> please try:
> 
> mkdir linux || exit -1
> cd linux
> 
> git init-db
> 
> # Add Linus's tree as a remote
> git remote add linus
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> 
> # Add the -tip tree as a remote
> git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> 
> # Add yinghai's tree
> 
> git remote add yinghai
> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> 
> git remote update
> 
> git checkout -b yinghai-for-x86-mm yinghai/for-x86-mm

kdump works when using your branch:

[    0.000000] Linux version 3.6.0-rc4-00012-g9389673 (root@f17i7.rh) (gcc version 4.7.0 20120507 (Red Hat 4.7.0-5) (GCC) ) #1 SMP Tue Sep 4 20:36:43 BRT 2012
...
[    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
[    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
[    0.000000] calculate_table_space_size: [mem 0x00000000-0x000fffff]
[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
[    0.000000] calculate_table_space_size: [mem 0x00100000-0xbf4bcfff]
[    0.000000]  [mem 0x00100000-0x001fffff] page 4k
[    0.000000]  [mem 0x00200000-0xbf3fffff] page 2M
[    0.000000]  [mem 0xbf400000-0xbf4bcfff] page 4k
[    0.000000] calculate_table_space_size: [mem 0xbf4bf000-0xbf4c5fff]
[    0.000000]  [mem 0xbf4bf000-0xbf4c5fff] page 4k
[    0.000000] calculate_table_space_size: [mem 0xbf7bf000-0xbf7dffff]
[    0.000000]  [mem 0xbf7bf000-0xbf7dffff] page 4k
[    0.000000] calculate_table_space_size: [mem 0xbf7ff000-0xbf7fffff]
[    0.000000]  [mem 0xbf7ff000-0xbf7fffff] page 4k
[    0.000000] calculate_table_space_size: [mem 0x100000000-0x43fffffff]
[    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
[    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem 0x43ffe1000-0x43fffffff] prealloc
[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
[    0.000000] init_memory_mapping: [mem 0x00100000-0xbf4bcfff]
[    0.000000]  [mem 0x00100000-0x001fffff] page 4k
[    0.000000]  [mem 0x00200000-0xbf3fffff] page 2M
[    0.000000]  [mem 0xbf400000-0xbf4bcfff] page 4k
[    0.000000] init_memory_mapping: [mem 0xbf4bf000-0xbf4c5fff]
[    0.000000]  [mem 0xbf4bf000-0xbf4c5fff] page 4k
[    0.000000] init_memory_mapping: [mem 0xbf7bf000-0xbf7dffff]
[    0.000000]  [mem 0xbf7bf000-0xbf7dffff] page 4k
[    0.000000] init_memory_mapping: [mem 0xbf7ff000-0xbf7fffff]
[    0.000000]  [mem 0xbf7ff000-0xbf7fffff] page 4k
[    0.000000] init_memory_mapping: [mem 0x100000000-0x43fffffff]
[    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
[    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem 0x43ffe1000-0x43fff2fff] final
[    0.000000] RAMDISK: [mem 0x34ffa000-0x367f4fff]
...
http://sysclose.org/kdump/dmesg.log
http://sysclose.org/kdump/config.log

thanks,
fbl

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-05  0:01               ` Flavio Leitner
@ 2012-09-05  1:15                 ` Yinghai Lu
  2012-09-05 13:46                   ` Flavio Leitner
  0 siblings, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-09-05  1:15 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, Sep 4, 2012 at 5:01 PM, Flavio Leitner <fbl@redhat.com> wrote:
> On Tue, 4 Sep 2012 15:25:45 -0700
> kdump works when using your branch:
>
> [    0.000000] Linux version 3.6.0-rc4-00012-g9389673 (root@f17i7.rh) (gcc version 4.7.0 20120507 (Red Hat 4.7.0-5) (GCC) ) #1 SMP Tue Sep 4 20:36:43 BRT 2012
> ...
> [    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
> [    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
> [    0.000000] calculate_table_space_size: [mem 0x00000000-0x000fffff]
> [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> [    0.000000] calculate_table_space_size: [mem 0x00100000-0xbf4bcfff]
> [    0.000000]  [mem 0x00100000-0x001fffff] page 4k
> [    0.000000]  [mem 0x00200000-0xbf3fffff] page 2M
> [    0.000000]  [mem 0xbf400000-0xbf4bcfff] page 4k
> [    0.000000] calculate_table_space_size: [mem 0xbf4bf000-0xbf4c5fff]
> [    0.000000]  [mem 0xbf4bf000-0xbf4c5fff] page 4k
> [    0.000000] calculate_table_space_size: [mem 0xbf7bf000-0xbf7dffff]
> [    0.000000]  [mem 0xbf7bf000-0xbf7dffff] page 4k
> [    0.000000] calculate_table_space_size: [mem 0xbf7ff000-0xbf7fffff]
> [    0.000000]  [mem 0xbf7ff000-0xbf7fffff] page 4k
> [    0.000000] calculate_table_space_size: [mem 0x100000000-0x43fffffff]
> [    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
> [    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem 0x43ffe1000-0x43fffffff] prealloc
> [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> [    0.000000] init_memory_mapping: [mem 0x00100000-0xbf4bcfff]
> [    0.000000]  [mem 0x00100000-0x001fffff] page 4k
> [    0.000000]  [mem 0x00200000-0xbf3fffff] page 2M
> [    0.000000]  [mem 0xbf400000-0xbf4bcfff] page 4k
> [    0.000000] init_memory_mapping: [mem 0xbf4bf000-0xbf4c5fff]
> [    0.000000]  [mem 0xbf4bf000-0xbf4c5fff] page 4k
> [    0.000000] init_memory_mapping: [mem 0xbf7bf000-0xbf7dffff]
> [    0.000000]  [mem 0xbf7bf000-0xbf7dffff] page 4k
> [    0.000000] init_memory_mapping: [mem 0xbf7ff000-0xbf7fffff]
> [    0.000000]  [mem 0xbf7ff000-0xbf7fffff] page 4k
> [    0.000000] init_memory_mapping: [mem 0x100000000-0x43fffffff]
> [    0.000000]  [mem 0x100000000-0x43fffffff] page 2M
> [    0.000000] kernel direct mapping tables up to 0x43fffffff @ [mem 0x43ffe1000-0x43fff2fff] final
> [    0.000000] RAMDISK: [mem 0x34ffa000-0x367f4fff]

thanks.

assume when we have good_end setting for 64 bit, page table for [4g,
TOMH) will be just under 512M, and later when first
first 2M lines changes, will push that page table range a little low,
and will make kdump not happy.

BTW the first 2M change commit is useless should be reverted. because
even it is in 2M page mapping at first, later
kernel will change to 4k page.

and with other change in this patchset, init_memory_mapping(0,
ISA_END_ADDR) will always make sure first 2M use 4K page.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-05  1:15                 ` Yinghai Lu
@ 2012-09-05 13:46                   ` Flavio Leitner
  0 siblings, 0 replies; 21+ messages in thread
From: Flavio Leitner @ 2012-09-05 13:46 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: lkml, Ingo Molnar, WANG Cong, Tejun Heo, ianfang.cn, Andrew Morton

On Tue, 4 Sep 2012 18:15:25 -0700
Yinghai Lu <yinghai@kernel.org> wrote:
> assume when we have good_end setting for 64 bit, page table for [4g,
> TOMH) will be just under 512M, and later when first
> first 2M lines changes, will push that page table range a little low,
> and will make kdump not happy.
> 
> BTW the first 2M change commit is useless should be reverted. because
> even it is in 2M page mapping at first, later
> kernel will change to 4k page.
> 
> and with other change in this patchset, init_memory_mapping(0,
> ISA_END_ADDR) will always make sure first 2M use 4K page.

Hm, it's not clear to me. Are you going to push the patch reverting
that commit and then your patchset?

thank you!
fbl

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-04 17:32 kexec/kdump kernel fails to start Flavio Leitner
  2012-09-04 19:02 ` Yinghai Lu
@ 2012-09-05 15:34 ` Cong Wang
  2012-09-23 20:27   ` Dan Carpenter
  1 sibling, 1 reply; 21+ messages in thread
From: Cong Wang @ 2012-09-05 15:34 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: lkml, Ingo Molnar, Yinghai Lu, Tejun Heo, ianfang.cn, Andrew Morton

On Wed, Sep 5, 2012 at 1:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
> Hi folks,
>
> I have system that no longer boots kdump kernel. Basically,
>
> # echo c > /proc/sysrq-trigger
>
> to dump a vmcore doesn't work. It just hangs after showing the usual
> panic messages. I've bisected the problem and the commit introducing
> the issue is the one below.
>
> Any idea?
>
> commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
> Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
> Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
> Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
> Branches: master, remotes/origin/master
> Follows: v3.3-rc6
> Precedes: v3.5-rc1
>
>     x86/mm: Fix the size calculation of mapping tables

There was some attempt to fix this:
https://patchwork.kernel.org/patch/1195751/

but for some reason it is not accepted.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-05 15:34 ` Cong Wang
@ 2012-09-23 20:27   ` Dan Carpenter
  2012-09-23 20:52     ` Yinghai Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Dan Carpenter @ 2012-09-23 20:27 UTC (permalink / raw)
  To: Cong Wang
  Cc: Flavio Leitner, lkml, Ingo Molnar, Yinghai Lu, Tejun Heo,
	ianfang.cn, Andrew Morton

On Wed, Sep 05, 2012 at 11:34:25PM +0800, Cong Wang wrote:
> On Wed, Sep 5, 2012 at 1:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
> > Hi folks,
> >
> > I have system that no longer boots kdump kernel. Basically,
> >
> > # echo c > /proc/sysrq-trigger
> >
> > to dump a vmcore doesn't work. It just hangs after showing the usual
> > panic messages. I've bisected the problem and the commit introducing
> > the issue is the one below.
> >
> > Any idea?
> >
> > commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> > Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
> > Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
> > Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
> > Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
> > Branches: master, remotes/origin/master
> > Follows: v3.3-rc6
> > Precedes: v3.5-rc1
> >
> >     x86/mm: Fix the size calculation of mapping tables
> 
> There was some attempt to fix this:
> https://patchwork.kernel.org/patch/1195751/
> 
> but for some reason it is not accepted.

I filed a bug for this:
https://bugzilla.kernel.org/show_bug.cgi?id=47881

Is it fixed now?

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-23 20:27   ` Dan Carpenter
@ 2012-09-23 20:52     ` Yinghai Lu
  2012-09-29  7:13       ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-09-23 20:52 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Cong Wang, Flavio Leitner, lkml, Ingo Molnar, Tejun Heo,
	ianfang.cn, Andrew Morton

On Sun, Sep 23, 2012 at 1:27 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> On Wed, Sep 05, 2012 at 11:34:25PM +0800, Cong Wang wrote:
>> On Wed, Sep 5, 2012 at 1:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
>> > Hi folks,
>> >
>> > I have system that no longer boots kdump kernel. Basically,
>> >
>> > # echo c > /proc/sysrq-trigger
>> >
>> > to dump a vmcore doesn't work. It just hangs after showing the usual
>> > panic messages. I've bisected the problem and the commit introducing
>> > the issue is the one below.
>> >
>> > Any idea?
>> >
>> > commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
>> > Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
>> > Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
>> > Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
>> > Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
>> > Branches: master, remotes/origin/master
>> > Follows: v3.3-rc6
>> > Precedes: v3.5-rc1
>> >
>> >     x86/mm: Fix the size calculation of mapping tables
>>
>> There was some attempt to fix this:
>> https://patchwork.kernel.org/patch/1195751/
>>
>> but for some reason it is not accepted.
>
> I filed a bug for this:
> https://bugzilla.kernel.org/show_bug.cgi?id=47881
>
> Is it fixed now?

that offending patch should be reverted...

722bc6b16771ed80871e1fd81c86d3627dda2ac8

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-23 20:52     ` Yinghai Lu
@ 2012-09-29  7:13       ` Ingo Molnar
  2012-10-18  2:16         ` Dave Young
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2012-09-29  7:13 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Dan Carpenter, Cong Wang, Flavio Leitner, lkml, Ingo Molnar,
	Tejun Heo, ianfang.cn, Andrew Morton


* Yinghai Lu <yinghai@kernel.org> wrote:

> On Sun, Sep 23, 2012 at 1:27 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > On Wed, Sep 05, 2012 at 11:34:25PM +0800, Cong Wang wrote:
> >> On Wed, Sep 5, 2012 at 1:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
> >> > Hi folks,
> >> >
> >> > I have system that no longer boots kdump kernel. Basically,
> >> >
> >> > # echo c > /proc/sysrq-trigger
> >> >
> >> > to dump a vmcore doesn't work. It just hangs after showing the usual
> >> > panic messages. I've bisected the problem and the commit introducing
> >> > the issue is the one below.
> >> >
> >> > Any idea?
> >> >
> >> > commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> >> > Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
> >> > Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
> >> > Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
> >> > Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
> >> > Branches: master, remotes/origin/master
> >> > Follows: v3.3-rc6
> >> > Precedes: v3.5-rc1
> >> >
> >> >     x86/mm: Fix the size calculation of mapping tables
> >>
> >> There was some attempt to fix this:
> >> https://patchwork.kernel.org/patch/1195751/
> >>
> >> but for some reason it is not accepted.
> >
> > I filed a bug for this:
> > https://bugzilla.kernel.org/show_bug.cgi?id=47881
> >
> > Is it fixed now?
> 
> that offending patch should be reverted...
> 
> 722bc6b16771ed80871e1fd81c86d3627dda2ac8

It does not revert cleanly - could someone send a (kexec 
tested!) patch with a proper description?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-09-29  7:13       ` Ingo Molnar
@ 2012-10-18  2:16         ` Dave Young
  2012-10-18  6:33           ` Dave Young
  0 siblings, 1 reply; 21+ messages in thread
From: Dave Young @ 2012-10-18  2:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Yinghai Lu, Dan Carpenter, Cong Wang, Flavio Leitner, lkml,
	Ingo Molnar, Tejun Heo, ianfang.cn, Andrew Morton, Vivek Goyal

On Sat, Sep 29, 2012 at 3:13 PM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Yinghai Lu <yinghai@kernel.org> wrote:
>
>> On Sun, Sep 23, 2012 at 1:27 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
>> > On Wed, Sep 05, 2012 at 11:34:25PM +0800, Cong Wang wrote:
>> >> On Wed, Sep 5, 2012 at 1:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
>> >> > Hi folks,
>> >> >
>> >> > I have system that no longer boots kdump kernel. Basically,
>> >> >
>> >> > # echo c > /proc/sysrq-trigger
>> >> >
>> >> > to dump a vmcore doesn't work. It just hangs after showing the usual
>> >> > panic messages. I've bisected the problem and the commit introducing
>> >> > the issue is the one below.
>> >> >
>> >> > Any idea?
>> >> >
>> >> > commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
>> >> > Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
>> >> > Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
>> >> > Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
>> >> > Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
>> >> > Branches: master, remotes/origin/master
>> >> > Follows: v3.3-rc6
>> >> > Precedes: v3.5-rc1
>> >> >
>> >> >     x86/mm: Fix the size calculation of mapping tables
>> >>
>> >> There was some attempt to fix this:
>> >> https://patchwork.kernel.org/patch/1195751/
>> >>
>> >> but for some reason it is not accepted.
>> >
>> > I filed a bug for this:
>> > https://bugzilla.kernel.org/show_bug.cgi?id=47881
>> >
>> > Is it fixed now?
>>
>> that offending patch should be reverted...
>>
>> 722bc6b16771ed80871e1fd81c86d3627dda2ac8
>
> It does not revert cleanly - could someone send a (kexec
> tested!) patch with a proper description?

Hi, ingo

Besides of commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8,
below commit also need revert.

commit bd2753b2dda7bb43c7468826de75f49c6a7e8965
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Wed Jun 6 10:55:40 2012 -0700

    x86/mm: Only add extra pages count for the first memory range
during pre-allocation early page table space

    Robin found this regression:

    | I just tried to boot an 8TB system.  It fails very early in boot with:
    | Kernel panic - not syncing: Cannot find space for the kernel page tables

    git bisect commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8.

    A git revert of that commit does boot past that point on the 8TB
    configuration.

    That commit will add up extra pages for all memory range even
    above 4g.

    Try to limit that extra page count adding to first entry only.

    Bisected-by: Robin Holt <holt@sgi.com>
    Tested-by: Robin Holt <holt@sgi.com>
    Signed-off-by: Yinghai Lu <yinghai@kernel.org>
    Cc: WANG Cong <xiyou.wangcong@gmail.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9BZMYA@mail.gmail.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>



OTOH,  Jacob and Yinghai has better init_memory_mapping
cleanup patches which are in tip:x86/mm2 already.  Their patches fixes
this issue
as well.

Since kdump does not work for long time since
722bc6b16771ed80871e1fd81c86d3627dda2ac8
Can you or someone else help to get the init_memory_mapping patches merged?

Or do you still prefer to revert 722bc6b and bd2753b2d?  I think
stable kernel also need a fix.

-- 
Regards
Dave

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-10-18  2:16         ` Dave Young
@ 2012-10-18  6:33           ` Dave Young
  2012-10-18 13:57             ` Cong Wang
  2012-10-18 16:27             ` Flavio Leitner
  0 siblings, 2 replies; 21+ messages in thread
From: Dave Young @ 2012-10-18  6:33 UTC (permalink / raw)
  To: Dave Young
  Cc: Ingo Molnar, Yinghai Lu, Dan Carpenter, Cong Wang,
	Flavio Leitner, lkml, Ingo Molnar, Tejun Heo, ianfang.cn,
	Andrew Morton, Vivek Goyal

On 10/18/2012 10:16 AM, Dave Young wrote:

> On Sat, Sep 29, 2012 at 3:13 PM, Ingo Molnar <mingo@kernel.org> wrote:
>>
>> * Yinghai Lu <yinghai@kernel.org> wrote:
>>
>>> On Sun, Sep 23, 2012 at 1:27 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
>>>> On Wed, Sep 05, 2012 at 11:34:25PM +0800, Cong Wang wrote:
>>>>> On Wed, Sep 5, 2012 at 1:32 AM, Flavio Leitner <fbl@redhat.com> wrote:
>>>>>> Hi folks,
>>>>>>
>>>>>> I have system that no longer boots kdump kernel. Basically,
>>>>>>
>>>>>> # echo c > /proc/sysrq-trigger
>>>>>>
>>>>>> to dump a vmcore doesn't work. It just hangs after showing the usual
>>>>>> panic messages. I've bisected the problem and the commit introducing
>>>>>> the issue is the one below.
>>>>>>
>>>>>> Any idea?
>>>>>>
>>>>>> commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
>>>>>> Author: WANG Cong <xiyou.wangcong@gmail.com>  2012-03-05 20:05:13
>>>>>> Committer: Ingo Molnar <mingo@elte.hu>  2012-03-06 05:38:26
>>>>>> Parent: 550cf00dbc8ee402bef71628cb71246493dd4500 (Merge tag 'mmc-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
>>>>>> Child:  a6fca40f1d7f3e232c9de27c1cebbb9f787fbc4f (x86, tlb: Switch cr3 in leave_mm() only when needed)
>>>>>> Branches: master, remotes/origin/master
>>>>>> Follows: v3.3-rc6
>>>>>> Precedes: v3.5-rc1
>>>>>>
>>>>>>     x86/mm: Fix the size calculation of mapping tables
>>>>>
>>>>> There was some attempt to fix this:
>>>>> https://patchwork.kernel.org/patch/1195751/
>>>>>
>>>>> but for some reason it is not accepted.
>>>>
>>>> I filed a bug for this:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=47881
>>>>
>>>> Is it fixed now?
>>>
>>> that offending patch should be reverted...
>>>
>>> 722bc6b16771ed80871e1fd81c86d3627dda2ac8
>>
>> It does not revert cleanly - could someone send a (kexec
>> tested!) patch with a proper description?
> 
> Hi, ingo
> 
> Besides of commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8,
> below commit also need revert.
> 
> commit bd2753b2dda7bb43c7468826de75f49c6a7e8965
> Author: Yinghai Lu <yinghai@kernel.org>
> Date:   Wed Jun 6 10:55:40 2012 -0700
> 
>     x86/mm: Only add extra pages count for the first memory range
> during pre-allocation early page table space
> 
>     Robin found this regression:
> 
>     | I just tried to boot an 8TB system.  It fails very early in boot with:
>     | Kernel panic - not syncing: Cannot find space for the kernel page tables
> 
>     git bisect commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8.
> 
>     A git revert of that commit does boot past that point on the 8TB
>     configuration.
> 
>     That commit will add up extra pages for all memory range even
>     above 4g.
> 
>     Try to limit that extra page count adding to first entry only.
> 
>     Bisected-by: Robin Holt <holt@sgi.com>
>     Tested-by: Robin Holt <holt@sgi.com>
>     Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>     Cc: WANG Cong <xiyou.wangcong@gmail.com>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Andrew Morton <akpm@linux-foundation.org>
>     Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>     Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9BZMYA@mail.gmail.com
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> 
> 
> OTOH,  Jacob and Yinghai has better init_memory_mapping
> cleanup patches which are in tip:x86/mm2 already.  Their patches fixes
> this issue
> as well.
> 
> Since kdump does not work for long time since
> 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> Can you or someone else help to get the init_memory_mapping patches merged?
> 
> Or do you still prefer to revert 722bc6b and bd2753b2d?  I think
> stable kernel also need a fix.
> 




Just see Yinghai's coments, later init_memory_mapping cleanup
will also address the 4k pages in first 2/4M, so revert them should be better.
https://lkml.org/lkml/2012/9/4/533

Here is a patch for the reverting:
---
x86 mm: Revert find_early_table_space fix

722bc6b16771ed80871e1fd81c86d3627dda2ac8 Try to address the issue that the
first 2/4M should use 4k pages if PSE enabled. but extra counts should only
valid for x86_32. This commit cause kdump regression, kdump kernel hangs happens
with it. 

As Yinghai Lu said they should be reverted. see below post:
https://lkml.org/lkml/2012/9/4/533 

As there's a later fix to above fix which is bd2753b2dda7bb43c7468826de75f49c6a7e8965
So we need revert both of these two commits.

Tested kdump on physical and virutual machines. 

Reverted commits:
commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Mon Mar 5 15:05:13 2012 -0800

    x86/mm: Fix the size calculation of mapping tables
    
    For machines that enable PSE, the first 2/4M memory region still uses
    4K pages, so needs more PTEs in this case, but
    find_early_table_space() doesn't count this.
    
    This patch fixes it.
    
    The bug was found via code review, no misbehavior of the kernel
    was observed.
    
    Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: <ianfang.cn@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Link: http://lkml.kernel.org/n/tip-kq6a00qe33h7c7ais2xsywnh@git.kernel.org
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit bd2753b2dda7bb43c7468826de75f49c6a7e8965
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Wed Jun 6 10:55:40 2012 -0700

    x86/mm: Only add extra pages count for the first memory range during pre-allocatio
    
    Robin found this regression:
    
    | I just tried to boot an 8TB system.  It fails very early in boot with:
    | Kernel panic - not syncing: Cannot find space for the kernel page tables
    
    git bisect commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8.
    
    A git revert of that commit does boot past that point on the 8TB
    configuration.
    
    That commit will add up extra pages for all memory range even
    above 4g.
    
    Try to limit that extra page count adding to first entry only.
    
    Bisected-by: Robin Holt <holt@sgi.com>
    Tested-by: Robin Holt <holt@sgi.com>
    Signed-off-by: Yinghai Lu <yinghai@kernel.org>
    Cc: WANG Cong <xiyou.wangcong@gmail.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9B
    Signed-off-by: Ingo Molnar <mingo@kernel.org>


Signed-off-by: Dave Young <dyoung@redhat.com>
---
 arch/x86/mm/init.c |   22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

--- linux-2.6.orig/arch/x86/mm/init.c
+++ linux-2.6/arch/x86/mm/init.c
@@ -29,14 +29,8 @@ int direct_gbpages
 #endif
 ;
 
-struct map_range {
-	unsigned long start;
-	unsigned long end;
-	unsigned page_size_mask;
-};
-
-static void __init find_early_table_space(struct map_range *mr, unsigned long end,
-					  int use_pse, int use_gbpages)
+static void __init find_early_table_space(unsigned long end, int use_pse,
+					  int use_gbpages)
 {
 	unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
 	phys_addr_t base;
@@ -61,10 +55,6 @@ static void __init find_early_table_spac
 #ifdef CONFIG_X86_32
 		extra += PMD_SIZE;
 #endif
-		/* The first 2/4M doesn't use large pages. */
-		if (mr->start < PMD_SIZE)
-			extra += mr->end - mr->start;
-
 		ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
 	} else
 		ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
@@ -95,6 +85,12 @@ void __init native_pagetable_reserve(u64
 	memblock_reserve(start, end - start);
 }
 
+struct map_range {
+	unsigned long start;
+	unsigned long end;
+	unsigned page_size_mask;
+};
+
 #ifdef CONFIG_X86_32
 #define NR_RANGE_MR 3
 #else /* CONFIG_X86_64 */
@@ -267,7 +263,7 @@ unsigned long __init_refok init_memory_m
 	 * nodes are discovered.
 	 */
 	if (!after_bootmem)
-		find_early_table_space(&mr[0], end, use_pse, use_gbpages);
+		find_early_table_space(end, use_pse, use_gbpages);
 
 	for (i = 0; i < nr_range; i++)
 		ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-10-18  6:33           ` Dave Young
@ 2012-10-18 13:57             ` Cong Wang
  2012-10-18 16:27             ` Flavio Leitner
  1 sibling, 0 replies; 21+ messages in thread
From: Cong Wang @ 2012-10-18 13:57 UTC (permalink / raw)
  To: Dave Young
  Cc: Dave Young, Ingo Molnar, Yinghai Lu, Dan Carpenter,
	Flavio Leitner, lkml, Ingo Molnar, Tejun Heo, ianfang.cn,
	Andrew Morton, Vivek Goyal

On Thu, Oct 18, 2012 at 2:33 PM, Dave Young <dyoung@redhat.com> wrote:
> Here is a patch for the reverting:
> ---
> x86 mm: Revert find_early_table_space fix
>
> 722bc6b16771ed80871e1fd81c86d3627dda2ac8 Try to address the issue that the
> first 2/4M should use 4k pages if PSE enabled. but extra counts should only
> valid for x86_32. This commit cause kdump regression, kdump kernel hangs happens
> with it.
>
> As Yinghai Lu said they should be reverted. see below post:
> https://lkml.org/lkml/2012/9/4/533
>
> As there's a later fix to above fix which is bd2753b2dda7bb43c7468826de75f49c6a7e8965
> So we need revert both of these two commits.
>
> Tested kdump on physical and virutual machines.

Looks good to me,

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

Thanks for the fix!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: kexec/kdump kernel fails to start
  2012-10-18  6:33           ` Dave Young
  2012-10-18 13:57             ` Cong Wang
@ 2012-10-18 16:27             ` Flavio Leitner
  1 sibling, 0 replies; 21+ messages in thread
From: Flavio Leitner @ 2012-10-18 16:27 UTC (permalink / raw)
  To: Dave Young
  Cc: Dave Young, Ingo Molnar, Yinghai Lu, Dan Carpenter, Cong Wang,
	lkml, Ingo Molnar, Tejun Heo, ianfang.cn, Andrew Morton,
	Vivek Goyal

On Thu, 18 Oct 2012 14:33:23 +0800
Dave Young <dyoung@redhat.com> wrote:
[...]
> Just see Yinghai's coments, later init_memory_mapping cleanup
> will also address the 4k pages in first 2/4M, so revert them should be better.
> https://lkml.org/lkml/2012/9/4/533
> 
> Here is a patch for the reverting:
> ---
> x86 mm: Revert find_early_table_space fix
> 
> 722bc6b16771ed80871e1fd81c86d3627dda2ac8 Try to address the issue that the
> first 2/4M should use 4k pages if PSE enabled. but extra counts should only
> valid for x86_32. This commit cause kdump regression, kdump kernel hangs happens
> with it. 
> 
> As Yinghai Lu said they should be reverted. see below post:
> https://lkml.org/lkml/2012/9/4/533 
> 
> As there's a later fix to above fix which is bd2753b2dda7bb43c7468826de75f49c6a7e8965
> So we need revert both of these two commits.
> 
> Tested kdump on physical and virutual machines. 
> 
> Reverted commits:
> commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> Author: WANG Cong <xiyou.wangcong@gmail.com>
> Date:   Mon Mar 5 15:05:13 2012 -0800
> 
>     x86/mm: Fix the size calculation of mapping tables
>     
>     For machines that enable PSE, the first 2/4M memory region still uses
>     4K pages, so needs more PTEs in this case, but
>     find_early_table_space() doesn't count this.
>     
>     This patch fixes it.
>     
>     The bug was found via code review, no misbehavior of the kernel
>     was observed.
>     
>     Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
>     Cc: Yinghai Lu <yinghai@kernel.org>
>     Cc: Tejun Heo <tj@kernel.org>
>     Cc: <ianfang.cn@gmail.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Link: http://lkml.kernel.org/n/tip-kq6a00qe33h7c7ais2xsywnh@git.kernel.org
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> commit bd2753b2dda7bb43c7468826de75f49c6a7e8965
> Author: Yinghai Lu <yinghai@kernel.org>
> Date:   Wed Jun 6 10:55:40 2012 -0700
> 
>     x86/mm: Only add extra pages count for the first memory range during pre-allocatio
>     
>     Robin found this regression:
>     
>     | I just tried to boot an 8TB system.  It fails very early in boot with:
>     | Kernel panic - not syncing: Cannot find space for the kernel page tables
>     
>     git bisect commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8.
>     
>     A git revert of that commit does boot past that point on the 8TB
>     configuration.
>     
>     That commit will add up extra pages for all memory range even
>     above 4g.
>     
>     Try to limit that extra page count adding to first entry only.
>     
>     Bisected-by: Robin Holt <holt@sgi.com>
>     Tested-by: Robin Holt <holt@sgi.com>
>     Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>     Cc: WANG Cong <xiyou.wangcong@gmail.com>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Andrew Morton <akpm@linux-foundation.org>
>     Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>     Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9B
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> 
> Signed-off-by: Dave Young <dyoung@redhat.com>
> ---
>  arch/x86/mm/init.c |   22 +++++++++-------------
>  1 file changed, 9 insertions(+), 13 deletions(-)

The patch looks good.

I reproduced the issue with last upstream
commit 43c422eda99b894f18d1cca17bcd2401efaf7bd0
and confirmed that it does work with the patch applied.

thanks a lot!

Acked-by: Flavio Leitner <fbl@redhat.com>
Tested-by: Flavio Leitner <fbl@redhat.com>

fbl

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-10-18 16:28 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-04 17:32 kexec/kdump kernel fails to start Flavio Leitner
2012-09-04 19:02 ` Yinghai Lu
2012-09-04 19:17   ` Flavio Leitner
2012-09-04 19:20     ` Yinghai Lu
2012-09-04 20:00       ` Flavio Leitner
2012-09-04 20:26       ` Flavio Leitner
2012-09-04 20:45         ` Yinghai Lu
2012-09-04 21:37           ` Flavio Leitner
2012-09-04 22:25             ` Yinghai Lu
2012-09-04 22:40               ` Flavio Leitner
2012-09-05  0:01               ` Flavio Leitner
2012-09-05  1:15                 ` Yinghai Lu
2012-09-05 13:46                   ` Flavio Leitner
2012-09-05 15:34 ` Cong Wang
2012-09-23 20:27   ` Dan Carpenter
2012-09-23 20:52     ` Yinghai Lu
2012-09-29  7:13       ` Ingo Molnar
2012-10-18  2:16         ` Dave Young
2012-10-18  6:33           ` Dave Young
2012-10-18 13:57             ` Cong Wang
2012-10-18 16:27             ` Flavio Leitner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).