linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* crash dump memory reservation regression
       [not found] <5535e246-b7c0-4f69-a27b-1abbadd42db0@zmail14.collab.prod.int.phx2.redhat.com>
@ 2012-03-12  3:00 ` CAI Qian
  2012-03-13  5:31   ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: CAI Qian @ 2012-03-12  3:00 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Vivek Goyal, H. Peter Anvin, linux-kernel

commit 3661ca66a42e306aaf53246fb75aec1ea01be0f0
x86, memblock: Fix crashkernel allocation

introduced a regression that crashkernel=512M
according to bisecting will fail like this,

crashkernel reservation failed - No suitable area found.
The full dmesg can be found here.

http://people.redhat.com/qcai/dmesg.bad

CAI Qian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-12  3:00 ` crash dump memory reservation regression CAI Qian
@ 2012-03-13  5:31   ` Yinghai Lu
  2012-03-13  5:42     ` H. Peter Anvin
                       ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Yinghai Lu @ 2012-03-13  5:31 UTC (permalink / raw)
  To: CAI Qian, Takashi Iwai, Linus Torvalds, H. Peter Anvin, Ingo Molnar
  Cc: Vivek Goyal, linux-kernel

On Sun, Mar 11, 2012 at 8:00 PM, CAI Qian <caiqian@redhat.com> wrote:
> commit 3661ca66a42e306aaf53246fb75aec1ea01be0f0
> x86, memblock: Fix crashkernel allocation
>
> introduced a regression that crashkernel=512M
> according to bisecting will fail like this,
>
> crashkernel reservation failed - No suitable area found.
> The full dmesg can be found here.
>
> http://people.redhat.com/qcai/dmesg.bad

The reason is: we put pagetable for [0,2g) just blow 512M.

Later we have other patches that will put pagetable for [0,2g) just
below 2g. even at that time we only can access 512M, because we use
early_ioremap to access page table.

But that good_end part get reverted in following because it cause s4
resume fail.

So there will be pagetable around just below 512M again. So you have
no chance to get 512M below 768M.

Solution will be:
1.  remove the good_end setting for 64 bit again. and root cause S4 resume.
2.  get page low?
3.  fix kdump, and make kdump could take two ranges, one is small
segment below 512M, other part could be more than 4G.

Thanks

Yinghai


commit 8548c84da2f47e71bbbe300f55edb768492575f7
Author: Takashi Iwai <tiwai@suse.de>
Date:   Sun Oct 23 23:19:12 2011 +0200

    x86: Fix S4 regression

    Commit 4b239f458 ("x86-64, mm: Put early page table high") causes a S4
    regression since 2.6.39, namely the machine reboots occasionally at S4
    resume.  It doesn't happen always, overall rate is about 1/20.  But,
    like other bugs, once when this happens, it continues to happen.

    This patch fixes the problem by essentially reverting the memory
    assignment in the older way.

    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Cc: <stable@kernel.org>
    Cc: Rafael J. Wysocki <rjw@sisk.pl>
    Cc: Yinghai Lu <yinghai.lu@oracle.com>
    [ We'll hopefully find the real fix, but that's too late for 3.1 now ]
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3032644..87488b9 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -63,9 +63,8 @@ static void __init find_early_table_space(unsigned
long end, int use_pse,
 #ifdef CONFIG_X86_32
        /* for fixmap */
        tables += roundup(__end_of_fixed_addresses * sizeof(pte_t), PAGE_SIZE);
-
-       good_end = max_pfn_mapped << PAGE_SHIFT;
 #endif
+       good_end = max_pfn_mapped << PAGE_SHIFT;

        base = memblock_find_in_range(start, good_end, tables, PAGE_SIZE);
        if (base == MEMBLOCK_ERROR)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-13  5:31   ` Yinghai Lu
@ 2012-03-13  5:42     ` H. Peter Anvin
  2012-08-31 15:59       ` Shuah Khan
  2012-03-13 14:26     ` Vivek Goyal
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2012-03-13  5:42 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: CAI Qian, Takashi Iwai, Linus Torvalds, Ingo Molnar, Vivek Goyal,
	linux-kernel

On 03/12/2012 10:31 PM, Yinghai Lu wrote:
> 
> Solution will be:
> 1.  remove the good_end setting for 64 bit again. and root cause S4 resume.

This would by far be the best.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-13  5:31   ` Yinghai Lu
  2012-03-13  5:42     ` H. Peter Anvin
@ 2012-03-13 14:26     ` Vivek Goyal
  2012-03-13 21:28       ` Yinghai Lu
  2012-03-21  8:17     ` Dave Young
  2012-03-26 10:32     ` Cong Wang
  3 siblings, 1 reply; 11+ messages in thread
From: Vivek Goyal @ 2012-03-13 14:26 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: CAI Qian, Takashi Iwai, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, linux-kernel

On Mon, Mar 12, 2012 at 10:31:21PM -0700, Yinghai Lu wrote:

[..]
> 3.  fix kdump, and make kdump could take two ranges, one is small
> segment below 512M, other part could be more than 4G.

I will prefer to avoid supporting split memory range for kdump memory.
This will make the kdump solution complicated and we might not have
much to gain.

In general focus is to reserve as less a memory as possible for kdump
kernel. Currently for x86, we reserve 128M adhoc block by default and
scale it up by 64MB per 1TB of physical RAM (dump filtering utility
requires 2bits of memory per 4K physical page).

So as long as we can reserve till 512MB of kdump memory, that should allow
us to support up to 6TB of systems with dump filtering. Hopefully that is
sufficient for quite some time and we don't have to take the path of
supporting non-contiguous memory for kdump.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-13 14:26     ` Vivek Goyal
@ 2012-03-13 21:28       ` Yinghai Lu
  0 siblings, 0 replies; 11+ messages in thread
From: Yinghai Lu @ 2012-03-13 21:28 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: CAI Qian, Takashi Iwai, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, linux-kernel

On Tue, Mar 13, 2012 at 7:26 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, Mar 12, 2012 at 10:31:21PM -0700, Yinghai Lu wrote:
>
>
> So as long as we can reserve till 512MB of kdump memory, that should allow
> us to support up to 6TB of systems with dump filtering. Hopefully that is
> sufficient for quite some time and we don't have to take the path of
> supporting non-contiguous memory for kdump.

one or two years later, there may be x86_system more than 16T?

Yinghai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-13  5:31   ` Yinghai Lu
  2012-03-13  5:42     ` H. Peter Anvin
  2012-03-13 14:26     ` Vivek Goyal
@ 2012-03-21  8:17     ` Dave Young
  2012-03-26 10:32     ` Cong Wang
  3 siblings, 0 replies; 11+ messages in thread
From: Dave Young @ 2012-03-21  8:17 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: CAI Qian, Takashi Iwai, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Vivek Goyal, linux-kernel

On Tue, Mar 13, 2012 at 1:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Sun, Mar 11, 2012 at 8:00 PM, CAI Qian <caiqian@redhat.com> wrote:
>> commit 3661ca66a42e306aaf53246fb75aec1ea01be0f0
>> x86, memblock: Fix crashkernel allocation
>>
>> introduced a regression that crashkernel=512M
>> according to bisecting will fail like this,
>>
>> crashkernel reservation failed - No suitable area found.
>> The full dmesg can be found here.
>>
>> http://people.redhat.com/qcai/dmesg.bad
>
> The reason is: we put pagetable for [0,2g) just blow 512M.
>
> Later we have other patches that will put pagetable for [0,2g) just
> below 2g. even at that time we only can access 512M, because we use
> early_ioremap to access page table.
>
> But that good_end part get reverted in following because it cause s4
> resume fail.
>
> So there will be pagetable around just below 512M again. So you have
> no chance to get 512M below 768M.
>
> Solution will be:
> 1.  remove the good_end setting for 64 bit again. and root cause S4 resume.

Takashi, could you check if latest kernel without the good_end patch
below works for you?

> 2.  get page low?
> 3.  fix kdump, and make kdump could take two ranges, one is small
> segment below 512M, other part could be more than 4G.
>
> Thanks
>
> Yinghai
>
>
> commit 8548c84da2f47e71bbbe300f55edb768492575f7
> Author: Takashi Iwai <tiwai@suse.de>
> Date:   Sun Oct 23 23:19:12 2011 +0200
>
>    x86: Fix S4 regression
>
>    Commit 4b239f458 ("x86-64, mm: Put early page table high") causes a S4
>    regression since 2.6.39, namely the machine reboots occasionally at S4
>    resume.  It doesn't happen always, overall rate is about 1/20.  But,
>    like other bugs, once when this happens, it continues to happen.
>
>    This patch fixes the problem by essentially reverting the memory
>    assignment in the older way.
>
>    Signed-off-by: Takashi Iwai <tiwai@suse.de>
>    Cc: <stable@kernel.org>
>    Cc: Rafael J. Wysocki <rjw@sisk.pl>
>    Cc: Yinghai Lu <yinghai.lu@oracle.com>
>    [ We'll hopefully find the real fix, but that's too late for 3.1 now ]
>    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 3032644..87488b9 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -63,9 +63,8 @@ static void __init find_early_table_space(unsigned
> long end, int use_pse,
>  #ifdef CONFIG_X86_32
>        /* for fixmap */
>        tables += roundup(__end_of_fixed_addresses * sizeof(pte_t), PAGE_SIZE);
> -
> -       good_end = max_pfn_mapped << PAGE_SHIFT;
>  #endif
> +       good_end = max_pfn_mapped << PAGE_SHIFT;
>
>        base = memblock_find_in_range(start, good_end, tables, PAGE_SIZE);
>        if (base == MEMBLOCK_ERROR)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Regards
Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-13  5:31   ` Yinghai Lu
                       ` (2 preceding siblings ...)
  2012-03-21  8:17     ` Dave Young
@ 2012-03-26 10:32     ` Cong Wang
  3 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2012-03-26 10:32 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: CAI Qian, Takashi Iwai, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Vivek Goyal, linux-kernel

On Tue, Mar 13, 2012 at 1:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Sun, Mar 11, 2012 at 8:00 PM, CAI Qian <caiqian@redhat.com> wrote:
>> commit 3661ca66a42e306aaf53246fb75aec1ea01be0f0
>> x86, memblock: Fix crashkernel allocation
>>
>> introduced a regression that crashkernel=512M
>> according to bisecting will fail like this,
>>
>> crashkernel reservation failed - No suitable area found.
>> The full dmesg can be found here.
>>
>> http://people.redhat.com/qcai/dmesg.bad
>
> The reason is: we put pagetable for [0,2g) just blow 512M.
>
> Later we have other patches that will put pagetable for [0,2g) just
> below 2g. even at that time we only can access 512M, because we use
> early_ioremap to access page table.
>
> But that good_end part get reverted in following because it cause s4
> resume fail.
>
> So there will be pagetable around just below 512M again. So you have
> no chance to get 512M below 768M.
>
> Solution will be:
> 1.  remove the good_end setting for 64 bit again. and root cause S4 resume.
> 2.  get page low?
> 3.  fix kdump, and make kdump could take two ranges, one is small
> segment below 512M, other part could be more than 4G.

Is increasing CRASH_KERNEL_ADDR_MAX a 4th solution? I know we need
to fix kexec-tools too, but we will get more benefits...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-03-13  5:42     ` H. Peter Anvin
@ 2012-08-31 15:59       ` Shuah Khan
  2012-08-31 16:37         ` H. Peter Anvin
  0 siblings, 1 reply; 11+ messages in thread
From: Shuah Khan @ 2012-08-31 15:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Yinghai Lu, CAI Qian, Takashi Iwai, Linus Torvalds, Ingo Molnar,
	Vivek Goyal, linux-kernel

On Mon, 2012-03-12 at 22:42 -0700, H. Peter Anvin wrote:
> On 03/12/2012 10:31 PM, Yinghai Lu wrote:
> > 
> > Solution will be:
> > 1.  remove the good_end setting for 64 bit again. and root cause S4 resume.
> 
> This would by far be the best.
> 
> 	-hpa
> 

Any resolution on this issue. Has this been fixed? I haven't seen this
message on the systems I am running the upstream kernel on, and would
like to get a commit ID (if any) that fixed the problem.

Thanks,
-- Shuah



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-08-31 15:59       ` Shuah Khan
@ 2012-08-31 16:37         ` H. Peter Anvin
  2012-08-31 16:42           ` Takashi Iwai
  0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2012-08-31 16:37 UTC (permalink / raw)
  To: shuah.khan
  Cc: Yinghai Lu, CAI Qian, Takashi Iwai, Linus Torvalds, Ingo Molnar,
	Vivek Goyal, linux-kernel

Kernel Summit is this week... people are away.

Shuah Khan <shuah.khan@hp.com> wrote:

>On Mon, 2012-03-12 at 22:42 -0700, H. Peter Anvin wrote:
>> On 03/12/2012 10:31 PM, Yinghai Lu wrote:
>> > 
>> > Solution will be:
>> > 1.  remove the good_end setting for 64 bit again. and root cause S4
>resume.
>> 
>> This would by far be the best.
>> 
>> 	-hpa
>> 
>
>Any resolution on this issue. Has this been fixed? I haven't seen this
>message on the systems I am running the upstream kernel on, and would
>like to get a commit ID (if any) that fixed the problem.
>
>Thanks,
>-- Shuah

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-08-31 16:37         ` H. Peter Anvin
@ 2012-08-31 16:42           ` Takashi Iwai
  2012-08-31 16:50             ` Shuah Khan
  0 siblings, 1 reply; 11+ messages in thread
From: Takashi Iwai @ 2012-08-31 16:42 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: shuah.khan, Yinghai Lu, CAI Qian, Takashi Iwai, Linus Torvalds,
	Ingo Molnar, Vivek Goyal, linux-kernel

At Fri, 31 Aug 2012 09:37:05 -0700,
H. Peter Anvin wrote:
> 
> Kernel Summit is this week... people are away.

I (as a person who asked the revert) don't mind now to set the
good_end back for 64bit again.  On 3.5/3.6 kernels, there seems more
other places breaking the hibernation on the machines hitting the
problem at that time.  I need to dig deeply, but it's a horrible time-
consuming and unreliable task, thus postponed, so far.


thanks,

Takashi


> Shuah Khan <shuah.khan@hp.com> wrote:
> 
> >On Mon, 2012-03-12 at 22:42 -0700, H. Peter Anvin wrote:
> >> On 03/12/2012 10:31 PM, Yinghai Lu wrote:
> >> > 
> >> > Solution will be:
> >> > 1.  remove the good_end setting for 64 bit again. and root cause S4
> >resume.
> >> 
> >> This would by far be the best.
> >> 
> >> 	-hpa
> >> 
> >
> >Any resolution on this issue. Has this been fixed? I haven't seen this
> >message on the systems I am running the upstream kernel on, and would
> >like to get a commit ID (if any) that fixed the problem.
> >
> >Thanks,
> >-- Shuah
> 
> -- 
> Sent from my mobile phone. Please excuse brevity and lack of formatting.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: crash dump memory reservation regression
  2012-08-31 16:42           ` Takashi Iwai
@ 2012-08-31 16:50             ` Shuah Khan
  0 siblings, 0 replies; 11+ messages in thread
From: Shuah Khan @ 2012-08-31 16:50 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: H. Peter Anvin, Yinghai Lu, CAI Qian, Linus Torvalds,
	Ingo Molnar, Vivek Goyal, linux-kernel, shuah.khan, shuahkhan

On Fri, 2012-08-31 at 18:42 +0200, Takashi Iwai wrote:
> At Fri, 31 Aug 2012 09:37:05 -0700,
> H. Peter Anvin wrote:
> > 
> > Kernel Summit is this week... people are away.
> 
> I (as a person who asked the revert) don't mind now to set the
> good_end back for 64bit again.  On 3.5/3.6 kernels, there seems more
> other places breaking the hibernation on the machines hitting the
> problem at that time.  I need to dig deeply, but it's a horrible time-
> consuming and unreliable task, thus postponed, so far.
> 

I can work on getting this change in, if you are short on time. I am
seeing this on a few systems and would like to resolve it before 3.6
window closes.

Thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-08-31 16:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <5535e246-b7c0-4f69-a27b-1abbadd42db0@zmail14.collab.prod.int.phx2.redhat.com>
2012-03-12  3:00 ` crash dump memory reservation regression CAI Qian
2012-03-13  5:31   ` Yinghai Lu
2012-03-13  5:42     ` H. Peter Anvin
2012-08-31 15:59       ` Shuah Khan
2012-08-31 16:37         ` H. Peter Anvin
2012-08-31 16:42           ` Takashi Iwai
2012-08-31 16:50             ` Shuah Khan
2012-03-13 14:26     ` Vivek Goyal
2012-03-13 21:28       ` Yinghai Lu
2012-03-21  8:17     ` Dave Young
2012-03-26 10:32     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).