* [PATCH] kdump: force page alignment for per-CPU crash notes.
@ 2012-02-29 17:21 Eugene Surovegin
2012-03-01 1:18 ` Simon Horman
2012-03-01 15:09 ` Vivek Goyal
0 siblings, 2 replies; 11+ messages in thread
From: Eugene Surovegin @ 2012-02-29 17:21 UTC (permalink / raw)
To: linux-kernel; +Cc: Eugene Surovegin, Eric Biederman, Vivek Goyal, kexec-list
Per-CPU allocations are not guaranteed to be physically contiguous.
However, kdump kernel and user-space code assumes that per-CPU
memory, used for saving CPU registers on crash, is.
This can cause corrupted /proc/vmcore in some cases - the main
symptom being huge ELF note section.
Force page alignment for note_buf_t to ensure that this assumption holds.
Signed-off-by: Eugene Surovegin <surovegin@google.com>
CC: Eric Biederman <ebiederm@xmission.com>
CC: Vivek Goyal <vgoyal@redhat.com>
CC: kexec-list <kexec@lists.infradead.org>
---
kernel/kexec.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 7b08867..e641b5c 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1232,8 +1232,13 @@ void crash_save_cpu(struct pt_regs *regs, int cpu)
static int __init crash_notes_memory_init(void)
{
- /* Allocate memory for saving cpu registers. */
- crash_notes = alloc_percpu(note_buf_t);
+ /* Allocate memory for saving cpu registers.
+ * Force page alignment to avoid crossing physical page boundary -
+ * kexec-tools and kernel /proc/vmcore handler assume these per-CPU
+ * chunks are physically contiguous.
+ */
+ crash_notes = (note_buf_t __percpu *)__alloc_percpu(sizeof(note_buf_t),
+ PAGE_SIZE);
if (!crash_notes) {
printk("Kexec: Memory allocation for saving cpu register"
" states failed\n");
--
1.7.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-02-29 17:21 [PATCH] kdump: force page alignment for per-CPU crash notes Eugene Surovegin
@ 2012-03-01 1:18 ` Simon Horman
[not found] ` <CANrfoUXd4L2bjBO+L8ZBCkc1dfnZahn39sB25zsrUMqGA5zY5A@mail.gmail.com>
2012-03-01 15:09 ` Vivek Goyal
1 sibling, 1 reply; 11+ messages in thread
From: Simon Horman @ 2012-03-01 1:18 UTC (permalink / raw)
To: Eugene Surovegin; +Cc: linux-kernel, kexec-list, Eric Biederman, Vivek Goyal
On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
> Per-CPU allocations are not guaranteed to be physically contiguous.
> However, kdump kernel and user-space code assumes that per-CPU
> memory, used for saving CPU registers on crash, is.
> This can cause corrupted /proc/vmcore in some cases - the main
> symptom being huge ELF note section.
>
> Force page alignment for note_buf_t to ensure that this assumption holds.
Ouch. I'm surprised there is an allocation on crash, perhaps
it could at least be done earlier? And am I right in thinking
that this change increases the likely hood that the allocation
could fail?
>
> Signed-off-by: Eugene Surovegin <surovegin@google.com>
> CC: Eric Biederman <ebiederm@xmission.com>
> CC: Vivek Goyal <vgoyal@redhat.com>
> CC: kexec-list <kexec@lists.infradead.org>
> ---
> kernel/kexec.c | 9 +++++++--
> 1 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 7b08867..e641b5c 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1232,8 +1232,13 @@ void crash_save_cpu(struct pt_regs *regs, int cpu)
>
> static int __init crash_notes_memory_init(void)
> {
> - /* Allocate memory for saving cpu registers. */
> - crash_notes = alloc_percpu(note_buf_t);
> + /* Allocate memory for saving cpu registers.
> + * Force page alignment to avoid crossing physical page boundary -
> + * kexec-tools and kernel /proc/vmcore handler assume these per-CPU
> + * chunks are physically contiguous.
> + */
> + crash_notes = (note_buf_t __percpu *)__alloc_percpu(sizeof(note_buf_t),
> + PAGE_SIZE);
> if (!crash_notes) {
> printk("Kexec: Memory allocation for saving cpu register"
> " states failed\n");
> --
> 1.7.9.1
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
[not found] ` <CANrfoUXd4L2bjBO+L8ZBCkc1dfnZahn39sB25zsrUMqGA5zY5A@mail.gmail.com>
@ 2012-03-01 1:32 ` Simon Horman
2012-03-01 1:39 ` Eugene Surovegin
0 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2012-03-01 1:32 UTC (permalink / raw)
To: Eugene Surovegin; +Cc: linux-kernel, kexec-list, Eric Biederman, Vivek Goyal
On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote:
>
> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
> > > Per-CPU allocations are not guaranteed to be physically contiguous.
> > > However, kdump kernel and user-space code assumes that per-CPU
> > > memory, used for saving CPU registers on crash, is.
> > > This can cause corrupted /proc/vmcore in some cases - the main
> > > symptom being huge ELF note section.
> > >
> > > Force page alignment for note_buf_t to ensure that this assumption holds.
> >
> > Ouch. I'm surprised there is an allocation on crash, perhaps
> > it could at least be done earlier? And am I right in thinking
> > that this change increases the likely hood that the allocation
> > could fail?
> >
>
> I'm not following. This allocation is done on start-up, not on crash.
> If you cannot allocate this much memory on system boot, I'm not sure what
> else you can do on this system....
Sorry, my eyes deceived me. You are correct and I agree.
Is it the case that note_buf_t is never larger than PAGE_SIZE?
If so I your patch looks good to me.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 1:32 ` Simon Horman
@ 2012-03-01 1:39 ` Eugene Surovegin
2012-03-01 1:51 ` HATAYAMA Daisuke
2012-03-01 2:53 ` Simon Horman
0 siblings, 2 replies; 11+ messages in thread
From: Eugene Surovegin @ 2012-03-01 1:39 UTC (permalink / raw)
To: Simon Horman; +Cc: linux-kernel, kexec-list, Eric Biederman, Vivek Goyal
On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote:
> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote:
>>
>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
>> > > Per-CPU allocations are not guaranteed to be physically contiguous.
>> > > However, kdump kernel and user-space code assumes that per-CPU
>> > > memory, used for saving CPU registers on crash, is.
>> > > This can cause corrupted /proc/vmcore in some cases - the main
>> > > symptom being huge ELF note section.
>> > >
>> > > Force page alignment for note_buf_t to ensure that this assumption holds.
>> >
>> > Ouch. I'm surprised there is an allocation on crash, perhaps
>> > it could at least be done earlier? And am I right in thinking
>> > that this change increases the likely hood that the allocation
>> > could fail?
>> >
>>
>> I'm not following. This allocation is done on start-up, not on crash.
>> If you cannot allocate this much memory on system boot, I'm not sure what
>> else you can do on this system....
>
> Sorry, my eyes deceived me. You are correct and I agree.
>
> Is it the case that note_buf_t is never larger than PAGE_SIZE?
> If so I your patch looks good to me.
Currently, maximum note size is hardcoded in kexec-tools to 1024
(MAX_NOTE_BYTES).
Usually it's way less. IIRC on x86_64 it's 336 bytes.
--
Eugene
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 1:39 ` Eugene Surovegin
@ 2012-03-01 1:51 ` HATAYAMA Daisuke
2012-03-01 1:56 ` Eugene Surovegin
2012-03-01 2:50 ` Simon Horman
2012-03-01 2:53 ` Simon Horman
1 sibling, 2 replies; 11+ messages in thread
From: HATAYAMA Daisuke @ 2012-03-01 1:51 UTC (permalink / raw)
To: surovegin; +Cc: horms, kexec, linux-kernel, vgoyal, ebiederm
From: Eugene Surovegin <surovegin@google.com>
Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
Date: Wed, 29 Feb 2012 17:39:55 -0800
> On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote:
>> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
>>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote:
>>>
>>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
>>> > > Per-CPU allocations are not guaranteed to be physically contiguous.
>>> > > However, kdump kernel and user-space code assumes that per-CPU
>>> > > memory, used for saving CPU registers on crash, is.
>>> > > This can cause corrupted /proc/vmcore in some cases - the main
>>> > > symptom being huge ELF note section.
>>> > >
>>> > > Force page alignment for note_buf_t to ensure that this assumption holds.
>>> >
>>> > Ouch. I'm surprised there is an allocation on crash, perhaps
>>> > it could at least be done earlier? And am I right in thinking
>>> > that this change increases the likely hood that the allocation
>>> > could fail?
>>> >
>>>
>>> I'm not following. This allocation is done on start-up, not on crash.
>>> If you cannot allocate this much memory on system boot, I'm not sure what
>>> else you can do on this system....
>>
>> Sorry, my eyes deceived me. You are correct and I agree.
>>
>> Is it the case that note_buf_t is never larger than PAGE_SIZE?
>> If so I your patch looks good to me.
>
> Currently, maximum note size is hardcoded in kexec-tools to 1024
> (MAX_NOTE_BYTES).
> Usually it's way less. IIRC on x86_64 it's 336 bytes.
>
This is elf_prstatus and I guess it's mostly equal to registers.
crash> p sizeof(struct elf_prstatus)
$3 = 336
crash> ptype struct elf_prstatus
type = struct elf_prstatus {
struct elf_siginfo pr_info;
short int pr_cursig;
long unsigned int pr_sigpend;
long unsigned int pr_sighold;
pid_t pr_pid;
pid_t pr_ppid;
pid_t pr_pgrp;
pid_t pr_sid;
struct timeval pr_utime;
struct timeval pr_stime;
struct timeval pr_cutime;
struct timeval pr_cstime;
elf_gregset_t pr_reg; <-- this
int pr_fpvalid;
}
What kinds of architecture does have so many registers? It's just my
interest. Or possibly other kinds of notes is written here?
Thanks.
HATAYAMA, Daisuke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 1:51 ` HATAYAMA Daisuke
@ 2012-03-01 1:56 ` Eugene Surovegin
2012-03-01 2:50 ` Simon Horman
1 sibling, 0 replies; 11+ messages in thread
From: Eugene Surovegin @ 2012-03-01 1:56 UTC (permalink / raw)
To: HATAYAMA Daisuke; +Cc: horms, kexec, linux-kernel, vgoyal, ebiederm
On Wed, Feb 29, 2012 at 5:51 PM, HATAYAMA Daisuke
<d.hatayama@jp.fujitsu.com> wrote:
> From: Eugene Surovegin <surovegin@google.com>
> Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
> Date: Wed, 29 Feb 2012 17:39:55 -0800
>
>> On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote:
>>> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
>>>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote:
>>>>
>>>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
>>>> > > Per-CPU allocations are not guaranteed to be physically contiguous.
>>>> > > However, kdump kernel and user-space code assumes that per-CPU
>>>> > > memory, used for saving CPU registers on crash, is.
>>>> > > This can cause corrupted /proc/vmcore in some cases - the main
>>>> > > symptom being huge ELF note section.
>>>> > >
>>>> > > Force page alignment for note_buf_t to ensure that this assumption holds.
>>>> >
>>>> > Ouch. I'm surprised there is an allocation on crash, perhaps
>>>> > it could at least be done earlier? And am I right in thinking
>>>> > that this change increases the likely hood that the allocation
>>>> > could fail?
>>>> >
>>>>
>>>> I'm not following. This allocation is done on start-up, not on crash.
>>>> If you cannot allocate this much memory on system boot, I'm not sure what
>>>> else you can do on this system....
>>>
>>> Sorry, my eyes deceived me. You are correct and I agree.
>>>
>>> Is it the case that note_buf_t is never larger than PAGE_SIZE?
>>> If so I your patch looks good to me.
>>
>> Currently, maximum note size is hardcoded in kexec-tools to 1024
>> (MAX_NOTE_BYTES).
>> Usually it's way less. IIRC on x86_64 it's 336 bytes.
>>
>
> This is elf_prstatus and I guess it's mostly equal to registers.
>
> crash> p sizeof(struct elf_prstatus)
> $3 = 336
> crash> ptype struct elf_prstatus
> type = struct elf_prstatus {
> struct elf_siginfo pr_info;
> short int pr_cursig;
> long unsigned int pr_sigpend;
> long unsigned int pr_sighold;
> pid_t pr_pid;
> pid_t pr_ppid;
> pid_t pr_pgrp;
> pid_t pr_sid;
> struct timeval pr_utime;
> struct timeval pr_stime;
> struct timeval pr_cutime;
> struct timeval pr_cstime;
> elf_gregset_t pr_reg; <-- this
> int pr_fpvalid;
> }
>
> What kinds of architecture does have so many registers? It's just my
> interest. Or possibly other kinds of notes is written here?
I'm not sure about other archs, but we don't write there anything
except for 'elf_prstatus' and sentinel "final" note.
--
Eugene
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 1:51 ` HATAYAMA Daisuke
2012-03-01 1:56 ` Eugene Surovegin
@ 2012-03-01 2:50 ` Simon Horman
1 sibling, 0 replies; 11+ messages in thread
From: Simon Horman @ 2012-03-01 2:50 UTC (permalink / raw)
To: HATAYAMA Daisuke; +Cc: surovegin, kexec, linux-kernel, vgoyal, ebiederm
On Thu, Mar 01, 2012 at 10:51:26AM +0900, HATAYAMA Daisuke wrote:
> From: Eugene Surovegin <surovegin@google.com>
> Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
> Date: Wed, 29 Feb 2012 17:39:55 -0800
>
> > On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote:
> >> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
> >>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote:
> >>>
> >>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
> >>> > > Per-CPU allocations are not guaranteed to be physically contiguous.
> >>> > > However, kdump kernel and user-space code assumes that per-CPU
> >>> > > memory, used for saving CPU registers on crash, is.
> >>> > > This can cause corrupted /proc/vmcore in some cases - the main
> >>> > > symptom being huge ELF note section.
> >>> > >
> >>> > > Force page alignment for note_buf_t to ensure that this assumption holds.
> >>> >
> >>> > Ouch. I'm surprised there is an allocation on crash, perhaps
> >>> > it could at least be done earlier? And am I right in thinking
> >>> > that this change increases the likely hood that the allocation
> >>> > could fail?
> >>> >
> >>>
> >>> I'm not following. This allocation is done on start-up, not on crash.
> >>> If you cannot allocate this much memory on system boot, I'm not sure what
> >>> else you can do on this system....
> >>
> >> Sorry, my eyes deceived me. You are correct and I agree.
> >>
> >> Is it the case that note_buf_t is never larger than PAGE_SIZE?
> >> If so I your patch looks good to me.
> >
> > Currently, maximum note size is hardcoded in kexec-tools to 1024
> > (MAX_NOTE_BYTES).
> > Usually it's way less. IIRC on x86_64 it's 336 bytes.
I presume that MAX_NOTE_BYTES was chosen to be large to
minimise the chance that it would ever need to be changed.
> This is elf_prstatus and I guess it's mostly equal to registers.
>
> crash> p sizeof(struct elf_prstatus)
> $3 = 336
> crash> ptype struct elf_prstatus
> type = struct elf_prstatus {
> struct elf_siginfo pr_info;
> short int pr_cursig;
> long unsigned int pr_sigpend;
> long unsigned int pr_sighold;
> pid_t pr_pid;
> pid_t pr_ppid;
> pid_t pr_pgrp;
> pid_t pr_sid;
> struct timeval pr_utime;
> struct timeval pr_stime;
> struct timeval pr_cutime;
> struct timeval pr_cstime;
> elf_gregset_t pr_reg; <-- this
> int pr_fpvalid;
> }
>
> What kinds of architecture does have so many registers? It's just my
> interest. Or possibly other kinds of notes is written here?
The winner seems to be ia64 with 128.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 1:39 ` Eugene Surovegin
2012-03-01 1:51 ` HATAYAMA Daisuke
@ 2012-03-01 2:53 ` Simon Horman
1 sibling, 0 replies; 11+ messages in thread
From: Simon Horman @ 2012-03-01 2:53 UTC (permalink / raw)
To: Eugene Surovegin; +Cc: linux-kernel, kexec-list, Eric Biederman, Vivek Goyal
On Wed, Feb 29, 2012 at 05:39:55PM -0800, Eugene Surovegin wrote:
> On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote:
> > On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote:
> >> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote:
> >>
> >> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
> >> > > Per-CPU allocations are not guaranteed to be physically contiguous.
> >> > > However, kdump kernel and user-space code assumes that per-CPU
> >> > > memory, used for saving CPU registers on crash, is.
> >> > > This can cause corrupted /proc/vmcore in some cases - the main
> >> > > symptom being huge ELF note section.
> >> > >
> >> > > Force page alignment for note_buf_t to ensure that this assumption holds.
> >> >
> >> > Ouch. I'm surprised there is an allocation on crash, perhaps
> >> > it could at least be done earlier? And am I right in thinking
> >> > that this change increases the likely hood that the allocation
> >> > could fail?
> >> >
> >>
> >> I'm not following. This allocation is done on start-up, not on crash.
> >> If you cannot allocate this much memory on system boot, I'm not sure what
> >> else you can do on this system....
> >
> > Sorry, my eyes deceived me. You are correct and I agree.
> >
> > Is it the case that note_buf_t is never larger than PAGE_SIZE?
> > If so I your patch looks good to me.
>
> Currently, maximum note size is hardcoded in kexec-tools to 1024
> (MAX_NOTE_BYTES).
> Usually it's way less. IIRC on x86_64 it's 336 bytes.
Ok, understood.
Reviewed-by: Simon Horman <horms@verge.net.au>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-02-29 17:21 [PATCH] kdump: force page alignment for per-CPU crash notes Eugene Surovegin
2012-03-01 1:18 ` Simon Horman
@ 2012-03-01 15:09 ` Vivek Goyal
2012-03-01 17:04 ` Eugene Surovegin
1 sibling, 1 reply; 11+ messages in thread
From: Vivek Goyal @ 2012-03-01 15:09 UTC (permalink / raw)
To: Eugene Surovegin; +Cc: linux-kernel, Eric Biederman, kexec-list
On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
> Per-CPU allocations are not guaranteed to be physically contiguous.
> However, kdump kernel and user-space code assumes that per-CPU
> memory, used for saving CPU registers on crash, is.
> This can cause corrupted /proc/vmcore in some cases - the main
> symptom being huge ELF note section.
>
> Force page alignment for note_buf_t to ensure that this assumption holds.
>
Hi Eugene,
Where do we make assumption that crash_notes address is page aligned?
In first kernel we save notes using virtual addresses as returned by
per_cpu_ptr(). So first kernel should be fine (crash_save_cpu()).
In second kernel, fs/proc/vmcore.c code does not seem to be assuming
that notes are stored as page aligned address.
merge_note_headers_elf64()
read_from_oldmem()
offset = (unsigned long)(*ppos % PAGE_SIZE);
pfn = (unsigned long)(*ppos / PAGE_SIZE);
We see to calculate pfn and offset inside the page. Map the pfn and then
access offset.
So as long as whole of the note section is stored on a single physical
page it is not a problem.
Are you referring to the fact that note could be stored on two different
physical pages and then kexec-tools does not know about the physical
address of second page, hence second kernel does not know about it
and we never read that data. That sounds like a problem.
So it make sense to force the page size alignment and if notes ever
grow beyond 1 page, we need to come up with more complex ways of
communicating more than 1 physical address to second kernel.
Thanks
Vivek
> Signed-off-by: Eugene Surovegin <surovegin@google.com>
> CC: Eric Biederman <ebiederm@xmission.com>
> CC: Vivek Goyal <vgoyal@redhat.com>
> CC: kexec-list <kexec@lists.infradead.org>
> ---
> kernel/kexec.c | 9 +++++++--
> 1 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 7b08867..e641b5c 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1232,8 +1232,13 @@ void crash_save_cpu(struct pt_regs *regs, int cpu)
>
> static int __init crash_notes_memory_init(void)
> {
> - /* Allocate memory for saving cpu registers. */
> - crash_notes = alloc_percpu(note_buf_t);
> + /* Allocate memory for saving cpu registers.
> + * Force page alignment to avoid crossing physical page boundary -
> + * kexec-tools and kernel /proc/vmcore handler assume these per-CPU
> + * chunks are physically contiguous.
> + */
> + crash_notes = (note_buf_t __percpu *)__alloc_percpu(sizeof(note_buf_t),
> + PAGE_SIZE);
> if (!crash_notes) {
> printk("Kexec: Memory allocation for saving cpu register"
> " states failed\n");
> --
> 1.7.9.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 15:09 ` Vivek Goyal
@ 2012-03-01 17:04 ` Eugene Surovegin
2012-03-01 18:31 ` Vivek Goyal
0 siblings, 1 reply; 11+ messages in thread
From: Eugene Surovegin @ 2012-03-01 17:04 UTC (permalink / raw)
To: Vivek Goyal; +Cc: linux-kernel, Eric Biederman, kexec-list
On Thu, Mar 1, 2012 at 7:09 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
>> Per-CPU allocations are not guaranteed to be physically contiguous.
>> However, kdump kernel and user-space code assumes that per-CPU
>> memory, used for saving CPU registers on crash, is.
>> This can cause corrupted /proc/vmcore in some cases - the main
>> symptom being huge ELF note section.
>>
>> Force page alignment for note_buf_t to ensure that this assumption holds.
>>
>
> Hi Eugene,
>
> Where do we make assumption that crash_notes address is page aligned?
I never said that. I said that assumption was that note chunks were
physically contiguous.
[snip]
> Are you referring to the fact that note could be stored on two different
> physical pages and then kexec-tools does not know about the physical
> address of second page, hence second kernel does not know about it
> and we never read that data. That sounds like a problem.
Yes. This is exactly what happens in some cases.
> So it make sense to force the page size alignment and if notes ever
> grow beyond 1 page, we need to come up with more complex ways of
> communicating more than 1 physical address to second kernel.
Yes. Thankfully notes so far are smaller than a page.
And if they do grow larger, than kexec-tools have to be modified
anyways - currently maximum note size (1K) is hardcoded there.
If this happens, I think instead of coming up with a way to
communicate phys to virt mapping, it'd be easier just to change th way
notes are allocated in the kernel so they are phys contiguous.
--
Eugene
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes.
2012-03-01 17:04 ` Eugene Surovegin
@ 2012-03-01 18:31 ` Vivek Goyal
0 siblings, 0 replies; 11+ messages in thread
From: Vivek Goyal @ 2012-03-01 18:31 UTC (permalink / raw)
To: Eugene Surovegin; +Cc: linux-kernel, Eric Biederman, kexec-list
On Thu, Mar 01, 2012 at 09:04:32AM -0800, Eugene Surovegin wrote:
> On Thu, Mar 1, 2012 at 7:09 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote:
> >> Per-CPU allocations are not guaranteed to be physically contiguous.
> >> However, kdump kernel and user-space code assumes that per-CPU
> >> memory, used for saving CPU registers on crash, is.
> >> This can cause corrupted /proc/vmcore in some cases - the main
> >> symptom being huge ELF note section.
> >>
> >> Force page alignment for note_buf_t to ensure that this assumption holds.
> >>
> >
> > Hi Eugene,
> >
> > Where do we make assumption that crash_notes address is page aligned?
>
> I never said that. I said that assumption was that note chunks were
> physically contiguous.
>
> [snip]
>
> > Are you referring to the fact that note could be stored on two different
> > physical pages and then kexec-tools does not know about the physical
> > address of second page, hence second kernel does not know about it
> > and we never read that data. That sounds like a problem.
>
> Yes. This is exactly what happens in some cases.
Ok. Thanks for clarifying. Patch looks good to me.
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Vivek
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-03-01 18:56 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-29 17:21 [PATCH] kdump: force page alignment for per-CPU crash notes Eugene Surovegin
2012-03-01 1:18 ` Simon Horman
[not found] ` <CANrfoUXd4L2bjBO+L8ZBCkc1dfnZahn39sB25zsrUMqGA5zY5A@mail.gmail.com>
2012-03-01 1:32 ` Simon Horman
2012-03-01 1:39 ` Eugene Surovegin
2012-03-01 1:51 ` HATAYAMA Daisuke
2012-03-01 1:56 ` Eugene Surovegin
2012-03-01 2:50 ` Simon Horman
2012-03-01 2:53 ` Simon Horman
2012-03-01 15:09 ` Vivek Goyal
2012-03-01 17:04 ` Eugene Surovegin
2012-03-01 18:31 ` Vivek Goyal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).