linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFD] kdump, kaslr: how to fix the failure of reservation of crashkernel low memory due to physical kaslr
@ 2019-12-25  4:26 d.hatayama
  2019-12-25  6:59 ` 'bhe@redhat.com'
  0 siblings, 1 reply; 4+ messages in thread
From: d.hatayama @ 2019-12-25  4:26 UTC (permalink / raw)
  To: 'dyoung@redhat.com', 'bhe@redhat.com',
	'vgoyal@redhat.com', 'ebiederm@xmission.com',
	'mingo@kernel.org'
  Cc: 'linux-kernel@vger.kernel.org',
	'kexec@lists.infradead.org'

Currently, reservation of crashkernel low memory sometimes fails due
to a sparse memory caused by physical kaslr with the following
message:

    Cannot reserve 256MB crashkernel low memory, please try smaller size.

Kdump needs low memory, memory area less than 4GB, e.g. for swiotlb.
Its size is 256MB for low memory by default. OTOH, physical kaslr
loads kernel images in a random physical address for
security. Physical kaslr sometimes choose low memory and sparse
there and as a result, reservation of crash kernel low memory could fail.

This failure seldom occurs on systems with large memory. For example,
on our system with 128GB, the issue occurs once in hundreds of
reboots. Although it doesn't occur frequently and can be avoided in
practice simply by rebooting the system, it definitely occurs once in
hundreds of reboots. Once the issue occurs, it's difficult for ordinary
users to understand why it failed. I'd like to fix this current behavior.

I'm now coming up two ideas but don't know what is best. Please
discuss how to fix the issue in better way.

1) Add a kernel parameter to make physical kaslr to avoid specified
   memory area

  This is the simplest idea I came up with first just like
   kaslr_mem_avoid=4GB-0, which is similar syntax to memap=, meaning
   that kaslr, please avoid to load kernel image into the region [0,
   4GB).

  It looks to me that this can be implemented easily by taking
   advantage of the existing code about mem_avoid mechanism in
   arch/x86/boot/compressed/kaslr.c.

  This mechanism doesn't lose security provided by physical kaslr if
   system memory is large enough.

  Demerit of this is that users need configuration. Automatic way is
   better if possible.

2) Add special handling for crashkernel= low in physical kaslr

  The second idea I came up with is to add special handling for
   crashkernel= low in physical kaslr, i.e. physical kaslr recognizes
   crashkernel= in kernel parameters and keep enough memory for
   crashkenrel.

  To guarantee that the memory area kept by the special handling in
   physical kaslr is used for crashkernel, it is necessary to mark the
   area to indicate to the crashkernel code executed after kernel
   runs. To implement this, I imagine introducing a new type of memory
   a kind of E820_CRASHKERNEL_LOW.

  My concern on this idea is whether its worth implementing such
   special handling in physical kaslr simply because I don't find such
   code in physical kaslr now.

3) Any other better ideas?

Thanks.
HATAYAMA, Daisuke


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFD] kdump, kaslr: how to fix the failure of reservation of crashkernel low memory due to physical kaslr
  2019-12-25  4:26 [RFD] kdump, kaslr: how to fix the failure of reservation of crashkernel low memory due to physical kaslr d.hatayama
@ 2019-12-25  6:59 ` 'bhe@redhat.com'
  2019-12-26  9:22   ` d.hatayama
  0 siblings, 1 reply; 4+ messages in thread
From: 'bhe@redhat.com' @ 2019-12-25  6:59 UTC (permalink / raw)
  To: d.hatayama
  Cc: 'dyoung@redhat.com', 'vgoyal@redhat.com',
	'ebiederm@xmission.com', 'mingo@kernel.org',
	'linux-kernel@vger.kernel.org',
	'kexec@lists.infradead.org'

Hi HATAYAMA,

On 12/25/19 at 04:26am, d.hatayama@fujitsu.com wrote:
> Currently, reservation of crashkernel low memory sometimes fails due
> to a sparse memory caused by physical kaslr with the following
> message:
> 
>     Cannot reserve 256MB crashkernel low memory, please try smaller size.

I don't understand, may not get your point. KASLR will randomize the
position of kernel image. However, kernel image usually takes up 50M
memory. Under low 4G memory, how come it can't reserve 256M crashkernel
low memory. Do you have the boot log of the failed case?

> 
> Kdump needs low memory, memory area less than 4GB, e.g. for swiotlb.
> Its size is 256MB for low memory by default. OTOH, physical kaslr
> loads kernel images in a random physical address for
> security. Physical kaslr sometimes choose low memory and sparse
> there and as a result, reservation of crash kernel low memory could fail.
> 
> This failure seldom occurs on systems with large memory. For example,
> on our system with 128GB, the issue occurs once in hundreds of
> reboots. Although it doesn't occur frequently and can be avoided in
> practice simply by rebooting the system, it definitely occurs once in
> hundreds of reboots. Once the issue occurs, it's difficult for ordinary
> users to understand why it failed. I'd like to fix this current behavior.
> 
> I'm now coming up two ideas but don't know what is best. Please
> discuss how to fix the issue in better way.
> 
> 1) Add a kernel parameter to make physical kaslr to avoid specified
>    memory area
> 
>   This is the simplest idea I came up with first just like
>    kaslr_mem_avoid=4GB-0, which is similar syntax to memap=, meaning
>    that kaslr, please avoid to load kernel image into the region [0,
>    4GB).
> 
>   It looks to me that this can be implemented easily by taking
>    advantage of the existing code about mem_avoid mechanism in
>    arch/x86/boot/compressed/kaslr.c.
> 
>   This mechanism doesn't lose security provided by physical kaslr if
>    system memory is large enough.
> 
>   Demerit of this is that users need configuration. Automatic way is
>    better if possible.
> 
> 2) Add special handling for crashkernel= low in physical kaslr
> 
>   The second idea I came up with is to add special handling for
>    crashkernel= low in physical kaslr, i.e. physical kaslr recognizes
>    crashkernel= in kernel parameters and keep enough memory for
>    crashkenrel.
> 
>   To guarantee that the memory area kept by the special handling in
>    physical kaslr is used for crashkernel, it is necessary to mark the
>    area to indicate to the crashkernel code executed after kernel
>    runs. To implement this, I imagine introducing a new type of memory
>    a kind of E820_CRASHKERNEL_LOW.
> 
>   My concern on this idea is whether its worth implementing such
>    special handling in physical kaslr simply because I don't find such
>    code in physical kaslr now.
> 
> 3) Any other better ideas?

Someone ever told that some systems may not have low 4G memory since
they own hardware iommu. In real life, I never see such kind of system,
and most of them can give 256M crashkernel memory a satisfactory result.
Unless you reserve more than 1G under low 4G, it could fail because of
kinds of complicated memory reservations there.

Thanks
Baoquan

> 
> Thanks.
> HATAYAMA, Daisuke
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFD] kdump, kaslr: how to fix the failure of reservation of crashkernel low memory due to physical kaslr
  2019-12-25  6:59 ` 'bhe@redhat.com'
@ 2019-12-26  9:22   ` d.hatayama
  2019-12-27  2:41     ` 'bhe@redhat.com'
  0 siblings, 1 reply; 4+ messages in thread
From: d.hatayama @ 2019-12-26  9:22 UTC (permalink / raw)
  To: 'bhe@redhat.com'
  Cc: 'dyoung@redhat.com', 'vgoyal@redhat.com',
	'ebiederm@xmission.com', 'mingo@kernel.org',
	'linux-kernel@vger.kernel.org',
	'kexec@lists.infradead.org'


> -----Original Message-----
> 
> Hi HATAYAMA,
> 
> On 12/25/19 at 04:26am, d.hatayama@fujitsu.com wrote:
> > Currently, reservation of crashkernel low memory sometimes fails due
> > to a sparse memory caused by physical kaslr with the following
> > message:
> >
> >     Cannot reserve 256MB crashkernel low memory, please try smaller size.
> 
> I don't understand, may not get your point. KASLR will randomize the
> position of kernel image. However, kernel image usually takes up 50M
> memory. Under low 4G memory, how come it can't reserve 256M crashkernel
> low memory. Do you have the boot log of the failed case?

Thanks for your comments and sorry for the insufficient explanation.

Low 4GB memory in our system is considerably limited. The size of the largest
contiguous free physical pages at the timing when kernel attempts at
reserving low memory for crash kernel is less than 512MB. Hence, if physical
kaslr inserts kernel image into the center of the chunk, every remaining
chunks have less than 256M size. Then, the failure occurs.

> 
> >
> > Kdump needs low memory, memory area less than 4GB, e.g. for swiotlb.
> > Its size is 256MB for low memory by default. OTOH, physical kaslr
> > loads kernel images in a random physical address for
> > security. Physical kaslr sometimes choose low memory and sparse
> > there and as a result, reservation of crash kernel low memory could fail.
> >
> > This failure seldom occurs on systems with large memory. For example,
> > on our system with 128GB, the issue occurs once in hundreds of
> > reboots. Although it doesn't occur frequently and can be avoided in
> > practice simply by rebooting the system, it definitely occurs once in
> > hundreds of reboots. Once the issue occurs, it's difficult for ordinary
> > users to understand why it failed. I'd like to fix this current behavior.
> >
> > I'm now coming up two ideas but don't know what is best. Please
> > discuss how to fix the issue in better way.
> >
> > 1) Add a kernel parameter to make physical kaslr to avoid specified
> >    memory area
> >
> >   This is the simplest idea I came up with first just like
> >    kaslr_mem_avoid=4GB-0, which is similar syntax to memap=, meaning
> >    that kaslr, please avoid to load kernel image into the region [0,
> >    4GB).
> >
> >   It looks to me that this can be implemented easily by taking
> >    advantage of the existing code about mem_avoid mechanism in
> >    arch/x86/boot/compressed/kaslr.c.
> >
> >   This mechanism doesn't lose security provided by physical kaslr if
> >    system memory is large enough.
> >
> >   Demerit of this is that users need configuration. Automatic way is
> >    better if possible.
> >
> > 2) Add special handling for crashkernel= low in physical kaslr
> >
> >   The second idea I came up with is to add special handling for
> >    crashkernel= low in physical kaslr, i.e. physical kaslr recognizes
> >    crashkernel= in kernel parameters and keep enough memory for
> >    crashkenrel.
> >
> >   To guarantee that the memory area kept by the special handling in
> >    physical kaslr is used for crashkernel, it is necessary to mark the
> >    area to indicate to the crashkernel code executed after kernel
> >    runs. To implement this, I imagine introducing a new type of memory
> >    a kind of E820_CRASHKERNEL_LOW.
> >
> >   My concern on this idea is whether its worth implementing such
> >    special handling in physical kaslr simply because I don't find such
> >    code in physical kaslr now.
> >
> > 3) Any other better ideas?
> 
> Someone ever told that some systems may not have low 4G memory since
> they own hardware iommu. In real life, I never see such kind of system,
> and most of them can give 256M crashkernel memory a satisfactory result.
> Unless you reserve more than 1G under low 4G, it could fail because of
> kinds of complicated memory reservations there.

I'm surprised to hear such system without low 4GB memory and I wonder
how such system works well without restriction of memory access range
in early runtime mode on x86 such as real mode.

Thanks.
HATAYAMA, Daisuke


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFD] kdump, kaslr: how to fix the failure of reservation of crashkernel low memory due to physical kaslr
  2019-12-26  9:22   ` d.hatayama
@ 2019-12-27  2:41     ` 'bhe@redhat.com'
  0 siblings, 0 replies; 4+ messages in thread
From: 'bhe@redhat.com' @ 2019-12-27  2:41 UTC (permalink / raw)
  To: d.hatayama
  Cc: 'kexec@lists.infradead.org',
	'linux-kernel@vger.kernel.org',
	'ebiederm@xmission.com', 'dyoung@redhat.com',
	'mingo@kernel.org', 'vgoyal@redhat.com'

On 12/26/19 at 09:22am, d.hatayama@fujitsu.com wrote:
> 
> > -----Original Message-----
> > 
> > Hi HATAYAMA,
> > 
> > On 12/25/19 at 04:26am, d.hatayama@fujitsu.com wrote:
> > > Currently, reservation of crashkernel low memory sometimes fails due
> > > to a sparse memory caused by physical kaslr with the following
> > > message:
> > >
> > >     Cannot reserve 256MB crashkernel low memory, please try smaller size.
> > 
> > I don't understand, may not get your point. KASLR will randomize the
> > position of kernel image. However, kernel image usually takes up 50M
> > memory. Under low 4G memory, how come it can't reserve 256M crashkernel
> > low memory. Do you have the boot log of the failed case?
> 
> Thanks for your comments and sorry for the insufficient explanation.
> 
> Low 4GB memory in our system is considerably limited. The size of the largest
> contiguous free physical pages at the timing when kernel attempts at
> reserving low memory for crash kernel is less than 512MB. Hence, if physical
> kaslr inserts kernel image into the center of the chunk, every remaining
> chunks have less than 256M size. Then, the failure occurs.

OK, this is truly extreme case, thanks for sharing.

Then I have several questions about it:

1) Can we use crashkernel=high, crashkernel=low, to fix this issue?
I believe in this system you told, it must be a high-end server, should
have hardware iommu depolyed. It doesn't need 256M low memory actually,
maybe 128M, even 64M is enough?

Asking this is because the 256M is default setting, default value only
covers general cases.

2) What if the system becomes more extreme, like the largest contiguous
free physical pages is less than 256M, even less than 128M?

Even though kernel image is limited to above 4G, whose fault is it? and
how can we fix it in this case?

3) If we really add limitation, maybe add kaslr_high to limit KASLR to
only put kernel image above 4G? 

 
> > 
...
 
> > Someone ever told that some systems may not have low 4G memory since
> > they own hardware iommu. In real life, I never see such kind of system,
> > and most of them can give 256M crashkernel memory a satisfactory result.
> > Unless you reserve more than 1G under low 4G, it could fail because of
> > kinds of complicated memory reservations there.
> 
> I'm surprised to hear such system without low 4GB memory and I wonder
> how such system works well without restriction of memory access range
> in early runtime mode on x86 such as real mode.

I think they meant almost no low memory, and it's a prototype machine to
experiment. Can't remember the detail.

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-12-27  2:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25  4:26 [RFD] kdump, kaslr: how to fix the failure of reservation of crashkernel low memory due to physical kaslr d.hatayama
2019-12-25  6:59 ` 'bhe@redhat.com'
2019-12-26  9:22   ` d.hatayama
2019-12-27  2:41     ` 'bhe@redhat.com'

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).