* [PATCH] x86/mm: Fix boot with some memory above MAXMEM
@ 2020-05-11 19:17 Kirill A. Shutemov
2020-05-25 4:49 ` Kirill A. Shutemov
0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2020-05-11 19:17 UTC (permalink / raw)
To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H. Peter Anvin
Cc: Dan Williams, Tony Luck, x86, linux-mm, linux-kernel,
Kirill A. Shutemov, Dave Hansen, stable
A 5-level paging capable machine can have memory above 46-bit in the
physical address space. This memory is only addressable in the 5-level
paging mode: we don't have enough virtual address space to create direct
mapping for such memory in the 4-level paging mode.
Currently, we fail boot completely: NULL pointer dereference in
subsection_map_init().
Skip creating a memblock for such memory instead and notify user that
some memory is not addressable.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: stable@vger.kernel.org # v4.14
---
Tested with a hacked QEMU: https://gist.github.com/kiryl/d45eb54110944ff95e544972d8bdac1d
---
arch/x86/kernel/e820.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index c5399e80c59c..d320d37d0f95 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1280,8 +1280,8 @@ void __init e820__memory_setup(void)
void __init e820__memblock_setup(void)
{
+ u64 size, end, not_addressable = 0;
int i;
- u64 end;
/*
* The bootstrap memblock region count maximum is 128 entries
@@ -1307,7 +1307,22 @@ void __init e820__memblock_setup(void)
if (entry->type != E820_TYPE_RAM && entry->type != E820_TYPE_RESERVED_KERN)
continue;
- memblock_add(entry->addr, entry->size);
+ if (entry->addr >= MAXMEM) {
+ not_addressable += entry->size;
+ continue;
+ }
+
+ end = min_t(u64, end, MAXMEM - 1);
+ size = end - entry->addr;
+ not_addressable += entry->size - size;
+ memblock_add(entry->addr, size);
+ }
+
+ if (not_addressable) {
+ pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
+ not_addressable >> 30);
+ if (!pgtable_l5_enabled())
+ pr_err("Consider enabling 5-level paging\n");
}
/* Throw away partial pages: */
--
2.26.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-05-11 19:17 [PATCH] x86/mm: Fix boot with some memory above MAXMEM Kirill A. Shutemov
@ 2020-05-25 4:49 ` Kirill A. Shutemov
2020-05-25 14:59 ` Mike Rapoport
0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2020-05-25 4:49 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, H. Peter Anvin, Dan Williams,
Tony Luck, x86, linux-mm, linux-kernel, Dave Hansen, stable
On Mon, May 11, 2020 at 10:17:21PM +0300, Kirill A. Shutemov wrote:
> A 5-level paging capable machine can have memory above 46-bit in the
> physical address space. This memory is only addressable in the 5-level
> paging mode: we don't have enough virtual address space to create direct
> mapping for such memory in the 4-level paging mode.
>
> Currently, we fail boot completely: NULL pointer dereference in
> subsection_map_init().
>
> Skip creating a memblock for such memory instead and notify user that
> some memory is not addressable.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reviewed-by: Dave Hansen <dave.hansen@intel.com>
> Cc: stable@vger.kernel.org # v4.14
> ---
Gentle ping.
It's not urgent, but it's a bug fix. Please consider applying.
> Tested with a hacked QEMU: https://gist.github.com/kiryl/d45eb54110944ff95e544972d8bdac1d
>
> ---
> arch/x86/kernel/e820.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index c5399e80c59c..d320d37d0f95 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -1280,8 +1280,8 @@ void __init e820__memory_setup(void)
>
> void __init e820__memblock_setup(void)
> {
> + u64 size, end, not_addressable = 0;
> int i;
> - u64 end;
>
> /*
> * The bootstrap memblock region count maximum is 128 entries
> @@ -1307,7 +1307,22 @@ void __init e820__memblock_setup(void)
> if (entry->type != E820_TYPE_RAM && entry->type != E820_TYPE_RESERVED_KERN)
> continue;
>
> - memblock_add(entry->addr, entry->size);
> + if (entry->addr >= MAXMEM) {
> + not_addressable += entry->size;
> + continue;
> + }
> +
> + end = min_t(u64, end, MAXMEM - 1);
> + size = end - entry->addr;
> + not_addressable += entry->size - size;
> + memblock_add(entry->addr, size);
> + }
> +
> + if (not_addressable) {
> + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
> + not_addressable >> 30);
> + if (!pgtable_l5_enabled())
> + pr_err("Consider enabling 5-level paging\n");
> }
>
> /* Throw away partial pages: */
> --
> 2.26.2
>
>
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-05-25 4:49 ` Kirill A. Shutemov
@ 2020-05-25 14:59 ` Mike Rapoport
2020-05-25 15:08 ` Kirill A. Shutemov
0 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2020-05-25 14:59 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
Dan Williams, Tony Luck, x86, linux-mm, linux-kernel,
Dave Hansen, stable
On Mon, May 25, 2020 at 07:49:02AM +0300, Kirill A. Shutemov wrote:
> On Mon, May 11, 2020 at 10:17:21PM +0300, Kirill A. Shutemov wrote:
> > A 5-level paging capable machine can have memory above 46-bit in the
> > physical address space. This memory is only addressable in the 5-level
> > paging mode: we don't have enough virtual address space to create direct
> > mapping for such memory in the 4-level paging mode.
> >
> > Currently, we fail boot completely: NULL pointer dereference in
> > subsection_map_init().
> >
> > Skip creating a memblock for such memory instead and notify user that
> > some memory is not addressable.
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Reviewed-by: Dave Hansen <dave.hansen@intel.com>
> > Cc: stable@vger.kernel.org # v4.14
> > ---
>
> Gentle ping.
>
> It's not urgent, but it's a bug fix. Please consider applying.
>
> > Tested with a hacked QEMU: https://gist.github.com/kiryl/d45eb54110944ff95e544972d8bdac1d
> >
> > ---
> > arch/x86/kernel/e820.c | 19 +++++++++++++++++--
> > 1 file changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> > index c5399e80c59c..d320d37d0f95 100644
> > --- a/arch/x86/kernel/e820.c
> > +++ b/arch/x86/kernel/e820.c
> > @@ -1280,8 +1280,8 @@ void __init e820__memory_setup(void)
> >
> > void __init e820__memblock_setup(void)
> > {
> > + u64 size, end, not_addressable = 0;
> > int i;
> > - u64 end;
> >
> > /*
> > * The bootstrap memblock region count maximum is 128 entries
> > @@ -1307,7 +1307,22 @@ void __init e820__memblock_setup(void)
> > if (entry->type != E820_TYPE_RAM && entry->type != E820_TYPE_RESERVED_KERN)
> > continue;
> >
> > - memblock_add(entry->addr, entry->size);
> > + if (entry->addr >= MAXMEM) {
> > + not_addressable += entry->size;
> > + continue;
> > + }
> > +
> > + end = min_t(u64, end, MAXMEM - 1);
> > + size = end - entry->addr;
> > + not_addressable += entry->size - size;
> > + memblock_add(entry->addr, size);
> > + }
> > +
> > + if (not_addressable) {
> > + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
> > + not_addressable >> 30);
> > + if (!pgtable_l5_enabled())
> > + pr_err("Consider enabling 5-level paging\n");
Could this happen at all when l5 is enabled?
Does it mean we need kmap() for 64-bit?
> > }
> >
> > /* Throw away partial pages: */
> > --
> > 2.26.2
> >
> >
>
> --
> Kirill A. Shutemov
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-05-25 14:59 ` Mike Rapoport
@ 2020-05-25 15:08 ` Kirill A. Shutemov
2020-05-25 15:58 ` Mike Rapoport
2020-05-26 14:27 ` Dave Hansen
0 siblings, 2 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2020-05-25 15:08 UTC (permalink / raw)
To: Mike Rapoport
Cc: Kirill A. Shutemov, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
Dan Williams, Tony Luck, x86, linux-mm, linux-kernel,
Dave Hansen, stable
On Mon, May 25, 2020 at 05:59:43PM +0300, Mike Rapoport wrote:
> On Mon, May 25, 2020 at 07:49:02AM +0300, Kirill A. Shutemov wrote:
> > On Mon, May 11, 2020 at 10:17:21PM +0300, Kirill A. Shutemov wrote:
> > > A 5-level paging capable machine can have memory above 46-bit in the
> > > physical address space. This memory is only addressable in the 5-level
> > > paging mode: we don't have enough virtual address space to create direct
> > > mapping for such memory in the 4-level paging mode.
> > >
> > > Currently, we fail boot completely: NULL pointer dereference in
> > > subsection_map_init().
> > >
> > > Skip creating a memblock for such memory instead and notify user that
> > > some memory is not addressable.
> > >
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Reviewed-by: Dave Hansen <dave.hansen@intel.com>
> > > Cc: stable@vger.kernel.org # v4.14
> > > ---
> >
> > Gentle ping.
> >
> > It's not urgent, but it's a bug fix. Please consider applying.
> >
> > > Tested with a hacked QEMU: https://gist.github.com/kiryl/d45eb54110944ff95e544972d8bdac1d
> > >
> > > ---
> > > arch/x86/kernel/e820.c | 19 +++++++++++++++++--
> > > 1 file changed, 17 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> > > index c5399e80c59c..d320d37d0f95 100644
> > > --- a/arch/x86/kernel/e820.c
> > > +++ b/arch/x86/kernel/e820.c
> > > @@ -1280,8 +1280,8 @@ void __init e820__memory_setup(void)
> > >
> > > void __init e820__memblock_setup(void)
> > > {
> > > + u64 size, end, not_addressable = 0;
> > > int i;
> > > - u64 end;
> > >
> > > /*
> > > * The bootstrap memblock region count maximum is 128 entries
> > > @@ -1307,7 +1307,22 @@ void __init e820__memblock_setup(void)
> > > if (entry->type != E820_TYPE_RAM && entry->type != E820_TYPE_RESERVED_KERN)
> > > continue;
> > >
> > > - memblock_add(entry->addr, entry->size);
> > > + if (entry->addr >= MAXMEM) {
> > > + not_addressable += entry->size;
> > > + continue;
> > > + }
> > > +
> > > + end = min_t(u64, end, MAXMEM - 1);
> > > + size = end - entry->addr;
> > > + not_addressable += entry->size - size;
> > > + memblock_add(entry->addr, size);
> > > + }
> > > +
> > > + if (not_addressable) {
> > > + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
> > > + not_addressable >> 30);
> > > + if (!pgtable_l5_enabled())
> > > + pr_err("Consider enabling 5-level paging\n");
>
> Could this happen at all when l5 is enabled?
> Does it mean we need kmap() for 64-bit?
It's future-profing. Who knows what paging modes we would have in the
future.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-05-25 15:08 ` Kirill A. Shutemov
@ 2020-05-25 15:58 ` Mike Rapoport
2020-05-26 14:27 ` Dave Hansen
1 sibling, 0 replies; 8+ messages in thread
From: Mike Rapoport @ 2020-05-25 15:58 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
Dan Williams, Tony Luck, x86, linux-mm, linux-kernel,
Dave Hansen, stable
On Mon, May 25, 2020 at 06:08:20PM +0300, Kirill A. Shutemov wrote:
> On Mon, May 25, 2020 at 05:59:43PM +0300, Mike Rapoport wrote:
> > On Mon, May 25, 2020 at 07:49:02AM +0300, Kirill A. Shutemov wrote:
> > > On Mon, May 11, 2020 at 10:17:21PM +0300, Kirill A. Shutemov wrote:
> > > > A 5-level paging capable machine can have memory above 46-bit in the
> > > > physical address space. This memory is only addressable in the 5-level
> > > > paging mode: we don't have enough virtual address space to create direct
> > > > mapping for such memory in the 4-level paging mode.
> > > >
> > > > Currently, we fail boot completely: NULL pointer dereference in
> > > > subsection_map_init().
> > > >
> > > > Skip creating a memblock for such memory instead and notify user that
> > > > some memory is not addressable.
> > > >
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > Reviewed-by: Dave Hansen <dave.hansen@intel.com>
> > > > Cc: stable@vger.kernel.org # v4.14
> > > > ---
> > >
> > > Gentle ping.
> > >
> > > It's not urgent, but it's a bug fix. Please consider applying.
> > >
> > > > Tested with a hacked QEMU: https://gist.github.com/kiryl/d45eb54110944ff95e544972d8bdac1d
> > > >
> > > > ---
> > > > arch/x86/kernel/e820.c | 19 +++++++++++++++++--
> > > > 1 file changed, 17 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> > > > index c5399e80c59c..d320d37d0f95 100644
> > > > --- a/arch/x86/kernel/e820.c
> > > > +++ b/arch/x86/kernel/e820.c
> > > > @@ -1280,8 +1280,8 @@ void __init e820__memory_setup(void)
> > > >
> > > > void __init e820__memblock_setup(void)
> > > > {
> > > > + u64 size, end, not_addressable = 0;
> > > > int i;
> > > > - u64 end;
> > > >
> > > > /*
> > > > * The bootstrap memblock region count maximum is 128 entries
> > > > @@ -1307,7 +1307,22 @@ void __init e820__memblock_setup(void)
> > > > if (entry->type != E820_TYPE_RAM && entry->type != E820_TYPE_RESERVED_KERN)
> > > > continue;
> > > >
> > > > - memblock_add(entry->addr, entry->size);
> > > > + if (entry->addr >= MAXMEM) {
> > > > + not_addressable += entry->size;
> > > > + continue;
> > > > + }
> > > > +
> > > > + end = min_t(u64, end, MAXMEM - 1);
> > > > + size = end - entry->addr;
> > > > + not_addressable += entry->size - size;
> > > > + memblock_add(entry->addr, size);
> > > > + }
> > > > +
> > > > + if (not_addressable) {
> > > > + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
> > > > + not_addressable >> 30);
> > > > + if (!pgtable_l5_enabled())
> > > > + pr_err("Consider enabling 5-level paging\n");
> >
> > Could this happen at all when l5 is enabled?
> > Does it mean we need kmap() for 64-bit?
>
> It's future-profing. Who knows what paging modes we would have in the
> future.
Than maybe
pr_err("%lldGB of physical memory is not addressable in %s the paging mode\n",
not_addressable >> 30, pgtable_l5_enabled() "5-level" ? "4-level");
"the paging mode" on its own sounds a bit awkward to me.
> --
> Kirill A. Shutemov
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-05-25 15:08 ` Kirill A. Shutemov
2020-05-25 15:58 ` Mike Rapoport
@ 2020-05-26 14:27 ` Dave Hansen
2020-06-02 23:18 ` Kirill A. Shutemov
1 sibling, 1 reply; 8+ messages in thread
From: Dave Hansen @ 2020-05-26 14:27 UTC (permalink / raw)
To: Kirill A. Shutemov, Mike Rapoport
Cc: Kirill A. Shutemov, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
Dan Williams, Tony Luck, x86, linux-mm, linux-kernel, stable
On 5/25/20 8:08 AM, Kirill A. Shutemov wrote:
>>>> + if (not_addressable) {
>>>> + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
>>>> + not_addressable >> 30);
>>>> + if (!pgtable_l5_enabled())
>>>> + pr_err("Consider enabling 5-level paging\n");
>> Could this happen at all when l5 is enabled?
>> Does it mean we need kmap() for 64-bit?
> It's future-profing. Who knows what paging modes we would have in the
> future.
Future-proofing and firmware-proofing. :)
In any case, are we *really* limited to 52 bits of physical memory with
5-level paging? Previously, we said we were limited to 46 bits, and now
we're saying that the limit is 52 with 5-level paging:
#define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46)
The 46 was fine with the 48 bits of address space on 4-level paging
systems since we need 1/2 of the address space for userspace, 1/4 for
the direct map and 1/4 for the vmalloc-and-friends area. At 46 bits of
address space, we fill up the direct map.
The hardware designers know this and never enumerated a MAXPHYADDR from
CPUID which was higher than what we could cover with 46 bits. It was
nice and convenient that these two separate things matched:
1. The amount of physical address space addressable in a direct map
consuming 1/4 of the virtual address space.
2. The CPU-enumerated MAXPHYADDR which among other things dictates how
much physical address space is addressable in a PTE.
But, with 5-level paging, things are a little different. The limit in
addressable memory because of running out of the direct map actually
happens at 55 bits (57-2=55, analogous to the 4-level 48-2=46).
So shouldn't it technically be this:
#define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 55 : 46)
?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-05-26 14:27 ` Dave Hansen
@ 2020-06-02 23:18 ` Kirill A. Shutemov
2020-06-03 19:18 ` Dave Hansen
0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2020-06-02 23:18 UTC (permalink / raw)
To: Dave Hansen
Cc: Mike Rapoport, Kirill A. Shutemov, Dave Hansen, Andy Lutomirski,
Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H. Peter Anvin, Dan Williams, Tony Luck, x86, linux-mm,
linux-kernel, stable
On Tue, May 26, 2020 at 07:27:15AM -0700, Dave Hansen wrote:
> On 5/25/20 8:08 AM, Kirill A. Shutemov wrote:
> >>>> + if (not_addressable) {
> >>>> + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
> >>>> + not_addressable >> 30);
> >>>> + if (!pgtable_l5_enabled())
> >>>> + pr_err("Consider enabling 5-level paging\n");
> >> Could this happen at all when l5 is enabled?
> >> Does it mean we need kmap() for 64-bit?
> > It's future-profing. Who knows what paging modes we would have in the
> > future.
>
> Future-proofing and firmware-proofing. :)
>
> In any case, are we *really* limited to 52 bits of physical memory with
> 5-level paging?
Yes. It's architectural. SDM says "MAXPHYADDR is at most 52" (Vol 3A,
4.1.4).
I guess it can be extended with an opt-in feature and relevant changes to
page table structure. But as of today there's no such thing.
> Previously, we said we were limited to 46 bits, and now
> we're saying that the limit is 52 with 5-level paging:
>
> #define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46)
>
> The 46 was fine with the 48 bits of address space on 4-level paging
> systems since we need 1/2 of the address space for userspace, 1/4 for
> the direct map and 1/4 for the vmalloc-and-friends area. At 46 bits of
> address space, we fill up the direct map.
>
> The hardware designers know this and never enumerated a MAXPHYADDR from
> CPUID which was higher than what we could cover with 46 bits. It was
> nice and convenient that these two separate things matched:
> 1. The amount of physical address space addressable in a direct map
> consuming 1/4 of the virtual address space.
> 2. The CPU-enumerated MAXPHYADDR which among other things dictates how
> much physical address space is addressable in a PTE.
>
> But, with 5-level paging, things are a little different. The limit in
> addressable memory because of running out of the direct map actually
> happens at 55 bits (57-2=55, analogous to the 4-level 48-2=46).
>
> So shouldn't it technically be this:
>
> #define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 55 : 46)
>
> ?
Bits above 52 are ignored in the page table entries and accessible to
software. Some of them got claimed by HW features (XD-bit, protection
keys), but such features require explicit opt-in on software side.
Kernel could claim bits 53-55 for the physical address, but it doesn't get
us anything: if future HW would provide such feature it would require
opt-in. On other hand claiming them now means we cannot use them for other
purposes as SW bit. I don't see a point.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] x86/mm: Fix boot with some memory above MAXMEM
2020-06-02 23:18 ` Kirill A. Shutemov
@ 2020-06-03 19:18 ` Dave Hansen
0 siblings, 0 replies; 8+ messages in thread
From: Dave Hansen @ 2020-06-03 19:18 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Mike Rapoport, Kirill A. Shutemov, Dave Hansen, Andy Lutomirski,
Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H. Peter Anvin, Dan Williams, Tony Luck, x86, linux-mm,
linux-kernel, stable
On 6/2/20 4:18 PM, Kirill A. Shutemov wrote:
> On Tue, May 26, 2020 at 07:27:15AM -0700, Dave Hansen wrote:
>> On 5/25/20 8:08 AM, Kirill A. Shutemov wrote:
>>>>>> + if (not_addressable) {
>>>>>> + pr_err("%lldGB of physical memory is not addressable in the paging mode\n",
>>>>>> + not_addressable >> 30);
>>>>>> + if (!pgtable_l5_enabled())
>>>>>> + pr_err("Consider enabling 5-level paging\n");
>>>> Could this happen at all when l5 is enabled?
>>>> Does it mean we need kmap() for 64-bit?
>>> It's future-profing. Who knows what paging modes we would have in the
>>> future.
>>
>> Future-proofing and firmware-proofing. :)
>>
>> In any case, are we *really* limited to 52 bits of physical memory with
>> 5-level paging?
>
> Yes. It's architectural. SDM says "MAXPHYADDR is at most 52" (Vol 3A,
> 4.1.4).
Right you are.
I'm glad it's in the architecture. Makes all of this a lot easier!
>> So shouldn't it technically be this:
>>
>> #define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 55 : 46)
>>
>> ?
>
> Bits above 52 are ignored in the page table entries and accessible to
> software. Some of them got claimed by HW features (XD-bit, protection
> keys), but such features require explicit opt-in on software side.
>
> Kernel could claim bits 53-55 for the physical address, but it doesn't get
> us anything: if future HW would provide such feature it would require
> opt-in. On other hand claiming them now means we cannot use them for other
> purposes as SW bit. I don't see a point.
Yep, agreed.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-06-03 19:19 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11 19:17 [PATCH] x86/mm: Fix boot with some memory above MAXMEM Kirill A. Shutemov
2020-05-25 4:49 ` Kirill A. Shutemov
2020-05-25 14:59 ` Mike Rapoport
2020-05-25 15:08 ` Kirill A. Shutemov
2020-05-25 15:58 ` Mike Rapoport
2020-05-26 14:27 ` Dave Hansen
2020-06-02 23:18 ` Kirill A. Shutemov
2020-06-03 19:18 ` Dave Hansen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).