* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-01-31 1:48 ` Dave Young
0 siblings, 0 replies; 58+ messages in thread
From: Dave Young @ 2019-01-31 1:48 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, bhupesh.linux,
linux-arm-kernel
+ more people
On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> supports these extensions can support upto 52-bit virtual and 52-bit
> physical addresses respectively.
>
> Since at the moment we enable the support of these extensions via CONFIG
> flags, e.g.
> - LPA via CONFIG_ARM64_PA_BITS_52
>
> there are no clear mechanisms in user-space right now to
> deteremine these CONFIG flag values and also determine the PARange and
> VARange address values.
>
> User-space tools like 'makedumpfile' and 'crash-utility' can instead
> use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> the maximum virtual address and physical address (respectively)
> supported by underlying kernel.
>
> A reference 'makedumpfile' implementation which uses this approach to
> determining the maximum physical address is available in [0].
>
> [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
I'm not objecting the patch, just want to make sure to make clear about
things and make sure these issues are aware by people, and leave arm
people to review the arm bits.
1. MAX_PHYSMEM_BITS
As we previously found, back to 2014 makedumpfile took a patch to read the
value from vmcore but the kernel patch was not accepted.
So we should first make clear if this is really needed, why other arches
do not need this in makedumpfile.
If we really need it then should it be arm64 only?
If it is arm64 only then the makedumpfile code should read this number
only for arm64.
Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
tree, need to make sure to document this as well.
2. MAX_USER_VA_BITS
Does makedumpfile care about userspace VA bits? I do not see other code
doing this, Kazu and Dave A should be able to comment.
I tend to doubt about this.
>
> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> ---
> arch/arm64/kernel/crash_core.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
> index ca4c3e12d8c5..ad231be5c0d8 100644
> --- a/arch/arm64/kernel/crash_core.c
> +++ b/arch/arm64/kernel/crash_core.c
> @@ -10,6 +10,8 @@
> void arch_crash_save_vmcoreinfo(void)
> {
> VMCOREINFO_NUMBER(VA_BITS);
> + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
> + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
> /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
> vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
> kimage_voffset);
> --
> 2.7.4
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
Thanks
Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-01-31 1:48 ` Dave Young
@ 2019-01-31 10:00 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-01-31 10:00 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, bhupesh.linux,
linux-arm-kernel
On 01/31/2019 07:18 AM, Dave Young wrote:
> + more people
> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>> supports these extensions can support upto 52-bit virtual and 52-bit
>> physical addresses respectively.
>>
>> Since at the moment we enable the support of these extensions via CONFIG
>> flags, e.g.
>> - LPA via CONFIG_ARM64_PA_BITS_52
>>
>> there are no clear mechanisms in user-space right now to
>> deteremine these CONFIG flag values and also determine the PARange and
>> VARange address values.
>>
>> User-space tools like 'makedumpfile' and 'crash-utility' can instead
>> use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
>> the maximum virtual address and physical address (respectively)
>> supported by underlying kernel.
>>
>> A reference 'makedumpfile' implementation which uses this approach to
>> determining the maximum physical address is available in [0].
>>
>> [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>
> I'm not objecting the patch, just want to make sure to make clear about
> things and make sure these issues are aware by people, and leave arm
> people to review the arm bits.
>
> 1. MAX_PHYSMEM_BITS
> As we previously found, back to 2014 makedumpfile took a patch to read the
> value from vmcore but the kernel patch was not accepted.
> So we should first make clear if this is really needed, why other arches
> do not need this in makedumpfile.
I explained this a bit in my reply to Suzuki's and James's review
comments yesterday, but let me summarize the same again for better clarity:
Let's take the example of x86. We have CONFIG_X86_5LEVEL config flag to
indicate 5 level page-table support. We export the same in vmcoreinfo
for x86_64 using:
void arch_crash_save_vmcoreinfo(void)
{
<.. snip..>
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
}
Also a simple grep in makedumpfile and crash for the same indicates that
the user-space code determines the MAX_PHYSMEM_BITS value using the
'pgtable_l5_enabled' value available in vmcoreinfo:
Example from makedumpfile:
-------------------------
int
get_versiondep_info_x86_64(void)
{
/*
* On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
*/
if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
else if(check_5level_paging())
info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
else
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
...
}
As we can see above, we use several if-else cases to determine the
'MAX_PHYSMEM_BITS', one of which is setting it to 52-bit, if the
'pgtable_l5_enabled' value is available and TRUE in vmcoreinfo.
So, we determine 'MAX_PHYSMEM_BITS' value in makedumpfile via a
vmcoreinfo export'ed variable (rather than using named
'MAX_PHYSMEM_BITS' its 'pgtable_l5_enabled' that is being used).
Since for arm64, we don't have a single CONFIG flag for 52-bit addresses
spaces (kernel VA, user-space VA and PA), so its better to export the
respective CONFIG flags in the vmcoreinfo directly.
> If we really need it then should it be arm64 only?
See above, since archs like x86 use a single flag: CONFIG_X86_5LEVEL,
whereas arm64 can use the combination of following flags to indicate
combinations of various address spaces:
- 48-bit kernel VA + 48-bit user-space VA + 52-bit PA
- 48-bit kernel VA + 52-bit user-space VA + 52-bit PA
- 52-bit kernel VA + 52-bit user-space VA + 52-bit PA
CONFIG_ARM64_64K_PAGES
CONFIG_ARM64_USER_VA_BITS_52
CONFIG_ARM64_VA_BITS
CONFIG_ARM64_PA_BITS_52
CONFIG_ARM64_PA_BITS
CONFIG_EXPERT, and
CONFIG_ARM64_FORCE_52BIT
so probably its not correct to compare the two cases one-to-one (its
more like an apple and an orange comparison).
> If it is arm64 only then the makedumpfile code should read this number
> only for arm64.
>
> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> tree, need to make sure to document this as well.
Sure, I will send a separate patch to fix the same, once this gets in.
>
> 2. MAX_USER_VA_BITS
> Does makedumpfile care about userspace VA bits?
Yes. Consider the case 48-bit kernel VA and 52-bit user-space VA, which
is a perfectly valid case on arm64. In such cases VA_BITS is set to 48,
whereas MAX_USER_VA_BITS is set to 52, which allows user-space
applications which use 52-bit virtual address to pass a hint to 'mmap'
to get high addresses.
I do not see other code
> doing this, Kazu and Dave A should be able to comment.
I talked to Dave A. yesterday off-list, I think he mentioned that these
changes are useful for crash-utility as well and he was hoping it gets
accepted soon so that kernel-debugging tools can handle increased
address spaces on arm64.
But it will be great to have reviews/ACKs from Dave A and others as well.
Thanks,
Bhupesh
>
> I tend to doubt about this.
>
>>
>> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: James Morse <james.morse@arm.com>
>> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
>> ---
>> arch/arm64/kernel/crash_core.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
>> index ca4c3e12d8c5..ad231be5c0d8 100644
>> --- a/arch/arm64/kernel/crash_core.c
>> +++ b/arch/arm64/kernel/crash_core.c
>> @@ -10,6 +10,8 @@
>> void arch_crash_save_vmcoreinfo(void)
>> {
>> VMCOREINFO_NUMBER(VA_BITS);
>> + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
>> + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
>> /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
>> vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
>> kimage_voffset);
>> --
>> 2.7.4
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-01-31 10:00 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-01-31 10:00 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, bhupesh.linux,
linux-arm-kernel
On 01/31/2019 07:18 AM, Dave Young wrote:
> + more people
> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>> supports these extensions can support upto 52-bit virtual and 52-bit
>> physical addresses respectively.
>>
>> Since at the moment we enable the support of these extensions via CONFIG
>> flags, e.g.
>> - LPA via CONFIG_ARM64_PA_BITS_52
>>
>> there are no clear mechanisms in user-space right now to
>> deteremine these CONFIG flag values and also determine the PARange and
>> VARange address values.
>>
>> User-space tools like 'makedumpfile' and 'crash-utility' can instead
>> use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
>> the maximum virtual address and physical address (respectively)
>> supported by underlying kernel.
>>
>> A reference 'makedumpfile' implementation which uses this approach to
>> determining the maximum physical address is available in [0].
>>
>> [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>
> I'm not objecting the patch, just want to make sure to make clear about
> things and make sure these issues are aware by people, and leave arm
> people to review the arm bits.
>
> 1. MAX_PHYSMEM_BITS
> As we previously found, back to 2014 makedumpfile took a patch to read the
> value from vmcore but the kernel patch was not accepted.
> So we should first make clear if this is really needed, why other arches
> do not need this in makedumpfile.
I explained this a bit in my reply to Suzuki's and James's review
comments yesterday, but let me summarize the same again for better clarity:
Let's take the example of x86. We have CONFIG_X86_5LEVEL config flag to
indicate 5 level page-table support. We export the same in vmcoreinfo
for x86_64 using:
void arch_crash_save_vmcoreinfo(void)
{
<.. snip..>
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
}
Also a simple grep in makedumpfile and crash for the same indicates that
the user-space code determines the MAX_PHYSMEM_BITS value using the
'pgtable_l5_enabled' value available in vmcoreinfo:
Example from makedumpfile:
-------------------------
int
get_versiondep_info_x86_64(void)
{
/*
* On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
*/
if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
else if(check_5level_paging())
info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
else
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
...
}
As we can see above, we use several if-else cases to determine the
'MAX_PHYSMEM_BITS', one of which is setting it to 52-bit, if the
'pgtable_l5_enabled' value is available and TRUE in vmcoreinfo.
So, we determine 'MAX_PHYSMEM_BITS' value in makedumpfile via a
vmcoreinfo export'ed variable (rather than using named
'MAX_PHYSMEM_BITS' its 'pgtable_l5_enabled' that is being used).
Since for arm64, we don't have a single CONFIG flag for 52-bit addresses
spaces (kernel VA, user-space VA and PA), so its better to export the
respective CONFIG flags in the vmcoreinfo directly.
> If we really need it then should it be arm64 only?
See above, since archs like x86 use a single flag: CONFIG_X86_5LEVEL,
whereas arm64 can use the combination of following flags to indicate
combinations of various address spaces:
- 48-bit kernel VA + 48-bit user-space VA + 52-bit PA
- 48-bit kernel VA + 52-bit user-space VA + 52-bit PA
- 52-bit kernel VA + 52-bit user-space VA + 52-bit PA
CONFIG_ARM64_64K_PAGES
CONFIG_ARM64_USER_VA_BITS_52
CONFIG_ARM64_VA_BITS
CONFIG_ARM64_PA_BITS_52
CONFIG_ARM64_PA_BITS
CONFIG_EXPERT, and
CONFIG_ARM64_FORCE_52BIT
so probably its not correct to compare the two cases one-to-one (its
more like an apple and an orange comparison).
> If it is arm64 only then the makedumpfile code should read this number
> only for arm64.
>
> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> tree, need to make sure to document this as well.
Sure, I will send a separate patch to fix the same, once this gets in.
>
> 2. MAX_USER_VA_BITS
> Does makedumpfile care about userspace VA bits?
Yes. Consider the case 48-bit kernel VA and 52-bit user-space VA, which
is a perfectly valid case on arm64. In such cases VA_BITS is set to 48,
whereas MAX_USER_VA_BITS is set to 52, which allows user-space
applications which use 52-bit virtual address to pass a hint to 'mmap'
to get high addresses.
I do not see other code
> doing this, Kazu and Dave A should be able to comment.
I talked to Dave A. yesterday off-list, I think he mentioned that these
changes are useful for crash-utility as well and he was hoping it gets
accepted soon so that kernel-debugging tools can handle increased
address spaces on arm64.
But it will be great to have reviews/ACKs from Dave A and others as well.
Thanks,
Bhupesh
>
> I tend to doubt about this.
>
>>
>> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: James Morse <james.morse@arm.com>
>> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
>> ---
>> arch/arm64/kernel/crash_core.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
>> index ca4c3e12d8c5..ad231be5c0d8 100644
>> --- a/arch/arm64/kernel/crash_core.c
>> +++ b/arch/arm64/kernel/crash_core.c
>> @@ -10,6 +10,8 @@
>> void arch_crash_save_vmcoreinfo(void)
>> {
>> VMCOREINFO_NUMBER(VA_BITS);
>> + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
>> + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
>> /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
>> vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
>> kimage_voffset);
>> --
>> 2.7.4
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-01-31 1:48 ` Dave Young
@ 2019-01-31 14:03 ` Dave Anderson
-1 siblings, 0 replies; 58+ messages in thread
From: Dave Anderson @ 2019-01-31 14:03 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard biesheuvel,
catalin marinas, Bhupesh Sharma, kexec, Will Deacon,
AKASHI Takahiro, James Morse, Borislav Petkov, bhupesh linux,
linux-arm-kernel
----- Original Message -----
> + more people
> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > supports these extensions can support upto 52-bit virtual and 52-bit
> > physical addresses respectively.
> >
> > Since at the moment we enable the support of these extensions via CONFIG
> > flags, e.g.
> > - LPA via CONFIG_ARM64_PA_BITS_52
> >
> > there are no clear mechanisms in user-space right now to
> > deteremine these CONFIG flag values and also determine the PARange and
> > VARange address values.
> >
> > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > the maximum virtual address and physical address (respectively)
> > supported by underlying kernel.
> >
> > A reference 'makedumpfile' implementation which uses this approach to
> > determining the maximum physical address is available in [0].
> >
> > [0].
> > https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>
> I'm not objecting the patch, just want to make sure to make clear about
> things and make sure these issues are aware by people, and leave arm
> people to review the arm bits.
>
> 1. MAX_PHYSMEM_BITS
> As we previously found, back to 2014 makedumpfile took a patch to read the
> value from vmcore but the kernel patch was not accepted.
> So we should first make clear if this is really needed, why other arches
> do not need this in makedumpfile.
>
> If we really need it then should it be arm64 only?
>
> If it is arm64 only then the makedumpfile code should read this number
> only for arm64.
With respect to the crash utility, this is the first time/architecture where the
determination of MAX_PHYSMEM_BITS cannot be determined.
>
> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> tree, need to make sure to document this as well.
>
> 2. MAX_USER_VA_BITS
> Does makedumpfile care about userspace VA bits? I do not see other code
> doing this, Kazu and Dave A should be able to comment.
>
> I tend to doubt about this.
With respect to the crash utility, this won't be required because there is
the exported "vabits_user" variable.
Dave Anderson
>
> >
> > Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: James Morse <james.morse@arm.com>
> > Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> > ---
> > arch/arm64/kernel/crash_core.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/crash_core.c
> > b/arch/arm64/kernel/crash_core.c
> > index ca4c3e12d8c5..ad231be5c0d8 100644
> > --- a/arch/arm64/kernel/crash_core.c
> > +++ b/arch/arm64/kernel/crash_core.c
> > @@ -10,6 +10,8 @@
> > void arch_crash_save_vmcoreinfo(void)
> > {
> > VMCOREINFO_NUMBER(VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
> > /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
> > vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
> > kimage_voffset);
> > --
> > 2.7.4
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-01-31 14:03 ` Dave Anderson
0 siblings, 0 replies; 58+ messages in thread
From: Dave Anderson @ 2019-01-31 14:03 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard biesheuvel,
catalin marinas, Bhupesh Sharma, kexec, Will Deacon,
AKASHI Takahiro, James Morse, Borislav Petkov, bhupesh linux,
linux-arm-kernel
----- Original Message -----
> + more people
> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > supports these extensions can support upto 52-bit virtual and 52-bit
> > physical addresses respectively.
> >
> > Since at the moment we enable the support of these extensions via CONFIG
> > flags, e.g.
> > - LPA via CONFIG_ARM64_PA_BITS_52
> >
> > there are no clear mechanisms in user-space right now to
> > deteremine these CONFIG flag values and also determine the PARange and
> > VARange address values.
> >
> > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > the maximum virtual address and physical address (respectively)
> > supported by underlying kernel.
> >
> > A reference 'makedumpfile' implementation which uses this approach to
> > determining the maximum physical address is available in [0].
> >
> > [0].
> > https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>
> I'm not objecting the patch, just want to make sure to make clear about
> things and make sure these issues are aware by people, and leave arm
> people to review the arm bits.
>
> 1. MAX_PHYSMEM_BITS
> As we previously found, back to 2014 makedumpfile took a patch to read the
> value from vmcore but the kernel patch was not accepted.
> So we should first make clear if this is really needed, why other arches
> do not need this in makedumpfile.
>
> If we really need it then should it be arm64 only?
>
> If it is arm64 only then the makedumpfile code should read this number
> only for arm64.
With respect to the crash utility, this is the first time/architecture where the
determination of MAX_PHYSMEM_BITS cannot be determined.
>
> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> tree, need to make sure to document this as well.
>
> 2. MAX_USER_VA_BITS
> Does makedumpfile care about userspace VA bits? I do not see other code
> doing this, Kazu and Dave A should be able to comment.
>
> I tend to doubt about this.
With respect to the crash utility, this won't be required because there is
the exported "vabits_user" variable.
Dave Anderson
>
> >
> > Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: James Morse <james.morse@arm.com>
> > Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> > ---
> > arch/arm64/kernel/crash_core.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/crash_core.c
> > b/arch/arm64/kernel/crash_core.c
> > index ca4c3e12d8c5..ad231be5c0d8 100644
> > --- a/arch/arm64/kernel/crash_core.c
> > +++ b/arch/arm64/kernel/crash_core.c
> > @@ -10,6 +10,8 @@
> > void arch_crash_save_vmcoreinfo(void)
> > {
> > VMCOREINFO_NUMBER(VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
> > /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
> > vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
> > kimage_voffset);
> > --
> > 2.7.4
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-01-31 1:48 ` Dave Young
@ 2019-02-04 16:04 ` Kazuhito Hagio
-1 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-04 16:04 UTC (permalink / raw)
To: Dave Young, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
On 1/30/2019 8:48 PM, Dave Young wrote:
> + more people
> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > supports these extensions can support upto 52-bit virtual and 52-bit
> > physical addresses respectively.
> >
> > Since at the moment we enable the support of these extensions via CONFIG
> > flags, e.g.
> > - LPA via CONFIG_ARM64_PA_BITS_52
> >
> > there are no clear mechanisms in user-space right now to
> > deteremine these CONFIG flag values and also determine the PARange and
> > VARange address values.
> >
> > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > the maximum virtual address and physical address (respectively)
> > supported by underlying kernel.
> >
> > A reference 'makedumpfile' implementation which uses this approach to
> > determining the maximum physical address is available in [0].
> >
> > [0].
> https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>
> I'm not objecting the patch, just want to make sure to make clear about
> things and make sure these issues are aware by people, and leave arm
> people to review the arm bits.
>
> 1. MAX_PHYSMEM_BITS
> As we previously found, back to 2014 makedumpfile took a patch to read the
> value from vmcore but the kernel patch was not accepted.
> So we should first make clear if this is really needed, why other arches
> do not need this in makedumpfile.
>
> If we really need it then should it be arm64 only?
>
> If it is arm64 only then the makedumpfile code should read this number
> only for arm64.
Sorry for the delay.
According to the kernel patch, some of arm32 platforms may need it
http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
but except for them (and arm64), makedumpfile can manage with kernel
version and some switches to determine this value so far.
>
> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> tree, need to make sure to document this as well.
>
> 2. MAX_USER_VA_BITS
> Does makedumpfile care about userspace VA bits? I do not see other code
> doing this, Kazu and Dave A should be able to comment.
The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
unless the config affects its structure or something, makedumpfile
will not need this value.
Thanks,
Kazu
>
> I tend to doubt about this.
>
> >
> > Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: James Morse <james.morse@arm.com>
> > Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> > ---
> > arch/arm64/kernel/crash_core.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
> > index ca4c3e12d8c5..ad231be5c0d8 100644
> > --- a/arch/arm64/kernel/crash_core.c
> > +++ b/arch/arm64/kernel/crash_core.c
> > @@ -10,6 +10,8 @@
> > void arch_crash_save_vmcoreinfo(void)
> > {
> > VMCOREINFO_NUMBER(VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
> > /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
> > vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
> > kimage_voffset);
> > --
> > 2.7.4
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-04 16:04 ` Kazuhito Hagio
0 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-04 16:04 UTC (permalink / raw)
To: Dave Young, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
On 1/30/2019 8:48 PM, Dave Young wrote:
> + more people
> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > supports these extensions can support upto 52-bit virtual and 52-bit
> > physical addresses respectively.
> >
> > Since at the moment we enable the support of these extensions via CONFIG
> > flags, e.g.
> > - LPA via CONFIG_ARM64_PA_BITS_52
> >
> > there are no clear mechanisms in user-space right now to
> > deteremine these CONFIG flag values and also determine the PARange and
> > VARange address values.
> >
> > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > the maximum virtual address and physical address (respectively)
> > supported by underlying kernel.
> >
> > A reference 'makedumpfile' implementation which uses this approach to
> > determining the maximum physical address is available in [0].
> >
> > [0].
> https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>
> I'm not objecting the patch, just want to make sure to make clear about
> things and make sure these issues are aware by people, and leave arm
> people to review the arm bits.
>
> 1. MAX_PHYSMEM_BITS
> As we previously found, back to 2014 makedumpfile took a patch to read the
> value from vmcore but the kernel patch was not accepted.
> So we should first make clear if this is really needed, why other arches
> do not need this in makedumpfile.
>
> If we really need it then should it be arm64 only?
>
> If it is arm64 only then the makedumpfile code should read this number
> only for arm64.
Sorry for the delay.
According to the kernel patch, some of arm32 platforms may need it
http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
but except for them (and arm64), makedumpfile can manage with kernel
version and some switches to determine this value so far.
>
> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> tree, need to make sure to document this as well.
>
> 2. MAX_USER_VA_BITS
> Does makedumpfile care about userspace VA bits? I do not see other code
> doing this, Kazu and Dave A should be able to comment.
The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
unless the config affects its structure or something, makedumpfile
will not need this value.
Thanks,
Kazu
>
> I tend to doubt about this.
>
> >
> > Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: James Morse <james.morse@arm.com>
> > Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> > ---
> > arch/arm64/kernel/crash_core.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
> > index ca4c3e12d8c5..ad231be5c0d8 100644
> > --- a/arch/arm64/kernel/crash_core.c
> > +++ b/arch/arm64/kernel/crash_core.c
> > @@ -10,6 +10,8 @@
> > void arch_crash_save_vmcoreinfo(void)
> > {
> > VMCOREINFO_NUMBER(VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
> > + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
> > /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
> > vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
> > kimage_voffset);
> > --
> > 2.7.4
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-04 16:04 ` Kazuhito Hagio
@ 2019-02-12 5:07 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-12 5:07 UTC (permalink / raw)
To: Kazuhito Hagio, Dave Young
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
Hi Kazu,
On 02/04/2019 09:34 PM, Kazuhito Hagio wrote:
> On 1/30/2019 8:48 PM, Dave Young wrote:
>> + more people
>> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
>>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>>> supports these extensions can support upto 52-bit virtual and 52-bit
>>> physical addresses respectively.
>>>
>>> Since at the moment we enable the support of these extensions via CONFIG
>>> flags, e.g.
>>> - LPA via CONFIG_ARM64_PA_BITS_52
>>>
>>> there are no clear mechanisms in user-space right now to
>>> deteremine these CONFIG flag values and also determine the PARange and
>>> VARange address values.
>>>
>>> User-space tools like 'makedumpfile' and 'crash-utility' can instead
>>> use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
>>> the maximum virtual address and physical address (respectively)
>>> supported by underlying kernel.
>>>
>>> A reference 'makedumpfile' implementation which uses this approach to
>>> determining the maximum physical address is available in [0].
>>>
>>> [0].
>> https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>>
>> I'm not objecting the patch, just want to make sure to make clear about
>> things and make sure these issues are aware by people, and leave arm
>> people to review the arm bits.
>>
>> 1. MAX_PHYSMEM_BITS
>> As we previously found, back to 2014 makedumpfile took a patch to read the
>> value from vmcore but the kernel patch was not accepted.
>> So we should first make clear if this is really needed, why other arches
>> do not need this in makedumpfile.
>>
>> If we really need it then should it be arm64 only?
>>
>> If it is arm64 only then the makedumpfile code should read this number
>> only for arm64.
>
> Sorry for the delay.
>
> According to the kernel patch, some of arm32 platforms may need it
> http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
> but except for them (and arm64), makedumpfile can manage with kernel
> version and some switches to determine this value so far.
>
>>
>> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
>> tree, need to make sure to document this as well.
>>
>> 2. MAX_USER_VA_BITS
>> Does makedumpfile care about userspace VA bits? I do not see other code
>> doing this, Kazu and Dave A should be able to comment.
>
> The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
> unless the config affects its structure or something, makedumpfile
> will not need this value.
I captured this case in more details while sending out the makedumpfile
enablement patch for ARMv8.2-LVA (see [0]), but here is a brief summary
on the same:
Since at the moment we enable the support of the ARMv8.2-LVA extension
for 52-bit user-space VA in the kernel via a CONFIG flags
(CONFIG_ARM64_USER_VA_BITS_52), so there are no clear mechanisms in
user-space to determine this CONFIG
flag value and use it to determine the address range values.
Since 'VA_BITS' are already exported via vmcoreinfo, if we export
'MAX_USER_VA_BITS' as well, we can use the same in user-space to check
if the 'MAX_USER_VA_BITS' value is greater than 'VA_BITS'. If yes, then
we are running a use-case where user-space is 52-bit while the
underlying kernel is still 48-bit.
The increased 'PTRS_PER_PGD' value for such cases needs to be then
calculated as is done by the underlying kernel (see
'arch/arm64/include/asm/pgtable-hwdef.h' for details):
#define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
Also, note that 'arch/arm64/include/asm/memory.h' defines
'MAX_USER_VA_BITS' as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52'
is set to 'n':
#ifdef CONFIG_ARM64_USER_VA_BITS_52
#define MAX_USER_VA_BITS 52
#else
#define MAX_USER_VA_BITS VA_BITS
#endif
So, makedumpfile will need this symbol exported in vmcore to make the
above determination.
[0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
Thanks,
Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-12 5:07 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-12 5:07 UTC (permalink / raw)
To: Kazuhito Hagio, Dave Young
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
Hi Kazu,
On 02/04/2019 09:34 PM, Kazuhito Hagio wrote:
> On 1/30/2019 8:48 PM, Dave Young wrote:
>> + more people
>> On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
>>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>>> supports these extensions can support upto 52-bit virtual and 52-bit
>>> physical addresses respectively.
>>>
>>> Since at the moment we enable the support of these extensions via CONFIG
>>> flags, e.g.
>>> - LPA via CONFIG_ARM64_PA_BITS_52
>>>
>>> there are no clear mechanisms in user-space right now to
>>> deteremine these CONFIG flag values and also determine the PARange and
>>> VARange address values.
>>>
>>> User-space tools like 'makedumpfile' and 'crash-utility' can instead
>>> use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
>>> the maximum virtual address and physical address (respectively)
>>> supported by underlying kernel.
>>>
>>> A reference 'makedumpfile' implementation which uses this approach to
>>> determining the maximum physical address is available in [0].
>>>
>>> [0].
>> https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
>>
>> I'm not objecting the patch, just want to make sure to make clear about
>> things and make sure these issues are aware by people, and leave arm
>> people to review the arm bits.
>>
>> 1. MAX_PHYSMEM_BITS
>> As we previously found, back to 2014 makedumpfile took a patch to read the
>> value from vmcore but the kernel patch was not accepted.
>> So we should first make clear if this is really needed, why other arches
>> do not need this in makedumpfile.
>>
>> If we really need it then should it be arm64 only?
>>
>> If it is arm64 only then the makedumpfile code should read this number
>> only for arm64.
>
> Sorry for the delay.
>
> According to the kernel patch, some of arm32 platforms may need it
> http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
> but except for them (and arm64), makedumpfile can manage with kernel
> version and some switches to determine this value so far.
>
>>
>> Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
>> tree, need to make sure to document this as well.
>>
>> 2. MAX_USER_VA_BITS
>> Does makedumpfile care about userspace VA bits? I do not see other code
>> doing this, Kazu and Dave A should be able to comment.
>
> The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
> unless the config affects its structure or something, makedumpfile
> will not need this value.
I captured this case in more details while sending out the makedumpfile
enablement patch for ARMv8.2-LVA (see [0]), but here is a brief summary
on the same:
Since at the moment we enable the support of the ARMv8.2-LVA extension
for 52-bit user-space VA in the kernel via a CONFIG flags
(CONFIG_ARM64_USER_VA_BITS_52), so there are no clear mechanisms in
user-space to determine this CONFIG
flag value and use it to determine the address range values.
Since 'VA_BITS' are already exported via vmcoreinfo, if we export
'MAX_USER_VA_BITS' as well, we can use the same in user-space to check
if the 'MAX_USER_VA_BITS' value is greater than 'VA_BITS'. If yes, then
we are running a use-case where user-space is 52-bit while the
underlying kernel is still 48-bit.
The increased 'PTRS_PER_PGD' value for such cases needs to be then
calculated as is done by the underlying kernel (see
'arch/arm64/include/asm/pgtable-hwdef.h' for details):
#define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
Also, note that 'arch/arm64/include/asm/memory.h' defines
'MAX_USER_VA_BITS' as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52'
is set to 'n':
#ifdef CONFIG_ARM64_USER_VA_BITS_52
#define MAX_USER_VA_BITS 52
#else
#define MAX_USER_VA_BITS VA_BITS
#endif
So, makedumpfile will need this symbol exported in vmcore to make the
above determination.
[0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
Thanks,
Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-12 5:07 ` Bhupesh Sharma
@ 2019-02-12 10:44 ` Dave Young
-1 siblings, 0 replies; 58+ messages in thread
From: Dave Young @ 2019-02-12 10:44 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, linux-arm-kernel
On 02/12/19 at 10:37am, Bhupesh Sharma wrote:
> Hi Kazu,
>
> On 02/04/2019 09:34 PM, Kazuhito Hagio wrote:
> > On 1/30/2019 8:48 PM, Dave Young wrote:
> > > + more people
> > > On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > > > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > > > supports these extensions can support upto 52-bit virtual and 52-bit
> > > > physical addresses respectively.
> > > >
> > > > Since at the moment we enable the support of these extensions via CONFIG
> > > > flags, e.g.
> > > > - LPA via CONFIG_ARM64_PA_BITS_52
> > > >
> > > > there are no clear mechanisms in user-space right now to
> > > > deteremine these CONFIG flag values and also determine the PARange and
> > > > VARange address values.
> > > >
> > > > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > > > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > > > the maximum virtual address and physical address (respectively)
> > > > supported by underlying kernel.
> > > >
> > > > A reference 'makedumpfile' implementation which uses this approach to
> > > > determining the maximum physical address is available in [0].
> > > >
> > > > [0].
> > > https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
> > >
> > > I'm not objecting the patch, just want to make sure to make clear about
> > > things and make sure these issues are aware by people, and leave arm
> > > people to review the arm bits.
> > >
> > > 1. MAX_PHYSMEM_BITS
> > > As we previously found, back to 2014 makedumpfile took a patch to read the
> > > value from vmcore but the kernel patch was not accepted.
> > > So we should first make clear if this is really needed, why other arches
> > > do not need this in makedumpfile.
> > >
> > > If we really need it then should it be arm64 only?
> > >
> > > If it is arm64 only then the makedumpfile code should read this number
> > > only for arm64.
> >
> > Sorry for the delay.
> >
> > According to the kernel patch, some of arm32 platforms may need it
> > http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
> > but except for them (and arm64), makedumpfile can manage with kernel
> > version and some switches to determine this value so far.
> >
> > >
> > > Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> > > tree, need to make sure to document this as well.
> > >
> > > 2. MAX_USER_VA_BITS
> > > Does makedumpfile care about userspace VA bits? I do not see other code
> > > doing this, Kazu and Dave A should be able to comment.
> >
> > The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
> > unless the config affects its structure or something, makedumpfile
> > will not need this value.
>
> I captured this case in more details while sending out the makedumpfile
> enablement patch for ARMv8.2-LVA (see [0]), but here is a brief summary on
> the same:
>
> Since at the moment we enable the support of the ARMv8.2-LVA extension for
> 52-bit user-space VA in the kernel via a CONFIG flags
> (CONFIG_ARM64_USER_VA_BITS_52), so there are no clear mechanisms in
> user-space to determine this CONFIG
> flag value and use it to determine the address range values.
>
> Since 'VA_BITS' are already exported via vmcoreinfo, if we export
> 'MAX_USER_VA_BITS' as well, we can use the same in user-space to check if
> the 'MAX_USER_VA_BITS' value is greater than 'VA_BITS'. If yes, then we are
> running a use-case where user-space is 52-bit while the underlying kernel is
> still 48-bit.
Problem is why this is needed, it sounds like you are talking about some
non-exist use case.
>
> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> calculated as is done by the underlying kernel (see
> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>
> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>
> Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
>
> #ifdef CONFIG_ARM64_USER_VA_BITS_52
> #define MAX_USER_VA_BITS 52
> #else
> #define MAX_USER_VA_BITS VA_BITS
> #endif
>
> So, makedumpfile will need this symbol exported in vmcore to make the above
> determination.
>
> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
>
> Thanks,
> Bhupesh
Thanks
Dave
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-12 10:44 ` Dave Young
0 siblings, 0 replies; 58+ messages in thread
From: Dave Young @ 2019-02-12 10:44 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, linux-arm-kernel
On 02/12/19 at 10:37am, Bhupesh Sharma wrote:
> Hi Kazu,
>
> On 02/04/2019 09:34 PM, Kazuhito Hagio wrote:
> > On 1/30/2019 8:48 PM, Dave Young wrote:
> > > + more people
> > > On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > > > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > > > supports these extensions can support upto 52-bit virtual and 52-bit
> > > > physical addresses respectively.
> > > >
> > > > Since at the moment we enable the support of these extensions via CONFIG
> > > > flags, e.g.
> > > > - LPA via CONFIG_ARM64_PA_BITS_52
> > > >
> > > > there are no clear mechanisms in user-space right now to
> > > > deteremine these CONFIG flag values and also determine the PARange and
> > > > VARange address values.
> > > >
> > > > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > > > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > > > the maximum virtual address and physical address (respectively)
> > > > supported by underlying kernel.
> > > >
> > > > A reference 'makedumpfile' implementation which uses this approach to
> > > > determining the maximum physical address is available in [0].
> > > >
> > > > [0].
> > > https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
> > >
> > > I'm not objecting the patch, just want to make sure to make clear about
> > > things and make sure these issues are aware by people, and leave arm
> > > people to review the arm bits.
> > >
> > > 1. MAX_PHYSMEM_BITS
> > > As we previously found, back to 2014 makedumpfile took a patch to read the
> > > value from vmcore but the kernel patch was not accepted.
> > > So we should first make clear if this is really needed, why other arches
> > > do not need this in makedumpfile.
> > >
> > > If we really need it then should it be arm64 only?
> > >
> > > If it is arm64 only then the makedumpfile code should read this number
> > > only for arm64.
> >
> > Sorry for the delay.
> >
> > According to the kernel patch, some of arm32 platforms may need it
> > http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
> > but except for them (and arm64), makedumpfile can manage with kernel
> > version and some switches to determine this value so far.
> >
> > >
> > > Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> > > tree, need to make sure to document this as well.
> > >
> > > 2. MAX_USER_VA_BITS
> > > Does makedumpfile care about userspace VA bits? I do not see other code
> > > doing this, Kazu and Dave A should be able to comment.
> >
> > The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
> > unless the config affects its structure or something, makedumpfile
> > will not need this value.
>
> I captured this case in more details while sending out the makedumpfile
> enablement patch for ARMv8.2-LVA (see [0]), but here is a brief summary on
> the same:
>
> Since at the moment we enable the support of the ARMv8.2-LVA extension for
> 52-bit user-space VA in the kernel via a CONFIG flags
> (CONFIG_ARM64_USER_VA_BITS_52), so there are no clear mechanisms in
> user-space to determine this CONFIG
> flag value and use it to determine the address range values.
>
> Since 'VA_BITS' are already exported via vmcoreinfo, if we export
> 'MAX_USER_VA_BITS' as well, we can use the same in user-space to check if
> the 'MAX_USER_VA_BITS' value is greater than 'VA_BITS'. If yes, then we are
> running a use-case where user-space is 52-bit while the underlying kernel is
> still 48-bit.
Problem is why this is needed, it sounds like you are talking about some
non-exist use case.
>
> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> calculated as is done by the underlying kernel (see
> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>
> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>
> Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
>
> #ifdef CONFIG_ARM64_USER_VA_BITS_52
> #define MAX_USER_VA_BITS 52
> #else
> #define MAX_USER_VA_BITS VA_BITS
> #endif
>
> So, makedumpfile will need this symbol exported in vmcore to make the above
> determination.
>
> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
>
> Thanks,
> Bhupesh
Thanks
Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-12 10:44 ` Dave Young
@ 2019-02-12 19:59 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-12 19:59 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, linux-arm-kernel
On Tue, Feb 12, 2019 at 4:14 PM Dave Young <dyoung@redhat.com> wrote:
>
> On 02/12/19 at 10:37am, Bhupesh Sharma wrote:
> > Hi Kazu,
> >
> > On 02/04/2019 09:34 PM, Kazuhito Hagio wrote:
> > > On 1/30/2019 8:48 PM, Dave Young wrote:
> > > > + more people
> > > > On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > > > > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > > > > supports these extensions can support upto 52-bit virtual and 52-bit
> > > > > physical addresses respectively.
> > > > >
> > > > > Since at the moment we enable the support of these extensions via CONFIG
> > > > > flags, e.g.
> > > > > - LPA via CONFIG_ARM64_PA_BITS_52
> > > > >
> > > > > there are no clear mechanisms in user-space right now to
> > > > > deteremine these CONFIG flag values and also determine the PARange and
> > > > > VARange address values.
> > > > >
> > > > > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > > > > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > > > > the maximum virtual address and physical address (respectively)
> > > > > supported by underlying kernel.
> > > > >
> > > > > A reference 'makedumpfile' implementation which uses this approach to
> > > > > determining the maximum physical address is available in [0].
> > > > >
> > > > > [0].
> > > > https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
> > > >
> > > > I'm not objecting the patch, just want to make sure to make clear about
> > > > things and make sure these issues are aware by people, and leave arm
> > > > people to review the arm bits.
> > > >
> > > > 1. MAX_PHYSMEM_BITS
> > > > As we previously found, back to 2014 makedumpfile took a patch to read the
> > > > value from vmcore but the kernel patch was not accepted.
> > > > So we should first make clear if this is really needed, why other arches
> > > > do not need this in makedumpfile.
> > > >
> > > > If we really need it then should it be arm64 only?
> > > >
> > > > If it is arm64 only then the makedumpfile code should read this number
> > > > only for arm64.
> > >
> > > Sorry for the delay.
> > >
> > > According to the kernel patch, some of arm32 platforms may need it
> > > http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
> > > but except for them (and arm64), makedumpfile can manage with kernel
> > > version and some switches to determine this value so far.
> > >
> > > >
> > > > Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> > > > tree, need to make sure to document this as well.
> > > >
> > > > 2. MAX_USER_VA_BITS
> > > > Does makedumpfile care about userspace VA bits? I do not see other code
> > > > doing this, Kazu and Dave A should be able to comment.
> > >
> > > The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
> > > unless the config affects its structure or something, makedumpfile
> > > will not need this value.
> >
> > I captured this case in more details while sending out the makedumpfile
> > enablement patch for ARMv8.2-LVA (see [0]), but here is a brief summary on
> > the same:
> >
> > Since at the moment we enable the support of the ARMv8.2-LVA extension for
> > 52-bit user-space VA in the kernel via a CONFIG flags
> > (CONFIG_ARM64_USER_VA_BITS_52), so there are no clear mechanisms in
> > user-space to determine this CONFIG
> > flag value and use it to determine the address range values.
> >
> > Since 'VA_BITS' are already exported via vmcoreinfo, if we export
> > 'MAX_USER_VA_BITS' as well, we can use the same in user-space to check if
> > the 'MAX_USER_VA_BITS' value is greater than 'VA_BITS'. If yes, then we are
> > running a use-case where user-space is 52-bit while the underlying kernel is
> > still 48-bit.
>
> Problem is why this is needed, it sounds like you are talking about some
> non-exist use case.
I already explained this in an earlier reply to your initial comments
(see <http://lists.infradead.org/pipermail/kexec/2019-January/022395.html>).
Perhaps you missed reading it, so here are the architecturally
possible and kernel supported uses cases with ARMv8.2 extensions
(depending on the combination of CONFIG flags and kernel version):
- 48-bit kernel VA + 48-bit user-space VA + 52-bit PA
- 48-bit kernel VA + 52-bit user-space VA + 52-bit PA
- 52-bit kernel VA + 52-bit user-space VA + 52-bit PA
Please see 'config ARM64_USER_VA_BITS_52' help text inside
'arch/arm64/Kconfig' for more details:
config ARM64_USER_VA_BITS_52
bool "52-bit (user)"
depends on ARM64_64K_PAGES && (ARM64_PAN || !ARM64_SW_TTBR0_PAN)
help
Enable 52-bit virtual addressing for userspace when explicitly
requested via a hint to mmap(). The kernel will continue to
use 48-bit virtual addresses for its own mappings.
BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
(which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
mentioned that the changes look necessary.
[0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
Thanks,
Bhupesh
> >
> > The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > calculated as is done by the underlying kernel (see
> > 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> >
> > #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> >
> > Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> > as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
> >
> > #ifdef CONFIG_ARM64_USER_VA_BITS_52
> > #define MAX_USER_VA_BITS 52
> > #else
> > #define MAX_USER_VA_BITS VA_BITS
> > #endif
> >
> > So, makedumpfile will need this symbol exported in vmcore to make the above
> > determination.
> >
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> >
> > Thanks,
> > Bhupesh
>
> Thanks
> Dave
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-12 19:59 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-12 19:59 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Borislav Petkov, anderson, linux-arm-kernel
On Tue, Feb 12, 2019 at 4:14 PM Dave Young <dyoung@redhat.com> wrote:
>
> On 02/12/19 at 10:37am, Bhupesh Sharma wrote:
> > Hi Kazu,
> >
> > On 02/04/2019 09:34 PM, Kazuhito Hagio wrote:
> > > On 1/30/2019 8:48 PM, Dave Young wrote:
> > > > + more people
> > > > On 01/30/19 at 05:53pm, Bhupesh Sharma wrote:
> > > > > With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> > > > > supports these extensions can support upto 52-bit virtual and 52-bit
> > > > > physical addresses respectively.
> > > > >
> > > > > Since at the moment we enable the support of these extensions via CONFIG
> > > > > flags, e.g.
> > > > > - LPA via CONFIG_ARM64_PA_BITS_52
> > > > >
> > > > > there are no clear mechanisms in user-space right now to
> > > > > deteremine these CONFIG flag values and also determine the PARange and
> > > > > VARange address values.
> > > > >
> > > > > User-space tools like 'makedumpfile' and 'crash-utility' can instead
> > > > > use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
> > > > > the maximum virtual address and physical address (respectively)
> > > > > supported by underlying kernel.
> > > > >
> > > > > A reference 'makedumpfile' implementation which uses this approach to
> > > > > determining the maximum physical address is available in [0].
> > > > >
> > > > > [0].
> > > > https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/arm64.c#L490
> > > >
> > > > I'm not objecting the patch, just want to make sure to make clear about
> > > > things and make sure these issues are aware by people, and leave arm
> > > > people to review the arm bits.
> > > >
> > > > 1. MAX_PHYSMEM_BITS
> > > > As we previously found, back to 2014 makedumpfile took a patch to read the
> > > > value from vmcore but the kernel patch was not accepted.
> > > > So we should first make clear if this is really needed, why other arches
> > > > do not need this in makedumpfile.
> > > >
> > > > If we really need it then should it be arm64 only?
> > > >
> > > > If it is arm64 only then the makedumpfile code should read this number
> > > > only for arm64.
> > >
> > > Sorry for the delay.
> > >
> > > According to the kernel patch, some of arm32 platforms may need it
> > > http://lists.infradead.org/pipermail/kexec/2014-May/011909.html
> > > but except for them (and arm64), makedumpfile can manage with kernel
> > > version and some switches to determine this value so far.
> > >
> > > >
> > > > Also Lianbo added the vmcoreinfo documents, I believe it stays in -tip
> > > > tree, need to make sure to document this as well.
> > > >
> > > > 2. MAX_USER_VA_BITS
> > > > Does makedumpfile care about userspace VA bits? I do not see other code
> > > > doing this, Kazu and Dave A should be able to comment.
> > >
> > > The mapping makedumpfile uses on arm64 is swapper_pg_dir only, so
> > > unless the config affects its structure or something, makedumpfile
> > > will not need this value.
> >
> > I captured this case in more details while sending out the makedumpfile
> > enablement patch for ARMv8.2-LVA (see [0]), but here is a brief summary on
> > the same:
> >
> > Since at the moment we enable the support of the ARMv8.2-LVA extension for
> > 52-bit user-space VA in the kernel via a CONFIG flags
> > (CONFIG_ARM64_USER_VA_BITS_52), so there are no clear mechanisms in
> > user-space to determine this CONFIG
> > flag value and use it to determine the address range values.
> >
> > Since 'VA_BITS' are already exported via vmcoreinfo, if we export
> > 'MAX_USER_VA_BITS' as well, we can use the same in user-space to check if
> > the 'MAX_USER_VA_BITS' value is greater than 'VA_BITS'. If yes, then we are
> > running a use-case where user-space is 52-bit while the underlying kernel is
> > still 48-bit.
>
> Problem is why this is needed, it sounds like you are talking about some
> non-exist use case.
I already explained this in an earlier reply to your initial comments
(see <http://lists.infradead.org/pipermail/kexec/2019-January/022395.html>).
Perhaps you missed reading it, so here are the architecturally
possible and kernel supported uses cases with ARMv8.2 extensions
(depending on the combination of CONFIG flags and kernel version):
- 48-bit kernel VA + 48-bit user-space VA + 52-bit PA
- 48-bit kernel VA + 52-bit user-space VA + 52-bit PA
- 52-bit kernel VA + 52-bit user-space VA + 52-bit PA
Please see 'config ARM64_USER_VA_BITS_52' help text inside
'arch/arm64/Kconfig' for more details:
config ARM64_USER_VA_BITS_52
bool "52-bit (user)"
depends on ARM64_64K_PAGES && (ARM64_PAN || !ARM64_SW_TTBR0_PAN)
help
Enable 52-bit virtual addressing for userspace when explicitly
requested via a hint to mmap(). The kernel will continue to
use 48-bit virtual addresses for its own mappings.
BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
(which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
mentioned that the changes look necessary.
[0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
Thanks,
Bhupesh
> >
> > The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > calculated as is done by the underlying kernel (see
> > 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> >
> > #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> >
> > Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> > as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
> >
> > #ifdef CONFIG_ARM64_USER_VA_BITS_52
> > #define MAX_USER_VA_BITS 52
> > #else
> > #define MAX_USER_VA_BITS VA_BITS
> > #endif
> >
> > So, makedumpfile will need this symbol exported in vmcore to make the above
> > determination.
> >
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> >
> > Thanks,
> > Bhupesh
>
> Thanks
> Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-12 19:59 ` Bhupesh Sharma
@ 2019-02-12 23:03 ` Kazuhito Hagio
-1 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-12 23:03 UTC (permalink / raw)
To: Bhupesh Sharma, Dave Young
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> mentioned that the changes look necessary.
>
> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> > > The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > > calculated as is done by the underlying kernel (see
> > > 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > >
> > > #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
It is used for pgd_index() also in makedumpfile to walk page tables.
/* to find an entry in a page-table-directory */
#define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
Thanks,
Kazu
> > >
> > > Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> > > as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
> > >
> > > #ifdef CONFIG_ARM64_USER_VA_BITS_52
> > > #define MAX_USER_VA_BITS 52
> > > #else
> > > #define MAX_USER_VA_BITS VA_BITS
> > > #endif
> > >
> > > So, makedumpfile will need this symbol exported in vmcore to make the above
> > > determination.
> > >
> > > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> > >
> > > Thanks,
> > > Bhupesh
> >
> > Thanks
> > Dave
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-12 23:03 ` Kazuhito Hagio
0 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-12 23:03 UTC (permalink / raw)
To: Bhupesh Sharma, Dave Young
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> mentioned that the changes look necessary.
>
> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> > > The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > > calculated as is done by the underlying kernel (see
> > > 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > >
> > > #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
It is used for pgd_index() also in makedumpfile to walk page tables.
/* to find an entry in a page-table-directory */
#define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
Thanks,
Kazu
> > >
> > > Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> > > as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
> > >
> > > #ifdef CONFIG_ARM64_USER_VA_BITS_52
> > > #define MAX_USER_VA_BITS 52
> > > #else
> > > #define MAX_USER_VA_BITS VA_BITS
> > > #endif
> > >
> > > So, makedumpfile will need this symbol exported in vmcore to make the above
> > > determination.
> > >
> > > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> > >
> > > Thanks,
> > > Bhupesh
> >
> > Thanks
> > Dave
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-12 23:03 ` Kazuhito Hagio
@ 2019-02-13 11:15 ` Dave Young
-1 siblings, 0 replies; 58+ messages in thread
From: Dave Young @ 2019-02-13 11:15 UTC (permalink / raw)
To: Kazuhito Hagio, anderson
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Bhupesh Sharma, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, anderson, linux-arm-kernel
On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>
> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> > BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> > (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> > mentioned that the changes look necessary.
> >
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>
> > > > The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > > > calculated as is done by the underlying kernel (see
> > > > 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > > >
> > > > #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>
> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> It is used for pgd_index() also in makedumpfile to walk page tables.
>
> /* to find an entry in a page-table-directory */
> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
Since Dave mentioned crash tool does not need it, but crash should also
travel the pg tables.
If this is really necessary it would be good to describe what will
happen without the patch, eg. some user visible error from an actual test etc.
>
> Thanks,
> Kazu
>
> > > >
> > > > Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> > > > as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
> > > >
> > > > #ifdef CONFIG_ARM64_USER_VA_BITS_52
> > > > #define MAX_USER_VA_BITS 52
> > > > #else
> > > > #define MAX_USER_VA_BITS VA_BITS
> > > > #endif
> > > >
> > > > So, makedumpfile will need this symbol exported in vmcore to make the above
> > > > determination.
> > > >
> > > > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> > > >
> > > > Thanks,
> > > > Bhupesh
> > >
> > > Thanks
> > > Dave
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-13 11:15 ` Dave Young
0 siblings, 0 replies; 58+ messages in thread
From: Dave Young @ 2019-02-13 11:15 UTC (permalink / raw)
To: Kazuhito Hagio, anderson
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Bhupesh Sharma, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Borislav Petkov, linux-arm-kernel
On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>
> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> > BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> > (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> > mentioned that the changes look necessary.
> >
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>
> > > > The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > > > calculated as is done by the underlying kernel (see
> > > > 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > > >
> > > > #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>
> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> It is used for pgd_index() also in makedumpfile to walk page tables.
>
> /* to find an entry in a page-table-directory */
> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
Since Dave mentioned crash tool does not need it, but crash should also
travel the pg tables.
If this is really necessary it would be good to describe what will
happen without the patch, eg. some user visible error from an actual test etc.
>
> Thanks,
> Kazu
>
> > > >
> > > > Also, note that 'arch/arm64/include/asm/memory.h' defines 'MAX_USER_VA_BITS'
> > > > as 'VA_BITS' in case 'CONFIG_ARM64_USER_VA_BITS_52' is set to 'n':
> > > >
> > > > #ifdef CONFIG_ARM64_USER_VA_BITS_52
> > > > #define MAX_USER_VA_BITS 52
> > > > #else
> > > > #define MAX_USER_VA_BITS VA_BITS
> > > > #endif
> > > >
> > > > So, makedumpfile will need this symbol exported in vmcore to make the above
> > > > determination.
> > > >
> > > > [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> > > >
> > > > Thanks,
> > > > Bhupesh
> > >
> > > Thanks
> > > Dave
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-13 11:15 ` Dave Young
@ 2019-02-13 18:22 ` James Morse
-1 siblings, 0 replies; 58+ messages in thread
From: James Morse @ 2019-02-13 18:22 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, Bhupesh Sharma, kexec, Will Deacon,
AKASHI Takahiro, anderson, Borislav Petkov, linux-arm-kernel
Hi guys,
On 13/02/2019 11:15, Dave Young wrote:
> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>> mentioned that the changes look necessary.
>>>
>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>
>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>> calculated as is done by the underlying kernel
Aha! Nothing to do with which-bits-are-pfn in the tables...
You need to know if the top level PGD is 512bytes or bigger. As we use a
kmem-cache the adjacent data could be some else's page tables.
Is this really a problem though? You can't pull the user-space pgd pointers out
of no-where, you must have walked some task_struct and struct_mm's to find them.
In which case you would have the VMAs on hand to tell you if its in the mapped
user range.
It would be good to avoid putting something arch-specific in here if we can at
all help it.
>>>>> (see
>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>
>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>
>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>
>> /* to find an entry in a page-table-directory */
>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>
> Since Dave mentioned crash tool does not need it, but crash should also
> travel the pg tables.
>
> If this is really necessary it would be good to describe what will
> happen without the patch, eg. some user visible error from an actual test etc.
Yes please, it would really help if there was a specific example we could discuss.
Thanks,
James
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-13 18:22 ` James Morse
0 siblings, 0 replies; 58+ messages in thread
From: James Morse @ 2019-02-13 18:22 UTC (permalink / raw)
To: Dave Young
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, Bhupesh Sharma, kexec, Will Deacon,
AKASHI Takahiro, anderson, Borislav Petkov, linux-arm-kernel
Hi guys,
On 13/02/2019 11:15, Dave Young wrote:
> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>> mentioned that the changes look necessary.
>>>
>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>
>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>> calculated as is done by the underlying kernel
Aha! Nothing to do with which-bits-are-pfn in the tables...
You need to know if the top level PGD is 512bytes or bigger. As we use a
kmem-cache the adjacent data could be some else's page tables.
Is this really a problem though? You can't pull the user-space pgd pointers out
of no-where, you must have walked some task_struct and struct_mm's to find them.
In which case you would have the VMAs on hand to tell you if its in the mapped
user range.
It would be good to avoid putting something arch-specific in here if we can at
all help it.
>>>>> (see
>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>
>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>
>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>
>> /* to find an entry in a page-table-directory */
>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>
> Since Dave mentioned crash tool does not need it, but crash should also
> travel the pg tables.
>
> If this is really necessary it would be good to describe what will
> happen without the patch, eg. some user visible error from an actual test etc.
Yes please, it would really help if there was a specific example we could discuss.
Thanks,
James
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-13 18:22 ` James Morse
@ 2019-02-13 19:52 ` Kazuhito Hagio
-1 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-13 19:52 UTC (permalink / raw)
To: James Morse, Dave Young, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, anderson, Borislav Petkov,
linux-arm-kernel
On 2/13/2019 1:22 PM, James Morse wrote:
> Hi guys,
>
> On 13/02/2019 11:15, Dave Young wrote:
> > On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> >> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> >>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> >>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> >>> mentioned that the changes look necessary.
> >>>
> >>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> >>
> >>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> >>>>> calculated as is done by the underlying kernel
>
> Aha! Nothing to do with which-bits-are-pfn in the tables...
>
> You need to know if the top level PGD is 512bytes or bigger. As we use a
> kmem-cache the adjacent data could be some else's page tables.
>
> Is this really a problem though? You can't pull the user-space pgd pointers out
> of no-where, you must have walked some task_struct and struct_mm's to find them.
> In which case you would have the VMAs on hand to tell you if its in the mapped
> user range.
>
> It would be good to avoid putting something arch-specific in here if we can at
> all help it.
>
>
> >>>>> (see
> >>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> >>>>>
> >>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> >>
> >> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> >> It is used for pgd_index() also in makedumpfile to walk page tables.
> >>
> >> /* to find an entry in a page-table-directory */
> >> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> >
> > Since Dave mentioned crash tool does not need it, but crash should also
> > travel the pg tables.
The crash utility is always invoked with vmlinux, so it can read the
vabits_user variable directly from vmcore, but makedumpfile can not.
> > If this is really necessary it would be good to describe what will
> > happen without the patch, eg. some user visible error from an actual test etc.
>
> Yes please, it would really help if there was a specific example we could discuss.
With 52-bit user space and 48-bit kernel space configuration,
makedumpfile will not be able to convert a virtual kernel address
to a physical address, and fail to capture a dumpfile, because the
pgd_index() will return a wrong index.
But I don't have any suitable test system on hand, so have not tried
the kernel configuration actually. If found, I'll try.
Bhupesh, do you have any test result?
Thanks,
Kazu
>
>
> Thanks,
>
> James
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-13 19:52 ` Kazuhito Hagio
0 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-13 19:52 UTC (permalink / raw)
To: James Morse, Dave Young, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
kexec, Will Deacon, AKASHI Takahiro, anderson, Borislav Petkov,
linux-arm-kernel
On 2/13/2019 1:22 PM, James Morse wrote:
> Hi guys,
>
> On 13/02/2019 11:15, Dave Young wrote:
> > On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> >> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> >>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> >>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> >>> mentioned that the changes look necessary.
> >>>
> >>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> >>
> >>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> >>>>> calculated as is done by the underlying kernel
>
> Aha! Nothing to do with which-bits-are-pfn in the tables...
>
> You need to know if the top level PGD is 512bytes or bigger. As we use a
> kmem-cache the adjacent data could be some else's page tables.
>
> Is this really a problem though? You can't pull the user-space pgd pointers out
> of no-where, you must have walked some task_struct and struct_mm's to find them.
> In which case you would have the VMAs on hand to tell you if its in the mapped
> user range.
>
> It would be good to avoid putting something arch-specific in here if we can at
> all help it.
>
>
> >>>>> (see
> >>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> >>>>>
> >>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> >>
> >> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> >> It is used for pgd_index() also in makedumpfile to walk page tables.
> >>
> >> /* to find an entry in a page-table-directory */
> >> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> >
> > Since Dave mentioned crash tool does not need it, but crash should also
> > travel the pg tables.
The crash utility is always invoked with vmlinux, so it can read the
vabits_user variable directly from vmcore, but makedumpfile can not.
> > If this is really necessary it would be good to describe what will
> > happen without the patch, eg. some user visible error from an actual test etc.
>
> Yes please, it would really help if there was a specific example we could discuss.
With 52-bit user space and 48-bit kernel space configuration,
makedumpfile will not be able to convert a virtual kernel address
to a physical address, and fail to capture a dumpfile, because the
pgd_index() will return a wrong index.
But I don't have any suitable test system on hand, so have not tried
the kernel configuration actually. If found, I'll try.
Bhupesh, do you have any test result?
Thanks,
Kazu
>
>
> Thanks,
>
> James
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-13 19:52 ` Kazuhito Hagio
@ 2019-02-15 17:34 ` James Morse
-1 siblings, 0 replies; 58+ messages in thread
From: James Morse @ 2019-02-15 17:34 UTC (permalink / raw)
To: Kazuhito Hagio, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, anderson,
Kristina Martsenko, Borislav Petkov, Dave Young,
linux-arm-kernel
Hi guys,
(CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
config when 52-bit VA and pointer-auth may be in use?"
On 13/02/2019 19:52, Kazuhito Hagio wrote:
> On 2/13/2019 1:22 PM, James Morse wrote:
>> On 13/02/2019 11:15, Dave Young wrote:
>>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>>> mentioned that the changes look necessary.
>>>>>
>>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>>>
>>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>>> calculated as is done by the underlying kernel
>>
>> Aha! Nothing to do with which-bits-are-pfn in the tables...
>>
>> You need to know if the top level PGD is 512bytes or bigger. As we use a
>> kmem-cache the adjacent data could be some else's page tables.
>>
>> Is this really a problem though? You can't pull the user-space pgd pointers out
>> of no-where, you must have walked some task_struct and struct_mm's to find them.
>> In which case you would have the VMAs on hand to tell you if its in the mapped
>> user range.
>>
>> It would be good to avoid putting something arch-specific in here if we can at
>> all help it.
>>>>>>> (see
>>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>>
>>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>>
>>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>>
>>>> /* to find an entry in a page-table-directory */
>>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>>
>>> Since Dave mentioned crash tool does not need it, but crash should also
>>> travel the pg tables.
>
> The crash utility is always invoked with vmlinux, so it can read the
> vabits_user variable directly from vmcore, but makedumpfile can not.
(This sounds fragile. That symbol's name may change, it may disappear
completely! ... but I guess crash changes with every kernel release anyway)
>>> If this is really necessary it would be good to describe what will
>>> happen without the patch, eg. some user visible error from an actual test etc.
>>
>> Yes please, it would really help if there was a specific example we could discuss.
>
> With 52-bit user space and 48-bit kernel space configuration,
> makedumpfile will not be able to convert a virtual kernel address
> to a physical address, and fail to capture a dumpfile, because the
> pgd_index() will return a wrong index.
Got it, thanks!
(all this user stuff had me thinking it was user-space you were trying to walk).
Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
walk it without knowing the offset you get junk.
Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
the offsetting code disappears, the kernel would still have to provide
'ttbr1_offset=0' for user-space to keep working.
I'd like to find something future-proof that always has an unambiguous meaning,
and isn't a problem if the kernel variable/symbol/kconfig names change.
With pointer-auth in use too you can't guess which bits are address and which
bits are data.
Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
problem if we ever switch that per task (some new bits may turn up with a new
feature). Some of those bits vary per cpu too, so we'd have to mask them out in
case user-space tries to conclude something from them.
My current best suggestion is to export:
from core code:
* USER_MMAP_END, the maximum value a user-space can try and mmap().
This would normally be TASK_SIZE, but x86 and powerpc also have support for
larger VA space, and its plumbed into mm slightly differently. We should have
one arch-independent property that covers all these. On arm64 this would be the
runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
arch specific:
* ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
never flip user/kernel space). This has to be arch specific, it will always have
a value and its meaning comes from the ARM-ARM (so linux can't change it in the
future). It should be the same on every CPU.
* ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
offset. Again this always has a value, and its meaning comes from the ARM-ARM.
If we ever get clever with different page-tables/TCR values on different CPUs,
these two should come from the same CPU.
I think this gives you what you need if user/kernel may both be using
pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
be picked at boot time depending on the kernel kconfig and the hardware support.
Does anyone have a better idea? (or a corner where this won't work?)
Thanks,
James
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-15 17:34 ` James Morse
0 siblings, 0 replies; 58+ messages in thread
From: James Morse @ 2019-02-15 17:34 UTC (permalink / raw)
To: Kazuhito Hagio, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, anderson,
Kristina Martsenko, Borislav Petkov, Dave Young,
linux-arm-kernel
Hi guys,
(CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
config when 52-bit VA and pointer-auth may be in use?"
On 13/02/2019 19:52, Kazuhito Hagio wrote:
> On 2/13/2019 1:22 PM, James Morse wrote:
>> On 13/02/2019 11:15, Dave Young wrote:
>>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>>> mentioned that the changes look necessary.
>>>>>
>>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>>>
>>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>>> calculated as is done by the underlying kernel
>>
>> Aha! Nothing to do with which-bits-are-pfn in the tables...
>>
>> You need to know if the top level PGD is 512bytes or bigger. As we use a
>> kmem-cache the adjacent data could be some else's page tables.
>>
>> Is this really a problem though? You can't pull the user-space pgd pointers out
>> of no-where, you must have walked some task_struct and struct_mm's to find them.
>> In which case you would have the VMAs on hand to tell you if its in the mapped
>> user range.
>>
>> It would be good to avoid putting something arch-specific in here if we can at
>> all help it.
>>>>>>> (see
>>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>>
>>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>>
>>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>>
>>>> /* to find an entry in a page-table-directory */
>>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>>
>>> Since Dave mentioned crash tool does not need it, but crash should also
>>> travel the pg tables.
>
> The crash utility is always invoked with vmlinux, so it can read the
> vabits_user variable directly from vmcore, but makedumpfile can not.
(This sounds fragile. That symbol's name may change, it may disappear
completely! ... but I guess crash changes with every kernel release anyway)
>>> If this is really necessary it would be good to describe what will
>>> happen without the patch, eg. some user visible error from an actual test etc.
>>
>> Yes please, it would really help if there was a specific example we could discuss.
>
> With 52-bit user space and 48-bit kernel space configuration,
> makedumpfile will not be able to convert a virtual kernel address
> to a physical address, and fail to capture a dumpfile, because the
> pgd_index() will return a wrong index.
Got it, thanks!
(all this user stuff had me thinking it was user-space you were trying to walk).
Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
walk it without knowing the offset you get junk.
Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
the offsetting code disappears, the kernel would still have to provide
'ttbr1_offset=0' for user-space to keep working.
I'd like to find something future-proof that always has an unambiguous meaning,
and isn't a problem if the kernel variable/symbol/kconfig names change.
With pointer-auth in use too you can't guess which bits are address and which
bits are data.
Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
problem if we ever switch that per task (some new bits may turn up with a new
feature). Some of those bits vary per cpu too, so we'd have to mask them out in
case user-space tries to conclude something from them.
My current best suggestion is to export:
from core code:
* USER_MMAP_END, the maximum value a user-space can try and mmap().
This would normally be TASK_SIZE, but x86 and powerpc also have support for
larger VA space, and its plumbed into mm slightly differently. We should have
one arch-independent property that covers all these. On arm64 this would be the
runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
arch specific:
* ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
never flip user/kernel space). This has to be arch specific, it will always have
a value and its meaning comes from the ARM-ARM (so linux can't change it in the
future). It should be the same on every CPU.
* ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
offset. Again this always has a value, and its meaning comes from the ARM-ARM.
If we ever get clever with different page-tables/TCR values on different CPUs,
these two should come from the same CPU.
I think this gives you what you need if user/kernel may both be using
pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
be picked at boot time depending on the kernel kconfig and the hardware support.
Does anyone have a better idea? (or a corner where this won't work?)
Thanks,
James
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-15 17:34 ` James Morse
@ 2019-02-15 18:01 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-15 18:01 UTC (permalink / raw)
To: James Morse
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, Steve Capper, kexec, Will Deacon,
AKASHI Takahiro, anderson, Kristina Martsenko, Borislav Petkov,
Dave Young, linux-arm-kernel
Hi James,
On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
>
> Hi guys,
>
> (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
> config when 52-bit VA and pointer-auth may be in use?"
>
> On 13/02/2019 19:52, Kazuhito Hagio wrote:
> > On 2/13/2019 1:22 PM, James Morse wrote:
> >> On 13/02/2019 11:15, Dave Young wrote:
> >>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> >>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> >>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> >>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> >>>>> mentioned that the changes look necessary.
> >>>>>
> >>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> >>>>
> >>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> >>>>>>> calculated as is done by the underlying kernel
> >>
> >> Aha! Nothing to do with which-bits-are-pfn in the tables...
> >>
> >> You need to know if the top level PGD is 512bytes or bigger. As we use a
> >> kmem-cache the adjacent data could be some else's page tables.
> >>
> >> Is this really a problem though? You can't pull the user-space pgd pointers out
> >> of no-where, you must have walked some task_struct and struct_mm's to find them.
> >> In which case you would have the VMAs on hand to tell you if its in the mapped
> >> user range.
> >>
> >> It would be good to avoid putting something arch-specific in here if we can at
> >> all help it.
>
> >>>>>>> (see
> >>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> >>>>>>>
> >>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> >>>>
> >>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> >>>> It is used for pgd_index() also in makedumpfile to walk page tables.
> >>>>
> >>>> /* to find an entry in a page-table-directory */
> >>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> >>>
> >>> Since Dave mentioned crash tool does not need it, but crash should also
> >>> travel the pg tables.
> >
> > The crash utility is always invoked with vmlinux, so it can read the
> > vabits_user variable directly from vmcore, but makedumpfile can not.
>
> (This sounds fragile. That symbol's name may change, it may disappear
> completely! ... but I guess crash changes with every kernel release anyway)
>
>
> >>> If this is really necessary it would be good to describe what will
> >>> happen without the patch, eg. some user visible error from an actual test etc.
> >>
> >> Yes please, it would really help if there was a specific example we could discuss.
> >
> > With 52-bit user space and 48-bit kernel space configuration,
> > makedumpfile will not be able to convert a virtual kernel address
> > to a physical address, and fail to capture a dumpfile, because the
> > pgd_index() will return a wrong index.
>
> Got it, thanks!
> (all this user stuff had me thinking it was user-space you were trying to walk).
>
> Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
> 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
> walk it without knowing the offset you get junk.
>
> Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
> the offsetting code disappears, the kernel would still have to provide
> 'ttbr1_offset=0' for user-space to keep working.
>
> I'd like to find something future-proof that always has an unambiguous meaning,
> and isn't a problem if the kernel variable/symbol/kconfig names change.
>
> With pointer-auth in use too you can't guess which bits are address and which
> bits are data.
>
> Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
> problem if we ever switch that per task (some new bits may turn up with a new
> feature). Some of those bits vary per cpu too, so we'd have to mask them out in
> case user-space tries to conclude something from them.
>
>
> My current best suggestion is to export:
> from core code:
> * USER_MMAP_END, the maximum value a user-space can try and mmap().
> This would normally be TASK_SIZE, but x86 and powerpc also have support for
> larger VA space, and its plumbed into mm slightly differently. We should have
> one arch-independent property that covers all these. On arm64 this would be the
> runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
>
> arch specific:
> * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
> never flip user/kernel space). This has to be arch specific, it will always have
> a value and its meaning comes from the ARM-ARM (so linux can't change it in the
> future). It should be the same on every CPU.
> * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
> offset. Again this always has a value, and its meaning comes from the ARM-ARM.
> If we ever get clever with different page-tables/TCR values on different CPUs,
> these two should come from the same CPU.
>
>
> I think this gives you what you need if user/kernel may both be using
> pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
> be picked at boot time depending on the kernel kconfig and the hardware support.
>
> Does anyone have a better idea? (or a corner where this won't work?)
I am not sure you got a chance to look at the two regression cases I
reported here:
<http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
Unfortunately the above suggestion doesn't provide any fix for
ARMv8.2-LPA regression (see text under heading '
(1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
After going through the regression reports, I think exporting
'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
for the above regressions (without over-complicating the stuff) as
ARM64_TCR.T1SZ and friends seem to arch specific as compared to
VA_BITS + 'MAX_USER_VA_BITS' .
Thanks,
Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-15 18:01 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-15 18:01 UTC (permalink / raw)
To: James Morse
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, Steve Capper, kexec, Will Deacon,
AKASHI Takahiro, anderson, Kristina Martsenko, Borislav Petkov,
Dave Young, linux-arm-kernel
Hi James,
On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
>
> Hi guys,
>
> (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
> config when 52-bit VA and pointer-auth may be in use?"
>
> On 13/02/2019 19:52, Kazuhito Hagio wrote:
> > On 2/13/2019 1:22 PM, James Morse wrote:
> >> On 13/02/2019 11:15, Dave Young wrote:
> >>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> >>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> >>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> >>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> >>>>> mentioned that the changes look necessary.
> >>>>>
> >>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> >>>>
> >>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> >>>>>>> calculated as is done by the underlying kernel
> >>
> >> Aha! Nothing to do with which-bits-are-pfn in the tables...
> >>
> >> You need to know if the top level PGD is 512bytes or bigger. As we use a
> >> kmem-cache the adjacent data could be some else's page tables.
> >>
> >> Is this really a problem though? You can't pull the user-space pgd pointers out
> >> of no-where, you must have walked some task_struct and struct_mm's to find them.
> >> In which case you would have the VMAs on hand to tell you if its in the mapped
> >> user range.
> >>
> >> It would be good to avoid putting something arch-specific in here if we can at
> >> all help it.
>
> >>>>>>> (see
> >>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> >>>>>>>
> >>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> >>>>
> >>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> >>>> It is used for pgd_index() also in makedumpfile to walk page tables.
> >>>>
> >>>> /* to find an entry in a page-table-directory */
> >>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> >>>
> >>> Since Dave mentioned crash tool does not need it, but crash should also
> >>> travel the pg tables.
> >
> > The crash utility is always invoked with vmlinux, so it can read the
> > vabits_user variable directly from vmcore, but makedumpfile can not.
>
> (This sounds fragile. That symbol's name may change, it may disappear
> completely! ... but I guess crash changes with every kernel release anyway)
>
>
> >>> If this is really necessary it would be good to describe what will
> >>> happen without the patch, eg. some user visible error from an actual test etc.
> >>
> >> Yes please, it would really help if there was a specific example we could discuss.
> >
> > With 52-bit user space and 48-bit kernel space configuration,
> > makedumpfile will not be able to convert a virtual kernel address
> > to a physical address, and fail to capture a dumpfile, because the
> > pgd_index() will return a wrong index.
>
> Got it, thanks!
> (all this user stuff had me thinking it was user-space you were trying to walk).
>
> Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
> 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
> walk it without knowing the offset you get junk.
>
> Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
> the offsetting code disappears, the kernel would still have to provide
> 'ttbr1_offset=0' for user-space to keep working.
>
> I'd like to find something future-proof that always has an unambiguous meaning,
> and isn't a problem if the kernel variable/symbol/kconfig names change.
>
> With pointer-auth in use too you can't guess which bits are address and which
> bits are data.
>
> Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
> problem if we ever switch that per task (some new bits may turn up with a new
> feature). Some of those bits vary per cpu too, so we'd have to mask them out in
> case user-space tries to conclude something from them.
>
>
> My current best suggestion is to export:
> from core code:
> * USER_MMAP_END, the maximum value a user-space can try and mmap().
> This would normally be TASK_SIZE, but x86 and powerpc also have support for
> larger VA space, and its plumbed into mm slightly differently. We should have
> one arch-independent property that covers all these. On arm64 this would be the
> runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
>
> arch specific:
> * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
> never flip user/kernel space). This has to be arch specific, it will always have
> a value and its meaning comes from the ARM-ARM (so linux can't change it in the
> future). It should be the same on every CPU.
> * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
> offset. Again this always has a value, and its meaning comes from the ARM-ARM.
> If we ever get clever with different page-tables/TCR values on different CPUs,
> these two should come from the same CPU.
>
>
> I think this gives you what you need if user/kernel may both be using
> pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
> be picked at boot time depending on the kernel kconfig and the hardware support.
>
> Does anyone have a better idea? (or a corner where this won't work?)
I am not sure you got a chance to look at the two regression cases I
reported here:
<http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
Unfortunately the above suggestion doesn't provide any fix for
ARMv8.2-LPA regression (see text under heading '
(1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
After going through the regression reports, I think exporting
'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
for the above regressions (without over-complicating the stuff) as
ARM64_TCR.T1SZ and friends seem to arch specific as compared to
VA_BITS + 'MAX_USER_VA_BITS' .
Thanks,
Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-15 18:01 ` Bhupesh Sharma
@ 2019-02-18 15:27 ` Steve Capper
-1 siblings, 0 replies; 58+ messages in thread
From: Steve Capper @ 2019-02-18 15:27 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
Catalin Marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Kristina Martsenko, Borislav Petkov, anderson, nd,
Dave Young, linux-arm-kernel
Hi Bhupesh,
Sorry for joining this thread late...
On Fri, Feb 15, 2019 at 11:31:56PM +0530, Bhupesh Sharma wrote:
> Hi James,
>
> On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
> >
> > Hi guys,
> >
> > (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
> > config when 52-bit VA and pointer-auth may be in use?"
> >
> > On 13/02/2019 19:52, Kazuhito Hagio wrote:
> > > On 2/13/2019 1:22 PM, James Morse wrote:
> > >> On 13/02/2019 11:15, Dave Young wrote:
> > >>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> > >>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> > >>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> > >>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> > >>>>> mentioned that the changes look necessary.
> > >>>>>
> > >>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> > >>>>
> > >>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > >>>>>>> calculated as is done by the underlying kernel
> > >>
> > >> Aha! Nothing to do with which-bits-are-pfn in the tables...
> > >>
> > >> You need to know if the top level PGD is 512bytes or bigger. As we use a
> > >> kmem-cache the adjacent data could be some else's page tables.
> > >>
> > >> Is this really a problem though? You can't pull the user-space pgd pointers out
> > >> of no-where, you must have walked some task_struct and struct_mm's to find them.
> > >> In which case you would have the VMAs on hand to tell you if its in the mapped
> > >> user range.
> > >>
> > >> It would be good to avoid putting something arch-specific in here if we can at
> > >> all help it.
> >
> > >>>>>>> (see
> > >>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > >>>>>>>
> > >>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> > >>>>
> > >>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> > >>>> It is used for pgd_index() also in makedumpfile to walk page tables.
> > >>>>
> > >>>> /* to find an entry in a page-table-directory */
> > >>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> > >>>
> > >>> Since Dave mentioned crash tool does not need it, but crash should also
> > >>> travel the pg tables.
> > >
> > > The crash utility is always invoked with vmlinux, so it can read the
> > > vabits_user variable directly from vmcore, but makedumpfile can not.
> >
> > (This sounds fragile. That symbol's name may change, it may disappear
> > completely! ... but I guess crash changes with every kernel release anyway)
> >
> >
> > >>> If this is really necessary it would be good to describe what will
> > >>> happen without the patch, eg. some user visible error from an actual test etc.
> > >>
> > >> Yes please, it would really help if there was a specific example we could discuss.
> > >
> > > With 52-bit user space and 48-bit kernel space configuration,
> > > makedumpfile will not be able to convert a virtual kernel address
> > > to a physical address, and fail to capture a dumpfile, because the
> > > pgd_index() will return a wrong index.
> >
> > Got it, thanks!
> > (all this user stuff had me thinking it was user-space you were trying to walk).
> >
> > Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
> > 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
> > walk it without knowing the offset you get junk.
> >
> > Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
> > the offsetting code disappears, the kernel would still have to provide
> > 'ttbr1_offset=0' for user-space to keep working.
> >
> > I'd like to find something future-proof that always has an unambiguous meaning,
> > and isn't a problem if the kernel variable/symbol/kconfig names change.
> >
> > With pointer-auth in use too you can't guess which bits are address and which
> > bits are data.
> >
> > Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
> > problem if we ever switch that per task (some new bits may turn up with a new
> > feature). Some of those bits vary per cpu too, so we'd have to mask them out in
> > case user-space tries to conclude something from them.
> >
> >
> > My current best suggestion is to export:
> > from core code:
> > * USER_MMAP_END, the maximum value a user-space can try and mmap().
> > This would normally be TASK_SIZE, but x86 and powerpc also have support for
> > larger VA space, and its plumbed into mm slightly differently. We should have
> > one arch-independent property that covers all these. On arm64 this would be the
> > runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
> >
> > arch specific:
> > * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
> > never flip user/kernel space). This has to be arch specific, it will always have
> > a value and its meaning comes from the ARM-ARM (so linux can't change it in the
> > future). It should be the same on every CPU.
> > * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
> > offset. Again this always has a value, and its meaning comes from the ARM-ARM.
> > If we ever get clever with different page-tables/TCR values on different CPUs,
> > these two should come from the same CPU.
> >
> >
> > I think this gives you what you need if user/kernel may both be using
> > pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
> > be picked at boot time depending on the kernel kconfig and the hardware support.
> >
> > Does anyone have a better idea? (or a corner where this won't work?)
>
> I am not sure you got a chance to look at the two regression cases I
> reported here:
> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>
> Unfortunately the above suggestion doesn't provide any fix for
> ARMv8.2-LPA regression (see text under heading '
> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>
> After going through the regression reports, I think exporting
> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
> for the above regressions (without over-complicating the stuff) as
> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
> VA_BITS + 'MAX_USER_VA_BITS' .
>
For MAX_USER_VA_BITS, IIUC you are just after a value of PTRS_PER_PGD?
Why not just add PTRS_PER_PGD to the vmcoreinfo?
FWIW it is possible in vaddr_to_paddr_arm64 to detect a zero pgd entry
then try again with another ptrs_per_pgd value (granted this is a little
hacky).
Cheers,
--
Steve
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-18 15:27 ` Steve Capper
0 siblings, 0 replies; 58+ messages in thread
From: Steve Capper @ 2019-02-18 15:27 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
Catalin Marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Kristina Martsenko, Borislav Petkov, anderson, nd,
Dave Young, linux-arm-kernel
Hi Bhupesh,
Sorry for joining this thread late...
On Fri, Feb 15, 2019 at 11:31:56PM +0530, Bhupesh Sharma wrote:
> Hi James,
>
> On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
> >
> > Hi guys,
> >
> > (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
> > config when 52-bit VA and pointer-auth may be in use?"
> >
> > On 13/02/2019 19:52, Kazuhito Hagio wrote:
> > > On 2/13/2019 1:22 PM, James Morse wrote:
> > >> On 13/02/2019 11:15, Dave Young wrote:
> > >>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> > >>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> > >>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> > >>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> > >>>>> mentioned that the changes look necessary.
> > >>>>>
> > >>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> > >>>>
> > >>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > >>>>>>> calculated as is done by the underlying kernel
> > >>
> > >> Aha! Nothing to do with which-bits-are-pfn in the tables...
> > >>
> > >> You need to know if the top level PGD is 512bytes or bigger. As we use a
> > >> kmem-cache the adjacent data could be some else's page tables.
> > >>
> > >> Is this really a problem though? You can't pull the user-space pgd pointers out
> > >> of no-where, you must have walked some task_struct and struct_mm's to find them.
> > >> In which case you would have the VMAs on hand to tell you if its in the mapped
> > >> user range.
> > >>
> > >> It would be good to avoid putting something arch-specific in here if we can at
> > >> all help it.
> >
> > >>>>>>> (see
> > >>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > >>>>>>>
> > >>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> > >>>>
> > >>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> > >>>> It is used for pgd_index() also in makedumpfile to walk page tables.
> > >>>>
> > >>>> /* to find an entry in a page-table-directory */
> > >>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> > >>>
> > >>> Since Dave mentioned crash tool does not need it, but crash should also
> > >>> travel the pg tables.
> > >
> > > The crash utility is always invoked with vmlinux, so it can read the
> > > vabits_user variable directly from vmcore, but makedumpfile can not.
> >
> > (This sounds fragile. That symbol's name may change, it may disappear
> > completely! ... but I guess crash changes with every kernel release anyway)
> >
> >
> > >>> If this is really necessary it would be good to describe what will
> > >>> happen without the patch, eg. some user visible error from an actual test etc.
> > >>
> > >> Yes please, it would really help if there was a specific example we could discuss.
> > >
> > > With 52-bit user space and 48-bit kernel space configuration,
> > > makedumpfile will not be able to convert a virtual kernel address
> > > to a physical address, and fail to capture a dumpfile, because the
> > > pgd_index() will return a wrong index.
> >
> > Got it, thanks!
> > (all this user stuff had me thinking it was user-space you were trying to walk).
> >
> > Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
> > 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
> > walk it without knowing the offset you get junk.
> >
> > Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
> > the offsetting code disappears, the kernel would still have to provide
> > 'ttbr1_offset=0' for user-space to keep working.
> >
> > I'd like to find something future-proof that always has an unambiguous meaning,
> > and isn't a problem if the kernel variable/symbol/kconfig names change.
> >
> > With pointer-auth in use too you can't guess which bits are address and which
> > bits are data.
> >
> > Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
> > problem if we ever switch that per task (some new bits may turn up with a new
> > feature). Some of those bits vary per cpu too, so we'd have to mask them out in
> > case user-space tries to conclude something from them.
> >
> >
> > My current best suggestion is to export:
> > from core code:
> > * USER_MMAP_END, the maximum value a user-space can try and mmap().
> > This would normally be TASK_SIZE, but x86 and powerpc also have support for
> > larger VA space, and its plumbed into mm slightly differently. We should have
> > one arch-independent property that covers all these. On arm64 this would be the
> > runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
> >
> > arch specific:
> > * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
> > never flip user/kernel space). This has to be arch specific, it will always have
> > a value and its meaning comes from the ARM-ARM (so linux can't change it in the
> > future). It should be the same on every CPU.
> > * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
> > offset. Again this always has a value, and its meaning comes from the ARM-ARM.
> > If we ever get clever with different page-tables/TCR values on different CPUs,
> > these two should come from the same CPU.
> >
> >
> > I think this gives you what you need if user/kernel may both be using
> > pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
> > be picked at boot time depending on the kernel kconfig and the hardware support.
> >
> > Does anyone have a better idea? (or a corner where this won't work?)
>
> I am not sure you got a chance to look at the two regression cases I
> reported here:
> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>
> Unfortunately the above suggestion doesn't provide any fix for
> ARMv8.2-LPA regression (see text under heading '
> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>
> After going through the regression reports, I think exporting
> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
> for the above regressions (without over-complicating the stuff) as
> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
> VA_BITS + 'MAX_USER_VA_BITS' .
>
For MAX_USER_VA_BITS, IIUC you are just after a value of PTRS_PER_PGD?
Why not just add PTRS_PER_PGD to the vmcoreinfo?
FWIW it is possible in vaddr_to_paddr_arm64 to detect a zero pgd entry
then try again with another ptrs_per_pgd value (granted this is a little
hacky).
Cheers,
--
Steve
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-18 15:27 ` Steve Capper
@ 2019-02-21 16:08 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-21 16:08 UTC (permalink / raw)
To: Steve Capper
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
Catalin Marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Kristina Martsenko, Borislav Petkov, anderson, nd,
Dave Young, linux-arm-kernel
Hi Steve,
On 02/18/2019 08:57 PM, Steve Capper wrote:
> Hi Bhupesh,
>
> Sorry for joining this thread late...
>
> On Fri, Feb 15, 2019 at 11:31:56PM +0530, Bhupesh Sharma wrote:
>> Hi James,
>>
>> On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
>>>
>>> Hi guys,
>>>
>>> (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
>>> config when 52-bit VA and pointer-auth may be in use?"
>>>
>>> On 13/02/2019 19:52, Kazuhito Hagio wrote:
>>>> On 2/13/2019 1:22 PM, James Morse wrote:
>>>>> On 13/02/2019 11:15, Dave Young wrote:
>>>>>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>>>>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>>>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>>>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>>>>>> mentioned that the changes look necessary.
>>>>>>>>
>>>>>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>>>>>>
>>>>>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>>>>>> calculated as is done by the underlying kernel
>>>>>
>>>>> Aha! Nothing to do with which-bits-are-pfn in the tables...
>>>>>
>>>>> You need to know if the top level PGD is 512bytes or bigger. As we use a
>>>>> kmem-cache the adjacent data could be some else's page tables.
>>>>>
>>>>> Is this really a problem though? You can't pull the user-space pgd pointers out
>>>>> of no-where, you must have walked some task_struct and struct_mm's to find them.
>>>>> In which case you would have the VMAs on hand to tell you if its in the mapped
>>>>> user range.
>>>>>
>>>>> It would be good to avoid putting something arch-specific in here if we can at
>>>>> all help it.
>>>
>>>>>>>>>> (see
>>>>>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>>>>>
>>>>>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>>>>>
>>>>>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>>>>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>>>>>
>>>>>>> /* to find an entry in a page-table-directory */
>>>>>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>>>>>
>>>>>> Since Dave mentioned crash tool does not need it, but crash should also
>>>>>> travel the pg tables.
>>>>
>>>> The crash utility is always invoked with vmlinux, so it can read the
>>>> vabits_user variable directly from vmcore, but makedumpfile can not.
>>>
>>> (This sounds fragile. That symbol's name may change, it may disappear
>>> completely! ... but I guess crash changes with every kernel release anyway)
>>>
>>>
>>>>>> If this is really necessary it would be good to describe what will
>>>>>> happen without the patch, eg. some user visible error from an actual test etc.
>>>>>
>>>>> Yes please, it would really help if there was a specific example we could discuss.
>>>>
>>>> With 52-bit user space and 48-bit kernel space configuration,
>>>> makedumpfile will not be able to convert a virtual kernel address
>>>> to a physical address, and fail to capture a dumpfile, because the
>>>> pgd_index() will return a wrong index.
>>>
>>> Got it, thanks!
>>> (all this user stuff had me thinking it was user-space you were trying to walk).
>>>
>>> Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
>>> 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
>>> walk it without knowing the offset you get junk.
>>>
>>> Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
>>> the offsetting code disappears, the kernel would still have to provide
>>> 'ttbr1_offset=0' for user-space to keep working.
>>>
>>> I'd like to find something future-proof that always has an unambiguous meaning,
>>> and isn't a problem if the kernel variable/symbol/kconfig names change.
>>>
>>> With pointer-auth in use too you can't guess which bits are address and which
>>> bits are data.
>>>
>>> Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
>>> problem if we ever switch that per task (some new bits may turn up with a new
>>> feature). Some of those bits vary per cpu too, so we'd have to mask them out in
>>> case user-space tries to conclude something from them.
>>>
>>>
>>> My current best suggestion is to export:
>>> from core code:
>>> * USER_MMAP_END, the maximum value a user-space can try and mmap().
>>> This would normally be TASK_SIZE, but x86 and powerpc also have support for
>>> larger VA space, and its plumbed into mm slightly differently. We should have
>>> one arch-independent property that covers all these. On arm64 this would be the
>>> runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
>>>
>>> arch specific:
>>> * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
>>> never flip user/kernel space). This has to be arch specific, it will always have
>>> a value and its meaning comes from the ARM-ARM (so linux can't change it in the
>>> future). It should be the same on every CPU.
>>> * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
>>> offset. Again this always has a value, and its meaning comes from the ARM-ARM.
>>> If we ever get clever with different page-tables/TCR values on different CPUs,
>>> these two should come from the same CPU.
>>>
>>>
>>> I think this gives you what you need if user/kernel may both be using
>>> pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
>>> be picked at boot time depending on the kernel kconfig and the hardware support.
>>>
>>> Does anyone have a better idea? (or a corner where this won't work?)
>>
>> I am not sure you got a chance to look at the two regression cases I
>> reported here:
>> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>>
>> Unfortunately the above suggestion doesn't provide any fix for
>> ARMv8.2-LPA regression (see text under heading '
>> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>>
>> After going through the regression reports, I think exporting
>> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
>> for the above regressions (without over-complicating the stuff) as
>> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
>> VA_BITS + 'MAX_USER_VA_BITS' .
>>
>
> For MAX_USER_VA_BITS, IIUC you are just after a value of PTRS_PER_PGD?
> Why not just add PTRS_PER_PGD to the vmcoreinfo?
That's a good suggestion. I will re-spin the v2 with the same.
> FWIW it is possible in vaddr_to_paddr_arm64 to detect a zero pgd entry
> then try again with another ptrs_per_pgd value (granted this is a little
> hacky).
Right, but having this hack replicated across various user-space tools
is perhaps not the ideal portable solution, when we can simply add a
valid hint in the vmcoreinfo itself.
Thanks,
Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-21 16:08 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-21 16:08 UTC (permalink / raw)
To: Steve Capper
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
Catalin Marinas, kexec, Will Deacon, AKASHI Takahiro,
James Morse, Kristina Martsenko, Borislav Petkov, anderson, nd,
Dave Young, linux-arm-kernel
Hi Steve,
On 02/18/2019 08:57 PM, Steve Capper wrote:
> Hi Bhupesh,
>
> Sorry for joining this thread late...
>
> On Fri, Feb 15, 2019 at 11:31:56PM +0530, Bhupesh Sharma wrote:
>> Hi James,
>>
>> On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
>>>
>>> Hi guys,
>>>
>>> (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
>>> config when 52-bit VA and pointer-auth may be in use?"
>>>
>>> On 13/02/2019 19:52, Kazuhito Hagio wrote:
>>>> On 2/13/2019 1:22 PM, James Morse wrote:
>>>>> On 13/02/2019 11:15, Dave Young wrote:
>>>>>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>>>>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>>>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>>>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>>>>>> mentioned that the changes look necessary.
>>>>>>>>
>>>>>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>>>>>>
>>>>>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>>>>>> calculated as is done by the underlying kernel
>>>>>
>>>>> Aha! Nothing to do with which-bits-are-pfn in the tables...
>>>>>
>>>>> You need to know if the top level PGD is 512bytes or bigger. As we use a
>>>>> kmem-cache the adjacent data could be some else's page tables.
>>>>>
>>>>> Is this really a problem though? You can't pull the user-space pgd pointers out
>>>>> of no-where, you must have walked some task_struct and struct_mm's to find them.
>>>>> In which case you would have the VMAs on hand to tell you if its in the mapped
>>>>> user range.
>>>>>
>>>>> It would be good to avoid putting something arch-specific in here if we can at
>>>>> all help it.
>>>
>>>>>>>>>> (see
>>>>>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>>>>>
>>>>>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>>>>>
>>>>>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>>>>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>>>>>
>>>>>>> /* to find an entry in a page-table-directory */
>>>>>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>>>>>
>>>>>> Since Dave mentioned crash tool does not need it, but crash should also
>>>>>> travel the pg tables.
>>>>
>>>> The crash utility is always invoked with vmlinux, so it can read the
>>>> vabits_user variable directly from vmcore, but makedumpfile can not.
>>>
>>> (This sounds fragile. That symbol's name may change, it may disappear
>>> completely! ... but I guess crash changes with every kernel release anyway)
>>>
>>>
>>>>>> If this is really necessary it would be good to describe what will
>>>>>> happen without the patch, eg. some user visible error from an actual test etc.
>>>>>
>>>>> Yes please, it would really help if there was a specific example we could discuss.
>>>>
>>>> With 52-bit user space and 48-bit kernel space configuration,
>>>> makedumpfile will not be able to convert a virtual kernel address
>>>> to a physical address, and fail to capture a dumpfile, because the
>>>> pgd_index() will return a wrong index.
>>>
>>> Got it, thanks!
>>> (all this user stuff had me thinking it was user-space you were trying to walk).
>>>
>>> Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
>>> 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
>>> walk it without knowing the offset you get junk.
>>>
>>> Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
>>> the offsetting code disappears, the kernel would still have to provide
>>> 'ttbr1_offset=0' for user-space to keep working.
>>>
>>> I'd like to find something future-proof that always has an unambiguous meaning,
>>> and isn't a problem if the kernel variable/symbol/kconfig names change.
>>>
>>> With pointer-auth in use too you can't guess which bits are address and which
>>> bits are data.
>>>
>>> Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
>>> problem if we ever switch that per task (some new bits may turn up with a new
>>> feature). Some of those bits vary per cpu too, so we'd have to mask them out in
>>> case user-space tries to conclude something from them.
>>>
>>>
>>> My current best suggestion is to export:
>>> from core code:
>>> * USER_MMAP_END, the maximum value a user-space can try and mmap().
>>> This would normally be TASK_SIZE, but x86 and powerpc also have support for
>>> larger VA space, and its plumbed into mm slightly differently. We should have
>>> one arch-independent property that covers all these. On arm64 this would be the
>>> runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
>>>
>>> arch specific:
>>> * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
>>> never flip user/kernel space). This has to be arch specific, it will always have
>>> a value and its meaning comes from the ARM-ARM (so linux can't change it in the
>>> future). It should be the same on every CPU.
>>> * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
>>> offset. Again this always has a value, and its meaning comes from the ARM-ARM.
>>> If we ever get clever with different page-tables/TCR values on different CPUs,
>>> these two should come from the same CPU.
>>>
>>>
>>> I think this gives you what you need if user/kernel may both be using
>>> pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
>>> be picked at boot time depending on the kernel kconfig and the hardware support.
>>>
>>> Does anyone have a better idea? (or a corner where this won't work?)
>>
>> I am not sure you got a chance to look at the two regression cases I
>> reported here:
>> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>>
>> Unfortunately the above suggestion doesn't provide any fix for
>> ARMv8.2-LPA regression (see text under heading '
>> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>>
>> After going through the regression reports, I think exporting
>> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
>> for the above regressions (without over-complicating the stuff) as
>> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
>> VA_BITS + 'MAX_USER_VA_BITS' .
>>
>
> For MAX_USER_VA_BITS, IIUC you are just after a value of PTRS_PER_PGD?
> Why not just add PTRS_PER_PGD to the vmcoreinfo?
That's a good suggestion. I will re-spin the v2 with the same.
> FWIW it is possible in vaddr_to_paddr_arm64 to detect a zero pgd entry
> then try again with another ptrs_per_pgd value (granted this is a little
> hacky).
Right, but having this hack replicated across various user-space tools
is perhaps not the ideal portable solution, when we can simply add a
valid hint in the vmcoreinfo itself.
Thanks,
Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-15 18:01 ` Bhupesh Sharma
@ 2019-02-19 20:47 ` Kazuhito Hagio
-1 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-19 20:47 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, anderson, Dave Young,
linux-arm-kernel
Hi Bhupesh,
-----Original Message-----
> I am not sure you got a chance to look at the two regression cases I
> reported here:
> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>
> Unfortunately the above suggestion doesn't provide any fix for
> ARMv8.2-LPA regression (see text under heading '
> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
it because there is only one SECTION_SIZE_BITS for ppc64. I think we
can use the same way as set_ppc64_max_physmem_bits() does also for
arm64 for now. I'm going to write it for kernels not having
NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
Thanks,
Kazu
>
> After going through the regression reports, I think exporting
> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
> for the above regressions (without over-complicating the stuff) as
> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
> VA_BITS + 'MAX_USER_VA_BITS' .
>
> Thanks,
> Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-19 20:47 ` Kazuhito Hagio
0 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-19 20:47 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, anderson, Dave Young,
linux-arm-kernel
Hi Bhupesh,
-----Original Message-----
> I am not sure you got a chance to look at the two regression cases I
> reported here:
> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>
> Unfortunately the above suggestion doesn't provide any fix for
> ARMv8.2-LPA regression (see text under heading '
> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
it because there is only one SECTION_SIZE_BITS for ppc64. I think we
can use the same way as set_ppc64_max_physmem_bits() does also for
arm64 for now. I'm going to write it for kernels not having
NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
Thanks,
Kazu
>
> After going through the regression reports, I think exporting
> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
> for the above regressions (without over-complicating the stuff) as
> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
> VA_BITS + 'MAX_USER_VA_BITS' .
>
> Thanks,
> Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-19 20:47 ` Kazuhito Hagio
@ 2019-02-21 16:20 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-21 16:20 UTC (permalink / raw)
To: Kazuhito Hagio
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, anderson, Dave Young,
linux-arm-kernel
Hi Kazu,
On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
> Hi Bhupesh,
>
> -----Original Message-----
>> I am not sure you got a chance to look at the two regression cases I
>> reported here:
>> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>>
>> Unfortunately the above suggestion doesn't provide any fix for
>> ARMv8.2-LPA regression (see text under heading '
>> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>
> As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
> it because there is only one SECTION_SIZE_BITS for ppc64. I think we
> can use the same way as set_ppc64_max_physmem_bits() does also for
> arm64 for now. I'm going to write it for kernels not having
> NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
I see two drawbacks with the above approach:
a). This means that other user-space tools like crash-utility would
still be broken and would probably need to find MAX_PHYSMEM_BITS for
arm64 via a similar (hack'ish ?) approach.
b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
determination for two archs as an example:
ppc
---
int
set_ppc64_max_physmem_bits(void)
{
long array_len = ARRAY_LENGTH(mem_section);
/*
* The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
* newer kernels 3.7 onwards uses 46 bits.
*/
info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
return FALSE;
}
x86_64:
------
int
get_versiondep_info_x86_64(void)
{
/*
* On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
*/
if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
else if(check_5level_paging())
info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
else
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
...
}
Looking at the above, two questions come to my mind:
- Do we really need all the above complexity in user-space code, to hoop
across various kernel versions and perform allocations for something
that can be so easily exported via vmcoreinfo? Also we need to see how
portable is the above code for a new kernel version - IMO, it will need
another fix patch when we update to a new kernel version in near future.
- Also do we need to replicate the above implementations across
user-space tools when they can also utilize the vmcoreinfo information
to determine the PA_BITS range without any additional arch/kernel
version specific details as the single point of obtaining this
information from the kernel?
So, in view of the above, I would still advocate that we use a
vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
interface for the same across all user-land applications.
Thanks,
Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-21 16:20 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-21 16:20 UTC (permalink / raw)
To: Kazuhito Hagio
Cc: Mark Rutland, lijiang, bhe, ard.biesheuvel, catalin.marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, anderson, Dave Young,
linux-arm-kernel
Hi Kazu,
On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
> Hi Bhupesh,
>
> -----Original Message-----
>> I am not sure you got a chance to look at the two regression cases I
>> reported here:
>> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>>
>> Unfortunately the above suggestion doesn't provide any fix for
>> ARMv8.2-LPA regression (see text under heading '
>> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>
> As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
> it because there is only one SECTION_SIZE_BITS for ppc64. I think we
> can use the same way as set_ppc64_max_physmem_bits() does also for
> arm64 for now. I'm going to write it for kernels not having
> NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
I see two drawbacks with the above approach:
a). This means that other user-space tools like crash-utility would
still be broken and would probably need to find MAX_PHYSMEM_BITS for
arm64 via a similar (hack'ish ?) approach.
b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
determination for two archs as an example:
ppc
---
int
set_ppc64_max_physmem_bits(void)
{
long array_len = ARRAY_LENGTH(mem_section);
/*
* The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
* newer kernels 3.7 onwards uses 46 bits.
*/
info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
if ((array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() /
_SECTIONS_PER_ROOT())))
return TRUE;
return FALSE;
}
x86_64:
------
int
get_versiondep_info_x86_64(void)
{
/*
* On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
*/
if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
else if(check_5level_paging())
info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
else
info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
...
}
Looking at the above, two questions come to my mind:
- Do we really need all the above complexity in user-space code, to hoop
across various kernel versions and perform allocations for something
that can be so easily exported via vmcoreinfo? Also we need to see how
portable is the above code for a new kernel version - IMO, it will need
another fix patch when we update to a new kernel version in near future.
- Also do we need to replicate the above implementations across
user-space tools when they can also utilize the vmcoreinfo information
to determine the PA_BITS range without any additional arch/kernel
version specific details as the single point of obtaining this
information from the kernel?
So, in view of the above, I would still advocate that we use a
vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
interface for the same across all user-land applications.
Thanks,
Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-21 16:20 ` Bhupesh Sharma
@ 2019-02-21 16:42 ` Dave Anderson
-1 siblings, 0 replies; 58+ messages in thread
From: Dave Anderson @ 2019-02-21 16:42 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard biesheuvel,
catalin marinas, Steve Capper, kexec, Will Deacon,
AKASHI Takahiro, James Morse, Kristina Martsenko,
Borislav Petkov, Dave Young, linux-arm-kernel
----- Original Message -----
> Hi Kazu,
>
> On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
> > Hi Bhupesh,
> >
> > -----Original Message-----
> >> I am not sure you got a chance to look at the two regression cases I
> >> reported here:
> >> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
> >>
> >> Unfortunately the above suggestion doesn't provide any fix for
> >> ARMv8.2-LPA regression (see text under heading '
> >> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
> >
> > As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
> > it because there is only one SECTION_SIZE_BITS for ppc64. I think we
> > can use the same way as set_ppc64_max_physmem_bits() does also for
> > arm64 for now. I'm going to write it for kernels not having
> > NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
>
> I see two drawbacks with the above approach:
>
> a). This means that other user-space tools like crash-utility would
> still be broken and would probably need to find MAX_PHYSMEM_BITS for
> arm64 via a similar (hack'ish ?) approach.
>
> b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
> determination for two archs as an example:
>
> ppc
> ---
>
> int
> set_ppc64_max_physmem_bits(void)
> {
> long array_len = ARRAY_LENGTH(mem_section);
> /*
> * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
> * newer kernels 3.7 onwards uses 46 bits.
> */
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> return FALSE;
> }
>
> x86_64:
> ------
>
> int
> get_versiondep_info_x86_64(void)
> {
> /*
> * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
> */
> if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
> else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
> else if(check_5level_paging())
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
> else
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
>
> ...
> }
>
> Looking at the above, two questions come to my mind:
>
> - Do we really need all the above complexity in user-space code, to hoop
> across various kernel versions and perform allocations for something
> that can be so easily exported via vmcoreinfo? Also we need to see how
> portable is the above code for a new kernel version - IMO, it will need
> another fix patch when we update to a new kernel version in near future.
I agree -- not to mention that the "kernel version" way of determining things
does not account for distribution-specific backports.
>
> - Also do we need to replicate the above implementations across
> user-space tools when they can also utilize the vmcoreinfo information
> to determine the PA_BITS range without any additional arch/kernel
> version specific details as the single point of obtaining this
> information from the kernel?
>
> So, in view of the above, I would still advocate that we use a
> vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
> interface for the same across all user-land applications.
Again, totally agree.
Dave
> Thanks,
> Bhupesh
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-21 16:42 ` Dave Anderson
0 siblings, 0 replies; 58+ messages in thread
From: Dave Anderson @ 2019-02-21 16:42 UTC (permalink / raw)
To: Bhupesh Sharma
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard biesheuvel,
catalin marinas, Steve Capper, kexec, Will Deacon,
AKASHI Takahiro, James Morse, Kristina Martsenko,
Borislav Petkov, Dave Young, linux-arm-kernel
----- Original Message -----
> Hi Kazu,
>
> On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
> > Hi Bhupesh,
> >
> > -----Original Message-----
> >> I am not sure you got a chance to look at the two regression cases I
> >> reported here:
> >> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
> >>
> >> Unfortunately the above suggestion doesn't provide any fix for
> >> ARMv8.2-LPA regression (see text under heading '
> >> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
> >
> > As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
> > it because there is only one SECTION_SIZE_BITS for ppc64. I think we
> > can use the same way as set_ppc64_max_physmem_bits() does also for
> > arm64 for now. I'm going to write it for kernels not having
> > NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
>
> I see two drawbacks with the above approach:
>
> a). This means that other user-space tools like crash-utility would
> still be broken and would probably need to find MAX_PHYSMEM_BITS for
> arm64 via a similar (hack'ish ?) approach.
>
> b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
> determination for two archs as an example:
>
> ppc
> ---
>
> int
> set_ppc64_max_physmem_bits(void)
> {
> long array_len = ARRAY_LENGTH(mem_section);
> /*
> * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
> * newer kernels 3.7 onwards uses 46 bits.
> */
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> return TRUE;
>
> return FALSE;
> }
>
> x86_64:
> ------
>
> int
> get_versiondep_info_x86_64(void)
> {
> /*
> * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
> */
> if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
> else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
> else if(check_5level_paging())
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
> else
> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
>
> ...
> }
>
> Looking at the above, two questions come to my mind:
>
> - Do we really need all the above complexity in user-space code, to hoop
> across various kernel versions and perform allocations for something
> that can be so easily exported via vmcoreinfo? Also we need to see how
> portable is the above code for a new kernel version - IMO, it will need
> another fix patch when we update to a new kernel version in near future.
I agree -- not to mention that the "kernel version" way of determining things
does not account for distribution-specific backports.
>
> - Also do we need to replicate the above implementations across
> user-space tools when they can also utilize the vmcoreinfo information
> to determine the PA_BITS range without any additional arch/kernel
> version specific details as the single point of obtaining this
> information from the kernel?
>
> So, in view of the above, I would still advocate that we use a
> vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
> interface for the same across all user-land applications.
Again, totally agree.
Dave
> Thanks,
> Bhupesh
>
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-21 16:42 ` Dave Anderson
@ 2019-02-21 19:02 ` Kazuhito Hagio
-1 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-21 19:02 UTC (permalink / raw)
To: Dave Anderson, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, Steve Capper, catalin marinas,
ard biesheuvel, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, Dave Young,
linux-arm-kernel
-----Original Message-----
> ----- Original Message -----
> > Hi Kazu,
> >
> > On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
> > > Hi Bhupesh,
> > >
> > > -----Original Message-----
> > >> I am not sure you got a chance to look at the two regression cases I
> > >> reported here:
> > >> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
> > >>
> > >> Unfortunately the above suggestion doesn't provide any fix for
> > >> ARMv8.2-LPA regression (see text under heading '
> > >> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
> > >
> > > As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
> > > it because there is only one SECTION_SIZE_BITS for ppc64. I think we
> > > can use the same way as set_ppc64_max_physmem_bits() does also for
> > > arm64 for now. I'm going to write it for kernels not having
> > > NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
> >
> > I see two drawbacks with the above approach:
> >
> > a). This means that other user-space tools like crash-utility would
> > still be broken and would probably need to find MAX_PHYSMEM_BITS for
> > arm64 via a similar (hack'ish ?) approach.
> >
> > b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
> > determination for two archs as an example:
> >
> > ppc
> > ---
> >
> > int
> > set_ppc64_max_physmem_bits(void)
> > {
> > long array_len = ARRAY_LENGTH(mem_section);
> > /*
> > * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
> > * newer kernels 3.7 onwards uses 46 bits.
> > */
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > return FALSE;
> > }
> >
> > x86_64:
> > ------
> >
> > int
> > get_versiondep_info_x86_64(void)
> > {
> > /*
> > * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
> > */
> > if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
> > else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
> > else if(check_5level_paging())
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
> > else
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
> >
> > ...
> > }
> >
> > Looking at the above, two questions come to my mind:
> >
> > - Do we really need all the above complexity in user-space code, to hoop
> > across various kernel versions and perform allocations for something
> > that can be so easily exported via vmcoreinfo? Also we need to see how
> > portable is the above code for a new kernel version - IMO, it will need
> > another fix patch when we update to a new kernel version in near future.
>
> I agree -- not to mention that the "kernel version" way of determining things
> does not account for distribution-specific backports.
>
> >
> > - Also do we need to replicate the above implementations across
> > user-space tools when they can also utilize the vmcoreinfo information
> > to determine the PA_BITS range without any additional arch/kernel
> > version specific details as the single point of obtaining this
> > information from the kernel?
> >
> > So, in view of the above, I would still advocate that we use a
> > vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
> > interface for the same across all user-land applications.
>
> Again, totally agree.
I also agree that we should do so. Then it will be better to have
it in kernel core code, not in arch-specific code.
Although makedumpfile may have to have the kludge for kernels that
support 52-bit PA and don't have the exported MAX_PHYSMEM_BITS
sooner or later..
Thanks,
Kazu
>
> Dave
>
>
> > Thanks,
> > Bhupesh
> >
> >
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-21 19:02 ` Kazuhito Hagio
0 siblings, 0 replies; 58+ messages in thread
From: Kazuhito Hagio @ 2019-02-21 19:02 UTC (permalink / raw)
To: Dave Anderson, Bhupesh Sharma
Cc: Mark Rutland, lijiang, bhe, Steve Capper, catalin marinas,
ard biesheuvel, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, Dave Young,
linux-arm-kernel
-----Original Message-----
> ----- Original Message -----
> > Hi Kazu,
> >
> > On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
> > > Hi Bhupesh,
> > >
> > > -----Original Message-----
> > >> I am not sure you got a chance to look at the two regression cases I
> > >> reported here:
> > >> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
> > >>
> > >> Unfortunately the above suggestion doesn't provide any fix for
> > >> ARMv8.2-LPA regression (see text under heading '
> > >> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
> > >
> > > As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
> > > it because there is only one SECTION_SIZE_BITS for ppc64. I think we
> > > can use the same way as set_ppc64_max_physmem_bits() does also for
> > > arm64 for now. I'm going to write it for kernels not having
> > > NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
> >
> > I see two drawbacks with the above approach:
> >
> > a). This means that other user-space tools like crash-utility would
> > still be broken and would probably need to find MAX_PHYSMEM_BITS for
> > arm64 via a similar (hack'ish ?) approach.
> >
> > b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
> > determination for two archs as an example:
> >
> > ppc
> > ---
> >
> > int
> > set_ppc64_max_physmem_bits(void)
> > {
> > long array_len = ARRAY_LENGTH(mem_section);
> > /*
> > * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
> > * newer kernels 3.7 onwards uses 46 bits.
> > */
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
> > if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> > || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> > return TRUE;
> >
> > return FALSE;
> > }
> >
> > x86_64:
> > ------
> >
> > int
> > get_versiondep_info_x86_64(void)
> > {
> > /*
> > * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
> > */
> > if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
> > else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
> > else if(check_5level_paging())
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
> > else
> > info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
> >
> > ...
> > }
> >
> > Looking at the above, two questions come to my mind:
> >
> > - Do we really need all the above complexity in user-space code, to hoop
> > across various kernel versions and perform allocations for something
> > that can be so easily exported via vmcoreinfo? Also we need to see how
> > portable is the above code for a new kernel version - IMO, it will need
> > another fix patch when we update to a new kernel version in near future.
>
> I agree -- not to mention that the "kernel version" way of determining things
> does not account for distribution-specific backports.
>
> >
> > - Also do we need to replicate the above implementations across
> > user-space tools when they can also utilize the vmcoreinfo information
> > to determine the PA_BITS range without any additional arch/kernel
> > version specific details as the single point of obtaining this
> > information from the kernel?
> >
> > So, in view of the above, I would still advocate that we use a
> > vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
> > interface for the same across all user-land applications.
>
> Again, totally agree.
I also agree that we should do so. Then it will be better to have
it in kernel core code, not in arch-specific code.
Although makedumpfile may have to have the kludge for kernels that
support 52-bit PA and don't have the exported MAX_PHYSMEM_BITS
sooner or later..
Thanks,
Kazu
>
> Dave
>
>
> > Thanks,
> > Bhupesh
> >
> >
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-21 19:02 ` Kazuhito Hagio
@ 2019-03-01 4:01 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-03-01 4:01 UTC (permalink / raw)
To: Kazuhito Hagio, Dave Anderson
Cc: Mark Rutland, lijiang, bhe, ard biesheuvel, catalin marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, Dave Young,
linux-arm-kernel
Hi Kazu,
On 02/22/2019 12:32 AM, Kazuhito Hagio wrote:
> -----Original Message-----
>> ----- Original Message -----
>>> Hi Kazu,
>>>
>>> On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
>>>> Hi Bhupesh,
>>>>
>>>> -----Original Message-----
>>>>> I am not sure you got a chance to look at the two regression cases I
>>>>> reported here:
>>>>> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>>>>>
>>>>> Unfortunately the above suggestion doesn't provide any fix for
>>>>> ARMv8.2-LPA regression (see text under heading '
>>>>> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>>>>
>>>> As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
>>>> it because there is only one SECTION_SIZE_BITS for ppc64. I think we
>>>> can use the same way as set_ppc64_max_physmem_bits() does also for
>>>> arm64 for now. I'm going to write it for kernels not having
>>>> NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
>>>
>>> I see two drawbacks with the above approach:
>>>
>>> a). This means that other user-space tools like crash-utility would
>>> still be broken and would probably need to find MAX_PHYSMEM_BITS for
>>> arm64 via a similar (hack'ish ?) approach.
>>>
>>> b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
>>> determination for two archs as an example:
>>>
>>> ppc
>>> ---
>>>
>>> int
>>> set_ppc64_max_physmem_bits(void)
>>> {
>>> long array_len = ARRAY_LENGTH(mem_section);
>>> /*
>>> * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
>>> * newer kernels 3.7 onwards uses 46 bits.
>>> */
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> return FALSE;
>>> }
>>>
>>> x86_64:
>>> ------
>>>
>>> int
>>> get_versiondep_info_x86_64(void)
>>> {
>>> /*
>>> * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
>>> */
>>> if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
>>> else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
>>> else if(check_5level_paging())
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
>>> else
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
>>>
>>> ...
>>> }
>>>
>>> Looking at the above, two questions come to my mind:
>>>
>>> - Do we really need all the above complexity in user-space code, to hoop
>>> across various kernel versions and perform allocations for something
>>> that can be so easily exported via vmcoreinfo? Also we need to see how
>>> portable is the above code for a new kernel version - IMO, it will need
>>> another fix patch when we update to a new kernel version in near future.
>>
>> I agree -- not to mention that the "kernel version" way of determining things
>> does not account for distribution-specific backports.
>>
>>>
>>> - Also do we need to replicate the above implementations across
>>> user-space tools when they can also utilize the vmcoreinfo information
>>> to determine the PA_BITS range without any additional arch/kernel
>>> version specific details as the single point of obtaining this
>>> information from the kernel?
>>>
>>> So, in view of the above, I would still advocate that we use a
>>> vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
>>> interface for the same across all user-land applications.
>>
>> Again, totally agree.
>
> I also agree that we should do so. Then it will be better to have
> it in kernel core code, not in arch-specific code.
>
> Although makedumpfile may have to have the kludge for kernels that
> support 52-bit PA and don't have the exported MAX_PHYSMEM_BITS
> sooner or later..
I was waiting for maintainers Cc'ed in this thread to share their
opinions on the above suggestion.
We especially need comments from x86 maintainers on this (Boris), since
x86 and ppc user-space utilities are currently not broken (no regression
reported) due to a lacking export of 'MAX_PHYSMEM_BITS' via vmcore,
whereas as I shared earlier arm64 user-space utilities are broken right
now with 52-bit address space changes.
Thanks,
Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-03-01 4:01 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-03-01 4:01 UTC (permalink / raw)
To: Kazuhito Hagio, Dave Anderson
Cc: Mark Rutland, lijiang, bhe, ard biesheuvel, catalin marinas,
Steve Capper, kexec, Will Deacon, AKASHI Takahiro, James Morse,
Kristina Martsenko, Borislav Petkov, Dave Young,
linux-arm-kernel
Hi Kazu,
On 02/22/2019 12:32 AM, Kazuhito Hagio wrote:
> -----Original Message-----
>> ----- Original Message -----
>>> Hi Kazu,
>>>
>>> On 02/20/2019 02:17 AM, Kazuhito Hagio wrote:
>>>> Hi Bhupesh,
>>>>
>>>> -----Original Message-----
>>>>> I am not sure you got a chance to look at the two regression cases I
>>>>> reported here:
>>>>> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>>>>>
>>>>> Unfortunately the above suggestion doesn't provide any fix for
>>>>> ARMv8.2-LPA regression (see text under heading '
>>>>> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>>>>
>>>> As for MAX_PHYSMEM_BITS, I realized that ppc64 makedumpfile can detect
>>>> it because there is only one SECTION_SIZE_BITS for ppc64. I think we
>>>> can use the same way as set_ppc64_max_physmem_bits() does also for
>>>> arm64 for now. I'm going to write it for kernels not having
>>>> NUMBER(MAX_PHYSMEM_BITS) in vmcoreinfo.
>>>
>>> I see two drawbacks with the above approach:
>>>
>>> a). This means that other user-space tools like crash-utility would
>>> still be broken and would probably need to find MAX_PHYSMEM_BITS for
>>> arm64 via a similar (hack'ish ?) approach.
>>>
>>> b). I am looking at the makedumpfile code for 'MAX_PHYSMEM_BITS'
>>> determination for two archs as an example:
>>>
>>> ppc
>>> ---
>>>
>>> int
>>> set_ppc64_max_physmem_bits(void)
>>> {
>>> long array_len = ARRAY_LENGTH(mem_section);
>>> /*
>>> * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
>>> * newer kernels 3.7 onwards uses 46 bits.
>>> */
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG ;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_3_7;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_19;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_4_20;
>>> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
>>> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
>>> return TRUE;
>>>
>>> return FALSE;
>>> }
>>>
>>> x86_64:
>>> ------
>>>
>>> int
>>> get_versiondep_info_x86_64(void)
>>> {
>>> /*
>>> * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
>>> */
>>> if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_ORIG;
>>> else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_26;
>>> else if(check_5level_paging())
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_5LEVEL;
>>> else
>>> info->max_physmem_bits = _MAX_PHYSMEM_BITS_2_6_31;
>>>
>>> ...
>>> }
>>>
>>> Looking at the above, two questions come to my mind:
>>>
>>> - Do we really need all the above complexity in user-space code, to hoop
>>> across various kernel versions and perform allocations for something
>>> that can be so easily exported via vmcoreinfo? Also we need to see how
>>> portable is the above code for a new kernel version - IMO, it will need
>>> another fix patch when we update to a new kernel version in near future.
>>
>> I agree -- not to mention that the "kernel version" way of determining things
>> does not account for distribution-specific backports.
>>
>>>
>>> - Also do we need to replicate the above implementations across
>>> user-space tools when they can also utilize the vmcoreinfo information
>>> to determine the PA_BITS range without any additional arch/kernel
>>> version specific details as the single point of obtaining this
>>> information from the kernel?
>>>
>>> So, in view of the above, I would still advocate that we use a
>>> vmcoreinfo export for 'MAX_PHYSMEM_BITS' as well to have a uniform
>>> interface for the same across all user-land applications.
>>
>> Again, totally agree.
>
> I also agree that we should do so. Then it will be better to have
> it in kernel core code, not in arch-specific code.
>
> Although makedumpfile may have to have the kludge for kernels that
> support 52-bit PA and don't have the exported MAX_PHYSMEM_BITS
> sooner or later..
I was waiting for maintainers Cc'ed in this thread to share their
opinions on the above suggestion.
We especially need comments from x86 maintainers on this (Boris), since
x86 and ppc user-space utilities are currently not broken (no regression
reported) due to a lacking export of 'MAX_PHYSMEM_BITS' via vmcore,
whereas as I shared earlier arm64 user-space utilities are broken right
now with 52-bit address space changes.
Thanks,
Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
2019-02-13 18:22 ` James Morse
@ 2019-02-14 19:30 ` Bhupesh Sharma
-1 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-14 19:30 UTC (permalink / raw)
To: James Morse
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro, anderson,
Borislav Petkov, Dave Young, linux-arm-kernel
Hi James,
On 02/13/2019 11:52 PM, James Morse wrote:
> Hi guys,
>
> On 13/02/2019 11:15, Dave Young wrote:
>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>> mentioned that the changes look necessary.
>>>>
>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>>
>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>> calculated as is done by the underlying kernel
>
> Aha! Nothing to do with which-bits-are-pfn in the tables...
>
> You need to know if the top level PGD is 512bytes or bigger. As we use a
> kmem-cache the adjacent data could be some else's page tables.
>
> Is this really a problem though? You can't pull the user-space pgd pointers out
> of no-where, you must have walked some task_struct and struct_mm's to find them.
> In which case you would have the VMAs on hand to tell you if its in the mapped
> user range.
>
> It would be good to avoid putting something arch-specific in here if we can at
> all help it.
>
>
>>>>>> (see
>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>
>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>
>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>
>>> /* to find an entry in a page-table-directory */
>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>
>> Since Dave mentioned crash tool does not need it, but crash should also
>> travel the pg tables.
>>
>> If this is really necessary it would be good to describe what will
>> happen without the patch, eg. some user visible error from an actual test etc.
>
> Yes please, it would really help if there was a specific example we could discuss.
Sure. Here are two use-cases/regressions reported and which I have been
able to reproduce. Note that I tested them both on a CPU which does not
support ARMv8.2-LPA/LVA and on ARMv8 FVP model (which supports ARMv8.2
extensions).
Environment:
------------
Latest Upstream kernel: sha-id: 1f947a7a011fcceb14cb912f5481a53b18f1879a
("Merge branch 'akpm' (patches from Andrew)")
Latest makedumpfile code: (git://git.code.sf.net/p/makedumpfile/code ,
branch: devel)
crash-utility code: (https://github.com/crash-utility/crash.git, sha-id:
e082c372c7f1a782b058ec359dfbbbee0f0b6aad)
Note that Dave A. has since fixed crash-utility by using a hardcoded
value of 'MAX_PHYSMEM_BITS' (via sha id:
ac5a7889d31bb37aa0687110ecea08837f8a66a8) and determining 'vabits_user'
value via vmlinux (via sha id: 8618ddd817621c40c1f44f0ab6df7c7805234416)
(1). Regression Case 1 (ARMv8.2-LPA enabled kernel):
- Upstream makedumpfile and crash-utility (with sha-id
e082c372c7f1a782b058ec359dfbbbee0f0b6aad) are broken on following kind
of platforms:
a. Upstream Kernel 5.0.0-rc6+ with the following kernel configuration:
CONFIG_ARM64_64K_PAGES=y
# CONFIG_ARM64_VA_BITS_42 is not set
CONFIG_ARM64_VA_BITS_48=y
# CONFIG_ARM64_USER_VA_BITS_52 is not set
CONFIG_ARM64_VA_BITS=48
# CONFIG_ARM64_PA_BITS_48 is not set
CONFIG_ARM64_PA_BITS_52=y
CONFIG_ARM64_PA_BITS=52
b. Both on CPUs which don't support ARMv8.2 LPA extension and on ARMv8
FVP model with ARMv8.2 LPA extensions.
- Error message from makedumpfile:
$ makedumpfile -f --mem-usage /proc/kcore -D
max_mapnr : a00000
kimage_voffset : fffeffff80000000
max_physmem_bits : 30
section_size_bits: 1e
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a962c
va_bits : 48
page_offset : ffff800000000000
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=41a474
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a1320
num of NODEs : 1
Memory type : SPARSEMEM
vaddr_to_paddr_arm64: pgda=90d70100, pudv.pgd=a657470206461
vaddr_to_paddr_arm64: puda=90d70100, pudv.pgd=9ffffd0003
vaddr_to_paddr_arm64: pmda=9ffffd27d8, pmdv.pud=9ffeee0003
vaddr_to_paddr_arm64: paddr=9ffeedc600
vaddr_to_paddr_arm64: pgda=90d70100, pudv.pgd=ffff97d98224
vaddr_to_paddr_arm64: puda=90d70100, pudv.pgd=9ffffd0003
vaddr_to_paddr_arm64: pmda=9ffffd27d8, pmdv.pud=9ffeee0003
vaddr_to_paddr_arm64: paddr=9ffeedc600
get_mm_sparsemem: Can't get the address of mem_section.
c. Root Cause Analysis -
- After the PA_BITS changes in arm64 kernel we set:
#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
- For SPARSEMEM, this value is used to calculate the bits space required
to store a section:
#define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
#define NR_MEM_SECTIONS (1UL << SECTIONS_SHIFT)
- User-space tools use a similar mechanism to determine the SPARSEMEM
type (extreme or not) using the 'NR_MEM_SECTIONS' value (an example from
makedumpfile code):
int
is_sparsemem_extreme(void)
{
if ((ARRAY_LENGTH(mem_section)
== divideup(NR_MEM_SECTIONS(), _SECTIONS_PER_ROOT_EXTREME()))
|| (ARRAY_LENGTH(mem_section) == NOT_FOUND_STRUCTURE))
return TRUE;
else
return FALSE;
}
- Since MAX_PHYSMEM_BITS are 48 bits for normal cases and are 52 bits
for extended PA address space, the memory type is incorrectly calculated
as SPARSEMEM rather than SPARSEMEM_EX in above case.
- Exporting correct 'MAX_PHYSMEM_BITS' via vmcoreinfo for 52-bit PA
case, fixes the above mentioned issue:
$ makedumpfile -f --mem-usage /proc/kcore -D
<..snip..>
NUMBER(MAX_PHYSMEM_BITS)=52
<..snip..>
max_mapnr : a00000
kimage_voffset : fffeffff80000000
max_physmem_bits : 30
section_size_bits: 1e
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a962c
va_bits : 48
page_offset : ffff800000000000
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=41a474
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a1320
num of NODEs : 1
Memory type : SPARSEMEM_EX
<..snip..>
TYPE PAGES EXCLUDABLE DESCRIPTION
----------------------------------------------------------------------
ZERO 2626 yes Pages filled with zero
NON_PRI_CACHE 569 yes Cache pages without private flag
PRI_CACHE 5446 yes Cache pages with private flag
USER 3213 yes User process pages
FREE 2048971 yes Free pages
KERN_DATA 19034 no Dumpable kernel data
page size: 65536
Total pages on system: 2079859
Total size on system: 136305639424 Byte
(2). Regression Case 2 (ARMv8.2-LPA + LVA [52-bit user-space VA] enabled
kernel):
- Upstream makedumpfile and crash-utility (with sha-id
e082c372c7f1a782b058ec359dfbbbee0f0b6aad) are broken on following kind
of platforms:
a. Upstream Kernel 5.0.0-rc6+ with the following kernel configuration:
CONFIG_ARM64_64K_PAGES=y
# CONFIG_ARM64_VA_BITS_42 is not set
# CONFIG_ARM64_VA_BITS_48 is not set
CONFIG_ARM64_USER_VA_BITS_52=y
CONFIG_ARM64_VA_BITS=48
# CONFIG_ARM64_PA_BITS_48 is not set
CONFIG_ARM64_PA_BITS_52=y
CONFIG_ARM64_PA_BITS=52
b. Both on CPUs which don't support ARMv8.2 extensions and on ARMv8 FVP
model with ARMv8.2 extensions.
- Error message from makedumpfile:
$ makedumpfile -f --mem-usage /proc/kcore -D
max_mapnr : a00000
kimage_voffset : fffeffff78000000
max_physmem_bits : 30
section_size_bits: 1e
vaddr_to_paddr_arm64: pgda=90f30000, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90f30000, pudv.pgd=0
readpage_elf: Attempt to read non-existent page at 0x0.
readmem: type_addr: 1, addr:0, size:8
vaddr_to_paddr_arm64: Can't read pmd
readmem: Can't convert a virtual address(ffff0000093c576c) to physical
address.
readmem: type_addr: 0, addr:ffff0000093c576c, size:390
check_release: Can't get the address of system_utsname.
c. Root Cause Analysis -
- After the 52-bit user-space VA_BIT changes in arm64 kernel we set:
#define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
- User-space tools like makedumpfile and crash use the 'PTRS_PER_PGD'
value to calculate the 'pgd_index()' of a vaddr:
#define pgd_index(vaddr) (((vaddr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
- Since the kernel now defines 'MAX_USER_VA_BITS' as:
#ifdef CONFIG_ARM64_USER_VA_BITS_52
#define MAX_USER_VA_BITS 52
#else
#define MAX_USER_VA_BITS VA_BITS
#endif
so, the user-space also needs this value to calculate the 'PTRS_PER_PGD'
and hence 'pgd_index()' correctly.
- Exporting correct 'MAX_USER_VA_BITS' via vmcoreinfo for the above
case, fixes the above mentioned issue:
$ makedumpfile -f --mem-usage /proc/kcore -D
<..snip..>
max_mapnr : a00000
pa_bits : 52
va_bits : 48 (vmcoreinfo)
max_user_va_bits : 52 (vmcoreinfo)
kimage_voffset : fffeffff78000000
max_physmem_bits : 52
section_size_bits: 30
vaddr_to_paddr_arm64: pgda=90f31e00, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90f31e00, pudv.pgd=9fffff0803
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0803
vaddr_to_paddr_arm64: paddr=913c576c
page_offset : ffff800000000000
vaddr_to_paddr_arm64: pgda=90f31e00, pudv.pgd=16e28e8bed294900
vaddr_to_paddr_arm64: puda=90f31e00, pudv.pgd=9fffff0803
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0803
vaddr_to_paddr_arm64: paddr=913bd2f8
num of NODEs : 1
Memory type : SPARSEMEM_EX
<..snip..>
Other important notes
---------------------
1. I have quoted only one makedumpfile use-case failure above (i.e.
calculating --mem-usage on the primary kernel). Other use-cases like
creating a dumpfile using /proc/vmcore or post-processing a vmcore are
also broken similarly and get fixed when a kernel which exports
'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is used along
with a modified user-space which can read this information from the
vmcoreinfo.
2. I was also going through some of the suggestions on earlier threads
about the PTE calculations for the 52-bit LPA case and discussed them
with some partner arm64 SoC enggs.
The suggestions to convert a page table entry to a physical address
without awareness of 52-bit (with an assumption of 64k page size) can be
risky.
With 64k page and older non-52-bit kernels, while it looks like in the
current checks that bits [15:12] are zero, and we can move the zeros to
bits [51:48] (because the zeros don't affect the overall PA) to generate
the overall 52-bit PA. However, this can cause IMPLEMENTATION SPECIFIC
issues on different platforms while generating a PA and IPA.
Lets see what the ARMv8 architecture reference manual says about the
Bits [15:12] for a 64KB page size:
"Bits [15:12] of each valid translation table descriptor hold Bits
[51:48] of the output address, or of the address
of the translation table to be used for the initial lookup at the next
level of translation. If the implementation
does not support 52-bit physical addresses, then it is IMPLEMENTATION
DEFINED whether non-zero values for
these bits generate an Address size fault. In this case, not generating
an Address Size Fault is deprecated."
As per the vendors, we should not assume that hardware (which does not
support 52-bit physical addresses) would generate an Address size fault
for non-zero values of Bits[15:12], so extending them to bits [51:48]
always can lead to PA address which might cause UNDEFINED behavior on
some SoCs.
Hope the above text clarifies the problem and what I am trying to fix
via this patch. Please let me know if something is missing here.
Thanks,
Bhupesh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
@ 2019-02-14 19:30 ` Bhupesh Sharma
0 siblings, 0 replies; 58+ messages in thread
From: Bhupesh Sharma @ 2019-02-14 19:30 UTC (permalink / raw)
To: James Morse
Cc: Mark Rutland, Kazuhito Hagio, lijiang, bhe, ard.biesheuvel,
catalin.marinas, kexec, Will Deacon, AKASHI Takahiro, anderson,
Borislav Petkov, Dave Young, linux-arm-kernel
Hi James,
On 02/13/2019 11:52 PM, James Morse wrote:
> Hi guys,
>
> On 13/02/2019 11:15, Dave Young wrote:
>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>> mentioned that the changes look necessary.
>>>>
>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
>>>
>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>> calculated as is done by the underlying kernel
>
> Aha! Nothing to do with which-bits-are-pfn in the tables...
>
> You need to know if the top level PGD is 512bytes or bigger. As we use a
> kmem-cache the adjacent data could be some else's page tables.
>
> Is this really a problem though? You can't pull the user-space pgd pointers out
> of no-where, you must have walked some task_struct and struct_mm's to find them.
> In which case you would have the VMAs on hand to tell you if its in the mapped
> user range.
>
> It would be good to avoid putting something arch-specific in here if we can at
> all help it.
>
>
>>>>>> (see
>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>
>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>
>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>
>>> /* to find an entry in a page-table-directory */
>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>
>> Since Dave mentioned crash tool does not need it, but crash should also
>> travel the pg tables.
>>
>> If this is really necessary it would be good to describe what will
>> happen without the patch, eg. some user visible error from an actual test etc.
>
> Yes please, it would really help if there was a specific example we could discuss.
Sure. Here are two use-cases/regressions reported and which I have been
able to reproduce. Note that I tested them both on a CPU which does not
support ARMv8.2-LPA/LVA and on ARMv8 FVP model (which supports ARMv8.2
extensions).
Environment:
------------
Latest Upstream kernel: sha-id: 1f947a7a011fcceb14cb912f5481a53b18f1879a
("Merge branch 'akpm' (patches from Andrew)")
Latest makedumpfile code: (git://git.code.sf.net/p/makedumpfile/code ,
branch: devel)
crash-utility code: (https://github.com/crash-utility/crash.git, sha-id:
e082c372c7f1a782b058ec359dfbbbee0f0b6aad)
Note that Dave A. has since fixed crash-utility by using a hardcoded
value of 'MAX_PHYSMEM_BITS' (via sha id:
ac5a7889d31bb37aa0687110ecea08837f8a66a8) and determining 'vabits_user'
value via vmlinux (via sha id: 8618ddd817621c40c1f44f0ab6df7c7805234416)
(1). Regression Case 1 (ARMv8.2-LPA enabled kernel):
- Upstream makedumpfile and crash-utility (with sha-id
e082c372c7f1a782b058ec359dfbbbee0f0b6aad) are broken on following kind
of platforms:
a. Upstream Kernel 5.0.0-rc6+ with the following kernel configuration:
CONFIG_ARM64_64K_PAGES=y
# CONFIG_ARM64_VA_BITS_42 is not set
CONFIG_ARM64_VA_BITS_48=y
# CONFIG_ARM64_USER_VA_BITS_52 is not set
CONFIG_ARM64_VA_BITS=48
# CONFIG_ARM64_PA_BITS_48 is not set
CONFIG_ARM64_PA_BITS_52=y
CONFIG_ARM64_PA_BITS=52
b. Both on CPUs which don't support ARMv8.2 LPA extension and on ARMv8
FVP model with ARMv8.2 LPA extensions.
- Error message from makedumpfile:
$ makedumpfile -f --mem-usage /proc/kcore -D
max_mapnr : a00000
kimage_voffset : fffeffff80000000
max_physmem_bits : 30
section_size_bits: 1e
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a962c
va_bits : 48
page_offset : ffff800000000000
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=41a474
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a1320
num of NODEs : 1
Memory type : SPARSEMEM
vaddr_to_paddr_arm64: pgda=90d70100, pudv.pgd=a657470206461
vaddr_to_paddr_arm64: puda=90d70100, pudv.pgd=9ffffd0003
vaddr_to_paddr_arm64: pmda=9ffffd27d8, pmdv.pud=9ffeee0003
vaddr_to_paddr_arm64: paddr=9ffeedc600
vaddr_to_paddr_arm64: pgda=90d70100, pudv.pgd=ffff97d98224
vaddr_to_paddr_arm64: puda=90d70100, pudv.pgd=9ffffd0003
vaddr_to_paddr_arm64: pmda=9ffffd27d8, pmdv.pud=9ffeee0003
vaddr_to_paddr_arm64: paddr=9ffeedc600
get_mm_sparsemem: Can't get the address of mem_section.
c. Root Cause Analysis -
- After the PA_BITS changes in arm64 kernel we set:
#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
- For SPARSEMEM, this value is used to calculate the bits space required
to store a section:
#define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
#define NR_MEM_SECTIONS (1UL << SECTIONS_SHIFT)
- User-space tools use a similar mechanism to determine the SPARSEMEM
type (extreme or not) using the 'NR_MEM_SECTIONS' value (an example from
makedumpfile code):
int
is_sparsemem_extreme(void)
{
if ((ARRAY_LENGTH(mem_section)
== divideup(NR_MEM_SECTIONS(), _SECTIONS_PER_ROOT_EXTREME()))
|| (ARRAY_LENGTH(mem_section) == NOT_FOUND_STRUCTURE))
return TRUE;
else
return FALSE;
}
- Since MAX_PHYSMEM_BITS are 48 bits for normal cases and are 52 bits
for extended PA address space, the memory type is incorrectly calculated
as SPARSEMEM rather than SPARSEMEM_EX in above case.
- Exporting correct 'MAX_PHYSMEM_BITS' via vmcoreinfo for 52-bit PA
case, fixes the above mentioned issue:
$ makedumpfile -f --mem-usage /proc/kcore -D
<..snip..>
NUMBER(MAX_PHYSMEM_BITS)=52
<..snip..>
max_mapnr : a00000
kimage_voffset : fffeffff80000000
max_physmem_bits : 30
section_size_bits: 1e
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a962c
va_bits : 48
page_offset : ffff800000000000
vaddr_to_paddr_arm64: pgda=90d70000, pudv.pgd=41a474
vaddr_to_paddr_arm64: puda=90d70000, pudv.pgd=9fffff0003
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0003
vaddr_to_paddr_arm64: paddr=911a1320
num of NODEs : 1
Memory type : SPARSEMEM_EX
<..snip..>
TYPE PAGES EXCLUDABLE DESCRIPTION
----------------------------------------------------------------------
ZERO 2626 yes Pages filled with zero
NON_PRI_CACHE 569 yes Cache pages without private flag
PRI_CACHE 5446 yes Cache pages with private flag
USER 3213 yes User process pages
FREE 2048971 yes Free pages
KERN_DATA 19034 no Dumpable kernel data
page size: 65536
Total pages on system: 2079859
Total size on system: 136305639424 Byte
(2). Regression Case 2 (ARMv8.2-LPA + LVA [52-bit user-space VA] enabled
kernel):
- Upstream makedumpfile and crash-utility (with sha-id
e082c372c7f1a782b058ec359dfbbbee0f0b6aad) are broken on following kind
of platforms:
a. Upstream Kernel 5.0.0-rc6+ with the following kernel configuration:
CONFIG_ARM64_64K_PAGES=y
# CONFIG_ARM64_VA_BITS_42 is not set
# CONFIG_ARM64_VA_BITS_48 is not set
CONFIG_ARM64_USER_VA_BITS_52=y
CONFIG_ARM64_VA_BITS=48
# CONFIG_ARM64_PA_BITS_48 is not set
CONFIG_ARM64_PA_BITS_52=y
CONFIG_ARM64_PA_BITS=52
b. Both on CPUs which don't support ARMv8.2 extensions and on ARMv8 FVP
model with ARMv8.2 extensions.
- Error message from makedumpfile:
$ makedumpfile -f --mem-usage /proc/kcore -D
max_mapnr : a00000
kimage_voffset : fffeffff78000000
max_physmem_bits : 30
section_size_bits: 1e
vaddr_to_paddr_arm64: pgda=90f30000, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90f30000, pudv.pgd=0
readpage_elf: Attempt to read non-existent page at 0x0.
readmem: type_addr: 1, addr:0, size:8
vaddr_to_paddr_arm64: Can't read pmd
readmem: Can't convert a virtual address(ffff0000093c576c) to physical
address.
readmem: type_addr: 0, addr:ffff0000093c576c, size:390
check_release: Can't get the address of system_utsname.
c. Root Cause Analysis -
- After the 52-bit user-space VA_BIT changes in arm64 kernel we set:
#define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
- User-space tools like makedumpfile and crash use the 'PTRS_PER_PGD'
value to calculate the 'pgd_index()' of a vaddr:
#define pgd_index(vaddr) (((vaddr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
- Since the kernel now defines 'MAX_USER_VA_BITS' as:
#ifdef CONFIG_ARM64_USER_VA_BITS_52
#define MAX_USER_VA_BITS 52
#else
#define MAX_USER_VA_BITS VA_BITS
#endif
so, the user-space also needs this value to calculate the 'PTRS_PER_PGD'
and hence 'pgd_index()' correctly.
- Exporting correct 'MAX_USER_VA_BITS' via vmcoreinfo for the above
case, fixes the above mentioned issue:
$ makedumpfile -f --mem-usage /proc/kcore -D
<..snip..>
max_mapnr : a00000
pa_bits : 52
va_bits : 48 (vmcoreinfo)
max_user_va_bits : 52 (vmcoreinfo)
kimage_voffset : fffeffff78000000
max_physmem_bits : 52
section_size_bits: 30
vaddr_to_paddr_arm64: pgda=90f31e00, pudv.pgd=ffffff80ffffffd0
vaddr_to_paddr_arm64: puda=90f31e00, pudv.pgd=9fffff0803
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0803
vaddr_to_paddr_arm64: paddr=913c576c
page_offset : ffff800000000000
vaddr_to_paddr_arm64: pgda=90f31e00, pudv.pgd=16e28e8bed294900
vaddr_to_paddr_arm64: puda=90f31e00, pudv.pgd=9fffff0803
vaddr_to_paddr_arm64: pmda=9fffff0000, pmdv.pud=9ffffe0803
vaddr_to_paddr_arm64: paddr=913bd2f8
num of NODEs : 1
Memory type : SPARSEMEM_EX
<..snip..>
Other important notes
---------------------
1. I have quoted only one makedumpfile use-case failure above (i.e.
calculating --mem-usage on the primary kernel). Other use-cases like
creating a dumpfile using /proc/vmcore or post-processing a vmcore are
also broken similarly and get fixed when a kernel which exports
'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is used along
with a modified user-space which can read this information from the
vmcoreinfo.
2. I was also going through some of the suggestions on earlier threads
about the PTE calculations for the 52-bit LPA case and discussed them
with some partner arm64 SoC enggs.
The suggestions to convert a page table entry to a physical address
without awareness of 52-bit (with an assumption of 64k page size) can be
risky.
With 64k page and older non-52-bit kernels, while it looks like in the
current checks that bits [15:12] are zero, and we can move the zeros to
bits [51:48] (because the zeros don't affect the overall PA) to generate
the overall 52-bit PA. However, this can cause IMPLEMENTATION SPECIFIC
issues on different platforms while generating a PA and IPA.
Lets see what the ARMv8 architecture reference manual says about the
Bits [15:12] for a 64KB page size:
"Bits [15:12] of each valid translation table descriptor hold Bits
[51:48] of the output address, or of the address
of the translation table to be used for the initial lookup at the next
level of translation. If the implementation
does not support 52-bit physical addresses, then it is IMPLEMENTATION
DEFINED whether non-zero values for
these bits generate an Address size fault. In this case, not generating
an Address Size Fault is deprecated."
As per the vendors, we should not assume that hardware (which does not
support 52-bit physical addresses) would generate an Address size fault
for non-zero values of Bits[15:12], so extending them to bits [51:48]
always can lead to PA address which might cause UNDEFINED behavior on
some SoCs.
Hope the above text clarifies the problem and what I am trying to fix
via this patch. Please let me know if something is missing here.
Thanks,
Bhupesh
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 58+ messages in thread