Re: [PATCH 1/2] arm64: Expose address bits (physical/virtual) via cpuinfo

From: James Morse <james.morse@arm.com>
To: Bhupesh Sharma <bhsharma@redhat.com>
Cc: mark.rutland@arm.com, Steve.Capper@arm.com,
	catalin.marinas@arm.com, ard.biesheuvel@linaro.org,
	will.deacon@arm.com, bhupesh.linux@gmail.com,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 1/2] arm64: Expose address bits (physical/virtual) via cpuinfo
Date: Mon, 4 Feb 2019 16:54:51 +0000	[thread overview]
Message-ID: <be9ad470-6f24-36cc-aeb8-a7cea3ec984a@arm.com> (raw)
In-Reply-To: <8659c949-a588-387b-3cb6-1dd8ea86723e@redhat.com>

Hi Bhupesh,

On 30/01/2019 19:48, Bhupesh Sharma wrote:
> On 01/29/2019 03:39 PM, Suzuki K Poulose wrote:
>> On 28/01/2019 20:57, Bhupesh Sharma wrote:
>>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>>> supports these extensions can support upto 52-bit virtual and 52-bit
>>> physical addresses respectively.
>>>
>>> Since at the moment we enable the support of these extensions via CONFIG
>>> flags, e.g.
>>>   - LPA via CONFIG_ARM64_PA_BITS_52, and
>>>   - LVA via CONFIG_ARM64_FORCE_52BIT
>>>
>>> The easiest way a user can determine the physical/virtual
>>> addresses supported on the hardware, is via the '/proc/cpuinfo'
>>> interface.
>>
>> Why do we need this information ?
> 
> Sorry for the delay in reply, but I wanted to collect as much information from
> our test teams as possible before replying to this thread.
> 
> So here is brief list of reasons, as to why we need this information in user-space:
> 
> 1. This information is useful for a non-expert user, using Linux distributions
> (like Fedora) on different arm64 platforms. The default configuration (.config)
> will be the same for a distribution flavor and is supposed to work fine on all
> underlying arm64 platforms.
> 
> a). Now some of these underlying platforms may support ARMv8-8.2 extension while
> others don't.
> 
> b). Users performing performance bench-marking on these platforms run benchmarks
> with different page-sizes and address ranges.

> c). Right now they have no way to know, about the underlying VARange and PARange
> values other than reading the config file and search for the flags.

Why do they need to know? What decision can you make with this information that
you can't make without it?

> For e.g. lets consider the 'pg-table_tests' (See -
> <https://github.com/sanskriti-s/pg-table_tests>), which is used to test and
> verify 5-level page table behavior on x86_64 Linux. It requires determining if
> 5-level page tables are fully supported, 

... but we don't have 5-level pages tables ...

> for which it uses either 'Intel 'la57'
> cpu flag' in:
> 
> $ cat /proc/cpuinfo', or
> 
> $ grep CONFIG_X86_5LEVEL /boot/config-$(uname -r)
> CONFIG_X86_5LEVEL=y
> 
> This test suite is easily modifiable for verifying 52-bit ARMv8.2-LVA support.

This looks like a test to check all kernel page-table walkers have been updated
for a fifth level. We don't need to worry about this.

You should just need to remove the arch-specific test. If you provide the hint
on platforms that support it, the mapping should succeed. On platforms that
don't, it won't.

Why does user space need to know in advance of making the hint?

> d). Now when running the above suite and sharing results, it might be that the
> .config file is not available or even in the case it is available the CONFIG
> flag settings in .config file are not intuitive to a non-expert user for arm64
> (the example below is of 64K page size, 48-bit kernel VA, 52-bit User space VA
> and 52-bit PA):

I agree inspecting the Kconfig is an inappropriate way for user-space 'to know'
what the kernel supports.

I can only see a 'supports 52bit va' flag as being useful to a program that
doesn't actually want to use it, but for some bizarre reason wants to know.

For coredumps the question isn't "was it supported", but "was it in use", which
you can tell from the pagetables.

> Also right now there is an absence of a standard ABI between the user-space and
> kernel for exporting this information to the user-space, with two exceptions:
> 
> 1. For vmcoreinfo specific user-space utilities (like makedumpfile and crash) I
> have proposed a couple of CONFIG flags to be added to the vmcoreinfo, so that
> user-space utilities can use the same (See
> <http://lists.infradead.org/pipermail/kexec/2019-January/022387.html> for details).

vmcoreinfo is for things like crash/gdb/makedumpfile to provide kernel-specific
information that they couldn't possibly work without. Like the page size. 52bit
support doesn't fit here as a 52bit-aware walker works regardless of whether
52bit was in use.

> 2. For other user-space utilities (especially those which make a 'mmap' call and
> pass an address hint to the get the kernel to provide a high address), 

> I can see only two methods to determine the underlying kernel support:
> 
> a). Read the CONFIG flags from .config (as I captured some paragraphs above), or
> 
> b). In absence of .config file on the system, read the system ID registers like
> 'ID_AA64MMFR0_EL1' and 'ID_AA64MMFR2_EL1' (which PATCH 2/2 of this series tries
> to enable from kernel side) and then make a decision on whether to pass a hint
> to 'mmap'.

It seems you're expecting to know whether 52bit-VA is supported without actually
using it. What is this useful for?

The point of the hint is you want to allocate memory, and can work with 52bit-VA
if the platform supports it. If it doesn't, you still want to allocate the
memory. We shouldn't need a hint that the 52bit-va hint is supported.

> It might be that I am missing other standard ABI mechanisms. If so, please point
> me to the same.
We also have HWCAP: Documentation/arm64/elf_hwcaps.txt

These are used for things the program may need to run: like floating point, or
the presence of particular instructions. User-space absolutely has to know about
these in advance, as it will get a SIGILL if support is not present.

52bit VA doesn't fit here: memory is memory. Needing to know implies user-space
is unwilling to use memory if the bits above 48bits aren't set.

Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel