Linux-ARM-Kernel Archive on
 help / color / Atom feed
From: James Morse <>
To: Kazuhito Hagio <>,
	Bhupesh Sharma <>
Cc: Mark Rutland <>,
	"" <>,
	"" <>,
	"" <>,
	"" <>,
	Steve Capper <>,
	"" <>,
	Will Deacon <>,
	AKASHI Takahiro <>,
	"" <>,
	Kristina Martsenko <>,
	Borislav Petkov <>, Dave Young <>,
Subject: Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
Date: Fri, 15 Feb 2019 17:34:12 +0000
Message-ID: <> (raw)
In-Reply-To: <>

Hi guys,

(CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
config when 52-bit VA and pointer-auth may be in use?"

On 13/02/2019 19:52, Kazuhito Hagio wrote:
> On 2/13/2019 1:22 PM, James Morse wrote:
>> On 13/02/2019 11:15, Dave Young wrote:
>>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
>>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
>>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
>>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
>>>>> mentioned that the changes look necessary.
>>>>> [0].
>>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
>>>>>>> calculated as is done by the underlying kernel
>> Aha! Nothing to do with which-bits-are-pfn in the tables...
>> You need to know if the top level PGD is 512bytes or bigger. As we use a
>> kmem-cache the adjacent data could be some else's page tables.
>> Is this really a problem though? You can't pull the user-space pgd pointers out
>> of no-where, you must have walked some task_struct and struct_mm's to find them.
>> In which case you would have the VMAs on hand to tell you if its in the mapped
>> user range.
>> It would be good to avoid putting something arch-specific in here if we can at
>> all help it.

>>>>>>> (see
>>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
>>>>>>> #define PTRS_PER_PGD          (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
>>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
>>>> It is used for pgd_index() also in makedumpfile to walk page tables.
>>>> /* to find an entry in a page-table-directory */
>>>> #define pgd_index(addr)         (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
>>> Since Dave mentioned crash tool does not need it, but crash should also
>>> travel the pg tables.
> The crash utility is always invoked with vmlinux, so it can read the
> vabits_user variable directly from vmcore, but makedumpfile can not.

(This sounds fragile. That symbol's name may change, it may disappear
completely! ... but I guess crash changes with every kernel release anyway)

>>> If this is really necessary it would be good to describe what will
>>> happen without the patch, eg. some user visible error from an actual test etc.
>> Yes please, it would really help if there was a specific example we could discuss.
> With 52-bit user space and 48-bit kernel space configuration,
> makedumpfile will not be able to convert a virtual kernel address
> to a physical address, and fail to capture a dumpfile, because the
> pgd_index() will return a wrong index.

Got it, thanks!
(all this user stuff had me thinking it was user-space you were trying to walk).

Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
walk it without knowing the offset you get junk.

Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
the offsetting code disappears, the kernel would still have to provide
'ttbr1_offset=0' for user-space to keep working.

I'd like to find something future-proof that always has an unambiguous meaning,
and isn't a problem if the kernel variable/symbol/kconfig names change.

With pointer-auth in use too you can't guess which bits are address and which
bits are data.

Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
problem if we ever switch that per task (some new bits may turn up with a new
feature). Some of those bits vary per cpu too, so we'd have to mask them out in
case user-space tries to conclude something from them.

My current best suggestion is to export:
from core code:
* USER_MMAP_END, the maximum value a user-space can try and mmap().
This would normally be TASK_SIZE, but x86 and powerpc also have support for
larger VA space, and its plumbed into mm slightly differently. We should have
one arch-independent property that covers all these. On arm64 this would be the
runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)

arch specific:
* ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
never flip user/kernel space). This has to be arch specific, it will always have
a value and its meaning comes from the ARM-ARM (so linux can't change it in the
future). It should be the same on every CPU.
* ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
offset. Again this always has a value, and its meaning comes from the ARM-ARM.
If we ever get clever with different page-tables/TCR values on different CPUs,
these two should come from the same CPU.

I think this gives you what you need if user/kernel may both be using
pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
be picked at boot time depending on the kernel kconfig and the hardware support.

Does anyone have a better idea? (or a corner where this won't work?)



linux-arm-kernel mailing list

  reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-30 12:23 Bhupesh Sharma
2019-01-30 15:21 ` James Morse
2019-01-30 21:39   ` Bhupesh Sharma
2019-02-04 14:35     ` Bhupesh Sharma
2019-02-04 15:31       ` Robin Murphy
2019-02-12  4:55         ` Bhupesh Sharma
2019-02-12 10:49           ` Robin Murphy
2019-02-04 16:56       ` James Morse
2019-01-31  1:48 ` Dave Young
2019-01-31 10:00   ` Bhupesh Sharma
2019-01-31 14:03   ` Dave Anderson
2019-02-04 16:04   ` Kazuhito Hagio
2019-02-12  5:07     ` Bhupesh Sharma
2019-02-12 10:44       ` Dave Young
2019-02-12 19:59         ` Bhupesh Sharma
2019-02-12 23:03           ` Kazuhito Hagio
2019-02-13 11:15             ` Dave Young
2019-02-13 18:22               ` James Morse
2019-02-13 19:52                 ` Kazuhito Hagio
2019-02-15 17:34                   ` James Morse [this message]
2019-02-15 18:01                     ` Bhupesh Sharma
2019-02-18 15:27                       ` Steve Capper
2019-02-21 16:08                         ` Bhupesh Sharma
2019-02-19 20:47                       ` Kazuhito Hagio
2019-02-21 16:20                         ` Bhupesh Sharma
2019-02-21 16:42                           ` Dave Anderson
2019-02-21 19:02                             ` Kazuhito Hagio
2019-03-01  4:01                               ` Bhupesh Sharma
2019-02-14 19:30                 ` Bhupesh Sharma

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ARM-Kernel Archive on

Archives are clonable:
	git clone --mirror linux-arm-kernel/git/0.git
	git clone --mirror linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ \
	public-inbox-index linux-arm-kernel

Newsgroup available over NNTP:

AGPL code for this site: git clone public-inbox