From: Steve Capper <Steve.Capper@arm.com>
To: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Mark Rutland <Mark.Rutland@arm.com>,
Kazuhito Hagio <k-hagio@ab.jp.nec.com>,
"lijiang@redhat.com" <lijiang@redhat.com>,
"bhe@redhat.com" <bhe@redhat.com>,
"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>,
Catalin Marinas <Catalin.Marinas@arm.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
Will Deacon <Will.Deacon@arm.com>,
AKASHI Takahiro <takahiro.akashi@linaro.org>,
James Morse <James.Morse@arm.com>,
Kristina Martsenko <Kristina.Martsenko@arm.com>,
Borislav Petkov <bp@alien8.de>,
"anderson@redhat.com" <anderson@redhat.com>, nd <nd@arm.com>,
Dave Young <dyoung@redhat.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo
Date: Mon, 18 Feb 2019 15:27:03 +0000 [thread overview]
Message-ID: <20190218152651.GA14091@capper-debian.cambridge.arm.com> (raw)
In-Reply-To: <CACi5LpN6gVy4gCTYLGRiWW1VNGrVc-+Ykn919LKsO2bP-v1NHw@mail.gmail.com>
Hi Bhupesh,
Sorry for joining this thread late...
On Fri, Feb 15, 2019 at 11:31:56PM +0530, Bhupesh Sharma wrote:
> Hi James,
>
> On Fri, Feb 15, 2019 at 11:04 PM James Morse <james.morse@arm.com> wrote:
> >
> > Hi guys,
> >
> > (CC: +Steve, +Kristina) "What's the best way of letting user-space know the MMU
> > config when 52-bit VA and pointer-auth may be in use?"
> >
> > On 13/02/2019 19:52, Kazuhito Hagio wrote:
> > > On 2/13/2019 1:22 PM, James Morse wrote:
> > >> On 13/02/2019 11:15, Dave Young wrote:
> > >>> On 02/12/19 at 11:03pm, Kazuhito Hagio wrote:
> > >>>> On 2/12/2019 2:59 PM, Bhupesh Sharma wrote:
> > >>>>> BTW, in the makedumpfile enablement patch thread for ARMv8.2 LVA
> > >>>>> (which I sent out for 52-bit User space VA enablement) (see [0]), Kazu
> > >>>>> mentioned that the changes look necessary.
> > >>>>>
> > >>>>> [0]. http://lists.infradead.org/pipermail/kexec/2019-February/022431.html
> > >>>>
> > >>>>>>> The increased 'PTRS_PER_PGD' value for such cases needs to be then
> > >>>>>>> calculated as is done by the underlying kernel
> > >>
> > >> Aha! Nothing to do with which-bits-are-pfn in the tables...
> > >>
> > >> You need to know if the top level PGD is 512bytes or bigger. As we use a
> > >> kmem-cache the adjacent data could be some else's page tables.
> > >>
> > >> Is this really a problem though? You can't pull the user-space pgd pointers out
> > >> of no-where, you must have walked some task_struct and struct_mm's to find them.
> > >> In which case you would have the VMAs on hand to tell you if its in the mapped
> > >> user range.
> > >>
> > >> It would be good to avoid putting something arch-specific in here if we can at
> > >> all help it.
> >
> > >>>>>>> (see
> > >>>>>>> 'arch/arm64/include/asm/pgtable-hwdef.h' for details):
> > >>>>>>>
> > >>>>>>> #define PTRS_PER_PGD (1 << (MAX_USER_VA_BITS - PGDIR_SHIFT))
> > >>>>
> > >>>> Yes, this is the reason why makedumpfile needs the MAX_USER_VA_BITS.
> > >>>> It is used for pgd_index() also in makedumpfile to walk page tables.
> > >>>>
> > >>>> /* to find an entry in a page-table-directory */
> > >>>> #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> > >>>
> > >>> Since Dave mentioned crash tool does not need it, but crash should also
> > >>> travel the pg tables.
> > >
> > > The crash utility is always invoked with vmlinux, so it can read the
> > > vabits_user variable directly from vmcore, but makedumpfile can not.
> >
> > (This sounds fragile. That symbol's name may change, it may disappear
> > completely! ... but I guess crash changes with every kernel release anyway)
> >
> >
> > >>> If this is really necessary it would be good to describe what will
> > >>> happen without the patch, eg. some user visible error from an actual test etc.
> > >>
> > >> Yes please, it would really help if there was a specific example we could discuss.
> > >
> > > With 52-bit user space and 48-bit kernel space configuration,
> > > makedumpfile will not be able to convert a virtual kernel address
> > > to a physical address, and fail to capture a dumpfile, because the
> > > pgd_index() will return a wrong index.
> >
> > Got it, thanks!
> > (all this user stuff had me thinking it was user-space you were trying to walk).
> >
> > Yes, this is because of commit e842dfb5a2d3 ("arm64: mm: Offset TTBR1 to allow
> > 52-bit PTRS_PER_PGD"). The kernel has offset the ttbr1 value, if you try and
> > walk it without knowing the offset you get junk.
> >
> > Ideally we tell you the offset with some 'ttbr1_offset=' in vmcoreinfo, but if
> > the offsetting code disappears, the kernel would still have to provide
> > 'ttbr1_offset=0' for user-space to keep working.
> >
> > I'd like to find something future-proof that always has an unambiguous meaning,
> > and isn't a problem if the kernel variable/symbol/kconfig names change.
> >
> > With pointer-auth in use too you can't guess which bits are address and which
> > bits are data.
> >
> > Taking arch-specific to its extreme, we could expose TCR_EL1, but this is a
> > problem if we ever switch that per task (some new bits may turn up with a new
> > feature). Some of those bits vary per cpu too, so we'd have to mask them out in
> > case user-space tries to conclude something from them.
> >
> >
> > My current best suggestion is to export:
> > from core code:
> > * USER_MMAP_END, the maximum value a user-space can try and mmap().
> > This would normally be TASK_SIZE, but x86 and powerpc also have support for
> > larger VA space, and its plumbed into mm slightly differently. We should have
> > one arch-independent property that covers all these. On arm64 this would be the
> > runtime va bits for user-space's TTBR. (This assumes the value isn't per-task)
> >
> > arch specific:
> > * ARM64_TCR.T1SZ, the va bits mapped by the kernel's TTBR. (We can assume we'll
> > never flip user/kernel space). This has to be arch specific, it will always have
> > a value and its meaning comes from the ARM-ARM (so linux can't change it in the
> > future). It should be the same on every CPU.
> > * ARM64_TTBR1.BADDR, the pa of the kernel page tables, which implicitly has the
> > offset. Again this always has a value, and its meaning comes from the ARM-ARM.
> > If we ever get clever with different page-tables/TCR values on different CPUs,
> > these two should come from the same CPU.
> >
> >
> > I think this gives you what you need if user/kernel may both be using
> > pointer-auth and both may be using 52-bit va. I'm pretty sure the 48:52 bits can
> > be picked at boot time depending on the kernel kconfig and the hardware support.
> >
> > Does anyone have a better idea? (or a corner where this won't work?)
>
> I am not sure you got a chance to look at the two regression cases I
> reported here:
> <http://lists.infradead.org/pipermail/kexec/2019-February/022449.html>
>
> Unfortunately the above suggestion doesn't provide any fix for
> ARMv8.2-LPA regression (see text under heading '
> (1). Regression Case 1 (ARMv8.2-LPA enabled kernel)')
>
> After going through the regression reports, I think exporting
> 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo is sufficient
> for the above regressions (without over-complicating the stuff) as
> ARM64_TCR.T1SZ and friends seem to arch specific as compared to
> VA_BITS + 'MAX_USER_VA_BITS' .
>
For MAX_USER_VA_BITS, IIUC you are just after a value of PTRS_PER_PGD?
Why not just add PTRS_PER_PGD to the vmcoreinfo?
FWIW it is possible in vaddr_to_paddr_arm64 to detect a zero pgd entry
then try again with another ptrs_per_pgd value (granted this is a little
hacky).
Cheers,
--
Steve
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-02-18 15:27 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-30 12:23 [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo Bhupesh Sharma
2019-01-30 15:21 ` James Morse
2019-01-30 21:39 ` Bhupesh Sharma
2019-02-04 14:35 ` Bhupesh Sharma
2019-02-04 15:31 ` Robin Murphy
2019-02-12 4:55 ` Bhupesh Sharma
2019-02-12 10:49 ` Robin Murphy
2019-02-04 16:56 ` James Morse
2019-01-31 1:48 ` Dave Young
2019-01-31 10:00 ` Bhupesh Sharma
2019-01-31 14:03 ` Dave Anderson
2019-02-04 16:04 ` Kazuhito Hagio
2019-02-12 5:07 ` Bhupesh Sharma
2019-02-12 10:44 ` Dave Young
2019-02-12 19:59 ` Bhupesh Sharma
2019-02-12 23:03 ` Kazuhito Hagio
2019-02-13 11:15 ` Dave Young
2019-02-13 18:22 ` James Morse
2019-02-13 19:52 ` Kazuhito Hagio
2019-02-15 17:34 ` James Morse
2019-02-15 18:01 ` Bhupesh Sharma
2019-02-18 15:27 ` Steve Capper [this message]
2019-02-21 16:08 ` Bhupesh Sharma
2019-02-19 20:47 ` Kazuhito Hagio
2019-02-21 16:20 ` Bhupesh Sharma
2019-02-21 16:42 ` Dave Anderson
2019-02-21 19:02 ` Kazuhito Hagio
2019-03-01 4:01 ` Bhupesh Sharma
2019-02-14 19:30 ` Bhupesh Sharma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190218152651.GA14091@capper-debian.cambridge.arm.com \
--to=steve.capper@arm.com \
--cc=Catalin.Marinas@arm.com \
--cc=James.Morse@arm.com \
--cc=Kristina.Martsenko@arm.com \
--cc=Mark.Rutland@arm.com \
--cc=Will.Deacon@arm.com \
--cc=anderson@redhat.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bhe@redhat.com \
--cc=bhsharma@redhat.com \
--cc=bp@alien8.de \
--cc=dyoung@redhat.com \
--cc=k-hagio@ab.jp.nec.com \
--cc=kexec@lists.infradead.org \
--cc=lijiang@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=nd@arm.com \
--cc=takahiro.akashi@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).