linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Steve Capper <Steve.Capper@arm.com>
Cc: "crecklin@redhat.com" <crecklin@redhat.com>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>, nd <nd@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH 0/9] 52-bit kernel + user VAs
Date: Tue, 26 Feb 2019 21:17:49 +0100	[thread overview]
Message-ID: <CAKv+Gu9J6+4rNBJKF8kVpP96bgANR3ZiMuOcvYLYCPEy-PkUDg@mail.gmail.com> (raw)
In-Reply-To: <20190226173007.GA1553@capper-debian.cambridge.arm.com>

On Tue, 26 Feb 2019 at 18:30, Steve Capper <Steve.Capper@arm.com> wrote:
>
> On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote:
> > On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote:
> > >
> > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote:
> > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote:
> > > > >
> > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote:
> > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote:
> > > > > > >
> > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote:
> > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote:
> > > > > > > > >
> > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the
> > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0.
> > > > > > > > >
> > > > > > > > > As 52-bit virtual address support is an optional hardware feature,
> > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot
> > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit.
> > > > > > > > >
> > > > > > > > > A significant proportion of this series focuses on "de-constifying"
> > > > > > > > > VA_BITS related constants.
> > > > > > > > >
> > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one
> > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
> > > > > > > > > start address. Also, it is highly desirable to maintain the same
> > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these
> > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t.
> > > > > > > > > the direct linear map occupies the lower addresses.
> > > > > > > > >
> > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I
> > > > > > > > > can add with some more #ifdef'ery if needed.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Hi Steve,
> > > > > > > >
> > > > > > > > Apologies if I am bringing up things that have been addressed
> > > > > > > > internally already. We discussed the 52-bit kernel VA work at
> > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor
> > > > > > > > when it comes to having compile time constants for VA_BITS_MIN,
> > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right?
> > > > > > > >
> > > > > > > > To clarify what I mean, please refer to the diagram below, which
> > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on
> > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52)
> > > > > > > >
> > > > > > > > +------------------- (~0) -------------------------+
> > > > > > > > |                                                |
> > > > > > > > |            PCI IO / fixmap spaces              |
> > > > > > > > |                                                |
> > > > > > > > +------------------------------------------------+
> > > > > > > > |                                                |
> > > > > > > > |             kernel/vmalloc space               |
> > > > > > > > |                                                |
> > > > > > > > +------------------------------------------------+
> > > > > > > > |                                                |
> > > > > > > > |                module space                    |
> > > > > > > > |                                                |
> > > > > > > > +------------------------------------------------+
> > > > > > > > |                                                |
> > > > > > > > |                BPF space                       |
> > > > > > > > |                                                |
> > > > > > > > +------------------------------------------------+
> > > > > > > > |                                                |
> > > > > > > > |                                                |
> > > > > > > > |   vmemmap space (size based on VA_BITS_MAX)    |
> > > > > > > > |                                                |
> > > > > > > > |                                                |
> > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- +
> > > > > > > > |                                                |
> > > > > > > > |    linear mapping (48 bit addressable region)  |
> > > > > > > > |                                                |
> > > > > > > > +------------------------------------------------+
> > > > > > > > |                                                |
> > > > > > > > |    linear mapping (52 bit addressable region)  |
> > > > > > > > |                                                |
> > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+
> > > > > > > >
> > > > > > > > Since KASAN is what is preventing this, would it be acceptable for
> > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit
> > > > > > > > configuration, and disable it for the 48/52 hybrid configuration?
> > > > > > > >
> > > > > > > > Just thinking out loud (and in ASCII art :-))
> > > > > > >
> > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to
> > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the
> > > > > > > moment, and may have a different opinion ;)
> > > > > > >
> > > > > >
> > > > > > But that implies that you cannot have an image that supports 52-bit
> > > > > > kernel VAs but can still boot on hardware that does not implement
> > > > > > support for it. If that is acceptable, then none of this hoop jumping
> > > > > > that Steve is doing in these patches is necessary to begin with,
> > > > > > right?
> > > > >
> > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I
> > > > > thought you were referring to the configuration where userspace is 52-bit
> > > > > and the kernel is 48-bit, which is something I think we can drop if we gain
> > > > > support for 52-bit kernel.
> > > > >
> > > > > Now that I understand what you mean, I think disabling KASAN would be fine
> > > > > as long as it's a runtime thing and the kernel continues to work in every
> > > > > other respect.
> > > > >
> > > >
> > > > No, it would be a limitation of the 52-bit config which also supports
> > > > 48-bit-VA-only-h/w that the address space is laid out in such a way
> > > > that there is simply no room for the KASAN shadow region, since it
> > > > would have to live in the 48-bit addressable area, but be big enough
> > > > to cover 52 bits of VA, which is impossible.
> > > >
> > > > For the vmemmap space, we could live with sizing it statically to
> > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply
> > > > too big.
> > > >
> > > > So if KASAN support in that configuration is a requirement, then I
> > > > agree with Steve's approach, but it does imply that quite a number of
> > > > formerly compile-time constants now get turned into runtime variables.
> > > >
> > > > Steve, do you have any idea what the impact of that is?
> > >
> > > Hi Guys,
> > >
> > > The KASAN region only really necessitates two things: 1) that we think
> > > about the end address of the region (which is invariant) rather than the
> > > start address; and that 2) we flip the kernel VA space. IIUC both these
> > > changes have a neglible perf impact.
> > >
> > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping
> > > support, and the big one phys_to/from_virt. For phys_to/from_virt the
> > > logic is changed s.t. we use a variable lookup for translation but this
> > > is folded into a new variable physvirt_offset (before the patch we used
> > > a single variable read too).
> > >
> > > Again IIUC there should be a minimal perf impact (unless one tries to do
> > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that
> > > can be optimised later).
> > >
> > > I didn't have the patience for ASCII art ;-), but I have a picture of
> > > what I think it looks like here:
> > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf
> > > What I've tried to do is have most parts of the kernel VA space
> > > invariant between 48/52 bits. If it's helpful I can type this up into a
> > > document/commit log message?
> > >
> > > For this series I have tried to introduce VA_BITS_MIN in its own patch
> > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow.
> > >
>
> Hi Ard,
>
> Apologies for my late reply, I had been staring at this for a while.
>
> >
> > OK, perhaps I am just rephrasing what you essentially implemented
> > already, but let me try to explain a bit better what I mean:
> >
> > - we flip the VA space in the way you suggest
> > - we limit the size of the top half of the address space to 47 bits
> > - KASAN region growns downwards from (~0) << 47
> > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w
> > supports LVA or not
> > - however, we tweak the phys/virt translation so that memory appears
> > in the 48-bit addressable part of the linear region on non-LVA
> > hardware
> >
> > The latter basically means that the KASAN shadow region will intersect
> > the linear region, but whether we map memory or shadow pages there
> > depends on the h/w config at runtime.
> >
> > The heart of the matter is probably the different placement of the
> > memory inside the linear region, depending on whether the h/w is LVA
> > capable or not, which is also reflected in your physvirt_offset. I am
> > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime
> > variable.
>
> Currently the direct linear map between configurations does not overlap,
> we have:
>
> FFF00000_00000000 - Direct linear map start (52-bit)
> FFF80000_00000000 - Direct linear map end (52-bit)
> FFFF0000_00000000 - Direct linear map start (48-bit)
> FFFF8000_00000000 - Direct linear map end (48-bit)
>
> We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we
> offset it. vmemmap can then be adjusted on early boot to ensure that
> everything points to the right place. However we will get overlap for
> 52-bit configurations between KASAN and the direct linear map.
>
> The question is: are we okay with quite a large overlap?
>
> The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to
> employ a "full" 47-bit direct linear map on 48-bit systems we need a
> PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear
> map end addresses "match up" between 48/52 bit configurations.
>
> This doesn't leave us with a lot of room for 52-bit configurations
> though, if KASAN is enabled.
>

OK, so with actual numbers, what I had in mind was


FFF00000_00000000  start of 52-bit addressable linear region | PAGE_OFFSET

FFFD8000_00000000  start of KASAN shadow region | KASAN_SHADOW_OFFSET

FFFF0000_00000000  start of 48-bit addressable linear region

FFFF6000_00000000  start of used KASAN shadow region (48-bit VA)
                   (KASAN_SHADOW_OFFSET + F0000_00000000 >> 3)

FFFF8000_00000000  start of vmemmap area - end of KASAN shadow region

FFFF8200_00000000  end of vmemmap area - start of bpf/module/etc area


The trick is that the full (52 - 3) bits KASAN shadow space overlaps
with the 48-bit linear region, but since you don't need KASAN shadow
pages for memory that does not exist, the region FFFF0000_00000000 -
FFFF6000_00000000 can be used for mapping the memory in case the h/w
is 48-bit only.

So in this case, PAGE_OFFSET and KASAN_SHADOW_OFFSET remain compile
time constants, and as long as we don't attempt to map anything
outside of the 48-bit addressable area on h/w that does not support
it, the fact that those quantities are outside the 48-bit range does
not really matter.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-02-26 20:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-18 17:02 [PATCH 0/9] 52-bit kernel + user VAs Steve Capper
2019-02-18 17:02 ` [PATCH 1/9] arm/arm64: KVM: Formalise end of direct linear map Steve Capper
2019-02-18 17:02 ` [PATCH 2/9] arm64: mm: Flip kernel VA space Steve Capper
2019-04-03 11:44   ` Bhupesh Sharma
2019-02-18 17:02 ` [PATCH 3/9] arm64: kasan: Switch to using KASAN_SHADOW_OFFSET Steve Capper
2019-02-18 17:02 ` [PATCH 4/9] arm64: mm: Replace fixed map BUILD_BUG_ON's with BUG_ON's Steve Capper
2019-02-18 17:02 ` [PATCH 5/9] arm64: dump: Make kernel page table dumper dynamic again Steve Capper
2019-02-18 17:02 ` [PATCH 6/9] arm64: mm: Introduce VA_BITS_MIN Steve Capper
2019-02-18 17:02 ` [PATCH 7/9] arm64: mm: Introduce VA_BITS_ACTUAL Steve Capper
2019-02-18 17:02 ` [PATCH 8/9] arm64: mm: Logic to make offset_ttbr1 conditional Steve Capper
2019-04-03 11:26   ` Bhupesh Sharma
2019-02-18 17:02 ` [PATCH 9/9] arm64: mm: Introduce 52-bit Kernel VAs Steve Capper
2019-03-25 18:17   ` Catalin Marinas
2019-02-19 12:13 ` [PATCH 0/9] 52-bit kernel + user VAs Ard Biesheuvel
2019-02-19 12:48   ` Will Deacon
2019-02-19 12:51     ` Ard Biesheuvel
2019-02-19 13:01       ` Will Deacon
2019-02-19 13:15         ` Ard Biesheuvel
2019-02-19 13:56           ` Steve Capper
2019-02-19 16:18             ` Ard Biesheuvel
2019-02-26 17:30               ` Steve Capper
2019-02-26 20:17                 ` Ard Biesheuvel [this message]
2019-02-28 10:35                   ` Steve Capper
2019-02-28 11:22                     ` Ard Biesheuvel
2019-02-28 11:45                       ` Steve Capper
2019-03-25 18:38                       ` Catalin Marinas
2019-03-25 20:32                         ` Ard Biesheuvel
2019-04-03  8:09 ` Bhupesh Sharma
2019-05-03 14:57   ` Steve Capper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKv+Gu9J6+4rNBJKF8kVpP96bgANR3ZiMuOcvYLYCPEy-PkUDg@mail.gmail.com \
    --to=ard.biesheuvel@linaro.org \
    --cc=Catalin.Marinas@arm.com \
    --cc=Marc.Zyngier@arm.com \
    --cc=Steve.Capper@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=crecklin@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).