Linux-Media Archive on lore.kernel.org
 help / color / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Andrew Murray <andrew.murray@arm.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Mark Rutland <mark.rutland@arm.com>,
	kvm@vger.kernel.org, Szabolcs Nagy <Szabolcs.Nagy@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org,
	Felix Kuehling <Felix.Kuehling@amd.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Jacob Bramley <Jacob.Bramley@arm.com>,
	Leon Romanovsky <leon@kernel.org>,
	linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org,
	Dmitry Vyukov <dvyukov@google.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Evgeniy Stepanov <eugenis@google.com>,
	linux-media@vger.kernel.org,
	Kevin Brodsky <kevin.brodsky@arm.com>,
	Kees Cook <keescook@chromium.org>,
	Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>,
	Ramana Radhakrishnan <Ramana.Radhakrishnan@arm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	Kostya Serebryany <kcc@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Yishai Hadas <yishaih@mellanox.com>,
	linux-kernel@vger.kernel.org,
	Jens Wiklander <jens.wiklander@linaro.org>,
	Lee Smith <Lee.Smith@arm.com>,
	Alexander Deucher <Alexander.Deucher@amd.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Christian Koenig <Christian.Koenig@amd.com>,
	Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Subject: Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls
Date: Wed, 29 May 2019 15:20:08 +0100
Message-ID: <20190529142008.5quqv3wskmpwdfbu@mbp> (raw)
In-Reply-To: <11193998209cc6ff34e7d704f081206b8787b174.camel@oracle.com>

Hi Khalid,

On Tue, May 28, 2019 at 05:33:04PM -0600, Khalid Aziz wrote:
> On Tue, 2019-05-28 at 16:40 +0100, Catalin Marinas wrote:
> > I think another aspect is how we define the ABI. Is allowing tags to
> > mlock() for example something specific to arm64 or would sparc ADI
> > need the same? In the absence of other architectures defining such
> > ABI, my preference would be to keep the wrappers in the arch code.
> > 
> > Assuming sparc won't implement untagged_addr(), we can place the
> > macros back in the generic code but, as per the review here, we need
> > to be more restrictive on where we allow tagged addresses. For
> > example, if mmap() gets a tagged address with MAP_FIXED, is it
> > expected to return the tag?
> 
> I would recommend against any ABI differences between ARM64 MTE/TBI and
> sparc ADI unless it simply can not be helped. My understanding of
> MTE/TBI is limited, so I will explain how sparc ADI works. On sparc, a
> tagged address has no meaning until following steps happen:

Before we go into the MTE/ADI similarities or differences, just to
clarify that TBI is something that we supported from the start of the
arm64 kernel port. TBI (top byte ignore) allows a user pointer to have
non-zero top byte and dereference it without causing a fault (the
hardware masks it out). The user/kernel ABI does not allow such tagged
pointers into the kernel, nor would the kernel return any such tagged
addresses.

With MTE (memory tagging extensions), the top-byte meaning is changed
from no longer being ignored to actually being checked against a tag in
the physical RAM (we call it allocation tag).

> 1. set the user mode PSTATE.mcde bit. This acts as the master switch to
> enable ADI for a process.
> 
> 2. set TTE.mcd bit on TLB entries that match the address range ADI is
> being enabled on.

Something close enough for MTE, with the difference that enabling it is
not a PSTATE bit but rather a system control bit (SCTLR_EL1 register),
so only the kernel can turn it on/off for the user.

> 3. Store version tag for the range of addresses userspace wants ADI
> enabled on using "stxa" instruction. These tags are stored in physical
> memory by the memory controller.

Do you have an "ldxa" instruction to load the tags from physical memory?

> Steps 1 and 2 are accomplished by userspace by calling mprotect() with
> PROT_ADI. Tags are set by storing tags in a loop, for example:
> 
>         version = 10;
>         tmp_addr = shmaddr;
>         end = shmaddr + BUFFER_SIZE;
>         while (tmp_addr < end) {
>                 asm volatile(
>                         "stxa %1, [%0]0x90\n\t"
>                         :
>                         : "r" (tmp_addr), "r" (version));
>                 tmp_addr += adi_blksz;
>         }

On arm64, a sequence similar to the above would live in the libc. So a
malloc() call will tag the memory and return the tagged address to the
user.

We were not planning for a PROT_ADI/MTE but rather have MTE enabled for
all user memory ranges. We may revisit this before we upstream the MTE
support (probably some marginal benefit for the hardware not fetching
the tags from memory if we don't need to, e.g. code sections).

Given that we already have the TBI feature and with MTE enabled the top
byte is no longer ignored, we are planning for an explicit opt-in by the
user via prctl() to enable MTE.

> With these semantics, giving mmap() or shamat() a tagged address is
> meaningless since no tags have been stored at the addresses mmap() will
> allocate and one can not store tags before memory range has been
> allocated. If we choose to allow tagged addresses to come into mmap()
> and shmat(), sparc code can strip the tags unconditionally and that may
> help simplify ABI and/or code.

We could say that with TBI (pre-MTE support), the top byte is actually
ignored on mmap(). Now, if you pass a MAP_FIXED with a tagged address,
should the user expect the same tagged address back or stripping the tag
is acceptable? If we want to keep the current mmap() semantics, I'd say
the same tag is returned. However, with MTE this also implies that the
memory was coloured.

> > My thoughts on allowing tags (quick look):
> > 
> > brk - no
> > get_mempolicy - yes
> > madvise - yes
> > mbind - yes
> > mincore - yes
> > mlock, mlock2, munlock - yes
> > mmap - no (we may change this with MTE but not for TBI)
> > mmap_pgoff - not used on arm64
> > mprotect - yes
> > mremap - yes for old_address, no for new_address (on par with mmap)
> > msync - yes
> > munmap - probably no (mmap does not return tagged ptrs)
> > remap_file_pages - no (also deprecated syscall)
> > shmat, shmdt - shall we allow tagged addresses on shared memory?
> > 
> > The above is only about the TBI ABI while ignoring hardware MTE. For
> > the latter, we may want to change the mmap() to allow pre-colouring
> > on page fault which means that munmap()/mprotect() should also
> > support tagged pointers. Possibly mremap() as well but we need to
> > decide whether it should allow re-colouring the page (probably no,
> > in which case old_address and new_address should have the same tag).
> > For some of these we'll end up with arm64 specific wrappers again,
> > unless sparc ADI adopts exactly the same ABI restrictions.
> 
> Let us keep any restrictions common across ARM64 and sparc. pre-
> coloring on sparc in the kernel would mean kernel will have to execute
> stxa instructions in a loop for each page being faulted in.

Since the user can probe the pre-existing colour in a faulted-in page
(either with some 'ldxa' instruction or by performing a tag-checked
access), the kernel should always pre-colour (even if colour 0) any
allocated page. There might not be an obvious security risk but I feel
uneasy about letting colours leak between address spaces (different user
processes or between kernel and user).

Since we already need such loop in the kernel, we might as well allow
user space to require a certain colour. This comes in handy for large
malloc() and another advantage is that the C library won't be stuck
trying to paint the whole range (think GB).

> Not that big a deal but doesn't that assume the entire page has the
> same tag which is dedcued from the upper bits of address? Shouldn't we
> support tags at the same granularity level as what the hardware
> supports?

That's mostly about large malloc() optimisation via mmap(), the latter
working on page granularity already. There is another use-case for
pre-coloured thread stacks, also allocated via anonymous mmap().

> We went through this discussion for sparc and decision was to support
> tags at the same granularity as hardware. That means we can not deduce
> tags from the first address that pioints into an mmap or shmat region.
> Those tags and the upper bytes of colored address could change for
> every cacheline sized block (64-bytes on sparc M7).

It's 16-byte for arm64, so smaller than the cacheline.

> We can try to store tags for an entire region in vma but that is
> expensive, plus on sparc tags are set in userspace with no
> participation from kernel and now we need a way for userspace to
> communicate the tags to kernel.

We can't support finer granularity through the mmap() syscall and, as
you said, the vma is not the right thing to store the individual tags.
With the above extension to mmap(), we'd have to store a colour per vma
and prevent merging if different colours (we could as well use the
pkeys mechanism we already have in the kernel but use a colour per vma
instead of a key).

Of course, the user is allowed to change the in-memory colours at a
finer granularity and the kernel will preserve them during swapping
out/in, page migration etc. The above mmap() proposal is just for the
first fault-in of a page in a given range/vma.

> From sparc point of view, making kernel responsible for assigning tags
> to a page on page fault is full of pitfalls.

This could be just some arm64-specific but if you plan to deploy it more
generically for sparc (at the C library level), you may find this
useful.

-- 
Catalin

  reply index

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-06 16:30 [PATCH v15 00/17] arm64: untag user pointers passed to the kernel Andrey Konovalov
2019-05-06 16:30 ` [PATCH v15 01/17] uaccess: add untagged_addr definition for other arches Andrey Konovalov
2019-05-29 14:49   ` Khalid Aziz
2019-05-06 16:30 ` [PATCH v15 02/17] arm64: untag user pointers in access_ok and __uaccess_mask_ptr Andrey Konovalov
2019-05-06 16:30 ` [PATCH v15 03/17] lib, arm64: untag user pointers in strn*_user Andrey Konovalov
2019-05-22 10:41   ` Catalin Marinas
2019-05-06 16:30 ` [PATCH v15 04/17] mm: add ksys_ wrappers to memory syscalls Andrey Konovalov
2019-05-22 10:56   ` Catalin Marinas
2019-05-06 16:30 ` [PATCH v15 05/17] arms64: untag user pointers passed " Andrey Konovalov
2019-05-22 11:49   ` Catalin Marinas
2019-05-22 21:16     ` Evgenii Stepanov
2019-05-23  9:04       ` Catalin Marinas
2019-05-24  4:23         ` Evgenii Stepanov
2019-05-24 15:41   ` Andrew Murray
2019-05-25  9:57   ` Catalin Marinas
2019-05-27  9:42   ` Catalin Marinas
2019-05-27 14:37   ` Catalin Marinas
2019-05-28 14:54     ` Andrew Murray
2019-05-28 15:40       ` Catalin Marinas
2019-05-28 15:56         ` Dave Martin
2019-05-28 16:34           ` Catalin Marinas
2019-05-29 12:42             ` Dave Martin
2019-05-29 13:23               ` Catalin Marinas
2019-05-29 15:18                 ` Dave Martin
2019-05-28 23:33         ` Khalid Aziz
2019-05-29 14:20           ` Catalin Marinas [this message]
2019-05-29 19:16             ` Khalid Aziz
2019-05-30 15:11               ` Catalin Marinas
2019-05-30 16:05                 ` Khalid Aziz
2019-05-30 16:57                   ` Catalin Marinas
2019-05-28 13:05   ` Catalin Marinas
2019-05-06 16:30 ` [PATCH v15 06/17] mm: untag user pointers in do_pages_move Andrey Konovalov
2019-05-22 11:51   ` Catalin Marinas
2019-05-06 16:30 ` [PATCH v15 07/17] mm, arm64: untag user pointers in mm/gup.c Andrey Konovalov
2019-05-22 11:56   ` Catalin Marinas
2019-05-06 16:30 ` [PATCH v15 08/17] mm, arm64: untag user pointers in get_vaddr_frames Andrey Konovalov
2019-05-06 16:30 ` [PATCH v15 09/17] fs, arm64: untag user pointers in copy_mount_options Andrey Konovalov
2019-05-22 12:09   ` Catalin Marinas
2019-05-06 16:30 ` [PATCH v15 10/17] fs, arm64: untag user pointers in fs/userfaultfd.c Andrey Konovalov
2019-05-06 16:30 ` [PATCH v15 11/17] drm/amdgpu, arm64: untag user pointers Andrey Konovalov
2019-05-07 16:43   ` Kuehling, Felix
2019-05-06 16:30 ` [PATCH v15 12/17] drm/radeon, arm64: untag user pointers in radeon_gem_userptr_ioctl Andrey Konovalov
2019-05-07 16:44   ` Kuehling, Felix
2019-05-06 16:30 ` [PATCH v15 13/17] IB, arm64: untag user pointers in ib_uverbs_(re)reg_mr() Andrey Konovalov
2019-05-06 19:50   ` Jason Gunthorpe
2019-05-07  6:33     ` Leon Romanovsky
2019-05-06 16:31 ` [PATCH v15 14/17] media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get Andrey Konovalov
2019-05-24 13:13   ` Mauro Carvalho Chehab
2019-05-06 16:31 ` [PATCH v15 15/17] tee, arm64: untag user pointers in tee_shm_register Andrey Konovalov
2019-05-06 16:31 ` [PATCH v15 16/17] vfio/type1, arm64: untag user pointers in vaddr_get_pfn Andrey Konovalov
2019-05-06 16:31 ` [PATCH v15 17/17] selftests, arm64: add a selftest for passing tagged pointers to kernel Andrey Konovalov
2019-05-22 14:16   ` Catalin Marinas
2019-05-31 14:21     ` Andrey Konovalov
2019-05-31 16:22       ` Catalin Marinas
2019-05-17 14:49 ` [PATCH v15 00/17] arm64: untag user pointers passed to the kernel Catalin Marinas
2019-05-20 23:53   ` Evgenii Stepanov
2019-05-21 18:29     ` Catalin Marinas
2019-05-22  0:04       ` Kees Cook
2019-05-22 10:11         ` Catalin Marinas
2019-05-22 15:30           ` enh
2019-05-22 16:35             ` Catalin Marinas
2019-05-22 16:58               ` enh
2019-05-23 15:21                 ` Catalin Marinas
2019-05-22 20:47               ` Kees Cook
2019-05-22 23:03                 ` Evgenii Stepanov
2019-05-22 23:09                   ` enh
2019-05-23  7:34                     ` Catalin Marinas
2019-05-23 14:44                 ` Catalin Marinas
2019-05-23 15:44                   ` enh
2019-05-23 17:00                     ` Catalin Marinas
2019-05-23 16:38                   ` Kees Cook
2019-05-23 17:43                     ` Catalin Marinas
2019-05-23 21:31                       ` Kees Cook
2019-05-24 11:20                         ` Catalin Marinas
2019-05-28 17:02                         ` Catalin Marinas
2019-06-02  5:06                           ` Kees Cook
2019-05-22 19:21             ` Kees Cook
2019-05-22 20:15               ` enh
2019-05-23 15:08               ` Catalin Marinas
2019-05-23 17:51         ` Khalid Aziz
2019-05-23 20:11           ` Catalin Marinas
2019-05-23 21:42             ` Khalid Aziz
2019-05-23 21:49             ` Khalid Aziz
2019-05-24 10:11               ` Catalin Marinas
2019-05-24 14:25                 ` Khalid Aziz
2019-05-28 14:14                   ` Andrey Konovalov
2019-05-29  6:11                     ` Christoph Hellwig
2019-05-29 12:12                       ` Catalin Marinas
2019-05-30 17:15                     ` Catalin Marinas
2019-05-31 14:29                       ` Andrey Konovalov
2019-05-31 16:19                         ` Catalin Marinas
2019-05-31 16:24                           ` Andrey Konovalov
2019-05-31 16:46                             ` Catalin Marinas
2019-05-21 18:48   ` Jason Gunthorpe
2019-05-22 13:49     ` Dave Martin
2019-05-23  0:20       ` Jason Gunthorpe
2019-05-23 10:42         ` Dave Martin
2019-05-23 16:57           ` Catalin Marinas
2019-05-24 14:23             ` Dave Martin

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190529142008.5quqv3wskmpwdfbu@mbp \
    --to=catalin.marinas@arm.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Dave.Martin@arm.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Jacob.Bramley@arm.com \
    --cc=Lee.Smith@arm.com \
    --cc=Ramana.Radhakrishnan@arm.com \
    --cc=Ruben.Ayrapetyan@arm.com \
    --cc=Szabolcs.Nagy@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrew.murray@arm.com \
    --cc=andreyknvl@google.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dvyukov@google.com \
    --cc=eugenis@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jens.wiklander@linaro.org \
    --cc=kcc@google.com \
    --cc=keescook@chromium.org \
    --cc=kevin.brodsky@arm.com \
    --cc=khalid.aziz@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=luc.vanoostenryck@gmail.com \
    --cc=mark.rutland@arm.com \
    --cc=mchehab@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will.deacon@arm.com \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Media Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-media/0 linux-media/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-media linux-media/ https://lore.kernel.org/linux-media \
		linux-media@vger.kernel.org linux-media@archiver.kernel.org
	public-inbox-index linux-media


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-media


AGPL code for this site: git clone https://public-inbox.org/ public-inbox