From: Evgenii Stepanov <eugenis@google.com>
To: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Mark Rutland <mark.rutland@arm.com>,
Kate Stewart <kstewart@linuxfoundation.org>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
Will Deacon <will.deacon@arm.com>,
Linux Memory Management List <linux-mm@kvack.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@vger.kernel.org>,
Chintan Pandya <cpandya@codeaurora.org>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
Shuah Khan <shuah@kernel.org>, Ingo Molnar <mingo@kernel.org>,
linux-arch <linux-arch@vger.kernel.org>,
Jacob Bramley <Jacob.Bramley@arm.com>,
Dmitry Vyukov <dvyukov@google.com>,
Kees Cook <keescook@chromium.org>,
Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>,
Andrey Konovalov <andreyknvl@google.com>,
Lee Smith <Lee.Smith@arm.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
Kostya Serebryany <kcc@google.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
LKML <linux-kernel@vger.kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Ramana Radhakrishnan <Ramana.Radhakrishnan@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Robin Murphy <robin.murphy@arm.com>,
Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Subject: Re: [RFC][PATCH 0/3] arm64 relaxed ABI
Date: Mon, 11 Feb 2019 12:32:55 -0800 [thread overview]
Message-ID: <CAFKCwrhH5R3e5ntX0t-gxcE6zzbCNm06pzeFfYEN2K13c5WLTg@mail.gmail.com> (raw)
In-Reply-To: <9bbacb1b-6237-f0bb-9bec-b4cf8d42bfc5@arm.com>
On Mon, Feb 11, 2019 at 9:28 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>
> On 19/12/2018 12:52, Dave Martin wrote:
> > On Tue, Dec 18, 2018 at 05:59:38PM +0000, Catalin Marinas wrote:
> >> On Tue, Dec 18, 2018 at 04:03:38PM +0100, Andrey Konovalov wrote:
> >>> On Wed, Dec 12, 2018 at 4:02 PM Catalin Marinas<catalin.marinas@arm.com> wrote:
> >>>> The summary of our internal discussions (mostly between kernel
> >>>> developers) is that we can't properly describe a user ABI that covers
> >>>> future syscalls or syscall extensions while not all syscalls accept
> >>>> tagged pointers. So we tweaked the requirements slightly to only allow
> >>>> tagged pointers back into the kernel *if* the originating address is
> >>>> from an anonymous mmap() or below sbrk(0). This should cover some of the
> >>>> ioctls or getsockopt(TCP_ZEROCOPY_RECEIVE) where the user passes a
> >>>> pointer to a buffer obtained via mmap() on the device operations.
> >>>>
> >>>> (sorry for not being clear on what Vincenzo's proposal implies)
> >>> OK, I see. So I need to make the following changes to my patchset AFAIU.
> >>>
> >>> 1. Make sure that we only allow tagged user addresses that originate
> >>> from an anonymous mmap() or below sbrk(0). How exactly should this
> >>> check be performed?
> >> I don't think we should perform such checks. That's rather stating that
> >> the kernel only guarantees that the tagged pointers work if they
> >> originated from these memory ranges.
> > I concur.
> >
> > Really, the kernel should do the expected thing with all "non-weird"
> > memory.
> >
> > In lieu of a proper definition of "non-weird", I think we should have
> > some lists of things that are explicitly included, and also excluded:
> >
> > OK:
> > kernel-allocated process stack
> > brk area
> > MAP_ANONYMOUS | MAP_PRIVATE
> > MAP_PRIVATE mappings of /dev/zero
> >
> > Not OK:
> > MAP_SHARED
> > mmaps of non-memory-like devices
> > mmaps of anything that is not a regular file
> > the VDSO
> > ...
> >
> > In general, userspace can tag memory that it "owns", and we do not assume
> > a transfer of ownership except in the "OK" list above. Otherwise, it's
> > the kernel's memory, or the owner is simply not well defined.
>
> Agreed on the general idea: a process should be able to pass tagged pointers at the
> syscall interface, as long as they point to memory privately owned by the process. I
> think it would be possible to simplify the definition of "non-weird" memory by using
> only this "OK" list:
> - mmap() done by the process itself, where either:
> * flags = MAP_PRIVATE | MAP_ANONYMOUS
> * flags = MAP_PRIVATE and fd refers to a regular file or a well-defined list of
> device files (like /dev/zero)
> - brk() done by the process itself
> - Any memory mapped by the kernel in the new process's address space during execve(),
> with the same restrictions as above ([vdso]/[vvar] are therefore excluded)
>
> > I would also like to see advice for userspace developers, particularly
> > things like (strawman, please challenge!):
>
> To some extent, one could call me a userspace developer, so I'll try to help :)
>
> > * Userspace should set tags at the point of allocation only.
>
> Yes, tags are only supposed to be set at the point of either allocation or
> deallocation/reallocation. However, allocators can in principle be nested, so an
> allocator could take a region allocated by malloc() as input and subdivide it
> (changing tags in the process). That said, this suballocator must not free() that
> region until all the suballocations themselves have been freed (thereby restoring the
> tags initially set by malloc()).
>
> > * If you don't know how an object was allocated, you cannot modify the
> > tag, period.
>
> Agreed, allocators that tag memory can only operate on memory with a well-defined
> provenance (for instance anonymous mmap() or malloc()).
>
> > * A single C object should be accessed using a single, fixed pointer tag
> > throughout its entire lifetime.
>
> Agreed. Allocators themselves may need to be excluded though, depending on how they
> represent their managed memory.
>
> > * Tags can be changed only when there are no outstanding pointers to
> > the affected object or region that may be used to access the object
> > or region (i.e., if the object were allocated from the C heap and
> > is it safe to realloc() it, then it is safe to change the tag; for
> > other types of allocation, analogous arguments can be applied).
>
> Tags can only be changed at the point of deallocation/reallocation. Pointers to the
> object become invalid and cannot be used after it has been deallocated; memory
> tagging allows to catch such invalid usage.
>
> > * When the kernel dereferences a pointer on userspace's behalf, it
> > shall behave equivalently to userspace dereferencing the same pointer,
> > including use of the same tag (where passed by userspace).
> >
> > * Where the pointer tag affects pointer dereference behaviour (i.e.,
> > with hardware memory colouring) the kernel makes no guarantee to
> > honour pointer tags correctly for every location a buffer based on a
> > pointer passed by userspace to the kernel.
> >
> > (This means for example that for a read(fd, buf, size), we can check
> > the tag for a single arbitrary location in *(char (*)[size])buf
> > before passing the buffer to get_user_pages(). Hopefully this could
> > be done in get_user_pages() itself rather than hunting call sites.
> > For userspace, it means that you're on your own if you ask the
> > kernel to operate on a buffer than spans multiple, independently-
> > allocated objects, or a deliberately striped single object.)
>
> I think both points are reasonable. It is very valuable for the kernel to access
> userspace memory using the user-provided tag, because it enables kernel accesses to
> be checked in the same way as user accesses, allowing to detect bugs that are
> potentially hard to find. For instance, if a pointer to an object is passed to the
> kernel after it has been deallocated, this is invalid and should be detected.
> However, you are absolutely right that the kernel cannot *guarantee* that such a
> check is carried out for the entire memory range (or in fact at all); it should be a
> best-effort approach.
It would also be valuable to narrow down the set of "relaxed" (i.e.
not tag-checking) syscalls as reasonably possible. We would want to
provide tag-checking userspace wrappers for any important calls that
are not checked in the kernel. Is it correct to assume that anything
that goes through copy_from_user / copy_to_user is checked?
>
> > * The kernel shall not extend the lifetime of user pointers in ways
> > that are not clear from the specification of the syscall or
> > interface to which the pointer is passed (and in any case shall not
> > extend pointer lifetimes without good reason).
> >
> > So no clever transparent caching between syscalls, unless it _really_
> > is transparent in the presence of tags.
>
> Do you have any particular case in mind? If such caching is really valuable, it is
> always possible to access the object while ignoring the tag. For sure, the
> user-provided tag can only be used during the syscall handling itself, not
> asynchronously later on, unless otherwise specified.
For aio* operations it would be nice if the tag was checked at the
time of the actual userspace read/write, either instead of or in
addition to at the time of the system call.
>
> > * For purposes other than dereference, the kernel shall accept any
> > legitimately tagged pointer (according to the above rules) as
> > identifying the associated memory location.
> >
> > So, mprotect(some_page_aligned_object, ...); is valid irrespective
> > of where page_aligned_object() came from. There is no implicit
> > derefence by the kernel here, hence no tag check.
> >
> > The kernel does not guarantee to work correctly if the wrong tag
> > is used, but there is not always a well-defined "right" tag, so
> > we can't really guarantee to check it. So a pointer derived by
> > any reasonable means by userspace has to be treated as equally
> > valid.
>
> This is a disputed point :) In my opinion, this is the the most reasonable approach.
Yes, it would be nice if the kernel explicitly promised, ex.
mprotect() over a range of differently tagged pages to be allowed
(i.e. address tag should be unchecked).
>
> Cheers,
> Kevin
>
> > We would need to get some cross-arch buy-in for this, otherwise core
> > maintainers might just refuse to maintain the necessary guarantees.
> >
> >
> >>> 2. Allow tagged addressed passed to memory syscalls (as long as (1) is
> >>> satisfied). Do I understand correctly that this means that I need to
> >>> locate all find_vma() callers outside of mm/ and fix them up as well?
> >> Yes (unless anyone as a better idea or objections to this approach).
> > Also, watch out for code that pokes about inside struct vma directly.
> >
> > I'm wondering, could we define an explicit type, say,
> >
> > struct user_vaddr {
> > unsigned long addr;
> > };
> >
> > to replace the unsigned longs in struct vma the mm API? This would
> > turn ad-hoc (unsigned long) casts into build breaks. We could have
> > an explicit conversion functions, say,
> >
> > struct user_vaddr __user_vaddr_unsafe(void __user *);
> > void __user *__user_ptr_unsafe(struct user_vaddr);
> >
> > that we robotically insert in all the relevant places to mark
> > unaudited code.
> >
> > This allows us to keep the kernel buildable, while flagging things
> > that will need review. We would also need to warn the mm folks to
> > reject any new code using these unsafe conversions.
> >
> > Of course, it would be a non-trivial effort...
> >
> >> BTW, I'll be off until the new year, so won't be able to follow up.
> > Cheers
> > ---Dave
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
next prev parent reply other threads:[~2019-02-11 20:33 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-10 12:50 [PATCH v9 0/8] arm64: untag user pointers passed to the kernel Andrey Konovalov
2018-12-10 12:50 ` [PATCH v9 1/8] arm64: add type casts to untagged_addr macro Andrey Konovalov
2018-12-10 12:50 ` [PATCH v9 2/8] uaccess: add untagged_addr definition for other arches Andrey Konovalov
2018-12-10 12:51 ` [PATCH v9 3/8] arm64: untag user addresses in access_ok and __uaccess_mask_ptr Andrey Konovalov
2018-12-10 12:51 ` [PATCH v9 4/8] mm, arm64: untag user addresses in mm/gup.c Andrey Konovalov
2018-12-10 12:51 ` [PATCH v9 5/8] lib, arm64: untag addrs passed to strncpy_from_user and strnlen_user Andrey Konovalov
2018-12-10 12:51 ` [PATCH v9 6/8] fs, arm64: untag user address in copy_mount_options Andrey Konovalov
2018-12-10 12:51 ` [PATCH v9 7/8] arm64: update Documentation/arm64/tagged-pointers.txt Andrey Konovalov
2018-12-10 12:51 ` [PATCH v9 8/8] selftests, arm64: add a selftest for passing tagged pointers to kernel Andrey Konovalov
2018-12-10 14:30 ` [RFC][PATCH 0/3] arm64 relaxed ABI Vincenzo Frascino
2018-12-10 14:30 ` [RFC][PATCH 1/3] elf: Make AT_FLAGS arch configurable Vincenzo Frascino
2018-12-10 14:30 ` [RFC][PATCH 2/3] arm64: Define Documentation/arm64/elf_at_flags.txt Vincenzo Frascino
2018-12-12 17:34 ` Dave Martin
2019-01-09 13:05 ` Vincenzo Frascino
2018-12-10 14:30 ` [RFC][PATCH 3/3] arm64: elf: Advertise relaxed ABI Vincenzo Frascino
2018-12-12 14:23 ` [RFC][PATCH 0/3] arm64 " Andrey Konovalov
2018-12-12 15:02 ` Catalin Marinas
2018-12-18 15:03 ` Andrey Konovalov
2018-12-18 17:59 ` Catalin Marinas
2018-12-19 12:52 ` Dave Martin
2019-02-11 17:28 ` Kevin Brodsky
2019-02-11 20:32 ` Evgenii Stepanov [this message]
2019-02-12 18:02 ` Catalin Marinas
2019-02-13 14:58 ` Dave Martin
2019-02-13 16:42 ` Kevin Brodsky
2019-02-13 17:43 ` Dave Martin
2019-02-13 21:41 ` Evgenii Stepanov
2019-02-14 11:22 ` Kevin Brodsky
2019-02-19 18:38 ` Szabolcs Nagy
2019-02-25 16:57 ` Catalin Marinas
2019-02-25 18:02 ` Szabolcs Nagy
2019-02-26 17:30 ` Kevin Brodsky
2018-12-12 17:01 ` [PATCH v9 0/8] arm64: untag user pointers passed to the kernel Dave Martin
2018-12-18 17:17 ` Andrey Konovalov
2019-02-11 11:35 ` Catalin Marinas
2019-02-11 17:02 ` Dave Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFKCwrhH5R3e5ntX0t-gxcE6zzbCNm06pzeFfYEN2K13c5WLTg@mail.gmail.com \
--to=eugenis@google.com \
--cc=Dave.Martin@arm.com \
--cc=Jacob.Bramley@arm.com \
--cc=Lee.Smith@arm.com \
--cc=Ramana.Radhakrishnan@arm.com \
--cc=Ruben.Ayrapetyan@arm.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@google.com \
--cc=catalin.marinas@arm.com \
--cc=cpandya@codeaurora.org \
--cc=dvyukov@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=kcc@google.com \
--cc=keescook@chromium.org \
--cc=kevin.brodsky@arm.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kstewart@linuxfoundation.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luc.vanoostenryck@gmail.com \
--cc=mark.rutland@arm.com \
--cc=mingo@kernel.org \
--cc=robin.murphy@arm.com \
--cc=shuah@kernel.org \
--cc=vincenzo.frascino@arm.com \
--cc=viro@zeniv.linux.org.uk \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).