LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Aleksa Sarai <cyphar@cyphar.com>
Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Alexei Starovoitov <ast@kernel.org>,
	linux-kernel@vger.kernel.org, David Howells <dhowells@redhat.com>,
	linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Shuah Khan <shuah@kernel.org>,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	Tycho Andersen <tycho@tycho.ws>, Aleksa Sarai <asarai@suse.de>,
	Jiri Olsa <jolsa@redhat.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, Kees Cook <keescook@chromium.org>,
	Arnd Bergmann <arnd@arndb.de>, Jann Horn <jannh@google.com>,
	linuxppc-dev@lists.ozlabs.org, linux-m68k@lists.linux-m68k.org,
	Andy Lutomirski <luto@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Namhyung Kim <namhyung@kernel.org>,
	David Drysdale <drysdale@google.com>,
	Christian Brauner <christian@brauner.io>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	linux-parisc@vger.kernel.org, linux-api@vger.kernel.org,
	Chanho Min <chanho.min@lge.com>, Jeff Layton <jlayton@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	linux-alpha@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org
Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers
Date: Thu, 5 Sep 2019 23:31:48 +0100
Message-ID: <20190905223148.GS1131@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com>

On Fri, Sep 06, 2019 at 05:56:18AM +1000, Aleksa Sarai wrote:
> On 2019-09-05, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Thu, Sep 05, 2019 at 08:23:03PM +0200, Christian Brauner wrote:
> > 
> > > Because every caller of that function right now has that limit set
> > > anyway iirc. So we can either remove it from here and place it back for
> > > the individual callers or leave it in the helper.
> > > Also, I'm really asking, why not? Is it unreasonable to have an upper
> > > bound on the size (for a long time probably) or are you disagreeing with
> > > PAGE_SIZE being used? PAGE_SIZE limit is currently used by sched, perf,
> > > bpf, and clone3 and in a few other places.
> > 
> > For a primitive that can be safely used with any size (OK, any within
> > the usual 2Gb limit)?  Why push the random policy into the place where
> > it doesn't belong?
> > 
> > Seriously, what's the point?  If they want to have a large chunk of
> > userland memory zeroed or checked for non-zeroes - why would that
> > be a problem?
> 
> Thinking about it some more, there isn't really any r/w amplification --
> so there isn't much to gain by passing giant structs. Though, if we are
> going to permit 2GB buffers, isn't that also an argument to use
> memchr_inv()? :P

I'm not sure I understand the last bit.  If you look at what copy_from_user()
does on misaligned source/destination, especially on architectures that
really, really do not like unaligned access...

Case in point: alpha (and it's not unusual in that respect).  What it boils
down to is
	copy bytes until the destination is aligned
	if source and destination are both aligned
		copy word by word
	else
		read word by word, storing the mix of two adjacent words
	copy the rest byte by byte

The unpleasant case (to and from having different remainders modulo 8) is
basically

	if (count >= 8) {
		u64 *aligned = (u64 *)(from & ~7);
		u64 *dest = (u64 *)to;
		int bitshift = (from & 7) * 8;
		u64 prev, next;

		prev = aligned[0];
		do {   
			next = aligned[1];
			prev <<= bitshift;
			prev |= next >> (64 - bitshift);
			*dest++ = prev;
			aligned++;  
			prev = next;
			from += 8;
			to += 8;
			count -= 8;
		} while (count >= 8);
	}

Now, mix that with "... and do memchr_inv() on the copy to find if we'd
copied any non-zeroes, nevermind where" and it starts looking really
ridiculous.

We should just read the fscking source, aligned down to word boundary
and check each word being read.  The first and the last ones - masked.
All there is to it.  On almost all architectures that'll work well
enough; s390 might want something more elaborate (there even word-by-word
copies are costly, but I'd suggest talking to them for details).

Something like bool all_zeroes_user(const void __user *p, size_t count)
would probably be a sane API...

  reply index

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04 20:19 [PATCH v12 00/12] namei: openat2(2) path resolution restrictions Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Aleksa Sarai
2019-09-04 20:48   ` [PATCH v12 01/12] lib: introduce copy_struct_{to, from}_user helpers Linus Torvalds
2019-09-04 21:00   ` [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Randy Dunlap
2019-09-05  7:32   ` Peter Zijlstra
2019-09-05  9:26     ` Aleksa Sarai
2019-09-05  9:43       ` Peter Zijlstra
2019-09-05 10:57         ` Peter Zijlstra
2019-09-11 10:37           ` Aleksa Sarai
2019-09-05 13:35         ` Aleksa Sarai
2019-09-05 17:01         ` Aleksa Sarai
2019-09-05  8:43   ` Rasmus Villemoes
2019-09-05  9:50     ` Aleksa Sarai
2019-09-05 10:45       ` Christian Brauner
2019-09-05  9:09   ` [PATCH v12 01/12] lib: introduce copy_struct_{to, from}_user helpers Andreas Schwab
2019-09-05 10:13     ` Gabriel Paubert
2019-09-05 11:05   ` [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Christian Brauner
2019-09-05 11:17     ` Rasmus Villemoes
2019-09-05 11:29       ` Christian Brauner
2019-09-05 13:40     ` Aleksa Sarai
2019-09-05 11:09   ` Christian Brauner
2019-09-05 11:27     ` Aleksa Sarai
2019-09-05 11:40       ` Christian Brauner
2019-09-05 18:07   ` Al Viro
2019-09-05 18:23     ` Christian Brauner
2019-09-05 18:28       ` Al Viro
2019-09-05 18:35         ` Christian Brauner
2019-09-05 19:56         ` Aleksa Sarai
2019-09-05 22:31           ` Al Viro [this message]
2019-09-06  7:00           ` Christian Brauner
2019-09-05 23:00     ` Aleksa Sarai
2019-09-05 23:49       ` Al Viro
2019-09-06  0:09         ` Aleksa Sarai
2019-09-06  0:14         ` Al Viro
2019-09-04 20:19 ` [PATCH v12 02/12] clone3: switch to copy_struct_from_user() Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 03/12] sched_setattr: switch to copy_struct_{to, from}_user() Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 04/12] perf_event_open: switch to copy_struct_from_user() Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 05/12] namei: obey trailing magic-link DAC permissions Aleksa Sarai
2019-09-17 21:30   ` Jann Horn
2019-09-18 13:51     ` Aleksa Sarai
2019-09-18 15:46       ` Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 06/12] procfs: switch magic-link modes to be more sane Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 07/12] open: O_EMPTYPATH: procfs-less file descriptor re-opening Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 08/12] namei: O_BENEATH-style path resolution flags Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 09/12] namei: LOOKUP_IN_ROOT: chroot-like path resolution Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 10/12] namei: aggressively check for nd->root escape on ".." resolution Aleksa Sarai
2019-09-04 21:09   ` Linus Torvalds
2019-09-04 21:35     ` Linus Torvalds
2019-09-04 21:36       ` Linus Torvalds
2019-09-04 21:48     ` Aleksa Sarai
2019-09-04 22:16       ` Linus Torvalds
2019-09-04 22:31       ` David Howells
2019-09-04 22:38         ` Linus Torvalds
2019-09-04 23:29           ` Al Viro
2019-09-04 23:44             ` Linus Torvalds
2019-09-04 20:19 ` [PATCH v12 11/12] open: openat2(2) syscall Aleksa Sarai
2019-09-04 21:00   ` Randy Dunlap
2019-09-07 12:40   ` Jeff Layton
2019-09-07 16:58     ` Linus Torvalds
2019-09-07 17:42       ` Andy Lutomirski
2019-09-07 17:45         ` Linus Torvalds
2019-09-07 18:15           ` Andy Lutomirski
2019-09-10  6:35           ` Ingo Molnar
2019-09-08 16:24     ` Aleksa Sarai
2019-09-04 20:19 ` [PATCH v12 12/12] selftests: add openat2(2) selftests Aleksa Sarai

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190905223148.GS1131@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=asarai@suse.de \
    --cc=ast@kernel.org \
    --cc=bfields@fieldses.org \
    --cc=chanho.min@lge.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=christian@brauner.io \
    --cc=containers@lists.linux-foundation.org \
    --cc=cyphar@cyphar.com \
    --cc=dhowells@redhat.com \
    --cc=drysdale@google.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=jlayton@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tycho@tycho.ws \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org linuxppc-dev@archiver.kernel.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox