From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Date: Thu, 05 Sep 2019 22:31:48 +0000 Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Message-Id: <20190905223148.GS1131@ZenIV.linux.org.uk> List-Id: References: <20190904201933.10736-1-cyphar@cyphar.com> <20190904201933.10736-2-cyphar@cyphar.com> <20190905180750.GQ1131@ZenIV.linux.org.uk> <20190905182303.7f6bxpa2enbgcegv@wittgenstein> <20190905182801.GR1131@ZenIV.linux.org.uk> <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> In-Reply-To: <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Aleksa Sarai Cc: Christian Brauner , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan , Ingo Molnar , Peter Zijlstra , Christian Brauner , Rasmus Villemoes , Eric Biederman , Andy Lutomirski , Andrew Morton , Alexei Starovoitov , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min On Fri, Sep 06, 2019 at 05:56:18AM +1000, Aleksa Sarai wrote: > On 2019-09-05, Al Viro wrote: > > On Thu, Sep 05, 2019 at 08:23:03PM +0200, Christian Brauner wrote: > > > > > Because every caller of that function right now has that limit set > > > anyway iirc. So we can either remove it from here and place it back for > > > the individual callers or leave it in the helper. > > > Also, I'm really asking, why not? Is it unreasonable to have an upper > > > bound on the size (for a long time probably) or are you disagreeing with > > > PAGE_SIZE being used? PAGE_SIZE limit is currently used by sched, perf, > > > bpf, and clone3 and in a few other places. > > > > For a primitive that can be safely used with any size (OK, any within > > the usual 2Gb limit)? Why push the random policy into the place where > > it doesn't belong? > > > > Seriously, what's the point? If they want to have a large chunk of > > userland memory zeroed or checked for non-zeroes - why would that > > be a problem? > > Thinking about it some more, there isn't really any r/w amplification -- > so there isn't much to gain by passing giant structs. Though, if we are > going to permit 2GB buffers, isn't that also an argument to use > memchr_inv()? :P I'm not sure I understand the last bit. If you look at what copy_from_user() does on misaligned source/destination, especially on architectures that really, really do not like unaligned access... Case in point: alpha (and it's not unusual in that respect). What it boils down to is copy bytes until the destination is aligned if source and destination are both aligned copy word by word else read word by word, storing the mix of two adjacent words copy the rest byte by byte The unpleasant case (to and from having different remainders modulo 8) is basically if (count >= 8) { u64 *aligned = (u64 *)(from & ~7); u64 *dest = (u64 *)to; int bitshift = (from & 7) * 8; u64 prev, next; prev = aligned[0]; do { next = aligned[1]; prev <<= bitshift; prev |= next >> (64 - bitshift); *dest++ = prev; aligned++; prev = next; from += 8; to += 8; count -= 8; } while (count >= 8); } Now, mix that with "... and do memchr_inv() on the copy to find if we'd copied any non-zeroes, nevermind where" and it starts looking really ridiculous. We should just read the fscking source, aligned down to word boundary and check each word being read. The first and the last ones - masked. All there is to it. On almost all architectures that'll work well enough; s390 might want something more elaborate (there even word-by-word copies are costly, but I'd suggest talking to them for details). Something like bool all_zeroes_user(const void __user *p, size_t count) would probably be a sane API... From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7533AC43331 for ; Thu, 5 Sep 2019 22:32:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BB9B720825 for ; Thu, 5 Sep 2019 22:32:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389931AbfIEWc3 (ORCPT ); Thu, 5 Sep 2019 18:32:29 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:43054 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732639AbfIEWc2 (ORCPT ); Thu, 5 Sep 2019 18:32:28 -0400 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.1 #3 (Red Hat Linux)) id 1i60I9-0001pN-0U; Thu, 05 Sep 2019 22:31:49 +0000 Date: Thu, 5 Sep 2019 23:31:48 +0100 From: Al Viro To: Aleksa Sarai Cc: Christian Brauner , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan , Ingo Molnar , Peter Zijlstra , Christian Brauner , Rasmus Villemoes , Eric Biederman , Andy Lutomirski , Andrew Morton , Alexei Starovoitov , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Aleksa Sarai , Linus Torvalds , containers@lists.linux-foundation.org, linux-alpha@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-xtensa@linux-xtensa.org, sparclinux@vger.kernel.org Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Message-ID: <20190905223148.GS1131@ZenIV.linux.org.uk> References: <20190904201933.10736-1-cyphar@cyphar.com> <20190904201933.10736-2-cyphar@cyphar.com> <20190905180750.GQ1131@ZenIV.linux.org.uk> <20190905182303.7f6bxpa2enbgcegv@wittgenstein> <20190905182801.GR1131@ZenIV.linux.org.uk> <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> User-Agent: Mutt/1.12.0 (2019-05-25) Sender: linux-parisc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org On Fri, Sep 06, 2019 at 05:56:18AM +1000, Aleksa Sarai wrote: > On 2019-09-05, Al Viro wrote: > > On Thu, Sep 05, 2019 at 08:23:03PM +0200, Christian Brauner wrote: > > > > > Because every caller of that function right now has that limit set > > > anyway iirc. So we can either remove it from here and place it back for > > > the individual callers or leave it in the helper. > > > Also, I'm really asking, why not? Is it unreasonable to have an upper > > > bound on the size (for a long time probably) or are you disagreeing with > > > PAGE_SIZE being used? PAGE_SIZE limit is currently used by sched, perf, > > > bpf, and clone3 and in a few other places. > > > > For a primitive that can be safely used with any size (OK, any within > > the usual 2Gb limit)? Why push the random policy into the place where > > it doesn't belong? > > > > Seriously, what's the point? If they want to have a large chunk of > > userland memory zeroed or checked for non-zeroes - why would that > > be a problem? > > Thinking about it some more, there isn't really any r/w amplification -- > so there isn't much to gain by passing giant structs. Though, if we are > going to permit 2GB buffers, isn't that also an argument to use > memchr_inv()? :P I'm not sure I understand the last bit. If you look at what copy_from_user() does on misaligned source/destination, especially on architectures that really, really do not like unaligned access... Case in point: alpha (and it's not unusual in that respect). What it boils down to is copy bytes until the destination is aligned if source and destination are both aligned copy word by word else read word by word, storing the mix of two adjacent words copy the rest byte by byte The unpleasant case (to and from having different remainders modulo 8) is basically if (count >= 8) { u64 *aligned = (u64 *)(from & ~7); u64 *dest = (u64 *)to; int bitshift = (from & 7) * 8; u64 prev, next; prev = aligned[0]; do { next = aligned[1]; prev <<= bitshift; prev |= next >> (64 - bitshift); *dest++ = prev; aligned++; prev = next; from += 8; to += 8; count -= 8; } while (count >= 8); } Now, mix that with "... and do memchr_inv() on the copy to find if we'd copied any non-zeroes, nevermind where" and it starts looking really ridiculous. We should just read the fscking source, aligned down to word boundary and check each word being read. The first and the last ones - masked. All there is to it. On almost all architectures that'll work well enough; s390 might want something more elaborate (there even word-by-word copies are costly, but I'd suggest talking to them for details). Something like bool all_zeroes_user(const void __user *p, size_t count) would probably be a sane API... From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Date: Thu, 5 Sep 2019 23:31:48 +0100 Message-ID: <20190905223148.GS1131@ZenIV.linux.org.uk> References: <20190904201933.10736-1-cyphar@cyphar.com> <20190904201933.10736-2-cyphar@cyphar.com> <20190905180750.GQ1131@ZenIV.linux.org.uk> <20190905182303.7f6bxpa2enbgcegv@wittgenstein> <20190905182801.GR1131@ZenIV.linux.org.uk> <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> Sender: linux-kernel-owner@vger.kernel.org To: Aleksa Sarai Cc: Christian Brauner , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan , Ingo Molnar , Peter Zijlstra , Christian Brauner , Rasmus Villemoes , Eric Biederman , Andy Lutomirski , Andrew Morton , Alexei Starovoitov , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min List-Id: linux-api@vger.kernel.org On Fri, Sep 06, 2019 at 05:56:18AM +1000, Aleksa Sarai wrote: > On 2019-09-05, Al Viro wrote: > > On Thu, Sep 05, 2019 at 08:23:03PM +0200, Christian Brauner wrote: > > > > > Because every caller of that function right now has that limit set > > > anyway iirc. So we can either remove it from here and place it back for > > > the individual callers or leave it in the helper. > > > Also, I'm really asking, why not? Is it unreasonable to have an upper > > > bound on the size (for a long time probably) or are you disagreeing with > > > PAGE_SIZE being used? PAGE_SIZE limit is currently used by sched, perf, > > > bpf, and clone3 and in a few other places. > > > > For a primitive that can be safely used with any size (OK, any within > > the usual 2Gb limit)? Why push the random policy into the place where > > it doesn't belong? > > > > Seriously, what's the point? If they want to have a large chunk of > > userland memory zeroed or checked for non-zeroes - why would that > > be a problem? > > Thinking about it some more, there isn't really any r/w amplification -- > so there isn't much to gain by passing giant structs. Though, if we are > going to permit 2GB buffers, isn't that also an argument to use > memchr_inv()? :P I'm not sure I understand the last bit. If you look at what copy_from_user() does on misaligned source/destination, especially on architectures that really, really do not like unaligned access... Case in point: alpha (and it's not unusual in that respect). What it boils down to is copy bytes until the destination is aligned if source and destination are both aligned copy word by word else read word by word, storing the mix of two adjacent words copy the rest byte by byte The unpleasant case (to and from having different remainders modulo 8) is basically if (count >= 8) { u64 *aligned = (u64 *)(from & ~7); u64 *dest = (u64 *)to; int bitshift = (from & 7) * 8; u64 prev, next; prev = aligned[0]; do { next = aligned[1]; prev <<= bitshift; prev |= next >> (64 - bitshift); *dest++ = prev; aligned++; prev = next; from += 8; to += 8; count -= 8; } while (count >= 8); } Now, mix that with "... and do memchr_inv() on the copy to find if we'd copied any non-zeroes, nevermind where" and it starts looking really ridiculous. We should just read the fscking source, aligned down to word boundary and check each word being read. The first and the last ones - masked. All there is to it. On almost all architectures that'll work well enough; s390 might want something more elaborate (there even word-by-word copies are costly, but I'd suggest talking to them for details). Something like bool all_zeroes_user(const void __user *p, size_t count) would probably be a sane API... From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C81DC43331 for ; Thu, 5 Sep 2019 22:35:10 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 78590206DE for ; Thu, 5 Sep 2019 22:35:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 78590206DE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46Pb9R4zRlzDr5V for ; Fri, 6 Sep 2019 08:35:07 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=none (mailfrom) smtp.mailfrom=ftp.linux.org.uk (client-ip=195.92.253.2; helo=zeniv.linux.org.uk; envelope-from=viro@ftp.linux.org.uk; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=zeniv.linux.org.uk Received: from ZenIV.linux.org.uk (zeniv.linux.org.uk [195.92.253.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46Pb6Y3JlMzDr6W for ; Fri, 6 Sep 2019 08:32:32 +1000 (AEST) Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.1 #3 (Red Hat Linux)) id 1i60I9-0001pN-0U; Thu, 05 Sep 2019 22:31:49 +0000 Date: Thu, 5 Sep 2019 23:31:48 +0100 From: Al Viro To: Aleksa Sarai Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Message-ID: <20190905223148.GS1131@ZenIV.linux.org.uk> References: <20190904201933.10736-1-cyphar@cyphar.com> <20190904201933.10736-2-cyphar@cyphar.com> <20190905180750.GQ1131@ZenIV.linux.org.uk> <20190905182303.7f6bxpa2enbgcegv@wittgenstein> <20190905182801.GR1131@ZenIV.linux.org.uk> <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> User-Agent: Mutt/1.12.0 (2019-05-25) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, Peter Zijlstra , Rasmus Villemoes , Alexei Starovoitov , linux-kernel@vger.kernel.org, David Howells , linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, Christian Brauner , Shuah Khan , linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Tycho Andersen , Aleksa Sarai , Jiri Olsa , Alexander Shishkin , Ingo Molnar , linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-xtensa@linux-xtensa.org, Kees Cook , Arnd Bergmann , Jann Horn , linuxppc-dev@lists.ozlabs.org, linux-m68k@lists.linux-m68k.org, Andy Lutomirski , Shuah Khan , Namhyung Kim , David Drysdale , Christian Brauner , "J. Bruce Fields" , linux-parisc@vger.kernel.org, linux-api@vger.kernel.org, Chanho Min , Jeff Layton , Oleg Nesterov , Eric Biederman , linux-alpha@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Linus Torvalds , containers@lists.linux-foundation.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Fri, Sep 06, 2019 at 05:56:18AM +1000, Aleksa Sarai wrote: > On 2019-09-05, Al Viro wrote: > > On Thu, Sep 05, 2019 at 08:23:03PM +0200, Christian Brauner wrote: > > > > > Because every caller of that function right now has that limit set > > > anyway iirc. So we can either remove it from here and place it back for > > > the individual callers or leave it in the helper. > > > Also, I'm really asking, why not? Is it unreasonable to have an upper > > > bound on the size (for a long time probably) or are you disagreeing with > > > PAGE_SIZE being used? PAGE_SIZE limit is currently used by sched, perf, > > > bpf, and clone3 and in a few other places. > > > > For a primitive that can be safely used with any size (OK, any within > > the usual 2Gb limit)? Why push the random policy into the place where > > it doesn't belong? > > > > Seriously, what's the point? If they want to have a large chunk of > > userland memory zeroed or checked for non-zeroes - why would that > > be a problem? > > Thinking about it some more, there isn't really any r/w amplification -- > so there isn't much to gain by passing giant structs. Though, if we are > going to permit 2GB buffers, isn't that also an argument to use > memchr_inv()? :P I'm not sure I understand the last bit. If you look at what copy_from_user() does on misaligned source/destination, especially on architectures that really, really do not like unaligned access... Case in point: alpha (and it's not unusual in that respect). What it boils down to is copy bytes until the destination is aligned if source and destination are both aligned copy word by word else read word by word, storing the mix of two adjacent words copy the rest byte by byte The unpleasant case (to and from having different remainders modulo 8) is basically if (count >= 8) { u64 *aligned = (u64 *)(from & ~7); u64 *dest = (u64 *)to; int bitshift = (from & 7) * 8; u64 prev, next; prev = aligned[0]; do { next = aligned[1]; prev <<= bitshift; prev |= next >> (64 - bitshift); *dest++ = prev; aligned++; prev = next; from += 8; to += 8; count -= 8; } while (count >= 8); } Now, mix that with "... and do memchr_inv() on the copy to find if we'd copied any non-zeroes, nevermind where" and it starts looking really ridiculous. We should just read the fscking source, aligned down to word boundary and check each word being read. The first and the last ones - masked. All there is to it. On almost all architectures that'll work well enough; s390 might want something more elaborate (there even word-by-word copies are costly, but I'd suggest talking to them for details). Something like bool all_zeroes_user(const void __user *p, size_t count) would probably be a sane API... From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91BEEC43331 for ; Thu, 5 Sep 2019 22:32:36 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C6757206DE for ; Thu, 5 Sep 2019 22:32:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="CRmQiWN+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C6757206DE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=1q5MNvYT8nxeHgoke6k+9h3fHZ8zyCAPP1dJX0iM814=; b=CRmQiWN+j4EKlt izHzwcS2Q9jjwoBpIRFaHh8n2tlpkNyxeDHPZSkliIlF2MlNQELe59O9UF8w3b+wUC8OVnK46cv39 KA/6mY6kmBRxWU9sxjQXsTYgqde2qTHJZxDWFtcOc6iWjA5WBLyKvTIlOOFT4smZnOCXsrEiLQlm9 8oX43atYo/N+V9TC3kWs4/UEc+Yyc0pzBYZbvsGYTO87rjlfzBqgvE6AVqb2NCDjPK5HbqJlNRUyl B+P7i2YDEoAOWiSyM+n+P55xY5fXaC4oJYztKy+D2T69fSTexDCij9Oc/ezOM/G15Y3vFdniEmeit +/Nz0HWz3GHLM1pwun8w==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92 #3 (Red Hat Linux)) id 1i60Iq-0002Pw-49; Thu, 05 Sep 2019 22:32:32 +0000 Received: from zeniv.linux.org.uk ([195.92.253.2]) by bombadil.infradead.org with esmtps (Exim 4.92 #3 (Red Hat Linux)) id 1i60In-0002P7-Gf for linux-arm-kernel@lists.infradead.org; Thu, 05 Sep 2019 22:32:31 +0000 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.1 #3 (Red Hat Linux)) id 1i60I9-0001pN-0U; Thu, 05 Sep 2019 22:31:49 +0000 Date: Thu, 5 Sep 2019 23:31:48 +0100 From: Al Viro To: Aleksa Sarai Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Message-ID: <20190905223148.GS1131@ZenIV.linux.org.uk> References: <20190904201933.10736-1-cyphar@cyphar.com> <20190904201933.10736-2-cyphar@cyphar.com> <20190905180750.GQ1131@ZenIV.linux.org.uk> <20190905182303.7f6bxpa2enbgcegv@wittgenstein> <20190905182801.GR1131@ZenIV.linux.org.uk> <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190905195618.pwzgvuzadkfpznfz@yavin.dot.cyphar.com> User-Agent: Mutt/1.12.0 (2019-05-25) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190905_153229_556769_3C2AE7A3 X-CRM114-Status: GOOD ( 19.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, Peter Zijlstra , Rasmus Villemoes , Alexei Starovoitov , linux-kernel@vger.kernel.org, David Howells , linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, Christian Brauner , Shuah Khan , linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Tycho Andersen , Aleksa Sarai , Jiri Olsa , Alexander Shishkin , Ingo Molnar , linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-xtensa@linux-xtensa.org, Kees Cook , Arnd Bergmann , Jann Horn , linuxppc-dev@lists.ozlabs.org, linux-m68k@lists.linux-m68k.org, Andy Lutomirski , Shuah Khan , Namhyung Kim , David Drysdale , Christian Brauner , "J. Bruce Fields" , linux-parisc@vger.kernel.org, linux-api@vger.kernel.org, Chanho Min , Jeff Layton , Oleg Nesterov , Eric Biederman , linux-alpha@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Linus Torvalds , containers@lists.linux-foundation.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Sep 06, 2019 at 05:56:18AM +1000, Aleksa Sarai wrote: > On 2019-09-05, Al Viro wrote: > > On Thu, Sep 05, 2019 at 08:23:03PM +0200, Christian Brauner wrote: > > > > > Because every caller of that function right now has that limit set > > > anyway iirc. So we can either remove it from here and place it back for > > > the individual callers or leave it in the helper. > > > Also, I'm really asking, why not? Is it unreasonable to have an upper > > > bound on the size (for a long time probably) or are you disagreeing with > > > PAGE_SIZE being used? PAGE_SIZE limit is currently used by sched, perf, > > > bpf, and clone3 and in a few other places. > > > > For a primitive that can be safely used with any size (OK, any within > > the usual 2Gb limit)? Why push the random policy into the place where > > it doesn't belong? > > > > Seriously, what's the point? If they want to have a large chunk of > > userland memory zeroed or checked for non-zeroes - why would that > > be a problem? > > Thinking about it some more, there isn't really any r/w amplification -- > so there isn't much to gain by passing giant structs. Though, if we are > going to permit 2GB buffers, isn't that also an argument to use > memchr_inv()? :P I'm not sure I understand the last bit. If you look at what copy_from_user() does on misaligned source/destination, especially on architectures that really, really do not like unaligned access... Case in point: alpha (and it's not unusual in that respect). What it boils down to is copy bytes until the destination is aligned if source and destination are both aligned copy word by word else read word by word, storing the mix of two adjacent words copy the rest byte by byte The unpleasant case (to and from having different remainders modulo 8) is basically if (count >= 8) { u64 *aligned = (u64 *)(from & ~7); u64 *dest = (u64 *)to; int bitshift = (from & 7) * 8; u64 prev, next; prev = aligned[0]; do { next = aligned[1]; prev <<= bitshift; prev |= next >> (64 - bitshift); *dest++ = prev; aligned++; prev = next; from += 8; to += 8; count -= 8; } while (count >= 8); } Now, mix that with "... and do memchr_inv() on the copy to find if we'd copied any non-zeroes, nevermind where" and it starts looking really ridiculous. We should just read the fscking source, aligned down to word boundary and check each word being read. The first and the last ones - masked. All there is to it. On almost all architectures that'll work well enough; s390 might want something more elaborate (there even word-by-word copies are costly, but I'd suggest talking to them for details). Something like bool all_zeroes_user(const void __user *p, size_t count) would probably be a sane API... _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel