From: Al Viro <viro@ZenIV.linux.org.uk> To: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Richard Henderson <rth@twiddle.net>, Will Deacon <will.deacon@arm.com>, Haavard Skinnemoen <hskinnemoen@gmail.com>, Vineet Gupta <vgupta@synopsys.com>, Steven Miao <realmz6@gmail.com>, Jesper Nilsson <jesper.nilsson@axis.com>, Mark Salter <msalter@redhat.com>, Yoshinori Sato <ysato@users.sourceforge.jp>, Richard Kuo <rkuo@codeaurora.org>, Tony Luck <tony.luck@intel.com>, Geert Uytterhoeven <geert@linux-m68k.org>, James Hogan <james.hogan@imgtec.com>, Michal Simek <monstr@monstr.eu>, David Howells <dhowells@redhat.com>, Ley Foon Tan <lftan@altera.com>, Jonas Bonn <jonas@southpole.se>, Helge Deller <deller@gmx.de>, Martin Schwidefsky <schwidefsky@de.ibm.com>, Ralf Baechle <ralf@linux-mips.org>, Benjamin Herrenschmidt <benh@kernel.crashing.org>, Chen Liqin <liqin.linux@gmail.com>, "David S. Miller" <davem@davemloft.net>, Chris Metcalf <cmetcalf@mellanox.com>, Richard Weinberger <richard@nod.at>, Guan Xuetao <gxt@mprc.pku.edu.cn>, Thomas Gleixner <tglx@linutronix.de>, Chris Zankel <chris@zankel.net> Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification Date: Thu, 30 Mar 2017 17:43:42 +0100 [thread overview] Message-ID: <20170330164342.GR29622@ZenIV.linux.org.uk> (raw) In-Reply-To: <20170330162241.GG7909@n2100.armlinux.org.uk> On Thu, Mar 30, 2017 at 05:22:41PM +0100, Russell King - ARM Linux wrote: > On Wed, Mar 29, 2017 at 06:57:06AM +0100, Al Viro wrote: > > Comments, review, testing, replacement patches, etc. are very welcome. > > I've given this a spin, and it appears to work (in that the box boots). > > Kernel size wise: > > text data bss dec hex filename > 8020229 3014220 10243276 21277725 144ac1d vmlinux.orig > 8034741 3014388 10243276 21292405 144e575 vmlinux.uaccess > 7976719 3014324 10243276 21234319 144028f vmlinux.noinline > > Performance using hdparm -T (cached reads) to evaluate against a SSD > gives me the following results: > > * original: > Timing cached reads: 580 MB in 2.00 seconds = 289.64 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 290.06 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 289.65 MB/sec > Timing cached reads: 582 MB in 2.00 seconds = 290.82 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 289.07 MB/sec > > Average = 289.85MB/s > > * uaccess: > Timing cached reads: 578 MB in 2.00 seconds = 288.36 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 266.68 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 267.07 MB/sec > Timing cached reads: 552 MB in 2.00 seconds = 275.45 MB/sec > Timing cached reads: 532 MB in 2.00 seconds = 266.08 MB/sec > > Average = 272.73 MB/sec > > * noinline: > Timing cached reads: 548 MB in 2.00 seconds = 274.16 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 287.19 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 286.47 MB/sec > Timing cached reads: 572 MB in 2.00 seconds = 286.20 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 288.86 MB/sec > > Average = 284.58 MB/sec > > I've run the test twice, and there's definitely a reproducable drop in > performance for some reason when switching between current and Al's > uaccess patches, which is partly recovered by switching to the out of > line versions. > > The only difference that I can identify that could explain this are > the extra might_fault() checks in Al's version but which are missing > from the ARM version. How would the following affect things? diff --git a/lib/iov_iter.c b/lib/iov_iter.c index e68604ae3ced..d24d338f0682 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -184,7 +184,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b kaddr = kmap(page); from = kaddr + offset; - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip += copy; from += copy; @@ -193,7 +193,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip = copy; from += copy; @@ -267,7 +267,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t kaddr = kmap(page); to = kaddr + offset; - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip += copy; to += copy; @@ -276,7 +276,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip = copy; to += copy; @@ -541,7 +541,7 @@ size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) if (unlikely(i->type & ITER_PIPE)) return copy_pipe_to_iter(addr, bytes, i); iterate_and_advance(i, bytes, v, - __copy_to_user(v.iov_base, (from += v.iov_len) - v.iov_len, + __copy_to_user_inatomic(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len), memcpy_to_page(v.bv_page, v.bv_offset, (from += v.bv_len) - v.bv_len, v.bv_len), @@ -560,7 +560,7 @@ size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i) return 0; } iterate_and_advance(i, bytes, v, - __copy_from_user((to += v.iov_len) - v.iov_len, v.iov_base, + __copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), @@ -582,7 +582,7 @@ bool copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i) return false; iterate_all_kinds(i, bytes, v, ({ - if (__copy_from_user((to += v.iov_len) - v.iov_len, + if (__copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)) return false; 0;}),
WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk> To: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Richard Henderson <rth@twiddle.net>, Will Deacon <will.deacon@arm.com>, Haavard Skinnemoen <hskinnemoen@gmail.com>, Vineet Gupta <vgupta@synopsys.com>, Steven Miao <realmz6@gmail.com>, Jesper Nilsson <jesper.nilsson@axis.com>, Mark Salter <msalter@redhat.com>, Yoshinori Sato <ysato@users.sourceforge.jp>, Richard Kuo <rkuo@codeaurora.org>, Tony Luck <tony.luck@intel.com>, Geert Uytterhoeven <geert@linux-m68k.org>, James Hogan <james.hogan@imgtec.com>, Michal Simek <monstr@monstr.eu>, David Howells <dhowells@redhat.com>, Ley Foon Tan <lftan@altera.com>, Jonas Bonn <jonas@southpole.se>, Helge Deller <deller@gmx.de>, Martin Schwidefsky <schwi> Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification Date: Thu, 30 Mar 2017 17:43:42 +0100 [thread overview] Message-ID: <20170330164342.GR29622@ZenIV.linux.org.uk> (raw) In-Reply-To: <20170330162241.GG7909@n2100.armlinux.org.uk> On Thu, Mar 30, 2017 at 05:22:41PM +0100, Russell King - ARM Linux wrote: > On Wed, Mar 29, 2017 at 06:57:06AM +0100, Al Viro wrote: > > Comments, review, testing, replacement patches, etc. are very welcome. > > I've given this a spin, and it appears to work (in that the box boots). > > Kernel size wise: > > text data bss dec hex filename > 8020229 3014220 10243276 21277725 144ac1d vmlinux.orig > 8034741 3014388 10243276 21292405 144e575 vmlinux.uaccess > 7976719 3014324 10243276 21234319 144028f vmlinux.noinline > > Performance using hdparm -T (cached reads) to evaluate against a SSD > gives me the following results: > > * original: > Timing cached reads: 580 MB in 2.00 seconds = 289.64 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 290.06 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 289.65 MB/sec > Timing cached reads: 582 MB in 2.00 seconds = 290.82 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 289.07 MB/sec > > Average = 289.85MB/s > > * uaccess: > Timing cached reads: 578 MB in 2.00 seconds = 288.36 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 266.68 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 267.07 MB/sec > Timing cached reads: 552 MB in 2.00 seconds = 275.45 MB/sec > Timing cached reads: 532 MB in 2.00 seconds = 266.08 MB/sec > > Average = 272.73 MB/sec > > * noinline: > Timing cached reads: 548 MB in 2.00 seconds = 274.16 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 287.19 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 286.47 MB/sec > Timing cached reads: 572 MB in 2.00 seconds = 286.20 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 288.86 MB/sec > > Average = 284.58 MB/sec > > I've run the test twice, and there's definitely a reproducable drop in > performance for some reason when switching between current and Al's > uaccess patches, which is partly recovered by switching to the out of > line versions. > > The only difference that I can identify that could explain this are > the extra might_fault() checks in Al's version but which are missing > from the ARM version. How would the following affect things? diff --git a/lib/iov_iter.c b/lib/iov_iter.c index e68604ae3ced..d24d338f0682 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -184,7 +184,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b kaddr = kmap(page); from = kaddr + offset; - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip += copy; from += copy; @@ -193,7 +193,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip = copy; from += copy; @@ -267,7 +267,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t kaddr = kmap(page); to = kaddr + offset; - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip += copy; to += copy; @@ -276,7 +276,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip = copy; to += copy; @@ -541,7 +541,7 @@ size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) if (unlikely(i->type & ITER_PIPE)) return copy_pipe_to_iter(addr, bytes, i); iterate_and_advance(i, bytes, v, - __copy_to_user(v.iov_base, (from += v.iov_len) - v.iov_len, + __copy_to_user_inatomic(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len), memcpy_to_page(v.bv_page, v.bv_offset, (from += v.bv_len) - v.bv_len, v.bv_len), @@ -560,7 +560,7 @@ size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i) return 0; } iterate_and_advance(i, bytes, v, - __copy_from_user((to += v.iov_len) - v.iov_len, v.iov_base, + __copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), @@ -582,7 +582,7 @@ bool copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i) return false; iterate_all_kinds(i, bytes, v, ({ - if (__copy_from_user((to += v.iov_len) - v.iov_len, + if (__copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)) return false; 0;}),
next prev parent reply other threads:[~2017-03-30 16:44 UTC|newest] Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-03-29 5:57 [RFC][CFT][PATCHSET v1] uaccess unification Al Viro 2017-03-29 5:57 ` Al Viro 2017-03-29 20:08 ` Vineet Gupta 2017-03-29 20:08 ` Vineet Gupta 2017-03-29 20:08 ` Vineet Gupta 2017-03-29 20:29 ` Al Viro 2017-03-29 20:29 ` Al Viro 2017-03-29 20:37 ` Linus Torvalds 2017-03-29 20:37 ` Linus Torvalds 2017-03-29 21:03 ` Al Viro 2017-03-29 21:03 ` Al Viro 2017-03-29 21:24 ` Linus Torvalds 2017-03-29 21:24 ` Linus Torvalds 2017-03-29 23:09 ` Al Viro 2017-03-29 23:09 ` Al Viro 2017-03-29 23:43 ` Linus Torvalds 2017-03-29 23:43 ` Linus Torvalds 2017-03-30 15:31 ` Al Viro 2017-03-30 15:31 ` Al Viro 2017-03-29 21:14 ` Vineet Gupta 2017-03-29 21:14 ` Vineet Gupta 2017-03-29 23:42 ` Al Viro 2017-03-29 23:42 ` Al Viro 2017-03-30 0:02 ` Vineet Gupta 2017-03-30 0:02 ` Vineet Gupta 2017-03-30 0:27 ` Linus Torvalds 2017-03-30 0:27 ` Linus Torvalds 2017-03-30 1:15 ` Al Viro 2017-03-30 1:15 ` Al Viro 2017-03-30 20:40 ` Vineet Gupta 2017-03-30 20:40 ` Vineet Gupta 2017-03-30 20:59 ` Linus Torvalds 2017-03-30 20:59 ` Linus Torvalds 2017-03-30 23:21 ` Russell King - ARM Linux 2017-03-30 23:21 ` Russell King - ARM Linux 2017-03-30 12:32 ` Martin Schwidefsky 2017-03-30 12:32 ` Martin Schwidefsky 2017-03-30 14:48 ` Al Viro 2017-03-30 14:48 ` Al Viro 2017-03-30 16:22 ` Russell King - ARM Linux 2017-03-30 16:22 ` Russell King - ARM Linux 2017-03-30 16:43 ` Al Viro [this message] 2017-03-30 16:43 ` Al Viro 2017-03-30 17:18 ` Linus Torvalds 2017-03-30 17:18 ` Linus Torvalds 2017-03-30 18:48 ` Al Viro 2017-03-30 18:48 ` Al Viro 2017-03-30 18:54 ` Al Viro 2017-03-30 18:54 ` Al Viro 2017-03-30 18:59 ` Linus Torvalds 2017-03-30 18:59 ` Linus Torvalds 2017-03-30 19:10 ` Al Viro 2017-03-30 19:10 ` Al Viro 2017-03-30 19:19 ` Linus Torvalds 2017-03-30 19:19 ` Linus Torvalds 2017-03-30 21:08 ` Al Viro 2017-03-30 21:08 ` Al Viro 2017-03-30 18:56 ` Linus Torvalds 2017-03-30 18:56 ` Linus Torvalds 2017-03-31 0:21 ` Kees Cook 2017-03-31 0:21 ` Kees Cook 2017-03-31 13:38 ` James Hogan 2017-03-31 13:38 ` James Hogan 2017-04-03 16:27 ` James Morse 2017-04-03 16:27 ` James Morse 2017-04-04 20:26 ` Max Filippov 2017-04-04 20:26 ` Max Filippov 2017-04-04 20:26 ` Max Filippov 2017-04-04 20:52 ` Al Viro 2017-04-04 20:52 ` Al Viro 2017-04-05 5:05 ` ia64 exceptions (Re: [RFC][CFT][PATCHSET v1] uaccess unification) Al Viro 2017-04-05 5:05 ` Al Viro 2017-04-05 8:08 ` Al Viro 2017-04-05 8:08 ` Al Viro 2017-04-05 18:44 ` Tony Luck 2017-04-05 18:44 ` Tony Luck 2017-04-05 20:33 ` Al Viro 2017-04-05 20:33 ` Al Viro 2017-04-07 0:24 ` [RFC][CFT][PATCHSET v2] uaccess unification Al Viro 2017-04-07 0:24 ` Al Viro 2017-04-07 0:35 ` Al Viro 2017-04-07 0:35 ` Al Viro [not found] <CACVxJT8+fQqvpSPb9rTWFy6g7moqUqxi+Ewjcg0ykuqo=vm4Ow@mail.gmail.com> 2017-03-30 13:27 ` [RFC][CFT][PATCHSET v1] " Alexey Dobriyan
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170330164342.GR29622@ZenIV.linux.org.uk \ --to=viro@zeniv.linux.org.uk \ --cc=benh@kernel.crashing.org \ --cc=chris@zankel.net \ --cc=cmetcalf@mellanox.com \ --cc=davem@davemloft.net \ --cc=deller@gmx.de \ --cc=dhowells@redhat.com \ --cc=geert@linux-m68k.org \ --cc=gxt@mprc.pku.edu.cn \ --cc=hskinnemoen@gmail.com \ --cc=james.hogan@imgtec.com \ --cc=jesper.nilsson@axis.com \ --cc=jonas@southpole.se \ --cc=lftan@altera.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@armlinux.org.uk \ --cc=liqin.linux@gmail.com \ --cc=monstr@monstr.eu \ --cc=msalter@redhat.com \ --cc=ralf@linux-mips.org \ --cc=realmz6@gmail.com \ --cc=richard@nod.at \ --cc=rkuo@codeaurora.org \ --cc=rth@twiddle.net \ --cc=schwidefsky@de.ibm.com \ --cc=tglx@linutronix.de \ --cc=tony.luck@intel.com \ --cc=torvalds@linux-foundation.org \ --cc=vgupta@synopsys.com \ --cc=will.deacon@arm.com \ --cc=ysato@users.sourceforge.jp \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.