All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Dobriyan <adobriyan@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mark Hemment <markhemm@googlemail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	patrice.chotard@foss.st.com,
	Mikulas Patocka <mpatocka@redhat.com>,
	Lukas Czerner <lczerner@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	patches@lists.linux.dev, Linux-MM <linux-mm@kvack.org>,
	mm-commits@vger.kernel.org, Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH -final] x86/clear_user: Make it faster
Date: Tue, 12 Jul 2022 15:32:27 +0300	[thread overview]
Message-ID: <Ys1p27uWqjWlcaa1@localhost.localdomain> (raw)
In-Reply-To: <Ysv8cAa7wcDmQlpm@zn.tnic>

On Mon, Jul 11, 2022 at 12:33:20PM +0200, Borislav Petkov wrote:
> On Wed, Jul 06, 2022 at 12:24:12PM +0300, Alexey Dobriyan wrote:
> > On Tue, Jul 05, 2022 at 07:01:06PM +0200, Borislav Petkov wrote:
> > 
> > > +	asm volatile(
> > > +		"1:\n\t"
> > > +		ALTERNATIVE_3("rep stosb",
> > > +			      "call clear_user_erms",	  ALT_NOT(X86_FEATURE_FSRM),
> > > +			      "call clear_user_rep_good", ALT_NOT(X86_FEATURE_ERMS),
> > > +			      "call clear_user_original", ALT_NOT(X86_FEATURE_REP_GOOD))
> > > +		"2:\n"
> > > +	       _ASM_EXTABLE_UA(1b, 2b)
> > > +	       : "+&c" (size), "+&D" (addr), ASM_CALL_CONSTRAINT
> > > +	       : "a" (0)
> > > +		/* rep_good clobbers %rdx */
> > > +	       : "rdx");
> > 
> > "+c" and "+D" should be enough for 1 instruction assembly?
> 
> I'm looking at
> 
>   e0a96129db57 ("x86: use early clobbers in usercopy*.c")
> 
> which introduced the early clobbers and I'm thinking we want them
> because "this operand is an earlyclobber operand, which is written
> before the instruction is finished using the input operands" and we have
> exception handling.
> 
> But maybe you need to be more verbose as to what you mean exactly...

This is the original code:

-#define __do_strncpy_from_user(dst,src,count,res)                         \
-do {                                                                      \
-       long __d0, __d1, __d2;                                             \
-       might_fault();                                                     \
-       __asm__ __volatile__(                                              \
-               "       testq %1,%1\n"                                     \
-               "       jz 2f\n"                                           \
-               "0:     lodsb\n"                                           \
-               "       stosb\n"                                           \
-               "       testb %%al,%%al\n"                                 \
-               "       jz 1f\n"                                           \
-               "       decq %1\n"                                         \
-               "       jnz 0b\n"                                          \
-               "1:     subq %1,%0\n"                                      \
-               "2:\n"                                                     \
-               ".section .fixup,\"ax\"\n"                                 \
-               "3:     movq %5,%0\n"                                      \
-               "       jmp 2b\n"                                          \
-               ".previous\n"                                              \
-               _ASM_EXTABLE(0b,3b)                                        \
-               : "=&r"(res), "=&c"(count), "=&a" (__d0), "=&S" (__d1),    \
-                 "=&D" (__d2)                                             \
-               : "i"(-EFAULT), "0"(count), "1"(count), "3"(src), "4"(dst) \
-               : "memory");                                               \
-} while (0)

I meant to say that earlyclobber is necessary only because the asm body
is more than 1 instruction so there is possibility of writing to some
outputs before all inputs are consumed.

If asm body is 1 insn there is no such possibility at all.
Now "rep stosb" is 1 instruction and two alterantive functions masquarade
as single instruction.

  reply	other threads:[~2022-07-12 12:32 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-15  2:12 incoming Andrew Morton
2022-04-15  2:13 ` [patch 01/14] MAINTAINERS: Broadcom internal lists aren't maintainers Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15 22:10   ` Linus Torvalds
2022-04-15 22:21     ` Matthew Wilcox
2022-04-15 22:41     ` Hugh Dickins
2022-04-16  6:36     ` Borislav Petkov
2022-04-16 14:07       ` Mark Hemment
2022-04-16 17:28         ` Borislav Petkov
2022-04-16 17:42           ` Linus Torvalds
2022-04-16 21:15             ` Borislav Petkov
2022-04-17 19:41               ` Borislav Petkov
2022-04-17 20:56                 ` Linus Torvalds
2022-04-18 10:15                   ` Borislav Petkov
2022-04-18 17:10                     ` Linus Torvalds
2022-04-19  9:17                       ` Borislav Petkov
2022-04-19 16:41                         ` Linus Torvalds
2022-04-19 17:48                           ` Borislav Petkov
2022-04-21 15:06                             ` Borislav Petkov
2022-04-21 16:50                               ` Linus Torvalds
2022-04-21 17:22                                 ` Linus Torvalds
2022-04-24 19:37                                   ` Borislav Petkov
2022-04-24 19:54                                     ` Linus Torvalds
2022-04-24 20:24                                       ` Linus Torvalds
2022-04-27  0:14                                       ` Borislav Petkov
2022-04-27  1:29                                         ` Linus Torvalds
2022-04-27 10:41                                           ` Borislav Petkov
2022-04-27 16:00                                             ` Linus Torvalds
2022-05-04 18:56                                               ` Borislav Petkov
2022-05-04 19:22                                                 ` Linus Torvalds
2022-05-04 20:18                                                   ` Borislav Petkov
2022-05-04 20:40                                                     ` Linus Torvalds
2022-05-04 21:01                                                       ` Borislav Petkov
2022-05-04 21:09                                                         ` Linus Torvalds
2022-05-10  9:31                                                           ` clear_user (was: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE) Borislav Petkov
2022-05-10 17:17                                                             ` Linus Torvalds
2022-05-10 17:28                                                             ` Linus Torvalds
2022-05-10 18:10                                                               ` Borislav Petkov
2022-05-10 18:57                                                                 ` Borislav Petkov
2022-05-24 12:32                                                                   ` [PATCH] x86/clear_user: Make it faster Borislav Petkov
2022-05-24 16:51                                                                     ` Linus Torvalds
2022-05-24 17:30                                                                       ` Borislav Petkov
2022-05-25 12:11                                                                     ` Mark Hemment
2022-05-27 11:28                                                                       ` Borislav Petkov
2022-05-27 11:10                                                                     ` Ingo Molnar
2022-06-22 14:21                                                                     ` Borislav Petkov
2022-06-22 15:06                                                                       ` Linus Torvalds
2022-06-22 20:14                                                                         ` Borislav Petkov
2022-06-22 21:07                                                                           ` Linus Torvalds
2022-06-23  9:41                                                                             ` Borislav Petkov
2022-07-05 17:01                                                                               ` [PATCH -final] " Borislav Petkov
2022-07-06  9:24                                                                                 ` Alexey Dobriyan
2022-07-11 10:33                                                                                   ` Borislav Petkov
2022-07-12 12:32                                                                                     ` Alexey Dobriyan [this message]
2022-08-06 12:49                                                                                       ` Borislav Petkov
2022-08-18 10:44     ` [tip: x86/cpu] " tip-bot2 for Borislav Petkov
2022-04-15  2:13 ` [patch 03/14] mm/secretmem: fix panic when growing a memfd_secret Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 04/14] irq_work: use kasan_record_aux_stack_noalloc() record callstack Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 05/14] kasan: fix hw tags enablement when KUNIT tests are disabled Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 06/14] mm, kfence: support kmem_dump_obj() for KFENCE objects Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 07/14] mm, page_alloc: fix build_zonerefs_node() Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 08/14] mm: fix unexpected zeroed page mapping with zram swap Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 09/14] mm: compaction: fix compiler warning when CONFIG_COMPACTION=n Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 10/14] hugetlb: do not demote poisoned hugetlb pages Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 11/14] revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders" Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:13 ` [patch 12/14] revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" Andrew Morton
2022-04-15  2:13   ` Andrew Morton
2022-04-15  2:14 ` [patch 13/14] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore Andrew Morton
2022-04-15  2:14   ` Andrew Morton
2022-04-15  2:14 ` [patch 14/14] mm: kmemleak: take a full lowmem check in kmemleak_*_phys() Andrew Morton
2022-04-15  2:14   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ys1p27uWqjWlcaa1@localhost.localdomain \
    --to=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=chuck.lever@oracle.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=lczerner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=markhemm@googlemail.com \
    --cc=mgorman@suse.de \
    --cc=mm-commits@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=patrice.chotard@foss.st.com \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.