From: Borislav Petkov <bp@alien8.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Hemment <markhemm@googlemail.com>,
Andrew Morton <akpm@linux-foundation.org>,
the arch/x86 maintainers <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
patrice.chotard@foss.st.com,
Mikulas Patocka <mpatocka@redhat.com>,
Lukas Czerner <lczerner@redhat.com>,
Christoph Hellwig <hch@lst.de>,
"Darrick J. Wong" <djwong@kernel.org>,
Chuck Lever <chuck.lever@oracle.com>,
Hugh Dickins <hughd@google.com>,
patches@lists.linux.dev, Linux-MM <linux-mm@kvack.org>,
mm-commits@vger.kernel.org
Subject: Re: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE
Date: Wed, 27 Apr 2022 02:14:36 +0200 [thread overview]
Message-ID: <YmiK7Bos+zLAvL0t@zn.tnic> (raw)
In-Reply-To: <CAHk-=wgFnTbbeR0NAsGGsoBBThXt9Zh5_acN47r4CF0PdgSNeA@mail.gmail.com>
On Sun, Apr 24, 2022 at 12:54:57PM -0700, Linus Torvalds wrote:
> I suspect it's a %rax vs %rcx confusion again, but with your "patch on
> top of earlier patch" I didn't go and sort it out.
Finally had some quiet time to stare at this.
So when we enter the function, we shift %rcx to get the number of
qword-sized quantities to zero:
SYM_FUNC_START(clear_user_original)
mov %rcx,%rax
shr $3,%rcx # qwords <---
and we zero in qword quantities merrily:
# do the qwords first
.p2align 4
0: movq $0,(%rdi)
lea 8(%rdi),%rdi
dec %rcx
jnz 0b
but when we encounter the fault here, we return *%rcx* - not %rcx << 3
- latter being the *bytes* leftover which we *actually* need to return
when we encounter the #PF.
So, we need to shift back when we fault during the qword-sized zeroing,
i.e., full function below, see label 3 there.
With that, strace looks good too:
openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7dc5000
munmap(0x7ffff7dd5000, 65536) = 0
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 16
exit_group(16) = ?
+++ exited with 16 +++
As to the byte-exact deal, I'll put it on my TODO to play with it later
and see how much asm we can shed from this simplification so thanks for
the pointers!
/*
* Default clear user-space.
* Input:
* rdi destination
* rcx count
*
* Output:
* rcx uncleared bytes or 0 if successful.
*/
SYM_FUNC_START(clear_user_original)
mov %rcx,%rax
shr $3,%rcx # qwords
and $7,%rax # rest bytes
test %rcx,%rcx
jz 1f
# do the qwords first
.p2align 4
0: movq $0,(%rdi)
lea 8(%rdi),%rdi
dec %rcx
jnz 0b
1: test %rax,%rax
jz 3f
# now do the rest bytes
2: movb $0,(%rdi)
inc %rdi
decl %eax
jnz 2b
3:
# convert qwords back into bytes to return to caller
shl $3, %rcx
4:
xorl %eax,%eax
RET
_ASM_EXTABLE_UA(0b, 3b)
/*
* The %rcx value gets fixed up with EX_TYPE_UCOPY_LEN (which basically ends
* up doing "%rcx = %rcx*8 + %rax" in ex_handler_ucopy_len() for the exception
* case). That is, we use %rax above at label 2: for simpler asm but the number
* of uncleared bytes will land in %rcx, as expected by the caller.
*
* %rax at label 3: still needs to be cleared in the exception case because this
* is called from inline asm and the compiler expects %rax to be zero when exiting
* the inline asm, in case it might reuse it somewhere.
*/
_ASM_EXTABLE_TYPE_REG(2b, 4b, EX_TYPE_UCOPY_LEN8, %rax)
Btw, I'm wondering if using descriptive label names would make this function even more
understandable:
/*
* Default clear user-space.
* Input:
* rdi destination
* rcx count
*
* Output:
* rcx uncleared bytes or 0 if successful.
*/
SYM_FUNC_START(clear_user_original)
mov %rcx,%rax
shr $3,%rcx # qwords
and $7,%rax # rest bytes
test %rcx,%rcx
jz .Lrest_bytes
# do the qwords first
.p2align 4
.Lqwords:
movq $0,(%rdi)
lea 8(%rdi),%rdi
dec %rcx
jnz .Lqwords
.Lrest_bytes:
test %rax,%rax
jz .Lexit
# now do the rest bytes
.Lbytes:
movb $0,(%rdi)
inc %rdi
decl %eax
jnz .Lbytes
.Lqwords_exit:
# convert qwords back into bytes to return to caller
shl $3, %rcx
.Lexit:
xorl %eax,%eax
RET
_ASM_EXTABLE_UA(.Lqwords, .Lqwords_exit)
/*
* The %rcx value gets fixed up with EX_TYPE_UCOPY_LEN (which basically ends
* up doing "%rcx = %rcx*8 + %rax" in ex_handler_ucopy_len() for the exception
* case). That is, we use %rax above at label 2: for simpler asm but the number
* of uncleared bytes will land in %rcx, as expected by the caller.
*
* %rax at label 3: still needs to be cleared in the exception case because this
* is called from inline asm and the compiler expects %rax to be zero when exiting
* the inline asm, in case it might reuse it somewhere.
*/
_ASM_EXTABLE_TYPE_REG(.Lbytes, .Lexit, EX_TYPE_UCOPY_LEN8, %rax)
SYM_FUNC_END(clear_user_original)
EXPORT_SYMBOL(clear_user_original)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
next prev parent reply other threads:[~2022-04-27 0:14 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-15 2:12 incoming Andrew Morton
2022-04-15 2:13 ` [patch 01/14] MAINTAINERS: Broadcom internal lists aren't maintainers Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 22:10 ` Linus Torvalds
2022-04-15 22:21 ` Matthew Wilcox
2022-04-15 22:41 ` Hugh Dickins
2022-04-16 6:36 ` Borislav Petkov
2022-04-16 14:07 ` Mark Hemment
2022-04-16 17:28 ` Borislav Petkov
2022-04-16 17:42 ` Linus Torvalds
2022-04-16 21:15 ` Borislav Petkov
2022-04-17 19:41 ` Borislav Petkov
2022-04-17 20:56 ` Linus Torvalds
2022-04-18 10:15 ` Borislav Petkov
2022-04-18 17:10 ` Linus Torvalds
2022-04-19 9:17 ` Borislav Petkov
2022-04-19 16:41 ` Linus Torvalds
2022-04-19 17:48 ` Borislav Petkov
2022-04-21 15:06 ` Borislav Petkov
2022-04-21 16:50 ` Linus Torvalds
2022-04-21 17:22 ` Linus Torvalds
2022-04-24 19:37 ` Borislav Petkov
2022-04-24 19:54 ` Linus Torvalds
2022-04-24 20:24 ` Linus Torvalds
2022-04-27 0:14 ` Borislav Petkov [this message]
2022-04-27 1:29 ` Linus Torvalds
2022-04-27 10:41 ` Borislav Petkov
2022-04-27 16:00 ` Linus Torvalds
2022-05-04 18:56 ` Borislav Petkov
2022-05-04 19:22 ` Linus Torvalds
2022-05-04 20:18 ` Borislav Petkov
2022-05-04 20:40 ` Linus Torvalds
2022-05-04 21:01 ` Borislav Petkov
2022-05-04 21:09 ` Linus Torvalds
2022-05-10 9:31 ` clear_user (was: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE) Borislav Petkov
2022-05-10 17:17 ` Linus Torvalds
2022-05-10 17:28 ` Linus Torvalds
2022-05-10 18:10 ` Borislav Petkov
2022-05-10 18:57 ` Borislav Petkov
2022-05-24 12:32 ` [PATCH] x86/clear_user: Make it faster Borislav Petkov
2022-05-24 16:51 ` Linus Torvalds
2022-05-24 17:30 ` Borislav Petkov
2022-05-25 12:11 ` Mark Hemment
2022-05-27 11:28 ` Borislav Petkov
2022-05-27 11:10 ` Ingo Molnar
2022-06-22 14:21 ` Borislav Petkov
2022-06-22 15:06 ` Linus Torvalds
2022-06-22 20:14 ` Borislav Petkov
2022-06-22 21:07 ` Linus Torvalds
2022-06-23 9:41 ` Borislav Petkov
2022-07-05 17:01 ` [PATCH -final] " Borislav Petkov
2022-07-06 9:24 ` Alexey Dobriyan
2022-07-11 10:33 ` Borislav Petkov
2022-07-12 12:32 ` Alexey Dobriyan
2022-08-06 12:49 ` Borislav Petkov
2022-08-18 10:44 ` [tip: x86/cpu] " tip-bot2 for Borislav Petkov
2022-04-15 2:13 ` [patch 03/14] mm/secretmem: fix panic when growing a memfd_secret Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 04/14] irq_work: use kasan_record_aux_stack_noalloc() record callstack Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 05/14] kasan: fix hw tags enablement when KUNIT tests are disabled Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 06/14] mm, kfence: support kmem_dump_obj() for KFENCE objects Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 07/14] mm, page_alloc: fix build_zonerefs_node() Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 08/14] mm: fix unexpected zeroed page mapping with zram swap Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 09/14] mm: compaction: fix compiler warning when CONFIG_COMPACTION=n Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 10/14] hugetlb: do not demote poisoned hugetlb pages Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 11/14] revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders" Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:13 ` [patch 12/14] revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" Andrew Morton
2022-04-15 2:13 ` Andrew Morton
2022-04-15 2:14 ` [patch 13/14] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore Andrew Morton
2022-04-15 2:14 ` Andrew Morton
2022-04-15 2:14 ` [patch 14/14] mm: kmemleak: take a full lowmem check in kmemleak_*_phys() Andrew Morton
2022-04-15 2:14 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YmiK7Bos+zLAvL0t@zn.tnic \
--to=bp@alien8.de \
--cc=akpm@linux-foundation.org \
--cc=chuck.lever@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=hughd@google.com \
--cc=lczerner@redhat.com \
--cc=linux-mm@kvack.org \
--cc=markhemm@googlemail.com \
--cc=mm-commits@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=patches@lists.linux.dev \
--cc=patrice.chotard@foss.st.com \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.