LKML Archive on lore.kernel.org
 help / color / Atom feed
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
@ 2019-10-06 22:20 Guenter Roeck
  2019-10-06 23:06 ` Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Guenter Roeck @ 2019-10-06 22:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, Alexander Viro, linux-fsdevel

On Sat, May 21, 2016 at 09:59:07PM -0700, Linus Torvalds wrote:
> We really should avoid the "__{get,put}_user()" functions entirely,
> because they can easily be mis-used and the original intent of being
> used for simple direct user accesses no longer holds in a post-SMAP/PAN
> world.
> 
> Manually optimizing away the user access range check makes no sense any
> more, when the range check is generally much cheaper than the "enable
> user accesses" code that the __{get,put}_user() functions still need.
> 
> So instead of __put_user(), use the unsafe_put_user() interface with
> user_access_{begin,end}() that really does generate better code these
> days, and which is generally a nicer interface.  Under some loads, the
> multiple user writes that filldir() does are actually quite noticeable.
> 
> This also makes the dirent name copy use unsafe_put_user() with a couple
> of macros.  We do not want to make function calls with SMAP/PAN
> disabled, and the code this generates is quite good when the
> architecture uses "asm goto" for unsafe_put_user() like x86 does.
> 
> Note that this doesn't bother with the legacy cases.  Nobody should use
> them anyway, so performance doesn't really matter there.
> 
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Linus,

this patch causes all my sparc64 emulations to stall during boot. It causes
all alpha emulations to crash with [1a] and [1b] when booting from a virtual
disk, and one of the xtensa emulations to crash with [2].

Reverting this patch fixes the problem.

Guenter

---
[1a]

Unable to handle kernel paging request at virtual address 0000000000000004
rcS(47): Oops -1
pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Not tainted
pc is at 0x4
ra is at filldir64+0x64/0x320
v0 = 0000000000000000  t0 = 0000000000000000  t1 = 0000000120117e8b
t2 = 646e617275303253  t3 = 646e617275300000  t4 = 0000000000007fe8
t5 = 0000000120117e78  t6 = 0000000000000000  t7 = fffffc0007ec8000
s0 = fffffc0007dbca56  s1 = 000000000000000a  s2 = 0000000000000020
s3 = fffffc0007ecbec8  s4 = 0000000000000008  s5 = 0000000000000021
s6 = 1cd2631fe897bf5a
a0 = fffffc0007dbca56  a1 = 2f2f2f2f2f2f2f2f  a2 = 000000000000000a
a3 = 1cd2631fe897bf5a  a4 = 0000000000000021  a5 = 0000000000000008
t8 = 0000000000000020  t9 = 0000000000000000  t10= fffffc0007dbca60
t11= 0000000000000001  pv = fffffc0000b9a810  at = 0000000000000001
gp = fffffc0000f03930  sp = (____ptrval____)
Disabling lock debugging due to kernel taint
Trace:
[<fffffc00004e7a08>] call_filldir+0xe8/0x1b0
[<fffffc00004e8684>] ext4_readdir+0x924/0xa70
[<fffffc0000ba3088>] _raw_spin_unlock+0x18/0x30
[<fffffc00003f751c>] __handle_mm_fault+0x9fc/0xc30
[<fffffc0000450c68>] iterate_dir+0x198/0x240
[<fffffc0000450b2c>] iterate_dir+0x5c/0x240
[<fffffc00004518b8>] ksys_getdents64+0xa8/0x160
[<fffffc0000451990>] sys_getdents64+0x20/0x40
[<fffffc0000451280>] filldir64+0x0/0x320
[<fffffc0000311634>] entSys+0xa4/0xc0

---
[1b]

Unable to handle kernel paging request at virtual address 0000000000000004
reboot(50): Oops -1
pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Tainted: G      D
pc is at 0x4
ra is at filldir64+0x64/0x320
v0 = 0000000000000000  t0 = 0000000067736d6b  t1 = 000000012011445b
t2 = 0000000000000000  t3 = 0000000000000000  t4 = 0000000000007ef8
t5 = 0000000120114448  t6 = 0000000000000000  t7 = fffffc0007eec000
s0 = fffffc000792b5c3  s1 = 0000000000000004  s2 = 0000000000000018
s3 = fffffc0007eefec8  s4 = 0000000000000008  s5 = 00000000f00000a3
s6 = 000000000000000b
a0 = fffffc000792b5c3  a1 = 2f2f2f2f2f2f2f2f  a2 = 0000000000000004
a3 = 000000000000000b  a4 = 00000000f00000a3  a5 = 0000000000000008
t8 = 0000000000000018  t9 = 0000000000000000  t10= 0000000022e1d02a
t11= 000000011f8fd3b8  pv = fffffc0000b9a810  at = 0000000022e1ccf8
gp = fffffc0000f03930  sp = (____ptrval____)
Trace:
[<fffffc00004ccba0>] proc_readdir_de+0x170/0x300
[<fffffc0000451280>] filldir64+0x0/0x320
[<fffffc00004c565c>] proc_root_readdir+0x3c/0x80
[<fffffc0000450c68>] iterate_dir+0x198/0x240
[<fffffc00004518b8>] ksys_getdents64+0xa8/0x160
[<fffffc0000451990>] sys_getdents64+0x20/0x40
[<fffffc0000451280>] filldir64+0x0/0x320
[<fffffc0000311634>] entSys+0xa4/0xc0

---
[2]

Unable to handle kernel paging request at virtual address 0000000000000004
reboot(50): Oops -1
pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Tainted: G      D
pc is at 0x4
ra is at filldir64+0x64/0x320
v0 = 0000000000000000  t0 = 0000000067736d6b  t1 = 000000012011445b
t2 = 0000000000000000  t3 = 0000000000000000  t4 = 0000000000007ef8
t5 = 0000000120114448  t6 = 0000000000000000  t7 = fffffc0007eec000
s0 = fffffc000792b5c3  s1 = 0000000000000004  s2 = 0000000000000018
s3 = fffffc0007eefec8  s4 = 0000000000000008  s5 = 00000000f00000a3
s6 = 000000000000000b
a0 = fffffc000792b5c3  a1 = 2f2f2f2f2f2f2f2f  a2 = 0000000000000004
a3 = 000000000000000b  a4 = 00000000f00000a3  a5 = 0000000000000008
t8 = 0000000000000018  t9 = 0000000000000000  t10= 0000000022e1d02a
t11= 000000011fd6f3b8  pv = fffffc0000b9a810  at = 0000000022e1ccf8
gp = fffffc0000f03930  sp = (____ptrval____)
Trace:
[<fffffc00004ccba0>] proc_readdir_de+0x170/0x300
[<fffffc0000451280>] filldir64+0x0/0x320
[<fffffc00004c565c>] proc_root_readdir+0x3c/0x80
[<fffffc0000450c68>] iterate_dir+0x198/0x240
[<fffffc00004518b8>] ksys_getdents64+0xa8/0x160
[<fffffc0000451990>] sys_getdents64+0x20/0x40
[<fffffc0000451280>] filldir64+0x0/0x320
[<fffffc0000311634>] entSys+0xa4/0xc0

Code:
 00000000
 00063301
 000007a3
 00001111
 00003f64

Segmentation fault


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-06 22:20 [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Guenter Roeck
@ 2019-10-06 23:06 ` Linus Torvalds
  2019-10-06 23:35   ` Linus Torvalds
  2019-10-07  0:23   ` Guenter Roeck
  2019-10-07  4:04 ` Max Filippov
  2019-10-07 19:21 ` Linus Torvalds
  2 siblings, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-06 23:06 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> this patch causes all my sparc64 emulations to stall during boot. It causes
> all alpha emulations to crash with [1a] and [1b] when booting from a virtual
> disk, and one of the xtensa emulations to crash with [2].

Ho humm. I've run variations of that patch over a few years on x86,
but obviously not on alpha/sparc.

At least I should still be able to read alpha assembly, even after all
these years. Would you mind sending me the result of

    make fs/readdir.s

on alpha with the broken config? I'd hope that the sparc issue is the same.

Actually, could you also do

    make fs/readdir.o

and then send me the "objdump --disassemble" of that? That way I get
the instruction offsets without having to count by hand.

> Unable to handle kernel paging request at virtual address 0000000000000004
> rcS(47): Oops -1
> pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Not tainted
> pc is at 0x4

That is _funky_. I'm not seeing how it could possibly jump to 0x4, but
it clearly does.

That said, are you sure it's _that_ commit? Because this pattern:

> a0 = fffffc0007dbca56  a1 = 2f2f2f2f2f2f2f2f  a2 = 000000000000000a

implicates the memchr('/') call in the next one. That's a word full of
'/' characters.

Of course, it could just be left-over register contents from that
memchr(), but it makes me wonder. Particularly since it seems to
happen early in filldir64():

> ra is at filldir64+0x64/0x320

which is just a fairly small handful of instructions in, and I
wouldn't be shocked if that's the return address for the call to
memchr.

              Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-06 23:06 ` Linus Torvalds
@ 2019-10-06 23:35   ` Linus Torvalds
  2019-10-07  0:04     ` Guenter Roeck
  2019-10-07  0:23   ` Guenter Roeck
  1 sibling, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-06 23:35 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2192 bytes --]

On Sun, Oct 6, 2019 at 4:06 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Ho humm. I've run variations of that patch over a few years on x86,
> but obviously not on alpha/sparc.

Oooh.

I wonder... This may be the name string copy loop. And it's special in
that the result may not be aligned.

Now, a "__put_user()" with an unaligned address _should_ work - it's
very easy to trigger that from user space by just giving an unaligned
address to any system call that then writes a single word.

But alpha does

  #define __put_user_32(x, addr)                                  \
  __asm__ __volatile__("1: stl %r2,%1\n"                          \
          "2:\n"                                                  \
          EXC(1b,2b,$31,%0)                                       \
                  : "=r"(__pu_err)                                \
                  : "m"(__m(addr)), "rJ"(x), "0"(__pu_err))

iow it implements that 32-bit __put_user() as a 'stl'.

Which will trap if it's not aligned.

And I wonder how much testing that has ever gotten. Nobody really does
unaigned accesses on alpha.

We need to do that memcpy unrolling on x86, because x86 actually uses
"user_access_begin()" and we have magic rules about what is inside
that region.

But on alpha (and sparc) it might be better to just do "__copy_to_user()".

Anyway, this does look like a possible latent bug where the alpha
unaligned trap doesn't then handle the case of exceptions. I know it
_tries_, but I doubt it's gotten a whole lot of testing.

Anyway, let me think about this, but just for testing, does the
attached patch make any difference? It's not the right thing in
general (and most definitely not on x86), but for testing whether this
is about unaligned accesses it might work.

It's entirely untested, and in fact on x86 it should cause objtool to
complain about a function call with AC set. But I think that on alpha
and sparc, using __copy_to_user() for the name copy should work, and
would work around the unaligned issue.

That said, if it *is* the unaligned issue, then that just means that
we have a serious bug elsewhere in the alpha port. Maybe nobody cares.

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 658 bytes --]

 fs/readdir.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/readdir.c b/fs/readdir.c
index 19bea591c3f1..d49c4e2c66a8 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -76,6 +76,15 @@
 	unsafe_put_user(0, dst, label);				\
 } while (0)
 
+/* Alpha (and sparc?) test patch! */
+#undef unsafe_copy_dirent_name
+#define unsafe_copy_dirent_name(_dst, _src, _len, label) do {	\
+	char __user *dst = (_dst);				\
+	const char *src = (_src);				\
+	size_t len = (_len);					\
+	if (__copy_to_user(dst, src, len)) goto label;		\
+	unsafe_put_user(0, dst+len, label);			\
+} while (0)
 
 int iterate_dir(struct file *file, struct dir_context *ctx)
 {

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-06 23:35   ` Linus Torvalds
@ 2019-10-07  0:04     ` Guenter Roeck
  2019-10-07  1:17       ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Guenter Roeck @ 2019-10-07  0:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

On 10/6/19 4:35 PM, Linus Torvalds wrote:
[ ... ]

> Anyway, let me think about this, but just for testing, does the
> attached patch make any difference? It's not the right thing in
> general (and most definitely not on x86), but for testing whether this
> is about unaligned accesses it might work.
> 

All my alpha, sparc64, and xtensa tests pass with the attached patch
applied on top of v5.4-rc2. I didn't test any others.

I'll (try to) send you some disassembly next.

Guenter

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-06 23:06 ` Linus Torvalds
  2019-10-06 23:35   ` Linus Torvalds
@ 2019-10-07  0:23   ` Guenter Roeck
  1 sibling, 0 replies; 71+ messages in thread
From: Guenter Roeck @ 2019-10-07  0:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 1532 bytes --]

On Sun, Oct 06, 2019 at 04:06:16PM -0700, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > this patch causes all my sparc64 emulations to stall during boot. It causes
> > all alpha emulations to crash with [1a] and [1b] when booting from a virtual
> > disk, and one of the xtensa emulations to crash with [2].
> 
> Ho humm. I've run variations of that patch over a few years on x86,
> but obviously not on alpha/sparc.
> 
> At least I should still be able to read alpha assembly, even after all
> these years. Would you mind sending me the result of
> 
>     make fs/readdir.s
> 
> on alpha with the broken config? I'd hope that the sparc issue is the same.
> 
> Actually, could you also do
> 
>     make fs/readdir.o
> 
> and then send me the "objdump --disassemble" of that? That way I get
> the instruction offsets without having to count by hand.
> 

Both attached for alpha.

> > Unable to handle kernel paging request at virtual address 0000000000000004
> > rcS(47): Oops -1
> > pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Not tainted
> > pc is at 0x4
> 
> That is _funky_. I'm not seeing how it could possibly jump to 0x4, but
> it clearly does.
> 
> That said, are you sure it's _that_ commit? Because this pattern:
> 
Bisect on sparc pointed to this commit, and re-running the tests with
the commit reverted passed for all architectures. I didn't check any
further.

Please let me know if you need anything else at this point.

Thanks,
Guenter

[-- Attachment #2: readdir.s --]
[-- Type: text/plain, Size: 65439 bytes --]

	.set noreorder
	.set volatile
	.set noat
	.set nomacro
	.arch ev5
 # GNU C89 (GCC) version 9.2.0 (alpha-linux)
 #	compiled by GNU C version 6.5.0 20181026, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version none
 # warning: GMP header version 6.1.0 differs from library version 6.1.2.
 # warning: MPC header version 1.0.3 differs from library version 1.1.0.
 # GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
 # options passed:  -nostdinc -I ./arch/alpha/include
 # -I ./arch/alpha/include/generated -I ./include
 # -I ./arch/alpha/include/uapi -I ./arch/alpha/include/generated/uapi
 # -I ./include/uapi -I ./include/generated/uapi
 # -iprefix /opt/kernel/gcc-9.2.0-nolibc/alpha-linux/bin/../lib/gcc/alpha-linux/9.2.0/
 # -D __KERNEL__ -D KBUILD_BASENAME="readdir" -D KBUILD_MODNAME="readdir"
 # -isystem /opt/kernel/gcc-9.2.0-nolibc/alpha-linux/bin/../lib/gcc/alpha-linux/9.2.0/include
 # -include ./include/linux/kconfig.h
 # -include ./include/linux/compiler_types.h -MD fs/.readdir.s.d
 # fs/readdir.c -mno-fp-regs -mcpu=ev5 -auxbase-strip fs/readdir.s -O2
 # -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs
 # -Werror=implicit-function-declaration -Werror=implicit-int
 # -Wno-format-security -Wno-frame-address -Wformat-truncation=0
 # -Wformat-overflow=0 -Wno-address-of-packed-member
 # -Wframe-larger-than=2048 -Wno-unused-but-set-variable
 # -Wimplicit-fallthrough=3 -Wunused-const-variable=0
 # -Wdeclaration-after-statement -Wvla -Wno-pointer-sign
 # -Wno-stringop-truncation -Werror=date-time
 # -Werror=incompatible-pointer-types -Werror=designated-init
 # -Wno-packed-not-aligned -std=gnu90 -fno-strict-aliasing -fno-common
 # -fshort-wchar -fno-PIE -ffixed-8 -fno-jump-tables
 # -fno-delete-null-pointer-checks -fno-stack-protector
 # -fomit-frame-pointer -fno-strict-overflow -fno-merge-all-constants
 # -fmerge-constants -fstack-check=no -fconserve-stack
 # -fmacro-prefix-map=./= -fverbose-asm --param allow-store-data-races=0
 # options enabled:  -faggressive-loop-optimizations -falign-functions
 # -falign-jumps -falign-labels -falign-loops -fassume-phsa -fauto-inc-dec
 # -fbranch-count-reg -fcaller-saves -fcode-hoisting
 # -fcombine-stack-adjustments -fcompare-elim -fcprop-registers
 # -fcrossjumping -fcse-follow-jumps -fdefer-pop -fdevirtualize
 # -fdevirtualize-speculatively -fdwarf2-cfi-asm -fearly-inlining
 # -feliminate-unused-debug-types -fexpensive-optimizations
 # -fforward-propagate -ffp-int-builtin-inexact -ffunction-cse -fgcse
 # -fgcse-lm -fgnu-runtime -fgnu-unique -fguess-branch-probability
 # -fhoist-adjacent-loads -fident -fif-conversion -fif-conversion2
 # -findirect-inlining -finline -finline-atomics
 # -finline-functions-called-once -finline-small-functions -fipa-bit-cp
 # -fipa-cp -fipa-icf -fipa-icf-functions -fipa-icf-variables -fipa-profile
 # -fipa-pure-const -fipa-ra -fipa-reference -fipa-reference-addressable
 # -fipa-sra -fipa-stack-alignment -fipa-vrp -fira-hoist-pressure
 # -fira-share-save-slots -fira-share-spill-slots
 # -fisolate-erroneous-paths-dereference -fivopts -fkeep-static-consts
 # -fleading-underscore -flifetime-dse -flra-remat -flto-odr-type-merging
 # -fmath-errno -fmerge-constants -fmerge-debug-strings
 # -fmove-loop-invariants -fomit-frame-pointer -foptimize-sibling-calls
 # -foptimize-strlen -fpartial-inlining -fpcc-struct-return -fpeephole
 # -fpeephole2 -fplt -fprefetch-loop-arrays -free -freorder-blocks
 # -freorder-functions -frerun-cse-after-loop
 # -fsched-critical-path-heuristic -fsched-dep-count-heuristic
 # -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
 # -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
 # -fsched-stalled-insns-dep -fschedule-fusion -fschedule-insns
 # -fschedule-insns2 -fsemantic-interposition -fshow-column -fshrink-wrap
 # -fshrink-wrap-separate -fsigned-zeros -fsplit-ivs-in-unroller
 # -fsplit-wide-types -fssa-backprop -fssa-phiopt -fstdarg-opt
 # -fstore-merging -fstrict-volatile-bitfields -fsync-libcalls
 # -fthread-jumps -ftoplevel-reorder -ftrapping-math -ftree-bit-ccp
 # -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-coalesce-vars
 # -ftree-copy-prop -ftree-cselim -ftree-dce -ftree-dominator-opts
 # -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert
 # -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
 # -ftree-parallelize-loops= -ftree-phiprop -ftree-pre -ftree-pta
 # -ftree-reassoc -ftree-scev-cprop -ftree-sink -ftree-slsr -ftree-sra
 # -ftree-switch-conversion -ftree-tail-merge -ftree-ter -ftree-vrp
 # -funit-at-a-time -funwind-tables -fverbose-asm -fwrapv -fwrapv-pointer
 # -fzero-initialized-in-bss -mexplicit-relocs -mfloat-ieee -mglibc
 # -mlarge-data -mlarge-text -mlong-double-64 -msoft-float

	.text
	.align 2
	.align 4
	.globl iterate_dir
	.ent iterate_dir
iterate_dir:
	.frame $30,64,$26,0
	.mask 0x400fe00,-64
$LFB3537:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!1	 #,,
	lda $29,0($29)		!gpdisp!1	 #,,
$iterate_dir..ng:
	lda $30,-64($30)	 #,,
	.cfi_def_cfa_offset 64
	bis $31,$31,$31
	stq $9,8($30)	 #,
	.cfi_offset 9, -56
	mov $16,$9	 # tmp144, file
	stq $11,24($30)	 #,
	.cfi_offset 11, -40
	mov $17,$11	 # tmp145, ctx
	stq $26,0($30)	 #,
	stq $10,16($30)	 #,
	stq $12,32($30)	 #,
	stq $13,40($30)	 #,
	stq $14,48($30)	 #,
	stq $15,56($30)	 #,
	.cfi_offset 26, -64
	.cfi_offset 10, -48
	.cfi_offset 12, -32
	.cfi_offset 13, -24
	.cfi_offset 14, -16
	.cfi_offset 15, -8
	.prologue 1
 # fs/readdir.c:85: 	if (file->f_op->iterate_shared)
	ldq $1,40($16)	 # file_23(D)->f_op, _1
 # ./include/linux/fs.h:1318: 	return f->f_inode;
	ldq $12,32($16)	 # MEM[(const struct file *)file_23(D)].f_inode, _26
 # fs/readdir.c:85: 	if (file->f_op->iterate_shared)
	ldq $2,64($1)	 # _1->iterate_shared, _1->iterate_shared
	beq $2,$L20	 #, _1->iterate_shared,
 # fs/readdir.c:95: 		res = down_read_killable(&inode->i_rwsem);
	lda $13,160($12)	 # pretmp_38,, _26
	ldq $27,down_read_killable($29)		!literal!14	 #,,,
	mov $13,$16	 # pretmp_38,
 # fs/readdir.c:86: 		shared = true;
	lda $14,1($31)	 # shared,
 # fs/readdir.c:95: 		res = down_read_killable(&inode->i_rwsem);
	jsr $26,($27),down_read_killable		!lituse_jsr!14	 #,,
	ldah $29,0($26)		!gpdisp!15	 #
	lda $29,0($29)		!gpdisp!15	 #,,
	mov $0,$10	 # tmp146, <retval>
$L5:
 # fs/readdir.c:98: 	if (res)
	ldq_u $31,0($30)
	bne $10,$L3	 #, <retval>,
 # fs/readdir.c:102: 	if (!IS_DEADDIR(inode)) {
	ldl $1,12($12)	 #, _26->i_flags
 # fs/readdir.c:101: 	res = -ENOENT;
	lda $10,-2($31)	 # <retval>,
 # fs/readdir.c:102: 	if (!IS_DEADDIR(inode)) {
	and $1,16,$1	 # _26->i_flags,, tmp112
	bne $1,$L6	 #, tmp112,
 # fs/readdir.c:103: 		ctx->pos = file->f_pos;
	ldq $1,152($9)	 # file_23(D)->f_pos, _8
 # fs/readdir.c:105: 			res = file->f_op->iterate_shared(file, ctx);
	mov $11,$17	 # ctx,
	mov $9,$16	 # file,
 # fs/readdir.c:103: 		ctx->pos = file->f_pos;
	stq $1,8($11)	 # ctx_31(D)->pos, _8
 # fs/readdir.c:105: 			res = file->f_op->iterate_shared(file, ctx);
	ldq $1,40($9)	 # file_23(D)->f_op, file_23(D)->f_op
 # fs/readdir.c:104: 		if (shared)
	bne $14,$L21	 #, shared,
 # fs/readdir.c:107: 			res = file->f_op->iterate(file, ctx);
	ldq $27,56($1)	 # _11->iterate, _11->iterate
	jsr $26,($27),0	 # _11->iterate
	ldah $29,0($26)		!gpdisp!16
	lda $29,0($29)		!gpdisp!16
	mov $0,$10	 # tmp149, <retval>
	bis $31,$31,$31
$L8:
 # fs/readdir.c:108: 		file->f_pos = ctx->pos;
	ldq $1,8($11)	 # ctx_31(D)->pos, _13
 # ./include/linux/fs.h:1318: 	return f->f_inode;
	ldq $12,32($9)	 # MEM[(const struct file *)file_23(D)].f_inode, _47
 # ./include/linux/fsnotify.h:239: 	if (!(file->f_mode & FMODE_NONOTIFY))
	ldl $2,92($9)	 #, file_23(D)->f_mode
 # ./include/linux/fsnotify.h:237: 		mask |= FS_ISDIR;
	ldah $18,16384($31)	 # tmp100,
 # fs/readdir.c:108: 		file->f_pos = ctx->pos;
	stq $1,152($9)	 # file_23(D)->f_pos, _13
 # ./include/linux/fsnotify.h:237: 		mask |= FS_ISDIR;
	lda $18,1($18)	 # tmp143,, tmp100
 # ./include/linux/fsnotify.h:236: 	if (S_ISDIR(inode->i_mode))
	ldl $1,0($12)	 #,* _47
 # ./include/linux/fsnotify.h:237: 		mask |= FS_ISDIR;
	lda $11,1($31)	 # mask,
 # ./include/linux/fsnotify.h:239: 	if (!(file->f_mode & FMODE_NONOTIFY))
	srl $2,26,$2	 # file_23(D)->f_mode,, tmp131
 # ./include/linux/fsnotify.h:232: 	const struct path *path = &file->f_path;
	lda $15,16($9)	 # path,, file
 # ./include/linux/fsnotify.h:236: 	if (S_ISDIR(inode->i_mode))
	extwl $1,0,$3	 #, tmp122,, tmp121
	lda $1,-4096($31)	 # tmp124,
	and $1,$3,$1	 # tmp124, tmp121, tmp125
	lda $1,-16384($1)	 # tmp126,, tmp125
 # ./include/linux/fsnotify.h:237: 		mask |= FS_ISDIR;
	cmoveq $1,$18,$11	 #, tmp126, tmp143, mask
 # ./include/linux/fsnotify.h:239: 	if (!(file->f_mode & FMODE_NONOTIFY))
	blbc $2,$L22	 # tmp131,
$L11:
 # ./include/linux/fs.h:2201: 	if (!(file->f_flags & O_NOATIME))
	ldl $1,88($9)	 #, file_23(D)->f_flags
 # ./include/linux/fs.h:2201: 	if (!(file->f_flags & O_NOATIME))
	srl $1,20,$1	 # file_23(D)->f_flags,, tmp139
	ldq_u $31,0($30)
	blbs $1,$L6	 # tmp139,
 # ./include/linux/fs.h:2202: 		touch_atime(&file->f_path);
	ldq $27,touch_atime($29)		!literal!6	 #,,,
	mov $15,$16	 # path,
	jsr $26,($27),touch_atime		!lituse_jsr!6	 #,,
	ldah $29,0($26)		!gpdisp!7	 #
	lda $29,0($29)		!gpdisp!7	 #,,
	.align 4
$L6:
 # ./include/linux/fs.h:806: 	up_read(&inode->i_rwsem);
	mov $13,$16	 # pretmp_38,
 # fs/readdir.c:112: 	if (shared)
	beq $14,$L13	 #, shared,
 # ./include/linux/fs.h:806: 	up_read(&inode->i_rwsem);
	ldq $27,up_read($29)		!literal!4	 #,,,
	jsr $26,($27),up_read		!lituse_jsr!4	 #,,
	ldah $29,0($26)		!gpdisp!5	 #
	lda $29,0($29)		!gpdisp!5	 #,,
$L3:
 # fs/readdir.c:118: }
	mov $10,$0	 # <retval>,
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
	ldq $13,40($30)	 #,
	ldq $14,48($30)	 #,
	ldq $15,56($30)	 #,
	bis $31,$31,$31
	lda $30,64($30)	 #,,
	.cfi_remember_state
	.cfi_restore 15
	.cfi_restore 14
	.cfi_restore 13
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L13:
	.cfi_restore_state
 # ./include/linux/fs.h:796: 	up_write(&inode->i_rwsem);
	ldq $27,up_write($29)		!literal!2	 #,,,
	bis $31,$31,$31
	jsr $26,($27),up_write		!lituse_jsr!2	 #,,
	ldah $29,0($26)		!gpdisp!3	 #
	lda $29,0($29)		!gpdisp!3	 #,,
 # ./include/linux/fs.h:797: }
	br $31,$L3	 #
	.align 4
$L20:
 # fs/readdir.c:87: 	else if (!file->f_op->iterate)
	ldq $1,56($1)	 # _1->iterate, _1->iterate
 # fs/readdir.c:84: 	int res = -ENOTDIR;
	lda $10,-20($31)	 # <retval>,
 # fs/readdir.c:87: 	else if (!file->f_op->iterate)
	ldq_u $31,0($30)
	beq $1,$L3	 #, _1->iterate,
 # fs/readdir.c:97: 		res = down_write_killable(&inode->i_rwsem);
	lda $13,160($12)	 # pretmp_38,, _26
	ldq $27,down_write_killable($29)		!literal!12	 #,,,
	mov $13,$16	 # pretmp_38,
 # fs/readdir.c:83: 	bool shared = false;
	mov $31,$14	 #, shared
 # fs/readdir.c:97: 		res = down_write_killable(&inode->i_rwsem);
	jsr $26,($27),down_write_killable		!lituse_jsr!12	 #,,
	ldah $29,0($26)		!gpdisp!13	 #
	lda $29,0($29)		!gpdisp!13	 #,,
	mov $0,$10	 # tmp147, <retval>
	br $31,$L5	 #
	.align 4
$L21:
 # fs/readdir.c:105: 			res = file->f_op->iterate_shared(file, ctx);
	ldq $27,64($1)	 # _9->iterate_shared, _9->iterate_shared
	jsr $26,($27),0	 # _9->iterate_shared
	ldah $29,0($26)		!gpdisp!17
	lda $29,0($29)		!gpdisp!17
	mov $0,$10	 # tmp148, <retval>
	br $31,$L8	 #
	.align 4
$L22:
 # ./include/linux/fsnotify.h:40: 	return __fsnotify_parent(path, dentry, mask);
	ldq $17,24($9)	 # MEM[(const struct path *)file_23(D) + 16B].dentry,
	ldq $27,__fsnotify_parent($29)		!literal!10	 #,,,
	mov $11,$18	 # mask,
	mov $15,$16	 # path,
	jsr $26,($27),__fsnotify_parent		!lituse_jsr!10	 #,,
	ldah $29,0($26)		!gpdisp!11	 #
	lda $29,0($29)		!gpdisp!11	 #,,
 # ./include/linux/fsnotify.h:52: 	if (ret)
	bne $0,$L11	 #, tmp150,
 # ./include/linux/fsnotify.h:54: 	return fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0);
	ldq $27,fsnotify($29)		!literal!8	 #,,,
	mov $31,$21	 #,
	mov $31,$20	 #,
	lda $19,1($31)	 #,
	mov $15,$18	 # path,
	mov $11,$17	 # mask,
	mov $12,$16	 # _47,
	jsr $26,($27),fsnotify		!lituse_jsr!8	 #,,
	ldah $29,0($26)		!gpdisp!9	 #
	lda $29,0($29)		!gpdisp!9	 #,,
	br $31,$L11	 #
	.cfi_endproc
$LFE3537:
	.end iterate_dir
	.align 2
	.align 4
	.ent fillonedir
fillonedir:
	.frame $30,48,$26,0
	.mask 0x4001e00,-48
$LFB3539:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!18	 #,,
	lda $29,0($29)		!gpdisp!18	 #,,
$fillonedir..ng:
	lda $30,-48($30)	 #,,
	.cfi_def_cfa_offset 48
	bis $31,$31,$31
	stq $11,24($30)	 #,
	.cfi_offset 11, -24
	mov $16,$11	 # tmp141, ctx
	stq $12,32($30)	 #,
	.cfi_offset 12, -16
	mov $18,$12	 # namlen, tmp142
	stq $26,0($30)	 #,
	stq $9,8($30)	 #,
	stq $10,16($30)	 #,
	.cfi_offset 26, -48
	.cfi_offset 9, -40
	.cfi_offset 10, -32
	.prologue 1
 # fs/readdir.c:187: 	if (buf->result)
	ldl $10,24($16)	 # <retval>, MEM[(struct readdir_callback *)ctx_33(D)].result
 # fs/readdir.c:187: 	if (buf->result)
	bne $10,$L27	 #, <retval>,
 # fs/readdir.c:195: 	dirent = buf->dirent;
	ldq $2,16($16)	 # MEM[(struct readdir_callback *)ctx_33(D)].dirent, dirent
 # fs/readdir.c:196: 	if (!access_ok(dirent,
	lda $1,1($18)	 # tmp113,, namlen
 # fs/readdir.c:194: 	buf->result++;
	lda $3,1($31)	 # tmp112,
	stl $3,24($16)	 # tmp112, MEM[(struct readdir_callback *)ctx_33(D)].result
 # fs/readdir.c:196: 	if (!access_ok(dirent,
	lda $9,18($2)	 # _2,, dirent
	addq $9,$1,$1	 # _2, tmp113, _6
	ldq $4,80($8)	 # __current_thread_info.3_9->addr_limit.seg, __current_thread_info.3_9->addr_limit.seg
	subq $1,$2,$3	 # _6, dirent, __ao_b
	cmpult $31,$3,$5	 # __ao_b, tmp115
	bis $2,$3,$3	 # dirent, __ao_b, tmp117
	subq $1,$5,$1	 # _6, tmp115, __ao_end
	bis $1,$3,$1	 # __ao_end, tmp117, tmp118
	and $1,$4,$1	 # tmp118, __current_thread_info.3_9->addr_limit.seg, tmp119
 # fs/readdir.c:196: 	if (!access_ok(dirent,
	bne $1,$L26	 #, tmp119,
 # fs/readdir.c:200: 	if (	__put_user(d_ino, &dirent->d_ino) ||
	mov $10,$1	 # <retval>, __pu_err
	.set	macro
 # 200 "fs/readdir.c" 1
	1: stq $20,0($2)	 # ino, MEM[(struct __large_struct *)_16]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
 # fs/readdir.c:200: 	if (	__put_user(d_ino, &dirent->d_ino) ||
	.set	nomacro
	bne $1,$L26	 #, __pu_err,
 # fs/readdir.c:201: 		__put_user(offset, &dirent->d_offset) ||
	mov $10,$1	 # <retval>, __pu_err
	.set	macro
 # 201 "fs/readdir.c" 1
	1: stq $19,8($2)	 # offset, MEM[(struct __large_struct *)_19]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
 # fs/readdir.c:200: 	if (	__put_user(d_ino, &dirent->d_ino) ||
	.set	nomacro
	bne $1,$L26	 #, __pu_err,
 # fs/readdir.c:202: 		__put_user(namlen, &dirent->d_namlen) ||
	.align 3 #realign	 #
	lda $2,16($2)	 # tmp131,, dirent
	mov $10,$1	 # <retval>, __pu_err
	zapnot $18,3,$3	 # namlen, namlen
	.set	macro
 # 202 "fs/readdir.c" 1
	1:	ldq_u $5,1($2)	 # __pu_tmp2, tmp131
2:	ldq_u $4,0($2)	 # __pu_tmp1, tmp131
	inswh $3,$2,$7	 # namlen, tmp131, __pu_tmp4
	inswl $3,$2,$6	 # namlen, tmp131, __pu_tmp3
	mskwh $5,$2,$5	 # __pu_tmp2, tmp131
	mskwl $4,$2,$4	 # __pu_tmp1, tmp131
	or $5,$7,$5	 # __pu_tmp2, __pu_tmp4
	or $4,$6,$4	 # __pu_tmp1, __pu_tmp3
3:	stq_u $5,1($2)	 # __pu_tmp2, tmp131
4:	stq_u $4,0($2)	 # __pu_tmp1, tmp131
5:
.section __ex_table,"a"
	.long 1b-.
	lda $31,5b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,5b-2b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 3b-.
	lda $31,5b-3b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 4b-.
	lda $31,5b-4b($1)	 # __pu_err
.previous

 # 0 "" 2
 # fs/readdir.c:201: 		__put_user(offset, &dirent->d_offset) ||
	.set	nomacro
	bne $1,$L26	 #, __pu_err,
 # ./arch/alpha/include/asm/uaccess.h:314: 	return __copy_user((__force void *)to, from, len);
	.align 3 #realign	 #
	ldq $27,__copy_user($29)		!literal!19	 #,,,
	mov $9,$16	 # _2,
	jsr $26,($27),__copy_user		!lituse_jsr!19	 #,,
	ldah $29,0($26)		!gpdisp!20	 #
	lda $29,0($29)		!gpdisp!20	 #,,
 # fs/readdir.c:202: 		__put_user(namlen, &dirent->d_namlen) ||
	bne $0,$L26	 #, tmp145,
 # fs/readdir.c:204: 		__put_user(0, dirent->d_name + namlen))
	addq $9,$12,$9	 # _2, namlen, tmp138
	mov $10,$1	 # <retval>, __pu_err
	.set	macro
 # 204 "fs/readdir.c" 1
	1:	ldq_u $2,0($9)	 # __pu_tmp1, tmp138
	insbl $10,$9,$3	 # __pu_err, tmp138, __pu_tmp2
	mskbl $2,$9,$2	 # __pu_tmp1, tmp138
	or $2,$3,$2	 # __pu_tmp1, __pu_tmp2
2:	stq_u $2,0($9)	 # __pu_tmp1, tmp138
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
 # fs/readdir.c:203: 		__copy_to_user(dirent->d_name, name, namlen) ||
	.set	nomacro
	bne $1,$L26	 #, __pu_err,
	.align 3 #realign	 #
$L24:
 # fs/readdir.c:210: }
	mov $10,$0	 # <retval>,
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
	lda $30,48($30)	 #,,
	.cfi_remember_state
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L26:
	.cfi_restore_state
 # fs/readdir.c:208: 	buf->result = -EFAULT;
	lda $1,-14($31)	 # tmp121,
 # fs/readdir.c:209: 	return -EFAULT;
	lda $10,-14($31)	 # <retval>,
 # fs/readdir.c:208: 	buf->result = -EFAULT;
	stl $1,24($11)	 # tmp121, MEM[(struct readdir_callback *)ctx_33(D)].result
 # fs/readdir.c:209: 	return -EFAULT;
	br $31,$L24	 #
	.align 4
$L27:
 # fs/readdir.c:188: 		return -EINVAL;
	lda $10,-22($31)	 # <retval>,
	br $31,$L24	 #
	.cfi_endproc
$LFE3539:
	.end fillonedir
	.section	.rodata.str1.1,"aMS",@progbits,1
$LC0:
	.string	"fs/readdir.c"
	.text
	.align 2
	.align 4
	.ent verify_dirent_name
verify_dirent_name:
	.frame $30,32,$26,0
	.mask 0x4000000,-32
$LFB3538:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!21	 #,,
	lda $29,0($29)		!gpdisp!21	 #,,
$verify_dirent_name..ng:
	lda $30,-32($30)	 #,,
	.cfi_def_cfa_offset 32
	mov $17,$18	 # tmp106, len
	stq $26,0($30)	 #,
	.cfi_offset 26, -32
	.prologue 1
 # fs/readdir.c:148: 	if (WARN_ON_ONCE(!len))
	beq $17,$L36	 #, len,
 # fs/readdir.c:150: 	if (WARN_ON_ONCE(memchr(name, '/', len)))
	ldq $27,memchr($29)		!literal!22	 #,,,
	lda $17,47($31)	 #,
	jsr $26,($27),memchr		!lituse_jsr!22	 #,,
	ldah $29,0($26)		!gpdisp!23	 #
	lda $29,0($29)		!gpdisp!23	 #,,
	bne $0,$L31	 #, tmp107,
$L34:
 # fs/readdir.c:153: }
	ldq $26,0($30)	 #,
	lda $30,32($30)	 #,,
	.cfi_remember_state
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L36:
	.cfi_restore_state
 # fs/readdir.c:148: 	if (WARN_ON_ONCE(!len))
	ldah $1,__warned.38909($29)		!gprelhigh	 # tmp77,,
 # fs/readdir.c:149: 		return -EIO;
	lda $0,-5($31)	 # <retval>,
 # fs/readdir.c:148: 	if (WARN_ON_ONCE(!len))
	ldq_u $2,__warned.38909($1)		!gprellow	 #, tmp81
	lda $3,__warned.38909($1)		!gprellow	 # tmp82,, tmp77
	extbl $2,$3,$4	 #, tmp81, tmp82, tmp78
	bne $4,$L34	 #, tmp78,
 # fs/readdir.c:148: 	if (WARN_ON_ONCE(!len))
	lda $4,1($31)	 # tmp84,
	mov $31,$19	 #,
	mskbl $2,$3,$2	 #, tmp81, tmp82, tmp86
	lda $18,9($31)	 #,
	insbl $4,$3,$3	 # tmp84, tmp82, tmp87
	lda $17,148($31)	 #,
	bis $3,$2,$3	 # tmp87, tmp86, tmp87
	stq_u $3,__warned.38909($1)		!gprellow	 #, tmp87
$L35:
 # fs/readdir.c:150: 	if (WARN_ON_ONCE(memchr(name, '/', len)))
	ldah $16,$LC0($29)		!gprelhigh	 # tmp102,,
	ldq $27,warn_slowpath_fmt($29)		!literal!24	 #,,,
	stq $0,16($30)	 #,
	lda $16,$LC0($16)		!gprellow	 #,, tmp102
	jsr $26,($27),warn_slowpath_fmt		!lituse_jsr!24	 #,,
	ldah $29,0($26)		!gpdisp!25	 #
	lda $29,0($29)		!gpdisp!25	 #,,
	ldq $0,16($30)	 #,
	br $31,$L34	 #
	.align 4
$L31:
 # fs/readdir.c:150: 	if (WARN_ON_ONCE(memchr(name, '/', len)))
	ldah $1,__warned.38914($29)		!gprelhigh	 # tmp90,,
 # fs/readdir.c:149: 		return -EIO;
	lda $0,-5($31)	 # <retval>,
 # fs/readdir.c:150: 	if (WARN_ON_ONCE(memchr(name, '/', len)))
	ldq_u $2,__warned.38914($1)		!gprellow	 #, tmp94
	lda $3,__warned.38914($1)		!gprellow	 # tmp95,, tmp90
	extbl $2,$3,$4	 #, tmp94, tmp95, tmp91
	bne $4,$L34	 #, tmp91,
 # fs/readdir.c:150: 	if (WARN_ON_ONCE(memchr(name, '/', len)))
	lda $4,1($31)	 # tmp97,
	mov $31,$19	 #,
	mskbl $2,$3,$2	 #, tmp94, tmp95, tmp99
	lda $18,9($31)	 #,
	insbl $4,$3,$3	 # tmp97, tmp95, tmp100
	lda $17,150($31)	 #,
	bis $3,$2,$3	 # tmp100, tmp99, tmp100
	stq_u $3,__warned.38914($1)		!gprellow	 #, tmp100
	br $31,$L35	 #
	.cfi_endproc
$LFE3538:
	.end verify_dirent_name
	.align 2
	.align 4
	.ent filldir
filldir:
	.frame $30,64,$26,0
	.mask 0x400fe00,-64
$LFB3542:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!26	 #,,
	lda $29,0($29)		!gpdisp!26	 #,,
$filldir..ng:
	lda $30,-64($30)	 #,,
	.cfi_def_cfa_offset 64
	bis $31,$31,$31
	stq $9,8($30)	 #,
	.cfi_offset 9, -56
	mov $17,$9	 # tmp243, name
	stq $13,40($30)	 #,
	.cfi_offset 13, -24
 # fs/readdir.c:261: 	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2,
	addl $18,27,$13	 # namlen,, tmp143
 # fs/readdir.c:256: {
	stq $11,24($30)	 #,
 # fs/readdir.c:261: 	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2,
	bic $13,7,$13	 # tmp143,, tmp144
	.cfi_offset 11, -40
 # fs/readdir.c:256: {
	mov $16,$11	 # ctx, tmp242
 # fs/readdir.c:264: 	buf->error = verify_dirent_name(name, namlen);
	mov $18,$17	 # namlen,
	mov $9,$16	 # name,
 # fs/readdir.c:261: 	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2,
	addl $31,$13,$13	 # tmp144, reclen
 # fs/readdir.c:256: {
	stq $10,16($30)	 #,
	.cfi_offset 10, -48
	mov $18,$10	 # tmp244, namlen
	stq $12,32($30)	 #,
	.cfi_offset 12, -32
	mov $21,$12	 # tmp247, d_type
	stq $14,48($30)	 #,
	.cfi_offset 14, -16
	mov $20,$14	 # tmp246, ino
	stq $15,56($30)	 #,
	.cfi_offset 15, -8
	mov $19,$15	 # tmp245, offset
	stq $26,0($30)	 #,
	.cfi_offset 26, -64
	.prologue 1
 # fs/readdir.c:264: 	buf->error = verify_dirent_name(name, namlen);
	ldq $27,verify_dirent_name($29)		!literal!27	 #
	jsr $26,($27),0		!lituse_jsr!27
	ldah $29,0($26)		!gpdisp!28
	lda $29,0($29)		!gpdisp!28
 # fs/readdir.c:265: 	if (unlikely(buf->error))
	bne $0,$L60	 #, <retval>,
 # fs/readdir.c:267: 	buf->error = -EINVAL;	/* only used if we fail.. */
	lda $1,-22($31)	 # tmp147,
 # fs/readdir.c:268: 	if (reclen > buf->count)
	ldl $5,32($11)	 # _8, MEM[(struct getdents_callback *)ctx_55(D)].count
 # fs/readdir.c:267: 	buf->error = -EINVAL;	/* only used if we fail.. */
	stl $1,36($11)	 # tmp147, MEM[(struct getdents_callback *)ctx_55(D)].error
 # fs/readdir.c:268: 	if (reclen > buf->count)
	bis $31,$31,$31
	cmplt $5,$13,$1	 #, _8, reclen, tmp148
	bne $1,$L51	 #, tmp148,
 # fs/readdir.c:275: 	dirent = buf->previous;
	ldq $2,24($11)	 # MEM[(struct getdents_callback *)ctx_55(D)].previous, dirent
 # fs/readdir.c:276: 	if (dirent && signal_pending(current))
	beq $2,$L40	 #, dirent,
 # ./include/linux/sched.h:1737: 	return test_ti_thread_flag(task_thread_info(tsk), flag);
	ldq $1,64($8)	 # __current_thread_info.16_9->task, __current_thread_info.16_9->task
 # ./arch/alpha/include/asm/bitops.h:289: 	return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
	ldq $1,8($1)	 # _10->stack, _10->stack
 # ./arch/alpha/include/asm/bitops.h:289: 	return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
	ldl $1,72($1)	 # MEM[(const int *)_118 + 72B], MEM[(const int *)_118 + 72B]
 # fs/readdir.c:276: 	if (dirent && signal_pending(current))
	and $1,4,$1	 # MEM[(const int *)_118 + 72B],, tmp154
	cmpult $31,$1,$1	 # tmp154, tmp154
	bne $1,$L61	 #, tmp154,
 # fs/readdir.c:283: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	lda $1,23($2)	 # __ao_end,, dirent
	ldq $3,80($8)	 # __current_thread_info.16_9->addr_limit.seg, __current_thread_info.16_9->addr_limit.seg
	bis $1,$2,$1	 # __ao_end, dirent, tmp236
	bis $1,24,$1	 # tmp236,, tmp237
	and $1,$3,$1	 # tmp237, __current_thread_info.16_9->addr_limit.seg, tmp238
 # fs/readdir.c:283: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	bne $1,$L48	 #, tmp238,
 # fs/readdir.c:286: 		unsafe_put_user(offset, &dirent->d_off, efault_end);
	mov $0,$1	 # <retval>, __pu_err
	.set	macro
 # 286 "fs/readdir.c" 1
	1: stq $15,8($2)	 # offset, MEM[(struct __large_struct *)_19]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L48	 #, __pu_err,
	.align 3 #realign	 #
$L50:
 # fs/readdir.c:287: 	dirent = buf->current_dir;
	ldq $6,16($11)	 # MEM[(struct getdents_callback *)ctx_55(D)].current_dir, dirent
 # fs/readdir.c:288: 	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
	mov $31,$1	 #, __pu_err
	.set	macro
 # 288 "fs/readdir.c" 1
	1: stq $14,0($6)	 # ino, MEM[(struct __large_struct *)_25]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L48	 #, __pu_err,
 # fs/readdir.c:289: 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
	.align 3 #realign	 #
	zapnot $13,3,$3	 # reclen, reclen
	lda $2,16($6)	 # tmp166,, dirent
	.set	macro
 # 289 "fs/readdir.c" 1
	1:	ldq_u $7,1($2)	 # __pu_tmp2, tmp166
2:	ldq_u $4,0($2)	 # __pu_tmp1, tmp166
	inswh $3,$2,$23	 # reclen, tmp166, __pu_tmp4
	inswl $3,$2,$22	 # reclen, tmp166, __pu_tmp3
	mskwh $7,$2,$7	 # __pu_tmp2, tmp166
	mskwl $4,$2,$4	 # __pu_tmp1, tmp166
	or $7,$23,$7	 # __pu_tmp2, __pu_tmp4
	or $4,$22,$4	 # __pu_tmp1, __pu_tmp3
3:	stq_u $7,1($2)	 # __pu_tmp2, tmp166
4:	stq_u $4,0($2)	 # __pu_tmp1, tmp166
5:
.section __ex_table,"a"
	.long 1b-.
	lda $31,5b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,5b-2b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 3b-.
	lda $31,5b-3b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 4b-.
	lda $31,5b-4b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L48	 #, __pu_err,
 # fs/readdir.c:290: 	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
	.align 3 #realign	 #
	sll $12,56,$12	 # d_type,, tmp175
	lda $2,-1($13)	 # tmp176,, reclen
	sra $12,56,$12	 # tmp175,, tmp173
	addq $6,$2,$2	 # dirent, tmp176, tmp177
	.set	macro
 # 290 "fs/readdir.c" 1
	1:	ldq_u $3,0($2)	 # __pu_tmp1, tmp177
	insbl $12,$2,$4	 # tmp173, tmp177, __pu_tmp2
	mskbl $3,$2,$3	 # __pu_tmp1, tmp177
	or $3,$4,$3	 # __pu_tmp1, __pu_tmp2
2:	stq_u $3,0($2)	 # __pu_tmp1, tmp177
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L48	 #, __pu_err,
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.align 3 #realign	 #
	mov $10,$18	 # namlen, len
	lda $2,18($6)	 # dst,, dirent
	cmpule $10,7,$1	 #, len,, tmp179
	bne $1,$L43	 #, tmp179,
	mov $31,$7	 #, tmp187
	br $31,$L44	 #
	.align 4
$L62:
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	lda $2,8($2)	 # dst,, dst
	lda $9,8($9)	 # name,, name
	cmpule $18,7,$1	 #, len,, tmp188
	bne $1,$L43	 #, tmp188,
$L44:
 # ./include/linux/unaligned/packed_struct.h:25: 	return ptr->x;
	ldq_u $1,0($9)	 #, tmp182
	ldq_u $4,7($9)	 #, tmp183
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $7,$3	 # tmp187, __pu_err
 # ./include/linux/unaligned/packed_struct.h:25: 	return ptr->x;
	extql $1,$9,$1	 #, tmp182, name, tmp185
	extqh $4,$9,$4	 # tmp183, name, tmp186
	bis $1,$4,$1	 # tmp185, tmp186, tmp181
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.set	macro
 # 291 "fs/readdir.c" 1
	1: stq $1,0($2)	 # tmp181, MEM[(struct __large_struct *)dst_154]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($3)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	.align 3 #realign	 #
	lda $18,-8($18)	 # len,, len
	beq $3,$L62	 #, __pu_err,
$L42:
$L48:
 # fs/readdir.c:302: 	buf->error = -EFAULT;
	lda $1,-14($31)	 # tmp233,
 # fs/readdir.c:303: 	return -EFAULT;
	lda $0,-14($31)	 # <retval>,
 # fs/readdir.c:302: 	buf->error = -EFAULT;
	stl $1,36($11)	 # tmp233, MEM[(struct getdents_callback *)ctx_55(D)].error
	bis $31,$31,$31
$L57:
 # fs/readdir.c:304: }
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
	ldq $13,40($30)	 #,
	ldq $14,48($30)	 #,
	ldq $15,56($30)	 #,
	lda $30,64($30)	 #,,
	.cfi_remember_state
	.cfi_restore 15
	.cfi_restore 14
	.cfi_restore 13
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L40:
	.cfi_restore_state
 # fs/readdir.c:283: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	ldq $1,80($8)	 # __current_thread_info.17_165->addr_limit.seg, __current_thread_info.17_165->addr_limit.seg
	and $1,31,$1	 # __current_thread_info.17_165->addr_limit.seg,, tmp240
 # fs/readdir.c:283: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	beq $1,$L50	 #, tmp240,
	br $31,$L48	 #
	.align 4
$L43:
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	cmpule $18,3,$1	 #, len,, tmp189
	bne $1,$L45	 #, tmp189,
 # ./include/linux/unaligned/packed_struct.h:19: 	return ptr->x;
	ldq_u $1,0($9)	 #, tmp192
	ldq_u $4,3($9)	 #, tmp193
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $31,$3	 #, __pu_err
 # ./include/linux/unaligned/packed_struct.h:19: 	return ptr->x;
	extll $1,$9,$1	 #, tmp192, name, tmp195
	extlh $4,$9,$4	 # tmp193, name, tmp196
	bis $1,$4,$1	 # tmp195, tmp196, tmp191
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.set	macro
 # 291 "fs/readdir.c" 1
	1: stl $1,0($2)	 # tmp191, MEM[(struct __large_struct *)dst_147]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($3)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $3,$L48	 #, __pu_err,
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.align 3 #realign	 #
	lda $2,4($2)	 # dst,, dst
	lda $9,4($9)	 # name,, name
	lda $18,-4($18)	 # len,, len
	bis $31,$31,$31
$L45:
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	cmpule $18,1,$1	 #, len,, tmp201
	bne $1,$L46	 #, tmp201,
 # ./include/linux/unaligned/packed_struct.h:13: 	return ptr->x;
	ldq_u $1,0($9)	 #, tmp208
	ldq_u $4,1($9)	 #, tmp209
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $31,$3	 #, __pu_err
 # ./include/linux/unaligned/packed_struct.h:13: 	return ptr->x;
	extwl $1,$9,$1	 #, tmp208, name, tmp211
	extwh $4,$9,$4	 # tmp209, name, tmp212
	bis $1,$4,$1	 # tmp211, tmp212, tmp207
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	zapnot $1,3,$1	 # tmp207, tmp216
	.set	macro
 # 291 "fs/readdir.c" 1
	1:	ldq_u $7,1($2)	 # __pu_tmp2, dst
2:	ldq_u $4,0($2)	 # __pu_tmp1, dst
	inswh $1,$2,$23	 # tmp216, dst, __pu_tmp4
	inswl $1,$2,$22	 # tmp216, dst, __pu_tmp3
	mskwh $7,$2,$7	 # __pu_tmp2, dst
	mskwl $4,$2,$4	 # __pu_tmp1, dst
	or $7,$23,$7	 # __pu_tmp2, __pu_tmp4
	or $4,$22,$4	 # __pu_tmp1, __pu_tmp3
3:	stq_u $7,1($2)	 # __pu_tmp2, dst
4:	stq_u $4,0($2)	 # __pu_tmp1, dst
5:
.section __ex_table,"a"
	.long 1b-.
	lda $31,5b-1b($3)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,5b-2b($3)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 3b-.
	lda $31,5b-3b($3)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 4b-.
	lda $31,5b-4b($3)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $3,$L48	 #, __pu_err,
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.align 3 #realign	 #
	lda $2,2($2)	 # dst,, dst
	lda $9,2($9)	 # name,, name
	lda $18,-2($18)	 # len,, len
$L46:
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	beq $18,$L47	 #, len,
	ldq_u $3,0($9)	 #, tmp223
	mov $31,$1	 #, __pu_err
	extbl $3,$9,$9	 #, tmp223, name, tmp221
	.set	macro
 # 291 "fs/readdir.c" 1
	1:	ldq_u $3,0($2)	 # __pu_tmp1, dst
	insbl $9,$2,$4	 # tmp221, dst, __pu_tmp2
	mskbl $3,$2,$3	 # __pu_tmp1, dst
	or $3,$4,$3	 # __pu_tmp1, __pu_tmp2
2:	stq_u $3,0($2)	 # __pu_tmp1, dst
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L48	 #, __pu_err,
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	lda $2,1($2)	 # dst,, dst
$L47:
 # fs/readdir.c:291: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $31,$1	 #, __pu_err
	.set	macro
 # 291 "fs/readdir.c" 1
	1:	ldq_u $3,0($2)	 # __pu_tmp1, dst
	insbl $1,$2,$4	 # __pu_err, dst, __pu_tmp2
	mskbl $3,$2,$3	 # __pu_tmp1, dst
	or $3,$4,$3	 # __pu_tmp1, __pu_tmp2
2:	stq_u $3,0($2)	 # __pu_tmp1, dst
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L48	 #, __pu_err,
 # fs/readdir.c:295: 	dirent = (void __user *)dirent + reclen;
	.align 3 #realign	 #
	addq $6,$13,$1	 # dirent, reclen, dirent
 # fs/readdir.c:297: 	buf->count -= reclen;
	subl $5,$13,$5	 # _8, reclen, tmp232
 # fs/readdir.c:294: 	buf->previous = dirent;
	stq $6,24($11)	 # MEM[(struct getdents_callback *)ctx_55(D)].previous, dirent
 # fs/readdir.c:296: 	buf->current_dir = dirent;
	stq $1,16($11)	 # MEM[(struct getdents_callback *)ctx_55(D)].current_dir, dirent
 # fs/readdir.c:297: 	buf->count -= reclen;
	stl $5,32($11)	 # tmp232, MEM[(struct getdents_callback *)ctx_55(D)].count
 # fs/readdir.c:298: 	return 0;
	br $31,$L57	 #
	.align 4
$L60:
 # fs/readdir.c:264: 	buf->error = verify_dirent_name(name, namlen);
	stl $0,36($11)	 # <retval>, MEM[(struct getdents_callback *)ctx_55(D)].error
	br $31,$L57	 #
	.align 4
$L61:
 # fs/readdir.c:277: 		return -EINTR;
	lda $0,-4($31)	 # <retval>,
	br $31,$L57	 #
$L51:
 # fs/readdir.c:269: 		return -EINVAL;
	lda $0,-22($31)	 # <retval>,
	br $31,$L57	 #
	.cfi_endproc
$LFE3542:
	.end filldir
	.align 2
	.align 4
	.ent filldir64
filldir64:
	.frame $30,64,$26,0
	.mask 0x400fe00,-64
$LFB3545:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!29	 #,,
	lda $29,0($29)		!gpdisp!29	 #,,
$filldir64..ng:
	lda $30,-64($30)	 #,,
	.cfi_def_cfa_offset 64
	bis $31,$31,$31
	stq $9,8($30)	 #,
	.cfi_offset 9, -56
	mov $17,$9	 # tmp238, name
	stq $12,32($30)	 #,
	.cfi_offset 12, -32
 # fs/readdir.c:353: 	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
	addl $18,27,$12	 # namlen,, tmp141
 # fs/readdir.c:349: {
	stq $11,24($30)	 #,
 # fs/readdir.c:353: 	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
	bic $12,7,$12	 # tmp141,, tmp142
	.cfi_offset 11, -40
 # fs/readdir.c:349: {
	mov $16,$11	 # ctx, tmp237
 # fs/readdir.c:356: 	buf->error = verify_dirent_name(name, namlen);
	mov $18,$17	 # namlen,
	mov $9,$16	 # name,
 # fs/readdir.c:353: 	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
	addl $31,$12,$12	 # tmp142, reclen
 # fs/readdir.c:349: {
	stq $10,16($30)	 #,
	.cfi_offset 10, -48
	mov $18,$10	 # tmp239, namlen
	stq $13,40($30)	 #,
	.cfi_offset 13, -24
	mov $21,$13	 # tmp242, d_type
	stq $14,48($30)	 #,
	.cfi_offset 14, -16
	mov $20,$14	 # tmp241, ino
	stq $15,56($30)	 #,
	.cfi_offset 15, -8
	mov $19,$15	 # tmp240, offset
	stq $26,0($30)	 #,
	.cfi_offset 26, -64
	.prologue 1
 # fs/readdir.c:356: 	buf->error = verify_dirent_name(name, namlen);
	ldq $27,verify_dirent_name($29)		!literal!30	 #
	jsr $26,($27),0		!lituse_jsr!30
	ldah $29,0($26)		!gpdisp!31
	lda $29,0($29)		!gpdisp!31
 # fs/readdir.c:357: 	if (unlikely(buf->error))
	bne $0,$L86	 #, <retval>,
 # fs/readdir.c:359: 	buf->error = -EINVAL;	/* only used if we fail.. */
	lda $1,-22($31)	 # tmp145,
 # fs/readdir.c:360: 	if (reclen > buf->count)
	ldl $5,32($11)	 # _8, MEM[(struct getdents_callback64 *)ctx_45(D)].count
 # fs/readdir.c:359: 	buf->error = -EINVAL;	/* only used if we fail.. */
	stl $1,36($11)	 # tmp145, MEM[(struct getdents_callback64 *)ctx_45(D)].error
 # fs/readdir.c:360: 	if (reclen > buf->count)
	bis $31,$31,$31
	cmplt $5,$12,$1	 #, _8, reclen, tmp146
	bne $1,$L77	 #, tmp146,
 # fs/readdir.c:362: 	dirent = buf->previous;
	ldq $2,24($11)	 # MEM[(struct getdents_callback64 *)ctx_45(D)].previous, dirent
 # fs/readdir.c:363: 	if (dirent && signal_pending(current))
	beq $2,$L66	 #, dirent,
 # ./include/linux/sched.h:1737: 	return test_ti_thread_flag(task_thread_info(tsk), flag);
	ldq $1,64($8)	 # __current_thread_info.30_9->task, __current_thread_info.30_9->task
 # ./arch/alpha/include/asm/bitops.h:289: 	return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
	ldq $1,8($1)	 # _10->stack, _10->stack
 # ./arch/alpha/include/asm/bitops.h:289: 	return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
	ldl $1,72($1)	 # MEM[(const int *)_115 + 72B], MEM[(const int *)_115 + 72B]
 # fs/readdir.c:363: 	if (dirent && signal_pending(current))
	and $1,4,$1	 # MEM[(const int *)_115 + 72B],, tmp152
	cmpult $31,$1,$1	 # tmp152, tmp152
	bne $1,$L87	 #, tmp152,
 # fs/readdir.c:370: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	lda $1,23($2)	 # __ao_end,, dirent
	ldq $3,80($8)	 # __current_thread_info.30_9->addr_limit.seg, __current_thread_info.30_9->addr_limit.seg
	bis $1,$2,$1	 # __ao_end, dirent, tmp231
	bis $1,24,$1	 # tmp231,, tmp232
	and $1,$3,$1	 # tmp232, __current_thread_info.30_9->addr_limit.seg, tmp233
 # fs/readdir.c:370: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	bne $1,$L74	 #, tmp233,
 # fs/readdir.c:373: 		unsafe_put_user(offset, &dirent->d_off, efault_end);
	mov $0,$1	 # <retval>, __pu_err
	.set	macro
 # 373 "fs/readdir.c" 1
	1: stq $15,8($2)	 # offset, MEM[(struct __large_struct *)_18]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L74	 #, __pu_err,
	.align 3 #realign	 #
$L76:
 # fs/readdir.c:374: 	dirent = buf->current_dir;
	ldq $6,16($11)	 # MEM[(struct getdents_callback64 *)ctx_45(D)].current_dir, dirent
 # fs/readdir.c:375: 	unsafe_put_user(ino, &dirent->d_ino, efault_end);
	mov $31,$1	 #, __pu_err
	.set	macro
 # 375 "fs/readdir.c" 1
	1: stq $14,0($6)	 # ino, MEM[(struct __large_struct *)_24]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L74	 #, __pu_err,
 # fs/readdir.c:376: 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
	.align 3 #realign	 #
	zapnot $12,3,$3	 # reclen, reclen
	lda $2,16($6)	 # tmp164,, dirent
	.set	macro
 # 376 "fs/readdir.c" 1
	1:	ldq_u $7,1($2)	 # __pu_tmp2, tmp164
2:	ldq_u $4,0($2)	 # __pu_tmp1, tmp164
	inswh $3,$2,$23	 # reclen, tmp164, __pu_tmp4
	inswl $3,$2,$22	 # reclen, tmp164, __pu_tmp3
	mskwh $7,$2,$7	 # __pu_tmp2, tmp164
	mskwl $4,$2,$4	 # __pu_tmp1, tmp164
	or $7,$23,$7	 # __pu_tmp2, __pu_tmp4
	or $4,$22,$4	 # __pu_tmp1, __pu_tmp3
3:	stq_u $7,1($2)	 # __pu_tmp2, tmp164
4:	stq_u $4,0($2)	 # __pu_tmp1, tmp164
5:
.section __ex_table,"a"
	.long 1b-.
	lda $31,5b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,5b-2b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 3b-.
	lda $31,5b-3b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 4b-.
	lda $31,5b-4b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L74	 #, __pu_err,
 # fs/readdir.c:377: 	unsafe_put_user(d_type, &dirent->d_type, efault_end);
	.align 3 #realign	 #
	and $13,0xff,$13	 # d_type, d_type
	lda $2,18($6)	 # tmp172,, dirent
	.set	macro
 # 377 "fs/readdir.c" 1
	1:	ldq_u $3,0($2)	 # __pu_tmp1, tmp172
	insbl $13,$2,$4	 # d_type, tmp172, __pu_tmp2
	mskbl $3,$2,$3	 # __pu_tmp1, tmp172
	or $3,$4,$3	 # __pu_tmp1, __pu_tmp2
2:	stq_u $3,0($2)	 # __pu_tmp1, tmp172
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L74	 #, __pu_err,
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.align 3 #realign	 #
	mov $10,$18	 # namlen, len
	lda $2,19($6)	 # dst,, dirent
	cmpule $10,7,$1	 #, len,, tmp174
	bne $1,$L69	 #, tmp174,
	mov $31,$7	 #, tmp182
	br $31,$L70	 #
	.align 4
$L88:
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	lda $2,8($2)	 # dst,, dst
	lda $9,8($9)	 # name,, name
	cmpule $18,7,$1	 #, len,, tmp183
	bne $1,$L69	 #, tmp183,
$L70:
 # ./include/linux/unaligned/packed_struct.h:25: 	return ptr->x;
	ldq_u $1,0($9)	 #, tmp177
	ldq_u $4,7($9)	 #, tmp178
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $7,$3	 # tmp182, __pu_err
 # ./include/linux/unaligned/packed_struct.h:25: 	return ptr->x;
	extql $1,$9,$1	 #, tmp177, name, tmp180
	extqh $4,$9,$4	 # tmp178, name, tmp181
	bis $1,$4,$1	 # tmp180, tmp181, tmp176
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.set	macro
 # 378 "fs/readdir.c" 1
	1: stq $1,0($2)	 # tmp176, MEM[(struct __large_struct *)dst_152]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($3)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	.align 3 #realign	 #
	lda $18,-8($18)	 # len,, len
	beq $3,$L88	 #, __pu_err,
$L68:
$L74:
 # fs/readdir.c:389: 	buf->error = -EFAULT;
	lda $1,-14($31)	 # tmp228,
 # fs/readdir.c:390: 	return -EFAULT;
	lda $0,-14($31)	 # <retval>,
 # fs/readdir.c:389: 	buf->error = -EFAULT;
	stl $1,36($11)	 # tmp228, MEM[(struct getdents_callback64 *)ctx_45(D)].error
	bis $31,$31,$31
$L83:
 # fs/readdir.c:391: }
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
	ldq $13,40($30)	 #,
	ldq $14,48($30)	 #,
	ldq $15,56($30)	 #,
	lda $30,64($30)	 #,,
	.cfi_remember_state
	.cfi_restore 15
	.cfi_restore 14
	.cfi_restore 13
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L66:
	.cfi_restore_state
 # fs/readdir.c:370: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	ldq $1,80($8)	 # __current_thread_info.31_163->addr_limit.seg, __current_thread_info.31_163->addr_limit.seg
	and $1,31,$1	 # __current_thread_info.31_163->addr_limit.seg,, tmp235
 # fs/readdir.c:370: 	if (!user_access_begin(dirent, sizeof(*dirent)))
	beq $1,$L76	 #, tmp235,
	br $31,$L74	 #
	.align 4
$L69:
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	cmpule $18,3,$1	 #, len,, tmp184
	bne $1,$L71	 #, tmp184,
 # ./include/linux/unaligned/packed_struct.h:19: 	return ptr->x;
	ldq_u $1,0($9)	 #, tmp187
	ldq_u $4,3($9)	 #, tmp188
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $31,$3	 #, __pu_err
 # ./include/linux/unaligned/packed_struct.h:19: 	return ptr->x;
	extll $1,$9,$1	 #, tmp187, name, tmp190
	extlh $4,$9,$4	 # tmp188, name, tmp191
	bis $1,$4,$1	 # tmp190, tmp191, tmp186
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.set	macro
 # 378 "fs/readdir.c" 1
	1: stl $1,0($2)	 # tmp186, MEM[(struct __large_struct *)dst_145]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($3)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $3,$L74	 #, __pu_err,
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.align 3 #realign	 #
	lda $2,4($2)	 # dst,, dst
	lda $9,4($9)	 # name,, name
	lda $18,-4($18)	 # len,, len
	bis $31,$31,$31
$L71:
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	cmpule $18,1,$1	 #, len,, tmp196
	bne $1,$L72	 #, tmp196,
 # ./include/linux/unaligned/packed_struct.h:13: 	return ptr->x;
	ldq_u $1,0($9)	 #, tmp203
	ldq_u $4,1($9)	 #, tmp204
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $31,$3	 #, __pu_err
 # ./include/linux/unaligned/packed_struct.h:13: 	return ptr->x;
	extwl $1,$9,$1	 #, tmp203, name, tmp206
	extwh $4,$9,$4	 # tmp204, name, tmp207
	bis $1,$4,$1	 # tmp206, tmp207, tmp202
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	zapnot $1,3,$1	 # tmp202, tmp211
	.set	macro
 # 378 "fs/readdir.c" 1
	1:	ldq_u $7,1($2)	 # __pu_tmp2, dst
2:	ldq_u $4,0($2)	 # __pu_tmp1, dst
	inswh $1,$2,$23	 # tmp211, dst, __pu_tmp4
	inswl $1,$2,$22	 # tmp211, dst, __pu_tmp3
	mskwh $7,$2,$7	 # __pu_tmp2, dst
	mskwl $4,$2,$4	 # __pu_tmp1, dst
	or $7,$23,$7	 # __pu_tmp2, __pu_tmp4
	or $4,$22,$4	 # __pu_tmp1, __pu_tmp3
3:	stq_u $7,1($2)	 # __pu_tmp2, dst
4:	stq_u $4,0($2)	 # __pu_tmp1, dst
5:
.section __ex_table,"a"
	.long 1b-.
	lda $31,5b-1b($3)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,5b-2b($3)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 3b-.
	lda $31,5b-3b($3)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 4b-.
	lda $31,5b-4b($3)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $3,$L74	 #, __pu_err,
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	.align 3 #realign	 #
	lda $2,2($2)	 # dst,, dst
	lda $9,2($9)	 # name,, name
	lda $18,-2($18)	 # len,, len
$L72:
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	beq $18,$L73	 #, len,
	ldq_u $3,0($9)	 #, tmp218
	mov $31,$1	 #, __pu_err
	extbl $3,$9,$9	 #, tmp218, name, tmp216
	.set	macro
 # 378 "fs/readdir.c" 1
	1:	ldq_u $3,0($2)	 # __pu_tmp1, dst
	insbl $9,$2,$4	 # tmp216, dst, __pu_tmp2
	mskbl $3,$2,$3	 # __pu_tmp1, dst
	or $3,$4,$3	 # __pu_tmp1, __pu_tmp2
2:	stq_u $3,0($2)	 # __pu_tmp1, dst
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L74	 #, __pu_err,
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	lda $2,1($2)	 # dst,, dst
$L73:
 # fs/readdir.c:378: 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
	mov $31,$1	 #, __pu_err
	.set	macro
 # 378 "fs/readdir.c" 1
	1:	ldq_u $3,0($2)	 # __pu_tmp1, dst
	insbl $1,$2,$4	 # __pu_err, dst, __pu_tmp2
	mskbl $3,$2,$3	 # __pu_tmp1, dst
	or $3,$4,$3	 # __pu_tmp1, __pu_tmp2
2:	stq_u $3,0($2)	 # __pu_tmp1, dst
3:
.section __ex_table,"a"
	.long 1b-.
	lda $31,3b-1b($1)	 # __pu_err
.previous
.section __ex_table,"a"
	.long 2b-.
	lda $31,3b-2b($1)	 # __pu_err
.previous

 # 0 "" 2
	.set	nomacro
	bne $1,$L74	 #, __pu_err,
 # fs/readdir.c:382: 	dirent = (void __user *)dirent + reclen;
	.align 3 #realign	 #
	addq $6,$12,$1	 # dirent, reclen, dirent
 # fs/readdir.c:384: 	buf->count -= reclen;
	subl $5,$12,$5	 # _8, reclen, tmp227
 # fs/readdir.c:381: 	buf->previous = dirent;
	stq $6,24($11)	 # MEM[(struct getdents_callback64 *)ctx_45(D)].previous, dirent
 # fs/readdir.c:383: 	buf->current_dir = dirent;
	stq $1,16($11)	 # MEM[(struct getdents_callback64 *)ctx_45(D)].current_dir, dirent
 # fs/readdir.c:384: 	buf->count -= reclen;
	stl $5,32($11)	 # tmp227, MEM[(struct getdents_callback64 *)ctx_45(D)].count
 # fs/readdir.c:385: 	return 0;
	br $31,$L83	 #
	.align 4
$L86:
 # fs/readdir.c:356: 	buf->error = verify_dirent_name(name, namlen);
	stl $0,36($11)	 # <retval>, MEM[(struct getdents_callback64 *)ctx_45(D)].error
	br $31,$L83	 #
	.align 4
$L87:
 # fs/readdir.c:364: 		return -EINTR;
	lda $0,-4($31)	 # <retval>,
	br $31,$L83	 #
$L77:
 # fs/readdir.c:361: 		return -EINVAL;
	lda $0,-22($31)	 # <retval>,
	br $31,$L83	 #
	.cfi_endproc
$LFE3545:
	.end filldir64
	.align 2
	.align 4
	.globl __se_sys_old_readdir
	.ent __se_sys_old_readdir
__se_sys_old_readdir:
	.frame $30,64,$26,0
	.mask 0x4000e00,-64
$LFB3540:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!32	 #,,
	lda $29,0($29)		!gpdisp!32	 #,,
$__se_sys_old_readdir..ng:
	lda $30,-64($30)	 #,,
	.cfi_def_cfa_offset 64
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	ldq $27,__fdget_pos($29)		!literal!37	 #,,,
 # fs/readdir.c:212: SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
	stq $9,8($30)	 #,
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	addl $31,$16,$16	 # tmp100,
 # fs/readdir.c:212: SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
	stq $10,16($30)	 #,
	.cfi_offset 9, -56
	.cfi_offset 10, -48
	mov $17,$10	 # tmp101, dirent
	stq $11,24($30)	 #,
	stq $26,0($30)	 #,
	.cfi_offset 11, -40
	.cfi_offset 26, -64
	.prologue 1
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	jsr $26,($27),__fdget_pos		!lituse_jsr!37	 #,,
	ldah $29,0($26)		!gpdisp!38	 #
 # fs/readdir.c:217: 	struct readdir_callback buf = {
	stq $31,40($30)	 # MEM[(struct readdir_callback *)&buf + 8B],
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	lda $29,0($29)		!gpdisp!38	 #,,
 # ./include/linux/file.h:57: 	return (struct fd){(struct file *)(v & ~3),v & 3};
	bic $0,3,$11	 # _9,, _11
 # fs/readdir.c:217: 	struct readdir_callback buf = {
	ldq_u $31,0($30)
	ldah $1,fillonedir($29)		!gprelhigh	 # tmp88,,
 # ./include/linux/file.h:57: 	return (struct fd){(struct file *)(v & ~3),v & 3};
	addl $31,$0,$9	 # _9, _12
 # fs/readdir.c:217: 	struct readdir_callback buf = {
	lda $1,fillonedir($1)		!gprellow	 # tmp87,, tmp88
 # fs/readdir.c:223: 		return -EBADF;
	lda $0,-9($31)	 # <retval>,
 # fs/readdir.c:217: 	struct readdir_callback buf = {
	stq $31,56($30)	 # MEM[(struct readdir_callback *)&buf + 8B],
	stq $1,32($30)	 # buf.ctx.actor, tmp87
	stq $10,48($30)	 # buf.dirent, dirent
 # fs/readdir.c:222: 	if (!f.file)
	beq $11,$L89	 #, _11,
 # fs/readdir.c:225: 	error = iterate_dir(f.file, &buf.ctx);
	lda $17,32($30)	 #,,
	mov $11,$16	 # _11,
	ldq $27,iterate_dir($29)		!literal!39	 #
	jsr $26,($27),0		!lituse_jsr!39
	ldah $29,0($26)		!gpdisp!40
	lda $29,0($29)		!gpdisp!40
	mov $0,$10	 #, tmp103
 # fs/readdir.c:226: 	if (buf.result)
	ldl $0,56($30)	 # _15, buf.result
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	and $9,2,$1	 # _12,, tmp94
 # fs/readdir.c:226: 	if (buf.result)
	cmovne $0,$0,$10	 #, _15, _15, error
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	bne $1,$L104	 #, tmp94,
 # ./include/linux/file.h:43: 	if (fd.flags & FDPUT_FPUT)
	blbs $9,$L105	 # _12,
$L93:
 # fs/readdir.c:230: 	return error;
	mov $10,$0	 # error, <retval>
$L89:
 # fs/readdir.c:212: SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	bis $31,$31,$31
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	lda $30,64($30)	 #,,
	.cfi_remember_state
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L105:
	.cfi_restore_state
 # ./include/linux/file.h:44: 		fput(fd.file);
	ldq $27,fput($29)		!literal!33	 #,,,
	mov $11,$16	 # _11,
	jsr $26,($27),fput		!lituse_jsr!33	 #,,
	ldah $29,0($26)		!gpdisp!34	 #
	lda $29,0($29)		!gpdisp!34	 #,,
	br $31,$L93	 #
	.align 4
$L104:
 # ./include/linux/file.h:78: 		__f_unlock_pos(f.file);
	ldq $27,__f_unlock_pos($29)		!literal!35	 #,,,
	mov $11,$16	 # _11,
	jsr $26,($27),__f_unlock_pos		!lituse_jsr!35	 #,,
	ldah $29,0($26)		!gpdisp!36	 #
	lda $29,0($29)		!gpdisp!36	 #,,
 # ./include/linux/file.h:43: 	if (fd.flags & FDPUT_FPUT)
	blbc $9,$L93	 # _12,
	br $31,$L105	 #
	.cfi_endproc
$LFE3540:
	.end __se_sys_old_readdir
	.globl sys_old_readdir
$sys_old_readdir..ng = $__se_sys_old_readdir..ng
sys_old_readdir = __se_sys_old_readdir
	.align 2
	.align 4
	.globl __se_sys_getdents
	.ent __se_sys_getdents
__se_sys_getdents:
	.frame $30,96,$26,0
	.mask 0x4001e00,-96
$LFB3543:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!41	 #,,
	lda $29,0($29)		!gpdisp!41	 #,,
$__se_sys_getdents..ng:
 # fs/readdir.c:318: 	if (!access_ok(dirent, count))
	zapnot $18,15,$2	 # count,, __ao_b
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	lda $30,-96($30)	 #,,
	.cfi_def_cfa_offset 96
 # fs/readdir.c:318: 	if (!access_ok(dirent, count))
	cmpult $31,$2,$3	 # __ao_b, tmp119
	addq $17,$2,$1	 # dirent, __ao_b, tmp117
	subq $1,$3,$1	 # tmp117, tmp119, __ao_end
	bis $17,$2,$2	 # dirent, __ao_b, tmp121
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	stq $10,16($30)	 #,
 # fs/readdir.c:318: 	if (!access_ok(dirent, count))
	bis $1,$2,$1	 # __ao_end, tmp121, tmp122
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	stq $11,24($30)	 #,
 # fs/readdir.c:311: 	struct getdents_callback buf = {
	ldah $2,filldir($29)		!gprelhigh	 # tmp116,,
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	stq $26,0($30)	 #,
 # fs/readdir.c:311: 	struct getdents_callback buf = {
	lda $2,filldir($2)		!gprellow	 # tmp115,, tmp116
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	stq $9,8($30)	 #,
	.cfi_offset 10, -80
	.cfi_offset 11, -72
	.cfi_offset 26, -96
	.cfi_offset 9, -88
	mov $18,$10	 # tmp152, count
	stq $12,32($30)	 #,
	.cfi_offset 12, -64
	.prologue 1
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	addl $31,$16,$16	 # tmp150, _1
 # fs/readdir.c:311: 	struct getdents_callback buf = {
	stq $31,80($30)	 # MEM[(struct getdents_callback *)&buf + 8B],
 # fs/readdir.c:319: 		return -EFAULT;
	lda $11,-14($31)	 # <retval>,
 # fs/readdir.c:318: 	if (!access_ok(dirent, count))
	ldq $3,80($8)	 # __current_thread_info.12_17->addr_limit.seg, __current_thread_info.12_17->addr_limit.seg
 # fs/readdir.c:311: 	struct getdents_callback buf = {
	stq $31,56($30)	 # MEM[(struct getdents_callback *)&buf + 8B],
	stq $31,72($30)	 # MEM[(struct getdents_callback *)&buf + 8B],
 # fs/readdir.c:318: 	if (!access_ok(dirent, count))
	and $1,$3,$1	 # tmp122, __current_thread_info.12_17->addr_limit.seg, tmp123
 # fs/readdir.c:311: 	struct getdents_callback buf = {
	stq $2,48($30)	 # buf.ctx.actor, tmp115
	stq $17,64($30)	 # buf.current_dir, dirent
	stl $18,80($30)	 # count, buf.count
 # fs/readdir.c:318: 	if (!access_ok(dirent, count))
	bne $1,$L106	 #, tmp123,
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	ldq $27,__fdget_pos($29)		!literal!46	 #,,,
	bis $31,$31,$31
	jsr $26,($27),__fdget_pos		!lituse_jsr!46	 #,,
	ldah $29,0($26)		!gpdisp!47	 #
	lda $29,0($29)		!gpdisp!47	 #,,
 # ./include/linux/file.h:57: 	return (struct fd){(struct file *)(v & ~3),v & 3};
	bic $0,3,$12	 # _22,, _24
 # ./include/linux/file.h:57: 	return (struct fd){(struct file *)(v & ~3),v & 3};
	addl $31,$0,$9	 # _22, _25
 # fs/readdir.c:322: 	if (!f.file)
	beq $12,$L114	 #, _24,
 # fs/readdir.c:325: 	error = iterate_dir(f.file, &buf.ctx);
	lda $17,48($30)	 #,,
	mov $12,$16	 # _24,
	ldq $27,iterate_dir($29)		!literal!48	 #
	jsr $26,($27),0		!lituse_jsr!48
	ldah $29,0($26)		!gpdisp!49
	lda $29,0($29)		!gpdisp!49
 # fs/readdir.c:327: 		error = buf.error;
	ldl $1,84($30)	 # buf.error, buf.error
 # fs/readdir.c:328: 	lastdirent = buf.previous;
	ldq $3,72($30)	 # buf.previous, lastdirent
 # fs/readdir.c:327: 		error = buf.error;
	cmovge $0,$1,$0	 #, tmp154, buf.error, error
 # fs/readdir.c:329: 	if (lastdirent) {
	beq $3,$L123	 #, lastdirent,
 # fs/readdir.c:330: 		if (put_user(buf.ctx.pos, &lastdirent->d_off))
	lda $2,15($3)	 # __ao_end,, lastdirent
	lda $1,8($3)	 # __pu_addr,, lastdirent
	bis $2,$1,$1	 # __ao_end, __pu_addr, tmp130
	ldq $2,80($8)	 # __current_thread_info.14_33->addr_limit.seg, __current_thread_info.14_33->addr_limit.seg
	bis $1,8,$1	 # tmp130,, tmp131
	bis $31,$31,$31
	and $1,$2,$1	 # tmp131, __current_thread_info.14_33->addr_limit.seg, tmp132
	bne $1,$L110	 #, tmp132,
	ldq $2,56($30)	 # buf.ctx.pos, buf.ctx.pos
	.set	macro
 # 330 "fs/readdir.c" 1
	1: stq $2,8($3)	 # buf.ctx.pos, MEM[(struct __large_struct *)__pu_addr_30]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($1)	 # __pu_err
.previous

 # 0 "" 2
 # fs/readdir.c:330: 		if (put_user(buf.ctx.pos, &lastdirent->d_off))
	.set	nomacro
	beq $1,$L124	 #, __pu_err,
	.align 3 #realign	 #
$L110:
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	and $9,2,$1	 # _25,, tmp142
	bne $1,$L125	 #, tmp142,
$L111:
 # ./include/linux/file.h:43: 	if (fd.flags & FDPUT_FPUT)
	bis $31,$31,$31
	blbs $9,$L126	 # _25,
$L106:
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	mov $11,$0	 # <retval>,
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
	lda $30,96($30)	 #,,
	.cfi_remember_state
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L125:
	.cfi_restore_state
 # ./include/linux/file.h:78: 		__f_unlock_pos(f.file);
	ldq $27,__f_unlock_pos($29)		!literal!44	 #,,,
	mov $12,$16	 # _24,
	jsr $26,($27),__f_unlock_pos		!lituse_jsr!44	 #,,
	ldah $29,0($26)		!gpdisp!45	 #
	lda $29,0($29)		!gpdisp!45	 #,,
 # ./include/linux/file.h:43: 	if (fd.flags & FDPUT_FPUT)
	blbc $9,$L106	 # _25,
$L126:
 # ./include/linux/file.h:44: 		fput(fd.file);
	ldq $27,fput($29)		!literal!42	 #,,,
	mov $12,$16	 # _24,
	jsr $26,($27),fput		!lituse_jsr!42	 #,,
	ldah $29,0($26)		!gpdisp!43	 #
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	ldq $9,8($30)	 #,
	bis $31,$31,$31
	mov $11,$0	 # <retval>,
	ldq $26,0($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
 # ./include/linux/file.h:44: 		fput(fd.file);
	lda $29,0($29)		!gpdisp!43	 #,,
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	lda $30,96($30)	 #,,
	.cfi_remember_state
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L123:
	.cfi_restore_state
	mov $0,$11	 # error, <retval>
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	and $9,2,$1	 # _25,, tmp142
	beq $1,$L111	 #, tmp142,
	br $31,$L125	 #
	.align 4
$L124:
 # fs/readdir.c:333: 			error = count - buf.count;
	ldl $11,80($30)	 #, buf.count
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	and $9,2,$1	 # _25,, tmp142
	subl $10,$11,$11	 # count, buf.count, <retval>
	beq $1,$L111	 #, tmp142,
	br $31,$L125	 #
$L114:
 # fs/readdir.c:323: 		return -EBADF;
	lda $11,-9($31)	 # <retval>,
 # fs/readdir.c:306: SYSCALL_DEFINE3(getdents, unsigned int, fd,
	br $31,$L106	 #
	.cfi_endproc
$LFE3543:
	.end __se_sys_getdents
	.globl sys_getdents
$sys_getdents..ng = $__se_sys_getdents..ng
sys_getdents = __se_sys_getdents
	.align 2
	.align 4
	.globl ksys_getdents64
	.ent ksys_getdents64
ksys_getdents64:
	.frame $30,96,$26,0
	.mask 0x4003e00,-96
$LFB3546:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!50	 #,,
	lda $29,0($29)		!gpdisp!50	 #,,
$ksys_getdents64..ng:
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	zapnot $18,15,$1	 # count, __ao_b
 # fs/readdir.c:395: {
	lda $30,-96($30)	 #,,
	.cfi_def_cfa_offset 96
	stq $9,8($30)	 #,
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	cmpult $31,$1,$2	 # __ao_b, tmp105
 # fs/readdir.c:395: {
	stq $10,16($30)	 #,
	.cfi_offset 9, -88
	.cfi_offset 10, -80
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	addq $17,$1,$9	 # dirent, __ao_b, tmp103
	subq $9,$2,$9	 # tmp103, tmp105, __ao_end
	bis $17,$1,$1	 # dirent, __ao_b, tmp107
 # fs/readdir.c:395: {
	stq $12,32($30)	 #,
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	bis $9,$1,$9	 # __ao_end, tmp107, tmp108
 # fs/readdir.c:395: {
	stq $26,0($30)	 #,
 # fs/readdir.c:398: 	struct getdents_callback64 buf = {
	ldah $1,filldir64($29)		!gprelhigh	 # tmp102,,
 # fs/readdir.c:395: {
	stq $11,24($30)	 #,
 # fs/readdir.c:398: 	struct getdents_callback64 buf = {
	lda $1,filldir64($1)		!gprellow	 # tmp101,, tmp102
 # fs/readdir.c:395: {
	stq $13,40($30)	 #,
	.cfi_offset 12, -64
	.cfi_offset 26, -96
	.cfi_offset 11, -72
	.cfi_offset 13, -56
	.prologue 1
 # fs/readdir.c:395: {
	mov $18,$12	 # tmp132, count
 # fs/readdir.c:398: 	struct getdents_callback64 buf = {
	stq $31,80($30)	 # MEM[(struct getdents_callback64 *)&buf + 8B],
 # fs/readdir.c:406: 		return -EFAULT;
	lda $10,-14($31)	 # <retval>,
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	ldq $2,80($8)	 # __current_thread_info.25_5->addr_limit.seg, __current_thread_info.25_5->addr_limit.seg
 # fs/readdir.c:398: 	struct getdents_callback64 buf = {
	stq $31,56($30)	 # MEM[(struct getdents_callback64 *)&buf + 8B],
	stq $31,72($30)	 # MEM[(struct getdents_callback64 *)&buf + 8B],
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	and $9,$2,$9	 # tmp108, __current_thread_info.25_5->addr_limit.seg, tmp109
 # fs/readdir.c:398: 	struct getdents_callback64 buf = {
	stq $1,48($30)	 # buf.ctx.actor, tmp101
	stq $17,64($30)	 # buf.current_dir, dirent
	stl $18,80($30)	 # count, buf.count
 # fs/readdir.c:405: 	if (!access_ok(dirent, count))
	bne $9,$L128	 #, tmp109,
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	ldq $27,__fdget_pos($29)		!literal!55	 #,,,
 # fs/readdir.c:410: 		return -EBADF;
	lda $10,-9($31)	 # <retval>,
 # ./include/linux/file.h:72: 	return __to_fd(__fdget_pos(fd));
	jsr $26,($27),__fdget_pos		!lituse_jsr!55	 #,,
	ldah $29,0($26)		!gpdisp!56	 #
	lda $29,0($29)		!gpdisp!56	 #,,
 # ./include/linux/file.h:57: 	return (struct fd){(struct file *)(v & ~3),v & 3};
	bic $0,3,$13	 # _38,, _40
 # ./include/linux/file.h:57: 	return (struct fd){(struct file *)(v & ~3),v & 3};
	addl $31,$0,$11	 # _38, _41
 # fs/readdir.c:409: 	if (!f.file)
	beq $13,$L128	 #, _40,
 # fs/readdir.c:412: 	error = iterate_dir(f.file, &buf.ctx);
	lda $17,48($30)	 #,,
	mov $13,$16	 # _40,
	ldq $27,iterate_dir($29)		!literal!57	 #
	jsr $26,($27),0		!lituse_jsr!57
	ldah $29,0($26)		!gpdisp!58
	lda $29,0($29)		!gpdisp!58
 # fs/readdir.c:414: 		error = buf.error;
	ldl $10,84($30)	 # buf.error, buf.error
 # fs/readdir.c:415: 	lastdirent = buf.previous;
	ldq $1,72($30)	 # buf.previous, lastdirent
 # fs/readdir.c:414: 		error = buf.error;
	cmovlt $0,$0,$10	 #, tmp134, tmp134, <retval>
 # fs/readdir.c:416: 	if (lastdirent) {
	beq $1,$L130	 #, lastdirent,
 # fs/readdir.c:418: 		if (__put_user(d_off, &lastdirent->d_off))
	ldq $2,56($30)	 # buf.ctx.pos, buf.ctx.pos
	.set	macro
 # 418 "fs/readdir.c" 1
	1: stq $2,8($1)	 # buf.ctx.pos, MEM[(struct __large_struct *)_15]
2:
.section __ex_table,"a"
	.long 1b-.
	lda $31,2b-1b($9)	 # __pu_err
.previous

 # 0 "" 2
 # fs/readdir.c:419: 			error = -EFAULT;
	.set	nomacro
	.align 3 #realign	 #
	lda $10,-14($31)	 # <retval>,
 # fs/readdir.c:418: 		if (__put_user(d_off, &lastdirent->d_off))
	beq $9,$L144	 #, __pu_err,
$L130:
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	and $11,2,$1	 # _41,, tmp123
	bne $1,$L145	 #, tmp123,
$L131:
 # ./include/linux/file.h:43: 	if (fd.flags & FDPUT_FPUT)
	bis $31,$31,$31
	blbs $11,$L146	 # _41,
$L128:
 # fs/readdir.c:425: }
	mov $10,$0	 # <retval>,
	ldq $26,0($30)	 #,
	ldq $9,8($30)	 #,
	ldq $10,16($30)	 #,
	ldq $11,24($30)	 #,
	ldq $12,32($30)	 #,
	ldq $13,40($30)	 #,
	bis $31,$31,$31
	lda $30,96($30)	 #,,
	.cfi_remember_state
	.cfi_restore 13
	.cfi_restore 12
	.cfi_restore 11
	.cfi_restore 10
	.cfi_restore 9
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.align 4
$L144:
	.cfi_restore_state
 # fs/readdir.c:421: 			error = count - buf.count;
	ldl $10,80($30)	 #, buf.count
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	and $11,2,$1	 # _41,, tmp123
 # fs/readdir.c:421: 			error = count - buf.count;
	subl $12,$10,$10	 # count, buf.count, <retval>
 # ./include/linux/file.h:77: 	if (f.flags & FDPUT_POS_UNLOCK)
	beq $1,$L131	 #, tmp123,
$L145:
 # ./include/linux/file.h:78: 		__f_unlock_pos(f.file);
	ldq $27,__f_unlock_pos($29)		!literal!53	 #,,,
	mov $13,$16	 # _40,
	jsr $26,($27),__f_unlock_pos		!lituse_jsr!53	 #,,
	ldah $29,0($26)		!gpdisp!54	 #
	lda $29,0($29)		!gpdisp!54	 #,,
 # ./include/linux/file.h:43: 	if (fd.flags & FDPUT_FPUT)
	blbc $11,$L128	 # _41,
$L146:
 # ./include/linux/file.h:44: 		fput(fd.file);
	ldq $27,fput($29)		!literal!51	 #,,,
	mov $13,$16	 # _40,
	jsr $26,($27),fput		!lituse_jsr!51	 #,,
	ldah $29,0($26)		!gpdisp!52	 #
	lda $29,0($29)		!gpdisp!52	 #,,
	br $31,$L128	 #
	.cfi_endproc
$LFE3546:
	.end ksys_getdents64
	.align 2
	.align 4
	.globl __se_sys_getdents64
	.ent __se_sys_getdents64
__se_sys_getdents64:
	.frame $30,16,$26,0
	.mask 0x4000000,-16
$LFB3547:
	.cfi_startproc
	ldah $29,0($27)		!gpdisp!59	 #,,
	lda $29,0($29)		!gpdisp!59	 #,,
$__se_sys_getdents64..ng:
	lda $30,-16($30)	 #,,
	.cfi_def_cfa_offset 16
 # fs/readdir.c:431: 	return ksys_getdents64(fd, dirent, count);
	addl $31,$18,$18	 # tmp84,
 # fs/readdir.c:428: SYSCALL_DEFINE3(getdents64, unsigned int, fd,
	stq $26,0($30)	 #,
	.cfi_offset 26, -16
	.prologue 1
 # fs/readdir.c:431: 	return ksys_getdents64(fd, dirent, count);
	addl $31,$16,$16	 # tmp83,
	ldq $27,ksys_getdents64($29)		!literal!60	 #
	jsr $26,($27),0		!lituse_jsr!60
	ldah $29,0($26)		!gpdisp!61
	lda $29,0($29)		!gpdisp!61
 # fs/readdir.c:428: SYSCALL_DEFINE3(getdents64, unsigned int, fd,
	ldq $26,0($30)	 #,
	bis $31,$31,$31
	lda $30,16($30)	 #,,
	.cfi_restore 26
	.cfi_def_cfa_offset 0
	ret $31,($26),1
	.cfi_endproc
$LFE3547:
	.end __se_sys_getdents64
	.globl sys_getdents64
$sys_getdents64..ng = $__se_sys_getdents64..ng
sys_getdents64 = __se_sys_getdents64
	.section	.data.once,"aw"
	.type	__warned.38914, @object
	.size	__warned.38914, 1
__warned.38914:
	.zero	1
	.type	__warned.38909, @object
	.size	__warned.38909, 1
__warned.38909:
	.zero	1
	.section	___ksymtab+iterate_dir,"a"
	.align 3
	.type	__ksymtab_iterate_dir, @object
	.size	__ksymtab_iterate_dir, 24
__ksymtab_iterate_dir:
 # value:
	.quad	iterate_dir
 # name:
	.quad	__kstrtab_iterate_dir
 # namespace:
	.quad	0
	.section	__ksymtab_strings,"a"
	.type	__kstrtab_iterate_dir, @object
	.size	__kstrtab_iterate_dir, 12
__kstrtab_iterate_dir:
	.string	"iterate_dir"
	.ident	"GCC: (GNU) 9.2.0"
	.section	.note.GNU-stack,"",@progbits

[-- Attachment #3: readdir.s.objdump --]
[-- Type: text/plain, Size: 32015 bytes --]


fs/readdir.o:     file format elf64-alpha


Disassembly of section .text:

0000000000000000 <iterate_dir>:
   0:	00 00 bb 27 	ldah	gp,0(t12)
   4:	00 00 bd 23 	lda	gp,0(gp)
   8:	c0 ff de 23 	lda	sp,-64(sp)
   c:	1f 04 ff 47 	nop	
  10:	08 00 3e b5 	stq	s0,8(sp)
  14:	09 04 f0 47 	mov	a0,s0
  18:	18 00 7e b5 	stq	s2,24(sp)
  1c:	0b 04 f1 47 	mov	a1,s2
  20:	00 00 5e b7 	stq	ra,0(sp)
  24:	10 00 5e b5 	stq	s1,16(sp)
  28:	20 00 9e b5 	stq	s3,32(sp)
  2c:	28 00 be b5 	stq	s4,40(sp)
  30:	30 00 de b5 	stq	s5,48(sp)
  34:	38 00 fe b5 	stq	fp,56(sp)
  38:	28 00 30 a4 	ldq	t0,40(a0)
  3c:	20 00 90 a5 	ldq	s3,32(a0)
  40:	40 00 41 a4 	ldq	t1,64(t0)
  44:	52 00 40 e4 	beq	t1,190 <iterate_dir+0x190>
  48:	a0 00 ac 21 	lda	s4,160(s3)
  4c:	00 00 7d a7 	ldq	t12,0(gp)
  50:	10 04 ed 47 	mov	s4,a0
  54:	01 00 df 21 	lda	s5,1
  58:	00 40 5b 6b 	jsr	ra,(t12),5c <iterate_dir+0x5c>
  5c:	00 00 ba 27 	ldah	gp,0(ra)
  60:	00 00 bd 23 	lda	gp,0(gp)
  64:	0a 04 e0 47 	mov	v0,s1
  68:	00 00 fe 2f 	unop	
  6c:	32 00 40 f5 	bne	s1,138 <iterate_dir+0x138>
  70:	0c 00 2c a0 	ldl	t0,12(s3)
  74:	fe ff 5f 21 	lda	s1,-2
  78:	01 10 22 44 	and	t0,0x10,t0
  7c:	28 00 20 f4 	bne	t0,120 <iterate_dir+0x120>
  80:	98 00 29 a4 	ldq	t0,152(s0)
  84:	11 04 eb 47 	mov	s2,a1
  88:	10 04 e9 47 	mov	s0,a0
  8c:	08 00 2b b4 	stq	t0,8(s2)
  90:	28 00 29 a4 	ldq	t0,40(s0)
  94:	4e 00 c0 f5 	bne	s5,1d0 <iterate_dir+0x1d0>
  98:	38 00 61 a7 	ldq	t12,56(t0)
  9c:	00 40 5b 6b 	jsr	ra,(t12),a0 <iterate_dir+0xa0>
  a0:	00 00 ba 27 	ldah	gp,0(ra)
  a4:	00 00 bd 23 	lda	gp,0(gp)
  a8:	0a 04 e0 47 	mov	v0,s1
  ac:	1f 04 ff 47 	nop	
  b0:	08 00 2b a4 	ldq	t0,8(s2)
  b4:	20 00 89 a5 	ldq	s3,32(s0)
  b8:	5c 00 49 a0 	ldl	t1,92(s0)
  bc:	00 40 5f 26 	ldah	a2,16384
  c0:	98 00 29 b4 	stq	t0,152(s0)
  c4:	01 00 52 22 	lda	a2,1(a2)
  c8:	00 00 2c a0 	ldl	t0,0(s3)
  cc:	01 00 7f 21 	lda	s2,1
  d0:	82 56 43 48 	srl	t1,0x1a,t1
  d4:	10 00 e9 21 	lda	fp,16(s0)
  d8:	c3 12 20 48 	extwl	t0,0,t2
  dc:	00 f0 3f 20 	lda	t0,-4096
  e0:	01 00 23 44 	and	t0,t2,t0
  e4:	00 c0 21 20 	lda	t0,-16384(t0)
  e8:	8b 04 32 44 	cmoveq	t0,a2,s2
  ec:	40 00 40 e0 	blbc	t1,1f0 <iterate_dir+0x1f0>
  f0:	58 00 29 a0 	ldl	t0,88(s0)
  f4:	81 96 22 48 	srl	t0,0x14,t0
  f8:	00 00 fe 2f 	unop	
  fc:	08 00 20 f0 	blbs	t0,120 <iterate_dir+0x120>
 100:	00 00 7d a7 	ldq	t12,0(gp)
 104:	10 04 ef 47 	mov	fp,a0
 108:	00 40 5b 6b 	jsr	ra,(t12),10c <iterate_dir+0x10c>
 10c:	00 00 ba 27 	ldah	gp,0(ra)
 110:	00 00 bd 23 	lda	gp,0(gp)
 114:	00 00 fe 2f 	unop	
 118:	1f 04 ff 47 	nop	
 11c:	00 00 fe 2f 	unop	
 120:	10 04 ed 47 	mov	s4,a0
 124:	12 00 c0 e5 	beq	s5,170 <iterate_dir+0x170>
 128:	00 00 7d a7 	ldq	t12,0(gp)
 12c:	00 40 5b 6b 	jsr	ra,(t12),130 <iterate_dir+0x130>
 130:	00 00 ba 27 	ldah	gp,0(ra)
 134:	00 00 bd 23 	lda	gp,0(gp)
 138:	00 04 ea 47 	mov	s1,v0
 13c:	00 00 5e a7 	ldq	ra,0(sp)
 140:	08 00 3e a5 	ldq	s0,8(sp)
 144:	10 00 5e a5 	ldq	s1,16(sp)
 148:	18 00 7e a5 	ldq	s2,24(sp)
 14c:	20 00 9e a5 	ldq	s3,32(sp)
 150:	28 00 be a5 	ldq	s4,40(sp)
 154:	30 00 de a5 	ldq	s5,48(sp)
 158:	38 00 fe a5 	ldq	fp,56(sp)
 15c:	1f 04 ff 47 	nop	
 160:	40 00 de 23 	lda	sp,64(sp)
 164:	01 80 fa 6b 	ret
 168:	1f 04 ff 47 	nop	
 16c:	00 00 fe 2f 	unop	
 170:	00 00 7d a7 	ldq	t12,0(gp)
 174:	1f 04 ff 47 	nop	
 178:	00 40 5b 6b 	jsr	ra,(t12),17c <iterate_dir+0x17c>
 17c:	00 00 ba 27 	ldah	gp,0(ra)
 180:	00 00 bd 23 	lda	gp,0(gp)
 184:	ec ff ff c3 	br	138 <iterate_dir+0x138>
 188:	1f 04 ff 47 	nop	
 18c:	00 00 fe 2f 	unop	
 190:	38 00 21 a4 	ldq	t0,56(t0)
 194:	ec ff 5f 21 	lda	s1,-20
 198:	00 00 fe 2f 	unop	
 19c:	e6 ff 3f e4 	beq	t0,138 <iterate_dir+0x138>
 1a0:	a0 00 ac 21 	lda	s4,160(s3)
 1a4:	00 00 7d a7 	ldq	t12,0(gp)
 1a8:	10 04 ed 47 	mov	s4,a0
 1ac:	0e 04 ff 47 	clr	s5
 1b0:	00 40 5b 6b 	jsr	ra,(t12),1b4 <iterate_dir+0x1b4>
 1b4:	00 00 ba 27 	ldah	gp,0(ra)
 1b8:	00 00 bd 23 	lda	gp,0(gp)
 1bc:	0a 04 e0 47 	mov	v0,s1
 1c0:	a9 ff ff c3 	br	68 <iterate_dir+0x68>
 1c4:	00 00 fe 2f 	unop	
 1c8:	1f 04 ff 47 	nop	
 1cc:	00 00 fe 2f 	unop	
 1d0:	40 00 61 a7 	ldq	t12,64(t0)
 1d4:	00 40 5b 6b 	jsr	ra,(t12),1d8 <iterate_dir+0x1d8>
 1d8:	00 00 ba 27 	ldah	gp,0(ra)
 1dc:	00 00 bd 23 	lda	gp,0(gp)
 1e0:	0a 04 e0 47 	mov	v0,s1
 1e4:	b2 ff ff c3 	br	b0 <iterate_dir+0xb0>
 1e8:	1f 04 ff 47 	nop	
 1ec:	00 00 fe 2f 	unop	
 1f0:	18 00 29 a6 	ldq	a1,24(s0)
 1f4:	00 00 7d a7 	ldq	t12,0(gp)
 1f8:	12 04 eb 47 	mov	s2,a2
 1fc:	10 04 ef 47 	mov	fp,a0
 200:	00 40 5b 6b 	jsr	ra,(t12),204 <iterate_dir+0x204>
 204:	00 00 ba 27 	ldah	gp,0(ra)
 208:	00 00 bd 23 	lda	gp,0(gp)
 20c:	b8 ff 1f f4 	bne	v0,f0 <iterate_dir+0xf0>
 210:	00 00 7d a7 	ldq	t12,0(gp)
 214:	15 04 ff 47 	clr	a5
 218:	14 04 ff 47 	clr	a4
 21c:	01 00 7f 22 	lda	a3,1
 220:	12 04 ef 47 	mov	fp,a2
 224:	11 04 eb 47 	mov	s2,a1
 228:	10 04 ec 47 	mov	s3,a0
 22c:	00 40 5b 6b 	jsr	ra,(t12),230 <iterate_dir+0x230>
 230:	00 00 ba 27 	ldah	gp,0(ra)
 234:	00 00 bd 23 	lda	gp,0(gp)
 238:	ad ff ff c3 	br	f0 <iterate_dir+0xf0>
 23c:	00 00 fe 2f 	unop	

0000000000000240 <fillonedir>:
 240:	00 00 bb 27 	ldah	gp,0(t12)
 244:	00 00 bd 23 	lda	gp,0(gp)
 248:	d0 ff de 23 	lda	sp,-48(sp)
 24c:	1f 04 ff 47 	nop	
 250:	18 00 7e b5 	stq	s2,24(sp)
 254:	0b 04 f0 47 	mov	a0,s2
 258:	20 00 9e b5 	stq	s3,32(sp)
 25c:	0c 04 f2 47 	mov	a2,s3
 260:	00 00 5e b7 	stq	ra,0(sp)
 264:	08 00 3e b5 	stq	s0,8(sp)
 268:	10 00 5e b5 	stq	s1,16(sp)
 26c:	18 00 50 a1 	ldl	s1,24(a0)
 270:	3f 00 40 f5 	bne	s1,370 <fillonedir+0x130>
 274:	10 00 50 a4 	ldq	t1,16(a0)
 278:	01 00 32 20 	lda	t0,1(a2)
 27c:	01 00 7f 20 	lda	t2,1
 280:	18 00 70 b0 	stl	t2,24(a0)
 284:	12 00 22 21 	lda	s0,18(t1)
 288:	01 04 21 41 	addq	s0,t0,t0
 28c:	50 00 88 a4 	ldq	t3,80(t7)
 290:	23 05 22 40 	subq	t0,t1,t2
 294:	a5 03 e3 43 	cmpult	zero,t2,t4
 298:	03 04 43 44 	or	t1,t2,t2
 29c:	21 05 25 40 	subq	t0,t4,t0
 2a0:	01 04 23 44 	or	t0,t2,t0
 2a4:	01 00 24 44 	and	t0,t3,t0
 2a8:	2d 00 20 f4 	bne	t0,360 <fillonedir+0x120>
 2ac:	01 04 ea 47 	mov	s1,t0
 2b0:	00 00 82 b6 	stq	a4,0(t1)
 2b4:	2a 00 20 f4 	bne	t0,360 <fillonedir+0x120>
 2b8:	01 04 ea 47 	mov	s1,t0
 2bc:	08 00 62 b6 	stq	a3,8(t1)
 2c0:	27 00 20 f4 	bne	t0,360 <fillonedir+0x120>
 2c4:	00 00 fe 2f 	unop	
 2c8:	10 00 42 20 	lda	t1,16(t1)
 2cc:	01 04 ea 47 	mov	s1,t0
 2d0:	23 76 40 4a 	zapnot	a2,0x3,t2
 2d4:	01 00 a2 2c 	ldq_u	t4,1(t1)
 2d8:	00 00 82 2c 	ldq_u	t3,0(t1)
 2dc:	e7 0a 62 48 	inswh	t2,t1,t6
 2e0:	66 03 62 48 	inswl	t2,t1,t5
 2e4:	45 0a a2 48 	mskwh	t4,t1,t4
 2e8:	44 02 82 48 	mskwl	t3,t1,t3
 2ec:	05 04 a7 44 	or	t4,t6,t4
 2f0:	04 04 86 44 	or	t3,t5,t3
 2f4:	01 00 a2 3c 	stq_u	t4,1(t1)
 2f8:	00 00 82 3c 	stq_u	t3,0(t1)
 2fc:	18 00 20 f4 	bne	t0,360 <fillonedir+0x120>
 300:	00 00 7d a7 	ldq	t12,0(gp)
 304:	10 04 e9 47 	mov	s0,a0
 308:	00 40 5b 6b 	jsr	ra,(t12),30c <fillonedir+0xcc>
 30c:	00 00 ba 27 	ldah	gp,0(ra)
 310:	00 00 bd 23 	lda	gp,0(gp)
 314:	12 00 00 f4 	bne	v0,360 <fillonedir+0x120>
 318:	09 04 2c 41 	addq	s0,s3,s0
 31c:	01 04 ea 47 	mov	s1,t0
 320:	00 00 49 2c 	ldq_u	t1,0(s0)
 324:	63 01 49 49 	insbl	s1,s0,t2
 328:	42 00 49 48 	mskbl	t1,s0,t1
 32c:	02 04 43 44 	or	t1,t2,t1
 330:	00 00 49 3c 	stq_u	t1,0(s0)
 334:	0a 00 20 f4 	bne	t0,360 <fillonedir+0x120>
 338:	00 04 ea 47 	mov	s1,v0
 33c:	00 00 5e a7 	ldq	ra,0(sp)
 340:	08 00 3e a5 	ldq	s0,8(sp)
 344:	10 00 5e a5 	ldq	s1,16(sp)
 348:	18 00 7e a5 	ldq	s2,24(sp)
 34c:	20 00 9e a5 	ldq	s3,32(sp)
 350:	30 00 de 23 	lda	sp,48(sp)
 354:	01 80 fa 6b 	ret
 358:	1f 04 ff 47 	nop	
 35c:	00 00 fe 2f 	unop	
 360:	f2 ff 3f 20 	lda	t0,-14
 364:	f2 ff 5f 21 	lda	s1,-14
 368:	18 00 2b b0 	stl	t0,24(s2)
 36c:	f2 ff ff c3 	br	338 <fillonedir+0xf8>
 370:	ea ff 5f 21 	lda	s1,-22
 374:	f0 ff ff c3 	br	338 <fillonedir+0xf8>
 378:	1f 04 ff 47 	nop	
 37c:	00 00 fe 2f 	unop	

0000000000000380 <verify_dirent_name>:
 380:	00 00 bb 27 	ldah	gp,0(t12)
 384:	00 00 bd 23 	lda	gp,0(gp)
 388:	e0 ff de 23 	lda	sp,-32(sp)
 38c:	12 04 f1 47 	mov	a1,a2
 390:	00 00 5e b7 	stq	ra,0(sp)
 394:	0a 00 20 e6 	beq	a1,3c0 <verify_dirent_name+0x40>
 398:	00 00 7d a7 	ldq	t12,0(gp)
 39c:	2f 00 3f 22 	lda	a1,47
 3a0:	00 40 5b 6b 	jsr	ra,(t12),3a4 <verify_dirent_name+0x24>
 3a4:	00 00 ba 27 	ldah	gp,0(ra)
 3a8:	00 00 bd 23 	lda	gp,0(gp)
 3ac:	1c 00 00 f4 	bne	v0,420 <verify_dirent_name+0xa0>
 3b0:	00 00 5e a7 	ldq	ra,0(sp)
 3b4:	20 00 de 23 	lda	sp,32(sp)
 3b8:	01 80 fa 6b 	ret
 3bc:	00 00 fe 2f 	unop	
 3c0:	00 00 3d 24 	ldah	t0,0(gp)
 3c4:	fb ff 1f 20 	lda	v0,-5
 3c8:	00 00 41 2c 	ldq_u	t1,0(t0)
 3cc:	00 00 61 20 	lda	t2,0(t0)
 3d0:	c4 00 43 48 	extbl	t1,t2,t3
 3d4:	f6 ff 9f f4 	bne	t3,3b0 <verify_dirent_name+0x30>
 3d8:	01 00 9f 20 	lda	t3,1
 3dc:	13 04 ff 47 	clr	a3
 3e0:	42 00 43 48 	mskbl	t1,t2,t1
 3e4:	09 00 5f 22 	lda	a2,9
 3e8:	63 01 83 48 	insbl	t3,t2,t2
 3ec:	94 00 3f 22 	lda	a1,148
 3f0:	03 04 62 44 	or	t2,t1,t2
 3f4:	00 00 61 3c 	stq_u	t2,0(t0)
 3f8:	00 00 1d 26 	ldah	a0,0(gp)
 3fc:	00 00 7d a7 	ldq	t12,0(gp)
 400:	10 00 1e b4 	stq	v0,16(sp)
 404:	00 00 10 22 	lda	a0,0(a0)
 408:	00 40 5b 6b 	jsr	ra,(t12),40c <verify_dirent_name+0x8c>
 40c:	00 00 ba 27 	ldah	gp,0(ra)
 410:	00 00 bd 23 	lda	gp,0(gp)
 414:	10 00 1e a4 	ldq	v0,16(sp)
 418:	e5 ff ff c3 	br	3b0 <verify_dirent_name+0x30>
 41c:	00 00 fe 2f 	unop	
 420:	00 00 3d 24 	ldah	t0,0(gp)
 424:	fb ff 1f 20 	lda	v0,-5
 428:	00 00 41 2c 	ldq_u	t1,0(t0)
 42c:	00 00 61 20 	lda	t2,0(t0)
 430:	c4 00 43 48 	extbl	t1,t2,t3
 434:	de ff 9f f4 	bne	t3,3b0 <verify_dirent_name+0x30>
 438:	01 00 9f 20 	lda	t3,1
 43c:	13 04 ff 47 	clr	a3
 440:	42 00 43 48 	mskbl	t1,t2,t1
 444:	09 00 5f 22 	lda	a2,9
 448:	63 01 83 48 	insbl	t3,t2,t2
 44c:	96 00 3f 22 	lda	a1,150
 450:	03 04 62 44 	or	t2,t1,t2
 454:	00 00 61 3c 	stq_u	t2,0(t0)
 458:	e7 ff ff c3 	br	3f8 <verify_dirent_name+0x78>
 45c:	00 00 fe 2f 	unop	

0000000000000460 <filldir>:
 460:	00 00 bb 27 	ldah	gp,0(t12)
 464:	00 00 bd 23 	lda	gp,0(gp)
 468:	c0 ff de 23 	lda	sp,-64(sp)
 46c:	1f 04 ff 47 	nop	
 470:	08 00 3e b5 	stq	s0,8(sp)
 474:	09 04 f1 47 	mov	a1,s0
 478:	28 00 be b5 	stq	s4,40(sp)
 47c:	0d 70 43 42 	addl	a2,0x1b,s4
 480:	18 00 7e b5 	stq	s2,24(sp)
 484:	0d f1 a0 45 	andnot	s4,0x7,s4
 488:	0b 04 f0 47 	mov	a0,s2
 48c:	11 04 f2 47 	mov	a2,a1
 490:	10 04 e9 47 	mov	s0,a0
 494:	0d 00 ed 43 	sextl	s4,s4
 498:	10 00 5e b5 	stq	s1,16(sp)
 49c:	0a 04 f2 47 	mov	a2,s1
 4a0:	20 00 9e b5 	stq	s3,32(sp)
 4a4:	0c 04 f5 47 	mov	a5,s3
 4a8:	30 00 de b5 	stq	s5,48(sp)
 4ac:	0e 04 f4 47 	mov	a4,s5
 4b0:	38 00 fe b5 	stq	fp,56(sp)
 4b4:	0f 04 f3 47 	mov	a3,fp
 4b8:	00 00 5e b7 	stq	ra,0(sp)
 4bc:	00 00 7d a7 	ldq	t12,0(gp)
 4c0:	00 40 5b 6b 	jsr	ra,(t12),4c4 <filldir+0x64>
 4c4:	00 00 ba 27 	ldah	gp,0(ra)
 4c8:	00 00 bd 23 	lda	gp,0(gp)
 4cc:	9c 00 00 f4 	bne	v0,740 <filldir+0x2e0>
 4d0:	ea ff 3f 20 	lda	t0,-22
 4d4:	20 00 ab a0 	ldl	t4,32(s2)
 4d8:	24 00 2b b0 	stl	t0,36(s2)
 4dc:	1f 04 ff 47 	nop	
 4e0:	a1 09 ad 40 	cmplt	t4,s4,t0
 4e4:	9c 00 20 f4 	bne	t0,758 <filldir+0x2f8>
 4e8:	18 00 4b a4 	ldq	t1,24(s2)
 4ec:	50 00 40 e4 	beq	t1,630 <filldir+0x1d0>
 4f0:	40 00 28 a4 	ldq	t0,64(t7)
 4f4:	08 00 21 a4 	ldq	t0,8(t0)
 4f8:	48 00 21 a0 	ldl	t0,72(t0)
 4fc:	01 90 20 44 	and	t0,0x4,t0
 500:	a1 03 e1 43 	cmpult	zero,t0,t0
 504:	92 00 20 f4 	bne	t0,750 <filldir+0x2f0>
 508:	17 00 22 20 	lda	t0,23(t1)
 50c:	50 00 68 a4 	ldq	t2,80(t7)
 510:	01 04 22 44 	or	t0,t1,t0
 514:	01 14 23 44 	or	t0,0x18,t0
 518:	01 00 23 44 	and	t0,t2,t0
 51c:	36 00 20 f4 	bne	t0,5f8 <filldir+0x198>
 520:	01 04 e0 47 	mov	v0,t0
 524:	08 00 e2 b5 	stq	fp,8(t1)
 528:	33 00 20 f4 	bne	t0,5f8 <filldir+0x198>
 52c:	00 00 fe 2f 	unop	
 530:	10 00 cb a4 	ldq	t5,16(s2)
 534:	01 04 ff 47 	clr	t0
 538:	00 00 c6 b5 	stq	s5,0(t5)
 53c:	2e 00 20 f4 	bne	t0,5f8 <filldir+0x198>
 540:	23 76 a0 49 	zapnot	s4,0x3,t2
 544:	10 00 46 20 	lda	t1,16(t5)
 548:	01 00 e2 2c 	ldq_u	t6,1(t1)
 54c:	00 00 82 2c 	ldq_u	t3,0(t1)
 550:	f7 0a 62 48 	inswh	t2,t1,t9
 554:	76 03 62 48 	inswl	t2,t1,t8
 558:	47 0a e2 48 	mskwh	t6,t1,t6
 55c:	44 02 82 48 	mskwl	t3,t1,t3
 560:	07 04 f7 44 	or	t6,t9,t6
 564:	04 04 96 44 	or	t3,t8,t3
 568:	01 00 e2 3c 	stq_u	t6,1(t1)
 56c:	00 00 82 3c 	stq_u	t3,0(t1)
 570:	21 00 20 f4 	bne	t0,5f8 <filldir+0x198>
 574:	00 00 fe 2f 	unop	
 578:	2c 17 87 49 	sll	s3,0x38,s3
 57c:	ff ff 4d 20 	lda	t1,-1(s4)
 580:	8c 17 87 49 	sra	s3,0x38,s3
 584:	02 04 c2 40 	addq	t5,t1,t1
 588:	00 00 62 2c 	ldq_u	t2,0(t1)
 58c:	64 01 82 49 	insbl	s3,t1,t3
 590:	43 00 62 48 	mskbl	t2,t1,t2
 594:	03 04 64 44 	or	t2,t3,t2
 598:	00 00 62 3c 	stq_u	t2,0(t1)
 59c:	16 00 20 f4 	bne	t0,5f8 <filldir+0x198>
 5a0:	12 04 ea 47 	mov	s1,a2
 5a4:	12 00 46 20 	lda	t1,18(t5)
 5a8:	a1 f7 40 41 	cmpule	s1,0x7,t0
 5ac:	24 00 20 f4 	bne	t0,640 <filldir+0x1e0>
 5b0:	07 04 ff 47 	clr	t6
 5b4:	06 00 e0 c3 	br	5d0 <filldir+0x170>
 5b8:	1f 04 ff 47 	nop	
 5bc:	00 00 fe 2f 	unop	
 5c0:	08 00 42 20 	lda	t1,8(t1)
 5c4:	08 00 29 21 	lda	s0,8(s0)
 5c8:	a1 f7 40 42 	cmpule	a2,0x7,t0
 5cc:	1c 00 20 f4 	bne	t0,640 <filldir+0x1e0>
 5d0:	00 00 29 2c 	ldq_u	t0,0(s0)
 5d4:	07 00 89 2c 	ldq_u	t3,7(s0)
 5d8:	03 04 e7 47 	mov	t6,t2
 5dc:	c1 06 29 48 	extql	t0,s0,t0
 5e0:	44 0f 89 48 	extqh	t3,s0,t3
 5e4:	01 04 24 44 	or	t0,t3,t0
 5e8:	00 00 22 b4 	stq	t0,0(t1)
 5ec:	00 00 fe 2f 	unop	
 5f0:	f8 ff 52 22 	lda	a2,-8(a2)
 5f4:	f2 ff 7f e4 	beq	t2,5c0 <filldir+0x160>
 5f8:	f2 ff 3f 20 	lda	t0,-14
 5fc:	f2 ff 1f 20 	lda	v0,-14
 600:	24 00 2b b0 	stl	t0,36(s2)
 604:	1f 04 ff 47 	nop	
 608:	00 00 5e a7 	ldq	ra,0(sp)
 60c:	08 00 3e a5 	ldq	s0,8(sp)
 610:	10 00 5e a5 	ldq	s1,16(sp)
 614:	18 00 7e a5 	ldq	s2,24(sp)
 618:	20 00 9e a5 	ldq	s3,32(sp)
 61c:	28 00 be a5 	ldq	s4,40(sp)
 620:	30 00 de a5 	ldq	s5,48(sp)
 624:	38 00 fe a5 	ldq	fp,56(sp)
 628:	40 00 de 23 	lda	sp,64(sp)
 62c:	01 80 fa 6b 	ret
 630:	50 00 28 a4 	ldq	t0,80(t7)
 634:	01 f0 23 44 	and	t0,0x1f,t0
 638:	bd ff 3f e4 	beq	t0,530 <filldir+0xd0>
 63c:	ee ff ff c3 	br	5f8 <filldir+0x198>
 640:	a1 77 40 42 	cmpule	a2,0x3,t0
 644:	0c 00 20 f4 	bne	t0,678 <filldir+0x218>
 648:	00 00 29 2c 	ldq_u	t0,0(s0)
 64c:	03 00 89 2c 	ldq_u	t3,3(s0)
 650:	03 04 ff 47 	clr	t2
 654:	c1 04 29 48 	extll	t0,s0,t0
 658:	44 0d 89 48 	extlh	t3,s0,t3
 65c:	01 04 24 44 	or	t0,t3,t0
 660:	00 00 22 b0 	stl	t0,0(t1)
 664:	e4 ff 7f f4 	bne	t2,5f8 <filldir+0x198>
 668:	04 00 42 20 	lda	t1,4(t1)
 66c:	04 00 29 21 	lda	s0,4(s0)
 670:	fc ff 52 22 	lda	a2,-4(a2)
 674:	1f 04 ff 47 	nop	
 678:	a1 37 40 42 	cmpule	a2,0x1,t0
 67c:	15 00 20 f4 	bne	t0,6d4 <filldir+0x274>
 680:	00 00 29 2c 	ldq_u	t0,0(s0)
 684:	01 00 89 2c 	ldq_u	t3,1(s0)
 688:	03 04 ff 47 	clr	t2
 68c:	c1 02 29 48 	extwl	t0,s0,t0
 690:	44 0b 89 48 	extwh	t3,s0,t3
 694:	01 04 24 44 	or	t0,t3,t0
 698:	21 76 20 48 	zapnot	t0,0x3,t0
 69c:	01 00 e2 2c 	ldq_u	t6,1(t1)
 6a0:	00 00 82 2c 	ldq_u	t3,0(t1)
 6a4:	f7 0a 22 48 	inswh	t0,t1,t9
 6a8:	76 03 22 48 	inswl	t0,t1,t8
 6ac:	47 0a e2 48 	mskwh	t6,t1,t6
 6b0:	44 02 82 48 	mskwl	t3,t1,t3
 6b4:	07 04 f7 44 	or	t6,t9,t6
 6b8:	04 04 96 44 	or	t3,t8,t3
 6bc:	01 00 e2 3c 	stq_u	t6,1(t1)
 6c0:	00 00 82 3c 	stq_u	t3,0(t1)
 6c4:	cc ff 7f f4 	bne	t2,5f8 <filldir+0x198>
 6c8:	02 00 42 20 	lda	t1,2(t1)
 6cc:	02 00 29 21 	lda	s0,2(s0)
 6d0:	fe ff 52 22 	lda	a2,-2(a2)
 6d4:	0a 00 40 e6 	beq	a2,700 <filldir+0x2a0>
 6d8:	00 00 69 2c 	ldq_u	t2,0(s0)
 6dc:	01 04 ff 47 	clr	t0
 6e0:	c9 00 69 48 	extbl	t2,s0,s0
 6e4:	00 00 62 2c 	ldq_u	t2,0(t1)
 6e8:	64 01 22 49 	insbl	s0,t1,t3
 6ec:	43 00 62 48 	mskbl	t2,t1,t2
 6f0:	03 04 64 44 	or	t2,t3,t2
 6f4:	00 00 62 3c 	stq_u	t2,0(t1)
 6f8:	bf ff 3f f4 	bne	t0,5f8 <filldir+0x198>
 6fc:	01 00 42 20 	lda	t1,1(t1)
 700:	01 04 ff 47 	clr	t0
 704:	00 00 62 2c 	ldq_u	t2,0(t1)
 708:	64 01 22 48 	insbl	t0,t1,t3
 70c:	43 00 62 48 	mskbl	t2,t1,t2
 710:	03 04 64 44 	or	t2,t3,t2
 714:	00 00 62 3c 	stq_u	t2,0(t1)
 718:	b7 ff 3f f4 	bne	t0,5f8 <filldir+0x198>
 71c:	00 00 fe 2f 	unop	
 720:	01 04 cd 40 	addq	t5,s4,t0
 724:	25 01 ad 40 	subl	t4,s4,t4
 728:	18 00 cb b4 	stq	t5,24(s2)
 72c:	10 00 2b b4 	stq	t0,16(s2)
 730:	20 00 ab b0 	stl	t4,32(s2)
 734:	b4 ff ff c3 	br	608 <filldir+0x1a8>
 738:	1f 04 ff 47 	nop	
 73c:	00 00 fe 2f 	unop	
 740:	24 00 0b b0 	stl	v0,36(s2)
 744:	b0 ff ff c3 	br	608 <filldir+0x1a8>
 748:	1f 04 ff 47 	nop	
 74c:	00 00 fe 2f 	unop	
 750:	fc ff 1f 20 	lda	v0,-4
 754:	ac ff ff c3 	br	608 <filldir+0x1a8>
 758:	ea ff 1f 20 	lda	v0,-22
 75c:	aa ff ff c3 	br	608 <filldir+0x1a8>

0000000000000760 <filldir64>:
 760:	00 00 bb 27 	ldah	gp,0(t12)
 764:	00 00 bd 23 	lda	gp,0(gp)
 768:	c0 ff de 23 	lda	sp,-64(sp)
 76c:	1f 04 ff 47 	nop	
 770:	08 00 3e b5 	stq	s0,8(sp)
 774:	09 04 f1 47 	mov	a1,s0
 778:	20 00 9e b5 	stq	s3,32(sp)
 77c:	0c 70 43 42 	addl	a2,0x1b,s3
 780:	18 00 7e b5 	stq	s2,24(sp)
 784:	0c f1 80 45 	andnot	s3,0x7,s3
 788:	0b 04 f0 47 	mov	a0,s2
 78c:	11 04 f2 47 	mov	a2,a1
 790:	10 04 e9 47 	mov	s0,a0
 794:	0c 00 ec 43 	sextl	s3,s3
 798:	10 00 5e b5 	stq	s1,16(sp)
 79c:	0a 04 f2 47 	mov	a2,s1
 7a0:	28 00 be b5 	stq	s4,40(sp)
 7a4:	0d 04 f5 47 	mov	a5,s4
 7a8:	30 00 de b5 	stq	s5,48(sp)
 7ac:	0e 04 f4 47 	mov	a4,s5
 7b0:	38 00 fe b5 	stq	fp,56(sp)
 7b4:	0f 04 f3 47 	mov	a3,fp
 7b8:	00 00 5e b7 	stq	ra,0(sp)
 7bc:	00 00 7d a7 	ldq	t12,0(gp)
 7c0:	00 40 5b 6b 	jsr	ra,(t12),7c4 <filldir64+0x64>
 7c4:	00 00 ba 27 	ldah	gp,0(ra)
 7c8:	00 00 bd 23 	lda	gp,0(gp)
 7cc:	98 00 00 f4 	bne	v0,a30 <filldir64+0x2d0>
 7d0:	ea ff 3f 20 	lda	t0,-22
 7d4:	20 00 ab a0 	ldl	t4,32(s2)
 7d8:	24 00 2b b0 	stl	t0,36(s2)
 7dc:	1f 04 ff 47 	nop	
 7e0:	a1 09 ac 40 	cmplt	t4,s3,t0
 7e4:	98 00 20 f4 	bne	t0,a48 <filldir64+0x2e8>
 7e8:	18 00 4b a4 	ldq	t1,24(s2)
 7ec:	4c 00 40 e4 	beq	t1,920 <filldir64+0x1c0>
 7f0:	40 00 28 a4 	ldq	t0,64(t7)
 7f4:	08 00 21 a4 	ldq	t0,8(t0)
 7f8:	48 00 21 a0 	ldl	t0,72(t0)
 7fc:	01 90 20 44 	and	t0,0x4,t0
 800:	a1 03 e1 43 	cmpult	zero,t0,t0
 804:	8e 00 20 f4 	bne	t0,a40 <filldir64+0x2e0>
 808:	17 00 22 20 	lda	t0,23(t1)
 80c:	50 00 68 a4 	ldq	t2,80(t7)
 810:	01 04 22 44 	or	t0,t1,t0
 814:	01 14 23 44 	or	t0,0x18,t0
 818:	01 00 23 44 	and	t0,t2,t0
 81c:	32 00 20 f4 	bne	t0,8e8 <filldir64+0x188>
 820:	01 04 e0 47 	mov	v0,t0
 824:	08 00 e2 b5 	stq	fp,8(t1)
 828:	2f 00 20 f4 	bne	t0,8e8 <filldir64+0x188>
 82c:	00 00 fe 2f 	unop	
 830:	10 00 cb a4 	ldq	t5,16(s2)
 834:	01 04 ff 47 	clr	t0
 838:	00 00 c6 b5 	stq	s5,0(t5)
 83c:	2a 00 20 f4 	bne	t0,8e8 <filldir64+0x188>
 840:	23 76 80 49 	zapnot	s3,0x3,t2
 844:	10 00 46 20 	lda	t1,16(t5)
 848:	01 00 e2 2c 	ldq_u	t6,1(t1)
 84c:	00 00 82 2c 	ldq_u	t3,0(t1)
 850:	f7 0a 62 48 	inswh	t2,t1,t9
 854:	76 03 62 48 	inswl	t2,t1,t8
 858:	47 0a e2 48 	mskwh	t6,t1,t6
 85c:	44 02 82 48 	mskwl	t3,t1,t3
 860:	07 04 f7 44 	or	t6,t9,t6
 864:	04 04 96 44 	or	t3,t8,t3
 868:	01 00 e2 3c 	stq_u	t6,1(t1)
 86c:	00 00 82 3c 	stq_u	t3,0(t1)
 870:	1d 00 20 f4 	bne	t0,8e8 <filldir64+0x188>
 874:	00 00 fe 2f 	unop	
 878:	0d f0 bf 45 	and	s4,0xff,s4
 87c:	12 00 46 20 	lda	t1,18(t5)
 880:	00 00 62 2c 	ldq_u	t2,0(t1)
 884:	64 01 a2 49 	insbl	s4,t1,t3
 888:	43 00 62 48 	mskbl	t2,t1,t2
 88c:	03 04 64 44 	or	t2,t3,t2
 890:	00 00 62 3c 	stq_u	t2,0(t1)
 894:	14 00 20 f4 	bne	t0,8e8 <filldir64+0x188>
 898:	12 04 ea 47 	mov	s1,a2
 89c:	13 00 46 20 	lda	t1,19(t5)
 8a0:	a1 f7 40 41 	cmpule	s1,0x7,t0
 8a4:	22 00 20 f4 	bne	t0,930 <filldir64+0x1d0>
 8a8:	07 04 ff 47 	clr	t6
 8ac:	04 00 e0 c3 	br	8c0 <filldir64+0x160>
 8b0:	08 00 42 20 	lda	t1,8(t1)
 8b4:	08 00 29 21 	lda	s0,8(s0)
 8b8:	a1 f7 40 42 	cmpule	a2,0x7,t0
 8bc:	1c 00 20 f4 	bne	t0,930 <filldir64+0x1d0>
 8c0:	00 00 29 2c 	ldq_u	t0,0(s0)
 8c4:	07 00 89 2c 	ldq_u	t3,7(s0)
 8c8:	03 04 e7 47 	mov	t6,t2
 8cc:	c1 06 29 48 	extql	t0,s0,t0
 8d0:	44 0f 89 48 	extqh	t3,s0,t3
 8d4:	01 04 24 44 	or	t0,t3,t0
 8d8:	00 00 22 b4 	stq	t0,0(t1)
 8dc:	00 00 fe 2f 	unop	
 8e0:	f8 ff 52 22 	lda	a2,-8(a2)
 8e4:	f2 ff 7f e4 	beq	t2,8b0 <filldir64+0x150>
 8e8:	f2 ff 3f 20 	lda	t0,-14
 8ec:	f2 ff 1f 20 	lda	v0,-14
 8f0:	24 00 2b b0 	stl	t0,36(s2)
 8f4:	1f 04 ff 47 	nop	
 8f8:	00 00 5e a7 	ldq	ra,0(sp)
 8fc:	08 00 3e a5 	ldq	s0,8(sp)
 900:	10 00 5e a5 	ldq	s1,16(sp)
 904:	18 00 7e a5 	ldq	s2,24(sp)
 908:	20 00 9e a5 	ldq	s3,32(sp)
 90c:	28 00 be a5 	ldq	s4,40(sp)
 910:	30 00 de a5 	ldq	s5,48(sp)
 914:	38 00 fe a5 	ldq	fp,56(sp)
 918:	40 00 de 23 	lda	sp,64(sp)
 91c:	01 80 fa 6b 	ret
 920:	50 00 28 a4 	ldq	t0,80(t7)
 924:	01 f0 23 44 	and	t0,0x1f,t0
 928:	c1 ff 3f e4 	beq	t0,830 <filldir64+0xd0>
 92c:	ee ff ff c3 	br	8e8 <filldir64+0x188>
 930:	a1 77 40 42 	cmpule	a2,0x3,t0
 934:	0c 00 20 f4 	bne	t0,968 <filldir64+0x208>
 938:	00 00 29 2c 	ldq_u	t0,0(s0)
 93c:	03 00 89 2c 	ldq_u	t3,3(s0)
 940:	03 04 ff 47 	clr	t2
 944:	c1 04 29 48 	extll	t0,s0,t0
 948:	44 0d 89 48 	extlh	t3,s0,t3
 94c:	01 04 24 44 	or	t0,t3,t0
 950:	00 00 22 b0 	stl	t0,0(t1)
 954:	e4 ff 7f f4 	bne	t2,8e8 <filldir64+0x188>
 958:	04 00 42 20 	lda	t1,4(t1)
 95c:	04 00 29 21 	lda	s0,4(s0)
 960:	fc ff 52 22 	lda	a2,-4(a2)
 964:	1f 04 ff 47 	nop	
 968:	a1 37 40 42 	cmpule	a2,0x1,t0
 96c:	15 00 20 f4 	bne	t0,9c4 <filldir64+0x264>
 970:	00 00 29 2c 	ldq_u	t0,0(s0)
 974:	01 00 89 2c 	ldq_u	t3,1(s0)
 978:	03 04 ff 47 	clr	t2
 97c:	c1 02 29 48 	extwl	t0,s0,t0
 980:	44 0b 89 48 	extwh	t3,s0,t3
 984:	01 04 24 44 	or	t0,t3,t0
 988:	21 76 20 48 	zapnot	t0,0x3,t0
 98c:	01 00 e2 2c 	ldq_u	t6,1(t1)
 990:	00 00 82 2c 	ldq_u	t3,0(t1)
 994:	f7 0a 22 48 	inswh	t0,t1,t9
 998:	76 03 22 48 	inswl	t0,t1,t8
 99c:	47 0a e2 48 	mskwh	t6,t1,t6
 9a0:	44 02 82 48 	mskwl	t3,t1,t3
 9a4:	07 04 f7 44 	or	t6,t9,t6
 9a8:	04 04 96 44 	or	t3,t8,t3
 9ac:	01 00 e2 3c 	stq_u	t6,1(t1)
 9b0:	00 00 82 3c 	stq_u	t3,0(t1)
 9b4:	cc ff 7f f4 	bne	t2,8e8 <filldir64+0x188>
 9b8:	02 00 42 20 	lda	t1,2(t1)
 9bc:	02 00 29 21 	lda	s0,2(s0)
 9c0:	fe ff 52 22 	lda	a2,-2(a2)
 9c4:	0a 00 40 e6 	beq	a2,9f0 <filldir64+0x290>
 9c8:	00 00 69 2c 	ldq_u	t2,0(s0)
 9cc:	01 04 ff 47 	clr	t0
 9d0:	c9 00 69 48 	extbl	t2,s0,s0
 9d4:	00 00 62 2c 	ldq_u	t2,0(t1)
 9d8:	64 01 22 49 	insbl	s0,t1,t3
 9dc:	43 00 62 48 	mskbl	t2,t1,t2
 9e0:	03 04 64 44 	or	t2,t3,t2
 9e4:	00 00 62 3c 	stq_u	t2,0(t1)
 9e8:	bf ff 3f f4 	bne	t0,8e8 <filldir64+0x188>
 9ec:	01 00 42 20 	lda	t1,1(t1)
 9f0:	01 04 ff 47 	clr	t0
 9f4:	00 00 62 2c 	ldq_u	t2,0(t1)
 9f8:	64 01 22 48 	insbl	t0,t1,t3
 9fc:	43 00 62 48 	mskbl	t2,t1,t2
 a00:	03 04 64 44 	or	t2,t3,t2
 a04:	00 00 62 3c 	stq_u	t2,0(t1)
 a08:	b7 ff 3f f4 	bne	t0,8e8 <filldir64+0x188>
 a0c:	00 00 fe 2f 	unop	
 a10:	01 04 cc 40 	addq	t5,s3,t0
 a14:	25 01 ac 40 	subl	t4,s3,t4
 a18:	18 00 cb b4 	stq	t5,24(s2)
 a1c:	10 00 2b b4 	stq	t0,16(s2)
 a20:	20 00 ab b0 	stl	t4,32(s2)
 a24:	b4 ff ff c3 	br	8f8 <filldir64+0x198>
 a28:	1f 04 ff 47 	nop	
 a2c:	00 00 fe 2f 	unop	
 a30:	24 00 0b b0 	stl	v0,36(s2)
 a34:	b0 ff ff c3 	br	8f8 <filldir64+0x198>
 a38:	1f 04 ff 47 	nop	
 a3c:	00 00 fe 2f 	unop	
 a40:	fc ff 1f 20 	lda	v0,-4
 a44:	ac ff ff c3 	br	8f8 <filldir64+0x198>
 a48:	ea ff 1f 20 	lda	v0,-22
 a4c:	aa ff ff c3 	br	8f8 <filldir64+0x198>

0000000000000a50 <__se_sys_old_readdir>:
 a50:	00 00 bb 27 	ldah	gp,0(t12)
 a54:	00 00 bd 23 	lda	gp,0(gp)
 a58:	c0 ff de 23 	lda	sp,-64(sp)
 a5c:	00 00 7d a7 	ldq	t12,0(gp)
 a60:	08 00 3e b5 	stq	s0,8(sp)
 a64:	10 00 f0 43 	sextl	a0,a0
 a68:	10 00 5e b5 	stq	s1,16(sp)
 a6c:	0a 04 f1 47 	mov	a1,s1
 a70:	18 00 7e b5 	stq	s2,24(sp)
 a74:	00 00 5e b7 	stq	ra,0(sp)
 a78:	00 40 5b 6b 	jsr	ra,(t12),a7c <__se_sys_old_readdir+0x2c>
 a7c:	00 00 ba 27 	ldah	gp,0(ra)
 a80:	28 00 fe b7 	stq	zero,40(sp)
 a84:	00 00 bd 23 	lda	gp,0(gp)
 a88:	0b 71 00 44 	andnot	v0,0x3,s2
 a8c:	00 00 fe 2f 	unop	
 a90:	00 00 3d 24 	ldah	t0,0(gp)
 a94:	09 00 e0 43 	sextl	v0,s0
 a98:	00 00 21 20 	lda	t0,0(t0)
 a9c:	f7 ff 1f 20 	lda	v0,-9
 aa0:	38 00 fe b7 	stq	zero,56(sp)
 aa4:	20 00 3e b4 	stq	t0,32(sp)
 aa8:	30 00 5e b5 	stq	s1,48(sp)
 aac:	0d 00 60 e5 	beq	s2,ae4 <__se_sys_old_readdir+0x94>
 ab0:	20 00 3e 22 	lda	a1,32(sp)
 ab4:	10 04 eb 47 	mov	s2,a0
 ab8:	00 00 7d a7 	ldq	t12,0(gp)
 abc:	00 40 5b 6b 	jsr	ra,(t12),ac0 <__se_sys_old_readdir+0x70>
 ac0:	00 00 ba 27 	ldah	gp,0(ra)
 ac4:	00 00 bd 23 	lda	gp,0(gp)
 ac8:	0a 04 e0 47 	mov	v0,s1
 acc:	38 00 1e a0 	ldl	v0,56(sp)
 ad0:	01 50 20 45 	and	s0,0x2,t0
 ad4:	ca 04 00 44 	cmovne	v0,v0,s1
 ad8:	11 00 20 f4 	bne	t0,b20 <__se_sys_old_readdir+0xd0>
 adc:	08 00 20 f1 	blbs	s0,b00 <__se_sys_old_readdir+0xb0>
 ae0:	00 04 ea 47 	mov	s1,v0
 ae4:	00 00 5e a7 	ldq	ra,0(sp)
 ae8:	08 00 3e a5 	ldq	s0,8(sp)
 aec:	1f 04 ff 47 	nop	
 af0:	10 00 5e a5 	ldq	s1,16(sp)
 af4:	18 00 7e a5 	ldq	s2,24(sp)
 af8:	40 00 de 23 	lda	sp,64(sp)
 afc:	01 80 fa 6b 	ret
 b00:	00 00 7d a7 	ldq	t12,0(gp)
 b04:	10 04 eb 47 	mov	s2,a0
 b08:	00 40 5b 6b 	jsr	ra,(t12),b0c <__se_sys_old_readdir+0xbc>
 b0c:	00 00 ba 27 	ldah	gp,0(ra)
 b10:	00 00 bd 23 	lda	gp,0(gp)
 b14:	f2 ff ff c3 	br	ae0 <__se_sys_old_readdir+0x90>
 b18:	1f 04 ff 47 	nop	
 b1c:	00 00 fe 2f 	unop	
 b20:	00 00 7d a7 	ldq	t12,0(gp)
 b24:	10 04 eb 47 	mov	s2,a0
 b28:	00 40 5b 6b 	jsr	ra,(t12),b2c <__se_sys_old_readdir+0xdc>
 b2c:	00 00 ba 27 	ldah	gp,0(ra)
 b30:	00 00 bd 23 	lda	gp,0(gp)
 b34:	ea ff 3f e1 	blbc	s0,ae0 <__se_sys_old_readdir+0x90>
 b38:	f1 ff ff c3 	br	b00 <__se_sys_old_readdir+0xb0>
 b3c:	00 00 fe 2f 	unop	

0000000000000b40 <__se_sys_getdents>:
 b40:	00 00 bb 27 	ldah	gp,0(t12)
 b44:	00 00 bd 23 	lda	gp,0(gp)
 b48:	22 f6 41 4a 	zapnot	a2,0xf,t1
 b4c:	a0 ff de 23 	lda	sp,-96(sp)
 b50:	a3 03 e2 43 	cmpult	zero,t1,t2
 b54:	01 04 22 42 	addq	a1,t1,t0
 b58:	21 05 23 40 	subq	t0,t2,t0
 b5c:	02 04 22 46 	or	a1,t1,t1
 b60:	10 00 5e b5 	stq	s1,16(sp)
 b64:	01 04 22 44 	or	t0,t1,t0
 b68:	18 00 7e b5 	stq	s2,24(sp)
 b6c:	00 00 5d 24 	ldah	t1,0(gp)
 b70:	00 00 5e b7 	stq	ra,0(sp)
 b74:	00 00 42 20 	lda	t1,0(t1)
 b78:	08 00 3e b5 	stq	s0,8(sp)
 b7c:	0a 04 f2 47 	mov	a2,s1
 b80:	20 00 9e b5 	stq	s3,32(sp)
 b84:	10 00 f0 43 	sextl	a0,a0
 b88:	50 00 fe b7 	stq	zero,80(sp)
 b8c:	f2 ff 7f 21 	lda	s2,-14
 b90:	50 00 68 a4 	ldq	t2,80(t7)
 b94:	38 00 fe b7 	stq	zero,56(sp)
 b98:	48 00 fe b7 	stq	zero,72(sp)
 b9c:	01 00 23 44 	and	t0,t2,t0
 ba0:	30 00 5e b4 	stq	t1,48(sp)
 ba4:	40 00 3e b6 	stq	a1,64(sp)
 ba8:	50 00 5e b2 	stl	a2,80(sp)
 bac:	22 00 20 f4 	bne	t0,c38 <__se_sys_getdents+0xf8>
 bb0:	00 00 7d a7 	ldq	t12,0(gp)
 bb4:	1f 04 ff 47 	nop	
 bb8:	00 40 5b 6b 	jsr	ra,(t12),bbc <__se_sys_getdents+0x7c>
 bbc:	00 00 ba 27 	ldah	gp,0(ra)
 bc0:	00 00 bd 23 	lda	gp,0(gp)
 bc4:	0c 71 00 44 	andnot	v0,0x3,s3
 bc8:	09 00 e0 43 	sextl	v0,s0
 bcc:	41 00 80 e5 	beq	s3,cd4 <__se_sys_getdents+0x194>
 bd0:	30 00 3e 22 	lda	a1,48(sp)
 bd4:	10 04 ec 47 	mov	s3,a0
 bd8:	00 00 7d a7 	ldq	t12,0(gp)
 bdc:	00 40 5b 6b 	jsr	ra,(t12),be0 <__se_sys_getdents+0xa0>
 be0:	00 00 ba 27 	ldah	gp,0(ra)
 be4:	00 00 bd 23 	lda	gp,0(gp)
 be8:	54 00 3e a0 	ldl	t0,84(sp)
 bec:	48 00 7e a4 	ldq	t2,72(sp)
 bf0:	c0 08 01 44 	cmovge	v0,t0,v0
 bf4:	2e 00 60 e4 	beq	t2,cb0 <__se_sys_getdents+0x170>
 bf8:	0f 00 43 20 	lda	t1,15(t2)
 bfc:	08 00 23 20 	lda	t0,8(t2)
 c00:	01 04 41 44 	or	t1,t0,t0
 c04:	50 00 48 a4 	ldq	t1,80(t7)
 c08:	01 14 21 44 	or	t0,0x8,t0
 c0c:	1f 04 ff 47 	nop	
 c10:	01 00 22 44 	and	t0,t1,t0
 c14:	04 00 20 f4 	bne	t0,c28 <__se_sys_getdents+0xe8>
 c18:	38 00 5e a4 	ldq	t1,56(sp)
 c1c:	08 00 43 b4 	stq	t1,8(t2)
 c20:	27 00 20 e4 	beq	t0,cc0 <__se_sys_getdents+0x180>
 c24:	00 00 fe 2f 	unop	
 c28:	01 50 20 45 	and	s0,0x2,t0
 c2c:	0c 00 20 f4 	bne	t0,c60 <__se_sys_getdents+0x120>
 c30:	1f 04 ff 47 	nop	
 c34:	10 00 20 f1 	blbs	s0,c78 <__se_sys_getdents+0x138>
 c38:	00 04 eb 47 	mov	s2,v0
 c3c:	00 00 5e a7 	ldq	ra,0(sp)
 c40:	08 00 3e a5 	ldq	s0,8(sp)
 c44:	10 00 5e a5 	ldq	s1,16(sp)
 c48:	18 00 7e a5 	ldq	s2,24(sp)
 c4c:	20 00 9e a5 	ldq	s3,32(sp)
 c50:	60 00 de 23 	lda	sp,96(sp)
 c54:	01 80 fa 6b 	ret
 c58:	1f 04 ff 47 	nop	
 c5c:	00 00 fe 2f 	unop	
 c60:	00 00 7d a7 	ldq	t12,0(gp)
 c64:	10 04 ec 47 	mov	s3,a0
 c68:	00 40 5b 6b 	jsr	ra,(t12),c6c <__se_sys_getdents+0x12c>
 c6c:	00 00 ba 27 	ldah	gp,0(ra)
 c70:	00 00 bd 23 	lda	gp,0(gp)
 c74:	f0 ff 3f e1 	blbc	s0,c38 <__se_sys_getdents+0xf8>
 c78:	00 00 7d a7 	ldq	t12,0(gp)
 c7c:	10 04 ec 47 	mov	s3,a0
 c80:	00 40 5b 6b 	jsr	ra,(t12),c84 <__se_sys_getdents+0x144>
 c84:	00 00 ba 27 	ldah	gp,0(ra)
 c88:	08 00 3e a5 	ldq	s0,8(sp)
 c8c:	1f 04 ff 47 	nop	
 c90:	00 04 eb 47 	mov	s2,v0
 c94:	00 00 5e a7 	ldq	ra,0(sp)
 c98:	10 00 5e a5 	ldq	s1,16(sp)
 c9c:	18 00 7e a5 	ldq	s2,24(sp)
 ca0:	20 00 9e a5 	ldq	s3,32(sp)
 ca4:	00 00 bd 23 	lda	gp,0(gp)
 ca8:	60 00 de 23 	lda	sp,96(sp)
 cac:	01 80 fa 6b 	ret
 cb0:	0b 04 e0 47 	mov	v0,s2
 cb4:	01 50 20 45 	and	s0,0x2,t0
 cb8:	dd ff 3f e4 	beq	t0,c30 <__se_sys_getdents+0xf0>
 cbc:	e8 ff ff c3 	br	c60 <__se_sys_getdents+0x120>
 cc0:	50 00 7e a1 	ldl	s2,80(sp)
 cc4:	01 50 20 45 	and	s0,0x2,t0
 cc8:	2b 01 4b 41 	subl	s1,s2,s2
 ccc:	d8 ff 3f e4 	beq	t0,c30 <__se_sys_getdents+0xf0>
 cd0:	e3 ff ff c3 	br	c60 <__se_sys_getdents+0x120>
 cd4:	f7 ff 7f 21 	lda	s2,-9
 cd8:	d7 ff ff c3 	br	c38 <__se_sys_getdents+0xf8>
 cdc:	00 00 fe 2f 	unop	

0000000000000ce0 <ksys_getdents64>:
 ce0:	00 00 bb 27 	ldah	gp,0(t12)
 ce4:	00 00 bd 23 	lda	gp,0(gp)
 ce8:	21 f6 41 4a 	zapnot	a2,0xf,t0
 cec:	a0 ff de 23 	lda	sp,-96(sp)
 cf0:	08 00 3e b5 	stq	s0,8(sp)
 cf4:	a2 03 e1 43 	cmpult	zero,t0,t1
 cf8:	10 00 5e b5 	stq	s1,16(sp)
 cfc:	09 04 21 42 	addq	a1,t0,s0
 d00:	29 05 22 41 	subq	s0,t1,s0
 d04:	01 04 21 46 	or	a1,t0,t0
 d08:	20 00 9e b5 	stq	s3,32(sp)
 d0c:	09 04 21 45 	or	s0,t0,s0
 d10:	00 00 5e b7 	stq	ra,0(sp)
 d14:	00 00 3d 24 	ldah	t0,0(gp)
 d18:	18 00 7e b5 	stq	s2,24(sp)
 d1c:	00 00 21 20 	lda	t0,0(t0)
 d20:	28 00 be b5 	stq	s4,40(sp)
 d24:	0c 04 f2 47 	mov	a2,s3
 d28:	50 00 fe b7 	stq	zero,80(sp)
 d2c:	f2 ff 5f 21 	lda	s1,-14
 d30:	50 00 48 a4 	ldq	t1,80(t7)
 d34:	38 00 fe b7 	stq	zero,56(sp)
 d38:	48 00 fe b7 	stq	zero,72(sp)
 d3c:	09 00 22 45 	and	s0,t1,s0
 d40:	30 00 3e b4 	stq	t0,48(sp)
 d44:	40 00 3e b6 	stq	a1,64(sp)
 d48:	50 00 5e b2 	stl	a2,80(sp)
 d4c:	1a 00 20 f5 	bne	s0,db8 <ksys_getdents64+0xd8>
 d50:	00 00 7d a7 	ldq	t12,0(gp)
 d54:	f7 ff 5f 21 	lda	s1,-9
 d58:	00 40 5b 6b 	jsr	ra,(t12),d5c <ksys_getdents64+0x7c>
 d5c:	00 00 ba 27 	ldah	gp,0(ra)
 d60:	00 00 bd 23 	lda	gp,0(gp)
 d64:	0d 71 00 44 	andnot	v0,0x3,s4
 d68:	0b 00 e0 43 	sextl	v0,s2
 d6c:	12 00 a0 e5 	beq	s4,db8 <ksys_getdents64+0xd8>
 d70:	30 00 3e 22 	lda	a1,48(sp)
 d74:	10 04 ed 47 	mov	s4,a0
 d78:	00 00 7d a7 	ldq	t12,0(gp)
 d7c:	00 40 5b 6b 	jsr	ra,(t12),d80 <ksys_getdents64+0xa0>
 d80:	00 00 ba 27 	ldah	gp,0(ra)
 d84:	00 00 bd 23 	lda	gp,0(gp)
 d88:	54 00 5e a1 	ldl	s1,84(sp)
 d8c:	48 00 3e a4 	ldq	t0,72(sp)
 d90:	8a 08 00 44 	cmovlt	v0,v0,s1
 d94:	04 00 20 e4 	beq	t0,da8 <ksys_getdents64+0xc8>
 d98:	38 00 5e a4 	ldq	t1,56(sp)
 d9c:	08 00 41 b4 	stq	t1,8(t0)
 da0:	f2 ff 5f 21 	lda	s1,-14
 da4:	0e 00 20 e5 	beq	s0,de0 <ksys_getdents64+0x100>
 da8:	01 50 60 45 	and	s2,0x2,t0
 dac:	10 00 20 f4 	bne	t0,df0 <ksys_getdents64+0x110>
 db0:	1f 04 ff 47 	nop	
 db4:	14 00 60 f1 	blbs	s2,e08 <ksys_getdents64+0x128>
 db8:	00 04 ea 47 	mov	s1,v0
 dbc:	00 00 5e a7 	ldq	ra,0(sp)
 dc0:	08 00 3e a5 	ldq	s0,8(sp)
 dc4:	10 00 5e a5 	ldq	s1,16(sp)
 dc8:	18 00 7e a5 	ldq	s2,24(sp)
 dcc:	20 00 9e a5 	ldq	s3,32(sp)
 dd0:	28 00 be a5 	ldq	s4,40(sp)
 dd4:	1f 04 ff 47 	nop	
 dd8:	60 00 de 23 	lda	sp,96(sp)
 ddc:	01 80 fa 6b 	ret
 de0:	50 00 5e a1 	ldl	s1,80(sp)
 de4:	01 50 60 45 	and	s2,0x2,t0
 de8:	2a 01 8a 41 	subl	s3,s1,s1
 dec:	f0 ff 3f e4 	beq	t0,db0 <ksys_getdents64+0xd0>
 df0:	00 00 7d a7 	ldq	t12,0(gp)
 df4:	10 04 ed 47 	mov	s4,a0
 df8:	00 40 5b 6b 	jsr	ra,(t12),dfc <ksys_getdents64+0x11c>
 dfc:	00 00 ba 27 	ldah	gp,0(ra)
 e00:	00 00 bd 23 	lda	gp,0(gp)
 e04:	ec ff 7f e1 	blbc	s2,db8 <ksys_getdents64+0xd8>
 e08:	00 00 7d a7 	ldq	t12,0(gp)
 e0c:	10 04 ed 47 	mov	s4,a0
 e10:	00 40 5b 6b 	jsr	ra,(t12),e14 <ksys_getdents64+0x134>
 e14:	00 00 ba 27 	ldah	gp,0(ra)
 e18:	00 00 bd 23 	lda	gp,0(gp)
 e1c:	e6 ff ff c3 	br	db8 <ksys_getdents64+0xd8>

0000000000000e20 <__se_sys_getdents64>:
 e20:	00 00 bb 27 	ldah	gp,0(t12)
 e24:	00 00 bd 23 	lda	gp,0(gp)
 e28:	f0 ff de 23 	lda	sp,-16(sp)
 e2c:	12 00 f2 43 	sextl	a2,a2
 e30:	00 00 5e b7 	stq	ra,0(sp)
 e34:	10 00 f0 43 	sextl	a0,a0
 e38:	00 00 7d a7 	ldq	t12,0(gp)
 e3c:	00 40 5b 6b 	jsr	ra,(t12),e40 <__se_sys_getdents64+0x20>
 e40:	00 00 ba 27 	ldah	gp,0(ra)
 e44:	00 00 bd 23 	lda	gp,0(gp)
 e48:	00 00 5e a7 	ldq	ra,0(sp)
 e4c:	1f 04 ff 47 	nop	
 e50:	10 00 de 23 	lda	sp,16(sp)
 e54:	01 80 fa 6b 	ret
 e58:	1f 04 ff 47 	nop	
 e5c:	00 00 fe 2f 	unop	

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  0:04     ` Guenter Roeck
@ 2019-10-07  1:17       ` Linus Torvalds
  2019-10-07  1:24         ` Al Viro
  2019-10-07  2:30         ` Guenter Roeck
  0 siblings, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07  1:17 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

On Sun, Oct 6, 2019 at 5:04 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> All my alpha, sparc64, and xtensa tests pass with the attached patch
> applied on top of v5.4-rc2. I didn't test any others.

Okay... I really wish my guess had been wrong.

Because fixing filldir64 isn't the problem. I can come up with
multiple ways to avoid the unaligned issues if that was the problem.

But it does look to me like the fundamental problem is that unaligned
__put_user() calls might just be broken on alpha (and likely sparc
too). Because that looks to be the only difference between the
__copy_to_user() approach and using unsafe_put_user() in a loop.

Now, I should have handled unaligned things differently in the first
place, and in that sense I think commit 9f79b78ef744 ("Convert
filldir[64]() from __put_user() to unsafe_put_user()") really is
non-optimal on architectures with alignment issues.

And I'll fix it.

But at the same time, the fact that "non-optimal" turns into "doesn't
work" is a fairly nasty issue.

> I'll (try to) send you some disassembly next.

Thanks, verified.

The "ra is at filldir64+0x64/0x320" is indeed right at the return
point of the "jsr verify_dirent_name".

But the problem isn't there - that's just left-over state. I'm pretty
sure that function worked fine, and returned.

The problem is that "pc is at 0x4" and the page fault that then
happens at that address as a result.

And that seems to be due to this:

 8c0:   00 00 29 2c     ldq_u   t0,0(s0)
 8c4:   07 00 89 2c     ldq_u   t3,7(s0)
 8c8:   03 04 e7 47     mov     t6,t2
 8cc:   c1 06 29 48     extql   t0,s0,t0
 8d0:   44 0f 89 48     extqh   t3,s0,t3
 8d4:   01 04 24 44     or      t0,t3,t0
 8d8:   00 00 22 b4     stq     t0,0(t1)

that's the "get_unaligned((type *)src)" (the first six instructions)
followed by the "unsafe_put_user()" done with a single "stq". That's
the guts of the unsafe_copy_loop() as part of
unsafe_copy_dirent_name()

And what I think happens is that it is writing to user memory that is

 (a) unaligned

 (b) not currently mapped in user space

so then the do_entUna() function tries to handle the unaligned trap,
but then it takes an exception while doing that (due to the unmapped
page), and then something in that nested exception mess causes it to
mess up badly and corrupt the register contents on stack, and it
returns with garbage in 'pc', and then you finally die with that

   Unable to handle kernel paging request at virtual address 0000000000000004
   pc is at 0x4

thing.

And yes, I'll fix that name copy loop in filldir to align the
destination first, *but* if I'm right, it means that something like
this should also likely cause issues:

  #define _GNU_SOURCE
  #include <unistd.h>
  #include <sys/mman.h>

  int main(int argc, char **argv)
  {
        void *mymap;
        uid_t *bad_ptr = (void *) 0x01;

        /* Create unpopulated memory area */
        mymap = mmap(NULL, 16384, PROT_READ | PROT_WRITE, MAP_PRIVATE
| MAP_ANONYMOUS, -1, 0);

        /* Unaligned uidpointer in that memory area */
        bad_ptr = mymap+1;

        /* Make the kernel do put_user() on it */
        return getresuid(bad_ptr, bad_ptr+1, bad_ptr+2);
  }

because that simple user mode program should cause that same "page
fault on unaligned put_user()" behavior as far as I can tell.

Mind humoring me and trying that on your alpha machine (or emulator,
or whatever)?

               Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  1:17       ` Linus Torvalds
@ 2019-10-07  1:24         ` Al Viro
  2019-10-07  2:06           ` Linus Torvalds
  2019-10-07  2:30         ` Guenter Roeck
  1 sibling, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-07  1:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 06, 2019 at 06:17:02PM -0700, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 5:04 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > All my alpha, sparc64, and xtensa tests pass with the attached patch
> > applied on top of v5.4-rc2. I didn't test any others.
> 
> Okay... I really wish my guess had been wrong.
> 
> Because fixing filldir64 isn't the problem. I can come up with
> multiple ways to avoid the unaligned issues if that was the problem.
> 
> But it does look to me like the fundamental problem is that unaligned
> __put_user() calls might just be broken on alpha (and likely sparc
> too). Because that looks to be the only difference between the
> __copy_to_user() approach and using unsafe_put_user() in a loop.
> 
> Now, I should have handled unaligned things differently in the first
> place, and in that sense I think commit 9f79b78ef744 ("Convert
> filldir[64]() from __put_user() to unsafe_put_user()") really is
> non-optimal on architectures with alignment issues.
> 
> And I'll fix it.

Ugh...  I wonder if it would be better to lift STAC/CLAC out of
raw_copy_to_user(), rather than trying to reinvent its guts
in readdir.c...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  1:24         ` Al Viro
@ 2019-10-07  2:06           ` Linus Torvalds
  2019-10-07  2:50             ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07  2:06 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 6, 2019 at 6:24 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Ugh...  I wonder if it would be better to lift STAC/CLAC out of
> raw_copy_to_user(), rather than trying to reinvent its guts
> in readdir.c...

Yeah, I suspect that's the best option.

Do something like

 - lift STAC/CLAC out of raw_copy_to_user

 - rename it to unsafe_copy_to_user

 - create a new raw_copy_to_user that is just unsafe_copy_to_user()
with the STAC/CLAC around it.

and the end result would actually be cleanert than what we have now
(which duplicates that STAC/CLAC for each size case etc).

And then for the "architecture doesn't have user_access_begin/end()"
fallback case, we just do

   #define unsafe_copy_to_user raw_copy_to_user

and the only slight painpoint is that we need to deal with that
copy_user_generic() case too.

We'd have to mark it uaccess_safe in objtool (but we already have that
__memcpy_mcsafe and csum_partial_copy_generic, os it all makes sense),
and we'd have to make all the other copy_user_generic() cases then do
the CLAC/STAC dance too or something.

ANYWAY.  As mentioned, I'm not actually all that worried about this all.

I could easily also just see the filldir() copy do an extra

#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
    if (len && (1 & (uint_ptr_t)dst)) .. copy byte ..
    if (len > 1 && (2 & (uint_ptr_t)dst)) .. copy word ..
    if (len > 3 && (4 & (uint_ptr_t)dst) && sizeof(unsigned long) > 4)
.. copy dword ..
#endif

at the start to align the destination.

The filldir code is actually somewhat unusual in that it deals with
pretty small strings on average, so just doing this might be more
efficient anyway.

So that doesn't worry me. Multiple ways to solve that part.

The "uhhuh, unaligned accesses cause more than performance problems" -
that's what worries me.

            Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  1:17       ` Linus Torvalds
  2019-10-07  1:24         ` Al Viro
@ 2019-10-07  2:30         ` Guenter Roeck
  2019-10-07  3:12           ` Linus Torvalds
  1 sibling, 1 reply; 71+ messages in thread
From: Guenter Roeck @ 2019-10-07  2:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

On 10/6/19 6:17 PM, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 5:04 PM Guenter Roeck <linux@roeck-us.net> wrote:
[ ... ]
> And yes, I'll fix that name copy loop in filldir to align the
> destination first, *but* if I'm right, it means that something like
> this should also likely cause issues:
> 
>    #define _GNU_SOURCE
>    #include <unistd.h>
>    #include <sys/mman.h>
> 
>    int main(int argc, char **argv)
>    {
>          void *mymap;
>          uid_t *bad_ptr = (void *) 0x01;
> 
>          /* Create unpopulated memory area */
>          mymap = mmap(NULL, 16384, PROT_READ | PROT_WRITE, MAP_PRIVATE
> | MAP_ANONYMOUS, -1, 0);
> 
>          /* Unaligned uidpointer in that memory area */
>          bad_ptr = mymap+1;
> 
>          /* Make the kernel do put_user() on it */
>          return getresuid(bad_ptr, bad_ptr+1, bad_ptr+2);
>    }
> 
> because that simple user mode program should cause that same "page
> fault on unaligned put_user()" behavior as far as I can tell.
> 
> Mind humoring me and trying that on your alpha machine (or emulator,
> or whatever)?
> 

Here you are. This is with v5.4-rc2 and your previous patch applied
on top.

/ # ./mmtest
Unable to handle kernel paging request at virtual address 0000000000000004
mmtest(75): Oops -1
pc = [<0000000000000004>]  ra = [<fffffc0000311584>]  ps = 0000    Not tainted
pc is at 0x4
ra is at entSys+0xa4/0xc0
v0 = fffffffffffffff2  t0 = 0000000000000000  t1 = 0000000000000000
t2 = 0000000000000000  t3 = 0000000000000000  t4 = 0000000000000000
t5 = 000000000000fffe  t6 = 0000000000000000  t7 = fffffc0007edc000
s0 = 0000000000000000  s1 = 00000001200006f0  s2 = 00000001200df19f
s3 = 00000001200ea0b9  s4 = 0000000120114630  s5 = 00000001201145d8
s6 = 000000011f955c50
a0 = 000002000002c001  a1 = 000002000002c005  a2 = 000002000002c009
a3 = 0000000000000000  a4 = ffffffffffffffff  a5 = 0000000000000000
t8 = 0000000000000000  t9 = fffffc0000000000  t10= 0000000000000000
t11= 000000011f955788  pv = fffffc0000349450  at = 00000000f8db54d3
gp = fffffc0000f2a160  sp = 00000000ab237c72
Disabling lock debugging due to kernel taint
Trace:

Code:
  00000000
  00063301
  000007b6
  00001111
  00003f8d

Segmentation fault

Guenter

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  2:06           ` Linus Torvalds
@ 2019-10-07  2:50             ` Al Viro
  2019-10-07  3:11               ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-07  2:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 06, 2019 at 07:06:19PM -0700, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 6:24 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Ugh...  I wonder if it would be better to lift STAC/CLAC out of
> > raw_copy_to_user(), rather than trying to reinvent its guts
> > in readdir.c...
> 
> Yeah, I suspect that's the best option.
> 
> Do something like
> 
>  - lift STAC/CLAC out of raw_copy_to_user
> 
>  - rename it to unsafe_copy_to_user
> 
>  - create a new raw_copy_to_user that is just unsafe_copy_to_user()
> with the STAC/CLAC around it.
> 
> and the end result would actually be cleanert than what we have now
> (which duplicates that STAC/CLAC for each size case etc).
> 
> And then for the "architecture doesn't have user_access_begin/end()"
> fallback case, we just do
> 
>    #define unsafe_copy_to_user raw_copy_to_user

Callers of raw_copy_to_user():
arch/hexagon/mm/uaccess.c:27:           uncleared = raw_copy_to_user(dest, &empty_zero_page, PAGE_SIZE);
arch/hexagon/mm/uaccess.c:34:           count = raw_copy_to_user(dest, &empty_zero_page, count);
arch/powerpc/kvm/book3s_64_mmu_radix.c:68:              ret = raw_copy_to_user(to, from, n);
arch/s390/include/asm/uaccess.h:150:    size = raw_copy_to_user(ptr, x, size);
include/asm-generic/uaccess.h:145:      return unlikely(raw_copy_to_user(ptr, x, size)) ? -EFAULT : 0;
include/linux/uaccess.h:93:     return raw_copy_to_user(to, from, n);
include/linux/uaccess.h:102:    return raw_copy_to_user(to, from, n);
include/linux/uaccess.h:131:            n = raw_copy_to_user(to, from, n);
lib/iov_iter.c:142:             n = raw_copy_to_user(to, from, n);
lib/usercopy.c:28:              n = raw_copy_to_user(to, from, n);


Out of those, only __copy_to_user_inatomic(), __copy_to_user(),
_copy_to_user() and iov_iter.c:copyout() can be called on
any architecture.

The last two should just do user_access_begin()/user_access_end()
instead of access_ok().  __copy_to_user_inatomic() has very few callers as well:

arch/mips/kernel/unaligned.c:1307:                      res = __copy_to_user_inatomic(addr, fpr, sizeof(*fpr));
drivers/gpu/drm/i915/i915_gem.c:345:    unwritten = __copy_to_user_inatomic(user_data,
lib/test_kasan.c:471:   unused = __copy_to_user_inatomic(usermem, kmem, size + 1);
mm/maccess.c:98:        ret = __copy_to_user_inatomic((__force void __user *)dst, src, size);

So few, in fact, that I wonder if we want to keep it at all; the only
thing stopping me from "let's remove it" is that I don't understand
the i915 side of things.  Where does it do an equivalent of access_ok()?

And mm/maccess.c one is __probe_kernel_write(), so presumably we don't
want stac/clac there at all...

So do we want to bother with separation between raw_copy_to_user() and
unsafe_copy_to_user()?  After all, __copy_to_user() also has only few
callers, most of them in arch/*

I'll take a look into that tomorrow - half-asleep right now...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  2:50             ` Al Viro
@ 2019-10-07  3:11               ` Linus Torvalds
  2019-10-07 15:40                 ` David Laight
                                   ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07  3:11 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 6, 2019 at 7:50 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Out of those, only __copy_to_user_inatomic(), __copy_to_user(),
> _copy_to_user() and iov_iter.c:copyout() can be called on
> any architecture.
>
> The last two should just do user_access_begin()/user_access_end()
> instead of access_ok().  __copy_to_user_inatomic() has very few callers as well:

Yeah, good points.

It looks like it would be better to just change over semantics
entirely to the unsafe_copy_user() model.

> So few, in fact, that I wonder if we want to keep it at all; the only
> thing stopping me from "let's remove it" is that I don't understand
> the i915 side of things.  Where does it do an equivalent of access_ok()?

Honestly, if you have to ask, I think the answer is: just add one.

Every single time we've had people who optimized things to try to
avoid the access_ok(), they just caused bugs and problems.

In this case, I think it's done a few callers up in i915_gem_pread_ioctl():

        if (!access_ok(u64_to_user_ptr(args->data_ptr),
                       args->size))
                return -EFAULT;

but honestly, trying to optimize away another "access_ok()" is just
not worth it. I'd rather have an extra one than miss one.

> And mm/maccess.c one is __probe_kernel_write(), so presumably we don't
> want stac/clac there at all...

Yup.

> So do we want to bother with separation between raw_copy_to_user() and
> unsafe_copy_to_user()?  After all, __copy_to_user() also has only few
> callers, most of them in arch/*

No, you're right. Just switch over.

> I'll take a look into that tomorrow - half-asleep right now...

Thanks. No huge hurry.

             Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  2:30         ` Guenter Roeck
@ 2019-10-07  3:12           ` Linus Torvalds
  0 siblings, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07  3:12 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel

On Sun, Oct 6, 2019 at 7:30 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> > Mind humoring me and trying that on your alpha machine (or emulator,
> > or whatever)?
>
> Here you are. This is with v5.4-rc2 and your previous patch applied
> on top.
>
> / # ./mmtest
> Unable to handle kernel paging request at virtual address 0000000000000004

Oookay.

Well, that's what I expected, but it's good to just have it confirmed.

Well, not "good" in this case. Bad bad bad.

The fs/readdir.c changes clearly exposed a pre-existing bug on alpha.
Not making excuses for it, but at least it explains why code that
_looks_ correct ends up causing that kind of issue.

I guess the other 'strict alignment' architectures should be checking
that test program too. I'll post my test program to the arch
maintainers list.

             Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-06 22:20 [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Guenter Roeck
  2019-10-06 23:06 ` Linus Torvalds
@ 2019-10-07  4:04 ` Max Filippov
  2019-10-07 12:16   ` Guenter Roeck
  2019-10-07 19:21 ` Linus Torvalds
  2 siblings, 1 reply; 71+ messages in thread
From: Max Filippov @ 2019-10-07  4:04 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Linus Torvalds, LKML, Alexander Viro, linux-fsdevel

On Sun, Oct 6, 2019 at 3:25 PM Guenter Roeck <linux@roeck-us.net> wrote:
> this patch causes all my sparc64 emulations to stall during boot. It causes
> all alpha emulations to crash with [1a] and [1b] when booting from a virtual
> disk, and one of the xtensa emulations to crash with [2].

[...]

> [2]
>
> Unable to handle kernel paging request at virtual address 0000000000000004
> reboot(50): Oops -1
> pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Tainted: G      D
> pc is at 0x4
> ra is at filldir64+0x64/0x320
> v0 = 0000000000000000  t0 = 0000000067736d6b  t1 = 000000012011445b
> t2 = 0000000000000000  t3 = 0000000000000000  t4 = 0000000000007ef8
> t5 = 0000000120114448  t6 = 0000000000000000  t7 = fffffc0007eec000
> s0 = fffffc000792b5c3  s1 = 0000000000000004  s2 = 0000000000000018
> s3 = fffffc0007eefec8  s4 = 0000000000000008  s5 = 00000000f00000a3
> s6 = 000000000000000b
> a0 = fffffc000792b5c3  a1 = 2f2f2f2f2f2f2f2f  a2 = 0000000000000004
> a3 = 000000000000000b  a4 = 00000000f00000a3  a5 = 0000000000000008
> t8 = 0000000000000018  t9 = 0000000000000000  t10= 0000000022e1d02a
> t11= 000000011fd6f3b8  pv = fffffc0000b9a810  at = 0000000022e1ccf8
> gp = fffffc0000f03930  sp = (____ptrval____)
> Trace:
> [<fffffc00004ccba0>] proc_readdir_de+0x170/0x300
> [<fffffc0000451280>] filldir64+0x0/0x320
> [<fffffc00004c565c>] proc_root_readdir+0x3c/0x80
> [<fffffc0000450c68>] iterate_dir+0x198/0x240
> [<fffffc00004518b8>] ksys_getdents64+0xa8/0x160
> [<fffffc0000451990>] sys_getdents64+0x20/0x40
> [<fffffc0000451280>] filldir64+0x0/0x320
> [<fffffc0000311634>] entSys+0xa4/0xc0

This doesn't look like a dump from xtensa core.
v5.4-rc2 kernel doesn't crash for me on xtensa, but the userspace
doesn't work well, because all directories appear to be empty.

__put_user/__get_user don't do unaligned access on xtensa,
they check address alignment and return -EFAULT if it's bad.

-- 
Thanks.
-- Max

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  4:04 ` Max Filippov
@ 2019-10-07 12:16   ` Guenter Roeck
  0 siblings, 0 replies; 71+ messages in thread
From: Guenter Roeck @ 2019-10-07 12:16 UTC (permalink / raw)
  To: Max Filippov; +Cc: Linus Torvalds, LKML, Alexander Viro, linux-fsdevel

Hi Max,

On 10/6/19 9:04 PM, Max Filippov wrote:
> On Sun, Oct 6, 2019 at 3:25 PM Guenter Roeck <linux@roeck-us.net> wrote:
>> this patch causes all my sparc64 emulations to stall during boot. It causes
>> all alpha emulations to crash with [1a] and [1b] when booting from a virtual
>> disk, and one of the xtensa emulations to crash with [2].
> 
> [...]
> 
>> [2]
>>
>> Unable to handle kernel paging request at virtual address 0000000000000004
>> reboot(50): Oops -1
>> pc = [<0000000000000004>]  ra = [<fffffc00004512e4>]  ps = 0000    Tainted: G      D
>> pc is at 0x4
>> ra is at filldir64+0x64/0x320
>> v0 = 0000000000000000  t0 = 0000000067736d6b  t1 = 000000012011445b
>> t2 = 0000000000000000  t3 = 0000000000000000  t4 = 0000000000007ef8
>> t5 = 0000000120114448  t6 = 0000000000000000  t7 = fffffc0007eec000
>> s0 = fffffc000792b5c3  s1 = 0000000000000004  s2 = 0000000000000018
>> s3 = fffffc0007eefec8  s4 = 0000000000000008  s5 = 00000000f00000a3
>> s6 = 000000000000000b
>> a0 = fffffc000792b5c3  a1 = 2f2f2f2f2f2f2f2f  a2 = 0000000000000004
>> a3 = 000000000000000b  a4 = 00000000f00000a3  a5 = 0000000000000008
>> t8 = 0000000000000018  t9 = 0000000000000000  t10= 0000000022e1d02a
>> t11= 000000011fd6f3b8  pv = fffffc0000b9a810  at = 0000000022e1ccf8
>> gp = fffffc0000f03930  sp = (____ptrval____)
>> Trace:
>> [<fffffc00004ccba0>] proc_readdir_de+0x170/0x300
>> [<fffffc0000451280>] filldir64+0x0/0x320
>> [<fffffc00004c565c>] proc_root_readdir+0x3c/0x80
>> [<fffffc0000450c68>] iterate_dir+0x198/0x240
>> [<fffffc00004518b8>] ksys_getdents64+0xa8/0x160
>> [<fffffc0000451990>] sys_getdents64+0x20/0x40
>> [<fffffc0000451280>] filldir64+0x0/0x320
>> [<fffffc0000311634>] entSys+0xa4/0xc0
> 
> This doesn't look like a dump from xtensa core.
> v5.4-rc2 kernel doesn't crash for me on xtensa, but the userspace
> doesn't work well, because all directories appear to be empty.
> 
> __put_user/__get_user don't do unaligned access on xtensa,
> they check address alignment and return -EFAULT if it's bad.
> 
You are right, sorry; I must have mixed that up. xtensa doesn't crash.
The boot stalls, similar to sparc64. This is only seen with my nommu
test (de212:kc705-nommu:nommu_kc705_defconfig). xtensa mmu tests are fine,
at least for me, but then I only run tests with initrd (which for some
reason doesn't crash on alpha either).

Guenter

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  3:11               ` Linus Torvalds
@ 2019-10-07 15:40                 ` David Laight
  2019-10-07 18:11                   ` Linus Torvalds
  2019-10-07 17:34                 ` Al Viro
  2019-10-07 18:26                 ` Linus Torvalds
  2 siblings, 1 reply; 71+ messages in thread
From: David Laight @ 2019-10-07 15:40 UTC (permalink / raw)
  To: Linus Torvalds, Al Viro
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

> From: Linus Torvalds
> Sent: 07 October 2019 04:12
...
> In this case, I think it's done a few callers up in i915_gem_pread_ioctl():
> 
>         if (!access_ok(u64_to_user_ptr(args->data_ptr),
>                        args->size))
>                 return -EFAULT;
> 
> but honestly, trying to optimize away another "access_ok()" is just
> not worth it. I'd rather have an extra one than miss one.

You don't really want an extra access_ok() for every 'word' of a copy.
Some copies have to be done a word at a time.

And the checks someone added to copy_to/from_user() to detect kernel
buffer overruns must kill performance when the buffers are way down the stack
or in kmalloc()ed space.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  3:11               ` Linus Torvalds
  2019-10-07 15:40                 ` David Laight
@ 2019-10-07 17:34                 ` Al Viro
  2019-10-07 18:13                   ` Linus Torvalds
  2019-10-07 18:26                 ` Linus Torvalds
  2 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-07 17:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 06, 2019 at 08:11:42PM -0700, Linus Torvalds wrote:

> > So do we want to bother with separation between raw_copy_to_user() and
> > unsafe_copy_to_user()?  After all, __copy_to_user() also has only few
> > callers, most of them in arch/*
> 
> No, you're right. Just switch over.
> 
> > I'll take a look into that tomorrow - half-asleep right now...
> 
> Thanks. No huge hurry.

Tangentially related: copy_regster_to_user() and copy_regset_from_user().
That's where we do access_ok(), followed by calls of ->get() and
->set() resp.  Those tend to either use user_regset_copy{out,in}(),
or open-code those.  The former variant tends to lead to few calls
of __copy_{to,from}_user(); the latter...  On x86 it ends up doing
this:
static int genregs_get(struct task_struct *target,
                       const struct user_regset *regset,
                       unsigned int pos, unsigned int count,
                       void *kbuf, void __user *ubuf)
{
        if (kbuf) {
                unsigned long *k = kbuf;
                while (count >= sizeof(*k)) {
                        *k++ = getreg(target, pos);
                        count -= sizeof(*k);
                        pos += sizeof(*k);
                }
        } else {
                unsigned long __user *u = ubuf;
                while (count >= sizeof(*u)) {
                        if (__put_user(getreg(target, pos), u++))
                                return -EFAULT;
                        count -= sizeof(*u);
                        pos += sizeof(*u);
                }
        }

        return 0;
}

Potentially doing arseloads of stac/clac as it goes.  OTOH, getreg()
(and setreg()) in there are not entirely trivial, so blanket
user_access_begin()/user_access_end() over the entire loop might be
a bad idea...

How hot is that codepath?  I know that arch/um used to rely on it
(== PTRACE_[GS]ETREGS) quite a bit...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 15:40                 ` David Laight
@ 2019-10-07 18:11                   ` Linus Torvalds
  2019-10-08  9:58                     ` David Laight
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07 18:11 UTC (permalink / raw)
  To: David Laight
  Cc: Al Viro, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 7, 2019 at 8:40 AM David Laight <David.Laight@aculab.com> wrote:
>
> You don't really want an extra access_ok() for every 'word' of a copy.

Yes you do.

> Some copies have to be done a word at a time.

Completely immaterial. If you can't see the access_ok() close to the
__get/put_user(), you have a bug.

Plus the access_ok() is cheap. The real cost is the STAC/CLAC.

So stop with the access_ok() "optimizations". They are broken garbage.

Really.

I've been very close to just removing __get_user/__put_user several
times, exactly because people do completely the wrong thing with them
- not speeding code up, but making it unsafe and buggy.

The new "user_access_begin/end()" model is much better, but it also
has actual STATIC checking that there are no function calls etc inside
th4e region, so it forces you to do the loop properly and tightly, and
not the incorrect "I checked the range somewhere else, now I'm doing
an unsafe copy".

And it actually speeds things up, unlike the access_ok() games.

               Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 17:34                 ` Al Viro
@ 2019-10-07 18:13                   ` Linus Torvalds
  2019-10-07 18:22                     ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07 18:13 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 7, 2019 at 10:34 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Tangentially related: copy_regster_to_user() and copy_regset_from_user().

Not a worry. It's not performance-critical code, and if it ever is, it
needs to be rewritten anyway.


> The former variant tends to lead to few calls
> of __copy_{to,from}_user(); the latter...  On x86 it ends up doing
> this:

Just replace the __put_user() with a put_user() and be done with it.
That code isn't acceptable, and if somebody ever complains about
performance it's not the lack of __put_user that is the problem.

           Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 18:13                   ` Linus Torvalds
@ 2019-10-07 18:22                     ` Al Viro
  0 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-07 18:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 07, 2019 at 11:13:27AM -0700, Linus Torvalds wrote:
> On Mon, Oct 7, 2019 at 10:34 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Tangentially related: copy_regster_to_user() and copy_regset_from_user().
> 
> Not a worry. It's not performance-critical code, and if it ever is, it
> needs to be rewritten anyway.
> 
> > The former variant tends to lead to few calls
> > of __copy_{to,from}_user(); the latter...  On x86 it ends up doing
> > this:
> 
> Just replace the __put_user() with a put_user() and be done with it.
> That code isn't acceptable, and if somebody ever complains about
> performance it's not the lack of __put_user that is the problem.

I wonder if it would be better off switching to several "copy in bulk"
like e.g. ppc does.  That boilerplate with parallel "to/from kernel"
and "to/from userland" loops is asking for bugs - the calling
conventions like "pass kbuf and ubuf; exactly one must be NULL"
tend to be trouble, IME; I'm not saying we should just pass
struct iov_iter * instead of count+pos+kbuf+ubuf to ->get() and
->set(), but it might clean the things up nicely.

Let me look into that zoo a bit more...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07  3:11               ` Linus Torvalds
  2019-10-07 15:40                 ` David Laight
  2019-10-07 17:34                 ` Al Viro
@ 2019-10-07 18:26                 ` Linus Torvalds
  2019-10-07 18:36                   ` Tony Luck
                                     ` (2 more replies)
  2 siblings, 3 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07 18:26 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 6, 2019 at 8:11 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> >
> > The last two should just do user_access_begin()/user_access_end()
> > instead of access_ok().  __copy_to_user_inatomic() has very few callers as well:
>
> Yeah, good points.

Looking at it some more this morning, I think it's actually pretty painful.

The good news is that right now x86 is the only architecture that does
that user_access_begin(), so we don't need to worry about anything
else. Apparently the ARM people haven't had enough performance
problems with the PAN bit for them to care.

We can have a fallback wrapper for unsafe_copy_to_user() for other
architectures that just does the __copy_to_user().

But on x86, if we move the STAC/CLAC out of the low-level copy
routines and into the callers, we'll have a _lot_ of churn. I thought
it would be mostly a "teach objtool" thing, but we have lots of
different versions of it. Not just the 32-bit vs 64-bit, it's embedded
in all the low-level asm implementations.

And we don't want the regular "copy_to/from_user()" to then have to
add the STAC/CLAC at the call-site. So then we'd want to un-inline
copy_to_user() entirely.

Which all sounds like a really good idea, don't get me wrong. I think
we inline it way too aggressively now. But it'sa  _big_ job.

So we probably _should_

 - remove INLINE_COPY_TO/FROM_USER

 - remove all the "small constant size special cases".

 - make "raw_copy_to/from_user()" have the "unsafe" semantics and make
the out-of-line copy in lib/usercopy.c be the only real interface

 - get rid of a _lot_ of oddities

but looking at just how much churn this is, I suspect that for 5.4
it's a bit late to do quite that much cleanup.

I hope you prove me wrong. But I'll look at a smaller change to just
make x86 use the current special copy loop (as
"unsafe_copy_to_user()") and have everybody else do the trivial
wrapper.

Because we definitely should do that cleanup (it also fixes the whole
"atomic copy in kernel space" issue that you pointed to that doesn't
actually want STAC/CLAC at all), but it just looks fairly massive to
me.

            Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 18:26                 ` Linus Torvalds
@ 2019-10-07 18:36                   ` Tony Luck
  2019-10-07 19:08                     ` Linus Torvalds
  2019-10-08  3:29                   ` Al Viro
  2019-10-08 19:58                   ` Al Viro
  2 siblings, 1 reply; 71+ messages in thread
From: Tony Luck @ 2019-10-07 18:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 7, 2019 at 11:28 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Oct 6, 2019 at 8:11 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > >
> > > The last two should just do user_access_begin()/user_access_end()
> > > instead of access_ok().  __copy_to_user_inatomic() has very few callers as well:
> >
> > Yeah, good points.
>
> Looking at it some more this morning, I think it's actually pretty painful.

Late to this party ,,, but my ia64 console today is full of:

irqbalance(5244): unaligned access to 0x2000000800042f9b, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x2000000800042fbb, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x2000000800042fdb, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x2000000800042ffb, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x200000080004301b, ip=0xa0000001002fef90
ia64_handle_unaligned: 95 callbacks suppressed
irqbalance(5244): unaligned access to 0x2000000800042f9b, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x2000000800042fbb, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x2000000800042fdb, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x2000000800042ffb, ip=0xa0000001002fef90
irqbalance(5244): unaligned access to 0x200000080004301b, ip=0xa0000001002fef90
ia64_handle_unaligned: 95 callbacks suppressed

Those ip's point into filldir64()

-Tony

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 18:36                   ` Tony Luck
@ 2019-10-07 19:08                     ` Linus Torvalds
  2019-10-07 19:49                       ` Tony Luck
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07 19:08 UTC (permalink / raw)
  To: Tony Luck
  Cc: Al Viro, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

On Mon, Oct 7, 2019 at 11:36 AM Tony Luck <tony.luck@gmail.com> wrote:
>
> Late to this party ,,, but my ia64 console today is full of:

Hmm? I thought ia64 did unaligneds ok.

But regardless, this is my current (as yet untested) patch.  This is
not the big user access cleanup that I hope Al will do, this is just a
"ok, x86 is the only one who wants a special unsafe_copy_to_user()
right now" patch.

                Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 4604 bytes --]

 arch/x86/include/asm/uaccess.h | 23 ++++++++++++++++++++++
 fs/readdir.c                   | 44 ++----------------------------------------
 include/linux/uaccess.h        |  6 ++++--
 3 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 35c225ede0e4..61d93f062a36 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -734,5 +734,28 @@ do {										\
 	if (unlikely(__gu_err)) goto err_label;					\
 } while (0)
 
+/*
+ * We want the unsafe accessors to always be inlined and use
+ * the error labels - thus the macro games.
+ */
+#define unsafe_copy_loop(dst, src, len, type, label)			\
+	while (len >= sizeof(type)) {					\
+		unsafe_put_user(*(type *)src,(type __user *)dst,label);	\
+		dst += sizeof(type);					\
+		src += sizeof(type);					\
+		len -= sizeof(type);					\
+	}
+
+#define unsafe_copy_to_user(_dst,_src,_len,label)			\
+do {									\
+	char __user *__ucu_dst = (_dst);				\
+	const char *__ucu_src = (_src);					\
+	size_t __ucu_len = (_len);					\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label);	\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label);	\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label);	\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label);	\
+} while (0)
+
 #endif /* _ASM_X86_UACCESS_H */
 
diff --git a/fs/readdir.c b/fs/readdir.c
index 19bea591c3f1..6e2623e57b2e 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -27,53 +27,13 @@
 /*
  * Note the "unsafe_put_user() semantics: we goto a
  * label for errors.
- *
- * Also note how we use a "while()" loop here, even though
- * only the biggest size needs to loop. The compiler (well,
- * at least gcc) is smart enough to turn the smaller sizes
- * into just if-statements, and this way we don't need to
- * care whether 'u64' or 'u32' is the biggest size.
- */
-#define unsafe_copy_loop(dst, src, len, type, label) 		\
-	while (len >= sizeof(type)) {				\
-		unsafe_put_user(get_unaligned((type *)src),	\
-			(type __user *)dst, label);		\
-		dst += sizeof(type);				\
-		src += sizeof(type);				\
-		len -= sizeof(type);				\
-	}
-
-/*
- * We avoid doing 64-bit copies on 32-bit architectures. They
- * might be better, but the component names are mostly small,
- * and the 64-bit cases can end up being much more complex and
- * put much more register pressure on the code, so it's likely
- * not worth the pain of unaligned accesses etc.
- *
- * So limit the copies to "unsigned long" size. I did verify
- * that at least the x86-32 case is ok without this limiting,
- * but I worry about random other legacy 32-bit cases that
- * might not do as well.
- */
-#define unsafe_copy_type(dst, src, len, type, label) do {	\
-	if (sizeof(type) <= sizeof(unsigned long))		\
-		unsafe_copy_loop(dst, src, len, type, label);	\
-} while (0)
-
-/*
- * Copy the dirent name to user space, and NUL-terminate
- * it. This should not be a function call, since we're doing
- * the copy inside a "user_access_begin/end()" section.
  */
 #define unsafe_copy_dirent_name(_dst, _src, _len, label) do {	\
 	char __user *dst = (_dst);				\
 	const char *src = (_src);				\
 	size_t len = (_len);					\
-	unsafe_copy_type(dst, src, len, u64, label);	 	\
-	unsafe_copy_type(dst, src, len, u32, label);		\
-	unsafe_copy_type(dst, src, len, u16, label);		\
-	unsafe_copy_type(dst, src, len, u8,  label);		\
-	unsafe_put_user(0, dst, label);				\
+	unsafe_put_user(0, dst+len, label);			\
+	unsafe_copy_to_user(dst, src, len, label);		\
 } while (0)
 
 
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index e47d0522a1f4..d4ee6e942562 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -355,8 +355,10 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count);
 #ifndef user_access_begin
 #define user_access_begin(ptr,len) access_ok(ptr, len)
 #define user_access_end() do { } while (0)
-#define unsafe_get_user(x, ptr, err) do { if (unlikely(__get_user(x, ptr))) goto err; } while (0)
-#define unsafe_put_user(x, ptr, err) do { if (unlikely(__put_user(x, ptr))) goto err; } while (0)
+#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
+#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e)
+#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e)
+#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(__copy_to_user(d,s,l),e)
 static inline unsigned long user_access_save(void) { return 0UL; }
 static inline void user_access_restore(unsigned long flags) { }
 #endif

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-06 22:20 [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Guenter Roeck
  2019-10-06 23:06 ` Linus Torvalds
  2019-10-07  4:04 ` Max Filippov
@ 2019-10-07 19:21 ` Linus Torvalds
  2019-10-07 20:29   ` Guenter Roeck
  2019-10-07 23:27   ` Guenter Roeck
  2 siblings, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07 19:21 UTC (permalink / raw)
  To: Guenter Roeck, Michael Cree
  Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 905 bytes --]

On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> this patch causes all my sparc64 emulations to stall during boot. It causes
> all alpha emulations to crash with [1a] and [1b] when booting from a virtual
> disk, and one of the xtensa emulations to crash with [2].

So I think your alpha emulation environment may be broken, because
Michael Cree reports that it works for him on real hardware, but he
does see the kernel unaligned count being high.

But regardless, this is my current fairly minimal patch that I think
should fix the unaligned issue, while still giving the behavior we
want on x86. I hope Al can do something nicer, but I think this is
"acceptable".

I'm running this now on x86, and I verified that x86-32 code
generation looks sane too, but it woudl be good to verify that this
makes the alignment issue go away on other architectures.

                Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 4604 bytes --]

 arch/x86/include/asm/uaccess.h | 23 ++++++++++++++++++++++
 fs/readdir.c                   | 44 ++----------------------------------------
 include/linux/uaccess.h        |  6 ++++--
 3 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 35c225ede0e4..61d93f062a36 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -734,5 +734,28 @@ do {										\
 	if (unlikely(__gu_err)) goto err_label;					\
 } while (0)
 
+/*
+ * We want the unsafe accessors to always be inlined and use
+ * the error labels - thus the macro games.
+ */
+#define unsafe_copy_loop(dst, src, len, type, label)			\
+	while (len >= sizeof(type)) {					\
+		unsafe_put_user(*(type *)src,(type __user *)dst,label);	\
+		dst += sizeof(type);					\
+		src += sizeof(type);					\
+		len -= sizeof(type);					\
+	}
+
+#define unsafe_copy_to_user(_dst,_src,_len,label)			\
+do {									\
+	char __user *__ucu_dst = (_dst);				\
+	const char *__ucu_src = (_src);					\
+	size_t __ucu_len = (_len);					\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label);	\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label);	\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label);	\
+	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label);	\
+} while (0)
+
 #endif /* _ASM_X86_UACCESS_H */
 
diff --git a/fs/readdir.c b/fs/readdir.c
index 19bea591c3f1..6e2623e57b2e 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -27,53 +27,13 @@
 /*
  * Note the "unsafe_put_user() semantics: we goto a
  * label for errors.
- *
- * Also note how we use a "while()" loop here, even though
- * only the biggest size needs to loop. The compiler (well,
- * at least gcc) is smart enough to turn the smaller sizes
- * into just if-statements, and this way we don't need to
- * care whether 'u64' or 'u32' is the biggest size.
- */
-#define unsafe_copy_loop(dst, src, len, type, label) 		\
-	while (len >= sizeof(type)) {				\
-		unsafe_put_user(get_unaligned((type *)src),	\
-			(type __user *)dst, label);		\
-		dst += sizeof(type);				\
-		src += sizeof(type);				\
-		len -= sizeof(type);				\
-	}
-
-/*
- * We avoid doing 64-bit copies on 32-bit architectures. They
- * might be better, but the component names are mostly small,
- * and the 64-bit cases can end up being much more complex and
- * put much more register pressure on the code, so it's likely
- * not worth the pain of unaligned accesses etc.
- *
- * So limit the copies to "unsigned long" size. I did verify
- * that at least the x86-32 case is ok without this limiting,
- * but I worry about random other legacy 32-bit cases that
- * might not do as well.
- */
-#define unsafe_copy_type(dst, src, len, type, label) do {	\
-	if (sizeof(type) <= sizeof(unsigned long))		\
-		unsafe_copy_loop(dst, src, len, type, label);	\
-} while (0)
-
-/*
- * Copy the dirent name to user space, and NUL-terminate
- * it. This should not be a function call, since we're doing
- * the copy inside a "user_access_begin/end()" section.
  */
 #define unsafe_copy_dirent_name(_dst, _src, _len, label) do {	\
 	char __user *dst = (_dst);				\
 	const char *src = (_src);				\
 	size_t len = (_len);					\
-	unsafe_copy_type(dst, src, len, u64, label);	 	\
-	unsafe_copy_type(dst, src, len, u32, label);		\
-	unsafe_copy_type(dst, src, len, u16, label);		\
-	unsafe_copy_type(dst, src, len, u8,  label);		\
-	unsafe_put_user(0, dst, label);				\
+	unsafe_put_user(0, dst+len, label);			\
+	unsafe_copy_to_user(dst, src, len, label);		\
 } while (0)
 
 
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index e47d0522a1f4..d4ee6e942562 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -355,8 +355,10 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count);
 #ifndef user_access_begin
 #define user_access_begin(ptr,len) access_ok(ptr, len)
 #define user_access_end() do { } while (0)
-#define unsafe_get_user(x, ptr, err) do { if (unlikely(__get_user(x, ptr))) goto err; } while (0)
-#define unsafe_put_user(x, ptr, err) do { if (unlikely(__put_user(x, ptr))) goto err; } while (0)
+#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
+#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e)
+#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e)
+#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(__copy_to_user(d,s,l),e)
 static inline unsigned long user_access_save(void) { return 0UL; }
 static inline void user_access_restore(unsigned long flags) { }
 #endif

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 19:08                     ` Linus Torvalds
@ 2019-10-07 19:49                       ` Tony Luck
  2019-10-07 20:04                         ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Tony Luck @ 2019-10-07 19:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 7, 2019 at 12:09 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Hmm? I thought ia64 did unaligneds ok.

If PSR.ac is set, we trap. If it isn't set, then model specific
(though all implementations will
trap for an unaligned access that crosses a 4K boundary).

Linux sets PSR.ac. Applications can use prctl(PR_SET_UNALIGN) to choose whether
they want the kernel to silently fix things or to send SIGBUS.

Kernel always noisily (rate limited) fixes up unaligned access.

Your patch does make all the messages go away.

Tested-by: Tony Luck <tony.luck@intel.com>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 19:49                       ` Tony Luck
@ 2019-10-07 20:04                         ` Linus Torvalds
  0 siblings, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-07 20:04 UTC (permalink / raw)
  To: Tony Luck
  Cc: Al Viro, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 7, 2019 at 12:49 PM Tony Luck <tony.luck@gmail.com> wrote:
>
> If PSR.ac is set, we trap. If it isn't set, then model specific
> (though all implementations will
> trap for an unaligned access that crosses a 4K boundary).

Ok. At that point, setting AC unconditionally is the better model just
to get test coverage for "it will trap occasionally anyway".

Odd "almost-but-not-quite x86" both in naming and in behavior (AC was
a no-op in kernel-mode until SMAP).

> Your patch does make all the messages go away.
>
> Tested-by: Tony Luck <tony.luck@intel.com>

Ok, I'll commit it, and we'll see what Al can come up with that might
be a bigger cleanup.

             Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 19:21 ` Linus Torvalds
@ 2019-10-07 20:29   ` Guenter Roeck
  2019-10-07 23:27   ` Guenter Roeck
  1 sibling, 0 replies; 71+ messages in thread
From: Guenter Roeck @ 2019-10-07 20:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Cree, Linux Kernel Mailing List, Alexander Viro,
	linux-fsdevel, linux-arch

On Mon, Oct 07, 2019 at 12:21:25PM -0700, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > this patch causes all my sparc64 emulations to stall during boot. It causes
> > all alpha emulations to crash with [1a] and [1b] when booting from a virtual
> > disk, and one of the xtensa emulations to crash with [2].
> 
> So I think your alpha emulation environment may be broken, because
> Michael Cree reports that it works for him on real hardware, but he
> does see the kernel unaligned count being high.
> 
Yes, that possibility always exists, unfortunately.

> But regardless, this is my current fairly minimal patch that I think
> should fix the unaligned issue, while still giving the behavior we
> want on x86. I hope Al can do something nicer, but I think this is
> "acceptable".
> 
> I'm running this now on x86, and I verified that x86-32 code
> generation looks sane too, but it woudl be good to verify that this
> makes the alignment issue go away on other architectures.
> 
>                 Linus

I started a complete test run with the patch applied. I'll let you know
how it went after it is complete - it should be done in a couple of hours.

Guenter

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 19:21 ` Linus Torvalds
  2019-10-07 20:29   ` Guenter Roeck
@ 2019-10-07 23:27   ` Guenter Roeck
  2019-10-08  6:28     ` Geert Uytterhoeven
  1 sibling, 1 reply; 71+ messages in thread
From: Guenter Roeck @ 2019-10-07 23:27 UTC (permalink / raw)
  To: Linus Torvalds, Michael Cree
  Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch

On 10/7/19 12:21 PM, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> this patch causes all my sparc64 emulations to stall during boot. It causes
>> all alpha emulations to crash with [1a] and [1b] when booting from a virtual
>> disk, and one of the xtensa emulations to crash with [2].
> 
> So I think your alpha emulation environment may be broken, because
> Michael Cree reports that it works for him on real hardware, but he
> does see the kernel unaligned count being high.
> 
> But regardless, this is my current fairly minimal patch that I think
> should fix the unaligned issue, while still giving the behavior we
> want on x86. I hope Al can do something nicer, but I think this is
> "acceptable".
> 
> I'm running this now on x86, and I verified that x86-32 code
> generation looks sane too, but it woudl be good to verify that this
> makes the alignment issue go away on other architectures.
> 
>                  Linus
> 

Test results look good. Feel free to add
Tested-by: Guenter Roeck <linux@roeck-us.net>
to your patch.

Build results:
	total: 158 pass: 154 fail: 4
Failed builds:
	arm:allmodconfig
	m68k:defconfig
	mips:allmodconfig
	sparc64:allmodconfig
Qemu test results:
	total: 391 pass: 390 fail: 1
Failed tests:
	ppc64:mpc8544ds:ppc64_e5500_defconfig:nosmp:initrd

This is with "regulator: fixed: Prevent NULL pointer dereference when !CONFIG_OF"
applied as well. The other failures are unrelated.

arm:

arch/arm/crypto/aes-ce-core.S:299: Error:
	selected processor does not support `movw ip,:lower16:.Lcts_permute_table' in ARM mode

Fix is pending in crypto tree.

m68k:

c2p_iplan2.c:(.text+0x98): undefined reference to `c2p_unsupported'

I don't know the status.

mips:

drivers/staging/octeon/ethernet-defines.h:30:38: error:
		'CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE' undeclared
and other similar errors. I don't know the status.

ppc64:

powerpc64-linux-ld: mm/page_alloc.o:(.toc+0x18): undefined reference to `node_reclaim_distance'

Reported against offending patch earlier today.

sparc64:

drivers/watchdog/cpwd.c:500:19: error: 'compat_ptr_ioctl' undeclared here

Oops. I'll need to look into that. Looks like the patch to use a new
infrastructure made it into the kernel but the infrastructure itself
didn't make it after all.

Guenter

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 18:26                 ` Linus Torvalds
  2019-10-07 18:36                   ` Tony Luck
@ 2019-10-08  3:29                   ` Al Viro
  2019-10-08  4:09                     ` Linus Torvalds
  2019-10-08 19:58                   ` Al Viro
  2 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-08  3:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 07, 2019 at 11:26:35AM -0700, Linus Torvalds wrote:

> But on x86, if we move the STAC/CLAC out of the low-level copy
> routines and into the callers, we'll have a _lot_ of churn. I thought
> it would be mostly a "teach objtool" thing, but we have lots of
> different versions of it. Not just the 32-bit vs 64-bit, it's embedded
> in all the low-level asm implementations.
> 
> And we don't want the regular "copy_to/from_user()" to then have to
> add the STAC/CLAC at the call-site. So then we'd want to un-inline
> copy_to_user() entirely.

For x86?  Sure, why not...  Note, BTW, that for short constant-sized
copies we *do* STAC/CLAC at the call site - see those
		__uaccess_begin_nospec();
in raw_copy_{from,to}_user() in the switches...

> Which all sounds like a really good idea, don't get me wrong. I think
> we inline it way too aggressively now. But it'sa  _big_ job.
> 
> So we probably _should_
> 
>  - remove INLINE_COPY_TO/FROM_USER
> 
>  - remove all the "small constant size special cases".
> 
>  - make "raw_copy_to/from_user()" have the "unsafe" semantics and make
> the out-of-line copy in lib/usercopy.c be the only real interface
> 
>  - get rid of a _lot_ of oddities

Not that many, really.  All we need is a temporary cross-architecture
__uaccess_begin_nospec(), so that __copy_{to,from}_user() could have
that used, instead of having it done in (x86) raw_copy_..._...().

Other callers of raw_copy_...() would simply wrap it into user_access_begin()/
user_access_end() pairs; this kludge is needed only in __copy_from_user()
and __copy_to_user(), and only until we kill their callers outside of
arch/*.  Which we can do, in a cycle or two.  _ANY_ use of
that temporary kludge outside of those two functions will be grepped
for and LARTed into the ground.

> I hope you prove me wrong. But I'll look at a smaller change to just
> make x86 use the current special copy loop (as
> "unsafe_copy_to_user()") and have everybody else do the trivial
> wrapper.
> 
> Because we definitely should do that cleanup (it also fixes the whole
> "atomic copy in kernel space" issue that you pointed to that doesn't
> actually want STAC/CLAC at all), but it just looks fairly massive to
> me.

AFAICS, it splits nicely.

1) cross-architecture user_access_begin_dont_use(): on everything
except x86 it's empty, on x86 - __uaccess_begin_nospec().

2) stac/clac lifted into x86 raw_copy_..._user() out of
copy_user_generic_unrolled(), copy_user_generic_string() and
copy_user_enhanced_fast_string().  Similar lift out of
__copy_user_nocache().

3) lifting that thing as user_access_begin_dont_use() into
__copy_..._user...() and as user_access_begin() into other
generic callers, consuming access_ok() in the latter.
__copy_to_user_inatomic() can die at the same stage.

4) possibly uninlining on x86 (and yes, killing the special
size handling off).  We don't need to touch the inlining
decisions for any other architectures.

At that point raw_copy_to_user() is available for e.g.
readdir.c to play with.

And up to that point only x86 sees any kind of code changes,
so we don't have to worry about other architectures.

5) kill the __copy_...() users outside of arch/*, alone with
quite a few other weird shits in there.  A cycle or two,
with the final goal being to kill the damn things off.

6) arch/* users get arch-by-arch treatment - mostly
it's sigframe handling.  Won't affect the generic code
and would be independent for different architectures.
Can happen in parallel with (5), actually.

7) ... at that point user_access_begin_dont_user() gets
removed and thrown into the pile of mangled fingers of
those who'd ignored all warnings and used it somewhere
else.

I don't believe that (5) would be doable entirely in
this cycle, but quite a few bits might be.

On a somewhat related note, do you see any problems with

void *copy_mount_options(const void __user * data)
{
        unsigned offs, size;
        char *copy;

        if (!data)
                return NULL;

        copy = kmalloc(PAGE_SIZE, GFP_KERNEL);
        if (!copy)
                return ERR_PTR(-ENOMEM);

        offs = (unsigned long)untagged_addr(data) & (PAGE_SIZE - 1);

	if (copy_from_user(copy, data, PAGE_SIZE - offs)) {
                kfree(copy);
                return ERR_PTR(-EFAULT);
	}
	if (offs) {
		if (copy_from_user(copy, data + PAGE_SIZE - offs, offs))
			memset(copy + PAGE_SIZE - offs, 0, offs);
	}
        return copy;
}

on the theory that any fault halfway through a page means a race with
munmap/mprotect/etc. and we can just pretend we'd lost the race entirely.
And to hell with exact_copy_from_user(), byte-by-byte copying, etc.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  3:29                   ` Al Viro
@ 2019-10-08  4:09                     ` Linus Torvalds
  2019-10-08  4:14                       ` Linus Torvalds
                                         ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-08  4:09 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2505 bytes --]

On Mon, Oct 7, 2019 at 8:29 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> For x86?  Sure, why not...  Note, BTW, that for short constant-sized
> copies we *do* STAC/CLAC at the call site - see those
>                 __uaccess_begin_nospec();
> in raw_copy_{from,to}_user() in the switches...

Yeah, an that code almost never actually triggers in practice. The
code is pointless and dead.

The thing is, it's only ever used for the double undescore versions,
and the ones that do have have it are almost never constant sizes in
the first place.

And yes, there's like a couple of cases in the whole kernel.

Just remove those constant size cases. They are pointless and just
complicate our headers and slow down the compile for no good reason.

Try the attached patch, and then count the number of "rorx"
instructions in the kernel. Hint: not many. On my personal config,
this triggers 15 times in the whole kernel build (not counting
modules).

It's not worth it. The "speedup" from using __copy_{to,from}_user()
with the fancy inlining is negligible. All the cost is in the
STAC/CLAC anyway, the code might as well be deleted.

> 1) cross-architecture user_access_begin_dont_use(): on everything
> except x86 it's empty, on x86 - __uaccess_begin_nospec().

No, just do a proper range check, and use user_access_begin()

Stop trying to optimize that range check away. It's a couple of fast
instructions.

The only ones who don't want the range check are the actual kernel
copy ones, but they don't want the user_access_begin() either.

> void *copy_mount_options(const void __user * data)
> {
>         unsigned offs, size;
>         char *copy;
>
>         if (!data)
>                 return NULL;
>
>         copy = kmalloc(PAGE_SIZE, GFP_KERNEL);
>         if (!copy)
>                 return ERR_PTR(-ENOMEM);
>
>         offs = (unsigned long)untagged_addr(data) & (PAGE_SIZE - 1);
>
>         if (copy_from_user(copy, data, PAGE_SIZE - offs)) {
>                 kfree(copy);
>                 return ERR_PTR(-EFAULT);
>         }
>         if (offs) {
>                 if (copy_from_user(copy, data + PAGE_SIZE - offs, offs))
>                         memset(copy + PAGE_SIZE - offs, 0, offs);
>         }
>         return copy;
> }
>
> on the theory that any fault halfway through a page means a race with
> munmap/mprotect/etc. and we can just pretend we'd lost the race entirely.
> And to hell with exact_copy_from_user(), byte-by-byte copying, etc.

Looks reasonable.

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 2965 bytes --]

diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index 5cd1caa8bc65..db58c4436ce3 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -62,6 +62,8 @@ copy_to_user_mcsafe(void *to, const void *from, unsigned len)
 	return ret;
 }
 
+#define marker(x) asm volatile("rorx $" #x ",%rax,%rdx")
+
 static __always_inline __must_check unsigned long
 raw_copy_from_user(void *dst, const void __user *src, unsigned long size)
 {
@@ -72,30 +74,35 @@ raw_copy_from_user(void *dst, const void __user *src, unsigned long size)
 	switch (size) {
 	case 1:
 		__uaccess_begin_nospec();
+		marker(1);
 		__get_user_asm_nozero(*(u8 *)dst, (u8 __user *)src,
 			      ret, "b", "b", "=q", 1);
 		__uaccess_end();
 		return ret;
 	case 2:
 		__uaccess_begin_nospec();
+		marker(2);
 		__get_user_asm_nozero(*(u16 *)dst, (u16 __user *)src,
 			      ret, "w", "w", "=r", 2);
 		__uaccess_end();
 		return ret;
 	case 4:
 		__uaccess_begin_nospec();
+		marker(4);
 		__get_user_asm_nozero(*(u32 *)dst, (u32 __user *)src,
 			      ret, "l", "k", "=r", 4);
 		__uaccess_end();
 		return ret;
 	case 8:
 		__uaccess_begin_nospec();
+		marker(8);
 		__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
 			      ret, "q", "", "=r", 8);
 		__uaccess_end();
 		return ret;
 	case 10:
 		__uaccess_begin_nospec();
+		marker(10);
 		__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
 			       ret, "q", "", "=r", 10);
 		if (likely(!ret))
@@ -106,6 +113,7 @@ raw_copy_from_user(void *dst, const void __user *src, unsigned long size)
 		return ret;
 	case 16:
 		__uaccess_begin_nospec();
+		marker(16);
 		__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
 			       ret, "q", "", "=r", 16);
 		if (likely(!ret))
@@ -129,30 +137,35 @@ raw_copy_to_user(void __user *dst, const void *src, unsigned long size)
 	switch (size) {
 	case 1:
 		__uaccess_begin();
+		marker(51);
 		__put_user_asm(*(u8 *)src, (u8 __user *)dst,
 			      ret, "b", "b", "iq", 1);
 		__uaccess_end();
 		return ret;
 	case 2:
 		__uaccess_begin();
+		marker(52);
 		__put_user_asm(*(u16 *)src, (u16 __user *)dst,
 			      ret, "w", "w", "ir", 2);
 		__uaccess_end();
 		return ret;
 	case 4:
 		__uaccess_begin();
+		marker(54);
 		__put_user_asm(*(u32 *)src, (u32 __user *)dst,
 			      ret, "l", "k", "ir", 4);
 		__uaccess_end();
 		return ret;
 	case 8:
 		__uaccess_begin();
+		marker(58);
 		__put_user_asm(*(u64 *)src, (u64 __user *)dst,
 			      ret, "q", "", "er", 8);
 		__uaccess_end();
 		return ret;
 	case 10:
 		__uaccess_begin();
+		marker(60);
 		__put_user_asm(*(u64 *)src, (u64 __user *)dst,
 			       ret, "q", "", "er", 10);
 		if (likely(!ret)) {
@@ -164,6 +177,7 @@ raw_copy_to_user(void __user *dst, const void *src, unsigned long size)
 		return ret;
 	case 16:
 		__uaccess_begin();
+		marker(66);
 		__put_user_asm(*(u64 *)src, (u64 __user *)dst,
 			       ret, "q", "", "er", 16);
 		if (likely(!ret)) {

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  4:09                     ` Linus Torvalds
@ 2019-10-08  4:14                       ` Linus Torvalds
  2019-10-08  5:02                         ` Al Viro
  2019-10-08  4:24                       ` Linus Torvalds
  2019-10-08  4:57                       ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
  2 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-08  4:14 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 761 bytes --]

On Mon, Oct 7, 2019 at 9:09 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Try the attached patch, and then count the number of "rorx"
> instructions in the kernel. Hint: not many. On my personal config,
> this triggers 15 times in the whole kernel build (not counting
> modules).

So here's a serious patch that doesn't just mark things for counting -
it just removes the cases entirely.

Doesn't this look nice:

  2 files changed, 2 insertions(+), 133 deletions(-)

and it is one less thing to worry about when doing further cleanup.

Seriously, if any of those __copy_{to,from}_user() constant cases were
a big deal, we can turn them into get_user/put_user calls. But only
after they show up as an actual performance issue.

            Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 4900 bytes --]

 arch/x86/include/asm/uaccess_32.h |  27 ----------
 arch/x86/include/asm/uaccess_64.h | 108 +-------------------------------------
 2 files changed, 2 insertions(+), 133 deletions(-)

diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index ba2dc1930630..388a40660c7b 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -23,33 +23,6 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 static __always_inline unsigned long
 raw_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-	if (__builtin_constant_p(n)) {
-		unsigned long ret;
-
-		switch (n) {
-		case 1:
-			ret = 0;
-			__uaccess_begin_nospec();
-			__get_user_asm_nozero(*(u8 *)to, from, ret,
-					      "b", "b", "=q", 1);
-			__uaccess_end();
-			return ret;
-		case 2:
-			ret = 0;
-			__uaccess_begin_nospec();
-			__get_user_asm_nozero(*(u16 *)to, from, ret,
-					      "w", "w", "=r", 2);
-			__uaccess_end();
-			return ret;
-		case 4:
-			ret = 0;
-			__uaccess_begin_nospec();
-			__get_user_asm_nozero(*(u32 *)to, from, ret,
-					      "l", "k", "=r", 4);
-			__uaccess_end();
-			return ret;
-		}
-	}
 	return __copy_user_ll(to, (__force const void *)from, n);
 }
 
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index 5cd1caa8bc65..bc10e3dc64fe 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -65,117 +65,13 @@ copy_to_user_mcsafe(void *to, const void *from, unsigned len)
 static __always_inline __must_check unsigned long
 raw_copy_from_user(void *dst, const void __user *src, unsigned long size)
 {
-	int ret = 0;
-
-	if (!__builtin_constant_p(size))
-		return copy_user_generic(dst, (__force void *)src, size);
-	switch (size) {
-	case 1:
-		__uaccess_begin_nospec();
-		__get_user_asm_nozero(*(u8 *)dst, (u8 __user *)src,
-			      ret, "b", "b", "=q", 1);
-		__uaccess_end();
-		return ret;
-	case 2:
-		__uaccess_begin_nospec();
-		__get_user_asm_nozero(*(u16 *)dst, (u16 __user *)src,
-			      ret, "w", "w", "=r", 2);
-		__uaccess_end();
-		return ret;
-	case 4:
-		__uaccess_begin_nospec();
-		__get_user_asm_nozero(*(u32 *)dst, (u32 __user *)src,
-			      ret, "l", "k", "=r", 4);
-		__uaccess_end();
-		return ret;
-	case 8:
-		__uaccess_begin_nospec();
-		__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
-			      ret, "q", "", "=r", 8);
-		__uaccess_end();
-		return ret;
-	case 10:
-		__uaccess_begin_nospec();
-		__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
-			       ret, "q", "", "=r", 10);
-		if (likely(!ret))
-			__get_user_asm_nozero(*(u16 *)(8 + (char *)dst),
-				       (u16 __user *)(8 + (char __user *)src),
-				       ret, "w", "w", "=r", 2);
-		__uaccess_end();
-		return ret;
-	case 16:
-		__uaccess_begin_nospec();
-		__get_user_asm_nozero(*(u64 *)dst, (u64 __user *)src,
-			       ret, "q", "", "=r", 16);
-		if (likely(!ret))
-			__get_user_asm_nozero(*(u64 *)(8 + (char *)dst),
-				       (u64 __user *)(8 + (char __user *)src),
-				       ret, "q", "", "=r", 8);
-		__uaccess_end();
-		return ret;
-	default:
-		return copy_user_generic(dst, (__force void *)src, size);
-	}
+	return copy_user_generic(dst, (__force void *)src, size);
 }
 
 static __always_inline __must_check unsigned long
 raw_copy_to_user(void __user *dst, const void *src, unsigned long size)
 {
-	int ret = 0;
-
-	if (!__builtin_constant_p(size))
-		return copy_user_generic((__force void *)dst, src, size);
-	switch (size) {
-	case 1:
-		__uaccess_begin();
-		__put_user_asm(*(u8 *)src, (u8 __user *)dst,
-			      ret, "b", "b", "iq", 1);
-		__uaccess_end();
-		return ret;
-	case 2:
-		__uaccess_begin();
-		__put_user_asm(*(u16 *)src, (u16 __user *)dst,
-			      ret, "w", "w", "ir", 2);
-		__uaccess_end();
-		return ret;
-	case 4:
-		__uaccess_begin();
-		__put_user_asm(*(u32 *)src, (u32 __user *)dst,
-			      ret, "l", "k", "ir", 4);
-		__uaccess_end();
-		return ret;
-	case 8:
-		__uaccess_begin();
-		__put_user_asm(*(u64 *)src, (u64 __user *)dst,
-			      ret, "q", "", "er", 8);
-		__uaccess_end();
-		return ret;
-	case 10:
-		__uaccess_begin();
-		__put_user_asm(*(u64 *)src, (u64 __user *)dst,
-			       ret, "q", "", "er", 10);
-		if (likely(!ret)) {
-			asm("":::"memory");
-			__put_user_asm(4[(u16 *)src], 4 + (u16 __user *)dst,
-				       ret, "w", "w", "ir", 2);
-		}
-		__uaccess_end();
-		return ret;
-	case 16:
-		__uaccess_begin();
-		__put_user_asm(*(u64 *)src, (u64 __user *)dst,
-			       ret, "q", "", "er", 16);
-		if (likely(!ret)) {
-			asm("":::"memory");
-			__put_user_asm(1[(u64 *)src], 1 + (u64 __user *)dst,
-				       ret, "q", "", "er", 8);
-		}
-		__uaccess_end();
-		return ret;
-	default:
-		return copy_user_generic((__force void *)dst, src, size);
-	}
+	return copy_user_generic((__force void *)dst, src, size);
 }
 
 static __always_inline __must_check

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  4:09                     ` Linus Torvalds
  2019-10-08  4:14                       ` Linus Torvalds
@ 2019-10-08  4:24                       ` Linus Torvalds
  2019-10-10 19:55                         ` Al Viro
  2019-10-08  4:57                       ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
  2 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-08  4:24 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 7, 2019 at 9:09 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Try the attached patch, and then count the number of "rorx"
> instructions in the kernel. Hint: not many. On my personal config,
> this triggers 15 times in the whole kernel build (not counting
> modules).

.. and four of them are in perf_callchain_user(), and are due to those
"__copy_from_user_nmi()" with either 4-byte or 8-byte copies.

It might as well just use __get_user() instead.

The point being that the silly code in the header files is just
pointless. We shouldn't do it.

            Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  4:09                     ` Linus Torvalds
  2019-10-08  4:14                       ` Linus Torvalds
  2019-10-08  4:24                       ` Linus Torvalds
@ 2019-10-08  4:57                       ` Al Viro
  2019-10-08 13:14                         ` Greg KH
  2 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-08  4:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 07, 2019 at 09:09:14PM -0700, Linus Torvalds wrote:

> > 1) cross-architecture user_access_begin_dont_use(): on everything
> > except x86 it's empty, on x86 - __uaccess_begin_nospec().
> 
> No, just do a proper range check, and use user_access_begin()
> 
> Stop trying to optimize that range check away. It's a couple of fast
> instructions.
> 
> The only ones who don't want the range check are the actual kernel
> copy ones, but they don't want the user_access_begin() either.

Not at the first step.  Sure, in the end we want exactly that, and
we want it ASAP.  However, the main reason it grows into a tangled
mess that would be over the top for this cycle is the impacts in
arseload of places all over arch/*.

That way we can untangle those.  The initial segment that would
allow to use raw_copy_to_user() cleanly in readdir.c et.al.
could be done with provably zero impact on anything in arch/*
outside of arch/x86 usercopy-related code.

Moreover, it will be fairly small.  And after that the rest can
be done in any order, independent from each other.  I want to
kill __copy_... completely, and I believe we'll be able to do
just that in a cycle or two.

Once that is done, the helper disappears along with __copy_...().
And it will be documented as a temporary kludge, don't use
anywhere else, no matter what from the very beginning.  For
all the couple of cycles it'll take.

I'm serious about getting rid of __copy_...() in that timeframe.
There's not that much left.

The reason I don't want to do a blanket search-and-replace turning
them all into copy_...() is simply that their use is a good indicator
of code in need of serious beating^Wamount of careful review.

And hell, we might end up doing just that on case-by-case basis.
Often enough we will, by what I'd seen there...

Again, this kluge is only a splitup aid - by the end of the series
it's gone.  All it allows is to keep it easier to review.

Note, BTW, that bits and pieces converting a given pointless use
of __copy_...() to copy_...() can be reordered freely at any point
of the sequence - I've already got several.  _Some_ of (5) will
be conversions a-la readdir.c one and that has to follow (4), but
most of it won't be like that.

> > void *copy_mount_options(const void __user * data)
> > {
> >         unsigned offs, size;
> >         char *copy;
> >
> >         if (!data)
> >                 return NULL;
> >
> >         copy = kmalloc(PAGE_SIZE, GFP_KERNEL);
> >         if (!copy)
> >                 return ERR_PTR(-ENOMEM);
> >
> >         offs = (unsigned long)untagged_addr(data) & (PAGE_SIZE - 1);
> >
> >         if (copy_from_user(copy, data, PAGE_SIZE - offs)) {
> >                 kfree(copy);
> >                 return ERR_PTR(-EFAULT);
> >         }
> >         if (offs) {
> >                 if (copy_from_user(copy, data + PAGE_SIZE - offs, offs))
> >                         memset(copy + PAGE_SIZE - offs, 0, offs);
> >         }
> >         return copy;
> > }
> >
> > on the theory that any fault halfway through a page means a race with
> > munmap/mprotect/etc. and we can just pretend we'd lost the race entirely.
> > And to hell with exact_copy_from_user(), byte-by-byte copying, etc.
> 
> Looks reasonable.

	OK...  BTW, do you agree that the use of access_ok() in
drivers/tty/n_hdlc.c:n_hdlc_tty_read() is wrong?  It's used as an early
cutoff, so we don't bother waiting if user has passed an obviously bogus
address.  copy_to_user() is used for actual copying there...

	There are other places following that pattern and IMO they are all
wrong.  Another variety is a half-arsed filter trying to prevent warnings
from too large (and user-controllable) kmalloc() of buffer we'll be
copying to.  Which is worth very little, since kmalloc() will scream and
fail well before access_ok() limits will trip.  Those need explicit capping
of the size, IMO...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  4:14                       ` Linus Torvalds
@ 2019-10-08  5:02                         ` Al Viro
  0 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-08  5:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 07, 2019 at 09:14:51PM -0700, Linus Torvalds wrote:
> On Mon, Oct 7, 2019 at 9:09 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Try the attached patch, and then count the number of "rorx"
> > instructions in the kernel. Hint: not many. On my personal config,
> > this triggers 15 times in the whole kernel build (not counting
> > modules).
> 
> So here's a serious patch that doesn't just mark things for counting -
> it just removes the cases entirely.
> 
> Doesn't this look nice:
> 
>   2 files changed, 2 insertions(+), 133 deletions(-)
> 
> and it is one less thing to worry about when doing further cleanup.
> 
> Seriously, if any of those __copy_{to,from}_user() constant cases were
> a big deal, we can turn them into get_user/put_user calls. But only
> after they show up as an actual performance issue.

Makes sense.  I'm not arguing against doing that.  Moreover, I suspect
that other architectures will be similar, at least once the
sigframe-related code for given architecture is dealt with.  But that's
more of a "let's look at that later" thing (hopefully with maintainers
of architectures getting involved).

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 23:27   ` Guenter Roeck
@ 2019-10-08  6:28     ` Geert Uytterhoeven
  0 siblings, 0 replies; 71+ messages in thread
From: Geert Uytterhoeven @ 2019-10-08  6:28 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linus Torvalds, Michael Cree, Linux Kernel Mailing List,
	Alexander Viro, linux-fsdevel, linux-arch,
	Bartlomiej Zolnierkiewicz

On Tue, Oct 8, 2019 at 1:30 AM Guenter Roeck <linux@roeck-us.net> wrote:
> m68k:
>
> c2p_iplan2.c:(.text+0x98): undefined reference to `c2p_unsupported'
>
> I don't know the status.

Fall-out from the (non)inline optimization.  Patch available:
https://lore.kernel.org/lkml/20190927094708.11563-1-geert@linux-m68k.org/

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 18:11                   ` Linus Torvalds
@ 2019-10-08  9:58                     ` David Laight
  0 siblings, 0 replies; 71+ messages in thread
From: David Laight @ 2019-10-08  9:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: 07 October 2019 19:11
...
> I've been very close to just removing __get_user/__put_user several
> times, exactly because people do completely the wrong thing with them
> - not speeding code up, but making it unsafe and buggy.

They could do the very simple check that 'user_ptr+size < kernel_base'
rather than the full window check under the assumption that access_ok()
has been called and that the likely errors are just overruns.

> The new "user_access_begin/end()" model is much better, but it also
> has actual STATIC checking that there are no function calls etc inside
> the region, so it forces you to do the loop properly and tightly, and
> not the incorrect "I checked the range somewhere else, now I'm doing
> an unsafe copy".
> 
> And it actually speeds things up, unlike the access_ok() games.

I've code that does:
	if (!access_ok(...))
		return -EFAULT;
	...
	for (...) {
		if (__get_user(tmp_u64, user_ptr++))
			return -EFAULT;
		writeq(tmp_u64, io_ptr++);
	}
(Although the code is more complex because not all transfers are multiples of 8 bytes.)

With user_access_begin/end() I'd probably want to put the copy loop
inside a function (which will probably get inlined) to avoid convoluted
error processing.
So you end up with:
	if (!user_access_ok())
		return _EFAULT;
	user_access_begin();
	rval = do_copy_code(...);
	user_access_end();
	return rval;
Which, at the source level (at least) breaks your 'no function calls' rule.
The writeq() might also break it as well.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  4:57                       ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
@ 2019-10-08 13:14                         ` Greg KH
  2019-10-08 15:29                           ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Greg KH @ 2019-10-08 13:14 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Tue, Oct 08, 2019 at 05:57:12AM +0100, Al Viro wrote:
> 
> 	OK...  BTW, do you agree that the use of access_ok() in
> drivers/tty/n_hdlc.c:n_hdlc_tty_read() is wrong?  It's used as an early
> cutoff, so we don't bother waiting if user has passed an obviously bogus
> address.  copy_to_user() is used for actual copying there...

Yes, it's wrong, and not needed.  I'll go rip it out unless you want to?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08 13:14                         ` Greg KH
@ 2019-10-08 15:29                           ` Al Viro
  2019-10-08 15:38                             ` Greg KH
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-08 15:29 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Tue, Oct 08, 2019 at 03:14:16PM +0200, Greg KH wrote:
> On Tue, Oct 08, 2019 at 05:57:12AM +0100, Al Viro wrote:
> > 
> > 	OK...  BTW, do you agree that the use of access_ok() in
> > drivers/tty/n_hdlc.c:n_hdlc_tty_read() is wrong?  It's used as an early
> > cutoff, so we don't bother waiting if user has passed an obviously bogus
> > address.  copy_to_user() is used for actual copying there...
> 
> Yes, it's wrong, and not needed.  I'll go rip it out unless you want to?

I'll throw it into misc queue for now; it has no prereqs and nothing is going
to depend upon it.

While looking for more of the same pattern: usb_device_read().  Frankly,
usb_device_dump() calling conventions look ugly - it smells like it
would be much happier as seq_file.  Iterator would take some massage,
but that seems to be doable.  Anyway, that's a separate story...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08 15:29                           ` Al Viro
@ 2019-10-08 15:38                             ` Greg KH
  2019-10-08 17:06                               ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Greg KH @ 2019-10-08 15:38 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Tue, Oct 08, 2019 at 04:29:00PM +0100, Al Viro wrote:
> On Tue, Oct 08, 2019 at 03:14:16PM +0200, Greg KH wrote:
> > On Tue, Oct 08, 2019 at 05:57:12AM +0100, Al Viro wrote:
> > > 
> > > 	OK...  BTW, do you agree that the use of access_ok() in
> > > drivers/tty/n_hdlc.c:n_hdlc_tty_read() is wrong?  It's used as an early
> > > cutoff, so we don't bother waiting if user has passed an obviously bogus
> > > address.  copy_to_user() is used for actual copying there...
> > 
> > Yes, it's wrong, and not needed.  I'll go rip it out unless you want to?
> 
> I'll throw it into misc queue for now; it has no prereqs and nothing is going
> to depend upon it.

Great, thanks.

> While looking for more of the same pattern: usb_device_read().  Frankly,
> usb_device_dump() calling conventions look ugly - it smells like it
> would be much happier as seq_file.  Iterator would take some massage,
> but that seems to be doable.  Anyway, that's a separate story...

That's just a debugfs file, and yes, it should be moved to seq_file.  I
think I tried it a long time ago, but given it's just a debugging thing,
I gave up as it wasn't worth it.

But yes, the access_ok() there also seems odd, and should be dropped.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08 15:38                             ` Greg KH
@ 2019-10-08 17:06                               ` Al Viro
  0 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-08 17:06 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Tue, Oct 08, 2019 at 05:38:31PM +0200, Greg KH wrote:
> On Tue, Oct 08, 2019 at 04:29:00PM +0100, Al Viro wrote:
> > On Tue, Oct 08, 2019 at 03:14:16PM +0200, Greg KH wrote:
> > > On Tue, Oct 08, 2019 at 05:57:12AM +0100, Al Viro wrote:
> > > > 
> > > > 	OK...  BTW, do you agree that the use of access_ok() in
> > > > drivers/tty/n_hdlc.c:n_hdlc_tty_read() is wrong?  It's used as an early
> > > > cutoff, so we don't bother waiting if user has passed an obviously bogus
> > > > address.  copy_to_user() is used for actual copying there...
> > > 
> > > Yes, it's wrong, and not needed.  I'll go rip it out unless you want to?
> > 
> > I'll throw it into misc queue for now; it has no prereqs and nothing is going
> > to depend upon it.
> 
> Great, thanks.
> 
> > While looking for more of the same pattern: usb_device_read().  Frankly,
> > usb_device_dump() calling conventions look ugly - it smells like it
> > would be much happier as seq_file.  Iterator would take some massage,
> > but that seems to be doable.  Anyway, that's a separate story...
> 
> That's just a debugfs file, and yes, it should be moved to seq_file.  I
> think I tried it a long time ago, but given it's just a debugging thing,
> I gave up as it wasn't worth it.
> 
> But yes, the access_ok() there also seems odd, and should be dropped.

I'm almost tempted to keep it there as a reminder/grep fodder ;-)

Seriously, though, it might be useful to have a way of marking the places
in need of gentle repair of retrocranial inversions _without_ attracting
the "checkpatch warning of the week" crowd...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-07 18:26                 ` Linus Torvalds
  2019-10-07 18:36                   ` Tony Luck
  2019-10-08  3:29                   ` Al Viro
@ 2019-10-08 19:58                   ` Al Viro
  2019-10-08 20:16                     ` Al Viro
  2019-10-08 20:34                     ` Al Viro
  2 siblings, 2 replies; 71+ messages in thread
From: Al Viro @ 2019-10-08 19:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 07, 2019 at 11:26:35AM -0700, Linus Torvalds wrote:

> The good news is that right now x86 is the only architecture that does
> that user_access_begin(), so we don't need to worry about anything
> else. Apparently the ARM people haven't had enough performance
> problems with the PAN bit for them to care.

Take a look at this:
static inline unsigned long raw_copy_from_user(void *to,
                const void __user *from, unsigned long n)
{
        unsigned long ret;
        if (__builtin_constant_p(n) && (n <= 8)) {
                ret = 1;

                switch (n) {
                case 1:
                        barrier_nospec();
                        __get_user_size(*(u8 *)to, from, 1, ret);
                        break;
                case 2:
                        barrier_nospec();
                        __get_user_size(*(u16 *)to, from, 2, ret);
                        break;
                case 4:
                        barrier_nospec();
                        __get_user_size(*(u32 *)to, from, 4, ret);
                        break;
                case 8:
                        barrier_nospec();
                        __get_user_size(*(u64 *)to, from, 8, ret);
                        break;
                }
                if (ret == 0)
                        return 0;
        }

        barrier_nospec();
        allow_read_from_user(from, n);
        ret = __copy_tofrom_user((__force void __user *)to, from, n);
        prevent_read_from_user(from, n);
        return ret;
}

That's powerpc.  And while the constant-sized bits are probably pretty
useless there as well, note the allow_read_from_user()/prevent_read_from_user()
part.  Looks suspiciously similar to user_access_begin()/user_access_end()...

The difference is, they have separate "for read" and "for write" primitives
and they want the range in their user_access_end() analogue.  Separating
the read and write isn't a problem for callers (we want them close to
the actual memory accesses).  Passing the range to user_access_end() just
might be tolerable, unless it makes you throw up...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08 19:58                   ` Al Viro
@ 2019-10-08 20:16                     ` Al Viro
  2019-10-08 20:34                     ` Al Viro
  1 sibling, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-08 20:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Tue, Oct 08, 2019 at 08:58:58PM +0100, Al Viro wrote:

> That's powerpc.  And while the constant-sized bits are probably pretty
> useless there as well, note the allow_read_from_user()/prevent_read_from_user()
> part.  Looks suspiciously similar to user_access_begin()/user_access_end()...
> 
> The difference is, they have separate "for read" and "for write" primitives
> and they want the range in their user_access_end() analogue.  Separating
> the read and write isn't a problem for callers (we want them close to
> the actual memory accesses).  Passing the range to user_access_end() just
> might be tolerable, unless it makes you throw up...

	BTW, another related cleanup is futex_atomic_op_inuser() and
arch_futex_atomic_op_inuser().  In the former we have
        if (!access_ok(uaddr, sizeof(u32)))
                return -EFAULT;

        ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
        if (ret)
                return ret;
and in the latter we've got STAC/CLAC pairs stuck into inlined bits
on x86.  As well as allow_write_to_user(uaddr, sizeof(*uaddr)) on
ppc...

I don't see anything in x86 one objtool would've barfed if we pulled
STAC/CLAC out and turned access_ok() into user_access_begin(),
with matching user_access_end() right after the call of 
arch_futex_atomic_op_inuser().  Everything is inlined there and
no scary memory accesses would get into the scope (well, we do
have
        if (!ret)
                *oval = oldval;
in the very end of arch_futex_atomic_op_inuser() there, but oval
is the address of a local variable in the sole caller; if we run
with kernel stack on ring 3 page, we are deeply fucked *and*
wouldn't have survived that far into futex_atomic_op_inuser() anyway ;-)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08 19:58                   ` Al Viro
  2019-10-08 20:16                     ` Al Viro
@ 2019-10-08 20:34                     ` Al Viro
  1 sibling, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-08 20:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Tue, Oct 08, 2019 at 08:58:58PM +0100, Al Viro wrote:

> The difference is, they have separate "for read" and "for write" primitives
> and they want the range in their user_access_end() analogue.  Separating
> the read and write isn't a problem for callers (we want them close to
> the actual memory accesses).  Passing the range to user_access_end() just
> might be tolerable, unless it makes you throw up...

NOTE: I'm *NOT* suggesting to bring back the VERIFY_READ/VERIFY_WRITE
argument to access_ok().  We'd gotten rid of it, and for a very good
reason (and decades overdue).

The main difference between access_ok() and user_access_begin() is that
the latter is right next to actual memory access, with user_access_end()
on the other side, also very close.  And most of those guys would be
concentrated in a few functions, where we bloody well know which
direction we are copying.

Even if we try and map ppc allow_..._to_user() on user_access_begin(),
access_ok() remains as it is (and I hope we'll get rid of the majority
of its caller in process).

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-08  4:24                       ` Linus Torvalds
@ 2019-10-10 19:55                         ` Al Viro
  2019-10-10 22:12                           ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-10 19:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Mon, Oct 07, 2019 at 09:24:17PM -0700, Linus Torvalds wrote:
> On Mon, Oct 7, 2019 at 9:09 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Try the attached patch, and then count the number of "rorx"
> > instructions in the kernel. Hint: not many. On my personal config,
> > this triggers 15 times in the whole kernel build (not counting
> > modules).
> 
> .. and four of them are in perf_callchain_user(), and are due to those
> "__copy_from_user_nmi()" with either 4-byte or 8-byte copies.
> 
> It might as well just use __get_user() instead.
> 
> The point being that the silly code in the header files is just
> pointless. We shouldn't do it.

FWIW, the one that looks the most potentiall sensitive in that bunch is
arch/x86/kvm/paging_tmpl.h:388:         if (unlikely(__copy_from_user(&pte, ptep_user, sizeof(pte))))
in the bowels of KVM page fault handling.  I would be very surprised if
the rest would be detectable...

Anyway, another question you way: what do you think of try/catch approaches
to __get_user() blocks, like e.g. restore_sigcontext() is doing?

Should that be available outside of arch/*?  For that matter, would
it be a good idea to convert get_user_ex() users in arch/x86 to
unsafe_get_user()?

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-10 19:55                         ` Al Viro
@ 2019-10-10 22:12                           ` Linus Torvalds
  2019-10-11  0:11                             ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-10 22:12 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Thu, Oct 10, 2019 at 12:55 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Anyway, another question you way: what do you think of try/catch approaches
> to __get_user() blocks, like e.g. restore_sigcontext() is doing?

I'd rather have them converted to our unsafe_get/put_user() instead.

We don't generate great code for the "get" case (because of how gcc
doesn't allow us to mix "asm goto" and outputs), but I really despise
the x86-specific "{get,put}_user_ex()" machinery. It's not actually
doing a real try/catch at all, and will just keep taking faults if one
happens.

But I've not gotten around to rewriting those disgusting sequences to
the unsafe_get/put_user() model. I did look at it, and it requires
some changes exactly *because* the _ex() functions are broken and
continue, but also because the current code ends up also doing other
things inside the try/catch region that you're not supposed to do in a
user_access_begin/end() region .

> Should that be available outside of arch/*?  For that matter, would
> it be a good idea to convert get_user_ex() users in arch/x86 to
> unsafe_get_user()?

See above: yes, it would be a good idea to convert to
unsafe_get/put_user(), and no, we don't want to expose the horrid
*_ex() model to other architectures.

          Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-10 22:12                           ` Linus Torvalds
@ 2019-10-11  0:11                             ` Al Viro
  2019-10-11  0:31                               ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-11  0:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Thu, Oct 10, 2019 at 03:12:49PM -0700, Linus Torvalds wrote:
> On Thu, Oct 10, 2019 at 12:55 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Anyway, another question you way: what do you think of try/catch approaches
> > to __get_user() blocks, like e.g. restore_sigcontext() is doing?
> 
> I'd rather have them converted to our unsafe_get/put_user() instead.
> 
> We don't generate great code for the "get" case (because of how gcc
> doesn't allow us to mix "asm goto" and outputs), but I really despise
> the x86-specific "{get,put}_user_ex()" machinery. It's not actually
> doing a real try/catch at all, and will just keep taking faults if one
> happens.
> 
> But I've not gotten around to rewriting those disgusting sequences to
> the unsafe_get/put_user() model. I did look at it, and it requires
> some changes exactly *because* the _ex() functions are broken and
> continue, but also because the current code ends up also doing other
> things inside the try/catch region that you're not supposed to do in a
> user_access_begin/end() region .

Hmm...  Which one was that?  AFAICS, we have
	do_sys_vm86: only get_user_ex()
	restore_sigcontext(): get_user_ex(), set_user_gs()
	ia32_restore_sigcontext(): get_user_ex()

So at least get_user_try/get_user_ex/get_user_catch should be killable.
The other side...
	save_v86_state(): put_user_ex()
	setup_sigcontext(): put_user_ex()
	__setup_rt_frame(): put_user_ex(), static_cpu_has()
	another one in __setup_rt_frame(): put_user_ex()
	x32_setup_rt_frame(): put_user_ex()
	ia32_setup_sigcontext(): put_user_ex()
	ia32_setup_frame(): put_user_ex()
	another one in ia32_setup_frame(): put_user_ex(), static_cpu_has()

IDGI...  Is static_cpu_has() not allowed in there?  Looks like it's all inlines
and doesn't do any potentially risky memory accesses...  What am I missing?

As for the try/catch model...  How about
	if (!user_access_begin())
		sod off
	...
	unsafe_get_user(..., l);
	...
	unsafe_get_user_nojump();
	...
	unsafe_get_user_nojump();
	...
	if (user_access_did_fail())
		goto l;

	user_access_end()
	...
	return 0;
l:
	...
	user_access_end()
	return -EFAULT;

making it clear that we are delaying the check for failures until it's
more convenient.  And *not* trying to trick C parser into enforcing
anything - let objtool do it and to hell with do { and } while (0) in
magic macros.  Could be mixed with the normal unsafe_..._user() without
any problems, AFAICS...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-11  0:11                             ` Al Viro
@ 2019-10-11  0:31                               ` Linus Torvalds
  2019-10-13 18:13                                 ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-11  0:31 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 3071 bytes --]

On Thu, Oct 10, 2019 at 5:11 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Oct 10, 2019 at 03:12:49PM -0700, Linus Torvalds wrote:
>
> > But I've not gotten around to rewriting those disgusting sequences to
> > the unsafe_get/put_user() model. I did look at it, and it requires
> > some changes exactly *because* the _ex() functions are broken and
> > continue, but also because the current code ends up also doing other
> > things inside the try/catch region that you're not supposed to do in a
> > user_access_begin/end() region .
>
> Hmm...  Which one was that?  AFAICS, we have
>         do_sys_vm86: only get_user_ex()
>         restore_sigcontext(): get_user_ex(), set_user_gs()
>         ia32_restore_sigcontext(): get_user_ex()

Try this patch.

It works fine (well, it worked fine the lastr time I tried this, I
might have screwed something up just now: I re-created the patch since
I hadn't saved it).

It's nice and clean, and does

 1 file changed, 9 insertions(+), 91 deletions(-)

by just deleting all the nasty *_ex() macros entirely, replacing them
with unsafe_get/put_user() calls.

And now those try/catch regions actually work like try/catch regions,
and a fault branches to the catch.

BUT.

It does change semantics, and you get warnings like

  arch/x86/ia32/ia32_signal.c: In function ‘ia32_restore_sigcontext’:
  arch/x86/ia32/ia32_signal.c:114:9: warning: ‘buf’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
    114 |  err |= fpu__restore_sig(buf, 1);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~
  arch/x86/ia32/ia32_signal.c:64:27: warning: ‘ds’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
     64 |  unsigned int pre = (seg) | 3;  \
        |                           ^
  arch/x86/ia32/ia32_signal.c:74:18: note: ‘ds’ was declared here
...
  arch/x86/kernel/signal.c: In function ‘restore_sigcontext’:
  arch/x86/kernel/signal.c:152:9: warning: ‘buf’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
    152 |  err |= fpu__restore_sig(buf, IS_ENABLED(CONFIG_X86_32));
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

because it's true: those things reall may not be initialized, because
the catch thing could have jumped out.

So the code actually needs to properly return the error early, or
initialize the segments that didn't get loaded to 0, or something.

And when I posted that, Luto said "just get rid of the get_user_ex()
entirely, instead of changing semantics of the existing ones to be
sane.

Which is probably right. There aren't that many.

I *thought* there were also cases of us doing some questionably things
inside the get_user_try sections, but those seem to have gotten fixed
already independently, so it's really just the "make try/catch really
try/catch" change that needs some editing of our current broken stuff
that depends on it not actually *catching* exceptions, but on just
continuing on to the next one.

                Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 5469 bytes --]

 arch/x86/include/asm/uaccess.h | 100 ++++-------------------------------------
 1 file changed, 9 insertions(+), 91 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 61d93f062a36..e87d8911dc53 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -193,23 +193,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
 		     : : "A" (x), "r" (addr)			\
 		     : : label)
 
-#define __put_user_asm_ex_u64(x, addr)					\
-	asm volatile("\n"						\
-		     "1:	movl %%eax,0(%1)\n"			\
-		     "2:	movl %%edx,4(%1)\n"			\
-		     "3:"						\
-		     _ASM_EXTABLE_EX(1b, 2b)				\
-		     _ASM_EXTABLE_EX(2b, 3b)				\
-		     : : "A" (x), "r" (addr))
-
 #define __put_user_x8(x, ptr, __ret_pu)				\
 	asm volatile("call __put_user_8" : "=a" (__ret_pu)	\
 		     : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
 #else
 #define __put_user_goto_u64(x, ptr, label) \
 	__put_user_goto(x, ptr, "q", "", "er", label)
-#define __put_user_asm_ex_u64(x, addr)	\
-	__put_user_asm_ex(x, addr, "q", "", "er")
 #define __put_user_x8(x, ptr, __ret_pu) __put_user_x(8, x, ptr, __ret_pu)
 #endif
 
@@ -289,31 +278,6 @@ do {									\
 	}								\
 } while (0)
 
-/*
- * This doesn't do __uaccess_begin/end - the exception handling
- * around it must do that.
- */
-#define __put_user_size_ex(x, ptr, size)				\
-do {									\
-	__chk_user_ptr(ptr);						\
-	switch (size) {							\
-	case 1:								\
-		__put_user_asm_ex(x, ptr, "b", "b", "iq");		\
-		break;							\
-	case 2:								\
-		__put_user_asm_ex(x, ptr, "w", "w", "ir");		\
-		break;							\
-	case 4:								\
-		__put_user_asm_ex(x, ptr, "l", "k", "ir");		\
-		break;							\
-	case 8:								\
-		__put_user_asm_ex_u64((__typeof__(*ptr))(x), ptr);	\
-		break;							\
-	default:							\
-		__put_user_bad();					\
-	}								\
-} while (0)
-
 #ifdef CONFIG_X86_32
 #define __get_user_asm_u64(x, ptr, retval, errret)			\
 ({									\
@@ -334,13 +298,9 @@ do {									\
 		     : "m" (__m(__ptr)), "m" __m(((u32 __user *)(__ptr)) + 1),	\
 		       "i" (errret), "0" (retval));			\
 })
-
-#define __get_user_asm_ex_u64(x, ptr)			(x) = __get_user_bad()
 #else
 #define __get_user_asm_u64(x, ptr, retval, errret) \
 	 __get_user_asm(x, ptr, retval, "q", "", "=r", errret)
-#define __get_user_asm_ex_u64(x, ptr) \
-	 __get_user_asm_ex(x, ptr, "q", "", "=r")
 #endif
 
 #define __get_user_size(x, ptr, size, retval, errret)			\
@@ -390,41 +350,6 @@ do {									\
 		     : "=r" (err), ltype(x)				\
 		     : "m" (__m(addr)), "i" (errret), "0" (err))
 
-/*
- * This doesn't do __uaccess_begin/end - the exception handling
- * around it must do that.
- */
-#define __get_user_size_ex(x, ptr, size)				\
-do {									\
-	__chk_user_ptr(ptr);						\
-	switch (size) {							\
-	case 1:								\
-		__get_user_asm_ex(x, ptr, "b", "b", "=q");		\
-		break;							\
-	case 2:								\
-		__get_user_asm_ex(x, ptr, "w", "w", "=r");		\
-		break;							\
-	case 4:								\
-		__get_user_asm_ex(x, ptr, "l", "k", "=r");		\
-		break;							\
-	case 8:								\
-		__get_user_asm_ex_u64(x, ptr);				\
-		break;							\
-	default:							\
-		(x) = __get_user_bad();					\
-	}								\
-} while (0)
-
-#define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %1,%"rtype"0\n"		\
-		     "2:\n"						\
-		     ".section .fixup,\"ax\"\n"				\
-                     "3:xor"itype" %"rtype"0,%"rtype"0\n"		\
-		     "  jmp 2b\n"					\
-		     ".previous\n"					\
-		     _ASM_EXTABLE_EX(1b, 3b)				\
-		     : ltype(x) : "m" (__m(addr)))
-
 #define __put_user_nocheck(x, ptr, size)			\
 ({								\
 	__label__ __pu_label;					\
@@ -480,27 +405,25 @@ struct __large_struct { unsigned long buf[100]; };
 	retval = __put_user_failed(x, addr, itype, rtype, ltype, errret);	\
 } while (0)
 
-#define __put_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %"rtype"0,%1\n"		\
-		     "2:\n"						\
-		     _ASM_EXTABLE_EX(1b, 2b)				\
-		     : : ltype(x), "m" (__m(addr)))
-
 /*
  * uaccess_try and catch
  */
 #define uaccess_try	do {						\
-	current->thread.uaccess_err = 0;				\
+	__label__ __uaccess_catch_efault;				\
 	__uaccess_begin();						\
 	barrier();
 
 #define uaccess_try_nospec do {						\
-	current->thread.uaccess_err = 0;				\
+	__label__ __uaccess_catch_efault;				\
 	__uaccess_begin_nospec();					\
 
 #define uaccess_catch(err)						\
 	__uaccess_end();						\
-	(err) |= (current->thread.uaccess_err ? -EFAULT : 0);		\
+	(err) = 0;							\
+	break;								\
+__uaccess_catch_efault:							\
+	__uaccess_end();						\
+	(err) = -EFAULT;						\
 } while (0)
 
 /**
@@ -562,17 +485,12 @@ struct __large_struct { unsigned long buf[100]; };
 #define get_user_try		uaccess_try_nospec
 #define get_user_catch(err)	uaccess_catch(err)
 
-#define get_user_ex(x, ptr)	do {					\
-	unsigned long __gue_val;					\
-	__get_user_size_ex((__gue_val), (ptr), (sizeof(*(ptr))));	\
-	(x) = (__force __typeof__(*(ptr)))__gue_val;			\
-} while (0)
+#define get_user_ex(x, ptr)	unsafe_get_user(x, ptr, __uaccess_catch_efault)
 
 #define put_user_try		uaccess_try
 #define put_user_catch(err)	uaccess_catch(err)
 
-#define put_user_ex(x, ptr)						\
-	__put_user_size_ex((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
+#define put_user_ex(x, ptr)	unsafe_put_user(x, ptr, __uaccess_catch_efault)
 
 extern unsigned long
 copy_from_user_nmi(void *to, const void __user *from, unsigned long n);

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-11  0:31                               ` Linus Torvalds
@ 2019-10-13 18:13                                 ` Al Viro
  2019-10-13 18:43                                   ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-13 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Thu, Oct 10, 2019 at 05:31:13PM -0700, Linus Torvalds wrote:

> So the code actually needs to properly return the error early, or
> initialize the segments that didn't get loaded to 0, or something.
> 
> And when I posted that, Luto said "just get rid of the get_user_ex()
> entirely, instead of changing semantics of the existing ones to be
> sane.
> 
> Which is probably right. There aren't that many.
> 
> I *thought* there were also cases of us doing some questionably things
> inside the get_user_try sections, but those seem to have gotten fixed
> already independently, so it's really just the "make try/catch really
> try/catch" change that needs some editing of our current broken stuff
> that depends on it not actually *catching* exceptions, but on just
> continuing on to the next one.

Umm...  TBH, I wonder if we would be better off if restore_sigcontext()
(i.e. sigreturn()/rt_sigreturn()) would flat-out copy_from_user() the
entire[*] struct sigcontext into a local variable and then copied fields
to pt_regs...  The thing is small enough for not blowing the stack (256
bytes max. and it's on a shallow stack) and big enough to make "fancy
memcpy + let the compiler think how to combine in-kernel copies"
potentially better than hardwired sequence of 64bit loads/stores...

[*] OK, sans ->reserved part in the very end on 64bit.  192 bytes to
copy.

Same for do_sys_vm86(), perhaps - we want regs/flags/cpu_type and
screen_bitmap there, i.e. the beginning of struct vm86plus_struct
and of struct vm86_struct...  24*32bit.  IOW, 96-byte memcpy +
gcc-visible field-by-field copying vs. hardwired sequence of
32bit loads (with some 16bit ones thrown in, for extra fun) and
compiler told not to reorder anything.

And these (32bit and 64bit restore_sigcontext() and do_sys_vm86())
are the only get_user_ex() users anywhere...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 18:13                                 ` Al Viro
@ 2019-10-13 18:43                                   ` Linus Torvalds
  2019-10-13 19:10                                     ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-13 18:43 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 13, 2019 at 11:13 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Umm...  TBH, I wonder if we would be better off if restore_sigcontext()
> (i.e. sigreturn()/rt_sigreturn()) would flat-out copy_from_user() the
> entire[*] struct sigcontext into a local variable and then copied fields
> to pt_regs...

Probably ok., We've generally tried to avoid state that big on the
stack, but you're right that it's shallow.

> Same for do_sys_vm86(), perhaps.
>
> And these (32bit and 64bit restore_sigcontext() and do_sys_vm86())
> are the only get_user_ex() users anywhere...

Yeah, that sounds like a solid strategy for getting rid of them.

Particularly since we can't really make get_user_ex() generate
particularly good code (at least for now).

Now, put_user_ex() is a different thing - converting it to
unsafe_put_user() actually does make it generate very good code - much
better than copying data twice.

               Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 18:43                                   ` Linus Torvalds
@ 2019-10-13 19:10                                     ` Al Viro
  2019-10-13 19:22                                       ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-13 19:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 13, 2019 at 11:43:57AM -0700, Linus Torvalds wrote:
> On Sun, Oct 13, 2019 at 11:13 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Umm...  TBH, I wonder if we would be better off if restore_sigcontext()
> > (i.e. sigreturn()/rt_sigreturn()) would flat-out copy_from_user() the
> > entire[*] struct sigcontext into a local variable and then copied fields
> > to pt_regs...
> 
> Probably ok., We've generally tried to avoid state that big on the
> stack, but you're right that it's shallow.
>
> > Same for do_sys_vm86(), perhaps.
> >
> > And these (32bit and 64bit restore_sigcontext() and do_sys_vm86())
> > are the only get_user_ex() users anywhere...
> 
> Yeah, that sounds like a solid strategy for getting rid of them.
> 
> Particularly since we can't really make get_user_ex() generate
> particularly good code (at least for now).
> 
> Now, put_user_ex() is a different thing - converting it to
> unsafe_put_user() actually does make it generate very good code - much
> better than copying data twice.

No arguments re put_user_ex side of things...  Below is a completely
untested patch for get_user_ex elimination (it seems to build, but that's
it); in any case, I would really like to see comments from x86 folks
before it goes anywhere.

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 1cee10091b9f..28a32ccc32de 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -46,59 +46,38 @@
 #define get_user_seg(seg)	({ unsigned int v; savesegment(seg, v); v; })
 #define set_user_seg(seg, v)	loadsegment_##seg(v)
 
-#define COPY(x)			{		\
-	get_user_ex(regs->x, &sc->x);		\
-}
-
-#define GET_SEG(seg)		({			\
-	unsigned short tmp;				\
-	get_user_ex(tmp, &sc->seg);			\
-	tmp;						\
-})
+#define COPY(x) regs->x = sc.x
 
-#define COPY_SEG_CPL3(seg)	do {			\
-	regs->seg = GET_SEG(seg) | 3;			\
-} while (0)
+#define COPY_SEG_CPL3(seg) regs->seg = sc.seg | 3
 
 #define RELOAD_SEG(seg)		{		\
-	unsigned int pre = (seg) | 3;		\
+	unsigned int pre = sc.seg | 3;		\
 	unsigned int cur = get_user_seg(seg);	\
 	if (pre != cur)				\
 		set_user_seg(seg, pre);		\
 }
 
 static int ia32_restore_sigcontext(struct pt_regs *regs,
-				   struct sigcontext_32 __user *sc)
+				   struct sigcontext_32 __user *usc)
 {
-	unsigned int tmpflags, err = 0;
-	u16 gs, fs, es, ds;
-	void __user *buf;
-	u32 tmp;
+	struct sigcontext_32 sc;
 
 	/* Always make any pending restarted system calls return -EINTR */
 	current->restart_block.fn = do_no_restart_syscall;
 
-	get_user_try {
-		gs = GET_SEG(gs);
-		fs = GET_SEG(fs);
-		ds = GET_SEG(ds);
-		es = GET_SEG(es);
-
-		COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx);
-		COPY(dx); COPY(cx); COPY(ip); COPY(ax);
-		/* Don't touch extended registers */
+	if (unlikely(__copy_from_user(&sc, usc, sizeof(sc))))
+		goto Efault;
 
-		COPY_SEG_CPL3(cs);
-		COPY_SEG_CPL3(ss);
+	COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx);
+	COPY(dx); COPY(cx); COPY(ip); COPY(ax);
+	/* Don't touch extended registers */
 
-		get_user_ex(tmpflags, &sc->flags);
-		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
-		/* disable syscall checks */
-		regs->orig_ax = -1;
+	COPY_SEG_CPL3(cs);
+	COPY_SEG_CPL3(ss);
 
-		get_user_ex(tmp, &sc->fpstate);
-		buf = compat_ptr(tmp);
-	} get_user_catch(err);
+	regs->flags = (regs->flags & ~FIX_EFLAGS) | (sc.flags & FIX_EFLAGS);
+	/* disable syscall checks */
+	regs->orig_ax = -1;
 
 	/*
 	 * Reload fs and gs if they have changed in the signal
@@ -111,11 +90,15 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,
 	RELOAD_SEG(ds);
 	RELOAD_SEG(es);
 
-	err |= fpu__restore_sig(buf, 1);
+	if (unlikely(fpu__restore_sig(compat_ptr(sc.fpstate), 1)))
+		goto Efault;
 
 	force_iret();
+	return 0;
 
-	return err;
+Efault:
+	force_iret();
+	return -EFAULT;
 }
 
 asmlinkage long sys32_sigreturn(void)
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 61d93f062a36..ac81f06f8358 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -335,12 +335,9 @@ do {									\
 		       "i" (errret), "0" (retval));			\
 })
 
-#define __get_user_asm_ex_u64(x, ptr)			(x) = __get_user_bad()
 #else
 #define __get_user_asm_u64(x, ptr, retval, errret) \
 	 __get_user_asm(x, ptr, retval, "q", "", "=r", errret)
-#define __get_user_asm_ex_u64(x, ptr) \
-	 __get_user_asm_ex(x, ptr, "q", "", "=r")
 #endif
 
 #define __get_user_size(x, ptr, size, retval, errret)			\
@@ -390,41 +387,6 @@ do {									\
 		     : "=r" (err), ltype(x)				\
 		     : "m" (__m(addr)), "i" (errret), "0" (err))
 
-/*
- * This doesn't do __uaccess_begin/end - the exception handling
- * around it must do that.
- */
-#define __get_user_size_ex(x, ptr, size)				\
-do {									\
-	__chk_user_ptr(ptr);						\
-	switch (size) {							\
-	case 1:								\
-		__get_user_asm_ex(x, ptr, "b", "b", "=q");		\
-		break;							\
-	case 2:								\
-		__get_user_asm_ex(x, ptr, "w", "w", "=r");		\
-		break;							\
-	case 4:								\
-		__get_user_asm_ex(x, ptr, "l", "k", "=r");		\
-		break;							\
-	case 8:								\
-		__get_user_asm_ex_u64(x, ptr);				\
-		break;							\
-	default:							\
-		(x) = __get_user_bad();					\
-	}								\
-} while (0)
-
-#define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %1,%"rtype"0\n"		\
-		     "2:\n"						\
-		     ".section .fixup,\"ax\"\n"				\
-                     "3:xor"itype" %"rtype"0,%"rtype"0\n"		\
-		     "  jmp 2b\n"					\
-		     ".previous\n"					\
-		     _ASM_EXTABLE_EX(1b, 3b)				\
-		     : ltype(x) : "m" (__m(addr)))
-
 #define __put_user_nocheck(x, ptr, size)			\
 ({								\
 	__label__ __pu_label;					\
@@ -552,22 +514,6 @@ struct __large_struct { unsigned long buf[100]; };
 #define __put_user(x, ptr)						\
 	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
 
-/*
- * {get|put}_user_try and catch
- *
- * get_user_try {
- *	get_user_ex(...);
- * } get_user_catch(err)
- */
-#define get_user_try		uaccess_try_nospec
-#define get_user_catch(err)	uaccess_catch(err)
-
-#define get_user_ex(x, ptr)	do {					\
-	unsigned long __gue_val;					\
-	__get_user_size_ex((__gue_val), (ptr), (sizeof(*(ptr))));	\
-	(x) = (__force __typeof__(*(ptr)))__gue_val;			\
-} while (0)
-
 #define put_user_try		uaccess_try
 #define put_user_catch(err)	uaccess_catch(err)
 
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 8eb7193e158d..301d34b256c6 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -47,23 +47,9 @@
 #include <asm/sigframe.h>
 #include <asm/signal.h>
 
-#define COPY(x)			do {			\
-	get_user_ex(regs->x, &sc->x);			\
-} while (0)
-
-#define GET_SEG(seg)		({			\
-	unsigned short tmp;				\
-	get_user_ex(tmp, &sc->seg);			\
-	tmp;						\
-})
-
-#define COPY_SEG(seg)		do {			\
-	regs->seg = GET_SEG(seg);			\
-} while (0)
-
-#define COPY_SEG_CPL3(seg)	do {			\
-	regs->seg = GET_SEG(seg) | 3;			\
-} while (0)
+#define COPY(x)	regs->x = sc.x
+#define COPY_SEG(seg) regs->seg = sc.seg
+#define COPY_SEG_CPL3(seg) regs->seg = sc.seg | 3
 
 #ifdef CONFIG_X86_64
 /*
@@ -95,50 +81,53 @@ static void force_valid_ss(struct pt_regs *regs)
 #endif
 
 static int restore_sigcontext(struct pt_regs *regs,
-			      struct sigcontext __user *sc,
+			      struct sigcontext __user *usc,
 			      unsigned long uc_flags)
 {
-	unsigned long buf_val;
 	void __user *buf;
-	unsigned int tmpflags;
-	unsigned int err = 0;
+	struct sigcontext sc;
+	enum {
+#ifdef CONFIG_X86_32
+		To_copy = sizeof(struct sigcontext),
+#else
+		To_copy = offsetof(struct sigcontext, reserved1),
+#endif
+	};
 
 	/* Always make any pending restarted system calls return -EINTR */
 	current->restart_block.fn = do_no_restart_syscall;
 
-	get_user_try {
+	if (unlikely(__copy_from_user(&sc, usc, To_copy)))
+		goto Efault;
 
 #ifdef CONFIG_X86_32
-		set_user_gs(regs, GET_SEG(gs));
-		COPY_SEG(fs);
-		COPY_SEG(es);
-		COPY_SEG(ds);
+	set_user_gs(regs, sc.gs);
+	COPY_SEG(fs);
+	COPY_SEG(es);
+	COPY_SEG(ds);
 #endif /* CONFIG_X86_32 */
 
-		COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx);
-		COPY(dx); COPY(cx); COPY(ip); COPY(ax);
+	COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx);
+	COPY(dx); COPY(cx); COPY(ip); COPY(ax);
 
 #ifdef CONFIG_X86_64
-		COPY(r8);
-		COPY(r9);
-		COPY(r10);
-		COPY(r11);
-		COPY(r12);
-		COPY(r13);
-		COPY(r14);
-		COPY(r15);
+	COPY(r8);
+	COPY(r9);
+	COPY(r10);
+	COPY(r11);
+	COPY(r12);
+	COPY(r13);
+	COPY(r14);
+	COPY(r15);
 #endif /* CONFIG_X86_64 */
 
-		COPY_SEG_CPL3(cs);
-		COPY_SEG_CPL3(ss);
+	COPY_SEG_CPL3(cs);
+	COPY_SEG_CPL3(ss);
 
-		get_user_ex(tmpflags, &sc->flags);
-		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
-		regs->orig_ax = -1;		/* disable syscall checks */
+	regs->flags = (regs->flags & ~FIX_EFLAGS) | (sc.flags & FIX_EFLAGS);
+	regs->orig_ax = -1;		/* disable syscall checks */
 
-		get_user_ex(buf_val, &sc->fpstate);
-		buf = (void __user *)buf_val;
-	} get_user_catch(err);
+	buf = (void __user *)sc.fpstate;
 
 #ifdef CONFIG_X86_64
 	/*
@@ -149,11 +138,14 @@ static int restore_sigcontext(struct pt_regs *regs,
 		force_valid_ss(regs);
 #endif
 
-	err |= fpu__restore_sig(buf, IS_ENABLED(CONFIG_X86_32));
-
+	if (unlikely(fpu__restore_sig(buf, IS_ENABLED(CONFIG_X86_32))))
+		goto Efault;
 	force_iret();
+	return 0;
 
-	return err;
+Efault:
+	force_iret();
+	return -EFAULT;
 }
 
 int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index a76c12b38e92..2b5183f8eb48 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -242,6 +242,7 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
 	struct vm86 *vm86 = tsk->thread.vm86;
 	struct kernel_vm86_regs vm86regs;
 	struct pt_regs *regs = current_pt_regs();
+	struct vm86_struct v;
 	unsigned long err = 0;
 
 	err = security_mmap_addr(0);
@@ -283,34 +284,32 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
 		       sizeof(struct vm86plus_struct)))
 		return -EFAULT;
 
+	if (unlikely(__copy_from_user(&v, user_vm86,
+			offsetof(struct vm86_struct, int_revectored))))
+		return -EFAULT;
+
 	memset(&vm86regs, 0, sizeof(vm86regs));
-	get_user_try {
-		unsigned short seg;
-		get_user_ex(vm86regs.pt.bx, &user_vm86->regs.ebx);
-		get_user_ex(vm86regs.pt.cx, &user_vm86->regs.ecx);
-		get_user_ex(vm86regs.pt.dx, &user_vm86->regs.edx);
-		get_user_ex(vm86regs.pt.si, &user_vm86->regs.esi);
-		get_user_ex(vm86regs.pt.di, &user_vm86->regs.edi);
-		get_user_ex(vm86regs.pt.bp, &user_vm86->regs.ebp);
-		get_user_ex(vm86regs.pt.ax, &user_vm86->regs.eax);
-		get_user_ex(vm86regs.pt.ip, &user_vm86->regs.eip);
-		get_user_ex(seg, &user_vm86->regs.cs);
-		vm86regs.pt.cs = seg;
-		get_user_ex(vm86regs.pt.flags, &user_vm86->regs.eflags);
-		get_user_ex(vm86regs.pt.sp, &user_vm86->regs.esp);
-		get_user_ex(seg, &user_vm86->regs.ss);
-		vm86regs.pt.ss = seg;
-		get_user_ex(vm86regs.es, &user_vm86->regs.es);
-		get_user_ex(vm86regs.ds, &user_vm86->regs.ds);
-		get_user_ex(vm86regs.fs, &user_vm86->regs.fs);
-		get_user_ex(vm86regs.gs, &user_vm86->regs.gs);
-
-		get_user_ex(vm86->flags, &user_vm86->flags);
-		get_user_ex(vm86->screen_bitmap, &user_vm86->screen_bitmap);
-		get_user_ex(vm86->cpu_type, &user_vm86->cpu_type);
-	} get_user_catch(err);
-	if (err)
-		return err;
+
+	vm86regs.pt.bx = v.regs.ebx;
+	vm86regs.pt.cx = v.regs.ecx;
+	vm86regs.pt.dx = v.regs.edx;
+	vm86regs.pt.si = v.regs.esi;
+	vm86regs.pt.di = v.regs.edi;
+	vm86regs.pt.bp = v.regs.ebp;
+	vm86regs.pt.ax = v.regs.eax;
+	vm86regs.pt.ip = v.regs.eip;
+	vm86regs.pt.cs = v.regs.cs;
+	vm86regs.pt.flags = v.regs.eflags;
+	vm86regs.pt.sp = v.regs.esp;
+	vm86regs.pt.ss = v.regs.ss;
+	vm86regs.es = v.regs.es;
+	vm86regs.ds = v.regs.ds;
+	vm86regs.fs = v.regs.fs;
+	vm86regs.gs = v.regs.gs;
+
+	vm86->flags = v.flags;
+	vm86->screen_bitmap = v.screen_bitmap;
+	vm86->cpu_type = v.cpu_type;
 
 	if (copy_from_user(&vm86->int_revectored,
 			   &user_vm86->int_revectored,

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 19:10                                     ` Al Viro
@ 2019-10-13 19:22                                       ` Linus Torvalds
  2019-10-13 19:59                                         ` Al Viro
  2019-10-16 20:25                                         ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
  0 siblings, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2019-10-13 19:22 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 13, 2019 at 12:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> No arguments re put_user_ex side of things...  Below is a completely
> untested patch for get_user_ex elimination (it seems to build, but that's
> it); in any case, I would really like to see comments from x86 folks
> before it goes anywhere.

Please don't do this:

> +       if (unlikely(__copy_from_user(&sc, usc, sizeof(sc))))
> +               goto Efault;

Why would you use __copy_from_user()? Just don't.

> +       if (unlikely(__copy_from_user(&v, user_vm86,
> +                       offsetof(struct vm86_struct, int_revectored))))

Same here.

There's no excuse for __copy_from_user().

           Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 19:22                                       ` Linus Torvalds
@ 2019-10-13 19:59                                         ` Al Viro
  2019-10-13 20:20                                           ` Linus Torvalds
  2019-10-15 18:08                                           ` Al Viro
  2019-10-16 20:25                                         ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
  1 sibling, 2 replies; 71+ messages in thread
From: Al Viro @ 2019-10-13 19:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 13, 2019 at 12:22:38PM -0700, Linus Torvalds wrote:
> On Sun, Oct 13, 2019 at 12:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > No arguments re put_user_ex side of things...  Below is a completely
> > untested patch for get_user_ex elimination (it seems to build, but that's
> > it); in any case, I would really like to see comments from x86 folks
> > before it goes anywhere.
> 
> Please don't do this:
> 
> > +       if (unlikely(__copy_from_user(&sc, usc, sizeof(sc))))
> > +               goto Efault;
> 
> Why would you use __copy_from_user()? Just don't.
> 
> > +       if (unlikely(__copy_from_user(&v, user_vm86,
> > +                       offsetof(struct vm86_struct, int_revectored))))
> 
> Same here.
> 
> There's no excuse for __copy_from_user().

Probably...  Said that, vm86 one is preceded by
        if (!access_ok(user_vm86, plus ?
                       sizeof(struct vm86_struct) :
                       sizeof(struct vm86plus_struct)))
                return -EFAULT;
so I didn't want to bother.  We'll need to eliminate most of
access_ok() anyway, and I figured that conversion to plain copy_from_user()
would go there as well.

Again, this is not a patch submission - just an illustration of what I meant
re getting rid of get_user_ex().  IOW, the whole thing is still in the
plotting stage.

Re plotting: how strongly would you object against passing the range to
user_access_end()?  Powerpc folks have a very close analogue of stac/clac,
currently buried inside their __get_user()/__put_user()/etc. - the same
places where x86 does, including futex.h and friends.

And there it's even costlier than on x86.  It would obviously be nice
to lift it at least out of unsafe_get_user()/unsafe_put_user() and
move into user_access_begin()/user_access_end(); unfortunately, in
one subarchitecture they really want it the range on the user_access_end()
side as well.  That's obviously not fatal (they can bloody well save those
into thread_info at user_access_begin()), but right now we have relatively
few user_access_end() callers, so the interface changes are still possible.

Other architectures with similar stuff are riscv (no arguments, same
as for stac/clac), arm (uaccess_save_and_enable() on the way in,
return value passed to uaccess_restore() on the way out) and s390
(similar to arm, but there it's needed only to deal with nesting,
and I'm not sure it actually can happen).

It would be nice to settle the API while there are not too many users
outside of arch/x86; changing it later will be a PITA and we definitely
have architectures that do potentially costly things around the userland
memory access; user_access_begin()/user_access_end() is in the right
place to try and see if they fit there...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 19:59                                         ` Al Viro
@ 2019-10-13 20:20                                           ` Linus Torvalds
  2019-10-15  3:46                                             ` Michael Ellerman
  2019-10-15 18:08                                           ` Al Viro
  1 sibling, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-13 20:20 UTC (permalink / raw)
  To: Al Viro; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 13, 2019 at 12:59 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Re plotting: how strongly would you object against passing the range to
> user_access_end()?  Powerpc folks have a very close analogue of stac/clac,
> currently buried inside their __get_user()/__put_user()/etc. - the same
> places where x86 does, including futex.h and friends.
>
> And there it's even costlier than on x86.  It would obviously be nice
> to lift it at least out of unsafe_get_user()/unsafe_put_user() and
> move into user_access_begin()/user_access_end(); unfortunately, in
> one subarchitecture they really want it the range on the user_access_end()
> side as well.

Hmm. I'm ok with that.

Do they want the actual range, or would it prefer some kind of opaque
cookie that user_access_begin() returns (where 0 would mean "failure"
of course)?

I'm thinking like a local_irq_save/restore thing, which might be the
case on yet other architectures.

         Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 20:20                                           ` Linus Torvalds
@ 2019-10-15  3:46                                             ` Michael Ellerman
  0 siblings, 0 replies; 71+ messages in thread
From: Michael Ellerman @ 2019-10-15  3:46 UTC (permalink / raw)
  To: Linus Torvalds, Al Viro
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sun, Oct 13, 2019 at 12:59 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>> Re plotting: how strongly would you object against passing the range to
>> user_access_end()?  Powerpc folks have a very close analogue of stac/clac,
>> currently buried inside their __get_user()/__put_user()/etc. - the same
>> places where x86 does, including futex.h and friends.
>>
>> And there it's even costlier than on x86.  It would obviously be nice
>> to lift it at least out of unsafe_get_user()/unsafe_put_user() and
>> move into user_access_begin()/user_access_end(); unfortunately, in
>> one subarchitecture they really want it the range on the user_access_end()
>> side as well.
>
> Hmm. I'm ok with that.
>
> Do they want the actual range, or would it prefer some kind of opaque
> cookie that user_access_begin() returns (where 0 would mean "failure"
> of course)?

The code does want the actual range, or at least the range rounded to a
segment boundary (256MB).

But it can get that already from a value it stashes in current->thread,
it was just more natural to pass the addr/size with the way the code is
currently structured.

It seems to generate slightly better code to pass addr/size vs loading
it from current->thread, but it's probably in the noise vs everything
else that's going on.

So a cookie would work fine, we could return the encoded addr/size in
the cookie and that might generate better code than loading it back from
current->thread. Equally we could just use the value in current->thread
and not have any cookie at all.

cheers

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 19:59                                         ` Al Viro
  2019-10-13 20:20                                           ` Linus Torvalds
@ 2019-10-15 18:08                                           ` Al Viro
  2019-10-15 19:00                                             ` Linus Torvalds
  2019-10-16 12:12                                             ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro
  1 sibling, 2 replies; 71+ messages in thread
From: Al Viro @ 2019-10-15 18:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-arch

[futex folks and linux-arch Cc'd]

On Sun, Oct 13, 2019 at 08:59:49PM +0100, Al Viro wrote:

> Re plotting: how strongly would you object against passing the range to
> user_access_end()?  Powerpc folks have a very close analogue of stac/clac,
> currently buried inside their __get_user()/__put_user()/etc. - the same
> places where x86 does, including futex.h and friends.
> 
> And there it's even costlier than on x86.  It would obviously be nice
> to lift it at least out of unsafe_get_user()/unsafe_put_user() and
> move into user_access_begin()/user_access_end(); unfortunately, in
> one subarchitecture they really want it the range on the user_access_end()
> side as well.  That's obviously not fatal (they can bloody well save those
> into thread_info at user_access_begin()), but right now we have relatively
> few user_access_end() callers, so the interface changes are still possible.
> 
> Other architectures with similar stuff are riscv (no arguments, same
> as for stac/clac), arm (uaccess_save_and_enable() on the way in,
> return value passed to uaccess_restore() on the way out) and s390
> (similar to arm, but there it's needed only to deal with nesting,
> and I'm not sure it actually can happen).
> 
> It would be nice to settle the API while there are not too many users
> outside of arch/x86; changing it later will be a PITA and we definitely
> have architectures that do potentially costly things around the userland
> memory access; user_access_begin()/user_access_end() is in the right
> place to try and see if they fit there...

Another question: right now we have
        if (!access_ok(uaddr, sizeof(u32)))
                return -EFAULT;

        ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
        if (ret)
                return ret;
in kernel/futex.c.  Would there be any objections to moving access_ok()
inside the instances and moving pagefault_disable()/pagefault_enable() outside?

Reasons:
	* on x86 that would allow folding access_ok() with STAC into
user_access_begin().  The same would be doable on other usual suspects
(arm, arm64, ppc, riscv, s390), bringing access_ok() next to their
STAC counterparts.
	* pagefault_disable()/pagefault_enable() pair is universal on
all architectures, really meant to by the nature of the beast and
lifting it into kernel/futex.c would get the same situation as with
futex_atomic_cmpxchg_inatomic().  Which also does access_ok() inside
the primitive (also foldable into user_access_begin(), at that).
	* access_ok() would be closer to actual memory access (and
out of the generic code).

Comments?

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-15 18:08                                           ` Al Viro
@ 2019-10-15 19:00                                             ` Linus Torvalds
  2019-10-15 19:40                                               ` Al Viro
  2019-10-16 12:12                                             ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro
  1 sibling, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2019-10-15 19:00 UTC (permalink / raw)
  To: Al Viro
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-arch

On Tue, Oct 15, 2019 at 11:08 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Another question: right now we have
>         if (!access_ok(uaddr, sizeof(u32)))
>                 return -EFAULT;
>
>         ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
>         if (ret)
>                 return ret;
> in kernel/futex.c.  Would there be any objections to moving access_ok()
> inside the instances and moving pagefault_disable()/pagefault_enable() outside?

I think we should remove all the "atomic" versions, and just make the
rule be that if you want atomic, you surround it with
pagefault_disable()/pagefault_enable().

That covers not just the futex ops (where "atomic" is actually
somewhat ambiguous - the ops themselves are atomic too, so the naming
might stay, although arguably the "futex" part makes that pointless
too), but also copy_to_user_inatomic() and the powerpc version of
__get_user_inatomic().

So we'd aim to get rid of all the "inatomic" ones entirely.

Same ultimately probably goes for the NMI versions. We should just
make it be a rule that we can use all of the user access functions
with pagefault_{dis,en}able() around them, and they'll be "safe" to
use in atomic context.

One issue with the NMI versions is that they actually want to avoid
the current value of set_fs().  So copy_from_user_nmi() (at least on
x86) is special in that it does

        if (__range_not_ok(from, n, TASK_SIZE))
                return n;

instead of access_ok() because of that issue.

NMI also has some other issues (nmi_uaccess_okay() on x86, at least),
but those *probably* could be handled at page fault time instead.

Anyway, NMI is so special that I'd suggest leaving it for later, but
the non-NMI atomic accesses I would suggest you clean up at the same
time.

I think the *only* reason we have the "inatomic()" versions is that
the regular ones do that "might_fault()" testing unconditionally, and
might_fault() _used_ to be just a might_sleep() - so it's not about
functionality per se, it's about "we have this sanity check that we
need to undo".

We've already made "might_fault()" look at pagefault_disabled(), so I
think a lot of the reasons for inatomic are entirely historical.

                Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-15 19:00                                             ` Linus Torvalds
@ 2019-10-15 19:40                                               ` Al Viro
  2019-10-15 20:18                                                 ` Al Viro
  0 siblings, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-15 19:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-arch

On Tue, Oct 15, 2019 at 12:00:34PM -0700, Linus Torvalds wrote:
> On Tue, Oct 15, 2019 at 11:08 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Another question: right now we have
> >         if (!access_ok(uaddr, sizeof(u32)))
> >                 return -EFAULT;
> >
> >         ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
> >         if (ret)
> >                 return ret;
> > in kernel/futex.c.  Would there be any objections to moving access_ok()
> > inside the instances and moving pagefault_disable()/pagefault_enable() outside?
> 
> I think we should remove all the "atomic" versions, and just make the
> rule be that if you want atomic, you surround it with
> pagefault_disable()/pagefault_enable().

Umm...  I thought about that, but ended up with "it documents the intent" -
pagefault_disable() might be implicit (e.g. done by kmap_atomic()) or
several levels up the call chain.  Not sure.

> That covers not just the futex ops (where "atomic" is actually
> somewhat ambiguous - the ops themselves are atomic too, so the naming
> might stay, although arguably the "futex" part makes that pointless
> too), but also copy_to_user_inatomic() and the powerpc version of
> __get_user_inatomic().

Eh?  copy_to_user_inatomic() doesn't exist; __copy_to_user_inatomic()
does, but...

arch/mips/kernel/unaligned.c:1307:                      res = __copy_to_user_inatomic(addr, fpr, sizeof(*fpr));
drivers/gpu/drm/i915/i915_gem.c:313:    unwritten = __copy_to_user_inatomic(user_data,
lib/test_kasan.c:510:   unused = __copy_to_user_inatomic(usermem, kmem, size + 1);
mm/maccess.c:98:        ret = __copy_to_user_inatomic((__force void __user *)dst, src, size);

these are all callers it has left anywhere and I'm certainly going to kill it.
Now, __copy_from_user_inatomic() has a lot more callers left...  Frankly,
the messier part of API is the nocache side of things.  Consider e.g. this:
/* platform specific: cacheless copy */
static void cacheless_memcpy(void *dst, void *src, size_t n)
{
        /* 
         * Use the only available X64 cacheless copy.  Add a __user cast
         * to quiet sparse.  The src agument is already in the kernel so
         * there are no security issues.  The extra fault recovery machinery
         * is not invoked.
         */
        __copy_user_nocache(dst, (void __user *)src, n, 0);
}
or this
static void ntb_memcpy_tx(struct ntb_queue_entry *entry, void __iomem *offset)
{
#ifdef ARCH_HAS_NOCACHE_UACCESS
        /*
         * Using non-temporal mov to improve performance on non-cached
         * writes, even though we aren't actually copying from user space.
         */
        __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len);
#else   
        memcpy_toio(offset, entry->buf, entry->len);
#endif

        /* Ensure that the data is fully copied out before setting the flags */
        wmb();

        ntb_tx_copy_callback(entry, NULL);
}
"user" part is bollocks in both cases; moreover, I really wonder about that
ifdef in ntb one - ARCH_HAS_NOCACHE_UACCESS is x86-only *at* *the* *moment*
and it just so happens that ..._toio() doesn't require anything special on
x86.  Have e.g. arm grow nocache stuff and the things will suddenly break,
won't they?

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-15 19:40                                               ` Al Viro
@ 2019-10-15 20:18                                                 ` Al Viro
  0 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-15 20:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-arch

On Tue, Oct 15, 2019 at 08:40:12PM +0100, Al Viro wrote:
> or this
> static void ntb_memcpy_tx(struct ntb_queue_entry *entry, void __iomem *offset)
> {
> #ifdef ARCH_HAS_NOCACHE_UACCESS
>         /*
>          * Using non-temporal mov to improve performance on non-cached
>          * writes, even though we aren't actually copying from user space.
>          */
>         __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len);
> #else   
>         memcpy_toio(offset, entry->buf, entry->len);
> #endif
> 
>         /* Ensure that the data is fully copied out before setting the flags */
>         wmb();
> 
>         ntb_tx_copy_callback(entry, NULL);
> }
> "user" part is bollocks in both cases; moreover, I really wonder about that
> ifdef in ntb one - ARCH_HAS_NOCACHE_UACCESS is x86-only *at* *the* *moment*
> and it just so happens that ..._toio() doesn't require anything special on
> x86.  Have e.g. arm grow nocache stuff and the things will suddenly break,
> won't they?

Incidentally, there are two callers of __copy_from_user_inatomic_nocache() in
generic code:
lib/iov_iter.c:792:             __copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len,
lib/iov_iter.c:849:             if (__copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len,

Neither is done under under pagefault_disable(), AFAICS.  This one
drivers/gpu/drm/qxl/qxl_ioctl.c:189:    unwritten = __copy_from_user_inatomic_nocache
probably is - it has something called qxl_bo_kmap_atomic_page() called just prior,
which would seem to imply kmap_atomic() somewhere in it.
The same goes for
drivers/gpu/drm/i915/i915_gem.c:500:    unwritten = __copy_from_user_inatomic_nocache((void __force *)vaddr + offset,

So we have 5 callers anywhere.  Two are not "inatomic" in any sense; source is
in userspace and we want nocache behaviour.  Two _are_ done into a page that
had been fed through kmap_atomic(); the source is, again, in userland.  And
the last one is complete BS - it wants memcpy_toio_nocache() and abuses this
thing.

Incidentally, in case of fault i915 caller ends up unmapping the page,
mapping it non-atomic (with kmap?) and doing plain copy_from_user(),
nocache be damned.  qxl, OTOH, whines and fails all the way to userland...

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC] change of calling conventions for arch_futex_atomic_op_inuser()
  2019-10-15 18:08                                           ` Al Viro
  2019-10-15 19:00                                             ` Linus Torvalds
@ 2019-10-16 12:12                                             ` Al Viro
  2019-10-16 12:24                                               ` Thomas Gleixner
  1 sibling, 1 reply; 71+ messages in thread
From: Al Viro @ 2019-10-16 12:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-arch

On Tue, Oct 15, 2019 at 07:08:46PM +0100, Al Viro wrote:
> [futex folks and linux-arch Cc'd]

> Another question: right now we have
>         if (!access_ok(uaddr, sizeof(u32)))
>                 return -EFAULT;
> 
>         ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
>         if (ret)
>                 return ret;
> in kernel/futex.c.  Would there be any objections to moving access_ok()
> inside the instances and moving pagefault_disable()/pagefault_enable() outside?
> 
> Reasons:
> 	* on x86 that would allow folding access_ok() with STAC into
> user_access_begin().  The same would be doable on other usual suspects
> (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their
> STAC counterparts.
> 	* pagefault_disable()/pagefault_enable() pair is universal on
> all architectures, really meant to by the nature of the beast and
> lifting it into kernel/futex.c would get the same situation as with
> futex_atomic_cmpxchg_inatomic().  Which also does access_ok() inside
> the primitive (also foldable into user_access_begin(), at that).
> 	* access_ok() would be closer to actual memory access (and
> out of the generic code).
> 
> Comments?

FWIW, completely untested patch follows; just the (semimechanical) conversion
of calling conventions, no per-architecture followups included.  Could futex
folks ACK/NAK that in principle?

commit 7babb6ad28cb3e80977fb6bd0405e3f81a943161
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Tue Oct 15 16:54:41 2019 -0400

    arch_futex_atomic_op_inuser(): move access_ok() in and pagefault_disable() - out
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/arch/alpha/include/asm/futex.h b/arch/alpha/include/asm/futex.h
index bfd3c01038f8..da67afd578fd 100644
--- a/arch/alpha/include/asm/futex.h
+++ b/arch/alpha/include/asm/futex.h
@@ -31,7 +31,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 {
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -53,8 +54,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 9d0d070e6c22..607d1c16d4dd 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -75,10 +75,12 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 {
 	int oldval = 0, ret;
 
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
+
 #ifndef CONFIG_ARC_HAS_LLSC
 	preempt_disable();	/* to guarantee atomic r-m-w of futex op */
 #endif
-	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -101,7 +103,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
 #ifndef CONFIG_ARC_HAS_LLSC
 	preempt_enable();
 #endif
diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 83c391b597d4..e133da303a98 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -134,10 +134,12 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret, tmp;
 
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
+
 #ifndef CONFIG_SMP
 	preempt_disable();
 #endif
-	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -159,7 +161,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
 #ifndef CONFIG_SMP
 	preempt_enable();
 #endif
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 6cc26a127819..97f6a63810ec 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -48,7 +48,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr)
 	int oldval = 0, ret, tmp;
 	u32 __user *uaddr = __uaccess_mask_ptr(_uaddr);
 
-	pagefault_disable();
+	if (!access_ok(_uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -75,8 +76,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/hexagon/include/asm/futex.h b/arch/hexagon/include/asm/futex.h
index cb635216a732..8693dc5ae9ec 100644
--- a/arch/hexagon/include/asm/futex.h
+++ b/arch/hexagon/include/asm/futex.h
@@ -36,7 +36,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -62,8 +63,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/ia64/include/asm/futex.h b/arch/ia64/include/asm/futex.h
index 2e106d462196..1db26b432d8c 100644
--- a/arch/ia64/include/asm/futex.h
+++ b/arch/ia64/include/asm/futex.h
@@ -50,7 +50,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -74,8 +75,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/microblaze/include/asm/futex.h b/arch/microblaze/include/asm/futex.h
index 8c90357e5983..86131ed84c9a 100644
--- a/arch/microblaze/include/asm/futex.h
+++ b/arch/microblaze/include/asm/futex.h
@@ -34,7 +34,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -56,8 +57,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h
index b83b0397462d..86f224548651 100644
--- a/arch/mips/include/asm/futex.h
+++ b/arch/mips/include/asm/futex.h
@@ -88,7 +88,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -115,8 +116,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/nds32/include/asm/futex.h b/arch/nds32/include/asm/futex.h
index 5213c65c2e0b..60b7ab74ed92 100644
--- a/arch/nds32/include/asm/futex.h
+++ b/arch/nds32/include/asm/futex.h
@@ -66,8 +66,9 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret;
 
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
-	pagefault_disable();
 	switch (op) {
 	case FUTEX_OP_SET:
 		__futex_atomic_op("move	%0, %3", ret, oldval, tmp, uaddr,
@@ -93,8 +94,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/openrisc/include/asm/futex.h b/arch/openrisc/include/asm/futex.h
index fe894e6331ae..865e9cd0d97b 100644
--- a/arch/openrisc/include/asm/futex.h
+++ b/arch/openrisc/include/asm/futex.h
@@ -35,7 +35,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -57,8 +58,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 50662b6cb605..6e2e4d10e3c8 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -40,11 +40,10 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 	u32 tmp;
 
 	_futex_spin_lock_irqsave(uaddr, &flags);
-	pagefault_disable();
 
 	ret = -EFAULT;
 	if (unlikely(get_user(oldval, uaddr) != 0))
-		goto out_pagefault_enable;
+		goto out_unlock;
 
 	ret = 0;
 	tmp = oldval;
@@ -72,8 +71,7 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 	if (ret == 0 && unlikely(put_user(tmp, uaddr) != 0))
 		ret = -EFAULT;
 
-out_pagefault_enable:
-	pagefault_enable();
+out_unlock:
 	_futex_spin_unlock_irqrestore(uaddr, &flags);
 
 	if (!ret)
diff --git a/arch/powerpc/include/asm/futex.h b/arch/powerpc/include/asm/futex.h
index eea28ca679db..d6e32b32f452 100644
--- a/arch/powerpc/include/asm/futex.h
+++ b/arch/powerpc/include/asm/futex.h
@@ -35,8 +35,9 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 {
 	int oldval = 0, ret;
 
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 	allow_write_to_user(uaddr, sizeof(*uaddr));
-	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -58,8 +59,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	*oval = oldval;
 
 	prevent_write_to_user(uaddr, sizeof(*uaddr));
diff --git a/arch/riscv/include/asm/futex.h b/arch/riscv/include/asm/futex.h
index 4ad6409c4647..84574acfb927 100644
--- a/arch/riscv/include/asm/futex.h
+++ b/arch/riscv/include/asm/futex.h
@@ -40,7 +40,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 {
 	int oldval = 0, ret = 0;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -67,8 +68,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/s390/include/asm/futex.h b/arch/s390/include/asm/futex.h
index 5e97a4353147..3c18a48baf44 100644
--- a/arch/s390/include/asm/futex.h
+++ b/arch/s390/include/asm/futex.h
@@ -28,8 +28,10 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 	int oldval = 0, newval, ret;
 	mm_segment_t old_fs;
 
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
+
 	old_fs = enable_sacf_uaccess();
-	pagefault_disable();
 	switch (op) {
 	case FUTEX_OP_SET:
 		__futex_atomic_op("lr %2,%5\n",
@@ -54,7 +56,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 	default:
 		ret = -ENOSYS;
 	}
-	pagefault_enable();
 	disable_sacf_uaccess(old_fs);
 
 	if (!ret)
diff --git a/arch/sh/include/asm/futex.h b/arch/sh/include/asm/futex.h
index 3190ec89df81..b39cda09fb95 100644
--- a/arch/sh/include/asm/futex.h
+++ b/arch/sh/include/asm/futex.h
@@ -34,8 +34,6 @@ static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval,
 	u32 oldval, newval, prev;
 	int ret;
 
-	pagefault_disable();
-
 	do {
 		ret = get_user(oldval, uaddr);
 
@@ -67,8 +65,6 @@ static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval,
 		ret = futex_atomic_cmpxchg_inatomic(&prev, uaddr, oldval, newval);
 	} while (!ret && prev != oldval);
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/sparc/include/asm/futex_64.h b/arch/sparc/include/asm/futex_64.h
index 0865ce77ec00..72de967318d7 100644
--- a/arch/sparc/include/asm/futex_64.h
+++ b/arch/sparc/include/asm/futex_64.h
@@ -38,8 +38,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 	if (unlikely((((unsigned long) uaddr) & 0x3UL)))
 		return -EINVAL;
 
-	pagefault_disable();
-
 	switch (op) {
 	case FUTEX_OP_SET:
 		__futex_cas_op("mov\t%4, %1", ret, oldval, uaddr, oparg);
@@ -60,8 +58,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h
index 13c83fe97988..6bcd1c1486d9 100644
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -47,7 +47,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 {
 	int oldval = 0, ret, tem;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -70,8 +71,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/arch/xtensa/include/asm/futex.h b/arch/xtensa/include/asm/futex.h
index 0c4457ca0a85..271cfcf8a841 100644
--- a/arch/xtensa/include/asm/futex.h
+++ b/arch/xtensa/include/asm/futex.h
@@ -72,7 +72,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 #if XCHAL_HAVE_S32C1I || XCHAL_HAVE_EXCLUSIVE
 	int oldval = 0, ret;
 
-	pagefault_disable();
+	if (!access_ok(uaddr, sizeof(u32)))
+		return -EFAULT;
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -99,8 +100,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();
-
 	if (!ret)
 		*oval = oldval;
 
diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h
index 02970b11f71f..f4c3470480c7 100644
--- a/include/asm-generic/futex.h
+++ b/include/asm-generic/futex.h
@@ -34,7 +34,6 @@ arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr)
 	u32 tmp;
 
 	preempt_disable();
-	pagefault_disable();
 
 	ret = -EFAULT;
 	if (unlikely(get_user(oldval, uaddr) != 0))
@@ -67,7 +66,6 @@ arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr)
 		ret = -EFAULT;
 
 out_pagefault_enable:
-	pagefault_enable();
 	preempt_enable();
 
 	if (ret == 0)
diff --git a/kernel/futex.c b/kernel/futex.c
index bd18f60e4c6c..2cc8a35109da 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1662,10 +1662,9 @@ static int futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr)
 		oparg = 1 << oparg;
 	}
 
-	if (!access_ok(uaddr, sizeof(u32)))
-		return -EFAULT;
-
+	pagefault_disable();
 	ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
+	pagefault_enable();
 	if (ret)
 		return ret;
 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] change of calling conventions for arch_futex_atomic_op_inuser()
  2019-10-16 12:12                                             ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro
@ 2019-10-16 12:24                                               ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2019-10-16 12:24 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List,
	linux-fsdevel, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-arch

On Wed, 16 Oct 2019, Al Viro wrote:
> On Tue, Oct 15, 2019 at 07:08:46PM +0100, Al Viro wrote:
> > [futex folks and linux-arch Cc'd]
> 
> > Another question: right now we have
> >         if (!access_ok(uaddr, sizeof(u32)))
> >                 return -EFAULT;
> > 
> >         ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
> >         if (ret)
> >                 return ret;
> > in kernel/futex.c.  Would there be any objections to moving access_ok()
> > inside the instances and moving pagefault_disable()/pagefault_enable() outside?
> > 
> > Reasons:
> > 	* on x86 that would allow folding access_ok() with STAC into
> > user_access_begin().  The same would be doable on other usual suspects
> > (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their
> > STAC counterparts.
> > 	* pagefault_disable()/pagefault_enable() pair is universal on
> > all architectures, really meant to by the nature of the beast and
> > lifting it into kernel/futex.c would get the same situation as with
> > futex_atomic_cmpxchg_inatomic().  Which also does access_ok() inside
> > the primitive (also foldable into user_access_begin(), at that).
> > 	* access_ok() would be closer to actual memory access (and
> > out of the generic code).
> > 
> > Comments?
> 
> FWIW, completely untested patch follows; just the (semimechanical) conversion
> of calling conventions, no per-architecture followups included.  Could futex
> folks ACK/NAK that in principle?

Makes sense and does not change any of the futex semantics. Go wild.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()
  2019-10-13 19:22                                       ` Linus Torvalds
  2019-10-13 19:59                                         ` Al Viro
@ 2019-10-16 20:25                                         ` Al Viro
  2019-10-17 19:36                                           ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Al Viro
  2019-10-18  0:27                                           ` [RFC] csum_and_copy_from_user() semantics Al Viro
  1 sibling, 2 replies; 71+ messages in thread
From: Al Viro @ 2019-10-16 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel

On Sun, Oct 13, 2019 at 12:22:38PM -0700, Linus Torvalds wrote:
> On Sun, Oct 13, 2019 at 12:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > No arguments re put_user_ex side of things...  Below is a completely
> > untested patch for get_user_ex elimination (it seems to build, but that's
> > it); in any case, I would really like to see comments from x86 folks
> > before it goes anywhere.
> 
> Please don't do this:
> 
> > +       if (unlikely(__copy_from_user(&sc, usc, sizeof(sc))))
> > +               goto Efault;
> 
> Why would you use __copy_from_user()? Just don't.
> 
> > +       if (unlikely(__copy_from_user(&v, user_vm86,
> > +                       offsetof(struct vm86_struct, int_revectored))))
> 
> Same here.
> 
> There's no excuse for __copy_from_user().

FWIW, callers of __copy_from_user() remaining in the generic code:

1) regset.h:user_regset_copyin().  Switch to copy_from_user(); the calling
conventions of regset ->set() (as well as the method name) are atrocious,
but there are too many instances to mix any work in that direction into
this series.  Yes, nominally it's an inline, but IRL it's too large and
has many callers in the same file(s), so any optimizations of inlining
__copy_from_user() will be lost and there's more than enough work done
there to make access_ok() a noise.  And in this case it doesn't pay
to try and lift user_access_begin() into the callers - the work done
between the calls is often too non-trivial to be done in such area.
The same goes for other regset.h stuff; eventually we might want to
try and come up with saner API, but that's a separate story.

2) default csum_partial_copy_from_user().  What we need to do is
turn it into default csum_and_copy_from_user().  This
#ifndef _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
static inline
__wsum csum_and_copy_from_user (const void __user *src, void *dst,
                                      int len, __wsum sum, int *err_ptr)
{
        if (access_ok(src, len))
                return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);

        if (len)
                *err_ptr = -EFAULT;

        return sum;
}
#endif
in checksum.h is the only thing that calls that sucker and we can bloody
well combine them and make the users of lib/checksum.h define
_HAVE_ARCH_COPY_AND_CSUM_FROM_USER.  That puts us reasonably close
to having _HAVE_ARCH_COPY_AND_CSUM_FROM_USER unconditional and in any
case, __copy_from_user() in lib/checksum.h turns into copy_from_user().

3) firewire ioctl_queue_iso().  Convert to copy_from_user(), lose the
access_ok() before the loop.  Definitely not an unsafe_... situation
(we call fw_iso_context_queue() after each chunk; _not_ something
we want under user_access_begin()/user_access_end()) and it's really
not worth trying to save on access_ok() checks there.

4) pstore persistent_ram_update_user().  Obvious copy_from_user(); definitely
lose access_ok() in the caller (persistent_ram_write_user()), along with
the one in write_pmsg() (several calls back by the callchain).

5) test_kasan: lose the function, lose the tests...

6) drivers/scsi/sg.c nest: sg_read() ones are memdup_user() in disguise
(i.e. fold with immediately preceding kmalloc()s).  sg_new_write() -
fold with access_ok() into copy_from_user() (for both call sites).
sg_write() - lose access_ok(), use copy_from_user() (both call sites)
and get_user() (instead of the solitary __get_user() there).

7) i915 ones are, frankly, terrifying.  Consider e.g. this one:
                relocs = kvmalloc_array(size, 1, GFP_KERNEL);
                if (!relocs) {
                        err = -ENOMEM;
                        goto err;
                }

                /* copy_from_user is limited to < 4GiB */
                copied = 0;
                do {
                        unsigned int len =
                                min_t(u64, BIT_ULL(31), size - copied);

                        if (__copy_from_user((char *)relocs + copied,
                                             (char __user *)urelocs + copied,
                                             len))
                                goto end;

                        copied += len;
                } while (copied < size);
Is that for real?  Are they *really* trying to allocate and copy >2Gb of
userland data?  That's eb_copy_relocations() and that crap is itself in
a loop.  Sizes come from user-supplied data.  WTF?  It's some weird
kmemdup lookalike and I'd rather heard from maintainers of that thing
before doing anything with it.

8) vhost_copy_from_user().  Need comments from mst - it's been a while since I crawled
through that code and I'd need his ACK anyway.  The logics with positioning of
access_ok() in there is non-trivial and I'm not sure how much of that serves
as early input validation and how much can be taken out and replaced by use of
place copy_from_user() and friends.

9) KVM.  There I'm not sure that access_ok() would be the right thing to
do.  kvm_is_error_hva() tends to serve as the range check in that and similar
places; it's not the same situation as with NMI, but...

And that's it - everything else is in arch/*.  Looking at arch/x86, we have
	* insanity in math_emu (unchecked return value, for example)
	* a bunch sigframe-related code.  Some want to use unsafe_...
(or raw_...) variant, some should probably go for copy_from_user().
FPU-related stuff is particularly interesting in that respect - there
we have several inline functions nearby that contain nothing but
stac + instruction + clac + exception handling.  And in quite a few
cases it would've been cleaner to lift stac/clac into the callers, since
they combine nicely.
	* regset_tls_set(): use copy_from_user().
	* one in kvm walk_addr_generic stuff.  If nothing else,
that one smells like __get_user() - we seem to be copying a single
PTE.  And again, it's using kvm_is_error_hva().

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes
  2019-10-16 20:25                                         ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
@ 2019-10-17 19:36                                           ` Al Viro
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
  2019-10-17 21:44                                             ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Douglas Gilbert
  2019-10-18  0:27                                           ` [RFC] csum_and_copy_from_user() semantics Al Viro
  1 sibling, 2 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-scsi, linux-kernel

On Wed, Oct 16, 2019 at 09:25:40PM +0100, Al Viro wrote:

> FWIW, callers of __copy_from_user() remaining in the generic code:

> 6) drivers/scsi/sg.c nest: sg_read() ones are memdup_user() in disguise
> (i.e. fold with immediately preceding kmalloc()s).  sg_new_write() -
> fold with access_ok() into copy_from_user() (for both call sites).
> sg_write() - lose access_ok(), use copy_from_user() (both call sites)
> and get_user() (instead of the solitary __get_user() there).

Turns out that there'd been outright redundant access_ok() calls (not
even warranted by __copy_...) *and* several __put_user()/__get_user()
with no checking of return value (access_ok() was there, handling of
unmapped addresses wasn't).  The latter go back at least to 2.1.early...

I've got a series that presumably fixes and cleans the things up
in that area; it didn't get any serious testing (the kernel builds
and boots, smartctl works as well as it used to, but that's not
worth much - all it says is that SG_IO doesn't fail terribly;
I don't have any test setup for really working with /dev/sg*).

IOW, it needs more review and testing - this is _not_ a pull request.
It's in vfs.git#work.sg; individual patches are in followups.
Shortlog/diffstat:
Al Viro (8):
      sg_ioctl(): fix copyout handling
      sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user()
      sg_write(): __get_user() can fail...
      sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t
      sg_new_write(): don't bother with access_ok
      sg_read(): get rid of access_ok()/__copy_..._user()
      sg_write(): get rid of access_ok()/__copy_from_user()/__get_user()
      SG_IO: get rid of access_ok()

 drivers/scsi/sg.c | 98 ++++++++++++++++++++++++++++++++----------------------------------------------------------------
 1 file changed, 32 insertions(+), 66 deletions(-)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 1/8] sg_ioctl(): fix copyout handling
  2019-10-17 19:36                                           ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Al Viro
@ 2019-10-17 19:39                                             ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 2/8] sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user() Al Viro
                                                                 ` (6 more replies)
  2019-10-17 21:44                                             ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Douglas Gilbert
  1 sibling, 7 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

First of all, __put_user() can fail with access_ok() succeeding.
And access_ok() + __copy_to_user() is spelled copy_to_user()...

__put_user() *can* fail with access_ok() succeeding...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 43 ++++++++++++++++---------------------------
 1 file changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index cce757506383..634460421ce4 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -963,26 +963,21 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 	case SG_GET_LOW_DMA:
 		return put_user((int) sdp->device->host->unchecked_isa_dma, ip);
 	case SG_GET_SCSI_ID:
-		if (!access_ok(p, sizeof (sg_scsi_id_t)))
-			return -EFAULT;
-		else {
-			sg_scsi_id_t __user *sg_idp = p;
+		{
+			sg_scsi_id_t v;
 
 			if (atomic_read(&sdp->detaching))
 				return -ENODEV;
-			__put_user((int) sdp->device->host->host_no,
-				   &sg_idp->host_no);
-			__put_user((int) sdp->device->channel,
-				   &sg_idp->channel);
-			__put_user((int) sdp->device->id, &sg_idp->scsi_id);
-			__put_user((int) sdp->device->lun, &sg_idp->lun);
-			__put_user((int) sdp->device->type, &sg_idp->scsi_type);
-			__put_user((short) sdp->device->host->cmd_per_lun,
-				   &sg_idp->h_cmd_per_lun);
-			__put_user((short) sdp->device->queue_depth,
-				   &sg_idp->d_queue_depth);
-			__put_user(0, &sg_idp->unused[0]);
-			__put_user(0, &sg_idp->unused[1]);
+			memset(&v, 0, sizeof(v));
+			v.host_no = sdp->device->host->host_no;
+			v.channel = sdp->device->channel;
+			v.scsi_id = sdp->device->id;
+			v.lun = sdp->device->lun;
+			v.scsi_type = sdp->device->type;
+			v.h_cmd_per_lun = sdp->device->host->cmd_per_lun;
+			v.d_queue_depth = sdp->device->queue_depth;
+			if (copy_to_user(p, &v, sizeof(sg_scsi_id_t)))
+				return -EFAULT;
 			return 0;
 		}
 	case SG_SET_FORCE_PACK_ID:
@@ -992,20 +987,16 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 		sfp->force_packid = val ? 1 : 0;
 		return 0;
 	case SG_GET_PACK_ID:
-		if (!access_ok(ip, sizeof (int)))
-			return -EFAULT;
 		read_lock_irqsave(&sfp->rq_list_lock, iflags);
 		list_for_each_entry(srp, &sfp->rq_list, entry) {
 			if ((1 == srp->done) && (!srp->sg_io_owned)) {
 				read_unlock_irqrestore(&sfp->rq_list_lock,
 						       iflags);
-				__put_user(srp->header.pack_id, ip);
-				return 0;
+				return put_user(srp->header.pack_id, ip);
 			}
 		}
 		read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
-		__put_user(-1, ip);
-		return 0;
+		return put_user(-1, ip);
 	case SG_GET_NUM_WAITING:
 		read_lock_irqsave(&sfp->rq_list_lock, iflags);
 		val = 0;
@@ -1073,9 +1064,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 		val = (sdp->device ? 1 : 0);
 		return put_user(val, ip);
 	case SG_GET_REQUEST_TABLE:
-		if (!access_ok(p, SZ_SG_REQ_INFO * SG_MAX_QUEUE))
-			return -EFAULT;
-		else {
+		{
 			sg_req_info_t *rinfo;
 
 			rinfo = kcalloc(SG_MAX_QUEUE, SZ_SG_REQ_INFO,
@@ -1085,7 +1074,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 			read_lock_irqsave(&sfp->rq_list_lock, iflags);
 			sg_fill_request_table(sfp, rinfo);
 			read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
-			result = __copy_to_user(p, rinfo,
+			result = copy_to_user(p, rinfo,
 						SZ_SG_REQ_INFO * SG_MAX_QUEUE);
 			result = result ? -EFAULT : 0;
 			kfree(rinfo);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 2/8] sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user()
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 3/8] sg_write(): __get_user() can fail Al Viro
                                                                 ` (5 subsequent siblings)
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 634460421ce4..026628aa556d 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -763,11 +763,7 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf,
 		sg_remove_request(sfp, srp);
 		return -EMSGSIZE;
 	}
-	if (!access_ok(hp->cmdp, hp->cmd_len)) {
-		sg_remove_request(sfp, srp);
-		return -EFAULT;	/* protects following copy_from_user()s + get_user()s */
-	}
-	if (__copy_from_user(cmnd, hp->cmdp, hp->cmd_len)) {
+	if (copy_from_user(cmnd, hp->cmdp, hp->cmd_len)) {
 		sg_remove_request(sfp, srp);
 		return -EFAULT;
 	}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 3/8] sg_write(): __get_user() can fail...
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 2/8] sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user() Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 4/8] sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t Al Viro
                                                                 ` (4 subsequent siblings)
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 026628aa556d..4c62237cdf37 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -640,13 +640,15 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
 	if (count < (SZ_SG_HEADER + 6))
 		return -EIO;	/* The minimum scsi command length is 6 bytes. */
 
+	buf += SZ_SG_HEADER;
+	if (__get_user(opcode, buf))
+		return -EFAULT;
+
 	if (!(srp = sg_add_request(sfp))) {
 		SCSI_LOG_TIMEOUT(1, sg_printk(KERN_INFO, sdp,
 					      "sg_write: queue full\n"));
 		return -EDOM;
 	}
-	buf += SZ_SG_HEADER;
-	__get_user(opcode, buf);
 	mutex_lock(&sfp->f_mutex);
 	if (sfp->next_cmd_len > 0) {
 		cmd_size = sfp->next_cmd_len;
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 4/8] sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 2/8] sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user() Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 3/8] sg_write(): __get_user() can fail Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 5/8] sg_new_write(): don't bother with access_ok Al Viro
                                                                 ` (3 subsequent siblings)
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

We don't need to allocate a temporary buffer and read the entire
structure in it, only to fetch a single field and free what we'd
allocated.  Just use get_user() and be done with it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 4c62237cdf37..2d30e89075e9 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -441,17 +441,8 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
 		}
 		if (old_hdr->reply_len < 0) {
 			if (count >= SZ_SG_IO_HDR) {
-				sg_io_hdr_t *new_hdr;
-				new_hdr = kmalloc(SZ_SG_IO_HDR, GFP_KERNEL);
-				if (!new_hdr) {
-					retval = -ENOMEM;
-					goto free_old_hdr;
-				}
-				retval =__copy_from_user
-				    (new_hdr, buf, SZ_SG_IO_HDR);
-				req_pack_id = new_hdr->pack_id;
-				kfree(new_hdr);
-				if (retval) {
+				sg_io_hdr_t __user *p = (void __user *)buf;
+				if (get_user(req_pack_id, &p->pack_id)) {
 					retval = -EFAULT;
 					goto free_old_hdr;
 				}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 5/8] sg_new_write(): don't bother with access_ok
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
                                                                 ` (2 preceding siblings ...)
  2019-10-17 19:39                                               ` [RFC PATCH 4/8] sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 6/8] sg_read(): get rid of access_ok()/__copy_..._user() Al Viro
                                                                 ` (2 subsequent siblings)
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

... just use copy_from_user().  We copy only SZ_SG_IO_HDR bytes,
so that would, strictly speaking, loosen the check.  However,
for call chains via ->write() the caller has actually checked
the entire range and SG_IO passes exactly SZ_SG_IO_HDR for count.
So no visible behaviour changes happen if we check only what
we really need for copyin.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 2d30e89075e9..3702f66493f7 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -717,8 +717,6 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf,
 
 	if (count < SZ_SG_IO_HDR)
 		return -EINVAL;
-	if (!access_ok(buf, count))
-		return -EFAULT; /* protects following copy_from_user()s + get_user()s */
 
 	sfp->cmd_q = 1;	/* when sg_io_hdr seen, set command queuing on */
 	if (!(srp = sg_add_request(sfp))) {
@@ -728,7 +726,7 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf,
 	}
 	srp->sg_io_owned = sg_io_owned;
 	hp = &srp->header;
-	if (__copy_from_user(hp, buf, SZ_SG_IO_HDR)) {
+	if (copy_from_user(hp, buf, SZ_SG_IO_HDR)) {
 		sg_remove_request(sfp, srp);
 		return -EFAULT;
 	}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 6/8] sg_read(): get rid of access_ok()/__copy_..._user()
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
                                                                 ` (3 preceding siblings ...)
  2019-10-17 19:39                                               ` [RFC PATCH 5/8] sg_new_write(): don't bother with access_ok Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 7/8] sg_write(): get rid of access_ok()/__copy_from_user()/__get_user() Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 8/8] SG_IO: get rid of access_ok() Al Viro
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

Use copy_..._user() instead, both in sg_read() and in sg_read_oxfer().
And don't open-code memdup_user()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 3702f66493f7..9f6534a025cd 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -429,16 +429,10 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
 	SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp,
 				      "sg_read: count=%d\n", (int) count));
 
-	if (!access_ok(buf, count))
-		return -EFAULT;
 	if (sfp->force_packid && (count >= SZ_SG_HEADER)) {
-		old_hdr = kmalloc(SZ_SG_HEADER, GFP_KERNEL);
-		if (!old_hdr)
-			return -ENOMEM;
-		if (__copy_from_user(old_hdr, buf, SZ_SG_HEADER)) {
-			retval = -EFAULT;
-			goto free_old_hdr;
-		}
+		old_hdr = memdup_user(buf, SZ_SG_HEADER);
+		if (IS_ERR(old_hdr))
+			return PTR_ERR(old_hdr);
 		if (old_hdr->reply_len < 0) {
 			if (count >= SZ_SG_IO_HDR) {
 				sg_io_hdr_t __user *p = (void __user *)buf;
@@ -529,7 +523,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
 
 	/* Now copy the result back to the user buffer.  */
 	if (count >= SZ_SG_HEADER) {
-		if (__copy_to_user(buf, old_hdr, SZ_SG_HEADER)) {
+		if (copy_to_user(buf, old_hdr, SZ_SG_HEADER)) {
 			retval = -EFAULT;
 			goto free_old_hdr;
 		}
@@ -1960,12 +1954,12 @@ sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer)
 	num = 1 << (PAGE_SHIFT + schp->page_order);
 	for (k = 0; k < schp->k_use_sg && schp->pages[k]; k++) {
 		if (num > num_read_xfer) {
-			if (__copy_to_user(outp, page_address(schp->pages[k]),
+			if (copy_to_user(outp, page_address(schp->pages[k]),
 					   num_read_xfer))
 				return -EFAULT;
 			break;
 		} else {
-			if (__copy_to_user(outp, page_address(schp->pages[k]),
+			if (copy_to_user(outp, page_address(schp->pages[k]),
 					   num))
 				return -EFAULT;
 			num_read_xfer -= num;
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 7/8] sg_write(): get rid of access_ok()/__copy_from_user()/__get_user()
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
                                                                 ` (4 preceding siblings ...)
  2019-10-17 19:39                                               ` [RFC PATCH 6/8] sg_read(): get rid of access_ok()/__copy_..._user() Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  2019-10-17 19:39                                               ` [RFC PATCH 8/8] SG_IO: get rid of access_ok() Al Viro
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

Just use plain copy_from_user() and get_user().  Note that while
a buf-derived pointer gets stored into ->dxferp, all places that
actually use the resulting value feed it either to import_iovec()
or to import_single_range(), and both will do validation.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 9f6534a025cd..f3d090b93cdf 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -612,11 +612,9 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
 	      scsi_block_when_processing_errors(sdp->device)))
 		return -ENXIO;
 
-	if (!access_ok(buf, count))
-		return -EFAULT;	/* protects following copy_from_user()s + get_user()s */
 	if (count < SZ_SG_HEADER)
 		return -EIO;
-	if (__copy_from_user(&old_hdr, buf, SZ_SG_HEADER))
+	if (copy_from_user(&old_hdr, buf, SZ_SG_HEADER))
 		return -EFAULT;
 	blocking = !(filp->f_flags & O_NONBLOCK);
 	if (old_hdr.reply_len < 0)
@@ -626,7 +624,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
 		return -EIO;	/* The minimum scsi command length is 6 bytes. */
 
 	buf += SZ_SG_HEADER;
-	if (__get_user(opcode, buf))
+	if (get_user(opcode, buf))
 		return -EFAULT;
 
 	if (!(srp = sg_add_request(sfp))) {
@@ -676,7 +674,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
 	hp->flags = input_size;	/* structure abuse ... */
 	hp->pack_id = old_hdr.pack_id;
 	hp->usr_ptr = NULL;
-	if (__copy_from_user(cmnd, buf, cmd_size))
+	if (copy_from_user(cmnd, buf, cmd_size))
 		return -EFAULT;
 	/*
 	 * SG_DXFER_TO_FROM_DEV is functionally equivalent to SG_DXFER_FROM_DEV,
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH 8/8] SG_IO: get rid of access_ok()
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
                                                                 ` (5 preceding siblings ...)
  2019-10-17 19:39                                               ` [RFC PATCH 7/8] sg_write(): get rid of access_ok()/__copy_from_user()/__get_user() Al Viro
@ 2019-10-17 19:39                                               ` Al Viro
  6 siblings, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-17 19:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linus Torvalds, linux-kernel, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

simply not needed there - neither sg_new_read() nor sg_new_write() need
it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/scsi/sg.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index f3d090b93cdf..0940abd91d3c 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -896,8 +896,6 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 			return -ENODEV;
 		if (!scsi_block_when_processing_errors(sdp->device))
 			return -ENXIO;
-		if (!access_ok(p, SZ_SG_IO_HDR))
-			return -EFAULT;
 		result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR,
 				 1, read_only, 1, &srp);
 		if (result < 0)
-- 
2.11.0


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes
  2019-10-17 19:36                                           ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Al Viro
  2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
@ 2019-10-17 21:44                                             ` Douglas Gilbert
  1 sibling, 0 replies; 71+ messages in thread
From: Douglas Gilbert @ 2019-10-17 21:44 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds; +Cc: linux-scsi, linux-kernel

On 2019-10-17 9:36 p.m., Al Viro wrote:
> On Wed, Oct 16, 2019 at 09:25:40PM +0100, Al Viro wrote:
> 
>> FWIW, callers of __copy_from_user() remaining in the generic code:
> 
>> 6) drivers/scsi/sg.c nest: sg_read() ones are memdup_user() in disguise
>> (i.e. fold with immediately preceding kmalloc()s).  sg_new_write() -
>> fold with access_ok() into copy_from_user() (for both call sites).
>> sg_write() - lose access_ok(), use copy_from_user() (both call sites)
>> and get_user() (instead of the solitary __get_user() there).
> 
> Turns out that there'd been outright redundant access_ok() calls (not
> even warranted by __copy_...) *and* several __put_user()/__get_user()
> with no checking of return value (access_ok() was there, handling of
> unmapped addresses wasn't).  The latter go back at least to 2.1.early...
> 
> I've got a series that presumably fixes and cleans the things up
> in that area; it didn't get any serious testing (the kernel builds
> and boots, smartctl works as well as it used to, but that's not
> worth much - all it says is that SG_IO doesn't fail terribly;
> I don't have any test setup for really working with /dev/sg*).
> 
> IOW, it needs more review and testing - this is _not_ a pull request.
> It's in vfs.git#work.sg; individual patches are in followups.
> Shortlog/diffstat:
> Al Viro (8):
>        sg_ioctl(): fix copyout handling
>        sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user()
>        sg_write(): __get_user() can fail...
>        sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t
>        sg_new_write(): don't bother with access_ok
>        sg_read(): get rid of access_ok()/__copy_..._user()
>        sg_write(): get rid of access_ok()/__copy_from_user()/__get_user()
>        SG_IO: get rid of access_ok()
> 
>   drivers/scsi/sg.c | 98 ++++++++++++++++++++++++++++++++----------------------------------------------------------------
>   1 file changed, 32 insertions(+), 66 deletions(-)

Al,
I am aware of these and have a 23 part patchset on the linux-scsi list
for review (see https://marc.info/?l=linux-scsi&m=157052102631490&w=2 )
that amongst other things fixes all of these. It also re-adds the
functionality removed from the bsg driver last year. Unfortunately that
review process is going very slowly, so I have no objections if you
apply these now.

It is unlikely that these changes will introduce any bugs (they didn't in
my testing). If you want to do more testing you may find the sg3_utils
package helpful, especially in the testing directory:
     https://github.com/hreinecke/sg3_utils

Doug Gilbert



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC] csum_and_copy_from_user() semantics
  2019-10-16 20:25                                         ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
  2019-10-17 19:36                                           ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Al Viro
@ 2019-10-18  0:27                                           ` Al Viro
  1 sibling, 0 replies; 71+ messages in thread
From: Al Viro @ 2019-10-18  0:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Anton Blanchard

On Wed, Oct 16, 2019 at 09:25:40PM +0100, Al Viro wrote:

> 2) default csum_partial_copy_from_user().  What we need to do is
> turn it into default csum_and_copy_from_user().  This
> #ifndef _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
> static inline
> __wsum csum_and_copy_from_user (const void __user *src, void *dst,
>                                       int len, __wsum sum, int *err_ptr)
> {
>         if (access_ok(src, len))
>                 return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
> 
>         if (len)
>                 *err_ptr = -EFAULT;
> 
>         return sum;
> }
> #endif
> in checksum.h is the only thing that calls that sucker and we can bloody
> well combine them and make the users of lib/checksum.h define
> _HAVE_ARCH_COPY_AND_CSUM_FROM_USER.  That puts us reasonably close
> to having _HAVE_ARCH_COPY_AND_CSUM_FROM_USER unconditional and in any
> case, __copy_from_user() in lib/checksum.h turns into copy_from_user().

Actually, that gets interesting.  First of all, csum_partial_copy_from_user()
has almost no callers other than csum_and_copy_from_user() - the only
exceptions are alpha and itanic, where csum_partial_copy_nocheck() instances
are using it.

Everything else goes through csum_and_copy_from_user().  And _that_ has
only two callers -  csum_and_copy_from_iter() and csum_and_copy_from_iter_full().
Both treat any failures as "discard the thing", for a good reason.  Namely,
neither csum_and_copy_from_user() nor csum_partial_copy_from_user() have any
means to tell the caller *where* has the fault happened.  So anything
that calls them has to treat a fault as "nothing copied".  That, of course,
goes both for data and csum.

Moreover, behaviour of instances on different architectures differs -
some zero the uncopied-over part of destination, some do not, some
just keep going treating every failed fetch as "got zero" (and returning
the error in the end).

We could, in theory, teach that thing to report the exact amount
copied, so that new users (when and if such appear) could make use
of that.  However, it means a lot of unpleasant work on e.g. sparc.
For raw_copy_from_user() we had to do that, but here I don't see
the point.

As it is, it's only suitable for "discard if anything fails, treat
the entire destination area as garbage in such case" uses.  Which is
all we have for it at the moment.

IOW, it might make sense to get rid of all the "memset the tail to
zero on failure" logics in there - it's not consistently done and
the callers have no way to make use of it anyway.

In any case, there's no point keeping csum_and_copy_from_user()
separate from csum_partial_copy_from_user().  As it is, the
only real difference is that the former does access_ok(), while
the latter might not (some instances do, in which case there's
no difference at all).

Questions from reviewing the instances:
	* mips csum_and_partial_copy_from_user() tries to check
if we are under KERNEL_DS, in which case it goes for kernel-to-kernel
copy.  That's pointless - the callers are reading from an
iovec-backed iov_iter, which can't be created under KERNEL_DS.
So we would have to have set iovec-backed iov_iter while under
USER_DS, then do set_fs(KERNEL_DS), then pass that iov_iter to
->sendmsg().  Which doesn't happen.  IOW, the calls of
__csum_partial_copy_kernel() never happen - neither for
csum_and_copy_from_kernel() for csum_and_copy_to_kernel().

	* ppc does something odd:
        csum = csum_partial_copy_generic((void __force *)src, dst,
                                         len, sum, err_ptr, NULL);

        if (unlikely(*err_ptr)) {
                int missing = __copy_from_user(dst, src, len);

                if (missing) {
                        memset(dst + len - missing, 0, missing);
                        *err_ptr = -EFAULT;
                } else {
                        *err_ptr = 0;
                }

                csum = csum_partial(dst, len, sum);
        }
and since that happens under their stac equivalent, we get it nested -
__copy_from_user() takes and drops it.  I would've said "don't bother
trying to be smart on failures", if I'd been certain that it's not
a fallback for e.g. csum_and_partial_copy_from_user() in misaligned
case.  Could ppc folks clarify that?

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, back to index

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-06 22:20 [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Guenter Roeck
2019-10-06 23:06 ` Linus Torvalds
2019-10-06 23:35   ` Linus Torvalds
2019-10-07  0:04     ` Guenter Roeck
2019-10-07  1:17       ` Linus Torvalds
2019-10-07  1:24         ` Al Viro
2019-10-07  2:06           ` Linus Torvalds
2019-10-07  2:50             ` Al Viro
2019-10-07  3:11               ` Linus Torvalds
2019-10-07 15:40                 ` David Laight
2019-10-07 18:11                   ` Linus Torvalds
2019-10-08  9:58                     ` David Laight
2019-10-07 17:34                 ` Al Viro
2019-10-07 18:13                   ` Linus Torvalds
2019-10-07 18:22                     ` Al Viro
2019-10-07 18:26                 ` Linus Torvalds
2019-10-07 18:36                   ` Tony Luck
2019-10-07 19:08                     ` Linus Torvalds
2019-10-07 19:49                       ` Tony Luck
2019-10-07 20:04                         ` Linus Torvalds
2019-10-08  3:29                   ` Al Viro
2019-10-08  4:09                     ` Linus Torvalds
2019-10-08  4:14                       ` Linus Torvalds
2019-10-08  5:02                         ` Al Viro
2019-10-08  4:24                       ` Linus Torvalds
2019-10-10 19:55                         ` Al Viro
2019-10-10 22:12                           ` Linus Torvalds
2019-10-11  0:11                             ` Al Viro
2019-10-11  0:31                               ` Linus Torvalds
2019-10-13 18:13                                 ` Al Viro
2019-10-13 18:43                                   ` Linus Torvalds
2019-10-13 19:10                                     ` Al Viro
2019-10-13 19:22                                       ` Linus Torvalds
2019-10-13 19:59                                         ` Al Viro
2019-10-13 20:20                                           ` Linus Torvalds
2019-10-15  3:46                                             ` Michael Ellerman
2019-10-15 18:08                                           ` Al Viro
2019-10-15 19:00                                             ` Linus Torvalds
2019-10-15 19:40                                               ` Al Viro
2019-10-15 20:18                                                 ` Al Viro
2019-10-16 12:12                                             ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro
2019-10-16 12:24                                               ` Thomas Gleixner
2019-10-16 20:25                                         ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
2019-10-17 19:36                                           ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Al Viro
2019-10-17 19:39                                             ` [RFC PATCH 1/8] sg_ioctl(): fix copyout handling Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 2/8] sg_new_write(): replace access_ok() + __copy_from_user() with copy_from_user() Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 3/8] sg_write(): __get_user() can fail Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 4/8] sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 5/8] sg_new_write(): don't bother with access_ok Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 6/8] sg_read(): get rid of access_ok()/__copy_..._user() Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 7/8] sg_write(): get rid of access_ok()/__copy_from_user()/__get_user() Al Viro
2019-10-17 19:39                                               ` [RFC PATCH 8/8] SG_IO: get rid of access_ok() Al Viro
2019-10-17 21:44                                             ` [RFC][PATCHES] drivers/scsi/sg.c uaccess cleanups/fixes Douglas Gilbert
2019-10-18  0:27                                           ` [RFC] csum_and_copy_from_user() semantics Al Viro
2019-10-08  4:57                       ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
2019-10-08 13:14                         ` Greg KH
2019-10-08 15:29                           ` Al Viro
2019-10-08 15:38                             ` Greg KH
2019-10-08 17:06                               ` Al Viro
2019-10-08 19:58                   ` Al Viro
2019-10-08 20:16                     ` Al Viro
2019-10-08 20:34                     ` Al Viro
2019-10-07  2:30         ` Guenter Roeck
2019-10-07  3:12           ` Linus Torvalds
2019-10-07  0:23   ` Guenter Roeck
2019-10-07  4:04 ` Max Filippov
2019-10-07 12:16   ` Guenter Roeck
2019-10-07 19:21 ` Linus Torvalds
2019-10-07 20:29   ` Guenter Roeck
2019-10-07 23:27   ` Guenter Roeck
2019-10-08  6:28     ` Geert Uytterhoeven

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox