linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 4.11
@ 2017-05-01  3:02 Linus Torvalds
  2017-05-01  3:45 ` [git pull] uaccess-related bits of vfs.git Al Viro
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2017-05-01  3:02 UTC (permalink / raw)
  To: Linux Kernel Mailing List

So after that extra week with an rc8, things were pretty calm, and I'm
much happier releasing a final 4.11 now.

We still had various smaller fixes the last week, but nothing that
made me go "hmm..". Shortlog appended for people who want to peruse
the details, but it's a mix all over, with about half being drivers
(networking dominates, but some sound fixlets too), with the rest
being soem arch updates, generic networking, and filesystem (nfs[d])
fixes. But it's all really small, which is what I like to see the last
week of the release cycle.

And with this, the merge window is obviously open. I already have two
pull request for 4.12 in my inbox, I expect that overnight I'll get a
lot more.

                    Linus

---

Al Viro (4):
      orangefs_bufmap_copy_from_iovec(): fix EFAULT handling
      p9_client_readdir() fix
      fix nfs O_DIRECT advancing iov_iter too much
      fix a braino in ITER_PIPE iov_iter_revert()

Alexander Kochetkov (1):
      net: phy: fix auto-negotiation stall due to unavailable interrupt

Alexander Potapenko (1):
      net/packet: check length in getsockopt() called with PACKET_HDRLEN

Andreas Kemnade (2):
      net: hso: fix module unloading
      net: hso: register netdev later to avoid a race condition

Ansis Atteka (1):
      udp: disable inner UDP checksum offloads in IPsec case

Arnaud Pouliquen (1):
      ASoC: STI: Fix null ptr deference in IRQ handler

Arnd Bergmann (2):
      clk: sunxi-ng: always select CCU_GATE
      cpsw/netcp: refine cpts dependency

Bert Kenward (1):
      sfc: tx ring can only have 2048 entries for all EF10 NICs

Dan Carpenter (2):
      net: tc35815: move free after the dereference
      ravb: Double free on error in ravb_start_xmit()

David Ahern (2):
      net: ipv6: send unsolicited NA if enabled for all interfaces
      net: ipv6: regenerate host route if moved to gc list

David Howells (1):
      statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH

David S. Miller (3):
      sparc64: Fill in rest of HAVE_REGS_AND_STACK_ACCESS_API
      sparc: Update syscall tables.
      Revert "phy: micrel: Disable auto negotiation on startup"

David Sterba (1):
      btrfs: qgroup: move noisy underflow warning to debugging build

Dmitry Torokhov (1):
      Input: i8042 - add Clevo P650RS to the i8042 reset list

Dmitry V. Levin (1):
      uapi: change the type of struct statx_timestamp.tv_nsec to unsigned

Eric Dumazet (2):
      tcp: do not underestimate skb->truesize in tcp_trim_head()
      net: adjust skb->truesize in ___pskb_trim()

Eugenia Emantayev (1):
      net/mlx5e: Fix small packet threshold

Florian Fainelli (3):
      net: dsa: b53: Include IMP/CPU port in dumb forwarding mode
      net: dsa: b53: Implement software reset for 58xx devices
      net: dsa: b53: Fix CPU port for 58xx devices

Frederic Weisbecker (1):
      sched/cputime: Fix ksoftirqd cputime accounting regression

Herbert Xu (1):
      macvlan: Fix device ref leak when purging bc_queue

Ilan Tayari (1):
      net/mlx5e: Fix ETHTOOL_GRXCLSRLALL handling

J. Bruce Fields (3):
      nfsd: check for oversized NFSv2/v3 arguments
      nfsd4: minor NFSv2/v3 write decoding cleanup
      nfsd: stricter decoding of write-like NFSv2/v3 ops

James Cowgill (2):
      MIPS: Fix modversioning of _mcount symbol
      MIPS: Avoid BUG warning in arch_check_elf

James Hogan (2):
      MIPS: cevt-r4k: Fix out-of-bounds array access
      MIPS: KGDB: Use kernel context for sleeping threads

Jamie Bainbridge (1):
      ipv6: check raw payload size correctly in ioctl

Janakarajan Natarajan (1):
      Prevent timer value 0 for MWAITX

Jason A. Donenfeld (2):
      macsec: avoid heap overflow in skb_to_sgvec
      macsec: dynamically allocate space for sglist

Johannes Thumshirn (1):
      scsi: return correct blkprep status code in case scsi_init_io() fails.

Josh Poimboeuf (2):
      ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram
      x86/build: convert function graph '-Os' error to warning

Linus Torvalds (1):
      Linux 4.11

Maksim Salau (1):
      net: can: usb: gs_usb: Fix buffer on stack

Maor Gottlieb (1):
      net/mlx5: Fix UAR memory leak

Marcin Nowakowski (1):
      MIPS: generic: fix out-of-tree defconfig target builds

Martin KaFai Lau (1):
      net/mlx5e: Fix race in mlx5e_sw_stats and mlx5e_vport_stats

Mathias Kresin (1):
      MIPS: PCI: add controllers before the specified head

Matt Redfearn (3):
      MIPS: Malta: Fix i8259 irqchip setup
      MIPS: KASLR: Add missing header files
      MIPS: smp-cps: Fix potentially uninitialised value of core

Michael Kerrisk (man-pages) (1):
      statx: correct error handling of NULL pathname

Mohamad Haj Yahia (1):
      net/mlx5: Fix driver load bad flow when having fw initializing timeout

Mousumi Jana (1):
      ASoC: topology: Fix to store enum text values

Myungho Jung (1):
      net: core: Prevent from dereferencing null pointer when releasing SKB

Noam Camus (1):
      ARC: [plat-eznps] Fix build error

Or Gerlitz (3):
      net/mlx5: E-Switch, Correctly deal with inline mode on ConnectX-5
      net/mlx5e: Make sure the FW max encap size is enough for ipv4 tunnels
      net/mlx5e: Make sure the FW max encap size is enough for ipv6 tunnels

Pan Bian (1):
      team: fix memory leaks

Paolo Abeni (2):
      ipv6: move stub initialization after ipv6 setup completion
      bonding: avoid defaulting hard_header_len to ETH_HLEN on slave removal

Parthasarathy Bhuvaragan (5):
      tipc: fix socket flow control accounting error at tipc_send_stream
      tipc: fix socket flow control accounting error at tipc_recv_stream
      tipc: Fix missing connection request handling
      tipc: improve error validations for sockets in CONNECTING state
      tipc: close the connection if protocol messages contain errors

Rabin Vincent (1):
      MIPS: perf: fix deadlock

Robert Shearman (1):
      ipv4: Avoid caching l3mdev dst on mismatched local route

Roman Spychała (1):
      usb: plusb: Add support for PL-27A1

Sabrina Dubroca (2):
      ipv6: fix source routing
      xfrm: fix GRO for !CONFIG_NETFILTER

Steffen Klassert (1):
      ipv4: Don't pass IP fragments to upper layer GRO handlers.

Stephane Grosjean (2):
      can: usb: Add support of PCAN-Chip USB stamp module
      can: usb: Kconfig: Add PCAN-USB X6 device in help text

Takashi Iwai (2):
      ALSA: seq: Don't break snd_use_lock_sync() loop by timeout
      ASoC: intel: Fix PM and non-atomic crash in bytcr drivers

Takashi Sakamoto (2):
      ALSA: oxfw: fix regression to handle Stanton SCS.1m/1d
      ALSA: firewire-lib: fix inappropriate assignment between
signed/unsigned type

Vineet Gupta (1):
      ARCv2: entry: save Accumulator register pair (r58:59) if present

WANG Cong (1):
      ipv6: check skb->protocol before lookup for nexthop

Wei Wang (1):
      tcp: memset ca_priv data to 0 properly

Xin Long (2):
      bridge: move bridge multicast cleanup to ndo_uninit
      xfrm: do the garbage collection after flushing policy

Yan, Zheng (1):
      ceph: fix recursion between ceph_set_acl() and __ceph_setattr()

stephen hemminger (1):
      netvsc: fix calculation of available send sections

sudarsana.kalluru@cavium.com (1):
      qed: Fix error in the dcbx app meta data initialization.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [git pull] uaccess-related bits of vfs.git
  2017-05-01  3:02 Linux 4.11 Linus Torvalds
@ 2017-05-01  3:45 ` Al Viro
  2017-05-13  1:00   ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Al Viro @ 2017-05-01  3:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

	uaccess unification pile.  It's _not_ the end of uaccess work, but
the next batch of that will go into the next cycle.  This one mostly takes
copy_from_user() and friends out of arch/* and gets the zero-padding behaviour
in sync for all architectures.
	Dealing with the nocache/writethrough mess is for the next cycle;
fortunately, that's x86-only.  Same for cleanups in iov_iter.c (I am sold on
access_ok() in there, BTW; just not in this pile), same for reducing __copy_...
callsites, strn*... stuff, etc. - there will be a pile about as large as this
one in the next merge window.
	This one sat in -next for weeks.  -3KLoC.  One trivial conflict in
arch/sparc/Kconfig - this series takes HAVE_ARCH_HARDENED_USERCOPY out,
and mainline has renaming of PROVE_LOCKING_SMALL to LOCKDEP_SMALL done in
the next line.

	Please, pull.  There's other stuff in vfs.git, but this is the
most massive one this cycle, so other pull requests will be smaller.
I apologize for the topology of this one - it's a common stem +
per-architecture branches + merge joining those + short tail after that.
Ugly, but I wanted to make it less painful in terms of conflicts with
arch trees...

The following changes since commit b884a190afcecdbef34ca508ea5ee88bb7c77861:

  metag/usercopy: Add missing fixups (2017-04-05 15:25:07 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.uaccess

for you to fetch changes up to 2fefc97b2180518bac923fba3f79fdca1f41dc15:

  HAVE_ARCH_HARDENED_USERCOPY is unconditional now (2017-04-26 12:11:06 -0400)

----------------------------------------------------------------
Al Viro (99):
      uaccess: move VERIFY_{READ,WRITE} definitions to linux/uaccess.h
      uaccess: drop duplicate includes from asm/uaccess.h
      uaccess: drop pointless ifdefs
      add asm-generic/extable.h
      new helper: uaccess_kernel()
      asm-generic/uaccess.h: don't mess with __copy_{to,from}_user
      asm-generic: zero in __get_user(), not __get_user_fn()
      generic ...copy_..._user primitives
      alpha: switch __copy_user() and __do_clean_user() to normal calling conventions
      alpha: add asm/extable.h
      alpha: get rid of 'segment' argument of __{get,put}_user_check()
      alpha: don't bother with __access_ok() in traps.c
      alpha: kill the 'segment' argument of __access_ok()
      alpha: add a helper for emitting exception table entries
      alpha: switch to RAW_COPY_USER
      arc: get rid of unused declaration
      arm: switch to generic extable.h
      arm: switch to RAW_COPY_USER
      arm64: add extable.h
      avr32: switch to generic extable.h
      arm64: switch to RAW_COPY_USER
      avr32: switch to RAW_COPY_USER
      blackfin: switch to generic extable.h
      bfin: switch to RAW_COPY_USER
      c6x: remove duplicate definition of __access_ok
      c6x: switch to RAW_COPY_USER
      cris: switch to generic extable.h
      cris: don't rely upon __copy_user_zeroing() zeroing the tail
      cris: get rid of zeroing in __asm_copy_from_user_N for N > 4
      cris: get rid of zeroing
      cris: rename __copy_user_zeroing to __copy_user_in
      cris: switch to RAW_COPY_USER
      frv: switch to use of fixup_exception()
      frv: switch to RAW_COPY_USER
      8300: switch to RAW_COPY_USER
      m32r: switch to generic extable.h
      m32r: get rid of zeroing
      m68k: switch to generic extable.h
      m68k: get rid of zeroing
      m68k: switch to RAW_COPY_USER
      metag: switch to generic extable.h
      metag: kill verify_area()
      microblaze: switch to generic extable.h
      mn10300: switch to generic extable.h
      mn10300: get rid of zeroing
      mn10300: switch to RAW_COPY_USER
      nios2: switch to generic extable.h
      nios2: switch to RAW_COPY_USER
      openrisc: switch to generic extable.h
      openrisc: switch to RAW_COPY_USER
      powerpc: switch to extable.h
      s390: switch to extable.h
      score: switch to generic extable.h
      score: it's "VERIFY_WRITE", not "VERFITY_WRITE"...
      score: switch to RAW_COPY_USER
      sh: switch to extable.h
      sh: switch to RAW_COPY_USER
      sparc32: kill __ret_efault()
      tile: switch to generic extable.h
      tile: get rid of zeroing, switch to RAW_COPY_USER
      um: switch to RAW_COPY_USER
      amd64: get rid of zeroing
      unicore32: get rid of zeroing and switch to RAW_COPY_USER
      kill __copy_from_user_nocache()
      xtensa: switch to generic extable.h
      xtensa: get rid of zeroing, use RAW_COPY_USER
      arc: switch to RAW_COPY_USER
      x86: don't wank with magical size in __copy_in_user()
      x86: switch to RAW_COPY_USER
      s390: get rid of zeroing, switch to RAW_COPY_USER
      Merge branch 'parisc-4.11-3' of git://git.kernel.org/.../deller/parisc-linux into uaccess.parisc
      parisc: switch to RAW_COPY_USER
      sparc: switch to RAW_COPY_USER
      Merge branch 'fixes' of git://git.kernel.org/.../jhogan/metag into uaccess.metag
      Merge commit 'fc69910f329d' into uaccess.mips
      mips: sanitize __access_ok()
      mips: consolidate __invoke_... wrappers
      mips: clean and reorder the forest of macros...
      mips: make copy_from_user() zero tail explicitly
      mips: get rid of tail-zeroing in primitives
      mips: switch to RAW_COPY_USER
      don't open-code kernel_setsockopt()
      alpha: fix stack smashing in old_adjtimex(2)
      esas2r: don't open-code memdup_user()
      Merge commit 'a7d2475af7aedcb9b5c6343989a8bfadbf84429b' into uaccess.powerpc
      powerpc: get rid of zeroing, switch to RAW_COPY_USER
      Merge commit 'b4fb8f66f1ae2e167d06c12d018025a8d4d3ba7e' into uaccess.ia64
      ia64: add extable.h
      ia64: get rid of 'segment' argument of __{get,put}_user_check()
      ia64: get rid of 'segment' argument of __do_{get,put}_user()
      ia64: sanitize __access_ok()
      ia64: get rid of copy_in_user()
      get rid of padding, switch to RAW_COPY_USER
      microblaze: switch to RAW_COPY_USER
      hexagon: switch to RAW_COPY_USER
      m32r: switch to RAW_COPY_USER
      Merge branches 'uaccess.alpha', 'uaccess.arc', 'uaccess.arm', 'uaccess.arm64', 'uaccess.avr32', 'uaccess.bfin', 'uaccess.c6x', 'uaccess.cris', 'uaccess.frv', 'uaccess.h8300', 'uaccess.hexagon', 'uaccess.ia64', 'uaccess.m32r', 'uaccess.m68k', 'uaccess.metag', 'uaccess.microblaze', 'uaccess.mips', 'uaccess.mn10300', 'uaccess.nios2', 'uaccess.openrisc', 'uaccess.parisc', 'uaccess.powerpc', 'uaccess.s390', 'uaccess.score', 'uaccess.sh', 'uaccess.sparc', 'uaccess.tile', 'uaccess.um', 'uaccess.unicore32', 'uaccess.x86' and 'uaccess.xtensa' into work.uaccess
      CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
      HAVE_ARCH_HARDENED_USERCOPY is unconditional now

James Hogan (1):
      metag/usercopy: Switch to RAW_COPY_USER

Max Filippov (1):
      xtensa: fix prefetch in the raw_copy_to_user

Vineet Gupta (1):
      ARC: uaccess: enable INLINE_COPY_{TO,FROM}_USER ...

 arch/alpha/include/asm/extable.h          |  55 ++++
 arch/alpha/include/asm/futex.h            |  16 +-
 arch/alpha/include/asm/uaccess.h          | 305 +++++---------------
 arch/alpha/kernel/traps.c                 | 152 +++-------
 arch/alpha/lib/clear_user.S               |  66 ++---
 arch/alpha/lib/copy_user.S                |  82 +++---
 arch/alpha/lib/csum_partial_copy.c        |  10 +-
 arch/alpha/lib/ev6-clear_user.S           |  84 +++---
 arch/alpha/lib/ev6-copy_user.S            | 104 +++----
 arch/arc/include/asm/Kbuild               |   1 +
 arch/arc/include/asm/uaccess.h            |  25 +-
 arch/arc/mm/extable.c                     |  14 -
 arch/arm/Kconfig                          |   1 -
 arch/arm/include/asm/Kbuild               |   1 +
 arch/arm/include/asm/uaccess.h            |  87 ++----
 arch/arm/lib/uaccess_with_memcpy.c        |   4 +-
 arch/arm64/Kconfig                        |   1 -
 arch/arm64/include/asm/extable.h          |  25 ++
 arch/arm64/include/asm/uaccess.h          |  83 +-----
 arch/arm64/kernel/arm64ksyms.c            |   2 +-
 arch/arm64/lib/copy_in_user.S             |   4 +-
 arch/avr32/include/asm/Kbuild             |   1 +
 arch/avr32/include/asm/uaccess.h          |  39 +--
 arch/avr32/kernel/avr32_ksyms.c           |   2 -
 arch/avr32/lib/copy_user.S                |  15 -
 arch/blackfin/include/asm/Kbuild          |   1 +
 arch/blackfin/include/asm/uaccess.h       |  47 +---
 arch/blackfin/kernel/process.c            |   2 +-
 arch/c6x/include/asm/Kbuild               |   1 +
 arch/c6x/include/asm/uaccess.h            |  19 +-
 arch/c6x/kernel/sys_c6x.c                 |   2 +-
 arch/cris/arch-v10/lib/usercopy.c         |  31 +--
 arch/cris/arch-v32/lib/usercopy.c         |  30 +-
 arch/cris/include/arch-v10/arch/uaccess.h |  46 ++-
 arch/cris/include/arch-v32/arch/uaccess.h |  54 ++--
 arch/cris/include/asm/Kbuild              |   1 +
 arch/cris/include/asm/uaccess.h           |  77 +----
 arch/frv/include/asm/Kbuild               |   1 +
 arch/frv/include/asm/uaccess.h            |  84 ++----
 arch/frv/kernel/traps.c                   |   7 +-
 arch/frv/mm/extable.c                     |  27 +-
 arch/frv/mm/fault.c                       |   6 +-
 arch/h8300/include/asm/Kbuild             |   2 +-
 arch/h8300/include/asm/uaccess.h          |  54 ++++
 arch/hexagon/include/asm/Kbuild           |   1 +
 arch/hexagon/include/asm/uaccess.h        |  26 +-
 arch/hexagon/kernel/hexagon_ksyms.c       |   4 +-
 arch/hexagon/mm/copy_from_user.S          |   2 +-
 arch/hexagon/mm/copy_to_user.S            |   2 +-
 arch/ia64/Kconfig                         |   1 -
 arch/ia64/include/asm/extable.h           |  11 +
 arch/ia64/include/asm/uaccess.h           | 102 ++-----
 arch/ia64/lib/memcpy_mck.S                |  13 +-
 arch/ia64/mm/extable.c                    |   5 +-
 arch/m32r/include/asm/Kbuild              |   1 +
 arch/m32r/include/asm/uaccess.h           | 189 +------------
 arch/m32r/kernel/m32r_ksyms.c             |   2 -
 arch/m32r/lib/usercopy.c                  |  21 --
 arch/m68k/include/asm/Kbuild              |   1 +
 arch/m68k/include/asm/processor.h         |  10 -
 arch/m68k/include/asm/uaccess.h           |   1 +
 arch/m68k/include/asm/uaccess_mm.h        | 103 ++++---
 arch/m68k/include/asm/uaccess_no.h        |  43 +--
 arch/m68k/kernel/signal.c                 |   2 +-
 arch/m68k/kernel/traps.c                  |   9 +-
 arch/m68k/lib/uaccess.c                   |  12 +-
 arch/m68k/mm/fault.c                      |   2 +-
 arch/metag/include/asm/Kbuild             |   1 +
 arch/metag/include/asm/uaccess.h          |  60 +---
 arch/metag/lib/usercopy.c                 |   6 +-
 arch/microblaze/include/asm/Kbuild        |   1 +
 arch/microblaze/include/asm/uaccess.h     |  64 +----
 arch/mips/Kconfig                         |   1 -
 arch/mips/cavium-octeon/octeon-memcpy.S   |  31 +--
 arch/mips/include/asm/checksum.h          |   4 +-
 arch/mips/include/asm/r4kcache.h          |   4 +-
 arch/mips/include/asm/uaccess.h           | 449 ++++--------------------------
 arch/mips/kernel/mips-r2-to-r6-emul.c     |  24 +-
 arch/mips/kernel/syscall.c                |   2 +-
 arch/mips/kernel/unaligned.c              |  10 +-
 arch/mips/lib/memcpy.S                    |  49 ----
 arch/mips/oprofile/backtrace.c            |   2 +-
 arch/mn10300/include/asm/Kbuild           |   1 +
 arch/mn10300/include/asm/uaccess.h        | 187 +------------
 arch/mn10300/kernel/mn10300_ksyms.c       |   2 -
 arch/mn10300/lib/usercopy.c               |  18 --
 arch/nios2/include/asm/Kbuild             |   1 +
 arch/nios2/include/asm/uaccess.h          |  55 +---
 arch/nios2/mm/uaccess.c                   |  16 +-
 arch/openrisc/include/asm/Kbuild          |   1 +
 arch/openrisc/include/asm/uaccess.h       |  53 +---
 arch/parisc/Kconfig                       |   1 -
 arch/parisc/include/asm/futex.h           |   2 +-
 arch/parisc/include/asm/uaccess.h         |  69 +----
 arch/parisc/lib/memcpy.c                  |  16 +-
 arch/powerpc/Kconfig                      |   1 -
 arch/powerpc/include/asm/extable.h        |  29 ++
 arch/powerpc/include/asm/uaccess.h        |  96 +------
 arch/powerpc/lib/Makefile                 |   2 +-
 arch/powerpc/lib/copy_32.S                |  14 -
 arch/powerpc/lib/copyuser_64.S            |  35 +--
 arch/powerpc/lib/usercopy_64.c            |  41 ---
 arch/s390/Kconfig                         |   1 -
 arch/s390/include/asm/extable.h           |  28 ++
 arch/s390/include/asm/uaccess.h           | 153 +---------
 arch/s390/lib/uaccess.c                   |  68 ++---
 arch/score/include/asm/Kbuild             |   1 +
 arch/score/include/asm/extable.h          |  11 -
 arch/score/include/asm/uaccess.h          |  59 +---
 arch/sh/include/asm/extable.h             |  10 +
 arch/sh/include/asm/uaccess.h             |  64 +----
 arch/sparc/Kconfig                        |   1 -
 arch/sparc/include/asm/uaccess.h          |   2 +-
 arch/sparc/include/asm/uaccess_32.h       |  44 +--
 arch/sparc/include/asm/uaccess_64.h       |  44 +--
 arch/sparc/kernel/head_32.S               |   7 -
 arch/sparc/lib/GENcopy_from_user.S        |   2 +-
 arch/sparc/lib/GENcopy_to_user.S          |   2 +-
 arch/sparc/lib/GENpatch.S                 |   4 +-
 arch/sparc/lib/NG2copy_from_user.S        |   2 +-
 arch/sparc/lib/NG2copy_to_user.S          |   2 +-
 arch/sparc/lib/NG2patch.S                 |   4 +-
 arch/sparc/lib/NG4copy_from_user.S        |   2 +-
 arch/sparc/lib/NG4copy_to_user.S          |   2 +-
 arch/sparc/lib/NG4patch.S                 |   4 +-
 arch/sparc/lib/NGcopy_from_user.S         |   2 +-
 arch/sparc/lib/NGcopy_to_user.S           |   2 +-
 arch/sparc/lib/NGpatch.S                  |   4 +-
 arch/sparc/lib/U1copy_from_user.S         |   4 +-
 arch/sparc/lib/U1copy_to_user.S           |   4 +-
 arch/sparc/lib/U3copy_to_user.S           |   2 +-
 arch/sparc/lib/U3patch.S                  |   4 +-
 arch/sparc/lib/copy_in_user.S             |   6 +-
 arch/sparc/lib/copy_user.S                |  16 +-
 arch/tile/include/asm/Kbuild              |   1 +
 arch/tile/include/asm/uaccess.h           | 166 +----------
 arch/tile/lib/exports.c                   |   7 +-
 arch/tile/lib/memcpy_32.S                 |  41 +--
 arch/tile/lib/memcpy_user_64.c            |  15 +-
 arch/um/include/asm/Kbuild                |   1 +
 arch/um/include/asm/uaccess.h             |  13 +-
 arch/um/kernel/skas/uaccess.c             |  18 +-
 arch/unicore32/include/asm/Kbuild         |   1 +
 arch/unicore32/include/asm/uaccess.h      |  15 +-
 arch/unicore32/kernel/ksyms.c             |   4 +-
 arch/unicore32/kernel/process.c           |   2 +-
 arch/unicore32/lib/copy_from_user.S       |  16 +-
 arch/unicore32/lib/copy_to_user.S         |   6 +-
 arch/x86/Kconfig                          |   1 -
 arch/x86/include/asm/uaccess.h            |  70 +----
 arch/x86/include/asm/uaccess_32.h         | 127 +--------
 arch/x86/include/asm/uaccess_64.h         | 128 +--------
 arch/x86/lib/usercopy.c                   |  54 +---
 arch/x86/lib/usercopy_32.c                | 288 +------------------
 arch/x86/lib/usercopy_64.c                |  13 -
 arch/xtensa/include/asm/Kbuild            |   1 +
 arch/xtensa/include/asm/asm-uaccess.h     |   3 -
 arch/xtensa/include/asm/uaccess.h         |  67 +----
 arch/xtensa/lib/usercopy.S                | 116 ++++----
 block/bsg.c                               |   2 +-
 drivers/scsi/esas2r/esas2r_ioctl.c        |  25 +-
 drivers/scsi/sg.c                         |   2 +-
 fs/ocfs2/cluster/tcp.c                    |  25 +-
 include/asm-generic/extable.h             |  26 ++
 include/asm-generic/uaccess.h             | 135 +--------
 include/linux/uaccess.h                   | 197 ++++++++++++-
 include/rdma/ib.h                         |   2 +-
 kernel/trace/bpf_trace.c                  |   2 +-
 lib/Makefile                              |   2 +-
 lib/iov_iter.c                            |   6 +-
 lib/usercopy.c                            |  26 ++
 mm/memory.c                               |   2 +-
 net/rds/tcp.c                             |   5 +-
 net/rds/tcp_send.c                        |   8 +-
 security/Kconfig                          |   9 -
 security/tomoyo/network.c                 |   2 +-
 176 files changed, 1458 insertions(+), 4335 deletions(-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-01  3:45 ` [git pull] uaccess-related bits of vfs.git Al Viro
@ 2017-05-13  1:00   ` Linus Torvalds
  2017-05-13  6:57     ` Al Viro
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2017-05-13  1:00 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

[-- Attachment #1: Type: text/plain, Size: 5157 bytes --]

So I should have asked earlier, but I was feeling rather busy during
the early merge window..

On Sun, Apr 30, 2017 at 8:45 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>         uaccess unification pile.  It's _not_ the end of uaccess work, but
> the next batch of that will go into the next cycle.  This one mostly takes
> copy_from_user() and friends out of arch/* and gets the zero-padding behaviour
> in sync for all architectures.
>         Dealing with the nocache/writethrough mess is for the next cycle;
> fortunately, that's x86-only.

So I'm actually more interested to hear if you have any pending work
on cleaning up the __get/put_user() mess we have. Is that on your
radar at all?

In particular, right now, both __get_user() and __put_user() are nasty
and broken interfaces.

It *used* to be that they were designed to just generate a single
instruction. That's not what they currently do at all, due to the
whole SMAP/PAN on x86 and arm.

For example, on x86, right now a single __put_user() call ends up
generating something like

    #APP
    # 58 "./arch/x86/include/asm/smap.h" 1
            661:

    662:
    .skip -(((6651f-6641f)-(662b-661b)) > 0) * ((6651f-6641f)-(662b-661b)),0x90
    663:
    .pushsection .altinstructions,"a"
     .long 661b - .
     .long 6641f - .
     .word ( 9*32+20)
     .byte 663b-661b
     .byte 6651f-6641f
     .byte 663b-662b
    .popsection
    .pushsection .altinstr_replacement, "ax"
    6641:
            .byte 0x0f,0x01,0xcb
    6651:
            .popsection
    # 0 "" 2
    #NO_APP
            xorl        %eax, %eax      # __pu_err
    #APP
    # 269 "fs/readdir.c" 1

    1:  movq %rcx,8(%rdi)       # offset, MEM[(struct __large_struct *)_16]
    2:
    .section .fixup,"ax"
    3:  mov $-14,%eax   #, __pu_err
            jmp 2b
    .previous
     .pushsection "__ex_table","a"
     .balign 4
     .long (1b) - .
     .long (3b) - .
     .long (ex_handler_default) - .
     .popsection

    # 0 "" 2
    # 52 "./arch/x86/include/asm/smap.h" 1
            661:

    662:
    .skip -(((6651f-6641f)-(662b-661b)) > 0) * ((6651f-6641f)-(662b-661b)),0x90
    663:
    .pushsection .altinstructions,"a"
     .long 661b - .
     .long 6641f - .
     .word ( 9*32+20)
     .byte 663b-661b
     .byte 6651f-6641f
     .byte 663b-662b
    .popsection
    .pushsection .altinstr_replacement, "ax"
    6641:
            .byte 0x0f,0x01,0xca
    6651:
            .popsection
    # 0 "" 2
    #NO_APP

and while much of that is out-of-line and in different sections, it
basically means that it's insane how we inline these things.

Yes, yes, the inlined part of the above horror-story ends up being just

        clac
        xor    %eax,%eax
        mov    %rcx,0x8(%rdi)
        stac

and everything else is just boiler-plate for the alt-instruction
handling and the exception handling.

But even that small thing is rather debatable from a performance
angle: the actual cost of __put_user() these days is not the function
call, but the clac/stac (on modern x86) and the PAN bit games (on
arm).

So I'm actually inclined to just say "We should make
__get_user/__put_user() just be aliases for the normal
get_user/put_user() model".

The *correct* way to do fast user space accesses when you hoist error
checking outside is to use

        if (!access_ok(..))
                goto efault;
        user_access_begin();
        ..
        ... loop over or otherwise do multiple "raw" accessess ...
        unsafe_get/put_user(c, ptr, got_fault);
        unsafe_get/put_user(c, ptr, got_fault);
        ...
        user_access_end();
        return 0;

got_fault:
        user_access_end();
efault:
        return -EFAULT;

which is more boilerplate, but ends up generating much better code.
And for "unsafe_put_user()" in particular, we actually could teach gcc
to use "asm goto" to really improve code generation. I have patches
for that if you want to look.

I'm attaching an example patch for "filldir()" that does that modern
thing. It almost had the right form as-is anyway, and under some loads
(eg "find") filldir actually shows up in profiles too.\

And just from a code generation standpoint, look at what the attached
patch does:

[torvalds@i7 linux]$ diff -u fs/readdir.s ~/readdir.s | diffstat
 readdir.s |  548 ++++++++++++++++++++++----------------------------------------
 1 file changed, 201 insertions(+), 347 deletions(-)

just from getting rid of that crazy repeated SMAP overhead.

Yeah, yeah, doing diffstat on generated assembly files is all kinds of
crazy, but it's an example of what kind of odd garbage we currently
generate.

But the *first* thing I'd like to do would be to just get rid of
__get_user/__put_user as a thing. It really does generate nasty code,
and we might as well just make it do that function call.

Because what we do now isn't right. If we care about performance, the
"__xyz_user" versions are wrong (use unsafe_xyz_user() instead). And
if you don't care about performance, you should use the regular
xyz_user() functions that do an ok job by just calling the right-sized
helper function instead of inlining the crud.

Hmm?

                          Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 1378 bytes --]

 fs/readdir.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/readdir.c b/fs/readdir.c
index 9d0212c374d6..03324f54c0e9 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -184,25 +184,27 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
 	if (dirent) {
 		if (signal_pending(current))
 			return -EINTR;
-		if (__put_user(offset, &dirent->d_off))
-			goto efault;
 	}
+
+	user_access_begin();
+	if (dirent)
+		unsafe_put_user(offset, &dirent->d_off, efault_end);
 	dirent = buf->current_dir;
-	if (__put_user(d_ino, &dirent->d_ino))
-		goto efault;
-	if (__put_user(reclen, &dirent->d_reclen))
-		goto efault;
+	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
+	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
+	unsafe_put_user(0, dirent->d_name + namlen, efault_end);
+	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
+	user_access_end();
+
 	if (copy_to_user(dirent->d_name, name, namlen))
 		goto efault;
-	if (__put_user(0, dirent->d_name + namlen))
-		goto efault;
-	if (__put_user(d_type, (char __user *) dirent + reclen - 1))
-		goto efault;
 	buf->previous = dirent;
 	dirent = (void __user *)dirent + reclen;
 	buf->current_dir = dirent;
 	buf->count -= reclen;
 	return 0;
+efault_end:
+	user_access_end();
 efault:
 	buf->error = -EFAULT;
 	return -EFAULT;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13  1:00   ` Linus Torvalds
@ 2017-05-13  6:57     ` Al Viro
  2017-05-13 12:05       ` Adam Borowski
  2017-05-13 16:15       ` Linus Torvalds
  0 siblings, 2 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13  6:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Fri, May 12, 2017 at 06:00:44PM -0700, Linus Torvalds wrote:
> So I should have asked earlier, but I was feeling rather busy during
> the early merge window..

> So I'm actually more interested to hear if you have any pending work
> on cleaning up the __get/put_user() mess we have. Is that on your
> radar at all?

Yes, it is.

> In particular, right now, both __get_user() and __put_user() are nasty
> and broken interfaces.
>
> It *used* to be that they were designed to just generate a single
> instruction. That's not what they currently do at all, due to the
> whole SMAP/PAN on x86 and arm.
> 
> For example, on x86, right now a single __put_user() call ends up
> generating something like

[snip]
 
> But even that small thing is rather debatable from a performance
> angle: the actual cost of __put_user() these days is not the function
> call, but the clac/stac (on modern x86) and the PAN bit games (on
> arm).
> 
> So I'm actually inclined to just say "We should make
> __get_user/__put_user() just be aliases for the normal
> get_user/put_user() model".

> which is more boilerplate, but ends up generating much better code.
> And for "unsafe_put_user()" in particular, we actually could teach gcc
> to use "asm goto" to really improve code generation. I have patches
> for that if you want to look.
> 
> I'm attaching an example patch for "filldir()" that does that modern
> thing. It almost had the right form as-is anyway, and under some loads
> (eg "find") filldir actually shows up in profiles too.\

> But the *first* thing I'd like to do would be to just get rid of
> __get_user/__put_user as a thing. It really does generate nasty code,
> and we might as well just make it do that function call.
> 
> Because what we do now isn't right. If we care about performance, the
> "__xyz_user" versions are wrong (use unsafe_xyz_user() instead). And
> if you don't care about performance, you should use the regular
> xyz_user() functions that do an ok job by just calling the right-sized
> helper function instead of inlining the crud.
> 
> Hmm?

First, some stats: there's a thousand-odd callers of __get_user().  Out of
those, about 70% are in arch/, mostly in sigframe-related code.  Take
a look at the output of
git grep -n -w __get_user|grep -v '^arch' | sed -e 's/:.*//'|uniq -c|sort -nr
     55 drivers/gpu/drm/drm_ioc32.c
     43 drivers/staging/comedi/comedi_compat32.c
     35 kernel/compat.c
     27 net/compat.c
     27 block/compat_ioctl.c
     15 drivers/usb/core/devio.c
     13 drivers/char/ipmi/ipmi_devintf.c
     11 kernel/signal.c
     10 fs/fcntl.c
      8 ipc/compat.c
      8 drivers/video/fbdev/sbuslib.c
      7 net/socket.c
      7 drivers/gpu/drm/mga/mga_ioc32.c
      6 fs/select.c
      6 drivers/tty/vt/vt_ioctl.c
      5 drivers/tty/vt/selection.c
      5 drivers/spi/spidev.c
      5 drivers/pci/proc.c
      4 kernel/ptrace.c
      4 ipc/compat_mq.c
      4 drivers/tty/vt/consolemap.c
      3 sound/oss/sys_timer.c
      3 drivers/media/usb/uvc/uvc_v4l2.c
      3 drivers/macintosh/ans-lcd.c
and then we have a smallish bunch of files with one or two callers.  For
__put_user() we have ~1800 callers total, ~1100 of them in arch/* and
the rest goes like this:
     73 drivers/gpu/drm/drm_ioc32.c
     66 ipc/compat.c
     58 drivers/gpu/drm/radeon/radeon_ioc32.c
     55 block/compat_ioctl.c
     52 kernel/compat.c
     49 kernel/signal.c
     43 drivers/staging/comedi/comedi_compat32.c
     31 drivers/gpu/drm/r128/r128_ioc32.c
     30 drivers/gpu/drm/mga/mga_ioc32.c
     27 fs/signalfd.c
     25 fs/readdir.c
     24 fs/statfs.c
     19 kernel/sys.c
     17 net/compat.c
     11 drivers/scsi/sg.c
     10 fs/fcntl.c
      8 sound/core/pcm_native.c
      8 fs/binfmt_flat.c
      7 fs/binfmt_elf_fdpic.c
      6 net/socket.c
      6 drivers/char/ipmi/ipmi_devintf.c
      5 sound/oss/sys_timer.c
      5 fs/binfmt_elf.c
      5 drivers/video/fbdev/sbuslib.c
      5 drivers/tty/vt/consolemap.c
      5 drivers/spi/spidev.c
      5 drivers/pci/proc.c
      4 kernel/ptrace.c
      4 ipc/compat_mq.c
      3 fs/select.c
      3 drivers/tty/vt/vt.c
      ...

IOW, we have
	* most of users in arch/* (heavily dominated by signal-related code,
both loads and stores).  Those need careful massage; maybe unsafe-based
solution, maybe something else, but it's obviously per-architecture work
and these paths are sensitive.
	* few places outside of arch/* where these are abused; absolute
majority are in ioctl compat code and they are _bad_.  Bad on x86, bad on
s390, etc.  I'm not sure if switch to unsafe is the right solution for
those, actually - depends on how we end up dealing with compat ioctls of
that sort.  Might be better to do bulk copy to/from userland, combined with
conversion from 32bit to native kernel-side and passing pointers to kernel
objects to functions doing actual work.  Really depends upon the details.
	* some places in fairly common codepaths where __get_user() and
__put_user() are being played with.  And I certainly agree that they are
not good.

But IMO the first step is not to convert __get_user()/__put_user() to
aliases of get_user()/put_user() - it's to get rid of their infestations
outside of arch/*.  They are concentrated in relatively few places, so
we can deal with those individually and then just fucking ban their use
outside of arch/*.  That's easy enough to watch for - simple git grep will do,
so anything creeping back will be immediately spotted.  In -next, with
subsequent explanation of the reasons Not To Do That Or Else(tm) to the
people responsible.

As for fs/readdir.c...  4 back-to-back stores like
        if (__put_user(ino, &dirent->d_ino))
                goto efault;
        if (__put_user(0, &dirent->d_off))
                goto efault;
        if (__put_user(reclen, &dirent->d_reclen))
                goto efault;
        if (__put_user(d_type, &dirent->d_type))
                goto efault;
might be asking for copy_to_user().  Maybe that one is too small for that
to be a win (on s390 it almost certainly would be a win, judging by what
Martin and Heiko are saying); fs/statfs.c ones almost certainly are better
off with 'convert, then copy_to_user()' approach.

Another couple of places worth looking into are copy_siginfo_to_user() and
signalfd_copyinfo().  Maybe unsafe is the best approach there as well,
maybe not, but it's in the 'large enough for copy_to_user() be interesting,
not too large for auto variable' range.

I certainly want that shit gone from the common paths, no arguments here,
but I want to avoid having to crawl through every architecture's sigframe
handling first.  Fortunately, we have that crap outside of arch/* concentrated
in a few places and they are far enough from each other to be dealt with
independently.  There are interesting interplays with set_fs() stuff, BTW...

There will be more than enough crap that _will_ require crawls through
arch/* ;-/  I'm plotting that pile right now (basically, what pieces should
go into never-rebased stem, so that per-architecture branches had what they
need).  get_user()/put_user() are part of it, it's just that I don't want
to bring arch/*/kernel/signal*.c into the mix ;-/

Re asm-goto - which architectures would be able to use that?  IOW, which
gcc versions are stable enough in that area?  I'd obviously like to see
your patches...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13  6:57     ` Al Viro
@ 2017-05-13 12:05       ` Adam Borowski
  2017-05-13 13:46         ` Brian Gerst
  2017-05-13 16:46         ` Al Viro
  2017-05-13 16:15       ` Linus Torvalds
  1 sibling, 2 replies; 27+ messages in thread
From: Adam Borowski @ 2017-05-13 12:05 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 07:57:45AM +0100, Al Viro wrote:
> On Fri, May 12, 2017 at 06:00:44PM -0700, Linus Torvalds wrote:
> > But the *first* thing I'd like to do would be to just get rid of
> > __get_user/__put_user as a thing. It really does generate nasty code,
> > and we might as well just make it do that function call.
> 
> But IMO the first step is not to convert __get_user()/__put_user() to
> aliases of get_user()/put_user() - it's to get rid of their infestations
> outside of arch/*.  They are concentrated in relatively few places, so
> we can deal with those individually and then just fucking ban their use
> outside of arch/*.  That's easy enough to watch for - simple git grep will do,
> so anything creeping back will be immediately spotted.

As someone from the peanuts gallery, I took a look for __put_user() in my
usual haunt, drivers/tty/vt/

* use 1: con_[gs]et_trans_*():
  Copies a linear array of 256 bytes/shorts, one by one.
  The obvious patch has 9 insertions(+), 22 deletions(-).

* use 2: con_[gs]et_unimap():
  Ditto, up to 65535*2 shorts, also in a nice linear array.

* use 3: tioclinux():
  Does a __put into a place that was checked only for read.  This got me
  frightened as it initially looked like something that can allow an user to
  write where they shouldn't.  Fortunately, it turns out the first argument
  to access_ok() is ignored on every single architecture -- why does it even
  exist then?  I imagined it's there for some odd arch that allows writes
  when in privileged mode, but unlike what the docs say, VERIFY_WRITE is
  exactly same as VERIFY_READ.

Ie, every use in this sample is wrong.  I suspect the rest of the kernel
should be similar.


Meow!
-- 
Don't be racist.  White, amber or black, all beers should be judged based
solely on their merits.  Heck, even if occasionally a cider applies for a
beer's job, why not?
On the other hand, corpo lager is not a race.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 12:05       ` Adam Borowski
@ 2017-05-13 13:46         ` Brian Gerst
  2017-05-13 16:46         ` Al Viro
  1 sibling, 0 replies; 27+ messages in thread
From: Brian Gerst @ 2017-05-13 13:46 UTC (permalink / raw)
  To: Adam Borowski
  Cc: Al Viro, Linus Torvalds, Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 8:05 AM, Adam Borowski <kilobyte@angband.pl> wrote:
> On Sat, May 13, 2017 at 07:57:45AM +0100, Al Viro wrote:
>> On Fri, May 12, 2017 at 06:00:44PM -0700, Linus Torvalds wrote:
>> > But the *first* thing I'd like to do would be to just get rid of
>> > __get_user/__put_user as a thing. It really does generate nasty code,
>> > and we might as well just make it do that function call.
>>
>> But IMO the first step is not to convert __get_user()/__put_user() to
>> aliases of get_user()/put_user() - it's to get rid of their infestations
>> outside of arch/*.  They are concentrated in relatively few places, so
>> we can deal with those individually and then just fucking ban their use
>> outside of arch/*.  That's easy enough to watch for - simple git grep will do,
>> so anything creeping back will be immediately spotted.
>
> As someone from the peanuts gallery, I took a look for __put_user() in my
> usual haunt, drivers/tty/vt/
>
> * use 1: con_[gs]et_trans_*():
>   Copies a linear array of 256 bytes/shorts, one by one.
>   The obvious patch has 9 insertions(+), 22 deletions(-).
>
> * use 2: con_[gs]et_unimap():
>   Ditto, up to 65535*2 shorts, also in a nice linear array.
>
> * use 3: tioclinux():
>   Does a __put into a place that was checked only for read.  This got me
>   frightened as it initially looked like something that can allow an user to
>   write where they shouldn't.  Fortunately, it turns out the first argument
>   to access_ok() is ignored on every single architecture -- why does it even
>   exist then?  I imagined it's there for some odd arch that allows writes
>   when in privileged mode, but unlike what the docs say, VERIFY_WRITE is
>   exactly same as VERIFY_READ.

The original i386 (no longer supported) ignored the write-protect bit
when in supervisor mode.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13  6:57     ` Al Viro
  2017-05-13 12:05       ` Adam Borowski
@ 2017-05-13 16:15       ` Linus Torvalds
  2017-05-13 16:17         ` Linus Torvalds
                           ` (2 more replies)
  1 sibling, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2017-05-13 16:15 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Fri, May 12, 2017 at 11:57 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> First, some stats: there's a thousand-odd callers of __get_user().  Out of
> those, about 70% are in arch/, mostly in sigframe-related code.

Sure. And they can be trivially converted, and none of them should care at all.

> IOW, we have
>         * most of users in arch/* (heavily dominated by signal-related code,
> both loads and stores).  Those need careful massage; maybe unsafe-based
> solution, maybe something else, but it's obviously per-architecture work
> and these paths are sensitive.

Why are they sensitive?

Why not just do this:

  git grep -l '\<__\(\(get\)\|\(put\)\)_user(' -- arch/x86
:^arch/x86/include/asm/uaccess.h
        | xargs sed -i 's/__\(\(\(get\)\|\(put\)\)_user(\)/\1/g'

which converts all the x86 uses in one go.

Anybody who *relies* on not checking the address_limit is so broken as
to be not even funny. And anything that is so performance-sensitive
that anybody can even measure the effect of the above we can convert
later.

Sure, do it in pieces (eg each architecture separately, then
"drivers", then "networking", then whatever, until all done), just so
that *if* something actually depends on semantics (and that really
shouldn't be the case), we have at least some information from a
bisect.

But I don't see the excuse for not just doing it. If nobody notices,
it's an obvious improvement. And if somebody *does* notice, we know
how to do it properly with unsafe_xyz_user(), because "__xyz_user()"
most definitely isn't it.

An example of something that *should* be converted is
"csum_partial_copy_from_user()", but it really does need to use
"user_access_begin()" and friends, because right now it's using
stac/clac for each 16-bit word. That's *expensive*, and it's expensive
whether you use __get_user() or get_user().

Adding x86 people just to see how they react to the above "patch".

For me, in my fairly minimal personal config, the above cleanup patch
shrinks the text size of the resulting kernel by 1.7kB. Yes, most of
it is the out-of-line code, but still..

Interestingly, the signal handling code didn't change at all. It looks
like only the 32-bit code uses __put_user() for the frame setup. The
regular code uses put_user_try/put_user_catch, which is the
x86-specific early try at the unsafe versions, but it would actually
be improved by using "unsafe_put_user()" and my patch to make that use
"asm goto".

                           Linus

PS. That "patch" depends on modern git - with older versions of git
you need to do the path negation with ":!pattern", and then you need
to quote it too since '!' is special for shell.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 16:15       ` Linus Torvalds
@ 2017-05-13 16:17         ` Linus Torvalds
  2017-05-13 17:00         ` Al Viro
  2017-05-14 18:13         ` Ingo Molnar
  2 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2017-05-13 16:17 UTC (permalink / raw)
  To: Al Viro, the arch/x86 maintainers; +Cc: Linux Kernel Mailing List, linux-arch

Oops.

*Really* adding the x86 guys now.

Blush.

            Linus

On Sat, May 13, 2017 at 9:15 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, May 12, 2017 at 11:57 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> First, some stats: there's a thousand-odd callers of __get_user().  Out of
>> those, about 70% are in arch/, mostly in sigframe-related code.
>
> Sure. And they can be trivially converted, and none of them should care at all.
>
>> IOW, we have
>>         * most of users in arch/* (heavily dominated by signal-related code,
>> both loads and stores).  Those need careful massage; maybe unsafe-based
>> solution, maybe something else, but it's obviously per-architecture work
>> and these paths are sensitive.
>
> Why are they sensitive?
>
> Why not just do this:
>
>   git grep -l '\<__\(\(get\)\|\(put\)\)_user(' -- arch/x86
> :^arch/x86/include/asm/uaccess.h
>         | xargs sed -i 's/__\(\(\(get\)\|\(put\)\)_user(\)/\1/g'
>
> which converts all the x86 uses in one go.
>
> Anybody who *relies* on not checking the address_limit is so broken as
> to be not even funny. And anything that is so performance-sensitive
> that anybody can even measure the effect of the above we can convert
> later.
>
> Sure, do it in pieces (eg each architecture separately, then
> "drivers", then "networking", then whatever, until all done), just so
> that *if* something actually depends on semantics (and that really
> shouldn't be the case), we have at least some information from a
> bisect.
>
> But I don't see the excuse for not just doing it. If nobody notices,
> it's an obvious improvement. And if somebody *does* notice, we know
> how to do it properly with unsafe_xyz_user(), because "__xyz_user()"
> most definitely isn't it.
>
> An example of something that *should* be converted is
> "csum_partial_copy_from_user()", but it really does need to use
> "user_access_begin()" and friends, because right now it's using
> stac/clac for each 16-bit word. That's *expensive*, and it's expensive
> whether you use __get_user() or get_user().
>
> Adding x86 people just to see how they react to the above "patch".
>
> For me, in my fairly minimal personal config, the above cleanup patch
> shrinks the text size of the resulting kernel by 1.7kB. Yes, most of
> it is the out-of-line code, but still..
>
> Interestingly, the signal handling code didn't change at all. It looks
> like only the 32-bit code uses __put_user() for the frame setup. The
> regular code uses put_user_try/put_user_catch, which is the
> x86-specific early try at the unsafe versions, but it would actually
> be improved by using "unsafe_put_user()" and my patch to make that use
> "asm goto".
>
>                            Linus
>
> PS. That "patch" depends on modern git - with older versions of git
> you need to do the path negation with ":!pattern", and then you need
> to quote it too since '!' is special for shell.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 12:05       ` Adam Borowski
  2017-05-13 13:46         ` Brian Gerst
@ 2017-05-13 16:46         ` Al Viro
  1 sibling, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 16:46 UTC (permalink / raw)
  To: Adam Borowski; +Cc: Linus Torvalds, Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 02:05:27PM +0200, Adam Borowski wrote:

> As someone from the peanuts gallery, I took a look for __put_user() in my
> usual haunt, drivers/tty/vt/
> 
> * use 1: con_[gs]et_trans_*():
>   Copies a linear array of 256 bytes/shorts, one by one.
>   The obvious patch has 9 insertions(+), 22 deletions(-).
> 
> * use 2: con_[gs]et_unimap():
>   Ditto, up to 65535*2 shorts, also in a nice linear array.
> 
> * use 3: tioclinux():
>   Does a __put into a place that was checked only for read.  This got me
>   frightened as it initially looked like something that can allow an user to
>   write where they shouldn't.  Fortunately, it turns out the first argument
>   to access_ok() is ignored on every single architecture -- why does it even
>   exist then?  I imagined it's there for some odd arch that allows writes
>   when in privileged mode, but unlike what the docs say, VERIFY_WRITE is
>   exactly same as VERIFY_READ.

It's a remnant of old kludge that never properly worked in the first place.
access_ok() should have been called userland_range() - that's all it
checks and that's all it *can* check.

As it is, each of those __get_user() can bloody well yield -EFAULT.  Despite
having "passed" access_ok().  Again, the only thing access_ok() checks is
that (on architecture with shared address space for userland and kernel)
the addresses given are on the userland side.  That's _it_ - they can
be unmapped, mmapped to broken floppy, whatever; you'll find out when
you try to actually copy bytes from from it.

What the kludge used to attempt was "let's check that we are not trying
to copy into read-only mmapped area - 80386 MMU is fucked in head and won't
fault on such stores in ring 0".  It had always been racy.  Look:
thread A: going to copy something to user-supplied address.  Do access_ok().
thread A: looks like it's mapped writable, let's go ahead and copy
thread A: do something blocking before actually doing __put_user() or
__copy_to_user() or whatever it's going to be.
thread B: munmap(), mmap() something read-only here.
thread A: get to actual __put_user()/__copy_to_user().
80386: hey, it's ring 0, no need to look at the write protect bit in page
tables.  What can go wrong, anyway?

You can't move any non-static checks to access_ok().  On any architecture.
Anything that could change between access_ok() and actual copying can't
be checked in access_ok().  As it is, access_ok() has actively misleading
calling conventions:
	* the name implies that having passed access_ok() you don't have
to worry about EFAULT
	* 'direction' argument of that thing reinforces that impression
*and* has to be produced by the caller.  Most simply pass a constant,
which immediately gets dropped (as an aside, take a look at 4b4554f6d - it's
amusing), but in some cases it's calculated elsewhere and carefully passed
through several levels of call chain.  Only to be discarded by access_ok(),
of course...

> Ie, every use in this sample is wrong.  I suspect the rest of the kernel
> should be similar.

Looking through vt...

* con_set_trans_old(): copy_from_user() + loop for doing or with
  UNI_DIRECT_BASE.  Almost certainly will be faster that way - on *any*
  architecture.
* con_get_trans_old(): copy_to_user() would be an obvious optimization.
* con_set_trans_new(): copy_from_user().
* con_get_trans_new(): copy_to_user().
* con_set_unimap(): memdup_user() instead of the entire kmalloc_array +
loop copying the sucker member by member.  With ushort ct you don't need
overflow-related logics of kmalloc_array() anyway...
* con_get_unimap(): copy_to_user() + put_user() (for uct)
* set_selection(): just copy_from_user() into local struct tiocl_selection.
* tioclinux(): use put_user().  Yes, you will repeat the same check twice.
Once per ioctl(), that involves enough work to make that "recalculate
access_ok() once for no reason" non-issue on any architecture.
* vt_ioctl(): turn that ushort ll,cc,vlin,clin,vcol,ccol; into
  struct vt_consize size; or something like that and use copy_from_user()

And AFAICS you can lose each and every access_ok() call in there.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 16:15       ` Linus Torvalds
  2017-05-13 16:17         ` Linus Torvalds
@ 2017-05-13 17:00         ` Al Viro
  2017-05-13 17:12           ` Al Viro
  2017-05-13 17:18           ` Linus Torvalds
  2017-05-14 18:13         ` Ingo Molnar
  2 siblings, 2 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 17:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 09:15:07AM -0700, Linus Torvalds wrote:
> > IOW, we have
> >         * most of users in arch/* (heavily dominated by signal-related code,
> > both loads and stores).  Those need careful massage; maybe unsafe-based
> > solution, maybe something else, but it's obviously per-architecture work
> > and these paths are sensitive.
> 
> Why are they sensitive?

Because there we do have tons of back-to-back __get_user() (on sigreturn())
or __put_user() (on signal setup).

> Why not just do this:
> 
>   git grep -l '\<__\(\(get\)\|\(put\)\)_user(' -- arch/x86
> :^arch/x86/include/asm/uaccess.h
>         | xargs sed -i 's/__\(\(\(get\)\|\(put\)\)_user(\)/\1/g'
> 
> which converts all the x86 uses in one go.

x86 actually tends to use get_user_ex/put_user_ex instead.

> Anybody who *relies* on not checking the address_limit is so broken as
> to be not even funny.

Except for open-coded probe_kernel_read() somewhere in arch/*; I have not
read through those 700+ callers, so I don't know if there's any such.
Sure, those won't be performance-sensitive.

> And anything that is so performance-sensitive
> that anybody can even measure the effect of the above we can convert
> later.

> Sure, do it in pieces (eg each architecture separately, then
> "drivers", then "networking", then whatever, until all done), just so
> that *if* something actually depends on semantics (and that really
> shouldn't be the case), we have at least some information from a
> bisect.
>
> But I don't see the excuse for not just doing it. If nobody notices,
> it's an obvious improvement. And if somebody *does* notice, we know
> how to do it properly with unsafe_xyz_user(), because "__xyz_user()"
> most definitely isn't it.

I think we ought to actually look through those places - there are few
enough of them (outside of arch/, that is) and stac/clac overhead is
not the only problem they tend to have.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 17:00         ` Al Viro
@ 2017-05-13 17:12           ` Al Viro
  2017-05-13 17:18           ` Linus Torvalds
  1 sibling, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 17:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 06:00:56PM +0100, Al Viro wrote:

> > But I don't see the excuse for not just doing it. If nobody notices,
> > it's an obvious improvement. And if somebody *does* notice, we know
> > how to do it properly with unsafe_xyz_user(), because "__xyz_user()"
> > most definitely isn't it.
> 
> I think we ought to actually look through those places - there are few
> enough of them (outside of arch/, that is) and stac/clac overhead is
> not the only problem they tend to have.

PS: just to make it clear - I do _not_ propose to keep that shit around
indefinitely; I want __get_user()/__put_user() gone by the end of that
work.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 17:00         ` Al Viro
  2017-05-13 17:12           ` Al Viro
@ 2017-05-13 17:18           ` Linus Torvalds
  2017-05-13 18:04             ` Al Viro
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2017-05-13 17:18 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 10:00 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Sat, May 13, 2017 at 09:15:07AM -0700, Linus Torvalds wrote:
>> > IOW, we have
>> >         * most of users in arch/* (heavily dominated by signal-related code,
>> > both loads and stores).  Those need careful massage; maybe unsafe-based
>> > solution, maybe something else, but it's obviously per-architecture work
>> > and these paths are sensitive.
>>
>> Why are they sensitive?
>
> Because there we do have tons of back-to-back __get_user() (on sigreturn())
> or __put_user() (on signal setup).

It doesn't actually matter. Regular "put_user()" doesn't generate
noticeably worse code.

It's not a normal function call, it's still an inline asm - so it
basically generates the exact same code as __xyz_user(), except it's a
call to a fixed copy of the code.

So the only difference tends to be
 (a) the extra call/ret instructions, possibly frame pointers
 (b) fixed registers
 (c) the added addr_limit checks

but (a) is cheap, and (b) isn't a big deal since with "asm volatile"
you can't re-order those things against each other anyway. So (b) ends
up being approximately the cost of "one extra lea instruction" for the
address generation that would otherwise be in the load/store
instruction.

And (c) is actually a reason *for* doing it. We've had bugs due to
people not getting it right.

So there really is basically no performance downside.  Even with
consecutive uses.  The code that uses a function call might even be
smaller (ie the 1.7kB savings isn't necessarily all in the out-of-line
exception handling cases: the stac/clac instructions are six bytes for
each use, so it more than makes up for the call instruction).

> x86 actually tends to use get_user_ex/put_user_ex instead.

Yes. Because anybody who *really* cared about performance will have
converted already. The real cost has been stac/clac for a few years
now.

So we pretty much know that none of the remaining users are really all
that critical. Certainly not "five cycles" kind of critical.

>> Anybody who *relies* on not checking the address_limit is so broken as
>> to be not even funny.
>
> Except for open-coded probe_kernel_read() somewhere in arch/*; I have not
> read through those 700+ callers, so I don't know if there's any such.

.. and those need to be fixed.

But I'm not seeing the point in wasting valuable human time in reading
through thousands of cases, when we can just automate it and see if
anything breaks.

Because it will break in a *safe* direction. You'll get an error from
bad uses that shouldn't have worked to begin with.

And some of the existing cases are just disgusting. There's no excuse
for compat code or for ioctl to use the "__" versions. That seems to
be the bulk of those uses too.

                     Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 17:18           ` Linus Torvalds
@ 2017-05-13 18:04             ` Al Viro
  2017-05-13 18:26               ` Al Viro
  2017-05-13 19:00               ` Linus Torvalds
  0 siblings, 2 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 18:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 10:18:22AM -0700, Linus Torvalds wrote:

> > x86 actually tends to use get_user_ex/put_user_ex instead.
> 
> Yes. Because anybody who *really* cared about performance will have
> converted already. The real cost has been stac/clac for a few years
> now.

On x86.  Which has (not counting arch/x86/um, which definitely needs
a careful look - there __..._user() is the same as ..._user() and costly
as hell) all of 12 callers of __get_user() and 31 callers of __put_user().
More than a half of the latter - in cp_stat64() and I would at least try
and see if "convert on stack, then copy_to_user" is better for that one.
Other than that, all we have is:

arch/x86/entry/common.c:372:            __get_user(*(u32 *)&regs->bp,
arch/x86/ia32/ia32_signal.c:124:        if (__get_user(set.sig[0], &frame->sc.oldmask)
arch/x86/ia32/ia32_signal.c:275:        if (__put_user(sig, &frame->sig))
arch/x86/include/asm/xen/page.h:86:     return __put_user(val, (unsigned long __user *)addr);
arch/x86/include/asm/xen/page.h:91:     return __get_user(*val, (unsigned long __user *)addr);
arch/x86/kernel/fpu/signal.c:100:       err |= __get_user(xfeatures, (__u32 *)&x->header.xfeatures);
arch/x86/kernel/fpu/signal.c:115:       err |= __put_user(xfeatures, (__u32 *)&x->header.xfeatures);
arch/x86/kernel/fpu/signal.c:46:        if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size))
arch/x86/kernel/fpu/signal.c:66:                    __put_user(xsave->i387.swd, &fp->status) ||
arch/x86/kernel/fpu/signal.c:67:                    __put_user(X86_FXSR_MAGIC, &fp->magic))
arch/x86/kernel/fpu/signal.c:72:                if (__get_user(swd, &fp->swd) || __put_user(swd, &fp->status))
arch/x86/kernel/fpu/signal.c:72:                if (__get_user(swd, &fp->swd) || __put_user(swd, &fp->status))
arch/x86/kernel/fpu/signal.c:93:        err |= __put_user(FP_XSTATE_MAGIC2,
arch/x86/kernel/ptrace.c:1043:                  if (__put_user(word, u++))
arch/x86/kernel/ptrace.c:1070:                  ret = __get_user(word, u++);
arch/x86/kernel/ptrace.c:472:                   if (__put_user(getreg(target, pos), u++))
arch/x86/kernel/ptrace.c:499:                   ret = __get_user(word, u++);
arch/x86/kernel/signal.c:326:   if (__put_user(sig, &frame->sig))
arch/x86/kernel/signal.c:347:   err |= __put_user(restorer, &frame->pretcode);
arch/x86/kernel/signal.c:356:   err |= __put_user(*((u64 *)&retcode), (u64 *)frame->retcode);
arch/x86/kernel/signal.c:613:   if (__get_user(set.sig[0], &frame->sc.oldmask) || (_NSIG_WORDS > 1
arch/x86/kernel/signal.c:647:   if (__get_user(uc_flags, &frame->uc.uc_flags))
arch/x86/kernel/signal.c:870:   if (__get_user(uc_flags, &frame->uc.uc_flags))
arch/x86/lib/csum-wrappers_64.c:103:                    *errp = __put_user(val16, (__u16 __user *)dst);
arch/x86/lib/csum-wrappers_64.c:45:                     if (__get_user(val16, (const __u16 __user *)src))

A bunch of strays in signal.c (extending ..._ex use might be a good idea)
xen_safe_{write,read}_ulong() (might very well break from adding access_ok() -
or be security holes; I'm not familiar enough with that code to tell) and
csum_partial_copy_{to,from}_user() - those would need to extend stac/clac
pairs already there and switch to unsafe_...

Plus several loops in ptrace - might or might not be sensitive, no idea.

Plus do_fast_syscall_32().  Might be sensitive, Andy would be the guy to
talk to, AFAICS.

> And some of the existing cases are just disgusting. There's no excuse
> for compat code or for ioctl to use the "__" versions. That seems to
> be the bulk of those uses too.

Sure.

My point is, this stuff needs looking at.  Even this quick look in arch/x86
has shown several fairly different classes of that stuff, probably needing
different approaches.  And that - on an architecture that had tons of TLC
around signal delivery; I'm not saying that result is optimal (asm-goto sounds
potentially useful there), but it had a lot of attention given to it...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 18:04             ` Al Viro
@ 2017-05-13 18:26               ` Al Viro
  2017-05-13 19:11                 ` Al Viro
  2017-05-13 19:00               ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Al Viro @ 2017-05-13 18:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 07:04:13PM +0100, Al Viro wrote:

> My point is, this stuff needs looking at.  Even this quick look in arch/x86
> has shown several fairly different classes of that stuff, probably needing
> different approaches.  And that - on an architecture that had tons of TLC
> around signal delivery; I'm not saying that result is optimal (asm-goto sounds
> potentially useful there), but it had a lot of attention given to it...

BTW, even in arch/* they tend to nest.  E.g. arch/alpha has 133 callers
total.  Distribution by files:
     35 arch/alpha/kernel/osf_sys.c
     92 arch/alpha/kernel/signal.c
      1 arch/alpha/kernel/traps.c
      4 arch/alpha/lib/csum_partial_copy.c
      1 arch/alpha/mm/fault.c
Distribution by functions:
      1 osf_getdomainname()	[1]
      2 osf_sigstack()
      2 get_tv32()
      2 put_tv32()
      4 get_it32()
      4 put_it32()
      2 osf_select()
     18 osf_wait4()		[2]
      6 osf_sigaction()
     34 restore_sigcontext()
      1 do_sigreturn()
     42 setup_sigcontext()
      3 setup_frame()
      6 setup_rt_frame()
      1 dik_show_code()		[3]
      2 csum_partial_cfu_aligned()
      2 csum_partial_cfu_src_aligned()
      1 do_page_fault()		[4]

[1] insane, BTW - should be strnlen() + copy_to_user(); should report -EFAULT
on failure, while we are at it.
[2] with fairly disgusting use of set_fs() in the mix.
[3] would break with get_user() - it's oopser fetching code to printk.
[4] this:
        /* As of EV6, a load into $31/$f31 is a prefetch, and never faults
           (or is suppressed by the PALcode).  Support that for older CPUs
           by ignoring such an instruction.  */
        if (cause == 0) {
                unsigned int insn;
                __get_user(insn, (unsigned int __user *)regs->pc);
                if ((insn >> 21 & 0x1f) == 0x1f &&
                    /* ldq ldl ldt lds ldg ldf ldwu ldbu */
                    (1ul << (insn >> 26) & 0x30f00001400ul)) {
                        regs->pc += 4;
                        return;
                }
        }

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 18:04             ` Al Viro
  2017-05-13 18:26               ` Al Viro
@ 2017-05-13 19:00               ` Linus Torvalds
  2017-05-13 19:17                 ` Al Viro
                                   ` (2 more replies)
  1 sibling, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2017-05-13 19:00 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 11:04 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
>
> My point is, this stuff needs looking at.

No.

You don't have a point. I've tried to explain that there's no
performance difference, and there is no way in hell that the current
"__" versions are better.

So what's the point in looking? All you are ever going to come up with
is theoretical "this might" cases.

The only sane forward is to just get rid of them. At that point, you
*may* end up actually noticing something, but if you do, it's not a
"you might" issue.

There's literally no upside to this "needs looking at". There are only
downsides.

And one major downside of your "needs looking at". is this one:

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 24 Mar 2015 10:42:18 -0700
>
> So I'd suggest we should just do a wholesale replacement of
> __copy_to/from_user() with the non-underlined cases. Then, we could
> switch insividual ones back - with reasoning of why they matter, and
> with pointers to how it does access_ok() two lines before.
>
> We should probably even consider looking at __get_user/__put_user().
> Few of them are actually performance-critical.

Look at that date. It's over two years ago. In the intervening two
years, how many of those conversions have happened?

Here's a hint: it's a very very round number.

                 Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 18:26               ` Al Viro
@ 2017-05-13 19:11                 ` Al Viro
  2017-05-13 19:34                   ` Al Viro
  0 siblings, 1 reply; 27+ messages in thread
From: Al Viro @ 2017-05-13 19:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 07:26:56PM +0100, Al Viro wrote:
> BTW, even in arch/* they tend to nest.  E.g. arch/alpha has 133 callers
> total.  Distribution by files:
>      35 arch/alpha/kernel/osf_sys.c
>      92 arch/alpha/kernel/signal.c
>       1 arch/alpha/kernel/traps.c
>       4 arch/alpha/lib/csum_partial_copy.c
>       1 arch/alpha/mm/fault.c

Another large pile is on sparc (208 callers).  Except that on sparc64
access_ok() is constant 1, which reduces it to 42 callers.
        3 arch/sparc/kernel/ptrace_32.c (all in arch_ptrace())
       31 arch/sparc/kernel/signal_32.c (5 in do_sigreturn(),
					 6 in do_rt_sigreturn(),
					 8 in setup_frame(),
					 11 in setup_rt_frame(),
					 1 in do_sys_sigstack())
	7 arch/sparc/kernel/sigutil_32.c (2 in save_fpu_state(),
					  2 in restore_fpu_state(),
					  2 in save_rwin_state(),
					  1 in restore_rwin_state()
	1 arch/sparc/mm/fault_32.c (in compute_si_addr())

The last one ignores -EFAULT, BTW - not sure what should it have done
in that case, though.  It's calculation of ->si_addr on SIGSEGV and SIGBUS
in case of data access SIGSEGV or SIGBUS; it wants to peek into insn to
figure out the effective address.  arch_ptrace() one is zeroing several
fields in userland struct fps for PTRACE_GETFPREGS.  Could've been
put_user() (or clear_user(), for that matter); we'd just done
copy_regset_to_user() on adjacent addresses, so the lack of access_ok() is not
a security hole there).  Everything else is in sigframe handling...

Other large piles are on mips, ppc and itanic.  parisc is next, but there
access_ok() is constant 1 as well.  Same for s390 and m68k.  nios2 and
unicore32 are a bit under 80 callers each and the next are tile (~60),
sh (~50) and then it drops off fast.

It's not impossible to review; the problem is in figuring out which codepaths
are hot enough to worry about them.  And I'm even more convinced that
bulk search-and-replace is the right approach here; we probably _can_ get
rid of that shit this cycle (or, at least, reduce its use to very few places
in arch/*), but that'll require some cooperation from architecture maintainers.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 19:00               ` Linus Torvalds
@ 2017-05-13 19:17                 ` Al Viro
  2017-05-13 19:56                 ` Al Viro
  2017-05-13 20:37                 ` Al Viro
  2 siblings, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 19:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 12:00:10PM -0700, Linus Torvalds wrote:
> On Sat, May 13, 2017 at 11:04 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> >
> > My point is, this stuff needs looking at.
> 
> No.
> 
> You don't have a point. I've tried to explain that there's no
> performance difference, and there is no way in hell that the current
> "__" versions are better.
> 
> So what's the point in looking? All you are ever going to come up with
> is theoretical "this might" cases.

Umm...  Just during this subthread - a couple of likely breakages in
xen (which would need looking into, right?) and "oopsen on alpha would
stop actually printing code", which is better to spot early.  The first
one might be theoretical, but it's worth giving heads-up to xen folks;
I'm fairly sure that this spot will break.  The second is not theoretical
at all - it *will* break.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 19:11                 ` Al Viro
@ 2017-05-13 19:34                   ` Al Viro
  0 siblings, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 19:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 08:11:42PM +0100, Al Viro wrote:
> It's not impossible to review; the problem is in figuring out which codepaths
> are hot enough to worry about them.  And I'm even more convinced that
> bulk search-and-replace is the right approach here; we probably _can_ get
			    ^- not
apologies for sloppy editing...

> rid of that shit this cycle (or, at least, reduce its use to very few places
> in arch/*), but that'll require some cooperation from architecture maintainers.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 19:00               ` Linus Torvalds
  2017-05-13 19:17                 ` Al Viro
@ 2017-05-13 19:56                 ` Al Viro
  2017-05-13 20:08                   ` Al Viro
  2017-05-13 20:37                 ` Al Viro
  2 siblings, 1 reply; 27+ messages in thread
From: Al Viro @ 2017-05-13 19:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 12:00:10PM -0700, Linus Torvalds wrote:
> > We should probably even consider looking at __get_user/__put_user().
> > Few of them are actually performance-critical.
> 
> Look at that date. It's over two years ago. In the intervening two
> years, how many of those conversions have happened?
> 
> Here's a hint: it's a very very round number.

FWIW, just this cycle (this one I remembered off-hand, there might be
more):

commit edb88cef0570914375d461107759cf0d6d677ed5
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Sat Apr 22 00:02:31 2017 +0200

    scsi: pmcraid: use normal copy_from_user
    
    As pointed out by Al Viro for my previous series, the driver has no need
    to call access_ok() and __copy_from_user()/__copy_to_user(). Changing
    it to regular copy_from_user()/copy_to_user() simplifies the code without
    any real downsides, making it less error-prone at best.
    
    This patch by itself also addresses the warning about the access_ok()
    macro on MIPS, but both fixes improve the code, so ideally we apply
    them both.
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 19:56                 ` Al Viro
@ 2017-05-13 20:08                   ` Al Viro
  2017-05-13 20:32                     ` Geert Uytterhoeven
  0 siblings, 1 reply; 27+ messages in thread
From: Al Viro @ 2017-05-13 20:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 08:56:59PM +0100, Al Viro wrote:

> FWIW, just this cycle (this one I remembered off-hand, there might be
> more):

And looking through my queue (will be pushed to -next as soon as -rc1 goes
out):

commit 87fb4c8c103a4cdf17fead4aba58e96940a19a09
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Thu Apr 20 15:47:34 2017 -0400

    spidev: quit messing with access_ok()
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c
index 9e2e099baf8c..8dd22de5e3b5 100644
--- a/drivers/spi/spidev.c
+++ b/drivers/spi/spidev.c
@@ -254,10 +254,6 @@ static int spidev_message(struct spidev_data *spidev,
 				goto done;
 			}
 			k_tmp->rx_buf = rx_buf;
-			if (!access_ok(VERIFY_WRITE, (u8 __user *)
-						(uintptr_t) u_tmp->rx_buf,
-						u_tmp->len))
-				goto done;
 			rx_buf += k_tmp->len;
 		}
 		if (u_tmp->tx_buf) {
@@ -305,7 +301,7 @@ static int spidev_message(struct spidev_data *spidev,
 	rx_buf = spidev->rx_buffer;
 	for (n = n_xfers, u_tmp = u_xfers; n; n--, u_tmp++) {
 		if (u_tmp->rx_buf) {
-			if (__copy_to_user((u8 __user *)
+			if (copy_to_user((u8 __user *)
 					(uintptr_t) u_tmp->rx_buf, rx_buf,
 					u_tmp->len)) {
 				status = -EFAULT;
@@ -325,8 +321,7 @@ static struct spi_ioc_transfer *
 spidev_get_ioc_message(unsigned int cmd, struct spi_ioc_transfer __user *u_ioc,
 		unsigned *n_ioc)
 {
-	struct spi_ioc_transfer	*ioc;
-	u32	tmp;
+	u32	size;
 
 	/* Check type, command number and direction */
 	if (_IOC_TYPE(cmd) != SPI_IOC_MAGIC
@@ -334,22 +329,15 @@ spidev_get_ioc_message(unsigned int cmd, struct spi_ioc_transfer __user *u_ioc,
 			|| _IOC_DIR(cmd) != _IOC_WRITE)
 		return ERR_PTR(-ENOTTY);
 
-	tmp = _IOC_SIZE(cmd);
+	size = _IOC_SIZE(cmd);
 	if ((tmp % sizeof(struct spi_ioc_transfer)) != 0)
 		return ERR_PTR(-EINVAL);
-	*n_ioc = tmp / sizeof(struct spi_ioc_transfer);
+	*n_ioc = size / sizeof(struct spi_ioc_transfer);
 	if (*n_ioc == 0)
 		return NULL;
 
 	/* copy into scratch area */
-	ioc = kmalloc(tmp, GFP_KERNEL);
-	if (!ioc)
-		return ERR_PTR(-ENOMEM);
-	if (__copy_from_user(ioc, u_ioc, tmp)) {
-		kfree(ioc);
-		return ERR_PTR(-EFAULT);
-	}
-	return ioc;
+	return memdup_user(u_ioc, size);
 }
 
 static long
@@ -367,19 +355,6 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 	if (_IOC_TYPE(cmd) != SPI_IOC_MAGIC)
 		return -ENOTTY;
 
-	/* Check access direction once here; don't repeat below.
-	 * IOC_DIR is from the user perspective, while access_ok is
-	 * from the kernel perspective; so they look reversed.
-	 */
-	if (_IOC_DIR(cmd) & _IOC_READ)
-		err = !access_ok(VERIFY_WRITE,
-				(void __user *)arg, _IOC_SIZE(cmd));
-	if (err == 0 && _IOC_DIR(cmd) & _IOC_WRITE)
-		err = !access_ok(VERIFY_READ,
-				(void __user *)arg, _IOC_SIZE(cmd));
-	if (err)
-		return -EFAULT;
-
 	/* guard against device removal before, or while,
 	 * we issue this ioctl.
 	 */
@@ -402,31 +377,31 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 	switch (cmd) {
 	/* read requests */
 	case SPI_IOC_RD_MODE:
-		retval = __put_user(spi->mode & SPI_MODE_MASK,
+		retval = put_user(spi->mode & SPI_MODE_MASK,
 					(__u8 __user *)arg);
 		break;
 	case SPI_IOC_RD_MODE32:
-		retval = __put_user(spi->mode & SPI_MODE_MASK,
+		retval = put_user(spi->mode & SPI_MODE_MASK,
 					(__u32 __user *)arg);
 		break;
 	case SPI_IOC_RD_LSB_FIRST:
-		retval = __put_user((spi->mode & SPI_LSB_FIRST) ?  1 : 0,
+		retval = put_user((spi->mode & SPI_LSB_FIRST) ?  1 : 0,
 					(__u8 __user *)arg);
 		break;
 	case SPI_IOC_RD_BITS_PER_WORD:
-		retval = __put_user(spi->bits_per_word, (__u8 __user *)arg);
+		retval = put_user(spi->bits_per_word, (__u8 __user *)arg);
 		break;
 	case SPI_IOC_RD_MAX_SPEED_HZ:
-		retval = __put_user(spidev->speed_hz, (__u32 __user *)arg);
+		retval = put_user(spidev->speed_hz, (__u32 __user *)arg);
 		break;
 
 	/* write requests */
 	case SPI_IOC_WR_MODE:
 	case SPI_IOC_WR_MODE32:
 		if (cmd == SPI_IOC_WR_MODE)
-			retval = __get_user(tmp, (u8 __user *)arg);
+			retval = get_user(tmp, (u8 __user *)arg);
 		else
-			retval = __get_user(tmp, (u32 __user *)arg);
+			retval = get_user(tmp, (u32 __user *)arg);
 		if (retval == 0) {
 			u32	save = spi->mode;
 
@@ -445,7 +420,7 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		}
 		break;
 	case SPI_IOC_WR_LSB_FIRST:
-		retval = __get_user(tmp, (__u8 __user *)arg);
+		retval = get_user(tmp, (__u8 __user *)arg);
 		if (retval == 0) {
 			u32	save = spi->mode;
 
@@ -462,7 +437,7 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		}
 		break;
 	case SPI_IOC_WR_BITS_PER_WORD:
-		retval = __get_user(tmp, (__u8 __user *)arg);
+		retval = get_user(tmp, (__u8 __user *)arg);
 		if (retval == 0) {
 			u8	save = spi->bits_per_word;
 
@@ -475,7 +450,7 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		}
 		break;
 	case SPI_IOC_WR_MAX_SPEED_HZ:
-		retval = __get_user(tmp, (__u32 __user *)arg);
+		retval = get_user(tmp, (__u32 __user *)arg);
 		if (retval == 0) {
 			u32	save = spi->max_speed_hz;
 
@@ -525,8 +500,6 @@ spidev_compat_ioc_message(struct file *filp, unsigned int cmd,
 	struct spi_ioc_transfer		*ioc;
 
 	u_ioc = (struct spi_ioc_transfer __user *) compat_ptr(arg);
-	if (!access_ok(VERIFY_READ, u_ioc, _IOC_SIZE(cmd)))
-		return -EFAULT;
 
 	/* guard against device removal before, or while,
 	 * we issue this ioctl.

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 20:08                   ` Al Viro
@ 2017-05-13 20:32                     ` Geert Uytterhoeven
  2017-05-13 20:45                       ` Al Viro
  0 siblings, 1 reply; 27+ messages in thread
From: Geert Uytterhoeven @ 2017-05-13 20:32 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Linux Kernel Mailing List, linux-arch

Hi Al,

On Sat, May 13, 2017 at 10:08 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Sat, May 13, 2017 at 08:56:59PM +0100, Al Viro wrote:
>
>> FWIW, just this cycle (this one I remembered off-hand, there might be
>> more):
>
> And looking through my queue (will be pushed to -next as soon as -rc1 goes
> out):
>
> commit 87fb4c8c103a4cdf17fead4aba58e96940a19a09
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date:   Thu Apr 20 15:47:34 2017 -0400
>
>     spidev: quit messing with access_ok()
>
>     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>
> diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c
> index 9e2e099baf8c..8dd22de5e3b5 100644
> --- a/drivers/spi/spidev.c
> +++ b/drivers/spi/spidev.c

You better run that one through linux-spi, to avoid conflicts, cfr.
https://patchwork.kernel.org/patch/9714993/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 19:00               ` Linus Torvalds
  2017-05-13 19:17                 ` Al Viro
  2017-05-13 19:56                 ` Al Viro
@ 2017-05-13 20:37                 ` Al Viro
  2017-05-13 20:52                   ` Linus Torvalds
  2 siblings, 1 reply; 27+ messages in thread
From: Al Viro @ 2017-05-13 20:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 12:00:10PM -0700, Linus Torvalds wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Tue, 24 Mar 2015 10:42:18 -0700
> >
> > So I'd suggest we should just do a wholesale replacement of
> > __copy_to/from_user() with the non-underlined cases. Then, we could
> > switch insividual ones back - with reasoning of why they matter, and
> > with pointers to how it does access_ok() two lines before.
> >
> > We should probably even consider looking at __get_user/__put_user().
> > Few of them are actually performance-critical.
> 
> Look at that date. It's over two years ago. In the intervening two
> years, how many of those conversions have happened?

Speaking of killing that kind of crap off: there was a question left from the
last cycle that hadn't been sorted out.

SCTP does this in a couple of places:
        /* Check the user passed a healthy pointer.  */
        if (unlikely(!access_ok(VERIFY_READ, addrs, addrs_size)))
                return -EFAULT;

        /* Alloc space for the address array in kernel memory.  */
        kaddrs = kmalloc(addrs_size, GFP_USER | __GFP_NOWARN);
        if (unlikely(!kaddrs))
                return -ENOMEM;

        if (__copy_from_user(kaddrs, addrs, addrs_size)) {
                kfree(kaddrs);
                return -EFAULT;
        }
instead of memdup_user().  Part of the rationale is pretty weak (access_ok()
as sanity check to prevent user-triggerable attempts to allocate too much -
it still can trivially trigger 2G, so it's not worth much), part is more
interesting.  Namely, that whining into the syslog shouldn't be that easy
to trigger.

That's a valid point and it might apply to memdup_user() callers out there.
Potential variants:
	* add an explicit upper bound on the size and turn that into
memdup_user() (and check that all memdup_user() callers are bounded).
	* have memdup_user() itself pass __GFP_NOWARN.
	* add kvmemdup_user() that would use kvmalloc() (with its callers
expected to use kvfree()); see who else might benefit from conversion.

Preferences?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 20:32                     ` Geert Uytterhoeven
@ 2017-05-13 20:45                       ` Al Viro
  0 siblings, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 20:45 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Linus Torvalds, Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 10:32:31PM +0200, Geert Uytterhoeven wrote:

> You better run that one through linux-spi, to avoid conflicts, cfr.
> https://patchwork.kernel.org/patch/9714993/

What I'm going to do is never-rebased #for-spi (well, never after -rc1)
merged into #work.uaccess and proposed for pull to linux-spi.

The local tree is a mess right now - bloody lot of branches, huge unsorted
pile, etc.  This was from #unsorted, actually - should've picked the one
from #for-spi (a couple of brainos fixed in the version there)...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 20:37                 ` Al Viro
@ 2017-05-13 20:52                   ` Linus Torvalds
  2017-05-13 21:25                     ` Al Viro
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2017-05-13 20:52 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 1:37 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> That's a valid point and it might apply to memdup_user() callers out there.
> Potential variants:
>         * add an explicit upper bound on the size and turn that into
> memdup_user() (and check that all memdup_user() callers are bounded).
>         * have memdup_user() itself pass __GFP_NOWARN.
>         * add kvmemdup_user() that would use kvmalloc() (with its callers
> expected to use kvfree()); see who else might benefit from conversion.

All of the above sound reasonable.

I wouldn't change the existing "memdup_user()" interface itself, but
if there really are users that can validly pass in a maxbyte value,
why not add a new helper:

  void *memdup_user_limit(userptr, nmember, nsize, maxsize);

and then have

  #define memdup_user(ptr,size) memdup_user_limit(ptr, size, 1, -1)

or something. I definitely see a couple of memdup_user() people who do
that "num*size" multiplication by hand, and it's very easy to get
wrong and have an overflow.

And for a kvmalloc/kvfree() interface, you *definitely* want that
maxsize thing, since it absolutely needs an upper limit.

             Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 20:52                   ` Linus Torvalds
@ 2017-05-13 21:25                     ` Al Viro
  0 siblings, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-13 21:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Sat, May 13, 2017 at 01:52:29PM -0700, Linus Torvalds wrote:
> I wouldn't change the existing "memdup_user()" interface itself, but
> if there really are users that can validly pass in a maxbyte value,
> why not add a new helper:
> 
>   void *memdup_user_limit(userptr, nmember, nsize, maxsize);
> 
> and then have
> 
>   #define memdup_user(ptr,size) memdup_user_limit(ptr, size, 1, -1)
> 
> or something. I definitely see a couple of memdup_user() people who do
> that "num*size" multiplication by hand, and it's very easy to get
> wrong and have an overflow.
> 
> And for a kvmalloc/kvfree() interface, you *definitely* want that
> maxsize thing, since it absolutely needs an upper limit.

*nod*

Speaking of insanities around open-coded memdup_user()...  Enjoy:

                ias_opt = kmalloc(sizeof(struct irda_ias_set), GFP_ATOMIC);
                if (ias_opt == NULL) {
                        err = -ENOMEM;
                        goto out;
                }

                /* Copy query to the driver. */
                if (copy_from_user(ias_opt, optval, optlen)) {

Can't have it block, sir, has to be GFP_ATOMIC...  Whaddya mean, "what
if copy_from_user() blocks?"  As far as I can see, that came in circa
2.4.6, when local sturct irda_ias_set got switched to dynamic allocation...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-13 16:15       ` Linus Torvalds
  2017-05-13 16:17         ` Linus Torvalds
  2017-05-13 17:00         ` Al Viro
@ 2017-05-14 18:13         ` Ingo Molnar
  2017-05-14 18:57           ` Al Viro
  2 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-05-14 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Al Viro, Linux Kernel Mailing List, linux-arch


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, May 12, 2017 at 11:57 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > First, some stats: there's a thousand-odd callers of __get_user().  Out of
> > those, about 70% are in arch/, mostly in sigframe-related code.
> 
> Sure. And they can be trivially converted, and none of them should care at all.
> 
> > IOW, we have
> >         * most of users in arch/* (heavily dominated by signal-related code,
> > both loads and stores).  Those need careful massage; maybe unsafe-based
> > solution, maybe something else, but it's obviously per-architecture work
> > and these paths are sensitive.
> 
> Why are they sensitive?
> 
> Why not just do this:
> 
>   git grep -l '\<__\(\(get\)\|\(put\)\)_user(' -- arch/x86
> :^arch/x86/include/asm/uaccess.h
>         | xargs sed -i 's/__\(\(\(get\)\|\(put\)\)_user(\)/\1/g'
> 
> which converts all the x86 uses in one go.
> 
> Anybody who *relies* on not checking the address_limit is so broken as
> to be not even funny. And anything that is so performance-sensitive
> that anybody can even measure the effect of the above we can convert
> later.

I'd say that the CLAC/STAC addition pretty much killed any argument in favor of 
"optimized" __get_user() code, so I'd be very happy to see these interfaces gone 
altogether.

So as far as x86 usage goes:

  Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [git pull] uaccess-related bits of vfs.git
  2017-05-14 18:13         ` Ingo Molnar
@ 2017-05-14 18:57           ` Al Viro
  0 siblings, 0 replies; 27+ messages in thread
From: Al Viro @ 2017-05-14 18:57 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, Linux Kernel Mailing List, linux-arch

On Sun, May 14, 2017 at 08:13:56PM +0200, Ingo Molnar wrote:

> I'd say that the CLAC/STAC addition pretty much killed any argument in favor of 
> "optimized" __get_user() code, so I'd be very happy to see these interfaces gone 
> altogether.

You and everybody else - these interfaces suck.  If anything, we want paired
brackets around a series of accesses instead of a single check in front of it.

> So as far as x86 usage goes:
> 
>   Acked-by: Ingo Molnar <mingo@kernel.org>

Umm...  Could you elaborate the situation with xen/page.h stuff?  I don't
see any obvious reasons that would guaratee that addresses passed to
__get_user() and __put_user() there would match the set_fs() state.

It might very well be true, but it's not obvious from that code...

BTW, does anybody have a suggestion regarding a test load that would hit
wait4/waitid as hard as possible?  I've turned sys_wait4/sys_waitid into
long kernel_wait4(pid_t upid, int *stat_addr, int options, struct rusage *ru)
and
static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop,
                int options, struct rusage *ru)
(with struct waitid_info {
        pid_t pid;
        uid_t uid;
        int status;   
        int why;
};), so that all copying to userland is done in sys_wait4() and friends.
It seems to survive testing without any noticable slowdowns, but that's
just LTP and xfstests - and a bug in my earlier version of that was _not_
caught by the LTP side; xfstests caught it...  So any extra tests (both
for correctness and timing) would be very much appreciated...

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-05-14 18:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-01  3:02 Linux 4.11 Linus Torvalds
2017-05-01  3:45 ` [git pull] uaccess-related bits of vfs.git Al Viro
2017-05-13  1:00   ` Linus Torvalds
2017-05-13  6:57     ` Al Viro
2017-05-13 12:05       ` Adam Borowski
2017-05-13 13:46         ` Brian Gerst
2017-05-13 16:46         ` Al Viro
2017-05-13 16:15       ` Linus Torvalds
2017-05-13 16:17         ` Linus Torvalds
2017-05-13 17:00         ` Al Viro
2017-05-13 17:12           ` Al Viro
2017-05-13 17:18           ` Linus Torvalds
2017-05-13 18:04             ` Al Viro
2017-05-13 18:26               ` Al Viro
2017-05-13 19:11                 ` Al Viro
2017-05-13 19:34                   ` Al Viro
2017-05-13 19:00               ` Linus Torvalds
2017-05-13 19:17                 ` Al Viro
2017-05-13 19:56                 ` Al Viro
2017-05-13 20:08                   ` Al Viro
2017-05-13 20:32                     ` Geert Uytterhoeven
2017-05-13 20:45                       ` Al Viro
2017-05-13 20:37                 ` Al Viro
2017-05-13 20:52                   ` Linus Torvalds
2017-05-13 21:25                     ` Al Viro
2017-05-14 18:13         ` Ingo Molnar
2017-05-14 18:57           ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).