linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 3.16 00/63] 3.16.58-rc1 review
@ 2018-09-22  0:15 Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption Ben Hutchings
                   ` (63 more replies)
  0 siblings, 64 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: torvalds, Guenter Roeck, akpm

This is the start of the stable review cycle for the 3.16.58 release.
There are 63 patches in this series, which will be posted as responses
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Mon Sep 24 00:15:41 UTC 2018.
Anything received after that time might be too late.

All the patches have also been committed to the linux-3.16.y-rc branch of
https://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-stable-rc.git .
A shortlog and diffstat can be found below.

Ben.

-------------

Alexander Potapenko (1):
      scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()
         [a45b599ad808c3c982fdcdc12b0b8611c2f92824]

Alexey Khoroshilov (1):
      usbip: fix error handling in stub_probe()
         [3ff67445750a84de67faaf52c6e1895cb09f2c56]

Andy Lutomirski (1):
      x86/entry/64: Remove %ebx handling from error_entry/exit
         [b3681dd548d06deb2e1573890829dff4b15abf46]

Ben Hutchings (2):
      Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
         [not upstream; the reverted commit was correct for upstream]
      x86/fpu: Default eagerfpu if FPU and FXSR are enabled
         [58122bf1d856a4ea9581d62a07c557d997d46a19]

Borislav Petkov (1):
      x86/cpu/AMD: Fix erratum 1076 (CPB bit)
         [f7f3dc00f61261cdc9ccd8b886f21bc4dffd6fd9]

Christoph Paasch (1):
      net: Set sk_prot_creator when cloning sockets to the right proto
         [9d538fa60bad4f7b23193c89e843797a1cf71ef3]

Cong Wang (1):
      infiniband: fix a possible use-after-free bug
         [cb2595c1393b4a5211534e6f0a0fbad369e21ad8]

Dave Chinner (2):
      xfs: catch inode allocation state mismatch corruption
         [ee457001ed6c6f31ddad69c24c1da8f377d8472d]
      xfs: validate cached inodes are free when allocated
         [afca6c5b2595fc44383919fba740c194b0b76aff]

Eric Sandeen (2):
      xfs: don't call xfs_da_shrink_inode with NULL bp
         [bb3d48dcf86a97dc25fe9fc2c11938e19cb4399a]
      xfs: set format back to extents if xfs_bmap_extents_to_btree
         [2c4306f719b083d17df2963bc761777576b8ad1b]

Ernesto A . Fernández (1):
      hfsplus: fix NULL dereference in hfsplus_lookup()
         [a7ec7a4193a2eb3b5341243fc0b621c1ac9e4ec4]

Ingo Molnar (2):
      x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT
         [d364a7656c1855c940dfa4baf4ebcc3c6a9e6fd2]
      x86/speculation: Clean up various Spectre related details
         [21e433bdb95bdf3aa48226fd3d33af608437f293]

Jann Horn (1):
      USB: yurex: fix out-of-bounds uaccess in read handler
         [f1e255d60ae66a9f672ff9a207ee6cd8e33d2679]

Jason Yan (1):
      scsi: libsas: defer ata device eh commands to libata
         [318aaf34f1179b39fa9c30fa0f3288b645beee39]

Jens Axboe (1):
      sr: pass down correctly sized SCSI sense buffer
         [f7068114d45ec55996b9040e98111afa56e010fe]

Jiri Kosina (1):
      x86/speculation: Protect against userspace-userspace spectreRSB
         [fdf82a7856b32d905c39afc85e34364491e46346]

Kees Cook (5):
      seccomp: add "seccomp" syscall
         [48dc92b9fc3926844257316e75ba11eb5c742b2c]
      seccomp: create internal mode-setting function
         [d78ab02c2c194257a03355fbb79eb721b381d105]
      seccomp: extract check/assign mode helpers
         [1f41b450416e689b9b7c8bfb750a98604f687a9b]
      seccomp: split mode setting routines
         [3b23dd12846215eff4afb073366b80c0c4d7543e]
      video: uvesafb: Fix integer overflow in allocation
         [9f645bcc566a1e9f921bdae7528a01ced5bc3713]

Kyle Huey (2):
      x86/process: Correct and optimize TIF_BLOCKSTEP switch
         [b9894a2f5bd18b1691cb6872c9afe32b148d0132]
      x86/process: Optimize TIF checks in __switch_to_xtra()
         [af8b3cd3934ec60f4c2a420d19a9d416554f140b]

Linus Torvalds (2):
      Fix up non-directory creation in SGID directories
         [0fa3ecd87848c9c93c2c828ef4c3a8ca36ce46c7]
      mm: get rid of vmacache_flush_all() entirely
         [7a9cdebdcc17e426fb5287e4a82db1dfe86339b2]

Mark Salyzyn (1):
      Bluetooth: hidp: buffer overflow in hidp_process_report
         [7992c18810e568b95c869b227137a2215702a805]

Mel Gorman (2):
      futex: Remove requirement for lock_page() in get_futex_key()
         [65d8fc777f6dcfee12785c057a6b57f679641c90]
      futex: Remove unnecessary warning from get_futex_key
         [48fb6f4db940e92cfb16cd878cddd59ea6120d06]

Nadav Amit (1):
      KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR
         [e37a75a13cdae5deaa2ea2cbf8d55b5dd08638b6]

Paolo Bonzini (4):
      KVM: x86: introduce linear_{read,write}_system
         [79367a65743975e5cac8d24d08eccc7fdae832b0]
      KVM: x86: introduce num_emulated_msrs
         [62ef68bb4d00f1a662e487f3fc44ce8521c416aa]
      KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system
         [ce14e868a54edeb2e30cb7a7b104a2fc4b9d76ca]
      kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor  access
         [3c9fa24ca7c9c47605672916491f79e8ccacb9e6]

Peter Zijlstra (1):
      x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
         [5800dc5c19f34e6e03b5adab1282535cb102fafd]

Piotr Luc (1):
      x86/cpu/intel: Add Knights Mill to Intel family
         [0047f59834e5947d45f34f5f12eb330d158f700b]

Qu Wenruo (1):
      btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized
         [389305b2aa68723c754f88d9dbd268a400e10664]

Sanjeev Sharma (1):
      uas: replace WARN_ON_ONCE() with lockdep_assert_held()
         [ab945eff8396bc3329cc97274320e8d2c6585077]

Scott Bauer (1):
      cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
         [8f3fafc9c2f0ece10832c25f7ffcb07c97a32ad4]

Shankara Pailoor (1):
      jfs: Fix inconsistency between memory allocation and ea_buf->max_size
         [92d34134193e5b129dc24f8d79cb9196626e8d7a]

Shuah Khan (6):
      usbip: usbip_host: delete device from busid_table after rebind
         [1e180f167d4e413afccbbb4a421b48b2de832549]
      usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
         [22076557b07c12086eeb16b8ce2b0b735f7a27e7]
      usbip: usbip_host: fix bad unlock balance during stub_probe()
         [c171654caa875919be3c533d3518da8be5be966e]
      usbip: usbip_host: fix to hold parent lock for device_attach() calls
         [4bfb141bc01312a817d36627cc47c93f801c216d]
      usbip: usbip_host: refine probe and disconnect debug msgs to be  useful
         [28b68acc4a88dcf91fd1dcf2577371dc9bf574cc]
      usbip: usbip_host: run rebind from exit when module is removed
         [7510df3f29d44685bab7b1918b61a8ccd57126a9]

Takashi Iwai (1):
      ALSA: rawmidi: Change resized buffers atomically
         [39675f7a7c7e7702f7d5341f1e0d01db746543a0]

Theodore Ts'o (14):
      ext4: add corruption check in ext4_xattr_set_entry()
         [5369a762c882c0b6e9599e4ebbb3a9ba9eee7e2d]
      ext4: add more inode number paranoia checks
         [c37e9e013469521d9adb932d17a1795c139b36db]
      ext4: always check block group bounds in ext4_init_block_bitmap()
         [819b23f1c501b17b9694325471789e6b5cc2d0d2]
      ext4: always verify the magic number in xattr blocks
         [513f86d73855ce556ea9522b6bfd79f87356dc3a]
      ext4: avoid running out of journal credits when appending to an inline file
         [8bc1379b82b8e809eef77a9fedbb75c6c297be19]
      ext4: clear i_data in ext4_inode_info when removing inline data
         [6e8ab72a812396996035a37e5ca4b3b99b5d214b]
      ext4: don't allow r/w mounts if metadata blocks overlap the superblock
         [18db4b4e6fc31eda838dd1c1296d67dbcb3dc957]
      ext4: fix check to prevent initializing reserved inodes
         [5012284700775a4e6e3fbe7eac4c543c4874b559]
      ext4: fix false negatives *and* false positives in ext4_check_descriptors()
         [44de022c4382541cebdd6de4465d1f4f465ff1dd]
      ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
         [77260807d1170a8cf35dbb06e07461a655f67eee]
      ext4: never move the system.data xattr out of the inode body
         [8cdb5240ec5928b20490a2bb34cb87e9a5f40226]
      ext4: only look at the bg_flags field if it is valid
         [8844618d8aa7a9973e7b527d038a2a589665002c]
      ext4: verify the depth of extent tree in ext4_find_extent()
         [bc890a60247171294acc0bd67d211fa4b88d40ba]
      jbd2: don't mark block as modified if the handle is out of credits
         [e09463f220ca9a1a1ecfda84fcda658f99a1f12a]

 Makefile                              |   4 +-
 arch/Kconfig                          |   1 +
 arch/x86/include/asm/intel-family.h   |   1 +
 arch/x86/include/asm/kvm_emulate.h    |   6 +-
 arch/x86/include/uapi/asm/msr-index.h |   1 +
 arch/x86/kernel/cpu/amd.c             |  13 +++
 arch/x86/kernel/cpu/bugs.c            |  59 ++++----------
 arch/x86/kernel/cpu/common.c          |  17 ++--
 arch/x86/kernel/entry_64.S            |  13 +--
 arch/x86/kernel/i387.c                |  24 ++++++
 arch/x86/kernel/paravirt.c            |  14 +++-
 arch/x86/kernel/process.c             |  62 +++++++++------
 arch/x86/kernel/xsave.c               |  24 +-----
 arch/x86/kvm/emulate.c                |  76 ++++++++++--------
 arch/x86/kvm/vmx.c                    |  20 +++--
 arch/x86/kvm/x86.c                    |  91 ++++++++++++++-------
 arch/x86/kvm/x86.h                    |   4 +-
 arch/x86/syscalls/syscall_32.tbl      |   1 +
 arch/x86/syscalls/syscall_64.tbl      |   1 +
 drivers/cdrom/cdrom.c                 |   2 +-
 drivers/infiniband/core/ucma.c        |   6 +-
 drivers/scsi/libsas/sas_scsi_host.c   |  33 +++-----
 drivers/scsi/sg.c                     |   2 +-
 drivers/scsi/sr_ioctl.c               |  21 ++---
 drivers/staging/usbip/stub.h          |   2 +
 drivers/staging/usbip/stub_dev.c      |  69 +++++++++-------
 drivers/staging/usbip/stub_main.c     | 100 +++++++++++++++++++++--
 drivers/usb/misc/yurex.c              |  23 ++----
 drivers/usb/storage/uas.c             |   8 +-
 drivers/video/fbdev/uvesafb.c         |   3 +-
 fs/btrfs/relocation.c                 |  23 +++---
 fs/ext4/balloc.c                      |  21 +++--
 fs/ext4/ext4.h                        |   8 --
 fs/ext4/ext4_extents.h                |   1 +
 fs/ext4/extents.c                     |   6 ++
 fs/ext4/ialloc.c                      |  19 ++++-
 fs/ext4/inline.c                      |  39 +--------
 fs/ext4/inode.c                       |   3 +-
 fs/ext4/mballoc.c                     |   6 +-
 fs/ext4/super.c                       |  41 +++++++++-
 fs/ext4/xattr.c                       |  49 ++++++------
 fs/hfsplus/dir.c                      |   4 +-
 fs/inode.c                            |   6 ++
 fs/jbd2/transaction.c                 |   2 +-
 fs/jfs/xattr.c                        |  10 ++-
 fs/xfs/xfs_attr_leaf.c                |   5 +-
 fs/xfs/xfs_bmap.c                     |   2 +
 fs/xfs/xfs_icache.c                   |  58 ++++++++++++--
 include/linux/mm_types.h              |   2 +-
 include/linux/sched.h                 |   2 +-
 include/linux/syscalls.h              |   2 +
 include/linux/vmacache.h              |   5 --
 include/uapi/asm-generic/unistd.h     |   4 +-
 include/uapi/linux/seccomp.h          |   4 +
 kernel/futex.c                        |  99 +++++++++++++++++++++--
 kernel/seccomp.c                      | 146 ++++++++++++++++++++++++++++------
 kernel/sys_ni.c                       |   3 +
 mm/vmacache.c                         |  36 ---------
 net/bluetooth/hidp/core.c             |   4 +-
 net/core/sock.c                       |   2 +
 net/ipv4/ip_vti.c                     |   1 +
 sound/core/rawmidi.c                  |  20 +++--
 62 files changed, 847 insertions(+), 487 deletions(-)

-- 
Ben Hutchings
Any sufficiently advanced bug is indistinguishable from a feature.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 14/63] KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (28 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 25/63] ext4: fix check to prevent initializing reserved inodes Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 15/63] KVM: x86: introduce linear_{read,write}_system Ben Hutchings
                   ` (33 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Nadav Amit, Paolo Bonzini

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Nadav Amit <namit@cs.technion.ac.il>

commit e37a75a13cdae5deaa2ea2cbf8d55b5dd08638b6 upstream.

The current implementation ignores the LDTR/TR base high 32-bits on long-mode.
As a result the loaded segment descriptor may be incorrect.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kvm/emulate.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1467,6 +1467,7 @@ static int __load_segment_descriptor(str
 	ulong desc_addr;
 	int ret;
 	u16 dummy;
+	u32 base3 = 0;
 
 	memset(&seg_desc, 0, sizeof seg_desc);
 
@@ -1597,9 +1598,14 @@ static int __load_segment_descriptor(str
 		ret = write_segment_descriptor(ctxt, selector, &seg_desc);
 		if (ret != X86EMUL_CONTINUE)
 			return ret;
+	} else if (ctxt->mode == X86EMUL_MODE_PROT64) {
+		ret = ctxt->ops->read_std(ctxt, desc_addr+8, &base3,
+				sizeof(base3), &ctxt->exception);
+		if (ret != X86EMUL_CONTINUE)
+			return ret;
 	}
 load:
-	ctxt->ops->set_segment(ctxt, selector, &seg_desc, 0, seg);
+	ctxt->ops->set_segment(ctxt, selector, &seg_desc, base3, seg);
 	if (desc)
 		*desc = seg_desc;
 	return X86EMUL_CONTINUE;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 13/63] futex: Remove unnecessary warning from get_futex_key
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (12 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 56/63] seccomp: split mode setting routines Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 21/63] Bluetooth: hidp: buffer overflow in hidp_process_report Ben Hutchings
                   ` (49 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Peter Zijlstra (Intel), Mel Gorman, Linus Torvalds

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgorman@suse.de>

commit 48fb6f4db940e92cfb16cd878cddd59ea6120d06 upstream.

Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in
get_futex_key()") removed an unnecessary lock_page() with the
side-effect that page->mapping needed to be treated very carefully.

Two defensive warnings were added in case any assumption was missed and
the first warning assumed a correct application would not alter a
mapping backing a futex key.  Since merging, it has not triggered for
any unexpected case but Mark Rutland reported the following bug
triggering due to the first warning.

  kernel BUG at kernel/futex.c:679!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
  Hardware name: linux,dummy-virt (DT)
  task: ffff80001e271780 task.stack: ffff000010908000
  PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
  LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
  pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145

The fact that it's a bug instead of a warning was due to an unrelated
arm64 problem, but the warning itself triggered because the underlying
mapping changed.

This is an application issue but from a kernel perspective it's a
recoverable situation and the warning is unnecessary so this patch
removes the warning.  The warning may potentially be triggered with the
following test program from Mark although it may be necessary to adjust
NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
system.

    #include <linux/futex.h>
    #include <pthread.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/mman.h>
    #include <sys/syscall.h>
    #include <sys/time.h>
    #include <unistd.h>

    #define NR_FUTEX_THREADS 16
    pthread_t threads[NR_FUTEX_THREADS];

    void *mem;

    #define MEM_PROT  (PROT_READ | PROT_WRITE)
    #define MEM_SIZE  65536

    static int futex_wrapper(int *uaddr, int op, int val,
                             const struct timespec *timeout,
                             int *uaddr2, int val3)
    {
        syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
    }

    void *poll_futex(void *unused)
    {
        for (;;) {
            futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
        }
    }

    int main(int argc, char *argv[])
    {
        int i;

        mem = mmap(NULL, MEM_SIZE, MEM_PROT,
               MAP_SHARED | MAP_ANONYMOUS, -1, 0);

        printf("Mapping @ %p\n", mem);

        printf("Creating futex threads...\n");

        for (i = 0; i < NR_FUTEX_THREADS; i++)
            pthread_create(&threads[i], NULL, poll_futex, NULL);

        printf("Flipping mapping...\n");
        for (;;) {
            mmap(mem, MEM_SIZE, MEM_PROT,
                 MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        }

        return 0;
    }

Reported-and-tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 kernel/futex.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -583,13 +583,14 @@ again:
 		 * this reference was taken by ihold under the page lock
 		 * pinning the inode in place so i_lock was unnecessary. The
 		 * only way for this check to fail is if the inode was
-		 * truncated in parallel so warn for now if this happens.
+		 * truncated in parallel which is almost certainly an
+		 * application bug. In such a case, just retry.
 		 *
 		 * We are not calling into get_futex_key_refs() in file-backed
 		 * cases, therefore a successful atomic_inc return below will
 		 * guarantee that get_futex_key() will still imply smp_mb(); (B).
 		 */
-		if (WARN_ON_ONCE(!atomic_inc_not_zero(&inode->i_count))) {
+		if (!atomic_inc_not_zero(&inode->i_count)) {
 			rcu_read_unlock();
 			put_page(page_head);
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 55/63] seccomp: extract check/assign mode helpers
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (22 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 33/63] ext4: never move the system.data xattr out of the inode body Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 16/63] KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system Ben Hutchings
                   ` (39 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Andy Lutomirski, Oleg Nesterov, Kees Cook

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 1f41b450416e689b9b7c8bfb750a98604f687a9b upstream.

To support splitting mode 1 from mode 2, extract the mode checking and
assignment logic into common functions.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 kernel/seccomp.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -194,7 +194,23 @@ static u32 seccomp_run_filters(int sysca
 	}
 	return ret;
 }
+#endif /* CONFIG_SECCOMP_FILTER */
 
+static inline bool seccomp_may_assign_mode(unsigned long seccomp_mode)
+{
+	if (current->seccomp.mode && current->seccomp.mode != seccomp_mode)
+		return false;
+
+	return true;
+}
+
+static inline void seccomp_assign_mode(unsigned long seccomp_mode)
+{
+	current->seccomp.mode = seccomp_mode;
+	set_tsk_thread_flag(current, TIF_SECCOMP);
+}
+
+#ifdef CONFIG_SECCOMP_FILTER
 /**
  * seccomp_attach_filter: Attaches a seccomp filter to current.
  * @fprog: BPF program to install
@@ -490,8 +506,7 @@ static long seccomp_set_mode(unsigned lo
 {
 	long ret = -EINVAL;
 
-	if (current->seccomp.mode &&
-	    current->seccomp.mode != seccomp_mode)
+	if (!seccomp_may_assign_mode(seccomp_mode))
 		goto out;
 
 	switch (seccomp_mode) {
@@ -512,8 +527,7 @@ static long seccomp_set_mode(unsigned lo
 		goto out;
 	}
 
-	current->seccomp.mode = seccomp_mode;
-	set_thread_flag(TIF_SECCOMP);
+	seccomp_assign_mode(seccomp_mode);
 out:
 	return ret;
 }


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 38/63] Fix up non-directory creation in SGID directories
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (10 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 26/63] ext4: verify the depth of extent tree in ext4_find_extent() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 56/63] seccomp: split mode setting routines Ben Hutchings
                   ` (51 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Al Viro, Jann Horn, Linus Torvalds, Andy Lutomirski

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

commit 0fa3ecd87848c9c93c2c828ef4c3a8ca36ce46c7 upstream.

sgid directories have special semantics, making newly created files in
the directory belong to the group of the directory, and newly created
subdirectories will also become sgid.  This is historically used for
group-shared directories.

But group directories writable by non-group members should not imply
that such non-group members can magically join the group, so make sure
to clear the sgid bit on non-directories for non-members (but remember
that sgid without group execute means "mandatory locking", just to
confuse things even more).

Reported-by: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/inode.c | 6 ++++++
 1 file changed, 6 insertions(+)

--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1827,8 +1827,14 @@ void inode_init_owner(struct inode *inod
 	inode->i_uid = current_fsuid();
 	if (dir && dir->i_mode & S_ISGID) {
 		inode->i_gid = dir->i_gid;
+
+		/* Directories are special, and always inherit S_ISGID */
 		if (S_ISDIR(mode))
 			mode |= S_ISGID;
+		else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) &&
+			 !in_group_p(inode->i_gid) &&
+			 !capable_wrt_inode_uidgid(dir, CAP_FSETID))
+			mode &= ~S_ISGID;
 	} else
 		inode->i_gid = current_fsgid();
 	inode->i_mode = mode;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 21/63] Bluetooth: hidp: buffer overflow in hidp_process_report
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (13 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 13/63] futex: Remove unnecessary warning from get_futex_key Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 32/63] ext4: always verify the magic number in xattr blocks Ben Hutchings
                   ` (48 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, netdev, Benjamin Tissoires, Kees Cook, David S. Miller,
	linux-bluetooth, kernel-team, security, Johan Hedberg,
	Marcel Holtmann, Mark Salyzyn

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Salyzyn <salyzyn@android.com>

commit 7992c18810e568b95c869b227137a2215702a805 upstream.

CVE-2018-9363

The buffer length is unsigned at all layers, but gets cast to int and
checked in hidp_process_report and can lead to a buffer overflow.
Switch len parameter to unsigned int to resolve issue.

This affects 3.18 and newer kernels.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: security@kernel.org
Cc: kernel-team@android.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 net/bluetooth/hidp/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -429,8 +429,8 @@ static void hidp_del_timer(struct hidp_s
 		del_timer(&session->timer);
 }
 
-static void hidp_process_report(struct hidp_session *session,
-				int type, const u8 *data, int len, int intr)
+static void hidp_process_report(struct hidp_session *session, int type,
+				const u8 *data, unsigned int len, int intr)
 {
 	if (len > HID_MAX_BUFFER_SIZE)
 		len = HID_MAX_BUFFER_SIZE;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 22/63] scsi: libsas: defer ata device eh commands to libata
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (47 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 19/63] jfs: Fix inconsistency between memory allocation and ea_buf->max_size Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 05/63] usbip: fix error handling in stub_probe() Ben Hutchings
                   ` (14 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Dan Williams, Xiaofei Tan, Martin K. Petersen, John Garry,
	Jason Yan

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Yan <yanaijie@huawei.com>

commit 318aaf34f1179b39fa9c30fa0f3288b645beee39 upstream.

When ata device doing EH, some commands still attached with tasks are
not passed to libata when abort failed or recover failed, so libata did
not handle these commands. After these commands done, sas task is freed,
but ata qc is not freed. This will cause ata qc leak and trigger a
warning like below:

WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037
ata_eh_finish+0xb4/0xcc
CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G     W  OE 4.14.0#1
......
Call trace:
[<ffff0000088b7bd0>] ata_eh_finish+0xb4/0xcc
[<ffff0000088b8420>] ata_do_eh+0xc4/0xd8
[<ffff0000088b8478>] ata_std_error_handler+0x44/0x8c
[<ffff0000088b8068>] ata_scsi_port_error_handler+0x480/0x694
[<ffff000008875fc4>] async_sas_ata_eh+0x4c/0x80
[<ffff0000080f6be8>] async_run_entry_fn+0x4c/0x170
[<ffff0000080ebd70>] process_one_work+0x144/0x390
[<ffff0000080ec100>] worker_thread+0x144/0x418
[<ffff0000080f2c98>] kthread+0x10c/0x138
[<ffff0000080855dc>] ret_from_fork+0x10/0x18

If ata qc leaked too many, ata tag allocation will fail and io blocked
for ever.

As suggested by Dan Williams, defer ata device commands to libata and
merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle
ata qcs correctly after this.

Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: Xiaofei Tan <tanxiaofei@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/scsi/libsas/sas_scsi_host.c | 33 ++++++++++++-----------------
 1 file changed, 13 insertions(+), 20 deletions(-)

--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -250,6 +250,7 @@ out_done:
 static void sas_eh_finish_cmd(struct scsi_cmnd *cmd)
 {
 	struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(cmd->device->host);
+	struct domain_device *dev = cmd_to_domain_dev(cmd);
 	struct sas_task *task = TO_SAS_TASK(cmd);
 
 	/* At this point, we only get called following an actual abort
@@ -258,6 +259,14 @@ static void sas_eh_finish_cmd(struct scs
 	 */
 	sas_end_task(cmd, task);
 
+	if (dev_is_sata(dev)) {
+		/* defer commands to libata so that libata EH can
+		 * handle ata qcs correctly
+		 */
+		list_move_tail(&cmd->eh_entry, &sas_ha->eh_ata_q);
+		return;
+	}
+
 	/* now finish the command and move it on to the error
 	 * handler done list, this also takes it off the
 	 * error handler pending list.
@@ -265,22 +274,6 @@ static void sas_eh_finish_cmd(struct scs
 	scsi_eh_finish_cmd(cmd, &sas_ha->eh_done_q);
 }
 
-static void sas_eh_defer_cmd(struct scsi_cmnd *cmd)
-{
-	struct domain_device *dev = cmd_to_domain_dev(cmd);
-	struct sas_ha_struct *ha = dev->port->ha;
-	struct sas_task *task = TO_SAS_TASK(cmd);
-
-	if (!dev_is_sata(dev)) {
-		sas_eh_finish_cmd(cmd);
-		return;
-	}
-
-	/* report the timeout to libata */
-	sas_end_task(cmd, task);
-	list_move_tail(&cmd->eh_entry, &ha->eh_ata_q);
-}
-
 static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd *my_cmd)
 {
 	struct scsi_cmnd *cmd, *n;
@@ -288,7 +281,7 @@ static void sas_scsi_clear_queue_lu(stru
 	list_for_each_entry_safe(cmd, n, error_q, eh_entry) {
 		if (cmd->device->sdev_target == my_cmd->device->sdev_target &&
 		    cmd->device->lun == my_cmd->device->lun)
-			sas_eh_defer_cmd(cmd);
+			sas_eh_finish_cmd(cmd);
 	}
 }
 
@@ -677,12 +670,12 @@ static void sas_eh_handle_sas_errors(str
 		case TASK_IS_DONE:
 			SAS_DPRINTK("%s: task 0x%p is done\n", __func__,
 				    task);
-			sas_eh_defer_cmd(cmd);
+			sas_eh_finish_cmd(cmd);
 			continue;
 		case TASK_IS_ABORTED:
 			SAS_DPRINTK("%s: task 0x%p is aborted\n",
 				    __func__, task);
-			sas_eh_defer_cmd(cmd);
+			sas_eh_finish_cmd(cmd);
 			continue;
 		case TASK_IS_AT_LU:
 			SAS_DPRINTK("task 0x%p is at LU: lu recover\n", task);
@@ -693,7 +686,7 @@ static void sas_eh_handle_sas_errors(str
 					    "recovered\n",
 					    SAS_ADDR(task->dev),
 					    cmd->device->lun);
-				sas_eh_defer_cmd(cmd);
+				sas_eh_finish_cmd(cmd);
 				sas_scsi_clear_queue_lu(work_q, cmd);
 				goto Again;
 			}


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 20/63] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (19 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 57/63] seccomp: add "seccomp" syscall Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:19   ` syzbot
  2018-09-22  0:15 ` [PATCH 3.16 04/63] net: Set sk_prot_creator when cloning sockets to the right proto Ben Hutchings
                   ` (42 subsequent siblings)
  63 siblings, 1 reply; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, syzbot+7d26fc1eea198488deab, Johannes Thumshirn,
	Alexander Potapenko, Douglas Gilbert, Martin K. Petersen

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Alexander Potapenko <glider@google.com>

commit a45b599ad808c3c982fdcdc12b0b8611c2f92824 upstream.

This shall help avoid copying uninitialized memory to the userspace when
calling ioctl(fd, SG_IO) with an empty command.

Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/scsi/sg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1825,7 +1825,7 @@ retry:
 		num = (rem_sz > scatter_elem_sz_prev) ?
 			scatter_elem_sz_prev : rem_sz;
 
-		schp->pages[k] = alloc_pages(gfp_mask, order);
+		schp->pages[k] = alloc_pages(gfp_mask | __GFP_ZERO, order);
 		if (!schp->pages[k])
 			goto out;
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 25/63] ext4: fix check to prevent initializing reserved inodes
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (27 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 27/63] ext4: always check block group bounds in ext4_init_block_bitmap() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 14/63] KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR Ben Hutchings
                   ` (34 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o, Eric Whitney

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 5012284700775a4e6e3fbe7eac4c543c4874b559 upstream.

Commit 8844618d8aa7: "ext4: only look at the bg_flags field if it is
valid" will complain if block group zero does not have the
EXT4_BG_INODE_ZEROED flag set.  Unfortunately, this is not correct,
since a freshly created file system has this flag cleared.  It gets
almost immediately after the file system is mounted read-write --- but
the following somewhat unlikely sequence will end up triggering a
false positive report of a corrupted file system:

   mkfs.ext4 /dev/vdc
   mount -o ro /dev/vdc /vdc
   mount -o remount,rw /dev/vdc

Instead, when initializing the inode table for block group zero, test
to make sure that itable_unused count is not too large, since that is
the case that will result in some or all of the reserved inodes
getting cleared.

This fixes the failures reported by Eric Whiteney when running
generic/230 and generic/231 in the the nojournal test case.

Fixes: 8844618d8aa7 ("ext4: only look at the bg_flags field if it is valid")
Reported-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/ialloc.c | 5 ++++-
 fs/ext4/super.c  | 8 +-------
 2 files changed, 5 insertions(+), 8 deletions(-)

--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1289,7 +1289,10 @@ int ext4_init_inode_table(struct super_b
 			    ext4_itable_unused_count(sb, gdp)),
 			    sbi->s_inodes_per_block);
 
-	if ((used_blks < 0) || (used_blks > sbi->s_itb_per_group)) {
+	if ((used_blks < 0) || (used_blks > sbi->s_itb_per_group) ||
+	    ((group == 0) && ((EXT4_INODES_PER_GROUP(sb) -
+			       ext4_itable_unused_count(sb, gdp)) <
+			      EXT4_FIRST_INO(sb)))) {
 		ext4_error(sb, "Something is wrong with group %u: "
 			   "used itable blocks: %d; "
 			   "itable unused count: %u",
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3088,14 +3088,8 @@ static ext4_group_t ext4_has_uninit_itab
 		if (!gdp)
 			continue;
 
-		if (gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_ZEROED))
-			continue;
-		if (group != 0)
+		if (!(gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_ZEROED)))
 			break;
-		ext4_error(sb, "Inode table for bg 0 marked as "
-			   "needing zeroing");
-		if (sb->s_flags & MS_RDONLY)
-			return ngroups;
 	}
 
 	return group;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 23/63] xfs: set format back to extents if xfs_bmap_extents_to_btree
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (15 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 32/63] ext4: always verify the magic number in xattr blocks Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 08/63] usbip: usbip_host: delete device from busid_table after rebind Ben Hutchings
                   ` (46 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Eric Sandeen, Darrick J. Wong, Christoph Hellwig

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Sandeen <sandeen@redhat.com>

commit 2c4306f719b083d17df2963bc761777576b8ad1b upstream.

If xfs_bmap_extents_to_btree fails in a mode where we call
xfs_iroot_realloc(-1) to de-allocate the root, set the
format back to extents.

Otherwise we can assume we can dereference ifp->if_broot
based on the XFS_DINODE_FMT_BTREE format, and crash.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199423
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[bwh: Backported to 3.16:
 - Only one failure path needs to be patched
 - Adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/xfs/xfs_bmap.c | 4 ++++
 1 file changed, 4 insertions(+)

--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -822,6 +822,8 @@ xfs_bmap_extents_to_btree(
 	*logflagsp = 0;
 	if ((error = xfs_alloc_vextent(&args))) {
 		xfs_iroot_realloc(ip, -1, whichfork);
+		ASSERT(ifp->if_broot == NULL);
+		XFS_IFORK_FMT_SET(ip, whichfork, XFS_DINODE_FMT_EXTENTS);
 		xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
 		return error;
 	}


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 19/63] jfs: Fix inconsistency between memory allocation and ea_buf->max_size
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (46 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 48/63] video: uvesafb: Fix integer overflow in allocation Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 22/63] scsi: libsas: defer ata device eh commands to libata Ben Hutchings
                   ` (15 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Dave Kleikamp, Shankara Pailoor

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Shankara Pailoor <shankarapailoor@gmail.com>

commit 92d34134193e5b129dc24f8d79cb9196626e8d7a upstream.

The code is assuming the buffer is max_size length, but we weren't
allocating enough space for it.

Signed-off-by: Shankara Pailoor <shankarapailoor@gmail.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/jfs/xattr.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- a/fs/jfs/xattr.c
+++ b/fs/jfs/xattr.c
@@ -493,15 +493,17 @@ static int ea_get(struct inode *inode, s
 	if (size > PSIZE) {
 		/*
 		 * To keep the rest of the code simple.  Allocate a
-		 * contiguous buffer to work with
+		 * contiguous buffer to work with. Make the buffer large
+		 * enough to make use of the whole extent.
 		 */
-		ea_buf->xattr = kmalloc(size, GFP_KERNEL);
+		ea_buf->max_size = (size + sb->s_blocksize - 1) &
+		    ~(sb->s_blocksize - 1);
+
+		ea_buf->xattr = kmalloc(ea_buf->max_size, GFP_KERNEL);
 		if (ea_buf->xattr == NULL)
 			return -ENOMEM;
 
 		ea_buf->flag = EA_MALLOC;
-		ea_buf->max_size = (size + sb->s_blocksize - 1) &
-		    ~(sb->s_blocksize - 1);
 
 		if (ea_size == 0)
 			return 0;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 24/63] ext4: only look at the bg_flags field if it is valid
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (40 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 49/63] btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 53/63] xfs: don't call xfs_da_shrink_inode with NULL bp Ben Hutchings
                   ` (21 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 8844618d8aa7a9973e7b527d038a2a589665002c upstream.

The bg_flags field in the block group descripts is only valid if the
uninit_bg or metadata_csum feature is enabled.  We were not
consistently looking at this field; fix this.

Also block group #0 must never have uninitialized allocation bitmaps,
or need to be zeroed, since that's where the root inode, and other
special inodes are set up.  Check for these conditions and mark the
file system as corrupted if they are detected.

This addresses CVE-2018-10876.

https://bugzilla.kernel.org/show_bug.cgi?id=199403

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16:
 - ext4_read_block_bitmap_nowait() and ext4_read_inode_bitmap() return
   a pointer (NULL on error) instead of an error code
 - Open-code sb_rdonly()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/balloc.c  | 11 ++++++++++-
 fs/ext4/ialloc.c  | 14 ++++++++++++--
 fs/ext4/mballoc.c |  6 ++++--
 fs/ext4/super.c   | 11 ++++++++++-
 4 files changed, 36 insertions(+), 6 deletions(-)

--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -453,9 +453,18 @@ ext4_read_block_bitmap_nowait(struct sup
 		goto verify;
 	}
 	ext4_lock_group(sb, block_group);
-	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
+	if (ext4_has_group_desc_csum(sb) &&
+	    (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
 		int err;
 
+		if (block_group == 0) {
+			ext4_unlock_group(sb, block_group);
+			unlock_buffer(bh);
+			ext4_error(sb, "Block bitmap for bg 0 marked "
+				   "uninitialized");
+			put_bh(bh);
+			return NULL;
+		}
 		err = ext4_init_block_bitmap(sb, bh, block_group, desc);
 		set_bitmap_uptodate(bh);
 		set_buffer_uptodate(bh);
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -156,7 +156,16 @@ ext4_read_inode_bitmap(struct super_bloc
 	}
 
 	ext4_lock_group(sb, block_group);
-	if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
+	if (ext4_has_group_desc_csum(sb) &&
+	    (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT))) {
+		if (block_group == 0) {
+			ext4_unlock_group(sb, block_group);
+			unlock_buffer(bh);
+			ext4_error(sb, "Inode bitmap for bg 0 marked "
+				   "uninitialized");
+			put_bh(bh);
+			return NULL;
+		}
 		ext4_init_inode_bitmap(sb, bh, block_group, desc);
 		set_bitmap_uptodate(bh);
 		set_buffer_uptodate(bh);
@@ -910,7 +919,8 @@ got:
 
 		/* recheck and clear flag under lock if we still need to */
 		ext4_lock_group(sb, group);
-		if (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
+		if (ext4_has_group_desc_csum(sb) &&
+		    (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
 			gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);
 			ext4_free_group_clusters_set(sb, gdp,
 				ext4_free_clusters_after_init(sb, group, gdp));
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2418,7 +2418,8 @@ int ext4_mb_add_groupinfo(struct super_b
 	 * initialize bb_free to be able to skip
 	 * empty groups without initialization
 	 */
-	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
+	if (ext4_has_group_desc_csum(sb) &&
+	    (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
 		meta_group_info[i]->bb_free =
 			ext4_free_clusters_after_init(sb, group, desc);
 	} else {
@@ -2943,7 +2944,8 @@ ext4_mb_mark_diskspace_used(struct ext4_
 #endif
 	ext4_set_bits(bitmap_bh->b_data, ac->ac_b_ex.fe_start,
 		      ac->ac_b_ex.fe_len);
-	if (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
+	if (ext4_has_group_desc_csum(sb) &&
+	    (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
 		gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);
 		ext4_free_group_clusters_set(sb, gdp,
 					     ext4_free_clusters_after_init(sb,
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3080,13 +3080,22 @@ static ext4_group_t ext4_has_uninit_itab
 	ext4_group_t group, ngroups = EXT4_SB(sb)->s_groups_count;
 	struct ext4_group_desc *gdp = NULL;
 
+	if (!ext4_has_group_desc_csum(sb))
+		return ngroups;
+
 	for (group = 0; group < ngroups; group++) {
 		gdp = ext4_get_group_desc(sb, group, NULL);
 		if (!gdp)
 			continue;
 
-		if (!(gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_ZEROED)))
+		if (gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_ZEROED))
+			continue;
+		if (group != 0)
 			break;
+		ext4_error(sb, "Inode table for bg 0 marked as "
+			   "needing zeroing");
+		if (sb->s_flags & MS_RDONLY)
+			return ngroups;
 	}
 
 	return group;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 35/63] ext4: add more inode number paranoia checks
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (24 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 16/63] KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 42/63] ALSA: rawmidi: Change resized buffers atomically Ben Hutchings
                   ` (37 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit c37e9e013469521d9adb932d17a1795c139b36db upstream.

If there is a directory entry pointing to a system inode (such as a
journal inode), complain and declare the file system to be corrupted.

Also, if the superblock's first inode number field is too small,
refuse to mount the file system.

This addresses CVE-2018-10882.

https://bugzilla.kernel.org/show_bug.cgi?id=200069

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/ext4.h  | 5 -----
 fs/ext4/inode.c | 3 ++-
 fs/ext4/super.c | 5 +++++
 3 files changed, 7 insertions(+), 6 deletions(-)

--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1422,11 +1422,6 @@ static inline struct timespec ext4_curre
 static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
 {
 	return ino == EXT4_ROOT_INO ||
-		ino == EXT4_USR_QUOTA_INO ||
-		ino == EXT4_GRP_QUOTA_INO ||
-		ino == EXT4_BOOT_LOADER_INO ||
-		ino == EXT4_JOURNAL_INO ||
-		ino == EXT4_RESIZE_INO ||
 		(ino >= EXT4_FIRST_INO(sb) &&
 		 ino <= le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count));
 }
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3957,7 +3957,8 @@ static int __ext4_get_inode_loc(struct i
 	int			inodes_per_block, inode_offset;
 
 	iloc->bh = NULL;
-	if (!ext4_valid_inum(sb, inode->i_ino))
+	if (inode->i_ino < EXT4_ROOT_INO ||
+	    inode->i_ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count))
 		return -EIO;
 
 	iloc->block_group = (inode->i_ino - 1) / EXT4_INODES_PER_GROUP(sb);
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3771,6 +3771,11 @@ static int ext4_fill_super(struct super_
 	} else {
 		sbi->s_inode_size = le16_to_cpu(es->s_inode_size);
 		sbi->s_first_ino = le32_to_cpu(es->s_first_ino);
+		if (sbi->s_first_ino < EXT4_GOOD_OLD_FIRST_INO) {
+			ext4_msg(sb, KERN_ERR, "invalid first ino: %u",
+				 sbi->s_first_ino);
+			goto failed_mount;
+		}
 		if ((sbi->s_inode_size < EXT4_GOOD_OLD_INODE_SIZE) ||
 		    (!is_power_of_2(sbi->s_inode_size)) ||
 		    (sbi->s_inode_size > blocksize)) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 40/63] infiniband: fix a possible use-after-free bug
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (31 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 61/63] x86/cpu/intel: Add Knights Mill to Intel family Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 09/63] usbip: usbip_host: run rebind from exit when module is removed Ben Hutchings
                   ` (30 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Noam Rathaus, Cong Wang, Jason Gunthorpe

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Cong Wang <xiyou.wangcong@gmail.com>

commit cb2595c1393b4a5211534e6f0a0fbad369e21ad8 upstream.

ucma_process_join() will free the new allocated "mc" struct,
if there is any error after that, especially the copy_to_user().

But in parallel, ucma_leave_multicast() could find this "mc"
through idr_find() before ucma_process_join() frees it, since it
is already published.

So "mc" could be used in ucma_leave_multicast() after it is been
allocated and freed in ucma_process_join(), since we don't refcnt
it.

Fix this by separating "publish" from ID allocation, so that we
can get an ID first and publish it later after copy_to_user().

Fixes: c8f6a362bf3e ("RDMA/cma: Add multicast communication support")
Reported-by: Noam Rathaus <noamr@beyondsecurity.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/infiniband/core/ucma.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -180,7 +180,7 @@ static struct ucma_multicast* ucma_alloc
 		return NULL;
 
 	mutex_lock(&mut);
-	mc->id = idr_alloc(&multicast_idr, mc, 0, 0, GFP_KERNEL);
+	mc->id = idr_alloc(&multicast_idr, NULL, 0, 0, GFP_KERNEL);
 	mutex_unlock(&mut);
 	if (mc->id < 0)
 		goto error;
@@ -1285,6 +1285,10 @@ static ssize_t ucma_process_join(struct
 		goto err3;
 	}
 
+	mutex_lock(&mut);
+	idr_replace(&multicast_idr, mc, mc->id);
+	mutex_unlock(&mut);
+
 	mutex_unlock(&file->mut);
 	ucma_put_ctx(ctx);
 	return 0;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 42/63] ALSA: rawmidi: Change resized buffers atomically
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (25 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 35/63] ext4: add more inode number paranoia checks Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 27/63] ext4: always check block group bounds in ext4_init_block_bitmap() Ben Hutchings
                   ` (36 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, syzbot+52f83f0ea8df16932f7f, Takashi Iwai

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 39675f7a7c7e7702f7d5341f1e0d01db746543a0 upstream.

The SNDRV_RAWMIDI_IOCTL_PARAMS ioctl may resize the buffers and the
current code is racy.  For example, the sequencer client may write to
buffer while it being resized.

As a simple workaround, let's switch to the resized buffer inside the
stream runtime lock.

Reported-by: syzbot+52f83f0ea8df16932f7f@syzkaller.appspotmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 sound/core/rawmidi.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

--- a/sound/core/rawmidi.c
+++ b/sound/core/rawmidi.c
@@ -645,7 +645,7 @@ static int snd_rawmidi_info_select_user(
 int snd_rawmidi_output_params(struct snd_rawmidi_substream *substream,
 			      struct snd_rawmidi_params * params)
 {
-	char *newbuf;
+	char *newbuf, *oldbuf;
 	struct snd_rawmidi_runtime *runtime = substream->runtime;
 	
 	if (substream->append && substream->use_count > 1)
@@ -658,13 +658,17 @@ int snd_rawmidi_output_params(struct snd
 		return -EINVAL;
 	}
 	if (params->buffer_size != runtime->buffer_size) {
-		newbuf = krealloc(runtime->buffer, params->buffer_size,
-				  GFP_KERNEL);
+		newbuf = kmalloc(params->buffer_size, GFP_KERNEL);
 		if (!newbuf)
 			return -ENOMEM;
+		spin_lock_irq(&runtime->lock);
+		oldbuf = runtime->buffer;
 		runtime->buffer = newbuf;
 		runtime->buffer_size = params->buffer_size;
 		runtime->avail = runtime->buffer_size;
+		runtime->appl_ptr = runtime->hw_ptr = 0;
+		spin_unlock_irq(&runtime->lock);
+		kfree(oldbuf);
 	}
 	runtime->avail_min = params->avail_min;
 	substream->active_sensing = !params->no_active_sensing;
@@ -675,7 +679,7 @@ EXPORT_SYMBOL(snd_rawmidi_output_params)
 int snd_rawmidi_input_params(struct snd_rawmidi_substream *substream,
 			     struct snd_rawmidi_params * params)
 {
-	char *newbuf;
+	char *newbuf, *oldbuf;
 	struct snd_rawmidi_runtime *runtime = substream->runtime;
 
 	snd_rawmidi_drain_input(substream);
@@ -686,12 +690,16 @@ int snd_rawmidi_input_params(struct snd_
 		return -EINVAL;
 	}
 	if (params->buffer_size != runtime->buffer_size) {
-		newbuf = krealloc(runtime->buffer, params->buffer_size,
-				  GFP_KERNEL);
+		newbuf = kmalloc(params->buffer_size, GFP_KERNEL);
 		if (!newbuf)
 			return -ENOMEM;
+		spin_lock_irq(&runtime->lock);
+		oldbuf = runtime->buffer;
 		runtime->buffer = newbuf;
 		runtime->buffer_size = params->buffer_size;
+		runtime->appl_ptr = runtime->hw_ptr = 0;
+		spin_unlock_irq(&runtime->lock);
+		kfree(oldbuf);
 	}
 	runtime->avail_min = params->avail_min;
 	return 0;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 32/63] ext4: always verify the magic number in xattr blocks
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (14 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 21/63] Bluetooth: hidp: buffer overflow in hidp_process_report Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 23/63] xfs: set format back to extents if xfs_bmap_extents_to_btree Ben Hutchings
                   ` (47 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Andreas Dilger, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 513f86d73855ce556ea9522b6bfd79f87356dc3a upstream.

If there an inode points to a block which is also some other type of
metadata block (such as a block allocation bitmap), the
buffer_verified flag can be set when it was validated as that other
metadata block type; however, it would make a really terrible external
attribute block.  The reason why we use the verified flag is to avoid
constantly reverifying the block.  However, it doesn't take much
overhead to make sure the magic number of the xattr block is correct,
and this will avoid potential crashes.

This addresses CVE-2018-10879.

https://bugzilla.kernel.org/show_bug.cgi?id=200001

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/xattr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -213,12 +213,12 @@ ext4_xattr_check_block(struct inode *ino
 {
 	int error;
 
-	if (buffer_verified(bh))
-		return 0;
-
 	if (BHDR(bh)->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC) ||
 	    BHDR(bh)->h_blocks != cpu_to_le32(1))
 		return -EIO;
+	if (buffer_verified(bh))
+		return 0;
+
 	if (!ext4_xattr_block_csum_verify(inode, bh))
 		return -EIO;
 	error = ext4_xattr_check_names(BFIRST(bh), bh->b_data + bh->b_size,


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 39/63] x86/entry/64: Remove %ebx handling from error_entry/exit
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (58 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 31/63] ext4: add corruption check in ext4_xattr_set_entry() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 59/63] x86/process: Correct and optimize TIF_BLOCKSTEP switch Ben Hutchings
                   ` (3 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, H. Peter Anvin, Dave Hansen, Greg KH, Ingo Molnar,
	Andy Lutomirski, Brian Gerst, Josh Poimboeuf, Juergen Gross,
	Peter Zijlstra, Denys Vlasenko, Dominik Brodowski,
	Boris Ostrovsky, xen-devel, Thomas Gleixner, Linus Torvalds,
	Borislav Petkov

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit b3681dd548d06deb2e1573890829dff4b15abf46 upstream.

error_entry and error_exit communicate the user vs. kernel status of
the frame using %ebx.  This is unnecessary -- the information is in
regs->cs.  Just use regs->cs.

This makes error_entry simpler and makes error_exit more robust.

It also fixes a nasty bug.  Before all the Spectre nonsense, the
xen_failsafe_callback entry point returned like this:

        ALLOC_PT_GPREGS_ON_STACK
        SAVE_C_REGS
        SAVE_EXTRA_REGS
        ENCODE_FRAME_POINTER
        jmp     error_exit

And it did not go through error_entry.  This was bogus: RBX
contained garbage, and error_exit expected a flag in RBX.

Fortunately, it generally contained *nonzero* garbage, so the
correct code path was used.  As part of the Spectre fixes, code was
added to clear RBX to mitigate certain speculation attacks.  Now,
depending on kernel configuration, RBX got zeroed and, when running
some Wine workloads, the kernel crashes.  This was introduced by:

    commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")

With this patch applied, RBX is no longer needed as a flag, and the
problem goes away.

I suspect that malicious userspace could use this bug to crash the
kernel even without the offending patch applied, though.

[ Historical note: I wrote this patch as a cleanup before I was aware
  of the bug it fixed. ]

[ Note to stable maintainers: this should probably get applied to all
  kernels.  If you're nervous about that, a more conservative fix to
  add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should
  also fix the problem. ]

Reported-and-tested-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Fixes: 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
Link: http://lkml.kernel.org/r/b5010a090d3586b2d6e06c7ad3ec5542d1241c45.1532282627.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16:
 - error_exit moved EBX to EAX before testing it, so delete both instructions
 - error_exit does RESTORE_REST earlier, so adjust the offset to saved CS
   accordingly
 - Drop inapplicable comment changes
 - Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1135,7 +1135,7 @@ ENTRY(\sym)
 	.if \paranoid
 	jmp paranoid_exit		/* %ebx: no swapgs flag */
 	.else
-	jmp error_exit			/* %ebx: no swapgs flag */
+	jmp error_exit
 	.endif
 
 	CFI_ENDPROC
@@ -1411,7 +1411,6 @@ END(paranoid_exit)
 
 /*
  * Exception entry point. This expects an error code/orig_rax on the stack.
- * returns in "no swapgs flag" in %ebx.
  */
 ENTRY(error_entry)
 	XCPT_FRAME
@@ -1440,7 +1439,6 @@ ENTRY(error_entry)
 	 * the kernel CR3 here.
 	 */
 	SWITCH_KERNEL_CR3
-	xorl %ebx,%ebx
 	testl $3,CS+8(%rsp)
 	je error_kernelspace
 error_swapgs:
@@ -1456,7 +1454,6 @@ error_sti:
  * for these here too.
  */
 error_kernelspace:
-	incl %ebx
 	leaq native_irq_return_iret(%rip),%rcx
 	cmpq %rcx,RIP+8(%rsp)
 	je error_bad_iret
@@ -1477,22 +1474,18 @@ error_bad_iret:
 	mov %rsp,%rdi
 	call fixup_bad_iret
 	mov %rax,%rsp
-	decl %ebx	/* Return to usergs */
 	jmp error_sti
 	CFI_ENDPROC
 END(error_entry)
 
-
-/* ebx:	no swapgs flag (1: don't need swapgs, 0: need it) */
 ENTRY(error_exit)
 	DEFAULT_FRAME
-	movl %ebx,%eax
 	RESTORE_REST
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
 	GET_THREAD_INFO(%rcx)
-	testl %eax,%eax
-	jne retint_kernel
+	testb	$3, CS-ARGOFFSET(%rsp)
+	jz	retint_kernel
 	LOCKDEP_SYS_EXIT_IRQ
 	movl TI_flags(%rcx),%edx
 	movl $_TIF_WORK_MASK,%edi


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 34/63] ext4: clear i_data in ext4_inode_info when removing inline data
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (53 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 03/63] Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 01/63] x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT Ben Hutchings
                   ` (8 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 6e8ab72a812396996035a37e5ca4b3b99b5d214b upstream.

When converting from an inode from storing the data in-line to a data
block, ext4_destroy_inline_data_nolock() was only clearing the on-disk
copy of the i_blocks[] array.  It was not clearing copy of the
i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually
used by ext4_map_blocks().

This didn't matter much if we are using extents, since the extents
header would be invalid and thus the extents could would re-initialize
the extents tree.  But if we are using indirect blocks, the previous
contents of the i_blocks array will be treated as block numbers, with
potentially catastrophic results to the file system integrity and/or
user data.

This gets worse if the file system is using a 1k block size and
s_first_data is zero, but even without this, the file system can get
quite badly corrupted.

This addresses CVE-2018-10881.

https://bugzilla.kernel.org/show_bug.cgi?id=200015

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/inline.c | 1 +
 1 file changed, 1 insertion(+)

--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -438,6 +438,7 @@ static int ext4_destroy_inline_data_nolo
 
 	memset((void *)ext4_raw_inode(&is.iloc)->i_block,
 		0, EXT4_MIN_INLINE_DATA_SIZE);
+	memset(ei->i_data, 0, EXT4_MIN_INLINE_DATA_SIZE);
 
 	if (EXT4_HAS_INCOMPAT_FEATURE(inode->i_sb,
 				      EXT4_FEATURE_INCOMPAT_EXTENTS)) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 26/63] ext4: verify the depth of extent tree in ext4_find_extent()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (9 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 63/63] mm: get rid of vmacache_flush_all() entirely Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 38/63] Fix up non-directory creation in SGID directories Ben Hutchings
                   ` (52 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit bc890a60247171294acc0bd67d211fa4b88d40ba upstream.

If there is a corupted file system where the claimed depth of the
extent tree is -1, this can cause a massive buffer overrun leading to
sadness.

This addresses CVE-2018-10877.

https://bugzilla.kernel.org/show_bug.cgi?id=199417

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: return -EIO instead of -EFSCORRUPTED]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/ext4_extents.h | 1 +
 fs/ext4/extents.c      | 6 ++++++
 2 files changed, 7 insertions(+)

--- a/fs/ext4/ext4_extents.h
+++ b/fs/ext4/ext4_extents.h
@@ -103,6 +103,7 @@ struct ext4_extent_header {
 };
 
 #define EXT4_EXT_MAGIC		cpu_to_le16(0xf30a)
+#define EXT4_MAX_EXTENT_DEPTH 5
 
 #define EXT4_EXTENT_TAIL_OFFSET(hdr) \
 	(sizeof(struct ext4_extent_header) + \
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -851,6 +851,12 @@ ext4_ext_find_extent(struct inode *inode
 
 	eh = ext_inode_hdr(inode);
 	depth = ext_depth(inode);
+	if (depth < 0 || depth > EXT4_MAX_EXTENT_DEPTH) {
+		EXT4_ERROR_INODE(inode, "inode has invalid extent depth: %d",
+				 depth);
+		ret = -EIO;
+		goto err;
+	}
 
 	/* account possible depth increase */
 	if (!path) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 31/63] ext4: add corruption check in ext4_xattr_set_entry()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (57 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 58/63] x86/process: Optimize TIF checks in __switch_to_xtra() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 39/63] x86/entry/64: Remove %ebx handling from error_entry/exit Ben Hutchings
                   ` (4 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Andreas Dilger, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 5369a762c882c0b6e9599e4ebbb3a9ba9eee7e2d upstream.

In theory this should have been caught earlier when the xattr list was
verified, but in case it got missed, it's simple enough to add check
to make sure we don't overrun the xattr buffer.

This addresses CVE-2018-10879.

https://bugzilla.kernel.org/show_bug.cgi?id=200001

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
[bwh: Backported to 3.16:
 - Add inode parameter to ext4_xattr_set_entry() and update callers
 - Return -EIO instead of -EFSCORRUPTED on error
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -610,14 +610,20 @@ static size_t ext4_xattr_free_space(stru
 }
 
 static int
-ext4_xattr_set_entry(struct ext4_xattr_info *i, struct ext4_xattr_search *s)
+ext4_xattr_set_entry(struct ext4_xattr_info *i, struct ext4_xattr_search *s,
+		     struct inode *inode)
 {
-	struct ext4_xattr_entry *last;
+	struct ext4_xattr_entry *last, *next;
 	size_t free, min_offs = s->end - s->base, name_len = strlen(i->name);
 
 	/* Compute min_offs and last. */
 	last = s->first;
-	for (; !IS_LAST_ENTRY(last); last = EXT4_XATTR_NEXT(last)) {
+	for (; !IS_LAST_ENTRY(last); last = next) {
+		next = EXT4_XATTR_NEXT(last);
+		if ((void *)next >= s->end) {
+			EXT4_ERROR_INODE(inode, "corrupted xattr entries");
+			return -EIO;
+		}
 		if (!last->e_value_block && last->e_value_size) {
 			size_t offs = le16_to_cpu(last->e_value_offs);
 			if (offs < min_offs)
@@ -798,7 +804,7 @@ ext4_xattr_block_set(handle_t *handle, s
 				ce = NULL;
 			}
 			ea_bdebug(bs->bh, "modifying in-place");
-			error = ext4_xattr_set_entry(i, s);
+			error = ext4_xattr_set_entry(i, s, inode);
 			if (!error) {
 				if (!IS_LAST_ENTRY(s->first))
 					ext4_xattr_rehash(header(s->base),
@@ -851,7 +857,7 @@ ext4_xattr_block_set(handle_t *handle, s
 		s->end = s->base + sb->s_blocksize;
 	}
 
-	error = ext4_xattr_set_entry(i, s);
+	error = ext4_xattr_set_entry(i, s, inode);
 	if (error == -EIO)
 		goto bad_block;
 	if (error)
@@ -1021,7 +1027,7 @@ int ext4_xattr_ibody_inline_set(handle_t
 
 	if (EXT4_I(inode)->i_extra_isize == 0)
 		return -ENOSPC;
-	error = ext4_xattr_set_entry(i, s);
+	error = ext4_xattr_set_entry(i, s, inode);
 	if (error) {
 		if (error == -ENOSPC &&
 		    ext4_has_inline_data(inode)) {
@@ -1033,7 +1039,7 @@ int ext4_xattr_ibody_inline_set(handle_t
 			error = ext4_xattr_ibody_find(inode, i, is);
 			if (error)
 				return error;
-			error = ext4_xattr_set_entry(i, s);
+			error = ext4_xattr_set_entry(i, s, inode);
 		}
 		if (error)
 			return error;
@@ -1059,7 +1065,7 @@ static int ext4_xattr_ibody_set(handle_t
 
 	if (EXT4_I(inode)->i_extra_isize == 0)
 		return -ENOSPC;
-	error = ext4_xattr_set_entry(i, s);
+	error = ext4_xattr_set_entry(i, s, inode);
 	if (error)
 		return error;
 	header = IHDR(inode, ext4_raw_inode(&is->iloc));


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 30/63] ext4: fix false negatives *and* false positives in ext4_check_descriptors()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (35 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 12/63] futex: Remove requirement for lock_page() in get_futex_key() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 37/63] ext4: avoid running out of journal credits when appending to an inline file Ben Hutchings
                   ` (26 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 44de022c4382541cebdd6de4465d1f4f465ff1dd upstream.

Ext4_check_descriptors() was getting called before s_gdb_count was
initialized.  So for file systems w/o the meta_bg feature, allocation
bitmaps could overlap the block group descriptors and ext4 wouldn't
notice.

For file systems with the meta_bg feature enabled, there was a
fencepost error which would cause the ext4_check_descriptors() to
incorrectly believe that the block allocation bitmap overlaps with the
block group descriptor blocks, and it would reject the mount.

Fix both of these problems.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/super.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2086,7 +2086,7 @@ static int ext4_check_descriptors(struct
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	ext4_fsblk_t first_block = le32_to_cpu(sbi->s_es->s_first_data_block);
 	ext4_fsblk_t last_block;
-	ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0) + 1;
+	ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0);
 	ext4_fsblk_t block_bitmap;
 	ext4_fsblk_t inode_bitmap;
 	ext4_fsblk_t inode_table;
@@ -3987,6 +3987,7 @@ static int ext4_fill_super(struct super_
 			goto failed_mount2;
 		}
 	}
+	sbi->s_gdb_count = db_count;
 	if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) {
 		ext4_msg(sb, KERN_ERR, "group descriptors corrupted!");
 		goto failed_mount2;
@@ -3999,7 +4000,6 @@ static int ext4_fill_super(struct super_
 			goto failed_mount2;
 		}
 
-	sbi->s_gdb_count = db_count;
 	get_random_bytes(&sbi->s_next_generation, sizeof(u32));
 	spin_lock_init(&sbi->s_next_gen_lock);
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 27/63] ext4: always check block group bounds in ext4_init_block_bitmap()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (26 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 42/63] ALSA: rawmidi: Change resized buffers atomically Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 25/63] ext4: fix check to prevent initializing reserved inodes Ben Hutchings
                   ` (35 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 819b23f1c501b17b9694325471789e6b5cc2d0d2 upstream.

Regardless of whether the flex_bg feature is set, we should always
check to make sure the bits we are setting in the block bitmap are
within the block group bounds.

https://bugzilla.kernel.org/show_bug.cgi?id=199865

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/balloc.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -184,7 +184,6 @@ static int ext4_init_block_bitmap(struct
 	unsigned int bit, bit_max;
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	ext4_fsblk_t start, tmp;
-	int flex_bg = 0;
 	struct ext4_group_info *grp;
 
 	J_ASSERT_BH(bh, buffer_locked(bh));
@@ -217,22 +216,19 @@ static int ext4_init_block_bitmap(struct
 
 	start = ext4_group_first_block_no(sb, block_group);
 
-	if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG))
-		flex_bg = 1;
-
 	/* Set bits for block and inode bitmaps, and inode table */
 	tmp = ext4_block_bitmap(sb, gdp);
-	if (!flex_bg || ext4_block_in_group(sb, tmp, block_group))
+	if (ext4_block_in_group(sb, tmp, block_group))
 		ext4_set_bit(EXT4_B2C(sbi, tmp - start), bh->b_data);
 
 	tmp = ext4_inode_bitmap(sb, gdp);
-	if (!flex_bg || ext4_block_in_group(sb, tmp, block_group))
+	if (ext4_block_in_group(sb, tmp, block_group))
 		ext4_set_bit(EXT4_B2C(sbi, tmp - start), bh->b_data);
 
 	tmp = ext4_inode_table(sb, gdp);
 	for (; tmp < ext4_inode_table(sb, gdp) +
 		     sbi->s_itb_per_group; tmp++) {
-		if (!flex_bg || ext4_block_in_group(sb, tmp, block_group))
+		if (ext4_block_in_group(sb, tmp, block_group))
 			ext4_set_bit(EXT4_B2C(sbi, tmp - start), bh->b_data);
 	}
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 28/63] ext4: don't allow r/w mounts if metadata blocks overlap the superblock
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (51 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 46/63] cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 03/63] Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" Ben Hutchings
                   ` (10 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Harsh Shandilya, Theodore Ts'o, Greg Kroah-Hartman

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 18db4b4e6fc31eda838dd1c1296d67dbcb3dc957 upstream.

If some metadata block, such as an allocation bitmap, overlaps the
superblock, it's very likely that if the file system is mounted
read/write, the results will not be pretty.  So disallow r/w mounts
for file systems corrupted in this particular way.

Backport notes:
3.18.y is missing bc98a42c1f7d ("VFS: Convert sb->s_flags & MS_RDONLY
to sb_rdonly(sb)") and e462ec50cb5f ("VFS: Differentiate mount flags
(MS_*) from internal superblock flags") so we simply use the sb
MS_RDONLY check from pre bc98a42c1f7d in place of the sb_rdonly
function used in the upstream variant of the patch.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Harsh Shandilya <harsh@prjkt.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/super.c | 6 ++++++
 1 file changed, 6 insertions(+)

--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2115,6 +2115,8 @@ static int ext4_check_descriptors(struct
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
 				 "Block bitmap for group %u overlaps "
 				 "superblock", i);
+			if (!(sb->s_flags & MS_RDONLY))
+				return 0;
 		}
 		if (block_bitmap < first_block || block_bitmap > last_block) {
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
@@ -2127,6 +2129,8 @@ static int ext4_check_descriptors(struct
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
 				 "Inode bitmap for group %u overlaps "
 				 "superblock", i);
+			if (!(sb->s_flags & MS_RDONLY))
+				return 0;
 		}
 		if (inode_bitmap < first_block || inode_bitmap > last_block) {
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
@@ -2139,6 +2143,8 @@ static int ext4_check_descriptors(struct
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
 				 "Inode table for group %u overlaps "
 				 "superblock", i);
+			if (!(sb->s_flags & MS_RDONLY))
+				return 0;
 		}
 		if (inode_table < first_block ||
 		    inode_table + sbi->s_itb_per_group - 1 > last_block) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 15/63] KVM: x86: introduce linear_{read,write}_system
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (29 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 14/63] KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 61/63] x86/cpu/intel: Add Knights Mill to Intel family Ben Hutchings
                   ` (32 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Paolo Bonzini

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <pbonzini@redhat.com>

commit 79367a65743975e5cac8d24d08eccc7fdae832b0 upstream.

Wrap the common invocation of ctxt->ops->read_std and ctxt->ops->write_std, so
as to have a smaller patch when the functions grow another argument.

Fixes: 129a72a0d3c8 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kvm/emulate.c | 64 +++++++++++++++++++++---------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -731,6 +731,19 @@ static int linearize(struct x86_emulate_
 }
 
 
+static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
+			      void *data, unsigned size)
+{
+	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception);
+}
+
+static int linear_write_system(struct x86_emulate_ctxt *ctxt,
+			       ulong linear, void *data,
+			       unsigned int size)
+{
+	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception);
+}
+
 static int segmented_read_std(struct x86_emulate_ctxt *ctxt,
 			      struct segmented_address addr,
 			      void *data,
@@ -1394,8 +1407,7 @@ static int read_interrupt_descriptor(str
 		return emulate_gp(ctxt, index << 3 | 0x2);
 
 	addr = dt.address + index * 8;
-	return ctxt->ops->read_std(ctxt, addr, desc, sizeof *desc,
-				   &ctxt->exception);
+	return linear_read_system(ctxt, addr, desc, sizeof *desc);
 }
 
 static void get_descriptor_table_ptr(struct x86_emulate_ctxt *ctxt,
@@ -1432,8 +1444,7 @@ static int read_segment_descriptor(struc
 		return emulate_gp(ctxt, selector & 0xfffc);
 
 	*desc_addr_p = addr = dt.address + index * 8;
-	return ctxt->ops->read_std(ctxt, addr, desc, sizeof *desc,
-				   &ctxt->exception);
+	return linear_read_system(ctxt, addr, desc, sizeof(*desc));
 }
 
 /* allowed just for 8 bytes segments */
@@ -1450,8 +1461,7 @@ static int write_segment_descriptor(stru
 		return emulate_gp(ctxt, selector & 0xfffc);
 
 	addr = dt.address + index * 8;
-	return ctxt->ops->write_std(ctxt, addr, desc, sizeof *desc,
-				    &ctxt->exception);
+	return linear_write_system(ctxt, addr, desc, sizeof *desc);
 }
 
 static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
@@ -1599,8 +1609,7 @@ static int __load_segment_descriptor(str
 		if (ret != X86EMUL_CONTINUE)
 			return ret;
 	} else if (ctxt->mode == X86EMUL_MODE_PROT64) {
-		ret = ctxt->ops->read_std(ctxt, desc_addr+8, &base3,
-				sizeof(base3), &ctxt->exception);
+		ret = linear_read_system(ctxt, desc_addr+8, &base3, sizeof(base3));
 		if (ret != X86EMUL_CONTINUE)
 			return ret;
 	}
@@ -1917,11 +1926,11 @@ static int __emulate_int_real(struct x86
 	eip_addr = dt.address + (irq << 2);
 	cs_addr = dt.address + (irq << 2) + 2;
 
-	rc = ops->read_std(ctxt, cs_addr, &cs, 2, &ctxt->exception);
+	rc = linear_read_system(ctxt, cs_addr, &cs, 2);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = ops->read_std(ctxt, eip_addr, &eip, 2, &ctxt->exception);
+	rc = linear_read_system(ctxt, eip_addr, &eip, 2);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
@@ -2573,27 +2582,23 @@ static int task_switch_16(struct x86_emu
 			  u16 tss_selector, u16 old_tss_sel,
 			  ulong old_tss_base, struct desc_struct *new_desc)
 {
-	const struct x86_emulate_ops *ops = ctxt->ops;
 	struct tss_segment_16 tss_seg;
 	int ret;
 	u32 new_tss_base = get_desc_base(new_desc);
 
-	ret = ops->read_std(ctxt, old_tss_base, &tss_seg, sizeof tss_seg,
-			    &ctxt->exception);
+	ret = linear_read_system(ctxt, old_tss_base, &tss_seg, sizeof tss_seg);
 	if (ret != X86EMUL_CONTINUE)
 		/* FIXME: need to provide precise fault address */
 		return ret;
 
 	save_state_to_tss16(ctxt, &tss_seg);
 
-	ret = ops->write_std(ctxt, old_tss_base, &tss_seg, sizeof tss_seg,
-			     &ctxt->exception);
+	ret = linear_write_system(ctxt, old_tss_base, &tss_seg, sizeof tss_seg);
 	if (ret != X86EMUL_CONTINUE)
 		/* FIXME: need to provide precise fault address */
 		return ret;
 
-	ret = ops->read_std(ctxt, new_tss_base, &tss_seg, sizeof tss_seg,
-			    &ctxt->exception);
+	ret = linear_read_system(ctxt, new_tss_base, &tss_seg, sizeof tss_seg);
 	if (ret != X86EMUL_CONTINUE)
 		/* FIXME: need to provide precise fault address */
 		return ret;
@@ -2601,10 +2606,9 @@ static int task_switch_16(struct x86_emu
 	if (old_tss_sel != 0xffff) {
 		tss_seg.prev_task_link = old_tss_sel;
 
-		ret = ops->write_std(ctxt, new_tss_base,
-				     &tss_seg.prev_task_link,
-				     sizeof tss_seg.prev_task_link,
-				     &ctxt->exception);
+		ret = linear_write_system(ctxt, new_tss_base,
+					  &tss_seg.prev_task_link,
+					  sizeof tss_seg.prev_task_link);
 		if (ret != X86EMUL_CONTINUE)
 			/* FIXME: need to provide precise fault address */
 			return ret;
@@ -2723,15 +2727,13 @@ static int task_switch_32(struct x86_emu
 			  u16 tss_selector, u16 old_tss_sel,
 			  ulong old_tss_base, struct desc_struct *new_desc)
 {
-	const struct x86_emulate_ops *ops = ctxt->ops;
 	struct tss_segment_32 tss_seg;
 	int ret;
 	u32 new_tss_base = get_desc_base(new_desc);
 	u32 eip_offset = offsetof(struct tss_segment_32, eip);
 	u32 ldt_sel_offset = offsetof(struct tss_segment_32, ldt_selector);
 
-	ret = ops->read_std(ctxt, old_tss_base, &tss_seg, sizeof tss_seg,
-			    &ctxt->exception);
+	ret = linear_read_system(ctxt, old_tss_base, &tss_seg, sizeof tss_seg);
 	if (ret != X86EMUL_CONTINUE)
 		/* FIXME: need to provide precise fault address */
 		return ret;
@@ -2739,14 +2741,13 @@ static int task_switch_32(struct x86_emu
 	save_state_to_tss32(ctxt, &tss_seg);
 
 	/* Only GP registers and segment selectors are saved */
-	ret = ops->write_std(ctxt, old_tss_base + eip_offset, &tss_seg.eip,
-			     ldt_sel_offset - eip_offset, &ctxt->exception);
+	ret = linear_write_system(ctxt, old_tss_base + eip_offset, &tss_seg.eip,
+				  ldt_sel_offset - eip_offset);
 	if (ret != X86EMUL_CONTINUE)
 		/* FIXME: need to provide precise fault address */
 		return ret;
 
-	ret = ops->read_std(ctxt, new_tss_base, &tss_seg, sizeof tss_seg,
-			    &ctxt->exception);
+	ret = linear_read_system(ctxt, new_tss_base, &tss_seg, sizeof tss_seg);
 	if (ret != X86EMUL_CONTINUE)
 		/* FIXME: need to provide precise fault address */
 		return ret;
@@ -2754,10 +2755,9 @@ static int task_switch_32(struct x86_emu
 	if (old_tss_sel != 0xffff) {
 		tss_seg.prev_task_link = old_tss_sel;
 
-		ret = ops->write_std(ctxt, new_tss_base,
-				     &tss_seg.prev_task_link,
-				     sizeof tss_seg.prev_task_link,
-				     &ctxt->exception);
+		ret = linear_write_system(ctxt, new_tss_base,
+					  &tss_seg.prev_task_link,
+					  sizeof tss_seg.prev_task_link);
 		if (ret != X86EMUL_CONTINUE)
 			/* FIXME: need to provide precise fault address */
 			return ret;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 41/63] USB: yurex: fix out-of-bounds uaccess in read handler
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 07/63] usbip: usbip_host: refine probe and disconnect debug msgs to be useful Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 54/63] seccomp: create internal mode-setting function Ben Hutchings
                   ` (60 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Greg Kroah-Hartman, Jann Horn

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jann Horn <jannh@google.com>

commit f1e255d60ae66a9f672ff9a207ee6cd8e33d2679 upstream.

In general, accessing userspace memory beyond the length of the supplied
buffer in VFS read/write handlers can lead to both kernel memory corruption
(via kernel_read()/kernel_write(), which can e.g. be triggered via
sys_splice()) and privilege escalation inside userspace.

Fix it by using simple_read_from_buffer() instead of custom logic.

Fixes: 6bc235a2e24a ("USB: add driver for Meywa-Denki & Kayac YUREX")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/usb/misc/yurex.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

--- a/drivers/usb/misc/yurex.c
+++ b/drivers/usb/misc/yurex.c
@@ -413,8 +413,7 @@ static int yurex_release(struct inode *i
 static ssize_t yurex_read(struct file *file, char *buffer, size_t count, loff_t *ppos)
 {
 	struct usb_yurex *dev;
-	int retval = 0;
-	int bytes_read = 0;
+	int len = 0;
 	char in_buffer[20];
 	unsigned long flags;
 
@@ -422,26 +421,16 @@ static ssize_t yurex_read(struct file *f
 
 	mutex_lock(&dev->io_mutex);
 	if (!dev->interface) {		/* already disconnected */
-		retval = -ENODEV;
-		goto exit;
+		mutex_unlock(&dev->io_mutex);
+		return -ENODEV;
 	}
 
 	spin_lock_irqsave(&dev->lock, flags);
-	bytes_read = snprintf(in_buffer, 20, "%lld\n", dev->bbu);
+	len = snprintf(in_buffer, 20, "%lld\n", dev->bbu);
 	spin_unlock_irqrestore(&dev->lock, flags);
-
-	if (*ppos < bytes_read) {
-		if (copy_to_user(buffer, in_buffer + *ppos, bytes_read - *ppos))
-			retval = -EFAULT;
-		else {
-			retval = bytes_read - *ppos;
-			*ppos += bytes_read;
-		}
-	}
-
-exit:
 	mutex_unlock(&dev->io_mutex);
-	return retval;
+
+	return simple_read_from_buffer(buffer, count, ppos, in_buffer, len);
 }
 
 static ssize_t yurex_write(struct file *file, const char *user_buffer, size_t count, loff_t *ppos)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 29/63] ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (7 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 06/63] usbip: usbip_host: fix to hold parent lock for device_attach() calls Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 63/63] mm: get rid of vmacache_flush_all() entirely Ben Hutchings
                   ` (54 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 77260807d1170a8cf35dbb06e07461a655f67eee upstream.

It's really bad when the allocation bitmaps and the inode table
overlap with the block group descriptors, since it causes random
corruption of the bg descriptors.  So we really want to head those off
at the pass.

https://bugzilla.kernel.org/show_bug.cgi?id=199865

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: Open-code sb_rdonly()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/super.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2086,6 +2086,7 @@ static int ext4_check_descriptors(struct
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	ext4_fsblk_t first_block = le32_to_cpu(sbi->s_es->s_first_data_block);
 	ext4_fsblk_t last_block;
+	ext4_fsblk_t last_bg_block = sb_block + ext4_bg_num_gdb(sb, 0) + 1;
 	ext4_fsblk_t block_bitmap;
 	ext4_fsblk_t inode_bitmap;
 	ext4_fsblk_t inode_table;
@@ -2118,6 +2119,14 @@ static int ext4_check_descriptors(struct
 			if (!(sb->s_flags & MS_RDONLY))
 				return 0;
 		}
+		if (block_bitmap >= sb_block + 1 &&
+		    block_bitmap <= last_bg_block) {
+			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
+				 "Block bitmap for group %u overlaps "
+				 "block group descriptors", i);
+			if (!(sb->s_flags & MS_RDONLY))
+				return 0;
+		}
 		if (block_bitmap < first_block || block_bitmap > last_block) {
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
 			       "Block bitmap for group %u not in group "
@@ -2132,6 +2141,14 @@ static int ext4_check_descriptors(struct
 			if (!(sb->s_flags & MS_RDONLY))
 				return 0;
 		}
+		if (inode_bitmap >= sb_block + 1 &&
+		    inode_bitmap <= last_bg_block) {
+			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
+				 "Inode bitmap for group %u overlaps "
+				 "block group descriptors", i);
+			if (!(sb->s_flags & MS_RDONLY))
+				return 0;
+		}
 		if (inode_bitmap < first_block || inode_bitmap > last_block) {
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
 			       "Inode bitmap for group %u not in group "
@@ -2146,6 +2163,14 @@ static int ext4_check_descriptors(struct
 			if (!(sb->s_flags & MS_RDONLY))
 				return 0;
 		}
+		if (inode_table >= sb_block + 1 &&
+		    inode_table <= last_bg_block) {
+			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "
+				 "Inode table for group %u overlaps "
+				 "block group descriptors", i);
+			if (!(sb->s_flags & MS_RDONLY))
+				return 0;
+		}
 		if (inode_table < first_block ||
 		    inode_table + sbi->s_itb_per_group - 1 > last_block) {
 			ext4_msg(sb, KERN_ERR, "ext4_check_descriptors: "


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 33/63] ext4: never move the system.data xattr out of the inode body
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (21 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 04/63] net: Set sk_prot_creator when cloning sockets to the right proto Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 55/63] seccomp: extract check/assign mode helpers Ben Hutchings
                   ` (40 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 8cdb5240ec5928b20490a2bb34cb87e9a5f40226 upstream.

When expanding the extra isize space, we must never move the
system.data xattr out of the inode body.  For performance reasons, it
doesn't make any sense, and the inline data implementation assumes
that system.data xattr is never in the external xattr block.

This addresses CVE-2018-10880

https://bugzilla.kernel.org/show_bug.cgi?id=200005

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/xattr.c | 5 +++++
 1 file changed, 5 insertions(+)

--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1370,6 +1370,11 @@ retry:
 		/* Find the entry best suited to be pushed into EA block */
 		entry = NULL;
 		for (; !IS_LAST_ENTRY(last); last = EXT4_XATTR_NEXT(last)) {
+			/* never move system.data out of the inode */
+			if ((last->e_name_len == 4) &&
+			    (last->e_name_index == EXT4_XATTR_INDEX_SYSTEM) &&
+			    !memcmp(last->e_name, "data", 4))
+				continue;
 			total_size =
 			EXT4_XATTR_SIZE(le32_to_cpu(last->e_value_size)) +
 					EXT4_XATTR_LEN(last->e_name_len);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 37/63] ext4: avoid running out of journal credits when appending to an inline file
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (36 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 30/63] ext4: fix false negatives *and* false positives in ext4_check_descriptors() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 47/63] uas: replace WARN_ON_ONCE() with lockdep_assert_held() Ben Hutchings
                   ` (25 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit 8bc1379b82b8e809eef77a9fedbb75c6c297be19 upstream.

Use a separate journal transaction if it turns out that we need to
convert an inline file to use an data block.  Otherwise we could end
up failing due to not having journal credits.

This addresses CVE-2018-10883.

https://bugzilla.kernel.org/show_bug.cgi?id=200071

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2701,9 +2701,6 @@ extern struct buffer_head *ext4_get_firs
 extern int ext4_inline_data_fiemap(struct inode *inode,
 				   struct fiemap_extent_info *fieinfo,
 				   int *has_inline);
-extern int ext4_try_to_evict_inline_data(handle_t *handle,
-					 struct inode *inode,
-					 int needed);
 extern void ext4_inline_data_truncate(struct inode *inode, int *has_inline);
 
 extern int ext4_convert_inline_data(struct inode *inode);
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -877,11 +877,11 @@ retry_journal:
 	}
 
 	if (ret == -ENOSPC) {
+		ext4_journal_stop(handle);
 		ret = ext4_da_convert_inline_data_to_extent(mapping,
 							    inode,
 							    flags,
 							    fsdata);
-		ext4_journal_stop(handle);
 		if (ret == -ENOSPC &&
 		    ext4_should_retry_alloc(inode->i_sb, &retries))
 			goto retry_journal;
@@ -1839,42 +1839,6 @@ out:
 	return (error < 0 ? error : 0);
 }
 
-/*
- * Called during xattr set, and if we can sparse space 'needed',
- * just create the extent tree evict the data to the outer block.
- *
- * We use jbd2 instead of page cache to move data to the 1st block
- * so that the whole transaction can be committed as a whole and
- * the data isn't lost because of the delayed page cache write.
- */
-int ext4_try_to_evict_inline_data(handle_t *handle,
-				  struct inode *inode,
-				  int needed)
-{
-	int error;
-	struct ext4_xattr_entry *entry;
-	struct ext4_inode *raw_inode;
-	struct ext4_iloc iloc;
-
-	error = ext4_get_inode_loc(inode, &iloc);
-	if (error)
-		return error;
-
-	raw_inode = ext4_raw_inode(&iloc);
-	entry = (struct ext4_xattr_entry *)((void *)raw_inode +
-					    EXT4_I(inode)->i_inline_off);
-	if (EXT4_XATTR_LEN(entry->e_name_len) +
-	    EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)) < needed) {
-		error = -ENOSPC;
-		goto out;
-	}
-
-	error = ext4_convert_inline_data_nolock(handle, inode, &iloc);
-out:
-	brelse(iloc.bh);
-	return error;
-}
-
 void ext4_inline_data_truncate(struct inode *inode, int *has_inline)
 {
 	handle_t *handle;
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1028,22 +1028,8 @@ int ext4_xattr_ibody_inline_set(handle_t
 	if (EXT4_I(inode)->i_extra_isize == 0)
 		return -ENOSPC;
 	error = ext4_xattr_set_entry(i, s, inode);
-	if (error) {
-		if (error == -ENOSPC &&
-		    ext4_has_inline_data(inode)) {
-			error = ext4_try_to_evict_inline_data(handle, inode,
-					EXT4_XATTR_LEN(strlen(i->name) +
-					EXT4_XATTR_SIZE(i->value_len)));
-			if (error)
-				return error;
-			error = ext4_xattr_ibody_find(inode, i, is);
-			if (error)
-				return error;
-			error = ext4_xattr_set_entry(i, s, inode);
-		}
-		if (error)
-			return error;
-	}
+	if (error)
+		return error;
 	header = IHDR(inode, ext4_raw_inode(&is->iloc));
 	if (!IS_LAST_ENTRY(s->first)) {
 		header->h_magic = cpu_to_le32(EXT4_XATTR_MAGIC);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 36/63] jbd2: don't mark block as modified if the handle is out of credits
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (4 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 45/63] x86/paravirt: Fix spectre-v2 mitigations for paravirt guests Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 18/63] sr: pass down correctly sized SCSI sense buffer Ben Hutchings
                   ` (57 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Theodore Ts'o

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit e09463f220ca9a1a1ecfda84fcda658f99a1f12a upstream.

Do not set the b_modified flag in block's journal head should not
until after we're sure that jbd2_journal_dirty_metadat() will not
abort with an error due to there not being enough space reserved in
the jbd2 handle.

Otherwise, future attempts to modify the buffer may lead a large
number of spurious errors and warnings.

This addresses CVE-2018-10883.

https://bugzilla.kernel.org/show_bug.cgi?id=200071

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: Drop the added logging statement, as it's on
 a code path that doesn't exist here]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1288,11 +1288,11 @@ int jbd2_journal_dirty_metadata(handle_t
 		 * of the transaction. This needs to be done
 		 * once a transaction -bzzz
 		 */
-		jh->b_modified = 1;
 		if (handle->h_buffer_credits <= 0) {
 			ret = -ENOSPC;
 			goto out_unlock_bh;
 		}
+		jh->b_modified = 1;
 		handle->h_buffer_credits--;
 	}
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 44/63] x86/speculation: Protect against userspace-userspace spectreRSB
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (38 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 47/63] uas: replace WARN_ON_ONCE() with lockdep_assert_held() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 49/63] btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized Ben Hutchings
                   ` (23 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Tim Chen, Thomas Gleixner, Linus Torvalds, David Woodhouse,
	Jiri Kosina, Peter Zijlstra, Konrad Rzeszutek Wilk,
	Josh Poimboeuf, Borislav Petkov

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Kosina <jkosina@suse.cz>

commit fdf82a7856b32d905c39afc85e34364491e46346 upstream.

The article "Spectre Returns! Speculation Attacks using the Return Stack
Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks,
making use solely of the RSB contents even on CPUs that don't fallback to
BTB on RSB underflow (Skylake+).

Mitigate userspace-userspace attacks by always unconditionally filling RSB on
context switch when the generic spectrev2 mitigation has been enabled.

[1] https://arxiv.org/pdf/1807.07940.pdf

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kernel/cpu/bugs.c | 38 +++++++-------------------------------
 1 file changed, 7 insertions(+), 31 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -263,23 +263,6 @@ static enum spectre_v2_mitigation_cmd __
 	return cmd;
 }
 
-/* Check for Skylake-like CPUs (for RSB handling) */
-static bool __init is_skylake_era(void)
-{
-	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
-	    boot_cpu_data.x86 == 6) {
-		switch (boot_cpu_data.x86_model) {
-		case INTEL_FAM6_SKYLAKE_MOBILE:
-		case INTEL_FAM6_SKYLAKE_DESKTOP:
-		case INTEL_FAM6_SKYLAKE_X:
-		case INTEL_FAM6_KABYLAKE_MOBILE:
-		case INTEL_FAM6_KABYLAKE_DESKTOP:
-			return true;
-		}
-	}
-	return false;
-}
-
 static void __init spectre_v2_select_mitigation(void)
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -340,22 +323,15 @@ retpoline_auto:
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
 	/*
-	 * If neither SMEP nor PTI are available, there is a risk of
-	 * hitting userspace addresses in the RSB after a context switch
-	 * from a shallow call stack to a deeper one. To prevent this fill
-	 * the entire RSB, even when using IBRS.
+	 * If spectre v2 protection has been enabled, unconditionally fill
+	 * RSB during a context switch; this protects against two independent
+	 * issues:
 	 *
-	 * Skylake era CPUs have a separate issue with *underflow* of the
-	 * RSB, when they will predict 'ret' targets from the generic BTB.
-	 * The proper mitigation for this is IBRS. If IBRS is not supported
-	 * or deactivated in favour of retpolines the RSB fill on context
-	 * switch is required.
+	 *	- RSB underflow (and switch to BTB) on Skylake+
+	 *	- SpectreRSB variant of spectre v2 on X86_BUG_SPECTRE_V2 CPUs
 	 */
-	if ((!boot_cpu_has(X86_FEATURE_KAISER) &&
-	     !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
-		setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
-		pr_info("Spectre v2 mitigation: Filling RSB on context switch\n");
-	}
+	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
 
 	/* Initialize Indirect Branch Prediction Barrier if supported */
 	if (boot_cpu_has(X86_FEATURE_IBPB)) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 43/63] x86/speculation: Clean up various Spectre related details
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (43 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 17/63] kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated Ben Hutchings
                   ` (18 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Greg Kroah-Hartman, Dave Hansen, Ingo Molnar,
	Josh Poimboeuf, Andy Lutomirski, Peter Zijlstra, David Woodhouse,
	David Woodhouse, Arjan van de Ven, Dan Williams, Borislav Petkov,
	Linus Torvalds, Thomas Gleixner

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ingo Molnar <mingo@kernel.org>

commit 21e433bdb95bdf3aa48226fd3d33af608437f293 upstream.

Harmonize all the Spectre messages so that a:

    dmesg | grep -i spectre

... gives us most Spectre related kernel boot messages.

Also fix a few other details:

 - clarify a comment about firmware speculation control

 - s/KPTI/PTI

 - remove various line-breaks that made the code uglier

Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -224,8 +224,7 @@ static enum spectre_v2_mitigation_cmd __
 	if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
 		return SPECTRE_V2_CMD_NONE;
 	else {
-		ret = cmdline_find_option(boot_command_line, "spectre_v2", arg,
-					  sizeof(arg));
+		ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
 		if (ret < 0)
 			return SPECTRE_V2_CMD_AUTO;
 
@@ -246,8 +245,7 @@ static enum spectre_v2_mitigation_cmd __
 	     cmd == SPECTRE_V2_CMD_RETPOLINE_AMD ||
 	     cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) &&
 	    !IS_ENABLED(CONFIG_RETPOLINE)) {
-		pr_err("%s selected but not compiled in. Switching to AUTO select\n",
-		       mitigation_options[i].option);
+		pr_err("%s selected but not compiled in. Switching to AUTO select\n", mitigation_options[i].option);
 		return SPECTRE_V2_CMD_AUTO;
 	}
 
@@ -317,14 +315,14 @@ static void __init spectre_v2_select_mit
 			goto retpoline_auto;
 		break;
 	}
-	pr_err("kernel not compiled with retpoline; no mitigation available!");
+	pr_err("Spectre mitigation: kernel not compiled with retpoline; no mitigation available!");
 	return;
 
 retpoline_auto:
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
 	retpoline_amd:
 		if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) {
-			pr_err("LFENCE not serializing. Switching to generic retpoline\n");
+			pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n");
 			goto retpoline_generic;
 		}
 		mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD :
@@ -342,7 +340,7 @@ retpoline_auto:
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
 	/*
-	 * If neither SMEP or KPTI are available, there is a risk of
+	 * If neither SMEP nor PTI are available, there is a risk of
 	 * hitting userspace addresses in the RSB after a context switch
 	 * from a shallow call stack to a deeper one. To prevent this fill
 	 * the entire RSB, even when using IBRS.
@@ -356,13 +354,13 @@ retpoline_auto:
 	if ((!boot_cpu_has(X86_FEATURE_KAISER) &&
 	     !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
 		setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
-		pr_info("Filling RSB on context switch\n");
+		pr_info("Spectre v2 mitigation: Filling RSB on context switch\n");
 	}
 
 	/* Initialize Indirect Branch Prediction Barrier if supported */
 	if (boot_cpu_has(X86_FEATURE_IBPB)) {
 		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
-		pr_info("Enabling Indirect Branch Prediction Barrier\n");
+		pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
 	}
 
 	/*
@@ -378,8 +376,7 @@ retpoline_auto:
 #undef pr_fmt
 
 #ifdef CONFIG_SYSFS
-ssize_t cpu_show_meltdown(struct device *dev,
-			  struct device_attribute *attr, char *buf)
+ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
 		return sprintf(buf, "Not affected\n");
@@ -388,16 +385,14 @@ ssize_t cpu_show_meltdown(struct device
 	return sprintf(buf, "Vulnerable\n");
 }
 
-ssize_t cpu_show_spectre_v1(struct device *dev,
-			    struct device_attribute *attr, char *buf)
+ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1))
 		return sprintf(buf, "Not affected\n");
 	return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 }
 
-ssize_t cpu_show_spectre_v2(struct device *dev,
-			    struct device_attribute *attr, char *buf)
+ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
 		return sprintf(buf, "Not affected\n");


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 58/63] x86/process: Optimize TIF checks in __switch_to_xtra()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (56 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 02/63] x86/fpu: Default eagerfpu if FPU and FXSR are enabled Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 31/63] ext4: add corruption check in ext4_xattr_set_entry() Ben Hutchings
                   ` (5 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Kyle Huey, Andy Lutomirski, Kyle Huey, Peter Zijlstra,
	Thomas Gleixner

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kyle Huey <me@kylehuey.com>

commit af8b3cd3934ec60f4c2a420d19a9d416554f140b upstream.

Help the compiler to avoid reevaluating the thread flags for each checked
bit by reordering the bit checks and providing an explicit xor for
evaluation.

With default defconfigs for each arch,

x86_64: arch/x86/kernel/process.o
text       data     bss     dec     hex
3056       8577      16   11649    2d81	Before
3024	   8577      16	  11617	   2d61	After

i386: arch/x86/kernel/process.o
text       data     bss     dec     hex
2957	   8673	      8	  11638	   2d76	Before
2925	   8673       8	  11606	   2d56	After

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170214081104.9244-2-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 3.16:
 - We don't do refresh_tr_limit() here
 - Use ACCESS_ONCE() instead of READ_ONCE()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kernel/process.c | 54 ++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 23 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -196,48 +196,56 @@ int set_tsc_mode(unsigned int val)
 	return 0;
 }
 
+static inline void switch_to_bitmap(struct tss_struct *tss,
+				    struct thread_struct *prev,
+				    struct thread_struct *next,
+				    unsigned long tifp, unsigned long tifn)
+{
+	if (tifn & _TIF_IO_BITMAP) {
+		/*
+		 * Copy the relevant range of the IO bitmap.
+		 * Normally this is 128 bytes or less:
+		 */
+		memcpy(tss->io_bitmap, next->io_bitmap_ptr,
+		       max(prev->io_bitmap_max, next->io_bitmap_max));
+	} else if (tifp & _TIF_IO_BITMAP) {
+		/*
+		 * Clear any possible leftover bits:
+		 */
+		memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
+	}
+}
+
 void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
 		      struct tss_struct *tss)
 {
 	struct thread_struct *prev, *next;
+	unsigned long tifp, tifn;
 
 	prev = &prev_p->thread;
 	next = &next_p->thread;
 
-	if (test_tsk_thread_flag(prev_p, TIF_BLOCKSTEP) ^
-	    test_tsk_thread_flag(next_p, TIF_BLOCKSTEP)) {
+	tifn = ACCESS_ONCE(task_thread_info(next_p)->flags);
+	tifp = ACCESS_ONCE(task_thread_info(prev_p)->flags);
+	switch_to_bitmap(tss, prev, next, tifp, tifn);
+
+	propagate_user_return_notify(prev_p, next_p);
+
+	if ((tifp ^ tifn) & _TIF_BLOCKSTEP) {
 		unsigned long debugctl = get_debugctlmsr();
 
 		debugctl &= ~DEBUGCTLMSR_BTF;
-		if (test_tsk_thread_flag(next_p, TIF_BLOCKSTEP))
+		if (tifn & _TIF_BLOCKSTEP)
 			debugctl |= DEBUGCTLMSR_BTF;
-
 		update_debugctlmsr(debugctl);
 	}
 
-	if (test_tsk_thread_flag(prev_p, TIF_NOTSC) ^
-	    test_tsk_thread_flag(next_p, TIF_NOTSC)) {
-		/* prev and next are different */
-		if (test_tsk_thread_flag(next_p, TIF_NOTSC))
+	if ((tifp ^ tifn) & _TIF_NOTSC) {
+		if (tifn & _TIF_NOTSC)
 			hard_disable_TSC();
 		else
 			hard_enable_TSC();
 	}
-
-	if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
-		/*
-		 * Copy the relevant range of the IO bitmap.
-		 * Normally this is 128 bytes or less:
-		 */
-		memcpy(tss->io_bitmap, next->io_bitmap_ptr,
-		       max(prev->io_bitmap_max, next->io_bitmap_max));
-	} else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
-		/*
-		 * Clear any possible leftover bits:
-		 */
-		memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
-	}
-	propagate_user_return_notify(prev_p, next_p);
 }
 
 /*


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 46/63] cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (50 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 10/63] usbip: usbip_host: fix NULL-ptr deref and use-after-free errors Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 28/63] ext4: don't allow r/w mounts if metadata blocks overlap the superblock Ben Hutchings
                   ` (11 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Scott Bauer, Scott Bauer, Jens Axboe

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Scott Bauer <scott.bauer@intel.com>

commit 8f3fafc9c2f0ece10832c25f7ffcb07c97a32ad4 upstream.

Like d88b6d04: "cdrom: information leak in cdrom_ioctl_media_changed()"

There is another cast from unsigned long to int which causes
a bounds check to fail with specially crafted input. The value is
then used as an index in the slot array in cdrom_slot_status().

Signed-off-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Scott Bauer <sbauer@plzdonthack.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/cdrom/cdrom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -2528,7 +2528,7 @@ static int cdrom_ioctl_drive_status(stru
 	if (!CDROM_CAN(CDC_SELECT_DISC) ||
 	    (arg == CDSL_CURRENT || arg == CDSL_NONE))
 		return cdi->ops->drive_status(cdi, CDSL_CURRENT);
-	if (((int)arg >= cdi->capacity))
+	if (arg >= cdi->capacity)
 		return -EINVAL;
 	return cdrom_slot_status(cdi, arg);
 }


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  5:25   ` Dave Chinner
  2018-09-22  0:15 ` [PATCH 3.16 07/63] usbip: usbip_host: refine probe and disconnect debug msgs to be useful Ben Hutchings
                   ` (62 subsequent siblings)
  63 siblings, 1 reply; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Darrick J. Wong, Dave Chinner, Carlos Maiolino

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Dave Chinner <dchinner@redhat.com>

commit ee457001ed6c6f31ddad69c24c1da8f377d8472d upstream.

We recently came across a V4 filesystem causing memory corruption
due to a newly allocated inode being setup twice and being added to
the superblock inode list twice. From code inspection, the only way
this could happen is if a newly allocated inode was not marked as
free on disk (i.e. di_mode wasn't zero).

Running the metadump on an upstream debug kernel fails during inode
allocation like so:

XFS: Assertion failed: ip->i_d.di_nblocks == 0, file: fs/xfs/xfs_inod=
e.c, line: 838
 ------------[ cut here ]------------
kernel BUG at fs/xfs/xfs_message.c:114!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 11 PID: 3496 Comm: mkdir Not tainted 4.16.0-rc5-dgc #442
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/0=
1/2014
RIP: 0010:assfail+0x28/0x30
RSP: 0018:ffffc9000236fc80 EFLAGS: 00010202
RAX: 00000000ffffffea RBX: 0000000000004000 RCX: 0000000000000000
RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff8227211b
RBP: ffffc9000236fce8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000bec R11: f000000000000000 R12: ffffc9000236fd30
R13: ffff8805c76bab80 R14: ffff8805c77ac800 R15: ffff88083fb12e10
FS:  00007fac8cbff040(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000=
000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffa6783ff8 CR3: 00000005c6e2b003 CR4: 00000000000606e0
Call Trace:
 xfs_ialloc+0x383/0x570
 xfs_dir_ialloc+0x6a/0x2a0
 xfs_create+0x412/0x670
 xfs_generic_create+0x1f7/0x2c0
 ? capable_wrt_inode_uidgid+0x3f/0x50
 vfs_mkdir+0xfb/0x1b0
 SyS_mkdir+0xcf/0xf0
 do_syscall_64+0x73/0x1a0
 entry_SYSCALL_64_after_hwframe+0x42/0xb7

Extracting the inode number we crashed on from an event trace and
looking at it with xfs_db:

xfs_db> inode 184452204
xfs_db> p
core.magic = 0x494e
core.mode = 0100644
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 1
core.onlink = 0
.....

Confirms that it is not a free inode on disk. xfs_repair
also trips over this inode:

.....
zero length extent (off = 0, fsbno = 0) in ino 184452204
correcting nextents for inode 184452204
bad attribute fork in inode 184452204, would clear attr fork
bad nblocks 1 for inode 184452204, would reset to 0
bad anextents 1 for inode 184452204, would reset to 0
imap claims in-use inode 184452204 is free, would correct imap
would have cleared inode 184452204
.....
disconnected inode 184452204, would move to lost+found

And so we have a situation where the directory structure and the
inobt thinks the inode is free, but the inode on disk thinks it is
still in use. Where this corruption came from is not possible to
diagnose, but we can detect it and prevent the kernel from oopsing
on lookup. The reproducer now results in:

$ sudo mkdir /mnt/scratch/{0,1,2,3,4,5}{0,1,2,3,4,5}
mkdir: cannot create directory =E2=80=98/mnt/scratch/00=E2=80=99: File ex=
ists
mkdir: cannot create directory =E2=80=98/mnt/scratch/01=E2=80=99: File ex=
ists
mkdir: cannot create directory =E2=80=98/mnt/scratch/03=E2=80=99: Structu=
re needs cleaning
mkdir: cannot create directory =E2=80=98/mnt/scratch/04=E2=80=99: Input/o=
utput error
mkdir: cannot create directory =E2=80=98/mnt/scratch/05=E2=80=99: Input/o=
utput error
....

And this corruption shutdown:

[   54.843517] XFS (loop0): Corruption detected! Free inode 0xafe846c not=
 marked free on disk
[   54.845885] XFS (loop0): Internal error xfs_trans_cancel at line 1023 =
of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x425/0x670
[   54.848994] CPU: 10 PID: 3541 Comm: mkdir Not tainted 4.16.0-rc5-dgc #=
443
[   54.850753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIO=
S 1.10.2-1 04/01/2014
[   54.852859] Call Trace:
[   54.853531]  dump_stack+0x85/0xc5
[   54.854385]  xfs_trans_cancel+0x197/0x1c0
[   54.855421]  xfs_create+0x425/0x670
[   54.856314]  xfs_generic_create+0x1f7/0x2c0
[   54.857390]  ? capable_wrt_inode_uidgid+0x3f/0x50
[   54.858586]  vfs_mkdir+0xfb/0x1b0
[   54.859458]  SyS_mkdir+0xcf/0xf0
[   54.860254]  do_syscall_64+0x73/0x1a0
[   54.861193]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[   54.862492] RIP: 0033:0x7fb73bddf547
[   54.863358] RSP: 002b:00007ffdaa553338 EFLAGS: 00000246 ORIG_RAX: 0000=
000000000053
[   54.865133] RAX: ffffffffffffffda RBX: 00007ffdaa55449a RCX: 00007fb73=
bddf547
[   54.866766] RDX: 0000000000000001 RSI: 00000000000001ff RDI: 00007ffda=
a55449a
[   54.868432] RBP: 00007ffdaa55449a R08: 00000000000001ff R09: 00005623a=
8670dd0
[   54.870110] R10: 00007fb73be72d5b R11: 0000000000000246 R12: 000000000=
00001ff
[   54.871752] R13: 00007ffdaa5534b0 R14: 0000000000000000 R15: 00007ffda=
a553500
[   54.873429] XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1=
024 of file fs/xfs/xfs_trans.c.  Return address = ffffffff814cd050
[   54.882790] XFS (loop0): Corruption of in-memory data detected.  Shutt=
ing down filesystem
[   54.884597] XFS (loop0): Please umount the filesystem and rectify the =
problem(s)

Note that this crash is only possible on v4 filesystemsi or v5
filesystems mounted with the ikeep mount option. For all other V5
filesystems, this problem cannot occur because we don't read inodes
we are allocating from disk - we simply overwrite them with the new
inode information.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Tested-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[bwh: Backported to 3.16:
 - Look up mode in XFS inode, not VFS inode
 - Use positive error codes, and EIO instead of EFSCORRUPTED]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/xfs/xfs_icache.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -293,7 +293,28 @@ xfs_iget_cache_miss(
 
 	trace_xfs_iget_miss(ip);
 
-	if ((ip->i_d.di_mode == 0) && !(flags & XFS_IGET_CREATE)) {
+
+	/*
+	 * If we are allocating a new inode, then check what was returned is
+	 * actually a free, empty inode. If we are not allocating an inode,
+	 * the check we didn't find a free inode.
+	 */
+	if (flags & XFS_IGET_CREATE) {
+		if (ip->i_d.di_mode != 0) {
+			xfs_warn(mp,
+"Corruption detected! Free inode 0x%llx not marked free on disk",
+				ino);
+			error = EIO;
+			goto out_destroy;
+		}
+		if (ip->i_d.di_nblocks != 0) {
+			xfs_warn(mp,
+"Corruption detected! Free inode 0x%llx has blocks allocated!",
+				ino);
+			error = EIO;
+			goto out_destroy;
+		}
+	} else if (ip->i_d.di_mode == 0) {
 		error = ENOENT;
 		goto out_destroy;
 	}


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 63/63] mm: get rid of vmacache_flush_all() entirely
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (8 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 29/63] ext4: make sure bitmaps and the inode table don't overlap with bg descriptors Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 26/63] ext4: verify the depth of extent tree in ext4_find_extent() Ben Hutchings
                   ` (53 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Will Deacon, Jann Horn, Greg Kroah-Hartman, Oleg Nesterov,
	Davidlohr Bueso, Linus Torvalds

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

commit 7a9cdebdcc17e426fb5287e4a82db1dfe86339b2 upstream.

Jann Horn points out that the vmacache_flush_all() function is not only
potentially expensive, it's buggy too.  It also happens to be entirely
unnecessary, because the sequence number overflow case can be avoided by
simply making the sequence number be 64-bit.  That doesn't even grow the
data structures in question, because the other adjacent fields are
already 64-bit.

So simplify the whole thing by just making the sequence number overflow
case go away entirely, which gets rid of all the complications and makes
the code faster too.  Win-win.

[ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
  also just goes away entirely with this ]

Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: drop changes to mm debug code]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -345,7 +345,7 @@ struct kioctx_table;
 struct mm_struct {
 	struct vm_area_struct *mmap;		/* list of VMAs */
 	struct rb_root mm_rb;
-	u32 vmacache_seqnum;                   /* per-thread vmacache */
+	u64 vmacache_seqnum;                   /* per-thread vmacache */
 #ifdef CONFIG_MMU
 	unsigned long (*get_unmapped_area) (struct file *filp,
 				unsigned long addr, unsigned long len,
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1291,7 +1291,7 @@ struct task_struct {
 	unsigned brk_randomized:1;
 #endif
 	/* per-thread vma caching */
-	u32 vmacache_seqnum;
+	u64 vmacache_seqnum;
 	struct vm_area_struct *vmacache[VMACACHE_SIZE];
 #if defined(SPLIT_RSS_COUNTING)
 	struct task_rss_stat	rss_stat;
--- a/include/linux/vmacache.h
+++ b/include/linux/vmacache.h
@@ -15,7 +15,6 @@ static inline void vmacache_flush(struct
 	memset(tsk->vmacache, 0, sizeof(tsk->vmacache));
 }
 
-extern void vmacache_flush_all(struct mm_struct *mm);
 extern void vmacache_update(unsigned long addr, struct vm_area_struct *newvma);
 extern struct vm_area_struct *vmacache_find(struct mm_struct *mm,
 						    unsigned long addr);
@@ -29,10 +28,6 @@ extern struct vm_area_struct *vmacache_f
 static inline void vmacache_invalidate(struct mm_struct *mm)
 {
 	mm->vmacache_seqnum++;
-
-	/* deal with overflows */
-	if (unlikely(mm->vmacache_seqnum == 0))
-		vmacache_flush_all(mm);
 }
 
 #endif /* __LINUX_VMACACHE_H */
--- a/mm/vmacache.c
+++ b/mm/vmacache.c
@@ -6,42 +6,6 @@
 #include <linux/vmacache.h>
 
 /*
- * Flush vma caches for threads that share a given mm.
- *
- * The operation is safe because the caller holds the mmap_sem
- * exclusively and other threads accessing the vma cache will
- * have mmap_sem held at least for read, so no extra locking
- * is required to maintain the vma cache.
- */
-void vmacache_flush_all(struct mm_struct *mm)
-{
-	struct task_struct *g, *p;
-
-	/*
-	 * Single threaded tasks need not iterate the entire
-	 * list of process. We can avoid the flushing as well
-	 * since the mm's seqnum was increased and don't have
-	 * to worry about other threads' seqnum. Current's
-	 * flush will occur upon the next lookup.
-	 */
-	if (atomic_read(&mm->mm_users) == 1)
-		return;
-
-	rcu_read_lock();
-	for_each_process_thread(g, p) {
-		/*
-		 * Only flush the vmacache pointers as the
-		 * mm seqnum is already set and curr's will
-		 * be set upon invalidation when the next
-		 * lookup is done.
-		 */
-		if (mm == p->mm)
-			vmacache_flush(p);
-	}
-	rcu_read_unlock();
-}
-
-/*
  * This task may be accessing a foreign mm via (for example)
  * get_user_pages()->find_vma().  The vmacache is task-local and this
  * task's vmacache pertains to a different mm (ie, its own).  There is


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 45/63] x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (3 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 54/63] seccomp: create internal mode-setting function Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 36/63] jbd2: don't mark block as modified if the handle is out of credits Ben Hutchings
                   ` (58 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Konrad Rzeszutek Wilk, David Woodhouse, Peter Zijlstra,
	Juergen Gross, Boris Ostrovsky, Nadav Amit, Thomas Gleixner

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@infradead.org>

commit 5800dc5c19f34e6e03b5adab1282535cb102fafd upstream.

Nadav reported that on guests we're failing to rewrite the indirect
calls to CALLEE_SAVE paravirt functions. In particular the
pv_queued_spin_unlock() call is left unpatched and that is all over the
place. This obviously wrecks Spectre-v2 mitigation (for paravirt
guests) which relies on not actually having indirect calls around.

The reason is an incorrect clobber test in paravirt_patch_call(); this
function rewrites an indirect call with a direct call to the _SAME_
function, there is no possible way the clobbers can be different
because of this.

Therefore remove this clobber check. Also put WARNs on the other patch
failure case (not enough room for the instruction) which I've not seen
trigger in my (limited) testing.

Three live kernel image disassemblies for lock_sock_nested (as a small
function that illustrates the problem nicely). PRE is the current
situation for guests, POST is with this patch applied and NATIVE is with
or without the patch for !guests.

PRE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  *0xffffffff822299e8
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

POST:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock>
   0xffffffff817be9a5 <+53>:    xchg   %ax,%ax
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063aa0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

NATIVE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    movb   $0x0,(%rdi)
   0xffffffff817be9a3 <+51>:    nopl   0x0(%rax)
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.


Fixes: 63f70270ccd9 ("[PATCH] i386: PARAVIRT: add common patching machinery")
Fixes: 3010a0663fd9 ("x86/paravirt, objtool: Annotate indirect calls")
Reported-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kernel/paravirt.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -97,10 +97,12 @@ unsigned paravirt_patch_call(void *insnb
 	struct branch *b = insnbuf;
 	unsigned long delta = (unsigned long)target - (addr+5);
 
-	if (tgt_clobbers & ~site_clobbers)
-		return len;	/* target would clobber too much for this site */
-	if (len < 5)
+	if (len < 5) {
+#ifdef CONFIG_RETPOLINE
+		WARN_ONCE("Failing to patch indirect CALL in %ps\n", (void *)addr);
+#endif
 		return len;	/* call too long for patch site */
+	}
 
 	b->opcode = 0xe8; /* call */
 	b->delta = delta;
@@ -115,8 +117,12 @@ unsigned paravirt_patch_jmp(void *insnbu
 	struct branch *b = insnbuf;
 	unsigned long delta = (unsigned long)target - (addr+5);
 
-	if (len < 5)
+	if (len < 5) {
+#ifdef CONFIG_RETPOLINE
+		WARN_ONCE("Failing to patch indirect JMP in %ps\n", (void *)addr);
+#endif
 		return len;	/* call too long for patch site */
+	}
 
 	b->opcode = 0xe9;	/* jmp */
 	b->delta = delta;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 59/63] x86/process: Correct and optimize TIF_BLOCKSTEP switch
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (59 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 39/63] x86/entry/64: Remove %ebx handling from error_entry/exit Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 50/63] hfsplus: fix NULL dereference in hfsplus_lookup() Ben Hutchings
                   ` (2 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Peter Zijlstra, David Woodhouse, Thomas Gleixner,
	Greg Kroah-Hartman, Kyle Huey, Andy Lutomirski, Kyle Huey

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kyle Huey <me@kylehuey.com>

commit b9894a2f5bd18b1691cb6872c9afe32b148d0132 upstream.

The debug control MSR is "highly magical" as the blockstep bit can be
cleared by hardware under not well documented circumstances.

So a task switch relying on the bit set by the previous task (according to
the previous tasks thread flags) can trip over this and not update the flag
for the next task.

To fix this its required to handle DEBUGCTLMSR_BTF when either the previous
or the next or both tasks have the TIF_BLOCKSTEP flag set.

While at it avoid branching within the TIF_BLOCKSTEP case and evaluating
boot_cpu_data twice in kernels without CONFIG_X86_DEBUGCTLMSR.

x86_64: arch/x86/kernel/process.o
text	data	bss	dec	 hex
3024    8577    16      11617    2d61	Before
3008	8577	16	11601	 2d51	After

i386: No change

[ tglx: Made the shift value explicit, use a local variable to make the
code readable and massaged changelog]

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170214081104.9244-3-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/include/uapi/asm/msr-index.h |  1 +
 arch/x86/kernel/process.c             | 12 +++++++-----
 2 files changed, 8 insertions(+), 5 deletions(-)

--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -109,6 +109,7 @@
 
 /* DEBUGCTLMSR bits (others vary by model): */
 #define DEBUGCTLMSR_LBR			(1UL <<  0) /* last branch recording */
+#define DEBUGCTLMSR_BTF_SHIFT		1
 #define DEBUGCTLMSR_BTF			(1UL <<  1) /* single-step on branches */
 #define DEBUGCTLMSR_TR			(1UL <<  6)
 #define DEBUGCTLMSR_BTS			(1UL <<  7)
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -231,13 +231,15 @@ void __switch_to_xtra(struct task_struct
 
 	propagate_user_return_notify(prev_p, next_p);
 
-	if ((tifp ^ tifn) & _TIF_BLOCKSTEP) {
-		unsigned long debugctl = get_debugctlmsr();
+	if ((tifp & _TIF_BLOCKSTEP || tifn & _TIF_BLOCKSTEP) &&
+	    arch_has_block_step()) {
+		unsigned long debugctl, msk;
 
+		rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
 		debugctl &= ~DEBUGCTLMSR_BTF;
-		if (tifn & _TIF_BLOCKSTEP)
-			debugctl |= DEBUGCTLMSR_BTF;
-		update_debugctlmsr(debugctl);
+		msk = tifn & _TIF_BLOCKSTEP;
+		debugctl |= (msk >> TIF_BLOCKSTEP) << DEBUGCTLMSR_BTF_SHIFT;
+		wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
 	}
 
 	if ((tifp ^ tifn) & _TIF_NOTSC) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (44 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 43/63] x86/speculation: Clean up various Spectre related details Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  5:26   ` Dave Chinner
  2018-09-22  0:15 ` [PATCH 3.16 48/63] video: uvesafb: Fix integer overflow in allocation Ben Hutchings
                   ` (17 subsequent siblings)
  63 siblings, 1 reply; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Dave Chinner, Christoph Hellwig, Wen Xu, Carlos Maiolino,
	Darrick J. Wong

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Dave Chinner <dchinner@redhat.com>

commit afca6c5b2595fc44383919fba740c194b0b76aff upstream.

A recent fuzzed filesystem image cached random dcache corruption
when the reproducer was run. This often showed up as panics in
lookup_slow() on a null inode->i_ops pointer when doing pathwalks.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
....
Call Trace:
 lookup_slow+0x44/0x60
 walk_component+0x3dd/0x9f0
 link_path_walk+0x4a7/0x830
 path_lookupat+0xc1/0x470
 filename_lookup+0x129/0x270
 user_path_at_empty+0x36/0x40
 path_listxattr+0x98/0x110
 SyS_listxattr+0x13/0x20
 do_syscall_64+0xf5/0x280
 entry_SYSCALL_64_after_hwframe+0x42/0xb7

but had many different failure modes including deadlocks trying to
lock the inode that was just allocated or KASAN reports of
use-after-free violations.

The cause of the problem was a corrupt INOBT on a v4 fs where the
root inode was marked as free in the inobt record. Hence when we
allocated an inode, it chose the root inode to allocate, found it in
the cache and re-initialised it.

We recently fixed a similar inode allocation issue caused by inobt
record corruption problem in xfs_iget_cache_miss() in commit
ee457001ed6c ("xfs: catch inode allocation state mismatch
corruption"). This change adds similar checks to the cache-hit path
to catch it, and turns the reproducer into a corruption shutdown
situation.

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix typos in comment]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[bwh: Backported to 3.16:
 - Look up mode in XFS inode, not VFS inode
 - Use positive error codes, and EIO instead of EFSCORRUPTED]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/xfs/xfs_icache.c | 73 +++++++++++++++++++++++++++++----------------
 1 file changed, 48 insertions(+), 25 deletions(-)

--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -133,6 +133,46 @@ xfs_inode_free(
 }
 
 /*
+ * If we are allocating a new inode, then check what was returned is
+ * actually a free, empty inode. If we are not allocating an inode,
+ * then check we didn't find a free inode.
+ *
+ * Returns:
+ *	0		if the inode free state matches the lookup context
+ *	ENOENT		if the inode is free and we are not allocating
+ *	EFSCORRUPTED	if there is any state mismatch at all
+ */
+static int
+xfs_iget_check_free_state(
+	struct xfs_inode	*ip,
+	int			flags)
+{
+	if (flags & XFS_IGET_CREATE) {
+		/* should be a free inode */
+		if (ip->i_d.di_mode != 0) {
+			xfs_warn(ip->i_mount,
+"Corruption detected! Free inode 0x%llx not marked free! (mode 0x%x)",
+				ip->i_ino, ip->i_d.di_mode);
+			return EIO;
+		}
+
+		if (ip->i_d.di_nblocks != 0) {
+			xfs_warn(ip->i_mount,
+"Corruption detected! Free inode 0x%llx has blocks allocated!",
+				ip->i_ino);
+			return EIO;
+		}
+		return 0;
+	}
+
+	/* should be an allocated inode */
+	if (ip->i_d.di_mode == 0)
+		return ENOENT;
+
+	return 0;
+}
+
+/*
  * Check the validity of the inode we just found it the cache
  */
 static int
@@ -181,12 +221,12 @@ xfs_iget_cache_hit(
 	}
 
 	/*
-	 * If lookup is racing with unlink return an error immediately.
+	 * Check the inode free state is valid. This also detects lookup
+	 * racing with unlinks.
 	 */
-	if (ip->i_d.di_mode == 0 && !(flags & XFS_IGET_CREATE)) {
-		error = ENOENT;
+	error = xfs_iget_check_free_state(ip, flags);
+	if (error)
 		goto out_error;
-	}
 
 	/*
 	 * If IRECLAIMABLE is set, we've torn down the VFS inode already.
@@ -295,29 +335,12 @@ xfs_iget_cache_miss(
 
 
 	/*
-	 * If we are allocating a new inode, then check what was returned is
-	 * actually a free, empty inode. If we are not allocating an inode,
-	 * the check we didn't find a free inode.
+	 * Check the inode free state is valid. This also detects lookup
+	 * racing with unlinks.
 	 */
-	if (flags & XFS_IGET_CREATE) {
-		if (ip->i_d.di_mode != 0) {
-			xfs_warn(mp,
-"Corruption detected! Free inode 0x%llx not marked free on disk",
-				ino);
-			error = EIO;
-			goto out_destroy;
-		}
-		if (ip->i_d.di_nblocks != 0) {
-			xfs_warn(mp,
-"Corruption detected! Free inode 0x%llx has blocks allocated!",
-				ino);
-			error = EIO;
-			goto out_destroy;
-		}
-	} else if (ip->i_d.di_mode == 0) {
-		error = ENOENT;
+	error = xfs_iget_check_free_state(ip, flags);
+	if (error)
 		goto out_destroy;
-	}
 
 	/*
 	 * Preload the radix tree so we can insert safely under the


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 49/63] btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (39 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 44/63] x86/speculation: Protect against userspace-userspace spectreRSB Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 24/63] ext4: only look at the bg_flags field if it is valid Ben Hutchings
                   ` (22 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Qu Wenruo, David Sterba, Gu Jinxiang, Xu Wen

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Qu Wenruo <wqu@suse.com>

commit 389305b2aa68723c754f88d9dbd268a400e10664 upstream.

Invalid reloc tree can cause kernel NULL pointer dereference when btrfs
does some cleanup of the reloc roots.

It turns out that fs_info::reloc_ctl can be NULL in
btrfs_recover_relocation() as we allocate relocation control after all
reloc roots have been verified.
So when we hit: note, we haven't called set_reloc_control() thus
fs_info::reloc_ctl is still NULL.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199833
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/btrfs/relocation.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1311,18 +1311,19 @@ static void __del_reloc_root(struct btrf
 	struct mapping_node *node = NULL;
 	struct reloc_control *rc = root->fs_info->reloc_ctl;
 
-	spin_lock(&rc->reloc_root_tree.lock);
-	rb_node = tree_search(&rc->reloc_root_tree.rb_root,
-			      root->node->start);
-	if (rb_node) {
-		node = rb_entry(rb_node, struct mapping_node, rb_node);
-		rb_erase(&node->rb_node, &rc->reloc_root_tree.rb_root);
+	if (rc) {
+		spin_lock(&rc->reloc_root_tree.lock);
+		rb_node = tree_search(&rc->reloc_root_tree.rb_root,
+				      root->node->start);
+		if (rb_node) {
+			node = rb_entry(rb_node, struct mapping_node, rb_node);
+			rb_erase(&node->rb_node, &rc->reloc_root_tree.rb_root);
+		}
+		spin_unlock(&rc->reloc_root_tree.lock);
+		if (!node)
+			return;
+		BUG_ON((struct btrfs_root *)node->data != root);
 	}
-	spin_unlock(&rc->reloc_root_tree.lock);
-
-	if (!node)
-		return;
-	BUG_ON((struct btrfs_root *)node->data != root);
 
 	spin_lock(&root->fs_info->trans_lock);
 	list_del_init(&root->root_list);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 50/63] hfsplus: fix NULL dereference in hfsplus_lookup()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (60 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 59/63] x86/process: Correct and optimize TIF_BLOCKSTEP switch Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 62/63] KVM: x86: introduce num_emulated_msrs Ben Hutchings
  2018-09-22 12:28 ` [PATCH 3.16 00/63] 3.16.58-rc1 review Guenter Roeck
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Wen Xu, Ernesto A. Fernández, Viacheslav Dubeyko,
	Linus Torvalds

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ernesto A. Fernández
 <ernesto.mnd.fernandez@gmail.com>

commit a7ec7a4193a2eb3b5341243fc0b621c1ac9e4ec4 upstream.

An HFS+ filesystem can be mounted read-only without having a metadata
directory, which is needed to support hardlinks.  But if the catalog
data is corrupted, a directory lookup may still find dentries claiming
to be hardlinks.

hfsplus_lookup() does check that ->hidden_dir is not NULL in such a
situation, but mistakenly does so after dereferencing it for the first
time.  Reorder this check to prevent a crash.

This happens when looking up corrupted catalog data (dentry) on a
filesystem with no metadata directory (this could only ever happen on a
read-only mount).  Wen Xu sent the replication steps in detail to the
fsdevel list: https://bugzilla.kernel.org/show_bug.cgi?id=200297

Link: http://lkml.kernel.org/r/20180712215344.q44dyrhymm4ajkao@eaf
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reported-by: Wen Xu <wen.xu@gatech.edu>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/hfsplus/dir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/hfsplus/dir.c
+++ b/fs/hfsplus/dir.c
@@ -74,13 +74,13 @@ again:
 				cpu_to_be32(HFSP_HARDLINK_TYPE) &&
 				entry.file.user_info.fdCreator ==
 				cpu_to_be32(HFSP_HFSPLUS_CREATOR) &&
+				HFSPLUS_SB(sb)->hidden_dir &&
 				(entry.file.create_date ==
 					HFSPLUS_I(HFSPLUS_SB(sb)->hidden_dir)->
 						create_date ||
 				entry.file.create_date ==
 					HFSPLUS_I(sb->s_root->d_inode)->
-						create_date) &&
-				HFSPLUS_SB(sb)->hidden_dir) {
+						create_date)) {
 			struct qstr str;
 			char name[32];
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 47/63] uas: replace WARN_ON_ONCE() with lockdep_assert_held()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (37 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 37/63] ext4: avoid running out of journal credits when appending to an inline file Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 44/63] x86/speculation: Protect against userspace-userspace spectreRSB Ben Hutchings
                   ` (24 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Sanjeev Sharma, Greg Kroah-Hartman, Sanjeev Sharma, Hans de Goede

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Sanjeev Sharma <sanjeev_sharma@mentor.com>

commit ab945eff8396bc3329cc97274320e8d2c6585077 upstream.

on some architecture spin_is_locked() always return false in
uniprocessor configuration and therefore it would be advise
to replace with lockdep_assert_held().

Signed-off-by: Sanjeev Sharma <Sanjeev_Sharma@mentor.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/usb/storage/uas.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -158,7 +158,7 @@ static void uas_mark_cmd_dead(struct uas
 	struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp);
 
 	uas_log_cmd_state(cmnd, caller);
-	WARN_ON_ONCE(!spin_is_locked(&devinfo->lock));
+	lockdep_assert_held(&devinfo->lock);
 	WARN_ON_ONCE(cmdinfo->state & COMMAND_ABORTED);
 	cmdinfo->state |= COMMAND_ABORTED;
 	cmdinfo->state &= ~IS_IN_WORK_LIST;
@@ -185,7 +185,7 @@ static void uas_add_work(struct uas_cmd_
 	struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp);
 	struct uas_dev_info *devinfo = cmnd->device->hostdata;
 
-	WARN_ON_ONCE(!spin_is_locked(&devinfo->lock));
+	lockdep_assert_held(&devinfo->lock);
 	cmdinfo->state |= IS_IN_WORK_LIST;
 	schedule_work(&devinfo->work);
 }
@@ -287,7 +287,7 @@ static int uas_try_complete(struct scsi_
 	struct uas_cmd_info *cmdinfo = (void *)&cmnd->SCp;
 	struct uas_dev_info *devinfo = (void *)cmnd->device->hostdata;
 
-	WARN_ON_ONCE(!spin_is_locked(&devinfo->lock));
+	lockdep_assert_held(&devinfo->lock);
 	if (cmdinfo->state & (COMMAND_INFLIGHT |
 			      DATA_IN_URB_INFLIGHT |
 			      DATA_OUT_URB_INFLIGHT |
@@ -626,7 +626,7 @@ static int uas_submit_urbs(struct scsi_c
 	struct urb *urb;
 	int err;
 
-	WARN_ON_ONCE(!spin_is_locked(&devinfo->lock));
+	lockdep_assert_held(&devinfo->lock);
 	if (cmdinfo->state & SUBMIT_STATUS_URB) {
 		urb = uas_submit_sense_urb(cmnd, gfp, cmdinfo->stream);
 		if (!urb)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 53/63] xfs: don't call xfs_da_shrink_inode with NULL bp
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (41 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 24/63] ext4: only look at the bg_flags field if it is valid Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 17/63] kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access Ben Hutchings
                   ` (20 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Eric Sandeen, Darrick J. Wong, Eric Sandeen, Xu, Wen

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Sandeen <sandeen@sandeen.net>

commit bb3d48dcf86a97dc25fe9fc2c11938e19cb4399a upstream.

xfs_attr3_leaf_create may have errored out before instantiating a buffer,
for example if the blkno is out of range.  In that case there is no work
to do to remove it, and in fact xfs_da_shrink_inode will lead to an oops
if we try.

This also seems to fix a flaw where the original error from
xfs_attr3_leaf_create gets overwritten in the cleanup case, and it
removes a pointless assignment to bp which isn't used after this.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199969
Reported-by: Xu, Wen <wen.xu@gatech.edu>
Tested-by: Xu, Wen <wen.xu@gatech.edu>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/xfs/xfs_attr_leaf.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/fs/xfs/xfs_attr_leaf.c
+++ b/fs/xfs/xfs_attr_leaf.c
@@ -701,9 +701,8 @@ xfs_attr_shortform_to_leaf(xfs_da_args_t
 	ASSERT(blkno == 0);
 	error = xfs_attr3_leaf_create(args, blkno, &bp);
 	if (error) {
-		error = xfs_da_shrink_inode(args, 0, bp);
-		bp = NULL;
-		if (error)
+		/* xfs_attr3_leaf_create may not have instantiated a block */
+		if (bp && (xfs_da_shrink_inode(args, 0, bp) != 0))
 			goto out;
 		xfs_idata_realloc(dp, size, XFS_ATTR_FORK);	/* try to put */
 		memcpy(ifp->if_u1.if_data, tmpbuffer, size);	/* it back */


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 60/63] x86/cpu/AMD: Fix erratum 1076 (CPB bit)
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (17 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 08/63] usbip: usbip_host: delete device from busid_table after rebind Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 57/63] seccomp: add "seccomp" syscall Ben Hutchings
                   ` (44 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Linus Torvalds, Thomas Gleixner, Sherry Hurwitz,
	Peter Zijlstra, David Woodhouse, Ingo Molnar, Greg Kroah-Hartman,
	Borislav Petkov

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <bp@suse.de>

commit f7f3dc00f61261cdc9ccd8b886f21bc4dffd6fd9 upstream.

CPUID Fn8000_0007_EDX[CPB] is wrongly 0 on models up to B1. But they do
support CPB (AMD's Core Performance Boosting cpufreq CPU feature), so fix that.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170907170821.16021-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16:
 - Change the added case into an if-statement
 - s/x86_stepping/x86_mask/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kernel/cpu/amd.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -533,6 +533,16 @@ static void init_amd_ln(struct cpuinfo_x
 	msr_set_bit(MSR_AMD64_DE_CFG, 31);
 }
 
+static void init_amd_zn(struct cpuinfo_x86 *c)
+{
+	/*
+	 * Fix erratum 1076: CPB feature bit not being set in CPUID. It affects
+	 * all up to and including B1.
+	 */
+	if (c->x86_model <= 1 && c->x86_mask <= 1)
+		set_cpu_cap(c, X86_FEATURE_CPB);
+}
+
 static void init_amd(struct cpuinfo_x86 *c)
 {
 	u32 dummy;
@@ -611,6 +621,9 @@ static void init_amd(struct cpuinfo_x86
 		clear_cpu_cap(c, X86_FEATURE_MCE);
 #endif
 
+	if (c->x86 == 0x17)
+		init_amd_zn(c);
+
 	/* Enable workaround for FXSAVE leak */
 	if (c->x86 >= 6)
 		set_cpu_cap(c, X86_FEATURE_FXSAVE_LEAK);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 54/63] seccomp: create internal mode-setting function
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (2 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 41/63] USB: yurex: fix out-of-bounds uaccess in read handler Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 45/63] x86/paravirt: Fix spectre-v2 mitigations for paravirt guests Ben Hutchings
                   ` (59 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Oleg Nesterov, Kees Cook, Andy Lutomirski

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit d78ab02c2c194257a03355fbb79eb721b381d105 upstream.

In preparation for having other callers of the seccomp mode setting
logic, split the prctl entry point away from the core logic that performs
seccomp mode setting.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 kernel/seccomp.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -473,7 +473,7 @@ long prctl_get_seccomp(void)
 }
 
 /**
- * prctl_set_seccomp: configures current->seccomp.mode
+ * seccomp_set_mode: internal function for setting seccomp mode
  * @seccomp_mode: requested mode to use
  * @filter: optional struct sock_fprog for use with SECCOMP_MODE_FILTER
  *
@@ -486,7 +486,7 @@ long prctl_get_seccomp(void)
  *
  * Returns 0 on success or -EINVAL on failure.
  */
-long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
+static long seccomp_set_mode(unsigned long seccomp_mode, char __user *filter)
 {
 	long ret = -EINVAL;
 
@@ -517,3 +517,15 @@ long prctl_set_seccomp(unsigned long sec
 out:
 	return ret;
 }
+
+/**
+ * prctl_set_seccomp: configures current->seccomp.mode
+ * @seccomp_mode: requested mode to use
+ * @filter: optional struct sock_fprog for use with SECCOMP_MODE_FILTER
+ *
+ * Returns 0 on success or -EINVAL on failure.
+ */
+long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
+{
+	return seccomp_set_mode(seccomp_mode, filter);
+}


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 62/63] KVM: x86: introduce num_emulated_msrs
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (61 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 50/63] hfsplus: fix NULL dereference in hfsplus_lookup() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22 12:28 ` [PATCH 3.16 00/63] 3.16.58-rc1 review Guenter Roeck
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Paolo Bonzini, Radim Krčmář

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <pbonzini@redhat.com>

commit 62ef68bb4d00f1a662e487f3fc44ce8521c416aa upstream.

We will want to filter away MSR_IA32_SMBASE from the emulated_msrs if
the host CPU does not support SMM virtualization.  Introduce the
logic to do that, and also move paravirt MSRs to emulated_msrs for
simplicity and to get rid of KVM_SAVE_MSRS_BEGIN.

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kvm/x86.c | 40 +++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -876,17 +876,11 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc);
  *
  * This list is modified at module load time to reflect the
  * capabilities of the host cpu. This capabilities test skips MSRs that are
- * kvm-specific. Those are put in the beginning of the list.
+ * kvm-specific. Those are put in emulated_msrs; filtering of emulated_msrs
+ * may depend on host virtualization features rather than host cpu features.
  */
 
-#define KVM_SAVE_MSRS_BEGIN	12
 static u32 msrs_to_save[] = {
-	MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
-	MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW,
-	HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,
-	HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC,
-	HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
-	MSR_KVM_PV_EOI_EN,
 	MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,
 	MSR_STAR,
 #ifdef CONFIG_X86_64
@@ -899,7 +893,14 @@ static u32 msrs_to_save[] = {
 
 static unsigned num_msrs_to_save;
 
-static const u32 emulated_msrs[] = {
+static u32 emulated_msrs[] = {
+	MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
+	MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW,
+	HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,
+	HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC,
+	HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
+	MSR_KVM_PV_EOI_EN,
+
 	MSR_IA32_TSC_ADJUST,
 	MSR_IA32_TSCDEADLINE,
 	MSR_IA32_MISC_ENABLE,
@@ -907,6 +908,8 @@ static const u32 emulated_msrs[] = {
 	MSR_IA32_MCG_CTL,
 };
 
+static unsigned num_emulated_msrs;
+
 bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
 	if (efer & efer_reserved_bits)
@@ -2774,7 +2777,7 @@ long kvm_arch_dev_ioctl(struct file *fil
 		if (copy_from_user(&msr_list, user_msr_list, sizeof msr_list))
 			goto out;
 		n = msr_list.nmsrs;
-		msr_list.nmsrs = num_msrs_to_save + ARRAY_SIZE(emulated_msrs);
+		msr_list.nmsrs = num_msrs_to_save + num_emulated_msrs;
 		if (copy_to_user(user_msr_list, &msr_list, sizeof msr_list))
 			goto out;
 		r = -E2BIG;
@@ -2786,7 +2789,7 @@ long kvm_arch_dev_ioctl(struct file *fil
 			goto out;
 		if (copy_to_user(user_msr_list->indices + num_msrs_to_save,
 				 &emulated_msrs,
-				 ARRAY_SIZE(emulated_msrs) * sizeof(u32)))
+				 num_emulated_msrs * sizeof(u32)))
 			goto out;
 		r = 0;
 		break;
@@ -4009,8 +4012,7 @@ static void kvm_init_msr_list(void)
 	u32 dummy[2];
 	unsigned i, j;
 
-	/* skip the first msrs in the list. KVM-specific */
-	for (i = j = KVM_SAVE_MSRS_BEGIN; i < ARRAY_SIZE(msrs_to_save); i++) {
+	for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
 		if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
 			continue;
 
@@ -4035,6 +4037,18 @@ static void kvm_init_msr_list(void)
 		j++;
 	}
 	num_msrs_to_save = j;
+
+	for (i = j = 0; i < ARRAY_SIZE(emulated_msrs); i++) {
+		switch (emulated_msrs[i]) {
+		default:
+			break;
+		}
+
+		if (j < i)
+			emulated_msrs[j] = emulated_msrs[i];
+		j++;
+	}
+	num_emulated_msrs = j;
 }
 
 static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len,


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 01/63] x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (54 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 34/63] ext4: clear i_data in ext4_inode_info when removing inline data Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 02/63] x86/fpu: Default eagerfpu if FPU and FXSR are enabled Ben Hutchings
                   ` (7 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Ingo Molnar, Linus Torvalds, Borislav Petkov,
	Thomas Gleixner, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	H. Peter Anvin, Andy Lutomirski, Dave Hansen

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ingo Molnar <mingo@kernel.org>

commit d364a7656c1855c940dfa4baf4ebcc3c6a9e6fd2 upstream.

I tried to simulate an ancient CPU via this option, and
found that it still has fxsr_opt enabled, confusing the
FPU code.

Make the 'nofxsr' option also clear FXSR_OPT flag.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/kernel/cpu/common.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -199,6 +199,15 @@ static int __init x86_noinvpcid_setup(ch
 }
 early_param("noinvpcid", x86_noinvpcid_setup);
 
+static int __init x86_fxsr_setup(char *s)
+{
+	setup_clear_cpu_cap(X86_FEATURE_FXSR);
+	setup_clear_cpu_cap(X86_FEATURE_FXSR_OPT);
+	setup_clear_cpu_cap(X86_FEATURE_XMM);
+	return 1;
+}
+__setup("nofxsr", x86_fxsr_setup);
+
 #ifdef CONFIG_X86_32
 static int cachesize_override = -1;
 static int disable_x86_serial_nr = 1;
@@ -210,14 +219,6 @@ static int __init cachesize_setup(char *
 }
 __setup("cachesize=", cachesize_setup);
 
-static int __init x86_fxsr_setup(char *s)
-{
-	setup_clear_cpu_cap(X86_FEATURE_FXSR);
-	setup_clear_cpu_cap(X86_FEATURE_XMM);
-	return 1;
-}
-__setup("nofxsr", x86_fxsr_setup);
-
 static int __init x86_sep_setup(char *s)
 {
 	setup_clear_cpu_cap(X86_FEATURE_SEP);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 57/63] seccomp: add "seccomp" syscall
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (18 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 60/63] x86/cpu/AMD: Fix erratum 1076 (CPB bit) Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 20/63] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() Ben Hutchings
                   ` (43 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Andy Lutomirski, Oleg Nesterov, Kees Cook

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 48dc92b9fc3926844257316e75ba11eb5c742b2c upstream.

This adds the new "seccomp" syscall with both an "operation" and "flags"
parameter for future expansion. The third argument is a pointer value,
used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).

In addition to the TSYNC flag later in this patch series, there is a
non-zero chance that this syscall could be used for configuring a fixed
argument area for seccomp-tracer-aware processes to pass syscall arguments
in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter"
for this syscall. Additionally, this syscall uses operation, flags,
and user pointer for arguments because strictly passing arguments via
a user pointer would mean seccomp itself would be unable to trivially
filter the seccomp syscall itself.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/Kconfig                      |  1 +
 arch/x86/syscalls/syscall_32.tbl  |  1 +
 arch/x86/syscalls/syscall_64.tbl  |  1 +
 include/linux/syscalls.h          |  2 ++
 include/uapi/asm-generic/unistd.h |  4 ++-
 include/uapi/linux/seccomp.h      |  4 +++
 kernel/seccomp.c                  | 55 ++++++++++++++++++++++++++++---
 kernel/sys_ni.c                   |  3 ++
 8 files changed, 65 insertions(+), 6 deletions(-)

--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -321,6 +321,7 @@ config HAVE_ARCH_SECCOMP_FILTER
 	  - secure_computing is called from a ptrace_event()-safe context
 	  - secure_computing return value is checked and a return value of -1
 	    results in the system call being skipped immediately.
+	  - seccomp syscall wired up
 
 config SECCOMP_FILTER
 	def_bool y
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -360,3 +360,4 @@
 351	i386	sched_setattr		sys_sched_setattr
 352	i386	sched_getattr		sys_sched_getattr
 353	i386	renameat2		sys_renameat2
+354	i386	seccomp			sys_seccomp
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -323,6 +323,7 @@
 314	common	sched_setattr		sys_sched_setattr
 315	common	sched_getattr		sys_sched_getattr
 316	common	renameat2		sys_renameat2
+317	common	seccomp			sys_seccomp
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -866,4 +866,6 @@ asmlinkage long sys_process_vm_writev(pi
 asmlinkage long sys_kcmp(pid_t pid1, pid_t pid2, int type,
 			 unsigned long idx1, unsigned long idx2);
 asmlinkage long sys_finit_module(int fd, const char __user *uargs, int flags);
+asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
+			    const char __user *uargs);
 #endif
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -699,9 +699,11 @@ __SYSCALL(__NR_sched_setattr, sys_sched_
 __SYSCALL(__NR_sched_getattr, sys_sched_getattr)
 #define __NR_renameat2 276
 __SYSCALL(__NR_renameat2, sys_renameat2)
+#define __NR_seccomp 277
+__SYSCALL(__NR_seccomp, sys_seccomp)
 
 #undef __NR_syscalls
-#define __NR_syscalls 277
+#define __NR_syscalls 278
 
 /*
  * All syscalls below here should go away really,
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -10,6 +10,10 @@
 #define SECCOMP_MODE_STRICT	1 /* uses hard-coded filter. */
 #define SECCOMP_MODE_FILTER	2 /* uses user-supplied filter. */
 
+/* Valid operations for seccomp syscall. */
+#define SECCOMP_SET_MODE_STRICT	0
+#define SECCOMP_SET_MODE_FILTER	1
+
 /*
  * All BPF programs must return a 32-bit value.
  * The bottom 16-bits are for optional return data.
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -18,6 +18,7 @@
 #include <linux/compat.h>
 #include <linux/sched.h>
 #include <linux/seccomp.h>
+#include <linux/syscalls.h>
 
 /* #define SECCOMP_DEBUG 1 */
 
@@ -314,7 +315,7 @@ free_prog:
  *
  * Returns 0 on success and non-zero otherwise.
  */
-static long seccomp_attach_user_filter(char __user *user_filter)
+static long seccomp_attach_user_filter(const char __user *user_filter)
 {
 	struct sock_fprog fprog;
 	long ret = -EFAULT;
@@ -517,6 +518,7 @@ out:
 #ifdef CONFIG_SECCOMP_FILTER
 /**
  * seccomp_set_mode_filter: internal function for setting seccomp filter
+ * @flags:  flags to change filter behavior
  * @filter: struct sock_fprog containing filter
  *
  * This function may be called repeatedly to install additional filters.
@@ -527,11 +529,16 @@ out:
  *
  * Returns 0 on success or -EINVAL on failure.
  */
-static long seccomp_set_mode_filter(char __user *filter)
+static long seccomp_set_mode_filter(unsigned int flags,
+				    const char __user *filter)
 {
 	const unsigned long seccomp_mode = SECCOMP_MODE_FILTER;
 	long ret = -EINVAL;
 
+	/* Validate flags. */
+	if (flags != 0)
+		goto out;
+
 	if (!seccomp_may_assign_mode(seccomp_mode))
 		goto out;
 
@@ -544,12 +551,35 @@ out:
 	return ret;
 }
 #else
-static inline long seccomp_set_mode_filter(char __user *filter)
+static inline long seccomp_set_mode_filter(unsigned int flags,
+					   const char __user *filter)
 {
 	return -EINVAL;
 }
 #endif
 
+/* Common entry point for both prctl and syscall. */
+static long do_seccomp(unsigned int op, unsigned int flags,
+		       const char __user *uargs)
+{
+	switch (op) {
+	case SECCOMP_SET_MODE_STRICT:
+		if (flags != 0 || uargs != NULL)
+			return -EINVAL;
+		return seccomp_set_mode_strict();
+	case SECCOMP_SET_MODE_FILTER:
+		return seccomp_set_mode_filter(flags, uargs);
+	default:
+		return -EINVAL;
+	}
+}
+
+SYSCALL_DEFINE3(seccomp, unsigned int, op, unsigned int, flags,
+			 const char __user *, uargs)
+{
+	return do_seccomp(op, flags, uargs);
+}
+
 /**
  * prctl_set_seccomp: configures current->seccomp.mode
  * @seccomp_mode: requested mode to use
@@ -559,12 +589,27 @@ static inline long seccomp_set_mode_filt
  */
 long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
 {
+	unsigned int op;
+	char __user *uargs;
+
 	switch (seccomp_mode) {
 	case SECCOMP_MODE_STRICT:
-		return seccomp_set_mode_strict();
+		op = SECCOMP_SET_MODE_STRICT;
+		/*
+		 * Setting strict mode through prctl always ignored filter,
+		 * so make sure it is always NULL here to pass the internal
+		 * check in do_seccomp().
+		 */
+		uargs = NULL;
+		break;
 	case SECCOMP_MODE_FILTER:
-		return seccomp_set_mode_filter(filter);
+		op = SECCOMP_SET_MODE_FILTER;
+		uargs = filter;
+		break;
 	default:
 		return -EINVAL;
 	}
+
+	/* prctl interface doesn't have flags, so they are always zero. */
+	return do_seccomp(op, 0, uargs);
 }
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -213,3 +213,6 @@ cond_syscall(compat_sys_open_by_handle_a
 
 /* compare kernel pointers */
 cond_syscall(sys_kcmp);
+
+/* operate on Secure Computing state */
+cond_syscall(sys_seccomp);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 12/63] futex: Remove requirement for lock_page() in get_futex_key()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (34 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 11/63] usbip: usbip_host: fix bad unlock balance during stub_probe() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 30/63] ext4: fix false negatives *and* false positives in ext4_check_descriptors() Ben Hutchings
                   ` (27 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Mel Gorman, Sebastian Andrzej Siewior, Hugh Dickins, dave,
	Greg Kroah-Hartman, Chris Mason, Peter Zijlstra, Thomas Gleixner,
	Darren Hart, Linus Torvalds, Mel Gorman, Chenbo Feng,
	Ingo Molnar, Davidlohr Bueso

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgorman@suse.de>

commit 65d8fc777f6dcfee12785c057a6b57f679641c90 upstream.

When dealing with key handling for shared futexes, we can drastically reduce
the usage/need of the page lock. 1) For anonymous pages, the associated futex
object is the mm_struct which does not require the page lock. 2) For inode
based, keys, we can check under RCU read lock if the page mapping is still
valid and take reference to the inode. This just leaves one rare race that
requires the page lock in the slow path when examining the swapcache.

Additionally realtime users currently have a problem with the page lock being
contended for unbounded periods of time during futex operations.

Task A
     get_futex_key()
     lock_page()
    ---> preempted

Now any other task trying to lock that page will have to wait until
task A gets scheduled back in, which is an unbound time.

With this patch, we pretty much have a lockless futex_get_key().

Experiments show that this patch can boost/speedup the hashing of shared
futexes with the perf futex benchmarks (which is good for measuring such
change) by up to 45% when there are high (> 100) thread counts on a 60 core
Westmere. Lower counts are pretty much in the noise range or less than 10%,
but mid range can be seen at over 30% overall throughput (hash ops/sec).
This makes anon-mem shared futexes much closer to its private counterpart.

Signed-off-by: Mel Gorman <mgorman@suse.de>
[ Ported on top of thp refcount rework, changelog, comments, fixes. ]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Mason <clm@fb.com>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1455045314-8305-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: s/READ_ONCE/ACCESS_ONCE/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 kernel/futex.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 91 insertions(+), 7 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -394,6 +394,7 @@ get_futex_key(u32 __user *uaddr, int fsh
 	unsigned long address = (unsigned long)uaddr;
 	struct mm_struct *mm = current->mm;
 	struct page *page, *page_head;
+	struct address_space *mapping;
 	int err, ro = 0;
 
 	/*
@@ -472,7 +473,19 @@ again:
 	}
 #endif
 
-	lock_page(page_head);
+	/*
+	 * The treatment of mapping from this point on is critical. The page
+	 * lock protects many things but in this context the page lock
+	 * stabilizes mapping, prevents inode freeing in the shared
+	 * file-backed region case and guards against movement to swap cache.
+	 *
+	 * Strictly speaking the page lock is not needed in all cases being
+	 * considered here and page lock forces unnecessarily serialization
+	 * From this point on, mapping will be re-verified if necessary and
+	 * page lock will be acquired only if it is unavoidable
+	 */
+
+	mapping = ACCESS_ONCE(page_head->mapping);
 
 	/*
 	 * If page_head->mapping is NULL, then it cannot be a PageAnon
@@ -489,18 +502,31 @@ again:
 	 * shmem_writepage move it from filecache to swapcache beneath us:
 	 * an unlikely race, but we do need to retry for page_head->mapping.
 	 */
-	if (!page_head->mapping) {
-		int shmem_swizzled = PageSwapCache(page_head);
+	if (unlikely(!mapping)) {
+		int shmem_swizzled;
+
+		/*
+		 * Page lock is required to identify which special case above
+		 * applies. If this is really a shmem page then the page lock
+		 * will prevent unexpected transitions.
+		 */
+		lock_page(page);
+		shmem_swizzled = PageSwapCache(page) || page->mapping;
 		unlock_page(page_head);
 		put_page(page_head);
+
 		if (shmem_swizzled)
 			goto again;
+
 		return -EFAULT;
 	}
 
 	/*
 	 * Private mappings are handled in a simple way.
 	 *
+	 * If the futex key is stored on an anonymous page, then the associated
+	 * object is the mm which is implicitly pinned by the calling process.
+	 *
 	 * NOTE: When userspace waits on a MAP_SHARED mapping, even if
 	 * it's a read-only handle, it's expected that futexes attach to
 	 * the object not the particular process.
@@ -518,16 +544,74 @@ again:
 		key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
 		key->private.mm = mm;
 		key->private.address = address;
+
+		get_futex_key_refs(key); /* implies smp_mb(); (B) */
+
 	} else {
+		struct inode *inode;
+
+		/*
+		 * The associated futex object in this case is the inode and
+		 * the page->mapping must be traversed. Ordinarily this should
+		 * be stabilised under page lock but it's not strictly
+		 * necessary in this case as we just want to pin the inode, not
+		 * update the radix tree or anything like that.
+		 *
+		 * The RCU read lock is taken as the inode is finally freed
+		 * under RCU. If the mapping still matches expectations then the
+		 * mapping->host can be safely accessed as being a valid inode.
+		 */
+		rcu_read_lock();
+
+		if (ACCESS_ONCE(page_head->mapping) != mapping) {
+			rcu_read_unlock();
+			put_page(page_head);
+
+			goto again;
+		}
+
+		inode = ACCESS_ONCE(mapping->host);
+		if (!inode) {
+			rcu_read_unlock();
+			put_page(page_head);
+
+			goto again;
+		}
+
+		/*
+		 * Take a reference unless it is about to be freed. Previously
+		 * this reference was taken by ihold under the page lock
+		 * pinning the inode in place so i_lock was unnecessary. The
+		 * only way for this check to fail is if the inode was
+		 * truncated in parallel so warn for now if this happens.
+		 *
+		 * We are not calling into get_futex_key_refs() in file-backed
+		 * cases, therefore a successful atomic_inc return below will
+		 * guarantee that get_futex_key() will still imply smp_mb(); (B).
+		 */
+		if (WARN_ON_ONCE(!atomic_inc_not_zero(&inode->i_count))) {
+			rcu_read_unlock();
+			put_page(page_head);
+
+			goto again;
+		}
+
+		/* Should be impossible but lets be paranoid for now */
+		if (WARN_ON_ONCE(inode->i_mapping != mapping)) {
+			err = -EFAULT;
+			rcu_read_unlock();
+			iput(inode);
+
+			goto out;
+		}
+
 		key->both.offset |= FUT_OFF_INODE; /* inode-based key */
-		key->shared.inode = page_head->mapping->host;
+		key->shared.inode = inode;
 		key->shared.pgoff = basepage_index(page);
+		rcu_read_unlock();
 	}
 
-	get_futex_key_refs(key); /* implies MB (B) */
-
 out:
-	unlock_page(page_head);
 	put_page(page_head);
 	return err;
 }


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 61/63] x86/cpu/intel: Add Knights Mill to Intel family
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (30 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 15/63] KVM: x86: introduce linear_{read,write}_system Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 40/63] infiniband: fix a possible use-after-free bug Ben Hutchings
                   ` (31 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Linus Torvalds, Borislav Petkov, Thomas Gleixner,
	Denys Vlasenko, Dave Hansen, Peter Zijlstra, Andy Lutomirski,
	Josh Poimboeuf, Brian Gerst, Ingo Molnar, Piotr Luc,
	H. Peter Anvin

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Piotr Luc <piotr.luc@intel.com>

commit 0047f59834e5947d45f34f5f12eb330d158f700b upstream.

Add CPUID of Knights Mill (KNM) processor to Intel family list.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161012180520.30976-1-piotr.luc@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 arch/x86/include/asm/intel-family.h | 1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -67,5 +67,6 @@
 /* Xeon Phi */
 
 #define INTEL_FAM6_XEON_PHI_KNL		0x57 /* Knights Landing */
+#define INTEL_FAM6_XEON_PHI_KNM		0x85 /* Knights Mill */
 
 #endif /* _ASM_X86_INTEL_FAMILY_H */


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 48/63] video: uvesafb: Fix integer overflow in allocation
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (45 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 19/63] jfs: Fix inconsistency between memory allocation and ea_buf->max_size Ben Hutchings
                   ` (16 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Kees Cook, Dr Silvio Cesare of InfoSect

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 9f645bcc566a1e9f921bdae7528a01ced5bc3713 upstream.

cmap->len can get close to INT_MAX/2, allowing for an integer overflow in
allocation. This uses kmalloc_array() instead to catch the condition.

Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com>
Fixes: 8bdb3a2d7df48 ("uvesafb: the driver core")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/video/fbdev/uvesafb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/video/fbdev/uvesafb.c
+++ b/drivers/video/fbdev/uvesafb.c
@@ -1059,7 +1059,8 @@ static int uvesafb_setcmap(struct fb_cma
 		    info->cmap.len || cmap->start < info->cmap.start)
 			return -EINVAL;
 
-		entries = kmalloc(sizeof(*entries) * cmap->len, GFP_KERNEL);
+		entries = kmalloc_array(cmap->len, sizeof(*entries),
+					GFP_KERNEL);
 		if (!entries)
 			return -ENOMEM;
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 56/63] seccomp: split mode setting routines
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (11 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 38/63] Fix up non-directory creation in SGID directories Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 13/63] futex: Remove unnecessary warning from get_futex_key Ben Hutchings
                   ` (50 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Andy Lutomirski, Oleg Nesterov, Kees Cook

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 3b23dd12846215eff4afb073366b80c0c4d7543e upstream.

Separates the two mode setting paths to make things more readable with
fewer #ifdefs within function bodies.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 kernel/seccomp.c | 71 ++++++++++++++++++++++++++++++++----------------
 1 file changed, 48 insertions(+), 23 deletions(-)

--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -489,48 +489,66 @@ long prctl_get_seccomp(void)
 }
 
 /**
- * seccomp_set_mode: internal function for setting seccomp mode
- * @seccomp_mode: requested mode to use
- * @filter: optional struct sock_fprog for use with SECCOMP_MODE_FILTER
- *
- * This function may be called repeatedly with a @seccomp_mode of
- * SECCOMP_MODE_FILTER to install additional filters.  Every filter
- * successfully installed will be evaluated (in reverse order) for each system
- * call the task makes.
+ * seccomp_set_mode_strict: internal function for setting strict seccomp
  *
  * Once current->seccomp.mode is non-zero, it may not be changed.
  *
  * Returns 0 on success or -EINVAL on failure.
  */
-static long seccomp_set_mode(unsigned long seccomp_mode, char __user *filter)
+static long seccomp_set_mode_strict(void)
 {
+	const unsigned long seccomp_mode = SECCOMP_MODE_STRICT;
 	long ret = -EINVAL;
 
 	if (!seccomp_may_assign_mode(seccomp_mode))
 		goto out;
 
-	switch (seccomp_mode) {
-	case SECCOMP_MODE_STRICT:
-		ret = 0;
 #ifdef TIF_NOTSC
-		disable_TSC();
+	disable_TSC();
 #endif
-		break;
+	seccomp_assign_mode(seccomp_mode);
+	ret = 0;
+
+out:
+
+	return ret;
+}
+
 #ifdef CONFIG_SECCOMP_FILTER
-	case SECCOMP_MODE_FILTER:
-		ret = seccomp_attach_user_filter(filter);
-		if (ret)
-			goto out;
-		break;
-#endif
-	default:
+/**
+ * seccomp_set_mode_filter: internal function for setting seccomp filter
+ * @filter: struct sock_fprog containing filter
+ *
+ * This function may be called repeatedly to install additional filters.
+ * Every filter successfully installed will be evaluated (in reverse order)
+ * for each system call the task makes.
+ *
+ * Once current->seccomp.mode is non-zero, it may not be changed.
+ *
+ * Returns 0 on success or -EINVAL on failure.
+ */
+static long seccomp_set_mode_filter(char __user *filter)
+{
+	const unsigned long seccomp_mode = SECCOMP_MODE_FILTER;
+	long ret = -EINVAL;
+
+	if (!seccomp_may_assign_mode(seccomp_mode))
+		goto out;
+
+	ret = seccomp_attach_user_filter(filter);
+	if (ret)
 		goto out;
-	}
 
 	seccomp_assign_mode(seccomp_mode);
 out:
 	return ret;
 }
+#else
+static inline long seccomp_set_mode_filter(char __user *filter)
+{
+	return -EINVAL;
+}
+#endif
 
 /**
  * prctl_set_seccomp: configures current->seccomp.mode
@@ -541,5 +559,12 @@ out:
  */
 long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
 {
-	return seccomp_set_mode(seccomp_mode, filter);
+	switch (seccomp_mode) {
+	case SECCOMP_MODE_STRICT:
+		return seccomp_set_mode_strict();
+	case SECCOMP_MODE_FILTER:
+		return seccomp_set_mode_filter(filter);
+	default:
+		return -EINVAL;
+	}
 }


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 05/63] usbip: fix error handling in stub_probe()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (48 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 22/63] scsi: libsas: defer ata device eh commands to libata Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 10/63] usbip: usbip_host: fix NULL-ptr deref and use-after-free errors Ben Hutchings
                   ` (13 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Alexey Khoroshilov, Greg Kroah-Hartman, Valentina Manea

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Khoroshilov <khoroshilov@ispras.ru>

commit 3ff67445750a84de67faaf52c6e1895cb09f2c56 upstream.

If usb_hub_claim_port() fails, no resources are deallocated and
if stub_add_files() fails, port is not released.

The patch fixes these issues and rearranges error handling code.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Valentina Manea <valentina.manea.m@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub_dev.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

--- a/drivers/staging/usbip/stub_dev.c
+++ b/drivers/staging/usbip/stub_dev.c
@@ -340,7 +340,6 @@ static int stub_probe(struct usb_device
 {
 	struct stub_device *sdev = NULL;
 	const char *udev_busid = dev_name(&udev->dev);
-	int err = 0;
 	struct bus_id_priv *busid_priv;
 	int rc;
 
@@ -401,23 +400,28 @@ static int stub_probe(struct usb_device
 			(struct usb_dev_state *) udev);
 	if (rc) {
 		dev_dbg(&udev->dev, "unable to claim port\n");
-		return rc;
+		goto err_port;
 	}
 
-	err = stub_add_files(&udev->dev);
-	if (err) {
+	rc = stub_add_files(&udev->dev);
+	if (rc) {
 		dev_err(&udev->dev, "stub_add_files for %s\n", udev_busid);
-		dev_set_drvdata(&udev->dev, NULL);
-		usb_put_dev(udev);
-		kthread_stop_put(sdev->ud.eh);
-
-		busid_priv->sdev = NULL;
-		stub_device_free(sdev);
-		return err;
+		goto err_files;
 	}
 	busid_priv->status = STUB_BUSID_ALLOC;
 
 	return 0;
+err_files:
+	usb_hub_release_port(udev->parent, udev->portnum,
+			     (struct usb_dev_state *) udev);
+err_port:
+	dev_set_drvdata(&udev->dev, NULL);
+	usb_put_dev(udev);
+	kthread_stop_put(sdev->ud.eh);
+
+	busid_priv->sdev = NULL;
+	stub_device_free(sdev);
+	return rc;
 }
 
 static void shutdown_busid(struct bus_id_priv *busid_priv)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 04/63] net: Set sk_prot_creator when cloning sockets to the right proto
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (20 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 20/63] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 33/63] ext4: never move the system.data xattr out of the inode body Ben Hutchings
                   ` (41 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Thomas King, David S. Miller, Christoph Paasch, Eric Dumazet

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Christoph Paasch <cpaasch@apple.com>

commit 9d538fa60bad4f7b23193c89e843797a1cf71ef3 upstream.

sk->sk_prot and sk->sk_prot_creator can differ when the app uses
IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one).
Which is why sk_prot_creator is there to make sure that sk_prot_free()
does the kmem_cache_free() on the right kmem_cache slab.

Now, if such a socket gets transformed back to a listening socket (using
connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through
sk_clone_lock() when a new connection comes in. But sk_prot_creator will
still point to the IPv6 kmem_cache (as everything got copied in
sk_clone_lock()). When freeing, we will thus put this
memory back into the IPv6 kmem_cache although it was allocated in the
IPv4 cache. I have seen memory corruption happening because of this.

With slub-debugging and MEMCG_KMEM enabled this gives the warning
	"cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP"

A C-program to trigger this:

void main(void)
{
        int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
        int new_fd, newest_fd, client_fd;
        struct sockaddr_in6 bind_addr;
        struct sockaddr_in bind_addr4, client_addr1, client_addr2;
        struct sockaddr unsp;
        int val;

        memset(&bind_addr, 0, sizeof(bind_addr));
        bind_addr.sin6_family = AF_INET6;
        bind_addr.sin6_port = ntohs(42424);

        memset(&client_addr1, 0, sizeof(client_addr1));
        client_addr1.sin_family = AF_INET;
        client_addr1.sin_port = ntohs(42424);
        client_addr1.sin_addr.s_addr = inet_addr("127.0.0.1");

        memset(&client_addr2, 0, sizeof(client_addr2));
        client_addr2.sin_family = AF_INET;
        client_addr2.sin_port = ntohs(42421);
        client_addr2.sin_addr.s_addr = inet_addr("127.0.0.1");

        memset(&unsp, 0, sizeof(unsp));
        unsp.sa_family = AF_UNSPEC;

        bind(fd, (struct sockaddr *)&bind_addr, sizeof(bind_addr));

        listen(fd, 5);

        client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
        connect(client_fd, (struct sockaddr *)&client_addr1, sizeof(client_addr1));
        new_fd = accept(fd, NULL, NULL);
        close(fd);

        val = AF_INET;
        setsockopt(new_fd, SOL_IPV6, IPV6_ADDRFORM, &val, sizeof(val));

        connect(new_fd, &unsp, sizeof(unsp));

        memset(&bind_addr4, 0, sizeof(bind_addr4));
        bind_addr4.sin_family = AF_INET;
        bind_addr4.sin_port = ntohs(42421);
        bind(new_fd, (struct sockaddr *)&bind_addr4, sizeof(bind_addr4));

        listen(new_fd, 5);

        client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
        connect(client_fd, (struct sockaddr *)&client_addr2, sizeof(client_addr2));

        newest_fd = accept(new_fd, NULL, NULL);
        close(new_fd);

        close(client_fd);
        close(new_fd);
}

As far as I can see, this bug has been there since the beginning of the
git-days.

Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Thomas King <thomaskingnew@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 net/core/sock.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1512,6 +1512,8 @@ struct sock *sk_clone_lock(const struct
 
 		sock_copy(newsk, sk);
 
+		newsk->sk_prot_creator = sk->sk_prot;
+
 		/* SANITY */
 		get_net(sock_net(newsk));
 		sk_node_init(&newsk->sk_node);


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 02/63] x86/fpu: Default eagerfpu if FPU and FXSR are enabled
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (55 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 01/63] x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 58/63] x86/process: Optimize TIF checks in __switch_to_xtra() Ben Hutchings
                   ` (6 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, x86, Andy Lutomirski

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben@decadent.org.uk>

This is a limited version of commit 58122bf1d856 "x86/fpu: Default
eagerfpu=on on all CPUs".  That commit revealed bugs in the use of
eagerfpu together with math emulation or without the FXSR feature.
Although those bugs have been fixed upstream, the fixes do not seem to
be practical to backport to 3.16.

The security issue that motivates using eagerfpu (CVE-2018-3665) is an
information leak through speculative execution, and most CPUs lacking
the FXSR feature also don't implement speculative execution.  The
exceptions I am aware of are the Intel Pentium Pro and AMD K6 family,
which will remain vulnerable to this issue.

Move the eagerfpu variable and associated initialisation into
fpu_init(), since xstate_enable_boot_cpu() won't be called at all if
XSAVE is disabled.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org
---
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -509,19 +509,6 @@ static void __init setup_init_fpu_buf(vo
 	xsave_state(init_xstate_buf, -1);
 }
 
-static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO;
-static int __init eager_fpu_setup(char *s)
-{
-	if (!strcmp(s, "on"))
-		eagerfpu = ENABLE;
-	else if (!strcmp(s, "off"))
-		eagerfpu = DISABLE;
-	else if (!strcmp(s, "auto"))
-		eagerfpu = AUTO;
-	return 1;
-}
-__setup("eagerfpu=", eager_fpu_setup);
-
 /*
  * Enable and initialize the xsave feature.
  */
@@ -560,17 +547,11 @@ static void __init xstate_enable_boot_cp
 	prepare_fx_sw_frame();
 	setup_init_fpu_buf();
 
-	/* Auto enable eagerfpu for xsaveopt */
-	if (cpu_has_xsaveopt && eagerfpu != DISABLE)
-		eagerfpu = ENABLE;
-
 	if (pcntxt_mask & XSTATE_EAGER) {
-		if (eagerfpu == DISABLE) {
+		if (!boot_cpu_has(X86_FEATURE_EAGER_FPU)) {
 			pr_err("eagerfpu not present, disabling some xstate features: 0x%llx\n",
 					pcntxt_mask & XSTATE_EAGER);
 			pcntxt_mask &= ~XSTATE_EAGER;
-		} else {
-			eagerfpu = ENABLE;
 		}
 	}
 
@@ -613,9 +594,6 @@ void eager_fpu_init(void)
 	clear_used_math();
 	current_thread_info()->status = 0;
 
-	if (eagerfpu == ENABLE)
-		setup_force_cpu_cap(X86_FEATURE_EAGER_FPU);
-
 	if (!cpu_has_eager_fpu) {
 		stts();
 		return;
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -159,6 +159,19 @@ static void init_thread_xstate(void)
 		xstate_size = sizeof(struct i387_fsave_struct);
 }
 
+static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO;
+static int __init eager_fpu_setup(char *s)
+{
+	if (!strcmp(s, "on"))
+		eagerfpu = ENABLE;
+	else if (!strcmp(s, "off"))
+		eagerfpu = DISABLE;
+	else if (!strcmp(s, "auto"))
+		eagerfpu = AUTO;
+	return 1;
+}
+__setup("eagerfpu=", eager_fpu_setup);
+
 /*
  * Called at bootup to set up the initial FPU state that is later cloned
  * into all processes.
@@ -197,6 +210,17 @@ void fpu_init(void)
 	if (xstate_size == 0)
 		init_thread_xstate();
 
+	/*
+	 * We should always enable eagerfpu, but it doesn't work properly
+	 * here without fpu and fxsr.
+	 */
+	if (eagerfpu == AUTO)
+		eagerfpu = (boot_cpu_has(X86_FEATURE_FPU) &&
+			    boot_cpu_has(X86_FEATURE_FXSR)) ?
+			ENABLE : DISABLE;
+	if (eagerfpu == ENABLE)
+		setup_force_cpu_cap(X86_FEATURE_EAGER_FPU);
+
 	mxcsr_feature_mask_init();
 	xsave_init();
 	eager_fpu_init();


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 09/63] usbip: usbip_host: run rebind from exit when module is removed
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (32 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 40/63] infiniband: fix a possible use-after-free bug Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 11/63] usbip: usbip_host: fix bad unlock balance during stub_probe() Ben Hutchings
                   ` (29 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Greg Kroah-Hartman, Shuah Khan (Samsung OSG)

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: "Shuah Khan (Samsung OSG)" <shuah@kernel.org>

commit 7510df3f29d44685bab7b1918b61a8ccd57126a9 upstream.

After removing usbip_host module, devices it releases are left without
a driver. For example, when a keyboard or a mass storage device are
bound to usbip_host when it is removed, these devices are no longer
bound to any driver.

Fix it to run device_attach() from the module exit routine to restore
the devices to their original drivers. This includes cleanup changes
and moving device_attach() code to a common routine to be called from
rebind_store() and usbip_host_exit().

Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filenames]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub_dev.c  |  6 +---
 drivers/staging/usbip/stub_main.c | 60 +++++++++++++++++++++++++++++------
 2 files changed, 52 insertions(+), 14 deletions(-)

--- a/drivers/staging/usbip/stub_dev.c
+++ b/drivers/staging/usbip/stub_dev.c
@@ -490,12 +490,8 @@ static void stub_disconnect(struct usb_d
 	busid_priv->sdev = NULL;
 	stub_device_free(sdev);
 
-	if (busid_priv->status == STUB_BUSID_ALLOC) {
+	if (busid_priv->status == STUB_BUSID_ALLOC)
 		busid_priv->status = STUB_BUSID_ADDED;
-	} else {
-		busid_priv->status = STUB_BUSID_OTHER;
-		del_match_busid((char *)udev_busid);
-	}
 }
 
 #ifdef CONFIG_PM
--- a/drivers/staging/usbip/stub_main.c
+++ b/drivers/staging/usbip/stub_main.c
@@ -28,6 +28,7 @@
 #define DRIVER_DESC "USB/IP Host Driver"
 
 struct kmem_cache *stub_priv_cache;
+
 /*
  * busid_tables defines matching busids that usbip can grab. A user can change
  * dynamically what device is locally used and what device is exported to a
@@ -188,6 +189,51 @@ static ssize_t store_match_busid(struct
 static DRIVER_ATTR(match_busid, S_IRUSR | S_IWUSR, show_match_busid,
 		   store_match_busid);
 
+static int do_rebind(char *busid, struct bus_id_priv *busid_priv)
+{
+	int ret;
+
+	/* device_attach() callers should hold parent lock for USB */
+	if (busid_priv->udev->dev.parent)
+		device_lock(busid_priv->udev->dev.parent);
+	ret = device_attach(&busid_priv->udev->dev);
+	if (busid_priv->udev->dev.parent)
+		device_unlock(busid_priv->udev->dev.parent);
+	if (ret < 0) {
+		dev_err(&busid_priv->udev->dev, "rebind failed\n");
+		return ret;
+	}
+	return 0;
+}
+
+static void stub_device_rebind(void)
+{
+#if IS_MODULE(CONFIG_USBIP_HOST)
+	struct bus_id_priv *busid_priv;
+	int i;
+
+	/* update status to STUB_BUSID_OTHER so probe ignores the device */
+	spin_lock(&busid_table_lock);
+	for (i = 0; i < MAX_BUSID; i++) {
+		if (busid_table[i].name[0] &&
+		    busid_table[i].shutdown_busid) {
+			busid_priv = &(busid_table[i]);
+			busid_priv->status = STUB_BUSID_OTHER;
+		}
+	}
+	spin_unlock(&busid_table_lock);
+
+	/* now run rebind */
+	for (i = 0; i < MAX_BUSID; i++) {
+		if (busid_table[i].name[0] &&
+		    busid_table[i].shutdown_busid) {
+			busid_priv = &(busid_table[i]);
+			do_rebind(busid_table[i].name, busid_priv);
+		}
+	}
+#endif
+}
+
 static ssize_t rebind_store(struct device_driver *dev, const char *buf,
 				 size_t count)
 {
@@ -208,16 +254,9 @@ static ssize_t rebind_store(struct devic
 	/* mark the device for deletion so probe ignores it during rescan */
 	bid->status = STUB_BUSID_OTHER;
 
-	/* device_attach() callers should hold parent lock for USB */
-	if (bid->udev->dev.parent)
-		device_lock(bid->udev->dev.parent);
-	ret = device_attach(&bid->udev->dev);
-	if (bid->udev->dev.parent)
-		device_unlock(bid->udev->dev.parent);
-	if (ret < 0) {
-		dev_err(&bid->udev->dev, "rebind failed\n");
+	ret = do_rebind((char *) buf, bid);
+	if (ret < 0)
 		return ret;
-	}
 
 	/* delete device from busid_table */
 	del_match_busid((char *) buf);
@@ -343,6 +382,9 @@ static void __exit usbip_host_exit(void)
 	 */
 	usb_deregister_device_driver(&stub_driver);
 
+	/* initiate scan to attach devices */
+	stub_device_rebind();
+
 	kmem_cache_destroy(stub_priv_cache);
 }
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 11/63] usbip: usbip_host: fix bad unlock balance during stub_probe()
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (33 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 09/63] usbip: usbip_host: run rebind from exit when module is removed Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 12/63] futex: Remove requirement for lock_page() in get_futex_key() Ben Hutchings
                   ` (28 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Greg Kroah-Hartman, Shuah Khan (Samsung OSG)

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: "Shuah Khan (Samsung OSG)" <shuah@kernel.org>

commit c171654caa875919be3c533d3518da8be5be966e upstream.

stub_probe() calls put_busid_priv() in an error path when device isn't
found in the busid_table. Fix it by making put_busid_priv() safe to be
called with null struct bus_id_priv pointer.

This problem happens when "usbip bind" is run without loading usbip_host
driver and then running modprobe. The first failed bind attempt unbinds
the device from the original driver and when usbip_host is modprobed,
stub_probe() runs and doesn't find the device in its busid table and calls
put_busid_priv(0 with null bus_id_priv pointer.

usbip-host 3-10.2: 3-10.2 is not in match_busid table...  skip!

[  367.359679] =====================================
[  367.359681] WARNING: bad unlock balance detected!
[  367.359683] 4.17.0-rc4+ #5 Not tainted
[  367.359685] -------------------------------------
[  367.359688] modprobe/2768 is trying to release lock (
[  367.359689]
==================================================================
[  367.359696] BUG: KASAN: null-ptr-deref in print_unlock_imbalance_bug+0x99/0x110
[  367.359699] Read of size 8 at addr 0000000000000058 by task modprobe/2768

[  367.359705] CPU: 4 PID: 2768 Comm: modprobe Not tainted 4.17.0-rc4+ #5

Fixes: 22076557b07c ("usbip: usbip_host: fix NULL-ptr deref and use-after-free errors") in usb-linus
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/staging/usbip/stub_main.c
+++ b/drivers/staging/usbip/stub_main.c
@@ -96,7 +96,8 @@ struct bus_id_priv *get_busid_priv(const
 
 void put_busid_priv(struct bus_id_priv *bid)
 {
-	spin_unlock(&bid->busid_lock);
+	if (bid)
+		spin_unlock(&bid->busid_lock);
 }
 
 static int add_match_busid(char *busid)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 16/63] KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (23 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 55/63] seccomp: extract check/assign mode helpers Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 35/63] ext4: add more inode number paranoia checks Ben Hutchings
                   ` (38 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Paolo Bonzini

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <pbonzini@redhat.com>

commit ce14e868a54edeb2e30cb7a7b104a2fc4b9d76ca upstream.

Int the next patch the emulator's .read_std and .write_std callbacks will
grow another argument, which is not needed in kvm_read_guest_virt and
kvm_write_guest_virt_system's callers.  Since we have to make separate
functions, let's give the currently existing names a nicer interface, too.

Fixes: 129a72a0d3c8 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16:
 - Drop change to handle_invvpid()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6027,8 +6027,7 @@ static int nested_vmx_check_vmptr(struct
 			vmcs_read32(VMX_INSTRUCTION_INFO), &gva))
 		return 1;
 
-	if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva, &vmptr,
-				sizeof(vmptr), &e)) {
+	if (kvm_read_guest_virt(vcpu, gva, &vmptr, sizeof(vmptr), &e)) {
 		kvm_inject_page_fault(vcpu, &e);
 		return 1;
 	}
@@ -6539,8 +6538,8 @@ static int handle_vmread(struct kvm_vcpu
 				vmx_instruction_info, &gva))
 			return 1;
 		/* _system ok, as nested_vmx_check_permission verified cpl=0 */
-		kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, gva,
-			     &field_value, (is_long_mode(vcpu) ? 8 : 4), NULL);
+		kvm_write_guest_virt_system(vcpu, gva, &field_value,
+					    (is_long_mode(vcpu) ? 8 : 4), NULL);
 	}
 
 	nested_vmx_succeed(vcpu);
@@ -6575,8 +6574,8 @@ static int handle_vmwrite(struct kvm_vcp
 		if (get_vmx_mem_address(vcpu, exit_qualification,
 				vmx_instruction_info, &gva))
 			return 1;
-		if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva,
-			   &field_value, (is_long_mode(vcpu) ? 8 : 4), &e)) {
+		if (kvm_read_guest_virt(vcpu, gva, &field_value,
+					(is_long_mode(vcpu) ? 8 : 4), &e)) {
 			kvm_inject_page_fault(vcpu, &e);
 			return 1;
 		}
@@ -6669,9 +6668,9 @@ static int handle_vmptrst(struct kvm_vcp
 			vmx_instruction_info, &vmcs_gva))
 		return 1;
 	/* ok to use *_system, as nested_vmx_check_permission verified cpl=0 */
-	if (kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, vmcs_gva,
-				 (void *)&to_vmx(vcpu)->nested.current_vmptr,
-				 sizeof(u64), &e)) {
+	if (kvm_write_guest_virt_system(vcpu, vmcs_gva,
+					(void *)&to_vmx(vcpu)->nested.current_vmptr,
+					sizeof(u64), &e)) {
 		kvm_inject_page_fault(vcpu, &e);
 		return 1;
 	}
@@ -6723,8 +6722,7 @@ static int handle_invept(struct kvm_vcpu
 	if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION),
 			vmx_instruction_info, &gva))
 		return 1;
-	if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva, &operand,
-				sizeof(operand), &e)) {
+	if (kvm_read_guest_virt(vcpu, gva, &operand, sizeof(operand), &e)) {
 		kvm_inject_page_fault(vcpu, &e);
 		return 1;
 	}
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4178,11 +4178,10 @@ static int kvm_fetch_guest_virt(struct x
 					  exception);
 }
 
-int kvm_read_guest_virt(struct x86_emulate_ctxt *ctxt,
+int kvm_read_guest_virt(struct kvm_vcpu *vcpu,
 			       gva_t addr, void *val, unsigned int bytes,
 			       struct x86_exception *exception)
 {
-	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
 
 	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, access,
@@ -4190,26 +4189,24 @@ int kvm_read_guest_virt(struct x86_emula
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_virt);
 
-static int kvm_read_guest_virt_system(struct x86_emulate_ctxt *ctxt,
-				      gva_t addr, void *val, unsigned int bytes,
-				      struct x86_exception *exception)
+static int emulator_read_std(struct x86_emulate_ctxt *ctxt,
+			     gva_t addr, void *val, unsigned int bytes,
+			     struct x86_exception *exception)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, 0, exception);
 }
 
-int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
-				       gva_t addr, void *val,
-				       unsigned int bytes,
-				       struct x86_exception *exception)
+static int kvm_write_guest_virt_helper(gva_t addr, void *val, unsigned int bytes,
+				      struct kvm_vcpu *vcpu, u32 access,
+				      struct x86_exception *exception)
 {
-	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	void *data = val;
 	int r = X86EMUL_CONTINUE;
 
 	while (bytes) {
 		gpa_t gpa =  vcpu->arch.walk_mmu->gva_to_gpa(vcpu, addr,
-							     PFERR_WRITE_MASK,
+							     access,
 							     exception);
 		unsigned offset = addr & (PAGE_SIZE-1);
 		unsigned towrite = min(bytes, (unsigned)PAGE_SIZE - offset);
@@ -4230,6 +4227,22 @@ int kvm_write_guest_virt_system(struct x
 out:
 	return r;
 }
+
+static int emulator_write_std(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val,
+			      unsigned int bytes, struct x86_exception *exception)
+{
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
+					   PFERR_WRITE_MASK, exception);
+}
+
+int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
+				unsigned int bytes, struct x86_exception *exception)
+{
+	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
+					   PFERR_WRITE_MASK, exception);
+}
 EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
 static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
@@ -4902,8 +4915,8 @@ static void emulator_write_gpr(struct x8
 static const struct x86_emulate_ops emulate_ops = {
 	.read_gpr            = emulator_read_gpr,
 	.write_gpr           = emulator_write_gpr,
-	.read_std            = kvm_read_guest_virt_system,
-	.write_std           = kvm_write_guest_virt_system,
+	.read_std            = emulator_read_std,
+	.write_std           = emulator_write_std,
 	.fetch               = kvm_fetch_guest_virt,
 	.read_emulated       = emulator_read_emulated,
 	.write_emulated      = emulator_write_emulated,
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -124,11 +124,11 @@ int kvm_inject_realmode_interrupt(struct
 
 void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr);
 
-int kvm_read_guest_virt(struct x86_emulate_ctxt *ctxt,
+int kvm_read_guest_virt(struct kvm_vcpu *vcpu,
 	gva_t addr, void *val, unsigned int bytes,
 	struct x86_exception *exception);
 
-int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
+int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu,
 	gva_t addr, void *val, unsigned int bytes,
 	struct x86_exception *exception);
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 17/63] kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor  access
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (42 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 53/63] xfs: don't call xfs_da_shrink_inode with NULL bp Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 43/63] x86/speculation: Clean up various Spectre related details Ben Hutchings
                   ` (19 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Paolo Bonzini

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <pbonzini@redhat.com>

commit 3c9fa24ca7c9c47605672916491f79e8ccacb9e6 upstream.

The functions that were used in the emulation of fxrstor, fxsave, sgdt and
sidt were originally meant for task switching, and as such they did not
check privilege levels.  This is very bad when the same functions are used
in the emulation of unprivileged instructions.  This is CVE-2018-10853.

The obvious fix is to add a new argument to ops->read_std and ops->write_std,
which decides whether the access is a "system" access or should use the
processor's CPL.

Fixes: 129a72a0d3c8 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: Drop change in handle_ud()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -104,11 +104,12 @@ struct x86_emulate_ops {
 	 *  @addr:  [IN ] Linear address from which to read.
 	 *  @val:   [OUT] Value read from memory, zero-extended to 'u_long'.
 	 *  @bytes: [IN ] Number of bytes to read from memory.
+	 *  @system:[IN ] Whether the access is forced to be at CPL0.
 	 */
 	int (*read_std)(struct x86_emulate_ctxt *ctxt,
 			unsigned long addr, void *val,
 			unsigned int bytes,
-			struct x86_exception *fault);
+			struct x86_exception *fault, bool system);
 
 	/*
 	 * write_std: Write bytes of standard (non-emulated/special) memory.
@@ -116,10 +117,11 @@ struct x86_emulate_ops {
 	 *  @addr:  [IN ] Linear address to which to write.
 	 *  @val:   [OUT] Value write to memory, zero-extended to 'u_long'.
 	 *  @bytes: [IN ] Number of bytes to write to memory.
+	 *  @system:[IN ] Whether the access is forced to be at CPL0.
 	 */
 	int (*write_std)(struct x86_emulate_ctxt *ctxt,
 			 unsigned long addr, void *val, unsigned int bytes,
-			 struct x86_exception *fault);
+			 struct x86_exception *fault, bool system);
 	/*
 	 * fetch: Read bytes of standard (non-emulated/special) memory.
 	 *        Used for instruction fetch.
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -734,14 +734,14 @@ static int linearize(struct x86_emulate_
 static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
 			      void *data, unsigned size)
 {
-	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception);
+	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
 static int linear_write_system(struct x86_emulate_ctxt *ctxt,
 			       ulong linear, void *data,
 			       unsigned int size)
 {
-	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception);
+	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
 static int segmented_read_std(struct x86_emulate_ctxt *ctxt,
@@ -755,7 +755,7 @@ static int segmented_read_std(struct x86
 	rc = linearize(ctxt, addr, size, false, &linear);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
-	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception);
+	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, false);
 }
 
 static int segmented_write_std(struct x86_emulate_ctxt *ctxt,
@@ -769,7 +769,7 @@ static int segmented_write_std(struct x8
 	rc = linearize(ctxt, addr, size, true, &linear);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
-	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception);
+	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, false);
 }
 
 /*
@@ -2472,12 +2472,12 @@ static bool emulator_io_port_access_allo
 #ifdef CONFIG_X86_64
 	base |= ((u64)base3) << 32;
 #endif
-	r = ops->read_std(ctxt, base + 102, &io_bitmap_ptr, 2, NULL);
+	r = ops->read_std(ctxt, base + 102, &io_bitmap_ptr, 2, NULL, true);
 	if (r != X86EMUL_CONTINUE)
 		return false;
 	if (io_bitmap_ptr + port/8 > desc_limit_scaled(&tr_seg))
 		return false;
-	r = ops->read_std(ctxt, base + io_bitmap_ptr + port/8, &perm, 2, NULL);
+	r = ops->read_std(ctxt, base + io_bitmap_ptr + port/8, &perm, 2, NULL, true);
 	if (r != X86EMUL_CONTINUE)
 		return false;
 	if ((perm >> bit_idx) & mask)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4191,10 +4191,15 @@ EXPORT_SYMBOL_GPL(kvm_read_guest_virt);
 
 static int emulator_read_std(struct x86_emulate_ctxt *ctxt,
 			     gva_t addr, void *val, unsigned int bytes,
-			     struct x86_exception *exception)
+			     struct x86_exception *exception, bool system)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
-	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, 0, exception);
+	u32 access = 0;
+
+	if (!system && kvm_x86_ops->get_cpl(vcpu) == 3)
+		access |= PFERR_USER_MASK;
+
+	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, access, exception);
 }
 
 static int kvm_write_guest_virt_helper(gva_t addr, void *val, unsigned int bytes,
@@ -4229,12 +4234,17 @@ out:
 }
 
 static int emulator_write_std(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val,
-			      unsigned int bytes, struct x86_exception *exception)
+			      unsigned int bytes, struct x86_exception *exception,
+			      bool system)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+	u32 access = PFERR_WRITE_MASK;
+
+	if (!system && kvm_x86_ops->get_cpl(vcpu) == 3)
+		access |= PFERR_USER_MASK;
 
 	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
-					   PFERR_WRITE_MASK, exception);
+					   access, exception);
 }
 
 int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 07/63] usbip: usbip_host: refine probe and disconnect debug msgs to be  useful
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 41/63] USB: yurex: fix out-of-bounds uaccess in read handler Ben Hutchings
                   ` (61 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Shuah Khan, Greg Kroah-Hartman

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Shuah Khan <shuahkh@osg.samsung.com>

commit 28b68acc4a88dcf91fd1dcf2577371dc9bf574cc upstream.

Refine probe and disconnect debug msgs to be useful and say what is
in progress.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub_dev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/staging/usbip/stub_dev.c
+++ b/drivers/staging/usbip/stub_dev.c
@@ -343,7 +343,7 @@ static int stub_probe(struct usb_device
 	struct bus_id_priv *busid_priv;
 	int rc;
 
-	dev_dbg(&udev->dev, "Enter\n");
+	dev_dbg(&udev->dev, "Enter probe\n");
 
 	/* check we should claim or not by busid_table */
 	busid_priv = get_busid_priv(udev_busid);
@@ -446,7 +446,7 @@ static void stub_disconnect(struct usb_d
 	struct bus_id_priv *busid_priv;
 	int rc;
 
-	dev_dbg(&udev->dev, "Enter\n");
+	dev_dbg(&udev->dev, "Enter disconnect\n");
 
 	busid_priv = get_busid_priv(udev_busid);
 	if (!busid_priv) {


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 06/63] usbip: usbip_host: fix to hold parent lock for device_attach() calls
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (6 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 18/63] sr: pass down correctly sized SCSI sense buffer Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 29/63] ext4: make sure bitmaps and the inode table don't overlap with bg descriptors Ben Hutchings
                   ` (55 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Greg Kroah-Hartman, Shuah Khan

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Shuah Khan <shuahkh@osg.samsung.com>

commit 4bfb141bc01312a817d36627cc47c93f801c216d upstream.

usbip_host calls device_attach() without holding dev->parent lock.
Fix it.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub_main.c | 5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/staging/usbip/stub_main.c
+++ b/drivers/staging/usbip/stub_main.c
@@ -205,7 +205,12 @@ static ssize_t rebind_store(struct devic
 	if (!bid)
 		return -ENODEV;
 
+	/* device_attach() callers should hold parent lock for USB */
+	if (bid->udev->dev.parent)
+		device_lock(bid->udev->dev.parent);
 	ret = device_attach(&bid->udev->dev);
+	if (bid->udev->dev.parent)
+		device_unlock(bid->udev->dev.parent);
 	if (ret < 0) {
 		dev_err(&bid->udev->dev, "rebind failed\n");
 		return ret;


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 10/63] usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (49 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 05/63] usbip: fix error handling in stub_probe() Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 46/63] cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status Ben Hutchings
                   ` (12 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Greg Kroah-Hartman, Shuah Khan (Samsung OSG), Jakub

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: "Shuah Khan (Samsung OSG)" <shuah@kernel.org>

commit 22076557b07c12086eeb16b8ce2b0b735f7a27e7 upstream.

usbip_host updates device status without holding lock from stub probe,
disconnect and rebind code paths. When multiple requests to import a
device are received, these unprotected code paths step all over each
other and drive fails with NULL-ptr deref and use-after-free errors.

The driver uses a table lock to protect the busid array for adding and
deleting busids to the table. However, the probe, disconnect and rebind
paths get the busid table entry and update the status without holding
the busid table lock. Add a new finer grain lock to protect the busid
entry. This new lock will be held to search and update the busid entry
fields from get_busid_idx(), add_match_busid() and del_match_busid().

match_busid_show() does the same to access the busid entry fields.

get_busid_priv() changed to return the pointer to the busid entry holding
the busid lock. stub_probe(), stub_disconnect() and stub_device_rebind()
call put_busid_priv() to release the busid lock before returning. This
changes fixes the unprotected code paths eliminating the race conditions
in updating the busid entries.

Reported-by: Jakub Jirasek
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filenames, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub.h      |  2 ++
 drivers/staging/usbip/stub_dev.c  | 33 ++++++++++++++++++++---------
 drivers/staging/usbip/stub_main.c | 40 ++++++++++++++++++++++++++++++-----
 3 files changed, 60 insertions(+), 15 deletions(-)

--- a/drivers/staging/usbip/stub.h
+++ b/drivers/staging/usbip/stub.h
@@ -87,6 +87,7 @@ struct bus_id_priv {
 	struct stub_device *sdev;
 	struct usb_device *udev;
 	char shutdown_busid;
+	spinlock_t busid_lock;
 };
 
 /* stub_priv is allocated from stub_priv_cache */
@@ -97,6 +98,7 @@ extern struct usb_device_driver stub_dri
 
 /* stub_main.c */
 struct bus_id_priv *get_busid_priv(const char *busid);
+void put_busid_priv(struct bus_id_priv *bid);
 int del_match_busid(char *busid);
 void stub_device_cleanup_urbs(struct stub_device *sdev);
 
--- a/drivers/staging/usbip/stub_dev.c
+++ b/drivers/staging/usbip/stub_dev.c
@@ -341,7 +341,7 @@ static int stub_probe(struct usb_device
 	struct stub_device *sdev = NULL;
 	const char *udev_busid = dev_name(&udev->dev);
 	struct bus_id_priv *busid_priv;
-	int rc;
+	int rc = 0;
 
 	dev_dbg(&udev->dev, "Enter probe\n");
 
@@ -358,13 +358,15 @@ static int stub_probe(struct usb_device
 		 * other matched drivers by the driver core.
 		 * See driver_probe_device() in driver/base/dd.c
 		 */
-		return -ENODEV;
+		rc = -ENODEV;
+		goto call_put_busid_priv;
 	}
 
 	if (udev->descriptor.bDeviceClass == USB_CLASS_HUB) {
 		dev_dbg(&udev->dev, "%s is a usb hub device... skip!\n",
 			 udev_busid);
-		return -ENODEV;
+		rc = -ENODEV;
+		goto call_put_busid_priv;
 	}
 
 	if (!strcmp(udev->bus->bus_name, "vhci_hcd")) {
@@ -372,13 +374,16 @@ static int stub_probe(struct usb_device
 			"%s is attached on vhci_hcd... skip!\n",
 			udev_busid);
 
-		return -ENODEV;
+		rc = -ENODEV;
+		goto call_put_busid_priv;
 	}
 
 	/* ok, this is my device */
 	sdev = stub_device_alloc(udev);
-	if (!sdev)
-		return -ENOMEM;
+	if (!sdev) {
+		rc = -ENOMEM;
+		goto call_put_busid_priv;
+	}
 
 	dev_info(&udev->dev,
 		"usbip-host: register new device (bus %u dev %u)\n",
@@ -410,7 +415,9 @@ static int stub_probe(struct usb_device
 	}
 	busid_priv->status = STUB_BUSID_ALLOC;
 
-	return 0;
+	rc = 0;
+	goto call_put_busid_priv;
+
 err_files:
 	usb_hub_release_port(udev->parent, udev->portnum,
 			     (struct usb_dev_state *) udev);
@@ -421,6 +428,9 @@ err_port:
 
 	busid_priv->sdev = NULL;
 	stub_device_free(sdev);
+
+call_put_busid_priv:
+	put_busid_priv(busid_priv);
 	return rc;
 }
 
@@ -459,7 +469,7 @@ static void stub_disconnect(struct usb_d
 	/* get stub_device */
 	if (!sdev) {
 		dev_err(&udev->dev, "could not get device");
-		return;
+		goto call_put_busid_priv;
 	}
 
 	dev_set_drvdata(&udev->dev, NULL);
@@ -474,12 +484,12 @@ static void stub_disconnect(struct usb_d
 				  (struct usb_dev_state *) udev);
 	if (rc) {
 		dev_dbg(&udev->dev, "unable to release port\n");
-		return;
+		goto call_put_busid_priv;
 	}
 
 	/* If usb reset is called from event handler */
 	if (busid_priv->sdev->ud.eh == current)
-		return;
+		goto call_put_busid_priv;
 
 	/* shutdown the current connection */
 	shutdown_busid(busid_priv);
@@ -492,6 +502,9 @@ static void stub_disconnect(struct usb_d
 
 	if (busid_priv->status == STUB_BUSID_ALLOC)
 		busid_priv->status = STUB_BUSID_ADDED;
+
+call_put_busid_priv:
+	put_busid_priv(busid_priv);
 }
 
 #ifdef CONFIG_PM
--- a/drivers/staging/usbip/stub_main.c
+++ b/drivers/staging/usbip/stub_main.c
@@ -40,6 +40,8 @@ static spinlock_t busid_table_lock;
 
 static void init_busid_table(void)
 {
+	int i;
+
 	/*
 	 * This also sets the bus_table[i].status to
 	 * STUB_BUSID_OTHER, which is 0.
@@ -47,6 +49,9 @@ static void init_busid_table(void)
 	memset(busid_table, 0, sizeof(busid_table));
 
 	spin_lock_init(&busid_table_lock);
+
+	for (i = 0; i < MAX_BUSID; i++)
+		spin_lock_init(&busid_table[i].busid_lock);
 }
 
 /*
@@ -58,15 +63,20 @@ static int get_busid_idx(const char *bus
 	int i;
 	int idx = -1;
 
-	for (i = 0; i < MAX_BUSID; i++)
+	for (i = 0; i < MAX_BUSID; i++) {
+		spin_lock(&busid_table[i].busid_lock);
 		if (busid_table[i].name[0])
 			if (!strncmp(busid_table[i].name, busid, BUSID_SIZE)) {
 				idx = i;
+				spin_unlock(&busid_table[i].busid_lock);
 				break;
 			}
+		spin_unlock(&busid_table[i].busid_lock);
+	}
 	return idx;
 }
 
+/* Returns holding busid_lock. Should call put_busid_priv() to unlock */
 struct bus_id_priv *get_busid_priv(const char *busid)
 {
 	int idx;
@@ -74,13 +84,21 @@ struct bus_id_priv *get_busid_priv(const
 
 	spin_lock(&busid_table_lock);
 	idx = get_busid_idx(busid);
-	if (idx >= 0)
+	if (idx >= 0) {
 		bid = &(busid_table[idx]);
+		/* get busid_lock before returning */
+		spin_lock(&bid->busid_lock);
+	}
 	spin_unlock(&busid_table_lock);
 
 	return bid;
 }
 
+void put_busid_priv(struct bus_id_priv *bid)
+{
+	spin_unlock(&bid->busid_lock);
+}
+
 static int add_match_busid(char *busid)
 {
 	int i;
@@ -93,15 +111,19 @@ static int add_match_busid(char *busid)
 		goto out;
 	}
 
-	for (i = 0; i < MAX_BUSID; i++)
+	for (i = 0; i < MAX_BUSID; i++) {
+		spin_lock(&busid_table[i].busid_lock);
 		if (!busid_table[i].name[0]) {
 			strncpy(busid_table[i].name, busid, BUSID_SIZE);
 			if ((busid_table[i].status != STUB_BUSID_ALLOC) &&
 			    (busid_table[i].status != STUB_BUSID_REMOV))
 				busid_table[i].status = STUB_BUSID_ADDED;
 			ret = 0;
+			spin_unlock(&busid_table[i].busid_lock);
 			break;
 		}
+		spin_unlock(&busid_table[i].busid_lock);
+	}
 
 out:
 	spin_unlock(&busid_table_lock);
@@ -122,6 +144,8 @@ int del_match_busid(char *busid)
 	/* found */
 	ret = 0;
 
+	spin_lock(&busid_table[idx].busid_lock);
+
 	if (busid_table[idx].status == STUB_BUSID_OTHER)
 		memset(busid_table[idx].name, 0, BUSID_SIZE);
 
@@ -129,6 +153,7 @@ int del_match_busid(char *busid)
 	    (busid_table[idx].status != STUB_BUSID_ADDED))
 		busid_table[idx].status = STUB_BUSID_REMOV;
 
+	spin_unlock(&busid_table[idx].busid_lock);
 out:
 	spin_unlock(&busid_table_lock);
 
@@ -141,9 +166,12 @@ static ssize_t show_match_busid(struct d
 	char *out = buf;
 
 	spin_lock(&busid_table_lock);
-	for (i = 0; i < MAX_BUSID; i++)
+	for (i = 0; i < MAX_BUSID; i++) {
+		spin_lock(&busid_table[i].busid_lock);
 		if (busid_table[i].name[0])
 			out += sprintf(out, "%s ", busid_table[i].name);
+		spin_unlock(&busid_table[i].busid_lock);
+	}
 	spin_unlock(&busid_table_lock);
 	out += sprintf(out, "\n");
 
@@ -223,7 +251,7 @@ static void stub_device_rebind(void)
 	}
 	spin_unlock(&busid_table_lock);
 
-	/* now run rebind */
+	/* now run rebind - no need to hold locks. driver files are removed */
 	for (i = 0; i < MAX_BUSID; i++) {
 		if (busid_table[i].name[0] &&
 		    busid_table[i].shutdown_busid) {
@@ -253,6 +281,8 @@ static ssize_t rebind_store(struct devic
 
 	/* mark the device for deletion so probe ignores it during rescan */
 	bid->status = STUB_BUSID_OTHER;
+	/* release the busid lock */
+	put_busid_priv(bid);
 
 	ret = do_rebind((char *) buf, bid);
 	if (ret < 0)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 18/63] sr: pass down correctly sized SCSI sense buffer
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (5 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 36/63] jbd2: don't mark block as modified if the handle is out of credits Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 06/63] usbip: usbip_host: fix to hold parent lock for device_attach() calls Ben Hutchings
                   ` (56 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Daniel Shapira, Kees Cook, Piotr Gabriel Kosinski, Jens Axboe

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jens Axboe <axboe@kernel.dk>

commit f7068114d45ec55996b9040e98111afa56e010fe upstream.

We're casting the CDROM layer request_sense to the SCSI sense
buffer, but the former is 64 bytes and the latter is 96 bytes.
As we generally allocate these on the stack, we end up blowing
up the stack.

Fix this by wrapping the scsi_execute() call with a properly
sized sense buffer, and copying back the bits for the CDROM
layer.

Reported-by: Piotr Gabriel Kosinski <pg.kosinski@gmail.com>
Reported-by: Daniel Shapira <daniel@twistlock.com>
Tested-by: Kees Cook <keescook@chromium.org>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[bwh: Despite what the "Fixes" field says, a buffer overrun was already
 possible if the sense data was really > 64 bytes long.
 Backported to 3.16:
 - We always need to allocate a sense buffer in order to call
   scsi_normalize_sense()
 - Remove the existing conditional heap-allocation of the sense buffer]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/scsi/sr_ioctl.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/scsi/sr_ioctl.c
+++ b/drivers/scsi/sr_ioctl.c
@@ -188,30 +188,25 @@ int sr_do_ioctl(Scsi_CD *cd, struct pack
 	struct scsi_device *SDev;
 	struct scsi_sense_hdr sshdr;
 	int result, err = 0, retries = 0;
-	struct request_sense *sense = cgc->sense;
+	unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE];
 
 	SDev = cd->device;
 
-	if (!sense) {
-		sense = kmalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL);
-		if (!sense) {
-			err = -ENOMEM;
-			goto out;
-		}
-	}
-
       retry:
 	if (!scsi_block_when_processing_errors(SDev)) {
 		err = -ENODEV;
 		goto out;
 	}
 
-	memset(sense, 0, sizeof(*sense));
+	memset(sense_buffer, 0, sizeof(sense_buffer));
 	result = scsi_execute(SDev, cgc->cmd, cgc->data_direction,
-			      cgc->buffer, cgc->buflen, (char *)sense,
+			      cgc->buffer, cgc->buflen, sense_buffer,
 			      cgc->timeout, IOCTL_RETRIES, 0, NULL);
 
-	scsi_normalize_sense((char *)sense, sizeof(*sense), &sshdr);
+	scsi_normalize_sense(sense_buffer, sizeof(sense_buffer), &sshdr);
+
+	if (cgc->sense)
+		memcpy(cgc->sense, sense_buffer, sizeof(*cgc->sense));
 
 	/* Minimal error checking.  Ignore cases we know about, and report the rest. */
 	if (driver_byte(result) != 0) {
@@ -268,8 +263,6 @@ int sr_do_ioctl(Scsi_CD *cd, struct pack
 
 	/* Wake up a process waiting for device */
       out:
-	if (!cgc->sense)
-		kfree(sense);
 	cgc->stat = err;
 	return err;
 }


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 3.16 03/63] Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (52 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 28/63] ext4: don't allow r/w mounts if metadata blocks overlap the superblock Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 34/63] ext4: clear i_data in ext4_inode_info when removing inline data Ben Hutchings
                   ` (9 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Alistair Strachan

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben@decadent.org.uk>

This reverts commit 5a79e43ffa5014c020e0d0f4e383205f87b10111, which
was commit 03080e5ec72740c1a62e6730f2a5f3f114f11b19 upstream, as it
causes test failures.  It should not have been backported to anything
older than 4.16.  Thanks to Alistair Strachan for debugging this.

Cc: Alistair Strachan <astrachan@google.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 net/ipv4/ip_vti.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index b4fc9e710308..778ee1bf40ad 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -359,6 +359,7 @@ static int vti_tunnel_init(struct net_device *dev)
 	memcpy(dev->dev_addr, &iph->saddr, 4);
 	memcpy(dev->broadcast, &iph->daddr, 4);
 
+	dev->mtu		= ETH_DATA_LEN;
 	dev->flags		= IFF_NOARP;
 	dev->iflink		= 0;
 	dev->addr_len		= 4;


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 3.16 08/63] usbip: usbip_host: delete device from busid_table after rebind
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (16 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 23/63] xfs: set format back to extents if xfs_bmap_extents_to_btree Ben Hutchings
@ 2018-09-22  0:15 ` Ben Hutchings
  2018-09-22  0:15 ` [PATCH 3.16 60/63] x86/cpu/AMD: Fix erratum 1076 (CPB bit) Ben Hutchings
                   ` (45 subsequent siblings)
  63 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22  0:15 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Shuah Khan (Samsung OSG), Greg Kroah-Hartman

3.16.58-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: "Shuah Khan (Samsung OSG)" <shuah@kernel.org>

commit 1e180f167d4e413afccbbb4a421b48b2de832549 upstream.

Device is left in the busid_table after unbind and rebind. Rebind
initiates usb bus scan and the original driver claims the device.
After rescan the device should be deleted from the busid_table as
it no longer belongs to usbip_host.

Fix it to delete the device after device_attach() succeeds.

Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/staging/usbip/stub_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

--- a/drivers/staging/usbip/stub_main.c
+++ b/drivers/staging/usbip/stub_main.c
@@ -205,6 +205,9 @@ static ssize_t rebind_store(struct devic
 	if (!bid)
 		return -ENODEV;
 
+	/* mark the device for deletion so probe ignores it during rescan */
+	bid->status = STUB_BUSID_OTHER;
+
 	/* device_attach() callers should hold parent lock for USB */
 	if (bid->udev->dev.parent)
 		device_lock(bid->udev->dev.parent);
@@ -216,6 +219,9 @@ static ssize_t rebind_store(struct devic
 		return ret;
 	}
 
+	/* delete device from busid_table */
+	del_match_busid((char *) buf);
+
 	return count;
 }
 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 20/63] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()
  2018-09-22  0:15 ` [PATCH 3.16 20/63] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() Ben Hutchings
@ 2018-09-22  0:19   ` syzbot
  0 siblings, 0 replies; 71+ messages in thread
From: syzbot @ 2018-09-22  0:19 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: akpm, ben, dgilbert, glider, jthumshirn, linux-kernel,
	martin.petersen, stable

> 3.16.58-rc1 review patch.  If anyone has any objections, please let me  
> know.

> ------------------

> From: Alexander Potapenko <glider@google.com>

> commit a45b599ad808c3c982fdcdc12b0b8611c2f92824 upstream.

> This shall help avoid copying uninitialized memory to the userspace when
> calling ioctl(fd, SG_IO) with an empty command.

> Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com
> Signed-off-by: Alexander Potapenko <glider@google.com>
> Acked-by: Douglas Gilbert <dgilbert@interlog.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> ---
>   drivers/scsi/sg.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -1825,7 +1825,7 @@ retry:
>   		num = (rem_sz > scatter_elem_sz_prev) ?
>   			scatter_elem_sz_prev : rem_sz;

> -		schp->pages[k] = alloc_pages(gfp_mask, order);
> +		schp->pages[k] = alloc_pages(gfp_mask | __GFP_ZERO, order);
>   		if (!schp->pages[k])
>   			goto out;



Can't find the corresponding bug.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption
  2018-09-22  0:15 ` [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption Ben Hutchings
@ 2018-09-22  5:25   ` Dave Chinner
  2018-09-22 20:57     ` Ben Hutchings
  0 siblings, 1 reply; 71+ messages in thread
From: Dave Chinner @ 2018-09-22  5:25 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: linux-kernel, stable, akpm, Darrick J. Wong, Carlos Maiolino

On Sat, Sep 22, 2018 at 01:15:42AM +0100, Ben Hutchings wrote:
> 3.16.58-rc1 review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> commit ee457001ed6c6f31ddad69c24c1da8f377d8472d upstream.
> 
> We recently came across a V4 filesystem causing memory corruption
> due to a newly allocated inode being setup twice and being added to
> the superblock inode list twice. From code inspection, the only way
> this could happen is if a newly allocated inode was not marked as
> free on disk (i.e. di_mode wasn't zero).
....
> Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> Tested-by: Carlos Maiolino <cmaiolino@redhat.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> [bwh: Backported to 3.16:
>  - Look up mode in XFS inode, not VFS inode
>  - Use positive error codes, and EIO instead of EFSCORRUPTED]

Why EIO?

Cheers,

Dave.
-- 
Dave Chinner
dchinner@redhat.com

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated
  2018-09-22  0:15 ` [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated Ben Hutchings
@ 2018-09-22  5:26   ` Dave Chinner
  2018-09-22 20:57     ` Ben Hutchings
  0 siblings, 1 reply; 71+ messages in thread
From: Dave Chinner @ 2018-09-22  5:26 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: linux-kernel, stable, akpm, Christoph Hellwig, Wen Xu,
	Carlos Maiolino, Darrick J. Wong

On Sat, Sep 22, 2018 at 01:15:42AM +0100, Ben Hutchings wrote:
> 3.16.58-rc1 review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> commit afca6c5b2595fc44383919fba740c194b0b76aff upstream.
> 
> A recent fuzzed filesystem image cached random dcache corruption
> when the reproducer was run. This often showed up as panics in
> lookup_slow() on a null inode->i_ops pointer when doing pathwalks.
.....
> [bwh: Backported to 3.16:
>  - Look up mode in XFS inode, not VFS inode
>  - Use positive error codes, and EIO instead of EFSCORRUPTED]

Again, why EIO?

And ....
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> ---
>  fs/xfs/xfs_icache.c | 73 +++++++++++++++++++++++++++++----------------
>  1 file changed, 48 insertions(+), 25 deletions(-)
> 
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -133,6 +133,46 @@ xfs_inode_free(
>  }
>  
>  /*
> + * If we are allocating a new inode, then check what was returned is
> + * actually a free, empty inode. If we are not allocating an inode,
> + * then check we didn't find a free inode.
> + *
> + * Returns:
> + *	0		if the inode free state matches the lookup context
> + *	ENOENT		if the inode is free and we are not allocating
> + *	EFSCORRUPTED	if there is any state mismatch at all

You changed the code but not the comment.

Cheers,

Dave.
-- 
Dave Chinner
dchinner@redhat.com

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 00/63] 3.16.58-rc1 review
  2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
                   ` (62 preceding siblings ...)
  2018-09-22  0:15 ` [PATCH 3.16 62/63] KVM: x86: introduce num_emulated_msrs Ben Hutchings
@ 2018-09-22 12:28 ` Guenter Roeck
  2018-09-22 21:03   ` Ben Hutchings
  63 siblings, 1 reply; 71+ messages in thread
From: Guenter Roeck @ 2018-09-22 12:28 UTC (permalink / raw)
  To: Ben Hutchings, linux-kernel, stable; +Cc: torvalds, akpm

On 09/21/2018 05:15 PM, Ben Hutchings wrote:
> This is the start of the stable review cycle for the 3.16.58 release.
> There are 63 patches in this series, which will be posted as responses
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Mon Sep 24 00:15:41 UTC 2018.
> Anything received after that time might be too late.
> 

Build results:
	total: 139 pass: 139 fail: 0
Qemu test results:
	total: 217 pass: 217 fail: 0

Details are available at https://kerneltests.org/builders/.

Guenter

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption
  2018-09-22  5:25   ` Dave Chinner
@ 2018-09-22 20:57     ` Ben Hutchings
  0 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22 20:57 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, stable, akpm, Darrick J. Wong, Carlos Maiolino

[-- Attachment #1: Type: text/plain, Size: 1459 bytes --]

On Sat, 2018-09-22 at 15:25 +1000, Dave Chinner wrote:
> On Sat, Sep 22, 2018 at 01:15:42AM +0100, Ben Hutchings wrote:
> > 3.16.58-rc1 review patch.  If anyone has any objections, please let
> > me know.
> > 
> > ------------------
> > 
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > commit ee457001ed6c6f31ddad69c24c1da8f377d8472d upstream.
> > 
> > We recently came across a V4 filesystem causing memory corruption
> > due to a newly allocated inode being setup twice and being added to
> > the superblock inode list twice. From code inspection, the only way
> > this could happen is if a newly allocated inode was not marked as
> > free on disk (i.e. di_mode wasn't zero).
> 
> ....
> > Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> > Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> > Tested-by: Carlos Maiolino <cmaiolino@redhat.com>
> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > [bwh: Backported to 3.16:
> >  - Look up mode in XFS inode, not VFS inode
> >  - Use positive error codes, and EIO instead of EFSCORRUPTED]
> 
> Why EIO?

I believe EIO was the usual error code used for filesystem errors
before EFSCORRUPTED was added.  But now I see xfs had its own private
definition of EFSCORRUPTED.  I'll change this back.

Ben.

-- 
Ben Hutchings
Any sufficiently advanced bug is indistinguishable from a feature.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated
  2018-09-22  5:26   ` Dave Chinner
@ 2018-09-22 20:57     ` Ben Hutchings
  0 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22 20:57 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-kernel, stable, akpm, Christoph Hellwig, Wen Xu,
	Carlos Maiolino, Darrick J. Wong

[-- Attachment #1: Type: text/plain, Size: 1731 bytes --]

On Sat, 2018-09-22 at 15:26 +1000, Dave Chinner wrote:
> On Sat, Sep 22, 2018 at 01:15:42AM +0100, Ben Hutchings wrote:
> > 3.16.58-rc1 review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > commit afca6c5b2595fc44383919fba740c194b0b76aff upstream.
> > 
> > A recent fuzzed filesystem image cached random dcache corruption
> > when the reproducer was run. This often showed up as panics in
> > lookup_slow() on a null inode->i_ops pointer when doing pathwalks.
> 
> .....
> > [bwh: Backported to 3.16:
> >  - Look up mode in XFS inode, not VFS inode
> >  - Use positive error codes, and EIO instead of EFSCORRUPTED]
> 
> Again, why EIO?

I'll change this back to EFSCORRUPTED.

Ben.

> And ....
> > Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> > ---
> >  fs/xfs/xfs_icache.c | 73 +++++++++++++++++++++++++++++----------------
> >  1 file changed, 48 insertions(+), 25 deletions(-)
> > 
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -133,6 +133,46 @@ xfs_inode_free(
> >  }
> >  
> >  /*
> > + * If we are allocating a new inode, then check what was returned is
> > + * actually a free, empty inode. If we are not allocating an inode,
> > + * then check we didn't find a free inode.
> > + *
> > + * Returns:
> > + *	0		if the inode free state matches the lookup context
> > + *	ENOENT		if the inode is free and we are not allocating
> > + *	EFSCORRUPTED	if there is any state mismatch at all
> 
> You changed the code but not the comment.
> 
> Cheers,
> 
> Dave.
-- 
Ben Hutchings
Any sufficiently advanced bug is indistinguishable from a feature.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 3.16 00/63] 3.16.58-rc1 review
  2018-09-22 12:28 ` [PATCH 3.16 00/63] 3.16.58-rc1 review Guenter Roeck
@ 2018-09-22 21:03   ` Ben Hutchings
  0 siblings, 0 replies; 71+ messages in thread
From: Ben Hutchings @ 2018-09-22 21:03 UTC (permalink / raw)
  To: Guenter Roeck, linux-kernel, stable; +Cc: torvalds, akpm

[-- Attachment #1: Type: text/plain, Size: 787 bytes --]

On Sat, 2018-09-22 at 05:28 -0700, Guenter Roeck wrote:
> On 09/21/2018 05:15 PM, Ben Hutchings wrote:
> > This is the start of the stable review cycle for the 3.16.58 release.
> > There are 63 patches in this series, which will be posted as responses
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Mon Sep 24 00:15:41 UTC 2018.
> > Anything received after that time might be too late.
> > 
> 
> Build results:
> 	total: 139 pass: 139 fail: 0
> Qemu test results:
> 	total: 217 pass: 217 fail: 0
> 
> Details are available at https://kerneltests.org/builders/.

Thanks for checking.

Ben.

-- 
Ben Hutchings
Any sufficiently advanced bug is indistinguishable from a feature.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2018-09-22 21:03 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-22  0:15 [PATCH 3.16 00/63] 3.16.58-rc1 review Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 51/63] xfs: catch inode allocation state mismatch corruption Ben Hutchings
2018-09-22  5:25   ` Dave Chinner
2018-09-22 20:57     ` Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 07/63] usbip: usbip_host: refine probe and disconnect debug msgs to be useful Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 41/63] USB: yurex: fix out-of-bounds uaccess in read handler Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 54/63] seccomp: create internal mode-setting function Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 45/63] x86/paravirt: Fix spectre-v2 mitigations for paravirt guests Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 36/63] jbd2: don't mark block as modified if the handle is out of credits Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 18/63] sr: pass down correctly sized SCSI sense buffer Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 06/63] usbip: usbip_host: fix to hold parent lock for device_attach() calls Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 29/63] ext4: make sure bitmaps and the inode table don't overlap with bg descriptors Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 63/63] mm: get rid of vmacache_flush_all() entirely Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 26/63] ext4: verify the depth of extent tree in ext4_find_extent() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 38/63] Fix up non-directory creation in SGID directories Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 56/63] seccomp: split mode setting routines Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 13/63] futex: Remove unnecessary warning from get_futex_key Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 21/63] Bluetooth: hidp: buffer overflow in hidp_process_report Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 32/63] ext4: always verify the magic number in xattr blocks Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 23/63] xfs: set format back to extents if xfs_bmap_extents_to_btree Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 08/63] usbip: usbip_host: delete device from busid_table after rebind Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 60/63] x86/cpu/AMD: Fix erratum 1076 (CPB bit) Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 57/63] seccomp: add "seccomp" syscall Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 20/63] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() Ben Hutchings
2018-09-22  0:19   ` syzbot
2018-09-22  0:15 ` [PATCH 3.16 04/63] net: Set sk_prot_creator when cloning sockets to the right proto Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 33/63] ext4: never move the system.data xattr out of the inode body Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 55/63] seccomp: extract check/assign mode helpers Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 16/63] KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 35/63] ext4: add more inode number paranoia checks Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 42/63] ALSA: rawmidi: Change resized buffers atomically Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 27/63] ext4: always check block group bounds in ext4_init_block_bitmap() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 25/63] ext4: fix check to prevent initializing reserved inodes Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 14/63] KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 15/63] KVM: x86: introduce linear_{read,write}_system Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 61/63] x86/cpu/intel: Add Knights Mill to Intel family Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 40/63] infiniband: fix a possible use-after-free bug Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 09/63] usbip: usbip_host: run rebind from exit when module is removed Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 11/63] usbip: usbip_host: fix bad unlock balance during stub_probe() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 12/63] futex: Remove requirement for lock_page() in get_futex_key() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 30/63] ext4: fix false negatives *and* false positives in ext4_check_descriptors() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 37/63] ext4: avoid running out of journal credits when appending to an inline file Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 47/63] uas: replace WARN_ON_ONCE() with lockdep_assert_held() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 44/63] x86/speculation: Protect against userspace-userspace spectreRSB Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 49/63] btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 24/63] ext4: only look at the bg_flags field if it is valid Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 53/63] xfs: don't call xfs_da_shrink_inode with NULL bp Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 17/63] kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 43/63] x86/speculation: Clean up various Spectre related details Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 52/63] xfs: validate cached inodes are free when allocated Ben Hutchings
2018-09-22  5:26   ` Dave Chinner
2018-09-22 20:57     ` Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 48/63] video: uvesafb: Fix integer overflow in allocation Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 19/63] jfs: Fix inconsistency between memory allocation and ea_buf->max_size Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 22/63] scsi: libsas: defer ata device eh commands to libata Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 05/63] usbip: fix error handling in stub_probe() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 10/63] usbip: usbip_host: fix NULL-ptr deref and use-after-free errors Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 46/63] cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 28/63] ext4: don't allow r/w mounts if metadata blocks overlap the superblock Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 03/63] Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 34/63] ext4: clear i_data in ext4_inode_info when removing inline data Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 01/63] x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 02/63] x86/fpu: Default eagerfpu if FPU and FXSR are enabled Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 58/63] x86/process: Optimize TIF checks in __switch_to_xtra() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 31/63] ext4: add corruption check in ext4_xattr_set_entry() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 39/63] x86/entry/64: Remove %ebx handling from error_entry/exit Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 59/63] x86/process: Correct and optimize TIF_BLOCKSTEP switch Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 50/63] hfsplus: fix NULL dereference in hfsplus_lookup() Ben Hutchings
2018-09-22  0:15 ` [PATCH 3.16 62/63] KVM: x86: introduce num_emulated_msrs Ben Hutchings
2018-09-22 12:28 ` [PATCH 3.16 00/63] 3.16.58-rc1 review Guenter Roeck
2018-09-22 21:03   ` Ben Hutchings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).