All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.9 00/66] 4.9.95-stable review
@ 2018-04-17 15:58 Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 01/66] media: v4l2-compat-ioctl32: dont oops on overlay Greg Kroah-Hartman
                   ` (69 more replies)
  0 siblings, 70 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable

This is the start of the stable review cycle for the 4.9.95 release.
There are 66 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Apr 19 15:56:27 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.95-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.9.95-rc1

Phil Elwell <phil@raspberrypi.org>
    lan78xx: Correctly indicate invalid OTP

Stefan Hajnoczi <stefanha@redhat.com>
    vhost: fix vhost_vq_access_ok() log check

Tejaswi Tanikella <tejaswit@codeaurora.org>
    slip: Check if rstate is initialized before uncompressing

Ka-Cheong Poon <ka-cheong.poon@oracle.com>
    rds: MP-RDS may use an invalid c_path

Bassem Boubaker <bassem.boubaker@actia.fr>
    cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN

Marek Szyprowski <m.szyprowski@samsung.com>
    hwmon: (ina2xx) Fix access to uninitialized mutex

Sudhir Sreedharan <ssreedharan@mvista.com>
    rtl8187: Fix NULL pointer dereference in priv->conf_mutex

Szymon Janc <szymon.janc@codecoup.pl>
    Bluetooth: Fix connection if directed advertising and privacy is used

Al Viro <viro@zeniv.linux.org.uk>
    getname_kernel() needs to make sure that ->name != ->iname in long case

Vasily Gorbik <gor@linux.ibm.com>
    s390/ipl: ensure loadparm valid flag is set

Julian Wiedmann <jwi@linux.vnet.ibm.com>
    s390/qdio: don't merge ERROR output buffers

Julian Wiedmann <jwi@linux.vnet.ibm.com>
    s390/qdio: don't retry EQBS after CCQ 96

Dan Williams <dan.j.williams@intel.com>
    nfit: fix region registration vs block-data-window ranges

Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    block/loop: fix deadlock after loop_set_status

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Revert "perf tests: Decompress kernel module before objdump"

Eric Biggers <ebiggers@google.com>
    sunrpc: remove incorrect HMAC request initialization

Mark Rutland <mark.rutland@arm.com>
    arm64: Kill PSCI_GET_VERSION as a variant-2 workaround

Mark Rutland <mark.rutland@arm.com>
    arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: smccc: Implement SMCCC v1.1 inline primitive

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: smccc: Make function identifiers an unsigned quantity

Mark Rutland <mark.rutland@arm.com>
    firmware/psci: Expose SMCCC version through psci_ops

Mark Rutland <mark.rutland@arm.com>
    firmware/psci: Expose PSCI conduit

Mark Rutland <mark.rutland@arm.com>
    arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling

Mark Rutland <mark.rutland@arm.com>
    arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: KVM: Turn kvm_psci_version into a static inline

Mark Rutland <mark.rutland@arm.com>
    arm64: KVM: Make PSCI_VERSION a fast path

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: KVM: Advertise SMCCC v1.1

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: KVM: Implement PSCI 1.0 support

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: KVM: Add smccc accessors to PSCI code

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: KVM: Add PSCI_VERSION helper

Mark Rutland <mark.rutland@arm.com>
    arm/arm64: KVM: Consolidate the PSCI include files

Mark Rutland <mark.rutland@arm.com>
    arm64: KVM: Increment PC after handling an SMC trap

Mark Rutland <mark.rutland@arm.com>
    arm64: Branch predictor hardening for Cavium ThunderX2

Mark Rutland <mark.rutland@arm.com>
    arm64: Implement branch predictor hardening for affected Cortex-A CPUs

Mark Rutland <mark.rutland@arm.com>
    arm64: cpu_errata: Allow an erratum to be match for all revisions of a core

Mark Rutland <mark.rutland@arm.com>
    arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75

Mark Rutland <mark.rutland@arm.com>
    arm64: entry: Apply BP hardening for suspicious interrupts from EL0

Mark Rutland <mark.rutland@arm.com>
    arm64: entry: Apply BP hardening for high-priority synchronous exceptions

Mark Rutland <mark.rutland@arm.com>
    arm64: KVM: Use per-CPU vector when BP hardening is enabled

Mark Rutland <mark.rutland@arm.com>
    mm: Introduce lm_alias

Mark Rutland <mark.rutland@arm.com>
    arm64: Move BP hardening to check_and_switch_context

Mark Rutland <mark.rutland@arm.com>
    arm64: Add skeleton to harden the branch predictor against aliasing attacks

Mark Rutland <mark.rutland@arm.com>
    arm64: Move post_ttbr_update_workaround to C code

Mark Rutland <mark.rutland@arm.com>
    arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro

Mark Rutland <mark.rutland@arm.com>
    drivers/firmware: Expose psci_get_version through psci_ops structure

Mark Rutland <mark.rutland@arm.com>
    arm64: cpufeature: Pass capability structure to ->enable callback

Mark Rutland <mark.rutland@arm.com>
    arm64: Run enable method for errata work arounds on late CPUs

Mark Rutland <mark.rutland@arm.com>
    arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early

Mark Rutland <mark.rutland@arm.com>
    arm64: uaccess: Mask __user pointers for __arch_{clear, copy_*}_user

Mark Rutland <mark.rutland@arm.com>
    arm64: uaccess: Don't bother eliding access_ok checks in __{get, put}_user

Mark Rutland <mark.rutland@arm.com>
    arm64: uaccess: Prevent speculative use of the current addr_limit

Mark Rutland <mark.rutland@arm.com>
    arm64: entry: Ensure branch through syscall table is bounded under speculation

Mark Rutland <mark.rutland@arm.com>
    arm64: Use pointer masking to limit uaccess speculation

Mark Rutland <mark.rutland@arm.com>
    arm64: Make USER_DS an inclusive limit

Mark Rutland <mark.rutland@arm.com>
    arm64: move TASK_* definitions to <asm/processor.h>

Mark Rutland <mark.rutland@arm.com>
    arm64: Implement array_index_mask_nospec()

Mark Rutland <mark.rutland@arm.com>
    arm64: barrier: Add CSDB macros to control data-value prediction

Arnd Bergmann <arnd@arndb.de>
    radeon: hide pointless #warning when compile testing

Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
    perf/core: Fix use-after-free in uprobe_perf_close()

Adrian Hunter <adrian.hunter@intel.com>
    perf intel-pt: Fix timestamp following overflow

Adrian Hunter <adrian.hunter@intel.com>
    perf intel-pt: Fix error recovery from missing TIP packet

Adrian Hunter <adrian.hunter@intel.com>
    perf intel-pt: Fix sync_switch

Adrian Hunter <adrian.hunter@intel.com>
    perf intel-pt: Fix overlap detection to identify consecutive buffers correctly

Dexuan Cui <decui@microsoft.com>
    Drivers: hv: vmbus: do not mark HV_PCIE as perf_device

Helge Deller <deller@gmx.de>
    parisc: Fix out of array access in match_pci_device()

Mauro Carvalho Chehab <mchehab@kernel.org>
    media: v4l2-compat-ioctl32: don't oops on overlay


-------------

Diffstat:

 Makefile                                           |    4 +-
 arch/arm/include/asm/kvm_host.h                    |    6 +
 arch/arm/include/asm/kvm_mmu.h                     |   10 +
 arch/arm/include/asm/kvm_psci.h                    |   27 -
 arch/arm/kvm/arm.c                                 |   11 +-
 arch/arm/kvm/handle_exit.c                         |    4 +-
 arch/arm/kvm/psci.c                                |  143 +-
 arch/arm64/Kconfig                                 |   17 +
 arch/arm64/crypto/sha256-core.S                    | 2061 ++++++++++++++++++++
 arch/arm64/crypto/sha512-core.S                    | 1085 +++++++++++
 arch/arm64/include/asm/assembler.h                 |   19 +
 arch/arm64/include/asm/barrier.h                   |   23 +
 arch/arm64/include/asm/cpucaps.h                   |    3 +-
 arch/arm64/include/asm/cputype.h                   |    6 +
 arch/arm64/include/asm/kvm_host.h                  |    5 +
 arch/arm64/include/asm/kvm_mmu.h                   |   38 +
 arch/arm64/include/asm/kvm_psci.h                  |   27 -
 arch/arm64/include/asm/memory.h                    |   15 -
 arch/arm64/include/asm/mmu.h                       |   39 +
 arch/arm64/include/asm/processor.h                 |   24 +
 arch/arm64/include/asm/sysreg.h                    |    2 +
 arch/arm64/include/asm/uaccess.h                   |  153 +-
 arch/arm64/kernel/Makefile                         |    4 +
 arch/arm64/kernel/arm64ksyms.c                     |    4 +-
 arch/arm64/kernel/bpi.S                            |   75 +
 arch/arm64/kernel/cpu_errata.c                     |  189 +-
 arch/arm64/kernel/cpufeature.c                     |   10 +-
 arch/arm64/kernel/entry.S                          |   25 +-
 arch/arm64/kvm/handle_exit.c                       |   16 +-
 arch/arm64/kvm/hyp/hyp-entry.S                     |   20 +-
 arch/arm64/kvm/hyp/switch.c                        |    5 +-
 arch/arm64/lib/clear_user.S                        |    6 +-
 arch/arm64/lib/copy_in_user.S                      |    4 +-
 arch/arm64/mm/context.c                            |   12 +
 arch/arm64/mm/fault.c                              |   34 +-
 arch/arm64/mm/proc.S                               |    7 +-
 arch/parisc/kernel/drivers.c                       |    4 +
 arch/s390/kernel/ipl.c                             |    1 +
 drivers/acpi/nfit/core.c                           |   22 +-
 drivers/block/loop.c                               |   12 +-
 drivers/firmware/psci.c                            |   57 +-
 drivers/gpu/drm/radeon/radeon_object.c             |    3 +-
 drivers/hv/channel_mgmt.c                          |    2 +-
 drivers/hwmon/ina2xx.c                             |    3 +-
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c      |    4 +-
 drivers/net/slip/slhc.c                            |    5 +
 drivers/net/usb/cdc_ether.c                        |    6 +
 drivers/net/usb/lan78xx.c                          |    3 +-
 drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c |    2 +-
 drivers/s390/cio/qdio_main.c                       |   42 +-
 drivers/vhost/vhost.c                              |    8 +-
 fs/namei.c                                         |    3 +-
 include/kvm/arm_psci.h                             |   51 +
 include/linux/arm-smccc.h                          |  165 +-
 include/linux/mm.h                                 |    4 +
 include/linux/psci.h                               |   14 +
 include/net/bluetooth/hci_core.h                   |    2 +-
 include/net/slhc_vj.h                              |    1 +
 include/uapi/linux/psci.h                          |    3 +
 kernel/events/core.c                               |    6 +
 net/bluetooth/hci_conn.c                           |   29 +-
 net/bluetooth/hci_event.c                          |   15 +-
 net/bluetooth/l2cap_core.c                         |    2 +-
 net/rds/send.c                                     |   15 +-
 net/sunrpc/auth_gss/gss_krb5_crypto.c              |    3 -
 tools/perf/tests/code-reading.c                    |   20 +-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |   64 +-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |    2 +-
 tools/perf/util/intel-pt.c                         |   37 +-
 69 files changed, 4423 insertions(+), 320 deletions(-)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 01/66] media: v4l2-compat-ioctl32: dont oops on overlay
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 02/66] parisc: Fix out of array access in match_pci_device() Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mauro Carvalho Chehab, Sakari Ailus,
	Hans Verkuil

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mauro Carvalho Chehab <mchehab@s-opensource.com>

commit 85ea29f19eab56ec16ec6b92bc67305998706afa upstream.

At put_v4l2_window32(), it tries to access kp->clips. However,
kp points to an userspace pointer. So, it should be obtained
via get_user(), otherwise it can OOPS:

 vivid-000: ==================  END STATUS  ==================
 BUG: unable to handle kernel paging request at 00000000fffb18e0
 IP: [<ffffffffc05468d9>] __put_v4l2_format32+0x169/0x220 [videodev]
 PGD 3f5776067 PUD 3f576f067 PMD 3f5769067 PTE 800000042548f067
 Oops: 0001 [#1] SMP
 Modules linked in: vivid videobuf2_vmalloc videobuf2_memops v4l2_dv_timings videobuf2_core v4l2_common videodev media xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables bluetooth rfkill binfmt_misc snd_hda_codec_hdmi i915 snd_hda_intel snd_hda_controller snd_hda_codec intel_rapl x86_pkg_temp_thermal snd_hwdep intel_powerclamp snd_pcm coretemp snd_seq_midi kvm_intel kvm snd_seq_midi_event snd_rawmidi i2c_algo_bit drm_kms_helper snd_seq drm crct10dif_pclmul e1000e snd_seq_device crc32_pclmul snd_timer ghash_clmulni_intel snd mei_me mei ptp pps_core soundcore lpc_ich video crc32c_intel [last unloaded: media]
 CPU: 2 PID: 28332 Comm: v4l2-compliance Not tainted 3.18.102+ #107
 Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0364.2017.0511.0949 05/11/2017
 task: ffff8804293f8000 ti: ffff8803f5640000 task.ti: ffff8803f5640000
 RIP: 0010:[<ffffffffc05468d9>]  [<ffffffffc05468d9>] __put_v4l2_format32+0x169/0x220 [videodev]
 RSP: 0018:ffff8803f5643e28  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000fffb1ab4
 RDX: 00000000fffb1a68 RSI: 00000000fffb18d8 RDI: 00000000fffb1aa8
 RBP: ffff8803f5643e48 R08: 0000000000000001 R09: ffff8803f54b0378
 R10: 0000000000000000 R11: 0000000000000168 R12: 00000000fffb18c0
 R13: 00000000fffb1a94 R14: 00000000fffb18c8 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff880456d00000(0063) knlGS:00000000f7100980
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 00000000fffb18e0 CR3: 00000003f552b000 CR4: 00000000003407e0
 Stack:
  00000000fffb1a94 00000000c0cc5640 0000000000000056 ffff8804274f3600
  ffff8803f5643ed0 ffffffffc0547e16 0000000000000003 ffff8803f5643eb0
  ffffffff81301460 ffff88009db44b01 ffff880441942520 ffff8800c0d05640
 Call Trace:
  [<ffffffffc0547e16>] v4l2_compat_ioctl32+0x12d6/0x1b1d [videodev]
  [<ffffffff81301460>] ? file_has_perm+0x70/0xc0
  [<ffffffff81252a2c>] compat_SyS_ioctl+0xec/0x1200
  [<ffffffff8173241a>] sysenter_dispatch+0x7/0x21
 Code: 00 00 48 8b 80 48 c0 ff ff 48 83 e8 38 49 39 c6 0f 87 2b ff ff ff 49 8d 45 1c e8 a3 ce e3 c0 85 c0 0f 85 1a ff ff ff 41 8d 40 ff <4d> 8b 64 24 20 41 89 d5 48 8d 44 40 03 4d 8d 34 c4 eb 15 0f 1f
 RIP  [<ffffffffc05468d9>] __put_v4l2_format32+0x169/0x220 [videodev]
 RSP <ffff8803f5643e28>
 CR2: 00000000fffb18e0

Tested with vivid driver on Kernel v3.18.102.

Same bug happens upstream too:

 BUG: KASAN: user-memory-access in __put_v4l2_format32+0x98/0x4d0 [videodev]
 Read of size 8 at addr 00000000ffe48400 by task v4l2-compliance/8713

 CPU: 0 PID: 8713 Comm: v4l2-compliance Not tainted 4.16.0-rc4+ #108
 Hardware name:  /NUC5i7RYB, BIOS RYBDWi35.86A.0364.2017.0511.0949 05/11/2017
 Call Trace:
  dump_stack+0x5c/0x7c
  kasan_report+0x164/0x380
  ? __put_v4l2_format32+0x98/0x4d0 [videodev]
  __put_v4l2_format32+0x98/0x4d0 [videodev]
  v4l2_compat_ioctl32+0x1aec/0x27a0 [videodev]
  ? __fsnotify_inode_delete+0x20/0x20
  ? __put_v4l2_format32+0x4d0/0x4d0 [videodev]
  compat_SyS_ioctl+0x646/0x14d0
  ? do_ioctl+0x30/0x30
  do_fast_syscall_32+0x191/0x3f4
  entry_SYSENTER_compat+0x6b/0x7a
 ==================================================================
 Disabling lock debugging due to kernel taint
 BUG: unable to handle kernel paging request at 00000000ffe48400
 IP: __put_v4l2_format32+0x98/0x4d0 [videodev]
 PGD 3a22fb067 P4D 3a22fb067 PUD 39b6f0067 PMD 39b6f1067 PTE 80000003256af067
 Oops: 0001 [#1] SMP KASAN
 Modules linked in: vivid videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops v4l2_tpg v4l2_dv_timings videobuf2_v4l2 videobuf2_common v4l2_common videodev xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables bluetooth rfkill ecdh_generic binfmt_misc snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp i915 coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hwdep snd_hda_core kvm snd_pcm irqbypass crct10dif_pclmul crc32_pclmul snd_seq_midi ghash_clmulni_intel snd_seq_midi_event i2c_algo_bit intel_cstate snd_rawmidi intel_uncore snd_seq drm_kms_helper e1000e snd_seq_device snd_timer intel_rapl_perf
  drm ptp snd mei_me mei lpc_ich pps_core soundcore video crc32c_intel
 CPU: 0 PID: 8713 Comm: v4l2-compliance Tainted: G    B            4.16.0-rc4+ #108
 Hardware name:  /NUC5i7RYB, BIOS RYBDWi35.86A.0364.2017.0511.0949 05/11/2017
 RIP: 0010:__put_v4l2_format32+0x98/0x4d0 [videodev]
 RSP: 0018:ffff8803b9be7d30 EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffff8803ac983e80 RCX: ffffffff8cd929f2
 RDX: 1ffffffff1d0a149 RSI: 0000000000000297 RDI: 0000000000000297
 RBP: 00000000ffe485c0 R08: fffffbfff1cf5123 R09: ffffffff8e7a8948
 R10: 0000000000000001 R11: fffffbfff1cf5122 R12: 00000000ffe483e0
 R13: 00000000ffe485c4 R14: ffff8803ac985918 R15: 00000000ffe483e8
 FS:  0000000000000000(0000) GS:ffff880407400000(0063) knlGS:00000000f7a46980
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 00000000ffe48400 CR3: 00000003a83f2003 CR4: 00000000003606f0
 Call Trace:
  v4l2_compat_ioctl32+0x1aec/0x27a0 [videodev]
  ? __fsnotify_inode_delete+0x20/0x20
  ? __put_v4l2_format32+0x4d0/0x4d0 [videodev]
  compat_SyS_ioctl+0x646/0x14d0
  ? do_ioctl+0x30/0x30
  do_fast_syscall_32+0x191/0x3f4
  entry_SYSENTER_compat+0x6b/0x7a
 Code: 4c 89 f7 4d 8d 7c 24 08 e8 e6 a4 69 cb 48 8b 83 98 1a 00 00 48 83 e8 10 49 39 c7 0f 87 9d 01 00 00 49 8d 7c 24 20 e8 c8 a4 69 cb <4d> 8b 74 24 20 4c 89 ef 4c 89 fe ba 10 00 00 00 e8 23 d9 08 cc
 RIP: __put_v4l2_format32+0x98/0x4d0 [videodev] RSP: ffff8803b9be7d30
 CR2: 00000000ffe48400

cc: stable@vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
+++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
@@ -101,7 +101,7 @@ static int get_v4l2_window32(struct v4l2
 static int put_v4l2_window32(struct v4l2_window __user *kp,
 			     struct v4l2_window32 __user *up)
 {
-	struct v4l2_clip __user *kclips = kp->clips;
+	struct v4l2_clip __user *kclips;
 	struct v4l2_clip32 __user *uclips;
 	compat_caddr_t p;
 	u32 clipcount;
@@ -116,6 +116,8 @@ static int put_v4l2_window32(struct v4l2
 	if (!clipcount)
 		return 0;
 
+	if (get_user(kclips, &kp->clips))
+		return -EFAULT;
 	if (get_user(p, &up->clips))
 		return -EFAULT;
 	uclips = compat_ptr(p);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 02/66] parisc: Fix out of array access in match_pci_device()
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 01/66] media: v4l2-compat-ioctl32: dont oops on overlay Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 03/66] Drivers: hv: vmbus: do not mark HV_PCIE as perf_device Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Helge Deller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Helge Deller <deller@gmx.de>

commit 615b2665fd20c327b631ff1e79426775de748094 upstream.

As found by the ubsan checker, the value of the 'index' variable can be
out of range for the bc[] array:

UBSAN: Undefined behaviour in arch/parisc/kernel/drivers.c:655:21
index 6 is out of range for type 'char [6]'
Backtrace:
 [<104fa850>] __ubsan_handle_out_of_bounds+0x68/0x80
 [<1019d83c>] check_parent+0xc0/0x170
 [<1019d91c>] descend_children+0x30/0x6c
 [<1059e164>] device_for_each_child+0x60/0x98
 [<1019cd54>] parse_tree_node+0x40/0x54
 [<1019d86c>] check_parent+0xf0/0x170
 [<1019d91c>] descend_children+0x30/0x6c
 [<1059e164>] device_for_each_child+0x60/0x98
 [<1019d938>] descend_children+0x4c/0x6c
 [<1059e164>] device_for_each_child+0x60/0x98
 [<1019cd54>] parse_tree_node+0x40/0x54
 [<1019cffc>] hwpath_to_device+0xa4/0xc4

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/parisc/kernel/drivers.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/arch/parisc/kernel/drivers.c
+++ b/arch/parisc/kernel/drivers.c
@@ -648,6 +648,10 @@ static int match_pci_device(struct devic
 					(modpath->mod == PCI_FUNC(devfn)));
 	}
 
+	/* index might be out of bounds for bc[] */
+	if (index >= 6)
+		return 0;
+
 	id = PCI_SLOT(pdev->devfn) | (PCI_FUNC(pdev->devfn) << 5);
 	return (modpath->bc[index] == id);
 }

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 03/66] Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 01/66] media: v4l2-compat-ioctl32: dont oops on overlay Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 02/66] parisc: Fix out of array access in match_pci_device() Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 04/66] perf intel-pt: Fix overlap detection to identify consecutive buffers correctly Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dexuan Cui, Stephen Hemminger,
	K. Y. Srinivasan

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dexuan Cui <decui@microsoft.com>

commit 238064f13d057390a8c5e1a6a80f4f0a0ec46499 upstream.

The pci-hyperv driver's channel callback hv_pci_onchannelcallback() is not
really a hot path, so we don't need to mark it as a perf_device, meaning
with this patch all HV_PCIE channels' target_cpu will be CPU0.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hv/channel_mgmt.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -70,7 +70,7 @@ static const struct vmbus_device vmbus_d
 	/* PCIE */
 	{ .dev_type = HV_PCIE,
 	  HV_PCIE_GUID,
-	  .perf_device = true,
+	  .perf_device = false,
 	},
 
 	/* Synthetic Frame Buffer */

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 04/66] perf intel-pt: Fix overlap detection to identify consecutive buffers correctly
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 03/66] Drivers: hv: vmbus: do not mark HV_PCIE as perf_device Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 05/66] perf intel-pt: Fix sync_switch Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Adrian Hunter <adrian.hunter@intel.com>

commit 117db4b27bf08dba412faf3924ba55fe970c57b8 upstream.

Overlap detection was not not updating the buffer's 'consecutive' flag.
Marking buffers consecutive has the advantage that decoding begins from
the start of the buffer instead of the first PSB. Fix overlap detection
to identify consecutive buffers correctly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c |   62 +++++++++-----------
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h |    2 
 tools/perf/util/intel-pt.c                          |    5 +
 3 files changed, 34 insertions(+), 35 deletions(-)

--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -2182,14 +2182,6 @@ const struct intel_pt_state *intel_pt_de
 	return &decoder->state;
 }
 
-static bool intel_pt_at_psb(unsigned char *buf, size_t len)
-{
-	if (len < INTEL_PT_PSB_LEN)
-		return false;
-	return memmem(buf, INTEL_PT_PSB_LEN, INTEL_PT_PSB_STR,
-		      INTEL_PT_PSB_LEN);
-}
-
 /**
  * intel_pt_next_psb - move buffer pointer to the start of the next PSB packet.
  * @buf: pointer to buffer pointer
@@ -2278,6 +2270,7 @@ static unsigned char *intel_pt_last_psb(
  * @buf: buffer
  * @len: size of buffer
  * @tsc: TSC value returned
+ * @rem: returns remaining size when TSC is found
  *
  * Find a TSC packet in @buf and return the TSC value.  This function assumes
  * that @buf starts at a PSB and that PSB+ will contain TSC and so stops if a
@@ -2285,7 +2278,8 @@ static unsigned char *intel_pt_last_psb(
  *
  * Return: %true if TSC is found, false otherwise.
  */
-static bool intel_pt_next_tsc(unsigned char *buf, size_t len, uint64_t *tsc)
+static bool intel_pt_next_tsc(unsigned char *buf, size_t len, uint64_t *tsc,
+			      size_t *rem)
 {
 	struct intel_pt_pkt packet;
 	int ret;
@@ -2296,6 +2290,7 @@ static bool intel_pt_next_tsc(unsigned c
 			return false;
 		if (packet.type == INTEL_PT_TSC) {
 			*tsc = packet.payload;
+			*rem = len;
 			return true;
 		}
 		if (packet.type == INTEL_PT_PSBEND)
@@ -2346,6 +2341,8 @@ static int intel_pt_tsc_cmp(uint64_t tsc
  * @len_a: size of first buffer
  * @buf_b: second buffer
  * @len_b: size of second buffer
+ * @consecutive: returns true if there is data in buf_b that is consecutive
+ *               to buf_a
  *
  * If the trace contains TSC we can look at the last TSC of @buf_a and the
  * first TSC of @buf_b in order to determine if the buffers overlap, and then
@@ -2358,33 +2355,41 @@ static int intel_pt_tsc_cmp(uint64_t tsc
 static unsigned char *intel_pt_find_overlap_tsc(unsigned char *buf_a,
 						size_t len_a,
 						unsigned char *buf_b,
-						size_t len_b)
+						size_t len_b, bool *consecutive)
 {
 	uint64_t tsc_a, tsc_b;
 	unsigned char *p;
-	size_t len;
+	size_t len, rem_a, rem_b;
 
 	p = intel_pt_last_psb(buf_a, len_a);
 	if (!p)
 		return buf_b; /* No PSB in buf_a => no overlap */
 
 	len = len_a - (p - buf_a);
-	if (!intel_pt_next_tsc(p, len, &tsc_a)) {
+	if (!intel_pt_next_tsc(p, len, &tsc_a, &rem_a)) {
 		/* The last PSB+ in buf_a is incomplete, so go back one more */
 		len_a -= len;
 		p = intel_pt_last_psb(buf_a, len_a);
 		if (!p)
 			return buf_b; /* No full PSB+ => assume no overlap */
 		len = len_a - (p - buf_a);
-		if (!intel_pt_next_tsc(p, len, &tsc_a))
+		if (!intel_pt_next_tsc(p, len, &tsc_a, &rem_a))
 			return buf_b; /* No TSC in buf_a => assume no overlap */
 	}
 
 	while (1) {
 		/* Ignore PSB+ with no TSC */
-		if (intel_pt_next_tsc(buf_b, len_b, &tsc_b) &&
-		    intel_pt_tsc_cmp(tsc_a, tsc_b) < 0)
-			return buf_b; /* tsc_a < tsc_b => no overlap */
+		if (intel_pt_next_tsc(buf_b, len_b, &tsc_b, &rem_b)) {
+			int cmp = intel_pt_tsc_cmp(tsc_a, tsc_b);
+
+			/* Same TSC, so buffers are consecutive */
+			if (!cmp && rem_b >= rem_a) {
+				*consecutive = true;
+				return buf_b + len_b - (rem_b - rem_a);
+			}
+			if (cmp < 0)
+				return buf_b; /* tsc_a < tsc_b => no overlap */
+		}
 
 		if (!intel_pt_step_psb(&buf_b, &len_b))
 			return buf_b + len_b; /* No PSB in buf_b => no data */
@@ -2398,6 +2403,8 @@ static unsigned char *intel_pt_find_over
  * @buf_b: second buffer
  * @len_b: size of second buffer
  * @have_tsc: can use TSC packets to detect overlap
+ * @consecutive: returns true if there is data in buf_b that is consecutive
+ *               to buf_a
  *
  * When trace samples or snapshots are recorded there is the possibility that
  * the data overlaps.  Note that, for the purposes of decoding, data is only
@@ -2408,7 +2415,7 @@ static unsigned char *intel_pt_find_over
  */
 unsigned char *intel_pt_find_overlap(unsigned char *buf_a, size_t len_a,
 				     unsigned char *buf_b, size_t len_b,
-				     bool have_tsc)
+				     bool have_tsc, bool *consecutive)
 {
 	unsigned char *found;
 
@@ -2420,7 +2427,8 @@ unsigned char *intel_pt_find_overlap(uns
 		return buf_b; /* No overlap */
 
 	if (have_tsc) {
-		found = intel_pt_find_overlap_tsc(buf_a, len_a, buf_b, len_b);
+		found = intel_pt_find_overlap_tsc(buf_a, len_a, buf_b, len_b,
+						  consecutive);
 		if (found)
 			return found;
 	}
@@ -2435,28 +2443,16 @@ unsigned char *intel_pt_find_overlap(uns
 	}
 
 	/* Now len_b >= len_a */
-	if (len_b > len_a) {
-		/* The leftover buffer 'b' must start at a PSB */
-		while (!intel_pt_at_psb(buf_b + len_a, len_b - len_a)) {
-			if (!intel_pt_step_psb(&buf_a, &len_a))
-				return buf_b; /* No overlap */
-		}
-	}
-
 	while (1) {
 		/* Potential overlap so check the bytes */
 		found = memmem(buf_a, len_a, buf_b, len_a);
-		if (found)
+		if (found) {
+			*consecutive = true;
 			return buf_b + len_a;
+		}
 
 		/* Try again at next PSB in buffer 'a' */
 		if (!intel_pt_step_psb(&buf_a, &len_a))
 			return buf_b; /* No overlap */
-
-		/* The leftover buffer 'b' must start at a PSB */
-		while (!intel_pt_at_psb(buf_b + len_a, len_b - len_a)) {
-			if (!intel_pt_step_psb(&buf_a, &len_a))
-				return buf_b; /* No overlap */
-		}
 	}
 }
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -103,7 +103,7 @@ const struct intel_pt_state *intel_pt_de
 
 unsigned char *intel_pt_find_overlap(unsigned char *buf_a, size_t len_a,
 				     unsigned char *buf_b, size_t len_b,
-				     bool have_tsc);
+				     bool have_tsc, bool *consecutive);
 
 int intel_pt__strerror(int code, char *buf, size_t buflen);
 
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -194,14 +194,17 @@ static void intel_pt_dump_event(struct i
 static int intel_pt_do_fix_overlap(struct intel_pt *pt, struct auxtrace_buffer *a,
 				   struct auxtrace_buffer *b)
 {
+	bool consecutive = false;
 	void *start;
 
 	start = intel_pt_find_overlap(a->data, a->size, b->data, b->size,
-				      pt->have_tsc);
+				      pt->have_tsc, &consecutive);
 	if (!start)
 		return -EINVAL;
 	b->use_size = b->data + b->size - start;
 	b->use_data = start;
+	if (b->use_size && consecutive)
+		b->consecutive = true;
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 05/66] perf intel-pt: Fix sync_switch
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 04/66] perf intel-pt: Fix overlap detection to identify consecutive buffers correctly Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 06/66] perf intel-pt: Fix error recovery from missing TIP packet Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Adrian Hunter <adrian.hunter@intel.com>

commit 63d8e38f6ae6c36dd5b5ba0e8c112e8861532ea2 upstream.

sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.

The flag when sync_switch is enabled was global to the decoding, whereas
it is really specific to the CPU.

The trace data for different CPUs is put on different queues, so add
sync_switch to the intel_pt_queue structure and use that in preference
to the global setting in the intel_pt structure.

That fixes problems decoding one CPU's trace because sync_switch was
disabled on a different CPU's queue.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/util/intel-pt.c |   32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -131,6 +131,7 @@ struct intel_pt_queue {
 	bool stop;
 	bool step_through_buffers;
 	bool use_buffer_pid_tid;
+	bool sync_switch;
 	pid_t pid, tid;
 	int cpu;
 	int switch_state;
@@ -931,10 +932,12 @@ static int intel_pt_setup_queue(struct i
 			if (pt->timeless_decoding || !pt->have_sched_switch)
 				ptq->use_buffer_pid_tid = true;
 		}
+
+		ptq->sync_switch = pt->sync_switch;
 	}
 
 	if (!ptq->on_heap &&
-	    (!pt->sync_switch ||
+	    (!ptq->sync_switch ||
 	     ptq->switch_state != INTEL_PT_SS_EXPECTING_SWITCH_EVENT)) {
 		const struct intel_pt_state *state;
 		int ret;
@@ -1336,7 +1339,7 @@ static int intel_pt_sample(struct intel_
 	if (pt->synth_opts.last_branch)
 		intel_pt_update_last_branch_rb(ptq);
 
-	if (!pt->sync_switch)
+	if (!ptq->sync_switch)
 		return 0;
 
 	if (intel_pt_is_switch_ip(ptq, state->to_ip)) {
@@ -1417,6 +1420,21 @@ static u64 intel_pt_switch_ip(struct int
 	return switch_ip;
 }
 
+static void intel_pt_enable_sync_switch(struct intel_pt *pt)
+{
+	unsigned int i;
+
+	pt->sync_switch = true;
+
+	for (i = 0; i < pt->queues.nr_queues; i++) {
+		struct auxtrace_queue *queue = &pt->queues.queue_array[i];
+		struct intel_pt_queue *ptq = queue->priv;
+
+		if (ptq)
+			ptq->sync_switch = true;
+	}
+}
+
 static int intel_pt_run_decoder(struct intel_pt_queue *ptq, u64 *timestamp)
 {
 	const struct intel_pt_state *state = ptq->state;
@@ -1433,7 +1451,7 @@ static int intel_pt_run_decoder(struct i
 			if (pt->switch_ip) {
 				intel_pt_log("switch_ip: %"PRIx64" ptss_ip: %"PRIx64"\n",
 					     pt->switch_ip, pt->ptss_ip);
-				pt->sync_switch = true;
+				intel_pt_enable_sync_switch(pt);
 			}
 		}
 	}
@@ -1449,9 +1467,9 @@ static int intel_pt_run_decoder(struct i
 		if (state->err) {
 			if (state->err == INTEL_PT_ERR_NODATA)
 				return 1;
-			if (pt->sync_switch &&
+			if (ptq->sync_switch &&
 			    state->from_ip >= pt->kernel_start) {
-				pt->sync_switch = false;
+				ptq->sync_switch = false;
 				intel_pt_next_tid(pt, ptq);
 			}
 			if (pt->synth_opts.errors) {
@@ -1477,7 +1495,7 @@ static int intel_pt_run_decoder(struct i
 				     state->timestamp, state->est_timestamp);
 			ptq->timestamp = state->est_timestamp;
 		/* Use estimated TSC in unknown switch state */
-		} else if (pt->sync_switch &&
+		} else if (ptq->sync_switch &&
 			   ptq->switch_state == INTEL_PT_SS_UNKNOWN &&
 			   intel_pt_is_switch_ip(ptq, state->to_ip) &&
 			   ptq->next_tid == -1) {
@@ -1624,7 +1642,7 @@ static int intel_pt_sync_switch(struct i
 		return 1;
 
 	ptq = intel_pt_cpu_to_ptq(pt, cpu);
-	if (!ptq)
+	if (!ptq || !ptq->sync_switch)
 		return 1;
 
 	switch (ptq->switch_state) {

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 06/66] perf intel-pt: Fix error recovery from missing TIP packet
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 05/66] perf intel-pt: Fix sync_switch Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 07/66] perf intel-pt: Fix timestamp following overflow Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Adrian Hunter <adrian.hunter@intel.com>

commit 1c196a6c771c47a2faa63d38d913e03284f73a16 upstream.

When a TIP packet is expected but there is a different packet, it is an
error. However the unexpected packet might be something important like a
TSC packet, so after the error, it is necessary to continue from there,
rather than the next packet. That is achieved by setting pkt_step to
zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c |    1 +
 1 file changed, 1 insertion(+)

--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1522,6 +1522,7 @@ static int intel_pt_walk_fup_tip(struct
 		case INTEL_PT_PSBEND:
 			intel_pt_log("ERROR: Missing TIP after FUP\n");
 			decoder->pkt_state = INTEL_PT_STATE_ERR3;
+			decoder->pkt_step = 0;
 			return -ENOENT;
 
 		case INTEL_PT_OVF:

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 07/66] perf intel-pt: Fix timestamp following overflow
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 06/66] perf intel-pt: Fix error recovery from missing TIP packet Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 08/66] perf/core: Fix use-after-free in uprobe_perf_close() Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Adrian Hunter <adrian.hunter@intel.com>

commit 91d29b288aed3406caf7c454bf2b898c96cfd177 upstream.

timestamp_insn_cnt is used to estimate the timestamp based on the number of
instructions since the last known timestamp.

If the estimate is not accurate enough decoding might not be correctly
synchronized with side-band events causing more trace errors.

However there are always timestamps following an overflow, so the
estimate is not needed and can indeed result in more errors.

Suppress the estimate by setting timestamp_insn_cnt to zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c |    1 +
 1 file changed, 1 insertion(+)

--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1300,6 +1300,7 @@ static int intel_pt_overflow(struct inte
 	intel_pt_clear_tx_flags(decoder);
 	decoder->have_tma = false;
 	decoder->cbr = 0;
+	decoder->timestamp_insn_cnt = 0;
 	decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC;
 	decoder->overflow = true;
 	return -EOVERFLOW;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 08/66] perf/core: Fix use-after-free in uprobe_perf_close()
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 07/66] perf intel-pt: Fix timestamp following overflow Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 09/66] radeon: hide pointless #warning when compile testing Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Prashant Bhole, Oleg Nesterov,
	Peter Zijlstra (Intel),
	stable, Alexander Shishkin, Arnaldo Carvalho de Melo, Jiri Olsa,
	Linus Torvalds, Namhyung Kim, Thomas Gleixner, Ingo Molnar

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>

commit 621b6d2ea297d0fb6030452c5bcd221f12165fcf upstream.

A use-after-free bug was caught by KASAN while running usdt related
code (BCC project. bcc/tests/python/test_usdt2.py):

	==================================================================
	BUG: KASAN: use-after-free in uprobe_perf_close+0x222/0x3b0
	Read of size 4 at addr ffff880384f9b4a4 by task test_usdt2.py/870

	CPU: 4 PID: 870 Comm: test_usdt2.py Tainted: G        W         4.16.0-next-20180409 #215
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
	Call Trace:
	 dump_stack+0xc7/0x15b
	 ? show_regs_print_info+0x5/0x5
	 ? printk+0x9c/0xc3
	 ? kmsg_dump_rewind_nolock+0x6e/0x6e
	 ? uprobe_perf_close+0x222/0x3b0
	 print_address_description+0x83/0x3a0
	 ? uprobe_perf_close+0x222/0x3b0
	 kasan_report+0x1dd/0x460
	 ? uprobe_perf_close+0x222/0x3b0
	 uprobe_perf_close+0x222/0x3b0
	 ? probes_open+0x180/0x180
	 ? free_filters_list+0x290/0x290
	 trace_uprobe_register+0x1bb/0x500
	 ? perf_event_attach_bpf_prog+0x310/0x310
	 ? probe_event_disable+0x4e0/0x4e0
	 perf_uprobe_destroy+0x63/0xd0
	 _free_event+0x2bc/0xbd0
	 ? lockdep_rcu_suspicious+0x100/0x100
	 ? ring_buffer_attach+0x550/0x550
	 ? kvm_sched_clock_read+0x1a/0x30
	 ? perf_event_release_kernel+0x3e4/0xc00
	 ? __mutex_unlock_slowpath+0x12e/0x540
	 ? wait_for_completion+0x430/0x430
	 ? lock_downgrade+0x3c0/0x3c0
	 ? lock_release+0x980/0x980
	 ? do_raw_spin_trylock+0x118/0x150
	 ? do_raw_spin_unlock+0x121/0x210
	 ? do_raw_spin_trylock+0x150/0x150
	 perf_event_release_kernel+0x5d4/0xc00
	 ? put_event+0x30/0x30
	 ? fsnotify+0xd2d/0xea0
	 ? sched_clock_cpu+0x18/0x1a0
	 ? __fsnotify_update_child_dentry_flags.part.0+0x1b0/0x1b0
	 ? pvclock_clocksource_read+0x152/0x2b0
	 ? pvclock_read_flags+0x80/0x80
	 ? kvm_sched_clock_read+0x1a/0x30
	 ? sched_clock_cpu+0x18/0x1a0
	 ? pvclock_clocksource_read+0x152/0x2b0
	 ? locks_remove_file+0xec/0x470
	 ? pvclock_read_flags+0x80/0x80
	 ? fcntl_setlk+0x880/0x880
	 ? ima_file_free+0x8d/0x390
	 ? lockdep_rcu_suspicious+0x100/0x100
	 ? ima_file_check+0x110/0x110
	 ? fsnotify+0xea0/0xea0
	 ? kvm_sched_clock_read+0x1a/0x30
	 ? rcu_note_context_switch+0x600/0x600
	 perf_release+0x21/0x40
	 __fput+0x264/0x620
	 ? fput+0xf0/0xf0
	 ? do_raw_spin_unlock+0x121/0x210
	 ? do_raw_spin_trylock+0x150/0x150
	 ? SyS_fchdir+0x100/0x100
	 ? fsnotify+0xea0/0xea0
	 task_work_run+0x14b/0x1e0
	 ? task_work_cancel+0x1c0/0x1c0
	 ? copy_fd_bitmaps+0x150/0x150
	 ? vfs_read+0xe5/0x260
	 exit_to_usermode_loop+0x17b/0x1b0
	 ? trace_event_raw_event_sys_exit+0x1a0/0x1a0
	 do_syscall_64+0x3f6/0x490
	 ? syscall_return_slowpath+0x2c0/0x2c0
	 ? lockdep_sys_exit+0x1f/0xaa
	 ? syscall_return_slowpath+0x1a3/0x2c0
	 ? lockdep_sys_exit+0x1f/0xaa
	 ? prepare_exit_to_usermode+0x11c/0x1e0
	 ? enter_from_user_mode+0x30/0x30
	random: crng init done
	 ? __put_user_4+0x1c/0x30
	 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
	RIP: 0033:0x7f41d95f9340
	RSP: 002b:00007fffe71e4268 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
	RAX: 0000000000000000 RBX: 000000000000000d RCX: 00007f41d95f9340
	RDX: 0000000000000000 RSI: 0000000000002401 RDI: 000000000000000d
	RBP: 0000000000000000 R08: 00007f41ca8ff700 R09: 00007f41d996dd1f
	R10: 00007fffe71e41e0 R11: 0000000000000246 R12: 00007fffe71e4330
	R13: 0000000000000000 R14: fffffffffffffffc R15: 00007fffe71e4290

	Allocated by task 870:
	 kasan_kmalloc+0xa0/0xd0
	 kmem_cache_alloc_node+0x11a/0x430
	 copy_process.part.19+0x11a0/0x41c0
	 _do_fork+0x1be/0xa20
	 do_syscall_64+0x198/0x490
	 entry_SYSCALL_64_after_hwframe+0x3d/0xa2

	Freed by task 0:
	 __kasan_slab_free+0x12e/0x180
	 kmem_cache_free+0x102/0x4d0
	 free_task+0xfe/0x160
	 __put_task_struct+0x189/0x290
	 delayed_put_task_struct+0x119/0x250
	 rcu_process_callbacks+0xa6c/0x1b60
	 __do_softirq+0x238/0x7ae

	The buggy address belongs to the object at ffff880384f9b480
	 which belongs to the cache task_struct of size 12928

It occurs because task_struct is freed before perf_event which refers
to the task and task flags are checked while teardown of the event.
perf_event_alloc() assigns task_struct to hw.target of perf_event,
but there is no reference counting for it.

As a fix we get_task_struct() in perf_event_alloc() at above mentioned
assignment and put_task_struct() in _free_event().

Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 63b6da39bb38e8f1a1ef3180d32a39d6 ("perf: Fix perf_event_exit_task() race")
Link: http://lkml.kernel.org/r/20180409100346.6416-1-bhole_prashant_q7@lab.ntt.co.jp
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/events/core.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4091,6 +4091,9 @@ static void _free_event(struct perf_even
 	if (event->ctx)
 		put_ctx(event->ctx);
 
+	if (event->hw.target)
+		put_task_struct(event->hw.target);
+
 	exclusive_event_destroy(event);
 	module_put(event->pmu->module);
 
@@ -9214,6 +9217,7 @@ perf_event_alloc(struct perf_event_attr
 		 * and we cannot use the ctx information because we need the
 		 * pmu before we get a ctx.
 		 */
+		get_task_struct(task);
 		event->hw.target = task;
 	}
 
@@ -9331,6 +9335,8 @@ err_ns:
 		perf_detach_cgroup(event);
 	if (event->ns)
 		put_pid_ns(event->ns);
+	if (event->hw.target)
+		put_task_struct(event->hw.target);
 	kfree(event);
 
 	return ERR_PTR(err);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 09/66] radeon: hide pointless #warning when compile testing
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 08/66] perf/core: Fix use-after-free in uprobe_perf_close() Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 10/66] arm64: barrier: Add CSDB macros to control data-value prediction Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arnd Bergmann, Michel Dänzer,
	Alex Deucher

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <arnd@arndb.de>

commit c02216acf4177c4411d33735c81cad687790fa59 upstream.

In randconfig testing, we sometimes get this warning:

drivers/gpu/drm/radeon/radeon_object.c: In function 'radeon_bo_create':
drivers/gpu/drm/radeon/radeon_object.c:242:2: error: #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Werror=cpp]
 #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \

This is rather annoying since almost all other code produces no build-time
output unless we have found a real bug. We already fixed this in the
amdgpu driver in commit 31bb90f1cd08 ("drm/amdgpu: shut up #warning for
compile testing") by adding a CONFIG_COMPILE_TEST check last year and
agreed to do the same here, but both Michel and I then forgot about it
until I came across the issue again now.

For stable kernels, as this is one of very few remaining randconfig
warnings in 4.14.

Cc: stable@vger.kernel.org
Link: https://patchwork.kernel.org/patch/9550009/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/radeon/radeon_object.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -238,9 +238,10 @@ int radeon_bo_create(struct radeon_devic
 	 * may be slow
 	 * See https://bugs.freedesktop.org/show_bug.cgi?id=88758
 	 */
-
+#ifndef CONFIG_COMPILE_TEST
 #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
 	 thanks to write-combining
+#endif
 
 	if (bo->flags & RADEON_GEM_GTT_WC)
 		DRM_INFO_ONCE("Please enable CONFIG_MTRR and CONFIG_X86_PAT for "

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 10/66] arm64: barrier: Add CSDB macros to control data-value prediction
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 09/66] radeon: hide pointless #warning when compile testing Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 11/66] arm64: Implement array_index_mask_nospec() Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Mark Rutland, Will Deacon, Catalin Marinas,
	Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 669474e772b952b14f4de4845a1558fd4c0414a4 upstream.

For CPUs capable of data value prediction, CSDB waits for any outstanding
predictions to architecturally resolve before allowing speculative execution
to continue. Provide macros to expose it to the arch code.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/assembler.h |    7 +++++++
 arch/arm64/include/asm/barrier.h   |    2 ++
 2 files changed, 9 insertions(+)

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -87,6 +87,13 @@
 	.endm
 
 /*
+ * Value prediction barrier
+ */
+	.macro	csdb
+	hint	#20
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -31,6 +31,8 @@
 #define dmb(opt)	asm volatile("dmb " #opt : : : "memory")
 #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
 
+#define csdb()		asm volatile("hint #20" : : : "memory")
+
 #define mb()		dsb(sy)
 #define rmb()		dsb(ld)
 #define wmb()		dsb(st)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 11/66] arm64: Implement array_index_mask_nospec()
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 10/66] arm64: barrier: Add CSDB macros to control data-value prediction Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 12/66] arm64: move TASK_* definitions to <asm/processor.h> Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Mark Rutland, Robin Murphy, Will Deacon,
	Catalin Marinas, Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Robin Murphy <robin.murphy@arm.com>

commit 022620eed3d0bc4bf2027326f599f5ad71c2ea3f upstream.

Provide an optimised, assembly implementation of array_index_mask_nospec()
for arm64 so that the compiler is not in a position to transform the code
in ways which affect its ability to inhibit speculation (e.g. by introducing
conditional branches).

This is similar to the sequence used by x86, modulo architectural differences
in the carry/borrow flags.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/barrier.h |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -40,6 +40,27 @@
 #define dma_rmb()	dmb(oshld)
 #define dma_wmb()	dmb(oshst)
 
+/*
+ * Generate a mask for array_index__nospec() that is ~0UL when 0 <= idx < sz
+ * and 0 otherwise.
+ */
+#define array_index_mask_nospec array_index_mask_nospec
+static inline unsigned long array_index_mask_nospec(unsigned long idx,
+						    unsigned long sz)
+{
+	unsigned long mask;
+
+	asm volatile(
+	"	cmp	%1, %2\n"
+	"	sbc	%0, xzr, xzr\n"
+	: "=r" (mask)
+	: "r" (idx), "Ir" (sz)
+	: "cc");
+
+	csdb();
+	return mask;
+}
+
 #define __smp_mb()	dmb(ish)
 #define __smp_rmb()	dmb(ishld)
 #define __smp_wmb()	dmb(ishst)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 12/66] arm64: move TASK_* definitions to <asm/processor.h>
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 11/66] arm64: Implement array_index_mask_nospec() Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 13/66] arm64: Make USER_DS an inclusive limit Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Laura Abbott, Ard Biesheuvel,
	Catalin Marinas, James Morse, Mark Rutland, Yury Norov,
	Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Yury Norov <ynorov@caviumnetworks.com>

commit eef94a3d09aab437c8c254de942d8b1aa76455e2 upstream.

ILP32 series [1] introduces the dependency on <asm/is_compat.h> for
TASK_SIZE macro. Which in turn requires <asm/thread_info.h>, and
<asm/thread_info.h> include <asm/memory.h>, giving a circular dependency,
because TASK_SIZE is currently located in <asm/memory.h>.

In other architectures, TASK_SIZE is defined in <asm/processor.h>, and
moving TASK_SIZE there fixes the problem.

Discussion: https://patchwork.kernel.org/patch/9929107/

[1] https://github.com/norov/linux/tree/ilp32-next

CC: Will Deacon <will.deacon@arm.com>
CC: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[v4.9: necessary for making USER_DS an inclusive limit]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/memory.h    |   15 ---------------
 arch/arm64/include/asm/processor.h |   21 +++++++++++++++++++++
 arch/arm64/kernel/entry.S          |    1 +
 3 files changed, 22 insertions(+), 15 deletions(-)

--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -60,8 +60,6 @@
  * KIMAGE_VADDR - the virtual address of the start of the kernel image
  * VA_BITS - the maximum number of bits for virtual addresses.
  * VA_START - the first kernel virtual address.
- * TASK_SIZE - the maximum size of a user space task.
- * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
  */
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) - \
@@ -76,19 +74,6 @@
 #define PCI_IO_END		(VMEMMAP_START - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
-#define TASK_SIZE_64		(UL(1) << VA_BITS)
-
-#ifdef CONFIG_COMPAT
-#define TASK_SIZE_32		UL(0x100000000)
-#define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
-				TASK_SIZE_32 : TASK_SIZE_64)
-#define TASK_SIZE_OF(tsk)	(test_tsk_thread_flag(tsk, TIF_32BIT) ? \
-				TASK_SIZE_32 : TASK_SIZE_64)
-#else
-#define TASK_SIZE		TASK_SIZE_64
-#endif /* CONFIG_COMPAT */
-
-#define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
 #define KERNEL_START      _text
 #define KERNEL_END        _end
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -19,6 +19,10 @@
 #ifndef __ASM_PROCESSOR_H
 #define __ASM_PROCESSOR_H
 
+#define TASK_SIZE_64		(UL(1) << VA_BITS)
+
+#ifndef __ASSEMBLY__
+
 /*
  * Default implementation of macro that returns current
  * instruction pointer ("program counter").
@@ -37,6 +41,22 @@
 #include <asm/ptrace.h>
 #include <asm/types.h>
 
+/*
+ * TASK_SIZE - the maximum size of a user space task.
+ * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
+ */
+#ifdef CONFIG_COMPAT
+#define TASK_SIZE_32		UL(0x100000000)
+#define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
+				TASK_SIZE_32 : TASK_SIZE_64)
+#define TASK_SIZE_OF(tsk)	(test_tsk_thread_flag(tsk, TIF_32BIT) ? \
+				TASK_SIZE_32 : TASK_SIZE_64)
+#else
+#define TASK_SIZE		TASK_SIZE_64
+#endif /* CONFIG_COMPAT */
+
+#define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
+
 #define STACK_TOP_MAX		TASK_SIZE_64
 #ifdef CONFIG_COMPAT
 #define AARCH32_VECTORS_BASE	0xffff0000
@@ -192,4 +212,5 @@ int cpu_enable_pan(void *__unused);
 int cpu_enable_uao(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
 
+#endif /* __ASSEMBLY__ */
 #endif /* __ASM_PROCESSOR_H */
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -30,6 +30,7 @@
 #include <asm/irq.h>
 #include <asm/memory.h>
 #include <asm/mmu.h>
+#include <asm/processor.h>
 #include <asm/thread_info.h>
 #include <asm/asm-uaccess.h>
 #include <asm/unistd.h>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 13/66] arm64: Make USER_DS an inclusive limit
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 12/66] arm64: move TASK_* definitions to <asm/processor.h> Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 14/66] arm64: Use pointer masking to limit uaccess speculation Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Robin Murphy, Will Deacon, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Robin Murphy <robin.murphy@arm.com>

commit 51369e398d0d33e8f524314e672b07e8cf870e79 upstream.

Currently, USER_DS represents an exclusive limit while KERNEL_DS is
inclusive. In order to do some clever trickery for speculation-safe
masking, we need them both to behave equivalently - there aren't enough
bits to make KERNEL_DS exclusive, so we have precisely one option. This
also happens to correct a longstanding false negative for a range
ending on the very top byte of kernel memory.

Mark Rutland points out that we've actually got the semantics of
addresses vs. segments muddled up in most of the places we need to
amend, so shuffle the {USER,KERNEL}_DS definitions around such that we
can correct those properly instead of just pasting "-1"s everywhere.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: avoid dependence on TTBR0 SW PAN and THREAD_INFO_IN_TASK_STRUCT]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/processor.h |    3 ++
 arch/arm64/include/asm/uaccess.h   |   46 +++++++++++++++++++++----------------
 arch/arm64/kernel/entry.S          |    4 +--
 arch/arm64/mm/fault.c              |    2 -
 4 files changed, 33 insertions(+), 22 deletions(-)

--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -21,6 +21,9 @@
 
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
 
+#define KERNEL_DS	UL(-1)
+#define USER_DS		(TASK_SIZE_64 - 1)
+
 #ifndef __ASSEMBLY__
 
 /*
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -28,6 +28,7 @@
 
 #include <asm/alternative.h>
 #include <asm/cpufeature.h>
+#include <asm/processor.h>
 #include <asm/ptrace.h>
 #include <asm/sysreg.h>
 #include <asm/errno.h>
@@ -59,10 +60,7 @@ struct exception_table_entry
 
 extern int fixup_exception(struct pt_regs *regs);
 
-#define KERNEL_DS	(-1UL)
 #define get_ds()	(KERNEL_DS)
-
-#define USER_DS		TASK_SIZE_64
 #define get_fs()	(current_thread_info()->addr_limit)
 
 static inline void set_fs(mm_segment_t fs)
@@ -87,22 +85,32 @@ static inline void set_fs(mm_segment_t f
  * Returns 1 if the range is valid, 0 otherwise.
  *
  * This is equivalent to the following test:
- * (u65)addr + (u65)size <= current->addr_limit
- *
- * This needs 65-bit arithmetic.
+ * (u65)addr + (u65)size <= (u65)current->addr_limit + 1
  */
-#define __range_ok(addr, size)						\
-({									\
-	unsigned long __addr = (unsigned long __force)(addr);		\
-	unsigned long flag, roksum;					\
-	__chk_user_ptr(addr);						\
-	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, ls"		\
-		: "=&r" (flag), "=&r" (roksum)				\
-		: "1" (__addr), "Ir" (size),				\
-		  "r" (current_thread_info()->addr_limit)		\
-		: "cc");						\
-	flag;								\
-})
+static inline unsigned long __range_ok(unsigned long addr, unsigned long size)
+{
+	unsigned long limit = current_thread_info()->addr_limit;
+
+	__chk_user_ptr(addr);
+	asm volatile(
+	// A + B <= C + 1 for all A,B,C, in four easy steps:
+	// 1: X = A + B; X' = X % 2^64
+	"	adds	%0, %0, %2\n"
+	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
+	"	csel	%1, xzr, %1, hi\n"
+	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
+	//    to compensate for the carry flag being set in step 4. For
+	//    X > 2^64, X' merely has to remain nonzero, which it does.
+	"	csinv	%0, %0, xzr, cc\n"
+	// 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
+	//    comes from the carry in being clear. Otherwise, we are
+	//    testing X' - C == 0, subject to the previous adjustments.
+	"	sbcs	xzr, %0, %1\n"
+	"	cset	%0, ls\n"
+	: "+r" (addr), "+r" (limit) : "Ir" (size) : "cc");
+
+	return addr;
+}
 
 /*
  * When dealing with data aborts, watchpoints, or instruction traps we may end
@@ -111,7 +119,7 @@ static inline void set_fs(mm_segment_t f
  */
 #define untagged_addr(addr)		sign_extend64(addr, 55)
 
-#define access_ok(type, addr, size)	__range_ok(addr, size)
+#define access_ok(type, addr, size)	__range_ok((unsigned long)(addr), size)
 #define user_addr_max			get_fs
 
 #define _ASM_EXTABLE(from, to)						\
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -126,10 +126,10 @@ alternative_else_nop_endif
 	.else
 	add	x21, sp, #S_FRAME_SIZE
 	get_thread_info tsk
-	/* Save the task's original addr_limit and set USER_DS (TASK_SIZE_64) */
+	/* Save the task's original addr_limit and set USER_DS */
 	ldr	x20, [tsk, #TI_ADDR_LIMIT]
 	str	x20, [sp, #S_ORIG_ADDR_LIMIT]
-	mov	x20, #TASK_SIZE_64
+	mov	x20, #USER_DS
 	str	x20, [tsk, #TI_ADDR_LIMIT]
 	/* No need to reset PSTATE.UAO, hardware's already set it to 0 for us */
 	.endif /* \el == 0 */
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -332,7 +332,7 @@ static int __kprobes do_page_fault(unsig
 		mm_flags |= FAULT_FLAG_WRITE;
 	}
 
-	if (is_permission_fault(esr) && (addr < USER_DS)) {
+	if (is_permission_fault(esr) && (addr < TASK_SIZE)) {
 		/* regs->orig_addr_limit may be 0 if we entered from EL0 */
 		if (regs->orig_addr_limit == KERNEL_DS)
 			die("Accessing user space memory with fs=KERNEL_DS", regs, esr);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 14/66] arm64: Use pointer masking to limit uaccess speculation
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 13/66] arm64: Make USER_DS an inclusive limit Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 15/66] arm64: entry: Ensure branch through syscall table is bounded under speculation Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Robin Murphy, Will Deacon, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Robin Murphy <robin.murphy@arm.com>

commit 4d8efc2d5ee4c9ccfeb29ee8afd47a8660d0c0ce upstream.

Similarly to x86, mitigate speculation past an access_ok() check by
masking the pointer against the address limit before use.

Even if we don't expect speculative writes per se, it is plausible that
a CPU may still speculate at least as far as fetching a cache line for
writing, hence we also harden put_user() and clear_user() for peace of
mind.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/uaccess.h |   26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -129,6 +129,26 @@ static inline unsigned long __range_ok(u
 	"	.popsection\n"
 
 /*
+ * Sanitise a uaccess pointer such that it becomes NULL if above the
+ * current addr_limit.
+ */
+#define uaccess_mask_ptr(ptr) (__typeof__(ptr))__uaccess_mask_ptr(ptr)
+static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
+{
+	void __user *safe_ptr;
+
+	asm volatile(
+	"	bics	xzr, %1, %2\n"
+	"	csel	%0, %1, xzr, eq\n"
+	: "=&r" (safe_ptr)
+	: "r" (ptr), "r" (current_thread_info()->addr_limit)
+	: "cc");
+
+	csdb();
+	return safe_ptr;
+}
+
+/*
  * The "__xxx" versions of the user access functions do not verify the address
  * space - it must have been done previously with a separate "access_ok()"
  * call.
@@ -202,7 +222,7 @@ do {									\
 	__typeof__(*(ptr)) __user *__p = (ptr);				\
 	might_fault();							\
 	access_ok(VERIFY_READ, __p, sizeof(*__p)) ?			\
-		__get_user((x), __p) :					\
+		__p = uaccess_mask_ptr(__p), __get_user((x), __p) :	\
 		((x) = 0, -EFAULT);					\
 })
 
@@ -270,7 +290,7 @@ do {									\
 	__typeof__(*(ptr)) __user *__p = (ptr);				\
 	might_fault();							\
 	access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ?			\
-		__put_user((x), __p) :					\
+		__p = uaccess_mask_ptr(__p), __put_user((x), __p) :	\
 		-EFAULT;						\
 })
 
@@ -331,7 +351,7 @@ static inline unsigned long __must_check
 static inline unsigned long __must_check clear_user(void __user *to, unsigned long n)
 {
 	if (access_ok(VERIFY_WRITE, to, n))
-		n = __clear_user(to, n);
+		n = __clear_user(__uaccess_mask_ptr(to), n);
 	return n;
 }
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 15/66] arm64: entry: Ensure branch through syscall table is bounded under speculation
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 14/66] arm64: Use pointer masking to limit uaccess speculation Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 16/66] arm64: uaccess: Prevent speculative use of the current addr_limit Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Mark Rutland, Will Deacon, Catalin Marinas,
	Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 6314d90e64936c584f300a52ef173603fb2461b5 upstream.

In a similar manner to array_index_mask_nospec, this patch introduces an
assembly macro (mask_nospec64) which can be used to bound a value under
speculation. This macro is then used to ensure that the indirect branch
through the syscall table is bounded under speculation, with out-of-range
addresses speculating as calls to sys_io_setup (0).

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: use existing scno & sc_nr definitions]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/assembler.h |   11 +++++++++++
 arch/arm64/kernel/entry.S          |    1 +
 2 files changed, 12 insertions(+)

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,6 +94,17 @@
 	.endm
 
 /*
+ * Sanitise a 64-bit bounded index wrt speculation, returning zero if out
+ * of bounds.
+ */
+	.macro	mask_nospec64, idx, limit, tmp
+	sub	\tmp, \idx, \limit
+	bic	\tmp, \tmp, \idx
+	and	\idx, \idx, \tmp, asr #63
+	csdb
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -795,6 +795,7 @@ el0_svc_naked:					// compat entry point
 	b.ne	__sys_trace
 	cmp     scno, sc_nr                     // check upper syscall limit
 	b.hs	ni_sys
+	mask_nospec64 scno, sc_nr, x19	// enforce bounds for syscall number
 	ldr	x16, [stbl, scno, lsl #3]	// address in the syscall table
 	blr	x16				// call sys_* routine
 	b	ret_fast_syscall

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 16/66] arm64: uaccess: Prevent speculative use of the current addr_limit
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 15/66] arm64: entry: Ensure branch through syscall table is bounded under speculation Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 17/66] arm64: uaccess: Dont bother eliding access_ok checks in __{get, put}_user Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Mark Rutland, Will Deacon, Catalin Marinas,
	Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit c2f0ad4fc089cff81cef6a13d04b399980ecbfcc upstream.

A mispredicted conditional call to set_fs could result in the wrong
addr_limit being forwarded under speculation to a subsequent access_ok
check, potentially forming part of a spectre-v1 attack using uaccess
routines.

This patch prevents this forwarding from taking place, but putting heavy
barriers in set_fs after writing the addr_limit.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/uaccess.h |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -68,6 +68,13 @@ static inline void set_fs(mm_segment_t f
 	current_thread_info()->addr_limit = fs;
 
 	/*
+	 * Prevent a mispredicted conditional call to set_fs from forwarding
+	 * the wrong address limit to access_ok under speculation.
+	 */
+	dsb(nsh);
+	isb();
+
+	/*
 	 * Enable/disable UAO so that copy_to_user() etc can access
 	 * kernel memory with the unprivileged instructions.
 	 */

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 17/66] arm64: uaccess: Dont bother eliding access_ok checks in __{get, put}_user
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 16/66] arm64: uaccess: Prevent speculative use of the current addr_limit Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 18/66] arm64: uaccess: Mask __user pointers for __arch_{clear, copy_*}_user Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Robin Murphy, Will Deacon, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 84624087dd7e3b482b7b11c170ebc1f329b3a218 upstream.

access_ok isn't an expensive operation once the addr_limit for the current
thread has been loaded into the cache. Given that the initial access_ok
check preceding a sequence of __{get,put}_user operations will take
the brunt of the miss, we can make the __* variants identical to the
full-fat versions, which brings with it the benefits of address masking.

The likely cost in these sequences will be from toggling PAN/UAO, which
we can address later by implementing the *_unsafe versions.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/uaccess.h |   62 ++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 26 deletions(-)

--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -209,30 +209,35 @@ do {									\
 			CONFIG_ARM64_PAN));				\
 } while (0)
 
-#define __get_user(x, ptr)						\
+#define __get_user_check(x, ptr, err)					\
 ({									\
-	int __gu_err = 0;						\
-	__get_user_err((x), (ptr), __gu_err);				\
-	__gu_err;							\
+	__typeof__(*(ptr)) __user *__p = (ptr);				\
+	might_fault();							\
+	if (access_ok(VERIFY_READ, __p, sizeof(*__p))) {		\
+		__p = uaccess_mask_ptr(__p);				\
+		__get_user_err((x), __p, (err));			\
+	} else {							\
+		(x) = 0; (err) = -EFAULT;				\
+	}								\
 })
 
 #define __get_user_error(x, ptr, err)					\
 ({									\
-	__get_user_err((x), (ptr), (err));				\
+	__get_user_check((x), (ptr), (err));				\
 	(void)0;							\
 })
 
-#define __get_user_unaligned __get_user
-
-#define get_user(x, ptr)						\
+#define __get_user(x, ptr)						\
 ({									\
-	__typeof__(*(ptr)) __user *__p = (ptr);				\
-	might_fault();							\
-	access_ok(VERIFY_READ, __p, sizeof(*__p)) ?			\
-		__p = uaccess_mask_ptr(__p), __get_user((x), __p) :	\
-		((x) = 0, -EFAULT);					\
+	int __gu_err = 0;						\
+	__get_user_check((x), (ptr), __gu_err);				\
+	__gu_err;							\
 })
 
+#define __get_user_unaligned __get_user
+
+#define get_user	__get_user
+
 #define __put_user_asm(instr, alt_instr, reg, x, addr, err, feature)	\
 	asm volatile(							\
 	"1:"ALTERNATIVE(instr "     " reg "1, [%2]\n",			\
@@ -277,30 +282,35 @@ do {									\
 			CONFIG_ARM64_PAN));				\
 } while (0)
 
-#define __put_user(x, ptr)						\
+#define __put_user_check(x, ptr, err)					\
 ({									\
-	int __pu_err = 0;						\
-	__put_user_err((x), (ptr), __pu_err);				\
-	__pu_err;							\
+	__typeof__(*(ptr)) __user *__p = (ptr);				\
+	might_fault();							\
+	if (access_ok(VERIFY_WRITE, __p, sizeof(*__p))) {		\
+		__p = uaccess_mask_ptr(__p);				\
+		__put_user_err((x), __p, (err));			\
+	} else	{							\
+		(err) = -EFAULT;					\
+	}								\
 })
 
 #define __put_user_error(x, ptr, err)					\
 ({									\
-	__put_user_err((x), (ptr), (err));				\
+	__put_user_check((x), (ptr), (err));				\
 	(void)0;							\
 })
 
-#define __put_user_unaligned __put_user
-
-#define put_user(x, ptr)						\
+#define __put_user(x, ptr)						\
 ({									\
-	__typeof__(*(ptr)) __user *__p = (ptr);				\
-	might_fault();							\
-	access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ?			\
-		__p = uaccess_mask_ptr(__p), __put_user((x), __p) :	\
-		-EFAULT;						\
+	int __pu_err = 0;						\
+	__put_user_check((x), (ptr), __pu_err);				\
+	__pu_err;							\
 })
 
+#define __put_user_unaligned __put_user
+
+#define put_user	__put_user
+
 extern unsigned long __must_check __arch_copy_from_user(void *to, const void __user *from, unsigned long n);
 extern unsigned long __must_check __arch_copy_to_user(void __user *to, const void *from, unsigned long n);
 extern unsigned long __must_check __copy_in_user(void __user *to, const void __user *from, unsigned long n);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 18/66] arm64: uaccess: Mask __user pointers for __arch_{clear, copy_*}_user
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 17/66] arm64: uaccess: Dont bother eliding access_ok checks in __{get, put}_user Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 19/66] arm64: cpufeature: __this_cpu_has_cap() shouldnt stop early Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Catalin Marinas, Greg Hackmann,
	Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit f71c2ffcb20dd8626880747557014bb9a61eb90e upstream.

Like we've done for get_user and put_user, ensure that user pointers
are masked before invoking the underlying __arch_{clear,copy_*}_user
operations.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: fixup for v4.9-style uaccess primitives]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/uaccess.h |   18 ++++++++++--------
 arch/arm64/kernel/arm64ksyms.c   |    4 ++--
 arch/arm64/lib/clear_user.S      |    6 +++---
 arch/arm64/lib/copy_in_user.S    |    4 ++--
 4 files changed, 17 insertions(+), 15 deletions(-)

--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -313,21 +313,20 @@ do {									\
 
 extern unsigned long __must_check __arch_copy_from_user(void *to, const void __user *from, unsigned long n);
 extern unsigned long __must_check __arch_copy_to_user(void __user *to, const void *from, unsigned long n);
-extern unsigned long __must_check __copy_in_user(void __user *to, const void __user *from, unsigned long n);
-extern unsigned long __must_check __clear_user(void __user *addr, unsigned long n);
+extern unsigned long __must_check __arch_copy_in_user(void __user *to, const void __user *from, unsigned long n);
 
 static inline unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n)
 {
 	kasan_check_write(to, n);
 	check_object_size(to, n, false);
-	return __arch_copy_from_user(to, from, n);
+	return __arch_copy_from_user(to, __uaccess_mask_ptr(from), n);
 }
 
 static inline unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n)
 {
 	kasan_check_read(from, n);
 	check_object_size(from, n, true);
-	return __arch_copy_to_user(to, from, n);
+	return __arch_copy_to_user(__uaccess_mask_ptr(to), from, n);
 }
 
 static inline unsigned long __must_check copy_from_user(void *to, const void __user *from, unsigned long n)
@@ -355,22 +354,25 @@ static inline unsigned long __must_check
 	return n;
 }
 
-static inline unsigned long __must_check copy_in_user(void __user *to, const void __user *from, unsigned long n)
+static inline unsigned long __must_check __copy_in_user(void __user *to, const void __user *from, unsigned long n)
 {
 	if (access_ok(VERIFY_READ, from, n) && access_ok(VERIFY_WRITE, to, n))
-		n = __copy_in_user(to, from, n);
+		n = __arch_copy_in_user(__uaccess_mask_ptr(to), __uaccess_mask_ptr(from), n);
 	return n;
 }
+#define copy_in_user __copy_in_user
 
 #define __copy_to_user_inatomic __copy_to_user
 #define __copy_from_user_inatomic __copy_from_user
 
-static inline unsigned long __must_check clear_user(void __user *to, unsigned long n)
+extern unsigned long __must_check __arch_clear_user(void __user *to, unsigned long n);
+static inline unsigned long __must_check __clear_user(void __user *to, unsigned long n)
 {
 	if (access_ok(VERIFY_WRITE, to, n))
-		n = __clear_user(__uaccess_mask_ptr(to), n);
+		n = __arch_clear_user(__uaccess_mask_ptr(to), n);
 	return n;
 }
+#define clear_user	__clear_user
 
 extern long strncpy_from_user(char *dest, const char __user *src, long count);
 
--- a/arch/arm64/kernel/arm64ksyms.c
+++ b/arch/arm64/kernel/arm64ksyms.c
@@ -37,8 +37,8 @@ EXPORT_SYMBOL(clear_page);
 	/* user mem (segment) */
 EXPORT_SYMBOL(__arch_copy_from_user);
 EXPORT_SYMBOL(__arch_copy_to_user);
-EXPORT_SYMBOL(__clear_user);
-EXPORT_SYMBOL(__copy_in_user);
+EXPORT_SYMBOL(__arch_clear_user);
+EXPORT_SYMBOL(__arch_copy_in_user);
 
 	/* physical memory */
 EXPORT_SYMBOL(memstart_addr);
--- a/arch/arm64/lib/clear_user.S
+++ b/arch/arm64/lib/clear_user.S
@@ -24,7 +24,7 @@
 
 	.text
 
-/* Prototype: int __clear_user(void *addr, size_t sz)
+/* Prototype: int __arch_clear_user(void *addr, size_t sz)
  * Purpose  : clear some user memory
  * Params   : addr - user memory address to clear
  *          : sz   - number of bytes to clear
@@ -32,7 +32,7 @@
  *
  * Alignment fixed up by hardware.
  */
-ENTRY(__clear_user)
+ENTRY(__arch_clear_user)
 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_ALT_PAN_NOT_UAO, \
 	    CONFIG_ARM64_PAN)
 	mov	x2, x1			// save the size for fixup return
@@ -57,7 +57,7 @@ uao_user_alternative 9f, strb, sttrb, wz
 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_ALT_PAN_NOT_UAO, \
 	    CONFIG_ARM64_PAN)
 	ret
-ENDPROC(__clear_user)
+ENDPROC(__arch_clear_user)
 
 	.section .fixup,"ax"
 	.align	2
--- a/arch/arm64/lib/copy_in_user.S
+++ b/arch/arm64/lib/copy_in_user.S
@@ -67,7 +67,7 @@
 	.endm
 
 end	.req	x5
-ENTRY(__copy_in_user)
+ENTRY(__arch_copy_in_user)
 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_ALT_PAN_NOT_UAO, \
 	    CONFIG_ARM64_PAN)
 	add	end, x0, x2
@@ -76,7 +76,7 @@ ALTERNATIVE("nop", __stringify(SET_PSTAT
 	    CONFIG_ARM64_PAN)
 	mov	x0, #0
 	ret
-ENDPROC(__copy_in_user)
+ENDPROC(__arch_copy_in_user)
 
 	.section .fixup,"ax"
 	.align	2

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 19/66] arm64: cpufeature: __this_cpu_has_cap() shouldnt stop early
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 18/66] arm64: uaccess: Mask __user pointers for __arch_{clear, copy_*}_user Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 20/66] arm64: Run enable method for errata work arounds on late CPUs Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Suzuki K Poulose, Marc Zyngier, James Morse,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: James Morse <james.morse@arm.com>

commit edf298cfce47ab7279d03b5203ae2ef3a58e49db upstream.

this_cpu_has_cap() tests caps->desc not caps->matches, so it stops
walking the list when it finds a 'silent' feature, instead of
walking to the end of the list.

Prior to v4.6's 644c2ae198412 ("arm64: cpufeature: Test 'matches' pointer
to find the end of the list") we always tested desc to find the end of
a capability list. This was changed for dubious things like PAN_NOT_UAO.
v4.7's e3661b128e53e ("arm64: Allow a capability to be checked on
single CPU") added this_cpu_has_cap() using the old desc style test.

CC: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/cpufeature.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1024,9 +1024,8 @@ static bool __this_cpu_has_cap(const str
 	if (WARN_ON(preemptible()))
 		return false;
 
-	for (caps = cap_array; caps->desc; caps++)
+	for (caps = cap_array; caps->matches; caps++)
 		if (caps->capability == cap &&
-		    caps->matches &&
 		    caps->matches(caps, SCOPE_LOCAL_CPU))
 			return true;
 	return false;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 20/66] arm64: Run enable method for errata work arounds on late CPUs
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 19/66] arm64: cpufeature: __this_cpu_has_cap() shouldnt stop early Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 21/66] arm64: cpufeature: Pass capability structure to ->enable callback Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Mark Rutland, Andre Przywara,
	Dave Martin, Suzuki K Poulose, Catalin Marinas, Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Suzuki K Poulose <suzuki.poulose@arm.com>

commit 55b35d070c2534dfb714b883f3c3ae05d02032da upstream.

When a CPU is brought up after we have finalised the system
wide capabilities (i.e, features and errata), we make sure the
new CPU doesn't need a new errata work around which has not been
detected already. However we don't run enable() method on the new
CPU for the errata work arounds already detected. This could
cause the new CPU running without potential work arounds.
It is upto the "enable()" method to decide if this CPU should
do something about the errata.

Fixes: commit 6a6efbb45b7d95c84 ("arm64: Verify CPU errata work arounds on hotplugged CPU")
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/cpu_errata.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -143,15 +143,18 @@ void verify_local_cpu_errata_workarounds
 {
 	const struct arm64_cpu_capabilities *caps = arm64_errata;
 
-	for (; caps->matches; caps++)
-		if (!cpus_have_cap(caps->capability) &&
-			caps->matches(caps, SCOPE_LOCAL_CPU)) {
+	for (; caps->matches; caps++) {
+		if (cpus_have_cap(caps->capability)) {
+			if (caps->enable)
+				caps->enable((void *)caps);
+		} else if (caps->matches(caps, SCOPE_LOCAL_CPU)) {
 			pr_crit("CPU%d: Requires work around for %s, not detected"
 					" at boot time\n",
 				smp_processor_id(),
 				caps->desc ? : "an erratum");
 			cpu_die_early();
 		}
+	}
 }
 
 void update_cpu_errata_workarounds(void)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 21/66] arm64: cpufeature: Pass capability structure to ->enable callback
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 20/66] arm64: Run enable method for errata work arounds on late CPUs Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 22/66] drivers/firmware: Expose psci_get_version through psci_ops structure Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Suzuki K Poulose, Will Deacon,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 0a0d111d40fd1dc588cc590fab6b55d86ddc71d3 upstream.

In order to invoke the CPU capability ->matches callback from the ->enable
callback for applying local-CPU workarounds, we need a handle on the
capability structure.

This patch passes a pointer to the capability structure to the ->enable
callback.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/cpufeature.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1058,7 +1058,7 @@ void __init enable_cpu_capabilities(cons
 			 * uses an IPI, giving us a PSTATE that disappears when
 			 * we return.
 			 */
-			stop_machine(caps->enable, NULL, cpu_online_mask);
+			stop_machine(caps->enable, (void *)caps, cpu_online_mask);
 }
 
 /*
@@ -1115,7 +1115,7 @@ verify_local_cpu_features(const struct a
 			cpu_die_early();
 		}
 		if (caps->enable)
-			caps->enable(NULL);
+			caps->enable((void *)caps);
 	}
 }
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 22/66] drivers/firmware: Expose psci_get_version through psci_ops structure
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 21/66] arm64: cpufeature: Pass capability structure to ->enable callback Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 23/66] arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Lorenzo Pieralisi, Will Deacon,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit d68e3ba5303f7e1099f51fdcd155f5263da8569b upstream.

Entry into recent versions of ARM Trusted Firmware will invalidate the CPU
branch predictor state in order to protect against aliasing attacks.

This patch exposes the PSCI "VERSION" function via psci_ops, so that it
can be invoked outside of the PSCI driver where necessary.

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/firmware/psci.c |    2 ++
 include/linux/psci.h    |    1 +
 2 files changed, 3 insertions(+)

--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -496,6 +496,8 @@ static void __init psci_init_migrate(voi
 static void __init psci_0_2_set_functions(void)
 {
 	pr_info("Using standard PSCI v0.2 function IDs\n");
+	psci_ops.get_version = psci_get_version;
+
 	psci_function_id[PSCI_FN_CPU_SUSPEND] =
 					PSCI_FN_NATIVE(0_2, CPU_SUSPEND);
 	psci_ops.cpu_suspend = psci_cpu_suspend;
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -26,6 +26,7 @@ int psci_cpu_init_idle(unsigned int cpu)
 int psci_cpu_suspend_enter(unsigned long index);
 
 struct psci_operations {
+	u32 (*get_version)(void);
 	int (*cpu_suspend)(u32 state, unsigned long entry_point);
 	int (*cpu_off)(u32 state);
 	int (*cpu_on)(unsigned long cpuid, unsigned long entry_point);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 23/66] arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 22/66] drivers/firmware: Expose psci_get_version through psci_ops structure Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 24/66] arm64: Move post_ttbr_update_workaround to C code Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, James Morse, Kees Cook,
	Mark Rutland, Catalin Marinas, Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Catalin Marinas <catalin.marinas@arm.com>

commit f33bcf03e6079668da6bf4eec4a7dcf9289131d0 upstream.

This patch takes the errata workaround code out of cpu_do_switch_mm into
a dedicated post_ttbr0_update_workaround macro which will be reused in a
subsequent patch.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/assembler.h |   14 ++++++++++++++
 arch/arm64/mm/proc.S               |    6 +-----
 2 files changed, 15 insertions(+), 5 deletions(-)

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -434,4 +434,18 @@ alternative_endif
 	.macro	pte_to_phys, phys, pte
 	and	\phys, \pte, #(((1 << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 	.endm
+
+/*
+ * Errata workaround post TTBR0_EL1 update.
+ */
+	.macro	post_ttbr0_update_workaround
+#ifdef CONFIG_CAVIUM_ERRATUM_27456
+alternative_if ARM64_WORKAROUND_CAVIUM_27456
+	ic	iallu
+	dsb	nsh
+	isb
+alternative_else_nop_endif
+#endif
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -139,11 +139,7 @@ ENTRY(cpu_do_switch_mm)
 	isb
 	msr	ttbr0_el1, x0			// now update TTBR0
 	isb
-alternative_if ARM64_WORKAROUND_CAVIUM_27456
-	ic	iallu
-	dsb	nsh
-	isb
-alternative_else_nop_endif
+	post_ttbr0_update_workaround
 	ret
 ENDPROC(cpu_do_switch_mm)
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 24/66] arm64: Move post_ttbr_update_workaround to C code
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 23/66] arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 25/66] arm64: Add skeleton to harden the branch predictor against aliasing attacks Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Marc Zyngier, Will Deacon, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 95e3de3590e3f2358bb13f013911bc1bfa5d3f53 upstream.

We will soon need to invoke a CPU-specific function pointer after changing
page tables, so move post_ttbr_update_workaround out into C code to make
this possible.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/assembler.h |   13 -------------
 arch/arm64/mm/context.c            |    9 +++++++++
 arch/arm64/mm/proc.S               |    3 +--
 3 files changed, 10 insertions(+), 15 deletions(-)

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -435,17 +435,4 @@ alternative_endif
 	and	\phys, \pte, #(((1 << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 	.endm
 
-/*
- * Errata workaround post TTBR0_EL1 update.
- */
-	.macro	post_ttbr0_update_workaround
-#ifdef CONFIG_CAVIUM_ERRATUM_27456
-alternative_if ARM64_WORKAROUND_CAVIUM_27456
-	ic	iallu
-	dsb	nsh
-	isb
-alternative_else_nop_endif
-#endif
-	.endm
-
 #endif	/* __ASM_ASSEMBLER_H */
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -233,6 +233,15 @@ switch_mm_fastpath:
 	cpu_switch_mm(mm->pgd, mm);
 }
 
+/* Errata workaround post TTBRx_EL1 update. */
+asmlinkage void post_ttbr_update_workaround(void)
+{
+	asm(ALTERNATIVE("nop; nop; nop",
+			"ic iallu; dsb nsh; isb",
+			ARM64_WORKAROUND_CAVIUM_27456,
+			CONFIG_CAVIUM_ERRATUM_27456));
+}
+
 static int asids_init(void)
 {
 	asid_bits = get_cpu_asid_bits();
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -139,8 +139,7 @@ ENTRY(cpu_do_switch_mm)
 	isb
 	msr	ttbr0_el1, x0			// now update TTBR0
 	isb
-	post_ttbr0_update_workaround
-	ret
+	b	post_ttbr_update_workaround	// Back to C code...
 ENDPROC(cpu_do_switch_mm)
 
 	.pushsection ".idmap.text", "awx"

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 25/66] arm64: Add skeleton to harden the branch predictor against aliasing attacks
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 24/66] arm64: Move post_ttbr_update_workaround to C code Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:58 ` [PATCH 4.9 26/66] arm64: Move BP hardening to check_and_switch_context Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Catalin Marinas, Greg Hackmann,
	Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 0f15adbb2861ce6f75ccfc5a92b19eae0ef327d0 upstream.

Aliasing attacks against CPU branch predictors can allow an attacker to
redirect speculative control flow on some CPUs and potentially divulge
information from one context to another.

This patch adds initial skeleton code behind a new Kconfig option to
enable implementation-specific mitigations against these attacks for
CPUs that are affected.

Co-developed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: copy bp hardening cb via text mapping]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/Kconfig               |   17 ++++++++
 arch/arm64/include/asm/cpucaps.h |    3 +
 arch/arm64/include/asm/mmu.h     |   39 ++++++++++++++++++++
 arch/arm64/include/asm/sysreg.h  |    2 +
 arch/arm64/kernel/Makefile       |    4 ++
 arch/arm64/kernel/bpi.S          |   55 ++++++++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c   |   74 +++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpufeature.c   |    3 +
 arch/arm64/kernel/entry.S        |    8 ++--
 arch/arm64/mm/context.c          |    2 +
 arch/arm64/mm/fault.c            |   17 ++++++++
 11 files changed, 219 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/kernel/bpi.S

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -745,6 +745,23 @@ config UNMAP_KERNEL_AT_EL0
 
 	  If unsure, say Y.
 
+config HARDEN_BRANCH_PREDICTOR
+	bool "Harden the branch predictor against aliasing attacks" if EXPERT
+	default y
+	help
+	  Speculation attacks against some high-performance processors rely on
+	  being able to manipulate the branch predictor for a victim context by
+	  executing aliasing branches in the attacker context.  Such attacks
+	  can be partially mitigated against by clearing internal branch
+	  predictor state and limiting the prediction logic in some situations.
+
+	  This config option will take CPU-specific actions to harden the
+	  branch predictor against aliasing attacks and may rely on specific
+	  instruction sequences or control bits being set by the system
+	  firmware.
+
+	  If unsure, say Y.
+
 menuconfig ARMV8_DEPRECATED
 	bool "Emulate deprecated/obsolete ARMv8 instructions"
 	depends on COMPAT
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -35,7 +35,8 @@
 #define ARM64_HYP_OFFSET_LOW			14
 #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
 #define ARM64_UNMAP_KERNEL_AT_EL0		16
+#define ARM64_HARDEN_BRANCH_PREDICTOR		17
 
-#define ARM64_NCAPS				17
+#define ARM64_NCAPS				18
 
 #endif /* __ASM_CPUCAPS_H */
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -20,6 +20,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/percpu.h>
+
 typedef struct {
 	atomic64_t	id;
 	void		*vdso;
@@ -38,6 +40,43 @@ static inline bool arm64_kernel_unmapped
 	       cpus_have_cap(ARM64_UNMAP_KERNEL_AT_EL0);
 }
 
+typedef void (*bp_hardening_cb_t)(void);
+
+struct bp_hardening_data {
+	int			hyp_vectors_slot;
+	bp_hardening_cb_t	fn;
+};
+
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[];
+
+DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+
+static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
+{
+	return this_cpu_ptr(&bp_hardening_data);
+}
+
+static inline void arm64_apply_bp_hardening(void)
+{
+	struct bp_hardening_data *d;
+
+	if (!cpus_have_cap(ARM64_HARDEN_BRANCH_PREDICTOR))
+		return;
+
+	d = arm64_get_bp_hardening_data();
+	if (d->fn)
+		d->fn();
+}
+#else
+static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
+{
+	return NULL;
+}
+
+static inline void arm64_apply_bp_hardening(void)	{ }
+#endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
+
 extern void paging_init(void);
 extern void bootmem_init(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -118,6 +118,8 @@
 
 /* id_aa64pfr0 */
 #define ID_AA64PFR0_CSV3_SHIFT		60
+#define ID_AA64PFR0_CSV2_SHIFT		56
+#define ID_AA64PFR0_SVE_SHIFT		32
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -51,6 +51,10 @@ arm64-obj-$(CONFIG_HIBERNATION)		+= hibe
 arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
 
+ifeq ($(CONFIG_KVM),y)
+arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR)	+= bpi.o
+endif
+
 obj-y					+= $(arm64-obj-y) vdso/ probes/
 obj-m					+= $(arm64-obj-m)
 head-y					:= head.o
--- /dev/null
+++ b/arch/arm64/kernel/bpi.S
@@ -0,0 +1,55 @@
+/*
+ * Contains CPU specific branch predictor invalidation sequences
+ *
+ * Copyright (C) 2018 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+.macro ventry target
+	.rept 31
+	nop
+	.endr
+	b	\target
+.endm
+
+.macro vectors target
+	ventry \target + 0x000
+	ventry \target + 0x080
+	ventry \target + 0x100
+	ventry \target + 0x180
+
+	ventry \target + 0x200
+	ventry \target + 0x280
+	ventry \target + 0x300
+	ventry \target + 0x380
+
+	ventry \target + 0x400
+	ventry \target + 0x480
+	ventry \target + 0x500
+	ventry \target + 0x580
+
+	ventry \target + 0x600
+	ventry \target + 0x680
+	ventry \target + 0x700
+	ventry \target + 0x780
+.endm
+
+	.align	11
+ENTRY(__bp_harden_hyp_vecs_start)
+	.rept 4
+	vectors __kvm_hyp_vector
+	.endr
+ENTRY(__bp_harden_hyp_vecs_end)
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -46,6 +46,80 @@ static int cpu_enable_trap_ctr_access(vo
 	return 0;
 }
 
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+#include <asm/mmu_context.h>
+#include <asm/cacheflush.h>
+
+DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+
+#ifdef CONFIG_KVM
+static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
+				const char *hyp_vecs_end)
+{
+	void *dst = __bp_harden_hyp_vecs_start + slot * SZ_2K;
+	int i;
+
+	for (i = 0; i < SZ_2K; i += 0x80)
+		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
+
+	flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+}
+
+static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
+				      const char *hyp_vecs_start,
+				      const char *hyp_vecs_end)
+{
+	static int last_slot = -1;
+	static DEFINE_SPINLOCK(bp_lock);
+	int cpu, slot = -1;
+
+	spin_lock(&bp_lock);
+	for_each_possible_cpu(cpu) {
+		if (per_cpu(bp_hardening_data.fn, cpu) == fn) {
+			slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu);
+			break;
+		}
+	}
+
+	if (slot == -1) {
+		last_slot++;
+		BUG_ON(((__bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start)
+			/ SZ_2K) <= last_slot);
+		slot = last_slot;
+		__copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end);
+	}
+
+	__this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot);
+	__this_cpu_write(bp_hardening_data.fn, fn);
+	spin_unlock(&bp_lock);
+}
+#else
+static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
+				      const char *hyp_vecs_start,
+				      const char *hyp_vecs_end)
+{
+	__this_cpu_write(bp_hardening_data.fn, fn);
+}
+#endif	/* CONFIG_KVM */
+
+static void  install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
+				     bp_hardening_cb_t fn,
+				     const char *hyp_vecs_start,
+				     const char *hyp_vecs_end)
+{
+	u64 pfr0;
+
+	if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+		return;
+
+	pfr0 = read_cpuid(ID_AA64PFR0_EL1);
+	if (cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_CSV2_SHIFT))
+		return;
+
+	__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
+}
+#endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
+
 #define MIDR_RANGE(model, min, max) \
 	.def_scope = SCOPE_LOCAL_CPU, \
 	.matches = is_affected_midr_range, \
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -94,7 +94,8 @@ static const struct arm64_ftr_bits ftr_i
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 	ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
-	ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 28, 0),
+	ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 24, 0),
 	ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 28, 4, 0),
 	ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -589,13 +589,15 @@ el0_ia:
 	 * Instruction abort handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	msr     daifclr, #(8 | 4 | 1)
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_off
+#endif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
-	bl	do_mem_abort
+	bl	do_el0_ia_bp_hardening
 	b	ret_to_user
 el0_fpsimd_acc:
 	/*
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -240,6 +240,8 @@ asmlinkage void post_ttbr_update_workaro
 			"ic iallu; dsb nsh; isb",
 			ARM64_WORKAROUND_CAVIUM_27456,
 			CONFIG_CAVIUM_ERRATUM_27456));
+
+	arm64_apply_bp_hardening();
 }
 
 static int asids_init(void)
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -590,6 +590,23 @@ asmlinkage void __exception do_mem_abort
 	arm64_notify_die("", regs, &info, esr);
 }
 
+asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr,
+						   unsigned int esr,
+						   struct pt_regs *regs)
+{
+	/*
+	 * We've taken an instruction abort from userspace and not yet
+	 * re-enabled IRQs. If the address is a kernel address, apply
+	 * BP hardening prior to enabling IRQs and pre-emption.
+	 */
+	if (addr > TASK_SIZE)
+		arm64_apply_bp_hardening();
+
+	local_irq_enable();
+	do_mem_abort(addr, esr, regs);
+}
+
+
 /*
  * Handle stack alignment exceptions.
  */

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 26/66] arm64: Move BP hardening to check_and_switch_context
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 25/66] arm64: Add skeleton to harden the branch predictor against aliasing attacks Greg Kroah-Hartman
@ 2018-04-17 15:58 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 27/66] mm: Introduce lm_alias Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Marc Zyngier, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit a8e4c0a919ae310944ed2c9ace11cf3ccd8a609b upstream.

We call arm64_apply_bp_hardening() from post_ttbr_update_workaround,
which has the unexpected consequence of being triggered on every
exception return to userspace when ARM64_SW_TTBR0_PAN is selected,
even if no context switch actually occured.

This is a bit suboptimal, and it would be more logical to only
invalidate the branch predictor when we actually switch to
a different mm.

In order to solve this, move the call to arm64_apply_bp_hardening()
into check_and_switch_context(), where we're guaranteed to pick
a different mm context.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/mm/context.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -230,6 +230,9 @@ void check_and_switch_context(struct mm_
 	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
 
 switch_mm_fastpath:
+
+	arm64_apply_bp_hardening();
+
 	cpu_switch_mm(mm->pgd, mm);
 }
 
@@ -240,8 +243,6 @@ asmlinkage void post_ttbr_update_workaro
 			"ic iallu; dsb nsh; isb",
 			ARM64_WORKAROUND_CAVIUM_27456,
 			CONFIG_CAVIUM_ERRATUM_27456));
-
-	arm64_apply_bp_hardening();
 }
 
 static int asids_init(void)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 27/66] mm: Introduce lm_alias
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2018-04-17 15:58 ` [PATCH 4.9 26/66] arm64: Move BP hardening to check_and_switch_context Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 28/66] arm64: KVM: Use per-CPU vector when BP hardening is enabled Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Mark Rutland, Laura Abbott, Will Deacon,
	Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Laura Abbott <labbott@redhat.com>

commit 568c5fe5a54f2654f5a4c599c45b8a62ed9a2013 upstream.

Certain architectures may have the kernel image mapped separately to
alias the linear map. Introduce a macro lm_alias to translate a kernel
image symbol into its linear alias. This is used in part with work to
add CONFIG_DEBUG_VIRTUAL support for arm64.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/mm.h |    4 ++++
 1 file changed, 4 insertions(+)

--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -76,6 +76,10 @@ extern int mmap_rnd_compat_bits __read_m
 #define page_to_virt(x)	__va(PFN_PHYS(page_to_pfn(x)))
 #endif
 
+#ifndef lm_alias
+#define lm_alias(x)	__va(__pa_symbol(x))
+#endif
+
 /*
  * To prevent common memory management code establishing
  * a zero page mapping on a read fault.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 28/66] arm64: KVM: Use per-CPU vector when BP hardening is enabled
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 27/66] mm: Introduce lm_alias Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 29/66] arm64: entry: Apply BP hardening for high-priority synchronous exceptions Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Marc Zyngier, Will Deacon, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 6840bdd73d07216ab4bc46f5a8768c37ea519038 upstream.

Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream, use cpus_have_cap()]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/include/asm/kvm_mmu.h   |   10 ++++++++++
 arch/arm/kvm/arm.c               |    9 ++++++++-
 arch/arm64/include/asm/kvm_mmu.h |   38 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/switch.c      |    2 +-
 4 files changed, 57 insertions(+), 2 deletions(-)

--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -223,6 +223,16 @@ static inline unsigned int kvm_get_vmid_
 	return 8;
 }
 
+static inline void *kvm_get_hyp_vector(void)
+{
+	return kvm_ksym_ref(__kvm_hyp_vector);
+}
+
+static inline int kvm_map_vectors(void)
+{
+	return 0;
+}
+
 #endif	/* !__ASSEMBLY__ */
 
 #endif /* __ARM_KVM_MMU_H__ */
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1088,7 +1088,7 @@ static void cpu_init_hyp_mode(void *dumm
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
+	vector_ptr = (unsigned long)kvm_get_hyp_vector();
 
 	__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
 	__cpu_init_stage2();
@@ -1345,6 +1345,13 @@ static int init_hyp_mode(void)
 		goto out_err;
 	}
 
+
+	err = kvm_map_vectors();
+	if (err) {
+		kvm_err("Cannot map vectors\n");
+		goto out_err;
+	}
+
 	/*
 	 * Map the Hyp stack pages
 	 */
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -313,5 +313,43 @@ static inline unsigned int kvm_get_vmid_
 	return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
 }
 
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+#include <asm/mmu.h>
+
+static inline void *kvm_get_hyp_vector(void)
+{
+	struct bp_hardening_data *data = arm64_get_bp_hardening_data();
+	void *vect = kvm_ksym_ref(__kvm_hyp_vector);
+
+	if (data->fn) {
+		vect = __bp_harden_hyp_vecs_start +
+		       data->hyp_vectors_slot * SZ_2K;
+
+		if (!cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN))
+			vect = lm_alias(vect);
+	}
+
+	return vect;
+}
+
+static inline int kvm_map_vectors(void)
+{
+	return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start),
+				   kvm_ksym_ref(__bp_harden_hyp_vecs_end),
+				   PAGE_HYP_EXEC);
+}
+
+#else
+static inline void *kvm_get_hyp_vector(void)
+{
+	return kvm_ksym_ref(__kvm_hyp_vector);
+}
+
+static inline int kvm_map_vectors(void)
+{
+	return 0;
+}
+#endif
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -50,7 +50,7 @@ static void __hyp_text __activate_traps_
 	val &= ~CPACR_EL1_FPEN;
 	write_sysreg(val, cpacr_el1);
 
-	write_sysreg(__kvm_hyp_vector, vbar_el1);
+	write_sysreg(kvm_get_hyp_vector(), vbar_el1);
 }
 
 static void __hyp_text __activate_traps_nvhe(void)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 29/66] arm64: entry: Apply BP hardening for high-priority synchronous exceptions
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 28/66] arm64: KVM: Use per-CPU vector when BP hardening is enabled Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 30/66] arm64: entry: Apply BP hardening for suspicious interrupts from EL0 Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Dan Hettena, Marc Zyngier, Will Deacon,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 5dfc6ed27710c42cbc15db5c0d4475699991da0a upstream.

Software-step and PC alignment fault exceptions have higher priority than
instruction abort exceptions, so apply the BP hardening hooks there too
if the user PC appears to reside in kernel space.

Reported-by: Dan Hettena <dhettena@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/entry.S |    6 ++++--
 arch/arm64/mm/fault.c     |    9 +++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -624,8 +624,10 @@ el0_sp_pc:
 	 * Stack or PC alignment exception handling
 	 */
 	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_off
+#endif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -617,6 +617,12 @@ asmlinkage void __exception do_sp_pc_abo
 	struct siginfo info;
 	struct task_struct *tsk = current;
 
+	if (user_mode(regs)) {
+		if (instruction_pointer(regs) > TASK_SIZE)
+			arm64_apply_bp_hardening();
+		local_irq_enable();
+	}
+
 	if (show_unhandled_signals && unhandled_signal(tsk, SIGBUS))
 		pr_info_ratelimited("%s[%d]: %s exception: pc=%p sp=%p\n",
 				    tsk->comm, task_pid_nr(tsk),
@@ -676,6 +682,9 @@ asmlinkage int __exception do_debug_exce
 	if (interrupts_enabled(regs))
 		trace_hardirqs_off();
 
+	if (user_mode(regs) && instruction_pointer(regs) > TASK_SIZE)
+		arm64_apply_bp_hardening();
+
 	if (!inf->fn(addr, esr, regs)) {
 		rv = 1;
 	} else {

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 30/66] arm64: entry: Apply BP hardening for suspicious interrupts from EL0
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 29/66] arm64: entry: Apply BP hardening for high-priority synchronous exceptions Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 31/66] arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75 Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Dan Hettena, Marc Zyngier, Will Deacon,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit 30d88c0e3ace625a92eead9ca0ad94093a8f59fe upstream.

It is possible to take an IRQ from EL0 following a branch to a kernel
address in such a way that the IRQ is prioritised over the instruction
abort. Whilst an attacker would need to get the stars to align here,
it might be sufficient with enough calibration so perform BP hardening
in the rare case that we see a kernel address in the ELR when handling
an IRQ from EL0.

Reported-by: Dan Hettena <dhettena@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/entry.S |    5 +++++
 arch/arm64/mm/fault.c     |    6 ++++++
 2 files changed, 11 insertions(+)

--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -686,6 +686,11 @@ el0_irq_naked:
 #endif
 
 	ct_user_exit
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+	tbz	x22, #55, 1f
+	bl	do_el0_irq_bp_hardening
+1:
+#endif
 	irq_handler
 
 #ifdef CONFIG_TRACE_IRQFLAGS
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -590,6 +590,12 @@ asmlinkage void __exception do_mem_abort
 	arm64_notify_die("", regs, &info, esr);
 }
 
+asmlinkage void __exception do_el0_irq_bp_hardening(void)
+{
+	/* PC has already been checked in entry.S */
+	arm64_apply_bp_hardening();
+}
+
 asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr,
 						   unsigned int esr,
 						   struct pt_regs *regs)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 31/66] arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 30/66] arm64: entry: Apply BP hardening for suspicious interrupts from EL0 Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 32/66] arm64: cpu_errata: Allow an erratum to be match for all revisions of a core Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Catalin Marinas, Greg Hackmann,
	Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit a65d219fe5dc7887fd5ca04c2ac3e9a34feb8dfc upstream.

Hook up MIDR values for the Cortex-A72 and Cortex-A75 CPUs, since they
will soon need MIDR matches for hardening the branch predictor.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/cputype.h |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -75,7 +75,10 @@
 #define ARM_CPU_PART_AEM_V8		0xD0F
 #define ARM_CPU_PART_FOUNDATION		0xD00
 #define ARM_CPU_PART_CORTEX_A57		0xD07
+#define ARM_CPU_PART_CORTEX_A72		0xD08
 #define ARM_CPU_PART_CORTEX_A53		0xD03
+#define ARM_CPU_PART_CORTEX_A73		0xD09
+#define ARM_CPU_PART_CORTEX_A75		0xD0A
 
 #define APM_CPU_PART_POTENZA		0x000
 
@@ -87,6 +90,9 @@
 
 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
 #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
+#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
+#define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
+#define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75)
 #define MIDR_THUNDERX	MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
 #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
 #define MIDR_CAVIUM_THUNDERX2 MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX2)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 32/66] arm64: cpu_errata: Allow an erratum to be match for all revisions of a core
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 31/66] arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75 Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 33/66] arm64: Implement branch predictor hardening for affected Cortex-A CPUs Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Thomas Gleixner, Mark Rutland,
	Daniel Lezcano, Suzuki K Poulose, Marc Zyngier, Greg Hackmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 06f1494f837da8997d670a1ba87add7963b08922 upstream.

Some minor erratum may not be fixed in further revisions of a core,
leading to a situation where the workaround needs to be updated each
time an updated core is released.

Introduce a MIDR_ALL_VERSIONS match helper that will work for all
versions of that MIDR, once and for all.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/cpu_errata.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -127,6 +127,13 @@ static void  install_bp_hardening_cb(con
 	.midr_range_min = min, \
 	.midr_range_max = max
 
+#define MIDR_ALL_VERSIONS(model) \
+	.def_scope = SCOPE_LOCAL_CPU, \
+	.matches = is_affected_midr_range, \
+	.midr_model = model, \
+	.midr_range_min = 0, \
+	.midr_range_max = (MIDR_VARIANT_MASK | MIDR_REVISION_MASK)
+
 const struct arm64_cpu_capabilities arm64_errata[] = {
 #if	defined(CONFIG_ARM64_ERRATUM_826319) || \
 	defined(CONFIG_ARM64_ERRATUM_827319) || \

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 33/66] arm64: Implement branch predictor hardening for affected Cortex-A CPUs
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 32/66] arm64: cpu_errata: Allow an erratum to be match for all revisions of a core Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 34/66] arm64: Branch predictor hardening for Cavium ThunderX2 Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Catalin Marinas, Greg Hackmann,
	Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Will Deacon <will.deacon@arm.com>

commit aa6acde65e03186b5add8151e1ffe36c3c62639b upstream.

Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
and can theoretically be attacked by malicious code.

This patch implements a PSCI-based mitigation for these CPUs when available.
The call into firmware will invalidate the branch predictor state, preventing
any malicious entries from affecting other victim contexts.

Co-developed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/bpi.S        |   24 +++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c |   42 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+)

--- a/arch/arm64/kernel/bpi.S
+++ b/arch/arm64/kernel/bpi.S
@@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
 	vectors __kvm_hyp_vector
 	.endr
 ENTRY(__bp_harden_hyp_vecs_end)
+ENTRY(__psci_hyp_bp_inval_start)
+	sub	sp, sp, #(8 * 18)
+	stp	x16, x17, [sp, #(16 * 0)]
+	stp	x14, x15, [sp, #(16 * 1)]
+	stp	x12, x13, [sp, #(16 * 2)]
+	stp	x10, x11, [sp, #(16 * 3)]
+	stp	x8, x9, [sp, #(16 * 4)]
+	stp	x6, x7, [sp, #(16 * 5)]
+	stp	x4, x5, [sp, #(16 * 6)]
+	stp	x2, x3, [sp, #(16 * 7)]
+	stp	x0, x1, [sp, #(16 * 8)]
+	mov	x0, #0x84000000
+	smc	#0
+	ldp	x16, x17, [sp, #(16 * 0)]
+	ldp	x14, x15, [sp, #(16 * 1)]
+	ldp	x12, x13, [sp, #(16 * 2)]
+	ldp	x10, x11, [sp, #(16 * 3)]
+	ldp	x8, x9, [sp, #(16 * 4)]
+	ldp	x6, x7, [sp, #(16 * 5)]
+	ldp	x4, x5, [sp, #(16 * 6)]
+	ldp	x2, x3, [sp, #(16 * 7)]
+	ldp	x0, x1, [sp, #(16 * 8)]
+	add	sp, sp, #(8 * 18)
+ENTRY(__psci_hyp_bp_inval_end)
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -53,6 +53,8 @@ static int cpu_enable_trap_ctr_access(vo
 DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
 
 #ifdef CONFIG_KVM
+extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
+
 static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
 				const char *hyp_vecs_end)
 {
@@ -94,6 +96,9 @@ static void __install_bp_hardening_cb(bp
 	spin_unlock(&bp_lock);
 }
 #else
+#define __psci_hyp_bp_inval_start	NULL
+#define __psci_hyp_bp_inval_end		NULL
+
 static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
 				      const char *hyp_vecs_start,
 				      const char *hyp_vecs_end)
@@ -118,6 +123,21 @@ static void  install_bp_hardening_cb(con
 
 	__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
 }
+
+#include <linux/psci.h>
+
+static int enable_psci_bp_hardening(void *data)
+{
+	const struct arm64_cpu_capabilities *entry = data;
+
+	if (psci_ops.get_version)
+		install_bp_hardening_cb(entry,
+				       (bp_hardening_cb_t)psci_ops.get_version,
+				       __psci_hyp_bp_inval_start,
+				       __psci_hyp_bp_inval_end);
+
+	return 0;
+}
 #endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
 
 #define MIDR_RANGE(model, min, max) \
@@ -211,6 +231,28 @@ const struct arm64_cpu_capabilities arm6
 		.def_scope = SCOPE_LOCAL_CPU,
 		.enable = cpu_enable_trap_ctr_access,
 	},
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
+		.enable = enable_psci_bp_hardening,
+	},
+#endif
 	{
 	}
 };

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 34/66] arm64: Branch predictor hardening for Cavium ThunderX2
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 33/66] arm64: Implement branch predictor hardening for affected Cortex-A CPUs Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 35/66] arm64: KVM: Increment PC after handling an SMC trap Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Will Deacon, Jayachandran C, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Jayachandran C <jnair@caviumnetworks.com>

commit f3d795d9b360523beca6d13ba64c2c532f601149 upstream.

Use PSCI based mitigation for speculative execution attacks targeting
the branch predictor. We use the same mechanism as the one used for
Cortex-A CPUs, we expect the PSCI version call to have a side effect
of clearing the BTBs.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/cpu_errata.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -252,6 +252,16 @@ const struct arm64_cpu_capabilities arm6
 		MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
 		.enable = enable_psci_bp_hardening,
 	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
+		.enable = enable_psci_bp_hardening,
+	},
 #endif
 	{
 	}

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 35/66] arm64: KVM: Increment PC after handling an SMC trap
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 34/66] arm64: Branch predictor hardening for Cavium ThunderX2 Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 36/66] arm/arm64: KVM: Consolidate the PSCI include files Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Christoffer Dall, Ard Biesheuvel,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit f5115e8869e1dfafac0e414b4f1664f3a84a4683 upstream.

When handling an SMC trap, the "preferred return address" is set
to that of the SMC, and not the next PC (which is a departure from
the behaviour of an SMC that isn't trapped).

Increment PC in the handler, as the guest is otherwise forever
stuck...

Cc: stable@vger.kernel.org
Fixes: acfb3b883f6d ("arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls")
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kvm/handle_exit.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -53,7 +53,16 @@ static int handle_hvc(struct kvm_vcpu *v
 
 static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
+	/*
+	 * "If an SMC instruction executed at Non-secure EL1 is
+	 * trapped to EL2 because HCR_EL2.TSC is 1, the exception is a
+	 * Trap exception, not a Secure Monitor Call exception [...]"
+	 *
+	 * We need to advance the PC after the trap, as it would
+	 * otherwise return to the same address...
+	 */
 	vcpu_set_reg(vcpu, 0, ~0UL);
+	kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
 	return 1;
 }
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 36/66] arm/arm64: KVM: Consolidate the PSCI include files
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 35/66] arm64: KVM: Increment PC after handling an SMC trap Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 37/66] arm/arm64: KVM: Add PSCI_VERSION helper Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Christoffer Dall, Ard Biesheuvel,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 1a2fb94e6a771ff94f4afa22497a4695187b820c upstream.

As we're about to update the PSCI support, and because I'm lazy,
let's move the PSCI include file to include/kvm so that both
ARM architectures can find it.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/include/asm/kvm_psci.h   |   27 ---------------------------
 arch/arm/kvm/arm.c                |    2 +-
 arch/arm/kvm/handle_exit.c        |    2 +-
 arch/arm/kvm/psci.c               |    3 ++-
 arch/arm64/include/asm/kvm_psci.h |   27 ---------------------------
 arch/arm64/kvm/handle_exit.c      |    5 ++++-
 include/kvm/arm_psci.h            |   27 +++++++++++++++++++++++++++
 7 files changed, 35 insertions(+), 58 deletions(-)
 delete mode 100644 arch/arm/include/asm/kvm_psci.h
 rename arch/arm64/include/asm/kvm_psci.h => include/kvm/arm_psci.h (89%)

--- a/arch/arm/include/asm/kvm_psci.h
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * Copyright (C) 2012 - ARM Ltd
- * Author: Marc Zyngier <marc.zyngier@arm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __ARM_KVM_PSCI_H__
-#define __ARM_KVM_PSCI_H__
-
-#define KVM_ARM_PSCI_0_1	1
-#define KVM_ARM_PSCI_0_2	2
-
-int kvm_psci_version(struct kvm_vcpu *vcpu);
-int kvm_psci_call(struct kvm_vcpu *vcpu);
-
-#endif /* __ARM_KVM_PSCI_H__ */
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -29,6 +29,7 @@
 #include <linux/kvm.h>
 #include <trace/events/kvm.h>
 #include <kvm/arm_pmu.h>
+#include <kvm/arm_psci.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -44,7 +45,6 @@
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_coproc.h>
-#include <asm/kvm_psci.h>
 #include <asm/sections.h>
 
 #ifdef REQUIRES_VIRT
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -21,7 +21,7 @@
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_mmu.h>
-#include <asm/kvm_psci.h>
+#include <kvm/arm_psci.h>
 #include <trace/events/kvm.h>
 
 #include "trace.h"
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -21,9 +21,10 @@
 
 #include <asm/cputype.h>
 #include <asm/kvm_emulate.h>
-#include <asm/kvm_psci.h>
 #include <asm/kvm_host.h>
 
+#include <kvm/arm_psci.h>
+
 #include <uapi/linux/psci.h>
 
 /*
--- a/arch/arm64/include/asm/kvm_psci.h
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * Copyright (C) 2012,2013 - ARM Ltd
- * Author: Marc Zyngier <marc.zyngier@arm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __ARM64_KVM_PSCI_H__
-#define __ARM64_KVM_PSCI_H__
-
-#define KVM_ARM_PSCI_0_1	1
-#define KVM_ARM_PSCI_0_2	2
-
-int kvm_psci_version(struct kvm_vcpu *vcpu);
-int kvm_psci_call(struct kvm_vcpu *vcpu);
-
-#endif /* __ARM64_KVM_PSCI_H__ */
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -22,12 +22,15 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 
+#include <kvm/arm_psci.h>
+
 #include <asm/esr.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_mmu.h>
-#include <asm/kvm_psci.h>
+#include <asm/debug-monitors.h>
+#include <asm/traps.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
--- /dev/null
+++ b/include/kvm/arm_psci.h
@@ -0,0 +1,27 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __KVM_ARM_PSCI_H__
+#define __KVM_ARM_PSCI_H__
+
+#define KVM_ARM_PSCI_0_1	1
+#define KVM_ARM_PSCI_0_2	2
+
+int kvm_psci_version(struct kvm_vcpu *vcpu);
+int kvm_psci_call(struct kvm_vcpu *vcpu);
+
+#endif /* __KVM_ARM_PSCI_H__ */

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 37/66] arm/arm64: KVM: Add PSCI_VERSION helper
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 36/66] arm/arm64: KVM: Consolidate the PSCI include files Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 38/66] arm/arm64: KVM: Add smccc accessors to PSCI code Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Christoffer Dall, Ard Biesheuvel,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit d0a144f12a7ca8368933eae6583c096c363ec506 upstream.

As we're about to trigger a PSCI version explosion, it doesn't
hurt to introduce a PSCI_VERSION helper that is going to be
used everywhere.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/kvm/psci.c       |    4 +---
 include/kvm/arm_psci.h    |    6 ++++--
 include/uapi/linux/psci.h |    3 +++
 3 files changed, 8 insertions(+), 5 deletions(-)

--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -25,8 +25,6 @@
 
 #include <kvm/arm_psci.h>
 
-#include <uapi/linux/psci.h>
-
 /*
  * This is an implementation of the Power State Coordination Interface
  * as described in ARM document number ARM DEN 0022A.
@@ -220,7 +218,7 @@ static int kvm_psci_0_2_call(struct kvm_
 		 * Bits[31:16] = Major Version = 0
 		 * Bits[15:0] = Minor Version = 2
 		 */
-		val = 2;
+		val = KVM_ARM_PSCI_0_2;
 		break;
 	case PSCI_0_2_FN_CPU_SUSPEND:
 	case PSCI_0_2_FN64_CPU_SUSPEND:
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -18,8 +18,10 @@
 #ifndef __KVM_ARM_PSCI_H__
 #define __KVM_ARM_PSCI_H__
 
-#define KVM_ARM_PSCI_0_1	1
-#define KVM_ARM_PSCI_0_2	2
+#include <uapi/linux/psci.h>
+
+#define KVM_ARM_PSCI_0_1	PSCI_VERSION(0, 1)
+#define KVM_ARM_PSCI_0_2	PSCI_VERSION(0, 2)
 
 int kvm_psci_version(struct kvm_vcpu *vcpu);
 int kvm_psci_call(struct kvm_vcpu *vcpu);
--- a/include/uapi/linux/psci.h
+++ b/include/uapi/linux/psci.h
@@ -87,6 +87,9 @@
 		(((ver) & PSCI_VERSION_MAJOR_MASK) >> PSCI_VERSION_MAJOR_SHIFT)
 #define PSCI_VERSION_MINOR(ver)			\
 		((ver) & PSCI_VERSION_MINOR_MASK)
+#define PSCI_VERSION(maj, min)						\
+	((((maj) << PSCI_VERSION_MAJOR_SHIFT) & PSCI_VERSION_MAJOR_MASK) | \
+	 ((min) & PSCI_VERSION_MINOR_MASK))
 
 /* PSCI features decoding (>=1.0) */
 #define PSCI_1_0_FEATURES_CPU_SUSPEND_PF_SHIFT	1

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 38/66] arm/arm64: KVM: Add smccc accessors to PSCI code
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 37/66] arm/arm64: KVM: Add PSCI_VERSION helper Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 39/66] arm/arm64: KVM: Implement PSCI 1.0 support Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Christoffer Dall, Ard Biesheuvel,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 84684fecd7ea381824a96634a027b7719587fb77 upstream.

Instead of open coding the accesses to the various registers,
let's add explicit SMCCC accessors.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/kvm/psci.c |   52 ++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 42 insertions(+), 10 deletions(-)

--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -32,6 +32,38 @@
 
 #define AFFINITY_MASK(level)	~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
 
+static u32 smccc_get_function(struct kvm_vcpu *vcpu)
+{
+	return vcpu_get_reg(vcpu, 0);
+}
+
+static unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
+{
+	return vcpu_get_reg(vcpu, 1);
+}
+
+static unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
+{
+	return vcpu_get_reg(vcpu, 2);
+}
+
+static unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
+{
+	return vcpu_get_reg(vcpu, 3);
+}
+
+static void smccc_set_retval(struct kvm_vcpu *vcpu,
+			     unsigned long a0,
+			     unsigned long a1,
+			     unsigned long a2,
+			     unsigned long a3)
+{
+	vcpu_set_reg(vcpu, 0, a0);
+	vcpu_set_reg(vcpu, 1, a1);
+	vcpu_set_reg(vcpu, 2, a2);
+	vcpu_set_reg(vcpu, 3, a3);
+}
+
 static unsigned long psci_affinity_mask(unsigned long affinity_level)
 {
 	if (affinity_level <= 3)
@@ -74,7 +106,7 @@ static unsigned long kvm_psci_vcpu_on(st
 	unsigned long context_id;
 	phys_addr_t target_pc;
 
-	cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
+	cpu_id = smccc_get_arg1(source_vcpu) & MPIDR_HWID_BITMASK;
 	if (vcpu_mode_is_32bit(source_vcpu))
 		cpu_id &= ~((u32) 0);
 
@@ -93,8 +125,8 @@ static unsigned long kvm_psci_vcpu_on(st
 			return PSCI_RET_INVALID_PARAMS;
 	}
 
-	target_pc = vcpu_get_reg(source_vcpu, 2);
-	context_id = vcpu_get_reg(source_vcpu, 3);
+	target_pc = smccc_get_arg2(source_vcpu);
+	context_id = smccc_get_arg3(source_vcpu);
 
 	kvm_reset_vcpu(vcpu);
 
@@ -113,7 +145,7 @@ static unsigned long kvm_psci_vcpu_on(st
 	 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 	 * the general puspose registers are undefined upon CPU_ON.
 	 */
-	vcpu_set_reg(vcpu, 0, context_id);
+	smccc_set_retval(vcpu, context_id, 0, 0, 0);
 	vcpu->arch.power_off = false;
 	smp_mb();		/* Make sure the above is visible */
 
@@ -133,8 +165,8 @@ static unsigned long kvm_psci_vcpu_affin
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_vcpu *tmp;
 
-	target_affinity = vcpu_get_reg(vcpu, 1);
-	lowest_affinity_level = vcpu_get_reg(vcpu, 2);
+	target_affinity = smccc_get_arg1(vcpu);
+	lowest_affinity_level = smccc_get_arg2(vcpu);
 
 	/* Determine target affinity mask */
 	target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
@@ -208,7 +240,7 @@ int kvm_psci_version(struct kvm_vcpu *vc
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
-	unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
+	unsigned long psci_fn = smccc_get_function(vcpu);
 	unsigned long val;
 	int ret = 1;
 
@@ -275,14 +307,14 @@ static int kvm_psci_0_2_call(struct kvm_
 		break;
 	}
 
-	vcpu_set_reg(vcpu, 0, val);
+	smccc_set_retval(vcpu, val, 0, 0, 0);
 	return ret;
 }
 
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
-	unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
+	unsigned long psci_fn = smccc_get_function(vcpu);
 	unsigned long val;
 
 	switch (psci_fn) {
@@ -300,7 +332,7 @@ static int kvm_psci_0_1_call(struct kvm_
 		break;
 	}
 
-	vcpu_set_reg(vcpu, 0, val);
+	smccc_set_retval(vcpu, val, 0, 0, 0);
 	return 1;
 }
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 39/66] arm/arm64: KVM: Implement PSCI 1.0 support
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 38/66] arm/arm64: KVM: Add smccc accessors to PSCI code Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 40/66] arm/arm64: KVM: Advertise SMCCC v1.1 Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Christoffer Dall, Ard Biesheuvel,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 58e0b2239a4d997094ba63986ef4de29ddc91d87 upstream.

PSCI 1.0 can be trivially implemented by providing the FEATURES
call on top of PSCI 0.2 and returning 1.0 as the PSCI version.

We happily ignore everything else, as they are either optional or
are clarifications that do not require any additional change.

PSCI 1.0 is now the default until we decide to add a userspace
selection API.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/kvm/psci.c    |   45 ++++++++++++++++++++++++++++++++++++++++++++-
 include/kvm/arm_psci.h |    3 +++
 2 files changed, 47 insertions(+), 1 deletion(-)

--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -232,7 +232,7 @@ static void kvm_psci_system_reset(struct
 int kvm_psci_version(struct kvm_vcpu *vcpu)
 {
 	if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
-		return KVM_ARM_PSCI_0_2;
+		return KVM_ARM_PSCI_LATEST;
 
 	return KVM_ARM_PSCI_0_1;
 }
@@ -311,6 +311,47 @@ static int kvm_psci_0_2_call(struct kvm_
 	return ret;
 }
 
+static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
+{
+	u32 psci_fn = smccc_get_function(vcpu);
+	u32 feature;
+	unsigned long val;
+	int ret = 1;
+
+	switch(psci_fn) {
+	case PSCI_0_2_FN_PSCI_VERSION:
+		val = KVM_ARM_PSCI_1_0;
+		break;
+	case PSCI_1_0_FN_PSCI_FEATURES:
+		feature = smccc_get_arg1(vcpu);
+		switch(feature) {
+		case PSCI_0_2_FN_PSCI_VERSION:
+		case PSCI_0_2_FN_CPU_SUSPEND:
+		case PSCI_0_2_FN64_CPU_SUSPEND:
+		case PSCI_0_2_FN_CPU_OFF:
+		case PSCI_0_2_FN_CPU_ON:
+		case PSCI_0_2_FN64_CPU_ON:
+		case PSCI_0_2_FN_AFFINITY_INFO:
+		case PSCI_0_2_FN64_AFFINITY_INFO:
+		case PSCI_0_2_FN_MIGRATE_INFO_TYPE:
+		case PSCI_0_2_FN_SYSTEM_OFF:
+		case PSCI_0_2_FN_SYSTEM_RESET:
+		case PSCI_1_0_FN_PSCI_FEATURES:
+			val = 0;
+			break;
+		default:
+			val = PSCI_RET_NOT_SUPPORTED;
+			break;
+		}
+		break;
+	default:
+		return kvm_psci_0_2_call(vcpu);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return ret;
+}
+
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
@@ -353,6 +394,8 @@ static int kvm_psci_0_1_call(struct kvm_
 int kvm_psci_call(struct kvm_vcpu *vcpu)
 {
 	switch (kvm_psci_version(vcpu)) {
+	case KVM_ARM_PSCI_1_0:
+		return kvm_psci_1_0_call(vcpu);
 	case KVM_ARM_PSCI_0_2:
 		return kvm_psci_0_2_call(vcpu);
 	case KVM_ARM_PSCI_0_1:
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -22,6 +22,9 @@
 
 #define KVM_ARM_PSCI_0_1	PSCI_VERSION(0, 1)
 #define KVM_ARM_PSCI_0_2	PSCI_VERSION(0, 2)
+#define KVM_ARM_PSCI_1_0	PSCI_VERSION(1, 0)
+
+#define KVM_ARM_PSCI_LATEST	KVM_ARM_PSCI_1_0
 
 int kvm_psci_version(struct kvm_vcpu *vcpu);
 int kvm_psci_call(struct kvm_vcpu *vcpu);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 40/66] arm/arm64: KVM: Advertise SMCCC v1.1
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 39/66] arm/arm64: KVM: Implement PSCI 1.0 support Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 41/66] arm64: KVM: Make PSCI_VERSION a fast path Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Christoffer Dall,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 09e6be12effdb33bf7210c8867bbd213b66a499e upstream.

The new SMC Calling Convention (v1.1) allows for a reduced overhead
when calling into the firmware, and provides a new feature discovery
mechanism.

Make it visible to KVM guests.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/kvm/handle_exit.c   |    2 +-
 arch/arm/kvm/psci.c          |   24 +++++++++++++++++++++++-
 arch/arm64/kvm/handle_exit.c |    2 +-
 include/kvm/arm_psci.h       |    2 +-
 include/linux/arm-smccc.h    |   13 +++++++++++++
 5 files changed, 39 insertions(+), 4 deletions(-)

--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -36,7 +36,7 @@ static int handle_hvc(struct kvm_vcpu *v
 		      kvm_vcpu_hvc_get_imm(vcpu));
 	vcpu->stat.hvc_exit_stat++;
 
-	ret = kvm_psci_call(vcpu);
+	ret = kvm_hvc_call_handler(vcpu);
 	if (ret < 0) {
 		vcpu_set_reg(vcpu, 0, ~0UL);
 		return 1;
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -15,6 +15,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/arm-smccc.h>
 #include <linux/preempt.h>
 #include <linux/kvm_host.h>
 #include <linux/wait.h>
@@ -337,6 +338,7 @@ static int kvm_psci_1_0_call(struct kvm_
 		case PSCI_0_2_FN_SYSTEM_OFF:
 		case PSCI_0_2_FN_SYSTEM_RESET:
 		case PSCI_1_0_FN_PSCI_FEATURES:
+		case ARM_SMCCC_VERSION_FUNC_ID:
 			val = 0;
 			break;
 		default:
@@ -391,7 +393,7 @@ static int kvm_psci_0_1_call(struct kvm_
  * Errors:
  * -EINVAL: Unrecognized PSCI function
  */
-int kvm_psci_call(struct kvm_vcpu *vcpu)
+static int kvm_psci_call(struct kvm_vcpu *vcpu)
 {
 	switch (kvm_psci_version(vcpu)) {
 	case KVM_ARM_PSCI_1_0:
@@ -404,3 +406,23 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
 		return -EINVAL;
 	};
 }
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+{
+	u32 func_id = smccc_get_function(vcpu);
+	u32 val = PSCI_RET_NOT_SUPPORTED;
+
+	switch (func_id) {
+	case ARM_SMCCC_VERSION_FUNC_ID:
+		val = ARM_SMCCC_VERSION_1_1;
+		break;
+	case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
+		/* Nothing supported yet */
+		break;
+	default:
+		return kvm_psci_call(vcpu);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -45,7 +45,7 @@ static int handle_hvc(struct kvm_vcpu *v
 			    kvm_vcpu_hvc_get_imm(vcpu));
 	vcpu->stat.hvc_exit_stat++;
 
-	ret = kvm_psci_call(vcpu);
+	ret = kvm_hvc_call_handler(vcpu);
 	if (ret < 0) {
 		vcpu_set_reg(vcpu, 0, ~0UL);
 		return 1;
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -27,6 +27,6 @@
 #define KVM_ARM_PSCI_LATEST	KVM_ARM_PSCI_1_0
 
 int kvm_psci_version(struct kvm_vcpu *vcpu);
-int kvm_psci_call(struct kvm_vcpu *vcpu);
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
 
 #endif /* __KVM_ARM_PSCI_H__ */
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -60,6 +60,19 @@
 #define ARM_SMCCC_QUIRK_NONE		0
 #define ARM_SMCCC_QUIRK_QCOM_A6		1 /* Save/restore register a6 */
 
+#define ARM_SMCCC_VERSION_1_0		0x10000
+#define ARM_SMCCC_VERSION_1_1		0x10001
+
+#define ARM_SMCCC_VERSION_FUNC_ID					\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,				\
+			   ARM_SMCCC_SMC_32,				\
+			   0, 0)
+
+#define ARM_SMCCC_ARCH_FEATURES_FUNC_ID					\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,				\
+			   ARM_SMCCC_SMC_32,				\
+			   0, 1)
+
 #ifndef __ASSEMBLY__
 
 #include <linux/linkage.h>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 41/66] arm64: KVM: Make PSCI_VERSION a fast path
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 40/66] arm/arm64: KVM: Advertise SMCCC v1.1 Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 42/66] arm/arm64: KVM: Turn kvm_psci_version into a static inline Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Marc Zyngier, Will Deacon, Catalin Marinas,
	Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 90348689d500410ca7a55624c667f956771dce7f upstream.

For those CPUs that require PSCI to perform a BP invalidation,
going all the way to the PSCI code for not much is a waste of
precious cycles. Let's terminate that call as early as possible.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kvm/hyp/switch.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -17,6 +17,7 @@
 
 #include <linux/types.h>
 #include <linux/jump_label.h>
+#include <uapi/linux/psci.h>
 
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
@@ -308,6 +309,18 @@ again:
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
 
+	if (exit_code == ARM_EXCEPTION_TRAP &&
+	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
+	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) &&
+	    vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) {
+		u64 val = PSCI_RET_NOT_SUPPORTED;
+		if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
+			val = 2;
+
+		vcpu_set_reg(vcpu, 0, val);
+		goto again;
+	}
+
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
 		bool valid;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 42/66] arm/arm64: KVM: Turn kvm_psci_version into a static inline
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 41/66] arm64: KVM: Make PSCI_VERSION a fast path Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 43/66] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Christoffer Dall,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit a4097b351118e821841941a79ec77d3ce3f1c5d9 upstream.

We're about to need kvm_psci_version in HYP too. So let's turn it
into a static inline, and pass the kvm structure as a second
parameter (so that HYP can do a kern_hyp_va on it).

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/kvm/psci.c         |   12 ++----------
 arch/arm64/kvm/hyp/switch.c |   18 +++++++++++-------
 include/kvm/arm_psci.h      |   21 ++++++++++++++++++++-
 3 files changed, 33 insertions(+), 18 deletions(-)

--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -120,7 +120,7 @@ static unsigned long kvm_psci_vcpu_on(st
 	if (!vcpu)
 		return PSCI_RET_INVALID_PARAMS;
 	if (!vcpu->arch.power_off) {
-		if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1)
+		if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
 			return PSCI_RET_ALREADY_ON;
 		else
 			return PSCI_RET_INVALID_PARAMS;
@@ -230,14 +230,6 @@ static void kvm_psci_system_reset(struct
 	kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET);
 }
 
-int kvm_psci_version(struct kvm_vcpu *vcpu)
-{
-	if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
-		return KVM_ARM_PSCI_LATEST;
-
-	return KVM_ARM_PSCI_0_1;
-}
-
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
@@ -395,7 +387,7 @@ static int kvm_psci_0_1_call(struct kvm_
  */
 static int kvm_psci_call(struct kvm_vcpu *vcpu)
 {
-	switch (kvm_psci_version(vcpu)) {
+	switch (kvm_psci_version(vcpu, vcpu->kvm)) {
 	case KVM_ARM_PSCI_1_0:
 		return kvm_psci_1_0_call(vcpu);
 	case KVM_ARM_PSCI_0_2:
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -19,6 +19,8 @@
 #include <linux/jump_label.h>
 #include <uapi/linux/psci.h>
 
+#include <kvm/arm_psci.h>
+
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
@@ -311,14 +313,16 @@ again:
 
 	if (exit_code == ARM_EXCEPTION_TRAP &&
 	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
-	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) &&
-	    vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) {
-		u64 val = PSCI_RET_NOT_SUPPORTED;
-		if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
-			val = 2;
+	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32)) {
+		u32 val = vcpu_get_reg(vcpu, 0);
 
-		vcpu_set_reg(vcpu, 0, val);
-		goto again;
+		if (val == PSCI_0_2_FN_PSCI_VERSION) {
+			val = kvm_psci_version(vcpu, kern_hyp_va(vcpu->kvm));
+			if (unlikely(val == KVM_ARM_PSCI_0_1))
+				val = PSCI_RET_NOT_SUPPORTED;
+			vcpu_set_reg(vcpu, 0, val);
+			goto again;
+		}
 	}
 
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -18,6 +18,7 @@
 #ifndef __KVM_ARM_PSCI_H__
 #define __KVM_ARM_PSCI_H__
 
+#include <linux/kvm_host.h>
 #include <uapi/linux/psci.h>
 
 #define KVM_ARM_PSCI_0_1	PSCI_VERSION(0, 1)
@@ -26,7 +27,25 @@
 
 #define KVM_ARM_PSCI_LATEST	KVM_ARM_PSCI_1_0
 
-int kvm_psci_version(struct kvm_vcpu *vcpu);
+/*
+ * We need the KVM pointer independently from the vcpu as we can call
+ * this from HYP, and need to apply kern_hyp_va on it...
+ */
+static inline int kvm_psci_version(struct kvm_vcpu *vcpu, struct kvm *kvm)
+{
+	/*
+	 * Our PSCI implementation stays the same across versions from
+	 * v0.2 onward, only adding the few mandatory functions (such
+	 * as FEATURES with 1.0) that are required by newer
+	 * revisions. It is thus safe to return the latest.
+	 */
+	if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
+		return KVM_ARM_PSCI_LATEST;
+
+	return KVM_ARM_PSCI_0_1;
+}
+
+
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
 
 #endif /* __KVM_ARM_PSCI_H__ */

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 43/66] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 42/66] arm/arm64: KVM: Turn kvm_psci_version into a static inline Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 44/66] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Christoffer Dall,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 6167ec5c9145cdf493722dfd80a5d48bafc4a18a upstream.

A new feature of SMCCC 1.1 is that it offers firmware-based CPU
workarounds. In particular, SMCCC_ARCH_WORKAROUND_1 provides
BP hardening for CVE-2017-5715.

If the host has some mitigation for this issue, report that
we deal with it using SMCCC_ARCH_WORKAROUND_1, as we apply the
host workaround on every guest exit.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[v4.9: account for files moved to virt/ upstream]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/include/asm/kvm_host.h   |    6 ++++++
 arch/arm/kvm/psci.c               |    9 ++++++++-
 arch/arm64/include/asm/kvm_host.h |    5 +++++
 include/linux/arm-smccc.h         |    5 +++++
 4 files changed, 24 insertions(+), 1 deletion(-)

--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -318,4 +318,10 @@ static inline int kvm_arm_vcpu_arch_has_
 	return -ENXIO;
 }
 
+static inline bool kvm_arm_harden_branch_predictor(void)
+{
+	/* No way to detect it yet, pretend it is not there. */
+	return false;
+}
+
 #endif /* __ARM_KVM_HOST_H__ */
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -403,13 +403,20 @@ int kvm_hvc_call_handler(struct kvm_vcpu
 {
 	u32 func_id = smccc_get_function(vcpu);
 	u32 val = PSCI_RET_NOT_SUPPORTED;
+	u32 feature;
 
 	switch (func_id) {
 	case ARM_SMCCC_VERSION_FUNC_ID:
 		val = ARM_SMCCC_VERSION_1_1;
 		break;
 	case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
-		/* Nothing supported yet */
+		feature = smccc_get_arg1(vcpu);
+		switch(feature) {
+		case ARM_SMCCC_ARCH_WORKAROUND_1:
+			if (kvm_arm_harden_branch_predictor())
+				val = 0;
+			break;
+		}
 		break;
 	default:
 		return kvm_psci_call(vcpu);
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -393,4 +393,9 @@ static inline void __cpu_init_stage2(voi
 		  "PARange is %d bits, unsupported configuration!", parange);
 }
 
+static inline bool kvm_arm_harden_branch_predictor(void)
+{
+	return cpus_have_cap(ARM64_HARDEN_BRANCH_PREDICTOR);
+}
+
 #endif /* __ARM64_KVM_HOST_H__ */
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -73,6 +73,11 @@
 			   ARM_SMCCC_SMC_32,				\
 			   0, 1)
 
+#define ARM_SMCCC_ARCH_WORKAROUND_1					\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,				\
+			   ARM_SMCCC_SMC_32,				\
+			   0, 0x8000)
+
 #ifndef __ASSEMBLY__
 
 #include <linux/linkage.h>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 44/66] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 43/66] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 45/66] firmware/psci: Expose PSCI conduit Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Christoffer Dall,
	Marc Zyngier, Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit f72af90c3783d924337624659b43e2d36f1b36b4 upstream.

We want SMCCC_ARCH_WORKAROUND_1 to be fast. As fast as possible.
So let's intercept it as early as we can by testing for the
function call number as soon as we've identified a HVC call
coming from the guest.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kvm/hyp/hyp-entry.S |   20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -15,6 +15,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/arm-smccc.h>
 #include <linux/linkage.h>
 
 #include <asm/alternative.h>
@@ -79,10 +80,11 @@ alternative_endif
 	lsr	x0, x1, #ESR_ELx_EC_SHIFT
 
 	cmp	x0, #ESR_ELx_EC_HVC64
+	ccmp	x0, #ESR_ELx_EC_HVC32, #4, ne
 	b.ne	el1_trap
 
-	mrs	x1, vttbr_el2		// If vttbr is valid, the 64bit guest
-	cbnz	x1, el1_trap		// called HVC
+	mrs	x1, vttbr_el2		// If vttbr is valid, the guest
+	cbnz	x1, el1_hvc_guest	// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
 	ldp	x0, x1, [sp], #16
@@ -101,6 +103,20 @@ alternative_endif
 
 2:	eret
 
+el1_hvc_guest:
+	/*
+	 * Fastest possible path for ARM_SMCCC_ARCH_WORKAROUND_1.
+	 * The workaround has already been applied on the host,
+	 * so let's quickly get back to the guest. We don't bother
+	 * restoring x1, as it can be clobbered anyway.
+	 */
+	ldr	x1, [sp]				// Guest's x0
+	eor	w1, w1, #ARM_SMCCC_ARCH_WORKAROUND_1
+	cbnz	w1, el1_trap
+	mov	x0, x1
+	add	sp, sp, #16
+	eret
+
 el1_trap:
 	/*
 	 * x0: ESR_EC

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 45/66] firmware/psci: Expose PSCI conduit
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 44/66] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 46/66] firmware/psci: Expose SMCCC version through psci_ops Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Lorenzo Pieralisi, Robin Murphy,
	Ard Biesheuvel, Marc Zyngier, Catalin Marinas, Greg Hackmann,
	Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 09a8d6d48499f93e2abde691f5800081cd858726 upstream.

In order to call into the firmware to apply workarounds, it is
useful to find out whether we're using HVC or SMC. Let's expose
this through the psci_ops.

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/firmware/psci.c |   28 +++++++++++++++++++++++-----
 include/linux/psci.h    |    7 +++++++
 2 files changed, 30 insertions(+), 5 deletions(-)

--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -59,7 +59,9 @@ bool psci_tos_resident_on(int cpu)
 	return cpu == resident_cpu;
 }
 
-struct psci_operations psci_ops;
+struct psci_operations psci_ops = {
+	.conduit = PSCI_CONDUIT_NONE,
+};
 
 typedef unsigned long (psci_fn)(unsigned long, unsigned long,
 				unsigned long, unsigned long);
@@ -210,6 +212,22 @@ static unsigned long psci_migrate_info_u
 			      0, 0, 0);
 }
 
+static void set_conduit(enum psci_conduit conduit)
+{
+	switch (conduit) {
+	case PSCI_CONDUIT_HVC:
+		invoke_psci_fn = __invoke_psci_fn_hvc;
+		break;
+	case PSCI_CONDUIT_SMC:
+		invoke_psci_fn = __invoke_psci_fn_smc;
+		break;
+	default:
+		WARN(1, "Unexpected PSCI conduit %d\n", conduit);
+	}
+
+	psci_ops.conduit = conduit;
+}
+
 static int get_set_conduit_method(struct device_node *np)
 {
 	const char *method;
@@ -222,9 +240,9 @@ static int get_set_conduit_method(struct
 	}
 
 	if (!strcmp("hvc", method)) {
-		invoke_psci_fn = __invoke_psci_fn_hvc;
+		set_conduit(PSCI_CONDUIT_HVC);
 	} else if (!strcmp("smc", method)) {
-		invoke_psci_fn = __invoke_psci_fn_smc;
+		set_conduit(PSCI_CONDUIT_SMC);
 	} else {
 		pr_warn("invalid \"method\" property: %s\n", method);
 		return -EINVAL;
@@ -654,9 +672,9 @@ int __init psci_acpi_init(void)
 	pr_info("probing for conduit method from ACPI.\n");
 
 	if (acpi_psci_use_hvc())
-		invoke_psci_fn = __invoke_psci_fn_hvc;
+		set_conduit(PSCI_CONDUIT_HVC);
 	else
-		invoke_psci_fn = __invoke_psci_fn_smc;
+		set_conduit(PSCI_CONDUIT_SMC);
 
 	return psci_probe();
 }
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -25,6 +25,12 @@ bool psci_tos_resident_on(int cpu);
 int psci_cpu_init_idle(unsigned int cpu);
 int psci_cpu_suspend_enter(unsigned long index);
 
+enum psci_conduit {
+	PSCI_CONDUIT_NONE,
+	PSCI_CONDUIT_SMC,
+	PSCI_CONDUIT_HVC,
+};
+
 struct psci_operations {
 	u32 (*get_version)(void);
 	int (*cpu_suspend)(u32 state, unsigned long entry_point);
@@ -34,6 +40,7 @@ struct psci_operations {
 	int (*affinity_info)(unsigned long target_affinity,
 			unsigned long lowest_affinity_level);
 	int (*migrate_info_type)(void);
+	enum psci_conduit conduit;
 };
 
 extern struct psci_operations psci_ops;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 46/66] firmware/psci: Expose SMCCC version through psci_ops
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 45/66] firmware/psci: Expose PSCI conduit Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 47/66] arm/arm64: smccc: Make function identifiers an unsigned quantity Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Lorenzo Pieralisi, Robin Murphy,
	Ard Biesheuvel, Marc Zyngier, Catalin Marinas, Greg Hackmann,
	Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit e78eef554a912ef6c1e0bbf97619dafbeae3339f upstream.

Since PSCI 1.0 allows the SMCCC version to be (indirectly) probed,
let's do that at boot time, and expose the version of the calling
convention as part of the psci_ops structure.

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/firmware/psci.c |   27 +++++++++++++++++++++++++++
 include/linux/psci.h    |    6 ++++++
 2 files changed, 33 insertions(+)

--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -61,6 +61,7 @@ bool psci_tos_resident_on(int cpu)
 
 struct psci_operations psci_ops = {
 	.conduit = PSCI_CONDUIT_NONE,
+	.smccc_version = SMCCC_VERSION_1_0,
 };
 
 typedef unsigned long (psci_fn)(unsigned long, unsigned long,
@@ -511,6 +512,31 @@ static void __init psci_init_migrate(voi
 	pr_info("Trusted OS resident on physical CPU 0x%lx\n", cpuid);
 }
 
+static void __init psci_init_smccc(void)
+{
+	u32 ver = ARM_SMCCC_VERSION_1_0;
+	int feature;
+
+	feature = psci_features(ARM_SMCCC_VERSION_FUNC_ID);
+
+	if (feature != PSCI_RET_NOT_SUPPORTED) {
+		u32 ret;
+		ret = invoke_psci_fn(ARM_SMCCC_VERSION_FUNC_ID, 0, 0, 0);
+		if (ret == ARM_SMCCC_VERSION_1_1) {
+			psci_ops.smccc_version = SMCCC_VERSION_1_1;
+			ver = ret;
+		}
+	}
+
+	/*
+	 * Conveniently, the SMCCC and PSCI versions are encoded the
+	 * same way. No, this isn't accidental.
+	 */
+	pr_info("SMC Calling Convention v%d.%d\n",
+		PSCI_VERSION_MAJOR(ver), PSCI_VERSION_MINOR(ver));
+
+}
+
 static void __init psci_0_2_set_functions(void)
 {
 	pr_info("Using standard PSCI v0.2 function IDs\n");
@@ -559,6 +585,7 @@ static int __init psci_probe(void)
 	psci_init_migrate();
 
 	if (PSCI_VERSION_MAJOR(ver) >= 1) {
+		psci_init_smccc();
 		psci_init_cpu_suspend();
 		psci_init_system_suspend();
 	}
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -31,6 +31,11 @@ enum psci_conduit {
 	PSCI_CONDUIT_HVC,
 };
 
+enum smccc_version {
+	SMCCC_VERSION_1_0,
+	SMCCC_VERSION_1_1,
+};
+
 struct psci_operations {
 	u32 (*get_version)(void);
 	int (*cpu_suspend)(u32 state, unsigned long entry_point);
@@ -41,6 +46,7 @@ struct psci_operations {
 			unsigned long lowest_affinity_level);
 	int (*migrate_info_type)(void);
 	enum psci_conduit conduit;
+	enum smccc_version smccc_version;
 };
 
 extern struct psci_operations psci_ops;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 47/66] arm/arm64: smccc: Make function identifiers an unsigned quantity
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 46/66] firmware/psci: Expose SMCCC version through psci_ops Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 48/66] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Robin Murphy, Marc Zyngier,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit ded4c39e93f3b72968fdb79baba27f3b83dad34c upstream.

Function identifiers are a 32bit, unsigned quantity. But we never
tell so to the compiler, resulting in the following:

 4ac:   b26187e0        mov     x0, #0xffffffff80000001

We thus rely on the firmware narrowing it for us, which is not
always a reasonable expectation.

Cc: stable@vger.kernel.org
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/arm-smccc.h |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -14,14 +14,16 @@
 #ifndef __LINUX_ARM_SMCCC_H
 #define __LINUX_ARM_SMCCC_H
 
+#include <uapi/linux/const.h>
+
 /*
  * This file provides common defines for ARM SMC Calling Convention as
  * specified in
  * http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html
  */
 
-#define ARM_SMCCC_STD_CALL		0
-#define ARM_SMCCC_FAST_CALL		1
+#define ARM_SMCCC_STD_CALL	        _AC(0,U)
+#define ARM_SMCCC_FAST_CALL	        _AC(1,U)
 #define ARM_SMCCC_TYPE_SHIFT		31
 
 #define ARM_SMCCC_SMC_32		0

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 48/66] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 47/66] arm/arm64: smccc: Make function identifiers an unsigned quantity Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 49/66] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Robin Murphy, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit f2d3b2e8759a5833df6f022e42df2d581e6d843c upstream.

One of the major improvement of SMCCC v1.1 is that it only clobbers
the first 4 registers, both on 32 and 64bit. This means that it
becomes very easy to provide an inline version of the SMC call
primitive, and avoid performing a function call to stash the
registers that would otherwise be clobbered by SMCCC v1.0.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/arm-smccc.h |  141 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -150,5 +150,146 @@ asmlinkage void __arm_smccc_hvc(unsigned
 
 #define arm_smccc_hvc_quirk(...) __arm_smccc_hvc(__VA_ARGS__)
 
+/* SMCCC v1.1 implementation madness follows */
+#ifdef CONFIG_ARM64
+
+#define SMCCC_SMC_INST	"smc	#0"
+#define SMCCC_HVC_INST	"hvc	#0"
+
+#elif defined(CONFIG_ARM)
+#include <asm/opcodes-sec.h>
+#include <asm/opcodes-virt.h>
+
+#define SMCCC_SMC_INST	__SMC(0)
+#define SMCCC_HVC_INST	__HVC(0)
+
+#endif
+
+#define ___count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x
+
+#define __count_args(...)						\
+	___count_args(__VA_ARGS__, 7, 6, 5, 4, 3, 2, 1, 0)
+
+#define __constraint_write_0						\
+	"+r" (r0), "=&r" (r1), "=&r" (r2), "=&r" (r3)
+#define __constraint_write_1						\
+	"+r" (r0), "+r" (r1), "=&r" (r2), "=&r" (r3)
+#define __constraint_write_2						\
+	"+r" (r0), "+r" (r1), "+r" (r2), "=&r" (r3)
+#define __constraint_write_3						\
+	"+r" (r0), "+r" (r1), "+r" (r2), "+r" (r3)
+#define __constraint_write_4	__constraint_write_3
+#define __constraint_write_5	__constraint_write_4
+#define __constraint_write_6	__constraint_write_5
+#define __constraint_write_7	__constraint_write_6
+
+#define __constraint_read_0
+#define __constraint_read_1
+#define __constraint_read_2
+#define __constraint_read_3
+#define __constraint_read_4	"r" (r4)
+#define __constraint_read_5	__constraint_read_4, "r" (r5)
+#define __constraint_read_6	__constraint_read_5, "r" (r6)
+#define __constraint_read_7	__constraint_read_6, "r" (r7)
+
+#define __declare_arg_0(a0, res)					\
+	struct arm_smccc_res   *___res = res;				\
+	register u32           r0 asm("r0") = a0;			\
+	register unsigned long r1 asm("r1");				\
+	register unsigned long r2 asm("r2");				\
+	register unsigned long r3 asm("r3")
+
+#define __declare_arg_1(a0, a1, res)					\
+	struct arm_smccc_res   *___res = res;				\
+	register u32           r0 asm("r0") = a0;			\
+	register typeof(a1)    r1 asm("r1") = a1;			\
+	register unsigned long r2 asm("r2");				\
+	register unsigned long r3 asm("r3")
+
+#define __declare_arg_2(a0, a1, a2, res)				\
+	struct arm_smccc_res   *___res = res;				\
+	register u32           r0 asm("r0") = a0;			\
+	register typeof(a1)    r1 asm("r1") = a1;			\
+	register typeof(a2)    r2 asm("r2") = a2;			\
+	register unsigned long r3 asm("r3")
+
+#define __declare_arg_3(a0, a1, a2, a3, res)				\
+	struct arm_smccc_res   *___res = res;				\
+	register u32           r0 asm("r0") = a0;			\
+	register typeof(a1)    r1 asm("r1") = a1;			\
+	register typeof(a2)    r2 asm("r2") = a2;			\
+	register typeof(a3)    r3 asm("r3") = a3
+
+#define __declare_arg_4(a0, a1, a2, a3, a4, res)			\
+	__declare_arg_3(a0, a1, a2, a3, res);				\
+	register typeof(a4) r4 asm("r4") = a4
+
+#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res)			\
+	__declare_arg_4(a0, a1, a2, a3, a4, res);			\
+	register typeof(a5) r5 asm("r5") = a5
+
+#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res)		\
+	__declare_arg_5(a0, a1, a2, a3, a4, a5, res);			\
+	register typeof(a6) r6 asm("r6") = a6
+
+#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res)		\
+	__declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res);		\
+	register typeof(a7) r7 asm("r7") = a7
+
+#define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__)
+#define __declare_args(count, ...)  ___declare_args(count, __VA_ARGS__)
+
+#define ___constraints(count)						\
+	: __constraint_write_ ## count					\
+	: __constraint_read_ ## count					\
+	: "memory"
+#define __constraints(count)	___constraints(count)
+
+/*
+ * We have an output list that is not necessarily used, and GCC feels
+ * entitled to optimise the whole sequence away. "volatile" is what
+ * makes it stick.
+ */
+#define __arm_smccc_1_1(inst, ...)					\
+	do {								\
+		__declare_args(__count_args(__VA_ARGS__), __VA_ARGS__);	\
+		asm volatile(inst "\n"					\
+			     __constraints(__count_args(__VA_ARGS__)));	\
+		if (___res)						\
+			*___res = (typeof(*___res)){r0, r1, r2, r3};	\
+	} while (0)
+
+/*
+ * arm_smccc_1_1_smc() - make an SMCCC v1.1 compliant SMC call
+ *
+ * This is a variadic macro taking one to eight source arguments, and
+ * an optional return structure.
+ *
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * This macro is used to make SMC calls following SMC Calling Convention v1.1.
+ * The content of the supplied param are copied to registers 0 to 7 prior
+ * to the SMC instruction. The return values are updated with the content
+ * from register 0 to 3 on return from the SMC instruction if not NULL.
+ */
+#define arm_smccc_1_1_smc(...)	__arm_smccc_1_1(SMCCC_SMC_INST, __VA_ARGS__)
+
+/*
+ * arm_smccc_1_1_hvc() - make an SMCCC v1.1 compliant HVC call
+ *
+ * This is a variadic macro taking one to eight source arguments, and
+ * an optional return structure.
+ *
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * This macro is used to make HVC calls following SMC Calling Convention v1.1.
+ * The content of the supplied param are copied to registers 0 to 7 prior
+ * to the HVC instruction. The return values are updated with the content
+ * from register 0 to 3 on return from the HVC instruction if not NULL.
+ */
+#define arm_smccc_1_1_hvc(...)	__arm_smccc_1_1(SMCCC_HVC_INST, __VA_ARGS__)
+
 #endif /*__ASSEMBLY__*/
 #endif /*__LINUX_ARM_SMCCC_H*/

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 49/66] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 48/66] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 50/66] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit b092201e0020614127f495c092e0a12d26a2116e upstream.

Add the detection and runtime code for ARM_SMCCC_ARCH_WORKAROUND_1.
It is lovely. Really.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/crypto/sha256-core.S | 2061 ++++++++++++++++++++++++++++++++++++++++
 arch/arm64/crypto/sha512-core.S | 1085 +++++++++++++++++++++
 arch/arm64/kernel/bpi.S         |   20 
 arch/arm64/kernel/cpu_errata.c  |   72 +
 4 files changed, 3235 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/crypto/sha256-core.S
 create mode 100644 arch/arm64/crypto/sha512-core.S

--- /dev/null
+++ b/arch/arm64/crypto/sha256-core.S
@@ -0,0 +1,2061 @@
+// Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
+//
+// Licensed under the OpenSSL license (the "License").  You may not use
+// this file except in compliance with the License.  You can obtain a copy
+// in the file LICENSE in the source distribution or at
+// https://www.openssl.org/source/license.html
+
+// ====================================================================
+// Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+// project. The module is, however, dual licensed under OpenSSL and
+// CRYPTOGAMS licenses depending on where you obtain it. For further
+// details see http://www.openssl.org/~appro/cryptogams/.
+//
+// Permission to use under GPLv2 terms is granted.
+// ====================================================================
+//
+// SHA256/512 for ARMv8.
+//
+// Performance in cycles per processed byte and improvement coefficient
+// over code generated with "default" compiler:
+//
+//		SHA256-hw	SHA256(*)	SHA512
+// Apple A7	1.97		10.5 (+33%)	6.73 (-1%(**))
+// Cortex-A53	2.38		15.5 (+115%)	10.0 (+150%(***))
+// Cortex-A57	2.31		11.6 (+86%)	7.51 (+260%(***))
+// Denver	2.01		10.5 (+26%)	6.70 (+8%)
+// X-Gene			20.0 (+100%)	12.8 (+300%(***))
+// Mongoose	2.36		13.0 (+50%)	8.36 (+33%)
+//
+// (*)	Software SHA256 results are of lesser relevance, presented
+//	mostly for informational purposes.
+// (**)	The result is a trade-off: it's possible to improve it by
+//	10% (or by 1 cycle per round), but at the cost of 20% loss
+//	on Cortex-A53 (or by 4 cycles per round).
+// (***)	Super-impressive coefficients over gcc-generated code are
+//	indication of some compiler "pathology", most notably code
+//	generated with -mgeneral-regs-only is significanty faster
+//	and the gap is only 40-90%.
+//
+// October 2016.
+//
+// Originally it was reckoned that it makes no sense to implement NEON
+// version of SHA256 for 64-bit processors. This is because performance
+// improvement on most wide-spread Cortex-A5x processors was observed
+// to be marginal, same on Cortex-A53 and ~10% on A57. But then it was
+// observed that 32-bit NEON SHA256 performs significantly better than
+// 64-bit scalar version on *some* of the more recent processors. As
+// result 64-bit NEON version of SHA256 was added to provide best
+// all-round performance. For example it executes ~30% faster on X-Gene
+// and Mongoose. [For reference, NEON version of SHA512 is bound to
+// deliver much less improvement, likely *negative* on Cortex-A5x.
+// Which is why NEON support is limited to SHA256.]
+
+#ifndef	__KERNEL__
+# include "arm_arch.h"
+#endif
+
+.text
+
+.extern	OPENSSL_armcap_P
+.globl	sha256_block_data_order
+.type	sha256_block_data_order,%function
+.align	6
+sha256_block_data_order:
+#ifndef	__KERNEL__
+# ifdef	__ILP32__
+	ldrsw	x16,.LOPENSSL_armcap_P
+# else
+	ldr	x16,.LOPENSSL_armcap_P
+# endif
+	adr	x17,.LOPENSSL_armcap_P
+	add	x16,x16,x17
+	ldr	w16,[x16]
+	tst	w16,#ARMV8_SHA256
+	b.ne	.Lv8_entry
+	tst	w16,#ARMV7_NEON
+	b.ne	.Lneon_entry
+#endif
+	stp	x29,x30,[sp,#-128]!
+	add	x29,sp,#0
+
+	stp	x19,x20,[sp,#16]
+	stp	x21,x22,[sp,#32]
+	stp	x23,x24,[sp,#48]
+	stp	x25,x26,[sp,#64]
+	stp	x27,x28,[sp,#80]
+	sub	sp,sp,#4*4
+
+	ldp	w20,w21,[x0]				// load context
+	ldp	w22,w23,[x0,#2*4]
+	ldp	w24,w25,[x0,#4*4]
+	add	x2,x1,x2,lsl#6	// end of input
+	ldp	w26,w27,[x0,#6*4]
+	adr	x30,.LK256
+	stp	x0,x2,[x29,#96]
+
+.Loop:
+	ldp	w3,w4,[x1],#2*4
+	ldr	w19,[x30],#4			// *K++
+	eor	w28,w21,w22				// magic seed
+	str	x1,[x29,#112]
+#ifndef	__AARCH64EB__
+	rev	w3,w3			// 0
+#endif
+	ror	w16,w24,#6
+	add	w27,w27,w19			// h+=K[i]
+	eor	w6,w24,w24,ror#14
+	and	w17,w25,w24
+	bic	w19,w26,w24
+	add	w27,w27,w3			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w20,w21			// a^b, b^c in next round
+	eor	w16,w16,w6,ror#11	// Sigma1(e)
+	ror	w6,w20,#2
+	add	w27,w27,w17			// h+=Ch(e,f,g)
+	eor	w17,w20,w20,ror#9
+	add	w27,w27,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w23,w23,w27			// d+=h
+	eor	w28,w28,w21			// Maj(a,b,c)
+	eor	w17,w6,w17,ror#13	// Sigma0(a)
+	add	w27,w27,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w27,w27,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w4,w4			// 1
+#endif
+	ldp	w5,w6,[x1],#2*4
+	add	w27,w27,w17			// h+=Sigma0(a)
+	ror	w16,w23,#6
+	add	w26,w26,w28			// h+=K[i]
+	eor	w7,w23,w23,ror#14
+	and	w17,w24,w23
+	bic	w28,w25,w23
+	add	w26,w26,w4			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w27,w20			// a^b, b^c in next round
+	eor	w16,w16,w7,ror#11	// Sigma1(e)
+	ror	w7,w27,#2
+	add	w26,w26,w17			// h+=Ch(e,f,g)
+	eor	w17,w27,w27,ror#9
+	add	w26,w26,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w22,w22,w26			// d+=h
+	eor	w19,w19,w20			// Maj(a,b,c)
+	eor	w17,w7,w17,ror#13	// Sigma0(a)
+	add	w26,w26,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w26,w26,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w5,w5			// 2
+#endif
+	add	w26,w26,w17			// h+=Sigma0(a)
+	ror	w16,w22,#6
+	add	w25,w25,w19			// h+=K[i]
+	eor	w8,w22,w22,ror#14
+	and	w17,w23,w22
+	bic	w19,w24,w22
+	add	w25,w25,w5			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w26,w27			// a^b, b^c in next round
+	eor	w16,w16,w8,ror#11	// Sigma1(e)
+	ror	w8,w26,#2
+	add	w25,w25,w17			// h+=Ch(e,f,g)
+	eor	w17,w26,w26,ror#9
+	add	w25,w25,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w21,w21,w25			// d+=h
+	eor	w28,w28,w27			// Maj(a,b,c)
+	eor	w17,w8,w17,ror#13	// Sigma0(a)
+	add	w25,w25,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w25,w25,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w6,w6			// 3
+#endif
+	ldp	w7,w8,[x1],#2*4
+	add	w25,w25,w17			// h+=Sigma0(a)
+	ror	w16,w21,#6
+	add	w24,w24,w28			// h+=K[i]
+	eor	w9,w21,w21,ror#14
+	and	w17,w22,w21
+	bic	w28,w23,w21
+	add	w24,w24,w6			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w25,w26			// a^b, b^c in next round
+	eor	w16,w16,w9,ror#11	// Sigma1(e)
+	ror	w9,w25,#2
+	add	w24,w24,w17			// h+=Ch(e,f,g)
+	eor	w17,w25,w25,ror#9
+	add	w24,w24,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w20,w20,w24			// d+=h
+	eor	w19,w19,w26			// Maj(a,b,c)
+	eor	w17,w9,w17,ror#13	// Sigma0(a)
+	add	w24,w24,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w24,w24,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w7,w7			// 4
+#endif
+	add	w24,w24,w17			// h+=Sigma0(a)
+	ror	w16,w20,#6
+	add	w23,w23,w19			// h+=K[i]
+	eor	w10,w20,w20,ror#14
+	and	w17,w21,w20
+	bic	w19,w22,w20
+	add	w23,w23,w7			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w24,w25			// a^b, b^c in next round
+	eor	w16,w16,w10,ror#11	// Sigma1(e)
+	ror	w10,w24,#2
+	add	w23,w23,w17			// h+=Ch(e,f,g)
+	eor	w17,w24,w24,ror#9
+	add	w23,w23,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w27,w27,w23			// d+=h
+	eor	w28,w28,w25			// Maj(a,b,c)
+	eor	w17,w10,w17,ror#13	// Sigma0(a)
+	add	w23,w23,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w23,w23,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w8,w8			// 5
+#endif
+	ldp	w9,w10,[x1],#2*4
+	add	w23,w23,w17			// h+=Sigma0(a)
+	ror	w16,w27,#6
+	add	w22,w22,w28			// h+=K[i]
+	eor	w11,w27,w27,ror#14
+	and	w17,w20,w27
+	bic	w28,w21,w27
+	add	w22,w22,w8			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w23,w24			// a^b, b^c in next round
+	eor	w16,w16,w11,ror#11	// Sigma1(e)
+	ror	w11,w23,#2
+	add	w22,w22,w17			// h+=Ch(e,f,g)
+	eor	w17,w23,w23,ror#9
+	add	w22,w22,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w26,w26,w22			// d+=h
+	eor	w19,w19,w24			// Maj(a,b,c)
+	eor	w17,w11,w17,ror#13	// Sigma0(a)
+	add	w22,w22,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w22,w22,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w9,w9			// 6
+#endif
+	add	w22,w22,w17			// h+=Sigma0(a)
+	ror	w16,w26,#6
+	add	w21,w21,w19			// h+=K[i]
+	eor	w12,w26,w26,ror#14
+	and	w17,w27,w26
+	bic	w19,w20,w26
+	add	w21,w21,w9			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w22,w23			// a^b, b^c in next round
+	eor	w16,w16,w12,ror#11	// Sigma1(e)
+	ror	w12,w22,#2
+	add	w21,w21,w17			// h+=Ch(e,f,g)
+	eor	w17,w22,w22,ror#9
+	add	w21,w21,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w25,w25,w21			// d+=h
+	eor	w28,w28,w23			// Maj(a,b,c)
+	eor	w17,w12,w17,ror#13	// Sigma0(a)
+	add	w21,w21,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w21,w21,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w10,w10			// 7
+#endif
+	ldp	w11,w12,[x1],#2*4
+	add	w21,w21,w17			// h+=Sigma0(a)
+	ror	w16,w25,#6
+	add	w20,w20,w28			// h+=K[i]
+	eor	w13,w25,w25,ror#14
+	and	w17,w26,w25
+	bic	w28,w27,w25
+	add	w20,w20,w10			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w21,w22			// a^b, b^c in next round
+	eor	w16,w16,w13,ror#11	// Sigma1(e)
+	ror	w13,w21,#2
+	add	w20,w20,w17			// h+=Ch(e,f,g)
+	eor	w17,w21,w21,ror#9
+	add	w20,w20,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w24,w24,w20			// d+=h
+	eor	w19,w19,w22			// Maj(a,b,c)
+	eor	w17,w13,w17,ror#13	// Sigma0(a)
+	add	w20,w20,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w20,w20,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w11,w11			// 8
+#endif
+	add	w20,w20,w17			// h+=Sigma0(a)
+	ror	w16,w24,#6
+	add	w27,w27,w19			// h+=K[i]
+	eor	w14,w24,w24,ror#14
+	and	w17,w25,w24
+	bic	w19,w26,w24
+	add	w27,w27,w11			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w20,w21			// a^b, b^c in next round
+	eor	w16,w16,w14,ror#11	// Sigma1(e)
+	ror	w14,w20,#2
+	add	w27,w27,w17			// h+=Ch(e,f,g)
+	eor	w17,w20,w20,ror#9
+	add	w27,w27,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w23,w23,w27			// d+=h
+	eor	w28,w28,w21			// Maj(a,b,c)
+	eor	w17,w14,w17,ror#13	// Sigma0(a)
+	add	w27,w27,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w27,w27,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w12,w12			// 9
+#endif
+	ldp	w13,w14,[x1],#2*4
+	add	w27,w27,w17			// h+=Sigma0(a)
+	ror	w16,w23,#6
+	add	w26,w26,w28			// h+=K[i]
+	eor	w15,w23,w23,ror#14
+	and	w17,w24,w23
+	bic	w28,w25,w23
+	add	w26,w26,w12			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w27,w20			// a^b, b^c in next round
+	eor	w16,w16,w15,ror#11	// Sigma1(e)
+	ror	w15,w27,#2
+	add	w26,w26,w17			// h+=Ch(e,f,g)
+	eor	w17,w27,w27,ror#9
+	add	w26,w26,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w22,w22,w26			// d+=h
+	eor	w19,w19,w20			// Maj(a,b,c)
+	eor	w17,w15,w17,ror#13	// Sigma0(a)
+	add	w26,w26,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w26,w26,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w13,w13			// 10
+#endif
+	add	w26,w26,w17			// h+=Sigma0(a)
+	ror	w16,w22,#6
+	add	w25,w25,w19			// h+=K[i]
+	eor	w0,w22,w22,ror#14
+	and	w17,w23,w22
+	bic	w19,w24,w22
+	add	w25,w25,w13			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w26,w27			// a^b, b^c in next round
+	eor	w16,w16,w0,ror#11	// Sigma1(e)
+	ror	w0,w26,#2
+	add	w25,w25,w17			// h+=Ch(e,f,g)
+	eor	w17,w26,w26,ror#9
+	add	w25,w25,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w21,w21,w25			// d+=h
+	eor	w28,w28,w27			// Maj(a,b,c)
+	eor	w17,w0,w17,ror#13	// Sigma0(a)
+	add	w25,w25,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w25,w25,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w14,w14			// 11
+#endif
+	ldp	w15,w0,[x1],#2*4
+	add	w25,w25,w17			// h+=Sigma0(a)
+	str	w6,[sp,#12]
+	ror	w16,w21,#6
+	add	w24,w24,w28			// h+=K[i]
+	eor	w6,w21,w21,ror#14
+	and	w17,w22,w21
+	bic	w28,w23,w21
+	add	w24,w24,w14			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w25,w26			// a^b, b^c in next round
+	eor	w16,w16,w6,ror#11	// Sigma1(e)
+	ror	w6,w25,#2
+	add	w24,w24,w17			// h+=Ch(e,f,g)
+	eor	w17,w25,w25,ror#9
+	add	w24,w24,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w20,w20,w24			// d+=h
+	eor	w19,w19,w26			// Maj(a,b,c)
+	eor	w17,w6,w17,ror#13	// Sigma0(a)
+	add	w24,w24,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w24,w24,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w15,w15			// 12
+#endif
+	add	w24,w24,w17			// h+=Sigma0(a)
+	str	w7,[sp,#0]
+	ror	w16,w20,#6
+	add	w23,w23,w19			// h+=K[i]
+	eor	w7,w20,w20,ror#14
+	and	w17,w21,w20
+	bic	w19,w22,w20
+	add	w23,w23,w15			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w24,w25			// a^b, b^c in next round
+	eor	w16,w16,w7,ror#11	// Sigma1(e)
+	ror	w7,w24,#2
+	add	w23,w23,w17			// h+=Ch(e,f,g)
+	eor	w17,w24,w24,ror#9
+	add	w23,w23,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w27,w27,w23			// d+=h
+	eor	w28,w28,w25			// Maj(a,b,c)
+	eor	w17,w7,w17,ror#13	// Sigma0(a)
+	add	w23,w23,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w23,w23,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w0,w0			// 13
+#endif
+	ldp	w1,w2,[x1]
+	add	w23,w23,w17			// h+=Sigma0(a)
+	str	w8,[sp,#4]
+	ror	w16,w27,#6
+	add	w22,w22,w28			// h+=K[i]
+	eor	w8,w27,w27,ror#14
+	and	w17,w20,w27
+	bic	w28,w21,w27
+	add	w22,w22,w0			// h+=X[i]
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w23,w24			// a^b, b^c in next round
+	eor	w16,w16,w8,ror#11	// Sigma1(e)
+	ror	w8,w23,#2
+	add	w22,w22,w17			// h+=Ch(e,f,g)
+	eor	w17,w23,w23,ror#9
+	add	w22,w22,w16			// h+=Sigma1(e)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	add	w26,w26,w22			// d+=h
+	eor	w19,w19,w24			// Maj(a,b,c)
+	eor	w17,w8,w17,ror#13	// Sigma0(a)
+	add	w22,w22,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	//add	w22,w22,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w1,w1			// 14
+#endif
+	ldr	w6,[sp,#12]
+	add	w22,w22,w17			// h+=Sigma0(a)
+	str	w9,[sp,#8]
+	ror	w16,w26,#6
+	add	w21,w21,w19			// h+=K[i]
+	eor	w9,w26,w26,ror#14
+	and	w17,w27,w26
+	bic	w19,w20,w26
+	add	w21,w21,w1			// h+=X[i]
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w22,w23			// a^b, b^c in next round
+	eor	w16,w16,w9,ror#11	// Sigma1(e)
+	ror	w9,w22,#2
+	add	w21,w21,w17			// h+=Ch(e,f,g)
+	eor	w17,w22,w22,ror#9
+	add	w21,w21,w16			// h+=Sigma1(e)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	add	w25,w25,w21			// d+=h
+	eor	w28,w28,w23			// Maj(a,b,c)
+	eor	w17,w9,w17,ror#13	// Sigma0(a)
+	add	w21,w21,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	//add	w21,w21,w17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	w2,w2			// 15
+#endif
+	ldr	w7,[sp,#0]
+	add	w21,w21,w17			// h+=Sigma0(a)
+	str	w10,[sp,#12]
+	ror	w16,w25,#6
+	add	w20,w20,w28			// h+=K[i]
+	ror	w9,w4,#7
+	and	w17,w26,w25
+	ror	w8,w1,#17
+	bic	w28,w27,w25
+	ror	w10,w21,#2
+	add	w20,w20,w2			// h+=X[i]
+	eor	w16,w16,w25,ror#11
+	eor	w9,w9,w4,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w21,w22			// a^b, b^c in next round
+	eor	w16,w16,w25,ror#25	// Sigma1(e)
+	eor	w10,w10,w21,ror#13
+	add	w20,w20,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w8,w8,w1,ror#19
+	eor	w9,w9,w4,lsr#3	// sigma0(X[i+1])
+	add	w20,w20,w16			// h+=Sigma1(e)
+	eor	w19,w19,w22			// Maj(a,b,c)
+	eor	w17,w10,w21,ror#22	// Sigma0(a)
+	eor	w8,w8,w1,lsr#10	// sigma1(X[i+14])
+	add	w3,w3,w12
+	add	w24,w24,w20			// d+=h
+	add	w20,w20,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w3,w3,w9
+	add	w20,w20,w17			// h+=Sigma0(a)
+	add	w3,w3,w8
+.Loop_16_xx:
+	ldr	w8,[sp,#4]
+	str	w11,[sp,#0]
+	ror	w16,w24,#6
+	add	w27,w27,w19			// h+=K[i]
+	ror	w10,w5,#7
+	and	w17,w25,w24
+	ror	w9,w2,#17
+	bic	w19,w26,w24
+	ror	w11,w20,#2
+	add	w27,w27,w3			// h+=X[i]
+	eor	w16,w16,w24,ror#11
+	eor	w10,w10,w5,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w20,w21			// a^b, b^c in next round
+	eor	w16,w16,w24,ror#25	// Sigma1(e)
+	eor	w11,w11,w20,ror#13
+	add	w27,w27,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w9,w9,w2,ror#19
+	eor	w10,w10,w5,lsr#3	// sigma0(X[i+1])
+	add	w27,w27,w16			// h+=Sigma1(e)
+	eor	w28,w28,w21			// Maj(a,b,c)
+	eor	w17,w11,w20,ror#22	// Sigma0(a)
+	eor	w9,w9,w2,lsr#10	// sigma1(X[i+14])
+	add	w4,w4,w13
+	add	w23,w23,w27			// d+=h
+	add	w27,w27,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w4,w4,w10
+	add	w27,w27,w17			// h+=Sigma0(a)
+	add	w4,w4,w9
+	ldr	w9,[sp,#8]
+	str	w12,[sp,#4]
+	ror	w16,w23,#6
+	add	w26,w26,w28			// h+=K[i]
+	ror	w11,w6,#7
+	and	w17,w24,w23
+	ror	w10,w3,#17
+	bic	w28,w25,w23
+	ror	w12,w27,#2
+	add	w26,w26,w4			// h+=X[i]
+	eor	w16,w16,w23,ror#11
+	eor	w11,w11,w6,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w27,w20			// a^b, b^c in next round
+	eor	w16,w16,w23,ror#25	// Sigma1(e)
+	eor	w12,w12,w27,ror#13
+	add	w26,w26,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w10,w10,w3,ror#19
+	eor	w11,w11,w6,lsr#3	// sigma0(X[i+1])
+	add	w26,w26,w16			// h+=Sigma1(e)
+	eor	w19,w19,w20			// Maj(a,b,c)
+	eor	w17,w12,w27,ror#22	// Sigma0(a)
+	eor	w10,w10,w3,lsr#10	// sigma1(X[i+14])
+	add	w5,w5,w14
+	add	w22,w22,w26			// d+=h
+	add	w26,w26,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w5,w5,w11
+	add	w26,w26,w17			// h+=Sigma0(a)
+	add	w5,w5,w10
+	ldr	w10,[sp,#12]
+	str	w13,[sp,#8]
+	ror	w16,w22,#6
+	add	w25,w25,w19			// h+=K[i]
+	ror	w12,w7,#7
+	and	w17,w23,w22
+	ror	w11,w4,#17
+	bic	w19,w24,w22
+	ror	w13,w26,#2
+	add	w25,w25,w5			// h+=X[i]
+	eor	w16,w16,w22,ror#11
+	eor	w12,w12,w7,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w26,w27			// a^b, b^c in next round
+	eor	w16,w16,w22,ror#25	// Sigma1(e)
+	eor	w13,w13,w26,ror#13
+	add	w25,w25,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w11,w11,w4,ror#19
+	eor	w12,w12,w7,lsr#3	// sigma0(X[i+1])
+	add	w25,w25,w16			// h+=Sigma1(e)
+	eor	w28,w28,w27			// Maj(a,b,c)
+	eor	w17,w13,w26,ror#22	// Sigma0(a)
+	eor	w11,w11,w4,lsr#10	// sigma1(X[i+14])
+	add	w6,w6,w15
+	add	w21,w21,w25			// d+=h
+	add	w25,w25,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w6,w6,w12
+	add	w25,w25,w17			// h+=Sigma0(a)
+	add	w6,w6,w11
+	ldr	w11,[sp,#0]
+	str	w14,[sp,#12]
+	ror	w16,w21,#6
+	add	w24,w24,w28			// h+=K[i]
+	ror	w13,w8,#7
+	and	w17,w22,w21
+	ror	w12,w5,#17
+	bic	w28,w23,w21
+	ror	w14,w25,#2
+	add	w24,w24,w6			// h+=X[i]
+	eor	w16,w16,w21,ror#11
+	eor	w13,w13,w8,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w25,w26			// a^b, b^c in next round
+	eor	w16,w16,w21,ror#25	// Sigma1(e)
+	eor	w14,w14,w25,ror#13
+	add	w24,w24,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w12,w12,w5,ror#19
+	eor	w13,w13,w8,lsr#3	// sigma0(X[i+1])
+	add	w24,w24,w16			// h+=Sigma1(e)
+	eor	w19,w19,w26			// Maj(a,b,c)
+	eor	w17,w14,w25,ror#22	// Sigma0(a)
+	eor	w12,w12,w5,lsr#10	// sigma1(X[i+14])
+	add	w7,w7,w0
+	add	w20,w20,w24			// d+=h
+	add	w24,w24,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w7,w7,w13
+	add	w24,w24,w17			// h+=Sigma0(a)
+	add	w7,w7,w12
+	ldr	w12,[sp,#4]
+	str	w15,[sp,#0]
+	ror	w16,w20,#6
+	add	w23,w23,w19			// h+=K[i]
+	ror	w14,w9,#7
+	and	w17,w21,w20
+	ror	w13,w6,#17
+	bic	w19,w22,w20
+	ror	w15,w24,#2
+	add	w23,w23,w7			// h+=X[i]
+	eor	w16,w16,w20,ror#11
+	eor	w14,w14,w9,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w24,w25			// a^b, b^c in next round
+	eor	w16,w16,w20,ror#25	// Sigma1(e)
+	eor	w15,w15,w24,ror#13
+	add	w23,w23,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w13,w13,w6,ror#19
+	eor	w14,w14,w9,lsr#3	// sigma0(X[i+1])
+	add	w23,w23,w16			// h+=Sigma1(e)
+	eor	w28,w28,w25			// Maj(a,b,c)
+	eor	w17,w15,w24,ror#22	// Sigma0(a)
+	eor	w13,w13,w6,lsr#10	// sigma1(X[i+14])
+	add	w8,w8,w1
+	add	w27,w27,w23			// d+=h
+	add	w23,w23,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w8,w8,w14
+	add	w23,w23,w17			// h+=Sigma0(a)
+	add	w8,w8,w13
+	ldr	w13,[sp,#8]
+	str	w0,[sp,#4]
+	ror	w16,w27,#6
+	add	w22,w22,w28			// h+=K[i]
+	ror	w15,w10,#7
+	and	w17,w20,w27
+	ror	w14,w7,#17
+	bic	w28,w21,w27
+	ror	w0,w23,#2
+	add	w22,w22,w8			// h+=X[i]
+	eor	w16,w16,w27,ror#11
+	eor	w15,w15,w10,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w23,w24			// a^b, b^c in next round
+	eor	w16,w16,w27,ror#25	// Sigma1(e)
+	eor	w0,w0,w23,ror#13
+	add	w22,w22,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w14,w14,w7,ror#19
+	eor	w15,w15,w10,lsr#3	// sigma0(X[i+1])
+	add	w22,w22,w16			// h+=Sigma1(e)
+	eor	w19,w19,w24			// Maj(a,b,c)
+	eor	w17,w0,w23,ror#22	// Sigma0(a)
+	eor	w14,w14,w7,lsr#10	// sigma1(X[i+14])
+	add	w9,w9,w2
+	add	w26,w26,w22			// d+=h
+	add	w22,w22,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w9,w9,w15
+	add	w22,w22,w17			// h+=Sigma0(a)
+	add	w9,w9,w14
+	ldr	w14,[sp,#12]
+	str	w1,[sp,#8]
+	ror	w16,w26,#6
+	add	w21,w21,w19			// h+=K[i]
+	ror	w0,w11,#7
+	and	w17,w27,w26
+	ror	w15,w8,#17
+	bic	w19,w20,w26
+	ror	w1,w22,#2
+	add	w21,w21,w9			// h+=X[i]
+	eor	w16,w16,w26,ror#11
+	eor	w0,w0,w11,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w22,w23			// a^b, b^c in next round
+	eor	w16,w16,w26,ror#25	// Sigma1(e)
+	eor	w1,w1,w22,ror#13
+	add	w21,w21,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w15,w15,w8,ror#19
+	eor	w0,w0,w11,lsr#3	// sigma0(X[i+1])
+	add	w21,w21,w16			// h+=Sigma1(e)
+	eor	w28,w28,w23			// Maj(a,b,c)
+	eor	w17,w1,w22,ror#22	// Sigma0(a)
+	eor	w15,w15,w8,lsr#10	// sigma1(X[i+14])
+	add	w10,w10,w3
+	add	w25,w25,w21			// d+=h
+	add	w21,w21,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w10,w10,w0
+	add	w21,w21,w17			// h+=Sigma0(a)
+	add	w10,w10,w15
+	ldr	w15,[sp,#0]
+	str	w2,[sp,#12]
+	ror	w16,w25,#6
+	add	w20,w20,w28			// h+=K[i]
+	ror	w1,w12,#7
+	and	w17,w26,w25
+	ror	w0,w9,#17
+	bic	w28,w27,w25
+	ror	w2,w21,#2
+	add	w20,w20,w10			// h+=X[i]
+	eor	w16,w16,w25,ror#11
+	eor	w1,w1,w12,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w21,w22			// a^b, b^c in next round
+	eor	w16,w16,w25,ror#25	// Sigma1(e)
+	eor	w2,w2,w21,ror#13
+	add	w20,w20,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w0,w0,w9,ror#19
+	eor	w1,w1,w12,lsr#3	// sigma0(X[i+1])
+	add	w20,w20,w16			// h+=Sigma1(e)
+	eor	w19,w19,w22			// Maj(a,b,c)
+	eor	w17,w2,w21,ror#22	// Sigma0(a)
+	eor	w0,w0,w9,lsr#10	// sigma1(X[i+14])
+	add	w11,w11,w4
+	add	w24,w24,w20			// d+=h
+	add	w20,w20,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w11,w11,w1
+	add	w20,w20,w17			// h+=Sigma0(a)
+	add	w11,w11,w0
+	ldr	w0,[sp,#4]
+	str	w3,[sp,#0]
+	ror	w16,w24,#6
+	add	w27,w27,w19			// h+=K[i]
+	ror	w2,w13,#7
+	and	w17,w25,w24
+	ror	w1,w10,#17
+	bic	w19,w26,w24
+	ror	w3,w20,#2
+	add	w27,w27,w11			// h+=X[i]
+	eor	w16,w16,w24,ror#11
+	eor	w2,w2,w13,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w20,w21			// a^b, b^c in next round
+	eor	w16,w16,w24,ror#25	// Sigma1(e)
+	eor	w3,w3,w20,ror#13
+	add	w27,w27,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w1,w1,w10,ror#19
+	eor	w2,w2,w13,lsr#3	// sigma0(X[i+1])
+	add	w27,w27,w16			// h+=Sigma1(e)
+	eor	w28,w28,w21			// Maj(a,b,c)
+	eor	w17,w3,w20,ror#22	// Sigma0(a)
+	eor	w1,w1,w10,lsr#10	// sigma1(X[i+14])
+	add	w12,w12,w5
+	add	w23,w23,w27			// d+=h
+	add	w27,w27,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w12,w12,w2
+	add	w27,w27,w17			// h+=Sigma0(a)
+	add	w12,w12,w1
+	ldr	w1,[sp,#8]
+	str	w4,[sp,#4]
+	ror	w16,w23,#6
+	add	w26,w26,w28			// h+=K[i]
+	ror	w3,w14,#7
+	and	w17,w24,w23
+	ror	w2,w11,#17
+	bic	w28,w25,w23
+	ror	w4,w27,#2
+	add	w26,w26,w12			// h+=X[i]
+	eor	w16,w16,w23,ror#11
+	eor	w3,w3,w14,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w27,w20			// a^b, b^c in next round
+	eor	w16,w16,w23,ror#25	// Sigma1(e)
+	eor	w4,w4,w27,ror#13
+	add	w26,w26,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w2,w2,w11,ror#19
+	eor	w3,w3,w14,lsr#3	// sigma0(X[i+1])
+	add	w26,w26,w16			// h+=Sigma1(e)
+	eor	w19,w19,w20			// Maj(a,b,c)
+	eor	w17,w4,w27,ror#22	// Sigma0(a)
+	eor	w2,w2,w11,lsr#10	// sigma1(X[i+14])
+	add	w13,w13,w6
+	add	w22,w22,w26			// d+=h
+	add	w26,w26,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w13,w13,w3
+	add	w26,w26,w17			// h+=Sigma0(a)
+	add	w13,w13,w2
+	ldr	w2,[sp,#12]
+	str	w5,[sp,#8]
+	ror	w16,w22,#6
+	add	w25,w25,w19			// h+=K[i]
+	ror	w4,w15,#7
+	and	w17,w23,w22
+	ror	w3,w12,#17
+	bic	w19,w24,w22
+	ror	w5,w26,#2
+	add	w25,w25,w13			// h+=X[i]
+	eor	w16,w16,w22,ror#11
+	eor	w4,w4,w15,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w26,w27			// a^b, b^c in next round
+	eor	w16,w16,w22,ror#25	// Sigma1(e)
+	eor	w5,w5,w26,ror#13
+	add	w25,w25,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w3,w3,w12,ror#19
+	eor	w4,w4,w15,lsr#3	// sigma0(X[i+1])
+	add	w25,w25,w16			// h+=Sigma1(e)
+	eor	w28,w28,w27			// Maj(a,b,c)
+	eor	w17,w5,w26,ror#22	// Sigma0(a)
+	eor	w3,w3,w12,lsr#10	// sigma1(X[i+14])
+	add	w14,w14,w7
+	add	w21,w21,w25			// d+=h
+	add	w25,w25,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w14,w14,w4
+	add	w25,w25,w17			// h+=Sigma0(a)
+	add	w14,w14,w3
+	ldr	w3,[sp,#0]
+	str	w6,[sp,#12]
+	ror	w16,w21,#6
+	add	w24,w24,w28			// h+=K[i]
+	ror	w5,w0,#7
+	and	w17,w22,w21
+	ror	w4,w13,#17
+	bic	w28,w23,w21
+	ror	w6,w25,#2
+	add	w24,w24,w14			// h+=X[i]
+	eor	w16,w16,w21,ror#11
+	eor	w5,w5,w0,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w25,w26			// a^b, b^c in next round
+	eor	w16,w16,w21,ror#25	// Sigma1(e)
+	eor	w6,w6,w25,ror#13
+	add	w24,w24,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w4,w4,w13,ror#19
+	eor	w5,w5,w0,lsr#3	// sigma0(X[i+1])
+	add	w24,w24,w16			// h+=Sigma1(e)
+	eor	w19,w19,w26			// Maj(a,b,c)
+	eor	w17,w6,w25,ror#22	// Sigma0(a)
+	eor	w4,w4,w13,lsr#10	// sigma1(X[i+14])
+	add	w15,w15,w8
+	add	w20,w20,w24			// d+=h
+	add	w24,w24,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w15,w15,w5
+	add	w24,w24,w17			// h+=Sigma0(a)
+	add	w15,w15,w4
+	ldr	w4,[sp,#4]
+	str	w7,[sp,#0]
+	ror	w16,w20,#6
+	add	w23,w23,w19			// h+=K[i]
+	ror	w6,w1,#7
+	and	w17,w21,w20
+	ror	w5,w14,#17
+	bic	w19,w22,w20
+	ror	w7,w24,#2
+	add	w23,w23,w15			// h+=X[i]
+	eor	w16,w16,w20,ror#11
+	eor	w6,w6,w1,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w24,w25			// a^b, b^c in next round
+	eor	w16,w16,w20,ror#25	// Sigma1(e)
+	eor	w7,w7,w24,ror#13
+	add	w23,w23,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w5,w5,w14,ror#19
+	eor	w6,w6,w1,lsr#3	// sigma0(X[i+1])
+	add	w23,w23,w16			// h+=Sigma1(e)
+	eor	w28,w28,w25			// Maj(a,b,c)
+	eor	w17,w7,w24,ror#22	// Sigma0(a)
+	eor	w5,w5,w14,lsr#10	// sigma1(X[i+14])
+	add	w0,w0,w9
+	add	w27,w27,w23			// d+=h
+	add	w23,w23,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w0,w0,w6
+	add	w23,w23,w17			// h+=Sigma0(a)
+	add	w0,w0,w5
+	ldr	w5,[sp,#8]
+	str	w8,[sp,#4]
+	ror	w16,w27,#6
+	add	w22,w22,w28			// h+=K[i]
+	ror	w7,w2,#7
+	and	w17,w20,w27
+	ror	w6,w15,#17
+	bic	w28,w21,w27
+	ror	w8,w23,#2
+	add	w22,w22,w0			// h+=X[i]
+	eor	w16,w16,w27,ror#11
+	eor	w7,w7,w2,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w23,w24			// a^b, b^c in next round
+	eor	w16,w16,w27,ror#25	// Sigma1(e)
+	eor	w8,w8,w23,ror#13
+	add	w22,w22,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w6,w6,w15,ror#19
+	eor	w7,w7,w2,lsr#3	// sigma0(X[i+1])
+	add	w22,w22,w16			// h+=Sigma1(e)
+	eor	w19,w19,w24			// Maj(a,b,c)
+	eor	w17,w8,w23,ror#22	// Sigma0(a)
+	eor	w6,w6,w15,lsr#10	// sigma1(X[i+14])
+	add	w1,w1,w10
+	add	w26,w26,w22			// d+=h
+	add	w22,w22,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w1,w1,w7
+	add	w22,w22,w17			// h+=Sigma0(a)
+	add	w1,w1,w6
+	ldr	w6,[sp,#12]
+	str	w9,[sp,#8]
+	ror	w16,w26,#6
+	add	w21,w21,w19			// h+=K[i]
+	ror	w8,w3,#7
+	and	w17,w27,w26
+	ror	w7,w0,#17
+	bic	w19,w20,w26
+	ror	w9,w22,#2
+	add	w21,w21,w1			// h+=X[i]
+	eor	w16,w16,w26,ror#11
+	eor	w8,w8,w3,ror#18
+	orr	w17,w17,w19			// Ch(e,f,g)
+	eor	w19,w22,w23			// a^b, b^c in next round
+	eor	w16,w16,w26,ror#25	// Sigma1(e)
+	eor	w9,w9,w22,ror#13
+	add	w21,w21,w17			// h+=Ch(e,f,g)
+	and	w28,w28,w19			// (b^c)&=(a^b)
+	eor	w7,w7,w0,ror#19
+	eor	w8,w8,w3,lsr#3	// sigma0(X[i+1])
+	add	w21,w21,w16			// h+=Sigma1(e)
+	eor	w28,w28,w23			// Maj(a,b,c)
+	eor	w17,w9,w22,ror#22	// Sigma0(a)
+	eor	w7,w7,w0,lsr#10	// sigma1(X[i+14])
+	add	w2,w2,w11
+	add	w25,w25,w21			// d+=h
+	add	w21,w21,w28			// h+=Maj(a,b,c)
+	ldr	w28,[x30],#4		// *K++, w19 in next round
+	add	w2,w2,w8
+	add	w21,w21,w17			// h+=Sigma0(a)
+	add	w2,w2,w7
+	ldr	w7,[sp,#0]
+	str	w10,[sp,#12]
+	ror	w16,w25,#6
+	add	w20,w20,w28			// h+=K[i]
+	ror	w9,w4,#7
+	and	w17,w26,w25
+	ror	w8,w1,#17
+	bic	w28,w27,w25
+	ror	w10,w21,#2
+	add	w20,w20,w2			// h+=X[i]
+	eor	w16,w16,w25,ror#11
+	eor	w9,w9,w4,ror#18
+	orr	w17,w17,w28			// Ch(e,f,g)
+	eor	w28,w21,w22			// a^b, b^c in next round
+	eor	w16,w16,w25,ror#25	// Sigma1(e)
+	eor	w10,w10,w21,ror#13
+	add	w20,w20,w17			// h+=Ch(e,f,g)
+	and	w19,w19,w28			// (b^c)&=(a^b)
+	eor	w8,w8,w1,ror#19
+	eor	w9,w9,w4,lsr#3	// sigma0(X[i+1])
+	add	w20,w20,w16			// h+=Sigma1(e)
+	eor	w19,w19,w22			// Maj(a,b,c)
+	eor	w17,w10,w21,ror#22	// Sigma0(a)
+	eor	w8,w8,w1,lsr#10	// sigma1(X[i+14])
+	add	w3,w3,w12
+	add	w24,w24,w20			// d+=h
+	add	w20,w20,w19			// h+=Maj(a,b,c)
+	ldr	w19,[x30],#4		// *K++, w28 in next round
+	add	w3,w3,w9
+	add	w20,w20,w17			// h+=Sigma0(a)
+	add	w3,w3,w8
+	cbnz	w19,.Loop_16_xx
+
+	ldp	x0,x2,[x29,#96]
+	ldr	x1,[x29,#112]
+	sub	x30,x30,#260		// rewind
+
+	ldp	w3,w4,[x0]
+	ldp	w5,w6,[x0,#2*4]
+	add	x1,x1,#14*4			// advance input pointer
+	ldp	w7,w8,[x0,#4*4]
+	add	w20,w20,w3
+	ldp	w9,w10,[x0,#6*4]
+	add	w21,w21,w4
+	add	w22,w22,w5
+	add	w23,w23,w6
+	stp	w20,w21,[x0]
+	add	w24,w24,w7
+	add	w25,w25,w8
+	stp	w22,w23,[x0,#2*4]
+	add	w26,w26,w9
+	add	w27,w27,w10
+	cmp	x1,x2
+	stp	w24,w25,[x0,#4*4]
+	stp	w26,w27,[x0,#6*4]
+	b.ne	.Loop
+
+	ldp	x19,x20,[x29,#16]
+	add	sp,sp,#4*4
+	ldp	x21,x22,[x29,#32]
+	ldp	x23,x24,[x29,#48]
+	ldp	x25,x26,[x29,#64]
+	ldp	x27,x28,[x29,#80]
+	ldp	x29,x30,[sp],#128
+	ret
+.size	sha256_block_data_order,.-sha256_block_data_order
+
+.align	6
+.type	.LK256,%object
+.LK256:
+	.long	0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
+	.long	0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
+	.long	0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
+	.long	0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
+	.long	0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
+	.long	0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
+	.long	0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
+	.long	0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
+	.long	0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
+	.long	0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
+	.long	0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
+	.long	0xd192e819,0xd6990624,0xf40e3585,0x106aa070
+	.long	0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
+	.long	0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
+	.long	0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
+	.long	0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
+	.long	0	//terminator
+.size	.LK256,.-.LK256
+#ifndef	__KERNEL__
+.align	3
+.LOPENSSL_armcap_P:
+# ifdef	__ILP32__
+	.long	OPENSSL_armcap_P-.
+# else
+	.quad	OPENSSL_armcap_P-.
+# endif
+#endif
+.asciz	"SHA256 block transform for ARMv8, CRYPTOGAMS by <appro@openssl.org>"
+.align	2
+#ifndef	__KERNEL__
+.type	sha256_block_armv8,%function
+.align	6
+sha256_block_armv8:
+.Lv8_entry:
+	stp		x29,x30,[sp,#-16]!
+	add		x29,sp,#0
+
+	ld1		{v0.4s,v1.4s},[x0]
+	adr		x3,.LK256
+
+.Loop_hw:
+	ld1		{v4.16b-v7.16b},[x1],#64
+	sub		x2,x2,#1
+	ld1		{v16.4s},[x3],#16
+	rev32		v4.16b,v4.16b
+	rev32		v5.16b,v5.16b
+	rev32		v6.16b,v6.16b
+	rev32		v7.16b,v7.16b
+	orr		v18.16b,v0.16b,v0.16b		// offload
+	orr		v19.16b,v1.16b,v1.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v4.4s
+	.inst	0x5e2828a4	//sha256su0 v4.16b,v5.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+	.inst	0x5e0760c4	//sha256su1 v4.16b,v6.16b,v7.16b
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v5.4s
+	.inst	0x5e2828c5	//sha256su0 v5.16b,v6.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+	.inst	0x5e0460e5	//sha256su1 v5.16b,v7.16b,v4.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v6.4s
+	.inst	0x5e2828e6	//sha256su0 v6.16b,v7.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+	.inst	0x5e056086	//sha256su1 v6.16b,v4.16b,v5.16b
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v7.4s
+	.inst	0x5e282887	//sha256su0 v7.16b,v4.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+	.inst	0x5e0660a7	//sha256su1 v7.16b,v5.16b,v6.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v4.4s
+	.inst	0x5e2828a4	//sha256su0 v4.16b,v5.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+	.inst	0x5e0760c4	//sha256su1 v4.16b,v6.16b,v7.16b
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v5.4s
+	.inst	0x5e2828c5	//sha256su0 v5.16b,v6.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+	.inst	0x5e0460e5	//sha256su1 v5.16b,v7.16b,v4.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v6.4s
+	.inst	0x5e2828e6	//sha256su0 v6.16b,v7.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+	.inst	0x5e056086	//sha256su1 v6.16b,v4.16b,v5.16b
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v7.4s
+	.inst	0x5e282887	//sha256su0 v7.16b,v4.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+	.inst	0x5e0660a7	//sha256su1 v7.16b,v5.16b,v6.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v4.4s
+	.inst	0x5e2828a4	//sha256su0 v4.16b,v5.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+	.inst	0x5e0760c4	//sha256su1 v4.16b,v6.16b,v7.16b
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v5.4s
+	.inst	0x5e2828c5	//sha256su0 v5.16b,v6.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+	.inst	0x5e0460e5	//sha256su1 v5.16b,v7.16b,v4.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v6.4s
+	.inst	0x5e2828e6	//sha256su0 v6.16b,v7.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+	.inst	0x5e056086	//sha256su1 v6.16b,v4.16b,v5.16b
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v7.4s
+	.inst	0x5e282887	//sha256su0 v7.16b,v4.16b
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+	.inst	0x5e0660a7	//sha256su1 v7.16b,v5.16b,v6.16b
+	ld1		{v17.4s},[x3],#16
+	add		v16.4s,v16.4s,v4.4s
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+
+	ld1		{v16.4s},[x3],#16
+	add		v17.4s,v17.4s,v5.4s
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+
+	ld1		{v17.4s},[x3]
+	add		v16.4s,v16.4s,v6.4s
+	sub		x3,x3,#64*4-16	// rewind
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e104020	//sha256h v0.16b,v1.16b,v16.4s
+	.inst	0x5e105041	//sha256h2 v1.16b,v2.16b,v16.4s
+
+	add		v17.4s,v17.4s,v7.4s
+	orr		v2.16b,v0.16b,v0.16b
+	.inst	0x5e114020	//sha256h v0.16b,v1.16b,v17.4s
+	.inst	0x5e115041	//sha256h2 v1.16b,v2.16b,v17.4s
+
+	add		v0.4s,v0.4s,v18.4s
+	add		v1.4s,v1.4s,v19.4s
+
+	cbnz		x2,.Loop_hw
+
+	st1		{v0.4s,v1.4s},[x0]
+
+	ldr		x29,[sp],#16
+	ret
+.size	sha256_block_armv8,.-sha256_block_armv8
+#endif
+#ifdef	__KERNEL__
+.globl	sha256_block_neon
+#endif
+.type	sha256_block_neon,%function
+.align	4
+sha256_block_neon:
+.Lneon_entry:
+	stp	x29, x30, [sp, #-16]!
+	mov	x29, sp
+	sub	sp,sp,#16*4
+
+	adr	x16,.LK256
+	add	x2,x1,x2,lsl#6	// len to point at the end of inp
+
+	ld1	{v0.16b},[x1], #16
+	ld1	{v1.16b},[x1], #16
+	ld1	{v2.16b},[x1], #16
+	ld1	{v3.16b},[x1], #16
+	ld1	{v4.4s},[x16], #16
+	ld1	{v5.4s},[x16], #16
+	ld1	{v6.4s},[x16], #16
+	ld1	{v7.4s},[x16], #16
+	rev32	v0.16b,v0.16b		// yes, even on
+	rev32	v1.16b,v1.16b		// big-endian
+	rev32	v2.16b,v2.16b
+	rev32	v3.16b,v3.16b
+	mov	x17,sp
+	add	v4.4s,v4.4s,v0.4s
+	add	v5.4s,v5.4s,v1.4s
+	add	v6.4s,v6.4s,v2.4s
+	st1	{v4.4s-v5.4s},[x17], #32
+	add	v7.4s,v7.4s,v3.4s
+	st1	{v6.4s-v7.4s},[x17]
+	sub	x17,x17,#32
+
+	ldp	w3,w4,[x0]
+	ldp	w5,w6,[x0,#8]
+	ldp	w7,w8,[x0,#16]
+	ldp	w9,w10,[x0,#24]
+	ldr	w12,[sp,#0]
+	mov	w13,wzr
+	eor	w14,w4,w5
+	mov	w15,wzr
+	b	.L_00_48
+
+.align	4
+.L_00_48:
+	ext	v4.16b,v0.16b,v1.16b,#4
+	add	w10,w10,w12
+	add	w3,w3,w15
+	and	w12,w8,w7
+	bic	w15,w9,w7
+	ext	v7.16b,v2.16b,v3.16b,#4
+	eor	w11,w7,w7,ror#5
+	add	w3,w3,w13
+	mov	d19,v3.d[1]
+	orr	w12,w12,w15
+	eor	w11,w11,w7,ror#19
+	ushr	v6.4s,v4.4s,#7
+	eor	w15,w3,w3,ror#11
+	ushr	v5.4s,v4.4s,#3
+	add	w10,w10,w12
+	add	v0.4s,v0.4s,v7.4s
+	ror	w11,w11,#6
+	sli	v6.4s,v4.4s,#25
+	eor	w13,w3,w4
+	eor	w15,w15,w3,ror#20
+	ushr	v7.4s,v4.4s,#18
+	add	w10,w10,w11
+	ldr	w12,[sp,#4]
+	and	w14,w14,w13
+	eor	v5.16b,v5.16b,v6.16b
+	ror	w15,w15,#2
+	add	w6,w6,w10
+	sli	v7.4s,v4.4s,#14
+	eor	w14,w14,w4
+	ushr	v16.4s,v19.4s,#17
+	add	w9,w9,w12
+	add	w10,w10,w15
+	and	w12,w7,w6
+	eor	v5.16b,v5.16b,v7.16b
+	bic	w15,w8,w6
+	eor	w11,w6,w6,ror#5
+	sli	v16.4s,v19.4s,#15
+	add	w10,w10,w14
+	orr	w12,w12,w15
+	ushr	v17.4s,v19.4s,#10
+	eor	w11,w11,w6,ror#19
+	eor	w15,w10,w10,ror#11
+	ushr	v7.4s,v19.4s,#19
+	add	w9,w9,w12
+	ror	w11,w11,#6
+	add	v0.4s,v0.4s,v5.4s
+	eor	w14,w10,w3
+	eor	w15,w15,w10,ror#20
+	sli	v7.4s,v19.4s,#13
+	add	w9,w9,w11
+	ldr	w12,[sp,#8]
+	and	w13,w13,w14
+	eor	v17.16b,v17.16b,v16.16b
+	ror	w15,w15,#2
+	add	w5,w5,w9
+	eor	w13,w13,w3
+	eor	v17.16b,v17.16b,v7.16b
+	add	w8,w8,w12
+	add	w9,w9,w15
+	and	w12,w6,w5
+	add	v0.4s,v0.4s,v17.4s
+	bic	w15,w7,w5
+	eor	w11,w5,w5,ror#5
+	add	w9,w9,w13
+	ushr	v18.4s,v0.4s,#17
+	orr	w12,w12,w15
+	ushr	v19.4s,v0.4s,#10
+	eor	w11,w11,w5,ror#19
+	eor	w15,w9,w9,ror#11
+	sli	v18.4s,v0.4s,#15
+	add	w8,w8,w12
+	ushr	v17.4s,v0.4s,#19
+	ror	w11,w11,#6
+	eor	w13,w9,w10
+	eor	v19.16b,v19.16b,v18.16b
+	eor	w15,w15,w9,ror#20
+	add	w8,w8,w11
+	sli	v17.4s,v0.4s,#13
+	ldr	w12,[sp,#12]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	ld1	{v4.4s},[x16], #16
+	add	w4,w4,w8
+	eor	v19.16b,v19.16b,v17.16b
+	eor	w14,w14,w10
+	eor	v17.16b,v17.16b,v17.16b
+	add	w7,w7,w12
+	add	w8,w8,w15
+	and	w12,w5,w4
+	mov	v17.d[1],v19.d[0]
+	bic	w15,w6,w4
+	eor	w11,w4,w4,ror#5
+	add	w8,w8,w14
+	add	v0.4s,v0.4s,v17.4s
+	orr	w12,w12,w15
+	eor	w11,w11,w4,ror#19
+	eor	w15,w8,w8,ror#11
+	add	v4.4s,v4.4s,v0.4s
+	add	w7,w7,w12
+	ror	w11,w11,#6
+	eor	w14,w8,w9
+	eor	w15,w15,w8,ror#20
+	add	w7,w7,w11
+	ldr	w12,[sp,#16]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w3,w3,w7
+	eor	w13,w13,w9
+	st1	{v4.4s},[x17], #16
+	ext	v4.16b,v1.16b,v2.16b,#4
+	add	w6,w6,w12
+	add	w7,w7,w15
+	and	w12,w4,w3
+	bic	w15,w5,w3
+	ext	v7.16b,v3.16b,v0.16b,#4
+	eor	w11,w3,w3,ror#5
+	add	w7,w7,w13
+	mov	d19,v0.d[1]
+	orr	w12,w12,w15
+	eor	w11,w11,w3,ror#19
+	ushr	v6.4s,v4.4s,#7
+	eor	w15,w7,w7,ror#11
+	ushr	v5.4s,v4.4s,#3
+	add	w6,w6,w12
+	add	v1.4s,v1.4s,v7.4s
+	ror	w11,w11,#6
+	sli	v6.4s,v4.4s,#25
+	eor	w13,w7,w8
+	eor	w15,w15,w7,ror#20
+	ushr	v7.4s,v4.4s,#18
+	add	w6,w6,w11
+	ldr	w12,[sp,#20]
+	and	w14,w14,w13
+	eor	v5.16b,v5.16b,v6.16b
+	ror	w15,w15,#2
+	add	w10,w10,w6
+	sli	v7.4s,v4.4s,#14
+	eor	w14,w14,w8
+	ushr	v16.4s,v19.4s,#17
+	add	w5,w5,w12
+	add	w6,w6,w15
+	and	w12,w3,w10
+	eor	v5.16b,v5.16b,v7.16b
+	bic	w15,w4,w10
+	eor	w11,w10,w10,ror#5
+	sli	v16.4s,v19.4s,#15
+	add	w6,w6,w14
+	orr	w12,w12,w15
+	ushr	v17.4s,v19.4s,#10
+	eor	w11,w11,w10,ror#19
+	eor	w15,w6,w6,ror#11
+	ushr	v7.4s,v19.4s,#19
+	add	w5,w5,w12
+	ror	w11,w11,#6
+	add	v1.4s,v1.4s,v5.4s
+	eor	w14,w6,w7
+	eor	w15,w15,w6,ror#20
+	sli	v7.4s,v19.4s,#13
+	add	w5,w5,w11
+	ldr	w12,[sp,#24]
+	and	w13,w13,w14
+	eor	v17.16b,v17.16b,v16.16b
+	ror	w15,w15,#2
+	add	w9,w9,w5
+	eor	w13,w13,w7
+	eor	v17.16b,v17.16b,v7.16b
+	add	w4,w4,w12
+	add	w5,w5,w15
+	and	w12,w10,w9
+	add	v1.4s,v1.4s,v17.4s
+	bic	w15,w3,w9
+	eor	w11,w9,w9,ror#5
+	add	w5,w5,w13
+	ushr	v18.4s,v1.4s,#17
+	orr	w12,w12,w15
+	ushr	v19.4s,v1.4s,#10
+	eor	w11,w11,w9,ror#19
+	eor	w15,w5,w5,ror#11
+	sli	v18.4s,v1.4s,#15
+	add	w4,w4,w12
+	ushr	v17.4s,v1.4s,#19
+	ror	w11,w11,#6
+	eor	w13,w5,w6
+	eor	v19.16b,v19.16b,v18.16b
+	eor	w15,w15,w5,ror#20
+	add	w4,w4,w11
+	sli	v17.4s,v1.4s,#13
+	ldr	w12,[sp,#28]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	ld1	{v4.4s},[x16], #16
+	add	w8,w8,w4
+	eor	v19.16b,v19.16b,v17.16b
+	eor	w14,w14,w6
+	eor	v17.16b,v17.16b,v17.16b
+	add	w3,w3,w12
+	add	w4,w4,w15
+	and	w12,w9,w8
+	mov	v17.d[1],v19.d[0]
+	bic	w15,w10,w8
+	eor	w11,w8,w8,ror#5
+	add	w4,w4,w14
+	add	v1.4s,v1.4s,v17.4s
+	orr	w12,w12,w15
+	eor	w11,w11,w8,ror#19
+	eor	w15,w4,w4,ror#11
+	add	v4.4s,v4.4s,v1.4s
+	add	w3,w3,w12
+	ror	w11,w11,#6
+	eor	w14,w4,w5
+	eor	w15,w15,w4,ror#20
+	add	w3,w3,w11
+	ldr	w12,[sp,#32]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w7,w7,w3
+	eor	w13,w13,w5
+	st1	{v4.4s},[x17], #16
+	ext	v4.16b,v2.16b,v3.16b,#4
+	add	w10,w10,w12
+	add	w3,w3,w15
+	and	w12,w8,w7
+	bic	w15,w9,w7
+	ext	v7.16b,v0.16b,v1.16b,#4
+	eor	w11,w7,w7,ror#5
+	add	w3,w3,w13
+	mov	d19,v1.d[1]
+	orr	w12,w12,w15
+	eor	w11,w11,w7,ror#19
+	ushr	v6.4s,v4.4s,#7
+	eor	w15,w3,w3,ror#11
+	ushr	v5.4s,v4.4s,#3
+	add	w10,w10,w12
+	add	v2.4s,v2.4s,v7.4s
+	ror	w11,w11,#6
+	sli	v6.4s,v4.4s,#25
+	eor	w13,w3,w4
+	eor	w15,w15,w3,ror#20
+	ushr	v7.4s,v4.4s,#18
+	add	w10,w10,w11
+	ldr	w12,[sp,#36]
+	and	w14,w14,w13
+	eor	v5.16b,v5.16b,v6.16b
+	ror	w15,w15,#2
+	add	w6,w6,w10
+	sli	v7.4s,v4.4s,#14
+	eor	w14,w14,w4
+	ushr	v16.4s,v19.4s,#17
+	add	w9,w9,w12
+	add	w10,w10,w15
+	and	w12,w7,w6
+	eor	v5.16b,v5.16b,v7.16b
+	bic	w15,w8,w6
+	eor	w11,w6,w6,ror#5
+	sli	v16.4s,v19.4s,#15
+	add	w10,w10,w14
+	orr	w12,w12,w15
+	ushr	v17.4s,v19.4s,#10
+	eor	w11,w11,w6,ror#19
+	eor	w15,w10,w10,ror#11
+	ushr	v7.4s,v19.4s,#19
+	add	w9,w9,w12
+	ror	w11,w11,#6
+	add	v2.4s,v2.4s,v5.4s
+	eor	w14,w10,w3
+	eor	w15,w15,w10,ror#20
+	sli	v7.4s,v19.4s,#13
+	add	w9,w9,w11
+	ldr	w12,[sp,#40]
+	and	w13,w13,w14
+	eor	v17.16b,v17.16b,v16.16b
+	ror	w15,w15,#2
+	add	w5,w5,w9
+	eor	w13,w13,w3
+	eor	v17.16b,v17.16b,v7.16b
+	add	w8,w8,w12
+	add	w9,w9,w15
+	and	w12,w6,w5
+	add	v2.4s,v2.4s,v17.4s
+	bic	w15,w7,w5
+	eor	w11,w5,w5,ror#5
+	add	w9,w9,w13
+	ushr	v18.4s,v2.4s,#17
+	orr	w12,w12,w15
+	ushr	v19.4s,v2.4s,#10
+	eor	w11,w11,w5,ror#19
+	eor	w15,w9,w9,ror#11
+	sli	v18.4s,v2.4s,#15
+	add	w8,w8,w12
+	ushr	v17.4s,v2.4s,#19
+	ror	w11,w11,#6
+	eor	w13,w9,w10
+	eor	v19.16b,v19.16b,v18.16b
+	eor	w15,w15,w9,ror#20
+	add	w8,w8,w11
+	sli	v17.4s,v2.4s,#13
+	ldr	w12,[sp,#44]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	ld1	{v4.4s},[x16], #16
+	add	w4,w4,w8
+	eor	v19.16b,v19.16b,v17.16b
+	eor	w14,w14,w10
+	eor	v17.16b,v17.16b,v17.16b
+	add	w7,w7,w12
+	add	w8,w8,w15
+	and	w12,w5,w4
+	mov	v17.d[1],v19.d[0]
+	bic	w15,w6,w4
+	eor	w11,w4,w4,ror#5
+	add	w8,w8,w14
+	add	v2.4s,v2.4s,v17.4s
+	orr	w12,w12,w15
+	eor	w11,w11,w4,ror#19
+	eor	w15,w8,w8,ror#11
+	add	v4.4s,v4.4s,v2.4s
+	add	w7,w7,w12
+	ror	w11,w11,#6
+	eor	w14,w8,w9
+	eor	w15,w15,w8,ror#20
+	add	w7,w7,w11
+	ldr	w12,[sp,#48]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w3,w3,w7
+	eor	w13,w13,w9
+	st1	{v4.4s},[x17], #16
+	ext	v4.16b,v3.16b,v0.16b,#4
+	add	w6,w6,w12
+	add	w7,w7,w15
+	and	w12,w4,w3
+	bic	w15,w5,w3
+	ext	v7.16b,v1.16b,v2.16b,#4
+	eor	w11,w3,w3,ror#5
+	add	w7,w7,w13
+	mov	d19,v2.d[1]
+	orr	w12,w12,w15
+	eor	w11,w11,w3,ror#19
+	ushr	v6.4s,v4.4s,#7
+	eor	w15,w7,w7,ror#11
+	ushr	v5.4s,v4.4s,#3
+	add	w6,w6,w12
+	add	v3.4s,v3.4s,v7.4s
+	ror	w11,w11,#6
+	sli	v6.4s,v4.4s,#25
+	eor	w13,w7,w8
+	eor	w15,w15,w7,ror#20
+	ushr	v7.4s,v4.4s,#18
+	add	w6,w6,w11
+	ldr	w12,[sp,#52]
+	and	w14,w14,w13
+	eor	v5.16b,v5.16b,v6.16b
+	ror	w15,w15,#2
+	add	w10,w10,w6
+	sli	v7.4s,v4.4s,#14
+	eor	w14,w14,w8
+	ushr	v16.4s,v19.4s,#17
+	add	w5,w5,w12
+	add	w6,w6,w15
+	and	w12,w3,w10
+	eor	v5.16b,v5.16b,v7.16b
+	bic	w15,w4,w10
+	eor	w11,w10,w10,ror#5
+	sli	v16.4s,v19.4s,#15
+	add	w6,w6,w14
+	orr	w12,w12,w15
+	ushr	v17.4s,v19.4s,#10
+	eor	w11,w11,w10,ror#19
+	eor	w15,w6,w6,ror#11
+	ushr	v7.4s,v19.4s,#19
+	add	w5,w5,w12
+	ror	w11,w11,#6
+	add	v3.4s,v3.4s,v5.4s
+	eor	w14,w6,w7
+	eor	w15,w15,w6,ror#20
+	sli	v7.4s,v19.4s,#13
+	add	w5,w5,w11
+	ldr	w12,[sp,#56]
+	and	w13,w13,w14
+	eor	v17.16b,v17.16b,v16.16b
+	ror	w15,w15,#2
+	add	w9,w9,w5
+	eor	w13,w13,w7
+	eor	v17.16b,v17.16b,v7.16b
+	add	w4,w4,w12
+	add	w5,w5,w15
+	and	w12,w10,w9
+	add	v3.4s,v3.4s,v17.4s
+	bic	w15,w3,w9
+	eor	w11,w9,w9,ror#5
+	add	w5,w5,w13
+	ushr	v18.4s,v3.4s,#17
+	orr	w12,w12,w15
+	ushr	v19.4s,v3.4s,#10
+	eor	w11,w11,w9,ror#19
+	eor	w15,w5,w5,ror#11
+	sli	v18.4s,v3.4s,#15
+	add	w4,w4,w12
+	ushr	v17.4s,v3.4s,#19
+	ror	w11,w11,#6
+	eor	w13,w5,w6
+	eor	v19.16b,v19.16b,v18.16b
+	eor	w15,w15,w5,ror#20
+	add	w4,w4,w11
+	sli	v17.4s,v3.4s,#13
+	ldr	w12,[sp,#60]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	ld1	{v4.4s},[x16], #16
+	add	w8,w8,w4
+	eor	v19.16b,v19.16b,v17.16b
+	eor	w14,w14,w6
+	eor	v17.16b,v17.16b,v17.16b
+	add	w3,w3,w12
+	add	w4,w4,w15
+	and	w12,w9,w8
+	mov	v17.d[1],v19.d[0]
+	bic	w15,w10,w8
+	eor	w11,w8,w8,ror#5
+	add	w4,w4,w14
+	add	v3.4s,v3.4s,v17.4s
+	orr	w12,w12,w15
+	eor	w11,w11,w8,ror#19
+	eor	w15,w4,w4,ror#11
+	add	v4.4s,v4.4s,v3.4s
+	add	w3,w3,w12
+	ror	w11,w11,#6
+	eor	w14,w4,w5
+	eor	w15,w15,w4,ror#20
+	add	w3,w3,w11
+	ldr	w12,[x16]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w7,w7,w3
+	eor	w13,w13,w5
+	st1	{v4.4s},[x17], #16
+	cmp	w12,#0				// check for K256 terminator
+	ldr	w12,[sp,#0]
+	sub	x17,x17,#64
+	bne	.L_00_48
+
+	sub	x16,x16,#256		// rewind x16
+	cmp	x1,x2
+	mov	x17, #64
+	csel	x17, x17, xzr, eq
+	sub	x1,x1,x17			// avoid SEGV
+	mov	x17,sp
+	add	w10,w10,w12
+	add	w3,w3,w15
+	and	w12,w8,w7
+	ld1	{v0.16b},[x1],#16
+	bic	w15,w9,w7
+	eor	w11,w7,w7,ror#5
+	ld1	{v4.4s},[x16],#16
+	add	w3,w3,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w7,ror#19
+	eor	w15,w3,w3,ror#11
+	rev32	v0.16b,v0.16b
+	add	w10,w10,w12
+	ror	w11,w11,#6
+	eor	w13,w3,w4
+	eor	w15,w15,w3,ror#20
+	add	v4.4s,v4.4s,v0.4s
+	add	w10,w10,w11
+	ldr	w12,[sp,#4]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w6,w6,w10
+	eor	w14,w14,w4
+	add	w9,w9,w12
+	add	w10,w10,w15
+	and	w12,w7,w6
+	bic	w15,w8,w6
+	eor	w11,w6,w6,ror#5
+	add	w10,w10,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w6,ror#19
+	eor	w15,w10,w10,ror#11
+	add	w9,w9,w12
+	ror	w11,w11,#6
+	eor	w14,w10,w3
+	eor	w15,w15,w10,ror#20
+	add	w9,w9,w11
+	ldr	w12,[sp,#8]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w5,w5,w9
+	eor	w13,w13,w3
+	add	w8,w8,w12
+	add	w9,w9,w15
+	and	w12,w6,w5
+	bic	w15,w7,w5
+	eor	w11,w5,w5,ror#5
+	add	w9,w9,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w5,ror#19
+	eor	w15,w9,w9,ror#11
+	add	w8,w8,w12
+	ror	w11,w11,#6
+	eor	w13,w9,w10
+	eor	w15,w15,w9,ror#20
+	add	w8,w8,w11
+	ldr	w12,[sp,#12]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w4,w4,w8
+	eor	w14,w14,w10
+	add	w7,w7,w12
+	add	w8,w8,w15
+	and	w12,w5,w4
+	bic	w15,w6,w4
+	eor	w11,w4,w4,ror#5
+	add	w8,w8,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w4,ror#19
+	eor	w15,w8,w8,ror#11
+	add	w7,w7,w12
+	ror	w11,w11,#6
+	eor	w14,w8,w9
+	eor	w15,w15,w8,ror#20
+	add	w7,w7,w11
+	ldr	w12,[sp,#16]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w3,w3,w7
+	eor	w13,w13,w9
+	st1	{v4.4s},[x17], #16
+	add	w6,w6,w12
+	add	w7,w7,w15
+	and	w12,w4,w3
+	ld1	{v1.16b},[x1],#16
+	bic	w15,w5,w3
+	eor	w11,w3,w3,ror#5
+	ld1	{v4.4s},[x16],#16
+	add	w7,w7,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w3,ror#19
+	eor	w15,w7,w7,ror#11
+	rev32	v1.16b,v1.16b
+	add	w6,w6,w12
+	ror	w11,w11,#6
+	eor	w13,w7,w8
+	eor	w15,w15,w7,ror#20
+	add	v4.4s,v4.4s,v1.4s
+	add	w6,w6,w11
+	ldr	w12,[sp,#20]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w10,w10,w6
+	eor	w14,w14,w8
+	add	w5,w5,w12
+	add	w6,w6,w15
+	and	w12,w3,w10
+	bic	w15,w4,w10
+	eor	w11,w10,w10,ror#5
+	add	w6,w6,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w10,ror#19
+	eor	w15,w6,w6,ror#11
+	add	w5,w5,w12
+	ror	w11,w11,#6
+	eor	w14,w6,w7
+	eor	w15,w15,w6,ror#20
+	add	w5,w5,w11
+	ldr	w12,[sp,#24]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w9,w9,w5
+	eor	w13,w13,w7
+	add	w4,w4,w12
+	add	w5,w5,w15
+	and	w12,w10,w9
+	bic	w15,w3,w9
+	eor	w11,w9,w9,ror#5
+	add	w5,w5,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w9,ror#19
+	eor	w15,w5,w5,ror#11
+	add	w4,w4,w12
+	ror	w11,w11,#6
+	eor	w13,w5,w6
+	eor	w15,w15,w5,ror#20
+	add	w4,w4,w11
+	ldr	w12,[sp,#28]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w8,w8,w4
+	eor	w14,w14,w6
+	add	w3,w3,w12
+	add	w4,w4,w15
+	and	w12,w9,w8
+	bic	w15,w10,w8
+	eor	w11,w8,w8,ror#5
+	add	w4,w4,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w8,ror#19
+	eor	w15,w4,w4,ror#11
+	add	w3,w3,w12
+	ror	w11,w11,#6
+	eor	w14,w4,w5
+	eor	w15,w15,w4,ror#20
+	add	w3,w3,w11
+	ldr	w12,[sp,#32]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w7,w7,w3
+	eor	w13,w13,w5
+	st1	{v4.4s},[x17], #16
+	add	w10,w10,w12
+	add	w3,w3,w15
+	and	w12,w8,w7
+	ld1	{v2.16b},[x1],#16
+	bic	w15,w9,w7
+	eor	w11,w7,w7,ror#5
+	ld1	{v4.4s},[x16],#16
+	add	w3,w3,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w7,ror#19
+	eor	w15,w3,w3,ror#11
+	rev32	v2.16b,v2.16b
+	add	w10,w10,w12
+	ror	w11,w11,#6
+	eor	w13,w3,w4
+	eor	w15,w15,w3,ror#20
+	add	v4.4s,v4.4s,v2.4s
+	add	w10,w10,w11
+	ldr	w12,[sp,#36]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w6,w6,w10
+	eor	w14,w14,w4
+	add	w9,w9,w12
+	add	w10,w10,w15
+	and	w12,w7,w6
+	bic	w15,w8,w6
+	eor	w11,w6,w6,ror#5
+	add	w10,w10,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w6,ror#19
+	eor	w15,w10,w10,ror#11
+	add	w9,w9,w12
+	ror	w11,w11,#6
+	eor	w14,w10,w3
+	eor	w15,w15,w10,ror#20
+	add	w9,w9,w11
+	ldr	w12,[sp,#40]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w5,w5,w9
+	eor	w13,w13,w3
+	add	w8,w8,w12
+	add	w9,w9,w15
+	and	w12,w6,w5
+	bic	w15,w7,w5
+	eor	w11,w5,w5,ror#5
+	add	w9,w9,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w5,ror#19
+	eor	w15,w9,w9,ror#11
+	add	w8,w8,w12
+	ror	w11,w11,#6
+	eor	w13,w9,w10
+	eor	w15,w15,w9,ror#20
+	add	w8,w8,w11
+	ldr	w12,[sp,#44]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w4,w4,w8
+	eor	w14,w14,w10
+	add	w7,w7,w12
+	add	w8,w8,w15
+	and	w12,w5,w4
+	bic	w15,w6,w4
+	eor	w11,w4,w4,ror#5
+	add	w8,w8,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w4,ror#19
+	eor	w15,w8,w8,ror#11
+	add	w7,w7,w12
+	ror	w11,w11,#6
+	eor	w14,w8,w9
+	eor	w15,w15,w8,ror#20
+	add	w7,w7,w11
+	ldr	w12,[sp,#48]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w3,w3,w7
+	eor	w13,w13,w9
+	st1	{v4.4s},[x17], #16
+	add	w6,w6,w12
+	add	w7,w7,w15
+	and	w12,w4,w3
+	ld1	{v3.16b},[x1],#16
+	bic	w15,w5,w3
+	eor	w11,w3,w3,ror#5
+	ld1	{v4.4s},[x16],#16
+	add	w7,w7,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w3,ror#19
+	eor	w15,w7,w7,ror#11
+	rev32	v3.16b,v3.16b
+	add	w6,w6,w12
+	ror	w11,w11,#6
+	eor	w13,w7,w8
+	eor	w15,w15,w7,ror#20
+	add	v4.4s,v4.4s,v3.4s
+	add	w6,w6,w11
+	ldr	w12,[sp,#52]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w10,w10,w6
+	eor	w14,w14,w8
+	add	w5,w5,w12
+	add	w6,w6,w15
+	and	w12,w3,w10
+	bic	w15,w4,w10
+	eor	w11,w10,w10,ror#5
+	add	w6,w6,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w10,ror#19
+	eor	w15,w6,w6,ror#11
+	add	w5,w5,w12
+	ror	w11,w11,#6
+	eor	w14,w6,w7
+	eor	w15,w15,w6,ror#20
+	add	w5,w5,w11
+	ldr	w12,[sp,#56]
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w9,w9,w5
+	eor	w13,w13,w7
+	add	w4,w4,w12
+	add	w5,w5,w15
+	and	w12,w10,w9
+	bic	w15,w3,w9
+	eor	w11,w9,w9,ror#5
+	add	w5,w5,w13
+	orr	w12,w12,w15
+	eor	w11,w11,w9,ror#19
+	eor	w15,w5,w5,ror#11
+	add	w4,w4,w12
+	ror	w11,w11,#6
+	eor	w13,w5,w6
+	eor	w15,w15,w5,ror#20
+	add	w4,w4,w11
+	ldr	w12,[sp,#60]
+	and	w14,w14,w13
+	ror	w15,w15,#2
+	add	w8,w8,w4
+	eor	w14,w14,w6
+	add	w3,w3,w12
+	add	w4,w4,w15
+	and	w12,w9,w8
+	bic	w15,w10,w8
+	eor	w11,w8,w8,ror#5
+	add	w4,w4,w14
+	orr	w12,w12,w15
+	eor	w11,w11,w8,ror#19
+	eor	w15,w4,w4,ror#11
+	add	w3,w3,w12
+	ror	w11,w11,#6
+	eor	w14,w4,w5
+	eor	w15,w15,w4,ror#20
+	add	w3,w3,w11
+	and	w13,w13,w14
+	ror	w15,w15,#2
+	add	w7,w7,w3
+	eor	w13,w13,w5
+	st1	{v4.4s},[x17], #16
+	add	w3,w3,w15			// h+=Sigma0(a) from the past
+	ldp	w11,w12,[x0,#0]
+	add	w3,w3,w13			// h+=Maj(a,b,c) from the past
+	ldp	w13,w14,[x0,#8]
+	add	w3,w3,w11			// accumulate
+	add	w4,w4,w12
+	ldp	w11,w12,[x0,#16]
+	add	w5,w5,w13
+	add	w6,w6,w14
+	ldp	w13,w14,[x0,#24]
+	add	w7,w7,w11
+	add	w8,w8,w12
+	 ldr	w12,[sp,#0]
+	stp	w3,w4,[x0,#0]
+	add	w9,w9,w13
+	 mov	w13,wzr
+	stp	w5,w6,[x0,#8]
+	add	w10,w10,w14
+	stp	w7,w8,[x0,#16]
+	 eor	w14,w4,w5
+	stp	w9,w10,[x0,#24]
+	 mov	w15,wzr
+	 mov	x17,sp
+	b.ne	.L_00_48
+
+	ldr	x29,[x29]
+	add	sp,sp,#16*4+16
+	ret
+.size	sha256_block_neon,.-sha256_block_neon
+#ifndef	__KERNEL__
+.comm	OPENSSL_armcap_P,4,4
+#endif
--- /dev/null
+++ b/arch/arm64/crypto/sha512-core.S
@@ -0,0 +1,1085 @@
+// Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
+//
+// Licensed under the OpenSSL license (the "License").  You may not use
+// this file except in compliance with the License.  You can obtain a copy
+// in the file LICENSE in the source distribution or at
+// https://www.openssl.org/source/license.html
+
+// ====================================================================
+// Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+// project. The module is, however, dual licensed under OpenSSL and
+// CRYPTOGAMS licenses depending on where you obtain it. For further
+// details see http://www.openssl.org/~appro/cryptogams/.
+//
+// Permission to use under GPLv2 terms is granted.
+// ====================================================================
+//
+// SHA256/512 for ARMv8.
+//
+// Performance in cycles per processed byte and improvement coefficient
+// over code generated with "default" compiler:
+//
+//		SHA256-hw	SHA256(*)	SHA512
+// Apple A7	1.97		10.5 (+33%)	6.73 (-1%(**))
+// Cortex-A53	2.38		15.5 (+115%)	10.0 (+150%(***))
+// Cortex-A57	2.31		11.6 (+86%)	7.51 (+260%(***))
+// Denver	2.01		10.5 (+26%)	6.70 (+8%)
+// X-Gene			20.0 (+100%)	12.8 (+300%(***))
+// Mongoose	2.36		13.0 (+50%)	8.36 (+33%)
+//
+// (*)	Software SHA256 results are of lesser relevance, presented
+//	mostly for informational purposes.
+// (**)	The result is a trade-off: it's possible to improve it by
+//	10% (or by 1 cycle per round), but at the cost of 20% loss
+//	on Cortex-A53 (or by 4 cycles per round).
+// (***)	Super-impressive coefficients over gcc-generated code are
+//	indication of some compiler "pathology", most notably code
+//	generated with -mgeneral-regs-only is significanty faster
+//	and the gap is only 40-90%.
+//
+// October 2016.
+//
+// Originally it was reckoned that it makes no sense to implement NEON
+// version of SHA256 for 64-bit processors. This is because performance
+// improvement on most wide-spread Cortex-A5x processors was observed
+// to be marginal, same on Cortex-A53 and ~10% on A57. But then it was
+// observed that 32-bit NEON SHA256 performs significantly better than
+// 64-bit scalar version on *some* of the more recent processors. As
+// result 64-bit NEON version of SHA256 was added to provide best
+// all-round performance. For example it executes ~30% faster on X-Gene
+// and Mongoose. [For reference, NEON version of SHA512 is bound to
+// deliver much less improvement, likely *negative* on Cortex-A5x.
+// Which is why NEON support is limited to SHA256.]
+
+#ifndef	__KERNEL__
+# include "arm_arch.h"
+#endif
+
+.text
+
+.extern	OPENSSL_armcap_P
+.globl	sha512_block_data_order
+.type	sha512_block_data_order,%function
+.align	6
+sha512_block_data_order:
+	stp	x29,x30,[sp,#-128]!
+	add	x29,sp,#0
+
+	stp	x19,x20,[sp,#16]
+	stp	x21,x22,[sp,#32]
+	stp	x23,x24,[sp,#48]
+	stp	x25,x26,[sp,#64]
+	stp	x27,x28,[sp,#80]
+	sub	sp,sp,#4*8
+
+	ldp	x20,x21,[x0]				// load context
+	ldp	x22,x23,[x0,#2*8]
+	ldp	x24,x25,[x0,#4*8]
+	add	x2,x1,x2,lsl#7	// end of input
+	ldp	x26,x27,[x0,#6*8]
+	adr	x30,.LK512
+	stp	x0,x2,[x29,#96]
+
+.Loop:
+	ldp	x3,x4,[x1],#2*8
+	ldr	x19,[x30],#8			// *K++
+	eor	x28,x21,x22				// magic seed
+	str	x1,[x29,#112]
+#ifndef	__AARCH64EB__
+	rev	x3,x3			// 0
+#endif
+	ror	x16,x24,#14
+	add	x27,x27,x19			// h+=K[i]
+	eor	x6,x24,x24,ror#23
+	and	x17,x25,x24
+	bic	x19,x26,x24
+	add	x27,x27,x3			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x20,x21			// a^b, b^c in next round
+	eor	x16,x16,x6,ror#18	// Sigma1(e)
+	ror	x6,x20,#28
+	add	x27,x27,x17			// h+=Ch(e,f,g)
+	eor	x17,x20,x20,ror#5
+	add	x27,x27,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x23,x23,x27			// d+=h
+	eor	x28,x28,x21			// Maj(a,b,c)
+	eor	x17,x6,x17,ror#34	// Sigma0(a)
+	add	x27,x27,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x27,x27,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x4,x4			// 1
+#endif
+	ldp	x5,x6,[x1],#2*8
+	add	x27,x27,x17			// h+=Sigma0(a)
+	ror	x16,x23,#14
+	add	x26,x26,x28			// h+=K[i]
+	eor	x7,x23,x23,ror#23
+	and	x17,x24,x23
+	bic	x28,x25,x23
+	add	x26,x26,x4			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x27,x20			// a^b, b^c in next round
+	eor	x16,x16,x7,ror#18	// Sigma1(e)
+	ror	x7,x27,#28
+	add	x26,x26,x17			// h+=Ch(e,f,g)
+	eor	x17,x27,x27,ror#5
+	add	x26,x26,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x22,x22,x26			// d+=h
+	eor	x19,x19,x20			// Maj(a,b,c)
+	eor	x17,x7,x17,ror#34	// Sigma0(a)
+	add	x26,x26,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x26,x26,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x5,x5			// 2
+#endif
+	add	x26,x26,x17			// h+=Sigma0(a)
+	ror	x16,x22,#14
+	add	x25,x25,x19			// h+=K[i]
+	eor	x8,x22,x22,ror#23
+	and	x17,x23,x22
+	bic	x19,x24,x22
+	add	x25,x25,x5			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x26,x27			// a^b, b^c in next round
+	eor	x16,x16,x8,ror#18	// Sigma1(e)
+	ror	x8,x26,#28
+	add	x25,x25,x17			// h+=Ch(e,f,g)
+	eor	x17,x26,x26,ror#5
+	add	x25,x25,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x21,x21,x25			// d+=h
+	eor	x28,x28,x27			// Maj(a,b,c)
+	eor	x17,x8,x17,ror#34	// Sigma0(a)
+	add	x25,x25,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x25,x25,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x6,x6			// 3
+#endif
+	ldp	x7,x8,[x1],#2*8
+	add	x25,x25,x17			// h+=Sigma0(a)
+	ror	x16,x21,#14
+	add	x24,x24,x28			// h+=K[i]
+	eor	x9,x21,x21,ror#23
+	and	x17,x22,x21
+	bic	x28,x23,x21
+	add	x24,x24,x6			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x25,x26			// a^b, b^c in next round
+	eor	x16,x16,x9,ror#18	// Sigma1(e)
+	ror	x9,x25,#28
+	add	x24,x24,x17			// h+=Ch(e,f,g)
+	eor	x17,x25,x25,ror#5
+	add	x24,x24,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x20,x20,x24			// d+=h
+	eor	x19,x19,x26			// Maj(a,b,c)
+	eor	x17,x9,x17,ror#34	// Sigma0(a)
+	add	x24,x24,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x24,x24,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x7,x7			// 4
+#endif
+	add	x24,x24,x17			// h+=Sigma0(a)
+	ror	x16,x20,#14
+	add	x23,x23,x19			// h+=K[i]
+	eor	x10,x20,x20,ror#23
+	and	x17,x21,x20
+	bic	x19,x22,x20
+	add	x23,x23,x7			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x24,x25			// a^b, b^c in next round
+	eor	x16,x16,x10,ror#18	// Sigma1(e)
+	ror	x10,x24,#28
+	add	x23,x23,x17			// h+=Ch(e,f,g)
+	eor	x17,x24,x24,ror#5
+	add	x23,x23,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x27,x27,x23			// d+=h
+	eor	x28,x28,x25			// Maj(a,b,c)
+	eor	x17,x10,x17,ror#34	// Sigma0(a)
+	add	x23,x23,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x23,x23,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x8,x8			// 5
+#endif
+	ldp	x9,x10,[x1],#2*8
+	add	x23,x23,x17			// h+=Sigma0(a)
+	ror	x16,x27,#14
+	add	x22,x22,x28			// h+=K[i]
+	eor	x11,x27,x27,ror#23
+	and	x17,x20,x27
+	bic	x28,x21,x27
+	add	x22,x22,x8			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x23,x24			// a^b, b^c in next round
+	eor	x16,x16,x11,ror#18	// Sigma1(e)
+	ror	x11,x23,#28
+	add	x22,x22,x17			// h+=Ch(e,f,g)
+	eor	x17,x23,x23,ror#5
+	add	x22,x22,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x26,x26,x22			// d+=h
+	eor	x19,x19,x24			// Maj(a,b,c)
+	eor	x17,x11,x17,ror#34	// Sigma0(a)
+	add	x22,x22,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x22,x22,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x9,x9			// 6
+#endif
+	add	x22,x22,x17			// h+=Sigma0(a)
+	ror	x16,x26,#14
+	add	x21,x21,x19			// h+=K[i]
+	eor	x12,x26,x26,ror#23
+	and	x17,x27,x26
+	bic	x19,x20,x26
+	add	x21,x21,x9			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x22,x23			// a^b, b^c in next round
+	eor	x16,x16,x12,ror#18	// Sigma1(e)
+	ror	x12,x22,#28
+	add	x21,x21,x17			// h+=Ch(e,f,g)
+	eor	x17,x22,x22,ror#5
+	add	x21,x21,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x25,x25,x21			// d+=h
+	eor	x28,x28,x23			// Maj(a,b,c)
+	eor	x17,x12,x17,ror#34	// Sigma0(a)
+	add	x21,x21,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x21,x21,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x10,x10			// 7
+#endif
+	ldp	x11,x12,[x1],#2*8
+	add	x21,x21,x17			// h+=Sigma0(a)
+	ror	x16,x25,#14
+	add	x20,x20,x28			// h+=K[i]
+	eor	x13,x25,x25,ror#23
+	and	x17,x26,x25
+	bic	x28,x27,x25
+	add	x20,x20,x10			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x21,x22			// a^b, b^c in next round
+	eor	x16,x16,x13,ror#18	// Sigma1(e)
+	ror	x13,x21,#28
+	add	x20,x20,x17			// h+=Ch(e,f,g)
+	eor	x17,x21,x21,ror#5
+	add	x20,x20,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x24,x24,x20			// d+=h
+	eor	x19,x19,x22			// Maj(a,b,c)
+	eor	x17,x13,x17,ror#34	// Sigma0(a)
+	add	x20,x20,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x20,x20,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x11,x11			// 8
+#endif
+	add	x20,x20,x17			// h+=Sigma0(a)
+	ror	x16,x24,#14
+	add	x27,x27,x19			// h+=K[i]
+	eor	x14,x24,x24,ror#23
+	and	x17,x25,x24
+	bic	x19,x26,x24
+	add	x27,x27,x11			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x20,x21			// a^b, b^c in next round
+	eor	x16,x16,x14,ror#18	// Sigma1(e)
+	ror	x14,x20,#28
+	add	x27,x27,x17			// h+=Ch(e,f,g)
+	eor	x17,x20,x20,ror#5
+	add	x27,x27,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x23,x23,x27			// d+=h
+	eor	x28,x28,x21			// Maj(a,b,c)
+	eor	x17,x14,x17,ror#34	// Sigma0(a)
+	add	x27,x27,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x27,x27,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x12,x12			// 9
+#endif
+	ldp	x13,x14,[x1],#2*8
+	add	x27,x27,x17			// h+=Sigma0(a)
+	ror	x16,x23,#14
+	add	x26,x26,x28			// h+=K[i]
+	eor	x15,x23,x23,ror#23
+	and	x17,x24,x23
+	bic	x28,x25,x23
+	add	x26,x26,x12			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x27,x20			// a^b, b^c in next round
+	eor	x16,x16,x15,ror#18	// Sigma1(e)
+	ror	x15,x27,#28
+	add	x26,x26,x17			// h+=Ch(e,f,g)
+	eor	x17,x27,x27,ror#5
+	add	x26,x26,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x22,x22,x26			// d+=h
+	eor	x19,x19,x20			// Maj(a,b,c)
+	eor	x17,x15,x17,ror#34	// Sigma0(a)
+	add	x26,x26,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x26,x26,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x13,x13			// 10
+#endif
+	add	x26,x26,x17			// h+=Sigma0(a)
+	ror	x16,x22,#14
+	add	x25,x25,x19			// h+=K[i]
+	eor	x0,x22,x22,ror#23
+	and	x17,x23,x22
+	bic	x19,x24,x22
+	add	x25,x25,x13			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x26,x27			// a^b, b^c in next round
+	eor	x16,x16,x0,ror#18	// Sigma1(e)
+	ror	x0,x26,#28
+	add	x25,x25,x17			// h+=Ch(e,f,g)
+	eor	x17,x26,x26,ror#5
+	add	x25,x25,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x21,x21,x25			// d+=h
+	eor	x28,x28,x27			// Maj(a,b,c)
+	eor	x17,x0,x17,ror#34	// Sigma0(a)
+	add	x25,x25,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x25,x25,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x14,x14			// 11
+#endif
+	ldp	x15,x0,[x1],#2*8
+	add	x25,x25,x17			// h+=Sigma0(a)
+	str	x6,[sp,#24]
+	ror	x16,x21,#14
+	add	x24,x24,x28			// h+=K[i]
+	eor	x6,x21,x21,ror#23
+	and	x17,x22,x21
+	bic	x28,x23,x21
+	add	x24,x24,x14			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x25,x26			// a^b, b^c in next round
+	eor	x16,x16,x6,ror#18	// Sigma1(e)
+	ror	x6,x25,#28
+	add	x24,x24,x17			// h+=Ch(e,f,g)
+	eor	x17,x25,x25,ror#5
+	add	x24,x24,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x20,x20,x24			// d+=h
+	eor	x19,x19,x26			// Maj(a,b,c)
+	eor	x17,x6,x17,ror#34	// Sigma0(a)
+	add	x24,x24,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x24,x24,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x15,x15			// 12
+#endif
+	add	x24,x24,x17			// h+=Sigma0(a)
+	str	x7,[sp,#0]
+	ror	x16,x20,#14
+	add	x23,x23,x19			// h+=K[i]
+	eor	x7,x20,x20,ror#23
+	and	x17,x21,x20
+	bic	x19,x22,x20
+	add	x23,x23,x15			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x24,x25			// a^b, b^c in next round
+	eor	x16,x16,x7,ror#18	// Sigma1(e)
+	ror	x7,x24,#28
+	add	x23,x23,x17			// h+=Ch(e,f,g)
+	eor	x17,x24,x24,ror#5
+	add	x23,x23,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x27,x27,x23			// d+=h
+	eor	x28,x28,x25			// Maj(a,b,c)
+	eor	x17,x7,x17,ror#34	// Sigma0(a)
+	add	x23,x23,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x23,x23,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x0,x0			// 13
+#endif
+	ldp	x1,x2,[x1]
+	add	x23,x23,x17			// h+=Sigma0(a)
+	str	x8,[sp,#8]
+	ror	x16,x27,#14
+	add	x22,x22,x28			// h+=K[i]
+	eor	x8,x27,x27,ror#23
+	and	x17,x20,x27
+	bic	x28,x21,x27
+	add	x22,x22,x0			// h+=X[i]
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x23,x24			// a^b, b^c in next round
+	eor	x16,x16,x8,ror#18	// Sigma1(e)
+	ror	x8,x23,#28
+	add	x22,x22,x17			// h+=Ch(e,f,g)
+	eor	x17,x23,x23,ror#5
+	add	x22,x22,x16			// h+=Sigma1(e)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	add	x26,x26,x22			// d+=h
+	eor	x19,x19,x24			// Maj(a,b,c)
+	eor	x17,x8,x17,ror#34	// Sigma0(a)
+	add	x22,x22,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	//add	x22,x22,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x1,x1			// 14
+#endif
+	ldr	x6,[sp,#24]
+	add	x22,x22,x17			// h+=Sigma0(a)
+	str	x9,[sp,#16]
+	ror	x16,x26,#14
+	add	x21,x21,x19			// h+=K[i]
+	eor	x9,x26,x26,ror#23
+	and	x17,x27,x26
+	bic	x19,x20,x26
+	add	x21,x21,x1			// h+=X[i]
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x22,x23			// a^b, b^c in next round
+	eor	x16,x16,x9,ror#18	// Sigma1(e)
+	ror	x9,x22,#28
+	add	x21,x21,x17			// h+=Ch(e,f,g)
+	eor	x17,x22,x22,ror#5
+	add	x21,x21,x16			// h+=Sigma1(e)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	add	x25,x25,x21			// d+=h
+	eor	x28,x28,x23			// Maj(a,b,c)
+	eor	x17,x9,x17,ror#34	// Sigma0(a)
+	add	x21,x21,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	//add	x21,x21,x17			// h+=Sigma0(a)
+#ifndef	__AARCH64EB__
+	rev	x2,x2			// 15
+#endif
+	ldr	x7,[sp,#0]
+	add	x21,x21,x17			// h+=Sigma0(a)
+	str	x10,[sp,#24]
+	ror	x16,x25,#14
+	add	x20,x20,x28			// h+=K[i]
+	ror	x9,x4,#1
+	and	x17,x26,x25
+	ror	x8,x1,#19
+	bic	x28,x27,x25
+	ror	x10,x21,#28
+	add	x20,x20,x2			// h+=X[i]
+	eor	x16,x16,x25,ror#18
+	eor	x9,x9,x4,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x21,x22			// a^b, b^c in next round
+	eor	x16,x16,x25,ror#41	// Sigma1(e)
+	eor	x10,x10,x21,ror#34
+	add	x20,x20,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x8,x8,x1,ror#61
+	eor	x9,x9,x4,lsr#7	// sigma0(X[i+1])
+	add	x20,x20,x16			// h+=Sigma1(e)
+	eor	x19,x19,x22			// Maj(a,b,c)
+	eor	x17,x10,x21,ror#39	// Sigma0(a)
+	eor	x8,x8,x1,lsr#6	// sigma1(X[i+14])
+	add	x3,x3,x12
+	add	x24,x24,x20			// d+=h
+	add	x20,x20,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x3,x3,x9
+	add	x20,x20,x17			// h+=Sigma0(a)
+	add	x3,x3,x8
+.Loop_16_xx:
+	ldr	x8,[sp,#8]
+	str	x11,[sp,#0]
+	ror	x16,x24,#14
+	add	x27,x27,x19			// h+=K[i]
+	ror	x10,x5,#1
+	and	x17,x25,x24
+	ror	x9,x2,#19
+	bic	x19,x26,x24
+	ror	x11,x20,#28
+	add	x27,x27,x3			// h+=X[i]
+	eor	x16,x16,x24,ror#18
+	eor	x10,x10,x5,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x20,x21			// a^b, b^c in next round
+	eor	x16,x16,x24,ror#41	// Sigma1(e)
+	eor	x11,x11,x20,ror#34
+	add	x27,x27,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x9,x9,x2,ror#61
+	eor	x10,x10,x5,lsr#7	// sigma0(X[i+1])
+	add	x27,x27,x16			// h+=Sigma1(e)
+	eor	x28,x28,x21			// Maj(a,b,c)
+	eor	x17,x11,x20,ror#39	// Sigma0(a)
+	eor	x9,x9,x2,lsr#6	// sigma1(X[i+14])
+	add	x4,x4,x13
+	add	x23,x23,x27			// d+=h
+	add	x27,x27,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x4,x4,x10
+	add	x27,x27,x17			// h+=Sigma0(a)
+	add	x4,x4,x9
+	ldr	x9,[sp,#16]
+	str	x12,[sp,#8]
+	ror	x16,x23,#14
+	add	x26,x26,x28			// h+=K[i]
+	ror	x11,x6,#1
+	and	x17,x24,x23
+	ror	x10,x3,#19
+	bic	x28,x25,x23
+	ror	x12,x27,#28
+	add	x26,x26,x4			// h+=X[i]
+	eor	x16,x16,x23,ror#18
+	eor	x11,x11,x6,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x27,x20			// a^b, b^c in next round
+	eor	x16,x16,x23,ror#41	// Sigma1(e)
+	eor	x12,x12,x27,ror#34
+	add	x26,x26,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x10,x10,x3,ror#61
+	eor	x11,x11,x6,lsr#7	// sigma0(X[i+1])
+	add	x26,x26,x16			// h+=Sigma1(e)
+	eor	x19,x19,x20			// Maj(a,b,c)
+	eor	x17,x12,x27,ror#39	// Sigma0(a)
+	eor	x10,x10,x3,lsr#6	// sigma1(X[i+14])
+	add	x5,x5,x14
+	add	x22,x22,x26			// d+=h
+	add	x26,x26,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x5,x5,x11
+	add	x26,x26,x17			// h+=Sigma0(a)
+	add	x5,x5,x10
+	ldr	x10,[sp,#24]
+	str	x13,[sp,#16]
+	ror	x16,x22,#14
+	add	x25,x25,x19			// h+=K[i]
+	ror	x12,x7,#1
+	and	x17,x23,x22
+	ror	x11,x4,#19
+	bic	x19,x24,x22
+	ror	x13,x26,#28
+	add	x25,x25,x5			// h+=X[i]
+	eor	x16,x16,x22,ror#18
+	eor	x12,x12,x7,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x26,x27			// a^b, b^c in next round
+	eor	x16,x16,x22,ror#41	// Sigma1(e)
+	eor	x13,x13,x26,ror#34
+	add	x25,x25,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x11,x11,x4,ror#61
+	eor	x12,x12,x7,lsr#7	// sigma0(X[i+1])
+	add	x25,x25,x16			// h+=Sigma1(e)
+	eor	x28,x28,x27			// Maj(a,b,c)
+	eor	x17,x13,x26,ror#39	// Sigma0(a)
+	eor	x11,x11,x4,lsr#6	// sigma1(X[i+14])
+	add	x6,x6,x15
+	add	x21,x21,x25			// d+=h
+	add	x25,x25,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x6,x6,x12
+	add	x25,x25,x17			// h+=Sigma0(a)
+	add	x6,x6,x11
+	ldr	x11,[sp,#0]
+	str	x14,[sp,#24]
+	ror	x16,x21,#14
+	add	x24,x24,x28			// h+=K[i]
+	ror	x13,x8,#1
+	and	x17,x22,x21
+	ror	x12,x5,#19
+	bic	x28,x23,x21
+	ror	x14,x25,#28
+	add	x24,x24,x6			// h+=X[i]
+	eor	x16,x16,x21,ror#18
+	eor	x13,x13,x8,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x25,x26			// a^b, b^c in next round
+	eor	x16,x16,x21,ror#41	// Sigma1(e)
+	eor	x14,x14,x25,ror#34
+	add	x24,x24,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x12,x12,x5,ror#61
+	eor	x13,x13,x8,lsr#7	// sigma0(X[i+1])
+	add	x24,x24,x16			// h+=Sigma1(e)
+	eor	x19,x19,x26			// Maj(a,b,c)
+	eor	x17,x14,x25,ror#39	// Sigma0(a)
+	eor	x12,x12,x5,lsr#6	// sigma1(X[i+14])
+	add	x7,x7,x0
+	add	x20,x20,x24			// d+=h
+	add	x24,x24,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x7,x7,x13
+	add	x24,x24,x17			// h+=Sigma0(a)
+	add	x7,x7,x12
+	ldr	x12,[sp,#8]
+	str	x15,[sp,#0]
+	ror	x16,x20,#14
+	add	x23,x23,x19			// h+=K[i]
+	ror	x14,x9,#1
+	and	x17,x21,x20
+	ror	x13,x6,#19
+	bic	x19,x22,x20
+	ror	x15,x24,#28
+	add	x23,x23,x7			// h+=X[i]
+	eor	x16,x16,x20,ror#18
+	eor	x14,x14,x9,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x24,x25			// a^b, b^c in next round
+	eor	x16,x16,x20,ror#41	// Sigma1(e)
+	eor	x15,x15,x24,ror#34
+	add	x23,x23,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x13,x13,x6,ror#61
+	eor	x14,x14,x9,lsr#7	// sigma0(X[i+1])
+	add	x23,x23,x16			// h+=Sigma1(e)
+	eor	x28,x28,x25			// Maj(a,b,c)
+	eor	x17,x15,x24,ror#39	// Sigma0(a)
+	eor	x13,x13,x6,lsr#6	// sigma1(X[i+14])
+	add	x8,x8,x1
+	add	x27,x27,x23			// d+=h
+	add	x23,x23,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x8,x8,x14
+	add	x23,x23,x17			// h+=Sigma0(a)
+	add	x8,x8,x13
+	ldr	x13,[sp,#16]
+	str	x0,[sp,#8]
+	ror	x16,x27,#14
+	add	x22,x22,x28			// h+=K[i]
+	ror	x15,x10,#1
+	and	x17,x20,x27
+	ror	x14,x7,#19
+	bic	x28,x21,x27
+	ror	x0,x23,#28
+	add	x22,x22,x8			// h+=X[i]
+	eor	x16,x16,x27,ror#18
+	eor	x15,x15,x10,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x23,x24			// a^b, b^c in next round
+	eor	x16,x16,x27,ror#41	// Sigma1(e)
+	eor	x0,x0,x23,ror#34
+	add	x22,x22,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x14,x14,x7,ror#61
+	eor	x15,x15,x10,lsr#7	// sigma0(X[i+1])
+	add	x22,x22,x16			// h+=Sigma1(e)
+	eor	x19,x19,x24			// Maj(a,b,c)
+	eor	x17,x0,x23,ror#39	// Sigma0(a)
+	eor	x14,x14,x7,lsr#6	// sigma1(X[i+14])
+	add	x9,x9,x2
+	add	x26,x26,x22			// d+=h
+	add	x22,x22,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x9,x9,x15
+	add	x22,x22,x17			// h+=Sigma0(a)
+	add	x9,x9,x14
+	ldr	x14,[sp,#24]
+	str	x1,[sp,#16]
+	ror	x16,x26,#14
+	add	x21,x21,x19			// h+=K[i]
+	ror	x0,x11,#1
+	and	x17,x27,x26
+	ror	x15,x8,#19
+	bic	x19,x20,x26
+	ror	x1,x22,#28
+	add	x21,x21,x9			// h+=X[i]
+	eor	x16,x16,x26,ror#18
+	eor	x0,x0,x11,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x22,x23			// a^b, b^c in next round
+	eor	x16,x16,x26,ror#41	// Sigma1(e)
+	eor	x1,x1,x22,ror#34
+	add	x21,x21,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x15,x15,x8,ror#61
+	eor	x0,x0,x11,lsr#7	// sigma0(X[i+1])
+	add	x21,x21,x16			// h+=Sigma1(e)
+	eor	x28,x28,x23			// Maj(a,b,c)
+	eor	x17,x1,x22,ror#39	// Sigma0(a)
+	eor	x15,x15,x8,lsr#6	// sigma1(X[i+14])
+	add	x10,x10,x3
+	add	x25,x25,x21			// d+=h
+	add	x21,x21,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x10,x10,x0
+	add	x21,x21,x17			// h+=Sigma0(a)
+	add	x10,x10,x15
+	ldr	x15,[sp,#0]
+	str	x2,[sp,#24]
+	ror	x16,x25,#14
+	add	x20,x20,x28			// h+=K[i]
+	ror	x1,x12,#1
+	and	x17,x26,x25
+	ror	x0,x9,#19
+	bic	x28,x27,x25
+	ror	x2,x21,#28
+	add	x20,x20,x10			// h+=X[i]
+	eor	x16,x16,x25,ror#18
+	eor	x1,x1,x12,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x21,x22			// a^b, b^c in next round
+	eor	x16,x16,x25,ror#41	// Sigma1(e)
+	eor	x2,x2,x21,ror#34
+	add	x20,x20,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x0,x0,x9,ror#61
+	eor	x1,x1,x12,lsr#7	// sigma0(X[i+1])
+	add	x20,x20,x16			// h+=Sigma1(e)
+	eor	x19,x19,x22			// Maj(a,b,c)
+	eor	x17,x2,x21,ror#39	// Sigma0(a)
+	eor	x0,x0,x9,lsr#6	// sigma1(X[i+14])
+	add	x11,x11,x4
+	add	x24,x24,x20			// d+=h
+	add	x20,x20,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x11,x11,x1
+	add	x20,x20,x17			// h+=Sigma0(a)
+	add	x11,x11,x0
+	ldr	x0,[sp,#8]
+	str	x3,[sp,#0]
+	ror	x16,x24,#14
+	add	x27,x27,x19			// h+=K[i]
+	ror	x2,x13,#1
+	and	x17,x25,x24
+	ror	x1,x10,#19
+	bic	x19,x26,x24
+	ror	x3,x20,#28
+	add	x27,x27,x11			// h+=X[i]
+	eor	x16,x16,x24,ror#18
+	eor	x2,x2,x13,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x20,x21			// a^b, b^c in next round
+	eor	x16,x16,x24,ror#41	// Sigma1(e)
+	eor	x3,x3,x20,ror#34
+	add	x27,x27,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x1,x1,x10,ror#61
+	eor	x2,x2,x13,lsr#7	// sigma0(X[i+1])
+	add	x27,x27,x16			// h+=Sigma1(e)
+	eor	x28,x28,x21			// Maj(a,b,c)
+	eor	x17,x3,x20,ror#39	// Sigma0(a)
+	eor	x1,x1,x10,lsr#6	// sigma1(X[i+14])
+	add	x12,x12,x5
+	add	x23,x23,x27			// d+=h
+	add	x27,x27,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x12,x12,x2
+	add	x27,x27,x17			// h+=Sigma0(a)
+	add	x12,x12,x1
+	ldr	x1,[sp,#16]
+	str	x4,[sp,#8]
+	ror	x16,x23,#14
+	add	x26,x26,x28			// h+=K[i]
+	ror	x3,x14,#1
+	and	x17,x24,x23
+	ror	x2,x11,#19
+	bic	x28,x25,x23
+	ror	x4,x27,#28
+	add	x26,x26,x12			// h+=X[i]
+	eor	x16,x16,x23,ror#18
+	eor	x3,x3,x14,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x27,x20			// a^b, b^c in next round
+	eor	x16,x16,x23,ror#41	// Sigma1(e)
+	eor	x4,x4,x27,ror#34
+	add	x26,x26,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x2,x2,x11,ror#61
+	eor	x3,x3,x14,lsr#7	// sigma0(X[i+1])
+	add	x26,x26,x16			// h+=Sigma1(e)
+	eor	x19,x19,x20			// Maj(a,b,c)
+	eor	x17,x4,x27,ror#39	// Sigma0(a)
+	eor	x2,x2,x11,lsr#6	// sigma1(X[i+14])
+	add	x13,x13,x6
+	add	x22,x22,x26			// d+=h
+	add	x26,x26,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x13,x13,x3
+	add	x26,x26,x17			// h+=Sigma0(a)
+	add	x13,x13,x2
+	ldr	x2,[sp,#24]
+	str	x5,[sp,#16]
+	ror	x16,x22,#14
+	add	x25,x25,x19			// h+=K[i]
+	ror	x4,x15,#1
+	and	x17,x23,x22
+	ror	x3,x12,#19
+	bic	x19,x24,x22
+	ror	x5,x26,#28
+	add	x25,x25,x13			// h+=X[i]
+	eor	x16,x16,x22,ror#18
+	eor	x4,x4,x15,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x26,x27			// a^b, b^c in next round
+	eor	x16,x16,x22,ror#41	// Sigma1(e)
+	eor	x5,x5,x26,ror#34
+	add	x25,x25,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x3,x3,x12,ror#61
+	eor	x4,x4,x15,lsr#7	// sigma0(X[i+1])
+	add	x25,x25,x16			// h+=Sigma1(e)
+	eor	x28,x28,x27			// Maj(a,b,c)
+	eor	x17,x5,x26,ror#39	// Sigma0(a)
+	eor	x3,x3,x12,lsr#6	// sigma1(X[i+14])
+	add	x14,x14,x7
+	add	x21,x21,x25			// d+=h
+	add	x25,x25,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x14,x14,x4
+	add	x25,x25,x17			// h+=Sigma0(a)
+	add	x14,x14,x3
+	ldr	x3,[sp,#0]
+	str	x6,[sp,#24]
+	ror	x16,x21,#14
+	add	x24,x24,x28			// h+=K[i]
+	ror	x5,x0,#1
+	and	x17,x22,x21
+	ror	x4,x13,#19
+	bic	x28,x23,x21
+	ror	x6,x25,#28
+	add	x24,x24,x14			// h+=X[i]
+	eor	x16,x16,x21,ror#18
+	eor	x5,x5,x0,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x25,x26			// a^b, b^c in next round
+	eor	x16,x16,x21,ror#41	// Sigma1(e)
+	eor	x6,x6,x25,ror#34
+	add	x24,x24,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x4,x4,x13,ror#61
+	eor	x5,x5,x0,lsr#7	// sigma0(X[i+1])
+	add	x24,x24,x16			// h+=Sigma1(e)
+	eor	x19,x19,x26			// Maj(a,b,c)
+	eor	x17,x6,x25,ror#39	// Sigma0(a)
+	eor	x4,x4,x13,lsr#6	// sigma1(X[i+14])
+	add	x15,x15,x8
+	add	x20,x20,x24			// d+=h
+	add	x24,x24,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x15,x15,x5
+	add	x24,x24,x17			// h+=Sigma0(a)
+	add	x15,x15,x4
+	ldr	x4,[sp,#8]
+	str	x7,[sp,#0]
+	ror	x16,x20,#14
+	add	x23,x23,x19			// h+=K[i]
+	ror	x6,x1,#1
+	and	x17,x21,x20
+	ror	x5,x14,#19
+	bic	x19,x22,x20
+	ror	x7,x24,#28
+	add	x23,x23,x15			// h+=X[i]
+	eor	x16,x16,x20,ror#18
+	eor	x6,x6,x1,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x24,x25			// a^b, b^c in next round
+	eor	x16,x16,x20,ror#41	// Sigma1(e)
+	eor	x7,x7,x24,ror#34
+	add	x23,x23,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x5,x5,x14,ror#61
+	eor	x6,x6,x1,lsr#7	// sigma0(X[i+1])
+	add	x23,x23,x16			// h+=Sigma1(e)
+	eor	x28,x28,x25			// Maj(a,b,c)
+	eor	x17,x7,x24,ror#39	// Sigma0(a)
+	eor	x5,x5,x14,lsr#6	// sigma1(X[i+14])
+	add	x0,x0,x9
+	add	x27,x27,x23			// d+=h
+	add	x23,x23,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x0,x0,x6
+	add	x23,x23,x17			// h+=Sigma0(a)
+	add	x0,x0,x5
+	ldr	x5,[sp,#16]
+	str	x8,[sp,#8]
+	ror	x16,x27,#14
+	add	x22,x22,x28			// h+=K[i]
+	ror	x7,x2,#1
+	and	x17,x20,x27
+	ror	x6,x15,#19
+	bic	x28,x21,x27
+	ror	x8,x23,#28
+	add	x22,x22,x0			// h+=X[i]
+	eor	x16,x16,x27,ror#18
+	eor	x7,x7,x2,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x23,x24			// a^b, b^c in next round
+	eor	x16,x16,x27,ror#41	// Sigma1(e)
+	eor	x8,x8,x23,ror#34
+	add	x22,x22,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x6,x6,x15,ror#61
+	eor	x7,x7,x2,lsr#7	// sigma0(X[i+1])
+	add	x22,x22,x16			// h+=Sigma1(e)
+	eor	x19,x19,x24			// Maj(a,b,c)
+	eor	x17,x8,x23,ror#39	// Sigma0(a)
+	eor	x6,x6,x15,lsr#6	// sigma1(X[i+14])
+	add	x1,x1,x10
+	add	x26,x26,x22			// d+=h
+	add	x22,x22,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x1,x1,x7
+	add	x22,x22,x17			// h+=Sigma0(a)
+	add	x1,x1,x6
+	ldr	x6,[sp,#24]
+	str	x9,[sp,#16]
+	ror	x16,x26,#14
+	add	x21,x21,x19			// h+=K[i]
+	ror	x8,x3,#1
+	and	x17,x27,x26
+	ror	x7,x0,#19
+	bic	x19,x20,x26
+	ror	x9,x22,#28
+	add	x21,x21,x1			// h+=X[i]
+	eor	x16,x16,x26,ror#18
+	eor	x8,x8,x3,ror#8
+	orr	x17,x17,x19			// Ch(e,f,g)
+	eor	x19,x22,x23			// a^b, b^c in next round
+	eor	x16,x16,x26,ror#41	// Sigma1(e)
+	eor	x9,x9,x22,ror#34
+	add	x21,x21,x17			// h+=Ch(e,f,g)
+	and	x28,x28,x19			// (b^c)&=(a^b)
+	eor	x7,x7,x0,ror#61
+	eor	x8,x8,x3,lsr#7	// sigma0(X[i+1])
+	add	x21,x21,x16			// h+=Sigma1(e)
+	eor	x28,x28,x23			// Maj(a,b,c)
+	eor	x17,x9,x22,ror#39	// Sigma0(a)
+	eor	x7,x7,x0,lsr#6	// sigma1(X[i+14])
+	add	x2,x2,x11
+	add	x25,x25,x21			// d+=h
+	add	x21,x21,x28			// h+=Maj(a,b,c)
+	ldr	x28,[x30],#8		// *K++, x19 in next round
+	add	x2,x2,x8
+	add	x21,x21,x17			// h+=Sigma0(a)
+	add	x2,x2,x7
+	ldr	x7,[sp,#0]
+	str	x10,[sp,#24]
+	ror	x16,x25,#14
+	add	x20,x20,x28			// h+=K[i]
+	ror	x9,x4,#1
+	and	x17,x26,x25
+	ror	x8,x1,#19
+	bic	x28,x27,x25
+	ror	x10,x21,#28
+	add	x20,x20,x2			// h+=X[i]
+	eor	x16,x16,x25,ror#18
+	eor	x9,x9,x4,ror#8
+	orr	x17,x17,x28			// Ch(e,f,g)
+	eor	x28,x21,x22			// a^b, b^c in next round
+	eor	x16,x16,x25,ror#41	// Sigma1(e)
+	eor	x10,x10,x21,ror#34
+	add	x20,x20,x17			// h+=Ch(e,f,g)
+	and	x19,x19,x28			// (b^c)&=(a^b)
+	eor	x8,x8,x1,ror#61
+	eor	x9,x9,x4,lsr#7	// sigma0(X[i+1])
+	add	x20,x20,x16			// h+=Sigma1(e)
+	eor	x19,x19,x22			// Maj(a,b,c)
+	eor	x17,x10,x21,ror#39	// Sigma0(a)
+	eor	x8,x8,x1,lsr#6	// sigma1(X[i+14])
+	add	x3,x3,x12
+	add	x24,x24,x20			// d+=h
+	add	x20,x20,x19			// h+=Maj(a,b,c)
+	ldr	x19,[x30],#8		// *K++, x28 in next round
+	add	x3,x3,x9
+	add	x20,x20,x17			// h+=Sigma0(a)
+	add	x3,x3,x8
+	cbnz	x19,.Loop_16_xx
+
+	ldp	x0,x2,[x29,#96]
+	ldr	x1,[x29,#112]
+	sub	x30,x30,#648		// rewind
+
+	ldp	x3,x4,[x0]
+	ldp	x5,x6,[x0,#2*8]
+	add	x1,x1,#14*8			// advance input pointer
+	ldp	x7,x8,[x0,#4*8]
+	add	x20,x20,x3
+	ldp	x9,x10,[x0,#6*8]
+	add	x21,x21,x4
+	add	x22,x22,x5
+	add	x23,x23,x6
+	stp	x20,x21,[x0]
+	add	x24,x24,x7
+	add	x25,x25,x8
+	stp	x22,x23,[x0,#2*8]
+	add	x26,x26,x9
+	add	x27,x27,x10
+	cmp	x1,x2
+	stp	x24,x25,[x0,#4*8]
+	stp	x26,x27,[x0,#6*8]
+	b.ne	.Loop
+
+	ldp	x19,x20,[x29,#16]
+	add	sp,sp,#4*8
+	ldp	x21,x22,[x29,#32]
+	ldp	x23,x24,[x29,#48]
+	ldp	x25,x26,[x29,#64]
+	ldp	x27,x28,[x29,#80]
+	ldp	x29,x30,[sp],#128
+	ret
+.size	sha512_block_data_order,.-sha512_block_data_order
+
+.align	6
+.type	.LK512,%object
+.LK512:
+	.quad	0x428a2f98d728ae22,0x7137449123ef65cd
+	.quad	0xb5c0fbcfec4d3b2f,0xe9b5dba58189dbbc
+	.quad	0x3956c25bf348b538,0x59f111f1b605d019
+	.quad	0x923f82a4af194f9b,0xab1c5ed5da6d8118
+	.quad	0xd807aa98a3030242,0x12835b0145706fbe
+	.quad	0x243185be4ee4b28c,0x550c7dc3d5ffb4e2
+	.quad	0x72be5d74f27b896f,0x80deb1fe3b1696b1
+	.quad	0x9bdc06a725c71235,0xc19bf174cf692694
+	.quad	0xe49b69c19ef14ad2,0xefbe4786384f25e3
+	.quad	0x0fc19dc68b8cd5b5,0x240ca1cc77ac9c65
+	.quad	0x2de92c6f592b0275,0x4a7484aa6ea6e483
+	.quad	0x5cb0a9dcbd41fbd4,0x76f988da831153b5
+	.quad	0x983e5152ee66dfab,0xa831c66d2db43210
+	.quad	0xb00327c898fb213f,0xbf597fc7beef0ee4
+	.quad	0xc6e00bf33da88fc2,0xd5a79147930aa725
+	.quad	0x06ca6351e003826f,0x142929670a0e6e70
+	.quad	0x27b70a8546d22ffc,0x2e1b21385c26c926
+	.quad	0x4d2c6dfc5ac42aed,0x53380d139d95b3df
+	.quad	0x650a73548baf63de,0x766a0abb3c77b2a8
+	.quad	0x81c2c92e47edaee6,0x92722c851482353b
+	.quad	0xa2bfe8a14cf10364,0xa81a664bbc423001
+	.quad	0xc24b8b70d0f89791,0xc76c51a30654be30
+	.quad	0xd192e819d6ef5218,0xd69906245565a910
+	.quad	0xf40e35855771202a,0x106aa07032bbd1b8
+	.quad	0x19a4c116b8d2d0c8,0x1e376c085141ab53
+	.quad	0x2748774cdf8eeb99,0x34b0bcb5e19b48a8
+	.quad	0x391c0cb3c5c95a63,0x4ed8aa4ae3418acb
+	.quad	0x5b9cca4f7763e373,0x682e6ff3d6b2b8a3
+	.quad	0x748f82ee5defb2fc,0x78a5636f43172f60
+	.quad	0x84c87814a1f0ab72,0x8cc702081a6439ec
+	.quad	0x90befffa23631e28,0xa4506cebde82bde9
+	.quad	0xbef9a3f7b2c67915,0xc67178f2e372532b
+	.quad	0xca273eceea26619c,0xd186b8c721c0c207
+	.quad	0xeada7dd6cde0eb1e,0xf57d4f7fee6ed178
+	.quad	0x06f067aa72176fba,0x0a637dc5a2c898a6
+	.quad	0x113f9804bef90dae,0x1b710b35131c471b
+	.quad	0x28db77f523047d84,0x32caab7b40c72493
+	.quad	0x3c9ebe0a15c9bebc,0x431d67c49c100d4c
+	.quad	0x4cc5d4becb3e42b6,0x597f299cfc657e2a
+	.quad	0x5fcb6fab3ad6faec,0x6c44198c4a475817
+	.quad	0	// terminator
+.size	.LK512,.-.LK512
+#ifndef	__KERNEL__
+.align	3
+.LOPENSSL_armcap_P:
+# ifdef	__ILP32__
+	.long	OPENSSL_armcap_P-.
+# else
+	.quad	OPENSSL_armcap_P-.
+# endif
+#endif
+.asciz	"SHA512 block transform for ARMv8, CRYPTOGAMS by <appro@openssl.org>"
+.align	2
+#ifndef	__KERNEL__
+.comm	OPENSSL_armcap_P,4,4
+#endif
--- a/arch/arm64/kernel/bpi.S
+++ b/arch/arm64/kernel/bpi.S
@@ -17,6 +17,7 @@
  */
 
 #include <linux/linkage.h>
+#include <linux/arm-smccc.h>
 
 .macro ventry target
 	.rept 31
@@ -77,3 +78,22 @@ ENTRY(__psci_hyp_bp_inval_start)
 	ldp	x0, x1, [sp, #(16 * 8)]
 	add	sp, sp, #(8 * 18)
 ENTRY(__psci_hyp_bp_inval_end)
+
+.macro smccc_workaround_1 inst
+	sub	sp, sp, #(8 * 4)
+	stp	x2, x3, [sp, #(8 * 0)]
+	stp	x0, x1, [sp, #(8 * 2)]
+	mov	w0, #ARM_SMCCC_ARCH_WORKAROUND_1
+	\inst	#0
+	ldp	x2, x3, [sp, #(8 * 0)]
+	ldp	x0, x1, [sp, #(8 * 2)]
+	add	sp, sp, #(8 * 4)
+.endm
+
+ENTRY(__smccc_workaround_1_smc_start)
+	smccc_workaround_1	smc
+ENTRY(__smccc_workaround_1_smc_end)
+
+ENTRY(__smccc_workaround_1_hvc_start)
+	smccc_workaround_1	hvc
+ENTRY(__smccc_workaround_1_hvc_end)
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -54,6 +54,10 @@ DEFINE_PER_CPU_READ_MOSTLY(struct bp_har
 
 #ifdef CONFIG_KVM
 extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
+extern char __smccc_workaround_1_smc_start[];
+extern char __smccc_workaround_1_smc_end[];
+extern char __smccc_workaround_1_hvc_start[];
+extern char __smccc_workaround_1_hvc_end[];
 
 static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
 				const char *hyp_vecs_end)
@@ -96,8 +100,12 @@ static void __install_bp_hardening_cb(bp
 	spin_unlock(&bp_lock);
 }
 #else
-#define __psci_hyp_bp_inval_start	NULL
-#define __psci_hyp_bp_inval_end		NULL
+#define __psci_hyp_bp_inval_start		NULL
+#define __psci_hyp_bp_inval_end			NULL
+#define __smccc_workaround_1_smc_start		NULL
+#define __smccc_workaround_1_smc_end		NULL
+#define __smccc_workaround_1_hvc_start		NULL
+#define __smccc_workaround_1_hvc_end		NULL
 
 static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
 				      const char *hyp_vecs_start,
@@ -124,17 +132,75 @@ static void  install_bp_hardening_cb(con
 	__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
 }
 
+#include <uapi/linux/psci.h>
+#include <linux/arm-smccc.h>
 #include <linux/psci.h>
 
+static void call_smc_arch_workaround_1(void)
+{
+	arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
+}
+
+static void call_hvc_arch_workaround_1(void)
+{
+	arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
+}
+
+static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
+{
+	bp_hardening_cb_t cb;
+	void *smccc_start, *smccc_end;
+	struct arm_smccc_res res;
+
+	if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+		return false;
+
+	if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
+		return false;
+
+	switch (psci_ops.conduit) {
+	case PSCI_CONDUIT_HVC:
+		arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+				  ARM_SMCCC_ARCH_WORKAROUND_1, &res);
+		if (res.a0)
+			return false;
+		cb = call_hvc_arch_workaround_1;
+		smccc_start = __smccc_workaround_1_hvc_start;
+		smccc_end = __smccc_workaround_1_hvc_end;
+		break;
+
+	case PSCI_CONDUIT_SMC:
+		arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+				  ARM_SMCCC_ARCH_WORKAROUND_1, &res);
+		if (res.a0)
+			return false;
+		cb = call_smc_arch_workaround_1;
+		smccc_start = __smccc_workaround_1_smc_start;
+		smccc_end = __smccc_workaround_1_smc_end;
+		break;
+
+	default:
+		return false;
+	}
+
+	install_bp_hardening_cb(entry, cb, smccc_start, smccc_end);
+
+	return true;
+}
+
 static int enable_psci_bp_hardening(void *data)
 {
 	const struct arm64_cpu_capabilities *entry = data;
 
-	if (psci_ops.get_version)
+	if (psci_ops.get_version) {
+		if (check_smccc_arch_workaround_1(entry))
+			return 0;
+
 		install_bp_hardening_cb(entry,
 				       (bp_hardening_cb_t)psci_ops.get_version,
 				       __psci_hyp_bp_inval_start,
 				       __psci_hyp_bp_inval_end);
+	}
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 50/66] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 49/66] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 51/66] sunrpc: remove incorrect HMAC request initialization Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Ard Biesheuvel, Marc Zyngier,
	Catalin Marinas, Greg Hackmann, Mark Rutland

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>


From: Marc Zyngier <marc.zyngier@arm.com>

commit 3a0a397ff5ff8b56ca9f7908b75dee6bf0b5fabb upstream.

Now that we've standardised on SMCCC v1.1 to perform the branch
prediction invalidation, let's drop the previous band-aid.
If vendors haven't updated their firmware to do SMCCC 1.1, they
haven't updated PSCI either, so we don't loose anything.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/bpi.S        |   24 ---------------------
 arch/arm64/kernel/cpu_errata.c |   45 +++++++++++------------------------------
 arch/arm64/kvm/hyp/switch.c    |   14 ------------
 3 files changed, 13 insertions(+), 70 deletions(-)

--- a/arch/arm64/kernel/bpi.S
+++ b/arch/arm64/kernel/bpi.S
@@ -54,30 +54,6 @@ ENTRY(__bp_harden_hyp_vecs_start)
 	vectors __kvm_hyp_vector
 	.endr
 ENTRY(__bp_harden_hyp_vecs_end)
-ENTRY(__psci_hyp_bp_inval_start)
-	sub	sp, sp, #(8 * 18)
-	stp	x16, x17, [sp, #(16 * 0)]
-	stp	x14, x15, [sp, #(16 * 1)]
-	stp	x12, x13, [sp, #(16 * 2)]
-	stp	x10, x11, [sp, #(16 * 3)]
-	stp	x8, x9, [sp, #(16 * 4)]
-	stp	x6, x7, [sp, #(16 * 5)]
-	stp	x4, x5, [sp, #(16 * 6)]
-	stp	x2, x3, [sp, #(16 * 7)]
-	stp	x0, x1, [sp, #(16 * 8)]
-	mov	x0, #0x84000000
-	smc	#0
-	ldp	x16, x17, [sp, #(16 * 0)]
-	ldp	x14, x15, [sp, #(16 * 1)]
-	ldp	x12, x13, [sp, #(16 * 2)]
-	ldp	x10, x11, [sp, #(16 * 3)]
-	ldp	x8, x9, [sp, #(16 * 4)]
-	ldp	x6, x7, [sp, #(16 * 5)]
-	ldp	x4, x5, [sp, #(16 * 6)]
-	ldp	x2, x3, [sp, #(16 * 7)]
-	ldp	x0, x1, [sp, #(16 * 8)]
-	add	sp, sp, #(8 * 18)
-ENTRY(__psci_hyp_bp_inval_end)
 
 .macro smccc_workaround_1 inst
 	sub	sp, sp, #(8 * 4)
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -53,7 +53,6 @@ static int cpu_enable_trap_ctr_access(vo
 DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
 
 #ifdef CONFIG_KVM
-extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
 extern char __smccc_workaround_1_smc_start[];
 extern char __smccc_workaround_1_smc_end[];
 extern char __smccc_workaround_1_hvc_start[];
@@ -100,8 +99,6 @@ static void __install_bp_hardening_cb(bp
 	spin_unlock(&bp_lock);
 }
 #else
-#define __psci_hyp_bp_inval_start		NULL
-#define __psci_hyp_bp_inval_end			NULL
 #define __smccc_workaround_1_smc_start		NULL
 #define __smccc_workaround_1_smc_end		NULL
 #define __smccc_workaround_1_hvc_start		NULL
@@ -146,24 +143,25 @@ static void call_hvc_arch_workaround_1(v
 	arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
 }
 
-static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
+static int enable_smccc_arch_workaround_1(void *data)
 {
+	const struct arm64_cpu_capabilities *entry = data;
 	bp_hardening_cb_t cb;
 	void *smccc_start, *smccc_end;
 	struct arm_smccc_res res;
 
 	if (!entry->matches(entry, SCOPE_LOCAL_CPU))
-		return false;
+		return 0;
 
 	if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
-		return false;
+		return 0;
 
 	switch (psci_ops.conduit) {
 	case PSCI_CONDUIT_HVC:
 		arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
 				  ARM_SMCCC_ARCH_WORKAROUND_1, &res);
 		if (res.a0)
-			return false;
+			return 0;
 		cb = call_hvc_arch_workaround_1;
 		smccc_start = __smccc_workaround_1_hvc_start;
 		smccc_end = __smccc_workaround_1_hvc_end;
@@ -173,35 +171,18 @@ static bool check_smccc_arch_workaround_
 		arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
 				  ARM_SMCCC_ARCH_WORKAROUND_1, &res);
 		if (res.a0)
-			return false;
+			return 0;
 		cb = call_smc_arch_workaround_1;
 		smccc_start = __smccc_workaround_1_smc_start;
 		smccc_end = __smccc_workaround_1_smc_end;
 		break;
 
 	default:
-		return false;
+		return 0;
 	}
 
 	install_bp_hardening_cb(entry, cb, smccc_start, smccc_end);
 
-	return true;
-}
-
-static int enable_psci_bp_hardening(void *data)
-{
-	const struct arm64_cpu_capabilities *entry = data;
-
-	if (psci_ops.get_version) {
-		if (check_smccc_arch_workaround_1(entry))
-			return 0;
-
-		install_bp_hardening_cb(entry,
-				       (bp_hardening_cb_t)psci_ops.get_version,
-				       __psci_hyp_bp_inval_start,
-				       __psci_hyp_bp_inval_end);
-	}
-
 	return 0;
 }
 #endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
@@ -301,32 +282,32 @@ const struct arm64_cpu_capabilities arm6
 	{
 		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
 		MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
-		.enable = enable_psci_bp_hardening,
+		.enable = enable_smccc_arch_workaround_1,
 	},
 	{
 		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
 		MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
-		.enable = enable_psci_bp_hardening,
+		.enable = enable_smccc_arch_workaround_1,
 	},
 	{
 		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
 		MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
-		.enable = enable_psci_bp_hardening,
+		.enable = enable_smccc_arch_workaround_1,
 	},
 	{
 		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
 		MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
-		.enable = enable_psci_bp_hardening,
+		.enable = enable_smccc_arch_workaround_1,
 	},
 	{
 		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
 		MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
-		.enable = enable_psci_bp_hardening,
+		.enable = enable_smccc_arch_workaround_1,
 	},
 	{
 		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
 		MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
-		.enable = enable_psci_bp_hardening,
+		.enable = enable_smccc_arch_workaround_1,
 	},
 #endif
 	{
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -311,20 +311,6 @@ again:
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
 
-	if (exit_code == ARM_EXCEPTION_TRAP &&
-	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
-	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32)) {
-		u32 val = vcpu_get_reg(vcpu, 0);
-
-		if (val == PSCI_0_2_FN_PSCI_VERSION) {
-			val = kvm_psci_version(vcpu, kern_hyp_va(vcpu->kvm));
-			if (unlikely(val == KVM_ARM_PSCI_0_1))
-				val = PSCI_RET_NOT_SUPPORTED;
-			vcpu_set_reg(vcpu, 0, val);
-			goto again;
-		}
-	}
-
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
 		bool valid;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 51/66] sunrpc: remove incorrect HMAC request initialization
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 50/66] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump" Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Young, Eric Biggers, J. Bruce Fields

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Biggers <ebiggers@google.com>

commit f3aefb6a7066e24bfea7fcf1b07907576de69d63 upstream.

make_checksum_hmac_md5() is allocating an HMAC transform and doing
crypto API calls in the following order:

    crypto_ahash_init()
    crypto_ahash_setkey()
    crypto_ahash_digest()

This is wrong because it makes no sense to init() the request before a
key has been set, given that the initial state depends on the key.  And
digest() is short for init() + update() + final(), so in this case
there's no need to explicitly call init() at all.

Before commit 9fa68f620041 ("crypto: hash - prevent using keyed hashes
without setting key") the extra init() had no real effect, at least for
the software HMAC implementation.  (There are also hardware drivers that
implement HMAC-MD5, and it's not immediately obvious how gracefully they
handle init() before setkey().)  But now the crypto API detects this
incorrect initialization and returns -ENOKEY.  This is breaking NFS
mounts in some cases.

Fix it by removing the incorrect call to crypto_ahash_init().

Reported-by: Michael Young <m.a.young@durham.ac.uk>
Fixes: 9fa68f620041 ("crypto: hash - prevent using keyed hashes without setting key")
Fixes: fffdaef2eb4a ("gss_krb5: Add support for rc4-hmac encryption")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/sunrpc/auth_gss/gss_krb5_crypto.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
+++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
@@ -237,9 +237,6 @@ make_checksum_hmac_md5(struct krb5_ctx *
 
 	ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
 
-	err = crypto_ahash_init(req);
-	if (err)
-		goto out;
 	err = crypto_ahash_setkey(hmac_md5, cksumkey, kctx->gk5e->keylength);
 	if (err)
 		goto out;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump"
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 51/66] sunrpc: remove incorrect HMAC request initialization Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-09-05 18:50   ` Florian Fainelli
  2018-04-17 15:59 ` [PATCH 4.9 53/66] block/loop: fix deadlock after loop_set_status Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  69 siblings, 1 reply; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavlos Parissis, Lei Chen,
	Maxime Hadjinlian, Namhyung Kim, Adrian Hunter, Jiri Olsa,
	David Ahern, Peter Zijlstra, Wang Nan, kernel-team,
	Arnaldo Carvalho de Melo, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This reverts commit 7525a238be8f46617cdda29d1be5b85ffe3b042d which is
commit 94df1040b1e6aacd8dec0ba3c61d7e77cd695f26 upstream.

It breaks the build of perf on 4.9.y, so I'm dropping it.

Reported-by: Pavlos Parissis <pavlos.parissis@gmail.com>
Reported-by: Lei Chen <chenl.lei@gmail.com>
Reported-by: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/tests/code-reading.c |   20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -224,8 +224,6 @@ static int read_object_code(u64 addr, si
 	unsigned char buf2[BUFSZ];
 	size_t ret_len;
 	u64 objdump_addr;
-	const char *objdump_name;
-	char decomp_name[KMOD_DECOMP_LEN];
 	int ret;
 
 	pr_debug("Reading object code for memory address: %#"PRIx64"\n", addr);
@@ -286,25 +284,9 @@ static int read_object_code(u64 addr, si
 		state->done[state->done_cnt++] = al.map->start;
 	}
 
-	objdump_name = al.map->dso->long_name;
-	if (dso__needs_decompress(al.map->dso)) {
-		if (dso__decompress_kmodule_path(al.map->dso, objdump_name,
-						 decomp_name,
-						 sizeof(decomp_name)) < 0) {
-			pr_debug("decompression failed\n");
-			return -1;
-		}
-
-		objdump_name = decomp_name;
-	}
-
 	/* Read the object code using objdump */
 	objdump_addr = map__rip_2objdump(al.map, al.addr);
-	ret = read_via_objdump(objdump_name, objdump_addr, buf2, len);
-
-	if (dso__needs_decompress(al.map->dso))
-		unlink(objdump_name);
-
+	ret = read_via_objdump(al.map->dso->long_name, objdump_addr, buf2, len);
 	if (ret > 0) {
 		/*
 		 * The kernel maps are inaccurate - assume objdump is right in

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 53/66] block/loop: fix deadlock after loop_set_status
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump" Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 54/66] nfit: fix region registration vs block-data-window ranges Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tetsuo Handa, syzbot, Ming Lei,
	Dmitry Vyukov, Jens Axboe, Jens Axboe

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

commit 1e047eaab3bb5564f25b41e9cd3a053009f4e789 upstream.

syzbot is reporting deadlocks at __blkdev_get() [1].

----------------------------------------
[   92.493919] systemd-udevd   D12696   525      1 0x00000000
[   92.495891] Call Trace:
[   92.501560]  schedule+0x23/0x80
[   92.502923]  schedule_preempt_disabled+0x5/0x10
[   92.504645]  __mutex_lock+0x416/0x9e0
[   92.510760]  __blkdev_get+0x73/0x4f0
[   92.512220]  blkdev_get+0x12e/0x390
[   92.518151]  do_dentry_open+0x1c3/0x2f0
[   92.519815]  path_openat+0x5d9/0xdc0
[   92.521437]  do_filp_open+0x7d/0xf0
[   92.527365]  do_sys_open+0x1b8/0x250
[   92.528831]  do_syscall_64+0x6e/0x270
[   92.530341]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[   92.931922] 1 lock held by systemd-udevd/525:
[   92.933642]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x73/0x4f0
----------------------------------------

The reason of deadlock turned out that wait_event_interruptible() in
blk_queue_enter() got stuck with bdev->bd_mutex held at __blkdev_put()
due to q->mq_freeze_depth == 1.

----------------------------------------
[   92.787172] a.out           S12584   634    633 0x80000002
[   92.789120] Call Trace:
[   92.796693]  schedule+0x23/0x80
[   92.797994]  blk_queue_enter+0x3cb/0x540
[   92.803272]  generic_make_request+0xf0/0x3d0
[   92.807970]  submit_bio+0x67/0x130
[   92.810928]  submit_bh_wbc+0x15e/0x190
[   92.812461]  __block_write_full_page+0x218/0x460
[   92.815792]  __writepage+0x11/0x50
[   92.817209]  write_cache_pages+0x1ae/0x3d0
[   92.825585]  generic_writepages+0x5a/0x90
[   92.831865]  do_writepages+0x43/0xd0
[   92.836972]  __filemap_fdatawrite_range+0xc1/0x100
[   92.838788]  filemap_write_and_wait+0x24/0x70
[   92.840491]  __blkdev_put+0x69/0x1e0
[   92.841949]  blkdev_close+0x16/0x20
[   92.843418]  __fput+0xda/0x1f0
[   92.844740]  task_work_run+0x87/0xb0
[   92.846215]  do_exit+0x2f5/0xba0
[   92.850528]  do_group_exit+0x34/0xb0
[   92.852018]  SyS_exit_group+0xb/0x10
[   92.853449]  do_syscall_64+0x6e/0x270
[   92.854944]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[   92.943530] 1 lock held by a.out/634:
[   92.945105]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x1e0
----------------------------------------

The reason of q->mq_freeze_depth == 1 turned out that loop_set_status()
forgot to call blk_mq_unfreeze_queue() at error paths for
info->lo_encrypt_type != NULL case.

----------------------------------------
[   37.509497] CPU: 2 PID: 634 Comm: a.out Tainted: G        W        4.16.0+ #457
[   37.513608] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[   37.518832] RIP: 0010:blk_freeze_queue_start+0x17/0x40
[   37.521778] RSP: 0018:ffffb0c2013e7c60 EFLAGS: 00010246
[   37.524078] RAX: 0000000000000000 RBX: ffff8b07b1519798 RCX: 0000000000000000
[   37.527015] RDX: 0000000000000002 RSI: ffffb0c2013e7cc0 RDI: ffff8b07b1519798
[   37.529934] RBP: ffffb0c2013e7cc0 R08: 0000000000000008 R09: 47a189966239b898
[   37.532684] R10: dad78b99b278552f R11: 9332dca72259d5ef R12: ffff8b07acd73678
[   37.535452] R13: 0000000000004c04 R14: 0000000000000000 R15: ffff8b07b841e940
[   37.538186] FS:  00007fede33b9740(0000) GS:ffff8b07b8e80000(0000) knlGS:0000000000000000
[   37.541168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   37.543590] CR2: 00000000206fdf18 CR3: 0000000130b30006 CR4: 00000000000606e0
[   37.546410] Call Trace:
[   37.547902]  blk_freeze_queue+0x9/0x30
[   37.549968]  loop_set_status+0x67/0x3c0 [loop]
[   37.549975]  loop_set_status64+0x3b/0x70 [loop]
[   37.549986]  lo_ioctl+0x223/0x810 [loop]
[   37.549995]  blkdev_ioctl+0x572/0x980
[   37.550003]  block_ioctl+0x34/0x40
[   37.550006]  do_vfs_ioctl+0xa7/0x6d0
[   37.550017]  ksys_ioctl+0x6b/0x80
[   37.573076]  SyS_ioctl+0x5/0x10
[   37.574831]  do_syscall_64+0x6e/0x270
[   37.576769]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
----------------------------------------

[1] https://syzkaller.appspot.com/bug?id=cd662bc3f6022c0979d01a262c318fab2ee9b56f

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <bot+48594378e9851eab70bcd6f99327c7db58c5a28a@syzkaller.appspotmail.com>
Fixes: ecdd09597a572513 ("block/loop: fix race between I/O and set_status")
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/block/loop.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1110,11 +1110,15 @@ loop_set_status(struct loop_device *lo,
 	if (info->lo_encrypt_type) {
 		unsigned int type = info->lo_encrypt_type;
 
-		if (type >= MAX_LO_CRYPT)
-			return -EINVAL;
+		if (type >= MAX_LO_CRYPT) {
+			err = -EINVAL;
+			goto exit;
+		}
 		xfer = xfer_funcs[type];
-		if (xfer == NULL)
-			return -EINVAL;
+		if (xfer == NULL) {
+			err = -EINVAL;
+			goto exit;
+		}
 	} else
 		xfer = NULL;
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 54/66] nfit: fix region registration vs block-data-window ranges
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 53/66] block/loop: fix deadlock after loop_set_status Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 55/66] s390/qdio: dont retry EQBS after CCQ 96 Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dave Jiang, Dan Williams

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Williams <dan.j.williams@intel.com>

commit 8d0d8ed3356aa9ed43b819aaedd39b08ca453007 upstream.

Commit 1cf03c00e7c1 "nfit: scrub and register regions in a workqueue"
mistakenly attempts to register a region per BLK aperture. There is
nothing to register for individual apertures as they belong as a set to
a BLK aperture group that are registered with a corresponding
DIMM-control-region. Filter them for registration to prevent some
needless devm_kzalloc() allocations.

Cc: <stable@vger.kernel.org>
Fixes: 1cf03c00e7c1 ("nfit: scrub and register regions in a workqueue")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/acpi/nfit/core.c |   22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2547,15 +2547,21 @@ static void acpi_nfit_scrub(struct work_
 static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc)
 {
 	struct nfit_spa *nfit_spa;
-	int rc;
 
-	list_for_each_entry(nfit_spa, &acpi_desc->spas, list)
-		if (nfit_spa_type(nfit_spa->spa) == NFIT_SPA_DCR) {
-			/* BLK regions don't need to wait for ars results */
-			rc = acpi_nfit_register_region(acpi_desc, nfit_spa);
-			if (rc)
-				return rc;
-		}
+	list_for_each_entry(nfit_spa, &acpi_desc->spas, list) {
+		int rc, type = nfit_spa_type(nfit_spa->spa);
+
+		/* PMEM and VMEM will be registered by the ARS workqueue */
+		if (type == NFIT_SPA_PM || type == NFIT_SPA_VOLATILE)
+			continue;
+		/* BLK apertures belong to BLK region registration below */
+		if (type == NFIT_SPA_BDW)
+			continue;
+		/* BLK regions don't need to wait for ARS results */
+		rc = acpi_nfit_register_region(acpi_desc, nfit_spa);
+		if (rc)
+			return rc;
+	}
 
 	queue_work(nfit_wq, &acpi_desc->work);
 	return 0;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 55/66] s390/qdio: dont retry EQBS after CCQ 96
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 54/66] nfit: fix region registration vs block-data-window ranges Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 56/66] s390/qdio: dont merge ERROR output buffers Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Julian Wiedmann, Benjamin Block,
	Martin Schwidefsky

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <jwi@linux.vnet.ibm.com>

commit dae55b6fef58530c13df074bcc182c096609339e upstream.

Immediate retry of EQBS after CCQ 96 means that we potentially misreport
the state of buffers inspected during the first EQBS call.

This occurs when
1. the first EQBS finds all inspected buffers still in the initial state
   set by the driver (ie INPUT EMPTY or OUTPUT PRIMED),
2. the EQBS terminates early with CCQ 96, and
3. by the time that the second EQBS comes around, the state of those
   previously inspected buffers has changed.

If the state reported by the second EQBS is 'driver-owned', all we know
is that the previous buffers are driver-owned now as well. But we can't
tell if they all have the same state. So for instance
- the second EQBS reports OUTPUT EMPTY, but any number of the previous
  buffers could be OUTPUT ERROR by now,
- the second EQBS reports OUTPUT ERROR, but any number of the previous
  buffers could be OUTPUT EMPTY by now.

Effectively, this can result in both over- and underreporting of errors.

If the state reported by the second EQBS is 'HW-owned', that doesn't
guarantee that the previous buffers have not been switched to
driver-owned in the mean time. So for instance
- the second EQBS reports INPUT EMPTY, but any number of the previous
  buffers could be INPUT PRIMED (or INPUT ERROR) by now.

This would result in failure to process pending work on the queue. If
it's the final check before yielding initiative, this can cause
a (temporary) queue stall due to IRQ avoidance.

Fixes: 25f269f17316 ("[S390] qdio: EQBS retry after CCQ 96")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/s390/cio/qdio_main.c |   11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

--- a/drivers/s390/cio/qdio_main.c
+++ b/drivers/s390/cio/qdio_main.c
@@ -126,7 +126,7 @@ static inline int qdio_check_ccq(struct
 static int qdio_do_eqbs(struct qdio_q *q, unsigned char *state,
 			int start, int count, int auto_ack)
 {
-	int rc, tmp_count = count, tmp_start = start, nr = q->nr, retried = 0;
+	int rc, tmp_count = count, tmp_start = start, nr = q->nr;
 	unsigned int ccq = 0;
 
 	qperf_inc(q, eqbs);
@@ -149,14 +149,7 @@ again:
 		qperf_inc(q, eqbs_partial);
 		DBF_DEV_EVENT(DBF_WARN, q->irq_ptr, "EQBS part:%02x",
 			tmp_count);
-		/*
-		 * Retry once, if that fails bail out and process the
-		 * extracted buffers before trying again.
-		 */
-		if (!retried++)
-			goto again;
-		else
-			return count - tmp_count;
+		return count - tmp_count;
 	}
 
 	DBF_ERROR("%4x EQBS ERROR", SCH_NO(q));

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 56/66] s390/qdio: dont merge ERROR output buffers
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 55/66] s390/qdio: dont retry EQBS after CCQ 96 Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 57/66] s390/ipl: ensure loadparm valid flag is set Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Julian Wiedmann, Ursula Braun,
	Benjamin Block, Martin Schwidefsky

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <jwi@linux.vnet.ibm.com>

commit 0cf1e05157b9e5530dcc3ca9fec9bf617fc93375 upstream.

On an Output queue, both EMPTY and PENDING buffer states imply that the
buffer is ready for completion-processing by the upper-layer drivers.

So for a non-QEBSM Output queue, get_buf_states() merges mixed
batches of PENDING and EMPTY buffers into one large batch of EMPTY
buffers. The upper-layer driver (ie. qeth) later distuingishes PENDING
from EMPTY by inspecting the slsb_state for
QDIO_OUTBUF_STATE_FLAG_PENDING.

But the merge logic in get_buf_states() contains a bug that causes us to
erronously also merge ERROR buffers into such a batch of EMPTY buffers
(ERROR is 0xaf, EMPTY is 0xa1; so ERROR & EMPTY == EMPTY).
Effectively, most outbound ERROR buffers are currently discarded
silently and processed as if they had succeeded.

Note that this affects _all_ non-QEBSM device types, not just IQD with CQ.

Fix it by explicitly spelling out the exact conditions for merging.

For extracting the "get initial state" part out of the loop, this relies
on the fact that get_buf_states() is never called with a count of 0. The
QEBSM path already strictly requires this, and the two callers with
variable 'count' make sure of it.

Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/s390/cio/qdio_main.c |   31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

--- a/drivers/s390/cio/qdio_main.c
+++ b/drivers/s390/cio/qdio_main.c
@@ -205,7 +205,10 @@ again:
 	return 0;
 }
 
-/* returns number of examined buffers and their common state in *state */
+/*
+ * Returns number of examined buffers and their common state in *state.
+ * Requested number of buffers-to-examine must be > 0.
+ */
 static inline int get_buf_states(struct qdio_q *q, unsigned int bufnr,
 				 unsigned char *state, unsigned int count,
 				 int auto_ack, int merge_pending)
@@ -216,17 +219,23 @@ static inline int get_buf_states(struct
 	if (is_qebsm(q))
 		return qdio_do_eqbs(q, state, bufnr, count, auto_ack);
 
-	for (i = 0; i < count; i++) {
-		if (!__state) {
-			__state = q->slsb.val[bufnr];
-			if (merge_pending && __state == SLSB_P_OUTPUT_PENDING)
-				__state = SLSB_P_OUTPUT_EMPTY;
-		} else if (merge_pending) {
-			if ((q->slsb.val[bufnr] & __state) != __state)
-				break;
-		} else if (q->slsb.val[bufnr] != __state)
-			break;
+	/* get initial state: */
+	__state = q->slsb.val[bufnr];
+	if (merge_pending && __state == SLSB_P_OUTPUT_PENDING)
+		__state = SLSB_P_OUTPUT_EMPTY;
+
+	for (i = 1; i < count; i++) {
 		bufnr = next_buf(bufnr);
+
+		/* merge PENDING into EMPTY: */
+		if (merge_pending &&
+		    q->slsb.val[bufnr] == SLSB_P_OUTPUT_PENDING &&
+		    __state == SLSB_P_OUTPUT_EMPTY)
+			continue;
+
+		/* stop if next state differs from initial state: */
+		if (q->slsb.val[bufnr] != __state)
+			break;
 	}
 	*state = __state;
 	return i;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 57/66] s390/ipl: ensure loadparm valid flag is set
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 56/66] s390/qdio: dont merge ERROR output buffers Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 58/66] getname_kernel() needs to make sure that ->name != ->iname in long case Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Heiko Carstens, Vasily Gorbik,
	Martin Schwidefsky

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vasily Gorbik <gor@linux.ibm.com>

commit 15deb080a6087b73089139569558965750e69d67 upstream.

When loadparm is set in reipl parm block, the kernel should also set
DIAG308_FLAGS_LP_VALID flag.

This fixes loadparm ignoring during z/VM fcp -> ccw reipl and kvm direct
boot -> ccw reipl.

Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/s390/kernel/ipl.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/s390/kernel/ipl.c
+++ b/arch/s390/kernel/ipl.c
@@ -798,6 +798,7 @@ static ssize_t reipl_generic_loadparm_st
 	/* copy and convert to ebcdic */
 	memcpy(ipb->hdr.loadparm, buf, lp_len);
 	ASCEBC(ipb->hdr.loadparm, LOADPARM_LEN);
+	ipb->hdr.flags |= DIAG308_FLAGS_LP_VALID;
 	return len;
 }
 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 58/66] getname_kernel() needs to make sure that ->name != ->iname in long case
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 57/66] s390/ipl: ensure loadparm valid flag is set Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 59/66] Bluetooth: Fix connection if directed advertising and privacy is used Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 30ce4d1903e1d8a7ccd110860a5eef3c638ed8be upstream.

missed it in "kill struct filename.separate" several years ago.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/namei.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -221,9 +221,10 @@ getname_kernel(const char * filename)
 	if (len <= EMBEDDED_NAME_MAX) {
 		result->name = (char *)result->iname;
 	} else if (len <= PATH_MAX) {
+		const size_t size = offsetof(struct filename, iname[1]);
 		struct filename *tmp;
 
-		tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
+		tmp = kmalloc(size, GFP_KERNEL);
 		if (unlikely(!tmp)) {
 			__putname(result);
 			return ERR_PTR(-ENOMEM);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 59/66] Bluetooth: Fix connection if directed advertising and privacy is used
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 58/66] getname_kernel() needs to make sure that ->name != ->iname in long case Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 60/66] rtl8187: Fix NULL pointer dereference in priv->conf_mutex Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Szymon Janc, Marcel Holtmann

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Szymon Janc <szymon.janc@codecoup.pl>

commit 082f2300cfa1a3d9d5221c38c5eba85d4ab98bd8 upstream.

Local random address needs to be updated before creating connection if
RPA from LE Direct Advertising Report was resolved in host. Otherwise
remote device might ignore connection request due to address mismatch.

This was affecting following qualification test cases:
GAP/CONN/SCEP/BV-03-C, GAP/CONN/GCEP/BV-05-C, GAP/CONN/DCEP/BV-05-C

Before patch:
< HCI Command: LE Set Random Address (0x08|0x0005) plen 6          #11350 [hci0] 84680.231216
        Address: 56:BC:E8:24:11:68 (Resolvable)
          Identity type: Random (0x01)
          Identity: F2:F1:06:3D:9C:42 (Static)
> HCI Event: Command Complete (0x0e) plen 4                        #11351 [hci0] 84680.246022
      LE Set Random Address (0x08|0x0005) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7         #11352 [hci0] 84680.246417
        Type: Passive (0x00)
        Interval: 60.000 msec (0x0060)
        Window: 30.000 msec (0x0030)
        Own address type: Random (0x01)
        Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02)
> HCI Event: Command Complete (0x0e) plen 4                        #11353 [hci0] 84680.248854
      LE Set Scan Parameters (0x08|0x000b) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2             #11354 [hci0] 84680.249466
        Scanning: Enabled (0x01)
        Filter duplicates: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4                        #11355 [hci0] 84680.253222
      LE Set Scan Enable (0x08|0x000c) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 18                          #11356 [hci0] 84680.458387
      LE Direct Advertising Report (0x0b)
        Num reports: 1
        Event type: Connectable directed - ADV_DIRECT_IND (0x01)
        Address type: Random (0x01)
        Address: 53:38:DA:46:8C:45 (Resolvable)
          Identity type: Public (0x00)
          Identity: 11:22:33:44:55:66 (OUI 11-22-33)
        Direct address type: Random (0x01)
        Direct address: 7C:D6:76:8C:DF:82 (Resolvable)
          Identity type: Random (0x01)
          Identity: F2:F1:06:3D:9C:42 (Static)
        RSSI: -74 dBm (0xb6)
< HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2             #11357 [hci0] 84680.458737
        Scanning: Disabled (0x00)
        Filter duplicates: Disabled (0x00)
> HCI Event: Command Complete (0x0e) plen 4                        #11358 [hci0] 84680.469982
      LE Set Scan Enable (0x08|0x000c) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Create Connection (0x08|0x000d) plen 25          #11359 [hci0] 84680.470444
        Scan interval: 60.000 msec (0x0060)
        Scan window: 60.000 msec (0x0060)
        Filter policy: White list is not used (0x00)
        Peer address type: Random (0x01)
        Peer address: 53:38:DA:46:8C:45 (Resolvable)
          Identity type: Public (0x00)
          Identity: 11:22:33:44:55:66 (OUI 11-22-33)
        Own address type: Random (0x01)
        Min connection interval: 30.00 msec (0x0018)
        Max connection interval: 50.00 msec (0x0028)
        Connection latency: 0 (0x0000)
        Supervision timeout: 420 msec (0x002a)
        Min connection length: 0.000 msec (0x0000)
        Max connection length: 0.000 msec (0x0000)
> HCI Event: Command Status (0x0f) plen 4                          #11360 [hci0] 84680.474971
      LE Create Connection (0x08|0x000d) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Create Connection Cancel (0x08|0x000e) plen 0    #11361 [hci0] 84682.545385
> HCI Event: Command Complete (0x0e) plen 4                        #11362 [hci0] 84682.551014
      LE Create Connection Cancel (0x08|0x000e) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 19                          #11363 [hci0] 84682.551074
      LE Connection Complete (0x01)
        Status: Unknown Connection Identifier (0x02)
        Handle: 0
        Role: Master (0x00)
        Peer address type: Public (0x00)
        Peer address: 00:00:00:00:00:00 (OUI 00-00-00)
        Connection interval: 0.00 msec (0x0000)
        Connection latency: 0 (0x0000)
        Supervision timeout: 0 msec (0x0000)
        Master clock accuracy: 0x00

After patch:
< HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7    #210 [hci0] 667.152459
        Type: Passive (0x00)
        Interval: 60.000 msec (0x0060)
        Window: 30.000 msec (0x0030)
        Own address type: Random (0x01)
        Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02)
> HCI Event: Command Complete (0x0e) plen 4                   #211 [hci0] 667.153613
      LE Set Scan Parameters (0x08|0x000b) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2        #212 [hci0] 667.153704
        Scanning: Enabled (0x01)
        Filter duplicates: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4                   #213 [hci0] 667.154584
      LE Set Scan Enable (0x08|0x000c) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 18                     #214 [hci0] 667.182619
      LE Direct Advertising Report (0x0b)
        Num reports: 1
        Event type: Connectable directed - ADV_DIRECT_IND (0x01)
        Address type: Random (0x01)
        Address: 50:52:D9:A6:48:A0 (Resolvable)
          Identity type: Public (0x00)
          Identity: 11:22:33:44:55:66 (OUI 11-22-33)
        Direct address type: Random (0x01)
        Direct address: 7C:C1:57:A5:B7:A8 (Resolvable)
          Identity type: Random (0x01)
          Identity: F4:28:73:5D:38:B0 (Static)
        RSSI: -70 dBm (0xba)
< HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2       #215 [hci0] 667.182704
        Scanning: Disabled (0x00)
        Filter duplicates: Disabled (0x00)
> HCI Event: Command Complete (0x0e) plen 4                  #216 [hci0] 667.183599
      LE Set Scan Enable (0x08|0x000c) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Random Address (0x08|0x0005) plen 6    #217 [hci0] 667.183645
        Address: 7C:C1:57:A5:B7:A8 (Resolvable)
          Identity type: Random (0x01)
          Identity: F4:28:73:5D:38:B0 (Static)
> HCI Event: Command Complete (0x0e) plen 4                  #218 [hci0] 667.184590
      LE Set Random Address (0x08|0x0005) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Create Connection (0x08|0x000d) plen 25    #219 [hci0] 667.184613
        Scan interval: 60.000 msec (0x0060)
        Scan window: 60.000 msec (0x0060)
        Filter policy: White list is not used (0x00)
        Peer address type: Random (0x01)
        Peer address: 50:52:D9:A6:48:A0 (Resolvable)
          Identity type: Public (0x00)
          Identity: 11:22:33:44:55:66 (OUI 11-22-33)
        Own address type: Random (0x01)
        Min connection interval: 30.00 msec (0x0018)
        Max connection interval: 50.00 msec (0x0028)
        Connection latency: 0 (0x0000)
        Supervision timeout: 420 msec (0x002a)
        Min connection length: 0.000 msec (0x0000)
        Max connection length: 0.000 msec (0x0000)
> HCI Event: Command Status (0x0f) plen 4                    #220 [hci0] 667.186558
      LE Create Connection (0x08|0x000d) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 19                    #221 [hci0] 667.485824
      LE Connection Complete (0x01)
        Status: Success (0x00)
        Handle: 0
        Role: Master (0x00)
        Peer address type: Random (0x01)
        Peer address: 50:52:D9:A6:48:A0 (Resolvable)
          Identity type: Public (0x00)
          Identity: 11:22:33:44:55:66 (OUI 11-22-33)
        Connection interval: 50.00 msec (0x0028)
        Connection latency: 0 (0x0000)
        Supervision timeout: 420 msec (0x002a)
        Master clock accuracy: 0x07
@ MGMT Event: Device Connected (0x000b) plen 13          {0x0002} [hci0] 667.485996
        LE Address: 11:22:33:44:55:66 (OUI 11-22-33)
        Flags: 0x00000000
        Data length: 0

Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/net/bluetooth/hci_core.h |    2 +-
 net/bluetooth/hci_conn.c         |   29 +++++++++++++++++++++--------
 net/bluetooth/hci_event.c        |   15 +++++++++++----
 net/bluetooth/l2cap_core.c       |    2 +-
 4 files changed, 34 insertions(+), 14 deletions(-)

--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -893,7 +893,7 @@ struct hci_conn *hci_connect_le_scan(str
 				     u16 conn_timeout);
 struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
 				u8 dst_type, u8 sec_level, u16 conn_timeout,
-				u8 role);
+				u8 role, bdaddr_t *direct_rpa);
 struct hci_conn *hci_connect_acl(struct hci_dev *hdev, bdaddr_t *dst,
 				 u8 sec_level, u8 auth_type);
 struct hci_conn *hci_connect_sco(struct hci_dev *hdev, int type, bdaddr_t *dst,
--- a/net/bluetooth/hci_conn.c
+++ b/net/bluetooth/hci_conn.c
@@ -749,18 +749,31 @@ static bool conn_use_rpa(struct hci_conn
 }
 
 static void hci_req_add_le_create_conn(struct hci_request *req,
-				       struct hci_conn *conn)
+				       struct hci_conn *conn,
+				       bdaddr_t *direct_rpa)
 {
 	struct hci_cp_le_create_conn cp;
 	struct hci_dev *hdev = conn->hdev;
 	u8 own_addr_type;
 
-	/* Update random address, but set require_privacy to false so
-	 * that we never connect with an non-resolvable address.
+	/* If direct address was provided we use it instead of current
+	 * address.
 	 */
-	if (hci_update_random_address(req, false, conn_use_rpa(conn),
-				      &own_addr_type))
-		return;
+	if (direct_rpa) {
+		if (bacmp(&req->hdev->random_addr, direct_rpa))
+			hci_req_add(req, HCI_OP_LE_SET_RANDOM_ADDR, 6,
+								direct_rpa);
+
+		/* direct address is always RPA */
+		own_addr_type = ADDR_LE_DEV_RANDOM;
+	} else {
+		/* Update random address, but set require_privacy to false so
+		 * that we never connect with an non-resolvable address.
+		 */
+		if (hci_update_random_address(req, false, conn_use_rpa(conn),
+					      &own_addr_type))
+			return;
+	}
 
 	memset(&cp, 0, sizeof(cp));
 
@@ -825,7 +838,7 @@ static void hci_req_directed_advertising
 
 struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
 				u8 dst_type, u8 sec_level, u16 conn_timeout,
-				u8 role)
+				u8 role, bdaddr_t *direct_rpa)
 {
 	struct hci_conn_params *params;
 	struct hci_conn *conn;
@@ -940,7 +953,7 @@ struct hci_conn *hci_connect_le(struct h
 		hci_dev_set_flag(hdev, HCI_LE_SCAN_INTERRUPTED);
 	}
 
-	hci_req_add_le_create_conn(&req, conn);
+	hci_req_add_le_create_conn(&req, conn, direct_rpa);
 
 create_conn:
 	err = hci_req_run(&req, create_le_conn_complete);
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -4646,7 +4646,8 @@ static void hci_le_conn_update_complete_
 /* This function requires the caller holds hdev->lock */
 static struct hci_conn *check_pending_le_conn(struct hci_dev *hdev,
 					      bdaddr_t *addr,
-					      u8 addr_type, u8 adv_type)
+					      u8 addr_type, u8 adv_type,
+					      bdaddr_t *direct_rpa)
 {
 	struct hci_conn *conn;
 	struct hci_conn_params *params;
@@ -4697,7 +4698,8 @@ static struct hci_conn *check_pending_le
 	}
 
 	conn = hci_connect_le(hdev, addr, addr_type, BT_SECURITY_LOW,
-			      HCI_LE_AUTOCONN_TIMEOUT, HCI_ROLE_MASTER);
+			      HCI_LE_AUTOCONN_TIMEOUT, HCI_ROLE_MASTER,
+			      direct_rpa);
 	if (!IS_ERR(conn)) {
 		/* If HCI_AUTO_CONN_EXPLICIT is set, conn is already owned
 		 * by higher layer that tried to connect, if no then
@@ -4807,8 +4809,13 @@ static void process_adv_report(struct hc
 		bdaddr_type = irk->addr_type;
 	}
 
-	/* Check if we have been requested to connect to this device */
-	conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type);
+	/* Check if we have been requested to connect to this device.
+	 *
+	 * direct_addr is set only for directed advertising reports (it is NULL
+	 * for advertising reports) and is already verified to be RPA above.
+	 */
+	conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
+								direct_addr);
 	if (conn && type == LE_ADV_IND) {
 		/* Store report for later inclusion by
 		 * mgmt_device_connected
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -7148,7 +7148,7 @@ int l2cap_chan_connect(struct l2cap_chan
 			hcon = hci_connect_le(hdev, dst, dst_type,
 					      chan->sec_level,
 					      HCI_LE_CONN_TIMEOUT,
-					      HCI_ROLE_SLAVE);
+					      HCI_ROLE_SLAVE, NULL);
 		else
 			hcon = hci_connect_le_scan(hdev, dst, dst_type,
 						   chan->sec_level,

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 60/66] rtl8187: Fix NULL pointer dereference in priv->conf_mutex
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 59/66] Bluetooth: Fix connection if directed advertising and privacy is used Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 61/66] hwmon: (ina2xx) Fix access to uninitialized mutex Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Sudhir Sreedharan, Kalle Valo

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sudhir Sreedharan <ssreedharan@mvista.com>

commit 7972326a26b5bf8dc2adac575c4e03ee7e9d193a upstream.

This can be reproduced by bind/unbind the driver multiple times
in AM3517 board.

Analysis revealed that rtl8187_start() was invoked before probe
finishes(ie. before the mutex is initialized).

 INFO: trying to register non-static key.
 the code is fine but needs lockdep annotation.
 turning off the locking correctness validator.
 CPU: 0 PID: 821 Comm: wpa_supplicant Not tainted 4.9.80-dirty #250
 Hardware name: Generic AM3517 (Flattened Device Tree)
 [<c010e0d8>] (unwind_backtrace) from [<c010beac>] (show_stack+0x10/0x14)
 [<c010beac>] (show_stack) from [<c017401c>] (register_lock_class+0x4f4/0x55c)
 [<c017401c>] (register_lock_class) from [<c0176fe0>] (__lock_acquire+0x74/0x1938)
 [<c0176fe0>] (__lock_acquire) from [<c0178cfc>] (lock_acquire+0xfc/0x23c)
 [<c0178cfc>] (lock_acquire) from [<c08aa2f8>] (mutex_lock_nested+0x50/0x3b0)
 [<c08aa2f8>] (mutex_lock_nested) from [<c05f5bf8>] (rtl8187_start+0x2c/0xd54)
 [<c05f5bf8>] (rtl8187_start) from [<c082dea0>] (drv_start+0xa8/0x320)
 [<c082dea0>] (drv_start) from [<c084d1d4>] (ieee80211_do_open+0x2bc/0x8e4)
 [<c084d1d4>] (ieee80211_do_open) from [<c069be94>] (__dev_open+0xb8/0x120)
 [<c069be94>] (__dev_open) from [<c069c11c>] (__dev_change_flags+0x88/0x14c)
 [<c069c11c>] (__dev_change_flags) from [<c069c1f8>] (dev_change_flags+0x18/0x48)
 [<c069c1f8>] (dev_change_flags) from [<c0710b08>] (devinet_ioctl+0x738/0x840)
 [<c0710b08>] (devinet_ioctl) from [<c067925c>] (sock_ioctl+0x164/0x2f4)
 [<c067925c>] (sock_ioctl) from [<c02883f8>] (do_vfs_ioctl+0x8c/0x9d0)
 [<c02883f8>] (do_vfs_ioctl) from [<c0288da8>] (SyS_ioctl+0x6c/0x7c)
 [<c0288da8>] (SyS_ioctl) from [<c0107760>] (ret_fast_syscall+0x0/0x1c)
 Unable to handle kernel NULL pointer dereference at virtual address 00000000
 pgd = cd1ec000
 [00000000] *pgd=8d1de831, *pte=00000000, *ppte=00000000
 Internal error: Oops: 817 [#1] PREEMPT ARM
 Modules linked in:
 CPU: 0 PID: 821 Comm: wpa_supplicant Not tainted 4.9.80-dirty #250
 Hardware name: Generic AM3517 (Flattened Device Tree)
 task: ce73eec0 task.stack: cd1ea000
 PC is at mutex_lock_nested+0xe8/0x3b0
 LR is at mutex_lock_nested+0xd0/0x3b0

Cc: stable@vger.kernel.org
Signed-off-by: Sudhir Sreedharan <ssreedharan@mvista.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c
+++ b/drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c
@@ -1454,6 +1454,7 @@ static int rtl8187_probe(struct usb_inte
 		goto err_free_dev;
 	}
 	mutex_init(&priv->io_mutex);
+	mutex_init(&priv->conf_mutex);
 
 	SET_IEEE80211_DEV(dev, &intf->dev);
 	usb_set_intfdata(intf, dev);
@@ -1627,7 +1628,6 @@ static int rtl8187_probe(struct usb_inte
 		printk(KERN_ERR "rtl8187: Cannot register device\n");
 		goto err_free_dmabuf;
 	}
-	mutex_init(&priv->conf_mutex);
 	skb_queue_head_init(&priv->b_tx_status.queue);
 
 	wiphy_info(dev->wiphy, "hwaddr %pM, %s V%d + %s, rfkill mask %d\n",

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 61/66] hwmon: (ina2xx) Fix access to uninitialized mutex
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 60/66] rtl8187: Fix NULL pointer dereference in priv->conf_mutex Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 62/66] cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Marek Szyprowski, Guenter Roeck

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marek Szyprowski <m.szyprowski@samsung.com>

commit 0c4c5860e9983eb3da7a3d73ca987643c3ed034b upstream.

Initialize data->config_lock mutex before it is used by the driver code.

This fixes following warning on Odroid XU3 boards:

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc7-next-20180115-00001-gb75575dee3f2 #107
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c0111504>] (unwind_backtrace) from [<c010dbec>] (show_stack+0x10/0x14)
[<c010dbec>] (show_stack) from [<c09b3f74>] (dump_stack+0x90/0xc8)
[<c09b3f74>] (dump_stack) from [<c0179528>] (register_lock_class+0x1c0/0x59c)
[<c0179528>] (register_lock_class) from [<c017bd1c>] (__lock_acquire+0x78/0x1850)
[<c017bd1c>] (__lock_acquire) from [<c017de30>] (lock_acquire+0xc8/0x2b8)
[<c017de30>] (lock_acquire) from [<c09ca59c>] (__mutex_lock+0x60/0xa0c)
[<c09ca59c>] (__mutex_lock) from [<c09cafd0>] (mutex_lock_nested+0x1c/0x24)
[<c09cafd0>] (mutex_lock_nested) from [<c068b0d0>] (ina2xx_set_shunt+0x70/0xb0)
[<c068b0d0>] (ina2xx_set_shunt) from [<c068b218>] (ina2xx_probe+0x88/0x1b0)
[<c068b218>] (ina2xx_probe) from [<c0673d90>] (i2c_device_probe+0x1e0/0x2d0)
[<c0673d90>] (i2c_device_probe) from [<c053a268>] (driver_probe_device+0x2b8/0x4a0)
[<c053a268>] (driver_probe_device) from [<c053a54c>] (__driver_attach+0xfc/0x120)
[<c053a54c>] (__driver_attach) from [<c05384cc>] (bus_for_each_dev+0x58/0x7c)
[<c05384cc>] (bus_for_each_dev) from [<c0539590>] (bus_add_driver+0x174/0x250)
[<c0539590>] (bus_add_driver) from [<c053b5e0>] (driver_register+0x78/0xf4)
[<c053b5e0>] (driver_register) from [<c0675ef0>] (i2c_register_driver+0x38/0xa8)
[<c0675ef0>] (i2c_register_driver) from [<c0102b40>] (do_one_initcall+0x48/0x18c)
[<c0102b40>] (do_one_initcall) from [<c0e00df0>] (kernel_init_freeable+0x110/0x1d4)
[<c0e00df0>] (kernel_init_freeable) from [<c09c8120>] (kernel_init+0x8/0x114)
[<c09c8120>] (kernel_init) from [<c01010b4>] (ret_from_fork+0x14/0x20)

Fixes: 5d389b125186 ("hwmon: (ina2xx) Make calibration register value fixed")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[backport to v4.4.y/v4.9.y: context changes]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hwmon/ina2xx.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/hwmon/ina2xx.c
+++ b/drivers/hwmon/ina2xx.c
@@ -447,6 +447,7 @@ static int ina2xx_probe(struct i2c_clien
 
 	/* set the device type */
 	data->config = &ina2xx_config[id->driver_data];
+	mutex_init(&data->config_lock);
 
 	if (of_property_read_u32(dev->of_node, "shunt-resistor", &val) < 0) {
 		struct ina2xx_platform_data *pdata = dev_get_platdata(dev);
@@ -473,8 +474,6 @@ static int ina2xx_probe(struct i2c_clien
 		return -ENODEV;
 	}
 
-	mutex_init(&data->config_lock);
-
 	data->groups[group++] = &ina2xx_group;
 	if (id->driver_data == ina226)
 		data->groups[group++] = &ina226_group;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 62/66] cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 61/66] hwmon: (ina2xx) Fix access to uninitialized mutex Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 63/66] rds: MP-RDS may use an invalid c_path Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bassem Boubaker, Oliver Neukum,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bassem Boubaker <bassem.boubaker@actia.fr>


[ Upstream commit 53765341ee821c0a0f1dec41adc89c9096ad694c ]

The Cinterion AHS8 is a 3G device with one embedded WWAN interface
using cdc_ether as a driver.

The modem is controlled via AT commands through the exposed TTYs.

AT+CGDCONT write command can be used to activate or deactivate a WWAN
connection for a PDP context defined with the same command. UE
supports one WWAN adapter.

Signed-off-by: Bassem Boubaker <bassem.boubaker@actia.fr>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/usb/cdc_ether.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -774,6 +774,12 @@ static const struct usb_device_id	produc
 				      USB_CDC_PROTO_NONE),
 	.driver_info = (unsigned long)&wwan_info,
 }, {
+	/* Cinterion AHS3 modem by GEMALTO */
+	USB_DEVICE_AND_INTERFACE_INFO(0x1e2d, 0x0055, USB_CLASS_COMM,
+				      USB_CDC_SUBCLASS_ETHERNET,
+				      USB_CDC_PROTO_NONE),
+	.driver_info = (unsigned long)&wwan_info,
+}, {
 	/* Telit modules */
 	USB_VENDOR_AND_INTERFACE_INFO(0x1bc7, USB_CLASS_COMM,
 			USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 63/66] rds: MP-RDS may use an invalid c_path
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 62/66] cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 64/66] slip: Check if rstate is initialized before uncompressing Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ka-Cheong Poon, Santosh Shilimkar,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ka-Cheong Poon <ka-cheong.poon@oracle.com>


[ Upstream commit a43cced9a348901f9015f4730b70b69e7c41a9c9 ]

rds_sendmsg() calls rds_send_mprds_hash() to find a c_path to use to
send a message.  Suppose the RDS connection is not yet up.  In
rds_send_mprds_hash(), it does

	if (conn->c_npaths == 0)
		wait_event_interruptible(conn->c_hs_waitq,
					 (conn->c_npaths != 0));

If it is interrupted before the connection is set up,
rds_send_mprds_hash() will return a non-zero hash value.  Hence
rds_sendmsg() will use a non-zero c_path to send the message.  But if
the RDS connection ends up to be non-MP capable, the message will be
lost as only the zero c_path can be used.

Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/rds/send.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2006 Oracle.  All rights reserved.
+ * Copyright (c) 2006, 2018 Oracle and/or its affiliates. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -983,10 +983,15 @@ static int rds_send_mprds_hash(struct rd
 	if (conn->c_npaths == 0 && hash != 0) {
 		rds_send_ping(conn);
 
-		if (conn->c_npaths == 0) {
-			wait_event_interruptible(conn->c_hs_waitq,
-						 (conn->c_npaths != 0));
-		}
+		/* The underlying connection is not up yet.  Need to wait
+		 * until it is up to be sure that the non-zero c_path can be
+		 * used.  But if we are interrupted, we have to use the zero
+		 * c_path in case the connection ends up being non-MP capable.
+		 */
+		if (conn->c_npaths == 0)
+			if (wait_event_interruptible(conn->c_hs_waitq,
+						     conn->c_npaths != 0))
+				hash = 0;
 		if (conn->c_npaths == 1)
 			hash = 0;
 	}

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 64/66] slip: Check if rstate is initialized before uncompressing
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 63/66] rds: MP-RDS may use an invalid c_path Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 65/66] vhost: fix vhost_vq_access_ok() log check Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tejaswi Tanikella, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tejaswi Tanikella <tejaswit@codeaurora.org>


[ Upstream commit 3f01ddb962dc506916c243f9524e8bef97119b77 ]

On receiving a packet the state index points to the rstate which must be
used to fill up IP and TCP headers. But if the state index points to a
rstate which is unitialized, i.e. filled with zeros, it gets stuck in an
infinite loop inside ip_fast_csum trying to compute the ip checsum of a
header with zero length.

89.666953:   <2> [<ffffff9dd3e94d38>] slhc_uncompress+0x464/0x468
89.666965:   <2> [<ffffff9dd3e87d88>] ppp_receive_nonmp_frame+0x3b4/0x65c
89.666978:   <2> [<ffffff9dd3e89dd4>] ppp_receive_frame+0x64/0x7e0
89.666991:   <2> [<ffffff9dd3e8a708>] ppp_input+0x104/0x198
89.667005:   <2> [<ffffff9dd3e93868>] pppopns_recv_core+0x238/0x370
89.667027:   <2> [<ffffff9dd4428fc8>] __sk_receive_skb+0xdc/0x250
89.667040:   <2> [<ffffff9dd3e939e4>] pppopns_recv+0x44/0x60
89.667053:   <2> [<ffffff9dd4426848>] __sock_queue_rcv_skb+0x16c/0x24c
89.667065:   <2> [<ffffff9dd4426954>] sock_queue_rcv_skb+0x2c/0x38
89.667085:   <2> [<ffffff9dd44f7358>] raw_rcv+0x124/0x154
89.667098:   <2> [<ffffff9dd44f7568>] raw_local_deliver+0x1e0/0x22c
89.667117:   <2> [<ffffff9dd44c8ba0>] ip_local_deliver_finish+0x70/0x24c
89.667131:   <2> [<ffffff9dd44c92f4>] ip_local_deliver+0x100/0x10c

./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output:
 ip_fast_csum at arch/arm64/include/asm/checksum.h:40
 (inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615

Adding a variable to indicate if the current rstate is initialized. If
such a packet arrives, move to toss state.

Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/slip/slhc.c |    5 +++++
 include/net/slhc_vj.h   |    1 +
 2 files changed, 6 insertions(+)

--- a/drivers/net/slip/slhc.c
+++ b/drivers/net/slip/slhc.c
@@ -509,6 +509,10 @@ slhc_uncompress(struct slcompress *comp,
 		if(x < 0 || x > comp->rslot_limit)
 			goto bad;
 
+		/* Check if the cstate is initialized */
+		if (!comp->rstate[x].initialized)
+			goto bad;
+
 		comp->flags &=~ SLF_TOSS;
 		comp->recv_current = x;
 	} else {
@@ -673,6 +677,7 @@ slhc_remember(struct slcompress *comp, u
 	if (cs->cs_tcp.doff > 5)
 	  memcpy(cs->cs_tcpopt, icp + ihl*4 + sizeof(struct tcphdr), (cs->cs_tcp.doff - 5) * 4);
 	cs->cs_hsize = ihl*2 + cs->cs_tcp.doff*2;
+	cs->initialized = true;
 	/* Put headers back on packet
 	 * Neither header checksum is recalculated
 	 */
--- a/include/net/slhc_vj.h
+++ b/include/net/slhc_vj.h
@@ -127,6 +127,7 @@ typedef __u32 int32;
  */
 struct cstate {
 	byte_t	cs_this;	/* connection id number (xmit) */
+	bool	initialized;	/* true if initialized */
 	struct cstate *next;	/* next in ring (xmit) */
 	struct iphdr cs_ip;	/* ip/tcp hdr from most recent packet */
 	struct tcphdr cs_tcp;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 65/66] vhost: fix vhost_vq_access_ok() log check
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 64/66] slip: Check if rstate is initialized before uncompressing Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 15:59 ` [PATCH 4.9 66/66] lan78xx: Correctly indicate invalid OTP Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+65a84dde0214b0387ccd,
	Jason Wang, Stefan Hajnoczi, Michael S. Tsirkin, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Stefan Hajnoczi <stefanha@redhat.com>


[ Upstream commit d14d2b78090c7de0557362b26a4ca591aa6a9faa ]

Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log
when IOTLB is enabled") introduced a regression.  The logic was
originally:

  if (vq->iotlb)
      return 1;
  return A && B;

After the patch the short-circuit logic for A was inverted:

  if (A || vq->iotlb)
      return A;
  return B;

This patch fixes the regression by rewriting the checks in the obvious
way, no longer returning A when vq->iotlb is non-NULL (which is hard to
understand).

Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vhost/vhost.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1175,10 +1175,12 @@ static int vq_log_access_ok(struct vhost
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-	int ret = vq_log_access_ok(vq, vq->log_base);
+	if (!vq_log_access_ok(vq, vq->log_base))
+		return 0;
 
-	if (ret || vq->iotlb)
-		return ret;
+	/* Access validation occurs at prefetch time with IOTLB */
+	if (vq->iotlb)
+		return 1;
 
 	return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used);
 }

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 4.9 66/66] lan78xx: Correctly indicate invalid OTP
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 65/66] vhost: fix vhost_vq_access_ok() log check Greg Kroah-Hartman
@ 2018-04-17 15:59 ` Greg Kroah-Hartman
  2018-04-17 21:04 ` [PATCH 4.9 00/66] 4.9.95-stable review Shuah Khan
                   ` (3 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-17 15:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Phil Elwell, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Phil Elwell <phil@raspberrypi.org>


[ Upstream commit 4bfc33807a9a02764bdd1e42e794b3b401240f27 ]

lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP
content, but the value gets overwritten before it is returned and the
read goes ahead anyway. Make the read conditional as it should be
and preserve the error code.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/usb/lan78xx.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -873,7 +873,8 @@ static int lan78xx_read_otp(struct lan78
 			offset += 0x100;
 		else
 			ret = -EINVAL;
-		ret = lan78xx_read_raw_otp(dev, offset, length, data);
+		if (!ret)
+			ret = lan78xx_read_raw_otp(dev, offset, length, data);
 	}
 
 	return ret;

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2018-04-17 15:59 ` [PATCH 4.9 66/66] lan78xx: Correctly indicate invalid OTP Greg Kroah-Hartman
@ 2018-04-17 21:04 ` Shuah Khan
  2018-04-17 23:03 ` kernelci.org bot
                   ` (2 subsequent siblings)
  69 siblings, 0 replies; 82+ messages in thread
From: Shuah Khan @ 2018-04-17 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, lkft-triage,
	stable, Shuah Khan

On 04/17/2018 09:58 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.95 release.
> There are 66 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Apr 19 15:56:27 UTC 2018.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.95-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 


Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2018-04-17 21:04 ` [PATCH 4.9 00/66] 4.9.95-stable review Shuah Khan
@ 2018-04-17 23:03 ` kernelci.org bot
  2018-04-18 15:38 ` Guenter Roeck
  2018-04-18 17:42 ` Dan Rue
  69 siblings, 0 replies; 82+ messages in thread
From: kernelci.org bot @ 2018-04-17 23:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable

stable-rc/linux-4.9.y boot: 52 boots: 1 failed, 48 passed with 3 untried/unknown (v4.9.94-67-g148bd16b9893)

Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.94-67-g148bd16b9893/
Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.94-67-g148bd16b9893/

Tree: stable-rc
Branch: linux-4.9.y
Git Describe: v4.9.94-67-g148bd16b9893
Git Commit: 148bd16b989355ed27e01a6480c2548d0bb795e1
Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 26 unique boards, 11 SoC families, 15 builds out of 182

Boot Regressions Detected:

arm64:

    defconfig:
        r8a7795-salvator-x:
            lab-baylibre: failing since 7 days (last pass: v4.9.93 - first fail: v4.9.93-278-g1614c07609f0)

Boot Failure Detected:

arm64:

    defconfig
        r8a7795-salvator-x: 1 failed lab

---
For more info write to <info@kernelci.org>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2018-04-17 23:03 ` kernelci.org bot
@ 2018-04-18 15:38 ` Guenter Roeck
  2018-04-18 17:42 ` Dan Rue
  69 siblings, 0 replies; 82+ messages in thread
From: Guenter Roeck @ 2018-04-18 15:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On Tue, Apr 17, 2018 at 05:58:33PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.95 release.
> There are 66 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Apr 19 15:56:27 UTC 2018.
> Anything received after that time might be too late.
> 

Build results:
	total: 145 pass: 145 fail: 0
Qemu test results:
	total: 137 pass: 137 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2018-04-18 15:38 ` Guenter Roeck
@ 2018-04-18 17:42 ` Dan Rue
  2018-04-19  7:56   ` Greg Kroah-Hartman
  69 siblings, 1 reply; 82+ messages in thread
From: Dan Rue @ 2018-04-18 17:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable, Ka-Cheong Poon, Mark Rutland

On Tue, Apr 17, 2018 at 05:58:33PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.95 release.
> There are 66 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Apr 19 15:56:27 UTC 2018.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.95-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.

We've noticed a regression on arm32 in sendto() and send() system calls,
causing them to hang forever.

The easiest way we have found to reproduce, is to simply run 'ip link'.
When running 'strace -T ip link' we see:

    ...
    getsockname(3, {sa_family=AF_NETLINK, nl_pid=364, nl_groups=00000000}, [12]) = 0 <0.000079>
    .... long wait (10 minute) and then eventually:
    send(3, {{len=40, type=0x12 /* NLMSG_??? */, flags=NLM_F_REQUEST|0x300, seq=1522961259, pid=0}, \"\21\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0\35\0\1\0\0\0\"}, 40, 0[  807.868633] random: crng init done

Ignore the 'random: crng init done' - I believe it is causing the dmesg
buffer to print which is what gets us our send() output.

We saw a similar strace hang on sendto() with the ltp 'gethostid01' test.

We've also observed the following kernel log during boot on the x15/arm32
device:

    [   13.555030] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
    [   13.580352] net eth0: initializing cpsw version 1.15 (0)
    [   13.708220] Unable to handle kernel NULL pointer dereference at virtual address 00000008
    [   13.716375] pgd = ed724f40
    [   13.719126] [00000008] *pgd=ac173003, *pmd=fb03d003
    [   13.724068] Internal error: Oops: 207 [#1] SMP ARM
    [   13.728882] Modules linked in: snd_soc_simple_card snd_soc_simple_card_utils snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus
    [   13.742337] CPU: 0 PID: 243 Comm: NetworkManager Not tainted 4.9.95-rc1 #1
    [   13.749241] Hardware name: Generic DRA74X (Flattened Device Tree)
    [   13.755360] task: ed6a6e40 task.stack: ec114000
    [   13.759916] PC is at kszphy_config_reset+0x1c/0x150
    [   13.764815] LR is at kszphy_resume+0x24/0x64
    [   13.769104] pc : [<c0b6f680>]    lr : [<c0b6f940>]    psr: 600e0113
    [   13.769104] sp : ec115920  ip : ec115940  fp : ec11593c
    [   13.780630] r10: 00000000  r9 : 00000007  r8 : 00000000
    [   13.785877] r7 : ed049800  r6 : 00000000  r5 : ee1d3800  r4 : ed049c00
    [   13.792431] r3 : 00000001  r2 : 00000000  r1 : 00000110  r0 : ed049c00
    [   13.798987] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
    [   13.806152] Control: 30c5387d  Table: ad724f40  DAC: fffffffd
    [   13.811923] Process NetworkManager (pid: 243, stack limit = 0xec114220)
    [   13.818564] Stack: (0xec115920 to 0xec116000)
    [   13.822941] 5920: ed049c00 ee1d3800 00000000 ed049800 ec115954 ec115940 c0b6f940 c0b6f670
    [   13.831154] 5940: ed049c00 ee1d3800 ec115984 ec115958 c0b693f8 c0b6f928 00000007 ed049c00
    [   13.839368] 5960: ed049c00 c0c11cec c0c11cec 00000007 ee1d3e0c 00000000 ec1159a4 ec115988
    [   13.847582] 5980: c0b69600 c0b69340 ee6af854 ed049c00 ee1d3800 c0c11cec ec1159cc ec1159a8
    [   13.855795] 59a0: c0b6968c c0b695e8 ed040310 ee1d3e00 00000000 ee4fbc10 ee4fbc10 ee1d3e0c
    [   13.864009] 59c0: ec115a04 ec1159d0 c0c0e47c c0b69644 00000001 00000000 c04d5160 ee4fbc10
    [   13.872222] 59e0: ee1d3e00 ee1d3800 ee4fbc10 00000000 00000001 ed3aad80 ec115aa4 ec115a08
    [   13.880436] 5a00: c0c11634 c0c0e258 00000000 c0476cec c0ef92e0 c0476cec c0df67b8 fffffff4
    [   13.888647] 5a20: ec115aac fffffff3 ec115aac 0000000d 00000000 00000000 ec115a6c ec115a48
    [   13.896861] 5a40: c0476cec c0eba8ac ec115aac 0000000d ee1d3800 00001002 00000000 ed5a7810
    [   13.905074] 5a60: ec115a84 ec115a70 c0476ed0 c0476ca4 00000000 c0de20fc ec115aa4 ee1d3800
    [   13.913287] 5a80: 00000000 c1193410 ee1d3830 00000000 ed5a7810 ed3aad80 ec115acc ec115aa8
    [   13.921500] 5aa0: c0deeb64 c0c1118c ec115acc ee1d3800 ee1d3800 00001003 00000001 00001002
    [   13.929713] 5ac0: ec115af4 ec115ad0 c0deee4c c0deeab4 ee1d3800 00001002 00000000 ee1d394c
    [   13.937925] 5ae0: 00000000 ed5a7810 ec115b1c ec115af8 c0deef24 c0deedb4 ee1d3800 ec115c28
    [   13.946137] 5b00: 00000000 c1193410 00000000 ed5a7810 ec115b8c ec115b20 c0e025f0 c0deef08
    [   13.954351] 5b20: c0491510 c04aa8d4 00000000 ec115b38 00000000 00000003 00000000 00000000
    [   13.962563] 5b40: 00000000 c230d40c ed6a7338 00000001 00000002 c1a58e78 c1a80870 ed6a6e40
    [   13.970775] 5b60: ec115bfc 00000000 ee1d3800 00000000 00000000 00000000 00000000 ed5a7800
    [   13.978988] 5b80: ec115d04 ec115b90 c0e04554 c0e0230c ec115bc4 00000000 ed3aad80 c11cb12c
    [   13.987202] 5ba0: ed5a7820 c1b68b40 00000000 c1b69b50 ec115b90 ed5a7810 ec115c54 ec115bc8
    [   13.995416] 5bc0: c04adef0 c04acd00 ec115c64 ec115bd8 c04adef0 00000000 00000000 00000000
    [   14.003629] 5be0: 00000000 00000000 00000000 c230d400 ed6a7318 00000000 00000001 c230d40c
    [   14.011842] 5c00: ed6a72f8 00000001 00000000 c1b69b84 c1a80870 ed6a6e40 ec115cb4 ec115c28
    [   14.020054] 5c20: c04adef0 c04acdd4 00000000 00000000 00000000 00000000 00000000 00000000
    [   14.028266] 5c40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [   14.036479] 5c60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [   14.044691] 5c80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [   14.052903] 5ca0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [   14.061115] 5cc0: 00000000 00000000 00000000 00000000 00000000 00000000 c0e194b0 c0e040a4
    [   14.069328] 5ce0: ed5a7800 c236d9e4 ed3aad80 ec115d84 00000000 00000000 ec115d44 ec115d08
    [   14.077540] 5d00: c0e04908 c0e040b0 c0f458a8 c04ad82c 00000001 00000000 c0e01c0c 00000000
    [   14.085753] 5d20: ed5a7800 c0e04828 ed3aad80 ed3aad80 ec115d84 00000000 ec115d64 ec115d48
    [   14.093966] 5d40: c0e1e538 c0e04834 ed3aad80 ec212000 00000020 ed3aad80 ec115d7c ec115d68
    [   14.102179] 5d60: c0e01c1c c0e1e494 ee2f0400 ec212000 ec115dac ec115d80 c0e1de70 c0e01bf0
    [   14.110391] 5d80: ec115e24 7fffffff ec115f48 ec212000 ed3aad80 00000000 00000020 00000000
    [   14.118603] 5da0: ec115e0c ec115db0 c0e1e29c c0e1dcfc ec115e24 ec115e48 0000000c 00000001
    [   14.126817] 5dc0: be8be8b4 00000008 00000000 ed73bd00 00000000 000000f3 00000000 00000000
    [   14.135031] 5de0: ec115e24 ec115f48 00000000 00000000 eddf6d80 ec115e28 00000000 00000000
    [   14.143244] 5e00: ec115e1c ec115e10 c0dc8304 c0e1dfe4 ec115f34 ec115e20 c0dc8a70 c0dc82ec
    [   14.151457] 5e20: ec115e84 00000000 c04afaa8 c04adb9c 00000001 00000000 600b0013 c1b7a9a4
    [   14.159670] 5e40: 00000001 c1a4405c 001fa4f0 00000020 c04b00cc c04c87a0 c04afaa8 c04adb9c
    [   14.167884] 5e60: 00000000 00000000 ffffe000 00000000 600b0013 00004000 00000000 c05fed1c
    [   14.176097] 5e80: 600b0013 c04aa828 00000010 00000000 00000000 00000008 00004000 ed45ea00
    [   14.184308] 5ea0: c1b79b7b c13c1e6c ec115efc ec115eb8 c05fed48 c04afc5c 00000000 00000000
    [   14.192522] 5ec0: c05febb8 ec115ed0 c13e1ce0 ed45ea90 00000000 ec115f44 ec115f40 00000008
    [   14.200735] 5ee0: 00000128 c0408e04 ec114000 00000000 ec115f0c ec115f00 c05fee84 c05febc4
    [   14.208949] 5f00: ec115f1c ec115f10 c05fef04 be8be89c 00000000 eddf6d80 00000128 c0408e04
    [   14.217162] 5f20: ec114000 00000000 ec115f94 ec115f38 c0dc9884 c0dc889c 00000000 00000000
    [   14.225375] 5f40: 00000001 fffffff7 ec115e88 0000000c 00000001 00000000 00000000 ec115e50
    [   14.233588] 5f60: 00000000 00000006 00000000 00000000 00000000 00000000 ec115f94 00000000
    [   14.241801] 5f80: be8be89c 00000008 ec115fa4 ec115f98 c0dc98c8 c0dc9840 00000000 ec115fa8
    [   14.250013] 5fa0: c0408c80 c0dc98bc 00000000 be8be89c 00000008 be8be89c 00000000 00000000
    [   14.258225] 5fc0: 00000000 be8be89c 00000008 00000128 b69d9390 be8be900 00000000 001a93e0
    [   14.266438] 5fe0: b6f316e0 be8be840 00000000 b699a4a4 800b0010 00000008 00000000 00000000
    [   14.274661] [<c0b6f680>] (kszphy_config_reset) from [<c0b6f940>] (kszphy_resume+0x24/0x64)
    [   14.282966] [<c0b6f940>] (kszphy_resume) from [<c0b693f8>] (phy_attach_direct+0xc4/0x1cc)
    [   14.291183] [<c0b693f8>] (phy_attach_direct) from [<c0b69600>] (phy_connect_direct+0x24/0x5c)
    [   14.299747] [<c0b69600>] (phy_connect_direct) from [<c0b6968c>] (phy_connect+0x54/0x88)
    [   14.307792] [<c0b6968c>] (phy_connect) from [<c0c0e47c>] (cpsw_slave_open+0x230/0x294)
    [   14.315749] [<c0c0e47c>] (cpsw_slave_open) from [<c0c11634>] (cpsw_ndo_open+0x4b4/0x618)
    [   14.323883] [<c0c11634>] (cpsw_ndo_open) from [<c0deeb64>] (__dev_open+0xbc/0x124)
    [   14.331498] [<c0deeb64>] (__dev_open) from [<c0deee4c>] (__dev_change_flags+0xa4/0x154)
    [   14.331507] [<c0deee4c>] (__dev_change_flags) from [<c0deef24>] (dev_change_flags+0x28/0x58)
    [   14.331522] [<c0deef24>] (dev_change_flags) from [<c0e025f0>] (do_setlink+0x2f0/0x890)
    [   14.331531] [<c0e025f0>] (do_setlink) from [<c0e04554>] (rtnl_newlink+0x4b0/0x784)
    [   14.331539] [<c0e04554>] (rtnl_newlink) from [<c0e04908>] (rtnetlink_rcv_msg+0xe0/0x1fc)
    [   14.331550] [<c0e04908>] (rtnetlink_rcv_msg) from [<c0e1e538>] (netlink_rcv_skb+0xb0/0xcc)
    [   14.331559] [<c0e1e538>] (netlink_rcv_skb) from [<c0e01c1c>] (rtnetlink_rcv+0x38/0x40)
    [   14.331567] [<c0e01c1c>] (rtnetlink_rcv) from [<c0e1de70>] (netlink_unicast+0x180/0x210)
    [   14.331575] [<c0e1de70>] (netlink_unicast) from [<c0e1e29c>] (netlink_sendmsg+0x2c4/0x380)
    [   14.331584] [<c0e1e29c>] (netlink_sendmsg) from [<c0dc8304>] (sock_sendmsg+0x24/0x34)
    [   14.331593] [<c0dc8304>] (sock_sendmsg) from [<c0dc8a70>] (___sys_sendmsg+0x1e0/0x1f0)
    [   14.331601] [<c0dc8a70>] (___sys_sendmsg) from [<c0dc9884>] (__sys_sendmsg+0x50/0x7c)
    [   14.331610] [<c0dc9884>] (__sys_sendmsg) from [<c0dc98c8>] (SyS_sendmsg+0x18/0x1c)
    [   14.331620] [<c0dc98c8>] (SyS_sendmsg) from [<c0408c80>] (ret_fast_syscall+0x0/0x1c)
    [   14.331629] Code: e52de004 e8bd4000 e59062dc e1a04000 (e5d63008) 
    [   14.331751] ---[ end trace 7927abed2565423d ]---

Full output of a lava job that shows both of these symptoms can be seen
at https://lkft.validation.linaro.org/scheduler/job/187562.

Looking through the patches, it's not obvious to me where the issue may
be. We haven't bisected it yet but will begin. I noticed quite a few arm
changes from Mark Rutland, and a change to net/rds/send.c from Ka-Cheong
Poon. Both added to CC.

Dan

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-18 17:42 ` Dan Rue
@ 2018-04-19  7:56   ` Greg Kroah-Hartman
  2018-04-19 11:12     ` Naresh Kamboju
  0 siblings, 1 reply; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-19  7:56 UTC (permalink / raw)
  To: linux-kernel, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable, Ka-Cheong Poon, Mark Rutland

On Wed, Apr 18, 2018 at 12:42:44PM -0500, Dan Rue wrote:
> On Tue, Apr 17, 2018 at 05:58:33PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.95 release.
> > There are 66 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu Apr 19 15:56:27 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.95-rc1.gz
> > or in the git tree and branch at:
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> > and the diffstat can be found below.
> 
> We've noticed a regression on arm32 in sendto() and send() system calls,
> causing them to hang forever.
> 
> The easiest way we have found to reproduce, is to simply run 'ip link'.
> When running 'strace -T ip link' we see:
> 
>     ...
>     getsockname(3, {sa_family=AF_NETLINK, nl_pid=364, nl_groups=00000000}, [12]) = 0 <0.000079>
>     .... long wait (10 minute) and then eventually:
>     send(3, {{len=40, type=0x12 /* NLMSG_??? */, flags=NLM_F_REQUEST|0x300, seq=1522961259, pid=0}, \"\21\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0\35\0\1\0\0\0\"}, 40, 0[  807.868633] random: crng init done
> 
> Ignore the 'random: crng init done' - I believe it is causing the dmesg
> buffer to print which is what gets us our send() output.
> 
> We saw a similar strace hang on sendto() with the ltp 'gethostid01' test.
> 
> We've also observed the following kernel log during boot on the x15/arm32
> device:
> 
>     [   13.555030] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
>     [   13.580352] net eth0: initializing cpsw version 1.15 (0)
>     [   13.708220] Unable to handle kernel NULL pointer dereference at virtual address 00000008
>     [   13.716375] pgd = ed724f40
>     [   13.719126] [00000008] *pgd=ac173003, *pmd=fb03d003
>     [   13.724068] Internal error: Oops: 207 [#1] SMP ARM
>     [   13.728882] Modules linked in: snd_soc_simple_card snd_soc_simple_card_utils snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus
>     [   13.742337] CPU: 0 PID: 243 Comm: NetworkManager Not tainted 4.9.95-rc1 #1
>     [   13.749241] Hardware name: Generic DRA74X (Flattened Device Tree)
>     [   13.755360] task: ed6a6e40 task.stack: ec114000
>     [   13.759916] PC is at kszphy_config_reset+0x1c/0x150
>     [   13.764815] LR is at kszphy_resume+0x24/0x64
>     [   13.769104] pc : [<c0b6f680>]    lr : [<c0b6f940>]    psr: 600e0113
>     [   13.769104] sp : ec115920  ip : ec115940  fp : ec11593c
>     [   13.780630] r10: 00000000  r9 : 00000007  r8 : 00000000
>     [   13.785877] r7 : ed049800  r6 : 00000000  r5 : ee1d3800  r4 : ed049c00
>     [   13.792431] r3 : 00000001  r2 : 00000000  r1 : 00000110  r0 : ed049c00
>     [   13.798987] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
>     [   13.806152] Control: 30c5387d  Table: ad724f40  DAC: fffffffd
>     [   13.811923] Process NetworkManager (pid: 243, stack limit = 0xec114220)
>     [   13.818564] Stack: (0xec115920 to 0xec116000)
>     [   13.822941] 5920: ed049c00 ee1d3800 00000000 ed049800 ec115954 ec115940 c0b6f940 c0b6f670
>     [   13.831154] 5940: ed049c00 ee1d3800 ec115984 ec115958 c0b693f8 c0b6f928 00000007 ed049c00
>     [   13.839368] 5960: ed049c00 c0c11cec c0c11cec 00000007 ee1d3e0c 00000000 ec1159a4 ec115988
>     [   13.847582] 5980: c0b69600 c0b69340 ee6af854 ed049c00 ee1d3800 c0c11cec ec1159cc ec1159a8
>     [   13.855795] 59a0: c0b6968c c0b695e8 ed040310 ee1d3e00 00000000 ee4fbc10 ee4fbc10 ee1d3e0c
>     [   13.864009] 59c0: ec115a04 ec1159d0 c0c0e47c c0b69644 00000001 00000000 c04d5160 ee4fbc10
>     [   13.872222] 59e0: ee1d3e00 ee1d3800 ee4fbc10 00000000 00000001 ed3aad80 ec115aa4 ec115a08
>     [   13.880436] 5a00: c0c11634 c0c0e258 00000000 c0476cec c0ef92e0 c0476cec c0df67b8 fffffff4
>     [   13.888647] 5a20: ec115aac fffffff3 ec115aac 0000000d 00000000 00000000 ec115a6c ec115a48
>     [   13.896861] 5a40: c0476cec c0eba8ac ec115aac 0000000d ee1d3800 00001002 00000000 ed5a7810
>     [   13.905074] 5a60: ec115a84 ec115a70 c0476ed0 c0476ca4 00000000 c0de20fc ec115aa4 ee1d3800
>     [   13.913287] 5a80: 00000000 c1193410 ee1d3830 00000000 ed5a7810 ed3aad80 ec115acc ec115aa8
>     [   13.921500] 5aa0: c0deeb64 c0c1118c ec115acc ee1d3800 ee1d3800 00001003 00000001 00001002
>     [   13.929713] 5ac0: ec115af4 ec115ad0 c0deee4c c0deeab4 ee1d3800 00001002 00000000 ee1d394c
>     [   13.937925] 5ae0: 00000000 ed5a7810 ec115b1c ec115af8 c0deef24 c0deedb4 ee1d3800 ec115c28
>     [   13.946137] 5b00: 00000000 c1193410 00000000 ed5a7810 ec115b8c ec115b20 c0e025f0 c0deef08
>     [   13.954351] 5b20: c0491510 c04aa8d4 00000000 ec115b38 00000000 00000003 00000000 00000000
>     [   13.962563] 5b40: 00000000 c230d40c ed6a7338 00000001 00000002 c1a58e78 c1a80870 ed6a6e40
>     [   13.970775] 5b60: ec115bfc 00000000 ee1d3800 00000000 00000000 00000000 00000000 ed5a7800
>     [   13.978988] 5b80: ec115d04 ec115b90 c0e04554 c0e0230c ec115bc4 00000000 ed3aad80 c11cb12c
>     [   13.987202] 5ba0: ed5a7820 c1b68b40 00000000 c1b69b50 ec115b90 ed5a7810 ec115c54 ec115bc8
>     [   13.995416] 5bc0: c04adef0 c04acd00 ec115c64 ec115bd8 c04adef0 00000000 00000000 00000000
>     [   14.003629] 5be0: 00000000 00000000 00000000 c230d400 ed6a7318 00000000 00000001 c230d40c
>     [   14.011842] 5c00: ed6a72f8 00000001 00000000 c1b69b84 c1a80870 ed6a6e40 ec115cb4 ec115c28
>     [   14.020054] 5c20: c04adef0 c04acdd4 00000000 00000000 00000000 00000000 00000000 00000000
>     [   14.028266] 5c40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>     [   14.036479] 5c60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>     [   14.044691] 5c80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>     [   14.052903] 5ca0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>     [   14.061115] 5cc0: 00000000 00000000 00000000 00000000 00000000 00000000 c0e194b0 c0e040a4
>     [   14.069328] 5ce0: ed5a7800 c236d9e4 ed3aad80 ec115d84 00000000 00000000 ec115d44 ec115d08
>     [   14.077540] 5d00: c0e04908 c0e040b0 c0f458a8 c04ad82c 00000001 00000000 c0e01c0c 00000000
>     [   14.085753] 5d20: ed5a7800 c0e04828 ed3aad80 ed3aad80 ec115d84 00000000 ec115d64 ec115d48
>     [   14.093966] 5d40: c0e1e538 c0e04834 ed3aad80 ec212000 00000020 ed3aad80 ec115d7c ec115d68
>     [   14.102179] 5d60: c0e01c1c c0e1e494 ee2f0400 ec212000 ec115dac ec115d80 c0e1de70 c0e01bf0
>     [   14.110391] 5d80: ec115e24 7fffffff ec115f48 ec212000 ed3aad80 00000000 00000020 00000000
>     [   14.118603] 5da0: ec115e0c ec115db0 c0e1e29c c0e1dcfc ec115e24 ec115e48 0000000c 00000001
>     [   14.126817] 5dc0: be8be8b4 00000008 00000000 ed73bd00 00000000 000000f3 00000000 00000000
>     [   14.135031] 5de0: ec115e24 ec115f48 00000000 00000000 eddf6d80 ec115e28 00000000 00000000
>     [   14.143244] 5e00: ec115e1c ec115e10 c0dc8304 c0e1dfe4 ec115f34 ec115e20 c0dc8a70 c0dc82ec
>     [   14.151457] 5e20: ec115e84 00000000 c04afaa8 c04adb9c 00000001 00000000 600b0013 c1b7a9a4
>     [   14.159670] 5e40: 00000001 c1a4405c 001fa4f0 00000020 c04b00cc c04c87a0 c04afaa8 c04adb9c
>     [   14.167884] 5e60: 00000000 00000000 ffffe000 00000000 600b0013 00004000 00000000 c05fed1c
>     [   14.176097] 5e80: 600b0013 c04aa828 00000010 00000000 00000000 00000008 00004000 ed45ea00
>     [   14.184308] 5ea0: c1b79b7b c13c1e6c ec115efc ec115eb8 c05fed48 c04afc5c 00000000 00000000
>     [   14.192522] 5ec0: c05febb8 ec115ed0 c13e1ce0 ed45ea90 00000000 ec115f44 ec115f40 00000008
>     [   14.200735] 5ee0: 00000128 c0408e04 ec114000 00000000 ec115f0c ec115f00 c05fee84 c05febc4
>     [   14.208949] 5f00: ec115f1c ec115f10 c05fef04 be8be89c 00000000 eddf6d80 00000128 c0408e04
>     [   14.217162] 5f20: ec114000 00000000 ec115f94 ec115f38 c0dc9884 c0dc889c 00000000 00000000
>     [   14.225375] 5f40: 00000001 fffffff7 ec115e88 0000000c 00000001 00000000 00000000 ec115e50
>     [   14.233588] 5f60: 00000000 00000006 00000000 00000000 00000000 00000000 ec115f94 00000000
>     [   14.241801] 5f80: be8be89c 00000008 ec115fa4 ec115f98 c0dc98c8 c0dc9840 00000000 ec115fa8
>     [   14.250013] 5fa0: c0408c80 c0dc98bc 00000000 be8be89c 00000008 be8be89c 00000000 00000000
>     [   14.258225] 5fc0: 00000000 be8be89c 00000008 00000128 b69d9390 be8be900 00000000 001a93e0
>     [   14.266438] 5fe0: b6f316e0 be8be840 00000000 b699a4a4 800b0010 00000008 00000000 00000000
>     [   14.274661] [<c0b6f680>] (kszphy_config_reset) from [<c0b6f940>] (kszphy_resume+0x24/0x64)
>     [   14.282966] [<c0b6f940>] (kszphy_resume) from [<c0b693f8>] (phy_attach_direct+0xc4/0x1cc)
>     [   14.291183] [<c0b693f8>] (phy_attach_direct) from [<c0b69600>] (phy_connect_direct+0x24/0x5c)
>     [   14.299747] [<c0b69600>] (phy_connect_direct) from [<c0b6968c>] (phy_connect+0x54/0x88)
>     [   14.307792] [<c0b6968c>] (phy_connect) from [<c0c0e47c>] (cpsw_slave_open+0x230/0x294)
>     [   14.315749] [<c0c0e47c>] (cpsw_slave_open) from [<c0c11634>] (cpsw_ndo_open+0x4b4/0x618)
>     [   14.323883] [<c0c11634>] (cpsw_ndo_open) from [<c0deeb64>] (__dev_open+0xbc/0x124)
>     [   14.331498] [<c0deeb64>] (__dev_open) from [<c0deee4c>] (__dev_change_flags+0xa4/0x154)
>     [   14.331507] [<c0deee4c>] (__dev_change_flags) from [<c0deef24>] (dev_change_flags+0x28/0x58)
>     [   14.331522] [<c0deef24>] (dev_change_flags) from [<c0e025f0>] (do_setlink+0x2f0/0x890)
>     [   14.331531] [<c0e025f0>] (do_setlink) from [<c0e04554>] (rtnl_newlink+0x4b0/0x784)
>     [   14.331539] [<c0e04554>] (rtnl_newlink) from [<c0e04908>] (rtnetlink_rcv_msg+0xe0/0x1fc)
>     [   14.331550] [<c0e04908>] (rtnetlink_rcv_msg) from [<c0e1e538>] (netlink_rcv_skb+0xb0/0xcc)
>     [   14.331559] [<c0e1e538>] (netlink_rcv_skb) from [<c0e01c1c>] (rtnetlink_rcv+0x38/0x40)
>     [   14.331567] [<c0e01c1c>] (rtnetlink_rcv) from [<c0e1de70>] (netlink_unicast+0x180/0x210)
>     [   14.331575] [<c0e1de70>] (netlink_unicast) from [<c0e1e29c>] (netlink_sendmsg+0x2c4/0x380)
>     [   14.331584] [<c0e1e29c>] (netlink_sendmsg) from [<c0dc8304>] (sock_sendmsg+0x24/0x34)
>     [   14.331593] [<c0dc8304>] (sock_sendmsg) from [<c0dc8a70>] (___sys_sendmsg+0x1e0/0x1f0)
>     [   14.331601] [<c0dc8a70>] (___sys_sendmsg) from [<c0dc9884>] (__sys_sendmsg+0x50/0x7c)
>     [   14.331610] [<c0dc9884>] (__sys_sendmsg) from [<c0dc98c8>] (SyS_sendmsg+0x18/0x1c)
>     [   14.331620] [<c0dc98c8>] (SyS_sendmsg) from [<c0408c80>] (ret_fast_syscall+0x0/0x1c)
>     [   14.331629] Code: e52de004 e8bd4000 e59062dc e1a04000 (e5d63008) 
>     [   14.331751] ---[ end trace 7927abed2565423d ]---
> 
> Full output of a lava job that shows both of these symptoms can be seen
> at https://lkft.validation.linaro.org/scheduler/job/187562.
> 
> Looking through the patches, it's not obvious to me where the issue may
> be. We haven't bisected it yet but will begin. I noticed quite a few arm
> changes from Mark Rutland, and a change to net/rds/send.c from Ka-Cheong
> Poon. Both added to CC.

Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
gets figured out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19  7:56   ` Greg Kroah-Hartman
@ 2018-04-19 11:12     ` Naresh Kamboju
  2018-04-19 12:09       ` Ben Hutchings
  2018-04-19 14:03       ` Greg Kroah-Hartman
  0 siblings, 2 replies; 82+ messages in thread
From: Naresh Kamboju @ 2018-04-19 11:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: open list, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, Ben Hutchings, lkft-triage, linux- stable,
	Ka-Cheong Poon, Mark Rutland

>
> Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> gets figured out.

After reverting this patch, network started works on arm32 x15 device.
d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")

- Naresh

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19 11:12     ` Naresh Kamboju
@ 2018-04-19 12:09       ` Ben Hutchings
  2018-04-19 12:30         ` Naresh Kamboju
  2018-04-19 14:03       ` Greg Kroah-Hartman
  1 sibling, 1 reply; 82+ messages in thread
From: Ben Hutchings @ 2018-04-19 12:09 UTC (permalink / raw)
  To: Naresh Kamboju, Greg Kroah-Hartman
  Cc: open list, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, lkft-triage, linux- stable, Ka-Cheong Poon,
	Mark Rutland

On Thu, 2018-04-19 at 16:42 +0530, Naresh Kamboju wrote:
> > 
> > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> > gets figured out.
> 
> After reverting this patch, network started works on arm32 x15 device.
> d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")

Does that also fix the hang during 'ip l'?  (If the hang only occurred
on the same systems that oops'd in the micrel driver, it could well be
because the task that oops'd was holding the rtnetlink lock.)

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19 12:09       ` Ben Hutchings
@ 2018-04-19 12:30         ` Naresh Kamboju
  2018-04-19 13:21           ` Dan Rue
  0 siblings, 1 reply; 82+ messages in thread
From: Naresh Kamboju @ 2018-04-19 12:30 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Greg Kroah-Hartman, open list, Linus Torvalds, Andrew Morton,
	Guenter Roeck, Shuah Khan, patches, lkft-triage, linux- stable,
	Ka-Cheong Poon, Mark Rutland

On 19 April 2018 at 17:39, Ben Hutchings <ben.hutchings@codethink.co.uk> wrote:
> On Thu, 2018-04-19 at 16:42 +0530, Naresh Kamboju wrote:
>> >
>> > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
>> > gets figured out.
>>
>> After reverting this patch, network started works on arm32 x15 device.
>> d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")
>
> Does that also fix the hang during 'ip l'?  (If the hang only occurred
> on the same systems that oops'd in the micrel driver, it could well be
> because the task that oops'd was holding the rtnetlink lock.)

Now 'ip l' works and the oops message is gone.

- Naresh

>
> Ben.
>
> --
> Ben Hutchings
> Software Developer, Codethink Ltd.
>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19 12:30         ` Naresh Kamboju
@ 2018-04-19 13:21           ` Dan Rue
  0 siblings, 0 replies; 82+ messages in thread
From: Dan Rue @ 2018-04-19 13:21 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Ben Hutchings, Greg Kroah-Hartman, open list, Linus Torvalds,
	Andrew Morton, Guenter Roeck, Shuah Khan, patches, lkft-triage,
	linux- stable, Ka-Cheong Poon, Mark Rutland

On Thu, Apr 19, 2018 at 06:00:52PM +0530, Naresh Kamboju wrote:
> On 19 April 2018 at 17:39, Ben Hutchings <ben.hutchings@codethink.co.uk> wrote:
> > On Thu, 2018-04-19 at 16:42 +0530, Naresh Kamboju wrote:
> >> >
> >> > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> >> > gets figured out.
> >>
> >> After reverting this patch, network started works on arm32 x15 device.
> >> d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")
> >
> > Does that also fix the hang during 'ip l'?  (If the hang only occurred
> > on the same systems that oops'd in the micrel driver, it could well be
> > because the task that oops'd was holding the rtnetlink lock.)
> 
> Now 'ip l' works and the oops message is gone.
> 
> - Naresh

There may be some confusion since this patch was in 4.9.94 and we did
not report this issue then. The way these failures presented themselves
made them a bit subtle (we didn't get any hard failures, just some
incomplete jobs - not an unusual occurrence on embedded boards) so we
didn't notice them in the last release. It was only after Naresh
manually investigated that this was discovered.

We'll think about how we can improve our process to make these much more
obvious in an automated way.

Dan

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19 11:12     ` Naresh Kamboju
  2018-04-19 12:09       ` Ben Hutchings
@ 2018-04-19 14:03       ` Greg Kroah-Hartman
  2018-04-19 20:04         ` Dan Rue
  1 sibling, 1 reply; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-19 14:03 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: open list, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, Ben Hutchings, lkft-triage, linux- stable,
	Ka-Cheong Poon, Mark Rutland

On Thu, Apr 19, 2018 at 04:42:56PM +0530, Naresh Kamboju wrote:
> >
> > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> > gets figured out.
> 
> After reverting this patch, network started works on arm32 x15 device.
> d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")

Thanks for letting me know, I've now reverted that commit.

greg k-h

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19 14:03       ` Greg Kroah-Hartman
@ 2018-04-19 20:04         ` Dan Rue
  2018-04-20  6:27           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 82+ messages in thread
From: Dan Rue @ 2018-04-19 20:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Naresh Kamboju, open list, Linus Torvalds, Andrew Morton,
	Guenter Roeck, Shuah Khan, patches, Ben Hutchings, lkft-triage,
	linux- stable, Ka-Cheong Poon, Mark Rutland

On Thu, Apr 19, 2018 at 04:03:05PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Apr 19, 2018 at 04:42:56PM +0530, Naresh Kamboju wrote:
> > >
> > > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> > > gets figured out.
> > 
> > After reverting this patch, network started works on arm32 x15 device.
> > d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")
> 
> Thanks for letting me know, I've now reverted that commit.

Alright here we go.

Results from Linaro’s test farm.
No regressions on arm64, arm and x86_64.

A few notes about these results. Since yesterday, we have enabled qemu
arm 32 and qemu arm 64 testing, and also upgraded kselftest to 4.16 from
4.15. These changes have caused some new failures and noise that will be
triaged over the next couple of days. These aren't considered
regressions because they're a result of upgrading the test / introducing
new environments, not some change in kernel. I removed some of these
false regressions from the below email report, but you would see them if
you looked at qa-reports.

The x15 issue that was previously reported is resolved, but we have
learned that running 'ethtool --phy-statistics eth0' causes an oops.
This isn't a new issue, and we'll be tracking it going forward until
it's resolved.

Summary
------------------------------------------------------------------------

kernel: 4.9.95-rc3
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: 0afd4120faa7df778d227670fc91f49234be202b
git describe: v4.9.94-69-g0afd4120faa7
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.94-69-g0afd4120faa7


No regressions (compared to build v4.9.94-67-g148bd16b9893)

Boards, architectures and test suites:
-------------------------------------

dragonboard-410c
* boot - pass: 20,
* kselftest - pass: 35, skip: 27, fail: 4
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1015, skip: 135,
* ltp-timers-tests - pass: 13,

hi6220-hikey - arm64
* boot - pass: 20,
* kselftest - pass: 38, skip: 24, fail: 4
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10, skip: 4,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1014, skip: 136,
* ltp-timers-tests - pass: 13,

juno-r2 - arm64
* boot - pass: 20,
* kselftest - pass: 38, skip: 24, fail: 4
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10, skip: 4,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1015, skip: 135,
* ltp-timers-tests - pass: 13,

qemu_arm
* boot - fail: 20
^ This is the first run for qemu_arm and this boot problem is due to an
error in our job template.

qemu_arm64
* boot - pass: 20,
* kselftest - pass: 37, skip: 27, fail: 4
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 592, skip: 74, fail: 5
* ltp-timers-tests - pass: 13,

qemu_x86_64
* boot - pass: 22,
* kselftest - pass: 49, skip: 27, fail: 4
* kselftest-vsyscall-mode-native - pass: 49, skip: 27, fail: 4
* kselftest-vsyscall-mode-none - pass: 49, skip: 27, fail: 4
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 13, skip: 1,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1003, skip: 147,
* ltp-timers-tests - pass: 13,

x15 - arm
* boot - pass: 20,
* kselftest - pass: 37, skip: 24, fail: 4
* libhugetlbfs - pass: 87, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 63, skip: 18,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 58, skip: 5,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 20, skip: 2,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 13, skip: 1,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1075, skip: 75,
* ltp-timers-tests - pass: 13,

x86_64
* boot - pass: 22,
* kselftest - pass: 51, skip: 25, fail: 4
* kselftest-vsyscall-mode-native - pass: 50, skip: 25, fail: 5
* kselftest-vsyscall-mode-none - pass: 51, skip: 25, fail: 4
* libhugetlbfs - pass: 90, skip: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 58, skip: 5,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 9, skip: 5,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1034, skip: 116,
* ltp-timers-tests - pass: 13,

-- 
Linaro QA (beta)
https://qa-reports.linaro.org

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 00/66] 4.9.95-stable review
  2018-04-19 20:04         ` Dan Rue
@ 2018-04-20  6:27           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-04-20  6:27 UTC (permalink / raw)
  To: Naresh Kamboju, open list, Linus Torvalds, Andrew Morton,
	Guenter Roeck, Shuah Khan, patches, Ben Hutchings, lkft-triage,
	linux- stable, Ka-Cheong Poon, Mark Rutland

On Thu, Apr 19, 2018 at 03:04:05PM -0500, Dan Rue wrote:
> On Thu, Apr 19, 2018 at 04:03:05PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Apr 19, 2018 at 04:42:56PM +0530, Naresh Kamboju wrote:
> > > >
> > > > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> > > > gets figured out.
> > > 
> > > After reverting this patch, network started works on arm32 x15 device.
> > > d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")
> > 
> > Thanks for letting me know, I've now reverted that commit.
> 
> Alright here we go.
> 
> Results from Linaro’s test farm.
> No regressions on arm64, arm and x86_64.

Great, thanks for testing and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump"
  2018-04-17 15:59 ` [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump" Greg Kroah-Hartman
@ 2018-09-05 18:50   ` Florian Fainelli
  2018-09-05 19:29     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 82+ messages in thread
From: Florian Fainelli @ 2018-09-05 18:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel, Sasha Levin
  Cc: stable, Pavlos Parissis, Lei Chen, Maxime Hadjinlian,
	Namhyung Kim, Adrian Hunter, Jiri Olsa, David Ahern,
	Peter Zijlstra, Wang Nan, kernel-team, Arnaldo Carvalho de Melo

On 04/17/2018 08:59 AM, Greg Kroah-Hartman wrote:
> 4.9-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> This reverts commit 7525a238be8f46617cdda29d1be5b85ffe3b042d which is
> commit 94df1040b1e6aacd8dec0ba3c61d7e77cd695f26 upstream.
> 
> It breaks the build of perf on 4.9.y, so I'm dropping it.

Sorry to hijack this thread, I was not able to find the original email
when the offending patch was included in 4.1.52. So kernel 4.1.52 also
has the same problem, can you push a 4.1.53 tag with that patch reverted
as well?

Thank you!

> 
> Reported-by: Pavlos Parissis <pavlos.parissis@gmail.com>
> Reported-by: Lei Chen <chenl.lei@gmail.com>
> Reported-by: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Wang Nan <wangnan0@huawei.com>
> Cc: kernel-team@lge.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Sasha Levin <alexander.levin@microsoft.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  tools/perf/tests/code-reading.c |   20 +-------------------
>  1 file changed, 1 insertion(+), 19 deletions(-)
> 
> --- a/tools/perf/tests/code-reading.c
> +++ b/tools/perf/tests/code-reading.c
> @@ -224,8 +224,6 @@ static int read_object_code(u64 addr, si
>  	unsigned char buf2[BUFSZ];
>  	size_t ret_len;
>  	u64 objdump_addr;
> -	const char *objdump_name;
> -	char decomp_name[KMOD_DECOMP_LEN];
>  	int ret;
>  
>  	pr_debug("Reading object code for memory address: %#"PRIx64"\n", addr);
> @@ -286,25 +284,9 @@ static int read_object_code(u64 addr, si
>  		state->done[state->done_cnt++] = al.map->start;
>  	}
>  
> -	objdump_name = al.map->dso->long_name;
> -	if (dso__needs_decompress(al.map->dso)) {
> -		if (dso__decompress_kmodule_path(al.map->dso, objdump_name,
> -						 decomp_name,
> -						 sizeof(decomp_name)) < 0) {
> -			pr_debug("decompression failed\n");
> -			return -1;
> -		}
> -
> -		objdump_name = decomp_name;
> -	}
> -
>  	/* Read the object code using objdump */
>  	objdump_addr = map__rip_2objdump(al.map, al.addr);
> -	ret = read_via_objdump(objdump_name, objdump_addr, buf2, len);
> -
> -	if (dso__needs_decompress(al.map->dso))
> -		unlink(objdump_name);
> -
> +	ret = read_via_objdump(al.map->dso->long_name, objdump_addr, buf2, len);
>  	if (ret > 0) {
>  		/*
>  		 * The kernel maps are inaccurate - assume objdump is right in
> 
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump"
  2018-09-05 18:50   ` Florian Fainelli
@ 2018-09-05 19:29     ` Greg Kroah-Hartman
  2018-09-05 20:08       ` Florian Fainelli
  0 siblings, 1 reply; 82+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-05 19:29 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: linux-kernel, Sasha Levin, stable, Pavlos Parissis, Lei Chen,
	Maxime Hadjinlian, Namhyung Kim, Adrian Hunter, Jiri Olsa,
	David Ahern, Peter Zijlstra, Wang Nan, kernel-team,
	Arnaldo Carvalho de Melo

On Wed, Sep 05, 2018 at 11:50:02AM -0700, Florian Fainelli wrote:
> On 04/17/2018 08:59 AM, Greg Kroah-Hartman wrote:
> > 4.9-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > This reverts commit 7525a238be8f46617cdda29d1be5b85ffe3b042d which is
> > commit 94df1040b1e6aacd8dec0ba3c61d7e77cd695f26 upstream.
> > 
> > It breaks the build of perf on 4.9.y, so I'm dropping it.
> 
> Sorry to hijack this thread, I was not able to find the original email
> when the offending patch was included in 4.1.52. So kernel 4.1.52 also
> has the same problem, can you push a 4.1.53 tag with that patch reverted
> as well?

4.1 is long end-of-life now, I'm not going to be going and updating a
"dead' kernel for a perf build bug.

You shouldn't be using this kernel either :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump"
  2018-09-05 19:29     ` Greg Kroah-Hartman
@ 2018-09-05 20:08       ` Florian Fainelli
  0 siblings, 0 replies; 82+ messages in thread
From: Florian Fainelli @ 2018-09-05 20:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Sasha Levin, stable, Pavlos Parissis, Lei Chen,
	Maxime Hadjinlian, Namhyung Kim, Adrian Hunter, Jiri Olsa,
	David Ahern, Peter Zijlstra, Wang Nan, kernel-team,
	Arnaldo Carvalho de Melo

On 09/05/2018 12:29 PM, Greg Kroah-Hartman wrote:
> On Wed, Sep 05, 2018 at 11:50:02AM -0700, Florian Fainelli wrote:
>> On 04/17/2018 08:59 AM, Greg Kroah-Hartman wrote:
>>> 4.9-stable review patch.  If anyone has any objections, please let me know.
>>>
>>> ------------------
>>>
>>> From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>
>>> This reverts commit 7525a238be8f46617cdda29d1be5b85ffe3b042d which is
>>> commit 94df1040b1e6aacd8dec0ba3c61d7e77cd695f26 upstream.
>>>
>>> It breaks the build of perf on 4.9.y, so I'm dropping it.
>>
>> Sorry to hijack this thread, I was not able to find the original email
>> when the offending patch was included in 4.1.52. So kernel 4.1.52 also
>> has the same problem, can you push a 4.1.53 tag with that patch reverted
>> as well?
> 
> 4.1 is long end-of-life now, I'm not going to be going and updating a
> "dead' kernel for a perf build bug.

Meh, fair enough, I reverted the offending commit, it still points to a
more fundamental problem, there is not always build testing of the tools
being shipped with the kernel unfortunately.

> 
> You shouldn't be using this kernel either :)

Fortunately 4.9 is what we are mostly using these days, still getting
occasional 4.1 kernel support requests unfortunately...
-- 
Florian

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2018-09-05 20:08 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17 15:58 [PATCH 4.9 00/66] 4.9.95-stable review Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 01/66] media: v4l2-compat-ioctl32: dont oops on overlay Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 02/66] parisc: Fix out of array access in match_pci_device() Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 03/66] Drivers: hv: vmbus: do not mark HV_PCIE as perf_device Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 04/66] perf intel-pt: Fix overlap detection to identify consecutive buffers correctly Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 05/66] perf intel-pt: Fix sync_switch Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 06/66] perf intel-pt: Fix error recovery from missing TIP packet Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 07/66] perf intel-pt: Fix timestamp following overflow Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 08/66] perf/core: Fix use-after-free in uprobe_perf_close() Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 09/66] radeon: hide pointless #warning when compile testing Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 10/66] arm64: barrier: Add CSDB macros to control data-value prediction Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 11/66] arm64: Implement array_index_mask_nospec() Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 12/66] arm64: move TASK_* definitions to <asm/processor.h> Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 13/66] arm64: Make USER_DS an inclusive limit Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 14/66] arm64: Use pointer masking to limit uaccess speculation Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 15/66] arm64: entry: Ensure branch through syscall table is bounded under speculation Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 16/66] arm64: uaccess: Prevent speculative use of the current addr_limit Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 17/66] arm64: uaccess: Dont bother eliding access_ok checks in __{get, put}_user Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 18/66] arm64: uaccess: Mask __user pointers for __arch_{clear, copy_*}_user Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 19/66] arm64: cpufeature: __this_cpu_has_cap() shouldnt stop early Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 20/66] arm64: Run enable method for errata work arounds on late CPUs Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 21/66] arm64: cpufeature: Pass capability structure to ->enable callback Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 22/66] drivers/firmware: Expose psci_get_version through psci_ops structure Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 23/66] arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 24/66] arm64: Move post_ttbr_update_workaround to C code Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 25/66] arm64: Add skeleton to harden the branch predictor against aliasing attacks Greg Kroah-Hartman
2018-04-17 15:58 ` [PATCH 4.9 26/66] arm64: Move BP hardening to check_and_switch_context Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 27/66] mm: Introduce lm_alias Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 28/66] arm64: KVM: Use per-CPU vector when BP hardening is enabled Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 29/66] arm64: entry: Apply BP hardening for high-priority synchronous exceptions Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 30/66] arm64: entry: Apply BP hardening for suspicious interrupts from EL0 Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 31/66] arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75 Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 32/66] arm64: cpu_errata: Allow an erratum to be match for all revisions of a core Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 33/66] arm64: Implement branch predictor hardening for affected Cortex-A CPUs Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 34/66] arm64: Branch predictor hardening for Cavium ThunderX2 Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 35/66] arm64: KVM: Increment PC after handling an SMC trap Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 36/66] arm/arm64: KVM: Consolidate the PSCI include files Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 37/66] arm/arm64: KVM: Add PSCI_VERSION helper Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 38/66] arm/arm64: KVM: Add smccc accessors to PSCI code Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 39/66] arm/arm64: KVM: Implement PSCI 1.0 support Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 40/66] arm/arm64: KVM: Advertise SMCCC v1.1 Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 41/66] arm64: KVM: Make PSCI_VERSION a fast path Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 42/66] arm/arm64: KVM: Turn kvm_psci_version into a static inline Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 43/66] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 44/66] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 45/66] firmware/psci: Expose PSCI conduit Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 46/66] firmware/psci: Expose SMCCC version through psci_ops Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 47/66] arm/arm64: smccc: Make function identifiers an unsigned quantity Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 48/66] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 49/66] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 50/66] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 51/66] sunrpc: remove incorrect HMAC request initialization Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 52/66] Revert "perf tests: Decompress kernel module before objdump" Greg Kroah-Hartman
2018-09-05 18:50   ` Florian Fainelli
2018-09-05 19:29     ` Greg Kroah-Hartman
2018-09-05 20:08       ` Florian Fainelli
2018-04-17 15:59 ` [PATCH 4.9 53/66] block/loop: fix deadlock after loop_set_status Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 54/66] nfit: fix region registration vs block-data-window ranges Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 55/66] s390/qdio: dont retry EQBS after CCQ 96 Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 56/66] s390/qdio: dont merge ERROR output buffers Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 57/66] s390/ipl: ensure loadparm valid flag is set Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 58/66] getname_kernel() needs to make sure that ->name != ->iname in long case Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 59/66] Bluetooth: Fix connection if directed advertising and privacy is used Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 60/66] rtl8187: Fix NULL pointer dereference in priv->conf_mutex Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 61/66] hwmon: (ina2xx) Fix access to uninitialized mutex Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 62/66] cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 63/66] rds: MP-RDS may use an invalid c_path Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 64/66] slip: Check if rstate is initialized before uncompressing Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 65/66] vhost: fix vhost_vq_access_ok() log check Greg Kroah-Hartman
2018-04-17 15:59 ` [PATCH 4.9 66/66] lan78xx: Correctly indicate invalid OTP Greg Kroah-Hartman
2018-04-17 21:04 ` [PATCH 4.9 00/66] 4.9.95-stable review Shuah Khan
2018-04-17 23:03 ` kernelci.org bot
2018-04-18 15:38 ` Guenter Roeck
2018-04-18 17:42 ` Dan Rue
2018-04-19  7:56   ` Greg Kroah-Hartman
2018-04-19 11:12     ` Naresh Kamboju
2018-04-19 12:09       ` Ben Hutchings
2018-04-19 12:30         ` Naresh Kamboju
2018-04-19 13:21           ` Dan Rue
2018-04-19 14:03       ` Greg Kroah-Hartman
2018-04-19 20:04         ` Dan Rue
2018-04-20  6:27           ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.