All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.14 000/159] 4.14.9-stable review
@ 2017-12-22  8:44 Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 001/159] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates Greg Kroah-Hartman
                   ` (165 more replies)
  0 siblings, 166 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable

This is the start of the stable review cycle for the 4.14.9 release.
There are 159 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
or in the git tree and branch at:
  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.14.9-rc1

Peter Hutterer <peter.hutterer@who-t.net>
    platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes

Daniel Lezcano <daniel.lezcano@linaro.org>
    thermal/drivers/hisi: Fix multiple alarm interrupts firing

Daniel Lezcano <daniel.lezcano@linaro.org>
    thermal/drivers/hisi: Simplify the temperature/step computation

Daniel Lezcano <daniel.lezcano@linaro.org>
    thermal/drivers/hisi: Fix kernel panic on alarm interrupt

Daniel Lezcano <daniel.lezcano@linaro.org>
    thermal/drivers/hisi: Fix missing interrupt enablement

Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
    IB/opa_vnic: Properly return the total MACs in UC MAC list

Scott Franco <safranco@intel.com>
    IB/opa_vnic: Properly clear Mac Table Digest

Eric Anholt <eric@anholt.net>
    drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math.

Nicholas Piggin <npiggin@gmail.com>
    cpuidle: fix broadcast control when broadcast can not be entered

Alexandre Belloni <alexandre.belloni@free-electrons.com>
    rtc: set the alarm to the next expiring timer

Hoang Tran <tranviethoang.vn@gmail.com>
    tcp: fix under-evaluated ssthresh in TCP Vegas

Chen-Yu Tsai <wens@csie.org>
    clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision

Arvind Yadav <arvind.yadav.cs@gmail.com>
    staging: greybus: light: Release memory obtained by kasprintf

Wei Hu(Xavier) <xavier.huwei@huawei.com>
    RDMA/hns: Avoid NULL pointer exception

Mike Manning <mmanning@brocade.com>
    net: ipv6: send NS for DAD when link operationally up

Mick Tarsel <mjtarsel@linux.vnet.ibm.com>
    ibmvnic: Set state UP

Jacob Keller <jacob.e.keller@intel.com>
    fm10k: ensure we process SM mbx when processing VF mbx

Marek Szyprowski <m.szyprowski@samsung.com>
    ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board

Alex Williamson <alex.williamson@redhat.com>
    vfio/pci: Virtualize Maximum Payload Size

Alan Brady <alan.brady@intel.com>
    i40e: fix client notify of VF reset

Dick Kennedy <dick.kennedy@broadcom.com>
    scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined

Dick Kennedy <dick.kennedy@broadcom.com>
    scsi: lpfc: PLOGI failures during NPIV testing

Dick Kennedy <dick.kennedy@broadcom.com>
    scsi: lpfc: Fix secure firmware updates

Jacob Keller <jacob.e.keller@intel.com>
    fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw

Nicolas Dechesne <nicolas.dechesne@linaro.org>
    ASoC: codecs: msm8916-wcd-analog: fix module autoload

Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    sctp: silence warns on sctp_stream_init allocations

Nicholas Piggin <npiggin@gmail.com>
    powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog

Nicholas Piggin <npiggin@gmail.com>
    powerpc/xmon: Avoid tripping SMP hardlockup watchdog

Ed Blake <ed.blake@sondrel.com>
    ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback

Jean-François Têtu <jean-francois.tetu@savoirfairelinux.com>
    ASoC: codecs: msm8916-wcd-analog: fix micbias level

Tom Zanussi <tom.zanussi@linux.intel.com>
    tracing: Exclude 'generic fields' from histograms

Gabriele Paoloni <gabriele.paoloni@huawei.com>
    PCI/AER: Report non-fatal errors only to the affected endpoint

Jacob Keller <jacob.e.keller@intel.com>
    i40e/i40evf: spread CPU affinity hints across online CPUs only

Hans de Goede <hdegoede@redhat.com>
    Bluetooth: hci_bcm: Fix setting of irq trigger type

Hans de Goede <hdegoede@redhat.com>
    Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev

Andrew Jeffery <andrew@aj.id.au>
    leds: pca955x: Don't invert requested value in pca955x_gpio_set_value()

Wei Wang <weiwan@google.com>
    ipv6: grab rt->rt6i_ref before allocating pcpu rt

William Tu <u9012063@gmail.com>
    ip_gre: check packet length and mtu correctly in erspan tx

Guoqing Jiang <gqjiang@suse.com>
    md: always set THREAD_WAKEUP and wake up wqueue if thread existed

Luca Miccio <lucmiccio@gmail.com>
    block,bfq: Disable writeback throttling

Colin Ian King <colin.king@canonical.com>
    IB/rxe: check for allocation failure on elem

Emil Tantilov <emil.s.tantilov@intel.com>
    ixgbe: fix use of uninitialized padding

Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
    iio: st_sensors: add register mask for status register

Lihong Yang <lihong.yang@intel.com>
    i40e: use the safe hash table iterator when deleting mac filters

Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    igb: check memory allocation failure

Fabio Estevam <fabio.estevam@nxp.com>
    PM / OPP: Move error message to debug level

Stuart Hayes <stuart.w.hayes@gmail.com>
    PCI: Create SR-IOV virtfn/physfn links before attaching driver

Sreekanth Reddy <sreekanth.reddy@broadcom.com>
    scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive

Varun Prakash <varun@chelsio.com>
    scsi: cxgb4i: fix Tx skb leak

David Daney <david.daney@cavium.com>
    PCI: Avoid bus reset if bridge itself is broken

Dan Murphy <dmurphy@ti.com>
    net: phy: at803x: Change error to EINVAL for invalid MAC

Shakeel Butt <shakeelb@google.com>
    kvm, mm: account kvm related kmem slabs to kmemcg

Russell King <rmk+kernel@armlinux.org.uk>
    rtc: pl031: make interrupt optional

Christophe Jaillet <christophe.jaillet@wanadoo.fr>
    crypto: lrw - Fix an error handling path in 'create()'

Christian Lamparter <chunkeey@gmail.com>
    crypto: crypto4xx - increase context and scatter ring buffer elements

Chen-Yu Tsai <wens@csie.org>
    clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider

Chen-Yu Tsai <wens@csie.org>
    clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock

Shashank Sharma <shashank.sharma@intel.com>
    drm: Add retries for lspcon mode detection

Derek Basehore <dbasehore@chromium.org>
    backlight: pwm_bl: Fix overflow condition

Jens Wiklander <jens.wiklander@linaro.org>
    optee: fix invalid of_node_put() in optee_driver_init()

Thomas Gleixner <tglx@linutronix.de>
    x86/cpufeatures: Make CPU bugs sticky

Thomas Gleixner <tglx@linutronix.de>
    x86/paravirt: Provide a way to check for hypervisors

Thomas Gleixner <tglx@linutronix.de>
    x86/paravirt: Dont patch flush_tlb_single

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Make cpu_entry_area.tss read-only

Andy Lutomirski <luto@kernel.org>
    x86/entry: Clean up the SYSENTER_stack code

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Remove the SYSENTER stack canary

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Move the IST stacks into struct cpu_entry_area

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Create a per-CPU SYSCALL entry trampoline

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Return to userspace from the trampoline stack

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Use a per-CPU trampoline stack for IDT entries

Andy Lutomirski <luto@kernel.org>
    x86/espfix/64: Stop assuming that pt_regs is on the entry stack

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0

Andy Lutomirski <luto@kernel.org>
    x86/entry: Remap the TSS into the CPU entry area

Andy Lutomirski <luto@kernel.org>
    x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct

Andy Lutomirski <luto@kernel.org>
    x86/dumpstack: Handle stack overflow on all stacks

Andy Lutomirski <luto@kernel.org>
    x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss

Andy Lutomirski <luto@kernel.org>
    x86/kasan/64: Teach KASAN about the cpu_entry_area

Andy Lutomirski <luto@kernel.org>
    x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area

Andy Lutomirski <luto@kernel.org>
    x86/entry/gdt: Put per-CPU GDT remaps in ascending order

Andy Lutomirski <luto@kernel.org>
    x86/dumpstack: Add get_stack_info() support for the SYSENTER stack

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Allocate and enable the SYSENTER stack

Andy Lutomirski <luto@kernel.org>
    x86/irq/64: Print the offending IP in the stack overflow warning

Andy Lutomirski <luto@kernel.org>
    x86/irq: Remove an old outdated comment about context tracking races

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/unwinder: Handle stack overflows more gracefully

Andy Lutomirski <luto@kernel.org>
    x86/unwinder/orc: Dont bail on stack overflow

Boris Ostrovsky <boris.ostrovsky@oracle.com>
    x86/entry/64/paravirt: Use paravirt-safe macro to access eflags

Andrey Ryabinin <aryabinin@virtuozzo.com>
    x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow

Will Deacon <will.deacon@arm.com>
    locking/barriers: Convert users of lockless_dereference() to READ_ONCE()

Will Deacon <will.deacon@arm.com>
    locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()

Daniel Borkmann <daniel@iogearbox.net>
    bpf: fix build issues on um due to mising bpf_perf_event.h

Andi Kleen <ak@linux.intel.com>
    perf/x86: Enable free running PEBS for REGS_USER/INTR

Rudolf Marek <r.marek@assembler.cz>
    x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD

Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    x86/cpufeature: Add User-Mode Instruction Prevention definitions

Ingo Molnar <mingo@kernel.org>
    drivers/misc/intel/pti: Rename the header file to free up the namespace

Juergen Gross <jgross@suse.com>
    x86/virt: Add enum for hypervisors to replace x86_hyper

Juergen Gross <jgross@suse.com>
    x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init'

James Morse <james.morse@arm.com>
    ACPI / APEI: Replace ioremap_page_range() with fixmap

Andy Lutomirski <luto@kernel.org>
    selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well

Andy Lutomirski <luto@kernel.org>
    selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area()

Ingo Molnar <mingo@kernel.org>
    x86/cpufeatures: Fix various details in the feature definitions

Ingo Molnar <mingo@kernel.org>
    x86/cpufeatures: Re-tabulate the X86_FEATURE definitions

Borislav Petkov <bp@suse.de>
    x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE

Thomas Gleixner <tglx@linutronix.de>
    bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h")

Thomas Gleixner <tglx@linutronix.de>
    x86/cpuid: Replace set/clear_bit32()

Borislav Petkov <bp@suse.de>
    x86/entry/64: Shorten TEST instructions

Andy Lutomirski <luto@kernel.org>
    x86/traps: Use a new on_thread_stack() helper to clean up an assertion

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Remove thread_struct::sp0

Andy Lutomirski <luto@kernel.org>
    x86/entry/32: Fix cpu_current_top_of_stack initialization at boot

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Remove all remaining direct thread_struct::sp0 reads

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Stop initializing TSS.sp0 at boot

Andy Lutomirski <luto@kernel.org>
    x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context()

Andy Lutomirski <luto@kernel.org>
    x86/entry: Add task_top_of_stack() to find the top of a task's stack

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Pass SP0 directly to load_sp0()

Andy Lutomirski <luto@kernel.org>
    x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0()

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: De-Xen-ify our NMI code

Juergen Gross <jgross@suse.com>
    xen, x86/entry/64: Add xen NMI trap entry

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Remove the RESTORE_..._REGS infrastructure

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Use POP instead of MOV to restore regs on NMI return

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Merge the fast and slow SYSRET paths

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Use pop instead of movq in syscall_return_via_sysret

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Shrink paranoid_exit_restore and make labels local

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Simplify reg restore code in the standard IRET paths

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Move SWAPGS into the common IRET-to-usermode path

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths

Andy Lutomirski <luto@kernel.org>
    x86/entry/64: Remove the restore_c_regs_and_iret label

Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    ptrace,x86: Make user_64bit_mode() available to 32-bit builds

Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    x86/boot: Relocate definition of the initial state of CR0

Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    x86/mm: Relocate page fault error codes to traps.h

Gayatri Kammela <gayatri.kammela@intel.com>
    x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features

Baoquan He <bhe@redhat.com>
    x86/mm/64: Rename the register_page_bootmem_memmap() 'size' parameter to 'nr_pages'

Masahiro Yamada <yamada.masahiro@socionext.com>
    x86/build: Beautify build log of syscall headers

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/asm: Don't use the confusing '.ifeq' directive

Dongjiu Geng <gengdongjiu@huawei.com>
    ACPI / APEI: remove the unused dead-code for SEA/NMI notification type

Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    x86/xen: Drop 5-level paging support code from the XEN_PV code

Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y

Andrey Ryabinin <aryabinin@virtuozzo.com>
    x86/kasan: Use the same shadow offset for 4- and 5-level paging

Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y

Thomas Gleixner <tglx@linutronix.de>
    x86/cpuid: Prevent out of bound access in do_clear_cpu_cap()

Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
    objtool: Print top level commands on incorrect usage

Kees Cook <keescook@chromium.org>
    x86/platform/UV: Convert timers to use timer_setup()

Andi Kleen <ak@linux.intel.com>
    x86/fpu: Remove the explicit clearing of XSAVE dependent features

Andi Kleen <ak@linux.intel.com>
    x86/fpu: Make XSAVE check the base CPUID features before enabling

Andi Kleen <ak@linux.intel.com>
    x86/fpu: Parse clearcpuid= as early XSAVE argument

Andi Kleen <ak@linux.intel.com>
    x86/cpuid: Add generic table for CPUID dependencies

Andi Kleen <ak@linux.intel.com>
    bitops: Add clear/set_bit32() to linux/bitops.h

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*'

Steven Rostedt (VMware) <rostedt@goodmis.org>
    x86/fpu/debug: Remove unused 'x86_fpu_state' and 'x86_fpu_deactivate_state' tracepoints

Ingo Molnar <mingo@kernel.org>
    x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig

Jan Beulich <JBeulich@suse.com>
    ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq()

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/head: Add unwind hint annotations

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/xen: Add unwind hint annotations

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/xen: Fix xen head ELF annotations

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/boot: Annotate verify_cpu() as a callable function

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/head: Fix head ELF function annotations

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/head: Remove unused 'bad_address' code

Josh Poimboeuf <jpoimboe@redhat.com>
    x86/head: Remove confusing comment

Josh Poimboeuf <jpoimboe@redhat.com>
    objtool: Don't report end of section error after an empty unwind hint

Uros Bizjak <ubizjak@gmail.com>
    x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates


-------------

Diffstat:

 Documentation/x86/orc-unwinder.txt                 |   2 +-
 Documentation/x86/x86_64/mm.txt                    |   2 +-
 Makefile                                           |   8 +-
 arch/arm/configs/exynos_defconfig                  |   2 +-
 arch/arm64/include/asm/fixmap.h                    |   7 +
 arch/powerpc/kernel/watchdog.c                     |   7 +-
 arch/powerpc/xmon/xmon.c                           |  17 +-
 arch/um/include/asm/Kbuild                         |   1 +
 arch/x86/Kconfig                                   |   5 +-
 arch/x86/Kconfig.debug                             |  39 +-
 arch/x86/configs/tiny.config                       |   4 +-
 arch/x86/configs/x86_64_defconfig                  |   1 +
 arch/x86/entry/calling.h                           |  69 +--
 arch/x86/entry/entry_32.S                          |   6 +-
 arch/x86/entry/entry_64.S                          | 322 +++++++++---
 arch/x86/entry/entry_64_compat.S                   |  10 +-
 arch/x86/entry/syscalls/Makefile                   |   4 +-
 arch/x86/events/core.c                             |   2 +-
 arch/x86/events/intel/core.c                       |   4 +
 arch/x86/events/perf_event.h                       |  24 +-
 arch/x86/hyperv/hv_init.c                          |   2 +-
 arch/x86/include/asm/archrandom.h                  |   8 +-
 arch/x86/include/asm/bitops.h                      |  10 +-
 arch/x86/include/asm/compat.h                      |   1 +
 arch/x86/include/asm/cpufeature.h                  |  11 +-
 arch/x86/include/asm/cpufeatures.h                 | 538 +++++++++++----------
 arch/x86/include/asm/desc.h                        |  11 +-
 arch/x86/include/asm/fixmap.h                      |  74 ++-
 arch/x86/include/asm/hypervisor.h                  |  53 +-
 arch/x86/include/asm/irqflags.h                    |   3 +
 arch/x86/include/asm/kdebug.h                      |   1 +
 arch/x86/include/asm/mmu_context.h                 |   4 +-
 arch/x86/include/asm/module.h                      |   2 +-
 arch/x86/include/asm/paravirt.h                    |  14 +-
 arch/x86/include/asm/paravirt_types.h              |   2 +-
 arch/x86/include/asm/percpu.h                      |   2 +-
 arch/x86/include/asm/pgtable_types.h               |   3 +-
 arch/x86/include/asm/processor.h                   | 109 +++--
 arch/x86/include/asm/ptrace.h                      |   6 +-
 arch/x86/include/asm/rmwcc.h                       |   2 +-
 arch/x86/include/asm/stacktrace.h                  |   3 +
 arch/x86/include/asm/switch_to.h                   |  26 +
 arch/x86/include/asm/thread_info.h                 |   2 +-
 arch/x86/include/asm/trace/fpu.h                   |  10 -
 arch/x86/include/asm/traps.h                       |  21 +-
 arch/x86/include/asm/unwind.h                      |  15 +-
 arch/x86/include/asm/x86_init.h                    |  24 +
 arch/x86/include/uapi/asm/processor-flags.h        |   3 +
 arch/x86/kernel/Makefile                           |  10 +-
 arch/x86/kernel/apic/apic.c                        |   2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c                 |   5 +-
 arch/x86/kernel/asm-offsets.c                      |   6 +
 arch/x86/kernel/asm-offsets_32.c                   |   9 +-
 arch/x86/kernel/asm-offsets_64.c                   |   4 +
 arch/x86/kernel/cpu/Makefile                       |   1 +
 arch/x86/kernel/cpu/amd.c                          |   7 +-
 arch/x86/kernel/cpu/common.c                       | 195 +++++---
 arch/x86/kernel/cpu/cpuid-deps.c                   | 121 +++++
 arch/x86/kernel/cpu/hypervisor.c                   |  64 +--
 arch/x86/kernel/cpu/mshyperv.c                     |   6 +-
 arch/x86/kernel/cpu/vmware.c                       |   8 +-
 arch/x86/kernel/doublefault.c                      |  36 +-
 arch/x86/kernel/dumpstack.c                        |  74 ++-
 arch/x86/kernel/dumpstack_32.c                     |   6 +
 arch/x86/kernel/dumpstack_64.c                     |   6 +
 arch/x86/kernel/fpu/init.c                         |  11 +
 arch/x86/kernel/fpu/xstate.c                       |  43 +-
 arch/x86/kernel/head_32.S                          |   5 +-
 arch/x86/kernel/head_64.S                          |  45 +-
 arch/x86/kernel/ioport.c                           |   2 +-
 arch/x86/kernel/irq.c                              |  12 -
 arch/x86/kernel/irq_64.c                           |   4 +-
 arch/x86/kernel/kvm.c                              |   6 +-
 arch/x86/kernel/ldt.c                              |   2 +-
 arch/x86/kernel/paravirt_patch_64.c                |   2 -
 arch/x86/kernel/process.c                          |  27 +-
 arch/x86/kernel/process_32.c                       |   8 +-
 arch/x86/kernel/process_64.c                       |  19 +-
 arch/x86/kernel/smpboot.c                          |   3 +-
 arch/x86/kernel/traps.c                            |  72 +--
 arch/x86/kernel/unwind_orc.c                       |  88 ++--
 arch/x86/kernel/verify_cpu.S                       |   3 +-
 arch/x86/kernel/vm86_32.c                          |  20 +-
 arch/x86/kernel/vmlinux.lds.S                      |   9 +
 arch/x86/kernel/x86_init.c                         |   9 +
 arch/x86/kvm/mmu.c                                 |   4 +-
 arch/x86/kvm/vmx.c                                 |   2 +-
 arch/x86/lib/delay.c                               |   4 +-
 arch/x86/mm/fault.c                                |  88 ++--
 arch/x86/mm/init.c                                 |   2 +-
 arch/x86/mm/init_64.c                              |  10 +-
 arch/x86/mm/kasan_init_64.c                        | 262 ++++++++--
 arch/x86/power/cpu.c                               |  16 +-
 arch/x86/xen/enlighten_hvm.c                       |  12 +-
 arch/x86/xen/enlighten_pv.c                        |  15 +-
 arch/x86/xen/mmu_pv.c                              | 161 +++---
 arch/x86/xen/smp_pv.c                              |  17 +-
 arch/x86/xen/xen-asm_64.S                          |   2 +-
 arch/x86/xen/xen-head.S                            |  11 +-
 block/bfq-iosched.c                                |   3 +-
 block/blk-wbt.c                                    |   2 +-
 crypto/lrw.c                                       |   6 +-
 drivers/acpi/apei/ghes.c                           |  78 +--
 drivers/base/power/opp/core.c                      |   2 +-
 drivers/bluetooth/hci_bcm.c                        |  23 +-
 drivers/bluetooth/hci_ldisc.c                      |   7 +
 drivers/clk/sunxi-ng/ccu-sun5i.c                   |   4 +-
 drivers/clk/sunxi-ng/ccu-sun6i-a31.c               |   2 +-
 drivers/clk/sunxi-ng/ccu_nm.c                      |   3 +
 drivers/cpuidle/cpuidle.c                          |   1 +
 drivers/crypto/amcc/crypto4xx_core.h               |  10 +-
 drivers/gpu/drm/drm_dp_dual_mode_helper.c          |  16 +-
 drivers/gpu/drm/vc4/vc4_dsi.c                      |   3 +-
 drivers/hv/vmbus_drv.c                             |   2 +-
 drivers/iio/accel/st_accel_core.c                  |  35 +-
 drivers/iio/common/st_sensors/st_sensors_core.c    |   2 +-
 drivers/iio/common/st_sensors/st_sensors_trigger.c |  16 +-
 drivers/iio/gyro/st_gyro_core.c                    |  15 +-
 drivers/iio/magnetometer/st_magn_core.c            |  10 +-
 drivers/iio/pressure/st_pressure_core.c            |  15 +-
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c         |   5 +
 drivers/infiniband/sw/rxe/rxe_pool.c               |   2 +
 drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c   |   1 +
 .../infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c  |   8 +-
 drivers/input/mouse/vmmouse.c                      |  10 +-
 drivers/leds/leds-pca955x.c                        |  17 +-
 drivers/md/dm-mpath.c                              |  20 +-
 drivers/md/md.c                                    |   4 +-
 drivers/misc/pti.c                                 |   2 +-
 drivers/misc/vmw_balloon.c                         |   2 +-
 drivers/net/ethernet/ibm/ibmvnic.c                 |   2 +
 drivers/net/ethernet/intel/fm10k/fm10k.h           |   4 +-
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c       |  12 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  16 +-
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   7 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    |   9 +-
 drivers/net/ethernet/intel/igb/igb_main.c          |   2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c    |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c      |   2 +
 drivers/net/phy/at803x.c                           |   2 +-
 drivers/pci/iov.c                                  |   3 +-
 drivers/pci/pci.c                                  |   4 +
 drivers/pci/pcie/aer/aerdrv_core.c                 |   9 +-
 drivers/platform/x86/asus-wireless.c               |   1 +
 drivers/rtc/interface.c                            |   2 +-
 drivers/rtc/rtc-pl031.c                            |  14 +-
 drivers/scsi/cxgbi/cxgb4i/cxgb4i.c                 |   1 +
 drivers/scsi/lpfc/lpfc_hbadisc.c                   |   3 +-
 drivers/scsi/lpfc/lpfc_hw4.h                       |   2 +-
 drivers/scsi/lpfc/lpfc_nvmet.c                     |   2 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c               |   5 +
 drivers/staging/greybus/light.c                    |   2 +
 drivers/tee/optee/core.c                           |   1 -
 drivers/thermal/hisi_thermal.c                     |  74 ++-
 drivers/vfio/pci/vfio_pci_config.c                 |   6 +-
 drivers/video/backlight/pwm_bl.c                   |   7 +-
 fs/dcache.c                                        |   4 +-
 fs/overlayfs/ovl_entry.h                           |   2 +-
 fs/overlayfs/readdir.c                             |   2 +-
 include/asm-generic/vmlinux.lds.h                  |   2 +-
 include/linux/compiler.h                           |   1 +
 include/linux/hypervisor.h                         |   8 +-
 include/linux/iio/common/st_sensors.h              |   7 +-
 include/linux/{pti.h => intel-pti.h}               |   6 +-
 include/linux/mm.h                                 |   2 +-
 include/linux/mmzone.h                             |   6 +-
 include/linux/rculist.h                            |   4 +-
 include/linux/rcupdate.h                           |   4 +-
 kernel/events/core.c                               |   4 +-
 kernel/seccomp.c                                   |   2 +-
 kernel/task_work.c                                 |   2 +-
 kernel/trace/trace_events_hist.c                   |   4 +-
 lib/Kconfig.debug                                  |   2 +-
 mm/page_alloc.c                                    |  10 +
 mm/slab.h                                          |   2 +-
 mm/sparse.c                                        |  17 +-
 net/ipv4/ip_gre.c                                  |   8 +-
 net/ipv4/tcp_vegas.c                               |   2 +-
 net/ipv6/addrconf.c                                |  12 +-
 net/ipv6/route.c                                   |  58 +--
 net/sctp/stream.c                                  |   8 +-
 scripts/Makefile.build                             |   2 +-
 sound/soc/codecs/msm8916-wcd-analog.c              |   9 +-
 sound/soc/img/img-parallel-out.c                   |   2 +
 tools/objtool/check.c                              |   7 +-
 tools/objtool/objtool.c                            |   6 +-
 tools/testing/selftests/x86/ldt_gdt.c              |  61 ++-
 virt/kvm/kvm_main.c                                |   2 +-
 188 files changed, 2414 insertions(+), 1428 deletions(-)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 001/159] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                   ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Uros Bizjak, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Uros Bizjak <ubizjak@gmail.com>

commit 3c52b5c64326d9dcfee4e10611c53ec1b1b20675 upstream.

There is no need for \n\t in front of CC_SET(), as the macro already includes these two.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170906151808.5634-1-ubizjak@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/archrandom.h |    8 ++++----
 arch/x86/include/asm/bitops.h     |   10 +++++-----
 arch/x86/include/asm/percpu.h     |    2 +-
 arch/x86/include/asm/rmwcc.h      |    2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -45,7 +45,7 @@ static inline bool rdrand_long(unsigned
 	bool ok;
 	unsigned int retry = RDRAND_RETRY_LOOPS;
 	do {
-		asm volatile(RDRAND_LONG "\n\t"
+		asm volatile(RDRAND_LONG
 			     CC_SET(c)
 			     : CC_OUT(c) (ok), "=a" (*v));
 		if (ok)
@@ -59,7 +59,7 @@ static inline bool rdrand_int(unsigned i
 	bool ok;
 	unsigned int retry = RDRAND_RETRY_LOOPS;
 	do {
-		asm volatile(RDRAND_INT "\n\t"
+		asm volatile(RDRAND_INT
 			     CC_SET(c)
 			     : CC_OUT(c) (ok), "=a" (*v));
 		if (ok)
@@ -71,7 +71,7 @@ static inline bool rdrand_int(unsigned i
 static inline bool rdseed_long(unsigned long *v)
 {
 	bool ok;
-	asm volatile(RDSEED_LONG "\n\t"
+	asm volatile(RDSEED_LONG
 		     CC_SET(c)
 		     : CC_OUT(c) (ok), "=a" (*v));
 	return ok;
@@ -80,7 +80,7 @@ static inline bool rdseed_long(unsigned
 static inline bool rdseed_int(unsigned int *v)
 {
 	bool ok;
-	asm volatile(RDSEED_INT "\n\t"
+	asm volatile(RDSEED_INT
 		     CC_SET(c)
 		     : CC_OUT(c) (ok), "=a" (*v));
 	return ok;
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -143,7 +143,7 @@ static __always_inline void __clear_bit(
 static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
 {
 	bool negative;
-	asm volatile(LOCK_PREFIX "andb %2,%1\n\t"
+	asm volatile(LOCK_PREFIX "andb %2,%1"
 		CC_SET(s)
 		: CC_OUT(s) (negative), ADDR
 		: "ir" ((char) ~(1 << nr)) : "memory");
@@ -246,7 +246,7 @@ static __always_inline bool __test_and_s
 {
 	bool oldbit;
 
-	asm("bts %2,%1\n\t"
+	asm("bts %2,%1"
 	    CC_SET(c)
 	    : CC_OUT(c) (oldbit), ADDR
 	    : "Ir" (nr));
@@ -286,7 +286,7 @@ static __always_inline bool __test_and_c
 {
 	bool oldbit;
 
-	asm volatile("btr %2,%1\n\t"
+	asm volatile("btr %2,%1"
 		     CC_SET(c)
 		     : CC_OUT(c) (oldbit), ADDR
 		     : "Ir" (nr));
@@ -298,7 +298,7 @@ static __always_inline bool __test_and_c
 {
 	bool oldbit;
 
-	asm volatile("btc %2,%1\n\t"
+	asm volatile("btc %2,%1"
 		     CC_SET(c)
 		     : CC_OUT(c) (oldbit), ADDR
 		     : "Ir" (nr) : "memory");
@@ -329,7 +329,7 @@ static __always_inline bool variable_tes
 {
 	bool oldbit;
 
-	asm volatile("bt %2,%1\n\t"
+	asm volatile("bt %2,%1"
 		     CC_SET(c)
 		     : CC_OUT(c) (oldbit)
 		     : "m" (*(unsigned long *)addr), "Ir" (nr));
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -526,7 +526,7 @@ static inline bool x86_this_cpu_variable
 {
 	bool oldbit;
 
-	asm volatile("bt "__percpu_arg(2)",%1\n\t"
+	asm volatile("bt "__percpu_arg(2)",%1"
 			CC_SET(c)
 			: CC_OUT(c) (oldbit)
 			: "m" (*(unsigned long __percpu *)addr), "Ir" (nr));
--- a/arch/x86/include/asm/rmwcc.h
+++ b/arch/x86/include/asm/rmwcc.h
@@ -29,7 +29,7 @@ cc_label:								\
 #define __GEN_RMWcc(fullop, var, cc, clobbers, ...)			\
 do {									\
 	bool c;								\
-	asm volatile (fullop ";" CC_SET(cc)				\
+	asm volatile (fullop CC_SET(cc)					\
 			: [counter] "+m" (var), CC_OUT(cc) (c)		\
 			: __VA_ARGS__ : clobbers);			\
 	return c;							\

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 001/159] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 003/159] x86/head: Remove confusing comment Greg Kroah-Hartman
                   ` (163 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 00d96180dc38ef872ac471c2d3e14b067cbd895d upstream.

If asm code specifies an UNWIND_HINT_EMPTY hint, don't warn if the
section ends unexpectedly.  This can happen with the xen-head.S code
because the hypercall_page is "text" but it's all zeros.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ddafe199dd8797e40e3c2777373347eba1d65572.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/objtool/check.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1757,11 +1757,14 @@ static int validate_branch(struct objtoo
 		if (insn->dead_end)
 			return 0;
 
-		insn = next_insn;
-		if (!insn) {
+		if (!next_insn) {
+			if (state.cfa.base == CFI_UNDEFINED)
+				return 0;
 			WARN("%s: unexpected end of section", sec->name);
 			return 1;
 		}
+
+		insn = next_insn;
 	}
 
 	return 0;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 003/159] x86/head: Remove confusing comment
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 001/159] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 004/159] x86/head: Remove unused bad_address code Greg Kroah-Hartman
                   ` (162 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 17270717e80de33a884ad328fea5f407d87f6d6a upstream.

This comment is actively wrong and confusing.  It refers to the
registers' stack offsets after the pt_regs has been constructed on the
stack, but this code is *before* that.

At this point the stack just has the standard iret frame, for which no
comment should be needed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a3c267b770fc56c9b86df9c11c552848248aace2.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head_64.S |    4 ----
 1 file changed, 4 deletions(-)

--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -271,10 +271,6 @@ bad_address:
 
 	__INIT
 ENTRY(early_idt_handler_array)
-	# 104(%rsp) %rflags
-	#  96(%rsp) %cs
-	#  88(%rsp) %rip
-	#  80(%rsp) error code
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
 	.ifeq (EXCEPTION_ERRCODE_MASK >> i) & 1

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 004/159] x86/head: Remove unused bad_address code
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 003/159] x86/head: Remove confusing comment Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 005/159] x86/head: Fix head ELF function annotations Greg Kroah-Hartman
                   ` (161 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit a8b88e84d124bc92c4808e72b8b8c0e0bb538630 upstream.

It's no longer possible for this code to be executed, so remove it.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/32a46fe92d2083700599b36872b26e7dfd7b7965.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head_64.S |    3 ---
 1 file changed, 3 deletions(-)

--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -266,9 +266,6 @@ ENDPROC(start_cpu0)
 	.quad  init_thread_union + THREAD_SIZE - SIZEOF_PTREGS
 	__FINITDATA
 
-bad_address:
-	jmp bad_address
-
 	__INIT
 ENTRY(early_idt_handler_array)
 	i = 0

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 005/159] x86/head: Fix head ELF function annotations
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 004/159] x86/head: Remove unused bad_address code Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 006/159] x86/boot: Annotate verify_cpu() as a callable function Greg Kroah-Hartman
                   ` (160 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 015a2ea5478680fc5216d56b7ff306f2a74efaf9 upstream.

These functions aren't callable C-type functions, so don't annotate them
as such.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/36eb182738c28514f8bf95e403d89b6413a88883.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head_64.S |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -235,7 +235,7 @@ ENTRY(secondary_startup_64)
 	pushq	%rax		# target address in negative space
 	lretq
 .Lafter_lret:
-ENDPROC(secondary_startup_64)
+END(secondary_startup_64)
 
 #include "verify_cpu.S"
 
@@ -278,7 +278,7 @@ ENTRY(early_idt_handler_array)
 	i = i + 1
 	.fill early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE - ., 1, 0xcc
 	.endr
-ENDPROC(early_idt_handler_array)
+END(early_idt_handler_array)
 
 early_idt_handler_common:
 	/*
@@ -321,7 +321,7 @@ early_idt_handler_common:
 20:
 	decl early_recursion_flag(%rip)
 	jmp restore_regs_and_iret
-ENDPROC(early_idt_handler_common)
+END(early_idt_handler_common)
 
 	__INITDATA
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 006/159] x86/boot: Annotate verify_cpu() as a callable function
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 005/159] x86/head: Fix head ELF function annotations Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 007/159] x86/xen: Fix xen head ELF annotations Greg Kroah-Hartman
                   ` (159 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit e93db75a0054b23a874a12c63376753544f3fe9e upstream.

verify_cpu() is a callable function.  Annotate it as such.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/293024b8a080832075312f38c07ccc970fc70292.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/verify_cpu.S |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/verify_cpu.S
+++ b/arch/x86/kernel/verify_cpu.S
@@ -33,7 +33,7 @@
 #include <asm/cpufeatures.h>
 #include <asm/msr-index.h>
 
-verify_cpu:
+ENTRY(verify_cpu)
 	pushf				# Save caller passed flags
 	push	$0			# Kill any dangerous flags
 	popf
@@ -139,3 +139,4 @@ verify_cpu:
 	popf				# Restore caller passed flags
 	xorl %eax, %eax
 	ret
+ENDPROC(verify_cpu)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 007/159] x86/xen: Fix xen head ELF annotations
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 006/159] x86/boot: Annotate verify_cpu() as a callable function Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 008/159] x86/xen: Add unwind hint annotations Greg Kroah-Hartman
                   ` (158 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 2582d3df95c76d3b686453baf90b64d57e87d1e8 upstream.

Mark the ends of the startup_xen and hypercall_page code sections.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3a80a394d30af43d9cefa1a29628c45ed8420c97.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/xen/xen-head.S |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -34,7 +34,7 @@ ENTRY(startup_xen)
 	mov $init_thread_union+THREAD_SIZE, %_ASM_SP
 
 	jmp xen_start_kernel
-
+END(startup_xen)
 	__FINIT
 #endif
 
@@ -48,7 +48,7 @@ ENTRY(hypercall_page)
 	.type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32
 #include <asm/xen-hypercalls.h>
 #undef HYPERCALL
-
+END(hypercall_page)
 .popsection
 
 	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 008/159] x86/xen: Add unwind hint annotations
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 007/159] x86/xen: Fix xen head ELF annotations Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 009/159] x86/head: " Greg Kroah-Hartman
                   ` (157 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andy Lutomirski,
	Boris Ostrovsky, Jiri Slaby, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit abbe1cac6214d81d2f4e149aba64a8760703144e upstream.

Add unwind hint annotations to the xen head code so the ORC unwinder can
read head_64.o.

hypercall_page needs empty annotations at 32-byte intervals to match the
'xen_hypercall_*' ELF functions at those locations.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/70ed2eb516fe9266be766d953f93c2571bca88cc.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/xen/xen-head.S |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -10,6 +10,7 @@
 #include <asm/boot.h>
 #include <asm/asm.h>
 #include <asm/page_types.h>
+#include <asm/unwind_hints.h>
 
 #include <xen/interface/elfnote.h>
 #include <xen/interface/features.h>
@@ -20,6 +21,7 @@
 #ifdef CONFIG_XEN_PV
 	__INIT
 ENTRY(startup_xen)
+	UNWIND_HINT_EMPTY
 	cld
 
 	/* Clear .bss */
@@ -41,7 +43,10 @@ END(startup_xen)
 .pushsection .text
 	.balign PAGE_SIZE
 ENTRY(hypercall_page)
-	.skip PAGE_SIZE
+	.rept (PAGE_SIZE / 32)
+		UNWIND_HINT_EMPTY
+		.skip 32
+	.endr
 
 #define HYPERCALL(n) \
 	.equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 009/159] x86/head: Add unwind hint annotations
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 008/159] x86/xen: Add unwind hint annotations Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 010/159] ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq() Greg Kroah-Hartman
                   ` (156 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Slaby, Josh Poimboeuf,
	Andy Lutomirski, Boris Ostrovsky, Juergen Gross, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 2704fbb672d0d9a19414907fda7949283dcef6a1 upstream.

Jiri Slaby reported an ORC issue when unwinding from an idle task.  The
stack was:

    ffffffff811083c2 do_idle+0x142/0x1e0
    ffffffff8110861d cpu_startup_entry+0x5d/0x60
    ffffffff82715f58 start_kernel+0x3ff/0x407
    ffffffff827153e8 x86_64_start_kernel+0x14e/0x15d
    ffffffff810001bf secondary_startup_64+0x9f/0xa0

The ORC unwinder errored out at secondary_startup_64 because the head
code isn't annotated yet so there wasn't a corresponding ORC entry.

Fix that and any other head-related unwinding issues by adding unwind
hints to the head code.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/78ef000a2f68f545d6eef44ee912edceaad82ccf.1505764066.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/Makefile  |    1 -
 arch/x86/kernel/head_64.S |   14 ++++++++++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -27,7 +27,6 @@ KASAN_SANITIZE_dumpstack.o				:= n
 KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
 KASAN_SANITIZE_stacktrace.o := n
 
-OBJECT_FILES_NON_STANDARD_head_$(BITS).o		:= y
 OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o	:= y
 OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
 OBJECT_FILES_NON_STANDARD_test_nx.o			:= y
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -50,6 +50,7 @@ L3_START_KERNEL = pud_index(__START_KERN
 	.code64
 	.globl startup_64
 startup_64:
+	UNWIND_HINT_EMPTY
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
 	 * and someone has loaded an identity mapped page table
@@ -89,6 +90,7 @@ startup_64:
 	addq	$(early_top_pgt - __START_KERNEL_map), %rax
 	jmp 1f
 ENTRY(secondary_startup_64)
+	UNWIND_HINT_EMPTY
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
 	 * and someone has loaded a mapped page table.
@@ -133,6 +135,7 @@ ENTRY(secondary_startup_64)
 	movq	$1f, %rax
 	jmp	*%rax
 1:
+	UNWIND_HINT_EMPTY
 
 	/* Check if nx is implemented */
 	movl	$0x80000001, %eax
@@ -247,6 +250,7 @@ END(secondary_startup_64)
  */
 ENTRY(start_cpu0)
 	movq	initial_stack(%rip), %rsp
+	UNWIND_HINT_EMPTY
 	jmp	.Ljump_to_C_code
 ENDPROC(start_cpu0)
 #endif
@@ -271,13 +275,18 @@ ENTRY(early_idt_handler_array)
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
 	.ifeq (EXCEPTION_ERRCODE_MASK >> i) & 1
-	pushq $0		# Dummy error code, to make stack frame uniform
+		UNWIND_HINT_IRET_REGS
+		pushq $0	# Dummy error code, to make stack frame uniform
+	.else
+		UNWIND_HINT_IRET_REGS offset=8
 	.endif
 	pushq $i		# 72(%rsp) Vector number
 	jmp early_idt_handler_common
+	UNWIND_HINT_IRET_REGS
 	i = i + 1
 	.fill early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE - ., 1, 0xcc
 	.endr
+	UNWIND_HINT_IRET_REGS offset=16
 END(early_idt_handler_array)
 
 early_idt_handler_common:
@@ -306,6 +315,7 @@ early_idt_handler_common:
 	pushq %r13				/* pt_regs->r13 */
 	pushq %r14				/* pt_regs->r14 */
 	pushq %r15				/* pt_regs->r15 */
+	UNWIND_HINT_REGS
 
 	cmpq $14,%rsi		/* Page fault? */
 	jnz 10f
@@ -428,7 +438,7 @@ ENTRY(phys_base)
 EXPORT_SYMBOL(phys_base)
 
 #include "../../x86/xen/xen-head.S"
-	
+
 	__PAGE_ALIGNED_BSS
 NEXT_PAGE(empty_zero_page)
 	.skip PAGE_SIZE

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 010/159] ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 009/159] x86/head: " Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 011/159] x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig Greg Kroah-Hartman
                   ` (155 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jan Beulich, Borislav Petkov,
	Rafael J. Wysocki

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jan Beulich <JBeulich@suse.com>

commit 095f613c6b386a1704b73a549e9ba66c1d5381ae upstream.

Match up with what 7edda0886b ("acpi: apei: handle SEA notification
type for ARMv8") did for ghes_ioremap_pfn_nmi().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/acpi/apei/ghes.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -174,7 +174,8 @@ static void __iomem *ghes_ioremap_pfn_nm
 
 static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
 {
-	unsigned long vaddr, paddr;
+	unsigned long vaddr;
+	phys_addr_t paddr;
 	pgprot_t prot;
 
 	vaddr = (unsigned long)GHES_IOREMAP_IRQ_PAGE(ghes_ioremap_area->addr);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 011/159] x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 010/159] ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq() Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 012/159] x86/fpu/debug: Remove unused x86_fpu_state and x86_fpu_deactivate_state tracepoints Greg Kroah-Hartman
                   ` (154 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ingo Molnar <mingo@kernel.org>

commit 1e4078f0bba46ad61b69548abe6a6faf63b89380 upstream.

Increase testing coverage by turning on the primary x86 unwinder for
the 64-bit defconfig.

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/configs/x86_64_defconfig |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/configs/x86_64_defconfig
+++ b/arch/x86/configs/x86_64_defconfig
@@ -299,6 +299,7 @@ CONFIG_DEBUG_STACKOVERFLOW=y
 # CONFIG_DEBUG_RODATA_TEST is not set
 CONFIG_DEBUG_BOOT_PARAMS=y
 CONFIG_OPTIMIZE_INLINING=y
+CONFIG_ORC_UNWINDER=y
 CONFIG_SECURITY=y
 CONFIG_SECURITY_NETWORK=y
 CONFIG_SECURITY_SELINUX=y

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 012/159] x86/fpu/debug: Remove unused x86_fpu_state and x86_fpu_deactivate_state tracepoints
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 011/159] x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 013/159] x86/unwind: Rename unwinder config options to CONFIG_UNWINDER_* Greg Kroah-Hartman
                   ` (153 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Steven Rostedt (VMware),
	Dave Hansen, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

commit 127a1bea40f7f2a36bc7207ea4d51bb6b4e936fa upstream.

Commit:

  d1898b733619 ("x86/fpu: Add tracepoints to dump FPU state at key points")

... added the 'x86_fpu_state' and 'x86_fpu_deactivate_state' trace points,
but never used them. Today they are still not used. As they take up
and waste memory, remove them.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171012180619.670b68b6@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/trace/fpu.h |   10 ----------
 1 file changed, 10 deletions(-)

--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -34,11 +34,6 @@ DECLARE_EVENT_CLASS(x86_fpu,
 	)
 );
 
-DEFINE_EVENT(x86_fpu, x86_fpu_state,
-	TP_PROTO(struct fpu *fpu),
-	TP_ARGS(fpu)
-);
-
 DEFINE_EVENT(x86_fpu, x86_fpu_before_save,
 	TP_PROTO(struct fpu *fpu),
 	TP_ARGS(fpu)
@@ -73,11 +68,6 @@ DEFINE_EVENT(x86_fpu, x86_fpu_activate_s
 	TP_PROTO(struct fpu *fpu),
 	TP_ARGS(fpu)
 );
-
-DEFINE_EVENT(x86_fpu, x86_fpu_deactivate_state,
-	TP_PROTO(struct fpu *fpu),
-	TP_ARGS(fpu)
-);
 
 DEFINE_EVENT(x86_fpu, x86_fpu_init_state,
 	TP_PROTO(struct fpu *fpu),

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 013/159] x86/unwind: Rename unwinder config options to CONFIG_UNWINDER_*
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 012/159] x86/fpu/debug: Remove unused x86_fpu_state and x86_fpu_deactivate_state tracepoints Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 014/159] x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit Greg Kroah-Hartman
                   ` (152 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ingo Molnar, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 11af847446ed0d131cf24d16a7ef3d5ea7a49554 upstream.

Rename the unwinder config options from:

  CONFIG_ORC_UNWINDER
  CONFIG_FRAME_POINTER_UNWINDER
  CONFIG_GUESS_UNWINDER

to:

  CONFIG_UNWINDER_ORC
  CONFIG_UNWINDER_FRAME_POINTER
  CONFIG_UNWINDER_GUESS

... in order to give them a more logical config namespace.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/x86/orc-unwinder.txt |    2 +-
 Makefile                           |    4 ++--
 arch/x86/Kconfig                   |    2 +-
 arch/x86/Kconfig.debug             |   10 +++++-----
 arch/x86/configs/tiny.config       |    4 ++--
 arch/x86/configs/x86_64_defconfig  |    2 +-
 arch/x86/include/asm/module.h      |    2 +-
 arch/x86/include/asm/unwind.h      |    8 ++++----
 arch/x86/kernel/Makefile           |    6 +++---
 include/asm-generic/vmlinux.lds.h  |    2 +-
 lib/Kconfig.debug                  |    2 +-
 scripts/Makefile.build             |    2 +-
 12 files changed, 23 insertions(+), 23 deletions(-)

--- a/Documentation/x86/orc-unwinder.txt
+++ b/Documentation/x86/orc-unwinder.txt
@@ -4,7 +4,7 @@ ORC unwinder
 Overview
 --------
 
-The kernel CONFIG_ORC_UNWINDER option enables the ORC unwinder, which is
+The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
 similar in concept to a DWARF unwinder.  The difference is that the
 format of the ORC data is much simpler than DWARF, which in turn allows
 the ORC unwinder to be much simpler and faster.
--- a/Makefile
+++ b/Makefile
@@ -935,8 +935,8 @@ ifdef CONFIG_STACK_VALIDATION
   ifeq ($(has_libelf),1)
     objtool_target := tools/objtool FORCE
   else
-    ifdef CONFIG_ORC_UNWINDER
-      $(error "Cannot generate ORC metadata for CONFIG_ORC_UNWINDER=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
+    ifdef CONFIG_UNWINDER_ORC
+      $(error "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
     else
       $(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
     endif
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -171,7 +171,7 @@ config X86
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE
 	select HAVE_REGS_AND_STACK_ACCESS_API
-	select HAVE_RELIABLE_STACKTRACE		if X86_64 && FRAME_POINTER_UNWINDER && STACK_VALIDATION
+	select HAVE_RELIABLE_STACKTRACE		if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION
 	select HAVE_STACK_VALIDATION		if X86_64
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UNSTABLE_SCHED_CLOCK
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -359,13 +359,13 @@ config PUNIT_ATOM_DEBUG
 
 choice
 	prompt "Choose kernel unwinder"
-	default FRAME_POINTER_UNWINDER
+	default UNWINDER_FRAME_POINTER
 	---help---
 	  This determines which method will be used for unwinding kernel stack
 	  traces for panics, oopses, bugs, warnings, perf, /proc/<pid>/stack,
 	  livepatch, lockdep, and more.
 
-config FRAME_POINTER_UNWINDER
+config UNWINDER_FRAME_POINTER
 	bool "Frame pointer unwinder"
 	select FRAME_POINTER
 	---help---
@@ -380,7 +380,7 @@ config FRAME_POINTER_UNWINDER
 	  consistency model, as this is currently the only way to get a
 	  reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
 
-config ORC_UNWINDER
+config UNWINDER_ORC
 	bool "ORC unwinder"
 	depends on X86_64
 	select STACK_VALIDATION
@@ -396,7 +396,7 @@ config ORC_UNWINDER
 	  Enabling this option will increase the kernel's runtime memory usage
 	  by roughly 2-4MB, depending on your kernel config.
 
-config GUESS_UNWINDER
+config UNWINDER_GUESS
 	bool "Guess unwinder"
 	depends on EXPERT
 	---help---
@@ -411,7 +411,7 @@ config GUESS_UNWINDER
 endchoice
 
 config FRAME_POINTER
-	depends on !ORC_UNWINDER && !GUESS_UNWINDER
+	depends on !UNWINDER_ORC && !UNWINDER_GUESS
 	bool
 
 endmenu
--- a/arch/x86/configs/tiny.config
+++ b/arch/x86/configs/tiny.config
@@ -1,5 +1,5 @@
 CONFIG_NOHIGHMEM=y
 # CONFIG_HIGHMEM4G is not set
 # CONFIG_HIGHMEM64G is not set
-CONFIG_GUESS_UNWINDER=y
-# CONFIG_FRAME_POINTER_UNWINDER is not set
+CONFIG_UNWINDER_GUESS=y
+# CONFIG_UNWINDER_FRAME_POINTER is not set
--- a/arch/x86/configs/x86_64_defconfig
+++ b/arch/x86/configs/x86_64_defconfig
@@ -299,7 +299,7 @@ CONFIG_DEBUG_STACKOVERFLOW=y
 # CONFIG_DEBUG_RODATA_TEST is not set
 CONFIG_DEBUG_BOOT_PARAMS=y
 CONFIG_OPTIMIZE_INLINING=y
-CONFIG_ORC_UNWINDER=y
+CONFIG_UNWINDER_ORC=y
 CONFIG_SECURITY=y
 CONFIG_SECURITY_NETWORK=y
 CONFIG_SECURITY_SELINUX=y
--- a/arch/x86/include/asm/module.h
+++ b/arch/x86/include/asm/module.h
@@ -6,7 +6,7 @@
 #include <asm/orc_types.h>
 
 struct mod_arch_specific {
-#ifdef CONFIG_ORC_UNWINDER
+#ifdef CONFIG_UNWINDER_ORC
 	unsigned int num_orcs;
 	int *orc_unwind_ip;
 	struct orc_entry *orc_unwind;
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -13,11 +13,11 @@ struct unwind_state {
 	struct task_struct *task;
 	int graph_idx;
 	bool error;
-#if defined(CONFIG_ORC_UNWINDER)
+#if defined(CONFIG_UNWINDER_ORC)
 	bool signal, full_regs;
 	unsigned long sp, bp, ip;
 	struct pt_regs *regs;
-#elif defined(CONFIG_FRAME_POINTER_UNWINDER)
+#elif defined(CONFIG_UNWINDER_FRAME_POINTER)
 	bool got_irq;
 	unsigned long *bp, *orig_sp, ip;
 	struct pt_regs *regs;
@@ -51,7 +51,7 @@ void unwind_start(struct unwind_state *s
 	__unwind_start(state, task, regs, first_frame);
 }
 
-#if defined(CONFIG_ORC_UNWINDER) || defined(CONFIG_FRAME_POINTER_UNWINDER)
+#if defined(CONFIG_UNWINDER_ORC) || defined(CONFIG_UNWINDER_FRAME_POINTER)
 static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
 {
 	if (unwind_done(state))
@@ -66,7 +66,7 @@ static inline struct pt_regs *unwind_get
 }
 #endif
 
-#ifdef CONFIG_ORC_UNWINDER
+#ifdef CONFIG_UNWINDER_ORC
 void unwind_init(void);
 void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
 			void *orc, size_t orc_size);
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -127,9 +127,9 @@ obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.
 obj-$(CONFIG_TRACING)			+= tracepoint.o
 obj-$(CONFIG_SCHED_MC_PRIO)		+= itmt.o
 
-obj-$(CONFIG_ORC_UNWINDER)		+= unwind_orc.o
-obj-$(CONFIG_FRAME_POINTER_UNWINDER)	+= unwind_frame.o
-obj-$(CONFIG_GUESS_UNWINDER)		+= unwind_guess.o
+obj-$(CONFIG_UNWINDER_ORC)		+= unwind_orc.o
+obj-$(CONFIG_UNWINDER_FRAME_POINTER)	+= unwind_frame.o
+obj-$(CONFIG_UNWINDER_GUESS)		+= unwind_guess.o
 
 ###
 # 64 bit specific files
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -688,7 +688,7 @@
 #define BUG_TABLE
 #endif
 
-#ifdef CONFIG_ORC_UNWINDER
+#ifdef CONFIG_UNWINDER_ORC
 #define ORC_UNWIND_TABLE						\
 	. = ALIGN(4);							\
 	.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) {	\
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -376,7 +376,7 @@ config STACK_VALIDATION
 	  that runtime stack traces are more reliable.
 
 	  This is also a prerequisite for generation of ORC unwind data, which
-	  is needed for CONFIG_ORC_UNWINDER.
+	  is needed for CONFIG_UNWINDER_ORC.
 
 	  For more information, see
 	  tools/objtool/Documentation/stack-validation.txt.
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -259,7 +259,7 @@ ifneq ($(SKIP_STACK_VALIDATION),1)
 
 __objtool_obj := $(objtree)/tools/objtool/objtool
 
-objtool_args = $(if $(CONFIG_ORC_UNWINDER),orc generate,check)
+objtool_args = $(if $(CONFIG_UNWINDER_ORC),orc generate,check)
 
 ifndef CONFIG_FRAME_POINTER
 objtool_args += --no-fp

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 014/159] x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 013/159] x86/unwind: Rename unwinder config options to CONFIG_UNWINDER_* Greg Kroah-Hartman
@ 2017-12-22  8:44 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h Greg Kroah-Hartman
                   ` (151 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ingo Molnar, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit fc72ae40e30327aa24eb88a24b9c7058f938bd36 upstream.

The ORC unwinder has been stable in testing so far.  Give it much wider
testing by making it the default in kconfig for x86_64.  It's not yet
supported for 32-bit, so leave frame pointers as the default there.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9b1237bbe7244ed9cdf8db2dcb1253e37e1c341e.1507924831.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/Kconfig.debug |   33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -359,27 +359,13 @@ config PUNIT_ATOM_DEBUG
 
 choice
 	prompt "Choose kernel unwinder"
-	default UNWINDER_FRAME_POINTER
+	default UNWINDER_ORC if X86_64
+	default UNWINDER_FRAME_POINTER if X86_32
 	---help---
 	  This determines which method will be used for unwinding kernel stack
 	  traces for panics, oopses, bugs, warnings, perf, /proc/<pid>/stack,
 	  livepatch, lockdep, and more.
 
-config UNWINDER_FRAME_POINTER
-	bool "Frame pointer unwinder"
-	select FRAME_POINTER
-	---help---
-	  This option enables the frame pointer unwinder for unwinding kernel
-	  stack traces.
-
-	  The unwinder itself is fast and it uses less RAM than the ORC
-	  unwinder, but the kernel text size will grow by ~3% and the kernel's
-	  overall performance will degrade by roughly 5-10%.
-
-	  This option is recommended if you want to use the livepatch
-	  consistency model, as this is currently the only way to get a
-	  reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
-
 config UNWINDER_ORC
 	bool "ORC unwinder"
 	depends on X86_64
@@ -396,6 +382,21 @@ config UNWINDER_ORC
 	  Enabling this option will increase the kernel's runtime memory usage
 	  by roughly 2-4MB, depending on your kernel config.
 
+config UNWINDER_FRAME_POINTER
+	bool "Frame pointer unwinder"
+	select FRAME_POINTER
+	---help---
+	  This option enables the frame pointer unwinder for unwinding kernel
+	  stack traces.
+
+	  The unwinder itself is fast and it uses less RAM than the ORC
+	  unwinder, but the kernel text size will grow by ~3% and the kernel's
+	  overall performance will degrade by roughly 5-10%.
+
+	  This option is recommended if you want to use the livepatch
+	  consistency model, as this is currently the only way to get a
+	  reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
+
 config UNWINDER_GUESS
 	bool "Guess unwinder"
 	depends on EXPERT

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2017-12-22  8:44 ` [PATCH 4.14 014/159] x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-26 21:41   ` Ben Hutchings
  2017-12-22  8:45 ` [PATCH 4.14 016/159] x86/cpuid: Add generic table for CPUID dependencies Greg Kroah-Hartman
                   ` (150 subsequent siblings)
  165 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Thomas Gleixner,
	Linus Torvalds, Peter Zijlstra, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andi Kleen <ak@linux.intel.com>

commit cbe96375025e14fc76f9ed42ee5225120d7210f8 upstream.

Add two simple wrappers around set_bit/clear_bit() that accept
the common case of an u32 array. This avoids writing
casts in all callers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171013215645.23166-2-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/bitops.h |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -228,6 +228,32 @@ static inline unsigned long __ffs64(u64
 	return __ffs((unsigned long)word);
 }
 
+/*
+ * clear_bit32 - Clear a bit in memory for u32 array
+ * @nr: Bit to clear
+ * @addr: u32 * address of bitmap
+ *
+ * Same as clear_bit, but avoids needing casts for u32 arrays.
+ */
+
+static __always_inline void clear_bit32(long nr, volatile u32 *addr)
+{
+	clear_bit(nr, (volatile unsigned long *)addr);
+}
+
+/*
+ * set_bit32 - Set a bit in memory for u32 array
+ * @nr: Bit to clear
+ * @addr: u32 * address of bitmap
+ *
+ * Same as set_bit, but avoids needing casts for u32 arrays.
+ */
+
+static __always_inline void set_bit32(long nr, volatile u32 *addr)
+{
+	set_bit(nr, (volatile unsigned long *)addr);
+}
+
 #ifdef __KERNEL__
 
 #ifndef set_mask_bits

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 016/159] x86/cpuid: Add generic table for CPUID dependencies
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 017/159] x86/fpu: Parse clearcpuid= as early XSAVE argument Greg Kroah-Hartman
                   ` (149 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Thomas Gleixner,
	Jonathan McDowell, Linus Torvalds, Peter Zijlstra, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andi Kleen <ak@linux.intel.com>

commit 0b00de857a648dafe7020878c7a27cf776f5edf4 upstream.

Some CPUID features depend on other features. Currently it's
possible to to clear dependent features, but not clear the base features,
which can cause various interesting problems.

This patch implements a generic table to describe dependencies
between CPUID features, to be used by all code that clears
CPUID.

Some subsystems (like XSAVE) had an own implementation of this,
but it's better to do it all in a single place for everyone.

Then clear_cpu_cap and setup_clear_cpu_cap always look up
this table and clear all dependencies too.

This is intended to be a practical table: only for features
that make sense to clear. If someone for example clears FPU,
or other features that are essentially part of the required
base feature set, not much is going to work. Handling
that is right now out of scope. We're only handling
features which can be usefully cleared.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jonathan McDowell <noodles@earth.li>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171013215645.23166-3-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeature.h  |    9 +-
 arch/x86/include/asm/cpufeatures.h |    5 +
 arch/x86/kernel/cpu/Makefile       |    1 
 arch/x86/kernel/cpu/cpuid-deps.c   |  113 +++++++++++++++++++++++++++++++++++++
 4 files changed, 123 insertions(+), 5 deletions(-)

--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -126,11 +126,10 @@ extern const char * const x86_bug_flags[
 #define boot_cpu_has(bit)	cpu_has(&boot_cpu_data, bit)
 
 #define set_cpu_cap(c, bit)	set_bit(bit, (unsigned long *)((c)->x86_capability))
-#define clear_cpu_cap(c, bit)	clear_bit(bit, (unsigned long *)((c)->x86_capability))
-#define setup_clear_cpu_cap(bit) do { \
-	clear_cpu_cap(&boot_cpu_data, bit);	\
-	set_bit(bit, (unsigned long *)cpu_caps_cleared); \
-} while (0)
+
+extern void setup_clear_cpu_cap(unsigned int bit);
+extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
+
 #define setup_force_cpu_cap(bit) do { \
 	set_cpu_cap(&boot_cpu_data, bit);	\
 	set_bit(bit, (unsigned long *)cpu_caps_set);	\
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -22,6 +22,11 @@
  * this feature bit is not displayed in /proc/cpuinfo at all.
  */
 
+/*
+ * When adding new features here that depend on other features,
+ * please update the table in kernel/cpu/cpuid-deps.c
+ */
+
 /* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
 #define X86_FEATURE_FPU		( 0*32+ 0) /* Onboard FPU */
 #define X86_FEATURE_VME		( 0*32+ 1) /* Virtual Mode Extensions */
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -23,6 +23,7 @@ obj-y			+= rdrand.o
 obj-y			+= match.o
 obj-y			+= bugs.o
 obj-$(CONFIG_CPU_FREQ)	+= aperfmperf.o
+obj-y			+= cpuid-deps.o
 
 obj-$(CONFIG_PROC_FS)	+= proc.o
 obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
--- /dev/null
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -0,0 +1,113 @@
+/* Declare dependencies between CPUIDs */
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <asm/cpufeature.h>
+
+struct cpuid_dep {
+	unsigned int	feature;
+	unsigned int	depends;
+};
+
+/*
+ * Table of CPUID features that depend on others.
+ *
+ * This only includes dependencies that can be usefully disabled, not
+ * features part of the base set (like FPU).
+ *
+ * Note this all is not __init / __initdata because it can be
+ * called from cpu hotplug. It shouldn't do anything in this case,
+ * but it's difficult to tell that to the init reference checker.
+ */
+const static struct cpuid_dep cpuid_deps[] = {
+	{ X86_FEATURE_XSAVEOPT,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_XSAVEC,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_XSAVES,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_AVX,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_PKU,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_MPX,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_XGETBV1,		X86_FEATURE_XSAVE     },
+	{ X86_FEATURE_FXSR_OPT,		X86_FEATURE_FXSR      },
+	{ X86_FEATURE_XMM,		X86_FEATURE_FXSR      },
+	{ X86_FEATURE_XMM2,		X86_FEATURE_XMM       },
+	{ X86_FEATURE_XMM3,		X86_FEATURE_XMM2      },
+	{ X86_FEATURE_XMM4_1,		X86_FEATURE_XMM2      },
+	{ X86_FEATURE_XMM4_2,		X86_FEATURE_XMM2      },
+	{ X86_FEATURE_XMM3,		X86_FEATURE_XMM2      },
+	{ X86_FEATURE_PCLMULQDQ,	X86_FEATURE_XMM2      },
+	{ X86_FEATURE_SSSE3,		X86_FEATURE_XMM2,     },
+	{ X86_FEATURE_F16C,		X86_FEATURE_XMM2,     },
+	{ X86_FEATURE_AES,		X86_FEATURE_XMM2      },
+	{ X86_FEATURE_SHA_NI,		X86_FEATURE_XMM2      },
+	{ X86_FEATURE_FMA,		X86_FEATURE_AVX       },
+	{ X86_FEATURE_AVX2,		X86_FEATURE_AVX,      },
+	{ X86_FEATURE_AVX512F,		X86_FEATURE_AVX,      },
+	{ X86_FEATURE_AVX512IFMA,	X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512PF,		X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512ER,		X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512CD,		X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512DQ,		X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512BW,		X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512VL,		X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512VBMI,	X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512_4VNNIW,	X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512_4FMAPS,	X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512_VPOPCNTDQ, X86_FEATURE_AVX512F   },
+	{}
+};
+
+static inline void __clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit)
+{
+	clear_bit32(bit, c->x86_capability);
+}
+
+static inline void __setup_clear_cpu_cap(unsigned int bit)
+{
+	clear_cpu_cap(&boot_cpu_data, bit);
+	set_bit32(bit, cpu_caps_cleared);
+}
+
+static inline void clear_feature(struct cpuinfo_x86 *c, unsigned int feature)
+{
+	if (!c)
+		__setup_clear_cpu_cap(feature);
+	else
+		__clear_cpu_cap(c, feature);
+}
+
+static void do_clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
+{
+	bool changed;
+	DECLARE_BITMAP(disable, NCAPINTS * sizeof(u32) * 8);
+	const struct cpuid_dep *d;
+
+	clear_feature(c, feature);
+
+	/* Collect all features to disable, handling dependencies */
+	memset(disable, 0, sizeof(disable));
+	__set_bit(feature, disable);
+
+	/* Loop until we get a stable state. */
+	do {
+		changed = false;
+		for (d = cpuid_deps; d->feature; d++) {
+			if (!test_bit(d->depends, disable))
+				continue;
+			if (__test_and_set_bit(d->feature, disable))
+				continue;
+
+			changed = true;
+			clear_feature(c, d->feature);
+		}
+	} while (changed);
+}
+
+void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
+{
+	do_clear_cpu_cap(c, feature);
+}
+
+void setup_clear_cpu_cap(unsigned int feature)
+{
+	do_clear_cpu_cap(NULL, feature);
+}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 017/159] x86/fpu: Parse clearcpuid= as early XSAVE argument
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 016/159] x86/cpuid: Add generic table for CPUID dependencies Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 018/159] x86/fpu: Make XSAVE check the base CPUID features before enabling Greg Kroah-Hartman
                   ` (148 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Thomas Gleixner,
	Linus Torvalds, Peter Zijlstra, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andi Kleen <ak@linux.intel.com>

commit 0c2a3913d6f50503f7c59d83a6219e39508cc898 upstream.

With a followon patch we want to make clearcpuid affect the XSAVE
configuration. But xsave is currently initialized before arguments
are parsed. Move the clearcpuid= parsing into the special
early xsave argument parsing code.

Since clearcpuid= contains a = we need to keep the old __setup
around as a dummy, otherwise it would end up as a environment
variable in init's environment.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171013215645.23166-4-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/common.c |   16 +++++++---------
 arch/x86/kernel/fpu/init.c   |   11 +++++++++++
 2 files changed, 18 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1301,18 +1301,16 @@ void print_cpu_info(struct cpuinfo_x86 *
 		pr_cont(")\n");
 }
 
-static __init int setup_disablecpuid(char *arg)
+/*
+ * clearcpuid= was already parsed in fpu__init_parse_early_param.
+ * But we need to keep a dummy __setup around otherwise it would
+ * show up as an environment variable for init.
+ */
+static __init int setup_clearcpuid(char *arg)
 {
-	int bit;
-
-	if (get_option(&arg, &bit) && bit >= 0 && bit < NCAPINTS * 32)
-		setup_clear_cpu_cap(bit);
-	else
-		return 0;
-
 	return 1;
 }
-__setup("clearcpuid=", setup_disablecpuid);
+__setup("clearcpuid=", setup_clearcpuid);
 
 #ifdef CONFIG_X86_64
 DEFINE_PER_CPU_FIRST(union irq_stack_union,
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -249,6 +249,10 @@ static void __init fpu__init_system_ctx_
  */
 static void __init fpu__init_parse_early_param(void)
 {
+	char arg[32];
+	char *argptr = arg;
+	int bit;
+
 	if (cmdline_find_option_bool(boot_command_line, "no387"))
 		setup_clear_cpu_cap(X86_FEATURE_FPU);
 
@@ -266,6 +270,13 @@ static void __init fpu__init_parse_early
 
 	if (cmdline_find_option_bool(boot_command_line, "noxsaves"))
 		setup_clear_cpu_cap(X86_FEATURE_XSAVES);
+
+	if (cmdline_find_option(boot_command_line, "clearcpuid", arg,
+				sizeof(arg)) &&
+	    get_option(&argptr, &bit) &&
+	    bit >= 0 &&
+	    bit < NCAPINTS * 32)
+		setup_clear_cpu_cap(bit);
 }
 
 /*

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 018/159] x86/fpu: Make XSAVE check the base CPUID features before enabling
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 017/159] x86/fpu: Parse clearcpuid= as early XSAVE argument Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 019/159] x86/fpu: Remove the explicit clearing of XSAVE dependent features Greg Kroah-Hartman
                   ` (147 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Thomas Gleixner,
	Linus Torvalds, Peter Zijlstra, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andi Kleen <ak@linux.intel.com>

commit ccb18db2ab9d923df07e7495123fe5fb02329713 upstream.

Before enabling XSAVE, not only check the XSAVE specific CPUID bits,
but also the base CPUID features of the respective XSAVE feature.
This allows to disable individual XSAVE states using the existing
clearcpuid= option, which can be useful for performance testing
and debugging, and also in general avoids inconsistencies.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171013215645.23166-5-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/fpu/xstate.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -15,6 +15,7 @@
 #include <asm/fpu/xstate.h>
 
 #include <asm/tlbflush.h>
+#include <asm/cpufeature.h>
 
 /*
  * Although we spell it out in here, the Processor Trace
@@ -36,6 +37,19 @@ static const char *xfeature_names[] =
 	"unknown xstate feature"	,
 };
 
+static short xsave_cpuid_features[] __initdata = {
+	X86_FEATURE_FPU,
+	X86_FEATURE_XMM,
+	X86_FEATURE_AVX,
+	X86_FEATURE_MPX,
+	X86_FEATURE_MPX,
+	X86_FEATURE_AVX512F,
+	X86_FEATURE_AVX512F,
+	X86_FEATURE_AVX512F,
+	X86_FEATURE_INTEL_PT,
+	X86_FEATURE_PKU,
+};
+
 /*
  * Mask of xstate features supported by the CPU and the kernel:
  */
@@ -726,6 +740,7 @@ void __init fpu__init_system_xstate(void
 	unsigned int eax, ebx, ecx, edx;
 	static int on_boot_cpu __initdata = 1;
 	int err;
+	int i;
 
 	WARN_ON_FPU(!on_boot_cpu);
 	on_boot_cpu = 0;
@@ -759,6 +774,14 @@ void __init fpu__init_system_xstate(void
 		goto out_disable;
 	}
 
+	/*
+	 * Clear XSAVE features that are disabled in the normal CPUID.
+	 */
+	for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
+		if (!boot_cpu_has(xsave_cpuid_features[i]))
+			xfeatures_mask &= ~BIT(i);
+	}
+
 	xfeatures_mask &= fpu__get_supported_xfeatures_mask();
 
 	/* Enable xstate instructions to be able to continue with initialization: */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 019/159] x86/fpu: Remove the explicit clearing of XSAVE dependent features
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 018/159] x86/fpu: Make XSAVE check the base CPUID features before enabling Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 020/159] x86/platform/UV: Convert timers to use timer_setup() Greg Kroah-Hartman
                   ` (146 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Thomas Gleixner,
	Linus Torvalds, Peter Zijlstra, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andi Kleen <ak@linux.intel.com>

commit 73e3a7d2a7c3be29a5a22b85026f6cfa5664267f upstream.

Clearing a CPU feature with setup_clear_cpu_cap() clears all features
which depend on it. Expressing feature dependencies in one place is
easier to maintain than keeping functions like
fpu__xstate_clear_all_cpu_caps() up to date.

The features which depend on XSAVE have their dependency expressed in the
dependency table, so its sufficient to clear X86_FEATURE_XSAVE.

Remove the explicit clearing of XSAVE dependent features.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171013215645.23166-6-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/fpu/xstate.c |   20 --------------------
 1 file changed, 20 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -73,26 +73,6 @@ unsigned int fpu_user_xstate_size;
 void fpu__xstate_clear_all_cpu_caps(void)
 {
 	setup_clear_cpu_cap(X86_FEATURE_XSAVE);
-	setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
-	setup_clear_cpu_cap(X86_FEATURE_XSAVEC);
-	setup_clear_cpu_cap(X86_FEATURE_XSAVES);
-	setup_clear_cpu_cap(X86_FEATURE_AVX);
-	setup_clear_cpu_cap(X86_FEATURE_AVX2);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512F);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512IFMA);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512PF);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512ER);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512CD);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512DQ);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512BW);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512VL);
-	setup_clear_cpu_cap(X86_FEATURE_MPX);
-	setup_clear_cpu_cap(X86_FEATURE_XGETBV1);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512VBMI);
-	setup_clear_cpu_cap(X86_FEATURE_PKU);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512_4VNNIW);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512_4FMAPS);
-	setup_clear_cpu_cap(X86_FEATURE_AVX512_VPOPCNTDQ);
 }
 
 /*

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 020/159] x86/platform/UV: Convert timers to use timer_setup()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 019/159] x86/fpu: Remove the explicit clearing of XSAVE dependent features Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 021/159] objtool: Print top level commands on incorrect usage Greg Kroah-Hartman
                   ` (145 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kees Cook, Thomas Gleixner,
	Dimitri Sivanich, Russ Anderson, Mike Travis

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 376f3bcebdc999cc737d9052109cc33b573b3a8b upstream.

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
Cc: Mike Travis <mike.travis@hpe.com>
Link: https://lkml.kernel.org/r/20171016232231.GA100493@beast
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/apic/x2apic_uv_x.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -920,9 +920,8 @@ static __init void uv_rtc_init(void)
 /*
  * percpu heartbeat timer
  */
-static void uv_heartbeat(unsigned long ignored)
+static void uv_heartbeat(struct timer_list *timer)
 {
-	struct timer_list *timer = &uv_scir_info->timer;
 	unsigned char bits = uv_scir_info->state;
 
 	/* Flip heartbeat bit: */
@@ -947,7 +946,7 @@ static int uv_heartbeat_enable(unsigned
 		struct timer_list *timer = &uv_cpu_scir_info(cpu)->timer;
 
 		uv_set_cpu_scir_bits(cpu, SCIR_CPU_HEARTBEAT|SCIR_CPU_ACTIVITY);
-		setup_pinned_timer(timer, uv_heartbeat, cpu);
+		timer_setup(timer, uv_heartbeat, TIMER_PINNED);
 		timer->expires = jiffies + SCIR_CPU_HB_INTERVAL;
 		add_timer_on(timer, cpu);
 		uv_cpu_scir_info(cpu)->enabled = 1;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 021/159] objtool: Print top level commands on incorrect usage
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 020/159] x86/platform/UV: Convert timers to use timer_setup() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 022/159] x86/cpuid: Prevent out of bound access in do_clear_cpu_cap() Greg Kroah-Hartman
                   ` (144 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kamalesh Babulal, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>

commit 6a93bb7e4a7d6670677d5b0eb980936eb9cc5d2e upstream.

Print top-level objtool commands, along with the error on incorrect
command line usage. Objtool command line parser exit's with code 129,
for incorrect usage. Convert the cmd_usage() exit code also, to maintain
consistency across objtool.

After the patch:

  $ ./objtool -j

  Unknown option: -j

  usage: objtool COMMAND [ARGS]

  Commands:
     check   Perform stack metadata validation on an object file
     orc     Generate in-place ORC unwind tables for an object file

  $ echo $?
  129

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1507992474-16142-1-git-send-email-kamalesh@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/objtool/objtool.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -70,7 +70,7 @@ static void cmd_usage(void)
 
 	printf("\n");
 
-	exit(1);
+	exit(129);
 }
 
 static void handle_options(int *argc, const char ***argv)
@@ -86,9 +86,7 @@ static void handle_options(int *argc, co
 			break;
 		} else {
 			fprintf(stderr, "Unknown option: %s\n", cmd);
-			fprintf(stderr, "\n Usage: %s\n",
-				objtool_usage_string);
-			exit(1);
+			cmd_usage();
 		}
 
 		(*argv)++;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 022/159] x86/cpuid: Prevent out of bound access in do_clear_cpu_cap()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 021/159] objtool: Print top level commands on incorrect usage Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45   ` Greg Kroah-Hartman
                   ` (143 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, kernel test robot, Thomas Gleixner,
	Andi Kleen, Borislav Petkov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 57b8b1a1856adaa849d02d547411a553a531022b upstream.

do_clear_cpu_cap() allocates a bitmap to keep track of disabled feature
dependencies. That bitmap is sized NCAPINTS * BITS_PER_INIT. The possible
'features' which can be handed in are larger than this, because after the
capabilities the bug 'feature' bits occupy another 32bit. Not really
obvious...

So clearing any of the misfeature bits, as 32bit does for the F00F bug,
accesses that bitmap out of bounds thereby corrupting the stack.

Size the bitmap proper and add a sanity check to catch accidental out of
bound access.

Fixes: 0b00de857a64 ("x86/cpuid: Add generic table for CPUID dependencies")
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20171018022023.GA12058@yexl-desktop
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/cpuid-deps.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -75,11 +75,17 @@ static inline void clear_feature(struct
 		__clear_cpu_cap(c, feature);
 }
 
+/* Take the capabilities and the BUG bits into account */
+#define MAX_FEATURE_BITS ((NCAPINTS + NBUGINTS) * sizeof(u32) * 8)
+
 static void do_clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
 {
-	bool changed;
-	DECLARE_BITMAP(disable, NCAPINTS * sizeof(u32) * 8);
+	DECLARE_BITMAP(disable, MAX_FEATURE_BITS);
 	const struct cpuid_dep *d;
+	bool changed;
+
+	if (WARN_ON(feature >= MAX_FEATURE_BITS))
+		return;
 
 	clear_feature(c, feature);
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.

Size of the mem_section[] array depends on the size of the physical address space.

In preparation for boot-time switching between paging modes on x86-64
we need to make the allocation of mem_section[] dynamic, because otherwise
we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
for 4-level paging and 2MB for 5-level paging mode.

The patch allocates the array on the first call to sparse_memory_present_with_active_regions().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/mmzone.h |    6 +++++-
 mm/page_alloc.c        |   10 ++++++++++
 mm/sparse.c            |   17 +++++++++++------
 3 files changed, 26 insertions(+), 7 deletions(-)

--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1152,13 +1152,17 @@ struct mem_section {
 #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
 
 #ifdef CONFIG_SPARSEMEM_EXTREME
-extern struct mem_section *mem_section[NR_SECTION_ROOTS];
+extern struct mem_section **mem_section;
 #else
 extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
 #endif
 
 static inline struct mem_section *__nr_to_section(unsigned long nr)
 {
+#ifdef CONFIG_SPARSEMEM_EXTREME
+	if (!mem_section)
+		return NULL;
+#endif
 	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
 		return NULL;
 	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
 	unsigned long start_pfn, end_pfn;
 	int i, this_nid;
 
+#ifdef CONFIG_SPARSEMEM_EXTREME
+	if (!mem_section) {
+		unsigned long size, align;
+
+		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
+		align = 1 << (INTERNODE_CACHE_SHIFT);
+		mem_section = memblock_virt_alloc(size, align);
+	}
+#endif
+
 	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
 		memory_present(this_nid, start_pfn, end_pfn);
 }
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -23,8 +23,7 @@
  * 1) mem_section	- memory sections, mem_map's for valid memory
  */
 #ifdef CONFIG_SPARSEMEM_EXTREME
-struct mem_section *mem_section[NR_SECTION_ROOTS]
-	____cacheline_internodealigned_in_smp;
+struct mem_section **mem_section;
 #else
 struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
 	____cacheline_internodealigned_in_smp;
@@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
 int __section_nr(struct mem_section* ms)
 {
 	unsigned long root_nr;
-	struct mem_section* root;
+	struct mem_section *root = NULL;
 
 	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
 		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
@@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
 		     break;
 	}
 
-	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
+	VM_BUG_ON(!root);
 
 	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
 }
@@ -330,11 +329,17 @@ again:
 static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
 {
 	unsigned long usemap_snr, pgdat_snr;
-	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
-	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
+	static unsigned long old_usemap_snr;
+	static unsigned long old_pgdat_snr;
 	struct pglist_data *pgdat = NODE_DATA(nid);
 	int usemap_nid;
 
+	/* First call */
+	if (!old_usemap_snr) {
+		old_usemap_snr = NR_MEM_SECTIONS;
+		old_pgdat_snr = NR_MEM_SECTIONS;
+	}
+
 	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
 	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
 	if (usemap_snr == pgdat_snr)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.

Size of the mem_section[] array depends on the size of the physical address space.

In preparation for boot-time switching between paging modes on x86-64
we need to make the allocation of mem_section[] dynamic, because otherwise
we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
for 4-level paging and 2MB for 5-level paging mode.

The patch allocates the array on the first call to sparse_memory_present_with_active_regions().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/mmzone.h |    6 +++++-
 mm/page_alloc.c        |   10 ++++++++++
 mm/sparse.c            |   17 +++++++++++------
 3 files changed, 26 insertions(+), 7 deletions(-)

--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1152,13 +1152,17 @@ struct mem_section {
 #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
 
 #ifdef CONFIG_SPARSEMEM_EXTREME
-extern struct mem_section *mem_section[NR_SECTION_ROOTS];
+extern struct mem_section **mem_section;
 #else
 extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
 #endif
 
 static inline struct mem_section *__nr_to_section(unsigned long nr)
 {
+#ifdef CONFIG_SPARSEMEM_EXTREME
+	if (!mem_section)
+		return NULL;
+#endif
 	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
 		return NULL;
 	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
 	unsigned long start_pfn, end_pfn;
 	int i, this_nid;
 
+#ifdef CONFIG_SPARSEMEM_EXTREME
+	if (!mem_section) {
+		unsigned long size, align;
+
+		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
+		align = 1 << (INTERNODE_CACHE_SHIFT);
+		mem_section = memblock_virt_alloc(size, align);
+	}
+#endif
+
 	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
 		memory_present(this_nid, start_pfn, end_pfn);
 }
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -23,8 +23,7 @@
  * 1) mem_section	- memory sections, mem_map's for valid memory
  */
 #ifdef CONFIG_SPARSEMEM_EXTREME
-struct mem_section *mem_section[NR_SECTION_ROOTS]
-	____cacheline_internodealigned_in_smp;
+struct mem_section **mem_section;
 #else
 struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
 	____cacheline_internodealigned_in_smp;
@@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
 int __section_nr(struct mem_section* ms)
 {
 	unsigned long root_nr;
-	struct mem_section* root;
+	struct mem_section *root = NULL;
 
 	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
 		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
@@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
 		     break;
 	}
 
-	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
+	VM_BUG_ON(!root);
 
 	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
 }
@@ -330,11 +329,17 @@ again:
 static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
 {
 	unsigned long usemap_snr, pgdat_snr;
-	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
-	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
+	static unsigned long old_usemap_snr;
+	static unsigned long old_pgdat_snr;
 	struct pglist_data *pgdat = NODE_DATA(nid);
 	int usemap_nid;
 
+	/* First call */
+	if (!old_usemap_snr) {
+		old_usemap_snr = NR_MEM_SECTIONS;
+		old_pgdat_snr = NR_MEM_SECTIONS;
+	}
+
 	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
 	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
 	if (usemap_snr == pgdat_snr)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 024/159] x86/kasan: Use the same shadow offset for 4- and 5-level paging
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrey Ryabinin, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrey Ryabinin <aryabinin@virtuozzo.com>

commit 12a8cc7fcf54a8575f094be1e99032ec38aa045c upstream.

We are going to support boot-time switching between 4- and 5-level
paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET
for different paging modes: the constant is passed to gcc to generate
code and cannot be changed at runtime.

This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset
for both 4- and 5-level paging.

For 5-level paging it means that shadow memory region is not aligned to
PGD boundary anymore and we have to handle unaligned parts of the region
properly.

In addition, we have to exclude paravirt code from KASAN instrumentation
as we now use set_pgd() before KASAN is fully ready.

[kirill.shutemov@linux.intel.com: clenaup, changelog message]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-4-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/x86/x86_64/mm.txt |    2 
 arch/x86/Kconfig                |    1 
 arch/x86/kernel/Makefile        |    3 -
 arch/x86/mm/kasan_init_64.c     |  101 +++++++++++++++++++++++++++++++---------
 4 files changed, 83 insertions(+), 24 deletions(-)

--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -34,7 +34,7 @@ ff92000000000000 - ffd1ffffffffffff (=54
 ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
 ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
 ... unused hole ...
-ffd8000000000000 - fff7ffffffffffff (=53 bits) kasan shadow memory (8PB)
+ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -303,7 +303,6 @@ config ARCH_SUPPORTS_DEBUG_PAGEALLOC
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
-	default 0xdff8000000000000 if X86_5LEVEL
 	default 0xdffffc0000000000
 
 config HAVE_INTEL_TXT
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -25,7 +25,8 @@ endif
 KASAN_SANITIZE_head$(BITS).o				:= n
 KASAN_SANITIZE_dumpstack.o				:= n
 KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
-KASAN_SANITIZE_stacktrace.o := n
+KASAN_SANITIZE_stacktrace.o				:= n
+KASAN_SANITIZE_paravirt.o				:= n
 
 OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o	:= y
 OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -16,6 +16,8 @@
 
 extern struct range pfn_mapped[E820_MAX_ENTRIES];
 
+static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
+
 static int __init map_range(struct range *range)
 {
 	unsigned long start;
@@ -31,8 +33,10 @@ static void __init clear_pgds(unsigned l
 			unsigned long end)
 {
 	pgd_t *pgd;
+	/* See comment in kasan_init() */
+	unsigned long pgd_end = end & PGDIR_MASK;
 
-	for (; start < end; start += PGDIR_SIZE) {
+	for (; start < pgd_end; start += PGDIR_SIZE) {
 		pgd = pgd_offset_k(start);
 		/*
 		 * With folded p4d, pgd_clear() is nop, use p4d_clear()
@@ -43,29 +47,61 @@ static void __init clear_pgds(unsigned l
 		else
 			pgd_clear(pgd);
 	}
+
+	pgd = pgd_offset_k(start);
+	for (; start < end; start += P4D_SIZE)
+		p4d_clear(p4d_offset(pgd, start));
+}
+
+static inline p4d_t *early_p4d_offset(pgd_t *pgd, unsigned long addr)
+{
+	unsigned long p4d;
+
+	if (!IS_ENABLED(CONFIG_X86_5LEVEL))
+		return (p4d_t *)pgd;
+
+	p4d = __pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK;
+	p4d += __START_KERNEL_map - phys_base;
+	return (p4d_t *)p4d + p4d_index(addr);
+}
+
+static void __init kasan_early_p4d_populate(pgd_t *pgd,
+		unsigned long addr,
+		unsigned long end)
+{
+	pgd_t pgd_entry;
+	p4d_t *p4d, p4d_entry;
+	unsigned long next;
+
+	if (pgd_none(*pgd)) {
+		pgd_entry = __pgd(_KERNPG_TABLE | __pa_nodebug(kasan_zero_p4d));
+		set_pgd(pgd, pgd_entry);
+	}
+
+	p4d = early_p4d_offset(pgd, addr);
+	do {
+		next = p4d_addr_end(addr, end);
+
+		if (!p4d_none(*p4d))
+			continue;
+
+		p4d_entry = __p4d(_KERNPG_TABLE | __pa_nodebug(kasan_zero_pud));
+		set_p4d(p4d, p4d_entry);
+	} while (p4d++, addr = next, addr != end && p4d_none(*p4d));
 }
 
 static void __init kasan_map_early_shadow(pgd_t *pgd)
 {
-	int i;
-	unsigned long start = KASAN_SHADOW_START;
+	/* See comment in kasan_init() */
+	unsigned long addr = KASAN_SHADOW_START & PGDIR_MASK;
 	unsigned long end = KASAN_SHADOW_END;
+	unsigned long next;
 
-	for (i = pgd_index(start); start < end; i++) {
-		switch (CONFIG_PGTABLE_LEVELS) {
-		case 4:
-			pgd[i] = __pgd(__pa_nodebug(kasan_zero_pud) |
-					_KERNPG_TABLE);
-			break;
-		case 5:
-			pgd[i] = __pgd(__pa_nodebug(kasan_zero_p4d) |
-					_KERNPG_TABLE);
-			break;
-		default:
-			BUILD_BUG();
-		}
-		start += PGDIR_SIZE;
-	}
+	pgd += pgd_index(addr);
+	do {
+		next = pgd_addr_end(addr, end);
+		kasan_early_p4d_populate(pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
 }
 
 #ifdef CONFIG_KASAN_INLINE
@@ -102,7 +138,7 @@ void __init kasan_early_init(void)
 	for (i = 0; i < PTRS_PER_PUD; i++)
 		kasan_zero_pud[i] = __pud(pud_val);
 
-	for (i = 0; CONFIG_PGTABLE_LEVELS >= 5 && i < PTRS_PER_P4D; i++)
+	for (i = 0; IS_ENABLED(CONFIG_X86_5LEVEL) && i < PTRS_PER_P4D; i++)
 		kasan_zero_p4d[i] = __p4d(p4d_val);
 
 	kasan_map_early_shadow(early_top_pgt);
@@ -118,12 +154,35 @@ void __init kasan_init(void)
 #endif
 
 	memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
+
+	/*
+	 * We use the same shadow offset for 4- and 5-level paging to
+	 * facilitate boot-time switching between paging modes.
+	 * As result in 5-level paging mode KASAN_SHADOW_START and
+	 * KASAN_SHADOW_END are not aligned to PGD boundary.
+	 *
+	 * KASAN_SHADOW_START doesn't share PGD with anything else.
+	 * We claim whole PGD entry to make things easier.
+	 *
+	 * KASAN_SHADOW_END lands in the last PGD entry and it collides with
+	 * bunch of things like kernel code, modules, EFI mapping, etc.
+	 * We need to take extra steps to not overwrite them.
+	 */
+	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
+		void *ptr;
+
+		ptr = (void *)pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_END));
+		memcpy(tmp_p4d_table, (void *)ptr, sizeof(tmp_p4d_table));
+		set_pgd(&early_top_pgt[pgd_index(KASAN_SHADOW_END)],
+				__pgd(__pa(tmp_p4d_table) | _KERNPG_TABLE));
+	}
+
 	load_cr3(early_top_pgt);
 	__flush_tlb_all();
 
-	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
+	clear_pgds(KASAN_SHADOW_START & PGDIR_MASK, KASAN_SHADOW_END);
 
-	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
+	kasan_populate_zero_shadow((void *)(KASAN_SHADOW_START & PGDIR_MASK),
 			kasan_mem_to_shadow((void *)PAGE_OFFSET));
 
 	for (i = 0; i < E820_MAX_ENTRIES; i++) {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 024/159] x86/kasan: Use the same shadow offset for 4- and 5-level paging
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrey Ryabinin, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrey Ryabinin <aryabinin@virtuozzo.com>

commit 12a8cc7fcf54a8575f094be1e99032ec38aa045c upstream.

We are going to support boot-time switching between 4- and 5-level
paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET
for different paging modes: the constant is passed to gcc to generate
code and cannot be changed at runtime.

This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset
for both 4- and 5-level paging.

For 5-level paging it means that shadow memory region is not aligned to
PGD boundary anymore and we have to handle unaligned parts of the region
properly.

In addition, we have to exclude paravirt code from KASAN instrumentation
as we now use set_pgd() before KASAN is fully ready.

[kirill.shutemov@linux.intel.com: clenaup, changelog message]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-4-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/x86/x86_64/mm.txt |    2 
 arch/x86/Kconfig                |    1 
 arch/x86/kernel/Makefile        |    3 -
 arch/x86/mm/kasan_init_64.c     |  101 +++++++++++++++++++++++++++++++---------
 4 files changed, 83 insertions(+), 24 deletions(-)

--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -34,7 +34,7 @@ ff92000000000000 - ffd1ffffffffffff (=54
 ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
 ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
 ... unused hole ...
-ffd8000000000000 - fff7ffffffffffff (=53 bits) kasan shadow memory (8PB)
+ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -303,7 +303,6 @@ config ARCH_SUPPORTS_DEBUG_PAGEALLOC
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
-	default 0xdff8000000000000 if X86_5LEVEL
 	default 0xdffffc0000000000
 
 config HAVE_INTEL_TXT
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -25,7 +25,8 @@ endif
 KASAN_SANITIZE_head$(BITS).o				:= n
 KASAN_SANITIZE_dumpstack.o				:= n
 KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
-KASAN_SANITIZE_stacktrace.o := n
+KASAN_SANITIZE_stacktrace.o				:= n
+KASAN_SANITIZE_paravirt.o				:= n
 
 OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o	:= y
 OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -16,6 +16,8 @@
 
 extern struct range pfn_mapped[E820_MAX_ENTRIES];
 
+static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
+
 static int __init map_range(struct range *range)
 {
 	unsigned long start;
@@ -31,8 +33,10 @@ static void __init clear_pgds(unsigned l
 			unsigned long end)
 {
 	pgd_t *pgd;
+	/* See comment in kasan_init() */
+	unsigned long pgd_end = end & PGDIR_MASK;
 
-	for (; start < end; start += PGDIR_SIZE) {
+	for (; start < pgd_end; start += PGDIR_SIZE) {
 		pgd = pgd_offset_k(start);
 		/*
 		 * With folded p4d, pgd_clear() is nop, use p4d_clear()
@@ -43,29 +47,61 @@ static void __init clear_pgds(unsigned l
 		else
 			pgd_clear(pgd);
 	}
+
+	pgd = pgd_offset_k(start);
+	for (; start < end; start += P4D_SIZE)
+		p4d_clear(p4d_offset(pgd, start));
+}
+
+static inline p4d_t *early_p4d_offset(pgd_t *pgd, unsigned long addr)
+{
+	unsigned long p4d;
+
+	if (!IS_ENABLED(CONFIG_X86_5LEVEL))
+		return (p4d_t *)pgd;
+
+	p4d = __pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK;
+	p4d += __START_KERNEL_map - phys_base;
+	return (p4d_t *)p4d + p4d_index(addr);
+}
+
+static void __init kasan_early_p4d_populate(pgd_t *pgd,
+		unsigned long addr,
+		unsigned long end)
+{
+	pgd_t pgd_entry;
+	p4d_t *p4d, p4d_entry;
+	unsigned long next;
+
+	if (pgd_none(*pgd)) {
+		pgd_entry = __pgd(_KERNPG_TABLE | __pa_nodebug(kasan_zero_p4d));
+		set_pgd(pgd, pgd_entry);
+	}
+
+	p4d = early_p4d_offset(pgd, addr);
+	do {
+		next = p4d_addr_end(addr, end);
+
+		if (!p4d_none(*p4d))
+			continue;
+
+		p4d_entry = __p4d(_KERNPG_TABLE | __pa_nodebug(kasan_zero_pud));
+		set_p4d(p4d, p4d_entry);
+	} while (p4d++, addr = next, addr != end && p4d_none(*p4d));
 }
 
 static void __init kasan_map_early_shadow(pgd_t *pgd)
 {
-	int i;
-	unsigned long start = KASAN_SHADOW_START;
+	/* See comment in kasan_init() */
+	unsigned long addr = KASAN_SHADOW_START & PGDIR_MASK;
 	unsigned long end = KASAN_SHADOW_END;
+	unsigned long next;
 
-	for (i = pgd_index(start); start < end; i++) {
-		switch (CONFIG_PGTABLE_LEVELS) {
-		case 4:
-			pgd[i] = __pgd(__pa_nodebug(kasan_zero_pud) |
-					_KERNPG_TABLE);
-			break;
-		case 5:
-			pgd[i] = __pgd(__pa_nodebug(kasan_zero_p4d) |
-					_KERNPG_TABLE);
-			break;
-		default:
-			BUILD_BUG();
-		}
-		start += PGDIR_SIZE;
-	}
+	pgd += pgd_index(addr);
+	do {
+		next = pgd_addr_end(addr, end);
+		kasan_early_p4d_populate(pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
 }
 
 #ifdef CONFIG_KASAN_INLINE
@@ -102,7 +138,7 @@ void __init kasan_early_init(void)
 	for (i = 0; i < PTRS_PER_PUD; i++)
 		kasan_zero_pud[i] = __pud(pud_val);
 
-	for (i = 0; CONFIG_PGTABLE_LEVELS >= 5 && i < PTRS_PER_P4D; i++)
+	for (i = 0; IS_ENABLED(CONFIG_X86_5LEVEL) && i < PTRS_PER_P4D; i++)
 		kasan_zero_p4d[i] = __p4d(p4d_val);
 
 	kasan_map_early_shadow(early_top_pgt);
@@ -118,12 +154,35 @@ void __init kasan_init(void)
 #endif
 
 	memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
+
+	/*
+	 * We use the same shadow offset for 4- and 5-level paging to
+	 * facilitate boot-time switching between paging modes.
+	 * As result in 5-level paging mode KASAN_SHADOW_START and
+	 * KASAN_SHADOW_END are not aligned to PGD boundary.
+	 *
+	 * KASAN_SHADOW_START doesn't share PGD with anything else.
+	 * We claim whole PGD entry to make things easier.
+	 *
+	 * KASAN_SHADOW_END lands in the last PGD entry and it collides with
+	 * bunch of things like kernel code, modules, EFI mapping, etc.
+	 * We need to take extra steps to not overwrite them.
+	 */
+	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
+		void *ptr;
+
+		ptr = (void *)pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_END));
+		memcpy(tmp_p4d_table, (void *)ptr, sizeof(tmp_p4d_table));
+		set_pgd(&early_top_pgt[pgd_index(KASAN_SHADOW_END)],
+				__pgd(__pa(tmp_p4d_table) | _KERNPG_TABLE));
+	}
+
 	load_cr3(early_top_pgt);
 	__flush_tlb_all();
 
-	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
+	clear_pgds(KASAN_SHADOW_START & PGDIR_MASK, KASAN_SHADOW_END);
 
-	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
+	kasan_populate_zero_shadow((void *)(KASAN_SHADOW_START & PGDIR_MASK),
 			kasan_mem_to_shadow((void *)PAGE_OFFSET));
 
 	for (i = 0; i < E820_MAX_ENTRIES; i++) {


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 025/159] x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kirill A. Shutemov, Juergen Gross,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

commit 4375c29985f155d7eb2346615d84e62d1b673682 upstream.

Looks like we only need pre-built page tables in the CONFIG_XEN_PV=y and
CONFIG_XEN_PVH=y cases.

Let's not provide them for other configurations.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-5-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head_64.S |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -38,11 +38,12 @@
  *
  */
 
-#define p4d_index(x)	(((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
 #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
 
+#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
 PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
 PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
+#endif
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
 	.text
@@ -365,10 +366,7 @@ NEXT_PAGE(early_dynamic_pgts)
 
 	.data
 
-#ifndef CONFIG_XEN
-NEXT_PAGE(init_top_pgt)
-	.fill	512,8,0
-#else
+#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
 NEXT_PAGE(init_top_pgt)
 	.quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
 	.org    init_top_pgt + PGD_PAGE_OFFSET*8, 0
@@ -385,6 +383,9 @@ NEXT_PAGE(level2_ident_pgt)
 	 * Don't set NX because code runs from these pages.
 	 */
 	PMDS(0, __PAGE_KERNEL_IDENT_LARGE_EXEC, PTRS_PER_PMD)
+#else
+NEXT_PAGE(init_top_pgt)
+	.fill	512,8,0
 #endif
 
 #ifdef CONFIG_X86_5LEVEL

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 025/159] x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kirill A. Shutemov, Juergen Gross,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

commit 4375c29985f155d7eb2346615d84e62d1b673682 upstream.

Looks like we only need pre-built page tables in the CONFIG_XEN_PV=y and
CONFIG_XEN_PVH=y cases.

Let's not provide them for other configurations.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-5-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head_64.S |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -38,11 +38,12 @@
  *
  */
 
-#define p4d_index(x)	(((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
 #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
 
+#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
 PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
 PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
+#endif
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
 	.text
@@ -365,10 +366,7 @@ NEXT_PAGE(early_dynamic_pgts)
 
 	.data
 
-#ifndef CONFIG_XEN
-NEXT_PAGE(init_top_pgt)
-	.fill	512,8,0
-#else
+#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
 NEXT_PAGE(init_top_pgt)
 	.quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
 	.org    init_top_pgt + PGD_PAGE_OFFSET*8, 0
@@ -385,6 +383,9 @@ NEXT_PAGE(level2_ident_pgt)
 	 * Don't set NX because code runs from these pages.
 	 */
 	PMDS(0, __PAGE_KERNEL_IDENT_LARGE_EXEC, PTRS_PER_PMD)
+#else
+NEXT_PAGE(init_top_pgt)
+	.fill	512,8,0
 #endif
 
 #ifdef CONFIG_X86_5LEVEL


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 026/159] x86/xen: Drop 5-level paging support code from the XEN_PV code
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Juergen Gross, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

commit 773dd2fca581b0a80e5a33332cc8ee67e5a79cba upstream.

It was decided 5-level paging is not going to be supported in XEN_PV.

Let's drop the dead code from the XEN_PV code.

Tested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-6-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/xen/mmu_pv.c |  159 ++++++++++++++++++--------------------------------
 1 file changed, 60 insertions(+), 99 deletions(-)

--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -449,7 +449,7 @@ __visible pmd_t xen_make_pmd(pmdval_t pm
 }
 PV_CALLEE_SAVE_REGS_THUNK(xen_make_pmd);
 
-#if CONFIG_PGTABLE_LEVELS == 4
+#ifdef CONFIG_X86_64
 __visible pudval_t xen_pud_val(pud_t pud)
 {
 	return pte_mfn_to_pfn(pud.pud);
@@ -538,7 +538,7 @@ static void xen_set_p4d(p4d_t *ptr, p4d_
 
 	xen_mc_issue(PARAVIRT_LAZY_MMU);
 }
-#endif	/* CONFIG_PGTABLE_LEVELS == 4 */
+#endif	/* CONFIG_X86_64 */
 
 static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd,
 		int (*func)(struct mm_struct *mm, struct page *, enum pt_level),
@@ -580,21 +580,17 @@ static int xen_p4d_walk(struct mm_struct
 		int (*func)(struct mm_struct *mm, struct page *, enum pt_level),
 		bool last, unsigned long limit)
 {
-	int i, nr, flush = 0;
+	int flush = 0;
+	pud_t *pud;
 
-	nr = last ? p4d_index(limit) + 1 : PTRS_PER_P4D;
-	for (i = 0; i < nr; i++) {
-		pud_t *pud;
 
-		if (p4d_none(p4d[i]))
-			continue;
+	if (p4d_none(*p4d))
+		return flush;
 
-		pud = pud_offset(&p4d[i], 0);
-		if (PTRS_PER_PUD > 1)
-			flush |= (*func)(mm, virt_to_page(pud), PT_PUD);
-		flush |= xen_pud_walk(mm, pud, func,
-				last && i == nr - 1, limit);
-	}
+	pud = pud_offset(p4d, 0);
+	if (PTRS_PER_PUD > 1)
+		flush |= (*func)(mm, virt_to_page(pud), PT_PUD);
+	flush |= xen_pud_walk(mm, pud, func, last, limit);
 	return flush;
 }
 
@@ -644,8 +640,6 @@ static int __xen_pgd_walk(struct mm_stru
 			continue;
 
 		p4d = p4d_offset(&pgd[i], 0);
-		if (PTRS_PER_P4D > 1)
-			flush |= (*func)(mm, virt_to_page(p4d), PT_P4D);
 		flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit);
 	}
 
@@ -1176,22 +1170,14 @@ static void __init xen_cleanmfnmap(unsig
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
-	unsigned int i;
 	bool unpin;
 
 	unpin = (vaddr == 2 * PGDIR_SIZE);
 	vaddr &= PMD_MASK;
 	pgd = pgd_offset_k(vaddr);
 	p4d = p4d_offset(pgd, 0);
-	for (i = 0; i < PTRS_PER_P4D; i++) {
-		if (p4d_none(p4d[i]))
-			continue;
-		xen_cleanmfnmap_p4d(p4d + i, unpin);
-	}
-	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
-		set_pgd(pgd, __pgd(0));
-		xen_cleanmfnmap_free_pgtbl(p4d, unpin);
-	}
+	if (!p4d_none(*p4d))
+		xen_cleanmfnmap_p4d(p4d, unpin);
 }
 
 static void __init xen_pagetable_p2m_free(void)
@@ -1692,7 +1678,7 @@ static void xen_release_pmd(unsigned lon
 	xen_release_ptpage(pfn, PT_PMD);
 }
 
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 static void xen_alloc_pud(struct mm_struct *mm, unsigned long pfn)
 {
 	xen_alloc_ptpage(mm, pfn, PT_PUD);
@@ -2029,13 +2015,12 @@ static phys_addr_t __init xen_early_virt
  */
 void __init xen_relocate_p2m(void)
 {
-	phys_addr_t size, new_area, pt_phys, pmd_phys, pud_phys, p4d_phys;
+	phys_addr_t size, new_area, pt_phys, pmd_phys, pud_phys;
 	unsigned long p2m_pfn, p2m_pfn_end, n_frames, pfn, pfn_end;
-	int n_pte, n_pt, n_pmd, n_pud, n_p4d, idx_pte, idx_pt, idx_pmd, idx_pud, idx_p4d;
+	int n_pte, n_pt, n_pmd, n_pud, idx_pte, idx_pt, idx_pmd, idx_pud;
 	pte_t *pt;
 	pmd_t *pmd;
 	pud_t *pud;
-	p4d_t *p4d = NULL;
 	pgd_t *pgd;
 	unsigned long *new_p2m;
 	int save_pud;
@@ -2045,11 +2030,7 @@ void __init xen_relocate_p2m(void)
 	n_pt = roundup(size, PMD_SIZE) >> PMD_SHIFT;
 	n_pmd = roundup(size, PUD_SIZE) >> PUD_SHIFT;
 	n_pud = roundup(size, P4D_SIZE) >> P4D_SHIFT;
-	if (PTRS_PER_P4D > 1)
-		n_p4d = roundup(size, PGDIR_SIZE) >> PGDIR_SHIFT;
-	else
-		n_p4d = 0;
-	n_frames = n_pte + n_pt + n_pmd + n_pud + n_p4d;
+	n_frames = n_pte + n_pt + n_pmd + n_pud;
 
 	new_area = xen_find_free_area(PFN_PHYS(n_frames));
 	if (!new_area) {
@@ -2065,76 +2046,56 @@ void __init xen_relocate_p2m(void)
 	 * To avoid any possible virtual address collision, just use
 	 * 2 * PUD_SIZE for the new area.
 	 */
-	p4d_phys = new_area;
-	pud_phys = p4d_phys + PFN_PHYS(n_p4d);
+	pud_phys = new_area;
 	pmd_phys = pud_phys + PFN_PHYS(n_pud);
 	pt_phys = pmd_phys + PFN_PHYS(n_pmd);
 	p2m_pfn = PFN_DOWN(pt_phys) + n_pt;
 
 	pgd = __va(read_cr3_pa());
 	new_p2m = (unsigned long *)(2 * PGDIR_SIZE);
-	idx_p4d = 0;
 	save_pud = n_pud;
-	do {
-		if (n_p4d > 0) {
-			p4d = early_memremap(p4d_phys, PAGE_SIZE);
-			clear_page(p4d);
-			n_pud = min(save_pud, PTRS_PER_P4D);
-		}
-		for (idx_pud = 0; idx_pud < n_pud; idx_pud++) {
-			pud = early_memremap(pud_phys, PAGE_SIZE);
-			clear_page(pud);
-			for (idx_pmd = 0; idx_pmd < min(n_pmd, PTRS_PER_PUD);
-				 idx_pmd++) {
-				pmd = early_memremap(pmd_phys, PAGE_SIZE);
-				clear_page(pmd);
-				for (idx_pt = 0; idx_pt < min(n_pt, PTRS_PER_PMD);
-					 idx_pt++) {
-					pt = early_memremap(pt_phys, PAGE_SIZE);
-					clear_page(pt);
-					for (idx_pte = 0;
-						 idx_pte < min(n_pte, PTRS_PER_PTE);
-						 idx_pte++) {
-						set_pte(pt + idx_pte,
-								pfn_pte(p2m_pfn, PAGE_KERNEL));
-						p2m_pfn++;
-					}
-					n_pte -= PTRS_PER_PTE;
-					early_memunmap(pt, PAGE_SIZE);
-					make_lowmem_page_readonly(__va(pt_phys));
-					pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE,
-							PFN_DOWN(pt_phys));
-					set_pmd(pmd + idx_pt,
-							__pmd(_PAGE_TABLE | pt_phys));
-					pt_phys += PAGE_SIZE;
+	for (idx_pud = 0; idx_pud < n_pud; idx_pud++) {
+		pud = early_memremap(pud_phys, PAGE_SIZE);
+		clear_page(pud);
+		for (idx_pmd = 0; idx_pmd < min(n_pmd, PTRS_PER_PUD);
+				idx_pmd++) {
+			pmd = early_memremap(pmd_phys, PAGE_SIZE);
+			clear_page(pmd);
+			for (idx_pt = 0; idx_pt < min(n_pt, PTRS_PER_PMD);
+					idx_pt++) {
+				pt = early_memremap(pt_phys, PAGE_SIZE);
+				clear_page(pt);
+				for (idx_pte = 0;
+						idx_pte < min(n_pte, PTRS_PER_PTE);
+						idx_pte++) {
+					set_pte(pt + idx_pte,
+							pfn_pte(p2m_pfn, PAGE_KERNEL));
+					p2m_pfn++;
 				}
-				n_pt -= PTRS_PER_PMD;
-				early_memunmap(pmd, PAGE_SIZE);
-				make_lowmem_page_readonly(__va(pmd_phys));
-				pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE,
-						PFN_DOWN(pmd_phys));
-				set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys));
-				pmd_phys += PAGE_SIZE;
+				n_pte -= PTRS_PER_PTE;
+				early_memunmap(pt, PAGE_SIZE);
+				make_lowmem_page_readonly(__va(pt_phys));
+				pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE,
+						PFN_DOWN(pt_phys));
+				set_pmd(pmd + idx_pt,
+						__pmd(_PAGE_TABLE | pt_phys));
+				pt_phys += PAGE_SIZE;
 			}
-			n_pmd -= PTRS_PER_PUD;
-			early_memunmap(pud, PAGE_SIZE);
-			make_lowmem_page_readonly(__va(pud_phys));
-			pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(pud_phys));
-			if (n_p4d > 0)
-				set_p4d(p4d + idx_pud, __p4d(_PAGE_TABLE | pud_phys));
-			else
-				set_pgd(pgd + 2 + idx_pud, __pgd(_PAGE_TABLE | pud_phys));
-			pud_phys += PAGE_SIZE;
-		}
-		if (n_p4d > 0) {
-			save_pud -= PTRS_PER_P4D;
-			early_memunmap(p4d, PAGE_SIZE);
-			make_lowmem_page_readonly(__va(p4d_phys));
-			pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, PFN_DOWN(p4d_phys));
-			set_pgd(pgd + 2 + idx_p4d, __pgd(_PAGE_TABLE | p4d_phys));
-			p4d_phys += PAGE_SIZE;
+			n_pt -= PTRS_PER_PMD;
+			early_memunmap(pmd, PAGE_SIZE);
+			make_lowmem_page_readonly(__va(pmd_phys));
+			pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE,
+					PFN_DOWN(pmd_phys));
+			set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys));
+			pmd_phys += PAGE_SIZE;
 		}
-	} while (++idx_p4d < n_p4d);
+		n_pmd -= PTRS_PER_PUD;
+		early_memunmap(pud, PAGE_SIZE);
+		make_lowmem_page_readonly(__va(pud_phys));
+		pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(pud_phys));
+		set_pgd(pgd + 2 + idx_pud, __pgd(_PAGE_TABLE | pud_phys));
+		pud_phys += PAGE_SIZE;
+	}
 
 	/* Now copy the old p2m info to the new area. */
 	memcpy(new_p2m, xen_p2m_addr, size);
@@ -2361,7 +2322,7 @@ static void __init xen_post_allocator_in
 	pv_mmu_ops.set_pte = xen_set_pte;
 	pv_mmu_ops.set_pmd = xen_set_pmd;
 	pv_mmu_ops.set_pud = xen_set_pud;
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 	pv_mmu_ops.set_p4d = xen_set_p4d;
 #endif
 
@@ -2371,7 +2332,7 @@ static void __init xen_post_allocator_in
 	pv_mmu_ops.alloc_pmd = xen_alloc_pmd;
 	pv_mmu_ops.release_pte = xen_release_pte;
 	pv_mmu_ops.release_pmd = xen_release_pmd;
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 	pv_mmu_ops.alloc_pud = xen_alloc_pud;
 	pv_mmu_ops.release_pud = xen_release_pud;
 #endif
@@ -2435,14 +2396,14 @@ static const struct pv_mmu_ops xen_mmu_o
 	.make_pmd = PV_CALLEE_SAVE(xen_make_pmd),
 	.pmd_val = PV_CALLEE_SAVE(xen_pmd_val),
 
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 	.pud_val = PV_CALLEE_SAVE(xen_pud_val),
 	.make_pud = PV_CALLEE_SAVE(xen_make_pud),
 	.set_p4d = xen_set_p4d_hyper,
 
 	.alloc_pud = xen_alloc_pmd_init,
 	.release_pud = xen_release_pmd_init,
-#endif	/* CONFIG_PGTABLE_LEVELS == 4 */
+#endif	/* CONFIG_X86_64 */
 
 	.activate_mm = xen_activate_mm,
 	.dup_mmap = xen_dup_mmap,

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 026/159] x86/xen: Drop 5-level paging support code from the XEN_PV code
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Juergen Gross, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

commit 773dd2fca581b0a80e5a33332cc8ee67e5a79cba upstream.

It was decided 5-level paging is not going to be supported in XEN_PV.

Let's drop the dead code from the XEN_PV code.

Tested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170929140821.37654-6-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/xen/mmu_pv.c |  159 ++++++++++++++++++--------------------------------
 1 file changed, 60 insertions(+), 99 deletions(-)

--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -449,7 +449,7 @@ __visible pmd_t xen_make_pmd(pmdval_t pm
 }
 PV_CALLEE_SAVE_REGS_THUNK(xen_make_pmd);
 
-#if CONFIG_PGTABLE_LEVELS == 4
+#ifdef CONFIG_X86_64
 __visible pudval_t xen_pud_val(pud_t pud)
 {
 	return pte_mfn_to_pfn(pud.pud);
@@ -538,7 +538,7 @@ static void xen_set_p4d(p4d_t *ptr, p4d_
 
 	xen_mc_issue(PARAVIRT_LAZY_MMU);
 }
-#endif	/* CONFIG_PGTABLE_LEVELS == 4 */
+#endif	/* CONFIG_X86_64 */
 
 static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd,
 		int (*func)(struct mm_struct *mm, struct page *, enum pt_level),
@@ -580,21 +580,17 @@ static int xen_p4d_walk(struct mm_struct
 		int (*func)(struct mm_struct *mm, struct page *, enum pt_level),
 		bool last, unsigned long limit)
 {
-	int i, nr, flush = 0;
+	int flush = 0;
+	pud_t *pud;
 
-	nr = last ? p4d_index(limit) + 1 : PTRS_PER_P4D;
-	for (i = 0; i < nr; i++) {
-		pud_t *pud;
 
-		if (p4d_none(p4d[i]))
-			continue;
+	if (p4d_none(*p4d))
+		return flush;
 
-		pud = pud_offset(&p4d[i], 0);
-		if (PTRS_PER_PUD > 1)
-			flush |= (*func)(mm, virt_to_page(pud), PT_PUD);
-		flush |= xen_pud_walk(mm, pud, func,
-				last && i == nr - 1, limit);
-	}
+	pud = pud_offset(p4d, 0);
+	if (PTRS_PER_PUD > 1)
+		flush |= (*func)(mm, virt_to_page(pud), PT_PUD);
+	flush |= xen_pud_walk(mm, pud, func, last, limit);
 	return flush;
 }
 
@@ -644,8 +640,6 @@ static int __xen_pgd_walk(struct mm_stru
 			continue;
 
 		p4d = p4d_offset(&pgd[i], 0);
-		if (PTRS_PER_P4D > 1)
-			flush |= (*func)(mm, virt_to_page(p4d), PT_P4D);
 		flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit);
 	}
 
@@ -1176,22 +1170,14 @@ static void __init xen_cleanmfnmap(unsig
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
-	unsigned int i;
 	bool unpin;
 
 	unpin = (vaddr == 2 * PGDIR_SIZE);
 	vaddr &= PMD_MASK;
 	pgd = pgd_offset_k(vaddr);
 	p4d = p4d_offset(pgd, 0);
-	for (i = 0; i < PTRS_PER_P4D; i++) {
-		if (p4d_none(p4d[i]))
-			continue;
-		xen_cleanmfnmap_p4d(p4d + i, unpin);
-	}
-	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
-		set_pgd(pgd, __pgd(0));
-		xen_cleanmfnmap_free_pgtbl(p4d, unpin);
-	}
+	if (!p4d_none(*p4d))
+		xen_cleanmfnmap_p4d(p4d, unpin);
 }
 
 static void __init xen_pagetable_p2m_free(void)
@@ -1692,7 +1678,7 @@ static void xen_release_pmd(unsigned lon
 	xen_release_ptpage(pfn, PT_PMD);
 }
 
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 static void xen_alloc_pud(struct mm_struct *mm, unsigned long pfn)
 {
 	xen_alloc_ptpage(mm, pfn, PT_PUD);
@@ -2029,13 +2015,12 @@ static phys_addr_t __init xen_early_virt
  */
 void __init xen_relocate_p2m(void)
 {
-	phys_addr_t size, new_area, pt_phys, pmd_phys, pud_phys, p4d_phys;
+	phys_addr_t size, new_area, pt_phys, pmd_phys, pud_phys;
 	unsigned long p2m_pfn, p2m_pfn_end, n_frames, pfn, pfn_end;
-	int n_pte, n_pt, n_pmd, n_pud, n_p4d, idx_pte, idx_pt, idx_pmd, idx_pud, idx_p4d;
+	int n_pte, n_pt, n_pmd, n_pud, idx_pte, idx_pt, idx_pmd, idx_pud;
 	pte_t *pt;
 	pmd_t *pmd;
 	pud_t *pud;
-	p4d_t *p4d = NULL;
 	pgd_t *pgd;
 	unsigned long *new_p2m;
 	int save_pud;
@@ -2045,11 +2030,7 @@ void __init xen_relocate_p2m(void)
 	n_pt = roundup(size, PMD_SIZE) >> PMD_SHIFT;
 	n_pmd = roundup(size, PUD_SIZE) >> PUD_SHIFT;
 	n_pud = roundup(size, P4D_SIZE) >> P4D_SHIFT;
-	if (PTRS_PER_P4D > 1)
-		n_p4d = roundup(size, PGDIR_SIZE) >> PGDIR_SHIFT;
-	else
-		n_p4d = 0;
-	n_frames = n_pte + n_pt + n_pmd + n_pud + n_p4d;
+	n_frames = n_pte + n_pt + n_pmd + n_pud;
 
 	new_area = xen_find_free_area(PFN_PHYS(n_frames));
 	if (!new_area) {
@@ -2065,76 +2046,56 @@ void __init xen_relocate_p2m(void)
 	 * To avoid any possible virtual address collision, just use
 	 * 2 * PUD_SIZE for the new area.
 	 */
-	p4d_phys = new_area;
-	pud_phys = p4d_phys + PFN_PHYS(n_p4d);
+	pud_phys = new_area;
 	pmd_phys = pud_phys + PFN_PHYS(n_pud);
 	pt_phys = pmd_phys + PFN_PHYS(n_pmd);
 	p2m_pfn = PFN_DOWN(pt_phys) + n_pt;
 
 	pgd = __va(read_cr3_pa());
 	new_p2m = (unsigned long *)(2 * PGDIR_SIZE);
-	idx_p4d = 0;
 	save_pud = n_pud;
-	do {
-		if (n_p4d > 0) {
-			p4d = early_memremap(p4d_phys, PAGE_SIZE);
-			clear_page(p4d);
-			n_pud = min(save_pud, PTRS_PER_P4D);
-		}
-		for (idx_pud = 0; idx_pud < n_pud; idx_pud++) {
-			pud = early_memremap(pud_phys, PAGE_SIZE);
-			clear_page(pud);
-			for (idx_pmd = 0; idx_pmd < min(n_pmd, PTRS_PER_PUD);
-				 idx_pmd++) {
-				pmd = early_memremap(pmd_phys, PAGE_SIZE);
-				clear_page(pmd);
-				for (idx_pt = 0; idx_pt < min(n_pt, PTRS_PER_PMD);
-					 idx_pt++) {
-					pt = early_memremap(pt_phys, PAGE_SIZE);
-					clear_page(pt);
-					for (idx_pte = 0;
-						 idx_pte < min(n_pte, PTRS_PER_PTE);
-						 idx_pte++) {
-						set_pte(pt + idx_pte,
-								pfn_pte(p2m_pfn, PAGE_KERNEL));
-						p2m_pfn++;
-					}
-					n_pte -= PTRS_PER_PTE;
-					early_memunmap(pt, PAGE_SIZE);
-					make_lowmem_page_readonly(__va(pt_phys));
-					pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE,
-							PFN_DOWN(pt_phys));
-					set_pmd(pmd + idx_pt,
-							__pmd(_PAGE_TABLE | pt_phys));
-					pt_phys += PAGE_SIZE;
+	for (idx_pud = 0; idx_pud < n_pud; idx_pud++) {
+		pud = early_memremap(pud_phys, PAGE_SIZE);
+		clear_page(pud);
+		for (idx_pmd = 0; idx_pmd < min(n_pmd, PTRS_PER_PUD);
+				idx_pmd++) {
+			pmd = early_memremap(pmd_phys, PAGE_SIZE);
+			clear_page(pmd);
+			for (idx_pt = 0; idx_pt < min(n_pt, PTRS_PER_PMD);
+					idx_pt++) {
+				pt = early_memremap(pt_phys, PAGE_SIZE);
+				clear_page(pt);
+				for (idx_pte = 0;
+						idx_pte < min(n_pte, PTRS_PER_PTE);
+						idx_pte++) {
+					set_pte(pt + idx_pte,
+							pfn_pte(p2m_pfn, PAGE_KERNEL));
+					p2m_pfn++;
 				}
-				n_pt -= PTRS_PER_PMD;
-				early_memunmap(pmd, PAGE_SIZE);
-				make_lowmem_page_readonly(__va(pmd_phys));
-				pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE,
-						PFN_DOWN(pmd_phys));
-				set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys));
-				pmd_phys += PAGE_SIZE;
+				n_pte -= PTRS_PER_PTE;
+				early_memunmap(pt, PAGE_SIZE);
+				make_lowmem_page_readonly(__va(pt_phys));
+				pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE,
+						PFN_DOWN(pt_phys));
+				set_pmd(pmd + idx_pt,
+						__pmd(_PAGE_TABLE | pt_phys));
+				pt_phys += PAGE_SIZE;
 			}
-			n_pmd -= PTRS_PER_PUD;
-			early_memunmap(pud, PAGE_SIZE);
-			make_lowmem_page_readonly(__va(pud_phys));
-			pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(pud_phys));
-			if (n_p4d > 0)
-				set_p4d(p4d + idx_pud, __p4d(_PAGE_TABLE | pud_phys));
-			else
-				set_pgd(pgd + 2 + idx_pud, __pgd(_PAGE_TABLE | pud_phys));
-			pud_phys += PAGE_SIZE;
-		}
-		if (n_p4d > 0) {
-			save_pud -= PTRS_PER_P4D;
-			early_memunmap(p4d, PAGE_SIZE);
-			make_lowmem_page_readonly(__va(p4d_phys));
-			pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, PFN_DOWN(p4d_phys));
-			set_pgd(pgd + 2 + idx_p4d, __pgd(_PAGE_TABLE | p4d_phys));
-			p4d_phys += PAGE_SIZE;
+			n_pt -= PTRS_PER_PMD;
+			early_memunmap(pmd, PAGE_SIZE);
+			make_lowmem_page_readonly(__va(pmd_phys));
+			pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE,
+					PFN_DOWN(pmd_phys));
+			set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys));
+			pmd_phys += PAGE_SIZE;
 		}
-	} while (++idx_p4d < n_p4d);
+		n_pmd -= PTRS_PER_PUD;
+		early_memunmap(pud, PAGE_SIZE);
+		make_lowmem_page_readonly(__va(pud_phys));
+		pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(pud_phys));
+		set_pgd(pgd + 2 + idx_pud, __pgd(_PAGE_TABLE | pud_phys));
+		pud_phys += PAGE_SIZE;
+	}
 
 	/* Now copy the old p2m info to the new area. */
 	memcpy(new_p2m, xen_p2m_addr, size);
@@ -2361,7 +2322,7 @@ static void __init xen_post_allocator_in
 	pv_mmu_ops.set_pte = xen_set_pte;
 	pv_mmu_ops.set_pmd = xen_set_pmd;
 	pv_mmu_ops.set_pud = xen_set_pud;
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 	pv_mmu_ops.set_p4d = xen_set_p4d;
 #endif
 
@@ -2371,7 +2332,7 @@ static void __init xen_post_allocator_in
 	pv_mmu_ops.alloc_pmd = xen_alloc_pmd;
 	pv_mmu_ops.release_pte = xen_release_pte;
 	pv_mmu_ops.release_pmd = xen_release_pmd;
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 	pv_mmu_ops.alloc_pud = xen_alloc_pud;
 	pv_mmu_ops.release_pud = xen_release_pud;
 #endif
@@ -2435,14 +2396,14 @@ static const struct pv_mmu_ops xen_mmu_o
 	.make_pmd = PV_CALLEE_SAVE(xen_make_pmd),
 	.pmd_val = PV_CALLEE_SAVE(xen_pmd_val),
 
-#if CONFIG_PGTABLE_LEVELS >= 4
+#ifdef CONFIG_X86_64
 	.pud_val = PV_CALLEE_SAVE(xen_pud_val),
 	.make_pud = PV_CALLEE_SAVE(xen_make_pud),
 	.set_p4d = xen_set_p4d_hyper,
 
 	.alloc_pud = xen_alloc_pmd_init,
 	.release_pud = xen_release_pmd_init,
-#endif	/* CONFIG_PGTABLE_LEVELS == 4 */
+#endif	/* CONFIG_X86_64 */
 
 	.activate_mm = xen_activate_mm,
 	.dup_mmap = xen_dup_mmap,


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 027/159] ACPI / APEI: remove the unused dead-code for SEA/NMI notification type
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2017-12-22  8:45   ` Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 028/159] x86/asm: Dont use the confusing .ifeq directive Greg Kroah-Hartman
                   ` (138 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dongjiu Geng, Tyler Baicar,
	Borislav Petkov, Rafael J. Wysocki

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dongjiu Geng <gengdongjiu@huawei.com>

commit c49870e89f4d2c21c76ebe90568246bb0f3572b7 upstream.

For the SEA notification, the two functions ghes_sea_add() and
ghes_sea_remove() are only called when CONFIG_ACPI_APEI_SEA
is defined. If not, it will return errors in the ghes_probe()
and not continue. If the probe is failed, the ghes_sea_remove()
also has no chance to be called. Hence, remove the unnecessary
handling when CONFIG_ACPI_APEI_SEA is not defined.

For the NMI notification, it has the same issue as SEA notification,
so also remove the unused dead-code for it.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/acpi/apei/ghes.c |   33 +++++----------------------------
 1 file changed, 5 insertions(+), 28 deletions(-)

--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -852,17 +852,8 @@ static void ghes_sea_remove(struct ghes
 	synchronize_rcu();
 }
 #else /* CONFIG_ACPI_APEI_SEA */
-static inline void ghes_sea_add(struct ghes *ghes)
-{
-	pr_err(GHES_PFX "ID: %d, trying to add SEA notification which is not supported\n",
-	       ghes->generic->header.source_id);
-}
-
-static inline void ghes_sea_remove(struct ghes *ghes)
-{
-	pr_err(GHES_PFX "ID: %d, trying to remove SEA notification which is not supported\n",
-	       ghes->generic->header.source_id);
-}
+static inline void ghes_sea_add(struct ghes *ghes) { }
+static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
@@ -1064,23 +1055,9 @@ static void ghes_nmi_init_cxt(void)
 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
 }
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
-static inline void ghes_nmi_add(struct ghes *ghes)
-{
-	pr_err(GHES_PFX "ID: %d, trying to add NMI notification which is not supported!\n",
-	       ghes->generic->header.source_id);
-	BUG();
-}
-
-static inline void ghes_nmi_remove(struct ghes *ghes)
-{
-	pr_err(GHES_PFX "ID: %d, trying to remove NMI notification which is not supported!\n",
-	       ghes->generic->header.source_id);
-	BUG();
-}
-
-static inline void ghes_nmi_init_cxt(void)
-{
-}
+static inline void ghes_nmi_add(struct ghes *ghes) { }
+static inline void ghes_nmi_remove(struct ghes *ghes) { }
+static inline void ghes_nmi_init_cxt(void) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 028/159] x86/asm: Dont use the confusing .ifeq directive
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 027/159] ACPI / APEI: remove the unused dead-code for SEA/NMI notification type Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 029/159] x86/build: Beautify build log of syscall headers Greg Kroah-Hartman
                   ` (137 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Andrei Vagin,
	Andy Lutomirski, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit 82c62fa0c49aa305104013cee4468772799bb391 upstream.

I find the '.ifeq <expression>' directive to be confusing.  Reading it
quickly seems to suggest its opposite meaning, or that it's missing an
argument.

Improve readability by replacing all of its x86 uses with
'.if <expression> == 0'.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/757da028e802c7e98d23fbab8d234b1063e161cf.1508516398.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |    2 +-
 arch/x86/kernel/head_32.S |    2 +-
 arch/x86/kernel/head_64.S |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -818,7 +818,7 @@ ENTRY(\sym)
 
 	ASM_CLAC
 
-	.ifeq \has_error_code
+	.if \has_error_code == 0
 	pushq	$-1				/* ORIG_RAX: no syscall to restart */
 	.endif
 
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -402,7 +402,7 @@ ENTRY(early_idt_handler_array)
 	# 24(%rsp) error code
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
-	.ifeq (EXCEPTION_ERRCODE_MASK >> i) & 1
+	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 	pushl $0		# Dummy error code, to make stack frame uniform
 	.endif
 	pushl $i		# 20(%esp) Vector number
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -275,7 +275,7 @@ ENDPROC(start_cpu0)
 ENTRY(early_idt_handler_array)
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
-	.ifeq (EXCEPTION_ERRCODE_MASK >> i) & 1
+	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 		UNWIND_HINT_IRET_REGS
 		pushq $0	# Dummy error code, to make stack frame uniform
 	.else

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 029/159] x86/build: Beautify build log of syscall headers
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 028/159] x86/asm: Dont use the confusing .ifeq directive Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 030/159] x86/mm/64: Rename the register_page_bootmem_memmap() size parameter to nr_pages Greg Kroah-Hartman
                   ` (136 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Masahiro Yamada, Thomas Gleixner,
	Linus Torvalds, Peter Zijlstra, H. Peter Anvin, linux-kbuild,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Masahiro Yamada <yamada.masahiro@socionext.com>

commit af8e947079a7dab0480b5d6db6b093fd04b86fc9 upstream.

This makes the build log look nicer.

Before:
  SYSTBL  arch/x86/entry/syscalls/../../include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/entry/syscalls/../../include/generated/asm/syscalls_64.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_x32.h

After:
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kbuild@vger.kernel.org
Link: http://lkml.kernel.org/r/1509077470-2735-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/syscalls/Makefile |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/entry/syscalls/Makefile
+++ b/arch/x86/entry/syscalls/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
-out := $(obj)/../../include/generated/asm
-uapi := $(obj)/../../include/generated/uapi/asm
+out := arch/$(SRCARCH)/include/generated/asm
+uapi := arch/$(SRCARCH)/include/generated/uapi/asm
 
 # Create output directory if not already present
 _dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)') \

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 030/159] x86/mm/64: Rename the register_page_bootmem_memmap() size parameter to nr_pages
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 029/159] x86/build: Beautify build log of syscall headers Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 031/159] x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features Greg Kroah-Hartman
                   ` (135 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Baoquan He, Thomas Gleixner,
	Linus Torvalds, Peter Zijlstra, akpm, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Baoquan He <bhe@redhat.com>

commit 15670bfe19905b1dcbb63137f40d718b59d84479 upstream.

register_page_bootmem_memmap()'s 3rd 'size' parameter is named
in a somewhat misleading fashion - rename it to 'nr_pages' which
makes the units of it much clearer.

Meanwhile rename the existing local variable 'nr_pages' to
'nr_pmd_pages', a more expressive name, to avoid conflict with
new function parameter 'nr_pages'.

(Also clean up the unnecessary parentheses in which get_order() is called.)

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/1509154238-23250-1-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/mm/init_64.c |   10 +++++-----
 include/linux/mm.h    |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1426,16 +1426,16 @@ int __meminit vmemmap_populate(unsigned
 
 #if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HAVE_BOOTMEM_INFO_NODE)
 void register_page_bootmem_memmap(unsigned long section_nr,
-				  struct page *start_page, unsigned long size)
+				  struct page *start_page, unsigned long nr_pages)
 {
 	unsigned long addr = (unsigned long)start_page;
-	unsigned long end = (unsigned long)(start_page + size);
+	unsigned long end = (unsigned long)(start_page + nr_pages);
 	unsigned long next;
 	pgd_t *pgd;
 	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
-	unsigned int nr_pages;
+	unsigned int nr_pmd_pages;
 	struct page *page;
 
 	for (; addr < end; addr = next) {
@@ -1482,9 +1482,9 @@ void register_page_bootmem_memmap(unsign
 			if (pmd_none(*pmd))
 				continue;
 
-			nr_pages = 1 << (get_order(PMD_SIZE));
+			nr_pmd_pages = 1 << get_order(PMD_SIZE);
 			page = pmd_page(*pmd);
-			while (nr_pages--)
+			while (nr_pmd_pages--)
 				get_page_bootmem(section_nr, page++,
 						 SECTION_INFO);
 		}
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2510,7 +2510,7 @@ void vmemmap_populate_print_last(void);
 void vmemmap_free(unsigned long start, unsigned long end);
 #endif
 void register_page_bootmem_memmap(unsigned long section_nr, struct page *map,
-				  unsigned long size);
+				  unsigned long nr_pages);
 
 enum mf_flags {
 	MF_COUNT_INCREASED = 1 << 0,

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 031/159] x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 030/159] x86/mm/64: Rename the register_page_bootmem_memmap() size parameter to nr_pages Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 032/159] x86/mm: Relocate page fault error codes to traps.h Greg Kroah-Hartman
                   ` (134 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gayatri Kammela, Thomas Gleixner,
	Andi Kleen, Fenghua Yu, Linus Torvalds, Peter Zijlstra,
	Ravi Shankar, Ricardo Neri, Yang Zhong, bp, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gayatri Kammela <gayatri.kammela@intel.com>

commit c128dbfa0f879f8ce7b79054037889b0b2240728 upstream.

Add a few new SSE/AVX/AVX512 instruction groups/features for enumeration
in /proc/cpuinfo: AVX512_VBMI2, GFNI, VAES, VPCLMULQDQ, AVX512_VNNI,
AVX512_BITALG.

 CPUID.(EAX=7,ECX=0):ECX[bit 6]  AVX512_VBMI2
 CPUID.(EAX=7,ECX=0):ECX[bit 8]  GFNI
 CPUID.(EAX=7,ECX=0):ECX[bit 9]  VAES
 CPUID.(EAX=7,ECX=0):ECX[bit 10] VPCLMULQDQ
 CPUID.(EAX=7,ECX=0):ECX[bit 11] AVX512_VNNI
 CPUID.(EAX=7,ECX=0):ECX[bit 12] AVX512_BITALG

Detailed information of CPUID bits for these features can be found
in the Intel Architecture Instruction Set Extensions and Future Features
Programming Interface document (refer to Table 1-1. and Table 1-2.).
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=197239

Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Yang Zhong <yang.zhong@intel.com>
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1509412829-23380-1-git-send-email-gayatri.kammela@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeatures.h |    6 ++++++
 arch/x86/kernel/cpu/cpuid-deps.c   |    6 ++++++
 2 files changed, 12 insertions(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -300,6 +300,12 @@
 #define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
 #define X86_FEATURE_PKU		(16*32+ 3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE	(16*32+ 4) /* OS Protection Keys Enable */
+#define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
+#define X86_FEATURE_GFNI	(16*32+ 8) /* Galois Field New Instructions */
+#define X86_FEATURE_VAES	(16*32+ 9) /* Vector AES */
+#define X86_FEATURE_VPCLMULQDQ	(16*32+ 10) /* Carry-Less Multiplication Double Quadword */
+#define X86_FEATURE_AVX512_VNNI (16*32+ 11) /* Vector Neural Network Instructions */
+#define X86_FEATURE_AVX512_BITALG (16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB */
 #define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
 #define X86_FEATURE_LA57	(16*32+16) /* 5-level page tables */
 #define X86_FEATURE_RDPID	(16*32+22) /* RDPID instruction */
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -50,6 +50,12 @@ const static struct cpuid_dep cpuid_deps
 	{ X86_FEATURE_AVX512BW,		X86_FEATURE_AVX512F   },
 	{ X86_FEATURE_AVX512VL,		X86_FEATURE_AVX512F   },
 	{ X86_FEATURE_AVX512VBMI,	X86_FEATURE_AVX512F   },
+	{ X86_FEATURE_AVX512_VBMI2,	X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_GFNI,		X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_VAES,		X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_VPCLMULQDQ,	X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_AVX512_VNNI,	X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_AVX512_BITALG,	X86_FEATURE_AVX512VL  },
 	{ X86_FEATURE_AVX512_4VNNIW,	X86_FEATURE_AVX512F   },
 	{ X86_FEATURE_AVX512_4FMAPS,	X86_FEATURE_AVX512F   },
 	{ X86_FEATURE_AVX512_VPOPCNTDQ, X86_FEATURE_AVX512F   },

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 032/159] x86/mm: Relocate page fault error codes to traps.h
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 031/159] x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45   ` Greg Kroah-Hartman
                   ` (133 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ricardo Neri, Thomas Gleixner,
	Borislav Petkov, Andy Lutomirski, Michael S. Tsirkin,
	Peter Zijlstra, Dave Hansen, ricardo.neri, Paul Gortmaker,
	Huang Rui, Shuah Khan, Jonathan Corbet, Jiri Slaby,
	Ravi V. Shankar, Chris Metcalf, Brian Gerst, Josh Poimboeuf,
	Chen Yucong, Vlastimil Babka, Masami Hiramatsu, Paolo Bonzini,
	Andrew Morton, Kirill A. Shutemov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

commit 1067f030994c69ca1fba8c607437c8895dcf8509 upstream.

Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/1509135945-13762-2-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/traps.h |   18 ++++++++
 arch/x86/mm/fault.c          |   88 ++++++++++++++++---------------------------
 2 files changed, 52 insertions(+), 54 deletions(-)

--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -145,4 +145,22 @@ enum {
 	X86_TRAP_IRET = 32,	/* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==	 0: no page found	1: protection fault
+ *   bit 1 ==	 0: read access		1: write access
+ *   bit 2 ==	 0: kernel-mode access	1: user-mode access
+ *   bit 3 ==				1: use of reserved bit detected
+ *   bit 4 ==				1: fault was an instruction fetch
+ *   bit 5 ==				1: protection keys block access
+ */
+enum x86_pf_error_code {
+	X86_PF_PROT	=		1 << 0,
+	X86_PF_WRITE	=		1 << 1,
+	X86_PF_USER	=		1 << 2,
+	X86_PF_RSVD	=		1 << 3,
+	X86_PF_INSTR	=		1 << 4,
+	X86_PF_PK	=		1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -30,26 +30,6 @@
 #include <asm/trace/exceptions.h>
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==	 0: no page found	1: protection fault
- *   bit 1 ==	 0: read access		1: write access
- *   bit 2 ==	 0: kernel-mode access	1: user-mode access
- *   bit 3 ==				1: use of reserved bit detected
- *   bit 4 ==				1: fault was an instruction fetch
- *   bit 5 ==				1: protection keys block access
- */
-enum x86_pf_error_code {
-
-	PF_PROT		=		1 << 0,
-	PF_WRITE	=		1 << 1,
-	PF_USER		=		1 << 2,
-	PF_RSVD		=		1 << 3,
-	PF_INSTR	=		1 << 4,
-	PF_PK		=		1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -150,7 +130,7 @@ is_prefetch(struct pt_regs *regs, unsign
 	 * If it was a exec (instruction fetch) fault on NX page, then
 	 * do not ignore the fault:
 	 */
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		return 0;
 
 	instr = (void *)convert_ip_to_linear(current, regs);
@@ -180,7 +160,7 @@ is_prefetch(struct pt_regs *regs, unsign
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -205,7 +185,7 @@ static void fill_sig_info_pkey(int si_co
 	/*
 	 * force_sig_info_fault() is called from a number of
 	 * contexts, some of which have a VMA and some of which
-	 * do not.  The PF_PK handing happens after we have a
+	 * do not.  The X86_PF_PK handing happens after we have a
 	 * valid VMA, so we should never reach this without a
 	 * valid VMA.
 	 */
@@ -698,7 +678,7 @@ show_fault_oops(struct pt_regs *regs, un
 	if (!oops_may_print())
 		return;
 
-	if (error_code & PF_INSTR) {
+	if (error_code & X86_PF_INSTR) {
 		unsigned int level;
 		pgd_t *pgd;
 		pte_t *pte;
@@ -780,7 +760,7 @@ no_context(struct pt_regs *regs, unsigne
 		 */
 		if (current->thread.sig_on_uaccess_err && signal) {
 			tsk->thread.trap_nr = X86_TRAP_PF;
-			tsk->thread.error_code = error_code | PF_USER;
+			tsk->thread.error_code = error_code | X86_PF_USER;
 			tsk->thread.cr2 = address;
 
 			/* XXX: hwpoison faults will set the wrong code. */
@@ -898,7 +878,7 @@ __bad_area_nosemaphore(struct pt_regs *r
 	struct task_struct *tsk = current;
 
 	/* User mode accesses just cause a SIGSEGV */
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * It's possible to have interrupts off here:
 		 */
@@ -919,7 +899,7 @@ __bad_area_nosemaphore(struct pt_regs *r
 		 * Instruction fetch faults in the vsyscall page might need
 		 * emulation.
 		 */
-		if (unlikely((error_code & PF_INSTR) &&
+		if (unlikely((error_code & X86_PF_INSTR) &&
 			     ((address & ~0xfff) == VSYSCALL_ADDR))) {
 			if (emulate_vsyscall(regs, address))
 				return;
@@ -932,7 +912,7 @@ __bad_area_nosemaphore(struct pt_regs *r
 		 * are always protection faults.
 		 */
 		if (address >= TASK_SIZE_MAX)
-			error_code |= PF_PROT;
+			error_code |= X86_PF_PROT;
 
 		if (likely(show_unhandled_signals))
 			show_signal_msg(regs, error_code, address, tsk);
@@ -993,11 +973,11 @@ static inline bool bad_area_access_from_
 
 	if (!boot_cpu_has(X86_FEATURE_OSPKE))
 		return false;
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return true;
 	/* this checks permission keys on the VMA: */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return true;
 	return false;
 }
@@ -1025,7 +1005,7 @@ do_sigbus(struct pt_regs *regs, unsigned
 	int code = BUS_ADRERR;
 
 	/* Kernel mode? Handle exceptions or die: */
-	if (!(error_code & PF_USER)) {
+	if (!(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
 		return;
 	}
@@ -1053,14 +1033,14 @@ static noinline void
 mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 	       unsigned long address, u32 *pkey, unsigned int fault)
 {
-	if (fatal_signal_pending(current) && !(error_code & PF_USER)) {
+	if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
 		no_context(regs, error_code, address, 0, 0);
 		return;
 	}
 
 	if (fault & VM_FAULT_OOM) {
 		/* Kernel mode? Handle exceptions or die: */
-		if (!(error_code & PF_USER)) {
+		if (!(error_code & X86_PF_USER)) {
 			no_context(regs, error_code, address,
 				   SIGSEGV, SEGV_MAPERR);
 			return;
@@ -1085,16 +1065,16 @@ mm_fault_error(struct pt_regs *regs, uns
 
 static int spurious_fault_check(unsigned long error_code, pte_t *pte)
 {
-	if ((error_code & PF_WRITE) && !pte_write(*pte))
+	if ((error_code & X86_PF_WRITE) && !pte_write(*pte))
 		return 0;
 
-	if ((error_code & PF_INSTR) && !pte_exec(*pte))
+	if ((error_code & X86_PF_INSTR) && !pte_exec(*pte))
 		return 0;
 	/*
 	 * Note: We do not do lazy flushing on protection key
-	 * changes, so no spurious fault will ever set PF_PK.
+	 * changes, so no spurious fault will ever set X86_PF_PK.
 	 */
-	if ((error_code & PF_PK))
+	if ((error_code & X86_PF_PK))
 		return 1;
 
 	return 1;
@@ -1140,8 +1120,8 @@ spurious_fault(unsigned long error_code,
 	 * change, so user accesses are not expected to cause spurious
 	 * faults.
 	 */
-	if (error_code != (PF_WRITE | PF_PROT)
-	    && error_code != (PF_INSTR | PF_PROT))
+	if (error_code != (X86_PF_WRITE | X86_PF_PROT) &&
+	    error_code != (X86_PF_INSTR | X86_PF_PROT))
 		return 0;
 
 	pgd = init_mm.pgd + pgd_index(address);
@@ -1201,19 +1181,19 @@ access_error(unsigned long error_code, s
 	 * always an unconditional error and can never result in
 	 * a follow-up action to resolve the fault, like a COW.
 	 */
-	if (error_code & PF_PK)
+	if (error_code & X86_PF_PK)
 		return 1;
 
 	/*
 	 * Make sure to check the VMA so that we do not perform
-	 * faults just to hit a PF_PK as soon as we fill in a
+	 * faults just to hit a X86_PF_PK as soon as we fill in a
 	 * page.
 	 */
-	if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
-				(error_code & PF_INSTR), foreign))
+	if (!arch_vma_access_permitted(vma, (error_code & X86_PF_WRITE),
+				       (error_code & X86_PF_INSTR), foreign))
 		return 1;
 
-	if (error_code & PF_WRITE) {
+	if (error_code & X86_PF_WRITE) {
 		/* write, present and write, not present: */
 		if (unlikely(!(vma->vm_flags & VM_WRITE)))
 			return 1;
@@ -1221,7 +1201,7 @@ access_error(unsigned long error_code, s
 	}
 
 	/* read, present: */
-	if (unlikely(error_code & PF_PROT))
+	if (unlikely(error_code & X86_PF_PROT))
 		return 1;
 
 	/* read, not present: */
@@ -1244,7 +1224,7 @@ static inline bool smap_violation(int er
 	if (!static_cpu_has(X86_FEATURE_SMAP))
 		return false;
 
-	if (error_code & PF_USER)
+	if (error_code & X86_PF_USER)
 		return false;
 
 	if (!user_mode(regs) && (regs->flags & X86_EFLAGS_AC))
@@ -1297,7 +1277,7 @@ __do_page_fault(struct pt_regs *regs, un
 	 * protection error (error_code & 9) == 0.
 	 */
 	if (unlikely(fault_in_kernel_space(address))) {
-		if (!(error_code & (PF_RSVD | PF_USER | PF_PROT))) {
+		if (!(error_code & (X86_PF_RSVD | X86_PF_USER | X86_PF_PROT))) {
 			if (vmalloc_fault(address) >= 0)
 				return;
 
@@ -1325,7 +1305,7 @@ __do_page_fault(struct pt_regs *regs, un
 	if (unlikely(kprobes_fault(regs)))
 		return;
 
-	if (unlikely(error_code & PF_RSVD))
+	if (unlikely(error_code & X86_PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
 	if (unlikely(smap_violation(error_code, regs))) {
@@ -1351,7 +1331,7 @@ __do_page_fault(struct pt_regs *regs, un
 	 */
 	if (user_mode(regs)) {
 		local_irq_enable();
-		error_code |= PF_USER;
+		error_code |= X86_PF_USER;
 		flags |= FAULT_FLAG_USER;
 	} else {
 		if (regs->flags & X86_EFLAGS_IF)
@@ -1360,9 +1340,9 @@ __do_page_fault(struct pt_regs *regs, un
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-	if (error_code & PF_WRITE)
+	if (error_code & X86_PF_WRITE)
 		flags |= FAULT_FLAG_WRITE;
-	if (error_code & PF_INSTR)
+	if (error_code & X86_PF_INSTR)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
 	/*
@@ -1382,7 +1362,7 @@ __do_page_fault(struct pt_regs *regs, un
 	 * space check, thus avoiding the deadlock:
 	 */
 	if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
-		if ((error_code & PF_USER) == 0 &&
+		if (!(error_code & X86_PF_USER) &&
 		    !search_exception_tables(regs->ip)) {
 			bad_area_nosemaphore(regs, error_code, address, NULL);
 			return;
@@ -1409,7 +1389,7 @@ retry:
 		bad_area(regs, error_code, address);
 		return;
 	}
-	if (error_code & PF_USER) {
+	if (error_code & X86_PF_USER) {
 		/*
 		 * Accessing the stack below %sp is always a bug.
 		 * The large cushion allows instructions like enter

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 033/159] x86/boot: Relocate definition of the initial state of CR0
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 001/159] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 003/159] x86/head: Remove confusing comment Greg Kroah-Hartman
                     ` (163 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Borislav Petkov, Ricardo Neri,
	Thomas Gleixner, Borislav Petkov, Andy Lutomirski,
	Michael S. Tsirkin, Peter Zijlstra, Dave Hansen, ricardo.neri,
	linux-mm, Paul Gortmaker, Huang Rui, Shuah Khan, linux-arch,
	Jonathan Corbet, Jiri Slaby, Ravi V. Shankar, Denys Vlasenko,
	Chris Metcalf, Brian Gerst, Josh Poimboeuf, Chen Yucong,
	Vlastimil Babka, Dave Hansen, Andy Lutomirski, Masami Hiramatsu,
	Paolo Bonzini, Andrew Morton, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

commit b0ce5b8c95c83a7b98c679b117e3d6ae6f97154b upstream.

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: linux-mm@kvack.org
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-arch@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/uapi/asm/processor-flags.h |    3 +++
 arch/x86/kernel/head_32.S                   |    3 ---
 arch/x86/kernel/head_64.S                   |    3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -152,5 +152,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -212,9 +212,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -154,9 +154,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 033/159] x86/boot: Relocate definition of the initial state of CR0
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Borislav Petkov, Ricardo Neri,
	Thomas Gleixner, Borislav Petkov, Andy Lutomirski,
	Michael S. Tsirkin, Peter Zijlstra, Dave Hansen, ricardo.neri,
	linux-mm, Paul Gortmaker, Huang Rui, Shuah Khan, linux-arch,
	Jonathan Corbet, Jiri Slaby, Ravi V. Shankar, Denys Vlasenko,
	Chris Metcalf, Brian Gerst, Josh Poimboeuf, Chen Yucong,
	Vlastimil Babka, Dave Hansen, Andy Lutomirski, Masami Hiramatsu,
	Paolo Bonzini, Andrew Morton, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

commit b0ce5b8c95c83a7b98c679b117e3d6ae6f97154b upstream.

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: linux-mm@kvack.org
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-arch@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/uapi/asm/processor-flags.h |    3 +++
 arch/x86/kernel/head_32.S                   |    3 ---
 arch/x86/kernel/head_64.S                   |    3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -152,5 +152,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -212,9 +212,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -154,9 +154,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 033/159] x86/boot: Relocate definition of the initial state of CR0
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Borislav Petkov, Ricardo Neri,
	Thomas Gleixner, Borislav Petkov, Andy Lutomirski,
	Michael S. Tsirkin, Peter Zijlstra, Dave Hansen, ricardo.neri,
	linux-mm, Paul Gortmaker, Huang Rui, Shuah Khan, linux-arch,
	Jonathan Corbet, Jiri Slaby, Ravi V. Shankar, Denys Vlasenko,
	Chris Metcalf, Brian Gerst, Josh Poimboeuf, Chen Yucong,
	Vlastimil Babka

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

commit b0ce5b8c95c83a7b98c679b117e3d6ae6f97154b upstream.

Both head_32.S and head_64.S utilize the same value to initialize the
control register CR0. Also, other parts of the kernel might want to access
this initial definition (e.g., emulation code for User-Mode Instruction
Prevention uses this state to provide a sane dummy value for CR0 when
emulating the smsw instruction). Thus, relocate this definition to a
header file from which it can be conveniently accessed.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: linux-mm@kvack.org
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-arch@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/uapi/asm/processor-flags.h |    3 +++
 arch/x86/kernel/head_32.S                   |    3 ---
 arch/x86/kernel/head_64.S                   |    3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -152,5 +152,8 @@
 #define CX86_ARR_BASE	0xc4
 #define CX86_RCR_BASE	0xdc
 
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
 
 #endif /* _UAPI_ASM_X86_PROCESSOR_FLAGS_H */
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -212,9 +212,6 @@ ENTRY(startup_32_smp)
 #endif
 
 .Ldefault_entry:
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl $(CR0_STATE & ~X86_CR0_PG),%eax
 	movl %eax,%cr0
 
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -154,9 +154,6 @@ ENTRY(secondary_startup_64)
 1:	wrmsr				/* Make changes effective */
 
 	/* Setup cr0 */
-#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
-			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
-			 X86_CR0_PG)
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
 	movq	%rax, %cr0


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 034/159] ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2017-12-22  8:45   ` Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 035/159] x86/entry/64: Remove the restore_c_regs_and_iret label Greg Kroah-Hartman
                   ` (131 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Borislav Petkov, Ricardo Neri,
	Thomas Gleixner, Michael S. Tsirkin, Peter Zijlstra, Dave Hansen,
	ricardo.neri, Adrian Hunter, Paul Gortmaker, Huang Rui,
	Qiaowei Ren, Shuah Khan, Kees Cook, Jonathan Corbet, Jiri Slaby,
	Dmitry Vyukov, Ravi V. Shankar, Chris Metcalf, Brian Gerst,
	Arnaldo Carvalho de Melo, Andy Lutomirski, Colin Ian King,
	Chen Yucong, Adam Buchbinder, Vlastimil Babka, Lorenzo Stoakes,
	Masami Hiramatsu, Paolo Bonzini, Andrew Morton, Thomas Garnier

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

commit e27c310af5c05cf876d9cad006928076c27f54d4 upstream.

In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.

This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-4-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/ptrace.h |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -136,9 +136,9 @@ static inline int v8086_mode(struct pt_r
 #endif
 }
 
-#ifdef CONFIG_X86_64
 static inline bool user_64bit_mode(struct pt_regs *regs)
 {
+#ifdef CONFIG_X86_64
 #ifndef CONFIG_PARAVIRT
 	/*
 	 * On non-paravirt systems, this is the only long mode CPL 3
@@ -149,8 +149,12 @@ static inline bool user_64bit_mode(struc
 	/* Headers are too twisted for this to go in paravirt.h. */
 	return regs->cs == __USER_CS || regs->cs == pv_info.extra_user_64bit_cs;
 #endif
+#else /* !CONFIG_X86_64 */
+	return false;
+#endif
 }
 
+#ifdef CONFIG_X86_64
 #define current_user_stack_pointer()	current_pt_regs()->sp
 #define compat_user_stack_pointer()	current_pt_regs()->sp
 #endif

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 035/159] x86/entry/64: Remove the restore_c_regs_and_iret label
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 034/159] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 036/159] x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths Greg Kroah-Hartman
                   ` (130 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 9da78ba6b47b46428cfdfc0851511ab29c869798 upstream.

The only user was the 64-bit opportunistic SYSRET failure path, and
that path didn't really need it.  This change makes the
opportunistic SYSRET code a bit more straightforward and gets rid of
the label.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/be3006a7ad3326e3458cf1cc55d416252cbe1986.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -246,7 +246,6 @@ entry_SYSCALL64_slow_path:
 	call	do_syscall_64		/* returns with IRQs disabled */
 
 return_from_SYSCALL_64:
-	RESTORE_EXTRA_REGS
 	TRACE_IRQS_IRETQ		/* we're about to change IF */
 
 	/*
@@ -315,6 +314,7 @@ return_from_SYSCALL_64:
 	 */
 syscall_return_via_sysret:
 	/* rcx and r11 are already restored (see code above) */
+	RESTORE_EXTRA_REGS
 	RESTORE_C_REGS_EXCEPT_RCX_R11
 	movq	RSP(%rsp), %rsp
 	UNWIND_HINT_EMPTY
@@ -322,7 +322,7 @@ syscall_return_via_sysret:
 
 opportunistic_sysret_failed:
 	SWAPGS
-	jmp	restore_c_regs_and_iret
+	jmp	restore_regs_and_iret
 END(entry_SYSCALL_64)
 
 ENTRY(stub_ptregs_64)
@@ -639,7 +639,6 @@ retint_kernel:
  */
 GLOBAL(restore_regs_and_iret)
 	RESTORE_EXTRA_REGS
-restore_c_regs_and_iret:
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
 	INTERRUPT_RETURN

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 036/159] x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 035/159] x86/entry/64: Remove the restore_c_regs_and_iret label Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 037/159] x86/entry/64: Move SWAPGS into the common IRET-to-usermode path Greg Kroah-Hartman
                   ` (129 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 26c4ef9c49d8a0341f6d97ce2cfdd55d1236ed29 upstream.

These code paths will diverge soon.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S        |   34 +++++++++++++++++++++++++---------
 arch/x86/entry/entry_64_compat.S |    2 +-
 arch/x86/kernel/head_64.S        |    2 +-
 3 files changed, 27 insertions(+), 11 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -322,7 +322,7 @@ syscall_return_via_sysret:
 
 opportunistic_sysret_failed:
 	SWAPGS
-	jmp	restore_regs_and_iret
+	jmp	restore_regs_and_return_to_usermode
 END(entry_SYSCALL_64)
 
 ENTRY(stub_ptregs_64)
@@ -424,7 +424,7 @@ ENTRY(ret_from_fork)
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
 	SWAPGS
-	jmp	restore_regs_and_iret
+	jmp	restore_regs_and_return_to_usermode
 
 1:
 	/* kernel thread */
@@ -613,7 +613,20 @@ GLOBAL(retint_user)
 	call	prepare_exit_to_usermode
 	TRACE_IRQS_IRETQ
 	SWAPGS
-	jmp	restore_regs_and_iret
+
+GLOBAL(restore_regs_and_return_to_usermode)
+#ifdef CONFIG_DEBUG_ENTRY
+	/* Assert that pt_regs indicates user mode. */
+	testl	$3, CS(%rsp)
+	jnz	1f
+	ud2
+1:
+#endif
+	RESTORE_EXTRA_REGS
+	RESTORE_C_REGS
+	REMOVE_PT_GPREGS_FROM_STACK 8
+	INTERRUPT_RETURN
+
 
 /* Returning to kernel space */
 retint_kernel:
@@ -633,11 +646,14 @@ retint_kernel:
 	 */
 	TRACE_IRQS_IRETQ
 
-/*
- * At this label, code paths which return to kernel and to user,
- * which come from interrupts/exception and from syscalls, merge.
- */
-GLOBAL(restore_regs_and_iret)
+GLOBAL(restore_regs_and_return_to_kernel)
+#ifdef CONFIG_DEBUG_ENTRY
+	/* Assert that pt_regs indicates kernel mode. */
+	testl	$3, CS(%rsp)
+	jz	1f
+	ud2
+1:
+#endif
 	RESTORE_EXTRA_REGS
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
@@ -1328,7 +1344,7 @@ ENTRY(nmi)
 	 * work, because we don't want to enable interrupts.
 	 */
 	SWAPGS
-	jmp	restore_regs_and_iret
+	jmp	restore_regs_and_return_to_usermode
 
 .Lnmi_from_kernel:
 	/*
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -338,7 +338,7 @@ ENTRY(entry_INT80_compat)
 	/* Go back to user mode. */
 	TRACE_IRQS_ON
 	SWAPGS
-	jmp	restore_regs_and_iret
+	jmp	restore_regs_and_return_to_usermode
 END(entry_INT80_compat)
 
 ENTRY(stub32_clone)
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -328,7 +328,7 @@ early_idt_handler_common:
 
 20:
 	decl early_recursion_flag(%rip)
-	jmp restore_regs_and_iret
+	jmp restore_regs_and_return_to_kernel
 END(early_idt_handler_common)
 
 	__INITDATA

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 037/159] x86/entry/64: Move SWAPGS into the common IRET-to-usermode path
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 036/159] x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 038/159] x86/entry/64: Simplify reg restore code in the standard IRET paths Greg Kroah-Hartman
                   ` (128 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 8a055d7f411d41755ce30db5bb65b154777c4b78 upstream.

All of the code paths that ended up doing IRET to usermode did
SWAPGS immediately beforehand.  Move the SWAPGS into the common
code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S        |   32 ++++++++++++++------------------
 arch/x86/entry/entry_64_compat.S |    3 +--
 2 files changed, 15 insertions(+), 20 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -250,12 +250,14 @@ return_from_SYSCALL_64:
 
 	/*
 	 * Try to use SYSRET instead of IRET if we're returning to
-	 * a completely clean 64-bit userspace context.
+	 * a completely clean 64-bit userspace context.  If we're not,
+	 * go to the slow exit path.
 	 */
 	movq	RCX(%rsp), %rcx
 	movq	RIP(%rsp), %r11
-	cmpq	%rcx, %r11			/* RCX == RIP */
-	jne	opportunistic_sysret_failed
+
+	cmpq	%rcx, %r11	/* SYSRET requires RCX == RIP */
+	jne	swapgs_restore_regs_and_return_to_usermode
 
 	/*
 	 * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
@@ -273,14 +275,14 @@ return_from_SYSCALL_64:
 
 	/* If this changed %rcx, it was not canonical */
 	cmpq	%rcx, %r11
-	jne	opportunistic_sysret_failed
+	jne	swapgs_restore_regs_and_return_to_usermode
 
 	cmpq	$__USER_CS, CS(%rsp)		/* CS must match SYSRET */
-	jne	opportunistic_sysret_failed
+	jne	swapgs_restore_regs_and_return_to_usermode
 
 	movq	R11(%rsp), %r11
 	cmpq	%r11, EFLAGS(%rsp)		/* R11 == RFLAGS */
-	jne	opportunistic_sysret_failed
+	jne	swapgs_restore_regs_and_return_to_usermode
 
 	/*
 	 * SYSCALL clears RF when it saves RFLAGS in R11 and SYSRET cannot
@@ -301,12 +303,12 @@ return_from_SYSCALL_64:
 	 * would never get past 'stuck_here'.
 	 */
 	testq	$(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
-	jnz	opportunistic_sysret_failed
+	jnz	swapgs_restore_regs_and_return_to_usermode
 
 	/* nothing to check for RSP */
 
 	cmpq	$__USER_DS, SS(%rsp)		/* SS must match SYSRET */
-	jne	opportunistic_sysret_failed
+	jne	swapgs_restore_regs_and_return_to_usermode
 
 	/*
 	 * We win! This label is here just for ease of understanding
@@ -319,10 +321,6 @@ syscall_return_via_sysret:
 	movq	RSP(%rsp), %rsp
 	UNWIND_HINT_EMPTY
 	USERGS_SYSRET64
-
-opportunistic_sysret_failed:
-	SWAPGS
-	jmp	restore_regs_and_return_to_usermode
 END(entry_SYSCALL_64)
 
 ENTRY(stub_ptregs_64)
@@ -423,8 +421,7 @@ ENTRY(ret_from_fork)
 	movq	%rsp, %rdi
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
-	SWAPGS
-	jmp	restore_regs_and_return_to_usermode
+	jmp	swapgs_restore_regs_and_return_to_usermode
 
 1:
 	/* kernel thread */
@@ -612,9 +609,8 @@ GLOBAL(retint_user)
 	mov	%rsp,%rdi
 	call	prepare_exit_to_usermode
 	TRACE_IRQS_IRETQ
-	SWAPGS
 
-GLOBAL(restore_regs_and_return_to_usermode)
+GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 #ifdef CONFIG_DEBUG_ENTRY
 	/* Assert that pt_regs indicates user mode. */
 	testl	$3, CS(%rsp)
@@ -622,6 +618,7 @@ GLOBAL(restore_regs_and_return_to_usermo
 	ud2
 1:
 #endif
+	SWAPGS
 	RESTORE_EXTRA_REGS
 	RESTORE_C_REGS
 	REMOVE_PT_GPREGS_FROM_STACK 8
@@ -1343,8 +1340,7 @@ ENTRY(nmi)
 	 * Return back to user mode.  We must *not* do the normal exit
 	 * work, because we don't want to enable interrupts.
 	 */
-	SWAPGS
-	jmp	restore_regs_and_return_to_usermode
+	jmp	swapgs_restore_regs_and_return_to_usermode
 
 .Lnmi_from_kernel:
 	/*
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -337,8 +337,7 @@ ENTRY(entry_INT80_compat)
 
 	/* Go back to user mode. */
 	TRACE_IRQS_ON
-	SWAPGS
-	jmp	restore_regs_and_return_to_usermode
+	jmp	swapgs_restore_regs_and_return_to_usermode
 END(entry_INT80_compat)
 
 ENTRY(stub32_clone)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 038/159] x86/entry/64: Simplify reg restore code in the standard IRET paths
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 037/159] x86/entry/64: Move SWAPGS into the common IRET-to-usermode path Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 039/159] x86/entry/64: Shrink paranoid_exit_restore and make labels local Greg Kroah-Hartman
                   ` (127 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit e872045bfd9c465a8555bab4b8567d56a4d2d3bb upstream.

The old code restored all the registers with movq instead of pop.

In theory, this was done because some CPUs have higher movq
throughput, but any gain there would be tiny and is almost certainly
outweighed by the higher text size.

This saves 96 bytes of text.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ad82520a207ccd851b04ba613f4f752b33ac05f7.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/calling.h  |   21 +++++++++++++++++++++
 arch/x86/entry/entry_64.S |   12 ++++++------
 2 files changed, 27 insertions(+), 6 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -152,6 +152,27 @@ For 32-bit we have the following convent
 	UNWIND_HINT_REGS offset=\offset extra=0
 	.endm
 
+	.macro POP_EXTRA_REGS
+	popq %r15
+	popq %r14
+	popq %r13
+	popq %r12
+	popq %rbp
+	popq %rbx
+	.endm
+
+	.macro POP_C_REGS
+	popq %r11
+	popq %r10
+	popq %r9
+	popq %r8
+	popq %rax
+	popq %rcx
+	popq %rdx
+	popq %rsi
+	popq %rdi
+	.endm
+
 	.macro RESTORE_C_REGS_HELPER rstor_rax=1, rstor_rcx=1, rstor_r11=1, rstor_r8910=1, rstor_rdx=1
 	.if \rstor_r11
 	movq 6*8(%rsp), %r11
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -619,9 +619,9 @@ GLOBAL(swapgs_restore_regs_and_return_to
 1:
 #endif
 	SWAPGS
-	RESTORE_EXTRA_REGS
-	RESTORE_C_REGS
-	REMOVE_PT_GPREGS_FROM_STACK 8
+	POP_EXTRA_REGS
+	POP_C_REGS
+	addq	$8, %rsp	/* skip regs->orig_ax */
 	INTERRUPT_RETURN
 
 
@@ -651,9 +651,9 @@ GLOBAL(restore_regs_and_return_to_kernel
 	ud2
 1:
 #endif
-	RESTORE_EXTRA_REGS
-	RESTORE_C_REGS
-	REMOVE_PT_GPREGS_FROM_STACK 8
+	POP_EXTRA_REGS
+	POP_C_REGS
+	addq	$8, %rsp	/* skip regs->orig_ax */
 	INTERRUPT_RETURN
 
 ENTRY(native_iret)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 039/159] x86/entry/64: Shrink paranoid_exit_restore and make labels local
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 038/159] x86/entry/64: Simplify reg restore code in the standard IRET paths Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 040/159] x86/entry/64: Use pop instead of movq in syscall_return_via_sysret Greg Kroah-Hartman
                   ` (126 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit e53178328c9b96fbdbc719e78c93b5687ee007c3 upstream.

paranoid_exit_restore was a copy of restore_regs_and_return_to_kernel.
Merge them and make the paranoid_exit internal labels local.

Keeping .Lparanoid_exit makes the code a bit shorter because it
allows a 2-byte jnz instead of a 5-byte jnz.

Saves 96 bytes of text.

( This is still a bit suboptimal in a non-CONFIG_TRACE_IRQFLAGS
  kernel, but fixing that would make the code rather messy. )

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/510d66a1895cda9473c84b1086f0bb974f22de6a.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |   13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1124,17 +1124,14 @@ ENTRY(paranoid_exit)
 	DISABLE_INTERRUPTS(CLBR_ANY)
 	TRACE_IRQS_OFF_DEBUG
 	testl	%ebx, %ebx			/* swapgs needed? */
-	jnz	paranoid_exit_no_swapgs
+	jnz	.Lparanoid_exit_no_swapgs
 	TRACE_IRQS_IRETQ
 	SWAPGS_UNSAFE_STACK
-	jmp	paranoid_exit_restore
-paranoid_exit_no_swapgs:
+	jmp	.Lparanoid_exit_restore
+.Lparanoid_exit_no_swapgs:
 	TRACE_IRQS_IRETQ_DEBUG
-paranoid_exit_restore:
-	RESTORE_EXTRA_REGS
-	RESTORE_C_REGS
-	REMOVE_PT_GPREGS_FROM_STACK 8
-	INTERRUPT_RETURN
+.Lparanoid_exit_restore:
+	jmp restore_regs_and_return_to_kernel
 END(paranoid_exit)
 
 /*

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 040/159] x86/entry/64: Use pop instead of movq in syscall_return_via_sysret
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 039/159] x86/entry/64: Shrink paranoid_exit_restore and make labels local Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 041/159] x86/entry/64: Merge the fast and slow SYSRET paths Greg Kroah-Hartman
                   ` (125 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 4fbb39108f972437c44e5ffa781b56635d496826 upstream.

Saves 64 bytes.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6609b7f74ab31c36604ad746e019ea8495aec76c.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -316,10 +316,18 @@ return_from_SYSCALL_64:
 	 */
 syscall_return_via_sysret:
 	/* rcx and r11 are already restored (see code above) */
-	RESTORE_EXTRA_REGS
-	RESTORE_C_REGS_EXCEPT_RCX_R11
-	movq	RSP(%rsp), %rsp
 	UNWIND_HINT_EMPTY
+	POP_EXTRA_REGS
+	popq	%rsi	/* skip r11 */
+	popq	%r10
+	popq	%r9
+	popq	%r8
+	popq	%rax
+	popq	%rsi	/* skip rcx */
+	popq	%rdx
+	popq	%rsi
+	popq	%rdi
+	movq	RSP-ORIG_RAX(%rsp), %rsp
 	USERGS_SYSRET64
 END(entry_SYSCALL_64)
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 041/159] x86/entry/64: Merge the fast and slow SYSRET paths
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 040/159] x86/entry/64: Use pop instead of movq in syscall_return_via_sysret Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 042/159] x86/entry/64: Use POP instead of MOV to restore regs on NMI return Greg Kroah-Hartman
                   ` (124 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit a512210643da8082cb44181dba8b18e752bd68f0 upstream.

They did almost the same thing.  Remove a bunch of pointless
instructions (mostly hidden in macros) and reduce cognitive load by
merging them.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1204e20233fcab9130a1ba80b3b1879b5db3fc1f.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -221,10 +221,9 @@ entry_SYSCALL_64_fastpath:
 	TRACE_IRQS_ON		/* user mode is traced as IRQs on */
 	movq	RIP(%rsp), %rcx
 	movq	EFLAGS(%rsp), %r11
-	RESTORE_C_REGS_EXCEPT_RCX_R11
-	movq	RSP(%rsp), %rsp
+	addq	$6*8, %rsp	/* skip extra regs -- they were preserved */
 	UNWIND_HINT_EMPTY
-	USERGS_SYSRET64
+	jmp	.Lpop_c_regs_except_rcx_r11_and_sysret
 
 1:
 	/*
@@ -318,6 +317,7 @@ syscall_return_via_sysret:
 	/* rcx and r11 are already restored (see code above) */
 	UNWIND_HINT_EMPTY
 	POP_EXTRA_REGS
+.Lpop_c_regs_except_rcx_r11_and_sysret:
 	popq	%rsi	/* skip r11 */
 	popq	%r10
 	popq	%r9

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 042/159] x86/entry/64: Use POP instead of MOV to restore regs on NMI return
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 041/159] x86/entry/64: Merge the fast and slow SYSRET paths Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 043/159] x86/entry/64: Remove the RESTORE_..._REGS infrastructure Greg Kroah-Hartman
                   ` (123 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 471ee4832209e986029b9fabdaad57b1eecb856b upstream.

This gets rid of the last user of the old RESTORE_..._REGS infrastructure.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/652a260f17a160789bc6a41d997f98249b73e2ab.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1560,11 +1560,14 @@ end_repeat_nmi:
 nmi_swapgs:
 	SWAPGS_UNSAFE_STACK
 nmi_restore:
-	RESTORE_EXTRA_REGS
-	RESTORE_C_REGS
+	POP_EXTRA_REGS
+	POP_C_REGS
 
-	/* Point RSP at the "iret" frame. */
-	REMOVE_PT_GPREGS_FROM_STACK 6*8
+	/*
+	 * Skip orig_ax and the "outermost" frame to point RSP at the "iret"
+	 * at the "iret" frame.
+	 */
+	addq	$6*8, %rsp
 
 	/*
 	 * Clear "NMI executing".  Set DF first so that we can easily

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 043/159] x86/entry/64: Remove the RESTORE_..._REGS infrastructure
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 042/159] x86/entry/64: Use POP instead of MOV to restore regs on NMI return Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 044/159] xen, x86/entry/64: Add xen NMI trap entry Greg Kroah-Hartman
                   ` (122 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit c39858de696f0cc160a544455e8403d663d577e9 upstream.

All users of RESTORE_EXTRA_REGS, RESTORE_C_REGS and such, and
REMOVE_PT_GPREGS_FROM_STACK are gone.  Delete the macros.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c32672f6e47c561893316d48e06c7656b1039a36.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/calling.h |   52 -----------------------------------------------
 1 file changed, 52 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -142,16 +142,6 @@ For 32-bit we have the following convent
 	UNWIND_HINT_REGS offset=\offset
 	.endm
 
-	.macro RESTORE_EXTRA_REGS offset=0
-	movq 0*8+\offset(%rsp), %r15
-	movq 1*8+\offset(%rsp), %r14
-	movq 2*8+\offset(%rsp), %r13
-	movq 3*8+\offset(%rsp), %r12
-	movq 4*8+\offset(%rsp), %rbp
-	movq 5*8+\offset(%rsp), %rbx
-	UNWIND_HINT_REGS offset=\offset extra=0
-	.endm
-
 	.macro POP_EXTRA_REGS
 	popq %r15
 	popq %r14
@@ -173,48 +163,6 @@ For 32-bit we have the following convent
 	popq %rdi
 	.endm
 
-	.macro RESTORE_C_REGS_HELPER rstor_rax=1, rstor_rcx=1, rstor_r11=1, rstor_r8910=1, rstor_rdx=1
-	.if \rstor_r11
-	movq 6*8(%rsp), %r11
-	.endif
-	.if \rstor_r8910
-	movq 7*8(%rsp), %r10
-	movq 8*8(%rsp), %r9
-	movq 9*8(%rsp), %r8
-	.endif
-	.if \rstor_rax
-	movq 10*8(%rsp), %rax
-	.endif
-	.if \rstor_rcx
-	movq 11*8(%rsp), %rcx
-	.endif
-	.if \rstor_rdx
-	movq 12*8(%rsp), %rdx
-	.endif
-	movq 13*8(%rsp), %rsi
-	movq 14*8(%rsp), %rdi
-	UNWIND_HINT_IRET_REGS offset=16*8
-	.endm
-	.macro RESTORE_C_REGS
-	RESTORE_C_REGS_HELPER 1,1,1,1,1
-	.endm
-	.macro RESTORE_C_REGS_EXCEPT_RAX
-	RESTORE_C_REGS_HELPER 0,1,1,1,1
-	.endm
-	.macro RESTORE_C_REGS_EXCEPT_RCX
-	RESTORE_C_REGS_HELPER 1,0,1,1,1
-	.endm
-	.macro RESTORE_C_REGS_EXCEPT_R11
-	RESTORE_C_REGS_HELPER 1,1,0,1,1
-	.endm
-	.macro RESTORE_C_REGS_EXCEPT_RCX_R11
-	RESTORE_C_REGS_HELPER 1,0,0,1,1
-	.endm
-
-	.macro REMOVE_PT_GPREGS_FROM_STACK addskip=0
-	subq $-(15*8+\addskip), %rsp
-	.endm
-
 	.macro icebp
 	.byte 0xf1
 	.endm

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 044/159] xen, x86/entry/64: Add xen NMI trap entry
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 043/159] x86/entry/64: Remove the RESTORE_..._REGS infrastructure Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 045/159] x86/entry/64: De-Xen-ify our NMI code Greg Kroah-Hartman
                   ` (121 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Juergen Gross, Andy Lutomirski,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit 43e4111086a70c78bedb6ad990bee97f17b27a6e upstream.

Instead of trying to execute any NMI via the bare metal's NMI trap
handler use a Xen specific one for PV domains, like we do for e.g.
debug traps. As in a PV domain the NMI is handled via the normal
kernel stack this is the correct thing to do.

This will enable us to get rid of the very fragile and questionable
dependencies between the bare metal NMI handler and Xen assumptions
believed to be broken anyway.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5baf5c0528d58402441550c5770b98e7961e7680.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S    |    2 +-
 arch/x86/include/asm/traps.h |    2 +-
 arch/x86/xen/enlighten_pv.c  |    2 +-
 arch/x86/xen/xen-asm_64.S    |    2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1079,6 +1079,7 @@ idtentry int3			do_int3			has_error_code
 idtentry stack_segment		do_stack_segment	has_error_code=1
 
 #ifdef CONFIG_XEN
+idtentry xennmi			do_nmi			has_error_code=0
 idtentry xendebug		do_debug		has_error_code=0
 idtentry xenint3		do_int3			has_error_code=0
 #endif
@@ -1241,7 +1242,6 @@ ENTRY(error_exit)
 END(error_exit)
 
 /* Runs on exception stack */
-/* XXX: broken on Xen PV */
 ENTRY(nmi)
 	UNWIND_HINT_IRET_REGS
 	/*
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -38,9 +38,9 @@ asmlinkage void simd_coprocessor_error(v
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
 asmlinkage void xen_divide_error(void);
+asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
 asmlinkage void xen_xenint3(void);
-asmlinkage void xen_nmi(void);
 asmlinkage void xen_overflow(void);
 asmlinkage void xen_bounds(void);
 asmlinkage void xen_invalid_op(void);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -601,7 +601,7 @@ static struct trap_array_entry trap_arra
 #ifdef CONFIG_X86_MCE
 	{ machine_check,               xen_machine_check,               true },
 #endif
-	{ nmi,                         xen_nmi,                         true },
+	{ nmi,                         xen_xennmi,                      true },
 	{ overflow,                    xen_overflow,                    false },
 #ifdef CONFIG_IA32_EMULATION
 	{ entry_INT80_compat,          xen_entry_INT80_compat,          false },
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -30,7 +30,7 @@ xen_pv_trap debug
 xen_pv_trap xendebug
 xen_pv_trap int3
 xen_pv_trap xenint3
-xen_pv_trap nmi
+xen_pv_trap xennmi
 xen_pv_trap overflow
 xen_pv_trap bounds
 xen_pv_trap invalid_op

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 045/159] x86/entry/64: De-Xen-ify our NMI code
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 044/159] xen, x86/entry/64: Add xen NMI trap entry Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 046/159] x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0() Greg Kroah-Hartman
                   ` (120 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Juergen Gross, Boris Ostrovsky, Borislav Petkov, Brian Gerst,
	Dave Hansen, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 929bacec21478a72c78e4f29f98fb799bd00105a upstream.

Xen PV is fundamentally incompatible with our fancy NMI code: it
doesn't use IST at all, and Xen entries clobber two stack slots
below the hardware frame.

Drop Xen PV support from our NMI code entirely.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bfbe711b5ae03f672f8848999a8eb2711efc7f98.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |   30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1241,9 +1241,13 @@ ENTRY(error_exit)
 	jmp	retint_user
 END(error_exit)
 
-/* Runs on exception stack */
+/*
+ * Runs on exception stack.  Xen PV does not go through this path at all,
+ * so we can use real assembly here.
+ */
 ENTRY(nmi)
 	UNWIND_HINT_IRET_REGS
+
 	/*
 	 * We allow breakpoints in NMIs. If a breakpoint occurs, then
 	 * the iretq it performs will take us out of NMI context.
@@ -1301,7 +1305,7 @@ ENTRY(nmi)
 	 * stacks lest we corrupt the "NMI executing" variable.
 	 */
 
-	SWAPGS_UNSAFE_STACK
+	swapgs
 	cld
 	movq	%rsp, %rdx
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
@@ -1466,7 +1470,7 @@ nested_nmi_out:
 	popq	%rdx
 
 	/* We are returning to kernel mode, so this cannot result in a fault. */
-	INTERRUPT_RETURN
+	iretq
 
 first_nmi:
 	/* Restore rdx. */
@@ -1497,7 +1501,7 @@ first_nmi:
 	pushfq			/* RFLAGS */
 	pushq	$__KERNEL_CS	/* CS */
 	pushq	$1f		/* RIP */
-	INTERRUPT_RETURN	/* continues at repeat_nmi below */
+	iretq			/* continues at repeat_nmi below */
 	UNWIND_HINT_IRET_REGS
 1:
 #endif
@@ -1572,20 +1576,22 @@ nmi_restore:
 	/*
 	 * Clear "NMI executing".  Set DF first so that we can easily
 	 * distinguish the remaining code between here and IRET from
-	 * the SYSCALL entry and exit paths.  On a native kernel, we
-	 * could just inspect RIP, but, on paravirt kernels,
-	 * INTERRUPT_RETURN can translate into a jump into a
-	 * hypercall page.
+	 * the SYSCALL entry and exit paths.
+	 *
+	 * We arguably should just inspect RIP instead, but I (Andy) wrote
+	 * this code when I had the misapprehension that Xen PV supported
+	 * NMIs, and Xen PV would break that approach.
 	 */
 	std
 	movq	$0, 5*8(%rsp)		/* clear "NMI executing" */
 
 	/*
-	 * INTERRUPT_RETURN reads the "iret" frame and exits the NMI
-	 * stack in a single instruction.  We are returning to kernel
-	 * mode, so this cannot result in a fault.
+	 * iretq reads the "iret" frame and exits the NMI stack in a
+	 * single instruction.  We are returning to kernel mode, so this
+	 * cannot result in a fault.  Similarly, we don't need to worry
+	 * about espfix64 on the way back to kernel mode.
 	 */
-	INTERRUPT_RETURN
+	iretq
 END(nmi)
 
 ENTRY(ignore_sysret)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 046/159] x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 045/159] x86/entry/64: De-Xen-ify our NMI code Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 047/159] x86/entry/64: Pass SP0 directly to load_sp0() Greg Kroah-Hartman
                   ` (119 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit bd7dc5a6afac719d8ce4092391eef2c7e83c2a75 upstream.

This causes the MSR_IA32_SYSENTER_CS write to move out of the
paravirt callback.  This shouldn't affect Xen PV: Xen already ignores
MSR_IA32_SYSENTER_ESP writes.  In any event, Xen doesn't support
vm86() in a useful way.

Note to any potential backporters: This patch won't break lguest, as
lguest didn't have any SYSENTER support at all.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/75cf09fe03ae778532d0ca6c65aa58e66bc2f90c.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/processor.h |    7 -------
 arch/x86/include/asm/switch_to.h |   12 ++++++++++++
 arch/x86/kernel/process_32.c     |    4 +++-
 arch/x86/kernel/process_64.c     |    2 +-
 arch/x86/kernel/vm86_32.c        |    6 +++++-
 5 files changed, 21 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -521,13 +521,6 @@ static inline void
 native_load_sp0(struct tss_struct *tss, struct thread_struct *thread)
 {
 	tss->x86_tss.sp0 = thread->sp0;
-#ifdef CONFIG_X86_32
-	/* Only happens when SEP is enabled, no need to test "SEP"arately: */
-	if (unlikely(tss->x86_tss.ss1 != thread->sysenter_cs)) {
-		tss->x86_tss.ss1 = thread->sysenter_cs;
-		wrmsr(MSR_IA32_SYSENTER_CS, thread->sysenter_cs, 0);
-	}
-#endif
 }
 
 static inline void native_swapgs(void)
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -73,4 +73,16 @@ do {									\
 	((last) = __switch_to_asm((prev), (next)));			\
 } while (0)
 
+#ifdef CONFIG_X86_32
+static inline void refresh_sysenter_cs(struct thread_struct *thread)
+{
+	/* Only happens when SEP is enabled, no need to test "SEP"arately: */
+	if (unlikely(this_cpu_read(cpu_tss.x86_tss.ss1) == thread->sysenter_cs))
+		return;
+
+	this_cpu_write(cpu_tss.x86_tss.ss1, thread->sysenter_cs);
+	wrmsr(MSR_IA32_SYSENTER_CS, thread->sysenter_cs, 0);
+}
+#endif
+
 #endif /* _ASM_X86_SWITCH_TO_H */
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -284,9 +284,11 @@ __switch_to(struct task_struct *prev_p,
 
 	/*
 	 * Reload esp0 and cpu_current_top_of_stack.  This changes
-	 * current_thread_info().
+	 * current_thread_info().  Refresh the SYSENTER configuration in
+	 * case prev or next is vm86.
 	 */
 	load_sp0(tss, next);
+	refresh_sysenter_cs(next);
 	this_cpu_write(cpu_current_top_of_stack,
 		       (unsigned long)task_stack_page(next_p) +
 		       THREAD_SIZE);
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -464,7 +464,7 @@ __switch_to(struct task_struct *prev_p,
 	 */
 	this_cpu_write(current_task, next_p);
 
-	/* Reload esp0 and ss1.  This changes current_thread_info(). */
+	/* Reload sp0. */
 	load_sp0(tss, next);
 
 	/*
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -55,6 +55,7 @@
 #include <asm/irq.h>
 #include <asm/traps.h>
 #include <asm/vm86.h>
+#include <asm/switch_to.h>
 
 /*
  * Known problems:
@@ -150,6 +151,7 @@ void save_v86_state(struct kernel_vm86_r
 	tsk->thread.sp0 = vm86->saved_sp0;
 	tsk->thread.sysenter_cs = __KERNEL_CS;
 	load_sp0(tss, &tsk->thread);
+	refresh_sysenter_cs(&tsk->thread);
 	vm86->saved_sp0 = 0;
 	put_cpu();
 
@@ -369,8 +371,10 @@ static long do_sys_vm86(struct vm86plus_
 	/* make room for real-mode segments */
 	tsk->thread.sp0 += 16;
 
-	if (static_cpu_has(X86_FEATURE_SEP))
+	if (static_cpu_has(X86_FEATURE_SEP)) {
 		tsk->thread.sysenter_cs = 0;
+		refresh_sysenter_cs(&tsk->thread);
+	}
 
 	load_sp0(tss, &tsk->thread);
 	put_cpu();

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 047/159] x86/entry/64: Pass SP0 directly to load_sp0()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 046/159] x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 048/159] x86/entry: Add task_top_of_stack() to find the top of a tasks stack Greg Kroah-Hartman
                   ` (118 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit da51da189a24bb9b7e2d5a123be096e51a4695a5 upstream.

load_sp0() had an odd signature:

  void load_sp0(struct tss_struct *tss, struct thread_struct *thread);

Simplify it to:

  void load_sp0(unsigned long sp0);

Also simplify a few get_cpu()/put_cpu() sequences to
preempt_disable()/preempt_enable().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2655d8b42ed940aa384fe18ee1129bbbcf730a08.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/paravirt.h       |    5 ++---
 arch/x86/include/asm/paravirt_types.h |    2 +-
 arch/x86/include/asm/processor.h      |    9 ++++-----
 arch/x86/kernel/cpu/common.c          |    4 ++--
 arch/x86/kernel/process_32.c          |    2 +-
 arch/x86/kernel/process_64.c          |    2 +-
 arch/x86/kernel/vm86_32.c             |   14 ++++++--------
 arch/x86/xen/enlighten_pv.c           |    7 +++----
 8 files changed, 20 insertions(+), 25 deletions(-)

--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -16,10 +16,9 @@
 #include <linux/cpumask.h>
 #include <asm/frame.h>
 
-static inline void load_sp0(struct tss_struct *tss,
-			     struct thread_struct *thread)
+static inline void load_sp0(unsigned long sp0)
 {
-	PVOP_VCALL2(pv_cpu_ops.load_sp0, tss, thread);
+	PVOP_VCALL1(pv_cpu_ops.load_sp0, sp0);
 }
 
 /* The paravirtualized CPUID instruction. */
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -134,7 +134,7 @@ struct pv_cpu_ops {
 	void (*alloc_ldt)(struct desc_struct *ldt, unsigned entries);
 	void (*free_ldt)(struct desc_struct *ldt, unsigned entries);
 
-	void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t);
+	void (*load_sp0)(unsigned long sp0);
 
 	void (*set_iopl_mask)(unsigned mask);
 
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -518,9 +518,9 @@ static inline void native_set_iopl_mask(
 }
 
 static inline void
-native_load_sp0(struct tss_struct *tss, struct thread_struct *thread)
+native_load_sp0(unsigned long sp0)
 {
-	tss->x86_tss.sp0 = thread->sp0;
+	this_cpu_write(cpu_tss.x86_tss.sp0, sp0);
 }
 
 static inline void native_swapgs(void)
@@ -545,10 +545,9 @@ static inline unsigned long current_top_
 #else
 #define __cpuid			native_cpuid
 
-static inline void load_sp0(struct tss_struct *tss,
-			    struct thread_struct *thread)
+static inline void load_sp0(unsigned long sp0)
 {
-	native_load_sp0(tss, thread);
+	native_load_sp0(sp0);
 }
 
 #define set_iopl_mask native_set_iopl_mask
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1570,7 +1570,7 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, me);
 
-	load_sp0(t, &current->thread);
+	load_sp0(current->thread.sp0);
 	set_tss_desc(cpu, t);
 	load_TR_desc();
 	load_mm_ldt(&init_mm);
@@ -1625,7 +1625,7 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, curr);
 
-	load_sp0(t, thread);
+	load_sp0(thread->sp0);
 	set_tss_desc(cpu, t);
 	load_TR_desc();
 	load_mm_ldt(&init_mm);
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -287,7 +287,7 @@ __switch_to(struct task_struct *prev_p,
 	 * current_thread_info().  Refresh the SYSENTER configuration in
 	 * case prev or next is vm86.
 	 */
-	load_sp0(tss, next);
+	load_sp0(next->sp0);
 	refresh_sysenter_cs(next);
 	this_cpu_write(cpu_current_top_of_stack,
 		       (unsigned long)task_stack_page(next_p) +
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -465,7 +465,7 @@ __switch_to(struct task_struct *prev_p,
 	this_cpu_write(current_task, next_p);
 
 	/* Reload sp0. */
-	load_sp0(tss, next);
+	load_sp0(next->sp0);
 
 	/*
 	 * Now maybe reload the debug registers and handle I/O bitmaps
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -95,7 +95,6 @@
 
 void save_v86_state(struct kernel_vm86_regs *regs, int retval)
 {
-	struct tss_struct *tss;
 	struct task_struct *tsk = current;
 	struct vm86plus_struct __user *user;
 	struct vm86 *vm86 = current->thread.vm86;
@@ -147,13 +146,13 @@ void save_v86_state(struct kernel_vm86_r
 		do_exit(SIGSEGV);
 	}
 
-	tss = &per_cpu(cpu_tss, get_cpu());
+	preempt_disable();
 	tsk->thread.sp0 = vm86->saved_sp0;
 	tsk->thread.sysenter_cs = __KERNEL_CS;
-	load_sp0(tss, &tsk->thread);
+	load_sp0(tsk->thread.sp0);
 	refresh_sysenter_cs(&tsk->thread);
 	vm86->saved_sp0 = 0;
-	put_cpu();
+	preempt_enable();
 
 	memcpy(&regs->pt, &vm86->regs32, sizeof(struct pt_regs));
 
@@ -239,7 +238,6 @@ SYSCALL_DEFINE2(vm86, unsigned long, cmd
 
 static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
 {
-	struct tss_struct *tss;
 	struct task_struct *tsk = current;
 	struct vm86 *vm86 = tsk->thread.vm86;
 	struct kernel_vm86_regs vm86regs;
@@ -367,8 +365,8 @@ static long do_sys_vm86(struct vm86plus_
 	vm86->saved_sp0 = tsk->thread.sp0;
 	lazy_save_gs(vm86->regs32.gs);
 
-	tss = &per_cpu(cpu_tss, get_cpu());
 	/* make room for real-mode segments */
+	preempt_disable();
 	tsk->thread.sp0 += 16;
 
 	if (static_cpu_has(X86_FEATURE_SEP)) {
@@ -376,8 +374,8 @@ static long do_sys_vm86(struct vm86plus_
 		refresh_sysenter_cs(&tsk->thread);
 	}
 
-	load_sp0(tss, &tsk->thread);
-	put_cpu();
+	load_sp0(tsk->thread.sp0);
+	preempt_enable();
 
 	if (vm86->flags & VM86_SCREEN_BITMAP)
 		mark_screen_rdonly(tsk->mm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -811,15 +811,14 @@ static void __init xen_write_gdt_entry_b
 	}
 }
 
-static void xen_load_sp0(struct tss_struct *tss,
-			 struct thread_struct *thread)
+static void xen_load_sp0(unsigned long sp0)
 {
 	struct multicall_space mcs;
 
 	mcs = xen_mc_entry(0);
-	MULTI_stack_switch(mcs.mc, __KERNEL_DS, thread->sp0);
+	MULTI_stack_switch(mcs.mc, __KERNEL_DS, sp0);
 	xen_mc_issue(PARAVIRT_LAZY_CPU);
-	tss->x86_tss.sp0 = thread->sp0;
+	this_cpu_write(cpu_tss.x86_tss.sp0, sp0);
 }
 
 void xen_set_iopl_mask(unsigned mask)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 048/159] x86/entry: Add task_top_of_stack() to find the top of a tasks stack
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 047/159] x86/entry/64: Pass SP0 directly to load_sp0() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 049/159] x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context() Greg Kroah-Hartman
                   ` (117 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 3500130b84a3cdc5b6796eba1daf178944935efe upstream.

This will let us get rid of a few places that hardcode accesses to
thread.sp0.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b49b3f95a8ff858c40c9b0f5b32be0355324327d.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/processor.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -796,6 +796,8 @@ static inline void spin_lock_prefetch(co
 #define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack) - \
 			   TOP_OF_KERNEL_STACK_PADDING)
 
+#define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1))
+
 #ifdef CONFIG_X86_32
 /*
  * User space process size: 3GB (default).

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 049/159] x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 048/159] x86/entry: Add task_top_of_stack() to find the top of a tasks stack Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 050/159] x86/entry/64: Stop initializing TSS.sp0 at boot Greg Kroah-Hartman
                   ` (116 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Juergen Gross,
	Boris Ostrovsky, Borislav Petkov, Brian Gerst, Dave Hansen,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit f16b3da1dc936c0f8121741d0a1731bf242f2f56 upstream.

I'm removing thread_struct::sp0, and Xen's usage of it is slightly
dubious and unnecessary.  Use appropriate helpers instead.

While we're at at, reorder the code slightly to make it more obvious
what's going on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/d5b9a3da2b47c68325bd2bbe8f82d9554dee0d0f.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/xen/smp_pv.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -14,6 +14,7 @@
  * single-threaded.
  */
 #include <linux/sched.h>
+#include <linux/sched/task_stack.h>
 #include <linux/err.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -294,12 +295,19 @@ cpu_initialize_context(unsigned int cpu,
 #endif
 	memset(&ctxt->fpu_ctxt, 0, sizeof(ctxt->fpu_ctxt));
 
+	/*
+	 * Bring up the CPU in cpu_bringup_and_idle() with the stack
+	 * pointing just below where pt_regs would be if it were a normal
+	 * kernel entry.
+	 */
 	ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
 	ctxt->flags = VGCF_IN_KERNEL;
 	ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */
 	ctxt->user_regs.ds = __USER_DS;
 	ctxt->user_regs.es = __USER_DS;
 	ctxt->user_regs.ss = __KERNEL_DS;
+	ctxt->user_regs.cs = __KERNEL_CS;
+	ctxt->user_regs.esp = (unsigned long)task_pt_regs(idle);
 
 	xen_copy_trap_info(ctxt->trap_ctxt);
 
@@ -314,8 +322,13 @@ cpu_initialize_context(unsigned int cpu,
 	ctxt->gdt_frames[0] = gdt_mfn;
 	ctxt->gdt_ents      = GDT_ENTRIES;
 
+	/*
+	 * Set SS:SP that Xen will use when entering guest kernel mode
+	 * from guest user mode.  Subsequent calls to load_sp0() can
+	 * change this value.
+	 */
 	ctxt->kernel_ss = __KERNEL_DS;
-	ctxt->kernel_sp = idle->thread.sp0;
+	ctxt->kernel_sp = task_top_of_stack(idle);
 
 #ifdef CONFIG_X86_32
 	ctxt->event_callback_cs     = __KERNEL_CS;
@@ -327,10 +340,8 @@ cpu_initialize_context(unsigned int cpu,
 		(unsigned long)xen_hypervisor_callback;
 	ctxt->failsafe_callback_eip =
 		(unsigned long)xen_failsafe_callback;
-	ctxt->user_regs.cs = __KERNEL_CS;
 	per_cpu(xen_cr3, cpu) = __pa(swapper_pg_dir);
 
-	ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
 	ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_gfn(swapper_pg_dir));
 	if (HYPERVISOR_vcpu_op(VCPUOP_initialise, xen_vcpu_nr(cpu), ctxt))
 		BUG();

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 050/159] x86/entry/64: Stop initializing TSS.sp0 at boot
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 049/159] x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 051/159] x86/entry/64: Remove all remaining direct thread_struct::sp0 reads Greg Kroah-Hartman
                   ` (115 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 20bb83443ea79087b5e5f8dab4e9d80bb9bf7acb upstream.

In my quest to get rid of thread_struct::sp0, I want to clean up or
remove all of its readers.  Two of them are in cpu_init() (32-bit and
64-bit), and they aren't needed.  This is because we never enter
userspace at all on the threads that CPUs are initialized in.

Poison the initial TSS.sp0 and stop initializing it on CPU init.

The comment text mostly comes from Dave Hansen.  Thanks!

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ee4a00540ad28c6cff475fbcc7769a4460acc861.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/common.c |   13 ++++++++++---
 arch/x86/kernel/process.c    |    8 +++++++-
 2 files changed, 17 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1570,9 +1570,13 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, me);
 
-	load_sp0(current->thread.sp0);
+	/*
+	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
+	 * task never enters user mode.
+	 */
 	set_tss_desc(cpu, t);
 	load_TR_desc();
+
 	load_mm_ldt(&init_mm);
 
 	clear_all_debug_regs();
@@ -1594,7 +1598,6 @@ void cpu_init(void)
 	int cpu = smp_processor_id();
 	struct task_struct *curr = current;
 	struct tss_struct *t = &per_cpu(cpu_tss, cpu);
-	struct thread_struct *thread = &curr->thread;
 
 	wait_for_master_cpu(cpu);
 
@@ -1625,9 +1628,13 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, curr);
 
-	load_sp0(thread->sp0);
+	/*
+	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
+	 * task never enters user mode.
+	 */
 	set_tss_desc(cpu, t);
 	load_TR_desc();
+
 	load_mm_ldt(&init_mm);
 
 	t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -49,7 +49,13 @@
  */
 __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = {
 	.x86_tss = {
-		.sp0 = TOP_OF_INIT_STACK,
+		/*
+		 * .sp0 is only used when entering ring 0 from a lower
+		 * privilege level.  Since the init task never runs anything
+		 * but ring 0 code, there is no need for a valid value here.
+		 * Poison it.
+		 */
+		.sp0 = (1UL << (BITS_PER_LONG-1)) + 1,
 #ifdef CONFIG_X86_32
 		.ss0 = __KERNEL_DS,
 		.ss1 = __KERNEL_CS,

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 051/159] x86/entry/64: Remove all remaining direct thread_struct::sp0 reads
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 050/159] x86/entry/64: Stop initializing TSS.sp0 at boot Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 052/159] x86/entry/32: Fix cpu_current_top_of_stack initialization at boot Greg Kroah-Hartman
                   ` (114 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 46f5a10a721ce8dce8cc8fe55279b49e1c6b3288 upstream.

The only remaining readers in context switch code or vm86(), and
they all just want to update TSS.sp0 to match the current task.
Replace them all with a new helper update_sp0().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2d231687f4ff288c9d9e98d7861b7df374246ac3.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/switch_to.h |    6 ++++++
 arch/x86/kernel/process_32.c     |    2 +-
 arch/x86/kernel/process_64.c     |    2 +-
 arch/x86/kernel/vm86_32.c        |    4 ++--
 4 files changed, 10 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -85,4 +85,10 @@ static inline void refresh_sysenter_cs(s
 }
 #endif
 
+/* This is used when switching tasks or entering/exiting vm86 mode. */
+static inline void update_sp0(struct task_struct *task)
+{
+	load_sp0(task->thread.sp0);
+}
+
 #endif /* _ASM_X86_SWITCH_TO_H */
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -287,7 +287,7 @@ __switch_to(struct task_struct *prev_p,
 	 * current_thread_info().  Refresh the SYSENTER configuration in
 	 * case prev or next is vm86.
 	 */
-	load_sp0(next->sp0);
+	update_sp0(next_p);
 	refresh_sysenter_cs(next);
 	this_cpu_write(cpu_current_top_of_stack,
 		       (unsigned long)task_stack_page(next_p) +
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -465,7 +465,7 @@ __switch_to(struct task_struct *prev_p,
 	this_cpu_write(current_task, next_p);
 
 	/* Reload sp0. */
-	load_sp0(next->sp0);
+	update_sp0(next_p);
 
 	/*
 	 * Now maybe reload the debug registers and handle I/O bitmaps
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -149,7 +149,7 @@ void save_v86_state(struct kernel_vm86_r
 	preempt_disable();
 	tsk->thread.sp0 = vm86->saved_sp0;
 	tsk->thread.sysenter_cs = __KERNEL_CS;
-	load_sp0(tsk->thread.sp0);
+	update_sp0(tsk);
 	refresh_sysenter_cs(&tsk->thread);
 	vm86->saved_sp0 = 0;
 	preempt_enable();
@@ -374,7 +374,7 @@ static long do_sys_vm86(struct vm86plus_
 		refresh_sysenter_cs(&tsk->thread);
 	}
 
-	load_sp0(tsk->thread.sp0);
+	update_sp0(tsk);
 	preempt_enable();
 
 	if (vm86->flags & VM86_SCREEN_BITMAP)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 052/159] x86/entry/32: Fix cpu_current_top_of_stack initialization at boot
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 051/159] x86/entry/64: Remove all remaining direct thread_struct::sp0 reads Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 053/159] x86/entry/64: Remove thread_struct::sp0 Greg Kroah-Hartman
                   ` (113 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit cd493a6deb8b78eca280d05f7fa73fd69403ae29 upstream.

cpu_current_top_of_stack's initialization forgot about
TOP_OF_KERNEL_STACK_PADDING.  This bug didn't matter because the
idle threads never enter user mode.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e5e370a7e6e4fddd1c4e4cf619765d96bb874b21.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/smpboot.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -962,8 +962,7 @@ void common_cpu_up(unsigned int cpu, str
 #ifdef CONFIG_X86_32
 	/* Stack for startup_32 can be just as for start_secondary onwards */
 	irq_ctx_init(cpu);
-	per_cpu(cpu_current_top_of_stack, cpu) =
-		(unsigned long)task_stack_page(idle) + THREAD_SIZE;
+	per_cpu(cpu_current_top_of_stack, cpu) = task_top_of_stack(idle);
 #else
 	initial_gs = per_cpu_offset(cpu);
 #endif

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 053/159] x86/entry/64: Remove thread_struct::sp0
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 052/159] x86/entry/32: Fix cpu_current_top_of_stack initialization at boot Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 054/159] x86/traps: Use a new on_thread_stack() helper to clean up an assertion Greg Kroah-Hartman
                   ` (112 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit d375cf1530595e33961a8844192cddab913650e3 upstream.

On x86_64, we can easily calculate sp0 when needed instead of
storing it in thread_struct.

On x86_32, a similar cleanup would be possible, but it would require
cleaning up the vm86 code first, and that can wait for a later
cleanup series.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/719cd9c66c548c4350d98a90f050aee8b17f8919.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/compat.h    |    1 +
 arch/x86/include/asm/processor.h |   28 +++++++++-------------------
 arch/x86/include/asm/switch_to.h |    6 ++++++
 arch/x86/kernel/process_64.c     |    1 -
 4 files changed, 16 insertions(+), 20 deletions(-)

--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -7,6 +7,7 @@
  */
 #include <linux/types.h>
 #include <linux/sched.h>
+#include <linux/sched/task_stack.h>
 #include <asm/processor.h>
 #include <asm/user32.h>
 #include <asm/unistd.h>
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -431,7 +431,9 @@ typedef struct {
 struct thread_struct {
 	/* Cached TLS descriptors: */
 	struct desc_struct	tls_array[GDT_ENTRY_TLS_ENTRIES];
+#ifdef CONFIG_X86_32
 	unsigned long		sp0;
+#endif
 	unsigned long		sp;
 #ifdef CONFIG_X86_32
 	unsigned long		sysenter_cs;
@@ -798,6 +800,13 @@ static inline void spin_lock_prefetch(co
 
 #define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1))
 
+#define task_pt_regs(task) \
+({									\
+	unsigned long __ptr = (unsigned long)task_stack_page(task);	\
+	__ptr += THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING;		\
+	((struct pt_regs *)__ptr) - 1;					\
+})
+
 #ifdef CONFIG_X86_32
 /*
  * User space process size: 3GB (default).
@@ -817,23 +826,6 @@ static inline void spin_lock_prefetch(co
 	.addr_limit		= KERNEL_DS,				  \
 }
 
-/*
- * TOP_OF_KERNEL_STACK_PADDING reserves 8 bytes on top of the ring0 stack.
- * This is necessary to guarantee that the entire "struct pt_regs"
- * is accessible even if the CPU haven't stored the SS/ESP registers
- * on the stack (interrupt gate does not save these registers
- * when switching to the same priv ring).
- * Therefore beware: accessing the ss/esp fields of the
- * "struct pt_regs" is possible, but they may contain the
- * completely wrong values.
- */
-#define task_pt_regs(task) \
-({									\
-	unsigned long __ptr = (unsigned long)task_stack_page(task);	\
-	__ptr += THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING;		\
-	((struct pt_regs *)__ptr) - 1;					\
-})
-
 #define KSTK_ESP(task)		(task_pt_regs(task)->sp)
 
 #else
@@ -867,11 +859,9 @@ static inline void spin_lock_prefetch(co
 #define STACK_TOP_MAX		TASK_SIZE_MAX
 
 #define INIT_THREAD  {						\
-	.sp0			= TOP_OF_INIT_STACK,		\
 	.addr_limit		= KERNEL_DS,			\
 }
 
-#define task_pt_regs(tsk)	((struct pt_regs *)(tsk)->thread.sp0 - 1)
 extern unsigned long KSTK_ESP(struct task_struct *task);
 
 #endif /* CONFIG_X86_64 */
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_SWITCH_TO_H
 #define _ASM_X86_SWITCH_TO_H
 
+#include <linux/sched/task_stack.h>
+
 struct task_struct; /* one of the stranger aspects of C forward declarations */
 
 struct task_struct *__switch_to_asm(struct task_struct *prev,
@@ -88,7 +90,11 @@ static inline void refresh_sysenter_cs(s
 /* This is used when switching tasks or entering/exiting vm86 mode. */
 static inline void update_sp0(struct task_struct *task)
 {
+#ifdef CONFIG_X86_32
 	load_sp0(task->thread.sp0);
+#else
+	load_sp0(task_top_of_stack(task));
+#endif
 }
 
 #endif /* _ASM_X86_SWITCH_TO_H */
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -274,7 +274,6 @@ int copy_thread_tls(unsigned long clone_
 	struct inactive_task_frame *frame;
 	struct task_struct *me = current;
 
-	p->thread.sp0 = (unsigned long)task_stack_page(p) + THREAD_SIZE;
 	childregs = task_pt_regs(p);
 	fork_frame = container_of(childregs, struct fork_frame, regs);
 	frame = &fork_frame->frame;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 054/159] x86/traps: Use a new on_thread_stack() helper to clean up an assertion
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 053/159] x86/entry/64: Remove thread_struct::sp0 Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 055/159] x86/entry/64: Shorten TEST instructions Greg Kroah-Hartman
                   ` (111 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 3383642c2f9d4f5b4fa37436db4a109a1a10018c upstream.

Let's keep the stack-related logic together rather than open-coding
a comparison in an assertion in the traps code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/856b15bee1f55017b8f79d3758b0d51c48a08cf8.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/processor.h |    6 ++++++
 arch/x86/kernel/traps.c          |    3 +--
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -542,6 +542,12 @@ static inline unsigned long current_top_
 #endif
 }
 
+static inline bool on_thread_stack(void)
+{
+	return (unsigned long)(current_top_of_stack() -
+			       current_stack_pointer) < THREAD_SIZE;
+}
+
 #ifdef CONFIG_PARAVIRT
 #include <asm/paravirt.h>
 #else
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -141,8 +141,7 @@ void ist_begin_non_atomic(struct pt_regs
 	 * will catch asm bugs and any attempt to use ist_preempt_enable
 	 * from double_fault.
 	 */
-	BUG_ON((unsigned long)(current_top_of_stack() -
-			       current_stack_pointer) >= THREAD_SIZE);
+	BUG_ON(!on_thread_stack());
 
 	preempt_enable_no_resched();
 }

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 055/159] x86/entry/64: Shorten TEST instructions
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 054/159] x86/traps: Use a new on_thread_stack() helper to clean up an assertion Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 056/159] x86/cpuid: Replace set/clear_bit32() Greg Kroah-Hartman
                   ` (110 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Borislav Petkov, Andy Lutomirski,
	Brian Gerst, Dave Hansen, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <bp@suse.de>

commit 1e4c4f610f774df6088d7c065b2dd4d22adba698 upstream.

Convert TESTL to TESTB and save 3 bytes per callsite.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171102120926.4srwerqrr7g72e2k@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -621,7 +621,7 @@ GLOBAL(retint_user)
 GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 #ifdef CONFIG_DEBUG_ENTRY
 	/* Assert that pt_regs indicates user mode. */
-	testl	$3, CS(%rsp)
+	testb	$3, CS(%rsp)
 	jnz	1f
 	ud2
 1:
@@ -654,7 +654,7 @@ retint_kernel:
 GLOBAL(restore_regs_and_return_to_kernel)
 #ifdef CONFIG_DEBUG_ENTRY
 	/* Assert that pt_regs indicates kernel mode. */
-	testl	$3, CS(%rsp)
+	testb	$3, CS(%rsp)
 	jz	1f
 	ud2
 1:

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 056/159] x86/cpuid: Replace set/clear_bit32()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 055/159] x86/entry/64: Shorten TEST instructions Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 057/159] bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h") Greg Kroah-Hartman
                   ` (109 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Ziljstra, Thomas Gleixner, Andi Kleen

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 06dd688ddda5819025e014b79aea9af6ab475fa2 upstream.

Peter pointed out that the set/clear_bit32() variants are broken in various
aspects.

Replace them with open coded set/clear_bit() and type cast
cpu_info::x86_capability as it's done in all other places throughout x86.

Fixes: 0b00de857a64 ("x86/cpuid: Add generic table for CPUID dependencies")
Reported-by: Peter Ziljstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/cpuid-deps.c |   26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -62,23 +62,19 @@ const static struct cpuid_dep cpuid_deps
 	{}
 };
 
-static inline void __clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit)
-{
-	clear_bit32(bit, c->x86_capability);
-}
-
-static inline void __setup_clear_cpu_cap(unsigned int bit)
-{
-	clear_cpu_cap(&boot_cpu_data, bit);
-	set_bit32(bit, cpu_caps_cleared);
-}
-
 static inline void clear_feature(struct cpuinfo_x86 *c, unsigned int feature)
 {
-	if (!c)
-		__setup_clear_cpu_cap(feature);
-	else
-		__clear_cpu_cap(c, feature);
+	/*
+	 * Note: This could use the non atomic __*_bit() variants, but the
+	 * rest of the cpufeature code uses atomics as well, so keep it for
+	 * consistency. Cleanup all of it separately.
+	 */
+	if (!c) {
+		clear_cpu_cap(&boot_cpu_data, feature);
+		set_bit(feature, (unsigned long *)cpu_caps_cleared);
+	} else {
+		clear_bit(feature, (unsigned long *)c->x86_capability);
+	}
 }
 
 /* Take the capabilities and the BUG bits into account */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 057/159] bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h")
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 056/159] x86/cpuid: Replace set/clear_bit32() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 058/159] x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE Greg Kroah-Hartman
                   ` (108 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Zijlstra, Thomas Gleixner, Andi Kleen

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 1943dc07b45e347c52c1bfdd4a37e04a86e399aa upstream.

These ops are not endian safe and may break on architectures which have
aligment requirements.

Reverts: cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/bitops.h |   26 --------------------------
 1 file changed, 26 deletions(-)

--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -228,32 +228,6 @@ static inline unsigned long __ffs64(u64
 	return __ffs((unsigned long)word);
 }
 
-/*
- * clear_bit32 - Clear a bit in memory for u32 array
- * @nr: Bit to clear
- * @addr: u32 * address of bitmap
- *
- * Same as clear_bit, but avoids needing casts for u32 arrays.
- */
-
-static __always_inline void clear_bit32(long nr, volatile u32 *addr)
-{
-	clear_bit(nr, (volatile unsigned long *)addr);
-}
-
-/*
- * set_bit32 - Set a bit in memory for u32 array
- * @nr: Bit to clear
- * @addr: u32 * address of bitmap
- *
- * Same as set_bit, but avoids needing casts for u32 arrays.
- */
-
-static __always_inline void set_bit32(long nr, volatile u32 *addr)
-{
-	set_bit(nr, (volatile unsigned long *)addr);
-}
-
 #ifdef __KERNEL__
 
 #ifndef set_mask_bits

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 058/159] x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 057/159] bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h") Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 059/159] x86/cpufeatures: Re-tabulate the X86_FEATURE definitions Greg Kroah-Hartman
                   ` (107 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Borislav Petkov, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <bp@suse.de>

commit c7da092a1f243bfd1bfb4124f538e69e941882da upstream.

... so that the difference is obvious.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171103102028.20284-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/pgtable_types.h |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -200,10 +200,9 @@ enum page_cache_mode {
 
 #define _PAGE_ENC	(_AT(pteval_t, sme_me_mask))
 
-#define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |	\
-			 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_ENC)
 #define _KERNPG_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |	\
 			 _PAGE_DIRTY | _PAGE_ENC)
+#define _PAGE_TABLE	(_KERNPG_TABLE | _PAGE_USER)
 
 #define __PAGE_KERNEL_ENC	(__PAGE_KERNEL | _PAGE_ENC)
 #define __PAGE_KERNEL_ENC_WP	(__PAGE_KERNEL_WP | _PAGE_ENC)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 059/159] x86/cpufeatures: Re-tabulate the X86_FEATURE definitions
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 058/159] x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 060/159] x86/cpufeatures: Fix various details in the feature definitions Greg Kroah-Hartman
                   ` (106 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrew Morton, Andy Lutomirski,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ingo Molnar <mingo@kernel.org>

commit acbc845ffefd9fb70466182cd8555a26189462b2 upstream.

Over the years asm/cpufeatures.h has become somewhat of a mess: the original
tabulation style was too narrow, while x86 feature names also kept growing
in length, creating frequent field width overflows.

Re-tabulate it to make it wider and easier to read/modify. Also harmonize
the tabulation of the other defines in this file to match it.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171031121723.28524-3-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeatures.h |  512 ++++++++++++++++++-------------------
 1 file changed, 256 insertions(+), 256 deletions(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -13,8 +13,8 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS	18	/* N 32-bit words worth of info */
-#define NBUGINTS	1	/* N 32-bit bug flags */
+#define NCAPINTS			18	   /* N 32-bit words worth of info */
+#define NBUGINTS			1	   /* N 32-bit bug flags */
 
 /*
  * Note: If the comment begins with a quoted string, that string is used
@@ -28,163 +28,163 @@
  */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
-#define X86_FEATURE_FPU		( 0*32+ 0) /* Onboard FPU */
-#define X86_FEATURE_VME		( 0*32+ 1) /* Virtual Mode Extensions */
-#define X86_FEATURE_DE		( 0*32+ 2) /* Debugging Extensions */
-#define X86_FEATURE_PSE		( 0*32+ 3) /* Page Size Extensions */
-#define X86_FEATURE_TSC		( 0*32+ 4) /* Time Stamp Counter */
-#define X86_FEATURE_MSR		( 0*32+ 5) /* Model-Specific Registers */
-#define X86_FEATURE_PAE		( 0*32+ 6) /* Physical Address Extensions */
-#define X86_FEATURE_MCE		( 0*32+ 7) /* Machine Check Exception */
-#define X86_FEATURE_CX8		( 0*32+ 8) /* CMPXCHG8 instruction */
-#define X86_FEATURE_APIC	( 0*32+ 9) /* Onboard APIC */
-#define X86_FEATURE_SEP		( 0*32+11) /* SYSENTER/SYSEXIT */
-#define X86_FEATURE_MTRR	( 0*32+12) /* Memory Type Range Registers */
-#define X86_FEATURE_PGE		( 0*32+13) /* Page Global Enable */
-#define X86_FEATURE_MCA		( 0*32+14) /* Machine Check Architecture */
-#define X86_FEATURE_CMOV	( 0*32+15) /* CMOV instructions */
+#define X86_FEATURE_FPU			( 0*32+ 0) /* Onboard FPU */
+#define X86_FEATURE_VME			( 0*32+ 1) /* Virtual Mode Extensions */
+#define X86_FEATURE_DE			( 0*32+ 2) /* Debugging Extensions */
+#define X86_FEATURE_PSE			( 0*32+ 3) /* Page Size Extensions */
+#define X86_FEATURE_TSC			( 0*32+ 4) /* Time Stamp Counter */
+#define X86_FEATURE_MSR			( 0*32+ 5) /* Model-Specific Registers */
+#define X86_FEATURE_PAE			( 0*32+ 6) /* Physical Address Extensions */
+#define X86_FEATURE_MCE			( 0*32+ 7) /* Machine Check Exception */
+#define X86_FEATURE_CX8			( 0*32+ 8) /* CMPXCHG8 instruction */
+#define X86_FEATURE_APIC		( 0*32+ 9) /* Onboard APIC */
+#define X86_FEATURE_SEP			( 0*32+11) /* SYSENTER/SYSEXIT */
+#define X86_FEATURE_MTRR		( 0*32+12) /* Memory Type Range Registers */
+#define X86_FEATURE_PGE			( 0*32+13) /* Page Global Enable */
+#define X86_FEATURE_MCA			( 0*32+14) /* Machine Check Architecture */
+#define X86_FEATURE_CMOV		( 0*32+15) /* CMOV instructions */
 					  /* (plus FCMOVcc, FCOMI with FPU) */
-#define X86_FEATURE_PAT		( 0*32+16) /* Page Attribute Table */
-#define X86_FEATURE_PSE36	( 0*32+17) /* 36-bit PSEs */
-#define X86_FEATURE_PN		( 0*32+18) /* Processor serial number */
-#define X86_FEATURE_CLFLUSH	( 0*32+19) /* CLFLUSH instruction */
-#define X86_FEATURE_DS		( 0*32+21) /* "dts" Debug Store */
-#define X86_FEATURE_ACPI	( 0*32+22) /* ACPI via MSR */
-#define X86_FEATURE_MMX		( 0*32+23) /* Multimedia Extensions */
-#define X86_FEATURE_FXSR	( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */
-#define X86_FEATURE_XMM		( 0*32+25) /* "sse" */
-#define X86_FEATURE_XMM2	( 0*32+26) /* "sse2" */
-#define X86_FEATURE_SELFSNOOP	( 0*32+27) /* "ss" CPU self snoop */
-#define X86_FEATURE_HT		( 0*32+28) /* Hyper-Threading */
-#define X86_FEATURE_ACC		( 0*32+29) /* "tm" Automatic clock control */
-#define X86_FEATURE_IA64	( 0*32+30) /* IA-64 processor */
-#define X86_FEATURE_PBE		( 0*32+31) /* Pending Break Enable */
+#define X86_FEATURE_PAT			( 0*32+16) /* Page Attribute Table */
+#define X86_FEATURE_PSE36		( 0*32+17) /* 36-bit PSEs */
+#define X86_FEATURE_PN			( 0*32+18) /* Processor serial number */
+#define X86_FEATURE_CLFLUSH		( 0*32+19) /* CLFLUSH instruction */
+#define X86_FEATURE_DS			( 0*32+21) /* "dts" Debug Store */
+#define X86_FEATURE_ACPI		( 0*32+22) /* ACPI via MSR */
+#define X86_FEATURE_MMX			( 0*32+23) /* Multimedia Extensions */
+#define X86_FEATURE_FXSR		( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */
+#define X86_FEATURE_XMM			( 0*32+25) /* "sse" */
+#define X86_FEATURE_XMM2		( 0*32+26) /* "sse2" */
+#define X86_FEATURE_SELFSNOOP		( 0*32+27) /* "ss" CPU self snoop */
+#define X86_FEATURE_HT			( 0*32+28) /* Hyper-Threading */
+#define X86_FEATURE_ACC			( 0*32+29) /* "tm" Automatic clock control */
+#define X86_FEATURE_IA64		( 0*32+30) /* IA-64 processor */
+#define X86_FEATURE_PBE			( 0*32+31) /* Pending Break Enable */
 
 /* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
 /* Don't duplicate feature flags which are redundant with Intel! */
-#define X86_FEATURE_SYSCALL	( 1*32+11) /* SYSCALL/SYSRET */
-#define X86_FEATURE_MP		( 1*32+19) /* MP Capable. */
-#define X86_FEATURE_NX		( 1*32+20) /* Execute Disable */
-#define X86_FEATURE_MMXEXT	( 1*32+22) /* AMD MMX extensions */
-#define X86_FEATURE_FXSR_OPT	( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
-#define X86_FEATURE_GBPAGES	( 1*32+26) /* "pdpe1gb" GB pages */
-#define X86_FEATURE_RDTSCP	( 1*32+27) /* RDTSCP */
-#define X86_FEATURE_LM		( 1*32+29) /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT	( 1*32+30) /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW	( 1*32+31) /* 3DNow! */
+#define X86_FEATURE_SYSCALL		( 1*32+11) /* SYSCALL/SYSRET */
+#define X86_FEATURE_MP			( 1*32+19) /* MP Capable. */
+#define X86_FEATURE_NX			( 1*32+20) /* Execute Disable */
+#define X86_FEATURE_MMXEXT		( 1*32+22) /* AMD MMX extensions */
+#define X86_FEATURE_FXSR_OPT		( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
+#define X86_FEATURE_GBPAGES		( 1*32+26) /* "pdpe1gb" GB pages */
+#define X86_FEATURE_RDTSCP		( 1*32+27) /* RDTSCP */
+#define X86_FEATURE_LM			( 1*32+29) /* Long Mode (x86-64) */
+#define X86_FEATURE_3DNOWEXT		( 1*32+30) /* AMD 3DNow! extensions */
+#define X86_FEATURE_3DNOW		( 1*32+31) /* 3DNow! */
 
 /* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */
-#define X86_FEATURE_RECOVERY	( 2*32+ 0) /* CPU in recovery mode */
-#define X86_FEATURE_LONGRUN	( 2*32+ 1) /* Longrun power control */
-#define X86_FEATURE_LRTI	( 2*32+ 3) /* LongRun table interface */
+#define X86_FEATURE_RECOVERY		( 2*32+ 0) /* CPU in recovery mode */
+#define X86_FEATURE_LONGRUN		( 2*32+ 1) /* Longrun power control */
+#define X86_FEATURE_LRTI		( 2*32+ 3) /* LongRun table interface */
 
 /* Other features, Linux-defined mapping, word 3 */
 /* This range is used for feature bits which conflict or are synthesized */
-#define X86_FEATURE_CXMMX	( 3*32+ 0) /* Cyrix MMX extensions */
-#define X86_FEATURE_K6_MTRR	( 3*32+ 1) /* AMD K6 nonstandard MTRRs */
-#define X86_FEATURE_CYRIX_ARR	( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */
-#define X86_FEATURE_CENTAUR_MCR	( 3*32+ 3) /* Centaur MCRs (= MTRRs) */
+#define X86_FEATURE_CXMMX		( 3*32+ 0) /* Cyrix MMX extensions */
+#define X86_FEATURE_K6_MTRR		( 3*32+ 1) /* AMD K6 nonstandard MTRRs */
+#define X86_FEATURE_CYRIX_ARR		( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */
+#define X86_FEATURE_CENTAUR_MCR		( 3*32+ 3) /* Centaur MCRs (= MTRRs) */
 /* cpu types for specific tunings: */
-#define X86_FEATURE_K8		( 3*32+ 4) /* "" Opteron, Athlon64 */
-#define X86_FEATURE_K7		( 3*32+ 5) /* "" Athlon */
-#define X86_FEATURE_P3		( 3*32+ 6) /* "" P3 */
-#define X86_FEATURE_P4		( 3*32+ 7) /* "" P4 */
-#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
-#define X86_FEATURE_UP		( 3*32+ 9) /* smp kernel running on up */
-#define X86_FEATURE_ART		( 3*32+10) /* Platform has always running timer (ART) */
-#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
-#define X86_FEATURE_PEBS	( 3*32+12) /* Precise-Event Based Sampling */
-#define X86_FEATURE_BTS		( 3*32+13) /* Branch Trace Store */
-#define X86_FEATURE_SYSCALL32	( 3*32+14) /* "" syscall in ia32 userspace */
-#define X86_FEATURE_SYSENTER32	( 3*32+15) /* "" sysenter in ia32 userspace */
-#define X86_FEATURE_REP_GOOD	( 3*32+16) /* rep microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" Mfence synchronizes RDTSC */
-#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" Lfence synchronizes RDTSC */
-#define X86_FEATURE_ACC_POWER	( 3*32+19) /* AMD Accumulated Power Mechanism */
-#define X86_FEATURE_NOPL	( 3*32+20) /* The NOPL (0F 1F) instructions */
-#define X86_FEATURE_ALWAYS	( 3*32+21) /* "" Always-present feature */
-#define X86_FEATURE_XTOPOLOGY	( 3*32+22) /* cpu topology enum extensions */
-#define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */
-#define X86_FEATURE_NONSTOP_TSC	( 3*32+24) /* TSC does not stop in C states */
-#define X86_FEATURE_CPUID	( 3*32+25) /* CPU has CPUID instruction itself */
-#define X86_FEATURE_EXTD_APICID	( 3*32+26) /* has extended APICID (8 bits) */
-#define X86_FEATURE_AMD_DCM     ( 3*32+27) /* multi-node processor */
-#define X86_FEATURE_APERFMPERF	( 3*32+28) /* APERFMPERF */
-#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
-#define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */
+#define X86_FEATURE_K8			( 3*32+ 4) /* "" Opteron, Athlon64 */
+#define X86_FEATURE_K7			( 3*32+ 5) /* "" Athlon */
+#define X86_FEATURE_P3			( 3*32+ 6) /* "" P3 */
+#define X86_FEATURE_P4			( 3*32+ 7) /* "" P4 */
+#define X86_FEATURE_CONSTANT_TSC	( 3*32+ 8) /* TSC ticks at a constant rate */
+#define X86_FEATURE_UP			( 3*32+ 9) /* smp kernel running on up */
+#define X86_FEATURE_ART			( 3*32+10) /* Platform has always running timer (ART) */
+#define X86_FEATURE_ARCH_PERFMON	( 3*32+11) /* Intel Architectural PerfMon */
+#define X86_FEATURE_PEBS		( 3*32+12) /* Precise-Event Based Sampling */
+#define X86_FEATURE_BTS			( 3*32+13) /* Branch Trace Store */
+#define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in ia32 userspace */
+#define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in ia32 userspace */
+#define X86_FEATURE_REP_GOOD		( 3*32+16) /* rep microcode works well */
+#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" Mfence synchronizes RDTSC */
+#define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" Lfence synchronizes RDTSC */
+#define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
+#define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
+#define X86_FEATURE_ALWAYS		( 3*32+21) /* "" Always-present feature */
+#define X86_FEATURE_XTOPOLOGY		( 3*32+22) /* cpu topology enum extensions */
+#define X86_FEATURE_TSC_RELIABLE	( 3*32+23) /* TSC is known to be reliable */
+#define X86_FEATURE_NONSTOP_TSC		( 3*32+24) /* TSC does not stop in C states */
+#define X86_FEATURE_CPUID		( 3*32+25) /* CPU has CPUID instruction itself */
+#define X86_FEATURE_EXTD_APICID		( 3*32+26) /* has extended APICID (8 bits) */
+#define X86_FEATURE_AMD_DCM		( 3*32+27) /* multi-node processor */
+#define X86_FEATURE_APERFMPERF		( 3*32+28) /* APERFMPERF */
+#define X86_FEATURE_NONSTOP_TSC_S3	( 3*32+30) /* TSC doesn't stop in S3 state */
+#define X86_FEATURE_TSC_KNOWN_FREQ	( 3*32+31) /* TSC has known frequency */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
-#define X86_FEATURE_XMM3	( 4*32+ 0) /* "pni" SSE-3 */
-#define X86_FEATURE_PCLMULQDQ	( 4*32+ 1) /* PCLMULQDQ instruction */
-#define X86_FEATURE_DTES64	( 4*32+ 2) /* 64-bit Debug Store */
-#define X86_FEATURE_MWAIT	( 4*32+ 3) /* "monitor" Monitor/Mwait support */
-#define X86_FEATURE_DSCPL	( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */
-#define X86_FEATURE_VMX		( 4*32+ 5) /* Hardware virtualization */
-#define X86_FEATURE_SMX		( 4*32+ 6) /* Safer mode */
-#define X86_FEATURE_EST		( 4*32+ 7) /* Enhanced SpeedStep */
-#define X86_FEATURE_TM2		( 4*32+ 8) /* Thermal Monitor 2 */
-#define X86_FEATURE_SSSE3	( 4*32+ 9) /* Supplemental SSE-3 */
-#define X86_FEATURE_CID		( 4*32+10) /* Context ID */
-#define X86_FEATURE_SDBG	( 4*32+11) /* Silicon Debug */
-#define X86_FEATURE_FMA		( 4*32+12) /* Fused multiply-add */
-#define X86_FEATURE_CX16	( 4*32+13) /* CMPXCHG16B */
-#define X86_FEATURE_XTPR	( 4*32+14) /* Send Task Priority Messages */
-#define X86_FEATURE_PDCM	( 4*32+15) /* Performance Capabilities */
-#define X86_FEATURE_PCID	( 4*32+17) /* Process Context Identifiers */
-#define X86_FEATURE_DCA		( 4*32+18) /* Direct Cache Access */
-#define X86_FEATURE_XMM4_1	( 4*32+19) /* "sse4_1" SSE-4.1 */
-#define X86_FEATURE_XMM4_2	( 4*32+20) /* "sse4_2" SSE-4.2 */
-#define X86_FEATURE_X2APIC	( 4*32+21) /* x2APIC */
-#define X86_FEATURE_MOVBE	( 4*32+22) /* MOVBE instruction */
-#define X86_FEATURE_POPCNT      ( 4*32+23) /* POPCNT instruction */
+#define X86_FEATURE_XMM3		( 4*32+ 0) /* "pni" SSE-3 */
+#define X86_FEATURE_PCLMULQDQ		( 4*32+ 1) /* PCLMULQDQ instruction */
+#define X86_FEATURE_DTES64		( 4*32+ 2) /* 64-bit Debug Store */
+#define X86_FEATURE_MWAIT		( 4*32+ 3) /* "monitor" Monitor/Mwait support */
+#define X86_FEATURE_DSCPL		( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */
+#define X86_FEATURE_VMX			( 4*32+ 5) /* Hardware virtualization */
+#define X86_FEATURE_SMX			( 4*32+ 6) /* Safer mode */
+#define X86_FEATURE_EST			( 4*32+ 7) /* Enhanced SpeedStep */
+#define X86_FEATURE_TM2			( 4*32+ 8) /* Thermal Monitor 2 */
+#define X86_FEATURE_SSSE3		( 4*32+ 9) /* Supplemental SSE-3 */
+#define X86_FEATURE_CID			( 4*32+10) /* Context ID */
+#define X86_FEATURE_SDBG		( 4*32+11) /* Silicon Debug */
+#define X86_FEATURE_FMA			( 4*32+12) /* Fused multiply-add */
+#define X86_FEATURE_CX16		( 4*32+13) /* CMPXCHG16B */
+#define X86_FEATURE_XTPR		( 4*32+14) /* Send Task Priority Messages */
+#define X86_FEATURE_PDCM		( 4*32+15) /* Performance Capabilities */
+#define X86_FEATURE_PCID		( 4*32+17) /* Process Context Identifiers */
+#define X86_FEATURE_DCA			( 4*32+18) /* Direct Cache Access */
+#define X86_FEATURE_XMM4_1		( 4*32+19) /* "sse4_1" SSE-4.1 */
+#define X86_FEATURE_XMM4_2		( 4*32+20) /* "sse4_2" SSE-4.2 */
+#define X86_FEATURE_X2APIC		( 4*32+21) /* x2APIC */
+#define X86_FEATURE_MOVBE		( 4*32+22) /* MOVBE instruction */
+#define X86_FEATURE_POPCNT		( 4*32+23) /* POPCNT instruction */
 #define X86_FEATURE_TSC_DEADLINE_TIMER	( 4*32+24) /* Tsc deadline timer */
-#define X86_FEATURE_AES		( 4*32+25) /* AES instructions */
-#define X86_FEATURE_XSAVE	( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
-#define X86_FEATURE_OSXSAVE	( 4*32+27) /* "" XSAVE enabled in the OS */
-#define X86_FEATURE_AVX		( 4*32+28) /* Advanced Vector Extensions */
-#define X86_FEATURE_F16C	( 4*32+29) /* 16-bit fp conversions */
-#define X86_FEATURE_RDRAND	( 4*32+30) /* The RDRAND instruction */
-#define X86_FEATURE_HYPERVISOR	( 4*32+31) /* Running on a hypervisor */
+#define X86_FEATURE_AES			( 4*32+25) /* AES instructions */
+#define X86_FEATURE_XSAVE		( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
+#define X86_FEATURE_OSXSAVE		( 4*32+27) /* "" XSAVE enabled in the OS */
+#define X86_FEATURE_AVX			( 4*32+28) /* Advanced Vector Extensions */
+#define X86_FEATURE_F16C		( 4*32+29) /* 16-bit fp conversions */
+#define X86_FEATURE_RDRAND		( 4*32+30) /* The RDRAND instruction */
+#define X86_FEATURE_HYPERVISOR		( 4*32+31) /* Running on a hypervisor */
 
 /* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
-#define X86_FEATURE_XSTORE	( 5*32+ 2) /* "rng" RNG present (xstore) */
-#define X86_FEATURE_XSTORE_EN	( 5*32+ 3) /* "rng_en" RNG enabled */
-#define X86_FEATURE_XCRYPT	( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */
-#define X86_FEATURE_XCRYPT_EN	( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */
-#define X86_FEATURE_ACE2	( 5*32+ 8) /* Advanced Cryptography Engine v2 */
-#define X86_FEATURE_ACE2_EN	( 5*32+ 9) /* ACE v2 enabled */
-#define X86_FEATURE_PHE		( 5*32+10) /* PadLock Hash Engine */
-#define X86_FEATURE_PHE_EN	( 5*32+11) /* PHE enabled */
-#define X86_FEATURE_PMM		( 5*32+12) /* PadLock Montgomery Multiplier */
-#define X86_FEATURE_PMM_EN	( 5*32+13) /* PMM enabled */
+#define X86_FEATURE_XSTORE		( 5*32+ 2) /* "rng" RNG present (xstore) */
+#define X86_FEATURE_XSTORE_EN		( 5*32+ 3) /* "rng_en" RNG enabled */
+#define X86_FEATURE_XCRYPT		( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */
+#define X86_FEATURE_XCRYPT_EN		( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */
+#define X86_FEATURE_ACE2		( 5*32+ 8) /* Advanced Cryptography Engine v2 */
+#define X86_FEATURE_ACE2_EN		( 5*32+ 9) /* ACE v2 enabled */
+#define X86_FEATURE_PHE			( 5*32+10) /* PadLock Hash Engine */
+#define X86_FEATURE_PHE_EN		( 5*32+11) /* PHE enabled */
+#define X86_FEATURE_PMM			( 5*32+12) /* PadLock Montgomery Multiplier */
+#define X86_FEATURE_PMM_EN		( 5*32+13) /* PMM enabled */
 
 /* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */
-#define X86_FEATURE_LAHF_LM	( 6*32+ 0) /* LAHF/SAHF in long mode */
-#define X86_FEATURE_CMP_LEGACY	( 6*32+ 1) /* If yes HyperThreading not valid */
-#define X86_FEATURE_SVM		( 6*32+ 2) /* Secure virtual machine */
-#define X86_FEATURE_EXTAPIC	( 6*32+ 3) /* Extended APIC space */
-#define X86_FEATURE_CR8_LEGACY	( 6*32+ 4) /* CR8 in 32-bit mode */
-#define X86_FEATURE_ABM		( 6*32+ 5) /* Advanced bit manipulation */
-#define X86_FEATURE_SSE4A	( 6*32+ 6) /* SSE-4A */
-#define X86_FEATURE_MISALIGNSSE ( 6*32+ 7) /* Misaligned SSE mode */
-#define X86_FEATURE_3DNOWPREFETCH ( 6*32+ 8) /* 3DNow prefetch instructions */
-#define X86_FEATURE_OSVW	( 6*32+ 9) /* OS Visible Workaround */
-#define X86_FEATURE_IBS		( 6*32+10) /* Instruction Based Sampling */
-#define X86_FEATURE_XOP		( 6*32+11) /* extended AVX instructions */
-#define X86_FEATURE_SKINIT	( 6*32+12) /* SKINIT/STGI instructions */
-#define X86_FEATURE_WDT		( 6*32+13) /* Watchdog timer */
-#define X86_FEATURE_LWP		( 6*32+15) /* Light Weight Profiling */
-#define X86_FEATURE_FMA4	( 6*32+16) /* 4 operands MAC instructions */
-#define X86_FEATURE_TCE		( 6*32+17) /* translation cache extension */
-#define X86_FEATURE_NODEID_MSR	( 6*32+19) /* NodeId MSR */
-#define X86_FEATURE_TBM		( 6*32+21) /* trailing bit manipulations */
-#define X86_FEATURE_TOPOEXT	( 6*32+22) /* topology extensions CPUID leafs */
-#define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */
-#define X86_FEATURE_PERFCTR_NB  ( 6*32+24) /* NB performance counter extensions */
-#define X86_FEATURE_BPEXT	(6*32+26) /* data breakpoint extension */
-#define X86_FEATURE_PTSC	( 6*32+27) /* performance time-stamp counter */
-#define X86_FEATURE_PERFCTR_LLC	( 6*32+28) /* Last Level Cache performance counter extensions */
-#define X86_FEATURE_MWAITX	( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */
+#define X86_FEATURE_LAHF_LM		( 6*32+ 0) /* LAHF/SAHF in long mode */
+#define X86_FEATURE_CMP_LEGACY		( 6*32+ 1) /* If yes HyperThreading not valid */
+#define X86_FEATURE_SVM			( 6*32+ 2) /* Secure virtual machine */
+#define X86_FEATURE_EXTAPIC		( 6*32+ 3) /* Extended APIC space */
+#define X86_FEATURE_CR8_LEGACY		( 6*32+ 4) /* CR8 in 32-bit mode */
+#define X86_FEATURE_ABM			( 6*32+ 5) /* Advanced bit manipulation */
+#define X86_FEATURE_SSE4A		( 6*32+ 6) /* SSE-4A */
+#define X86_FEATURE_MISALIGNSSE		( 6*32+ 7) /* Misaligned SSE mode */
+#define X86_FEATURE_3DNOWPREFETCH	( 6*32+ 8) /* 3DNow prefetch instructions */
+#define X86_FEATURE_OSVW		( 6*32+ 9) /* OS Visible Workaround */
+#define X86_FEATURE_IBS			( 6*32+10) /* Instruction Based Sampling */
+#define X86_FEATURE_XOP			( 6*32+11) /* extended AVX instructions */
+#define X86_FEATURE_SKINIT		( 6*32+12) /* SKINIT/STGI instructions */
+#define X86_FEATURE_WDT			( 6*32+13) /* Watchdog timer */
+#define X86_FEATURE_LWP			( 6*32+15) /* Light Weight Profiling */
+#define X86_FEATURE_FMA4		( 6*32+16) /* 4 operands MAC instructions */
+#define X86_FEATURE_TCE			( 6*32+17) /* translation cache extension */
+#define X86_FEATURE_NODEID_MSR		( 6*32+19) /* NodeId MSR */
+#define X86_FEATURE_TBM			( 6*32+21) /* trailing bit manipulations */
+#define X86_FEATURE_TOPOEXT		( 6*32+22) /* topology extensions CPUID leafs */
+#define X86_FEATURE_PERFCTR_CORE	( 6*32+23) /* core performance counter extensions */
+#define X86_FEATURE_PERFCTR_NB		( 6*32+24) /* NB performance counter extensions */
+#define X86_FEATURE_BPEXT		(6*32+26) /* data breakpoint extension */
+#define X86_FEATURE_PTSC		( 6*32+27) /* performance time-stamp counter */
+#define X86_FEATURE_PERFCTR_LLC		( 6*32+28) /* Last Level Cache performance counter extensions */
+#define X86_FEATURE_MWAITX		( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */
 
 /*
  * Auxiliary flags: Linux defined - For features scattered in various
@@ -192,152 +192,152 @@
  *
  * Reuse free bits when adding new feature flags!
  */
-#define X86_FEATURE_RING3MWAIT	( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */
-#define X86_FEATURE_CPUID_FAULT ( 7*32+ 1) /* Intel CPUID faulting */
-#define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
-#define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
-#define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
-#define X86_FEATURE_CAT_L2	( 7*32+ 5) /* Cache Allocation Technology L2 */
-#define X86_FEATURE_CDP_L3	( 7*32+ 6) /* Code and Data Prioritization L3 */
-
-#define X86_FEATURE_HW_PSTATE	( 7*32+ 8) /* AMD HW-PState */
-#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
-#define X86_FEATURE_SME		( 7*32+10) /* AMD Secure Memory Encryption */
-
-#define X86_FEATURE_INTEL_PPIN	( 7*32+14) /* Intel Processor Inventory Number */
-#define X86_FEATURE_INTEL_PT	( 7*32+15) /* Intel Processor Trace */
-#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
-#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_RING3MWAIT		( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */
+#define X86_FEATURE_CPUID_FAULT		( 7*32+ 1) /* Intel CPUID faulting */
+#define X86_FEATURE_CPB			( 7*32+ 2) /* AMD Core Performance Boost */
+#define X86_FEATURE_EPB			( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
+#define X86_FEATURE_CAT_L3		( 7*32+ 4) /* Cache Allocation Technology L3 */
+#define X86_FEATURE_CAT_L2		( 7*32+ 5) /* Cache Allocation Technology L2 */
+#define X86_FEATURE_CDP_L3		( 7*32+ 6) /* Code and Data Prioritization L3 */
+
+#define X86_FEATURE_HW_PSTATE		( 7*32+ 8) /* AMD HW-PState */
+#define X86_FEATURE_PROC_FEEDBACK	( 7*32+ 9) /* AMD ProcFeedbackInterface */
+#define X86_FEATURE_SME			( 7*32+10) /* AMD Secure Memory Encryption */
+
+#define X86_FEATURE_INTEL_PPIN		( 7*32+14) /* Intel Processor Inventory Number */
+#define X86_FEATURE_INTEL_PT		( 7*32+15) /* Intel Processor Trace */
+#define X86_FEATURE_AVX512_4VNNIW	(7*32+16) /* AVX-512 Neural Network Instructions */
+#define X86_FEATURE_AVX512_4FMAPS	(7*32+17) /* AVX-512 Multiply Accumulation Single precision */
 
-#define X86_FEATURE_MBA         ( 7*32+18) /* Memory Bandwidth Allocation */
+#define X86_FEATURE_MBA			( 7*32+18) /* Memory Bandwidth Allocation */
 
 /* Virtualization flags: Linux defined, word 8 */
-#define X86_FEATURE_TPR_SHADOW  ( 8*32+ 0) /* Intel TPR Shadow */
-#define X86_FEATURE_VNMI        ( 8*32+ 1) /* Intel Virtual NMI */
-#define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */
-#define X86_FEATURE_EPT         ( 8*32+ 3) /* Intel Extended Page Table */
-#define X86_FEATURE_VPID        ( 8*32+ 4) /* Intel Virtual Processor ID */
+#define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
+#define X86_FEATURE_VNMI		( 8*32+ 1) /* Intel Virtual NMI */
+#define X86_FEATURE_FLEXPRIORITY	( 8*32+ 2) /* Intel FlexPriority */
+#define X86_FEATURE_EPT			( 8*32+ 3) /* Intel Extended Page Table */
+#define X86_FEATURE_VPID		( 8*32+ 4) /* Intel Virtual Processor ID */
 
-#define X86_FEATURE_VMMCALL     ( 8*32+15) /* Prefer vmmcall to vmcall */
-#define X86_FEATURE_XENPV       ( 8*32+16) /* "" Xen paravirtual guest */
+#define X86_FEATURE_VMMCALL		( 8*32+15) /* Prefer vmmcall to vmcall */
+#define X86_FEATURE_XENPV		( 8*32+16) /* "" Xen paravirtual guest */
 
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
-#define X86_FEATURE_FSGSBASE	( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
-#define X86_FEATURE_TSC_ADJUST	( 9*32+ 1) /* TSC adjustment MSR 0x3b */
-#define X86_FEATURE_BMI1	( 9*32+ 3) /* 1st group bit manipulation extensions */
-#define X86_FEATURE_HLE		( 9*32+ 4) /* Hardware Lock Elision */
-#define X86_FEATURE_AVX2	( 9*32+ 5) /* AVX2 instructions */
-#define X86_FEATURE_SMEP	( 9*32+ 7) /* Supervisor Mode Execution Protection */
-#define X86_FEATURE_BMI2	( 9*32+ 8) /* 2nd group bit manipulation extensions */
-#define X86_FEATURE_ERMS	( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */
-#define X86_FEATURE_INVPCID	( 9*32+10) /* Invalidate Processor Context ID */
-#define X86_FEATURE_RTM		( 9*32+11) /* Restricted Transactional Memory */
-#define X86_FEATURE_CQM		( 9*32+12) /* Cache QoS Monitoring */
-#define X86_FEATURE_MPX		( 9*32+14) /* Memory Protection Extension */
-#define X86_FEATURE_RDT_A	( 9*32+15) /* Resource Director Technology Allocation */
-#define X86_FEATURE_AVX512F	( 9*32+16) /* AVX-512 Foundation */
-#define X86_FEATURE_AVX512DQ	( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
-#define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
-#define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
-#define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
-#define X86_FEATURE_AVX512IFMA  ( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
-#define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
-#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
-#define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
-#define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
-#define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
-#define X86_FEATURE_SHA_NI	( 9*32+29) /* SHA1/SHA256 Instruction Extensions */
-#define X86_FEATURE_AVX512BW	( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */
-#define X86_FEATURE_AVX512VL	( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */
+#define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
+#define X86_FEATURE_TSC_ADJUST		( 9*32+ 1) /* TSC adjustment MSR 0x3b */
+#define X86_FEATURE_BMI1		( 9*32+ 3) /* 1st group bit manipulation extensions */
+#define X86_FEATURE_HLE			( 9*32+ 4) /* Hardware Lock Elision */
+#define X86_FEATURE_AVX2		( 9*32+ 5) /* AVX2 instructions */
+#define X86_FEATURE_SMEP		( 9*32+ 7) /* Supervisor Mode Execution Protection */
+#define X86_FEATURE_BMI2		( 9*32+ 8) /* 2nd group bit manipulation extensions */
+#define X86_FEATURE_ERMS		( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */
+#define X86_FEATURE_INVPCID		( 9*32+10) /* Invalidate Processor Context ID */
+#define X86_FEATURE_RTM			( 9*32+11) /* Restricted Transactional Memory */
+#define X86_FEATURE_CQM			( 9*32+12) /* Cache QoS Monitoring */
+#define X86_FEATURE_MPX			( 9*32+14) /* Memory Protection Extension */
+#define X86_FEATURE_RDT_A		( 9*32+15) /* Resource Director Technology Allocation */
+#define X86_FEATURE_AVX512F		( 9*32+16) /* AVX-512 Foundation */
+#define X86_FEATURE_AVX512DQ		( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
+#define X86_FEATURE_RDSEED		( 9*32+18) /* The RDSEED instruction */
+#define X86_FEATURE_ADX			( 9*32+19) /* The ADCX and ADOX instructions */
+#define X86_FEATURE_SMAP		( 9*32+20) /* Supervisor Mode Access Prevention */
+#define X86_FEATURE_AVX512IFMA		( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
+#define X86_FEATURE_CLFLUSHOPT		( 9*32+23) /* CLFLUSHOPT instruction */
+#define X86_FEATURE_CLWB		( 9*32+24) /* CLWB instruction */
+#define X86_FEATURE_AVX512PF		( 9*32+26) /* AVX-512 Prefetch */
+#define X86_FEATURE_AVX512ER		( 9*32+27) /* AVX-512 Exponential and Reciprocal */
+#define X86_FEATURE_AVX512CD		( 9*32+28) /* AVX-512 Conflict Detection */
+#define X86_FEATURE_SHA_NI		( 9*32+29) /* SHA1/SHA256 Instruction Extensions */
+#define X86_FEATURE_AVX512BW		( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */
+#define X86_FEATURE_AVX512VL		( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */
 
 /* Extended state features, CPUID level 0x0000000d:1 (eax), word 10 */
-#define X86_FEATURE_XSAVEOPT	(10*32+ 0) /* XSAVEOPT */
-#define X86_FEATURE_XSAVEC	(10*32+ 1) /* XSAVEC */
-#define X86_FEATURE_XGETBV1	(10*32+ 2) /* XGETBV with ECX = 1 */
-#define X86_FEATURE_XSAVES	(10*32+ 3) /* XSAVES/XRSTORS */
+#define X86_FEATURE_XSAVEOPT		(10*32+ 0) /* XSAVEOPT */
+#define X86_FEATURE_XSAVEC		(10*32+ 1) /* XSAVEC */
+#define X86_FEATURE_XGETBV1		(10*32+ 2) /* XGETBV with ECX = 1 */
+#define X86_FEATURE_XSAVES		(10*32+ 3) /* XSAVES/XRSTORS */
 
 /* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (edx), word 11 */
-#define X86_FEATURE_CQM_LLC	(11*32+ 1) /* LLC QoS if 1 */
+#define X86_FEATURE_CQM_LLC		(11*32+ 1) /* LLC QoS if 1 */
 
 /* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (edx), word 12 */
-#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring if 1 */
-#define X86_FEATURE_CQM_MBM_TOTAL (12*32+ 1) /* LLC Total MBM monitoring */
-#define X86_FEATURE_CQM_MBM_LOCAL (12*32+ 2) /* LLC Local MBM monitoring */
+#define X86_FEATURE_CQM_OCCUP_LLC	(12*32+ 0) /* LLC occupancy monitoring if 1 */
+#define X86_FEATURE_CQM_MBM_TOTAL	(12*32+ 1) /* LLC Total MBM monitoring */
+#define X86_FEATURE_CQM_MBM_LOCAL	(12*32+ 2) /* LLC Local MBM monitoring */
 
 /* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
-#define X86_FEATURE_CLZERO	(13*32+0) /* CLZERO instruction */
-#define X86_FEATURE_IRPERF	(13*32+1) /* Instructions Retired Count */
+#define X86_FEATURE_CLZERO		(13*32+0) /* CLZERO instruction */
+#define X86_FEATURE_IRPERF		(13*32+1) /* Instructions Retired Count */
 
 /* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
-#define X86_FEATURE_DTHERM	(14*32+ 0) /* Digital Thermal Sensor */
-#define X86_FEATURE_IDA		(14*32+ 1) /* Intel Dynamic Acceleration */
-#define X86_FEATURE_ARAT	(14*32+ 2) /* Always Running APIC Timer */
-#define X86_FEATURE_PLN		(14*32+ 4) /* Intel Power Limit Notification */
-#define X86_FEATURE_PTS		(14*32+ 6) /* Intel Package Thermal Status */
-#define X86_FEATURE_HWP		(14*32+ 7) /* Intel Hardware P-states */
-#define X86_FEATURE_HWP_NOTIFY	(14*32+ 8) /* HWP Notification */
-#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */
-#define X86_FEATURE_HWP_EPP	(14*32+10) /* HWP Energy Perf. Preference */
-#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
+#define X86_FEATURE_DTHERM		(14*32+ 0) /* Digital Thermal Sensor */
+#define X86_FEATURE_IDA			(14*32+ 1) /* Intel Dynamic Acceleration */
+#define X86_FEATURE_ARAT		(14*32+ 2) /* Always Running APIC Timer */
+#define X86_FEATURE_PLN			(14*32+ 4) /* Intel Power Limit Notification */
+#define X86_FEATURE_PTS			(14*32+ 6) /* Intel Package Thermal Status */
+#define X86_FEATURE_HWP			(14*32+ 7) /* Intel Hardware P-states */
+#define X86_FEATURE_HWP_NOTIFY		(14*32+ 8) /* HWP Notification */
+#define X86_FEATURE_HWP_ACT_WINDOW	(14*32+ 9) /* HWP Activity Window */
+#define X86_FEATURE_HWP_EPP		(14*32+10) /* HWP Energy Perf. Preference */
+#define X86_FEATURE_HWP_PKG_REQ		(14*32+11) /* HWP Package Level Request */
 
 /* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */
-#define X86_FEATURE_NPT		(15*32+ 0) /* Nested Page Table support */
-#define X86_FEATURE_LBRV	(15*32+ 1) /* LBR Virtualization support */
-#define X86_FEATURE_SVML	(15*32+ 2) /* "svm_lock" SVM locking MSR */
-#define X86_FEATURE_NRIPS	(15*32+ 3) /* "nrip_save" SVM next_rip save */
-#define X86_FEATURE_TSCRATEMSR  (15*32+ 4) /* "tsc_scale" TSC scaling support */
-#define X86_FEATURE_VMCBCLEAN   (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
-#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */
-#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */
-#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */
-#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
-#define X86_FEATURE_AVIC	(15*32+13) /* Virtual Interrupt Controller */
-#define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
-#define X86_FEATURE_VGIF	(15*32+16) /* Virtual GIF */
+#define X86_FEATURE_NPT			(15*32+ 0) /* Nested Page Table support */
+#define X86_FEATURE_LBRV		(15*32+ 1) /* LBR Virtualization support */
+#define X86_FEATURE_SVML		(15*32+ 2) /* "svm_lock" SVM locking MSR */
+#define X86_FEATURE_NRIPS		(15*32+ 3) /* "nrip_save" SVM next_rip save */
+#define X86_FEATURE_TSCRATEMSR		(15*32+ 4) /* "tsc_scale" TSC scaling support */
+#define X86_FEATURE_VMCBCLEAN		(15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
+#define X86_FEATURE_FLUSHBYASID		(15*32+ 6) /* flush-by-ASID support */
+#define X86_FEATURE_DECODEASSISTS	(15*32+ 7) /* Decode Assists support */
+#define X86_FEATURE_PAUSEFILTER		(15*32+10) /* filtered pause intercept */
+#define X86_FEATURE_PFTHRESHOLD		(15*32+12) /* pause filter threshold */
+#define X86_FEATURE_AVIC		(15*32+13) /* Virtual Interrupt Controller */
+#define X86_FEATURE_V_VMSAVE_VMLOAD	(15*32+15) /* Virtual VMSAVE VMLOAD */
+#define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
-#define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
-#define X86_FEATURE_PKU		(16*32+ 3) /* Protection Keys for Userspace */
-#define X86_FEATURE_OSPKE	(16*32+ 4) /* OS Protection Keys Enable */
-#define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
-#define X86_FEATURE_GFNI	(16*32+ 8) /* Galois Field New Instructions */
-#define X86_FEATURE_VAES	(16*32+ 9) /* Vector AES */
-#define X86_FEATURE_VPCLMULQDQ	(16*32+ 10) /* Carry-Less Multiplication Double Quadword */
-#define X86_FEATURE_AVX512_VNNI (16*32+ 11) /* Vector Neural Network Instructions */
-#define X86_FEATURE_AVX512_BITALG (16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB */
-#define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
-#define X86_FEATURE_LA57	(16*32+16) /* 5-level page tables */
-#define X86_FEATURE_RDPID	(16*32+22) /* RDPID instruction */
+#define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
+#define X86_FEATURE_PKU			(16*32+ 3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE		(16*32+ 4) /* OS Protection Keys Enable */
+#define X86_FEATURE_AVX512_VBMI2	(16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
+#define X86_FEATURE_GFNI		(16*32+ 8) /* Galois Field New Instructions */
+#define X86_FEATURE_VAES		(16*32+ 9) /* Vector AES */
+#define X86_FEATURE_VPCLMULQDQ		(16*32+ 10) /* Carry-Less Multiplication Double Quadword */
+#define X86_FEATURE_AVX512_VNNI		(16*32+ 11) /* Vector Neural Network Instructions */
+#define X86_FEATURE_AVX512_BITALG	(16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB */
+#define X86_FEATURE_AVX512_VPOPCNTDQ	(16*32+14) /* POPCNT for vectors of DW/QW */
+#define X86_FEATURE_LA57		(16*32+16) /* 5-level page tables */
+#define X86_FEATURE_RDPID		(16*32+22) /* RDPID instruction */
 
 /* AMD-defined CPU features, CPUID level 0x80000007 (ebx), word 17 */
-#define X86_FEATURE_OVERFLOW_RECOV (17*32+0) /* MCA overflow recovery support */
-#define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */
-#define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */
+#define X86_FEATURE_OVERFLOW_RECOV	(17*32+0) /* MCA overflow recovery support */
+#define X86_FEATURE_SUCCOR		(17*32+1) /* Uncorrectable error containment and recovery */
+#define X86_FEATURE_SMCA		(17*32+3) /* Scalable MCA */
 
 /*
  * BUG word(s)
  */
-#define X86_BUG(x)		(NCAPINTS*32 + (x))
+#define X86_BUG(x)			(NCAPINTS*32 + (x))
 
-#define X86_BUG_F00F		X86_BUG(0) /* Intel F00F */
-#define X86_BUG_FDIV		X86_BUG(1) /* FPU FDIV */
-#define X86_BUG_COMA		X86_BUG(2) /* Cyrix 6x86 coma */
-#define X86_BUG_AMD_TLB_MMATCH	X86_BUG(3) /* "tlb_mmatch" AMD Erratum 383 */
-#define X86_BUG_AMD_APIC_C1E	X86_BUG(4) /* "apic_c1e" AMD Erratum 400 */
-#define X86_BUG_11AP		X86_BUG(5) /* Bad local APIC aka 11AP */
-#define X86_BUG_FXSAVE_LEAK	X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */
-#define X86_BUG_CLFLUSH_MONITOR	X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */
-#define X86_BUG_SYSRET_SS_ATTRS	X86_BUG(8) /* SYSRET doesn't fix up SS attrs */
+#define X86_BUG_F00F			X86_BUG(0) /* Intel F00F */
+#define X86_BUG_FDIV			X86_BUG(1) /* FPU FDIV */
+#define X86_BUG_COMA			X86_BUG(2) /* Cyrix 6x86 coma */
+#define X86_BUG_AMD_TLB_MMATCH		X86_BUG(3) /* "tlb_mmatch" AMD Erratum 383 */
+#define X86_BUG_AMD_APIC_C1E		X86_BUG(4) /* "apic_c1e" AMD Erratum 400 */
+#define X86_BUG_11AP			X86_BUG(5) /* Bad local APIC aka 11AP */
+#define X86_BUG_FXSAVE_LEAK		X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */
+#define X86_BUG_CLFLUSH_MONITOR		X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */
+#define X86_BUG_SYSRET_SS_ATTRS		X86_BUG(8) /* SYSRET doesn't fix up SS attrs */
 #ifdef CONFIG_X86_32
 /*
  * 64-bit kernels don't use X86_BUG_ESPFIX.  Make the define conditional
  * to avoid confusion.
  */
-#define X86_BUG_ESPFIX		X86_BUG(9) /* "" IRET to 16-bit SS corrupts ESP/RSP high bits */
+#define X86_BUG_ESPFIX			X86_BUG(9) /* "" IRET to 16-bit SS corrupts ESP/RSP high bits */
 #endif
-#define X86_BUG_NULL_SEG	X86_BUG(10) /* Nulling a selector preserves the base */
-#define X86_BUG_SWAPGS_FENCE	X86_BUG(11) /* SWAPGS without input dep on GS */
-#define X86_BUG_MONITOR		X86_BUG(12) /* IPI required to wake up remote CPU */
-#define X86_BUG_AMD_E400	X86_BUG(13) /* CPU is among the affected by Erratum 400 */
+#define X86_BUG_NULL_SEG		X86_BUG(10) /* Nulling a selector preserves the base */
+#define X86_BUG_SWAPGS_FENCE		X86_BUG(11) /* SWAPGS without input dep on GS */
+#define X86_BUG_MONITOR			X86_BUG(12) /* IPI required to wake up remote CPU */
+#define X86_BUG_AMD_E400		X86_BUG(13) /* CPU is among the affected by Erratum 400 */
 #endif /* _ASM_X86_CPUFEATURES_H */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 060/159] x86/cpufeatures: Fix various details in the feature definitions
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 059/159] x86/cpufeatures: Re-tabulate the X86_FEATURE definitions Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 061/159] selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area() Greg Kroah-Hartman
                   ` (105 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrew Morton, Andy Lutomirski,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ingo Molnar <mingo@kernel.org>

commit f3a624e901c633593156f7b00ca743a6204a29bc upstream.

Kept this commit separate from the re-tabulation changes, to make
the changes easier to review:

 - add better explanation for entries with no explanation
 - fix/enhance the text of some of the entries
 - fix the vertical alignment of some of the feature number definitions
 - fix inconsistent capitalization
 - ... and lots of other small details

i.e. make it all more of a coherent unit, instead of a patchwork of years of additions.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171031121723.28524-4-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeatures.h |  149 ++++++++++++++++++-------------------
 1 file changed, 74 insertions(+), 75 deletions(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -20,14 +20,12 @@
  * Note: If the comment begins with a quoted string, that string is used
  * in /proc/cpuinfo instead of the macro name.  If the string is "",
  * this feature bit is not displayed in /proc/cpuinfo at all.
- */
-
-/*
+ *
  * When adding new features here that depend on other features,
- * please update the table in kernel/cpu/cpuid-deps.c
+ * please update the table in kernel/cpu/cpuid-deps.c as well.
  */
 
-/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
+/* Intel-defined CPU features, CPUID level 0x00000001 (EDX), word 0 */
 #define X86_FEATURE_FPU			( 0*32+ 0) /* Onboard FPU */
 #define X86_FEATURE_VME			( 0*32+ 1) /* Virtual Mode Extensions */
 #define X86_FEATURE_DE			( 0*32+ 2) /* Debugging Extensions */
@@ -42,8 +40,7 @@
 #define X86_FEATURE_MTRR		( 0*32+12) /* Memory Type Range Registers */
 #define X86_FEATURE_PGE			( 0*32+13) /* Page Global Enable */
 #define X86_FEATURE_MCA			( 0*32+14) /* Machine Check Architecture */
-#define X86_FEATURE_CMOV		( 0*32+15) /* CMOV instructions */
-					  /* (plus FCMOVcc, FCOMI with FPU) */
+#define X86_FEATURE_CMOV		( 0*32+15) /* CMOV instructions (plus FCMOVcc, FCOMI with FPU) */
 #define X86_FEATURE_PAT			( 0*32+16) /* Page Attribute Table */
 #define X86_FEATURE_PSE36		( 0*32+17) /* 36-bit PSEs */
 #define X86_FEATURE_PN			( 0*32+18) /* Processor serial number */
@@ -63,15 +60,15 @@
 /* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
 /* Don't duplicate feature flags which are redundant with Intel! */
 #define X86_FEATURE_SYSCALL		( 1*32+11) /* SYSCALL/SYSRET */
-#define X86_FEATURE_MP			( 1*32+19) /* MP Capable. */
+#define X86_FEATURE_MP			( 1*32+19) /* MP Capable */
 #define X86_FEATURE_NX			( 1*32+20) /* Execute Disable */
 #define X86_FEATURE_MMXEXT		( 1*32+22) /* AMD MMX extensions */
 #define X86_FEATURE_FXSR_OPT		( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
 #define X86_FEATURE_GBPAGES		( 1*32+26) /* "pdpe1gb" GB pages */
 #define X86_FEATURE_RDTSCP		( 1*32+27) /* RDTSCP */
-#define X86_FEATURE_LM			( 1*32+29) /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT		( 1*32+30) /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW		( 1*32+31) /* 3DNow! */
+#define X86_FEATURE_LM			( 1*32+29) /* Long Mode (x86-64, 64-bit support) */
+#define X86_FEATURE_3DNOWEXT		( 1*32+30) /* AMD 3DNow extensions */
+#define X86_FEATURE_3DNOW		( 1*32+31) /* 3DNow */
 
 /* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */
 #define X86_FEATURE_RECOVERY		( 2*32+ 0) /* CPU in recovery mode */
@@ -84,66 +81,67 @@
 #define X86_FEATURE_K6_MTRR		( 3*32+ 1) /* AMD K6 nonstandard MTRRs */
 #define X86_FEATURE_CYRIX_ARR		( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */
 #define X86_FEATURE_CENTAUR_MCR		( 3*32+ 3) /* Centaur MCRs (= MTRRs) */
-/* cpu types for specific tunings: */
+
+/* CPU types for specific tunings: */
 #define X86_FEATURE_K8			( 3*32+ 4) /* "" Opteron, Athlon64 */
 #define X86_FEATURE_K7			( 3*32+ 5) /* "" Athlon */
 #define X86_FEATURE_P3			( 3*32+ 6) /* "" P3 */
 #define X86_FEATURE_P4			( 3*32+ 7) /* "" P4 */
 #define X86_FEATURE_CONSTANT_TSC	( 3*32+ 8) /* TSC ticks at a constant rate */
-#define X86_FEATURE_UP			( 3*32+ 9) /* smp kernel running on up */
-#define X86_FEATURE_ART			( 3*32+10) /* Platform has always running timer (ART) */
+#define X86_FEATURE_UP			( 3*32+ 9) /* SMP kernel running on UP */
+#define X86_FEATURE_ART			( 3*32+10) /* Always running timer (ART) */
 #define X86_FEATURE_ARCH_PERFMON	( 3*32+11) /* Intel Architectural PerfMon */
 #define X86_FEATURE_PEBS		( 3*32+12) /* Precise-Event Based Sampling */
 #define X86_FEATURE_BTS			( 3*32+13) /* Branch Trace Store */
-#define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in ia32 userspace */
-#define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in ia32 userspace */
-#define X86_FEATURE_REP_GOOD		( 3*32+16) /* rep microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" Mfence synchronizes RDTSC */
-#define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" Lfence synchronizes RDTSC */
+#define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
+#define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
+#define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
+#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
+#define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
 #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
 #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
 #define X86_FEATURE_ALWAYS		( 3*32+21) /* "" Always-present feature */
-#define X86_FEATURE_XTOPOLOGY		( 3*32+22) /* cpu topology enum extensions */
+#define X86_FEATURE_XTOPOLOGY		( 3*32+22) /* CPU topology enum extensions */
 #define X86_FEATURE_TSC_RELIABLE	( 3*32+23) /* TSC is known to be reliable */
 #define X86_FEATURE_NONSTOP_TSC		( 3*32+24) /* TSC does not stop in C states */
 #define X86_FEATURE_CPUID		( 3*32+25) /* CPU has CPUID instruction itself */
-#define X86_FEATURE_EXTD_APICID		( 3*32+26) /* has extended APICID (8 bits) */
-#define X86_FEATURE_AMD_DCM		( 3*32+27) /* multi-node processor */
-#define X86_FEATURE_APERFMPERF		( 3*32+28) /* APERFMPERF */
+#define X86_FEATURE_EXTD_APICID		( 3*32+26) /* Extended APICID (8 bits) */
+#define X86_FEATURE_AMD_DCM		( 3*32+27) /* AMD multi-node processor */
+#define X86_FEATURE_APERFMPERF		( 3*32+28) /* P-State hardware coordination feedback capability (APERF/MPERF MSRs) */
 #define X86_FEATURE_NONSTOP_TSC_S3	( 3*32+30) /* TSC doesn't stop in S3 state */
 #define X86_FEATURE_TSC_KNOWN_FREQ	( 3*32+31) /* TSC has known frequency */
 
-/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
+/* Intel-defined CPU features, CPUID level 0x00000001 (ECX), word 4 */
 #define X86_FEATURE_XMM3		( 4*32+ 0) /* "pni" SSE-3 */
 #define X86_FEATURE_PCLMULQDQ		( 4*32+ 1) /* PCLMULQDQ instruction */
 #define X86_FEATURE_DTES64		( 4*32+ 2) /* 64-bit Debug Store */
-#define X86_FEATURE_MWAIT		( 4*32+ 3) /* "monitor" Monitor/Mwait support */
-#define X86_FEATURE_DSCPL		( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */
+#define X86_FEATURE_MWAIT		( 4*32+ 3) /* "monitor" MONITOR/MWAIT support */
+#define X86_FEATURE_DSCPL		( 4*32+ 4) /* "ds_cpl" CPL-qualified (filtered) Debug Store */
 #define X86_FEATURE_VMX			( 4*32+ 5) /* Hardware virtualization */
-#define X86_FEATURE_SMX			( 4*32+ 6) /* Safer mode */
+#define X86_FEATURE_SMX			( 4*32+ 6) /* Safer Mode eXtensions */
 #define X86_FEATURE_EST			( 4*32+ 7) /* Enhanced SpeedStep */
 #define X86_FEATURE_TM2			( 4*32+ 8) /* Thermal Monitor 2 */
 #define X86_FEATURE_SSSE3		( 4*32+ 9) /* Supplemental SSE-3 */
 #define X86_FEATURE_CID			( 4*32+10) /* Context ID */
 #define X86_FEATURE_SDBG		( 4*32+11) /* Silicon Debug */
 #define X86_FEATURE_FMA			( 4*32+12) /* Fused multiply-add */
-#define X86_FEATURE_CX16		( 4*32+13) /* CMPXCHG16B */
+#define X86_FEATURE_CX16		( 4*32+13) /* CMPXCHG16B instruction */
 #define X86_FEATURE_XTPR		( 4*32+14) /* Send Task Priority Messages */
-#define X86_FEATURE_PDCM		( 4*32+15) /* Performance Capabilities */
+#define X86_FEATURE_PDCM		( 4*32+15) /* Perf/Debug Capabilities MSR */
 #define X86_FEATURE_PCID		( 4*32+17) /* Process Context Identifiers */
 #define X86_FEATURE_DCA			( 4*32+18) /* Direct Cache Access */
 #define X86_FEATURE_XMM4_1		( 4*32+19) /* "sse4_1" SSE-4.1 */
 #define X86_FEATURE_XMM4_2		( 4*32+20) /* "sse4_2" SSE-4.2 */
-#define X86_FEATURE_X2APIC		( 4*32+21) /* x2APIC */
+#define X86_FEATURE_X2APIC		( 4*32+21) /* X2APIC */
 #define X86_FEATURE_MOVBE		( 4*32+22) /* MOVBE instruction */
 #define X86_FEATURE_POPCNT		( 4*32+23) /* POPCNT instruction */
-#define X86_FEATURE_TSC_DEADLINE_TIMER	( 4*32+24) /* Tsc deadline timer */
+#define X86_FEATURE_TSC_DEADLINE_TIMER	( 4*32+24) /* TSC deadline timer */
 #define X86_FEATURE_AES			( 4*32+25) /* AES instructions */
-#define X86_FEATURE_XSAVE		( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
-#define X86_FEATURE_OSXSAVE		( 4*32+27) /* "" XSAVE enabled in the OS */
+#define X86_FEATURE_XSAVE		( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV instructions */
+#define X86_FEATURE_OSXSAVE		( 4*32+27) /* "" XSAVE instruction enabled in the OS */
 #define X86_FEATURE_AVX			( 4*32+28) /* Advanced Vector Extensions */
-#define X86_FEATURE_F16C		( 4*32+29) /* 16-bit fp conversions */
-#define X86_FEATURE_RDRAND		( 4*32+30) /* The RDRAND instruction */
+#define X86_FEATURE_F16C		( 4*32+29) /* 16-bit FP conversions */
+#define X86_FEATURE_RDRAND		( 4*32+30) /* RDRAND instruction */
 #define X86_FEATURE_HYPERVISOR		( 4*32+31) /* Running on a hypervisor */
 
 /* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
@@ -158,10 +156,10 @@
 #define X86_FEATURE_PMM			( 5*32+12) /* PadLock Montgomery Multiplier */
 #define X86_FEATURE_PMM_EN		( 5*32+13) /* PMM enabled */
 
-/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */
+/* More extended AMD flags: CPUID level 0x80000001, ECX, word 6 */
 #define X86_FEATURE_LAHF_LM		( 6*32+ 0) /* LAHF/SAHF in long mode */
 #define X86_FEATURE_CMP_LEGACY		( 6*32+ 1) /* If yes HyperThreading not valid */
-#define X86_FEATURE_SVM			( 6*32+ 2) /* Secure virtual machine */
+#define X86_FEATURE_SVM			( 6*32+ 2) /* Secure Virtual Machine */
 #define X86_FEATURE_EXTAPIC		( 6*32+ 3) /* Extended APIC space */
 #define X86_FEATURE_CR8_LEGACY		( 6*32+ 4) /* CR8 in 32-bit mode */
 #define X86_FEATURE_ABM			( 6*32+ 5) /* Advanced bit manipulation */
@@ -175,16 +173,16 @@
 #define X86_FEATURE_WDT			( 6*32+13) /* Watchdog timer */
 #define X86_FEATURE_LWP			( 6*32+15) /* Light Weight Profiling */
 #define X86_FEATURE_FMA4		( 6*32+16) /* 4 operands MAC instructions */
-#define X86_FEATURE_TCE			( 6*32+17) /* translation cache extension */
+#define X86_FEATURE_TCE			( 6*32+17) /* Translation Cache Extension */
 #define X86_FEATURE_NODEID_MSR		( 6*32+19) /* NodeId MSR */
-#define X86_FEATURE_TBM			( 6*32+21) /* trailing bit manipulations */
-#define X86_FEATURE_TOPOEXT		( 6*32+22) /* topology extensions CPUID leafs */
-#define X86_FEATURE_PERFCTR_CORE	( 6*32+23) /* core performance counter extensions */
+#define X86_FEATURE_TBM			( 6*32+21) /* Trailing Bit Manipulations */
+#define X86_FEATURE_TOPOEXT		( 6*32+22) /* Topology extensions CPUID leafs */
+#define X86_FEATURE_PERFCTR_CORE	( 6*32+23) /* Core performance counter extensions */
 #define X86_FEATURE_PERFCTR_NB		( 6*32+24) /* NB performance counter extensions */
-#define X86_FEATURE_BPEXT		(6*32+26) /* data breakpoint extension */
-#define X86_FEATURE_PTSC		( 6*32+27) /* performance time-stamp counter */
+#define X86_FEATURE_BPEXT		( 6*32+26) /* Data breakpoint extension */
+#define X86_FEATURE_PTSC		( 6*32+27) /* Performance time-stamp counter */
 #define X86_FEATURE_PERFCTR_LLC		( 6*32+28) /* Last Level Cache performance counter extensions */
-#define X86_FEATURE_MWAITX		( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */
+#define X86_FEATURE_MWAITX		( 6*32+29) /* MWAIT extension (MONITORX/MWAITX instructions) */
 
 /*
  * Auxiliary flags: Linux defined - For features scattered in various
@@ -192,7 +190,7 @@
  *
  * Reuse free bits when adding new feature flags!
  */
-#define X86_FEATURE_RING3MWAIT		( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */
+#define X86_FEATURE_RING3MWAIT		( 7*32+ 0) /* Ring 3 MONITOR/MWAIT instructions */
 #define X86_FEATURE_CPUID_FAULT		( 7*32+ 1) /* Intel CPUID faulting */
 #define X86_FEATURE_CPB			( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB			( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
@@ -206,8 +204,8 @@
 
 #define X86_FEATURE_INTEL_PPIN		( 7*32+14) /* Intel Processor Inventory Number */
 #define X86_FEATURE_INTEL_PT		( 7*32+15) /* Intel Processor Trace */
-#define X86_FEATURE_AVX512_4VNNIW	(7*32+16) /* AVX-512 Neural Network Instructions */
-#define X86_FEATURE_AVX512_4FMAPS	(7*32+17) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_AVX512_4VNNIW	( 7*32+16) /* AVX-512 Neural Network Instructions */
+#define X86_FEATURE_AVX512_4FMAPS	( 7*32+17) /* AVX-512 Multiply Accumulation Single precision */
 
 #define X86_FEATURE_MBA			( 7*32+18) /* Memory Bandwidth Allocation */
 
@@ -218,19 +216,19 @@
 #define X86_FEATURE_EPT			( 8*32+ 3) /* Intel Extended Page Table */
 #define X86_FEATURE_VPID		( 8*32+ 4) /* Intel Virtual Processor ID */
 
-#define X86_FEATURE_VMMCALL		( 8*32+15) /* Prefer vmmcall to vmcall */
+#define X86_FEATURE_VMMCALL		( 8*32+15) /* Prefer VMMCALL to VMCALL */
 #define X86_FEATURE_XENPV		( 8*32+16) /* "" Xen paravirtual guest */
 
 
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
-#define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
-#define X86_FEATURE_TSC_ADJUST		( 9*32+ 1) /* TSC adjustment MSR 0x3b */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
+#define X86_FEATURE_FSGSBASE		( 9*32+ 0) /* RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
+#define X86_FEATURE_TSC_ADJUST		( 9*32+ 1) /* TSC adjustment MSR 0x3B */
 #define X86_FEATURE_BMI1		( 9*32+ 3) /* 1st group bit manipulation extensions */
 #define X86_FEATURE_HLE			( 9*32+ 4) /* Hardware Lock Elision */
 #define X86_FEATURE_AVX2		( 9*32+ 5) /* AVX2 instructions */
 #define X86_FEATURE_SMEP		( 9*32+ 7) /* Supervisor Mode Execution Protection */
 #define X86_FEATURE_BMI2		( 9*32+ 8) /* 2nd group bit manipulation extensions */
-#define X86_FEATURE_ERMS		( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */
+#define X86_FEATURE_ERMS		( 9*32+ 9) /* Enhanced REP MOVSB/STOSB instructions */
 #define X86_FEATURE_INVPCID		( 9*32+10) /* Invalidate Processor Context ID */
 #define X86_FEATURE_RTM			( 9*32+11) /* Restricted Transactional Memory */
 #define X86_FEATURE_CQM			( 9*32+12) /* Cache QoS Monitoring */
@@ -238,8 +236,8 @@
 #define X86_FEATURE_RDT_A		( 9*32+15) /* Resource Director Technology Allocation */
 #define X86_FEATURE_AVX512F		( 9*32+16) /* AVX-512 Foundation */
 #define X86_FEATURE_AVX512DQ		( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
-#define X86_FEATURE_RDSEED		( 9*32+18) /* The RDSEED instruction */
-#define X86_FEATURE_ADX			( 9*32+19) /* The ADCX and ADOX instructions */
+#define X86_FEATURE_RDSEED		( 9*32+18) /* RDSEED instruction */
+#define X86_FEATURE_ADX			( 9*32+19) /* ADCX and ADOX instructions */
 #define X86_FEATURE_SMAP		( 9*32+20) /* Supervisor Mode Access Prevention */
 #define X86_FEATURE_AVX512IFMA		( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
 #define X86_FEATURE_CLFLUSHOPT		( 9*32+23) /* CLFLUSHOPT instruction */
@@ -251,25 +249,25 @@
 #define X86_FEATURE_AVX512BW		( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */
 #define X86_FEATURE_AVX512VL		( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */
 
-/* Extended state features, CPUID level 0x0000000d:1 (eax), word 10 */
-#define X86_FEATURE_XSAVEOPT		(10*32+ 0) /* XSAVEOPT */
-#define X86_FEATURE_XSAVEC		(10*32+ 1) /* XSAVEC */
-#define X86_FEATURE_XGETBV1		(10*32+ 2) /* XGETBV with ECX = 1 */
-#define X86_FEATURE_XSAVES		(10*32+ 3) /* XSAVES/XRSTORS */
+/* Extended state features, CPUID level 0x0000000d:1 (EAX), word 10 */
+#define X86_FEATURE_XSAVEOPT		(10*32+ 0) /* XSAVEOPT instruction */
+#define X86_FEATURE_XSAVEC		(10*32+ 1) /* XSAVEC instruction */
+#define X86_FEATURE_XGETBV1		(10*32+ 2) /* XGETBV with ECX = 1 instruction */
+#define X86_FEATURE_XSAVES		(10*32+ 3) /* XSAVES/XRSTORS instructions */
 
-/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (edx), word 11 */
+/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (EDX), word 11 */
 #define X86_FEATURE_CQM_LLC		(11*32+ 1) /* LLC QoS if 1 */
 
-/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (edx), word 12 */
-#define X86_FEATURE_CQM_OCCUP_LLC	(12*32+ 0) /* LLC occupancy monitoring if 1 */
+/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (EDX), word 12 */
+#define X86_FEATURE_CQM_OCCUP_LLC	(12*32+ 0) /* LLC occupancy monitoring */
 #define X86_FEATURE_CQM_MBM_TOTAL	(12*32+ 1) /* LLC Total MBM monitoring */
 #define X86_FEATURE_CQM_MBM_LOCAL	(12*32+ 2) /* LLC Local MBM monitoring */
 
-/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
-#define X86_FEATURE_CLZERO		(13*32+0) /* CLZERO instruction */
-#define X86_FEATURE_IRPERF		(13*32+1) /* Instructions Retired Count */
+/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
+#define X86_FEATURE_CLZERO		(13*32+ 0) /* CLZERO instruction */
+#define X86_FEATURE_IRPERF		(13*32+ 1) /* Instructions Retired Count */
 
-/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
+/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
 #define X86_FEATURE_DTHERM		(14*32+ 0) /* Digital Thermal Sensor */
 #define X86_FEATURE_IDA			(14*32+ 1) /* Intel Dynamic Acceleration */
 #define X86_FEATURE_ARAT		(14*32+ 2) /* Always Running APIC Timer */
@@ -281,7 +279,7 @@
 #define X86_FEATURE_HWP_EPP		(14*32+10) /* HWP Energy Perf. Preference */
 #define X86_FEATURE_HWP_PKG_REQ		(14*32+11) /* HWP Package Level Request */
 
-/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */
+/* AMD SVM Feature Identification, CPUID level 0x8000000a (EDX), word 15 */
 #define X86_FEATURE_NPT			(15*32+ 0) /* Nested Page Table support */
 #define X86_FEATURE_LBRV		(15*32+ 1) /* LBR Virtualization support */
 #define X86_FEATURE_SVML		(15*32+ 2) /* "svm_lock" SVM locking MSR */
@@ -296,24 +294,24 @@
 #define X86_FEATURE_V_VMSAVE_VMLOAD	(15*32+15) /* Virtual VMSAVE VMLOAD */
 #define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
 
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
 #define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
 #define X86_FEATURE_PKU			(16*32+ 3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE		(16*32+ 4) /* OS Protection Keys Enable */
 #define X86_FEATURE_AVX512_VBMI2	(16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
 #define X86_FEATURE_GFNI		(16*32+ 8) /* Galois Field New Instructions */
 #define X86_FEATURE_VAES		(16*32+ 9) /* Vector AES */
-#define X86_FEATURE_VPCLMULQDQ		(16*32+ 10) /* Carry-Less Multiplication Double Quadword */
-#define X86_FEATURE_AVX512_VNNI		(16*32+ 11) /* Vector Neural Network Instructions */
-#define X86_FEATURE_AVX512_BITALG	(16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB */
+#define X86_FEATURE_VPCLMULQDQ		(16*32+10) /* Carry-Less Multiplication Double Quadword */
+#define X86_FEATURE_AVX512_VNNI		(16*32+11) /* Vector Neural Network Instructions */
+#define X86_FEATURE_AVX512_BITALG	(16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
 #define X86_FEATURE_AVX512_VPOPCNTDQ	(16*32+14) /* POPCNT for vectors of DW/QW */
 #define X86_FEATURE_LA57		(16*32+16) /* 5-level page tables */
 #define X86_FEATURE_RDPID		(16*32+22) /* RDPID instruction */
 
-/* AMD-defined CPU features, CPUID level 0x80000007 (ebx), word 17 */
-#define X86_FEATURE_OVERFLOW_RECOV	(17*32+0) /* MCA overflow recovery support */
-#define X86_FEATURE_SUCCOR		(17*32+1) /* Uncorrectable error containment and recovery */
-#define X86_FEATURE_SMCA		(17*32+3) /* Scalable MCA */
+/* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
+#define X86_FEATURE_OVERFLOW_RECOV	(17*32+ 0) /* MCA overflow recovery support */
+#define X86_FEATURE_SUCCOR		(17*32+ 1) /* Uncorrectable error containment and recovery */
+#define X86_FEATURE_SMCA		(17*32+ 3) /* Scalable MCA */
 
 /*
  * BUG word(s)
@@ -340,4 +338,5 @@
 #define X86_BUG_SWAPGS_FENCE		X86_BUG(11) /* SWAPGS without input dep on GS */
 #define X86_BUG_MONITOR			X86_BUG(12) /* IPI required to wake up remote CPU */
 #define X86_BUG_AMD_E400		X86_BUG(13) /* CPU is among the affected by Erratum 400 */
+
 #endif /* _ASM_X86_CPUFEATURES_H */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 061/159] selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 060/159] x86/cpufeatures: Fix various details in the feature definitions Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 062/159] selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well Greg Kroah-Hartman
                   ` (104 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit d744dcad39094c9187075e274d1cdef79c57c8b5 upstream.

Much of the test design could apply to set_thread_area() (i.e. GDT),
not just modify_ldt().  Add set_thread_area() to the
install_valid_mode() helper.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02c23f8fba5547007f741dc24c3926e5284ede02.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/testing/selftests/x86/ldt_gdt.c |   53 +++++++++++++++++++++++-----------
 1 file changed, 37 insertions(+), 16 deletions(-)

--- a/tools/testing/selftests/x86/ldt_gdt.c
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -137,30 +137,51 @@ static void check_valid_segment(uint16_t
 	}
 }
 
-static bool install_valid_mode(const struct user_desc *desc, uint32_t ar,
-			       bool oldmode)
+static bool install_valid_mode(const struct user_desc *d, uint32_t ar,
+			       bool oldmode, bool ldt)
 {
-	int ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
-			  desc, sizeof(*desc));
-	if (ret < -1)
-		errno = -ret;
+	struct user_desc desc = *d;
+	int ret;
+
+	if (!ldt) {
+#ifndef __i386__
+		/* No point testing set_thread_area in a 64-bit build */
+		return false;
+#endif
+		if (!gdt_entry_num)
+			return false;
+		desc.entry_number = gdt_entry_num;
+
+		ret = syscall(SYS_set_thread_area, &desc);
+	} else {
+		ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
+			      &desc, sizeof(desc));
+
+		if (ret < -1)
+			errno = -ret;
+
+		if (ret != 0 && errno == ENOSYS) {
+			printf("[OK]\tmodify_ldt returned -ENOSYS\n");
+			return false;
+		}
+	}
+
 	if (ret == 0) {
-		uint32_t limit = desc->limit;
-		if (desc->limit_in_pages)
+		uint32_t limit = desc.limit;
+		if (desc.limit_in_pages)
 			limit = (limit << 12) + 4095;
-		check_valid_segment(desc->entry_number, 1, ar, limit, true);
+		check_valid_segment(desc.entry_number, ldt, ar, limit, true);
 		return true;
-	} else if (errno == ENOSYS) {
-		printf("[OK]\tmodify_ldt returned -ENOSYS\n");
-		return false;
 	} else {
-		if (desc->seg_32bit) {
-			printf("[FAIL]\tUnexpected modify_ldt failure %d\n",
+		if (desc.seg_32bit) {
+			printf("[FAIL]\tUnexpected %s failure %d\n",
+			       ldt ? "modify_ldt" : "set_thread_area",
 			       errno);
 			nerrs++;
 			return false;
 		} else {
-			printf("[OK]\tmodify_ldt rejected 16 bit segment\n");
+			printf("[OK]\t%s rejected 16 bit segment\n",
+			       ldt ? "modify_ldt" : "set_thread_area");
 			return false;
 		}
 	}
@@ -168,7 +189,7 @@ static bool install_valid_mode(const str
 
 static bool install_valid(const struct user_desc *desc, uint32_t ar)
 {
-	return install_valid_mode(desc, ar, false);
+	return install_valid_mode(desc, ar, false, true);
 }
 
 static void install_invalid(const struct user_desc *desc, bool oldmode)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 062/159] selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 061/159] selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 063/159] ACPI / APEI: Replace ioremap_page_range() with fixmap Greg Kroah-Hartman
                   ` (103 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Borislav Petkov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit adedf2893c192dd09b1cc2f2dcfdd7cad99ec49d upstream.

Now that the main test infrastructure supports the GDT, run tests
that will pass the kernel's GDT permission tests against the GDT.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/686a1eda63414da38fcecc2412db8dba1ae40581.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/testing/selftests/x86/ldt_gdt.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/tools/testing/selftests/x86/ldt_gdt.c
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -189,7 +189,15 @@ static bool install_valid_mode(const str
 
 static bool install_valid(const struct user_desc *desc, uint32_t ar)
 {
-	return install_valid_mode(desc, ar, false, true);
+	bool ret = install_valid_mode(desc, ar, false, true);
+
+	if (desc->contents <= 1 && desc->seg_32bit &&
+	    !desc->seg_not_present) {
+		/* Should work in the GDT, too. */
+		install_valid_mode(desc, ar, false, false);
+	}
+
+	return ret;
 }
 
 static void install_invalid(const struct user_desc *desc, bool oldmode)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 063/159] ACPI / APEI: Replace ioremap_page_range() with fixmap
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 062/159] selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45   ` Greg Kroah-Hartman
                   ` (102 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fengguang Wu, Linus Torvalds,
	James Morse, Borislav Petkov, Tyler Baicar, Toshi Kani,
	Will Deacon, Ingo Molnar, Rafael J. Wysocki

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: James Morse <james.morse@arm.com>

commit 4f89fa286f6729312e227e7c2d764e8e7b9d340e upstream.

Replace ghes_io{re,un}map_pfn_{nmi,irq}()s use of ioremap_page_range()
with __set_fixmap() as ioremap_page_range() may sleep to allocate a new
level of page-table, even if its passed an existing final-address to
use in the mapping.

The GHES driver can only be enabled for architectures that select
HAVE_ACPI_APEI: Add fixmap entries to both x86 and arm64.

clear_fixmap() does the TLB invalidation in __set_fixmap() for arm64
and __set_pte_vaddr() for x86. In each case its the same as the
respective arch_apei_flush_tlb_one().

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Tested-by: Toshi Kani <toshi.kani@hpe.com>
[ For the arm64 bits: ]
Acked-by: Will Deacon <will.deacon@arm.com>
[ For the x86 bits: ]
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/include/asm/fixmap.h |    7 ++++++
 arch/x86/include/asm/fixmap.h   |    6 +++++
 drivers/acpi/apei/ghes.c        |   44 ++++++++++++----------------------------
 3 files changed, 27 insertions(+), 30 deletions(-)

--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -51,6 +51,13 @@ enum fixed_addresses {
 
 	FIX_EARLYCON_MEM_BASE,
 	FIX_TEXT_POKE0,
+
+#ifdef CONFIG_ACPI_APEI_GHES
+	/* Used for GHES mapping from assorted contexts */
+	FIX_APEI_GHES_IRQ,
+	FIX_APEI_GHES_NMI,
+#endif /* CONFIG_ACPI_APEI_GHES */
+
 	__end_of_permanent_fixed_addresses,
 
 	/*
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -104,6 +104,12 @@ enum fixed_addresses {
 	FIX_GDT_REMAP_BEGIN,
 	FIX_GDT_REMAP_END = FIX_GDT_REMAP_BEGIN + NR_CPUS - 1,
 
+#ifdef CONFIG_ACPI_APEI_GHES
+	/* Used for GHES mapping from assorted contexts */
+	FIX_APEI_GHES_IRQ,
+	FIX_APEI_GHES_NMI,
+#endif
+
 	__end_of_permanent_fixed_addresses,
 
 	/*
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -51,6 +51,7 @@
 #include <acpi/actbl1.h>
 #include <acpi/ghes.h>
 #include <acpi/apei.h>
+#include <asm/fixmap.h>
 #include <asm/tlbflush.h>
 #include <ras/ras_event.h>
 
@@ -112,7 +113,7 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * Because the memory area used to transfer hardware error information
  * from BIOS to Linux can be determined only in NMI, IRQ or timer
  * handler, but general ioremap can not be used in atomic context, so
- * a special version of atomic ioremap is implemented for that.
+ * the fixmap is used instead.
  */
 
 /*
@@ -126,8 +127,8 @@ static DEFINE_MUTEX(ghes_list_mutex);
 /* virtual memory area for atomic ioremap */
 static struct vm_struct *ghes_ioremap_area;
 /*
- * These 2 spinlock is used to prevent atomic ioremap virtual memory
- * area from being mapped simultaneously.
+ * These 2 spinlocks are used to prevent the fixmap entries from being used
+ * simultaneously.
  */
 static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
 static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
@@ -159,53 +160,36 @@ static void ghes_ioremap_exit(void)
 
 static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
 {
-	unsigned long vaddr;
 	phys_addr_t paddr;
 	pgprot_t prot;
 
-	vaddr = (unsigned long)GHES_IOREMAP_NMI_PAGE(ghes_ioremap_area->addr);
-
 	paddr = pfn << PAGE_SHIFT;
 	prot = arch_apei_get_mem_attribute(paddr);
-	ioremap_page_range(vaddr, vaddr + PAGE_SIZE, paddr, prot);
+	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
 
-	return (void __iomem *)vaddr;
+	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
 }
 
 static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
 {
-	unsigned long vaddr;
 	phys_addr_t paddr;
 	pgprot_t prot;
 
-	vaddr = (unsigned long)GHES_IOREMAP_IRQ_PAGE(ghes_ioremap_area->addr);
-
 	paddr = pfn << PAGE_SHIFT;
 	prot = arch_apei_get_mem_attribute(paddr);
+	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
 
-	ioremap_page_range(vaddr, vaddr + PAGE_SIZE, paddr, prot);
-
-	return (void __iomem *)vaddr;
+	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
 }
 
-static void ghes_iounmap_nmi(void __iomem *vaddr_ptr)
+static void ghes_iounmap_nmi(void)
 {
-	unsigned long vaddr = (unsigned long __force)vaddr_ptr;
-	void *base = ghes_ioremap_area->addr;
-
-	BUG_ON(vaddr != (unsigned long)GHES_IOREMAP_NMI_PAGE(base));
-	unmap_kernel_range_noflush(vaddr, PAGE_SIZE);
-	arch_apei_flush_tlb_one(vaddr);
+	clear_fixmap(FIX_APEI_GHES_NMI);
 }
 
-static void ghes_iounmap_irq(void __iomem *vaddr_ptr)
+static void ghes_iounmap_irq(void)
 {
-	unsigned long vaddr = (unsigned long __force)vaddr_ptr;
-	void *base = ghes_ioremap_area->addr;
-
-	BUG_ON(vaddr != (unsigned long)GHES_IOREMAP_IRQ_PAGE(base));
-	unmap_kernel_range_noflush(vaddr, PAGE_SIZE);
-	arch_apei_flush_tlb_one(vaddr);
+	clear_fixmap(FIX_APEI_GHES_IRQ);
 }
 
 static int ghes_estatus_pool_init(void)
@@ -361,10 +345,10 @@ static void ghes_copy_tofrom_phys(void *
 		paddr += trunk;
 		buffer += trunk;
 		if (in_nmi) {
-			ghes_iounmap_nmi(vaddr);
+			ghes_iounmap_nmi();
 			raw_spin_unlock(&ghes_ioremap_lock_nmi);
 		} else {
-			ghes_iounmap_irq(vaddr);
+			ghes_iounmap_irq();
 			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
 		}
 	}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 064/159] x86/virt, x86/platform: Merge struct x86_hyper into struct x86_platform and struct x86_init
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ingo Molnar, Juergen Gross,
	Thomas Gleixner, Linus Torvalds, Peter Zijlstra, akataria,
	boris.ostrovsky, devel, haiyangz, kvm, kys, pbonzini, rkrcmar,
	rusty, sthemmin, virtualization, xen-devel

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit f72e38e8ec8869ac0ba5a75d7d2f897d98a1454e upstream.

Instead of x86_hyper being either NULL on bare metal or a pointer to a
struct hypervisor_x86 in case of the kernel running as a guest merge
the struct into x86_platform and x86_init.

This will remove the need for wrappers making it hard to find out what
is being called. With dummy functions added for all callbacks testing
for a NULL function pointer can be removed, too.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: rusty@rustcorp.com.au
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-2-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/hypervisor.h |   25 +++-------------
 arch/x86/include/asm/x86_init.h   |   24 ++++++++++++++++
 arch/x86/kernel/apic/apic.c       |    2 -
 arch/x86/kernel/cpu/hypervisor.c  |   56 ++++++++++++++++++--------------------
 arch/x86/kernel/cpu/mshyperv.c    |    2 -
 arch/x86/kernel/cpu/vmware.c      |    4 +-
 arch/x86/kernel/kvm.c             |    2 -
 arch/x86/kernel/x86_init.c        |    9 ++++++
 arch/x86/mm/init.c                |    2 -
 arch/x86/xen/enlighten_hvm.c      |    8 ++---
 arch/x86/xen/enlighten_pv.c       |    2 -
 include/linux/hypervisor.h        |    8 ++++-
 12 files changed, 82 insertions(+), 62 deletions(-)

--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -23,6 +23,7 @@
 #ifdef CONFIG_HYPERVISOR_GUEST
 
 #include <asm/kvm_para.h>
+#include <asm/x86_init.h>
 #include <asm/xen/hypervisor.h>
 
 /*
@@ -35,17 +36,11 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
-	/* Platform setup (run once per boot) */
-	void		(*init_platform)(void);
+	/* init time callbacks */
+	struct x86_hyper_init init;
 
-	/* X2APIC detection (run once per boot) */
-	bool		(*x2apic_available)(void);
-
-	/* pin current vcpu to specified physical cpu (run rarely) */
-	void		(*pin_vcpu)(int);
-
-	/* called during init_mem_mapping() to setup early mappings. */
-	void		(*init_mem_mapping)(void);
+	/* runtime callbacks */
+	struct x86_hyper_runtime runtime;
 };
 
 extern const struct hypervisor_x86 *x86_hyper;
@@ -58,17 +53,7 @@ extern const struct hypervisor_x86 x86_h
 extern const struct hypervisor_x86 x86_hyper_kvm;
 
 extern void init_hypervisor_platform(void);
-extern bool hypervisor_x2apic_available(void);
-extern void hypervisor_pin_vcpu(int cpu);
-
-static inline void hypervisor_init_mem_mapping(void)
-{
-	if (x86_hyper && x86_hyper->init_mem_mapping)
-		x86_hyper->init_mem_mapping();
-}
 #else
 static inline void init_hypervisor_platform(void) { }
-static inline bool hypervisor_x2apic_available(void) { return false; }
-static inline void hypervisor_init_mem_mapping(void) { }
 #endif /* CONFIG_HYPERVISOR_GUEST */
 #endif /* _ASM_X86_HYPERVISOR_H */
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -115,6 +115,18 @@ struct x86_init_pci {
 };
 
 /**
+ * struct x86_hyper_init - x86 hypervisor init functions
+ * @init_platform:		platform setup
+ * @x2apic_available:		X2APIC detection
+ * @init_mem_mapping:		setup early mappings during init_mem_mapping()
+ */
+struct x86_hyper_init {
+	void (*init_platform)(void);
+	bool (*x2apic_available)(void);
+	void (*init_mem_mapping)(void);
+};
+
+/**
  * struct x86_init_ops - functions for platform specific setup
  *
  */
@@ -127,6 +139,7 @@ struct x86_init_ops {
 	struct x86_init_timers		timers;
 	struct x86_init_iommu		iommu;
 	struct x86_init_pci		pci;
+	struct x86_hyper_init		hyper;
 };
 
 /**
@@ -200,6 +213,15 @@ struct x86_legacy_features {
 };
 
 /**
+ * struct x86_hyper_runtime - x86 hypervisor specific runtime callbacks
+ *
+ * @pin_vcpu:		pin current vcpu to specified physical cpu (run rarely)
+ */
+struct x86_hyper_runtime {
+	void (*pin_vcpu)(int cpu);
+};
+
+/**
  * struct x86_platform_ops - platform specific runtime functions
  * @calibrate_cpu:		calibrate CPU
  * @calibrate_tsc:		calibrate TSC, if different from CPU
@@ -218,6 +240,7 @@ struct x86_legacy_features {
  * 				possible in x86_early_init_platform_quirks() by
  * 				only using the current x86_hardware_subarch
  * 				semantics.
+ * @hyper:			x86 hypervisor specific runtime callbacks
  */
 struct x86_platform_ops {
 	unsigned long (*calibrate_cpu)(void);
@@ -233,6 +256,7 @@ struct x86_platform_ops {
 	void (*apic_post_init)(void);
 	struct x86_legacy_features legacy;
 	void (*set_legacy_features)(void);
+	struct x86_hyper_runtime hyper;
 };
 
 struct pci_dev;
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1645,7 +1645,7 @@ static __init void try_to_enable_x2apic(
 		 * under KVM
 		 */
 		if (max_physical_apicid > 255 ||
-		    !hypervisor_x2apic_available()) {
+		    !x86_init.hyper.x2apic_available()) {
 			pr_info("x2apic: IRQ remapping doesn't support X2APIC mode\n");
 			x2apic_disable();
 			return;
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -44,51 +44,49 @@ static const __initconst struct hypervis
 const struct hypervisor_x86 *x86_hyper;
 EXPORT_SYMBOL(x86_hyper);
 
-static inline void __init
+static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
-	const struct hypervisor_x86 *h, * const *p;
+	const struct hypervisor_x86 *h = NULL, * const *p;
 	uint32_t pri, max_pri = 0;
 
 	for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
-		h = *p;
-		pri = h->detect();
-		if (pri != 0 && pri > max_pri) {
+		pri = (*p)->detect();
+		if (pri > max_pri) {
 			max_pri = pri;
-			x86_hyper = h;
+			h = *p;
 		}
 	}
 
-	if (max_pri)
-		pr_info("Hypervisor detected: %s\n", x86_hyper->name);
-}
-
-void __init init_hypervisor_platform(void)
-{
-
-	detect_hypervisor_vendor();
+	if (h)
+		pr_info("Hypervisor detected: %s\n", h->name);
 
-	if (!x86_hyper)
-		return;
-
-	if (x86_hyper->init_platform)
-		x86_hyper->init_platform();
+	return h;
 }
 
-bool __init hypervisor_x2apic_available(void)
+static void __init copy_array(const void *src, void *target, unsigned int size)
 {
-	return x86_hyper                   &&
-	       x86_hyper->x2apic_available &&
-	       x86_hyper->x2apic_available();
+	unsigned int i, n = size / sizeof(void *);
+	const void * const *from = (const void * const *)src;
+	const void **to = (const void **)target;
+
+	for (i = 0; i < n; i++)
+		if (from[i])
+			to[i] = from[i];
 }
 
-void hypervisor_pin_vcpu(int cpu)
+void __init init_hypervisor_platform(void)
 {
-	if (!x86_hyper)
+	const struct hypervisor_x86 *h;
+
+	h = detect_hypervisor_vendor();
+
+	if (!h)
 		return;
 
-	if (x86_hyper->pin_vcpu)
-		x86_hyper->pin_vcpu(cpu);
-	else
-		WARN_ONCE(1, "vcpu pinning requested but not supported!\n");
+	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
+	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
+
+	x86_hyper = h;
+	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -257,6 +257,6 @@ static void __init ms_hyperv_init_platfo
 const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
-	.init_platform		= ms_hyperv_init_platform,
+	.init.init_platform	= ms_hyperv_init_platform,
 };
 EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -208,7 +208,7 @@ static bool __init vmware_legacy_x2apic_
 const __refconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
-	.init_platform		= vmware_platform_setup,
-	.x2apic_available	= vmware_legacy_x2apic_available,
+	.init.init_platform	= vmware_platform_setup,
+	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
 EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -547,7 +547,7 @@ static uint32_t __init kvm_detect(void)
 const struct hypervisor_x86 x86_hyper_kvm __refconst = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
-	.x2apic_available	= kvm_para_available,
+	.init.x2apic_available	= kvm_para_available,
 };
 EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -28,6 +28,8 @@ void x86_init_noop(void) { }
 void __init x86_init_uint_noop(unsigned int unused) { }
 int __init iommu_init_noop(void) { return 0; }
 void iommu_shutdown_noop(void) { }
+bool __init bool_x86_init_noop(void) { return false; }
+void x86_op_int_noop(int cpu) { }
 
 /*
  * The platform setup functions are preset with the default functions
@@ -81,6 +83,12 @@ struct x86_init_ops x86_init __initdata
 		.init_irq		= x86_default_pci_init_irq,
 		.fixup_irqs		= x86_default_pci_fixup_irqs,
 	},
+
+	.hyper = {
+		.init_platform		= x86_init_noop,
+		.x2apic_available	= bool_x86_init_noop,
+		.init_mem_mapping	= x86_init_noop,
+	},
 };
 
 struct x86_cpuinit_ops x86_cpuinit = {
@@ -101,6 +109,7 @@ struct x86_platform_ops x86_platform __r
 	.get_nmi_reason			= default_get_nmi_reason,
 	.save_sched_clock_state 	= tsc_save_sched_clock_state,
 	.restore_sched_clock_state 	= tsc_restore_sched_clock_state,
+	.hyper.pin_vcpu			= x86_op_int_noop,
 };
 
 EXPORT_SYMBOL_GPL(x86_platform);
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -671,7 +671,7 @@ void __init init_mem_mapping(void)
 	load_cr3(swapper_pg_dir);
 	__flush_tlb_all();
 
-	hypervisor_init_mem_mapping();
+	x86_init.hyper.init_mem_mapping();
 
 	early_memtest(0, max_pfn_mapped << PAGE_SHIFT);
 }
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -229,9 +229,9 @@ static uint32_t __init xen_platform_hvm(
 const struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
-	.init_platform          = xen_hvm_guest_init,
-	.pin_vcpu               = xen_pin_vcpu,
-	.x2apic_available       = xen_x2apic_para_available,
-	.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.init.init_platform     = xen_hvm_guest_init,
+	.init.x2apic_available  = xen_x2apic_para_available,
+	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
 EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1462,6 +1462,6 @@ static uint32_t __init xen_platform_pv(v
 const struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
-	.pin_vcpu               = xen_pin_vcpu,
+	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
 EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -7,8 +7,12 @@
  *		Juergen Gross <jgross@suse.com>
  */
 
-#ifdef CONFIG_HYPERVISOR_GUEST
-#include <asm/hypervisor.h>
+#ifdef CONFIG_X86
+#include <asm/x86_init.h>
+static inline void hypervisor_pin_vcpu(int cpu)
+{
+	x86_platform.hyper.pin_vcpu(cpu);
+}
 #else
 static inline void hypervisor_pin_vcpu(int cpu)
 {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 064/159] x86/virt, x86/platform: Merge struct x86_hyper into struct x86_platform and struct x86_init
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Juergen Gross, sthemmin, xen-devel, kvm, rkrcmar, Peter Zijlstra,
	Greg Kroah-Hartman, boris.ostrovsky, rusty, akataria, stable,
	virtualization, haiyangz, pbonzini, devel, Thomas Gleixner,
	Linus Torvalds, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit f72e38e8ec8869ac0ba5a75d7d2f897d98a1454e upstream.

Instead of x86_hyper being either NULL on bare metal or a pointer to a
struct hypervisor_x86 in case of the kernel running as a guest merge
the struct into x86_platform and x86_init.

This will remove the need for wrappers making it hard to find out what
is being called. With dummy functions added for all callbacks testing
for a NULL function pointer can be removed, too.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: rusty@rustcorp.com.au
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-2-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/hypervisor.h |   25 +++-------------
 arch/x86/include/asm/x86_init.h   |   24 ++++++++++++++++
 arch/x86/kernel/apic/apic.c       |    2 -
 arch/x86/kernel/cpu/hypervisor.c  |   56 ++++++++++++++++++--------------------
 arch/x86/kernel/cpu/mshyperv.c    |    2 -
 arch/x86/kernel/cpu/vmware.c      |    4 +-
 arch/x86/kernel/kvm.c             |    2 -
 arch/x86/kernel/x86_init.c        |    9 ++++++
 arch/x86/mm/init.c                |    2 -
 arch/x86/xen/enlighten_hvm.c      |    8 ++---
 arch/x86/xen/enlighten_pv.c       |    2 -
 include/linux/hypervisor.h        |    8 ++++-
 12 files changed, 82 insertions(+), 62 deletions(-)

--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -23,6 +23,7 @@
 #ifdef CONFIG_HYPERVISOR_GUEST
 
 #include <asm/kvm_para.h>
+#include <asm/x86_init.h>
 #include <asm/xen/hypervisor.h>
 
 /*
@@ -35,17 +36,11 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
-	/* Platform setup (run once per boot) */
-	void		(*init_platform)(void);
+	/* init time callbacks */
+	struct x86_hyper_init init;
 
-	/* X2APIC detection (run once per boot) */
-	bool		(*x2apic_available)(void);
-
-	/* pin current vcpu to specified physical cpu (run rarely) */
-	void		(*pin_vcpu)(int);
-
-	/* called during init_mem_mapping() to setup early mappings. */
-	void		(*init_mem_mapping)(void);
+	/* runtime callbacks */
+	struct x86_hyper_runtime runtime;
 };
 
 extern const struct hypervisor_x86 *x86_hyper;
@@ -58,17 +53,7 @@ extern const struct hypervisor_x86 x86_h
 extern const struct hypervisor_x86 x86_hyper_kvm;
 
 extern void init_hypervisor_platform(void);
-extern bool hypervisor_x2apic_available(void);
-extern void hypervisor_pin_vcpu(int cpu);
-
-static inline void hypervisor_init_mem_mapping(void)
-{
-	if (x86_hyper && x86_hyper->init_mem_mapping)
-		x86_hyper->init_mem_mapping();
-}
 #else
 static inline void init_hypervisor_platform(void) { }
-static inline bool hypervisor_x2apic_available(void) { return false; }
-static inline void hypervisor_init_mem_mapping(void) { }
 #endif /* CONFIG_HYPERVISOR_GUEST */
 #endif /* _ASM_X86_HYPERVISOR_H */
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -115,6 +115,18 @@ struct x86_init_pci {
 };
 
 /**
+ * struct x86_hyper_init - x86 hypervisor init functions
+ * @init_platform:		platform setup
+ * @x2apic_available:		X2APIC detection
+ * @init_mem_mapping:		setup early mappings during init_mem_mapping()
+ */
+struct x86_hyper_init {
+	void (*init_platform)(void);
+	bool (*x2apic_available)(void);
+	void (*init_mem_mapping)(void);
+};
+
+/**
  * struct x86_init_ops - functions for platform specific setup
  *
  */
@@ -127,6 +139,7 @@ struct x86_init_ops {
 	struct x86_init_timers		timers;
 	struct x86_init_iommu		iommu;
 	struct x86_init_pci		pci;
+	struct x86_hyper_init		hyper;
 };
 
 /**
@@ -200,6 +213,15 @@ struct x86_legacy_features {
 };
 
 /**
+ * struct x86_hyper_runtime - x86 hypervisor specific runtime callbacks
+ *
+ * @pin_vcpu:		pin current vcpu to specified physical cpu (run rarely)
+ */
+struct x86_hyper_runtime {
+	void (*pin_vcpu)(int cpu);
+};
+
+/**
  * struct x86_platform_ops - platform specific runtime functions
  * @calibrate_cpu:		calibrate CPU
  * @calibrate_tsc:		calibrate TSC, if different from CPU
@@ -218,6 +240,7 @@ struct x86_legacy_features {
  * 				possible in x86_early_init_platform_quirks() by
  * 				only using the current x86_hardware_subarch
  * 				semantics.
+ * @hyper:			x86 hypervisor specific runtime callbacks
  */
 struct x86_platform_ops {
 	unsigned long (*calibrate_cpu)(void);
@@ -233,6 +256,7 @@ struct x86_platform_ops {
 	void (*apic_post_init)(void);
 	struct x86_legacy_features legacy;
 	void (*set_legacy_features)(void);
+	struct x86_hyper_runtime hyper;
 };
 
 struct pci_dev;
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1645,7 +1645,7 @@ static __init void try_to_enable_x2apic(
 		 * under KVM
 		 */
 		if (max_physical_apicid > 255 ||
-		    !hypervisor_x2apic_available()) {
+		    !x86_init.hyper.x2apic_available()) {
 			pr_info("x2apic: IRQ remapping doesn't support X2APIC mode\n");
 			x2apic_disable();
 			return;
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -44,51 +44,49 @@ static const __initconst struct hypervis
 const struct hypervisor_x86 *x86_hyper;
 EXPORT_SYMBOL(x86_hyper);
 
-static inline void __init
+static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
-	const struct hypervisor_x86 *h, * const *p;
+	const struct hypervisor_x86 *h = NULL, * const *p;
 	uint32_t pri, max_pri = 0;
 
 	for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
-		h = *p;
-		pri = h->detect();
-		if (pri != 0 && pri > max_pri) {
+		pri = (*p)->detect();
+		if (pri > max_pri) {
 			max_pri = pri;
-			x86_hyper = h;
+			h = *p;
 		}
 	}
 
-	if (max_pri)
-		pr_info("Hypervisor detected: %s\n", x86_hyper->name);
-}
-
-void __init init_hypervisor_platform(void)
-{
-
-	detect_hypervisor_vendor();
+	if (h)
+		pr_info("Hypervisor detected: %s\n", h->name);
 
-	if (!x86_hyper)
-		return;
-
-	if (x86_hyper->init_platform)
-		x86_hyper->init_platform();
+	return h;
 }
 
-bool __init hypervisor_x2apic_available(void)
+static void __init copy_array(const void *src, void *target, unsigned int size)
 {
-	return x86_hyper                   &&
-	       x86_hyper->x2apic_available &&
-	       x86_hyper->x2apic_available();
+	unsigned int i, n = size / sizeof(void *);
+	const void * const *from = (const void * const *)src;
+	const void **to = (const void **)target;
+
+	for (i = 0; i < n; i++)
+		if (from[i])
+			to[i] = from[i];
 }
 
-void hypervisor_pin_vcpu(int cpu)
+void __init init_hypervisor_platform(void)
 {
-	if (!x86_hyper)
+	const struct hypervisor_x86 *h;
+
+	h = detect_hypervisor_vendor();
+
+	if (!h)
 		return;
 
-	if (x86_hyper->pin_vcpu)
-		x86_hyper->pin_vcpu(cpu);
-	else
-		WARN_ONCE(1, "vcpu pinning requested but not supported!\n");
+	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
+	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
+
+	x86_hyper = h;
+	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -257,6 +257,6 @@ static void __init ms_hyperv_init_platfo
 const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
-	.init_platform		= ms_hyperv_init_platform,
+	.init.init_platform	= ms_hyperv_init_platform,
 };
 EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -208,7 +208,7 @@ static bool __init vmware_legacy_x2apic_
 const __refconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
-	.init_platform		= vmware_platform_setup,
-	.x2apic_available	= vmware_legacy_x2apic_available,
+	.init.init_platform	= vmware_platform_setup,
+	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
 EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -547,7 +547,7 @@ static uint32_t __init kvm_detect(void)
 const struct hypervisor_x86 x86_hyper_kvm __refconst = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
-	.x2apic_available	= kvm_para_available,
+	.init.x2apic_available	= kvm_para_available,
 };
 EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -28,6 +28,8 @@ void x86_init_noop(void) { }
 void __init x86_init_uint_noop(unsigned int unused) { }
 int __init iommu_init_noop(void) { return 0; }
 void iommu_shutdown_noop(void) { }
+bool __init bool_x86_init_noop(void) { return false; }
+void x86_op_int_noop(int cpu) { }
 
 /*
  * The platform setup functions are preset with the default functions
@@ -81,6 +83,12 @@ struct x86_init_ops x86_init __initdata
 		.init_irq		= x86_default_pci_init_irq,
 		.fixup_irqs		= x86_default_pci_fixup_irqs,
 	},
+
+	.hyper = {
+		.init_platform		= x86_init_noop,
+		.x2apic_available	= bool_x86_init_noop,
+		.init_mem_mapping	= x86_init_noop,
+	},
 };
 
 struct x86_cpuinit_ops x86_cpuinit = {
@@ -101,6 +109,7 @@ struct x86_platform_ops x86_platform __r
 	.get_nmi_reason			= default_get_nmi_reason,
 	.save_sched_clock_state 	= tsc_save_sched_clock_state,
 	.restore_sched_clock_state 	= tsc_restore_sched_clock_state,
+	.hyper.pin_vcpu			= x86_op_int_noop,
 };
 
 EXPORT_SYMBOL_GPL(x86_platform);
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -671,7 +671,7 @@ void __init init_mem_mapping(void)
 	load_cr3(swapper_pg_dir);
 	__flush_tlb_all();
 
-	hypervisor_init_mem_mapping();
+	x86_init.hyper.init_mem_mapping();
 
 	early_memtest(0, max_pfn_mapped << PAGE_SHIFT);
 }
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -229,9 +229,9 @@ static uint32_t __init xen_platform_hvm(
 const struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
-	.init_platform          = xen_hvm_guest_init,
-	.pin_vcpu               = xen_pin_vcpu,
-	.x2apic_available       = xen_x2apic_para_available,
-	.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.init.init_platform     = xen_hvm_guest_init,
+	.init.x2apic_available  = xen_x2apic_para_available,
+	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
 EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1462,6 +1462,6 @@ static uint32_t __init xen_platform_pv(v
 const struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
-	.pin_vcpu               = xen_pin_vcpu,
+	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
 EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -7,8 +7,12 @@
  *		Juergen Gross <jgross@suse.com>
  */
 
-#ifdef CONFIG_HYPERVISOR_GUEST
-#include <asm/hypervisor.h>
+#ifdef CONFIG_X86
+#include <asm/x86_init.h>
+static inline void hypervisor_pin_vcpu(int cpu)
+{
+	x86_platform.hyper.pin_vcpu(cpu);
+}
 #else
 static inline void hypervisor_pin_vcpu(int cpu)
 {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 064/159] x86/virt, x86/platform: Merge struct x86_hyper into struct x86_platform and struct x86_init
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2017-12-22  8:45   ` Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper Greg Kroah-Hartman
                   ` (100 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Juergen Gross, sthemmin, xen-devel, kvm, rkrcmar, Peter Zijlstra,
	Greg Kroah-Hartman, boris.ostrovsky, rusty, akataria, stable,
	virtualization, haiyangz, pbonzini, devel, Thomas Gleixner, kys,
	Linus Torvalds, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit f72e38e8ec8869ac0ba5a75d7d2f897d98a1454e upstream.

Instead of x86_hyper being either NULL on bare metal or a pointer to a
struct hypervisor_x86 in case of the kernel running as a guest merge
the struct into x86_platform and x86_init.

This will remove the need for wrappers making it hard to find out what
is being called. With dummy functions added for all callbacks testing
for a NULL function pointer can be removed, too.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: rusty@rustcorp.com.au
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-2-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/hypervisor.h |   25 +++-------------
 arch/x86/include/asm/x86_init.h   |   24 ++++++++++++++++
 arch/x86/kernel/apic/apic.c       |    2 -
 arch/x86/kernel/cpu/hypervisor.c  |   56 ++++++++++++++++++--------------------
 arch/x86/kernel/cpu/mshyperv.c    |    2 -
 arch/x86/kernel/cpu/vmware.c      |    4 +-
 arch/x86/kernel/kvm.c             |    2 -
 arch/x86/kernel/x86_init.c        |    9 ++++++
 arch/x86/mm/init.c                |    2 -
 arch/x86/xen/enlighten_hvm.c      |    8 ++---
 arch/x86/xen/enlighten_pv.c       |    2 -
 include/linux/hypervisor.h        |    8 ++++-
 12 files changed, 82 insertions(+), 62 deletions(-)

--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -23,6 +23,7 @@
 #ifdef CONFIG_HYPERVISOR_GUEST
 
 #include <asm/kvm_para.h>
+#include <asm/x86_init.h>
 #include <asm/xen/hypervisor.h>
 
 /*
@@ -35,17 +36,11 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
-	/* Platform setup (run once per boot) */
-	void		(*init_platform)(void);
+	/* init time callbacks */
+	struct x86_hyper_init init;
 
-	/* X2APIC detection (run once per boot) */
-	bool		(*x2apic_available)(void);
-
-	/* pin current vcpu to specified physical cpu (run rarely) */
-	void		(*pin_vcpu)(int);
-
-	/* called during init_mem_mapping() to setup early mappings. */
-	void		(*init_mem_mapping)(void);
+	/* runtime callbacks */
+	struct x86_hyper_runtime runtime;
 };
 
 extern const struct hypervisor_x86 *x86_hyper;
@@ -58,17 +53,7 @@ extern const struct hypervisor_x86 x86_h
 extern const struct hypervisor_x86 x86_hyper_kvm;
 
 extern void init_hypervisor_platform(void);
-extern bool hypervisor_x2apic_available(void);
-extern void hypervisor_pin_vcpu(int cpu);
-
-static inline void hypervisor_init_mem_mapping(void)
-{
-	if (x86_hyper && x86_hyper->init_mem_mapping)
-		x86_hyper->init_mem_mapping();
-}
 #else
 static inline void init_hypervisor_platform(void) { }
-static inline bool hypervisor_x2apic_available(void) { return false; }
-static inline void hypervisor_init_mem_mapping(void) { }
 #endif /* CONFIG_HYPERVISOR_GUEST */
 #endif /* _ASM_X86_HYPERVISOR_H */
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -115,6 +115,18 @@ struct x86_init_pci {
 };
 
 /**
+ * struct x86_hyper_init - x86 hypervisor init functions
+ * @init_platform:		platform setup
+ * @x2apic_available:		X2APIC detection
+ * @init_mem_mapping:		setup early mappings during init_mem_mapping()
+ */
+struct x86_hyper_init {
+	void (*init_platform)(void);
+	bool (*x2apic_available)(void);
+	void (*init_mem_mapping)(void);
+};
+
+/**
  * struct x86_init_ops - functions for platform specific setup
  *
  */
@@ -127,6 +139,7 @@ struct x86_init_ops {
 	struct x86_init_timers		timers;
 	struct x86_init_iommu		iommu;
 	struct x86_init_pci		pci;
+	struct x86_hyper_init		hyper;
 };
 
 /**
@@ -200,6 +213,15 @@ struct x86_legacy_features {
 };
 
 /**
+ * struct x86_hyper_runtime - x86 hypervisor specific runtime callbacks
+ *
+ * @pin_vcpu:		pin current vcpu to specified physical cpu (run rarely)
+ */
+struct x86_hyper_runtime {
+	void (*pin_vcpu)(int cpu);
+};
+
+/**
  * struct x86_platform_ops - platform specific runtime functions
  * @calibrate_cpu:		calibrate CPU
  * @calibrate_tsc:		calibrate TSC, if different from CPU
@@ -218,6 +240,7 @@ struct x86_legacy_features {
  * 				possible in x86_early_init_platform_quirks() by
  * 				only using the current x86_hardware_subarch
  * 				semantics.
+ * @hyper:			x86 hypervisor specific runtime callbacks
  */
 struct x86_platform_ops {
 	unsigned long (*calibrate_cpu)(void);
@@ -233,6 +256,7 @@ struct x86_platform_ops {
 	void (*apic_post_init)(void);
 	struct x86_legacy_features legacy;
 	void (*set_legacy_features)(void);
+	struct x86_hyper_runtime hyper;
 };
 
 struct pci_dev;
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1645,7 +1645,7 @@ static __init void try_to_enable_x2apic(
 		 * under KVM
 		 */
 		if (max_physical_apicid > 255 ||
-		    !hypervisor_x2apic_available()) {
+		    !x86_init.hyper.x2apic_available()) {
 			pr_info("x2apic: IRQ remapping doesn't support X2APIC mode\n");
 			x2apic_disable();
 			return;
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -44,51 +44,49 @@ static const __initconst struct hypervis
 const struct hypervisor_x86 *x86_hyper;
 EXPORT_SYMBOL(x86_hyper);
 
-static inline void __init
+static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
 {
-	const struct hypervisor_x86 *h, * const *p;
+	const struct hypervisor_x86 *h = NULL, * const *p;
 	uint32_t pri, max_pri = 0;
 
 	for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
-		h = *p;
-		pri = h->detect();
-		if (pri != 0 && pri > max_pri) {
+		pri = (*p)->detect();
+		if (pri > max_pri) {
 			max_pri = pri;
-			x86_hyper = h;
+			h = *p;
 		}
 	}
 
-	if (max_pri)
-		pr_info("Hypervisor detected: %s\n", x86_hyper->name);
-}
-
-void __init init_hypervisor_platform(void)
-{
-
-	detect_hypervisor_vendor();
+	if (h)
+		pr_info("Hypervisor detected: %s\n", h->name);
 
-	if (!x86_hyper)
-		return;
-
-	if (x86_hyper->init_platform)
-		x86_hyper->init_platform();
+	return h;
 }
 
-bool __init hypervisor_x2apic_available(void)
+static void __init copy_array(const void *src, void *target, unsigned int size)
 {
-	return x86_hyper                   &&
-	       x86_hyper->x2apic_available &&
-	       x86_hyper->x2apic_available();
+	unsigned int i, n = size / sizeof(void *);
+	const void * const *from = (const void * const *)src;
+	const void **to = (const void **)target;
+
+	for (i = 0; i < n; i++)
+		if (from[i])
+			to[i] = from[i];
 }
 
-void hypervisor_pin_vcpu(int cpu)
+void __init init_hypervisor_platform(void)
 {
-	if (!x86_hyper)
+	const struct hypervisor_x86 *h;
+
+	h = detect_hypervisor_vendor();
+
+	if (!h)
 		return;
 
-	if (x86_hyper->pin_vcpu)
-		x86_hyper->pin_vcpu(cpu);
-	else
-		WARN_ONCE(1, "vcpu pinning requested but not supported!\n");
+	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
+	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
+
+	x86_hyper = h;
+	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -257,6 +257,6 @@ static void __init ms_hyperv_init_platfo
 const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
-	.init_platform		= ms_hyperv_init_platform,
+	.init.init_platform	= ms_hyperv_init_platform,
 };
 EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -208,7 +208,7 @@ static bool __init vmware_legacy_x2apic_
 const __refconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
-	.init_platform		= vmware_platform_setup,
-	.x2apic_available	= vmware_legacy_x2apic_available,
+	.init.init_platform	= vmware_platform_setup,
+	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
 EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -547,7 +547,7 @@ static uint32_t __init kvm_detect(void)
 const struct hypervisor_x86 x86_hyper_kvm __refconst = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
-	.x2apic_available	= kvm_para_available,
+	.init.x2apic_available	= kvm_para_available,
 };
 EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -28,6 +28,8 @@ void x86_init_noop(void) { }
 void __init x86_init_uint_noop(unsigned int unused) { }
 int __init iommu_init_noop(void) { return 0; }
 void iommu_shutdown_noop(void) { }
+bool __init bool_x86_init_noop(void) { return false; }
+void x86_op_int_noop(int cpu) { }
 
 /*
  * The platform setup functions are preset with the default functions
@@ -81,6 +83,12 @@ struct x86_init_ops x86_init __initdata
 		.init_irq		= x86_default_pci_init_irq,
 		.fixup_irqs		= x86_default_pci_fixup_irqs,
 	},
+
+	.hyper = {
+		.init_platform		= x86_init_noop,
+		.x2apic_available	= bool_x86_init_noop,
+		.init_mem_mapping	= x86_init_noop,
+	},
 };
 
 struct x86_cpuinit_ops x86_cpuinit = {
@@ -101,6 +109,7 @@ struct x86_platform_ops x86_platform __r
 	.get_nmi_reason			= default_get_nmi_reason,
 	.save_sched_clock_state 	= tsc_save_sched_clock_state,
 	.restore_sched_clock_state 	= tsc_restore_sched_clock_state,
+	.hyper.pin_vcpu			= x86_op_int_noop,
 };
 
 EXPORT_SYMBOL_GPL(x86_platform);
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -671,7 +671,7 @@ void __init init_mem_mapping(void)
 	load_cr3(swapper_pg_dir);
 	__flush_tlb_all();
 
-	hypervisor_init_mem_mapping();
+	x86_init.hyper.init_mem_mapping();
 
 	early_memtest(0, max_pfn_mapped << PAGE_SHIFT);
 }
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -229,9 +229,9 @@ static uint32_t __init xen_platform_hvm(
 const struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
-	.init_platform          = xen_hvm_guest_init,
-	.pin_vcpu               = xen_pin_vcpu,
-	.x2apic_available       = xen_x2apic_para_available,
-	.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.init.init_platform     = xen_hvm_guest_init,
+	.init.x2apic_available  = xen_x2apic_para_available,
+	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
 EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1462,6 +1462,6 @@ static uint32_t __init xen_platform_pv(v
 const struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
-	.pin_vcpu               = xen_pin_vcpu,
+	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
 EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -7,8 +7,12 @@
  *		Juergen Gross <jgross@suse.com>
  */
 
-#ifdef CONFIG_HYPERVISOR_GUEST
-#include <asm/hypervisor.h>
+#ifdef CONFIG_X86
+#include <asm/x86_init.h>
+static inline void hypervisor_pin_vcpu(int cpu)
+{
+	x86_platform.hyper.pin_vcpu(cpu);
+}
 #else
 static inline void hypervisor_pin_vcpu(int cpu)
 {



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Juergen Gross, Thomas Gleixner,
	Xavier Deguillard, Linus Torvalds, Peter Zijlstra, akataria,
	arnd, boris.ostrovsky, devel, dmitry.torokhov, haiyangz, kvm,
	kys, linux-graphics-maintainer, linux-input, moltmann, pbonzini,
	pv-drivers, rkrcmar, sthemmin, virtualization, xen-devel,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit 03b2a320b19f1424e9ac9c21696be9c60b6d0d93 upstream.

The x86_hyper pointer is only used for checking whether a virtual
device is supporting the hypervisor the system is running on.

Use an enum for that purpose instead and drop the x86_hyper pointer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Xavier Deguillard <xdeguillard@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: arnd@arndb.de
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: dmitry.torokhov@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: linux-graphics-maintainer@vmware.com
Cc: linux-input@vger.kernel.org
Cc: moltmann@vmware.com
Cc: pbonzini@redhat.com
Cc: pv-drivers@vmware.com
Cc: rkrcmar@redhat.com
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/hyperv/hv_init.c         |    2 +-
 arch/x86/include/asm/hypervisor.h |   23 ++++++++++++++---------
 arch/x86/kernel/cpu/hypervisor.c  |   12 +++++++++---
 arch/x86/kernel/cpu/mshyperv.c    |    4 ++--
 arch/x86/kernel/cpu/vmware.c      |    4 ++--
 arch/x86/kernel/kvm.c             |    4 ++--
 arch/x86/xen/enlighten_hvm.c      |    4 ++--
 arch/x86/xen/enlighten_pv.c       |    4 ++--
 drivers/hv/vmbus_drv.c            |    2 +-
 drivers/input/mouse/vmmouse.c     |   10 ++++------
 drivers/misc/vmw_balloon.c        |    2 +-
 11 files changed, 40 insertions(+), 31 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -113,7 +113,7 @@ void hyperv_init(void)
 	u64 guest_id;
 	union hv_x64_msr_hypercall_contents hypercall_msr;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return;
 
 	/* Allocate percpu VP index */
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -29,6 +29,16 @@
 /*
  * x86 hypervisor information
  */
+
+enum x86_hypervisor_type {
+	X86_HYPER_NATIVE = 0,
+	X86_HYPER_VMWARE,
+	X86_HYPER_MS_HYPERV,
+	X86_HYPER_XEN_PV,
+	X86_HYPER_XEN_HVM,
+	X86_HYPER_KVM,
+};
+
 struct hypervisor_x86 {
 	/* Hypervisor name */
 	const char	*name;
@@ -36,6 +46,9 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
+	/* Hypervisor type */
+	enum x86_hypervisor_type type;
+
 	/* init time callbacks */
 	struct x86_hyper_init init;
 
@@ -43,15 +56,7 @@ struct hypervisor_x86 {
 	struct x86_hyper_runtime runtime;
 };
 
-extern const struct hypervisor_x86 *x86_hyper;
-
-/* Recognized hypervisors */
-extern const struct hypervisor_x86 x86_hyper_vmware;
-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen_pv;
-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
-extern const struct hypervisor_x86 x86_hyper_kvm;
-
+extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 #else
 static inline void init_hypervisor_platform(void) { }
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -26,6 +26,12 @@
 #include <asm/processor.h>
 #include <asm/hypervisor.h>
 
+extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_xen_hvm;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
 #ifdef CONFIG_XEN_PV
@@ -41,8 +47,8 @@ static const __initconst struct hypervis
 #endif
 };
 
-const struct hypervisor_x86 *x86_hyper;
-EXPORT_SYMBOL(x86_hyper);
+enum x86_hypervisor_type x86_hyper_type;
+EXPORT_SYMBOL(x86_hyper_type);
 
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
@@ -87,6 +93,6 @@ void __init init_hypervisor_platform(voi
 	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
 	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
 
-	x86_hyper = h;
+	x86_hyper_type = h->type;
 	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -254,9 +254,9 @@ static void __init ms_hyperv_init_platfo
 #endif
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
+const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
+	.type			= X86_HYPER_MS_HYPERV,
 	.init.init_platform	= ms_hyperv_init_platform,
 };
-EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -205,10 +205,10 @@ static bool __init vmware_legacy_x2apic_
 	       (eax & (1 << VMWARE_PORT_CMD_LEGACY_X2APIC)) != 0;
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_vmware = {
+const __initconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
+	.type			= X86_HYPER_VMWARE,
 	.init.init_platform	= vmware_platform_setup,
 	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
-EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -544,12 +544,12 @@ static uint32_t __init kvm_detect(void)
 	return kvm_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_kvm __refconst = {
+const __initconst struct hypervisor_x86 x86_hyper_kvm = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
+	.type			= X86_HYPER_KVM,
 	.init.x2apic_available	= kvm_para_available,
 };
-EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
 static __init int activate_jump_labels(void)
 {
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -226,12 +226,12 @@ static uint32_t __init xen_platform_hvm(
 	return xen_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_xen_hvm = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
+	.type			= X86_HYPER_XEN_HVM,
 	.init.init_platform     = xen_hvm_guest_init,
 	.init.x2apic_available  = xen_x2apic_para_available,
 	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1459,9 +1459,9 @@ static uint32_t __init xen_platform_pv(v
 	return 0;
 }
 
-const struct hypervisor_x86 x86_hyper_xen_pv = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
+	.type			= X86_HYPER_XEN_PV,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1534,7 +1534,7 @@ static int __init hv_acpi_init(void)
 {
 	int ret, t;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return -ENODEV;
 
 	init_completion(&probe_event);
--- a/drivers/input/mouse/vmmouse.c
+++ b/drivers/input/mouse/vmmouse.c
@@ -316,11 +316,9 @@ static int vmmouse_enable(struct psmouse
 /*
  * Array of supported hypervisors.
  */
-static const struct hypervisor_x86 *vmmouse_supported_hypervisors[] = {
-	&x86_hyper_vmware,
-#ifdef CONFIG_KVM_GUEST
-	&x86_hyper_kvm,
-#endif
+static enum x86_hypervisor_type vmmouse_supported_hypervisors[] = {
+	X86_HYPER_VMWARE,
+	X86_HYPER_KVM,
 };
 
 /**
@@ -331,7 +329,7 @@ static bool vmmouse_check_hypervisor(voi
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(vmmouse_supported_hypervisors); i++)
-		if (vmmouse_supported_hypervisors[i] == x86_hyper)
+		if (vmmouse_supported_hypervisors[i] == x86_hyper_type)
 			return true;
 
 	return false;
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1271,7 +1271,7 @@ static int __init vmballoon_init(void)
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
 	 */
-	if (x86_hyper != &x86_hyper_vmware)
+	if (x86_hyper_type != X86_HYPER_VMWARE)
 		return -ENODEV;
 
 	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: kvm, rkrcmar, pv-drivers, akataria, virtualization,
	Thomas Gleixner, sthemmin, moltmann, Ingo Molnar, Peter Zijlstra,
	linux-graphics-maintainer, linux-input, xen-devel, arnd,
	Xavier Deguillard, haiyangz, devel, boris.ostrovsky,
	Juergen Gross, Greg Kroah-Hartman, dmitry.torokhov, stable,
	pbonzini, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit 03b2a320b19f1424e9ac9c21696be9c60b6d0d93 upstream.

The x86_hyper pointer is only used for checking whether a virtual
device is supporting the hypervisor the system is running on.

Use an enum for that purpose instead and drop the x86_hyper pointer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Xavier Deguillard <xdeguillard@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: arnd@arndb.de
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: dmitry.torokhov@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: linux-graphics-maintainer@vmware.com
Cc: linux-input@vger.kernel.org
Cc: moltmann@vmware.com
Cc: pbonzini@redhat.com
Cc: pv-drivers@vmware.com
Cc: rkrcmar@redhat.com
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/hyperv/hv_init.c         |    2 +-
 arch/x86/include/asm/hypervisor.h |   23 ++++++++++++++---------
 arch/x86/kernel/cpu/hypervisor.c  |   12 +++++++++---
 arch/x86/kernel/cpu/mshyperv.c    |    4 ++--
 arch/x86/kernel/cpu/vmware.c      |    4 ++--
 arch/x86/kernel/kvm.c             |    4 ++--
 arch/x86/xen/enlighten_hvm.c      |    4 ++--
 arch/x86/xen/enlighten_pv.c       |    4 ++--
 drivers/hv/vmbus_drv.c            |    2 +-
 drivers/input/mouse/vmmouse.c     |   10 ++++------
 drivers/misc/vmw_balloon.c        |    2 +-
 11 files changed, 40 insertions(+), 31 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -113,7 +113,7 @@ void hyperv_init(void)
 	u64 guest_id;
 	union hv_x64_msr_hypercall_contents hypercall_msr;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return;
 
 	/* Allocate percpu VP index */
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -29,6 +29,16 @@
 /*
  * x86 hypervisor information
  */
+
+enum x86_hypervisor_type {
+	X86_HYPER_NATIVE = 0,
+	X86_HYPER_VMWARE,
+	X86_HYPER_MS_HYPERV,
+	X86_HYPER_XEN_PV,
+	X86_HYPER_XEN_HVM,
+	X86_HYPER_KVM,
+};
+
 struct hypervisor_x86 {
 	/* Hypervisor name */
 	const char	*name;
@@ -36,6 +46,9 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
+	/* Hypervisor type */
+	enum x86_hypervisor_type type;
+
 	/* init time callbacks */
 	struct x86_hyper_init init;
 
@@ -43,15 +56,7 @@ struct hypervisor_x86 {
 	struct x86_hyper_runtime runtime;
 };
 
-extern const struct hypervisor_x86 *x86_hyper;
-
-/* Recognized hypervisors */
-extern const struct hypervisor_x86 x86_hyper_vmware;
-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen_pv;
-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
-extern const struct hypervisor_x86 x86_hyper_kvm;
-
+extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 #else
 static inline void init_hypervisor_platform(void) { }
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -26,6 +26,12 @@
 #include <asm/processor.h>
 #include <asm/hypervisor.h>
 
+extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_xen_hvm;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
 #ifdef CONFIG_XEN_PV
@@ -41,8 +47,8 @@ static const __initconst struct hypervis
 #endif
 };
 
-const struct hypervisor_x86 *x86_hyper;
-EXPORT_SYMBOL(x86_hyper);
+enum x86_hypervisor_type x86_hyper_type;
+EXPORT_SYMBOL(x86_hyper_type);
 
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
@@ -87,6 +93,6 @@ void __init init_hypervisor_platform(voi
 	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
 	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
 
-	x86_hyper = h;
+	x86_hyper_type = h->type;
 	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -254,9 +254,9 @@ static void __init ms_hyperv_init_platfo
 #endif
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
+const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
+	.type			= X86_HYPER_MS_HYPERV,
 	.init.init_platform	= ms_hyperv_init_platform,
 };
-EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -205,10 +205,10 @@ static bool __init vmware_legacy_x2apic_
 	       (eax & (1 << VMWARE_PORT_CMD_LEGACY_X2APIC)) != 0;
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_vmware = {
+const __initconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
+	.type			= X86_HYPER_VMWARE,
 	.init.init_platform	= vmware_platform_setup,
 	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
-EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -544,12 +544,12 @@ static uint32_t __init kvm_detect(void)
 	return kvm_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_kvm __refconst = {
+const __initconst struct hypervisor_x86 x86_hyper_kvm = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
+	.type			= X86_HYPER_KVM,
 	.init.x2apic_available	= kvm_para_available,
 };
-EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
 static __init int activate_jump_labels(void)
 {
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -226,12 +226,12 @@ static uint32_t __init xen_platform_hvm(
 	return xen_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_xen_hvm = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
+	.type			= X86_HYPER_XEN_HVM,
 	.init.init_platform     = xen_hvm_guest_init,
 	.init.x2apic_available  = xen_x2apic_para_available,
 	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1459,9 +1459,9 @@ static uint32_t __init xen_platform_pv(v
 	return 0;
 }
 
-const struct hypervisor_x86 x86_hyper_xen_pv = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
+	.type			= X86_HYPER_XEN_PV,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1534,7 +1534,7 @@ static int __init hv_acpi_init(void)
 {
 	int ret, t;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return -ENODEV;
 
 	init_completion(&probe_event);
--- a/drivers/input/mouse/vmmouse.c
+++ b/drivers/input/mouse/vmmouse.c
@@ -316,11 +316,9 @@ static int vmmouse_enable(struct psmouse
 /*
  * Array of supported hypervisors.
  */
-static const struct hypervisor_x86 *vmmouse_supported_hypervisors[] = {
-	&x86_hyper_vmware,
-#ifdef CONFIG_KVM_GUEST
-	&x86_hyper_kvm,
-#endif
+static enum x86_hypervisor_type vmmouse_supported_hypervisors[] = {
+	X86_HYPER_VMWARE,
+	X86_HYPER_KVM,
 };
 
 /**
@@ -331,7 +329,7 @@ static bool vmmouse_check_hypervisor(voi
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(vmmouse_supported_hypervisors); i++)
-		if (vmmouse_supported_hypervisors[i] == x86_hyper)
+		if (vmmouse_supported_hypervisors[i] == x86_hyper_type)
 			return true;
 
 	return false;
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1271,7 +1271,7 @@ static int __init vmballoon_init(void)
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
 	 */
-	if (x86_hyper != &x86_hyper_vmware)
+	if (x86_hyper_type != X86_HYPER_VMWARE)
 		return -ENODEV;
 
 	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2017-12-22  8:45 ` Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` Greg Kroah-Hartman
                   ` (99 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: kvm, rkrcmar, pv-drivers, akataria, virtualization,
	Thomas Gleixner, sthemmin, moltmann, Ingo Molnar, Peter Zijlstra,
	linux-graphics-maintainer, linux-input, xen-devel, arnd,
	Xavier Deguillard, haiyangz, devel, boris.ostrovsky,
	Juergen Gross, Greg Kroah-Hartman, dmitry.torokhov, stable,
	pbonzini, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit 03b2a320b19f1424e9ac9c21696be9c60b6d0d93 upstream.

The x86_hyper pointer is only used for checking whether a virtual
device is supporting the hypervisor the system is running on.

Use an enum for that purpose instead and drop the x86_hyper pointer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Xavier Deguillard <xdeguillard@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: arnd@arndb.de
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: dmitry.torokhov@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: linux-graphics-maintainer@vmware.com
Cc: linux-input@vger.kernel.org
Cc: moltmann@vmware.com
Cc: pbonzini@redhat.com
Cc: pv-drivers@vmware.com
Cc: rkrcmar@redhat.com
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/hyperv/hv_init.c         |    2 +-
 arch/x86/include/asm/hypervisor.h |   23 ++++++++++++++---------
 arch/x86/kernel/cpu/hypervisor.c  |   12 +++++++++---
 arch/x86/kernel/cpu/mshyperv.c    |    4 ++--
 arch/x86/kernel/cpu/vmware.c      |    4 ++--
 arch/x86/kernel/kvm.c             |    4 ++--
 arch/x86/xen/enlighten_hvm.c      |    4 ++--
 arch/x86/xen/enlighten_pv.c       |    4 ++--
 drivers/hv/vmbus_drv.c            |    2 +-
 drivers/input/mouse/vmmouse.c     |   10 ++++------
 drivers/misc/vmw_balloon.c        |    2 +-
 11 files changed, 40 insertions(+), 31 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -113,7 +113,7 @@ void hyperv_init(void)
 	u64 guest_id;
 	union hv_x64_msr_hypercall_contents hypercall_msr;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return;
 
 	/* Allocate percpu VP index */
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -29,6 +29,16 @@
 /*
  * x86 hypervisor information
  */
+
+enum x86_hypervisor_type {
+	X86_HYPER_NATIVE = 0,
+	X86_HYPER_VMWARE,
+	X86_HYPER_MS_HYPERV,
+	X86_HYPER_XEN_PV,
+	X86_HYPER_XEN_HVM,
+	X86_HYPER_KVM,
+};
+
 struct hypervisor_x86 {
 	/* Hypervisor name */
 	const char	*name;
@@ -36,6 +46,9 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
+	/* Hypervisor type */
+	enum x86_hypervisor_type type;
+
 	/* init time callbacks */
 	struct x86_hyper_init init;
 
@@ -43,15 +56,7 @@ struct hypervisor_x86 {
 	struct x86_hyper_runtime runtime;
 };
 
-extern const struct hypervisor_x86 *x86_hyper;
-
-/* Recognized hypervisors */
-extern const struct hypervisor_x86 x86_hyper_vmware;
-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen_pv;
-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
-extern const struct hypervisor_x86 x86_hyper_kvm;
-
+extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 #else
 static inline void init_hypervisor_platform(void) { }
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -26,6 +26,12 @@
 #include <asm/processor.h>
 #include <asm/hypervisor.h>
 
+extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_xen_hvm;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
 #ifdef CONFIG_XEN_PV
@@ -41,8 +47,8 @@ static const __initconst struct hypervis
 #endif
 };
 
-const struct hypervisor_x86 *x86_hyper;
-EXPORT_SYMBOL(x86_hyper);
+enum x86_hypervisor_type x86_hyper_type;
+EXPORT_SYMBOL(x86_hyper_type);
 
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
@@ -87,6 +93,6 @@ void __init init_hypervisor_platform(voi
 	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
 	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
 
-	x86_hyper = h;
+	x86_hyper_type = h->type;
 	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -254,9 +254,9 @@ static void __init ms_hyperv_init_platfo
 #endif
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
+const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
+	.type			= X86_HYPER_MS_HYPERV,
 	.init.init_platform	= ms_hyperv_init_platform,
 };
-EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -205,10 +205,10 @@ static bool __init vmware_legacy_x2apic_
 	       (eax & (1 << VMWARE_PORT_CMD_LEGACY_X2APIC)) != 0;
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_vmware = {
+const __initconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
+	.type			= X86_HYPER_VMWARE,
 	.init.init_platform	= vmware_platform_setup,
 	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
-EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -544,12 +544,12 @@ static uint32_t __init kvm_detect(void)
 	return kvm_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_kvm __refconst = {
+const __initconst struct hypervisor_x86 x86_hyper_kvm = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
+	.type			= X86_HYPER_KVM,
 	.init.x2apic_available	= kvm_para_available,
 };
-EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
 static __init int activate_jump_labels(void)
 {
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -226,12 +226,12 @@ static uint32_t __init xen_platform_hvm(
 	return xen_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_xen_hvm = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
+	.type			= X86_HYPER_XEN_HVM,
 	.init.init_platform     = xen_hvm_guest_init,
 	.init.x2apic_available  = xen_x2apic_para_available,
 	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1459,9 +1459,9 @@ static uint32_t __init xen_platform_pv(v
 	return 0;
 }
 
-const struct hypervisor_x86 x86_hyper_xen_pv = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
+	.type			= X86_HYPER_XEN_PV,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1534,7 +1534,7 @@ static int __init hv_acpi_init(void)
 {
 	int ret, t;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return -ENODEV;
 
 	init_completion(&probe_event);
--- a/drivers/input/mouse/vmmouse.c
+++ b/drivers/input/mouse/vmmouse.c
@@ -316,11 +316,9 @@ static int vmmouse_enable(struct psmouse
 /*
  * Array of supported hypervisors.
  */
-static const struct hypervisor_x86 *vmmouse_supported_hypervisors[] = {
-	&x86_hyper_vmware,
-#ifdef CONFIG_KVM_GUEST
-	&x86_hyper_kvm,
-#endif
+static enum x86_hypervisor_type vmmouse_supported_hypervisors[] = {
+	X86_HYPER_VMWARE,
+	X86_HYPER_KVM,
 };
 
 /**
@@ -331,7 +329,7 @@ static bool vmmouse_check_hypervisor(voi
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(vmmouse_supported_hypervisors); i++)
-		if (vmmouse_supported_hypervisors[i] == x86_hyper)
+		if (vmmouse_supported_hypervisors[i] == x86_hyper_type)
 			return true;
 
 	return false;
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1271,7 +1271,7 @@ static int __init vmballoon_init(void)
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
 	 */
-	if (x86_hyper != &x86_hyper_vmware)
+	if (x86_hyper_type != X86_HYPER_VMWARE)
 		return -ENODEV;
 
 	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45   ` Greg Kroah-Hartman
                   ` (98 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: kvm, rkrcmar, pv-drivers, akataria, virtualization, kys,
	Thomas Gleixner, sthemmin, moltmann, Ingo Molnar, Peter Zijlstra,
	linux-graphics-maintainer, linux-input, xen-devel, arnd,
	Xavier Deguillard, haiyangz, devel, boris.ostrovsky,
	Juergen Gross, Greg Kroah-Hartman, dmitry.torokhov, stable,
	pbonzini, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Juergen Gross <jgross@suse.com>

commit 03b2a320b19f1424e9ac9c21696be9c60b6d0d93 upstream.

The x86_hyper pointer is only used for checking whether a virtual
device is supporting the hypervisor the system is running on.

Use an enum for that purpose instead and drop the x86_hyper pointer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Xavier Deguillard <xdeguillard@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: arnd@arndb.de
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: dmitry.torokhov@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: linux-graphics-maintainer@vmware.com
Cc: linux-input@vger.kernel.org
Cc: moltmann@vmware.com
Cc: pbonzini@redhat.com
Cc: pv-drivers@vmware.com
Cc: rkrcmar@redhat.com
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/hyperv/hv_init.c         |    2 +-
 arch/x86/include/asm/hypervisor.h |   23 ++++++++++++++---------
 arch/x86/kernel/cpu/hypervisor.c  |   12 +++++++++---
 arch/x86/kernel/cpu/mshyperv.c    |    4 ++--
 arch/x86/kernel/cpu/vmware.c      |    4 ++--
 arch/x86/kernel/kvm.c             |    4 ++--
 arch/x86/xen/enlighten_hvm.c      |    4 ++--
 arch/x86/xen/enlighten_pv.c       |    4 ++--
 drivers/hv/vmbus_drv.c            |    2 +-
 drivers/input/mouse/vmmouse.c     |   10 ++++------
 drivers/misc/vmw_balloon.c        |    2 +-
 11 files changed, 40 insertions(+), 31 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -113,7 +113,7 @@ void hyperv_init(void)
 	u64 guest_id;
 	union hv_x64_msr_hypercall_contents hypercall_msr;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return;
 
 	/* Allocate percpu VP index */
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -29,6 +29,16 @@
 /*
  * x86 hypervisor information
  */
+
+enum x86_hypervisor_type {
+	X86_HYPER_NATIVE = 0,
+	X86_HYPER_VMWARE,
+	X86_HYPER_MS_HYPERV,
+	X86_HYPER_XEN_PV,
+	X86_HYPER_XEN_HVM,
+	X86_HYPER_KVM,
+};
+
 struct hypervisor_x86 {
 	/* Hypervisor name */
 	const char	*name;
@@ -36,6 +46,9 @@ struct hypervisor_x86 {
 	/* Detection routine */
 	uint32_t	(*detect)(void);
 
+	/* Hypervisor type */
+	enum x86_hypervisor_type type;
+
 	/* init time callbacks */
 	struct x86_hyper_init init;
 
@@ -43,15 +56,7 @@ struct hypervisor_x86 {
 	struct x86_hyper_runtime runtime;
 };
 
-extern const struct hypervisor_x86 *x86_hyper;
-
-/* Recognized hypervisors */
-extern const struct hypervisor_x86 x86_hyper_vmware;
-extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen_pv;
-extern const struct hypervisor_x86 x86_hyper_xen_hvm;
-extern const struct hypervisor_x86 x86_hyper_kvm;
-
+extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
 #else
 static inline void init_hypervisor_platform(void) { }
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -26,6 +26,12 @@
 #include <asm/processor.h>
 #include <asm/hypervisor.h>
 
+extern const struct hypervisor_x86 x86_hyper_vmware;
+extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_xen_hvm;
+extern const struct hypervisor_x86 x86_hyper_kvm;
+
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
 #ifdef CONFIG_XEN_PV
@@ -41,8 +47,8 @@ static const __initconst struct hypervis
 #endif
 };
 
-const struct hypervisor_x86 *x86_hyper;
-EXPORT_SYMBOL(x86_hyper);
+enum x86_hypervisor_type x86_hyper_type;
+EXPORT_SYMBOL(x86_hyper_type);
 
 static inline const struct hypervisor_x86 * __init
 detect_hypervisor_vendor(void)
@@ -87,6 +93,6 @@ void __init init_hypervisor_platform(voi
 	copy_array(&h->init, &x86_init.hyper, sizeof(h->init));
 	copy_array(&h->runtime, &x86_platform.hyper, sizeof(h->runtime));
 
-	x86_hyper = h;
+	x86_hyper_type = h->type;
 	x86_init.hyper.init_platform();
 }
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -254,9 +254,9 @@ static void __init ms_hyperv_init_platfo
 #endif
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
+const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
+	.type			= X86_HYPER_MS_HYPERV,
 	.init.init_platform	= ms_hyperv_init_platform,
 };
-EXPORT_SYMBOL(x86_hyper_ms_hyperv);
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -205,10 +205,10 @@ static bool __init vmware_legacy_x2apic_
 	       (eax & (1 << VMWARE_PORT_CMD_LEGACY_X2APIC)) != 0;
 }
 
-const __refconst struct hypervisor_x86 x86_hyper_vmware = {
+const __initconst struct hypervisor_x86 x86_hyper_vmware = {
 	.name			= "VMware",
 	.detect			= vmware_platform,
+	.type			= X86_HYPER_VMWARE,
 	.init.init_platform	= vmware_platform_setup,
 	.init.x2apic_available	= vmware_legacy_x2apic_available,
 };
-EXPORT_SYMBOL(x86_hyper_vmware);
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -544,12 +544,12 @@ static uint32_t __init kvm_detect(void)
 	return kvm_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_kvm __refconst = {
+const __initconst struct hypervisor_x86 x86_hyper_kvm = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
+	.type			= X86_HYPER_KVM,
 	.init.x2apic_available	= kvm_para_available,
 };
-EXPORT_SYMBOL_GPL(x86_hyper_kvm);
 
 static __init int activate_jump_labels(void)
 {
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -226,12 +226,12 @@ static uint32_t __init xen_platform_hvm(
 	return xen_cpuid_base();
 }
 
-const struct hypervisor_x86 x86_hyper_xen_hvm = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
+	.type			= X86_HYPER_XEN_HVM,
 	.init.init_platform     = xen_hvm_guest_init,
 	.init.x2apic_available  = xen_x2apic_para_available,
 	.init.init_mem_mapping	= xen_hvm_init_mem_mapping,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_hvm);
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1459,9 +1459,9 @@ static uint32_t __init xen_platform_pv(v
 	return 0;
 }
 
-const struct hypervisor_x86 x86_hyper_xen_pv = {
+const __initconst struct hypervisor_x86 x86_hyper_xen_pv = {
 	.name                   = "Xen PV",
 	.detect                 = xen_platform_pv,
+	.type			= X86_HYPER_XEN_PV,
 	.runtime.pin_vcpu       = xen_pin_vcpu,
 };
-EXPORT_SYMBOL(x86_hyper_xen_pv);
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1534,7 +1534,7 @@ static int __init hv_acpi_init(void)
 {
 	int ret, t;
 
-	if (x86_hyper != &x86_hyper_ms_hyperv)
+	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return -ENODEV;
 
 	init_completion(&probe_event);
--- a/drivers/input/mouse/vmmouse.c
+++ b/drivers/input/mouse/vmmouse.c
@@ -316,11 +316,9 @@ static int vmmouse_enable(struct psmouse
 /*
  * Array of supported hypervisors.
  */
-static const struct hypervisor_x86 *vmmouse_supported_hypervisors[] = {
-	&x86_hyper_vmware,
-#ifdef CONFIG_KVM_GUEST
-	&x86_hyper_kvm,
-#endif
+static enum x86_hypervisor_type vmmouse_supported_hypervisors[] = {
+	X86_HYPER_VMWARE,
+	X86_HYPER_KVM,
 };
 
 /**
@@ -331,7 +329,7 @@ static bool vmmouse_check_hypervisor(voi
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(vmmouse_supported_hypervisors); i++)
-		if (vmmouse_supported_hypervisors[i] == x86_hyper)
+		if (vmmouse_supported_hypervisors[i] == x86_hyper_type)
 			return true;
 
 	return false;
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1271,7 +1271,7 @@ static int __init vmballoon_init(void)
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
 	 */
-	if (x86_hyper != &x86_hyper_vmware)
+	if (x86_hyper_type != X86_HYPER_VMWARE)
 		return -ENODEV;
 
 	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 066/159] drivers/misc/intel/pti: Rename the header file to free up the namespace
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2017-12-22  8:45   ` Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 067/159] x86/cpufeature: Add User-Mode Instruction Prevention definitions Greg Kroah-Hartman
                   ` (96 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Zijlstra, Thomas Gleixner,
	J Freyensee, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ingo Molnar <mingo@kernel.org>

commit 1784f9144b143a1e8b19fe94083b040aa559182b upstream.

We'd like to use the 'PTI' acronym for 'Page Table Isolation' - free up the
namespace by renaming the <linux/pti.h> driver header to <linux/intel-pti.h>.

(Also standardize the header guard name while at it.)

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: J Freyensee <james_p_freyensee@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/misc/pti.c        |    2 +-
 include/linux/intel-pti.h |   43 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pti.h       |   43 -------------------------------------------
 3 files changed, 44 insertions(+), 44 deletions(-)

--- a/drivers/misc/pti.c
+++ b/drivers/misc/pti.c
@@ -32,7 +32,7 @@
 #include <linux/pci.h>
 #include <linux/mutex.h>
 #include <linux/miscdevice.h>
-#include <linux/pti.h>
+#include <linux/intel-pti.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 
--- /dev/null
+++ b/include/linux/intel-pti.h
@@ -0,0 +1,43 @@
+/*
+ *  Copyright (C) Intel 2011
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * The PTI (Parallel Trace Interface) driver directs trace data routed from
+ * various parts in the system out through the Intel Penwell PTI port and
+ * out of the mobile device for analysis with a debugging tool
+ * (Lauterbach, Fido). This is part of a solution for the MIPI P1149.7,
+ * compact JTAG, standard.
+ *
+ * This header file will allow other parts of the OS to use the
+ * interface to write out it's contents for debugging a mobile system.
+ */
+
+#ifndef LINUX_INTEL_PTI_H_
+#define LINUX_INTEL_PTI_H_
+
+/* offset for last dword of any PTI message. Part of MIPI P1149.7 */
+#define PTI_LASTDWORD_DTS	0x30
+
+/* basic structure used as a write address to the PTI HW */
+struct pti_masterchannel {
+	u8 master;
+	u8 channel;
+};
+
+/* the following functions are defined in misc/pti.c */
+void pti_writedata(struct pti_masterchannel *mc, u8 *buf, int count);
+struct pti_masterchannel *pti_request_masterchannel(u8 type,
+						    const char *thread_name);
+void pti_release_masterchannel(struct pti_masterchannel *mc);
+
+#endif /* LINUX_INTEL_PTI_H_ */
--- a/include/linux/pti.h
+++ /dev/null
@@ -1,43 +0,0 @@
-/*
- *  Copyright (C) Intel 2011
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *
- * The PTI (Parallel Trace Interface) driver directs trace data routed from
- * various parts in the system out through the Intel Penwell PTI port and
- * out of the mobile device for analysis with a debugging tool
- * (Lauterbach, Fido). This is part of a solution for the MIPI P1149.7,
- * compact JTAG, standard.
- *
- * This header file will allow other parts of the OS to use the
- * interface to write out it's contents for debugging a mobile system.
- */
-
-#ifndef PTI_H_
-#define PTI_H_
-
-/* offset for last dword of any PTI message. Part of MIPI P1149.7 */
-#define PTI_LASTDWORD_DTS	0x30
-
-/* basic structure used as a write address to the PTI HW */
-struct pti_masterchannel {
-	u8 master;
-	u8 channel;
-};
-
-/* the following functions are defined in misc/pti.c */
-void pti_writedata(struct pti_masterchannel *mc, u8 *buf, int count);
-struct pti_masterchannel *pti_request_masterchannel(u8 type,
-						    const char *thread_name);
-void pti_release_masterchannel(struct pti_masterchannel *mc);
-
-#endif /*PTI_H_*/

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 067/159] x86/cpufeature: Add User-Mode Instruction Prevention definitions
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 066/159] drivers/misc/intel/pti: Rename the header file to free up the namespace Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 068/159] x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD Greg Kroah-Hartman
                   ` (95 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ricardo Neri, Thomas Gleixner,
	Borislav Petkov, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Brian Gerst, Chen Yucong, Chris Metcalf, Dave Hansen,
	Denys Vlasenko, Fenghua Yu, H. Peter Anvin, Huang Rui,
	Jiri Slaby, Jonathan Corbet, Josh Poimboeuf, Linus Torvalds,
	Masami Hiramatsu, Michael S. Tsirkin, Paolo Bonzini,
	Paul Gortmaker, Peter Zijlstra, Ravi V. Shankar, Shuah Khan,
	Tony Luck, Vlastimil Babka, ricardo.neri, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

commit a8b4db562e7283a1520f9e9730297ecaab7622ea upstream.

[ Note, this is a Git cherry-pick of the following commit: (limited to the cpufeatures.h file)

    3522c2a6a4f3 ("x86/cpufeature: Add User-Mode Instruction Prevention definitions")

  ... for easier x86 PTI code testing and back-porting. ]

User-Mode Instruction Prevention is a security feature present in new
Intel processors that, when set, prevents the execution of a subset of
instructions if such instructions are executed in user mode (CPL > 0).
Attempting to execute such instructions causes a general protection
exception.

The subset of instructions comprises:

 * SGDT - Store Global Descriptor Table
 * SIDT - Store Interrupt Descriptor Table
 * SLDT - Store Local Descriptor Table
 * SMSW - Store Machine Status Word
 * STR  - Store Task Register

This feature is also added to the list of disabled-features to allow
a cleaner handling of build-time configuration.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1509935277-22138-7-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeatures.h |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -296,6 +296,7 @@
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
 #define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
+#define X86_FEATURE_UMIP		(16*32+ 2) /* User Mode Instruction Protection */
 #define X86_FEATURE_PKU			(16*32+ 3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE		(16*32+ 4) /* OS Protection Keys Enable */
 #define X86_FEATURE_AVX512_VBMI2	(16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 068/159] x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 067/159] x86/cpufeature: Add User-Mode Instruction Prevention definitions Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 069/159] perf/x86: Enable free running PEBS for REGS_USER/INTR Greg Kroah-Hartman
                   ` (94 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Rudolf Marek, Thomas Gleixner,
	Borislav Petkov, Andy Lutomirski, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Rudolf Marek <r.marek@assembler.cz>

commit f2dbad36c55e5d3a91dccbde6e8cae345fe5632f upstream.

[ Note, this is a Git cherry-pick of the following commit:

    2b67799bdf25 ("x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD")

  ... for easier x86 PTI code testing and back-porting. ]

The latest AMD AMD64 Architecture Programmer's Manual
adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]).

If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES
/ FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers,
thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs.

Signed-Off-By: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/kernel/cpu/amd.c          |    7 +++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -266,6 +266,7 @@
 /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
 #define X86_FEATURE_CLZERO		(13*32+ 0) /* CLZERO instruction */
 #define X86_FEATURE_IRPERF		(13*32+ 1) /* Instructions Retired Count */
+#define X86_FEATURE_XSAVEERPTR		(13*32+ 2) /* Always save/restore FP error pointers */
 
 /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
 #define X86_FEATURE_DTHERM		(14*32+ 0) /* Digital Thermal Sensor */
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -804,8 +804,11 @@ static void init_amd(struct cpuinfo_x86
 	case 0x17: init_amd_zn(c); break;
 	}
 
-	/* Enable workaround for FXSAVE leak */
-	if (c->x86 >= 6)
+	/*
+	 * Enable workaround for FXSAVE leak on CPUs
+	 * without a XSaveErPtr feature
+	 */
+	if ((c->x86 >= 6) && (!cpu_has(c, X86_FEATURE_XSAVEERPTR)))
 		set_cpu_bug(c, X86_BUG_FXSAVE_LEAK);
 
 	cpu_detect_cache_sizes(c);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 069/159] perf/x86: Enable free running PEBS for REGS_USER/INTR
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 068/159] x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 070/159] bpf: fix build issues on um due to mising bpf_perf_event.h Greg Kroah-Hartman
                   ` (93 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Peter Zijlstra (Intel),
	Linus Torvalds, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andi Kleen <ak@linux.intel.com>

commit 2fe1bc1f501d55e5925b4035bcd85781adc76c63 upstream.

[ Note, this is a Git cherry-pick of the following commit:

    a47ba4d77e12 ("perf/x86: Enable free running PEBS for REGS_USER/INTR")

  ... for easier x86 PTI code testing and back-porting. ]

Currently free running PEBS is disabled when user or interrupt
registers are requested. Most of the registers are actually
available in the PEBS record and can be supported.

So we just need to check for the supported registers and then
allow it: it is all except for the segment register.

For user registers this only works when the counter is limited
to ring 3 only, so this also needs to be checked.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/events/intel/core.c |    4 ++++
 arch/x86/events/perf_event.h |   24 +++++++++++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2958,6 +2958,10 @@ static unsigned long intel_pmu_free_runn
 
 	if (event->attr.use_clockid)
 		flags &= ~PERF_SAMPLE_TIME;
+	if (!event->attr.exclude_kernel)
+		flags &= ~PERF_SAMPLE_REGS_USER;
+	if (event->attr.sample_regs_user & ~PEBS_REGS)
+		flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
 	return flags;
 }
 
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -85,13 +85,15 @@ struct amd_nb {
  * Flags PEBS can handle without an PMI.
  *
  * TID can only be handled by flushing at context switch.
+ * REGS_USER can be handled for events limited to ring 3.
  *
  */
 #define PEBS_FREERUNNING_FLAGS \
 	(PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
 	PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
 	PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
-	PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR)
+	PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR | \
+	PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)
 
 /*
  * A debug store configuration.
@@ -110,6 +112,26 @@ struct debug_store {
 	u64	pebs_event_reset[MAX_PEBS_EVENTS];
 };
 
+#define PEBS_REGS \
+	(PERF_REG_X86_AX | \
+	 PERF_REG_X86_BX | \
+	 PERF_REG_X86_CX | \
+	 PERF_REG_X86_DX | \
+	 PERF_REG_X86_DI | \
+	 PERF_REG_X86_SI | \
+	 PERF_REG_X86_SP | \
+	 PERF_REG_X86_BP | \
+	 PERF_REG_X86_IP | \
+	 PERF_REG_X86_FLAGS | \
+	 PERF_REG_X86_R8 | \
+	 PERF_REG_X86_R9 | \
+	 PERF_REG_X86_R10 | \
+	 PERF_REG_X86_R11 | \
+	 PERF_REG_X86_R12 | \
+	 PERF_REG_X86_R13 | \
+	 PERF_REG_X86_R14 | \
+	 PERF_REG_X86_R15)
+
 /*
  * Per register state.
  */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 070/159] bpf: fix build issues on um due to mising bpf_perf_event.h
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 069/159] perf/x86: Enable free running PEBS for REGS_USER/INTR Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 071/159] locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() Greg Kroah-Hartman
                   ` (92 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Randy Dunlap, Richard Weinberger,
	Daniel Borkmann, Hendrik Brueckner, Alexei Starovoitov,
	Richard Weinberger, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>

commit ab95477e7cb35557ecfc837687007b646bab9a9f upstream.

[ Note, this is a Git cherry-pick of the following commit:

    a23f06f06dbe ("bpf: fix build issues on um due to mising bpf_perf_event.h")

  ... for easier x86 PTI code testing and back-porting. ]

Since c895f6f703ad ("bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build
on i386 or x86_64:

  [...]
    CC      init/main.o
  In file included from ../include/linux/perf_event.h:18:0,
                   from ../include/linux/trace_events.h:10,
                   from ../include/trace/syscall.h:7,
                   from ../include/linux/syscalls.h:82,
                   from ../init/main.c:20:
  ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error:
  asm/bpf_perf_event.h: No such file or directory #include
  <asm/bpf_perf_event.h>
  [...]

Lets add missing bpf_perf_event.h also to um arch. This seems
to be the only one still missing.

Fixes: c895f6f703ad ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Richard Weinberger <richard@sigma-star.at>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Richard Weinberger <richard@sigma-star.at>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/um/include/asm/Kbuild |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -1,4 +1,5 @@
 generic-y += barrier.h
+generic-y += bpf_perf_event.h
 generic-y += bug.h
 generic-y += clkdev.h
 generic-y += current.h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 071/159] locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 070/159] bpf: fix build issues on um due to mising bpf_perf_event.h Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 072/159] locking/barriers: Convert users of lockless_dereference() " Greg Kroah-Hartman
                   ` (91 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Will Deacon, Linus Torvalds,
	Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Will Deacon <will.deacon@arm.com>

commit c2bc66082e1048c7573d72e62f597bdc5ce13fea upstream.

[ Note, this is a Git cherry-pick of the following commit:

    76ebbe78f739 ("locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()")

  ... for easier x86 PTI code testing and back-porting. ]

In preparation for the removal of lockless_dereference(), which is the
same as READ_ONCE() on all architectures other than Alpha, add an
implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
used to head dependency chains on all architectures.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1508840570-22169-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/compiler.h |    1 +
 1 file changed, 1 insertion(+)

--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -341,6 +341,7 @@ static __always_inline void __write_once
 		__read_once_size(&(x), __u.__c, sizeof(x));		\
 	else								\
 		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
+	smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
 	__u.__val;							\
 })
 #define READ_ONCE(x) __READ_ONCE(x, 1)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 072/159] locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 071/159] locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45 ` [PATCH 4.14 073/159] x86/mm/kasan: Dont use vmemmap_populate() to initialize shadow Greg Kroah-Hartman
                   ` (90 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Will Deacon, Linus Torvalds,
	Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Will Deacon <will.deacon@arm.com>

commit 3382290ed2d5e275429cef510ab21889d3ccd164 upstream.

[ Note, this is a Git cherry-pick of the following commit:

    506458efaf15 ("locking/barriers: Convert users of lockless_dereference() to READ_ONCE()")

  ... for easier x86 PTI code testing and back-porting. ]

READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it
can be used instead of lockless_dereference() without any change in
semantics.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/events/core.c             |    2 +-
 arch/x86/include/asm/mmu_context.h |    4 ++--
 arch/x86/kernel/ldt.c              |    2 +-
 drivers/md/dm-mpath.c              |   20 ++++++++++----------
 fs/dcache.c                        |    4 ++--
 fs/overlayfs/ovl_entry.h           |    2 +-
 fs/overlayfs/readdir.c             |    2 +-
 include/linux/rculist.h            |    4 ++--
 include/linux/rcupdate.h           |    4 ++--
 kernel/events/core.c               |    4 ++--
 kernel/seccomp.c                   |    2 +-
 kernel/task_work.c                 |    2 +-
 mm/slab.h                          |    2 +-
 13 files changed, 27 insertions(+), 27 deletions(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2371,7 +2371,7 @@ static unsigned long get_segment_base(un
 		struct ldt_struct *ldt;
 
 		/* IRQs are off, so this synchronizes with smp_store_release */
-		ldt = lockless_dereference(current->active_mm->context.ldt);
+		ldt = READ_ONCE(current->active_mm->context.ldt);
 		if (!ldt || idx >= ldt->nr_entries)
 			return 0;
 
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -73,8 +73,8 @@ static inline void load_mm_ldt(struct mm
 #ifdef CONFIG_MODIFY_LDT_SYSCALL
 	struct ldt_struct *ldt;
 
-	/* lockless_dereference synchronizes with smp_store_release */
-	ldt = lockless_dereference(mm->context.ldt);
+	/* READ_ONCE synchronizes with smp_store_release */
+	ldt = READ_ONCE(mm->context.ldt);
 
 	/*
 	 * Any change to mm->context.ldt is followed by an IPI to all
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -103,7 +103,7 @@ static void finalize_ldt_struct(struct l
 static void install_ldt(struct mm_struct *current_mm,
 			struct ldt_struct *ldt)
 {
-	/* Synchronizes with lockless_dereference in load_mm_ldt. */
+	/* Synchronizes with READ_ONCE in load_mm_ldt. */
 	smp_store_release(&current_mm->context.ldt, ldt);
 
 	/* Activate the LDT for all CPUs using current_mm. */
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -366,7 +366,7 @@ static struct pgpath *choose_path_in_pg(
 
 	pgpath = path_to_pgpath(path);
 
-	if (unlikely(lockless_dereference(m->current_pg) != pg)) {
+	if (unlikely(READ_ONCE(m->current_pg) != pg)) {
 		/* Only update current_pgpath if pg changed */
 		spin_lock_irqsave(&m->lock, flags);
 		m->current_pgpath = pgpath;
@@ -390,7 +390,7 @@ static struct pgpath *choose_pgpath(stru
 	}
 
 	/* Were we instructed to switch PG? */
-	if (lockless_dereference(m->next_pg)) {
+	if (READ_ONCE(m->next_pg)) {
 		spin_lock_irqsave(&m->lock, flags);
 		pg = m->next_pg;
 		if (!pg) {
@@ -406,7 +406,7 @@ static struct pgpath *choose_pgpath(stru
 
 	/* Don't change PG until it has no remaining paths */
 check_current_pg:
-	pg = lockless_dereference(m->current_pg);
+	pg = READ_ONCE(m->current_pg);
 	if (pg) {
 		pgpath = choose_path_in_pg(m, pg, nr_bytes);
 		if (!IS_ERR_OR_NULL(pgpath))
@@ -473,7 +473,7 @@ static int multipath_clone_and_map(struc
 	struct request *clone;
 
 	/* Do we need to select a new pgpath? */
-	pgpath = lockless_dereference(m->current_pgpath);
+	pgpath = READ_ONCE(m->current_pgpath);
 	if (!pgpath || !test_bit(MPATHF_QUEUE_IO, &m->flags))
 		pgpath = choose_pgpath(m, nr_bytes);
 
@@ -533,7 +533,7 @@ static int __multipath_map_bio(struct mu
 	bool queue_io;
 
 	/* Do we need to select a new pgpath? */
-	pgpath = lockless_dereference(m->current_pgpath);
+	pgpath = READ_ONCE(m->current_pgpath);
 	queue_io = test_bit(MPATHF_QUEUE_IO, &m->flags);
 	if (!pgpath || !queue_io)
 		pgpath = choose_pgpath(m, nr_bytes);
@@ -1802,7 +1802,7 @@ static int multipath_prepare_ioctl(struc
 	struct pgpath *current_pgpath;
 	int r;
 
-	current_pgpath = lockless_dereference(m->current_pgpath);
+	current_pgpath = READ_ONCE(m->current_pgpath);
 	if (!current_pgpath)
 		current_pgpath = choose_pgpath(m, 0);
 
@@ -1824,7 +1824,7 @@ static int multipath_prepare_ioctl(struc
 	}
 
 	if (r == -ENOTCONN) {
-		if (!lockless_dereference(m->current_pg)) {
+		if (!READ_ONCE(m->current_pg)) {
 			/* Path status changed, redo selection */
 			(void) choose_pgpath(m, 0);
 		}
@@ -1893,9 +1893,9 @@ static int multipath_busy(struct dm_targ
 		return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED);
 
 	/* Guess which priority_group will be used at next mapping time */
-	pg = lockless_dereference(m->current_pg);
-	next_pg = lockless_dereference(m->next_pg);
-	if (unlikely(!lockless_dereference(m->current_pgpath) && next_pg))
+	pg = READ_ONCE(m->current_pg);
+	next_pg = READ_ONCE(m->next_pg);
+	if (unlikely(!READ_ONCE(m->current_pgpath) && next_pg))
 		pg = next_pg;
 
 	if (!pg) {
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -231,7 +231,7 @@ static inline int dentry_cmp(const struc
 {
 	/*
 	 * Be careful about RCU walk racing with rename:
-	 * use 'lockless_dereference' to fetch the name pointer.
+	 * use 'READ_ONCE' to fetch the name pointer.
 	 *
 	 * NOTE! Even if a rename will mean that the length
 	 * was not loaded atomically, we don't care. The
@@ -245,7 +245,7 @@ static inline int dentry_cmp(const struc
 	 * early because the data cannot match (there can
 	 * be no NUL in the ct/tcount data)
 	 */
-	const unsigned char *cs = lockless_dereference(dentry->d_name.name);
+	const unsigned char *cs = READ_ONCE(dentry->d_name.name);
 
 	return dentry_string_cmp(cs, ct, tcount);
 }
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -77,5 +77,5 @@ static inline struct ovl_inode *OVL_I(st
 
 static inline struct dentry *ovl_upperdentry_dereference(struct ovl_inode *oi)
 {
-	return lockless_dereference(oi->__upperdentry);
+	return READ_ONCE(oi->__upperdentry);
 }
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -757,7 +757,7 @@ static int ovl_dir_fsync(struct file *fi
 	if (!od->is_upper && OVL_TYPE_UPPER(ovl_path_type(dentry))) {
 		struct inode *inode = file_inode(file);
 
-		realfile = lockless_dereference(od->upperfile);
+		realfile = READ_ONCE(od->upperfile);
 		if (!realfile) {
 			struct path upperpath;
 
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -275,7 +275,7 @@ static inline void list_splice_tail_init
  * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
  */
 #define list_entry_rcu(ptr, type, member) \
-	container_of(lockless_dereference(ptr), type, member)
+	container_of(READ_ONCE(ptr), type, member)
 
 /*
  * Where are list_empty_rcu() and list_first_entry_rcu()?
@@ -368,7 +368,7 @@ static inline void list_splice_tail_init
  * example is when items are added to the list, but never deleted.
  */
 #define list_entry_lockless(ptr, type, member) \
-	container_of((typeof(ptr))lockless_dereference(ptr), type, member)
+	container_of((typeof(ptr))READ_ONCE(ptr), type, member)
 
 /**
  * list_for_each_entry_lockless - iterate over rcu list of given type
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -346,7 +346,7 @@ static inline void rcu_preempt_sleep_che
 #define __rcu_dereference_check(p, c, space) \
 ({ \
 	/* Dependency order vs. p above. */ \
-	typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \
+	typeof(*p) *________p1 = (typeof(*p) *__force)READ_ONCE(p); \
 	RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \
 	rcu_dereference_sparse(p, space); \
 	((typeof(*p) __force __kernel *)(________p1)); \
@@ -360,7 +360,7 @@ static inline void rcu_preempt_sleep_che
 #define rcu_dereference_raw(p) \
 ({ \
 	/* Dependency order vs. p above. */ \
-	typeof(p) ________p1 = lockless_dereference(p); \
+	typeof(p) ________p1 = READ_ONCE(p); \
 	((typeof(*p) __force __kernel *)(________p1)); \
 })
 
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4233,7 +4233,7 @@ static void perf_remove_from_owner(struc
 	 * indeed free this event, otherwise we need to serialize on
 	 * owner->perf_event_mutex.
 	 */
-	owner = lockless_dereference(event->owner);
+	owner = READ_ONCE(event->owner);
 	if (owner) {
 		/*
 		 * Since delayed_put_task_struct() also drops the last
@@ -4330,7 +4330,7 @@ again:
 		 * Cannot change, child events are not migrated, see the
 		 * comment with perf_event_ctx_lock_nested().
 		 */
-		ctx = lockless_dereference(child->ctx);
+		ctx = READ_ONCE(child->ctx);
 		/*
 		 * Since child_mutex nests inside ctx::mutex, we must jump
 		 * through hoops. We start by grabbing a reference on the ctx.
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -190,7 +190,7 @@ static u32 seccomp_run_filters(const str
 	u32 ret = SECCOMP_RET_ALLOW;
 	/* Make sure cross-thread synced filter points somewhere sane. */
 	struct seccomp_filter *f =
-			lockless_dereference(current->seccomp.filter);
+			READ_ONCE(current->seccomp.filter);
 
 	/* Ensure unexpected behavior doesn't result in failing open. */
 	if (unlikely(WARN_ON(f == NULL)))
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -68,7 +68,7 @@ task_work_cancel(struct task_struct *tas
 	 * we raced with task_work_run(), *pprev == NULL/exited.
 	 */
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
-	while ((work = lockless_dereference(*pprev))) {
+	while ((work = READ_ONCE(*pprev))) {
 		if (work->func != func)
 			pprev = &work->next;
 		else if (cmpxchg(pprev, work, work->next) == work)
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -259,7 +259,7 @@ cache_from_memcg_idx(struct kmem_cache *
 	 * memcg_caches issues a write barrier to match this (see
 	 * memcg_create_kmem_cache()).
 	 */
-	cachep = lockless_dereference(arr->entries[idx]);
+	cachep = READ_ONCE(arr->entries[idx]);
 	rcu_read_unlock();
 
 	return cachep;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 073/159] x86/mm/kasan: Dont use vmemmap_populate() to initialize shadow
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2017-12-22  8:45 ` [PATCH 4.14 072/159] locking/barriers: Convert users of lockless_dereference() " Greg Kroah-Hartman
@ 2017-12-22  8:45 ` Greg Kroah-Hartman
  2017-12-22  8:45   ` Greg Kroah-Hartman
                   ` (89 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrey Ryabinin, Pavel Tatashin,
	Andy Lutomirski, Steven Sistare, Daniel Jordan, Bob Picco,
	Michal Hocko, Alexander Potapenko, Ard Biesheuvel,
	Catalin Marinas, Christian Borntraeger, David S. Miller,
	Dmitry Vyukov, Heiko Carstens, H. Peter Anvin, Ingo Molnar,
	Mark Rutland, Matthew Wilcox, Mel Gorman, Michal Hocko,
	Sam Ravnborg, Thomas Gleixner, Will Deacon, Andrew Morton,
	Linus Torvalds, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrey Ryabinin <aryabinin@virtuozzo.com>

commit 2aeb07365bcd489620f71390a7d2031cd4dfb83e upstream.

[ Note, this is a Git cherry-pick of the following commit:

    d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")

  ... for easier x86 PTI code testing and back-porting. ]

The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/Kconfig            |    2 
 arch/x86/mm/kasan_init_64.c |  143 +++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 137 insertions(+), 8 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -108,7 +108,7 @@ config X86
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_HUGE_VMAP		if X86_64 || X86_PAE
 	select HAVE_ARCH_JUMP_LABEL
-	select HAVE_ARCH_KASAN			if X86_64 && SPARSEMEM_VMEMMAP
+	select HAVE_ARCH_KASAN			if X86_64
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_KMEMCHECK
 	select HAVE_ARCH_MMAP_RND_BITS		if MMU
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -4,12 +4,14 @@
 #include <linux/bootmem.h>
 #include <linux/kasan.h>
 #include <linux/kdebug.h>
+#include <linux/memblock.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/sched/task.h>
 #include <linux/vmalloc.h>
 
 #include <asm/e820/types.h>
+#include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 #include <asm/sections.h>
 #include <asm/pgtable.h>
@@ -18,7 +20,134 @@ extern struct range pfn_mapped[E820_MAX_
 
 static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
 
-static int __init map_range(struct range *range)
+static __init void *early_alloc(size_t size, int nid)
+{
+	return memblock_virt_alloc_try_nid_nopanic(size, size,
+		__pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
+}
+
+static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
+				      unsigned long end, int nid)
+{
+	pte_t *pte;
+
+	if (pmd_none(*pmd)) {
+		void *p;
+
+		if (boot_cpu_has(X86_FEATURE_PSE) &&
+		    ((end - addr) == PMD_SIZE) &&
+		    IS_ALIGNED(addr, PMD_SIZE)) {
+			p = early_alloc(PMD_SIZE, nid);
+			if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL))
+				return;
+			else if (p)
+				memblock_free(__pa(p), PMD_SIZE);
+		}
+
+		p = early_alloc(PAGE_SIZE, nid);
+		pmd_populate_kernel(&init_mm, pmd, p);
+	}
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		pte_t entry;
+		void *p;
+
+		if (!pte_none(*pte))
+			continue;
+
+		p = early_alloc(PAGE_SIZE, nid);
+		entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
+		set_pte_at(&init_mm, addr, pte, entry);
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+static void __init kasan_populate_pud(pud_t *pud, unsigned long addr,
+				      unsigned long end, int nid)
+{
+	pmd_t *pmd;
+	unsigned long next;
+
+	if (pud_none(*pud)) {
+		void *p;
+
+		if (boot_cpu_has(X86_FEATURE_GBPAGES) &&
+		    ((end - addr) == PUD_SIZE) &&
+		    IS_ALIGNED(addr, PUD_SIZE)) {
+			p = early_alloc(PUD_SIZE, nid);
+			if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL))
+				return;
+			else if (p)
+				memblock_free(__pa(p), PUD_SIZE);
+		}
+
+		p = early_alloc(PAGE_SIZE, nid);
+		pud_populate(&init_mm, pud, p);
+	}
+
+	pmd = pmd_offset(pud, addr);
+	do {
+		next = pmd_addr_end(addr, end);
+		if (!pmd_large(*pmd))
+			kasan_populate_pmd(pmd, addr, next, nid);
+	} while (pmd++, addr = next, addr != end);
+}
+
+static void __init kasan_populate_p4d(p4d_t *p4d, unsigned long addr,
+				      unsigned long end, int nid)
+{
+	pud_t *pud;
+	unsigned long next;
+
+	if (p4d_none(*p4d)) {
+		void *p = early_alloc(PAGE_SIZE, nid);
+
+		p4d_populate(&init_mm, p4d, p);
+	}
+
+	pud = pud_offset(p4d, addr);
+	do {
+		next = pud_addr_end(addr, end);
+		if (!pud_large(*pud))
+			kasan_populate_pud(pud, addr, next, nid);
+	} while (pud++, addr = next, addr != end);
+}
+
+static void __init kasan_populate_pgd(pgd_t *pgd, unsigned long addr,
+				      unsigned long end, int nid)
+{
+	void *p;
+	p4d_t *p4d;
+	unsigned long next;
+
+	if (pgd_none(*pgd)) {
+		p = early_alloc(PAGE_SIZE, nid);
+		pgd_populate(&init_mm, pgd, p);
+	}
+
+	p4d = p4d_offset(pgd, addr);
+	do {
+		next = p4d_addr_end(addr, end);
+		kasan_populate_p4d(p4d, addr, next, nid);
+	} while (p4d++, addr = next, addr != end);
+}
+
+static void __init kasan_populate_shadow(unsigned long addr, unsigned long end,
+					 int nid)
+{
+	pgd_t *pgd;
+	unsigned long next;
+
+	addr = addr & PAGE_MASK;
+	end = round_up(end, PAGE_SIZE);
+	pgd = pgd_offset_k(addr);
+	do {
+		next = pgd_addr_end(addr, end);
+		kasan_populate_pgd(pgd, addr, next, nid);
+	} while (pgd++, addr = next, addr != end);
+}
+
+static void __init map_range(struct range *range)
 {
 	unsigned long start;
 	unsigned long end;
@@ -26,7 +155,7 @@ static int __init map_range(struct range
 	start = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->start));
 	end = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->end));
 
-	return vmemmap_populate(start, end, NUMA_NO_NODE);
+	kasan_populate_shadow(start, end, early_pfn_to_nid(range->start));
 }
 
 static void __init clear_pgds(unsigned long start,
@@ -189,16 +318,16 @@ void __init kasan_init(void)
 		if (pfn_mapped[i].end == 0)
 			break;
 
-		if (map_range(&pfn_mapped[i]))
-			panic("kasan: unable to allocate shadow!");
+		map_range(&pfn_mapped[i]);
 	}
+
 	kasan_populate_zero_shadow(
 		kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
 		kasan_mem_to_shadow((void *)__START_KERNEL_map));
 
-	vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
-			(unsigned long)kasan_mem_to_shadow(_end),
-			NUMA_NO_NODE);
+	kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext),
+			      (unsigned long)kasan_mem_to_shadow(_end),
+			      early_pfn_to_nid(__pa(_stext)));
 
 	kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
 			(void *)KASAN_SHADOW_END);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 074/159] x86/entry/64/paravirt: Use paravirt-safe macro to access eflags
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Boris Ostrovsky, Thomas Gleixner,
	Juergen Gross, Andy Lutomirski, Borislav Petkov, Borislav Petkov,
	Brian Gerst, Dave Hansen, Dave Hansen, David Laight,
	Denys Vlasenko, Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, xen-devel, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Ostrovsky <boris.ostrovsky@oracle.com>

commit e17f8234538d1ff708673f287a42457c4dee720d upstream.

Commit 1d3e53e8624a ("x86/entry/64: Refactor IRQ stacks and make them
NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags
using 'pushfq' instruction when testing for IF bit. On PV Xen guests
looking at IF flag directly will always see it set, resulting in 'ud2'.

Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when
running paravirt.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: xen-devel@lists.xenproject.org
Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S        |    7 ++++---
 arch/x86/include/asm/irqflags.h  |    3 +++
 arch/x86/include/asm/paravirt.h  |    9 +++++++++
 arch/x86/kernel/asm-offsets_64.c |    3 +++
 4 files changed, 19 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -462,12 +462,13 @@ END(irq_entries_start)
 
 .macro DEBUG_ENTRY_ASSERT_IRQS_OFF
 #ifdef CONFIG_DEBUG_ENTRY
-	pushfq
-	testl $X86_EFLAGS_IF, (%rsp)
+	pushq %rax
+	SAVE_FLAGS(CLBR_RAX)
+	testl $X86_EFLAGS_IF, %eax
 	jz .Lokay_\@
 	ud2
 .Lokay_\@:
-	addq $8, %rsp
+	popq %rax
 #endif
 .endm
 
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -142,6 +142,9 @@ static inline notrace unsigned long arch
 	swapgs;					\
 	sysretl
 
+#ifdef CONFIG_DEBUG_ENTRY
+#define SAVE_FLAGS(x)		pushfq; popq %rax
+#endif
 #else
 #define INTERRUPT_RETURN		iret
 #define ENABLE_INTERRUPTS_SYSEXIT	sti; sysexit
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -927,6 +927,15 @@ extern void default_banner(void);
 	PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64),	\
 		  CLBR_NONE,						\
 		  jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret64))
+
+#ifdef CONFIG_DEBUG_ENTRY
+#define SAVE_FLAGS(clobbers)                                        \
+	PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_save_fl), clobbers, \
+		  PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE);        \
+		  call PARA_INDIRECT(pv_irq_ops+PV_IRQ_save_fl);    \
+		  PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
+#endif
+
 #endif	/* CONFIG_X86_32 */
 
 #endif /* __ASSEMBLY__ */
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -23,6 +23,9 @@ int main(void)
 #ifdef CONFIG_PARAVIRT
 	OFFSET(PV_CPU_usergs_sysret64, pv_cpu_ops, usergs_sysret64);
 	OFFSET(PV_CPU_swapgs, pv_cpu_ops, swapgs);
+#ifdef CONFIG_DEBUG_ENTRY
+	OFFSET(PV_IRQ_save_fl, pv_irq_ops, save_fl);
+#endif
 	BLANK();
 #endif
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 074/159] x86/entry/64/paravirt: Use paravirt-safe macro to access eflags
@ 2017-12-22  8:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Dave Hansen, Will Deacon, Dave Hansen,
	H. Peter Anvin, Thomas Gleixner, Eduardo Valentin, hughd,
	Ingo Molnar, aliguori, xen-devel, Rik van Riel, Denys Vlasenko,
	daniel.gruss, Brian Gerst, Borislav Petkov, Andy Lutomirski,
	Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov, Juergen Gross,
	Greg Kroah-Hartman, stable, David Laight, keescook,
	Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Ostrovsky <boris.ostrovsky@oracle.com>

commit e17f8234538d1ff708673f287a42457c4dee720d upstream.

Commit 1d3e53e8624a ("x86/entry/64: Refactor IRQ stacks and make them
NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags
using 'pushfq' instruction when testing for IF bit. On PV Xen guests
looking at IF flag directly will always see it set, resulting in 'ud2'.

Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when
running paravirt.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: xen-devel@lists.xenproject.org
Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S        |    7 ++++---
 arch/x86/include/asm/irqflags.h  |    3 +++
 arch/x86/include/asm/paravirt.h  |    9 +++++++++
 arch/x86/kernel/asm-offsets_64.c |    3 +++
 4 files changed, 19 insertions(+), 3 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -462,12 +462,13 @@ END(irq_entries_start)
 
 .macro DEBUG_ENTRY_ASSERT_IRQS_OFF
 #ifdef CONFIG_DEBUG_ENTRY
-	pushfq
-	testl $X86_EFLAGS_IF, (%rsp)
+	pushq %rax
+	SAVE_FLAGS(CLBR_RAX)
+	testl $X86_EFLAGS_IF, %eax
 	jz .Lokay_\@
 	ud2
 .Lokay_\@:
-	addq $8, %rsp
+	popq %rax
 #endif
 .endm
 
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -142,6 +142,9 @@ static inline notrace unsigned long arch
 	swapgs;					\
 	sysretl
 
+#ifdef CONFIG_DEBUG_ENTRY
+#define SAVE_FLAGS(x)		pushfq; popq %rax
+#endif
 #else
 #define INTERRUPT_RETURN		iret
 #define ENABLE_INTERRUPTS_SYSEXIT	sti; sysexit
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -927,6 +927,15 @@ extern void default_banner(void);
 	PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64),	\
 		  CLBR_NONE,						\
 		  jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret64))
+
+#ifdef CONFIG_DEBUG_ENTRY
+#define SAVE_FLAGS(clobbers)                                        \
+	PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_save_fl), clobbers, \
+		  PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE);        \
+		  call PARA_INDIRECT(pv_irq_ops+PV_IRQ_save_fl);    \
+		  PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
+#endif
+
 #endif	/* CONFIG_X86_32 */
 
 #endif /* __ASSEMBLY__ */
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -23,6 +23,9 @@ int main(void)
 #ifdef CONFIG_PARAVIRT
 	OFFSET(PV_CPU_usergs_sysret64, pv_cpu_ops, usergs_sysret64);
 	OFFSET(PV_CPU_swapgs, pv_cpu_ops, swapgs);
+#ifdef CONFIG_DEBUG_ENTRY
+	OFFSET(PV_IRQ_save_fl, pv_irq_ops, save_fl);
+#endif
 	BLANK();
 #endif
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 075/159] x86/unwinder/orc: Dont bail on stack overflow
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2017-12-22  8:45   ` Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 076/159] x86/unwinder: Handle stack overflows more gracefully Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Boris Ostrovsky, Borislav Petkov, Borislav Petkov, Brian Gerst,
	Dave Hansen, Dave Hansen, David Laight, Denys Vlasenko,
	Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf, Juergen Gross,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit d3a09104018cf2ad5973dfa8a9c138ef9f5015a3 upstream.

If the stack overflows into a guard page and the ORC unwinder should work
well: by construction, there can't be any meaningful data in the guard page
because no writes to the guard page will have succeeded.

But there is a bug that prevents unwinding from working correctly: if the
starting register state has RSP pointing into a stack guard page, the ORC
unwinder bails out immediately.

Instead of bailing out immediately check whether the next page up is a
valid check page and if so analyze that. As a result the ORC unwinder will
start the unwind.

Tested by intentionally overflowing the task stack.  The result is an
accurate call trace instead of a trace consisting purely of '?' entries.

There are a few other bugs that are triggered if the unwinder encounters a
stack overflow after the first step, but they are outside the scope of this
fix.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150604.991389777@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/unwind_orc.c |   14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -553,8 +553,18 @@ void __unwind_start(struct unwind_state
 	}
 
 	if (get_stack_info((unsigned long *)state->sp, state->task,
-			   &state->stack_info, &state->stack_mask))
-		return;
+			   &state->stack_info, &state->stack_mask)) {
+		/*
+		 * We weren't on a valid stack.  It's possible that
+		 * we overflowed a valid stack into a guard page.
+		 * See if the next page up is valid so that we can
+		 * generate some kind of backtrace if this happens.
+		 */
+		void *next_page = (void *)PAGE_ALIGN((unsigned long)state->sp);
+		if (get_stack_info(next_page, state->task, &state->stack_info,
+				   &state->stack_mask))
+			return;
+	}
 
 	/*
 	 * The caller can provide the address of the first frame directly

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 076/159] x86/unwinder: Handle stack overflows more gracefully
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 075/159] x86/unwinder/orc: Dont bail on stack overflow Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 077/159] x86/irq: Remove an old outdated comment about context tracking races Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josh Poimboeuf, Thomas Gleixner,
	Borislav Petkov, Andy Lutomirski, Boris Ostrovsky,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Juergen Gross, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Will Deacon, aliguori, daniel.gruss, hughd, keescook,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <jpoimboe@redhat.com>

commit b02fcf9ba1211097754b286043cd87a8b4907e75 upstream.

There are at least two unwinder bugs hindering the debugging of
stack-overflow crashes:

- It doesn't deal gracefully with the case where the stack overflows and
  the stack pointer itself isn't on a valid stack but the
  to-be-dereferenced data *is*.

- The ORC oops dump code doesn't know how to print partial pt_regs, for the
  case where if we get an interrupt/exception in *early* entry code
  before the full pt_regs have been saved.

Fix both issues.

http://lkml.kernel.org/r/20171126024031.uxi4numpbjm5rlbr@treble

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.071425003@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/kdebug.h |    1 
 arch/x86/include/asm/unwind.h |    7 +++
 arch/x86/kernel/dumpstack.c   |   32 ++++++++++++++---
 arch/x86/kernel/process_64.c  |   11 ++----
 arch/x86/kernel/unwind_orc.c  |   76 ++++++++++++++----------------------------
 5 files changed, 66 insertions(+), 61 deletions(-)

--- a/arch/x86/include/asm/kdebug.h
+++ b/arch/x86/include/asm/kdebug.h
@@ -26,6 +26,7 @@ extern void die(const char *, struct pt_
 extern int __must_check __die(const char *, struct pt_regs *, long);
 extern void show_stack_regs(struct pt_regs *regs);
 extern void __show_regs(struct pt_regs *regs, int all);
+extern void show_iret_regs(struct pt_regs *regs);
 extern unsigned long oops_begin(void);
 extern void oops_end(unsigned long, struct pt_regs *, int signr);
 
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -7,6 +7,9 @@
 #include <asm/ptrace.h>
 #include <asm/stacktrace.h>
 
+#define IRET_FRAME_OFFSET (offsetof(struct pt_regs, ip))
+#define IRET_FRAME_SIZE   (sizeof(struct pt_regs) - IRET_FRAME_OFFSET)
+
 struct unwind_state {
 	struct stack_info stack_info;
 	unsigned long stack_mask;
@@ -52,6 +55,10 @@ void unwind_start(struct unwind_state *s
 }
 
 #if defined(CONFIG_UNWINDER_ORC) || defined(CONFIG_UNWINDER_FRAME_POINTER)
+/*
+ * WARNING: The entire pt_regs may not be safe to dereference.  In some cases,
+ * only the iret frame registers are accessible.  Use with caution!
+ */
 static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
 {
 	if (unwind_done(state))
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -50,6 +50,28 @@ static void printk_stack_address(unsigne
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
+void show_iret_regs(struct pt_regs *regs)
+{
+	printk(KERN_DEFAULT "RIP: %04x:%pS\n", (int)regs->cs, (void *)regs->ip);
+	printk(KERN_DEFAULT "RSP: %04x:%016lx EFLAGS: %08lx", (int)regs->ss,
+		regs->sp, regs->flags);
+}
+
+static void show_regs_safe(struct stack_info *info, struct pt_regs *regs)
+{
+	if (on_stack(info, regs, sizeof(*regs)))
+		__show_regs(regs, 0);
+	else if (on_stack(info, (void *)regs + IRET_FRAME_OFFSET,
+			  IRET_FRAME_SIZE)) {
+		/*
+		 * When an interrupt or exception occurs in entry code, the
+		 * full pt_regs might not have been saved yet.  In that case
+		 * just print the iret frame.
+		 */
+		show_iret_regs(regs);
+	}
+}
+
 void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 			unsigned long *stack, char *log_lvl)
 {
@@ -94,8 +116,8 @@ void show_trace_log_lvl(struct task_stru
 		if (stack_name)
 			printk("%s <%s>\n", log_lvl, stack_name);
 
-		if (regs && on_stack(&stack_info, regs, sizeof(*regs)))
-			__show_regs(regs, 0);
+		if (regs)
+			show_regs_safe(&stack_info, regs);
 
 		/*
 		 * Scan the stack, printing any text addresses we find.  At the
@@ -119,7 +141,7 @@ void show_trace_log_lvl(struct task_stru
 
 			/*
 			 * Don't print regs->ip again if it was already printed
-			 * by __show_regs() below.
+			 * by show_regs_safe() below.
 			 */
 			if (regs && stack == &regs->ip)
 				goto next;
@@ -155,8 +177,8 @@ next:
 
 			/* if the frame has entry regs, print them */
 			regs = unwind_get_entry_regs(&state);
-			if (regs && on_stack(&stack_info, regs, sizeof(*regs)))
-				__show_regs(regs, 0);
+			if (regs)
+				show_regs_safe(&stack_info, regs);
 		}
 
 		if (stack_name)
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -69,9 +69,8 @@ void __show_regs(struct pt_regs *regs, i
 	unsigned int fsindex, gsindex;
 	unsigned int ds, cs, es;
 
-	printk(KERN_DEFAULT "RIP: %04lx:%pS\n", regs->cs, (void *)regs->ip);
-	printk(KERN_DEFAULT "RSP: %04lx:%016lx EFLAGS: %08lx", regs->ss,
-		regs->sp, regs->flags);
+	show_iret_regs(regs);
+
 	if (regs->orig_ax != -1)
 		pr_cont(" ORIG_RAX: %016lx\n", regs->orig_ax);
 	else
@@ -88,6 +87,9 @@ void __show_regs(struct pt_regs *regs, i
 	printk(KERN_DEFAULT "R13: %016lx R14: %016lx R15: %016lx\n",
 	       regs->r13, regs->r14, regs->r15);
 
+	if (!all)
+		return;
+
 	asm("movl %%ds,%0" : "=r" (ds));
 	asm("movl %%cs,%0" : "=r" (cs));
 	asm("movl %%es,%0" : "=r" (es));
@@ -98,9 +100,6 @@ void __show_regs(struct pt_regs *regs, i
 	rdmsrl(MSR_GS_BASE, gs);
 	rdmsrl(MSR_KERNEL_GS_BASE, shadowgs);
 
-	if (!all)
-		return;
-
 	cr0 = read_cr0();
 	cr2 = read_cr2();
 	cr3 = __read_cr3();
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -253,22 +253,15 @@ unsigned long *unwind_get_return_address
 	return NULL;
 }
 
-static bool stack_access_ok(struct unwind_state *state, unsigned long addr,
+static bool stack_access_ok(struct unwind_state *state, unsigned long _addr,
 			    size_t len)
 {
 	struct stack_info *info = &state->stack_info;
+	void *addr = (void *)_addr;
 
-	/*
-	 * If the address isn't on the current stack, switch to the next one.
-	 *
-	 * We may have to traverse multiple stacks to deal with the possibility
-	 * that info->next_sp could point to an empty stack and the address
-	 * could be on a subsequent stack.
-	 */
-	while (!on_stack(info, (void *)addr, len))
-		if (get_stack_info(info->next_sp, state->task, info,
-				   &state->stack_mask))
-			return false;
+	if (!on_stack(info, addr, len) &&
+	    (get_stack_info(addr, state->task, info, &state->stack_mask)))
+		return false;
 
 	return true;
 }
@@ -283,42 +276,32 @@ static bool deref_stack_reg(struct unwin
 	return true;
 }
 
-#define REGS_SIZE (sizeof(struct pt_regs))
-#define SP_OFFSET (offsetof(struct pt_regs, sp))
-#define IRET_REGS_SIZE (REGS_SIZE - offsetof(struct pt_regs, ip))
-#define IRET_SP_OFFSET (SP_OFFSET - offsetof(struct pt_regs, ip))
-
 static bool deref_stack_regs(struct unwind_state *state, unsigned long addr,
-			     unsigned long *ip, unsigned long *sp, bool full)
+			     unsigned long *ip, unsigned long *sp)
 {
-	size_t regs_size = full ? REGS_SIZE : IRET_REGS_SIZE;
-	size_t sp_offset = full ? SP_OFFSET : IRET_SP_OFFSET;
-	struct pt_regs *regs = (struct pt_regs *)(addr + regs_size - REGS_SIZE);
-
-	if (IS_ENABLED(CONFIG_X86_64)) {
-		if (!stack_access_ok(state, addr, regs_size))
-			return false;
-
-		*ip = regs->ip;
-		*sp = regs->sp;
+	struct pt_regs *regs = (struct pt_regs *)addr;
 
-		return true;
-	}
+	/* x86-32 support will be more complicated due to the &regs->sp hack */
+	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_32));
 
-	if (!stack_access_ok(state, addr, sp_offset))
+	if (!stack_access_ok(state, addr, sizeof(struct pt_regs)))
 		return false;
 
 	*ip = regs->ip;
+	*sp = regs->sp;
+	return true;
+}
 
-	if (user_mode(regs)) {
-		if (!stack_access_ok(state, addr + sp_offset,
-				     REGS_SIZE - SP_OFFSET))
-			return false;
-
-		*sp = regs->sp;
-	} else
-		*sp = (unsigned long)&regs->sp;
+static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr,
+				  unsigned long *ip, unsigned long *sp)
+{
+	struct pt_regs *regs = (void *)addr - IRET_FRAME_OFFSET;
 
+	if (!stack_access_ok(state, addr, IRET_FRAME_SIZE))
+		return false;
+
+	*ip = regs->ip;
+	*sp = regs->sp;
 	return true;
 }
 
@@ -327,7 +310,6 @@ bool unwind_next_frame(struct unwind_sta
 	unsigned long ip_p, sp, orig_ip, prev_sp = state->sp;
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
-	struct pt_regs *ptregs;
 	bool indirect = false;
 
 	if (unwind_done(state))
@@ -435,7 +417,7 @@ bool unwind_next_frame(struct unwind_sta
 		break;
 
 	case ORC_TYPE_REGS:
-		if (!deref_stack_regs(state, sp, &state->ip, &state->sp, true)) {
+		if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
 			goto done;
@@ -447,20 +429,14 @@ bool unwind_next_frame(struct unwind_sta
 		break;
 
 	case ORC_TYPE_REGS_IRET:
-		if (!deref_stack_regs(state, sp, &state->ip, &state->sp, false)) {
+		if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference iret registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
 			goto done;
 		}
 
-		ptregs = container_of((void *)sp, struct pt_regs, ip);
-		if ((unsigned long)ptregs >= prev_sp &&
-		    on_stack(&state->stack_info, ptregs, REGS_SIZE)) {
-			state->regs = ptregs;
-			state->full_regs = false;
-		} else
-			state->regs = NULL;
-
+		state->regs = (void *)sp - IRET_FRAME_OFFSET;
+		state->full_regs = false;
 		state->signal = true;
 		break;
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 077/159] x86/irq: Remove an old outdated comment about context tracking races
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 076/159] x86/unwinder: Handle stack overflows more gracefully Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 078/159] x86/irq/64: Print the offending IP in the stack overflow warning Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 6669a692605547892a026445e460bf233958bd7f upstream.

That race has been fixed and code cleaned up for a while now.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.150551639@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/irq.c |   12 ------------
 1 file changed, 12 deletions(-)

--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -219,18 +219,6 @@ __visible unsigned int __irq_entry do_IR
 	/* high bit used in ret_from_ code  */
 	unsigned vector = ~regs->orig_ax;
 
-	/*
-	 * NB: Unlike exception entries, IRQ entries do not reliably
-	 * handle context tracking in the low-level entry code.  This is
-	 * because syscall entries execute briefly with IRQs on before
-	 * updating context tracking state, so we can take an IRQ from
-	 * kernel mode with CONTEXT_USER.  The low-level entry code only
-	 * updates the context if we came from user mode, so we won't
-	 * switch to CONTEXT_KERNEL.  We'll fix that once the syscall
-	 * code is cleaned up enough that we can cleanly defer enabling
-	 * IRQs.
-	 */
-
 	entering_irq();
 
 	/* entering_irq() tells RCU that we're not quiescent.  Check it. */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 078/159] x86/irq/64: Print the offending IP in the stack overflow warning
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 077/159] x86/irq: Remove an old outdated comment about context tracking races Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 079/159] x86/entry/64: Allocate and enable the SYSENTER stack Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 4f3789e792296e21405f708cf3cb409d7c7d5683 upstream.

In case something goes wrong with unwind (not unlikely in case of
overflow), print the offending IP where we detected the overflow.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.231677119@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/irq_64.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/irq_64.c
+++ b/arch/x86/kernel/irq_64.c
@@ -57,10 +57,10 @@ static inline void stack_overflow_check(
 	if (regs->sp >= estack_top && regs->sp <= estack_bottom)
 		return;
 
-	WARN_ONCE(1, "do_IRQ(): %s has overflown the kernel stack (cur:%Lx,sp:%lx,irq stk top-bottom:%Lx-%Lx,exception stk top-bottom:%Lx-%Lx)\n",
+	WARN_ONCE(1, "do_IRQ(): %s has overflown the kernel stack (cur:%Lx,sp:%lx,irq stk top-bottom:%Lx-%Lx,exception stk top-bottom:%Lx-%Lx,ip:%pF)\n",
 		current->comm, curbase, regs->sp,
 		irq_stack_top, irq_stack_bottom,
-		estack_top, estack_bottom);
+		estack_top, estack_bottom, (void *)regs->ip);
 
 	if (sysctl_panic_on_stackoverflow)
 		panic("low stack detected by irq handler - check messages\n");

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 079/159] x86/entry/64: Allocate and enable the SYSENTER stack
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 078/159] x86/irq/64: Print the offending IP in the stack overflow warning Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 080/159] x86/dumpstack: Add get_stack_info() support for " Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 1a79797b58cddfa948420a7553241c79c013e3ca upstream.

This will simplify future changes that want scratch variables early in
the SYSENTER handler -- they'll be able to spill registers to the
stack.  It also lets us get rid of a SWAPGS_UNSAFE_STACK user.

This does not depend on CONFIG_IA32_EMULATION=y because we'll want the
stack space even without IA32 emulation.

As far as I can tell, the reason that this wasn't done from day 1 is
that we use IST for #DB and #BP, which is IMO rather nasty and causes
a lot more problems than it solves.  But, since #DB uses IST, we don't
actually need a real stack for SYSENTER (because SYSENTER with TF set
will invoke #DB on the IST stack rather than the SYSENTER stack).

I want to remove IST usage from these vectors some day, and this patch
is a prerequisite for that as well.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64_compat.S |    2 +-
 arch/x86/include/asm/processor.h |    3 ---
 arch/x86/kernel/asm-offsets.c    |    5 +++++
 arch/x86/kernel/asm-offsets_32.c |    5 -----
 arch/x86/kernel/cpu/common.c     |    4 +++-
 arch/x86/kernel/process.c        |    2 --
 arch/x86/kernel/traps.c          |    3 +--
 7 files changed, 10 insertions(+), 14 deletions(-)

--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -48,7 +48,7 @@
  */
 ENTRY(entry_SYSENTER_compat)
 	/* Interrupts are off on entry. */
-	SWAPGS_UNSAFE_STACK
+	SWAPGS
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
 	/*
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -339,14 +339,11 @@ struct tss_struct {
 	 */
 	unsigned long		io_bitmap[IO_BITMAP_LONGS + 1];
 
-#ifdef CONFIG_X86_32
 	/*
 	 * Space for the temporary SYSENTER stack.
 	 */
 	unsigned long		SYSENTER_stack_canary;
 	unsigned long		SYSENTER_stack[64];
-#endif
-
 } ____cacheline_aligned;
 
 DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -93,4 +93,9 @@ void common(void) {
 
 	BLANK();
 	DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
+
+	/* Offset from cpu_tss to SYSENTER_stack */
+	OFFSET(CPU_TSS_SYSENTER_stack, tss_struct, SYSENTER_stack);
+	/* Size of SYSENTER_stack */
+	DEFINE(SIZEOF_SYSENTER_stack, sizeof(((struct tss_struct *)0)->SYSENTER_stack));
 }
--- a/arch/x86/kernel/asm-offsets_32.c
+++ b/arch/x86/kernel/asm-offsets_32.c
@@ -50,11 +50,6 @@ void foo(void)
 	DEFINE(TSS_sysenter_sp0, offsetof(struct tss_struct, x86_tss.sp0) -
 	       offsetofend(struct tss_struct, SYSENTER_stack));
 
-	/* Offset from cpu_tss to SYSENTER_stack */
-	OFFSET(CPU_TSS_SYSENTER_stack, tss_struct, SYSENTER_stack);
-	/* Size of SYSENTER_stack */
-	DEFINE(SIZEOF_SYSENTER_stack, sizeof(((struct tss_struct *)0)->SYSENTER_stack));
-
 #ifdef CONFIG_CC_STACKPROTECTOR
 	BLANK();
 	OFFSET(stack_canary_offset, stack_canary, canary);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1361,7 +1361,9 @@ void syscall_init(void)
 	 * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit).
 	 */
 	wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
-	wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL);
+	wrmsrl_safe(MSR_IA32_SYSENTER_ESP,
+		    (unsigned long)this_cpu_ptr(&cpu_tss) +
+		    offsetofend(struct tss_struct, SYSENTER_stack));
 	wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
 #else
 	wrmsrl(MSR_CSTAR, (unsigned long)ignore_sysret);
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -71,9 +71,7 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(
 	  */
 	.io_bitmap		= { [0 ... IO_BITMAP_LONGS] = ~0 },
 #endif
-#ifdef CONFIG_X86_32
 	.SYSENTER_stack_canary	= STACK_END_MAGIC,
-#endif
 };
 EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -794,14 +794,13 @@ dotraplinkage void do_debug(struct pt_re
 	debug_stack_usage_dec();
 
 exit:
-#if defined(CONFIG_X86_32)
 	/*
 	 * This is the most likely code path that involves non-trivial use
 	 * of the SYSENTER stack.  Check that we haven't overrun it.
 	 */
 	WARN(this_cpu_read(cpu_tss.SYSENTER_stack_canary) != STACK_END_MAGIC,
 	     "Overran or corrupted SYSENTER stack\n");
-#endif
+
 	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 080/159] x86/dumpstack: Add get_stack_info() support for the SYSENTER stack
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 079/159] x86/entry/64: Allocate and enable the SYSENTER stack Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 081/159] x86/entry/gdt: Put per-CPU GDT remaps in ascending order Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 33a2f1a6c4d7c0a02d1c006fb0379cc5ca3b96bb upstream.

get_stack_info() doesn't currently know about the SYSENTER stack, so
unwinding will fail if we entered the kernel on the SYSENTER stack
and haven't fully switched off.  Teach get_stack_info() about the
SYSENTER stack.

With future patches applied that run part of the entry code on the
SYSENTER stack and introduce an intentional BUG(), I would get:

  PANIC: double fault, error_code: 0x0
  ...
  RIP: 0010:do_error_trap+0x33/0x1c0
  ...
  Call Trace:
  Code: ...

With this patch, I get:

  PANIC: double fault, error_code: 0x0
  ...
  Call Trace:
   <SYSENTER>
   ? async_page_fault+0x36/0x60
   ? invalid_op+0x22/0x40
   ? async_page_fault+0x36/0x60
   ? sync_regs+0x3c/0x40
   ? sync_regs+0x2e/0x40
   ? error_entry+0x6c/0xd0
   ? async_page_fault+0x36/0x60
   </SYSENTER>
  Code: ...

which is a lot more informative.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.392711508@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 8da111b3c342..f8062bfd43a0 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -16,6 +16,7 @@ enum stack_type {
 	STACK_TYPE_TASK,
 	STACK_TYPE_IRQ,
 	STACK_TYPE_SOFTIRQ,
+	STACK_TYPE_SYSENTER,
 	STACK_TYPE_EXCEPTION,
 	STACK_TYPE_EXCEPTION_LAST = STACK_TYPE_EXCEPTION + N_EXCEPTION_STACKS-1,
 };
@@ -28,6 +29,8 @@ struct stack_info {
 bool in_task_stack(unsigned long *stack, struct task_struct *task,
 		   struct stack_info *info);
 
+bool in_sysenter_stack(unsigned long *stack, struct stack_info *info);
+
 int get_stack_info(unsigned long *stack, struct task_struct *task,
 		   struct stack_info *info, unsigned long *visit_mask);
 
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 0bc95be5c638..a33a1373a252 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -43,6 +43,25 @@ bool in_task_stack(unsigned long *stack, struct task_struct *task,
 	return true;
 }
 
+bool in_sysenter_stack(unsigned long *stack, struct stack_info *info)
+{
+	struct tss_struct *tss = this_cpu_ptr(&cpu_tss);
+
+	/* Treat the canary as part of the stack for unwinding purposes. */
+	void *begin = &tss->SYSENTER_stack_canary;
+	void *end = (void *)&tss->SYSENTER_stack + sizeof(tss->SYSENTER_stack);
+
+	if ((void *)stack < begin || (void *)stack >= end)
+		return false;
+
+	info->type	= STACK_TYPE_SYSENTER;
+	info->begin	= begin;
+	info->end	= end;
+	info->next_sp	= NULL;
+
+	return true;
+}
+
 static void printk_stack_address(unsigned long address, int reliable,
 				 char *log_lvl)
 {
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index daefae83a3aa..5ff13a6b3680 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -26,6 +26,9 @@ const char *stack_type_name(enum stack_type type)
 	if (type == STACK_TYPE_SOFTIRQ)
 		return "SOFTIRQ";
 
+	if (type == STACK_TYPE_SYSENTER)
+		return "SYSENTER";
+
 	return NULL;
 }
 
@@ -93,6 +96,9 @@ int get_stack_info(unsigned long *stack, struct task_struct *task,
 	if (task != current)
 		goto unknown;
 
+	if (in_sysenter_stack(stack, info))
+		goto recursion_check;
+
 	if (in_hardirq_stack(stack, info))
 		goto recursion_check;
 
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 88ce2ffdb110..abc828f8c297 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -37,6 +37,9 @@ const char *stack_type_name(enum stack_type type)
 	if (type == STACK_TYPE_IRQ)
 		return "IRQ";
 
+	if (type == STACK_TYPE_SYSENTER)
+		return "SYSENTER";
+
 	if (type >= STACK_TYPE_EXCEPTION && type <= STACK_TYPE_EXCEPTION_LAST)
 		return exception_stack_names[type - STACK_TYPE_EXCEPTION];
 
@@ -115,6 +118,9 @@ int get_stack_info(unsigned long *stack, struct task_struct *task,
 	if (in_irq_stack(stack, info))
 		goto recursion_check;
 
+	if (in_sysenter_stack(stack, info))
+		goto recursion_check;
+
 	goto unknown;
 
 recursion_check:

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* [PATCH 4.14 081/159] x86/entry/gdt: Put per-CPU GDT remaps in ascending order
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 080/159] x86/dumpstack: Add get_stack_info() support for " Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 082/159] x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit aaeed3aeb39c1ba69f0a49baec8cb728121d0a91 upstream.

We currently have CPU 0's GDT at the top of the GDT range and
higher-numbered CPUs at lower addresses.  This happens because the
fixmap is upside down (index 0 is the top of the fixmap).

Flip it so that GDTs are in ascending order by virtual address.
This will simplify a future patch that will generalize the GDT
remap to contain multiple pages.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.471561421@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/desc.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -63,7 +63,7 @@ static inline struct desc_struct *get_cu
 /* Get the fixmap index for a specific processor */
 static inline unsigned int get_cpu_gdt_ro_index(int cpu)
 {
-	return FIX_GDT_REMAP_BEGIN + cpu;
+	return FIX_GDT_REMAP_END - cpu;
 }
 
 /* Provide the fixmap address of the remapped GDT */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 082/159] x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 081/159] x86/entry/gdt: Put per-CPU GDT remaps in ascending order Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 083/159] x86/kasan/64: Teach KASAN about the cpu_entry_area Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit ef8813ab280507972bb57e4b1b502811ad4411e9 upstream.

Currently, the GDT is an ad-hoc array of pages, one per CPU, in the
fixmap.  Generalize it to be an array of a new 'struct cpu_entry_area'
so that we can cleanly add new things to it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.563271721@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/desc.h   |    9 +--------
 arch/x86/include/asm/fixmap.h |   37 +++++++++++++++++++++++++++++++++++--
 arch/x86/kernel/cpu/common.c  |   14 +++++++-------
 arch/x86/xen/mmu_pv.c         |    2 +-
 4 files changed, 44 insertions(+), 18 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -60,17 +60,10 @@ static inline struct desc_struct *get_cu
 	return this_cpu_ptr(&gdt_page)->gdt;
 }
 
-/* Get the fixmap index for a specific processor */
-static inline unsigned int get_cpu_gdt_ro_index(int cpu)
-{
-	return FIX_GDT_REMAP_END - cpu;
-}
-
 /* Provide the fixmap address of the remapped GDT */
 static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
 {
-	unsigned int idx = get_cpu_gdt_ro_index(cpu);
-	return (struct desc_struct *)__fix_to_virt(idx);
+	return (struct desc_struct *)&get_cpu_entry_area(cpu)->gdt;
 }
 
 /* Provide the current read-only GDT */
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -44,6 +44,19 @@ extern unsigned long __FIXADDR_TOP;
 			 PAGE_SIZE)
 #endif
 
+/*
+ * cpu_entry_area is a percpu region in the fixmap that contains things
+ * needed by the CPU and early entry/exit code.  Real types aren't used
+ * for all fields here to avoid circular header dependencies.
+ *
+ * Every field is a virtual alias of some other allocated backing store.
+ * There is no direct allocation of a struct cpu_entry_area.
+ */
+struct cpu_entry_area {
+	char gdt[PAGE_SIZE];
+};
+
+#define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
 
 /*
  * Here we define all the compile-time 'special' virtual
@@ -101,8 +114,8 @@ enum fixed_addresses {
 	FIX_LNW_VRTC,
 #endif
 	/* Fixmap entries to remap the GDTs, one per processor. */
-	FIX_GDT_REMAP_BEGIN,
-	FIX_GDT_REMAP_END = FIX_GDT_REMAP_BEGIN + NR_CPUS - 1,
+	FIX_CPU_ENTRY_AREA_TOP,
+	FIX_CPU_ENTRY_AREA_BOTTOM = FIX_CPU_ENTRY_AREA_TOP + (CPU_ENTRY_AREA_PAGES * NR_CPUS) - 1,
 
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
@@ -191,5 +204,25 @@ void __init *early_memremap_decrypted_wp
 void __early_set_fixmap(enum fixed_addresses idx,
 			phys_addr_t phys, pgprot_t flags);
 
+static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int page)
+{
+	BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0);
+
+	return FIX_CPU_ENTRY_AREA_BOTTOM - cpu*CPU_ENTRY_AREA_PAGES - page;
+}
+
+#define __get_cpu_entry_area_offset_index(cpu, offset) ({		\
+	BUILD_BUG_ON(offset % PAGE_SIZE != 0);				\
+	__get_cpu_entry_area_page_index(cpu, offset / PAGE_SIZE);	\
+	})
+
+#define get_cpu_entry_area_index(cpu, field)				\
+	__get_cpu_entry_area_offset_index((cpu), offsetof(struct cpu_entry_area, field))
+
+static inline struct cpu_entry_area *get_cpu_entry_area(int cpu)
+{
+	return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0));
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASM_X86_FIXMAP_H */
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,12 +466,12 @@ void load_percpu_segment(int cpu)
 	load_stack_canary_segment();
 }
 
-/* Setup the fixmap mapping only once per-processor */
-static inline void setup_fixmap_gdt(int cpu)
+/* Setup the fixmap mappings only once per-processor */
+static inline void setup_cpu_entry_area(int cpu)
 {
 #ifdef CONFIG_X86_64
 	/* On 64-bit systems, we use a read-only fixmap GDT. */
-	pgprot_t prot = PAGE_KERNEL_RO;
+	pgprot_t gdt_prot = PAGE_KERNEL_RO;
 #else
 	/*
 	 * On native 32-bit systems, the GDT cannot be read-only because
@@ -482,11 +482,11 @@ static inline void setup_fixmap_gdt(int
 	 * On Xen PV, the GDT must be read-only because the hypervisor requires
 	 * it.
 	 */
-	pgprot_t prot = boot_cpu_has(X86_FEATURE_XENPV) ?
+	pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ?
 		PAGE_KERNEL_RO : PAGE_KERNEL;
 #endif
 
-	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu), prot);
+	__set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot);
 }
 
 /* Load the original GDT from the per-cpu structure */
@@ -1589,7 +1589,7 @@ void cpu_init(void)
 	if (is_uv_system())
 		uv_cpu_init();
 
-	setup_fixmap_gdt(cpu);
+	setup_cpu_entry_area(cpu);
 	load_fixmap_gdt(cpu);
 }
 
@@ -1651,7 +1651,7 @@ void cpu_init(void)
 
 	fpu__init_cpu();
 
-	setup_fixmap_gdt(cpu);
+	setup_cpu_entry_area(cpu);
 	load_fixmap_gdt(cpu);
 }
 #endif
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -2272,7 +2272,7 @@ static void xen_set_fixmap(unsigned idx,
 #endif
 	case FIX_TEXT_POKE0:
 	case FIX_TEXT_POKE1:
-	case FIX_GDT_REMAP_BEGIN ... FIX_GDT_REMAP_END:
+	case FIX_CPU_ENTRY_AREA_TOP ... FIX_CPU_ENTRY_AREA_BOTTOM:
 		/* All local page mappings */
 		pte = pfn_pte(phys, prot);
 		break;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 083/159] x86/kasan/64: Teach KASAN about the cpu_entry_area
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 082/159] x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 084/159] x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Andrey Ryabinin,
	Thomas Gleixner, Alexander Potapenko, Boris Ostrovsky,
	Borislav Petkov, Borislav Petkov, Brian Gerst, Dave Hansen,
	Dave Hansen, David Laight, Denys Vlasenko, Dmitry Vyukov,
	Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf, Juergen Gross,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, kasan-dev, keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 21506525fb8ddb0342f2a2370812d47f6a1f3833 upstream.

The cpu_entry_area will contain stacks.  Make sure that KASAN has
appropriate shadow mappings for them.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: kasan-dev@googlegroups.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.642806442@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/mm/kasan_init_64.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -277,6 +277,7 @@ void __init kasan_early_init(void)
 void __init kasan_init(void)
 {
 	int i;
+	void *shadow_cpu_entry_begin, *shadow_cpu_entry_end;
 
 #ifdef CONFIG_KASAN_INLINE
 	register_die_notifier(&kasan_die_notifier);
@@ -329,8 +330,23 @@ void __init kasan_init(void)
 			      (unsigned long)kasan_mem_to_shadow(_end),
 			      early_pfn_to_nid(__pa(_stext)));
 
+	shadow_cpu_entry_begin = (void *)__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM);
+	shadow_cpu_entry_begin = kasan_mem_to_shadow(shadow_cpu_entry_begin);
+	shadow_cpu_entry_begin = (void *)round_down((unsigned long)shadow_cpu_entry_begin,
+						PAGE_SIZE);
+
+	shadow_cpu_entry_end = (void *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE);
+	shadow_cpu_entry_end = kasan_mem_to_shadow(shadow_cpu_entry_end);
+	shadow_cpu_entry_end = (void *)round_up((unsigned long)shadow_cpu_entry_end,
+					PAGE_SIZE);
+
 	kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
-			(void *)KASAN_SHADOW_END);
+				   shadow_cpu_entry_begin);
+
+	kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin,
+			      (unsigned long)shadow_cpu_entry_end, 0);
+
+	kasan_populate_zero_shadow(shadow_cpu_entry_end, (void *)KASAN_SHADOW_END);
 
 	load_cr3(init_top_pgt);
 	__flush_tlb_all();

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 084/159] x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 083/159] x86/kasan/64: Teach KASAN about the cpu_entry_area Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 085/159] x86/dumpstack: Handle stack overflow on all stacks Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Dave Hansen, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, David Laight,
	Denys Vlasenko, Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf,
	Juergen Gross, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Will Deacon, aliguori, daniel.gruss, hughd, keescook,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 7fb983b4dd569e08564134a850dfd4eb1c63d9b8 upstream.

A future patch will move SYSENTER_stack to the beginning of cpu_tss
to help detect overflow.  Before this can happen, fix several code
paths that hardcode assumptions about the old layout.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.722425540@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/desc.h      |    2 +-
 arch/x86/include/asm/processor.h |    9 +++++++--
 arch/x86/kernel/cpu/common.c     |    8 ++++----
 arch/x86/kernel/doublefault.c    |   32 +++++++++++++++-----------------
 arch/x86/kvm/vmx.c               |    2 +-
 arch/x86/power/cpu.c             |   13 +++++++------
 6 files changed, 35 insertions(+), 31 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -178,7 +178,7 @@ static inline void set_tssldt_descriptor
 #endif
 }
 
-static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
+static inline void __set_tss_desc(unsigned cpu, unsigned int entry, struct x86_hw_tss *addr)
 {
 	struct desc_struct *d = get_cpu_gdt_rw(cpu);
 	tss_desc tss;
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -162,7 +162,7 @@ enum cpuid_regs_idx {
 extern struct cpuinfo_x86	boot_cpu_data;
 extern struct cpuinfo_x86	new_cpu_data;
 
-extern struct tss_struct	doublefault_tss;
+extern struct x86_hw_tss	doublefault_tss;
 extern __u32			cpu_caps_cleared[NCAPINTS];
 extern __u32			cpu_caps_set[NCAPINTS];
 
@@ -252,6 +252,11 @@ static inline void load_cr3(pgd_t *pgdir
 	write_cr3(__sme_pa(pgdir));
 }
 
+/*
+ * Note that while the legacy 'TSS' name comes from 'Task State Segment',
+ * on modern x86 CPUs the TSS also holds information important to 64-bit mode,
+ * unrelated to the task-switch mechanism:
+ */
 #ifdef CONFIG_X86_32
 /* This is the TSS defined by the hardware. */
 struct x86_hw_tss {
@@ -322,7 +327,7 @@ struct x86_hw_tss {
 #define IO_BITMAP_BITS			65536
 #define IO_BITMAP_BYTES			(IO_BITMAP_BITS/8)
 #define IO_BITMAP_LONGS			(IO_BITMAP_BYTES/sizeof(long))
-#define IO_BITMAP_OFFSET		offsetof(struct tss_struct, io_bitmap)
+#define IO_BITMAP_OFFSET		(offsetof(struct tss_struct, io_bitmap) - offsetof(struct tss_struct, x86_tss))
 #define INVALID_IO_BITMAP_OFFSET	0x8000
 
 struct tss_struct {
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1557,7 +1557,7 @@ void cpu_init(void)
 		}
 	}
 
-	t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
+	t->x86_tss.io_bitmap_base = IO_BITMAP_OFFSET;
 
 	/*
 	 * <= is required because the CPU will access up to
@@ -1576,7 +1576,7 @@ void cpu_init(void)
 	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
 	 * task never enters user mode.
 	 */
-	set_tss_desc(cpu, t);
+	set_tss_desc(cpu, &t->x86_tss);
 	load_TR_desc();
 
 	load_mm_ldt(&init_mm);
@@ -1634,12 +1634,12 @@ void cpu_init(void)
 	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
 	 * task never enters user mode.
 	 */
-	set_tss_desc(cpu, t);
+	set_tss_desc(cpu, &t->x86_tss);
 	load_TR_desc();
 
 	load_mm_ldt(&init_mm);
 
-	t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
+	t->x86_tss.io_bitmap_base = IO_BITMAP_OFFSET;
 
 #ifdef CONFIG_DOUBLEFAULT
 	/* Set up doublefault TSS pointer in the GDT */
--- a/arch/x86/kernel/doublefault.c
+++ b/arch/x86/kernel/doublefault.c
@@ -50,25 +50,23 @@ static void doublefault_fn(void)
 		cpu_relax();
 }
 
-struct tss_struct doublefault_tss __cacheline_aligned = {
-	.x86_tss = {
-		.sp0		= STACK_START,
-		.ss0		= __KERNEL_DS,
-		.ldt		= 0,
-		.io_bitmap_base	= INVALID_IO_BITMAP_OFFSET,
+struct x86_hw_tss doublefault_tss __cacheline_aligned = {
+	.sp0		= STACK_START,
+	.ss0		= __KERNEL_DS,
+	.ldt		= 0,
+	.io_bitmap_base	= INVALID_IO_BITMAP_OFFSET,
 
-		.ip		= (unsigned long) doublefault_fn,
-		/* 0x2 bit is always set */
-		.flags		= X86_EFLAGS_SF | 0x2,
-		.sp		= STACK_START,
-		.es		= __USER_DS,
-		.cs		= __KERNEL_CS,
-		.ss		= __KERNEL_DS,
-		.ds		= __USER_DS,
-		.fs		= __KERNEL_PERCPU,
+	.ip		= (unsigned long) doublefault_fn,
+	/* 0x2 bit is always set */
+	.flags		= X86_EFLAGS_SF | 0x2,
+	.sp		= STACK_START,
+	.es		= __USER_DS,
+	.cs		= __KERNEL_CS,
+	.ss		= __KERNEL_DS,
+	.ds		= __USER_DS,
+	.fs		= __KERNEL_PERCPU,
 
-		.__cr3		= __pa_nodebug(swapper_pg_dir),
-	}
+	.__cr3		= __pa_nodebug(swapper_pg_dir),
 };
 
 /* dummy for do_double_fault() call */
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2295,7 +2295,7 @@ static void vmx_vcpu_load(struct kvm_vcp
 		 * processors.  See 22.2.4.
 		 */
 		vmcs_writel(HOST_TR_BASE,
-			    (unsigned long)this_cpu_ptr(&cpu_tss));
+			    (unsigned long)this_cpu_ptr(&cpu_tss.x86_tss));
 		vmcs_writel(HOST_GDTR_BASE, (unsigned long)gdt);   /* 22.2.4 */
 
 		/*
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -165,12 +165,13 @@ static void fix_processor_context(void)
 	struct desc_struct *desc = get_cpu_gdt_rw(cpu);
 	tss_desc tss;
 #endif
-	set_tss_desc(cpu, t);	/*
-				 * This just modifies memory; should not be
-				 * necessary. But... This is necessary, because
-				 * 386 hardware has concept of busy TSS or some
-				 * similar stupidity.
-				 */
+
+	/*
+	 * This just modifies memory; should not be necessary. But... This is
+	 * necessary, because 386 hardware has concept of busy TSS or some
+	 * similar stupidity.
+	 */
+	set_tss_desc(cpu, &t->x86_tss);
 
 #ifdef CONFIG_X86_64
 	memcpy(&tss, &desc[GDT_ENTRY_TSS], sizeof(tss_desc));

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 085/159] x86/dumpstack: Handle stack overflow on all stacks
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 084/159] x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 086/159] x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 6e60e583426c2f8751c22c2dfe5c207083b4483a upstream.

We currently special-case stack overflow on the task stack.  We're
going to start putting special stacks in the fixmap with a custom
layout, so they'll have guard pages, too.  Teach the unwinder to be
able to unwind an overflow of any of the stacks.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.802057305@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/dumpstack.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -112,24 +112,28 @@ void show_trace_log_lvl(struct task_stru
 	 * - task stack
 	 * - interrupt stack
 	 * - HW exception stacks (double fault, nmi, debug, mce)
+	 * - SYSENTER stack
 	 *
-	 * x86-32 can have up to three stacks:
+	 * x86-32 can have up to four stacks:
 	 * - task stack
 	 * - softirq stack
 	 * - hardirq stack
+	 * - SYSENTER stack
 	 */
 	for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
 		const char *stack_name;
 
-		/*
-		 * If we overflowed the task stack into a guard page, jump back
-		 * to the bottom of the usable stack.
-		 */
-		if (task_stack_page(task) - (void *)stack < PAGE_SIZE)
-			stack = task_stack_page(task);
-
-		if (get_stack_info(stack, task, &stack_info, &visit_mask))
-			break;
+		if (get_stack_info(stack, task, &stack_info, &visit_mask)) {
+			/*
+			 * We weren't on a valid stack.  It's possible that
+			 * we overflowed a valid stack into a guard page.
+			 * See if the next page up is valid so that we can
+			 * generate some kind of backtrace if this happens.
+			 */
+			stack = (unsigned long *)PAGE_ALIGN((unsigned long)stack);
+			if (get_stack_info(stack, task, &stack_info, &visit_mask))
+				break;
+		}
 
 		stack_name = stack_type_name(stack_info.type);
 		if (stack_name)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 086/159] x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 085/159] x86/dumpstack: Handle stack overflow on all stacks Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 087/159] x86/entry: Remap the TSS into the CPU entry area Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 1a935bc3d4ea61556461a9e92a68ca3556232efd upstream.

SYSENTER_stack should have reliable overflow detection, which
means that it needs to be at the bottom of a page, not the top.
Move it to the beginning of struct tss_struct and page-align it.

Also add an assertion to make sure that the fixed hardware TSS
doesn't cross a page boundary.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.881827433@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/processor.h |   21 ++++++++++++---------
 arch/x86/kernel/cpu/common.c     |   21 +++++++++++++++++++++
 2 files changed, 33 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -332,7 +332,16 @@ struct x86_hw_tss {
 
 struct tss_struct {
 	/*
-	 * The hardware state:
+	 * Space for the temporary SYSENTER stack, used for SYSENTER
+	 * and the entry trampoline as well.
+	 */
+	unsigned long		SYSENTER_stack_canary;
+	unsigned long		SYSENTER_stack[64];
+
+	/*
+	 * The fixed hardware portion.  This must not cross a page boundary
+	 * at risk of violating the SDM's advice and potentially triggering
+	 * errata.
 	 */
 	struct x86_hw_tss	x86_tss;
 
@@ -343,15 +352,9 @@ struct tss_struct {
 	 * be within the limit.
 	 */
 	unsigned long		io_bitmap[IO_BITMAP_LONGS + 1];
+} __aligned(PAGE_SIZE);
 
-	/*
-	 * Space for the temporary SYSENTER stack.
-	 */
-	unsigned long		SYSENTER_stack_canary;
-	unsigned long		SYSENTER_stack[64];
-} ____cacheline_aligned;
-
-DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
+DECLARE_PER_CPU_PAGE_ALIGNED(struct tss_struct, cpu_tss);
 
 /*
  * sizeof(unsigned long) coming from an extra "long" at the end
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -487,6 +487,27 @@ static inline void setup_cpu_entry_area(
 #endif
 
 	__set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot);
+
+	/*
+	 * The Intel SDM says (Volume 3, 7.2.1):
+	 *
+	 *  Avoid placing a page boundary in the part of the TSS that the
+	 *  processor reads during a task switch (the first 104 bytes). The
+	 *  processor may not correctly perform address translations if a
+	 *  boundary occurs in this area. During a task switch, the processor
+	 *  reads and writes into the first 104 bytes of each TSS (using
+	 *  contiguous physical addresses beginning with the physical address
+	 *  of the first byte of the TSS). So, after TSS access begins, if
+	 *  part of the 104 bytes is not physically contiguous, the processor
+	 *  will access incorrect information without generating a page-fault
+	 *  exception.
+	 *
+	 * There are also a lot of errata involving the TSS spanning a page
+	 * boundary.  Assert that we're not doing that.
+	 */
+	BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^
+		      offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK);
+
 }
 
 /* Load the original GDT from the per-cpu structure */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 087/159] x86/entry: Remap the TSS into the CPU entry area
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 086/159] x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 088/159] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov, Brian Gerst,
	Dave Hansen, Dave Hansen, David Laight, Denys Vlasenko,
	Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf, Juergen Gross,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 72f5e08dbba2d01aa90b592cf76c378ea233b00b upstream.

This has a secondary purpose: it puts the entry stack into a region
with a well-controlled layout.  A subsequent patch will take
advantage of this to streamline the SYSCALL entry code to be able to
find it more easily.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_32.S     |    6 ++++--
 arch/x86/include/asm/fixmap.h |    7 +++++++
 arch/x86/kernel/asm-offsets.c |    3 +++
 arch/x86/kernel/cpu/common.c  |   41 +++++++++++++++++++++++++++++++++++------
 arch/x86/kernel/dumpstack.c   |    3 ++-
 arch/x86/kvm/vmx.c            |    2 +-
 arch/x86/power/cpu.c          |   11 ++++++-----
 7 files changed, 58 insertions(+), 15 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -941,7 +941,8 @@ ENTRY(debug)
 	movl	%esp, %eax			# pt_regs pointer
 
 	/* Are we currently on the SYSENTER stack? */
-	PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx)
+	movl	PER_CPU_VAR(cpu_entry_area), %ecx
+	addl	$CPU_ENTRY_AREA_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
 	jb	.Ldebug_from_sysenter_stack
@@ -984,7 +985,8 @@ ENTRY(nmi)
 	movl	%esp, %eax			# pt_regs pointer
 
 	/* Are we currently on the SYSENTER stack? */
-	PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx)
+	movl	PER_CPU_VAR(cpu_entry_area), %ecx
+	addl	$CPU_ENTRY_AREA_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
 	jb	.Lnmi_from_sysenter_stack
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -54,6 +54,13 @@ extern unsigned long __FIXADDR_TOP;
  */
 struct cpu_entry_area {
 	char gdt[PAGE_SIZE];
+
+	/*
+	 * The GDT is just below cpu_tss and thus serves (on x86_64) as a
+	 * a read-only guard page for the SYSENTER stack at the bottom
+	 * of the TSS region.
+	 */
+	struct tss_struct tss;
 };
 
 #define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -98,4 +98,7 @@ void common(void) {
 	OFFSET(CPU_TSS_SYSENTER_stack, tss_struct, SYSENTER_stack);
 	/* Size of SYSENTER_stack */
 	DEFINE(SIZEOF_SYSENTER_stack, sizeof(((struct tss_struct *)0)->SYSENTER_stack));
+
+	/* Layout info for cpu_entry_area */
+	OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss);
 }
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,6 +466,22 @@ void load_percpu_segment(int cpu)
 	load_stack_canary_segment();
 }
 
+static void set_percpu_fixmap_pages(int fixmap_index, void *ptr,
+				    int pages, pgprot_t prot)
+{
+	int i;
+
+	for (i = 0; i < pages; i++) {
+		__set_fixmap(fixmap_index - i,
+			     per_cpu_ptr_to_phys(ptr + i * PAGE_SIZE), prot);
+	}
+}
+
+#ifdef CONFIG_X86_32
+/* The 32-bit entry code needs to find cpu_entry_area. */
+DEFINE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
+#endif
+
 /* Setup the fixmap mappings only once per-processor */
 static inline void setup_cpu_entry_area(int cpu)
 {
@@ -507,7 +523,15 @@ static inline void setup_cpu_entry_area(
 	 */
 	BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^
 		      offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK);
+	BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0);
+	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss),
+				&per_cpu(cpu_tss, cpu),
+				sizeof(struct tss_struct) / PAGE_SIZE,
+				PAGE_KERNEL);
 
+#ifdef CONFIG_X86_32
+	this_cpu_write(cpu_entry_area, get_cpu_entry_area(cpu));
+#endif
 }
 
 /* Load the original GDT from the per-cpu structure */
@@ -1257,7 +1281,8 @@ void enable_sep_cpu(void)
 	wrmsr(MSR_IA32_SYSENTER_CS, tss->x86_tss.ss1, 0);
 
 	wrmsr(MSR_IA32_SYSENTER_ESP,
-	      (unsigned long)tss + offsetofend(struct tss_struct, SYSENTER_stack),
+	      (unsigned long)&get_cpu_entry_area(cpu)->tss +
+	      offsetofend(struct tss_struct, SYSENTER_stack),
 	      0);
 
 	wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0);
@@ -1370,6 +1395,8 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(char,
 /* May not be marked __init: used by software suspend */
 void syscall_init(void)
 {
+	int cpu = smp_processor_id();
+
 	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
 	wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
 
@@ -1383,7 +1410,7 @@ void syscall_init(void)
 	 */
 	wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
 	wrmsrl_safe(MSR_IA32_SYSENTER_ESP,
-		    (unsigned long)this_cpu_ptr(&cpu_tss) +
+		    (unsigned long)&get_cpu_entry_area(cpu)->tss +
 		    offsetofend(struct tss_struct, SYSENTER_stack));
 	wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
 #else
@@ -1593,11 +1620,13 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, me);
 
+	setup_cpu_entry_area(cpu);
+
 	/*
 	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
 	 * task never enters user mode.
 	 */
-	set_tss_desc(cpu, &t->x86_tss);
+	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
 	load_TR_desc();
 
 	load_mm_ldt(&init_mm);
@@ -1610,7 +1639,6 @@ void cpu_init(void)
 	if (is_uv_system())
 		uv_cpu_init();
 
-	setup_cpu_entry_area(cpu);
 	load_fixmap_gdt(cpu);
 }
 
@@ -1651,11 +1679,13 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, curr);
 
+	setup_cpu_entry_area(cpu);
+
 	/*
 	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
 	 * task never enters user mode.
 	 */
-	set_tss_desc(cpu, &t->x86_tss);
+	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
 	load_TR_desc();
 
 	load_mm_ldt(&init_mm);
@@ -1672,7 +1702,6 @@ void cpu_init(void)
 
 	fpu__init_cpu();
 
-	setup_cpu_entry_area(cpu);
 	load_fixmap_gdt(cpu);
 }
 #endif
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -45,7 +45,8 @@ bool in_task_stack(unsigned long *stack,
 
 bool in_sysenter_stack(unsigned long *stack, struct stack_info *info)
 {
-	struct tss_struct *tss = this_cpu_ptr(&cpu_tss);
+	int cpu = smp_processor_id();
+	struct tss_struct *tss = &get_cpu_entry_area(cpu)->tss;
 
 	/* Treat the canary as part of the stack for unwinding purposes. */
 	void *begin = &tss->SYSENTER_stack_canary;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2295,7 +2295,7 @@ static void vmx_vcpu_load(struct kvm_vcp
 		 * processors.  See 22.2.4.
 		 */
 		vmcs_writel(HOST_TR_BASE,
-			    (unsigned long)this_cpu_ptr(&cpu_tss.x86_tss));
+			    (unsigned long)&get_cpu_entry_area(cpu)->tss.x86_tss);
 		vmcs_writel(HOST_GDTR_BASE, (unsigned long)gdt);   /* 22.2.4 */
 
 		/*
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -160,18 +160,19 @@ static void do_fpu_end(void)
 static void fix_processor_context(void)
 {
 	int cpu = smp_processor_id();
-	struct tss_struct *t = &per_cpu(cpu_tss, cpu);
 #ifdef CONFIG_X86_64
 	struct desc_struct *desc = get_cpu_gdt_rw(cpu);
 	tss_desc tss;
 #endif
 
 	/*
-	 * This just modifies memory; should not be necessary. But... This is
-	 * necessary, because 386 hardware has concept of busy TSS or some
-	 * similar stupidity.
+	 * We need to reload TR, which requires that we change the
+	 * GDT entry to indicate "available" first.
+	 *
+	 * XXX: This could probably all be replaced by a call to
+	 * force_reload_TR().
 	 */
-	set_tss_desc(cpu, &t->x86_tss);
+	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
 
 #ifdef CONFIG_X86_64
 	memcpy(&tss, &desc[GDT_ENTRY_TSS], sizeof(tss_desc));

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 088/159] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (89 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 087/159] x86/entry: Remap the TSS into the CPU entry area Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 089/159] x86/espfix/64: Stop assuming that pt_regs is on the entry stack Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 9aaefe7b59ae00605256a7d6bd1c1456432495fc upstream.

On 64-bit kernels, we used to assume that TSS.sp0 was the current
top of stack.  With the addition of an entry trampoline, this will
no longer be the case.  Store the current top of stack in TSS.sp1,
which is otherwise unused but shares the same cacheline.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.050864668@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/processor.h   |   18 +++++++++++++-----
 arch/x86/include/asm/thread_info.h |    2 +-
 arch/x86/kernel/asm-offsets_64.c   |    1 +
 arch/x86/kernel/process.c          |   10 ++++++++++
 arch/x86/kernel/process_64.c       |    1 +
 5 files changed, 26 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -309,7 +309,13 @@ struct x86_hw_tss {
 struct x86_hw_tss {
 	u32			reserved1;
 	u64			sp0;
+
+	/*
+	 * We store cpu_current_top_of_stack in sp1 so it's always accessible.
+	 * Linux does not use ring 1, so sp1 is not otherwise needed.
+	 */
 	u64			sp1;
+
 	u64			sp2;
 	u64			reserved2;
 	u64			ist[7];
@@ -368,6 +374,8 @@ DECLARE_PER_CPU_PAGE_ALIGNED(struct tss_
 
 #ifdef CONFIG_X86_32
 DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack);
+#else
+#define cpu_current_top_of_stack cpu_tss.x86_tss.sp1
 #endif
 
 /*
@@ -539,12 +547,12 @@ static inline void native_swapgs(void)
 
 static inline unsigned long current_top_of_stack(void)
 {
-#ifdef CONFIG_X86_64
-	return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
-#else
-	/* sp0 on x86_32 is special in and around vm86 mode. */
+	/*
+	 *  We can't read directly from tss.sp0: sp0 on x86_32 is special in
+	 *  and around vm86 mode and sp0 on x86_64 is special because of the
+	 *  entry trampoline.
+	 */
 	return this_cpu_read_stable(cpu_current_top_of_stack);
-#endif
 }
 
 static inline bool on_thread_stack(void)
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -207,7 +207,7 @@ static inline int arch_within_stack_fram
 #else /* !__ASSEMBLY__ */
 
 #ifdef CONFIG_X86_64
-# define cpu_current_top_of_stack (cpu_tss + TSS_sp0)
+# define cpu_current_top_of_stack (cpu_tss + TSS_sp1)
 #endif
 
 #endif
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -66,6 +66,7 @@ int main(void)
 
 	OFFSET(TSS_ist, tss_struct, x86_tss.ist);
 	OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
+	OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
 	BLANK();
 
 #ifdef CONFIG_CC_STACKPROTECTOR
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -56,6 +56,16 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(
 		 * Poison it.
 		 */
 		.sp0 = (1UL << (BITS_PER_LONG-1)) + 1,
+
+#ifdef CONFIG_X86_64
+		/*
+		 * .sp1 is cpu_current_top_of_stack.  The init task never
+		 * runs user code, but cpu_current_top_of_stack should still
+		 * be well defined before the first context switch.
+		 */
+		.sp1 = TOP_OF_INIT_STACK,
+#endif
+
 #ifdef CONFIG_X86_32
 		.ss0 = __KERNEL_DS,
 		.ss1 = __KERNEL_CS,
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -461,6 +461,7 @@ __switch_to(struct task_struct *prev_p,
 	 * Switch the PDA and FPU contexts.
 	 */
 	this_cpu_write(current_task, next_p);
+	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
 
 	/* Reload sp0. */
 	update_sp0(next_p);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 089/159] x86/espfix/64: Stop assuming that pt_regs is on the entry stack
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (90 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 088/159] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 090/159] x86/entry/64: Use a per-CPU trampoline stack for IDT entries Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 6d9256f0a89eaff97fca6006100bcaea8d1d8bdb upstream.

When we start using an entry trampoline, a #GP from userspace will
be delivered on the entry stack, not on the task stack.  Fix the
espfix64 #DF fixup to set up #GP according to TSS.SP0, rather than
assuming that pt_regs + 1 == SP0.  This won't change anything
without an entry stack, but it will make the code continue to work
when an entry stack is added.

While we're at it, improve the comments to explain what's actually
going on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.130778051@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/traps.c |   37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -348,9 +348,15 @@ dotraplinkage void do_double_fault(struc
 
 	/*
 	 * If IRET takes a non-IST fault on the espfix64 stack, then we
-	 * end up promoting it to a doublefault.  In that case, modify
-	 * the stack to make it look like we just entered the #GP
-	 * handler from user space, similar to bad_iret.
+	 * end up promoting it to a doublefault.  In that case, take
+	 * advantage of the fact that we're not using the normal (TSS.sp0)
+	 * stack right now.  We can write a fake #GP(0) frame at TSS.sp0
+	 * and then modify our own IRET frame so that, when we return,
+	 * we land directly at the #GP(0) vector with the stack already
+	 * set up according to its expectations.
+	 *
+	 * The net result is that our #GP handler will think that we
+	 * entered from usermode with the bad user context.
 	 *
 	 * No need for ist_enter here because we don't use RCU.
 	 */
@@ -358,13 +364,26 @@ dotraplinkage void do_double_fault(struc
 		regs->cs == __KERNEL_CS &&
 		regs->ip == (unsigned long)native_irq_return_iret)
 	{
-		struct pt_regs *normal_regs = task_pt_regs(current);
+		struct pt_regs *gpregs = (struct pt_regs *)this_cpu_read(cpu_tss.x86_tss.sp0) - 1;
 
-		/* Fake a #GP(0) from userspace. */
-		memmove(&normal_regs->ip, (void *)regs->sp, 5*8);
-		normal_regs->orig_ax = 0;  /* Missing (lost) #GP error code */
+		/*
+		 * regs->sp points to the failing IRET frame on the
+		 * ESPFIX64 stack.  Copy it to the entry stack.  This fills
+		 * in gpregs->ss through gpregs->ip.
+		 *
+		 */
+		memmove(&gpregs->ip, (void *)regs->sp, 5*8);
+		gpregs->orig_ax = 0;  /* Missing (lost) #GP error code */
+
+		/*
+		 * Adjust our frame so that we return straight to the #GP
+		 * vector with the expected RSP value.  This is safe because
+		 * we won't enable interupts or schedule before we invoke
+		 * general_protection, so nothing will clobber the stack
+		 * frame we just set up.
+		 */
 		regs->ip = (unsigned long)general_protection;
-		regs->sp = (unsigned long)&normal_regs->orig_ax;
+		regs->sp = (unsigned long)&gpregs->orig_ax;
 
 		return;
 	}
@@ -389,7 +408,7 @@ dotraplinkage void do_double_fault(struc
 	 *
 	 *   Processors update CR2 whenever a page fault is detected. If a
 	 *   second page fault occurs while an earlier page fault is being
-	 *   deliv- ered, the faulting linear address of the second fault will
+	 *   delivered, the faulting linear address of the second fault will
 	 *   overwrite the contents of CR2 (replacing the previous
 	 *   address). These updates to CR2 occur even if the page fault
 	 *   results in a double fault or occurs during the delivery of a

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 090/159] x86/entry/64: Use a per-CPU trampoline stack for IDT entries
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (91 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 089/159] x86/espfix/64: Stop assuming that pt_regs is on the entry stack Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 091/159] x86/entry/64: Return to userspace from the trampoline stack Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.

Historically, IDT entries from usermode have always gone directly
to the running task's kernel stack.  Rearrange it so that we enter on
a per-CPU trampoline stack and then manually switch to the task's stack.
This touches a couple of extra cachelines, but it gives us a chance
to run some code before we touch the kernel stack.

The asm isn't exactly beautiful, but I think that fully refactoring
it can wait.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S        |   67 +++++++++++++++++++++++++++++----------
 arch/x86/entry/entry_64_compat.S |    5 ++
 arch/x86/include/asm/switch_to.h |    4 +-
 arch/x86/include/asm/traps.h     |    1 
 arch/x86/kernel/cpu/common.c     |    6 ++-
 arch/x86/kernel/traps.c          |   21 ++++++------
 6 files changed, 72 insertions(+), 32 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -560,6 +560,13 @@ END(irq_entries_start)
 /* 0(%rsp): ~(interrupt number) */
 	.macro interrupt func
 	cld
+
+	testb	$3, CS-ORIG_RAX(%rsp)
+	jz	1f
+	SWAPGS
+	call	switch_to_thread_stack
+1:
+
 	ALLOC_PT_GPREGS_ON_STACK
 	SAVE_C_REGS
 	SAVE_EXTRA_REGS
@@ -569,12 +576,8 @@ END(irq_entries_start)
 	jz	1f
 
 	/*
-	 * IRQ from user mode.  Switch to kernel gsbase and inform context
-	 * tracking that we're in kernel mode.
-	 */
-	SWAPGS
-
-	/*
+	 * IRQ from user mode.
+	 *
 	 * We need to tell lockdep that IRQs are off.  We can't do this until
 	 * we fix gsbase, and we should do it before enter_from_user_mode
 	 * (which can take locks).  Since TRACE_IRQS_OFF idempotent,
@@ -828,6 +831,32 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work
  */
 #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
 
+/*
+ * Switch to the thread stack.  This is called with the IRET frame and
+ * orig_ax on the stack.  (That is, RDI..R12 are not on the stack and
+ * space has not been allocated for them.)
+ */
+ENTRY(switch_to_thread_stack)
+	UNWIND_HINT_FUNC
+
+	pushq	%rdi
+	movq	%rsp, %rdi
+	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+	UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
+
+	pushq	7*8(%rdi)		/* regs->ss */
+	pushq	6*8(%rdi)		/* regs->rsp */
+	pushq	5*8(%rdi)		/* regs->eflags */
+	pushq	4*8(%rdi)		/* regs->cs */
+	pushq	3*8(%rdi)		/* regs->ip */
+	pushq	2*8(%rdi)		/* regs->orig_ax */
+	pushq	8(%rdi)			/* return address */
+	UNWIND_HINT_FUNC
+
+	movq	(%rdi), %rdi
+	ret
+END(switch_to_thread_stack)
+
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
 	UNWIND_HINT_IRET_REGS offset=\has_error_code*8
@@ -845,11 +874,12 @@ ENTRY(\sym)
 
 	ALLOC_PT_GPREGS_ON_STACK
 
-	.if \paranoid
-	.if \paranoid == 1
+	.if \paranoid < 2
 	testb	$3, CS(%rsp)			/* If coming from userspace, switch stacks */
-	jnz	1f
+	jnz	.Lfrom_usermode_switch_stack_\@
 	.endif
+
+	.if \paranoid
 	call	paranoid_entry
 	.else
 	call	error_entry
@@ -891,20 +921,15 @@ ENTRY(\sym)
 	jmp	error_exit
 	.endif
 
-	.if \paranoid == 1
+	.if \paranoid < 2
 	/*
-	 * Paranoid entry from userspace.  Switch stacks and treat it
+	 * Entry from userspace.  Switch stacks and treat it
 	 * as a normal entry.  This means that paranoid handlers
 	 * run in real process context if user_mode(regs).
 	 */
-1:
+.Lfrom_usermode_switch_stack_\@:
 	call	error_entry
 
-
-	movq	%rsp, %rdi			/* pt_regs pointer */
-	call	sync_regs
-	movq	%rax, %rsp			/* switch stack */
-
 	movq	%rsp, %rdi			/* pt_regs pointer */
 
 	.if \has_error_code
@@ -1165,6 +1190,14 @@ ENTRY(error_entry)
 	SWAPGS
 
 .Lerror_entry_from_usermode_after_swapgs:
+	/* Put us onto the real thread stack. */
+	popq	%r12				/* save return addr in %12 */
+	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
+	call	sync_regs
+	movq	%rax, %rsp			/* switch stack */
+	ENCODE_FRAME_POINTER
+	pushq	%r12
+
 	/*
 	 * We need to tell lockdep that IRQs are off.  We can't do this until
 	 * we fix gsbase, and we should do it before enter_from_user_mode
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -306,8 +306,11 @@ ENTRY(entry_INT80_compat)
 	 */
 	movl	%eax, %eax
 
-	/* Construct struct pt_regs on stack (iret frame is already on stack) */
 	pushq	%rax			/* pt_regs->orig_ax */
+
+	/* switch to thread stack expects orig_ax to be pushed */
+	call	switch_to_thread_stack
+
 	pushq	%rdi			/* pt_regs->di */
 	pushq	%rsi			/* pt_regs->si */
 	pushq	%rdx			/* pt_regs->dx */
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -90,10 +90,12 @@ static inline void refresh_sysenter_cs(s
 /* This is used when switching tasks or entering/exiting vm86 mode. */
 static inline void update_sp0(struct task_struct *task)
 {
+	/* On x86_64, sp0 always points to the entry trampoline stack, which is constant: */
 #ifdef CONFIG_X86_32
 	load_sp0(task->thread.sp0);
 #else
-	load_sp0(task_top_of_stack(task));
+	if (static_cpu_has(X86_FEATURE_XENPV))
+		load_sp0(task_top_of_stack(task));
 #endif
 }
 
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -75,7 +75,6 @@ dotraplinkage void do_segment_not_presen
 dotraplinkage void do_stack_segment(struct pt_regs *, long);
 #ifdef CONFIG_X86_64
 dotraplinkage void do_double_fault(struct pt_regs *, long);
-asmlinkage struct pt_regs *sync_regs(struct pt_regs *);
 #endif
 dotraplinkage void do_general_protection(struct pt_regs *, long);
 dotraplinkage void do_page_fault(struct pt_regs *, unsigned long);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1623,11 +1623,13 @@ void cpu_init(void)
 	setup_cpu_entry_area(cpu);
 
 	/*
-	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
-	 * task never enters user mode.
+	 * Initialize the TSS.  sp0 points to the entry trampoline stack
+	 * regardless of what task is running.
 	 */
 	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
 	load_TR_desc();
+	load_sp0((unsigned long)&get_cpu_entry_area(cpu)->tss +
+		 offsetofend(struct tss_struct, SYSENTER_stack));
 
 	load_mm_ldt(&init_mm);
 
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -619,14 +619,15 @@ NOKPROBE_SYMBOL(do_int3);
 
 #ifdef CONFIG_X86_64
 /*
- * Help handler running on IST stack to switch off the IST stack if the
- * interrupted code was in user mode. The actual stack switch is done in
- * entry_64.S
+ * Help handler running on a per-cpu (IST or entry trampoline) stack
+ * to switch to the normal thread stack if the interrupted code was in
+ * user mode. The actual stack switch is done in entry_64.S
  */
 asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs)
 {
-	struct pt_regs *regs = task_pt_regs(current);
-	*regs = *eregs;
+	struct pt_regs *regs = (struct pt_regs *)this_cpu_read(cpu_current_top_of_stack) - 1;
+	if (regs != eregs)
+		*regs = *eregs;
 	return regs;
 }
 NOKPROBE_SYMBOL(sync_regs);
@@ -642,13 +643,13 @@ struct bad_iret_stack *fixup_bad_iret(st
 	/*
 	 * This is called from entry_64.S early in handling a fault
 	 * caused by a bad iret to user mode.  To handle the fault
-	 * correctly, we want move our stack frame to task_pt_regs
-	 * and we want to pretend that the exception came from the
-	 * iret target.
+	 * correctly, we want to move our stack frame to where it would
+	 * be had we entered directly on the entry stack (rather than
+	 * just below the IRET frame) and we want to pretend that the
+	 * exception came from the IRET target.
 	 */
 	struct bad_iret_stack *new_stack =
-		container_of(task_pt_regs(current),
-			     struct bad_iret_stack, regs);
+		(struct bad_iret_stack *)this_cpu_read(cpu_tss.x86_tss.sp0) - 1;
 
 	/* Copy the IRET target to the new stack. */
 	memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 091/159] x86/entry/64: Return to userspace from the trampoline stack
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (92 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 090/159] x86/entry/64: Use a per-CPU trampoline stack for IDT entries Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 092/159] x86/entry/64: Create a per-CPU SYSCALL entry trampoline Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 3e3b9293d392c577b62e24e4bc9982320438e749 upstream.

By itself, this is useless.  It gives us the ability to run some final code
before exit that cannnot run on the kernel stack.  This could include a CR3
switch a la PAGE_TABLE_ISOLATION or some kernel stack erasing, for
example.  (Or even weird things like *changing* which kernel stack gets
used as an ASLR-strengthening mechanism.)

The SYSRET32 path is not covered yet.  It could be in the future or
we could just ignore it and force the slow path if needed.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.306546484@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S |   55 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 51 insertions(+), 4 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -326,8 +326,24 @@ syscall_return_via_sysret:
 	popq	%rsi	/* skip rcx */
 	popq	%rdx
 	popq	%rsi
+
+	/*
+	 * Now all regs are restored except RSP and RDI.
+	 * Save old stack pointer and switch to trampoline stack.
+	 */
+	movq	%rsp, %rdi
+	movq	PER_CPU_VAR(cpu_tss + TSS_sp0), %rsp
+
+	pushq	RSP-RDI(%rdi)	/* RSP */
+	pushq	(%rdi)		/* RDI */
+
+	/*
+	 * We are on the trampoline stack.  All regs except RDI are live.
+	 * We can do future final exit work right here.
+	 */
+
 	popq	%rdi
-	movq	RSP-ORIG_RAX(%rsp), %rsp
+	popq	%rsp
 	USERGS_SYSRET64
 END(entry_SYSCALL_64)
 
@@ -630,10 +646,41 @@ GLOBAL(swapgs_restore_regs_and_return_to
 	ud2
 1:
 #endif
-	SWAPGS
 	POP_EXTRA_REGS
-	POP_C_REGS
-	addq	$8, %rsp	/* skip regs->orig_ax */
+	popq	%r11
+	popq	%r10
+	popq	%r9
+	popq	%r8
+	popq	%rax
+	popq	%rcx
+	popq	%rdx
+	popq	%rsi
+
+	/*
+	 * The stack is now user RDI, orig_ax, RIP, CS, EFLAGS, RSP, SS.
+	 * Save old stack pointer and switch to trampoline stack.
+	 */
+	movq	%rsp, %rdi
+	movq	PER_CPU_VAR(cpu_tss + TSS_sp0), %rsp
+
+	/* Copy the IRET frame to the trampoline stack. */
+	pushq	6*8(%rdi)	/* SS */
+	pushq	5*8(%rdi)	/* RSP */
+	pushq	4*8(%rdi)	/* EFLAGS */
+	pushq	3*8(%rdi)	/* CS */
+	pushq	2*8(%rdi)	/* RIP */
+
+	/* Push user RDI on the trampoline stack. */
+	pushq	(%rdi)
+
+	/*
+	 * We are on the trampoline stack.  All regs except RDI are live.
+	 * We can do future final exit work right here.
+	 */
+
+	/* Restore RDI. */
+	popq	%rdi
+	SWAPGS
 	INTERRUPT_RETURN
 
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 092/159] x86/entry/64: Create a per-CPU SYSCALL entry trampoline
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (93 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 091/159] x86/entry/64: Return to userspace from the trampoline stack Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 093/159] x86/entry/64: Move the IST stacks into struct cpu_entry_area Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov, Brian Gerst,
	Dave Hansen, Dave Hansen, David Laight, Denys Vlasenko,
	Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf, Juergen Gross,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 3386bc8aed825e9f1f65ce38df4b109b2019b71a upstream.

Handling SYSCALL is tricky: the SYSCALL handler is entered with every
single register (except FLAGS), including RSP, live.  It somehow needs
to set RSP to point to a valid stack, which means it needs to save the
user RSP somewhere and find its own stack pointer.  The canonical way
to do this is with SWAPGS, which lets us access percpu data using the
%gs prefix.

With PAGE_TABLE_ISOLATION-like pagetable switching, this is
problematic.  Without a scratch register, switching CR3 is impossible, so
%gs-based percpu memory would need to be mapped in the user pagetables.
Doing that without information leaks is difficult or impossible.

Instead, use a different sneaky trick.  Map a copy of the first part
of the SYSCALL asm at a different address for each CPU.  Now RIP
varies depending on the CPU, so we can use RIP-relative memory access
to access percpu memory.  By putting the relevant information (one
scratch slot and the stack address) at a constant offset relative to
RIP, we can make SYSCALL work without relying on %gs.

A nice thing about this approach is that we can easily switch it on
and off if we want pagetable switching to be configurable.

The compat variant of SYSCALL doesn't have this problem in the first
place -- there are plenty of scratch registers, since we don't care
about preserving r8-r15.  This patch therefore doesn't touch SYSCALL32
at all.

This patch actually seems to be a small speedup.  With this patch,
SYSCALL touches an extra cache line and an extra virtual page, but
the pipeline no longer stalls waiting for SWAPGS.  It seems that, at
least in a tight loop, the latter outweights the former.

Thanks to David Laight for an optimization tip.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.403607157@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_64.S     |   58 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/fixmap.h |    2 +
 arch/x86/kernel/asm-offsets.c |    1 
 arch/x86/kernel/cpu/common.c  |   15 ++++++++++
 arch/x86/kernel/vmlinux.lds.S |    9 ++++++
 5 files changed, 84 insertions(+), 1 deletion(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -136,6 +136,64 @@ END(native_usergs_sysret64)
  * with them due to bugs in both AMD and Intel CPUs.
  */
 
+	.pushsection .entry_trampoline, "ax"
+
+/*
+ * The code in here gets remapped into cpu_entry_area's trampoline.  This means
+ * that the assembler and linker have the wrong idea as to where this code
+ * lives (and, in fact, it's mapped more than once, so it's not even at a
+ * fixed address).  So we can't reference any symbols outside the entry
+ * trampoline and expect it to work.
+ *
+ * Instead, we carefully abuse %rip-relative addressing.
+ * _entry_trampoline(%rip) refers to the start of the remapped) entry
+ * trampoline.  We can thus find cpu_entry_area with this macro:
+ */
+
+#define CPU_ENTRY_AREA \
+	_entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip)
+
+/* The top word of the SYSENTER stack is hot and is usable as scratch space. */
+#define RSP_SCRATCH	CPU_ENTRY_AREA_tss + CPU_TSS_SYSENTER_stack + \
+			SIZEOF_SYSENTER_stack - 8 + CPU_ENTRY_AREA
+
+ENTRY(entry_SYSCALL_64_trampoline)
+	UNWIND_HINT_EMPTY
+	swapgs
+
+	/* Stash the user RSP. */
+	movq	%rsp, RSP_SCRATCH
+
+	/* Load the top of the task stack into RSP */
+	movq	CPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp
+
+	/* Start building the simulated IRET frame. */
+	pushq	$__USER_DS			/* pt_regs->ss */
+	pushq	RSP_SCRATCH			/* pt_regs->sp */
+	pushq	%r11				/* pt_regs->flags */
+	pushq	$__USER_CS			/* pt_regs->cs */
+	pushq	%rcx				/* pt_regs->ip */
+
+	/*
+	 * x86 lacks a near absolute jump, and we can't jump to the real
+	 * entry text with a relative jump.  We could push the target
+	 * address and then use retq, but this destroys the pipeline on
+	 * many CPUs (wasting over 20 cycles on Sandy Bridge).  Instead,
+	 * spill RDI and restore it in a second-stage trampoline.
+	 */
+	pushq	%rdi
+	movq	$entry_SYSCALL_64_stage2, %rdi
+	jmp	*%rdi
+END(entry_SYSCALL_64_trampoline)
+
+	.popsection
+
+ENTRY(entry_SYSCALL_64_stage2)
+	UNWIND_HINT_EMPTY
+	popq	%rdi
+	jmp	entry_SYSCALL_64_after_hwframe
+END(entry_SYSCALL_64_stage2)
+
 ENTRY(entry_SYSCALL_64)
 	UNWIND_HINT_EMPTY
 	/*
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -61,6 +61,8 @@ struct cpu_entry_area {
 	 * of the TSS region.
 	 */
 	struct tss_struct tss;
+
+	char entry_trampoline[PAGE_SIZE];
 };
 
 #define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -101,4 +101,5 @@ void common(void) {
 
 	/* Layout info for cpu_entry_area */
 	OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss);
+	OFFSET(CPU_ENTRY_AREA_entry_trampoline, cpu_entry_area, entry_trampoline);
 }
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -486,6 +486,8 @@ DEFINE_PER_CPU(struct cpu_entry_area *,
 static inline void setup_cpu_entry_area(int cpu)
 {
 #ifdef CONFIG_X86_64
+	extern char _entry_trampoline[];
+
 	/* On 64-bit systems, we use a read-only fixmap GDT. */
 	pgprot_t gdt_prot = PAGE_KERNEL_RO;
 #else
@@ -532,6 +534,11 @@ static inline void setup_cpu_entry_area(
 #ifdef CONFIG_X86_32
 	this_cpu_write(cpu_entry_area, get_cpu_entry_area(cpu));
 #endif
+
+#ifdef CONFIG_X86_64
+	__set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline),
+		     __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX);
+#endif
 }
 
 /* Load the original GDT from the per-cpu structure */
@@ -1395,10 +1402,16 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(char,
 /* May not be marked __init: used by software suspend */
 void syscall_init(void)
 {
+	extern char _entry_trampoline[];
+	extern char entry_SYSCALL_64_trampoline[];
+
 	int cpu = smp_processor_id();
+	unsigned long SYSCALL64_entry_trampoline =
+		(unsigned long)get_cpu_entry_area(cpu)->entry_trampoline +
+		(entry_SYSCALL_64_trampoline - _entry_trampoline);
 
 	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
-	wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
+	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
 
 #ifdef CONFIG_IA32_EMULATION
 	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -107,6 +107,15 @@ SECTIONS
 		SOFTIRQENTRY_TEXT
 		*(.fixup)
 		*(.gnu.warning)
+
+#ifdef CONFIG_X86_64
+		. = ALIGN(PAGE_SIZE);
+		_entry_trampoline = .;
+		*(.entry_trampoline)
+		. = ALIGN(PAGE_SIZE);
+		ASSERT(. - _entry_trampoline == PAGE_SIZE, "entry trampoline is too big");
+#endif
+
 		/* End of text section */
 		_etext = .;
 	} :text = 0x9090

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 093/159] x86/entry/64: Move the IST stacks into struct cpu_entry_area
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (94 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 092/159] x86/entry/64: Create a per-CPU SYSCALL entry trampoline Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 094/159] x86/entry/64: Remove the SYSENTER stack canary Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 40e7f949e0d9a33968ebde5d67f7e3a47c97742a upstream.

The IST stacks are needed when an IST exception occurs and are accessed
before any kernel code at all runs.  Move them into struct cpu_entry_area.

The IST stacks are unlike the rest of cpu_entry_area: they're used even for
entries from kernel mode.  This means that they should be set up before we
load the final IDT.  Move cpu_entry_area setup to trap_init() for the boot
CPU and set it up for all possible CPUs at once in native_smp_prepare_cpus().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.480598743@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/fixmap.h |   12 ++++++
 arch/x86/kernel/cpu/common.c  |   74 +++++++++++++++++++++++-------------------
 arch/x86/kernel/traps.c       |    3 +
 3 files changed, 57 insertions(+), 32 deletions(-)

--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -63,10 +63,22 @@ struct cpu_entry_area {
 	struct tss_struct tss;
 
 	char entry_trampoline[PAGE_SIZE];
+
+#ifdef CONFIG_X86_64
+	/*
+	 * Exception stacks used for IST entries.
+	 *
+	 * In the future, this should have a separate slot for each stack
+	 * with guard pages between them.
+	 */
+	char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ];
+#endif
 };
 
 #define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
 
+extern void setup_cpu_entry_areas(void);
+
 /*
  * Here we define all the compile-time 'special' virtual
  * addresses. The point is to have a constant address at
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,24 +466,36 @@ void load_percpu_segment(int cpu)
 	load_stack_canary_segment();
 }
 
-static void set_percpu_fixmap_pages(int fixmap_index, void *ptr,
-				    int pages, pgprot_t prot)
-{
-	int i;
-
-	for (i = 0; i < pages; i++) {
-		__set_fixmap(fixmap_index - i,
-			     per_cpu_ptr_to_phys(ptr + i * PAGE_SIZE), prot);
-	}
-}
-
 #ifdef CONFIG_X86_32
 /* The 32-bit entry code needs to find cpu_entry_area. */
 DEFINE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
 #endif
 
+#ifdef CONFIG_X86_64
+/*
+ * Special IST stacks which the CPU switches to when it calls
+ * an IST-marked descriptor entry. Up to 7 stacks (hardware
+ * limit), all of them are 4K, except the debug stack which
+ * is 8K.
+ */
+static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
+	  [0 ... N_EXCEPTION_STACKS - 1]	= EXCEPTION_STKSZ,
+	  [DEBUG_STACK - 1]			= DEBUG_STKSZ
+};
+
+static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
+	[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]);
+#endif
+
+static void __init
+set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot)
+{
+	for ( ; pages; pages--, idx--, ptr += PAGE_SIZE)
+		__set_fixmap(idx, per_cpu_ptr_to_phys(ptr), prot);
+}
+
 /* Setup the fixmap mappings only once per-processor */
-static inline void setup_cpu_entry_area(int cpu)
+static void __init setup_cpu_entry_area(int cpu)
 {
 #ifdef CONFIG_X86_64
 	extern char _entry_trampoline[];
@@ -532,15 +544,31 @@ static inline void setup_cpu_entry_area(
 				PAGE_KERNEL);
 
 #ifdef CONFIG_X86_32
-	this_cpu_write(cpu_entry_area, get_cpu_entry_area(cpu));
+	per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu);
 #endif
 
 #ifdef CONFIG_X86_64
+	BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0);
+	BUILD_BUG_ON(sizeof(exception_stacks) !=
+		     sizeof(((struct cpu_entry_area *)0)->exception_stacks));
+	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, exception_stacks),
+				&per_cpu(exception_stacks, cpu),
+				sizeof(exception_stacks) / PAGE_SIZE,
+				PAGE_KERNEL);
+
 	__set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline),
 		     __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX);
 #endif
 }
 
+void __init setup_cpu_entry_areas(void)
+{
+	unsigned int cpu;
+
+	for_each_possible_cpu(cpu)
+		setup_cpu_entry_area(cpu);
+}
+
 /* Load the original GDT from the per-cpu structure */
 void load_direct_gdt(int cpu)
 {
@@ -1385,20 +1413,6 @@ DEFINE_PER_CPU(unsigned int, irq_count)
 DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
 EXPORT_PER_CPU_SYMBOL(__preempt_count);
 
-/*
- * Special IST stacks which the CPU switches to when it calls
- * an IST-marked descriptor entry. Up to 7 stacks (hardware
- * limit), all of them are 4K, except the debug stack which
- * is 8K.
- */
-static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
-	  [0 ... N_EXCEPTION_STACKS - 1]	= EXCEPTION_STKSZ,
-	  [DEBUG_STACK - 1]			= DEBUG_STKSZ
-};
-
-static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
-	[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]);
-
 /* May not be marked __init: used by software suspend */
 void syscall_init(void)
 {
@@ -1607,7 +1621,7 @@ void cpu_init(void)
 	 * set up and load the per-CPU TSS
 	 */
 	if (!oist->ist[0]) {
-		char *estacks = per_cpu(exception_stacks, cpu);
+		char *estacks = get_cpu_entry_area(cpu)->exception_stacks;
 
 		for (v = 0; v < N_EXCEPTION_STACKS; v++) {
 			estacks += exception_stack_sizes[v];
@@ -1633,8 +1647,6 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, me);
 
-	setup_cpu_entry_area(cpu);
-
 	/*
 	 * Initialize the TSS.  sp0 points to the entry trampoline stack
 	 * regardless of what task is running.
@@ -1694,8 +1706,6 @@ void cpu_init(void)
 	initialize_tlbstate_and_flush();
 	enter_lazy_tlb(&init_mm, curr);
 
-	setup_cpu_entry_area(cpu);
-
 	/*
 	 * Initialize the TSS.  Don't bother initializing sp0, as the initial
 	 * task never enters user mode.
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -947,6 +947,9 @@ dotraplinkage void do_iret_error(struct
 
 void __init trap_init(void)
 {
+	/* Init cpu_entry_area before IST entries are set up */
+	setup_cpu_entry_areas();
+
 	idt_setup_traps();
 
 	/*

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 094/159] x86/entry/64: Remove the SYSENTER stack canary
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (95 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 093/159] x86/entry/64: Move the IST stacks into struct cpu_entry_area Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 095/159] x86/entry: Clean up the SYSENTER_stack code Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 7fbbd5cbebf118a9e09f5453f686656a167c3d1c upstream.

Now that the SYSENTER stack has a guard page, there's no need for a canary
to detect overflow after the fact.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.572577316@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/processor.h |    1 -
 arch/x86/kernel/dumpstack.c      |    3 +--
 arch/x86/kernel/process.c        |    1 -
 arch/x86/kernel/traps.c          |    7 -------
 4 files changed, 1 insertion(+), 11 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -341,7 +341,6 @@ struct tss_struct {
 	 * Space for the temporary SYSENTER stack, used for SYSENTER
 	 * and the entry trampoline as well.
 	 */
-	unsigned long		SYSENTER_stack_canary;
 	unsigned long		SYSENTER_stack[64];
 
 	/*
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -48,8 +48,7 @@ bool in_sysenter_stack(unsigned long *st
 	int cpu = smp_processor_id();
 	struct tss_struct *tss = &get_cpu_entry_area(cpu)->tss;
 
-	/* Treat the canary as part of the stack for unwinding purposes. */
-	void *begin = &tss->SYSENTER_stack_canary;
+	void *begin = &tss->SYSENTER_stack;
 	void *end = (void *)&tss->SYSENTER_stack + sizeof(tss->SYSENTER_stack);
 
 	if ((void *)stack < begin || (void *)stack >= end)
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -81,7 +81,6 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(
 	  */
 	.io_bitmap		= { [0 ... IO_BITMAP_LONGS] = ~0 },
 #endif
-	.SYSENTER_stack_canary	= STACK_END_MAGIC,
 };
 EXPORT_PER_CPU_SYMBOL(cpu_tss);
 
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -814,13 +814,6 @@ dotraplinkage void do_debug(struct pt_re
 	debug_stack_usage_dec();
 
 exit:
-	/*
-	 * This is the most likely code path that involves non-trivial use
-	 * of the SYSENTER stack.  Check that we haven't overrun it.
-	 */
-	WARN(this_cpu_read(cpu_tss.SYSENTER_stack_canary) != STACK_END_MAGIC,
-	     "Overran or corrupted SYSENTER stack\n");
-
 	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 095/159] x86/entry: Clean up the SYSENTER_stack code
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (96 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 094/159] x86/entry/64: Remove the SYSENTER stack canary Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 096/159] x86/entry/64: Make cpu_entry_area.tss read-only Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov, Brian Gerst,
	Dave Hansen, Dave Hansen, David Laight, Denys Vlasenko,
	Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf, Juergen Gross,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 0f9a48100fba3f189724ae88a450c2261bf91c80 upstream.

The existing code was a mess, mainly because C arrays are nasty.  Turn
SYSENTER_stack into a struct, add a helper to find it, and do all the
obvious cleanups this enables.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_32.S        |    4 ++--
 arch/x86/entry/entry_64.S        |    2 +-
 arch/x86/include/asm/fixmap.h    |    5 +++++
 arch/x86/include/asm/processor.h |    6 +++++-
 arch/x86/kernel/asm-offsets.c    |    6 ++----
 arch/x86/kernel/cpu/common.c     |   14 +++-----------
 arch/x86/kernel/dumpstack.c      |    7 +++----
 7 files changed, 21 insertions(+), 23 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -942,7 +942,7 @@ ENTRY(debug)
 
 	/* Are we currently on the SYSENTER stack? */
 	movl	PER_CPU_VAR(cpu_entry_area), %ecx
-	addl	$CPU_ENTRY_AREA_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
+	addl	$CPU_ENTRY_AREA_tss + TSS_STRUCT_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
 	jb	.Ldebug_from_sysenter_stack
@@ -986,7 +986,7 @@ ENTRY(nmi)
 
 	/* Are we currently on the SYSENTER stack? */
 	movl	PER_CPU_VAR(cpu_entry_area), %ecx
-	addl	$CPU_ENTRY_AREA_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
+	addl	$CPU_ENTRY_AREA_tss + TSS_STRUCT_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
 	jb	.Lnmi_from_sysenter_stack
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -154,7 +154,7 @@ END(native_usergs_sysret64)
 	_entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip)
 
 /* The top word of the SYSENTER stack is hot and is usable as scratch space. */
-#define RSP_SCRATCH	CPU_ENTRY_AREA_tss + CPU_TSS_SYSENTER_stack + \
+#define RSP_SCRATCH	CPU_ENTRY_AREA_tss + TSS_STRUCT_SYSENTER_stack + \
 			SIZEOF_SYSENTER_stack - 8 + CPU_ENTRY_AREA
 
 ENTRY(entry_SYSCALL_64_trampoline)
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -245,5 +245,10 @@ static inline struct cpu_entry_area *get
 	return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0));
 }
 
+static inline struct SYSENTER_stack *cpu_SYSENTER_stack(int cpu)
+{
+	return &get_cpu_entry_area(cpu)->tss.SYSENTER_stack;
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASM_X86_FIXMAP_H */
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -336,12 +336,16 @@ struct x86_hw_tss {
 #define IO_BITMAP_OFFSET		(offsetof(struct tss_struct, io_bitmap) - offsetof(struct tss_struct, x86_tss))
 #define INVALID_IO_BITMAP_OFFSET	0x8000
 
+struct SYSENTER_stack {
+	unsigned long		words[64];
+};
+
 struct tss_struct {
 	/*
 	 * Space for the temporary SYSENTER stack, used for SYSENTER
 	 * and the entry trampoline as well.
 	 */
-	unsigned long		SYSENTER_stack[64];
+	struct SYSENTER_stack	SYSENTER_stack;
 
 	/*
 	 * The fixed hardware portion.  This must not cross a page boundary
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -94,10 +94,8 @@ void common(void) {
 	BLANK();
 	DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
 
-	/* Offset from cpu_tss to SYSENTER_stack */
-	OFFSET(CPU_TSS_SYSENTER_stack, tss_struct, SYSENTER_stack);
-	/* Size of SYSENTER_stack */
-	DEFINE(SIZEOF_SYSENTER_stack, sizeof(((struct tss_struct *)0)->SYSENTER_stack));
+	OFFSET(TSS_STRUCT_SYSENTER_stack, tss_struct, SYSENTER_stack);
+	DEFINE(SIZEOF_SYSENTER_stack, sizeof(struct SYSENTER_stack));
 
 	/* Layout info for cpu_entry_area */
 	OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1314,12 +1314,7 @@ void enable_sep_cpu(void)
 
 	tss->x86_tss.ss1 = __KERNEL_CS;
 	wrmsr(MSR_IA32_SYSENTER_CS, tss->x86_tss.ss1, 0);
-
-	wrmsr(MSR_IA32_SYSENTER_ESP,
-	      (unsigned long)&get_cpu_entry_area(cpu)->tss +
-	      offsetofend(struct tss_struct, SYSENTER_stack),
-	      0);
-
+	wrmsr(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1), 0);
 	wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0);
 
 	put_cpu();
@@ -1436,9 +1431,7 @@ void syscall_init(void)
 	 * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit).
 	 */
 	wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
-	wrmsrl_safe(MSR_IA32_SYSENTER_ESP,
-		    (unsigned long)&get_cpu_entry_area(cpu)->tss +
-		    offsetofend(struct tss_struct, SYSENTER_stack));
+	wrmsrl_safe(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1));
 	wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
 #else
 	wrmsrl(MSR_CSTAR, (unsigned long)ignore_sysret);
@@ -1653,8 +1646,7 @@ void cpu_init(void)
 	 */
 	set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
 	load_TR_desc();
-	load_sp0((unsigned long)&get_cpu_entry_area(cpu)->tss +
-		 offsetofend(struct tss_struct, SYSENTER_stack));
+	load_sp0((unsigned long)(cpu_SYSENTER_stack(cpu) + 1));
 
 	load_mm_ldt(&init_mm);
 
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -45,11 +45,10 @@ bool in_task_stack(unsigned long *stack,
 
 bool in_sysenter_stack(unsigned long *stack, struct stack_info *info)
 {
-	int cpu = smp_processor_id();
-	struct tss_struct *tss = &get_cpu_entry_area(cpu)->tss;
+	struct SYSENTER_stack *ss = cpu_SYSENTER_stack(smp_processor_id());
 
-	void *begin = &tss->SYSENTER_stack;
-	void *end = (void *)&tss->SYSENTER_stack + sizeof(tss->SYSENTER_stack);
+	void *begin = ss;
+	void *end = ss + 1;
 
 	if ((void *)stack < begin || (void *)stack >= end)
 		return false;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 096/159] x86/entry/64: Make cpu_entry_area.tss read-only
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (97 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 095/159] x86/entry: Clean up the SYSENTER_stack code Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46   ` Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Thomas Gleixner,
	Borislav Petkov, Boris Ostrovsky, Borislav Petkov, Brian Gerst,
	Dave Hansen, Dave Hansen, David Laight, Denys Vlasenko,
	Eduardo Valentin, H. Peter Anvin, Josh Poimboeuf, Juergen Gross,
	Kees Cook, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Will Deacon, aliguori, daniel.gruss, hughd, keescook,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit c482feefe1aeb150156248ba0fd3e029bc886605 upstream.

The TSS is a fairly juicy target for exploits, and, now that the TSS
is in the cpu_entry_area, it's no longer protected by kASLR.  Make it
read-only on x86_64.

On x86_32, it can't be RO because it's written by the CPU during task
switches, and we use a task gate for double faults.  I'd also be
nervous about errata if we tried to make it RO even on configurations
without double fault handling.

[ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO.  So
  	it's probably safe to assume that it's a non issue, though Intel
  	might have been creative in that area. Still waiting for
  	confirmation. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/entry/entry_32.S          |    4 ++--
 arch/x86/entry/entry_64.S          |    8 ++++----
 arch/x86/include/asm/fixmap.h      |   13 +++++++++----
 arch/x86/include/asm/processor.h   |   17 ++++++++---------
 arch/x86/include/asm/switch_to.h   |    4 ++--
 arch/x86/include/asm/thread_info.h |    2 +-
 arch/x86/kernel/asm-offsets.c      |    5 ++---
 arch/x86/kernel/asm-offsets_32.c   |    4 ++--
 arch/x86/kernel/cpu/common.c       |   29 +++++++++++++++++++----------
 arch/x86/kernel/ioport.c           |    2 +-
 arch/x86/kernel/process.c          |    6 +++---
 arch/x86/kernel/process_32.c       |    2 +-
 arch/x86/kernel/process_64.c       |    2 +-
 arch/x86/kernel/traps.c            |    4 ++--
 arch/x86/lib/delay.c               |    4 ++--
 arch/x86/xen/enlighten_pv.c        |    2 +-
 16 files changed, 60 insertions(+), 48 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -942,7 +942,7 @@ ENTRY(debug)
 
 	/* Are we currently on the SYSENTER stack? */
 	movl	PER_CPU_VAR(cpu_entry_area), %ecx
-	addl	$CPU_ENTRY_AREA_tss + TSS_STRUCT_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
+	addl	$CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
 	jb	.Ldebug_from_sysenter_stack
@@ -986,7 +986,7 @@ ENTRY(nmi)
 
 	/* Are we currently on the SYSENTER stack? */
 	movl	PER_CPU_VAR(cpu_entry_area), %ecx
-	addl	$CPU_ENTRY_AREA_tss + TSS_STRUCT_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
+	addl	$CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx
 	subl	%eax, %ecx	/* ecx = (end of SYSENTER_stack) - esp */
 	cmpl	$SIZEOF_SYSENTER_stack, %ecx
 	jb	.Lnmi_from_sysenter_stack
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -154,7 +154,7 @@ END(native_usergs_sysret64)
 	_entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip)
 
 /* The top word of the SYSENTER stack is hot and is usable as scratch space. */
-#define RSP_SCRATCH	CPU_ENTRY_AREA_tss + TSS_STRUCT_SYSENTER_stack + \
+#define RSP_SCRATCH	CPU_ENTRY_AREA_SYSENTER_stack + \
 			SIZEOF_SYSENTER_stack - 8 + CPU_ENTRY_AREA
 
 ENTRY(entry_SYSCALL_64_trampoline)
@@ -390,7 +390,7 @@ syscall_return_via_sysret:
 	 * Save old stack pointer and switch to trampoline stack.
 	 */
 	movq	%rsp, %rdi
-	movq	PER_CPU_VAR(cpu_tss + TSS_sp0), %rsp
+	movq	PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
 
 	pushq	RSP-RDI(%rdi)	/* RSP */
 	pushq	(%rdi)		/* RDI */
@@ -719,7 +719,7 @@ GLOBAL(swapgs_restore_regs_and_return_to
 	 * Save old stack pointer and switch to trampoline stack.
 	 */
 	movq	%rsp, %rdi
-	movq	PER_CPU_VAR(cpu_tss + TSS_sp0), %rsp
+	movq	PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
 
 	/* Copy the IRET frame to the trampoline stack. */
 	pushq	6*8(%rdi)	/* SS */
@@ -934,7 +934,7 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work
 /*
  * Exception entry points.
  */
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
+#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
 
 /*
  * Switch to the thread stack.  This is called with the IRET frame and
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -56,9 +56,14 @@ struct cpu_entry_area {
 	char gdt[PAGE_SIZE];
 
 	/*
-	 * The GDT is just below cpu_tss and thus serves (on x86_64) as a
-	 * a read-only guard page for the SYSENTER stack at the bottom
-	 * of the TSS region.
+	 * The GDT is just below SYSENTER_stack and thus serves (on x86_64) as
+	 * a a read-only guard page.
+	 */
+	struct SYSENTER_stack_page SYSENTER_stack_page;
+
+	/*
+	 * On x86_64, the TSS is mapped RO.  On x86_32, it's mapped RW because
+	 * we need task switches to work, and task switches write to the TSS.
 	 */
 	struct tss_struct tss;
 
@@ -247,7 +252,7 @@ static inline struct cpu_entry_area *get
 
 static inline struct SYSENTER_stack *cpu_SYSENTER_stack(int cpu)
 {
-	return &get_cpu_entry_area(cpu)->tss.SYSENTER_stack;
+	return &get_cpu_entry_area(cpu)->SYSENTER_stack_page.stack;
 }
 
 #endif /* !__ASSEMBLY__ */
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -340,13 +340,11 @@ struct SYSENTER_stack {
 	unsigned long		words[64];
 };
 
-struct tss_struct {
-	/*
-	 * Space for the temporary SYSENTER stack, used for SYSENTER
-	 * and the entry trampoline as well.
-	 */
-	struct SYSENTER_stack	SYSENTER_stack;
+struct SYSENTER_stack_page {
+	struct SYSENTER_stack stack;
+} __aligned(PAGE_SIZE);
 
+struct tss_struct {
 	/*
 	 * The fixed hardware portion.  This must not cross a page boundary
 	 * at risk of violating the SDM's advice and potentially triggering
@@ -363,7 +361,7 @@ struct tss_struct {
 	unsigned long		io_bitmap[IO_BITMAP_LONGS + 1];
 } __aligned(PAGE_SIZE);
 
-DECLARE_PER_CPU_PAGE_ALIGNED(struct tss_struct, cpu_tss);
+DECLARE_PER_CPU_PAGE_ALIGNED(struct tss_struct, cpu_tss_rw);
 
 /*
  * sizeof(unsigned long) coming from an extra "long" at the end
@@ -378,7 +376,8 @@ DECLARE_PER_CPU_PAGE_ALIGNED(struct tss_
 #ifdef CONFIG_X86_32
 DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack);
 #else
-#define cpu_current_top_of_stack cpu_tss.x86_tss.sp1
+/* The RO copy can't be accessed with this_cpu_xyz(), so use the RW copy. */
+#define cpu_current_top_of_stack cpu_tss_rw.x86_tss.sp1
 #endif
 
 /*
@@ -538,7 +537,7 @@ static inline void native_set_iopl_mask(
 static inline void
 native_load_sp0(unsigned long sp0)
 {
-	this_cpu_write(cpu_tss.x86_tss.sp0, sp0);
+	this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0);
 }
 
 static inline void native_swapgs(void)
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -79,10 +79,10 @@ do {									\
 static inline void refresh_sysenter_cs(struct thread_struct *thread)
 {
 	/* Only happens when SEP is enabled, no need to test "SEP"arately: */
-	if (unlikely(this_cpu_read(cpu_tss.x86_tss.ss1) == thread->sysenter_cs))
+	if (unlikely(this_cpu_read(cpu_tss_rw.x86_tss.ss1) == thread->sysenter_cs))
 		return;
 
-	this_cpu_write(cpu_tss.x86_tss.ss1, thread->sysenter_cs);
+	this_cpu_write(cpu_tss_rw.x86_tss.ss1, thread->sysenter_cs);
 	wrmsr(MSR_IA32_SYSENTER_CS, thread->sysenter_cs, 0);
 }
 #endif
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -207,7 +207,7 @@ static inline int arch_within_stack_fram
 #else /* !__ASSEMBLY__ */
 
 #ifdef CONFIG_X86_64
-# define cpu_current_top_of_stack (cpu_tss + TSS_sp1)
+# define cpu_current_top_of_stack (cpu_tss_rw + TSS_sp1)
 #endif
 
 #endif
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -94,10 +94,9 @@ void common(void) {
 	BLANK();
 	DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
 
-	OFFSET(TSS_STRUCT_SYSENTER_stack, tss_struct, SYSENTER_stack);
-	DEFINE(SIZEOF_SYSENTER_stack, sizeof(struct SYSENTER_stack));
-
 	/* Layout info for cpu_entry_area */
 	OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss);
 	OFFSET(CPU_ENTRY_AREA_entry_trampoline, cpu_entry_area, entry_trampoline);
+	OFFSET(CPU_ENTRY_AREA_SYSENTER_stack, cpu_entry_area, SYSENTER_stack_page);
+	DEFINE(SIZEOF_SYSENTER_stack, sizeof(struct SYSENTER_stack));
 }
--- a/arch/x86/kernel/asm-offsets_32.c
+++ b/arch/x86/kernel/asm-offsets_32.c
@@ -47,8 +47,8 @@ void foo(void)
 	BLANK();
 
 	/* Offset from the sysenter stack to tss.sp0 */
-	DEFINE(TSS_sysenter_sp0, offsetof(struct tss_struct, x86_tss.sp0) -
-	       offsetofend(struct tss_struct, SYSENTER_stack));
+	DEFINE(TSS_sysenter_sp0, offsetof(struct cpu_entry_area, tss.x86_tss.sp0) -
+	       offsetofend(struct cpu_entry_area, SYSENTER_stack_page.stack));
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	BLANK();
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -487,6 +487,9 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(char,
 	[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]);
 #endif
 
+static DEFINE_PER_CPU_PAGE_ALIGNED(struct SYSENTER_stack_page,
+				   SYSENTER_stack_storage);
+
 static void __init
 set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot)
 {
@@ -500,23 +503,29 @@ static void __init setup_cpu_entry_area(
 #ifdef CONFIG_X86_64
 	extern char _entry_trampoline[];
 
-	/* On 64-bit systems, we use a read-only fixmap GDT. */
+	/* On 64-bit systems, we use a read-only fixmap GDT and TSS. */
 	pgprot_t gdt_prot = PAGE_KERNEL_RO;
+	pgprot_t tss_prot = PAGE_KERNEL_RO;
 #else
 	/*
 	 * On native 32-bit systems, the GDT cannot be read-only because
 	 * our double fault handler uses a task gate, and entering through
-	 * a task gate needs to change an available TSS to busy.  If the GDT
-	 * is read-only, that will triple fault.
+	 * a task gate needs to change an available TSS to busy.  If the
+	 * GDT is read-only, that will triple fault.  The TSS cannot be
+	 * read-only because the CPU writes to it on task switches.
 	 *
-	 * On Xen PV, the GDT must be read-only because the hypervisor requires
-	 * it.
+	 * On Xen PV, the GDT must be read-only because the hypervisor
+	 * requires it.
 	 */
 	pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ?
 		PAGE_KERNEL_RO : PAGE_KERNEL;
+	pgprot_t tss_prot = PAGE_KERNEL;
 #endif
 
 	__set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot);
+	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, SYSENTER_stack_page),
+				per_cpu_ptr(&SYSENTER_stack_storage, cpu), 1,
+				PAGE_KERNEL);
 
 	/*
 	 * The Intel SDM says (Volume 3, 7.2.1):
@@ -539,9 +548,9 @@ static void __init setup_cpu_entry_area(
 		      offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK);
 	BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0);
 	set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss),
-				&per_cpu(cpu_tss, cpu),
+				&per_cpu(cpu_tss_rw, cpu),
 				sizeof(struct tss_struct) / PAGE_SIZE,
-				PAGE_KERNEL);
+				tss_prot);
 
 #ifdef CONFIG_X86_32
 	per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu);
@@ -1305,7 +1314,7 @@ void enable_sep_cpu(void)
 		return;
 
 	cpu = get_cpu();
-	tss = &per_cpu(cpu_tss, cpu);
+	tss = &per_cpu(cpu_tss_rw, cpu);
 
 	/*
 	 * We cache MSR_IA32_SYSENTER_CS's value in the TSS's ss1 field --
@@ -1575,7 +1584,7 @@ void cpu_init(void)
 	if (cpu)
 		load_ucode_ap();
 
-	t = &per_cpu(cpu_tss, cpu);
+	t = &per_cpu(cpu_tss_rw, cpu);
 	oist = &per_cpu(orig_ist, cpu);
 
 #ifdef CONFIG_NUMA
@@ -1667,7 +1676,7 @@ void cpu_init(void)
 {
 	int cpu = smp_processor_id();
 	struct task_struct *curr = current;
-	struct tss_struct *t = &per_cpu(cpu_tss, cpu);
+	struct tss_struct *t = &per_cpu(cpu_tss_rw, cpu);
 
 	wait_for_master_cpu(cpu);
 
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -67,7 +67,7 @@ asmlinkage long sys_ioperm(unsigned long
 	 * because the ->io_bitmap_max value must match the bitmap
 	 * contents:
 	 */
-	tss = &per_cpu(cpu_tss, get_cpu());
+	tss = &per_cpu(cpu_tss_rw, get_cpu());
 
 	if (turn_on)
 		bitmap_clear(t->io_bitmap_ptr, from, num);
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -47,7 +47,7 @@
  * section. Since TSS's are completely CPU-local, we want them
  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
  */
-__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = {
+__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss_rw) = {
 	.x86_tss = {
 		/*
 		 * .sp0 is only used when entering ring 0 from a lower
@@ -82,7 +82,7 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(
 	.io_bitmap		= { [0 ... IO_BITMAP_LONGS] = ~0 },
 #endif
 };
-EXPORT_PER_CPU_SYMBOL(cpu_tss);
+EXPORT_PER_CPU_SYMBOL(cpu_tss_rw);
 
 DEFINE_PER_CPU(bool, __tss_limit_invalid);
 EXPORT_PER_CPU_SYMBOL_GPL(__tss_limit_invalid);
@@ -111,7 +111,7 @@ void exit_thread(struct task_struct *tsk
 	struct fpu *fpu = &t->fpu;
 
 	if (bp) {
-		struct tss_struct *tss = &per_cpu(cpu_tss, get_cpu());
+		struct tss_struct *tss = &per_cpu(cpu_tss_rw, get_cpu());
 
 		t->io_bitmap_ptr = NULL;
 		clear_thread_flag(TIF_IO_BITMAP);
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -234,7 +234,7 @@ __switch_to(struct task_struct *prev_p,
 	struct fpu *prev_fpu = &prev->fpu;
 	struct fpu *next_fpu = &next->fpu;
 	int cpu = smp_processor_id();
-	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
+	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -399,7 +399,7 @@ __switch_to(struct task_struct *prev_p,
 	struct fpu *prev_fpu = &prev->fpu;
 	struct fpu *next_fpu = &next->fpu;
 	int cpu = smp_processor_id();
-	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
+	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
 
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
 		     this_cpu_read(irq_count) != -1);
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -364,7 +364,7 @@ dotraplinkage void do_double_fault(struc
 		regs->cs == __KERNEL_CS &&
 		regs->ip == (unsigned long)native_irq_return_iret)
 	{
-		struct pt_regs *gpregs = (struct pt_regs *)this_cpu_read(cpu_tss.x86_tss.sp0) - 1;
+		struct pt_regs *gpregs = (struct pt_regs *)this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
 
 		/*
 		 * regs->sp points to the failing IRET frame on the
@@ -649,7 +649,7 @@ struct bad_iret_stack *fixup_bad_iret(st
 	 * exception came from the IRET target.
 	 */
 	struct bad_iret_stack *new_stack =
-		(struct bad_iret_stack *)this_cpu_read(cpu_tss.x86_tss.sp0) - 1;
+		(struct bad_iret_stack *)this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
 
 	/* Copy the IRET target to the new stack. */
 	memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8);
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -107,10 +107,10 @@ static void delay_mwaitx(unsigned long _
 		delay = min_t(u64, MWAITX_MAX_LOOPS, loops);
 
 		/*
-		 * Use cpu_tss as a cacheline-aligned, seldomly
+		 * Use cpu_tss_rw as a cacheline-aligned, seldomly
 		 * accessed per-cpu variable as the monitor target.
 		 */
-		__monitorx(raw_cpu_ptr(&cpu_tss), 0, 0);
+		__monitorx(raw_cpu_ptr(&cpu_tss_rw), 0, 0);
 
 		/*
 		 * AMD, like Intel, supports the EAX hint and EAX=0xf
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -818,7 +818,7 @@ static void xen_load_sp0(unsigned long s
 	mcs = xen_mc_entry(0);
 	MULTI_stack_switch(mcs.mc, __KERNEL_DS, sp0);
 	xen_mc_issue(PARAVIRT_LAZY_CPU);
-	this_cpu_write(cpu_tss.x86_tss.sp0, sp0);
+	this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0);
 }
 
 void xen_set_iopl_mask(unsigned mask)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 097/159] x86/paravirt: Dont patch flush_tlb_single
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:46   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Josh Poimboeuf,
	Juergen Gross, Peter Zijlstra, Andy Lutomirski, Boris Ostrovsky,
	Borislav Petkov, Borislav Petkov, Brian Gerst, Dave Hansen,
	Dave Hansen, David Laight, Denys Vlasenko, Eduardo Valentin,
	H. Peter Anvin, Linus Torvalds, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, linux-mm,
	michael.schwarz, moritz.lipp, richard.fellner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit a035795499ca1c2bd1928808d1a156eda1420383 upstream.

native_flush_tlb_single() will be changed with the upcoming
PAGE_TABLE_ISOLATION feature. This requires to have more code in
there than INVLPG.

Remove the paravirt patching for it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Cc: michael.schwarz@iaik.tugraz.at
Cc: moritz.lipp@iaik.tugraz.at
Cc: richard.fellner@student.tugraz.at
Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/paravirt_patch_64.c |    2 --
 1 file changed, 2 deletions(-)

--- a/arch/x86/kernel/paravirt_patch_64.c
+++ b/arch/x86/kernel/paravirt_patch_64.c
@@ -10,7 +10,6 @@ DEF_NATIVE(pv_irq_ops, save_fl, "pushfq;
 DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax");
 DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax");
 DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3");
-DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)");
 DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
 
 DEF_NATIVE(pv_cpu_ops, usergs_sysret64, "swapgs; sysretq");
@@ -60,7 +59,6 @@ unsigned native_patch(u8 type, u16 clobb
 		PATCH_SITE(pv_mmu_ops, read_cr2);
 		PATCH_SITE(pv_mmu_ops, read_cr3);
 		PATCH_SITE(pv_mmu_ops, write_cr3);
-		PATCH_SITE(pv_mmu_ops, flush_tlb_single);
 		PATCH_SITE(pv_cpu_ops, wbinvd);
 #if defined(CONFIG_PARAVIRT_SPINLOCKS)
 		case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 097/159] x86/paravirt: Dont patch flush_tlb_single
@ 2017-12-22  8:46   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Josh Poimboeuf,
	Juergen Gross, Peter Zijlstra, Andy Lutomirski, Boris Ostrovsky,
	Borislav Petkov, Borislav Petkov, Brian Gerst, Dave Hansen,
	Dave Hansen, David Laight, Denys Vlasenko, Eduardo Valentin,
	H. Peter Anvin, Linus Torvalds, Rik van Riel, Will Deacon,
	aliguori, daniel.gruss, hughd, keescook, linux-mm,
	michael.schwarz, moritz.lipp, richard.fellner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit a035795499ca1c2bd1928808d1a156eda1420383 upstream.

native_flush_tlb_single() will be changed with the upcoming
PAGE_TABLE_ISOLATION feature. This requires to have more code in
there than INVLPG.

Remove the paravirt patching for it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Cc: michael.schwarz@iaik.tugraz.at
Cc: moritz.lipp@iaik.tugraz.at
Cc: richard.fellner@student.tugraz.at
Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/paravirt_patch_64.c |    2 --
 1 file changed, 2 deletions(-)

--- a/arch/x86/kernel/paravirt_patch_64.c
+++ b/arch/x86/kernel/paravirt_patch_64.c
@@ -10,7 +10,6 @@ DEF_NATIVE(pv_irq_ops, save_fl, "pushfq;
 DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax");
 DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax");
 DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3");
-DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)");
 DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
 
 DEF_NATIVE(pv_cpu_ops, usergs_sysret64, "swapgs; sysretq");
@@ -60,7 +59,6 @@ unsigned native_patch(u8 type, u16 clobb
 		PATCH_SITE(pv_mmu_ops, read_cr2);
 		PATCH_SITE(pv_mmu_ops, read_cr3);
 		PATCH_SITE(pv_mmu_ops, write_cr3);
-		PATCH_SITE(pv_mmu_ops, flush_tlb_single);
 		PATCH_SITE(pv_cpu_ops, wbinvd);
 #if defined(CONFIG_PARAVIRT_SPINLOCKS)
 		case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 098/159] x86/paravirt: Provide a way to check for hypervisors
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (99 preceding siblings ...)
  2017-12-22  8:46   ` Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 099/159] x86/cpufeatures: Make CPU bugs sticky Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Juergen Gross,
	Andy Lutomirski, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Will Deacon, aliguori, daniel.gruss, hughd, keescook,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 79cc74155218316b9a5d28577c7077b2adba8e58 upstream.

There is no generic way to test whether a kernel is running on a specific
hypervisor. But that's required to prevent the upcoming user address space
separation feature in certain guest modes.

Make the hypervisor type enum unconditionally available and provide a
helper function which allows to test for a specific type.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.912938129@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/hypervisor.h |   25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -20,16 +20,7 @@
 #ifndef _ASM_X86_HYPERVISOR_H
 #define _ASM_X86_HYPERVISOR_H
 
-#ifdef CONFIG_HYPERVISOR_GUEST
-
-#include <asm/kvm_para.h>
-#include <asm/x86_init.h>
-#include <asm/xen/hypervisor.h>
-
-/*
- * x86 hypervisor information
- */
-
+/* x86 hypervisor types  */
 enum x86_hypervisor_type {
 	X86_HYPER_NATIVE = 0,
 	X86_HYPER_VMWARE,
@@ -39,6 +30,12 @@ enum x86_hypervisor_type {
 	X86_HYPER_KVM,
 };
 
+#ifdef CONFIG_HYPERVISOR_GUEST
+
+#include <asm/kvm_para.h>
+#include <asm/x86_init.h>
+#include <asm/xen/hypervisor.h>
+
 struct hypervisor_x86 {
 	/* Hypervisor name */
 	const char	*name;
@@ -58,7 +55,15 @@ struct hypervisor_x86 {
 
 extern enum x86_hypervisor_type x86_hyper_type;
 extern void init_hypervisor_platform(void);
+static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
+{
+	return x86_hyper_type == type;
+}
 #else
 static inline void init_hypervisor_platform(void) { }
+static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
+{
+	return type == X86_HYPER_NATIVE;
+}
 #endif /* CONFIG_HYPERVISOR_GUEST */
 #endif /* _ASM_X86_HYPERVISOR_H */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 099/159] x86/cpufeatures: Make CPU bugs sticky
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (100 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 098/159] x86/paravirt: Provide a way to check for hypervisors Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 100/159] optee: fix invalid of_node_put() in optee_driver_init() Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Borislav Petkov,
	Andy Lutomirski, Boris Ostrovsky, Borislav Petkov,
	Borislav Petkov, Brian Gerst, Dave Hansen, Dave Hansen,
	David Laight, Denys Vlasenko, Eduardo Valentin, H. Peter Anvin,
	Josh Poimboeuf, Juergen Gross, Linus Torvalds, Peter Zijlstra,
	Rik van Riel, Will Deacon, aliguori, daniel.gruss, hughd,
	keescook, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 6cbd2171e89b13377261d15e64384df60ecb530e upstream.

There is currently no way to force CPU bug bits like CPU feature bits. That
makes it impossible to set a bug bit once at boot and have it stick for all
upcoming CPUs.

Extend the force set/clear arrays to handle bug bits as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.992156574@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/cpufeature.h |    2 ++
 arch/x86/include/asm/processor.h  |    4 ++--
 arch/x86/kernel/cpu/common.c      |    6 +++---
 3 files changed, 7 insertions(+), 5 deletions(-)

--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -135,6 +135,8 @@ extern void clear_cpu_cap(struct cpuinfo
 	set_bit(bit, (unsigned long *)cpu_caps_set);	\
 } while (0)
 
+#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
+
 #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS)
 /*
  * Static testing of CPU features.  Used the same as boot_cpu_has().
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -163,8 +163,8 @@ extern struct cpuinfo_x86	boot_cpu_data;
 extern struct cpuinfo_x86	new_cpu_data;
 
 extern struct x86_hw_tss	doublefault_tss;
-extern __u32			cpu_caps_cleared[NCAPINTS];
-extern __u32			cpu_caps_set[NCAPINTS];
+extern __u32			cpu_caps_cleared[NCAPINTS + NBUGINTS];
+extern __u32			cpu_caps_set[NCAPINTS + NBUGINTS];
 
 #ifdef CONFIG_SMP
 DECLARE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -452,8 +452,8 @@ static const char *table_lookup_model(st
 	return NULL;		/* Not found */
 }
 
-__u32 cpu_caps_cleared[NCAPINTS];
-__u32 cpu_caps_set[NCAPINTS];
+__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
+__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
 
 void load_percpu_segment(int cpu)
 {
@@ -812,7 +812,7 @@ static void apply_forced_caps(struct cpu
 {
 	int i;
 
-	for (i = 0; i < NCAPINTS; i++) {
+	for (i = 0; i < NCAPINTS + NBUGINTS; i++) {
 		c->x86_capability[i] &= ~cpu_caps_cleared[i];
 		c->x86_capability[i] |= cpu_caps_set[i];
 	}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 100/159] optee: fix invalid of_node_put() in optee_driver_init()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (101 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 099/159] x86/cpufeatures: Make CPU bugs sticky Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 101/159] backlight: pwm_bl: Fix overflow condition Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alex Shi, Jens Wiklander, andi

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jens Wiklander <jens.wiklander@linaro.org>

commit f044113113dd95ba73916bde10e804d3cdfa2662 upstream.

The first node supplied to of_find_matching_node() has its reference
counter decreased as part of call to that function. In optee_driver_init()
after calling of_find_matching_node() it's invalid to call of_node_put() on
the supplied node again.

So remove the invalid call to of_node_put().

Reported-by: Alex Shi <alex.shi@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Cc: <andi@linux-stable.l.notmuch.email>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/tee/optee/core.c |    1 -
 1 file changed, 1 deletion(-)

--- a/drivers/tee/optee/core.c
+++ b/drivers/tee/optee/core.c
@@ -590,7 +590,6 @@ static int __init optee_driver_init(void
 		return -ENODEV;
 
 	np = of_find_matching_node(fw_np, optee_match);
-	of_node_put(fw_np);
 	if (!np)
 		return -ENODEV;
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 101/159] backlight: pwm_bl: Fix overflow condition
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (102 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 100/159] optee: fix invalid of_node_put() in optee_driver_init() Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46   ` Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Derek Basehore, Thierry Reding,
	Brian Norris, Lee Jones, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Derek Basehore <dbasehore@chromium.org>


[ Upstream commit 5d0c49acebc9488e37db95f1d4a55644e545ffe7 ]

This fixes an overflow condition that can happen with high max
brightness and period values in compute_duty_cycle. This fixes it by
using a 64 bit variable for computing the duty cycle.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Acked-by: Thierry Reding <thierry.reding@gmail.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/video/backlight/pwm_bl.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -79,14 +79,17 @@ static void pwm_backlight_power_off(stru
 static int compute_duty_cycle(struct pwm_bl_data *pb, int brightness)
 {
 	unsigned int lth = pb->lth_brightness;
-	int duty_cycle;
+	u64 duty_cycle;
 
 	if (pb->levels)
 		duty_cycle = pb->levels[brightness];
 	else
 		duty_cycle = brightness;
 
-	return (duty_cycle * (pb->period - lth) / pb->scale) + lth;
+	duty_cycle *= pb->period - lth;
+	do_div(duty_cycle, pb->scale);
+
+	return duty_cycle + lth;
 }
 
 static int pwm_backlight_update_status(struct backlight_device *bl)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 102/159] drm: Add retries for lspcon mode detection
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22  8:46   ` Greg Kroah-Hartman
  2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
                     ` (164 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ville Syrjala, Imre Deak,
	Jani Nikula, Dave Airlie, Shashank Sharma, Jani Nikula,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shashank Sharma <shashank.sharma@intel.com>


[ Upstream commit f687e25a7a245952349f1f9f9cc238ac5a3be258 ]

>From the CI builds, its been observed that during a driver
reload/insert, dp dual mode read function sometimes fails to
read from LSPCON device over i2c-over-aux channel.

This patch:
- adds some delay and few retries, allowing a scope for these
  devices to settle down and respond.
- changes one error log's level from ERROR->DEBUG as we want
  to call it an error only after all the retries are exhausted.

V2: Addressed review comments from Jani (for loop for retry)
V3: Addressed review comments from Imre (break on partial read too)
V3: Addressed review comments from Ville/Imre (Add the retries
    exclusively for LSPCON, not for all dp_dual_mode devices)
V4: Added r-b from Imre, sending it to dri-devel (Jani)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102294
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102295
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102359
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103186
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507826408-19322-1-git-send-email-shashank.sharma@intel.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/drm_dp_dual_mode_helper.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

--- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c
+++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c
@@ -410,6 +410,7 @@ int drm_lspcon_get_mode(struct i2c_adapt
 {
 	u8 data;
 	int ret = 0;
+	int retry;
 
 	if (!mode) {
 		DRM_ERROR("NULL input\n");
@@ -417,10 +418,19 @@ int drm_lspcon_get_mode(struct i2c_adapt
 	}
 
 	/* Read Status: i2c over aux */
-	ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE,
-				    &data, sizeof(data));
+	for (retry = 0; retry < 6; retry++) {
+		if (retry)
+			usleep_range(500, 1000);
+
+		ret = drm_dp_dual_mode_read(adapter,
+					    DP_DUAL_MODE_LSPCON_CURRENT_MODE,
+					    &data, sizeof(data));
+		if (!ret)
+			break;
+	}
+
 	if (ret < 0) {
-		DRM_ERROR("LSPCON read(0x80, 0x41) failed\n");
+		DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n");
 		return -EFAULT;
 	}
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 102/159] drm: Add retries for lspcon mode detection
@ 2017-12-22  8:46   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ville Syrjala, Imre Deak,
	Jani Nikula, Dave Airlie, Shashank Sharma, Jani Nikula,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shashank Sharma <shashank.sharma@intel.com>


[ Upstream commit f687e25a7a245952349f1f9f9cc238ac5a3be258 ]

>>From the CI builds, its been observed that during a driver
reload/insert, dp dual mode read function sometimes fails to
read from LSPCON device over i2c-over-aux channel.

This patch:
- adds some delay and few retries, allowing a scope for these
  devices to settle down and respond.
- changes one error log's level from ERROR->DEBUG as we want
  to call it an error only after all the retries are exhausted.

V2: Addressed review comments from Jani (for loop for retry)
V3: Addressed review comments from Imre (break on partial read too)
V3: Addressed review comments from Ville/Imre (Add the retries
    exclusively for LSPCON, not for all dp_dual_mode devices)
V4: Added r-b from Imre, sending it to dri-devel (Jani)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102294
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102295
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102359
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103186
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507826408-19322-1-git-send-email-shashank.sharma@intel.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/drm_dp_dual_mode_helper.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

--- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c
+++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c
@@ -410,6 +410,7 @@ int drm_lspcon_get_mode(struct i2c_adapt
 {
 	u8 data;
 	int ret = 0;
+	int retry;
 
 	if (!mode) {
 		DRM_ERROR("NULL input\n");
@@ -417,10 +418,19 @@ int drm_lspcon_get_mode(struct i2c_adapt
 	}
 
 	/* Read Status: i2c over aux */
-	ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE,
-				    &data, sizeof(data));
+	for (retry = 0; retry < 6; retry++) {
+		if (retry)
+			usleep_range(500, 1000);
+
+		ret = drm_dp_dual_mode_read(adapter,
+					    DP_DUAL_MODE_LSPCON_CURRENT_MODE,
+					    &data, sizeof(data));
+		if (!ret)
+			break;
+	}
+
 	if (ret < 0) {
-		DRM_ERROR("LSPCON read(0x80, 0x41) failed\n");
+		DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n");
 		return -EFAULT;
 	}
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 103/159] clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (104 preceding siblings ...)
  2017-12-22  8:46   ` Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 104/159] clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chen-Yu Tsai, Maxime Ripard, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chen-Yu Tsai <wens@csie.org>


[ Upstream commit 4cdbc40d64d4b8303a97e29a52862e4d99502beb ]

The round_rate callback for N-M-factor style clocks does not check if
the requested clock rate is supported by the fractional clock mode.
While this doesn't affect usage in practice, since the clock rates
are also supported through N-M factors, it does not match the set_rate
code.

Add a check to the round_rate callback so it matches the set_rate
callback.

Fixes: 6174a1e24b0d ("clk: sunxi-ng: Add N-M-factor clock support")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/sunxi-ng/ccu_nm.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/clk/sunxi-ng/ccu_nm.c
+++ b/drivers/clk/sunxi-ng/ccu_nm.c
@@ -99,6 +99,9 @@ static long ccu_nm_round_rate(struct clk
 	struct ccu_nm *nm = hw_to_ccu_nm(hw);
 	struct _ccu_nm _nm;
 
+	if (ccu_frac_helper_has_rate(&nm->common, &nm->frac, rate))
+		return rate;
+
 	_nm.min_n = nm->n.min ?: 1;
 	_nm.max_n = nm->n.max ?: 1 << nm->n.width;
 	_nm.min_m = 1;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 104/159] clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (105 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 103/159] clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 105/159] crypto: crypto4xx - increase context and scatter ring buffer elements Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chen-Yu Tsai, Maxime Ripard, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chen-Yu Tsai <wens@csie.org>


[ Upstream commit d51fe3ba9773c8b6fc79f82bbe75d64baf604292 ]

The post-divider for the audio PLL is in bits [29:26], as specified
in the user manual, not [19:16] as currently programmed in the code.
The post-divider has a default register value of 2, i.e. a divider
of 3. This means the clock rate fed to the audio codec would be off.

This was discovered when porting sigma-delta modulation for the PLL
to sun5i, which needs the post-divider to be 1.

Fix the bit offset, so we do actually force the post-divider to a
certain value.

Fixes: 5e73761786d6 ("clk: sunxi-ng: Add sun5i CCU driver")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/sunxi-ng/ccu-sun5i.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/clk/sunxi-ng/ccu-sun5i.c
+++ b/drivers/clk/sunxi-ng/ccu-sun5i.c
@@ -982,8 +982,8 @@ static void __init sun5i_ccu_init(struct
 
 	/* Force the PLL-Audio-1x divider to 4 */
 	val = readl(reg + SUN5I_PLL_AUDIO_REG);
-	val &= ~GENMASK(19, 16);
-	writel(val | (3 << 16), reg + SUN5I_PLL_AUDIO_REG);
+	val &= ~GENMASK(29, 26);
+	writel(val | (3 << 26), reg + SUN5I_PLL_AUDIO_REG);
 
 	/*
 	 * Use the peripheral PLL as the AHB parent, instead of CPU /

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 105/159] crypto: crypto4xx - increase context and scatter ring buffer elements
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (106 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 104/159] clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 106/159] crypto: lrw - Fix an error handling path in create() Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Christian Lamparter, Herbert Xu, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christian Lamparter <chunkeey@gmail.com>


[ Upstream commit 778f81d6cdb7d25360f082ac0384d5103f04eca5 ]

If crypto4xx is used in conjunction with dm-crypt, the available
ring buffer elements are not enough to handle the load properly.

On an aes-cbc-essiv:sha256 encrypted swap partition the read
performance is abyssal: (tested with hdparm -t)

/dev/mapper/swap_crypt:
 Timing buffered disk reads:  14 MB in  3.68 seconds =   3.81 MB/sec

The patch increases both PPC4XX_NUM_SD and PPC4XX_NUM_PD to 256.
This improves the performance considerably:

/dev/mapper/swap_crypt:
 Timing buffered disk reads: 104 MB in  3.03 seconds =  34.31 MB/sec

Furthermore, PPC4XX_LAST_SD, PPC4XX_LAST_GD and PPC4XX_LAST_PD
can be easily calculated from their respective PPC4XX_NUM_*
constant.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/crypto/amcc/crypto4xx_core.h |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/crypto/amcc/crypto4xx_core.h
+++ b/drivers/crypto/amcc/crypto4xx_core.h
@@ -34,12 +34,12 @@
 #define PPC405EX_CE_RESET                       0x00000008
 
 #define CRYPTO4XX_CRYPTO_PRIORITY		300
-#define PPC4XX_LAST_PD				63
-#define PPC4XX_NUM_PD				64
-#define PPC4XX_LAST_GD				1023
+#define PPC4XX_NUM_PD				256
+#define PPC4XX_LAST_PD				(PPC4XX_NUM_PD - 1)
 #define PPC4XX_NUM_GD				1024
-#define PPC4XX_LAST_SD				63
-#define PPC4XX_NUM_SD				64
+#define PPC4XX_LAST_GD				(PPC4XX_NUM_GD - 1)
+#define PPC4XX_NUM_SD				256
+#define PPC4XX_LAST_SD				(PPC4XX_NUM_SD - 1)
 #define PPC4XX_SD_BUFFER_SIZE			2048
 
 #define PD_ENTRY_INUSE				1

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 106/159] crypto: lrw - Fix an error handling path in create()
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (107 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 105/159] crypto: crypto4xx - increase context and scatter ring buffer elements Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 107/159] rtc: pl031: make interrupt optional Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Christophe JAILLET, Herbert Xu, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christophe Jaillet <christophe.jaillet@wanadoo.fr>


[ Upstream commit 616129cc6e75fb4da6681c16c981fa82dfe5e4c7 ]

All error handling paths 'goto err_drop_spawn' except this one.
In order to avoid some resources leak, we should do it as well here.

Fixes: 700cb3f5fe75 ("crypto: lrw - Convert to skcipher")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 crypto/lrw.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -610,8 +610,10 @@ static int create(struct crypto_template
 		ecb_name[len - 1] = 0;
 
 		if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
-			     "lrw(%s)", ecb_name) >= CRYPTO_MAX_ALG_NAME)
-			return -ENAMETOOLONG;
+			     "lrw(%s)", ecb_name) >= CRYPTO_MAX_ALG_NAME) {
+			err = -ENAMETOOLONG;
+			goto err_drop_spawn;
+		}
 	}
 
 	inst->alg.base.cra_flags = alg->base.cra_flags & CRYPTO_ALG_ASYNC;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 107/159] rtc: pl031: make interrupt optional
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (108 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 106/159] crypto: lrw - Fix an error handling path in create() Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Linus Walleij, Russell King,
	Alexandre Belloni, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Russell King <rmk+kernel@armlinux.org.uk>


[ Upstream commit 5b64a2965dfdfca8039e93303c64e2b15c19ff0c ]

On some platforms, the interrupt for the PL031 is optional.  Avoid
trying to claim the interrupt if it's not specified.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/rtc/rtc-pl031.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/drivers/rtc/rtc-pl031.c
+++ b/drivers/rtc/rtc-pl031.c
@@ -308,7 +308,8 @@ static int pl031_remove(struct amba_devi
 
 	dev_pm_clear_wake_irq(&adev->dev);
 	device_init_wakeup(&adev->dev, false);
-	free_irq(adev->irq[0], ldata);
+	if (adev->irq[0])
+		free_irq(adev->irq[0], ldata);
 	rtc_device_unregister(ldata->rtc);
 	iounmap(ldata->base);
 	kfree(ldata);
@@ -381,12 +382,13 @@ static int pl031_probe(struct amba_devic
 		goto out_no_rtc;
 	}
 
-	if (request_irq(adev->irq[0], pl031_interrupt,
-			vendor->irqflags, "rtc-pl031", ldata)) {
-		ret = -EIO;
-		goto out_no_irq;
+	if (adev->irq[0]) {
+		ret = request_irq(adev->irq[0], pl031_interrupt,
+				  vendor->irqflags, "rtc-pl031", ldata);
+		if (ret)
+			goto out_no_irq;
+		dev_pm_set_wake_irq(&adev->dev, adev->irq[0]);
 	}
-	dev_pm_set_wake_irq(&adev->dev, adev->irq[0]);
 	return 0;
 
 out_no_irq:

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (109 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 107/159] rtc: pl031: make interrupt optional Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  9:34   ` Michal Hocko
  2017-12-22  8:46 ` [PATCH 4.14 109/159] net: phy: at803x: Change error to EINVAL for invalid MAC Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  165 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shakeel Butt, Paolo Bonzini, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shakeel Butt <shakeelb@google.com>


[ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]

The kvm slabs can consume a significant amount of system memory
and indeed in our production environment we have observed that
a lot of machines are spending significant amount of memory that
can not be left as system memory overhead. Also the allocations
from these slabs can be triggered directly by user space applications
which has access to kvm and thus a buggy application can leak
such memory. So, these caches should be accounted to kmemcg.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/mmu.c  |    4 ++--
 virt/kvm/kvm_main.c |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5476,13 +5476,13 @@ int kvm_mmu_module_init(void)
 
 	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
 					    sizeof(struct pte_list_desc),
-					    0, 0, NULL);
+					    0, SLAB_ACCOUNT, NULL);
 	if (!pte_list_desc_cache)
 		goto nomem;
 
 	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
 						  sizeof(struct kvm_mmu_page),
-						  0, 0, NULL);
+						  0, SLAB_ACCOUNT, NULL);
 	if (!mmu_page_header_cache)
 		goto nomem;
 
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4018,7 +4018,7 @@ int kvm_init(void *opaque, unsigned vcpu
 	if (!vcpu_align)
 		vcpu_align = __alignof__(struct kvm_vcpu);
 	kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size, vcpu_align,
-					   0, NULL);
+					   SLAB_ACCOUNT, NULL);
 	if (!kvm_vcpu_cache) {
 		r = -ENOMEM;
 		goto out_free_3;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 109/159] net: phy: at803x: Change error to EINVAL for invalid MAC
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (110 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 110/159] PCI: Avoid bus reset if bridge itself is broken Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dan Murphy, David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Murphy <dmurphy@ti.com>


[ Upstream commit fc7556877d1748ac00958822a0a3bba1d4bd9e0d ]

Change the return error code to EINVAL if the MAC
address is not valid in the set_wol function.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/phy/at803x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -167,7 +167,7 @@ static int at803x_set_wol(struct phy_dev
 		mac = (const u8 *) ndev->dev_addr;
 
 		if (!is_valid_ether_addr(mac))
-			return -EFAULT;
+			return -EINVAL;
 
 		for (i = 0; i < 3; i++) {
 			phy_write(phydev, AT803X_MMD_ACCESS_CONTROL,

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 110/159] PCI: Avoid bus reset if bridge itself is broken
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (111 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 109/159] net: phy: at803x: Change error to EINVAL for invalid MAC Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 111/159] scsi: cxgb4i: fix Tx skb leak Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Daney, Jan Glauber,
	Bjorn Helgaas, Alex Williamson, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Daney <david.daney@cavium.com>


[ Upstream commit 357027786f3523d26f42391aa4c075b8495e5d28 ]

When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.  Children marked with that flag are known not to behave well
after a bus reset.

Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.

Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.

Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: fixed typo]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pci.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4356,6 +4356,10 @@ static bool pci_bus_resetable(struct pci
 {
 	struct pci_dev *dev;
 
+
+	if (bus->self && (bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
+		return false;
+
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
 		    (dev->subordinate && !pci_bus_resetable(dev->subordinate)))

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 111/159] scsi: cxgb4i: fix Tx skb leak
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (112 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 110/159] PCI: Avoid bus reset if bridge itself is broken Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 112/159] scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Varun Prakash, Martin K. Petersen,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Varun Prakash <varun@chelsio.com>


[ Upstream commit 9b3a081fb62158b50bcc90522ca2423017544367 ]

In case of connection reset Tx skb queue can have some skbs which are
not transmitted so purge Tx skb queue in release_offload_resources() to
avoid skb leak.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/cxgbi/cxgb4i/cxgb4i.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
@@ -1575,6 +1575,7 @@ static void release_offload_resources(st
 		csk, csk->state, csk->flags, csk->tid);
 
 	cxgbi_sock_free_cpl_skbs(csk);
+	cxgbi_sock_purge_write_queue(csk);
 	if (csk->wr_cred != csk->wr_max_cred) {
 		cxgbi_sock_purge_wr_queue(csk);
 		cxgbi_sock_reset_wr_list(csk);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 112/159] scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (113 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 111/159] scsi: cxgb4i: fix Tx skb leak Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 113/159] PCI: Create SR-IOV virtfn/physfn links before attaching driver Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sreekanth Reddy, Tomas Henzl,
	Martin K. Petersen, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sreekanth Reddy <sreekanth.reddy@broadcom.com>


[ Upstream commit 2ce9a3645299ba1752873d333d73f67620f4550b ]

Whenever an I/O for a RAID volume fails with IOCStatus
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and SCSIStatus equal to
(MPI2_SCSI_STATE_TERMINATED | MPI2_SCSI_STATE_NO_SCSI_STATUS) then
return the I/O to SCSI midlayer with "DID_RESET" (i.e. retry the IO
infinite times) set in the host byte.

Previously, the driver was completing the I/O with "DID_SOFT_ERROR"
which causes the I/O to be quickly retried. However, firmware needed
more time and hence I/Os were failing.

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4804,6 +4804,11 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *i
 		} else if (log_info == VIRTUAL_IO_FAILED_RETRY) {
 			scmd->result = DID_RESET << 16;
 			break;
+		} else if ((scmd->device->channel == RAID_CHANNEL) &&
+		   (scsi_state == (MPI2_SCSI_STATE_TERMINATED |
+		   MPI2_SCSI_STATE_NO_SCSI_STATUS))) {
+			scmd->result = DID_RESET << 16;
+			break;
 		}
 		scmd->result = DID_SOFT_ERROR << 16;
 		break;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 113/159] PCI: Create SR-IOV virtfn/physfn links before attaching driver
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (114 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 112/159] scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 114/159] PM / OPP: Move error message to debug level Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stuart Hayes, Bjorn Helgaas, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Stuart Hayes <stuart.w.hayes@gmail.com>


[ Upstream commit 27d6162944b9b34c32cd5841acd21786637ee743 ]

When creating virtual functions, create the "virtfn%u" and "physfn" links
in sysfs *before* attaching the driver instead of after.  When we attach
the driver to the new virtual network interface first, there is a race when
the driver attaches to the new sends out an "add" udev event, and the
network interface naming software (biosdevname or systemd, for example)
tries to look at these links.

Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -162,7 +162,6 @@ int pci_iov_add_virtfn(struct pci_dev *d
 
 	pci_device_add(virtfn, virtfn->bus);
 
-	pci_bus_add_device(virtfn);
 	sprintf(buf, "virtfn%u", id);
 	rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
 	if (rc)
@@ -173,6 +172,8 @@ int pci_iov_add_virtfn(struct pci_dev *d
 
 	kobject_uevent(&virtfn->dev.kobj, KOBJ_CHANGE);
 
+	pci_bus_add_device(virtfn);
+
 	return 0;
 
 failed2:

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 114/159] PM / OPP: Move error message to debug level
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (115 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 113/159] PCI: Create SR-IOV virtfn/physfn links before attaching driver Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 115/159] igb: check memory allocation failure Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fabio Estevam, Rafael J. Wysocki,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Fabio Estevam <fabio.estevam@nxp.com>


[ Upstream commit 035ed07208dc501d023873447113f3f178592156 ]

On some i.MX6 platforms which do not have speed grading
check, opp table will not be created in platform code,
so cpufreq driver prints the following error message:

cpu cpu0: dev_pm_opp_get_opp_count: OPP table not found (-19)

However, this is not really an error in this case because the
imx6q-cpufreq driver first calls dev_pm_opp_get_opp_count()
and if it fails, it means that platform code does not provide
OPP and then dev_pm_opp_of_add_table() will be called.

In order to avoid such confusing error message, move it to
debug level.

It is up to the caller of dev_pm_opp_get_opp_count() to check its
return value and decide if it will print an error or not.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/power/opp/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -296,7 +296,7 @@ int dev_pm_opp_get_opp_count(struct devi
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		count = PTR_ERR(opp_table);
-		dev_err(dev, "%s: OPP table not found (%d)\n",
+		dev_dbg(dev, "%s: OPP table not found (%d)\n",
 			__func__, count);
 		return count;
 	}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 115/159] igb: check memory allocation failure
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (116 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 114/159] PM / OPP: Move error message to debug level Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 116/159] i40e: use the safe hash table iterator when deleting mac filters Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Christophe JAILLET, PJ Waskiewicz,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>


[ Upstream commit 18eb86362a52f0af933cc0fd5e37027317eb2d1c ]

Check memory allocation failures and return -ENOMEM in such cases, as
already done for other memory allocations in this function.

This avoids NULL pointers dereference.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Tested-by: Aaron Brown <aaron.f.brown@intel.com
Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/igb/igb_main.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3162,6 +3162,8 @@ static int igb_sw_init(struct igb_adapte
 	/* Setup and initialize a copy of the hw vlan table array */
 	adapter->shadow_vfta = kcalloc(E1000_VLAN_FILTER_TBL_SIZE, sizeof(u32),
 				       GFP_ATOMIC);
+	if (!adapter->shadow_vfta)
+		return -ENOMEM;
 
 	/* This call may decrease the number of queues */
 	if (igb_init_interrupt_scheme(adapter, true)) {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 116/159] i40e: use the safe hash table iterator when deleting mac filters
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (117 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 115/159] igb: check memory allocation failure Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 117/159] iio: st_sensors: add register mask for status register Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lihong Yang, Andrew Bowers,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lihong Yang <lihong.yang@intel.com>


[ Upstream commit 784548c40d6f43eff2297220ad7800dc04be03c6 ]

This patch replaces hash_for_each function with hash_for_each_safe
when calling  __i40e_del_filter. The hash_for_each_safe function is
the right one to use when iterating over a hash table to safely remove
a hash entry. Otherwise, incorrect values may be read from freed memory.

Detected by CoverityScan, CID 1402048 Read from pointer after free

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2779,6 +2779,7 @@ int i40e_ndo_set_vf_mac(struct net_devic
 	struct i40e_mac_filter *f;
 	struct i40e_vf *vf;
 	int ret = 0;
+	struct hlist_node *h;
 	int bkt;
 
 	/* validate the request */
@@ -2817,7 +2818,7 @@ int i40e_ndo_set_vf_mac(struct net_devic
 	/* Delete all the filters for this VSI - we're going to kill it
 	 * anyway.
 	 */
-	hash_for_each(vsi->mac_filter_hash, bkt, f, hlist)
+	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist)
 		__i40e_del_filter(vsi, f);
 
 	spin_unlock_bh(&vsi->mac_filter_hash_lock);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 117/159] iio: st_sensors: add register mask for status register
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (118 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 116/159] i40e: use the safe hash table iterator when deleting mac filters Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 118/159] ixgbe: fix use of uninitialized padding Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lorenzo Bianconi, Linus Walleij,
	Jonathan Cameron, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>


[ Upstream commit e72a060151e5bb673af24993665e270fc4f674a7 ]

Introduce register mask for data-ready status register since
pressure sensors (e.g. LPS22HB) export just two channels
(BIT(0) and BIT(1)) and BIT(2) is marked reserved while in
st_sensors_new_samples_available() value read from status register
is masked using 0x7.
Moreover do not mask status register using active_scan_mask since
now status value is properly masked and if the result is not zero the
interrupt has to be consumed by the driver. This fix an issue on LPS25H
and LPS331AP where channel definition is swapped respect to status
register.
Furthermore that change allows to properly support new devices
(e.g LIS2DW12) that report just ZYXDA (data-ready) field in status register
to figure out if the interrupt has been generated by the device.

Fixes: 97865fe41322 (iio: st_sensors: verify interrupt event to status)
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/iio/accel/st_accel_core.c                  |   35 ++++++++++++++++-----
 drivers/iio/common/st_sensors/st_sensors_core.c    |    2 -
 drivers/iio/common/st_sensors/st_sensors_trigger.c |   16 ++-------
 drivers/iio/gyro/st_gyro_core.c                    |   15 +++++++--
 drivers/iio/magnetometer/st_magn_core.c            |   10 ++++--
 drivers/iio/pressure/st_pressure_core.c            |   15 +++++++--
 include/linux/iio/common/st_sensors.h              |    7 +++-
 7 files changed, 70 insertions(+), 30 deletions(-)

--- a/drivers/iio/accel/st_accel_core.c
+++ b/drivers/iio/accel/st_accel_core.c
@@ -164,7 +164,10 @@ static const struct st_sensor_settings s
 			.mask_int2 = 0x00,
 			.addr_ihl = 0x25,
 			.mask_ihl = 0x02,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.sim = {
 			.addr = 0x23,
@@ -236,7 +239,10 @@ static const struct st_sensor_settings s
 			.mask_ihl = 0x80,
 			.addr_od = 0x22,
 			.mask_od = 0x40,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.sim = {
 			.addr = 0x23,
@@ -318,7 +324,10 @@ static const struct st_sensor_settings s
 			.mask_int2 = 0x00,
 			.addr_ihl = 0x23,
 			.mask_ihl = 0x40,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 			.ig1 = {
 				.en_addr = 0x23,
 				.en_mask = 0x08,
@@ -389,7 +398,10 @@ static const struct st_sensor_settings s
 		.drdy_irq = {
 			.addr = 0x21,
 			.mask_int1 = 0x04,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.sim = {
 			.addr = 0x21,
@@ -451,7 +463,10 @@ static const struct st_sensor_settings s
 			.mask_ihl = 0x80,
 			.addr_od = 0x22,
 			.mask_od = 0x40,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.sim = {
 			.addr = 0x21,
@@ -569,7 +584,10 @@ static const struct st_sensor_settings s
 		.drdy_irq = {
 			.addr = 0x21,
 			.mask_int1 = 0x04,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.sim = {
 			.addr = 0x21,
@@ -640,7 +658,10 @@ static const struct st_sensor_settings s
 			.mask_int2 = 0x00,
 			.addr_ihl = 0x25,
 			.mask_ihl = 0x02,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.sim = {
 			.addr = 0x23,
--- a/drivers/iio/common/st_sensors/st_sensors_core.c
+++ b/drivers/iio/common/st_sensors/st_sensors_core.c
@@ -470,7 +470,7 @@ int st_sensors_set_dataready_irq(struct
 		 * different one. Take into account irq status register
 		 * to understand if irq trigger can be properly supported
 		 */
-		if (sdata->sensor_settings->drdy_irq.addr_stat_drdy)
+		if (sdata->sensor_settings->drdy_irq.stat_drdy.addr)
 			sdata->hw_irq_trigger = enable;
 		return 0;
 	}
--- a/drivers/iio/common/st_sensors/st_sensors_trigger.c
+++ b/drivers/iio/common/st_sensors/st_sensors_trigger.c
@@ -31,7 +31,7 @@ static int st_sensors_new_samples_availa
 	int ret;
 
 	/* How would I know if I can't check it? */
-	if (!sdata->sensor_settings->drdy_irq.addr_stat_drdy)
+	if (!sdata->sensor_settings->drdy_irq.stat_drdy.addr)
 		return -EINVAL;
 
 	/* No scan mask, no interrupt */
@@ -39,23 +39,15 @@ static int st_sensors_new_samples_availa
 		return 0;
 
 	ret = sdata->tf->read_byte(&sdata->tb, sdata->dev,
-			sdata->sensor_settings->drdy_irq.addr_stat_drdy,
+			sdata->sensor_settings->drdy_irq.stat_drdy.addr,
 			&status);
 	if (ret < 0) {
 		dev_err(sdata->dev,
 			"error checking samples available\n");
 		return ret;
 	}
-	/*
-	 * the lower bits of .active_scan_mask[0] is directly mapped
-	 * to the channels on the sensor: either bit 0 for
-	 * one-dimensional sensors, or e.g. x,y,z for accelerometers,
-	 * gyroscopes or magnetometers. No sensor use more than 3
-	 * channels, so cut the other status bits here.
-	 */
-	status &= 0x07;
 
-	if (status & (u8)indio_dev->active_scan_mask[0])
+	if (status & sdata->sensor_settings->drdy_irq.stat_drdy.mask)
 		return 1;
 
 	return 0;
@@ -212,7 +204,7 @@ int st_sensors_allocate_trigger(struct i
 	 * it was "our" interrupt.
 	 */
 	if (sdata->int_pin_open_drain &&
-	    sdata->sensor_settings->drdy_irq.addr_stat_drdy)
+	    sdata->sensor_settings->drdy_irq.stat_drdy.addr)
 		irq_trig |= IRQF_SHARED;
 
 	err = request_threaded_irq(sdata->get_irq_data_ready(indio_dev),
--- a/drivers/iio/gyro/st_gyro_core.c
+++ b/drivers/iio/gyro/st_gyro_core.c
@@ -118,7 +118,10 @@ static const struct st_sensor_settings s
 			 * drain settings, but only for INT1 and not
 			 * for the DRDY line on INT2.
 			 */
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.multi_read_bit = true,
 		.bootime = 2,
@@ -188,7 +191,10 @@ static const struct st_sensor_settings s
 			 * drain settings, but only for INT1 and not
 			 * for the DRDY line on INT2.
 			 */
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.multi_read_bit = true,
 		.bootime = 2,
@@ -253,7 +259,10 @@ static const struct st_sensor_settings s
 			 * drain settings, but only for INT1 and not
 			 * for the DRDY line on INT2.
 			 */
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.multi_read_bit = true,
 		.bootime = 2,
--- a/drivers/iio/magnetometer/st_magn_core.c
+++ b/drivers/iio/magnetometer/st_magn_core.c
@@ -317,7 +317,10 @@ static const struct st_sensor_settings s
 		},
 		.drdy_irq = {
 			/* drdy line is routed drdy pin */
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x07,
+			},
 		},
 		.multi_read_bit = true,
 		.bootime = 2,
@@ -361,7 +364,10 @@ static const struct st_sensor_settings s
 		.drdy_irq = {
 			.addr = 0x62,
 			.mask_int1 = 0x01,
-			.addr_stat_drdy = 0x67,
+			.stat_drdy = {
+				.addr = 0x67,
+				.mask = 0x07,
+			},
 		},
 		.multi_read_bit = false,
 		.bootime = 2,
--- a/drivers/iio/pressure/st_pressure_core.c
+++ b/drivers/iio/pressure/st_pressure_core.c
@@ -287,7 +287,10 @@ static const struct st_sensor_settings s
 			.mask_ihl = 0x80,
 			.addr_od = 0x22,
 			.mask_od = 0x40,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x03,
+			},
 		},
 		.multi_read_bit = true,
 		.bootime = 2,
@@ -395,7 +398,10 @@ static const struct st_sensor_settings s
 			.mask_ihl = 0x80,
 			.addr_od = 0x22,
 			.mask_od = 0x40,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x03,
+			},
 		},
 		.multi_read_bit = true,
 		.bootime = 2,
@@ -454,7 +460,10 @@ static const struct st_sensor_settings s
 			.mask_ihl = 0x80,
 			.addr_od = 0x12,
 			.mask_od = 0x40,
-			.addr_stat_drdy = ST_SENSORS_DEFAULT_STAT_ADDR,
+			.stat_drdy = {
+				.addr = ST_SENSORS_DEFAULT_STAT_ADDR,
+				.mask = 0x03,
+			},
 		},
 		.multi_read_bit = false,
 		.bootime = 2,
--- a/include/linux/iio/common/st_sensors.h
+++ b/include/linux/iio/common/st_sensors.h
@@ -139,7 +139,7 @@ struct st_sensor_das {
  * @mask_ihl: mask to enable/disable active low on the INT lines.
  * @addr_od: address to enable/disable Open Drain on the INT lines.
  * @mask_od: mask to enable/disable Open Drain on the INT lines.
- * @addr_stat_drdy: address to read status of DRDY (data ready) interrupt
+ * struct stat_drdy - status register of DRDY (data ready) interrupt.
  * struct ig1 - represents the Interrupt Generator 1 of sensors.
  * @en_addr: address of the enable ig1 register.
  * @en_mask: mask to write the on/off value for enable.
@@ -152,7 +152,10 @@ struct st_sensor_data_ready_irq {
 	u8 mask_ihl;
 	u8 addr_od;
 	u8 mask_od;
-	u8 addr_stat_drdy;
+	struct {
+		u8 addr;
+		u8 mask;
+	} stat_drdy;
 	struct {
 		u8 en_addr;
 		u8 en_mask;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 118/159] ixgbe: fix use of uninitialized padding
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (119 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 117/159] iio: st_sensors: add register mask for status register Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 119/159] IB/rxe: check for allocation failure on elem Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Emil Tantilov, Andrew Bowers,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Emil Tantilov <emil.s.tantilov@intel.com>


[ Upstream commit dcfd6b839c998bc9838e2a47f44f37afbdf3099c ]

This patch is resolving Coverity hits where padding in a structure could
be used uninitialized.

- Initialize fwd_cmd.pad/2 before ixgbe_calculate_checksum()

- Initialize buffer.pad2/3 before ixgbe_hic_unlocked()

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c |    4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c   |    2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -3781,10 +3781,10 @@ s32 ixgbe_set_fw_drv_ver_generic(struct
 	fw_cmd.ver_build = build;
 	fw_cmd.ver_sub = sub;
 	fw_cmd.hdr.checksum = 0;
-	fw_cmd.hdr.checksum = ixgbe_calculate_checksum((u8 *)&fw_cmd,
-				(FW_CEM_HDR_LEN + fw_cmd.hdr.buf_len));
 	fw_cmd.pad = 0;
 	fw_cmd.pad2 = 0;
+	fw_cmd.hdr.checksum = ixgbe_calculate_checksum((u8 *)&fw_cmd,
+				(FW_CEM_HDR_LEN + fw_cmd.hdr.buf_len));
 
 	for (i = 0; i <= FW_CEM_MAX_RETRIES; i++) {
 		ret_val = ixgbe_host_interface_command(hw, &fw_cmd,
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -900,6 +900,8 @@ static s32 ixgbe_read_ee_hostif_buffer_X
 		/* convert offset from words to bytes */
 		buffer.address = cpu_to_be32((offset + current_word) * 2);
 		buffer.length = cpu_to_be16(words_to_read * 2);
+		buffer.pad2 = 0;
+		buffer.pad3 = 0;
 
 		status = ixgbe_hic_unlocked(hw, (u32 *)&buffer, sizeof(buffer),
 					    IXGBE_HI_COMMAND_TIMEOUT);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 119/159] IB/rxe: check for allocation failure on elem
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (120 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 118/159] ixgbe: fix use of uninitialized padding Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 120/159] block,bfq: Disable writeback throttling Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Colin Ian King, Doug Ledford, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Colin Ian King <colin.king@canonical.com>


[ Upstream commit 4831ca9e4a8e48cb27e0a792f73250390827a228 ]

The allocation for elem may fail (especially because we're using
GFP_ATOMIC) so best to check for a null return.  This fixes a potential
null pointer dereference when assigning elem->pool.

Detected by CoverityScan CID#1357507 ("Dereference null return value")

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/sw/rxe/rxe_pool.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -404,6 +404,8 @@ void *rxe_alloc(struct rxe_pool *pool)
 	elem = kmem_cache_zalloc(pool_cache(pool),
 				 (pool->flags & RXE_POOL_ATOMIC) ?
 				 GFP_ATOMIC : GFP_KERNEL);
+	if (!elem)
+		return NULL;
 
 	elem->pool = pool;
 	kref_init(&elem->ref_cnt);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 120/159] block,bfq: Disable writeback throttling
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (121 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 119/159] IB/rxe: check for allocation failure on elem Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 121/159] md: always set THREAD_WAKEUP and wake up wqueue if thread existed Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Luca Miccio, Paolo Valente,
	Oleksandr Natalenko, Lee Tibbert, Jens Axboe, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Luca Miccio <lucmiccio@gmail.com>


[ Upstream commit b5dc5d4d1f4ff9032eb6c21a3c571a1317dc9289 ]

Similarly to CFQ, BFQ has its write-throttling heuristics, and it
is better not to combine them with further write-throttling
heuristics of a different nature.
So this commit disables write-back throttling for a device if BFQ
is used as I/O scheduler for that device.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Lee Tibbert <lee.tibbert@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/bfq-iosched.c |    3 ++-
 block/blk-wbt.c     |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -108,6 +108,7 @@
 #include "blk-mq-tag.h"
 #include "blk-mq-sched.h"
 #include "bfq-iosched.h"
+#include "blk-wbt.h"
 
 #define BFQ_BFQQ_FNS(name)						\
 void bfq_mark_bfqq_##name(struct bfq_queue *bfqq)			\
@@ -4775,7 +4776,7 @@ static int bfq_init_queue(struct request
 	bfq_init_root_group(bfqd->root_group, bfqd);
 	bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group);
 
-
+	wbt_disable_default(q);
 	return 0;
 
 out_free:
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -654,7 +654,7 @@ void wbt_set_write_cache(struct rq_wb *r
 }
 
 /*
- * Disable wbt, if enabled by default. Only called from CFQ.
+ * Disable wbt, if enabled by default.
  */
 void wbt_disable_default(struct request_queue *q)
 {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 121/159] md: always set THREAD_WAKEUP and wake up wqueue if thread existed
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (122 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 120/159] block,bfq: Disable writeback throttling Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 122/159] ip_gre: check packet length and mtu correctly in erspan tx Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Guoqing Jiang, Shaohua Li, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Guoqing Jiang <gqjiang@suse.com>


[ Upstream commit d1d90147c9680aaec4a5757932c2103c42c9c23b ]

Since commit 4ad23a976413 ("MD: use per-cpu counter for writes_pending"),
the wait_queue is only got invoked if THREAD_WAKEUP is not set previously.

With above change, I can see process_metadata_update could always hang on
the wait queue, because mddev->thread could stay on 'D' status and the
THREAD_WAKEUP flag is not cleared since there are lots of place to wake up
mddev->thread. Then deadlock happened as follows:

linux175:~ # ps aux|grep md|grep D
root    20117   0.0 0.0         0   0 ? D   03:45   0:00 [md0_raid1]
root    20125   0.0 0.0         0   0 ? D   03:45   0:00 [md0_cluster_rec]
linux175:~ # cat /proc/20117/stack
[<ffffffffa0635604>] dlm_lock_sync+0x94/0xd0 [md_cluster]
[<ffffffffa0635674>] lock_token+0x34/0xd0 [md_cluster]
[<ffffffffa0635804>] metadata_update_start+0x64/0x110 [md_cluster]
[<ffffffffa04d985b>] md_update_sb.part.58+0x9b/0x860 [md_mod]
[<ffffffffa04da035>] md_update_sb+0x15/0x30 [md_mod]
[<ffffffffa04dc066>] md_check_recovery+0x266/0x490 [md_mod]
[<ffffffffa06450e2>] raid1d+0x42/0x810 [raid1]
[<ffffffffa04d2252>] md_thread+0x122/0x150 [md_mod]
[<ffffffff81091741>] kthread+0x101/0x140
linux175:~ # cat /proc/20125/stack
[<ffffffffa0636679>] recv_daemon+0x3f9/0x5c0 [md_cluster]
[<ffffffffa04d2252>] md_thread+0x122/0x150 [md_mod]
[<ffffffff81091741>] kthread+0x101/0x140

So let's revert the part of code in the commit to resovle the problem since
we can't get lots of benefits of previous change.

Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending")
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/md.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7468,8 +7468,8 @@ void md_wakeup_thread(struct md_thread *
 {
 	if (thread) {
 		pr_debug("md: waking up MD thread %s.\n", thread->tsk->comm);
-		if (!test_and_set_bit(THREAD_WAKEUP, &thread->flags))
-			wake_up(&thread->wqueue);
+		set_bit(THREAD_WAKEUP, &thread->flags);
+		wake_up(&thread->wqueue);
 	}
 }
 EXPORT_SYMBOL(md_wakeup_thread);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 122/159] ip_gre: check packet length and mtu correctly in erspan tx
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (123 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 121/159] md: always set THREAD_WAKEUP and wake up wqueue if thread existed Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 123/159] ipv6: grab rt->rt6i_ref before allocating pcpu rt Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, William Tu, Xin Long, David Laight,
	David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: William Tu <u9012063@gmail.com>


[ Upstream commit f192970de860d3ab90aa9e2a22853201a57bde78 ]

Similarly to early patch for erspan_xmit(), the ARPHDR_ETHER device
is the length of the whole ether packet.  So skb->len should subtract
the dev->hard_header_len.

Fixes: 1a66a836da63 ("gre: add collect_md mode to ERSPAN tunnel")
Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_gre.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -579,8 +579,8 @@ static void erspan_fb_xmit(struct sk_buf
 	if (gre_handle_offloads(skb, false))
 		goto err_free_rt;
 
-	if (skb->len > dev->mtu) {
-		pskb_trim(skb, dev->mtu);
+	if (skb->len > dev->mtu + dev->hard_header_len) {
+		pskb_trim(skb, dev->mtu + dev->hard_header_len);
 		truncate = true;
 	}
 
@@ -731,8 +731,8 @@ static netdev_tx_t erspan_xmit(struct sk
 	if (skb_cow_head(skb, dev->needed_headroom))
 		goto free_skb;
 
-	if (skb->len - dev->hard_header_len > dev->mtu) {
-		pskb_trim(skb, dev->mtu);
+	if (skb->len > dev->mtu + dev->hard_header_len) {
+		pskb_trim(skb, dev->mtu + dev->hard_header_len);
 		truncate = true;
 	}
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 123/159] ipv6: grab rt->rt6i_ref before allocating pcpu rt
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (124 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 122/159] ip_gre: check packet length and mtu correctly in erspan tx Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 125/159] Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wei Wang, Martin KaFai Lau,
	Eric Dumazet, David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wei Wang <weiwan@google.com>


[ Upstream commit a94b9367e044ba672c9f4105eb1516ff6ff4948a ]

After rwlock is replaced with rcu and spinlock, ip6_pol_route() will be
called with only rcu held. That means rt6 route deletion could happen
simultaneously with rt6_make_pcpu_rt(). This could potentially cause
memory leak if rt6_release() is called right before rt6_make_pcpu_rt()
on the same route.

This patch grabs rt->rt6i_ref safely before calling rt6_make_pcpu_rt()
to make sure rt6_release() will not get triggered while
rt6_make_pcpu_rt() is in progress. And rt6_release() is called after
rt6_make_pcpu_rt() is finished.

Note: As we are incrementing rt->rt6i_ref in ip6_pol_route(), there is a
very slim chance that fib6_purge_rt() will be triggered unnecessarily
when deleting a route if ip6_pol_route() running on another thread picks
this route as well and tries to make pcpu cache for it.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/route.c |   58 +++++++++++++++++++++++++++----------------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1055,7 +1055,6 @@ static struct rt6_info *rt6_get_pcpu_rou
 
 static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
 {
-	struct fib6_table *table = rt->rt6i_table;
 	struct rt6_info *pcpu_rt, *prev, **p;
 
 	pcpu_rt = ip6_rt_pcpu_alloc(rt);
@@ -1066,28 +1065,20 @@ static struct rt6_info *rt6_make_pcpu_ro
 		return net->ipv6.ip6_null_entry;
 	}
 
-	read_lock_bh(&table->tb6_lock);
-	if (rt->rt6i_pcpu) {
-		p = this_cpu_ptr(rt->rt6i_pcpu);
-		prev = cmpxchg(p, NULL, pcpu_rt);
-		if (prev) {
-			/* If someone did it before us, return prev instead */
-			dst_release_immediate(&pcpu_rt->dst);
-			pcpu_rt = prev;
-		}
-	} else {
-		/* rt has been removed from the fib6 tree
-		 * before we have a chance to acquire the read_lock.
-		 * In this case, don't brother to create a pcpu rt
-		 * since rt is going away anyway.  The next
-		 * dst_check() will trigger a re-lookup.
-		 */
+	dst_hold(&pcpu_rt->dst);
+	p = this_cpu_ptr(rt->rt6i_pcpu);
+	prev = cmpxchg(p, NULL, pcpu_rt);
+	if (prev) {
+		/* If someone did it before us, return prev instead */
+		/* release refcnt taken by ip6_rt_pcpu_alloc() */
+		dst_release_immediate(&pcpu_rt->dst);
+		/* release refcnt taken by above dst_hold() */
 		dst_release_immediate(&pcpu_rt->dst);
-		pcpu_rt = rt;
+		dst_hold(&prev->dst);
+		pcpu_rt = prev;
 	}
-	dst_hold(&pcpu_rt->dst);
+
 	rt6_dst_from_metrics_check(pcpu_rt);
-	read_unlock_bh(&table->tb6_lock);
 	return pcpu_rt;
 }
 
@@ -1177,19 +1168,28 @@ redo_rt6_select:
 		if (pcpu_rt) {
 			read_unlock_bh(&table->tb6_lock);
 		} else {
-			/* We have to do the read_unlock first
-			 * because rt6_make_pcpu_route() may trigger
-			 * ip6_dst_gc() which will take the write_lock.
-			 */
-			dst_hold(&rt->dst);
-			read_unlock_bh(&table->tb6_lock);
-			pcpu_rt = rt6_make_pcpu_route(rt);
-			dst_release(&rt->dst);
+			/* atomic_inc_not_zero() is needed when using rcu */
+			if (atomic_inc_not_zero(&rt->rt6i_ref)) {
+				/* We have to do the read_unlock first
+				 * because rt6_make_pcpu_route() may trigger
+				 * ip6_dst_gc() which will take the write_lock.
+				 *
+				 * No dst_hold() on rt is needed because grabbing
+				 * rt->rt6i_ref makes sure rt can't be released.
+				 */
+				read_unlock_bh(&table->tb6_lock);
+				pcpu_rt = rt6_make_pcpu_route(rt);
+				rt6_release(rt);
+			} else {
+				/* rt is already removed from tree */
+				read_unlock_bh(&table->tb6_lock);
+				pcpu_rt = net->ipv6.ip6_null_entry;
+				dst_hold(&pcpu_rt->dst);
+			}
 		}
 
 		trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
 		return pcpu_rt;
-
 	}
 }
 EXPORT_SYMBOL_GPL(ip6_pol_route);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 125/159] Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (125 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 123/159] ipv6: grab rt->rt6i_ref before allocating pcpu rt Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 126/159] Bluetooth: hci_bcm: Fix setting of irq trigger type Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hans de Goede, Marcel Holtmann, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hans de Goede <hdegoede@redhat.com>


[ Upstream commit 7841d554809b518a22349e7e39b6b63f8a48d0fb ]

Fix a NULL pointer deref (hu->tty) when calling hci_uart_set_flow_control
on hci_uart-s using serdev.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/bluetooth/hci_ldisc.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -41,6 +41,7 @@
 #include <linux/ioctl.h>
 #include <linux/skbuff.h>
 #include <linux/firmware.h>
+#include <linux/serdev.h>
 
 #include <net/bluetooth/bluetooth.h>
 #include <net/bluetooth/hci_core.h>
@@ -298,6 +299,12 @@ void hci_uart_set_flow_control(struct hc
 	unsigned int set = 0;
 	unsigned int clear = 0;
 
+	if (hu->serdev) {
+		serdev_device_set_flow_control(hu->serdev, !enable);
+		serdev_device_set_rts(hu->serdev, !enable);
+		return;
+	}
+
 	if (enable) {
 		/* Disable hardware flow control */
 		ktermios = tty->termios;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 126/159] Bluetooth: hci_bcm: Fix setting of irq trigger type
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (126 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 125/159] Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 127/159] i40e/i40evf: spread CPU affinity hints across online CPUs only Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hans de Goede, Marcel Holtmann, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hans de Goede <hdegoede@redhat.com>


[ Upstream commit 227630cccdbb8f8a1b24ac26517b75079c9a69c9 ]

This commit fixes 2 issues with host-wake irq trigger type handling
in hci_bcm:

1) bcm_setup_sleep sets sleep_params.host_wake_active based on
bcm_device.irq_polarity, but bcm_request_irq was always requesting
IRQF_TRIGGER_RISING as trigger type independent of irq_polarity.

This was a problem when the irq is described as a GpioInt rather then
an Interrupt in the DSDT as for GpioInt-s the value passed to request_irq
is honored. This commit fixes this by requesting the correct trigger
type depending on bcm_device.irq_polarity.

2) bcm_device.irq_polarity was used to directly store an ACPI polarity
value (ACPI_ACTIVE_*). This is undesirable because hci_bcm is also
used with device-tree and checking for something like ACPI_ACTIVE_LOW
in a non ACPI specific function like bcm_request_irq feels wrong.

This commit fixes this by renaming irq_polarity to irq_active_low
and changing its type to a bool.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/bluetooth/hci_bcm.c |   23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

--- a/drivers/bluetooth/hci_bcm.c
+++ b/drivers/bluetooth/hci_bcm.c
@@ -68,7 +68,7 @@ struct bcm_device {
 	u32			init_speed;
 	u32			oper_speed;
 	int			irq;
-	u8			irq_polarity;
+	bool			irq_active_low;
 
 #ifdef CONFIG_PM
 	struct hci_uart		*hu;
@@ -213,7 +213,9 @@ static int bcm_request_irq(struct bcm_da
 	}
 
 	err = devm_request_irq(&bdev->pdev->dev, bdev->irq, bcm_host_wake,
-			       IRQF_TRIGGER_RISING, "host_wake", bdev);
+			       bdev->irq_active_low ? IRQF_TRIGGER_FALLING :
+						      IRQF_TRIGGER_RISING,
+			       "host_wake", bdev);
 	if (err)
 		goto unlock;
 
@@ -253,7 +255,7 @@ static int bcm_setup_sleep(struct hci_ua
 	struct sk_buff *skb;
 	struct bcm_set_sleep_mode sleep_params = default_sleep_params;
 
-	sleep_params.host_wake_active = !bcm->dev->irq_polarity;
+	sleep_params.host_wake_active = !bcm->dev->irq_active_low;
 
 	skb = __hci_cmd_sync(hu->hdev, 0xfc27, sizeof(sleep_params),
 			     &sleep_params, HCI_INIT_TIMEOUT);
@@ -690,10 +692,8 @@ static const struct acpi_gpio_mapping ac
 };
 
 #ifdef CONFIG_ACPI
-static u8 acpi_active_low = ACPI_ACTIVE_LOW;
-
 /* IRQ polarity of some chipsets are not defined correctly in ACPI table. */
-static const struct dmi_system_id bcm_wrong_irq_dmi_table[] = {
+static const struct dmi_system_id bcm_active_low_irq_dmi_table[] = {
 	{
 		.ident = "Asus T100TA",
 		.matches = {
@@ -701,7 +701,6 @@ static const struct dmi_system_id bcm_wr
 					"ASUSTeK COMPUTER INC."),
 			DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T100TA"),
 		},
-		.driver_data = &acpi_active_low,
 	},
 	{
 		.ident = "Asus T100CHI",
@@ -710,7 +709,6 @@ static const struct dmi_system_id bcm_wr
 					"ASUSTeK COMPUTER INC."),
 			DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T100CHI"),
 		},
-		.driver_data = &acpi_active_low,
 	},
 	{	/* Handle ThinkPad 8 tablets with BCM2E55 chipset ACPI ID */
 		.ident = "Lenovo ThinkPad 8",
@@ -718,7 +716,6 @@ static const struct dmi_system_id bcm_wr
 			DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"),
 			DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "ThinkPad 8"),
 		},
-		.driver_data = &acpi_active_low,
 	},
 	{ }
 };
@@ -733,13 +730,13 @@ static int bcm_resource(struct acpi_reso
 	switch (ares->type) {
 	case ACPI_RESOURCE_TYPE_EXTENDED_IRQ:
 		irq = &ares->data.extended_irq;
-		dev->irq_polarity = irq->polarity;
+		dev->irq_active_low = irq->polarity == ACPI_ACTIVE_LOW;
 		break;
 
 	case ACPI_RESOURCE_TYPE_GPIO:
 		gpio = &ares->data.gpio;
 		if (gpio->connection_type == ACPI_RESOURCE_GPIO_TYPE_INT)
-			dev->irq_polarity = gpio->polarity;
+			dev->irq_active_low = gpio->polarity == ACPI_ACTIVE_LOW;
 		break;
 
 	case ACPI_RESOURCE_TYPE_SERIAL_BUS:
@@ -834,11 +831,11 @@ static int bcm_acpi_probe(struct bcm_dev
 		return ret;
 	acpi_dev_free_resource_list(&resources);
 
-	dmi_id = dmi_first_match(bcm_wrong_irq_dmi_table);
+	dmi_id = dmi_first_match(bcm_active_low_irq_dmi_table);
 	if (dmi_id) {
 		bt_dev_warn(dev, "%s: Overwriting IRQ polarity to active low",
 			    dmi_id->ident);
-		dev->irq_polarity = *(u8 *)dmi_id->driver_data;
+		dev->irq_active_low = true;
 	}
 
 	return 0;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 127/159] i40e/i40evf: spread CPU affinity hints across online CPUs only
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (127 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 126/159] Bluetooth: hci_bcm: Fix setting of irq trigger type Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 128/159] PCI/AER: Report non-fatal errors only to the affected endpoint Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jacob Keller, Andrew Bowers,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jacob Keller <jacob.e.keller@intel.com>


[ Upstream commit be664cbefc50977aaefc868ba6a1109ec9b7449d ]

Currently, when setting up the IRQ for a q_vector, we set an affinity
hint based on the v_idx of that q_vector. Meaning a loop iterates on
v_idx, which is an incremental value, and the cpumask is created based
on this value.

This is a problem in systems with multiple logical CPUs per core (like in
simultaneous multithreading (SMT) scenarios). If we disable some logical
CPUs, by turning SMT off for example, we will end up with a sparse
cpu_online_mask, i.e., only the first CPU in a core is online, and
incremental filling in q_vector cpumask might lead to multiple offline
CPUs being assigned to q_vectors.

Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.

In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1)  where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.

Instead, we should only assign hints for CPUs which are online. Even
better, the kernel already provides a function, cpumask_local_spread()
which takes an index and returns a CPU, spreading the interrupts across
local NUMA nodes first, and then remote ones if necessary.

Since we generally have a 1:1 mapping between vectors and CPUs, there
is no real advantage to spreading vectors to local CPUs first. In order
to avoid mismatch of the default XPS hints, we'll pass -1 so that it
spreads across all CPUs without regard to the node locality.

Note that we don't need to change the q_vector->affinity_mask as this is
initialized to cpu_possible_mask, until an actual affinity is set and
then notified back to us.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c     |   16 +++++++++++-----
 drivers/net/ethernet/intel/i40evf/i40evf_main.c |    9 ++++++---
 2 files changed, 17 insertions(+), 8 deletions(-)

--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2874,14 +2874,15 @@ static void i40e_vsi_free_rx_resources(s
 static void i40e_config_xps_tx_ring(struct i40e_ring *ring)
 {
 	struct i40e_vsi *vsi = ring->vsi;
+	int cpu;
 
 	if (!ring->q_vector || !ring->netdev)
 		return;
 
 	if ((vsi->tc_config.numtc <= 1) &&
 	    !test_and_set_bit(__I40E_TX_XPS_INIT_DONE, &ring->state)) {
-		netif_set_xps_queue(ring->netdev,
-				    get_cpu_mask(ring->q_vector->v_idx),
+		cpu = cpumask_local_spread(ring->q_vector->v_idx, -1);
+		netif_set_xps_queue(ring->netdev, get_cpu_mask(cpu),
 				    ring->queue_index);
 	}
 
@@ -3471,6 +3472,7 @@ static int i40e_vsi_request_irq_msix(str
 	int tx_int_idx = 0;
 	int vector, err;
 	int irq_num;
+	int cpu;
 
 	for (vector = 0; vector < q_vectors; vector++) {
 		struct i40e_q_vector *q_vector = vsi->q_vectors[vector];
@@ -3506,10 +3508,14 @@ static int i40e_vsi_request_irq_msix(str
 		q_vector->affinity_notify.notify = i40e_irq_affinity_notify;
 		q_vector->affinity_notify.release = i40e_irq_affinity_release;
 		irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
-		/* get_cpu_mask returns a static constant mask with
-		 * a permanent lifetime so it's ok to use here.
+		/* Spread affinity hints out across online CPUs.
+		 *
+		 * get_cpu_mask returns a static constant mask with
+		 * a permanent lifetime so it's ok to pass to
+		 * irq_set_affinity_hint without making a copy.
 		 */
-		irq_set_affinity_hint(irq_num, get_cpu_mask(q_vector->v_idx));
+		cpu = cpumask_local_spread(q_vector->v_idx, -1);
+		irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
 	}
 
 	vsi->irqs_ready = true;
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -546,6 +546,7 @@ i40evf_request_traffic_irqs(struct i40ev
 	unsigned int vector, q_vectors;
 	unsigned int rx_int_idx = 0, tx_int_idx = 0;
 	int irq_num, err;
+	int cpu;
 
 	i40evf_irq_disable(adapter);
 	/* Decrement for Other and TCP Timer vectors */
@@ -584,10 +585,12 @@ i40evf_request_traffic_irqs(struct i40ev
 		q_vector->affinity_notify.release =
 						   i40evf_irq_affinity_release;
 		irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
-		/* get_cpu_mask returns a static constant mask with
-		 * a permanent lifetime so it's ok to use here.
+		/* Spread the IRQ affinity hints across online CPUs. Note that
+		 * get_cpu_mask returns a mask with a permanent lifetime so
+		 * it's safe to use as a hint for irq_set_affinity_hint.
 		 */
-		irq_set_affinity_hint(irq_num, get_cpu_mask(q_vector->v_idx));
+		cpu = cpumask_local_spread(q_vector->v_idx, -1);
+		irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
 	}
 
 	return 0;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 128/159] PCI/AER: Report non-fatal errors only to the affected endpoint
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (128 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 127/159] i40e/i40evf: spread CPU affinity hints across online CPUs only Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 129/159] tracing: Exclude generic fields from histograms Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gabriele Paoloni, Dongdong Liu,
	Bjorn Helgaas, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gabriele Paoloni <gabriele.paoloni@huawei.com>


[ Upstream commit 86acc790717fb60fb51ea3095084e331d8711c74 ]

Previously, if an non-fatal error was reported by an endpoint, we
called report_error_detected() for the endpoint, every sibling on the
bus, and their descendents.  If any of them did not implement the
.error_detected() method, do_recovery() failed, leaving all these
devices unrecovered.

For example, the system described in the bugzilla below has two devices:

  0000:74:02.0 [19e5:a230] SAS controller, driver has .error_detected()
  0000:74:03.0 [19e5:a235] SATA controller, driver lacks .error_detected()

When a device such as 74:02.0 reported a non-fatal error, do_recovery()
failed because 74:03.0 lacked an .error_detected() method.  But per PCIe
r3.1, sec 6.2.2.2.2, such an error does not compromise the Link and
does not affect 74:03.0:

  Non-fatal errors are uncorrectable errors which cause a particular
  transaction to be unreliable but the Link is otherwise fully functional.
  Isolating Non-fatal from Fatal errors provides Requester/Receiver logic
  in a device or system management software the opportunity to recover from
  the error without resetting the components on the Link and disturbing
  other transactions in progress.  Devices not associated with the
  transaction in error are not impacted by the error.

Report non-fatal errors only to the endpoint that reported them.  We really
want to check for AER_NONFATAL here, but the current code structure doesn't
allow that.  Looking for pci_channel_io_normal is the best we can do now.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=197055
Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pcie/aer/aerdrv_core.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -390,7 +390,14 @@ static pci_ers_result_t broadcast_error_
 		 * If the error is reported by an end point, we think this
 		 * error is related to the upstream link of the end point.
 		 */
-		pci_walk_bus(dev->bus, cb, &result_data);
+		if (state == pci_channel_io_normal)
+			/*
+			 * the error is non fatal so the bus is ok, just invoke
+			 * the callback for the function that logged the error.
+			 */
+			cb(dev, &result_data);
+		else
+			pci_walk_bus(dev->bus, cb, &result_data);
 	}
 
 	return result_data.result;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 129/159] tracing: Exclude generic fields from histograms
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (129 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 128/159] PCI/AER: Report non-fatal errors only to the affected endpoint Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 131/159] ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Zanussi, Steven Rostedt (VMware),
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tom Zanussi <tom.zanussi@linux.intel.com>


[ Upstream commit a15f7fc20389a8827d5859907568b201234d4b79 ]

There are a small number of 'generic fields' (comm/COMM/cpu/CPU) that
are found by trace_find_event_field() but are only meant for
filtering.  Specifically, they unlike normal fields, they have a size
of 0 and thus wreak havoc when used as a histogram key.

Exclude these (return -EINVAL) when used as histogram keys.

Link: http://lkml.kernel.org/r/956154cbc3e8a4f0633d619b886c97f0f0edf7b4.1506105045.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/trace/trace_events_hist.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -450,7 +450,7 @@ static int create_val_field(struct hist_
 	}
 
 	field = trace_find_event_field(file->event_call, field_name);
-	if (!field) {
+	if (!field || !field->size) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -548,7 +548,7 @@ static int create_key_field(struct hist_
 		}
 
 		field = trace_find_event_field(file->event_call, field_name);
-		if (!field) {
+		if (!field || !field->size) {
 			ret = -EINVAL;
 			goto out;
 		}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 131/159] ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (130 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 129/159] tracing: Exclude generic fields from histograms Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 132/159] powerpc/xmon: Avoid tripping SMP hardlockup watchdog Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ed Blake, Mark Brown, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ed Blake <ed.blake@sondrel.com>


[ Upstream commit c70458890ff15d858bd347fa9f563818bcd6e457 ]

Add pm_runtime_get_sync and pm_runtime_put calls to set_fmt callback
function. This fixes a bus error during boot when CONFIG_SUSPEND is
defined when this function gets called while the device is runtime
disabled and device registers are accessed while the clock is disabled.

Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 sound/soc/img/img-parallel-out.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/sound/soc/img/img-parallel-out.c
+++ b/sound/soc/img/img-parallel-out.c
@@ -164,9 +164,11 @@ static int img_prl_out_set_fmt(struct sn
 		return -EINVAL;
 	}
 
+	pm_runtime_get_sync(prl->dev);
 	reg = img_prl_out_readl(prl, IMG_PRL_OUT_CTL);
 	reg = (reg & ~IMG_PRL_OUT_CTL_EDGE_MASK) | control_set;
 	img_prl_out_writel(prl, reg, IMG_PRL_OUT_CTL);
+	pm_runtime_put(prl->dev);
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 132/159] powerpc/xmon: Avoid tripping SMP hardlockup watchdog
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (131 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 131/159] ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 133/159] powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nicholas Piggin, Michael Ellerman,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <npiggin@gmail.com>


[ Upstream commit 064996d62a33ffe10264b5af5dca92d54f60f806 ]

The SMP hardlockup watchdog cross-checks other CPUs for lockups, which
causes xmon headaches because it's assuming interrupts hard disabled
means no watchdog troubles. Try to improve that by calling
touch_nmi_watchdog() in obvious places where secondaries are spinning.

Also annotate these spin loops with spin_begin/end calls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/xmon/xmon.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -530,14 +530,19 @@ static int xmon_core(struct pt_regs *reg
 
  waiting:
 	secondary = 1;
+	spin_begin();
 	while (secondary && !xmon_gate) {
 		if (in_xmon == 0) {
-			if (fromipi)
+			if (fromipi) {
+				spin_end();
 				goto leave;
+			}
 			secondary = test_and_set_bit(0, &in_xmon);
 		}
-		barrier();
+		spin_cpu_relax();
+		touch_nmi_watchdog();
 	}
+	spin_end();
 
 	if (!secondary && !xmon_gate) {
 		/* we are the first cpu to come in */
@@ -568,21 +573,25 @@ static int xmon_core(struct pt_regs *reg
 		mb();
 		xmon_gate = 1;
 		barrier();
+		touch_nmi_watchdog();
 	}
 
  cmdloop:
 	while (in_xmon) {
 		if (secondary) {
+			spin_begin();
 			if (cpu == xmon_owner) {
 				if (!test_and_set_bit(0, &xmon_taken)) {
 					secondary = 0;
+					spin_end();
 					continue;
 				}
 				/* missed it */
 				while (cpu == xmon_owner)
-					barrier();
+					spin_cpu_relax();
 			}
-			barrier();
+			spin_cpu_relax();
+			touch_nmi_watchdog();
 		} else {
 			cmd = cmds(regs);
 			if (cmd != 0) {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 133/159] powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (132 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 132/159] powerpc/xmon: Avoid tripping SMP hardlockup watchdog Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:46 ` [PATCH 4.14 134/159] sctp: silence warns on sctp_stream_init allocations Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nicholas Piggin, Michael Ellerman,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <npiggin@gmail.com>


[ Upstream commit 80e4d70b06863e0104e5a0dc78aa3710297fbd4b ]

In xmon, touch_nmi_watchdog() is not expected to be checking that
other CPUs have not touched the watchdog, so the code will just call
touch_nmi_watchdog() once before re-enabling hard interrupts.

Just update our CPU's state, and ignore apparently stuck SMP threads.

Arguably touch_nmi_watchdog should check for SMP lockups, and callers
should be fixed, but that's not trivial for the input code of xmon.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/kernel/watchdog.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -276,9 +276,12 @@ void arch_touch_nmi_watchdog(void)
 {
 	unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
 	int cpu = smp_processor_id();
+	u64 tb = get_tb();
 
-	if (get_tb() - per_cpu(wd_timer_tb, cpu) >= ticks)
-		watchdog_timer_interrupt(cpu);
+	if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
+		per_cpu(wd_timer_tb, cpu) = tb;
+		wd_smp_clear_cpu_pending(cpu, tb);
+	}
 }
 EXPORT_SYMBOL(arch_touch_nmi_watchdog);
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 134/159] sctp: silence warns on sctp_stream_init allocations
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (133 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 133/159] powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog Greg Kroah-Hartman
@ 2017-12-22  8:46 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 135/159] ASoC: codecs: msm8916-wcd-analog: fix module autoload Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xin Long, Marcelo Ricardo Leitner,
	David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>


[ Upstream commit 1ae2eaaa229bc350b6f38fbf4ab9c873532aecfb ]

As SCTP supports up to 65535 streams, that can lead to very large
allocations in sctp_stream_init(). As Xin Long noticed, systems with
small amounts of memory are more prone to not have enough memory and
dump warnings on dmesg initiated by user actions. Thus, silence them.

Also, if the reallocation of stream->out is not necessary, skip it and
keep the memory we already have.

Reported-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sctp/stream.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -40,9 +40,14 @@ int sctp_stream_init(struct sctp_stream
 {
 	int i;
 
+	gfp |= __GFP_NOWARN;
+
 	/* Initial stream->out size may be very big, so free it and alloc
-	 * a new one with new outcnt to save memory.
+	 * a new one with new outcnt to save memory if needed.
 	 */
+	if (outcnt == stream->outcnt)
+		goto in;
+
 	kfree(stream->out);
 
 	stream->out = kcalloc(outcnt, sizeof(*stream->out), gfp);
@@ -53,6 +58,7 @@ int sctp_stream_init(struct sctp_stream
 	for (i = 0; i < stream->outcnt; i++)
 		stream->out[i].state = SCTP_STREAM_OPEN;
 
+in:
 	if (!incnt)
 		return 0;
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 135/159] ASoC: codecs: msm8916-wcd-analog: fix module autoload
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (134 preceding siblings ...)
  2017-12-22  8:46 ` [PATCH 4.14 134/159] sctp: silence warns on sctp_stream_init allocations Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 136/159] fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nicolas Dechesne, Mark Brown, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicolas Dechesne <nicolas.dechesne@linaro.org>


[ Upstream commit 46d69e141d479585c105a4d5b2337cd2ce6967e5 ]

If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo snd_soc_msm8916_analog | grep alias
$

After this patch:

$ modinfo snd_soc_msm8916_analog | grep alias
alias:          of:N*T*Cqcom,pm8916-wcd-analog-codecC*
alias:          of:N*T*Cqcom,pm8916-wcd-analog-codec

Signed-off-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 sound/soc/codecs/msm8916-wcd-analog.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/sound/soc/codecs/msm8916-wcd-analog.c
+++ b/sound/soc/codecs/msm8916-wcd-analog.c
@@ -1242,6 +1242,8 @@ static const struct of_device_id pm8916_
 	{ }
 };
 
+MODULE_DEVICE_TABLE(of, pm8916_wcd_analog_spmi_match_table);
+
 static struct platform_driver pm8916_wcd_analog_spmi_driver = {
 	.driver = {
 		   .name = "qcom,pm8916-wcd-spmi-codec",

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 136/159] fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (135 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 135/159] ASoC: codecs: msm8916-wcd-analog: fix module autoload Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 137/159] scsi: lpfc: Fix secure firmware updates Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jacob Keller, Krishneil Singh,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jacob Keller <jacob.e.keller@intel.com>


[ Upstream commit 3e256ac5b1ec307e5dd5a4c99fbdbc651446c738 ]

We've had support for setting both a minimum and maximum bandwidth via
.ndo_set_vf_bw since commit 883a9ccbae56 ("fm10k: Add support for SR-IOV
to driver", 2014-09-20).

Likely because we do not support minimum rates, the declaration
mis-ordered the "unused" parameter, which causes warnings when analyzed
with cppcheck.

Fix this warning by properly declaring the min_rate and max_rate
variables in the declaration and definition (rather than using
"unused"). Also rename "rate" to max_rate so as to clarify that we only
support setting the maximum rate.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/fm10k/fm10k.h     |    4 ++--
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c |    9 +++++----
 2 files changed, 7 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -526,8 +526,8 @@ s32 fm10k_iov_update_pvid(struct fm10k_i
 int fm10k_ndo_set_vf_mac(struct net_device *netdev, int vf_idx, u8 *mac);
 int fm10k_ndo_set_vf_vlan(struct net_device *netdev,
 			  int vf_idx, u16 vid, u8 qos, __be16 vlan_proto);
-int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int rate,
-			int unused);
+int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx,
+			int __always_unused min_rate, int max_rate);
 int fm10k_ndo_get_vf_config(struct net_device *netdev,
 			    int vf_idx, struct ifla_vf_info *ivi);
 
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -482,7 +482,7 @@ int fm10k_ndo_set_vf_vlan(struct net_dev
 }
 
 int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx,
-			int __always_unused unused, int rate)
+			int __always_unused min_rate, int max_rate)
 {
 	struct fm10k_intfc *interface = netdev_priv(netdev);
 	struct fm10k_iov_data *iov_data = interface->iov_data;
@@ -493,14 +493,15 @@ int fm10k_ndo_set_vf_bw(struct net_devic
 		return -EINVAL;
 
 	/* rate limit cannot be less than 10Mbs or greater than link speed */
-	if (rate && ((rate < FM10K_VF_TC_MIN) || rate > FM10K_VF_TC_MAX))
+	if (max_rate &&
+	    (max_rate < FM10K_VF_TC_MIN || max_rate > FM10K_VF_TC_MAX))
 		return -EINVAL;
 
 	/* store values */
-	iov_data->vf_info[vf_idx].rate = rate;
+	iov_data->vf_info[vf_idx].rate = max_rate;
 
 	/* update hardware configuration */
-	hw->iov.ops.configure_tc(hw, vf_idx, rate);
+	hw->iov.ops.configure_tc(hw, vf_idx, max_rate);
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 137/159] scsi: lpfc: Fix secure firmware updates
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (136 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 136/159] fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 138/159] scsi: lpfc: PLOGI failures during NPIV testing Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dick Kennedy, James Smart,
	Johannes Thumshirn, Martin K. Petersen, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dick Kennedy <dick.kennedy@broadcom.com>


[ Upstream commit 184fc2b9a8bcbda9c14d0a1e7fbecfc028c7702e ]

Firmware update fails with: status x17 add_status x56 on the final write

If multiple DMA buffers are used for the download, some firmware revs
have difficulty with signatures and crcs split across the dma buffer
boundaries.  Resolve by making all writes be a single 4k page in length.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/lpfc/lpfc_hw4.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/lpfc/lpfc_hw4.h
+++ b/drivers/scsi/lpfc/lpfc_hw4.h
@@ -3636,7 +3636,7 @@ struct lpfc_mbx_get_port_name {
 #define MB_CEQ_STATUS_QUEUE_FLUSHING		0x4
 #define MB_CQE_STATUS_DMA_FAILED		0x5
 
-#define LPFC_MBX_WR_CONFIG_MAX_BDE		8
+#define LPFC_MBX_WR_CONFIG_MAX_BDE		1
 struct lpfc_mbx_wr_object {
 	struct mbox_header header;
 	union {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 138/159] scsi: lpfc: PLOGI failures during NPIV testing
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (137 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 137/159] scsi: lpfc: Fix secure firmware updates Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 139/159] scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dick Kennedy, James Smart,
	Johannes Thumshirn, Martin K. Petersen, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dick Kennedy <dick.kennedy@broadcom.com>


[ Upstream commit e8bcf0ae4c0346fdc78ebefe0eefcaa6a6622d38 ]

Local Reject/Invalid RPI errors seen during discovery.

Temporary RPI cleanup was occurring regardless of SLI rev. It's only
necessary on SLI-4.

Adjust the test for whether cleanup is necessary.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/lpfc/lpfc_hbadisc.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -4983,7 +4983,8 @@ lpfc_nlp_remove(struct lpfc_vport *vport
 	lpfc_cancel_retry_delay_tmo(vport, ndlp);
 	if ((ndlp->nlp_flag & NLP_DEFER_RM) &&
 	    !(ndlp->nlp_flag & NLP_REG_LOGIN_SEND) &&
-	    !(ndlp->nlp_flag & NLP_RPI_REGISTERED)) {
+	    !(ndlp->nlp_flag & NLP_RPI_REGISTERED) &&
+	    phba->sli_rev != LPFC_SLI_REV4) {
 		/* For this case we need to cleanup the default rpi
 		 * allocated by the firmware.
 		 */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 139/159] scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (138 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 138/159] scsi: lpfc: PLOGI failures during NPIV testing Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 140/159] i40e: fix client notify of VF reset Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dick Kennedy, James Smart,
	Stephen Rothwell, Johannes Thumshirn, Martin K. Petersen,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dick Kennedy <dick.kennedy@broadcom.com>


[ Upstream commit 2299e4323d2bf6e0728fdc6b9e8e9704978d2dd7 ]

Warning messages when NVME_TARGET_FC not defined on ppc builds

The lpfc_nvmet_replenish_context() function is only meaningful when NVME
target mode enabled. Surround the function body with ifdefs for target
mode enablement.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/lpfc/lpfc_nvmet.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -1464,6 +1464,7 @@ static struct lpfc_nvmet_ctxbuf *
 lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
 			     struct lpfc_nvmet_ctx_info *current_infop)
 {
+#if (IS_ENABLED(CONFIG_NVME_TARGET_FC))
 	struct lpfc_nvmet_ctxbuf *ctx_buf = NULL;
 	struct lpfc_nvmet_ctx_info *get_infop;
 	int i;
@@ -1511,6 +1512,7 @@ lpfc_nvmet_replenish_context(struct lpfc
 		get_infop = get_infop->nvmet_ctx_next_cpu;
 	}
 
+#endif
 	/* Nothing found, all contexts for the MRQ are in-flight */
 	return NULL;
 }

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 140/159] i40e: fix client notify of VF reset
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (139 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 139/159] scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 141/159] vfio/pci: Virtualize Maximum Payload Size Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alan Brady, Andrew Bowers,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alan Brady <alan.brady@intel.com>


[ Upstream commit c53d11f669c0e7d0daf46a717b6712ad0b09de99 ]

Currently there is a bug in which the PF driver fails to inform clients
of a VF reset which then causes clients to leak resources.  The bug
exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE
bit.

When a VF is first init we go through a reset to initialize variables
and allocate resources but we don't want to inform clients of this first
reset since the client isn't fully enabled yet so we set a state bit
signifying we're in a "pre-enabled" client state.  During the first
reset we should be clearing the bit, allowing all following resets to
notify the client of the reset when the bit is not set.  This patch
fixes the issue by negating the 'test_and_clear_bit' check to accurately
reflect the behavior we want.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1008,8 +1008,8 @@ static void i40e_cleanup_reset_vf(struct
 		set_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);
 		clear_bit(I40E_VF_STATE_DISABLED, &vf->vf_states);
 		/* Do not notify the client during VF init */
-		if (test_and_clear_bit(I40E_VF_STATE_PRE_ENABLE,
-				       &vf->vf_states))
+		if (!test_and_clear_bit(I40E_VF_STATE_PRE_ENABLE,
+					&vf->vf_states))
 			i40e_notify_client_of_vf_reset(pf, abs_vf_id);
 		vf->num_vlan = 0;
 	}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 141/159] vfio/pci: Virtualize Maximum Payload Size
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (140 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 140/159] i40e: fix client notify of VF reset Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 142/159] ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alex Williamson, Eric Auger, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Williamson <alex.williamson@redhat.com>


[ Upstream commit 523184972b282cd9ca17a76f6ca4742394856818 ]

With virtual PCI-Express chipsets, we now see userspace/guest drivers
trying to match the physical MPS setting to a virtual downstream port.
Of course a lone physical device surrounded by virtual interconnects
cannot make a correct decision for a proper MPS setting.  Instead,
let's virtualize the MPS control register so that writes through to
hardware are disallowed.  Userspace drivers like QEMU assume they can
write anything to the device and we'll filter out anything dangerous.
Since mismatched MPS can lead to AER and other faults, let's add it
to the kernel side rather than relying on userspace virtualization to
handle it.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/pci/vfio_pci_config.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -849,11 +849,13 @@ static int __init init_pci_cap_exp_perm(
 
 	/*
 	 * Allow writes to device control fields, except devctl_phantom,
-	 * which could confuse IOMMU, and the ARI bit in devctl2, which
+	 * which could confuse IOMMU, MPS, which can break communication
+	 * with other physical devices, and the ARI bit in devctl2, which
 	 * is set at probe time.  FLR gets virtualized via our writefn.
 	 */
 	p_setw(perm, PCI_EXP_DEVCTL,
-	       PCI_EXP_DEVCTL_BCR_FLR, ~PCI_EXP_DEVCTL_PHANTOM);
+	       PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD,
+	       ~PCI_EXP_DEVCTL_PHANTOM);
 	p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 142/159] ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (141 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 141/159] vfio/pci: Virtualize Maximum Payload Size Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 143/159] fm10k: ensure we process SM mbx when processing VF mbx Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Marek Szyprowski,
	Krzysztof Kozlowski, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marek Szyprowski <m.szyprowski@samsung.com>


[ Upstream commit a99897f550de96841aecb811455a67ad7a4e39a7 ]

Odroid HC1 board has built-in JMicron USB to SATA bridge, which supports
UAS protocol. Compile-in support for it (instead of enabling it as module)
to make sure that all built-in storage devices are available for rootfs.
The bridge itself also supports fallback to standard USB Mass Storage
protocol, but USB Mass Storage class doesn't bind to it when UAS is
compiled as module and modules are not (yet) available.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/configs/exynos_defconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -244,7 +244,7 @@ CONFIG_USB_STORAGE_ONETOUCH=m
 CONFIG_USB_STORAGE_KARMA=m
 CONFIG_USB_STORAGE_CYPRESS_ATACB=m
 CONFIG_USB_STORAGE_ENE_UB6250=m
-CONFIG_USB_UAS=m
+CONFIG_USB_UAS=y
 CONFIG_USB_DWC3=y
 CONFIG_USB_DWC2=y
 CONFIG_USB_HSIC_USB3503=y

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 143/159] fm10k: ensure we process SM mbx when processing VF mbx
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (142 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 142/159] ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 144/159] ibmvnic: Set state UP Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jacob Keller, Krishneil Singh,
	Jeff Kirsher, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jacob Keller <jacob.e.keller@intel.com>


[ Upstream commit 17a91809942ca32c70026d2d5ba3348a2c4fdf8f ]

When we process VF mailboxes, the driver is likely going to also queue
up messages to the switch manager. This process merely queues up the
FIFO, but doesn't actually begin the transmission process. Because we
hold the mailbox lock during this VF processing, the PF<->SM mailbox is
not getting processed at this time. Ensure that we actually process the
PF<->SM mailbox in between each PF<->VF mailbox.

This should ensure prompt transmission of the messages queued up after
each VF message is received and handled.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -126,6 +126,9 @@ process_mbx:
 		struct fm10k_mbx_info *mbx = &vf_info->mbx;
 		u16 glort = vf_info->glort;
 
+		/* process the SM mailbox first to drain outgoing messages */
+		hw->mbx.ops.process(hw, &hw->mbx);
+
 		/* verify port mapping is valid, if not reset port */
 		if (vf_info->vf_flags && !fm10k_glort_valid_pf(hw, glort))
 			hw->iov.ops.reset_lport(hw, vf_info);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 144/159] ibmvnic: Set state UP
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (143 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 143/159] fm10k: ensure we process SM mbx when processing VF mbx Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 145/159] net: ipv6: send NS for DAD when link operationally up Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mick Tarsel, David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mick Tarsel <mjtarsel@linux.vnet.ibm.com>


[ Upstream commit e876a8a7e9dd89dc88c12ca2e81beb478dbe9897 ]

State is initially reported as UNKNOWN. Before register call
netif_carrier_off(). Once the device is opened, call netif_carrier_on() in
order to set the state to UP.

Signed-off-by: Mick Tarsel <mjtarsel@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -927,6 +927,7 @@ static int ibmvnic_open(struct net_devic
 	}
 
 	rc = __ibmvnic_open(netdev);
+	netif_carrier_on(netdev);
 	mutex_unlock(&adapter->reset_lock);
 
 	return rc;
@@ -3899,6 +3900,7 @@ static int ibmvnic_probe(struct vio_dev
 	if (rc)
 		goto ibmvnic_init_fail;
 
+	netif_carrier_off(netdev);
 	rc = register_netdev(netdev);
 	if (rc) {
 		dev_err(&dev->dev, "failed to register netdev rc=%d\n", rc);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 145/159] net: ipv6: send NS for DAD when link operationally up
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (144 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 144/159] ibmvnic: Set state UP Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 146/159] RDMA/hns: Avoid NULL pointer exception Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mike Manning, David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mike Manning <mmanning@brocade.com>


[ Upstream commit 1f372c7bfb23286d2bf4ce0423ab488e86b74bb2 ]

The NS for DAD are sent on admin up as long as a valid qdisc is found.
A race condition exists by which these packets will not egress the
interface if the operational state of the lower device is not yet up.
The solution is to delay DAD until the link is operationally up
according to RFC2863. Rather than only doing this, follow the existing
code checks by deferring IPv6 device initialization altogether. The fix
allows DAD on devices like tunnels that are controlled by userspace
control plane. The fix has no impact on regular deployments, but means
that there is no IPv6 connectivity until the port has been opened in
the case of port-based network access control, which should be
desirable.

Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/addrconf.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -303,10 +303,10 @@ static struct ipv6_devconf ipv6_devconf_
 	.disable_policy		= 0,
 };
 
-/* Check if a valid qdisc is available */
-static inline bool addrconf_qdisc_ok(const struct net_device *dev)
+/* Check if link is ready: is it up and is a valid qdisc available */
+static inline bool addrconf_link_ready(const struct net_device *dev)
 {
-	return !qdisc_tx_is_noop(dev);
+	return netif_oper_up(dev) && !qdisc_tx_is_noop(dev);
 }
 
 static void addrconf_del_rs_timer(struct inet6_dev *idev)
@@ -451,7 +451,7 @@ static struct inet6_dev *ipv6_add_dev(st
 
 	ndev->token = in6addr_any;
 
-	if (netif_running(dev) && addrconf_qdisc_ok(dev))
+	if (netif_running(dev) && addrconf_link_ready(dev))
 		ndev->if_flags |= IF_READY;
 
 	ipv6_mc_init_dev(ndev);
@@ -3404,7 +3404,7 @@ static int addrconf_notify(struct notifi
 			/* restore routes for permanent addresses */
 			addrconf_permanent_addr(dev);
 
-			if (!addrconf_qdisc_ok(dev)) {
+			if (!addrconf_link_ready(dev)) {
 				/* device is not ready yet. */
 				pr_info("ADDRCONF(NETDEV_UP): %s: link is not ready\n",
 					dev->name);
@@ -3419,7 +3419,7 @@ static int addrconf_notify(struct notifi
 				run_pending = 1;
 			}
 		} else if (event == NETDEV_CHANGE) {
-			if (!addrconf_qdisc_ok(dev)) {
+			if (!addrconf_link_ready(dev)) {
 				/* device is still not ready. */
 				break;
 			}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 146/159] RDMA/hns: Avoid NULL pointer exception
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (145 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 145/159] net: ipv6: send NS for DAD when link operationally up Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 147/159] staging: greybus: light: Release memory obtained by kasprintf Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wei Hu (Xavier),
	Lijun Ou, Shaobo Xu, Doug Ledford, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "Wei Hu(Xavier)" <xavier.huwei@huawei.com>


[ Upstream commit 5e437b1d7e8d31ff9a4b8e898eb3a6cee309edd9 ]

After the loop in hns_roce_v1_mr_free_work_fn function, it is possible that
all qps will have been freed (in which case ne will be 0).  If that
happens, then later in the function when we dereference hr_qp we will
get an exception.  Check ne is not 0 to make sure we actually have an
hr_qp left to work on.

This patch fixes the smatch error as below:
drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1009 hns_roce_v1_mr_free_work_fn()
error: we previously assumed 'hr_qp' could be null

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -1001,6 +1001,11 @@ static void hns_roce_v1_mr_free_work_fn(
 		}
 	}
 
+	if (!ne) {
+		dev_err(dev, "Reseved loop qp is absent!\n");
+		goto free_work;
+	}
+
 	do {
 		ret = hns_roce_v1_poll_cq(&mr_free_cq->ib_cq, ne, wc);
 		if (ret < 0) {

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 147/159] staging: greybus: light: Release memory obtained by kasprintf
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (146 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 146/159] RDMA/hns: Avoid NULL pointer exception Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 148/159] clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arvind Yadav, Rui Miguel Silva, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arvind Yadav <arvind.yadav.cs@gmail.com>


[ Upstream commit 04820da21050b35eed68aa046115d810163ead0c ]

Free memory region, if gb_lights_channel_config is not successful.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/staging/greybus/light.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/staging/greybus/light.c
+++ b/drivers/staging/greybus/light.c
@@ -925,6 +925,8 @@ static void __gb_lights_led_unregister(s
 		return;
 
 	led_classdev_unregister(cdev);
+	kfree(cdev->name);
+	cdev->name = NULL;
 	channel->led = NULL;
 }
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 148/159] clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (147 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 147/159] staging: greybus: light: Release memory obtained by kasprintf Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 149/159] tcp: fix under-evaluated ssthresh in TCP Vegas Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chen-Yu Tsai, Maxime Ripard, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chen-Yu Tsai <wens@csie.org>


[ Upstream commit 7f3ed79188f2f094d0ee366fa858857fb7f511ba ]

The HDMI DDC clock found in the CCU is the parent of the actual DDC
clock within the HDMI controller. That clock is also named "hdmi-ddc".

Rename the one in the CCU to "ddc". This makes more sense than renaming
the one in the HDMI controller to something else.

Fixes: c6e6c96d8fa6 ("clk: sunxi-ng: Add A31/A31s clocks")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/sunxi-ng/ccu-sun6i-a31.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
+++ b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
@@ -608,7 +608,7 @@ static SUNXI_CCU_M_WITH_MUX_GATE(hdmi_cl
 				 0x150, 0, 4, 24, 2, BIT(31),
 				 CLK_SET_RATE_PARENT);
 
-static SUNXI_CCU_GATE(hdmi_ddc_clk, "hdmi-ddc", "osc24M", 0x150, BIT(30), 0);
+static SUNXI_CCU_GATE(hdmi_ddc_clk, "ddc", "osc24M", 0x150, BIT(30), 0);
 
 static SUNXI_CCU_GATE(ps_clk, "ps", "lcd1-ch1", 0x140, BIT(31), 0);
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 149/159] tcp: fix under-evaluated ssthresh in TCP Vegas
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (148 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 148/159] clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 150/159] rtc: set the alarm to the next expiring timer Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hoang Tran, David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hoang Tran <tranviethoang.vn@gmail.com>


[ Upstream commit cf5d74b85ef40c202c76d90959db4d850f301b95 ]

With the commit 76174004a0f19785 (tcp: do not slow start when cwnd equals
ssthresh), the comparison to the reduced cwnd in tcp_vegas_ssthresh() would
under-evaluate the ssthresh.

Signed-off-by: Hoang Tran <hoang.tran@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_vegas.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/tcp_vegas.c
+++ b/net/ipv4/tcp_vegas.c
@@ -158,7 +158,7 @@ EXPORT_SYMBOL_GPL(tcp_vegas_cwnd_event);
 
 static inline u32 tcp_vegas_ssthresh(struct tcp_sock *tp)
 {
-	return  min(tp->snd_ssthresh, tp->snd_cwnd-1);
+	return  min(tp->snd_ssthresh, tp->snd_cwnd);
 }
 
 static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 150/159] rtc: set the alarm to the next expiring timer
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (149 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 149/159] tcp: fix under-evaluated ssthresh in TCP Vegas Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 151/159] cpuidle: fix broadcast control when broadcast can not be entered Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alexandre Belloni, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexandre Belloni <alexandre.belloni@free-electrons.com>


[ Upstream commit 74717b28cb32e1ad3c1042cafd76b264c8c0f68d ]

If there is any non expired timer in the queue, the RTC alarm is never set.
This is an issue when adding a timer that expires before the next non
expired timer.

Ensure the RTC alarm is set in that case.

Fixes: 2b2f5ff00f63 ("rtc: interface: ignore expired timers when enqueuing new timers")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/rtc/interface.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -779,7 +779,7 @@ static int rtc_timer_enqueue(struct rtc_
 	}
 
 	timerqueue_add(&rtc->timerqueue, &timer->node);
-	if (!next) {
+	if (!next || ktime_before(timer->node.expires, next->expires)) {
 		struct rtc_wkalrm alarm;
 		int err;
 		alarm.time = rtc_ktime_to_tm(timer->node.expires);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 151/159] cpuidle: fix broadcast control when broadcast can not be entered
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (150 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 150/159] rtc: set the alarm to the next expiring timer Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 152/159] drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nicholas Piggin, Thomas Gleixner,
	Rafael J. Wysocki, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <npiggin@gmail.com>


[ Upstream commit f187851b9b4a76952b1158b86434563dd2031103 ]

When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.

This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases. It does not appear to cause problems for code today, but seems
to violate the interface so should be fixed.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/cpuidle/cpuidle.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_d
 			return -EBUSY;
 		}
 		target_state = &drv->states[index];
+		broadcast = false;
 	}
 
 	/* Take note of the planned idle state. */

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 152/159] drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math.
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (151 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 151/159] cpuidle: fix broadcast control when broadcast can not be entered Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 153/159] IB/opa_vnic: Properly clear Mac Table Digest Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Anholt, Andrzej Hajda, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Anholt <eric@anholt.net>


[ Upstream commit af2eca53206c59ce9308a4f5f46c4a104a179b6b ]

The incoming mode might have a missing vrefresh field if it came from
drmModeSetCrtc(), which the kernel is supposed to calculate using
drm_mode_vrefresh().  We could either use that or the adjusted_mode's
original vrefresh value.

However, we can maintain a more exact vrefresh value (not just the
integer approximation), by scaling by the ratio of our clocks.

v2: Use math suggested by Andrzej Hajda instead.
v3: Simplify math now that adjusted_mode->clock isn't padded.
v4: Drop some parens.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815234722.20700-2-eric@anholt.net
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/vc4/vc4_dsi.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/vc4/vc4_dsi.c
+++ b/drivers/gpu/drm/vc4/vc4_dsi.c
@@ -866,7 +866,8 @@ static bool vc4_dsi_encoder_mode_fixup(s
 	adjusted_mode->clock = pixel_clock_hz / 1000 + 1;
 
 	/* Given the new pixel clock, adjust HFP to keep vrefresh the same. */
-	adjusted_mode->htotal = pixel_clock_hz / (mode->vrefresh * mode->vtotal);
+	adjusted_mode->htotal = adjusted_mode->clock * mode->htotal /
+				mode->clock;
 	adjusted_mode->hsync_end += adjusted_mode->htotal - mode->htotal;
 	adjusted_mode->hsync_start += adjusted_mode->htotal - mode->htotal;
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 153/159] IB/opa_vnic: Properly clear Mac Table Digest
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (152 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 152/159] drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 154/159] IB/opa_vnic: Properly return the total MACs in UC MAC list Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Niranjana Vishwanathapura,
	Scott Franco, Dennis Dalessandro, Doug Ledford, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Scott Franco <safranco@intel.com>


[ Upstream commit 4bbdfe25600c1909c26747d0b5c39fd0e409bb87 ]

Clear the MAC table digest when the MAC table is freed.

Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Scott Franco <safranco@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c
@@ -139,6 +139,7 @@ void opa_vnic_release_mac_tbl(struct opa
 	rcu_assign_pointer(adapter->mactbl, NULL);
 	synchronize_rcu();
 	opa_vnic_free_mac_tbl(mactbl);
+	adapter->info.vport.mac_tbl_digest = 0;
 	mutex_unlock(&adapter->mactbl_lock);
 }
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 154/159] IB/opa_vnic: Properly return the total MACs in UC MAC list
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (153 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 153/159] IB/opa_vnic: Properly clear Mac Table Digest Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 155/159] thermal/drivers/hisi: Fix missing interrupt enablement Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sudeep Dutt,
	Niranjana Vishwanathapura, Dennis Dalessandro, Doug Ledford,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>


[ Upstream commit b77eb45e0d9c324245d165656ab3b38b6f386436 ]

Do not include EM specified MAC address in total MACs of the
UC MAC list.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c
@@ -348,7 +348,7 @@ void opa_vnic_query_mcast_macs(struct op
 void opa_vnic_query_ucast_macs(struct opa_vnic_adapter *adapter,
 			       struct opa_veswport_iface_macs *macs)
 {
-	u16 start_idx, tot_macs, num_macs, idx = 0, count = 0;
+	u16 start_idx, tot_macs, num_macs, idx = 0, count = 0, em_macs = 0;
 	struct netdev_hw_addr *ha;
 
 	start_idx = be16_to_cpu(macs->start_idx);
@@ -359,8 +359,10 @@ void opa_vnic_query_ucast_macs(struct op
 
 		/* Do not include EM specified MAC address */
 		if (!memcmp(adapter->info.vport.base_mac_addr, ha->addr,
-			    ARRAY_SIZE(adapter->info.vport.base_mac_addr)))
+			    ARRAY_SIZE(adapter->info.vport.base_mac_addr))) {
+			em_macs++;
 			continue;
+		}
 
 		if (start_idx > idx++)
 			continue;
@@ -383,7 +385,7 @@ void opa_vnic_query_ucast_macs(struct op
 	}
 
 	tot_macs = netdev_hw_addr_list_count(&adapter->netdev->dev_addrs) +
-		   netdev_uc_count(adapter->netdev);
+		   netdev_uc_count(adapter->netdev) - em_macs;
 	macs->tot_macs_in_lst = cpu_to_be16(tot_macs);
 	macs->num_macs_in_msg = cpu_to_be16(count);
 	macs->gen_count = cpu_to_be16(adapter->info.vport.uc_macs_gen_count);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 155/159] thermal/drivers/hisi: Fix missing interrupt enablement
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (154 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 154/159] IB/opa_vnic: Properly return the total MACs in UC MAC list Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 156/159] thermal/drivers/hisi: Fix kernel panic on alarm interrupt Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Lezcano, Leo Yan,
	Eduardo Valentin, Kevin Wangtao

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Lezcano <daniel.lezcano@linaro.org>

commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream.

The interrupt for the temperature threshold is not enabled at the end of the
probe function, enable it after the setup is complete.

On the other side, the irq_enabled is not correctly set as we are checking if
the interrupt is masked where 'yes' means irq_enabled=false.

	irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED,
				&data->irq_enabled);

As we are always enabling the interrupt, it is pointless to check if
the interrupt is masked or not, just set irq_enabled to 'true'.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/thermal/hisi_thermal.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/thermal/hisi_thermal.c
+++ b/drivers/thermal/hisi_thermal.c
@@ -345,8 +345,7 @@ static int hisi_thermal_probe(struct pla
 	}
 
 	hisi_thermal_enable_bind_irq_sensor(data);
-	irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED,
-			      &data->irq_enabled);
+	data->irq_enabled = true;
 
 	for (i = 0; i < HISI_MAX_SENSORS; ++i) {
 		ret = hisi_thermal_register_sensor(pdev, data,
@@ -358,6 +357,8 @@ static int hisi_thermal_probe(struct pla
 			hisi_thermal_toggle_sensor(&data->sensors[i], true);
 	}
 
+	enable_irq(data->irq);
+
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 156/159] thermal/drivers/hisi: Fix kernel panic on alarm interrupt
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (155 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 155/159] thermal/drivers/hisi: Fix missing interrupt enablement Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 157/159] thermal/drivers/hisi: Simplify the temperature/step computation Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Lezcano, Leo Yan,
	Eduardo Valentin, Kevin Wangtao

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Lezcano <daniel.lezcano@linaro.org>

commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream.

The threaded interrupt for the alarm interrupt is requested before the
temperature controller is setup. This one can fire an interrupt immediately
leading to a kernel panic as the sensor data is not initialized.

In order to prevent that, move the threaded irq after the Tsensor is setup.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/thermal/hisi_thermal.c |   18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

--- a/drivers/thermal/hisi_thermal.c
+++ b/drivers/thermal/hisi_thermal.c
@@ -317,15 +317,6 @@ static int hisi_thermal_probe(struct pla
 	if (data->irq < 0)
 		return data->irq;
 
-	ret = devm_request_threaded_irq(&pdev->dev, data->irq,
-					hisi_thermal_alarm_irq,
-					hisi_thermal_alarm_irq_thread,
-					0, "hisi_thermal", data);
-	if (ret < 0) {
-		dev_err(&pdev->dev, "failed to request alarm irq: %d\n", ret);
-		return ret;
-	}
-
 	platform_set_drvdata(pdev, data);
 
 	data->clk = devm_clk_get(&pdev->dev, "thermal_clk");
@@ -357,6 +348,15 @@ static int hisi_thermal_probe(struct pla
 			hisi_thermal_toggle_sensor(&data->sensors[i], true);
 	}
 
+	ret = devm_request_threaded_irq(&pdev->dev, data->irq,
+					hisi_thermal_alarm_irq,
+					hisi_thermal_alarm_irq_thread,
+					0, "hisi_thermal", data);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "failed to request alarm irq: %d\n", ret);
+		return ret;
+	}
+
 	enable_irq(data->irq);
 
 	return 0;

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 157/159] thermal/drivers/hisi: Simplify the temperature/step computation
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (156 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 156/159] thermal/drivers/hisi: Fix kernel panic on alarm interrupt Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 158/159] thermal/drivers/hisi: Fix multiple alarm interrupts firing Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Lezcano, Leo Yan,
	Eduardo Valentin, Kevin Wangtao

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Lezcano <daniel.lezcano@linaro.org>

commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream.

The step and the base temperature are fixed values, we can simplify the
computation by converting the base temperature to milli celsius and use a
pre-computed step value. That saves us a lot of mult + div for nothing at
runtime.

Take also the opportunity to change the function names to be consistent with
the rest of the code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/thermal/hisi_thermal.c |   41 ++++++++++++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 13 deletions(-)

--- a/drivers/thermal/hisi_thermal.c
+++ b/drivers/thermal/hisi_thermal.c
@@ -35,8 +35,9 @@
 #define TEMP0_RST_MSK			(0x1C)
 #define TEMP0_VALUE			(0x28)
 
-#define HISI_TEMP_BASE			(-60)
+#define HISI_TEMP_BASE			(-60000)
 #define HISI_TEMP_RESET			(100000)
+#define HISI_TEMP_STEP			(784)
 
 #define HISI_MAX_SENSORS		4
 
@@ -61,19 +62,32 @@ struct hisi_thermal_data {
 	void __iomem *regs;
 };
 
-/* in millicelsius */
-static inline int _step_to_temp(int step)
+/*
+ * The temperature computation on the tsensor is as follow:
+ *	Unit: millidegree Celsius
+ *	Step: 255/200 (0.7843)
+ *	Temperature base: -60°C
+ *
+ * The register is programmed in temperature steps, every step is 784
+ * millidegree and begins at -60 000 m°C
+ *
+ * The temperature from the steps:
+ *
+ *	Temp = TempBase + (steps x 784)
+ *
+ * and the steps from the temperature:
+ *
+ *	steps = (Temp - TempBase) / 784
+ *
+ */
+static inline int hisi_thermal_step_to_temp(int step)
 {
-	/*
-	 * Every step equals (1 * 200) / 255 celsius, and finally
-	 * need convert to millicelsius.
-	 */
-	return (HISI_TEMP_BASE * 1000 + (step * 200000 / 255));
+	return HISI_TEMP_BASE + (step * HISI_TEMP_STEP);
 }
 
-static inline long _temp_to_step(long temp)
+static inline long hisi_thermal_temp_to_step(long temp)
 {
-	return ((temp - HISI_TEMP_BASE * 1000) * 255) / 200000;
+	return (temp - HISI_TEMP_BASE) / HISI_TEMP_STEP;
 }
 
 static long hisi_thermal_get_sensor_temp(struct hisi_thermal_data *data,
@@ -99,7 +113,7 @@ static long hisi_thermal_get_sensor_temp
 	usleep_range(3000, 5000);
 
 	val = readl(data->regs + TEMP0_VALUE);
-	val = _step_to_temp(val);
+	val = hisi_thermal_step_to_temp(val);
 
 	mutex_unlock(&data->thermal_lock);
 
@@ -126,10 +140,11 @@ static void hisi_thermal_enable_bind_irq
 	writel((sensor->id << 12), data->regs + TEMP0_CFG);
 
 	/* enable for interrupt */
-	writel(_temp_to_step(sensor->thres_temp) | 0x0FFFFFF00,
+	writel(hisi_thermal_temp_to_step(sensor->thres_temp) | 0x0FFFFFF00,
 	       data->regs + TEMP0_TH);
 
-	writel(_temp_to_step(HISI_TEMP_RESET), data->regs + TEMP0_RST_TH);
+	writel(hisi_thermal_temp_to_step(HISI_TEMP_RESET),
+	       data->regs + TEMP0_RST_TH);
 
 	/* enable module */
 	writel(0x1, data->regs + TEMP0_RST_MSK);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 158/159] thermal/drivers/hisi: Fix multiple alarm interrupts firing
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (157 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 157/159] thermal/drivers/hisi: Simplify the temperature/step computation Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22  8:47 ` [PATCH 4.14 159/159] platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Lezcano, Leo Yan,
	Eduardo Valentin, Kevin Wangtao

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Lezcano <daniel.lezcano@linaro.org>

commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream.

The DT specifies a threshold of 65000, we setup the register with a value in
the temperature resolution for the controller, 64656.

When we reach 64656, the interrupt fires, the interrupt is disabled. Then the
irq thread runs and calls thermal_zone_device_update() which will call in turn
hisi_thermal_get_temp().

The function will look if the temperature decreased, assuming it was more than
65000, but that is not the case because the current temperature is 64656
(because of the rounding when setting the threshold). This condition being
true, we re-enable the interrupt which fires immediately after exiting the irq
thread. That happens again and again until the temperature goes to more than
65000.

Potentially, there is here an interrupt storm if the temperature stabilizes at
this temperature. A very unlikely case but possible.

In any case, it does not make sense to handle dozens of alarm interrupt for
nothing.

Fix this by rounding the threshold value to the controller resolution so the
check against the threshold is consistent with the one set in the controller.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/thermal/hisi_thermal.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/thermal/hisi_thermal.c
+++ b/drivers/thermal/hisi_thermal.c
@@ -90,6 +90,12 @@ static inline long hisi_thermal_temp_to_
 	return (temp - HISI_TEMP_BASE) / HISI_TEMP_STEP;
 }
 
+static inline long hisi_thermal_round_temp(int temp)
+{
+	return hisi_thermal_step_to_temp(
+		hisi_thermal_temp_to_step(temp));
+}
+
 static long hisi_thermal_get_sensor_temp(struct hisi_thermal_data *data,
 					 struct hisi_thermal_sensor *sensor)
 {
@@ -245,7 +251,7 @@ static irqreturn_t hisi_thermal_alarm_ir
 	sensor = &data->sensors[data->irq_bind_sensor];
 
 	dev_crit(&data->pdev->dev, "THERMAL ALARM: T > %d\n",
-		 sensor->thres_temp / 1000);
+		 sensor->thres_temp);
 	mutex_unlock(&data->thermal_lock);
 
 	for (i = 0; i < HISI_MAX_SENSORS; i++) {
@@ -284,7 +290,7 @@ static int hisi_thermal_register_sensor(
 
 	for (i = 0; i < of_thermal_get_ntrips(sensor->tzd); i++) {
 		if (trip[i].type == THERMAL_TRIP_PASSIVE) {
-			sensor->thres_temp = trip[i].temperature;
+			sensor->thres_temp = hisi_thermal_round_temp(trip[i].temperature);
 			break;
 		}
 	}

^ permalink raw reply	[flat|nested] 349+ messages in thread

* [PATCH 4.14 159/159] platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (158 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 158/159] thermal/drivers/hisi: Fix multiple alarm interrupts firing Greg Kroah-Hartman
@ 2017-12-22  8:47 ` Greg Kroah-Hartman
  2017-12-22 15:08 ` [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  165 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22  8:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Hutterer, Darren Hart (VMware)

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Hutterer <peter.hutterer@who-t.net>

commit bff5bf9db1c9453ffd0a78abed3e2d040c092fd9 upstream.

Sending the switch state change twice within the same frame is invalid
evdev protocol and only works if the client handles keys immediately as
well. Processing events immediately is incorrect, it forces a fake
order of events that does not exist on the device.

Recent versions of libinput changed to only process the device state and
SYN_REPORT time, so now the key event is lost.

https://bugs.freedesktop.org/show_bug.cgi?id=104041

Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/platform/x86/asus-wireless.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/platform/x86/asus-wireless.c
+++ b/drivers/platform/x86/asus-wireless.c
@@ -118,6 +118,7 @@ static void asus_wireless_notify(struct
 		return;
 	}
 	input_report_key(data->idev, KEY_RFKILL, 1);
+	input_sync(data->idev);
 	input_report_key(data->idev, KEY_RFKILL, 0);
 	input_sync(data->idev);
 }

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22  8:46 ` [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg Greg Kroah-Hartman
@ 2017-12-22  9:34   ` Michal Hocko
  2017-12-22 12:41     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 349+ messages in thread
From: Michal Hocko @ 2017-12-22  9:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Shakeel Butt, Paolo Bonzini, Sasha Levin

On Fri 22-12-17 09:46:33, Greg KH wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Shakeel Butt <shakeelb@google.com>
> 
> 
> [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
> 
> The kvm slabs can consume a significant amount of system memory
> and indeed in our production environment we have observed that
> a lot of machines are spending significant amount of memory that
> can not be left as system memory overhead. Also the allocations
> from these slabs can be triggered directly by user space applications
> which has access to kvm and thus a buggy application can leak
> such memory. So, these caches should be accounted to kmemcg.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The patch is not marked for stable, neither it fixes an existing bug.
It is a nice to have thing for sure but I am wondering how this got
through stable-filter. 

> ---
>  arch/x86/kvm/mmu.c  |    4 ++--
>  virt/kvm/kvm_main.c |    2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5476,13 +5476,13 @@ int kvm_mmu_module_init(void)
>  
>  	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
>  					    sizeof(struct pte_list_desc),
> -					    0, 0, NULL);
> +					    0, SLAB_ACCOUNT, NULL);
>  	if (!pte_list_desc_cache)
>  		goto nomem;
>  
>  	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
>  						  sizeof(struct kvm_mmu_page),
> -						  0, 0, NULL);
> +						  0, SLAB_ACCOUNT, NULL);
>  	if (!mmu_page_header_cache)
>  		goto nomem;
>  
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4018,7 +4018,7 @@ int kvm_init(void *opaque, unsigned vcpu
>  	if (!vcpu_align)
>  		vcpu_align = __alignof__(struct kvm_vcpu);
>  	kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size, vcpu_align,
> -					   0, NULL);
> +					   SLAB_ACCOUNT, NULL);
>  	if (!kvm_vcpu_cache) {
>  		r = -ENOMEM;
>  		goto out_free_3;
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22  9:34   ` Michal Hocko
@ 2017-12-22 12:41     ` Greg Kroah-Hartman
  2017-12-22 13:06       ` Michal Hocko
  0 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 12:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, stable, Shakeel Butt, Paolo Bonzini, Sasha Levin

On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
> On Fri 22-12-17 09:46:33, Greg KH wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Shakeel Butt <shakeelb@google.com>
> > 
> > 
> > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
> > 
> > The kvm slabs can consume a significant amount of system memory
> > and indeed in our production environment we have observed that
> > a lot of machines are spending significant amount of memory that
> > can not be left as system memory overhead. Also the allocations
> > from these slabs can be triggered directly by user space applications
> > which has access to kvm and thus a buggy application can leak
> > such memory. So, these caches should be accounted to kmemcg.
> > 
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> The patch is not marked for stable, neither it fixes an existing bug.
> It is a nice to have thing for sure but I am wondering how this got
> through stable-filter. 

Sasha picked it out, and it seemed like a sane thing to backport.  If
you think it's not worthy, I'll gladly drop it, but it seemed like such
a simple bugfix to include.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 12:41     ` Greg Kroah-Hartman
@ 2017-12-22 13:06       ` Michal Hocko
  2017-12-22 17:40         ` alexander.levin
  0 siblings, 1 reply; 349+ messages in thread
From: Michal Hocko @ 2017-12-22 13:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Shakeel Butt, Paolo Bonzini, Sasha Levin

On Fri 22-12-17 13:41:22, Greg KH wrote:
> On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
> > On Fri 22-12-17 09:46:33, Greg KH wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > ------------------
> > > 
> > > From: Shakeel Butt <shakeelb@google.com>
> > > 
> > > 
> > > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
> > > 
> > > The kvm slabs can consume a significant amount of system memory
> > > and indeed in our production environment we have observed that
> > > a lot of machines are spending significant amount of memory that
> > > can not be left as system memory overhead. Also the allocations
> > > from these slabs can be triggered directly by user space applications
> > > which has access to kvm and thus a buggy application can leak
> > > such memory. So, these caches should be accounted to kmemcg.
> > > 
> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > The patch is not marked for stable, neither it fixes an existing bug.
> > It is a nice to have thing for sure but I am wondering how this got
> > through stable-filter. 
> 
> Sasha picked it out, and it seemed like a sane thing to backport.  If
> you think it's not worthy, I'll gladly drop it, but it seemed like such
> a simple bugfix to include.

It is not that I would have some specific concerns about this particular
patch. It is more of a worry about the overal process. I thought that
_any_ patch backported to the stable tree would require a specific bug
to be fixed or in exceptional cases a performance issue. I have
experienced this pushback myself when trying to push "no real bug report
but better to have this plugged" patches.

So something has apparently changed in the process, I just haven't
noticed it. I am worried this might lead to more regression in future.
Not that my worry counts all that much as I am not a stable kernel user
though. So this is just my 2c worth of worry.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2017-12-22  8:45   ` Greg Kroah-Hartman
@ 2017-12-22 14:18     ` Dan Rue
  -1 siblings, 0 replies; 349+ messages in thread
From: Dan Rue @ 2017-12-22 14:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> 
> Size of the mem_section[] array depends on the size of the physical address space.
> 
> In preparation for boot-time switching between paging modes on x86-64
> we need to make the allocation of mem_section[] dynamic, because otherwise
> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> for 4-level paging and 2MB for 5-level paging mode.
> 
> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-mm@kvack.org
> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This patch causes a boot failure on arm64.

Please drop this patch, or pick up the fix in:

    commit 629a359bdb0e0652a8227b4ff3125431995fec6e
    Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Date:   Tue Nov 7 11:33:37 2017 +0300

        mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y

See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html

> 
> ---
>  include/linux/mmzone.h |    6 +++++-
>  mm/page_alloc.c        |   10 ++++++++++
>  mm/sparse.c            |   17 +++++++++++------
>  3 files changed, 26 insertions(+), 7 deletions(-)
> 
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1152,13 +1152,17 @@ struct mem_section {
>  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
>  
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> +extern struct mem_section **mem_section;
>  #else
>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>  #endif
>  
>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>  {
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section)
> +		return NULL;
> +#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>  	unsigned long start_pfn, end_pfn;
>  	int i, this_nid;
>  
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section) {
> +		unsigned long size, align;
> +
> +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> +		align = 1 << (INTERNODE_CACHE_SHIFT);
> +		mem_section = memblock_virt_alloc(size, align);
> +	}
> +#endif
> +
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>  		memory_present(this_nid, start_pfn, end_pfn);
>  }
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -23,8 +23,7 @@
>   * 1) mem_section	- memory sections, mem_map's for valid memory
>   */
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -struct mem_section *mem_section[NR_SECTION_ROOTS]
> -	____cacheline_internodealigned_in_smp;
> +struct mem_section **mem_section;
>  #else
>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>  	____cacheline_internodealigned_in_smp;
> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>  int __section_nr(struct mem_section* ms)
>  {
>  	unsigned long root_nr;
> -	struct mem_section* root;
> +	struct mem_section *root = NULL;
>  
>  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
>  
> -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> +	VM_BUG_ON(!root);
>  
>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>  }
> @@ -330,11 +329,17 @@ again:
>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>  {
>  	unsigned long usemap_snr, pgdat_snr;
> -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> +	static unsigned long old_usemap_snr;
> +	static unsigned long old_pgdat_snr;
>  	struct pglist_data *pgdat = NODE_DATA(nid);
>  	int usemap_nid;
>  
> +	/* First call */
> +	if (!old_usemap_snr) {
> +		old_usemap_snr = NR_MEM_SECTIONS;
> +		old_pgdat_snr = NR_MEM_SECTIONS;
> +	}
> +
>  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>  	if (usemap_snr == pgdat_snr)
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2017-12-22 14:18     ` Dan Rue
  0 siblings, 0 replies; 349+ messages in thread
From: Dan Rue @ 2017-12-22 14:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> 
> Size of the mem_section[] array depends on the size of the physical address space.
> 
> In preparation for boot-time switching between paging modes on x86-64
> we need to make the allocation of mem_section[] dynamic, because otherwise
> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> for 4-level paging and 2MB for 5-level paging mode.
> 
> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-mm@kvack.org
> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This patch causes a boot failure on arm64.

Please drop this patch, or pick up the fix in:

    commit 629a359bdb0e0652a8227b4ff3125431995fec6e
    Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Date:   Tue Nov 7 11:33:37 2017 +0300

        mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y

See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html

> 
> ---
>  include/linux/mmzone.h |    6 +++++-
>  mm/page_alloc.c        |   10 ++++++++++
>  mm/sparse.c            |   17 +++++++++++------
>  3 files changed, 26 insertions(+), 7 deletions(-)
> 
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1152,13 +1152,17 @@ struct mem_section {
>  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
>  
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> +extern struct mem_section **mem_section;
>  #else
>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>  #endif
>  
>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>  {
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section)
> +		return NULL;
> +#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>  	unsigned long start_pfn, end_pfn;
>  	int i, this_nid;
>  
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section) {
> +		unsigned long size, align;
> +
> +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> +		align = 1 << (INTERNODE_CACHE_SHIFT);
> +		mem_section = memblock_virt_alloc(size, align);
> +	}
> +#endif
> +
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>  		memory_present(this_nid, start_pfn, end_pfn);
>  }
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -23,8 +23,7 @@
>   * 1) mem_section	- memory sections, mem_map's for valid memory
>   */
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -struct mem_section *mem_section[NR_SECTION_ROOTS]
> -	____cacheline_internodealigned_in_smp;
> +struct mem_section **mem_section;
>  #else
>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>  	____cacheline_internodealigned_in_smp;
> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>  int __section_nr(struct mem_section* ms)
>  {
>  	unsigned long root_nr;
> -	struct mem_section* root;
> +	struct mem_section *root = NULL;
>  
>  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
>  
> -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> +	VM_BUG_ON(!root);
>  
>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>  }
> @@ -330,11 +329,17 @@ again:
>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>  {
>  	unsigned long usemap_snr, pgdat_snr;
> -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> +	static unsigned long old_usemap_snr;
> +	static unsigned long old_pgdat_snr;
>  	struct pglist_data *pgdat = NODE_DATA(nid);
>  	int usemap_nid;
>  
> +	/* First call */
> +	if (!old_usemap_snr) {
> +		old_usemap_snr = NR_MEM_SECTIONS;
> +		old_pgdat_snr = NR_MEM_SECTIONS;
> +	}
> +
>  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>  	if (usemap_snr == pgdat_snr)
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2017-12-22 14:18     ` Dan Rue
@ 2017-12-22 14:52       ` Naresh Kamboju
  -1 siblings, 0 replies; 349+ messages in thread
From: Naresh Kamboju @ 2017-12-22 14:52 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel, linux- stable,
	Kirill A. Shutemov, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Ingo Molnar

On 22 December 2017 at 19:48, Dan Rue <dan.rue@linaro.org> wrote:
> On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
>> 4.14-stable review patch.  If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>
>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
>>
>> Size of the mem_section[] array depends on the size of the physical address space.
>>
>> In preparation for boot-time switching between paging modes on x86-64
>> we need to make the allocation of mem_section[] dynamic, because otherwise
>> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
>> for 4-level paging and 2MB for 5-level paging mode.
>>
>> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: linux-mm@kvack.org
>> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> This patch causes a boot failure on arm64.
>
> Please drop this patch, or pick up the fix in:
>
>     commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>     Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     Date:   Tue Nov 7 11:33:37 2017 +0300
>
>         mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
>
> See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html

+1.
Boot failed on arm64 without 629a359b
mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y

Boot Error log:
--------------------
[    0.000000] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
[    0.000000] Mem abort info:
[    0.000000]   Exception class = DABT (current EL), IL = 32 bits
[    0.000000]   SET = 0, FnV = 0
[    0.000000]   EA = 0, S1PTW = 0
[    0.000000] Data abort info:
[    0.000000]   ISV = 0, ISS = 0x00000004
[    0.000000]   CM = 0, WnR = 0
[    0.000000] [0000000000000000] user address but active_mm is swapper
[    0.000000] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.9-rc1 #1
[    0.000000] Hardware name: ARM Juno development board (r2) (DT)
[    0.000000] task: ffff0000091d9380 task.stack: ffff0000091c0000
[    0.000000] PC is at memory_present+0x64/0xf4
[    0.000000] LR is at memory_present+0x38/0xf4
[    0.000000] pc : [<ffff0000090a1f54>] lr : [<ffff0000090a1f28>]
pstate: 800000c5
[    0.000000] sp : ffff0000091c3e80

More information,
https://pastebin.com/KambxUwb

- Naresh
>
>>
>> ---
>>  include/linux/mmzone.h |    6 +++++-
>>  mm/page_alloc.c        |   10 ++++++++++
>>  mm/sparse.c            |   17 +++++++++++------
>>  3 files changed, 26 insertions(+), 7 deletions(-)
>>
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1152,13 +1152,17 @@ struct mem_section {
>>  #define SECTION_ROOT_MASK    (SECTIONS_PER_ROOT - 1)
>>
>>  #ifdef CONFIG_SPARSEMEM_EXTREME
>> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
>> +extern struct mem_section **mem_section;
>>  #else
>>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>>  #endif
>>
>>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>>  {
>> +#ifdef CONFIG_SPARSEMEM_EXTREME
>> +     if (!mem_section)
>> +             return NULL;
>> +#endif
>>       if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>>               return NULL;
>>       return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>>       unsigned long start_pfn, end_pfn;
>>       int i, this_nid;
>>
>> +#ifdef CONFIG_SPARSEMEM_EXTREME
>> +     if (!mem_section) {
>> +             unsigned long size, align;
>> +
>> +             size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
>> +             align = 1 << (INTERNODE_CACHE_SHIFT);
>> +             mem_section = memblock_virt_alloc(size, align);
>> +     }
>> +#endif
>> +
>>       for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>>               memory_present(this_nid, start_pfn, end_pfn);
>>  }
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -23,8 +23,7 @@
>>   * 1) mem_section    - memory sections, mem_map's for valid memory
>>   */
>>  #ifdef CONFIG_SPARSEMEM_EXTREME
>> -struct mem_section *mem_section[NR_SECTION_ROOTS]
>> -     ____cacheline_internodealigned_in_smp;
>> +struct mem_section **mem_section;
>>  #else
>>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>>       ____cacheline_internodealigned_in_smp;
>> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>>  int __section_nr(struct mem_section* ms)
>>  {
>>       unsigned long root_nr;
>> -     struct mem_section* root;
>> +     struct mem_section *root = NULL;
>>
>>       for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>>               root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
>> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>>                    break;
>>       }
>>
>> -     VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
>> +     VM_BUG_ON(!root);
>>
>>       return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>>  }
>> @@ -330,11 +329,17 @@ again:
>>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>>  {
>>       unsigned long usemap_snr, pgdat_snr;
>> -     static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
>> -     static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
>> +     static unsigned long old_usemap_snr;
>> +     static unsigned long old_pgdat_snr;
>>       struct pglist_data *pgdat = NODE_DATA(nid);
>>       int usemap_nid;
>>
>> +     /* First call */
>> +     if (!old_usemap_snr) {
>> +             old_usemap_snr = NR_MEM_SECTIONS;
>> +             old_pgdat_snr = NR_MEM_SECTIONS;
>> +     }
>> +
>>       usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>>       pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>>       if (usemap_snr == pgdat_snr)
>>
>>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2017-12-22 14:52       ` Naresh Kamboju
  0 siblings, 0 replies; 349+ messages in thread
From: Naresh Kamboju @ 2017-12-22 14:52 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel, linux- stable,
	Kirill A. Shutemov, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Ingo Molnar

On 22 December 2017 at 19:48, Dan Rue <dan.rue@linaro.org> wrote:
> On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
>> 4.14-stable review patch.  If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>
>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
>>
>> Size of the mem_section[] array depends on the size of the physical address space.
>>
>> In preparation for boot-time switching between paging modes on x86-64
>> we need to make the allocation of mem_section[] dynamic, because otherwise
>> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
>> for 4-level paging and 2MB for 5-level paging mode.
>>
>> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: linux-mm@kvack.org
>> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> This patch causes a boot failure on arm64.
>
> Please drop this patch, or pick up the fix in:
>
>     commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>     Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     Date:   Tue Nov 7 11:33:37 2017 +0300
>
>         mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
>
> See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html

+1.
Boot failed on arm64 without 629a359b
mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y

Boot Error log:
--------------------
[    0.000000] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
[    0.000000] Mem abort info:
[    0.000000]   Exception class = DABT (current EL), IL = 32 bits
[    0.000000]   SET = 0, FnV = 0
[    0.000000]   EA = 0, S1PTW = 0
[    0.000000] Data abort info:
[    0.000000]   ISV = 0, ISS = 0x00000004
[    0.000000]   CM = 0, WnR = 0
[    0.000000] [0000000000000000] user address but active_mm is swapper
[    0.000000] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.9-rc1 #1
[    0.000000] Hardware name: ARM Juno development board (r2) (DT)
[    0.000000] task: ffff0000091d9380 task.stack: ffff0000091c0000
[    0.000000] PC is at memory_present+0x64/0xf4
[    0.000000] LR is at memory_present+0x38/0xf4
[    0.000000] pc : [<ffff0000090a1f54>] lr : [<ffff0000090a1f28>]
pstate: 800000c5
[    0.000000] sp : ffff0000091c3e80

More information,
https://pastebin.com/KambxUwb

- Naresh
>
>>
>> ---
>>  include/linux/mmzone.h |    6 +++++-
>>  mm/page_alloc.c        |   10 ++++++++++
>>  mm/sparse.c            |   17 +++++++++++------
>>  3 files changed, 26 insertions(+), 7 deletions(-)
>>
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1152,13 +1152,17 @@ struct mem_section {
>>  #define SECTION_ROOT_MASK    (SECTIONS_PER_ROOT - 1)
>>
>>  #ifdef CONFIG_SPARSEMEM_EXTREME
>> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
>> +extern struct mem_section **mem_section;
>>  #else
>>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>>  #endif
>>
>>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>>  {
>> +#ifdef CONFIG_SPARSEMEM_EXTREME
>> +     if (!mem_section)
>> +             return NULL;
>> +#endif
>>       if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>>               return NULL;
>>       return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>>       unsigned long start_pfn, end_pfn;
>>       int i, this_nid;
>>
>> +#ifdef CONFIG_SPARSEMEM_EXTREME
>> +     if (!mem_section) {
>> +             unsigned long size, align;
>> +
>> +             size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
>> +             align = 1 << (INTERNODE_CACHE_SHIFT);
>> +             mem_section = memblock_virt_alloc(size, align);
>> +     }
>> +#endif
>> +
>>       for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>>               memory_present(this_nid, start_pfn, end_pfn);
>>  }
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -23,8 +23,7 @@
>>   * 1) mem_section    - memory sections, mem_map's for valid memory
>>   */
>>  #ifdef CONFIG_SPARSEMEM_EXTREME
>> -struct mem_section *mem_section[NR_SECTION_ROOTS]
>> -     ____cacheline_internodealigned_in_smp;
>> +struct mem_section **mem_section;
>>  #else
>>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>>       ____cacheline_internodealigned_in_smp;
>> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>>  int __section_nr(struct mem_section* ms)
>>  {
>>       unsigned long root_nr;
>> -     struct mem_section* root;
>> +     struct mem_section *root = NULL;
>>
>>       for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>>               root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
>> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>>                    break;
>>       }
>>
>> -     VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
>> +     VM_BUG_ON(!root);
>>
>>       return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>>  }
>> @@ -330,11 +329,17 @@ again:
>>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>>  {
>>       unsigned long usemap_snr, pgdat_snr;
>> -     static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
>> -     static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
>> +     static unsigned long old_usemap_snr;
>> +     static unsigned long old_pgdat_snr;
>>       struct pglist_data *pgdat = NODE_DATA(nid);
>>       int usemap_nid;
>>
>> +     /* First call */
>> +     if (!old_usemap_snr) {
>> +             old_usemap_snr = NR_MEM_SECTIONS;
>> +             old_pgdat_snr = NR_MEM_SECTIONS;
>> +     }
>> +
>>       usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>>       pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>>       if (usemap_snr == pgdat_snr)
>>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2017-12-22 14:18     ` Dan Rue
@ 2017-12-22 15:03       ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:03 UTC (permalink / raw)
  To: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Fri, Dec 22, 2017 at 08:18:10AM -0600, Dan Rue wrote:
> On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> > 
> > Size of the mem_section[] array depends on the size of the physical address space.
> > 
> > In preparation for boot-time switching between paging modes on x86-64
> > we need to make the allocation of mem_section[] dynamic, because otherwise
> > we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> > for 4-level paging and 2MB for 5-level paging mode.
> > 
> > The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: linux-mm@kvack.org
> > Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> This patch causes a boot failure on arm64.
> 
> Please drop this patch, or pick up the fix in:
> 
>     commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>     Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     Date:   Tue Nov 7 11:33:37 2017 +0300
> 
>         mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> 
> See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html

Now added, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2017-12-22 15:03       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:03 UTC (permalink / raw)
  To: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Fri, Dec 22, 2017 at 08:18:10AM -0600, Dan Rue wrote:
> On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> > 
> > Size of the mem_section[] array depends on the size of the physical address space.
> > 
> > In preparation for boot-time switching between paging modes on x86-64
> > we need to make the allocation of mem_section[] dynamic, because otherwise
> > we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> > for 4-level paging and 2MB for 5-level paging mode.
> > 
> > The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: linux-mm@kvack.org
> > Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> This patch causes a boot failure on arm64.
> 
> Please drop this patch, or pick up the fix in:
> 
>     commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>     Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     Date:   Tue Nov 7 11:33:37 2017 +0300
> 
>         mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> 
> See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html

Now added, thanks.

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (159 preceding siblings ...)
  2017-12-22  8:47 ` [PATCH 4.14 159/159] platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes Greg Kroah-Hartman
@ 2017-12-22 15:08 ` Greg Kroah-Hartman
  2017-12-22 15:54   ` Greg Kroah-Hartman
       [not found] ` <5a3cfea4.0692500a.66bcf.cf6b@mx.google.com>
                   ` (4 subsequent siblings)
  165 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: torvalds, akpm, linux, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.9 release.
> There are 159 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.

Ok, that blew up hard on arm64, there's now a -rc2 out with a fix for
that.  Hopefully :)

 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc2.gz

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
       [not found] ` <5a3cfea4.0692500a.66bcf.cf6b@mx.google.com>
@ 2017-12-22 15:11   ` Greg Kroah-Hartman
  2017-12-22 15:45     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:11 UTC (permalink / raw)
  To: kernelci.org bot
  Cc: linux-kernel, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable

On Fri, Dec 22, 2017 at 04:46:28AM -0800, kernelci.org bot wrote:
> stable-rc/linux-4.14.y boot: 136 boots: 12 failed, 111 passed with 12 offline, 1 untried/unknown (v4.14.8-159-gc2a94d1a6095)
> 
> Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14.8-159-gc2a94d1a6095/
> Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.8-159-gc2a94d1a6095/
> 
> Tree: stable-rc
> Branch: linux-4.14.y
> Git Describe: v4.14.8-159-gc2a94d1a6095
> Git Commit: c2a94d1a60958294af33649f908960c536a206d5
> Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Tested: 76 unique boards, 23 SoC families, 17 builds out of 185
> 
> Boot Regressions Detected:
> 
> arm:
> 
>     exynos_defconfig:
>         exynos5250-arndale:
>             lab-baylibre-seattle: failing since 6 days (last pass: v4.14.5-97-gcdda4aaafa84 - first fail: v4.14.6)
> 
> arm64:
> 
>     defconfig:
>         hip07-d05_rootfs:nfs:
>             lab-collabora: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         meson-gxbb-p200:
>             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         meson-gxl-s905d-p230:
>             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         meson-gxl-s905x-khadas-vim:
>             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         meson-gxl-s905x-nexbox-a95x:
>             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         meson-gxl-s905x-p212:
>             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         qemu:
>             lab-mhart: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>             lab-collabora: new failure (last pass: v4.14.8-64-g6b2f7746b2ea)
>         r8a7796-m3ulcb:
>             lab-collabora: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
>         rk3399-firefly:
>             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)

These should all now be fixed with the -rc2 release, sorry about that.

My arm64 box here is dead, so I couldn't test it locally :(

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2017-12-22 14:52       ` Naresh Kamboju
@ 2017-12-22 15:12         ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:12 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: linux-kernel, linux- stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Fri, Dec 22, 2017 at 08:22:09PM +0530, Naresh Kamboju wrote:
> On 22 December 2017 at 19:48, Dan Rue <dan.rue@linaro.org> wrote:
> > On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
> >> 4.14-stable review patch.  If anyone has any objections, please let me know.
> >>
> >> ------------------
> >>
> >> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >>
> >> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> >>
> >> Size of the mem_section[] array depends on the size of the physical address space.
> >>
> >> In preparation for boot-time switching between paging modes on x86-64
> >> we need to make the allocation of mem_section[] dynamic, because otherwise
> >> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> >> for 4-level paging and 2MB for 5-level paging mode.
> >>
> >> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> >>
> >> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Andy Lutomirski <luto@amacapital.net>
> >> Cc: Borislav Petkov <bp@suse.de>
> >> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> >> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> >> Cc: Peter Zijlstra <peterz@infradead.org>
> >> Cc: Thomas Gleixner <tglx@linutronix.de>
> >> Cc: linux-mm@kvack.org
> >> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> >> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> >> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > This patch causes a boot failure on arm64.
> >
> > Please drop this patch, or pick up the fix in:
> >
> >     commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> >     Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >     Date:   Tue Nov 7 11:33:37 2017 +0300
> >
> >         mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> >
> > See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html
> 
> +1.
> Boot failed on arm64 without 629a359b
> mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> 
> Boot Error log:
> --------------------
> [    0.000000] Unable to handle kernel NULL pointer dereference at
> virtual address 00000000
> [    0.000000] Mem abort info:
> [    0.000000]   Exception class = DABT (current EL), IL = 32 bits
> [    0.000000]   SET = 0, FnV = 0
> [    0.000000]   EA = 0, S1PTW = 0
> [    0.000000] Data abort info:
> [    0.000000]   ISV = 0, ISS = 0x00000004
> [    0.000000]   CM = 0, WnR = 0
> [    0.000000] [0000000000000000] user address but active_mm is swapper
> [    0.000000] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.9-rc1 #1
> [    0.000000] Hardware name: ARM Juno development board (r2) (DT)
> [    0.000000] task: ffff0000091d9380 task.stack: ffff0000091c0000
> [    0.000000] PC is at memory_present+0x64/0xf4
> [    0.000000] LR is at memory_present+0x38/0xf4
> [    0.000000] pc : [<ffff0000090a1f54>] lr : [<ffff0000090a1f28>]
> pstate: 800000c5
> [    0.000000] sp : ffff0000091c3e80
> 
> More information,
> https://pastebin.com/KambxUwb

-rc2 is out with the fix, hopefully that survives longer :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2017-12-22 15:12         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:12 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: linux-kernel, linux- stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Fri, Dec 22, 2017 at 08:22:09PM +0530, Naresh Kamboju wrote:
> On 22 December 2017 at 19:48, Dan Rue <dan.rue@linaro.org> wrote:
> > On Fri, Dec 22, 2017 at 09:45:08AM +0100, Greg Kroah-Hartman wrote:
> >> 4.14-stable review patch.  If anyone has any objections, please let me know.
> >>
> >> ------------------
> >>
> >> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >>
> >> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> >>
> >> Size of the mem_section[] array depends on the size of the physical address space.
> >>
> >> In preparation for boot-time switching between paging modes on x86-64
> >> we need to make the allocation of mem_section[] dynamic, because otherwise
> >> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> >> for 4-level paging and 2MB for 5-level paging mode.
> >>
> >> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> >>
> >> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Andy Lutomirski <luto@amacapital.net>
> >> Cc: Borislav Petkov <bp@suse.de>
> >> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> >> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> >> Cc: Peter Zijlstra <peterz@infradead.org>
> >> Cc: Thomas Gleixner <tglx@linutronix.de>
> >> Cc: linux-mm@kvack.org
> >> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> >> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> >> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > This patch causes a boot failure on arm64.
> >
> > Please drop this patch, or pick up the fix in:
> >
> >     commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> >     Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >     Date:   Tue Nov 7 11:33:37 2017 +0300
> >
> >         mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> >
> > See https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1527427.html
> 
> +1.
> Boot failed on arm64 without 629a359b
> mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> 
> Boot Error log:
> --------------------
> [    0.000000] Unable to handle kernel NULL pointer dereference at
> virtual address 00000000
> [    0.000000] Mem abort info:
> [    0.000000]   Exception class = DABT (current EL), IL = 32 bits
> [    0.000000]   SET = 0, FnV = 0
> [    0.000000]   EA = 0, S1PTW = 0
> [    0.000000] Data abort info:
> [    0.000000]   ISV = 0, ISS = 0x00000004
> [    0.000000]   CM = 0, WnR = 0
> [    0.000000] [0000000000000000] user address but active_mm is swapper
> [    0.000000] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.9-rc1 #1
> [    0.000000] Hardware name: ARM Juno development board (r2) (DT)
> [    0.000000] task: ffff0000091d9380 task.stack: ffff0000091c0000
> [    0.000000] PC is at memory_present+0x64/0xf4
> [    0.000000] LR is at memory_present+0x38/0xf4
> [    0.000000] pc : [<ffff0000090a1f54>] lr : [<ffff0000090a1f28>]
> pstate: 800000c5
> [    0.000000] sp : ffff0000091c3e80
> 
> More information,
> https://pastebin.com/KambxUwb

-rc2 is out with the fix, hopefully that survives longer :)

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22 15:11   ` Greg Kroah-Hartman
@ 2017-12-22 15:45     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:45 UTC (permalink / raw)
  To: kernelci.org bot
  Cc: linux-kernel, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, lkft-triage, stable

On Fri, Dec 22, 2017 at 04:11:16PM +0100, Greg Kroah-Hartman wrote:
> On Fri, Dec 22, 2017 at 04:46:28AM -0800, kernelci.org bot wrote:
> > stable-rc/linux-4.14.y boot: 136 boots: 12 failed, 111 passed with 12 offline, 1 untried/unknown (v4.14.8-159-gc2a94d1a6095)
> > 
> > Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14.8-159-gc2a94d1a6095/
> > Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.8-159-gc2a94d1a6095/
> > 
> > Tree: stable-rc
> > Branch: linux-4.14.y
> > Git Describe: v4.14.8-159-gc2a94d1a6095
> > Git Commit: c2a94d1a60958294af33649f908960c536a206d5
> > Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > Tested: 76 unique boards, 23 SoC families, 17 builds out of 185
> > 
> > Boot Regressions Detected:
> > 
> > arm:
> > 
> >     exynos_defconfig:
> >         exynos5250-arndale:
> >             lab-baylibre-seattle: failing since 6 days (last pass: v4.14.5-97-gcdda4aaafa84 - first fail: v4.14.6)
> > 
> > arm64:
> > 
> >     defconfig:
> >         hip07-d05_rootfs:nfs:
> >             lab-collabora: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         meson-gxbb-p200:
> >             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         meson-gxl-s905d-p230:
> >             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         meson-gxl-s905x-khadas-vim:
> >             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         meson-gxl-s905x-nexbox-a95x:
> >             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         meson-gxl-s905x-p212:
> >             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         qemu:
> >             lab-mhart: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >             lab-collabora: new failure (last pass: v4.14.8-64-g6b2f7746b2ea)
> >         r8a7796-m3ulcb:
> >             lab-collabora: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> >         rk3399-firefly:
> >             lab-baylibre-seattle: new failure (last pass: v4.14.8-63-gbbedfb07d3bf)
> 
> These should all now be fixed with the -rc2 release, sorry about that.
> 
> My arm64 box here is dead, so I couldn't test it locally :(

Ok, the box now works just fine, it was my fault.  I'm updating it right
now and will get it up and working as part of my test systems next week.

thanks,

greg "oh the silver thing is a power button?" k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22 15:08 ` [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
@ 2017-12-22 15:54   ` Greg Kroah-Hartman
  2017-12-22 18:15     ` Guenter Roeck
  0 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-22 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: torvalds, akpm, linux, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On Fri, Dec 22, 2017 at 04:08:39PM +0100, Greg Kroah-Hartman wrote:
> On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.14.9 release.
> > There are 159 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> > or in the git tree and branch at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > and the diffstat can be found below.
> 
> Ok, that blew up hard on arm64, there's now a -rc2 out with a fix for
> that.  Hopefully :)
> 
>  	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc2.gz

And because it's just been one of those weeks, there's now a -rc3 out
due to a bunch of important BPF patches.

I'll stop now to give you all a chance to test...

  	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc3.gz


thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 13:06       ` Michal Hocko
@ 2017-12-22 17:40         ` alexander.levin
  2017-12-22 17:56           ` Michal Hocko
  0 siblings, 1 reply; 349+ messages in thread
From: alexander.levin @ 2017-12-22 17:40 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Shakeel Butt, Paolo Bonzini

On Fri, Dec 22, 2017 at 02:06:07PM +0100, Michal Hocko wrote:
>On Fri 22-12-17 13:41:22, Greg KH wrote:
>> On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
>> > On Fri 22-12-17 09:46:33, Greg KH wrote:
>> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
>> > >
>> > > ------------------
>> > >
>> > > From: Shakeel Butt <shakeelb@google.com>
>> > >
>> > >
>> > > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
>> > >
>> > > The kvm slabs can consume a significant amount of system memory
>> > > and indeed in our production environment we have observed that
>> > > a lot of machines are spending significant amount of memory that
>> > > can not be left as system memory overhead. Also the allocations
>> > > from these slabs can be triggered directly by user space applications
>> > > which has access to kvm and thus a buggy application can leak
>> > > such memory. So, these caches should be accounted to kmemcg.
>> > >
>> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
>> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> > > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
>> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> >
>> > The patch is not marked for stable, neither it fixes an existing bug.
>> > It is a nice to have thing for sure but I am wondering how this got
>> > through stable-filter.
>>
>> Sasha picked it out, and it seemed like a sane thing to backport.  If
>> you think it's not worthy, I'll gladly drop it, but it seemed like such
>> a simple bugfix to include.
>
>It is not that I would have some specific concerns about this particular
>patch. It is more of a worry about the overal process. I thought that
>_any_ patch backported to the stable tree would require a specific bug
>to be fixed or in exceptional cases a performance issue. I have
>experienced this pushback myself when trying to push "no real bug report
>but better to have this plugged" patches.
>
>So something has apparently changed in the process, I just haven't
>noticed it. I am worried this might lead to more regression in future.
>Not that my worry counts all that much as I am not a stable kernel user
>though. So this is just my 2c worth of worry.

The way I see it is that stable commits are supposed to fix a bug that
a user can hit/exploit, it doesn't have to have an actual user
complaining about it.

For this particular commit, the way I read it is that a user can avoid
his kmemcg limits (maybe maliciously), which would qualify as an
actual bug we want to get fixed.

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 17:40         ` alexander.levin
@ 2017-12-22 17:56           ` Michal Hocko
  2017-12-22 18:07             ` alexander.levin
  2017-12-23  9:24             ` Greg Kroah-Hartman
  0 siblings, 2 replies; 349+ messages in thread
From: Michal Hocko @ 2017-12-22 17:56 UTC (permalink / raw)
  To: alexander.levin
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Shakeel Butt, Paolo Bonzini

On Fri 22-12-17 17:40:10, Sasha Levin wrote:
> On Fri, Dec 22, 2017 at 02:06:07PM +0100, Michal Hocko wrote:
> >On Fri 22-12-17 13:41:22, Greg KH wrote:
> >> On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
> >> > On Fri 22-12-17 09:46:33, Greg KH wrote:
> >> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> >> > >
> >> > > ------------------
> >> > >
> >> > > From: Shakeel Butt <shakeelb@google.com>
> >> > >
> >> > >
> >> > > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
> >> > >
> >> > > The kvm slabs can consume a significant amount of system memory
> >> > > and indeed in our production environment we have observed that
> >> > > a lot of machines are spending significant amount of memory that
> >> > > can not be left as system memory overhead. Also the allocations
> >> > > from these slabs can be triggered directly by user space applications
> >> > > which has access to kvm and thus a buggy application can leak
> >> > > such memory. So, these caches should be accounted to kmemcg.
> >> > >
> >> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> >> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> > > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
> >> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> >
> >> > The patch is not marked for stable, neither it fixes an existing bug.
> >> > It is a nice to have thing for sure but I am wondering how this got
> >> > through stable-filter.
> >>
> >> Sasha picked it out, and it seemed like a sane thing to backport.  If
> >> you think it's not worthy, I'll gladly drop it, but it seemed like such
> >> a simple bugfix to include.
> >
> >It is not that I would have some specific concerns about this particular
> >patch. It is more of a worry about the overal process. I thought that
> >_any_ patch backported to the stable tree would require a specific bug
> >to be fixed or in exceptional cases a performance issue. I have
> >experienced this pushback myself when trying to push "no real bug report
> >but better to have this plugged" patches.
> >
> >So something has apparently changed in the process, I just haven't
> >noticed it. I am worried this might lead to more regression in future.
> >Not that my worry counts all that much as I am not a stable kernel user
> >though. So this is just my 2c worth of worry.
> 
> The way I see it is that stable commits are supposed to fix a bug that
> a user can hit/exploit, it doesn't have to have an actual user
> complaining about it.
> 
> For this particular commit, the way I read it is that a user can avoid
> his kmemcg limits (maybe maliciously), which would qualify as an
> actual bug we want to get fixed.

How are you going to judge all the possible relations to other
subsystems? I mean there is a good reason maintainers mark patches for
stable trees. How do you want to competently decide this for them? Can
you do that for all subsystems?

I do not want to underestimate your judgment or misinterpret your
process here but I _believe_ that picking patches based on the changelog
without a deep understanding of the subsystem is really risky. We do
not really have to go a long way to see that. Just look at other patch
in this very thread [1]. But maybe our our understanding of the stable
trees are different.

[1] http://lkml.kernel.org/r/20171222141810.dpeozmylmnj253do@xps
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 17:56           ` Michal Hocko
@ 2017-12-22 18:07             ` alexander.levin
  2017-12-22 18:22               ` Michal Hocko
  2017-12-23  9:24             ` Greg Kroah-Hartman
  1 sibling, 1 reply; 349+ messages in thread
From: alexander.levin @ 2017-12-22 18:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Shakeel Butt, Paolo Bonzini

On Fri, Dec 22, 2017 at 06:56:16PM +0100, Michal Hocko wrote:
>On Fri 22-12-17 17:40:10, Sasha Levin wrote:
>> On Fri, Dec 22, 2017 at 02:06:07PM +0100, Michal Hocko wrote:
>> >On Fri 22-12-17 13:41:22, Greg KH wrote:
>> >> On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
>> >> > On Fri 22-12-17 09:46:33, Greg KH wrote:
>> >> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
>> >> > >
>> >> > > ------------------
>> >> > >
>> >> > > From: Shakeel Butt <shakeelb@google.com>
>> >> > >
>> >> > >
>> >> > > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
>> >> > >
>> >> > > The kvm slabs can consume a significant amount of system memory
>> >> > > and indeed in our production environment we have observed that
>> >> > > a lot of machines are spending significant amount of memory that
>> >> > > can not be left as system memory overhead. Also the allocations
>> >> > > from these slabs can be triggered directly by user space applications
>> >> > > which has access to kvm and thus a buggy application can leak
>> >> > > such memory. So, these caches should be accounted to kmemcg.
>> >> > >
>> >> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
>> >> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> >> > > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
>> >> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> >> >
>> >> > The patch is not marked for stable, neither it fixes an existing bug.
>> >> > It is a nice to have thing for sure but I am wondering how this got
>> >> > through stable-filter.
>> >>
>> >> Sasha picked it out, and it seemed like a sane thing to backport.  If
>> >> you think it's not worthy, I'll gladly drop it, but it seemed like such
>> >> a simple bugfix to include.
>> >
>> >It is not that I would have some specific concerns about this particular
>> >patch. It is more of a worry about the overal process. I thought that
>> >_any_ patch backported to the stable tree would require a specific bug
>> >to be fixed or in exceptional cases a performance issue. I have
>> >experienced this pushback myself when trying to push "no real bug report
>> >but better to have this plugged" patches.
>> >
>> >So something has apparently changed in the process, I just haven't
>> >noticed it. I am worried this might lead to more regression in future.
>> >Not that my worry counts all that much as I am not a stable kernel user
>> >though. So this is just my 2c worth of worry.
>>
>> The way I see it is that stable commits are supposed to fix a bug that
>> a user can hit/exploit, it doesn't have to have an actual user
>> complaining about it.
>>
>> For this particular commit, the way I read it is that a user can avoid
>> his kmemcg limits (maybe maliciously), which would qualify as an
>> actual bug we want to get fixed.
>
>How are you going to judge all the possible relations to other
>subsystems? I mean there is a good reason maintainers mark patches for
>stable trees. How do you want to competently decide this for them? Can
>you do that for all subsystems?
>
>I do not want to underestimate your judgment or misinterpret your
>process here but I _believe_ that picking patches based on the changelog
>without a deep understanding of the subsystem is really risky. We do
>not really have to go a long way to see that. Just look at other patch
>in this very thread [1]. But maybe our our understanding of the stable
>trees are different.

I don't try and override maintainers, I mostly try to get fixes out
of subsystems where maintainers/authors partially (or just don't)
mark their commits for stable.

These patches also go through a much longer review process than
commits that are marked for stable (there are at least 3 emails issued
for each such commit, and at least 1 week (usually much more) is
given for reviews).

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22 15:54   ` Greg Kroah-Hartman
@ 2017-12-22 18:15     ` Guenter Roeck
  2017-12-23 14:21       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 349+ messages in thread
From: Guenter Roeck @ 2017-12-22 18:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On Fri, Dec 22, 2017 at 04:54:41PM +0100, Greg Kroah-Hartman wrote:
> On Fri, Dec 22, 2017 at 04:08:39PM +0100, Greg Kroah-Hartman wrote:
> > On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 4.14.9 release.
> > > There are 159 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > > Anything received after that time might be too late.
> > > 
> > > The whole patch series can be found in one patch at:
> > > 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> > > or in the git tree and branch at:
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > > and the diffstat can be found below.
> > 
> > Ok, that blew up hard on arm64, there's now a -rc2 out with a fix for
> > that.  Hopefully :)
> > 
> >  	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc2.gz
> 
> And because it's just been one of those weeks, there's now a -rc3 out
> due to a bunch of important BPF patches.
> 

> I'll stop now to give you all a chance to test...
> 

I can't keep up with this. I'll let my builders try again tonight
and report tomorrow.

This is what I have so far.

h8300 builds are broken.

include/linux/compiler.h:344:2: error:
	implicit declaration of function ‘smp_read_barrier_depends’

Seen when building arch/h8300/kernel/asm-offsets.c.

sparc32 builds are broken with the same error, in this case when building
init/do_mounts_initrd.c.

Guenter

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 18:07             ` alexander.levin
@ 2017-12-22 18:22               ` Michal Hocko
  2017-12-22 21:55                 ` alexander.levin
  0 siblings, 1 reply; 349+ messages in thread
From: Michal Hocko @ 2017-12-22 18:22 UTC (permalink / raw)
  To: alexander.levin
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Shakeel Butt, Paolo Bonzini

On Fri 22-12-17 18:07:23, Sasha Levin wrote:
> On Fri, Dec 22, 2017 at 06:56:16PM +0100, Michal Hocko wrote:
> >On Fri 22-12-17 17:40:10, Sasha Levin wrote:
> >> On Fri, Dec 22, 2017 at 02:06:07PM +0100, Michal Hocko wrote:
> >> >On Fri 22-12-17 13:41:22, Greg KH wrote:
> >> >> On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
> >> >> > On Fri 22-12-17 09:46:33, Greg KH wrote:
> >> >> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> >> >> > >
> >> >> > > ------------------
> >> >> > >
> >> >> > > From: Shakeel Butt <shakeelb@google.com>
> >> >> > >
> >> >> > >
> >> >> > > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
> >> >> > >
> >> >> > > The kvm slabs can consume a significant amount of system memory
> >> >> > > and indeed in our production environment we have observed that
> >> >> > > a lot of machines are spending significant amount of memory that
> >> >> > > can not be left as system memory overhead. Also the allocations
> >> >> > > from these slabs can be triggered directly by user space applications
> >> >> > > which has access to kvm and thus a buggy application can leak
> >> >> > > such memory. So, these caches should be accounted to kmemcg.
> >> >> > >
> >> >> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> >> >> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> >> > > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
> >> >> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> >> >
> >> >> > The patch is not marked for stable, neither it fixes an existing bug.
> >> >> > It is a nice to have thing for sure but I am wondering how this got
> >> >> > through stable-filter.
> >> >>
> >> >> Sasha picked it out, and it seemed like a sane thing to backport.  If
> >> >> you think it's not worthy, I'll gladly drop it, but it seemed like such
> >> >> a simple bugfix to include.
> >> >
> >> >It is not that I would have some specific concerns about this particular
> >> >patch. It is more of a worry about the overal process. I thought that
> >> >_any_ patch backported to the stable tree would require a specific bug
> >> >to be fixed or in exceptional cases a performance issue. I have
> >> >experienced this pushback myself when trying to push "no real bug report
> >> >but better to have this plugged" patches.
> >> >
> >> >So something has apparently changed in the process, I just haven't
> >> >noticed it. I am worried this might lead to more regression in future.
> >> >Not that my worry counts all that much as I am not a stable kernel user
> >> >though. So this is just my 2c worth of worry.
> >>
> >> The way I see it is that stable commits are supposed to fix a bug that
> >> a user can hit/exploit, it doesn't have to have an actual user
> >> complaining about it.
> >>
> >> For this particular commit, the way I read it is that a user can avoid
> >> his kmemcg limits (maybe maliciously), which would qualify as an
> >> actual bug we want to get fixed.
> >
> >How are you going to judge all the possible relations to other
> >subsystems? I mean there is a good reason maintainers mark patches for
> >stable trees. How do you want to competently decide this for them? Can
> >you do that for all subsystems?
> >
> >I do not want to underestimate your judgment or misinterpret your
> >process here but I _believe_ that picking patches based on the changelog
> >without a deep understanding of the subsystem is really risky. We do
> >not really have to go a long way to see that. Just look at other patch
> >in this very thread [1]. But maybe our our understanding of the stable
> >trees are different.
> 
> I don't try and override maintainers, I mostly try to get fixes out
> of subsystems where maintainers/authors partially (or just don't)
> mark their commits for stable.

Well, I have see quite some MM patches and I believe we are quite good
at marking patches for stable trees... I also think we we (as the whole
kernel) are much better are using Fixes tag (although it is over used
sometimes).

Moreover it makes more sense to push on those maintainers than try to
substitude them without being so closely familiar with the subsystem. If
missing backports result in bug reports then this just increase the
pressure on those maintainers /me think.

> These patches also go through a much longer review process than
> commits that are marked for stable (there are at least 3 emails issued
> for each such commit, and at least 1 week (usually much more) is
> given for reviews).

Does any of the maintainers read those emails? How many acks/reviewes do
you get for those patches for the stable tree? To be honest I tend to
ignore those patchbombs most of the time because it is simply impossible
to handle them for me. I try to help backporting obvious fixes but
reviewing seemingly randomly selected patch which applies and changelog
looks reasonaly is simply out of my time budget. Not to mention that
this is not just about the patch itself but also the tree it is applied
to and other patches that are in the same pile.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (161 preceding siblings ...)
       [not found] ` <5a3cfea4.0692500a.66bcf.cf6b@mx.google.com>
@ 2017-12-22 21:09 ` Shuah Khan
  2017-12-23  9:14   ` Greg Kroah-Hartman
  2017-12-22 22:31 ` Dan Rue
                   ` (2 subsequent siblings)
  165 siblings, 1 reply; 349+ messages in thread
From: Shuah Khan @ 2017-12-22 21:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, lkft-triage,
	stable, Shuah Khan

On 12/22/2017 01:44 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.9 release.
> There are 159 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions,

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 18:22               ` Michal Hocko
@ 2017-12-22 21:55                 ` alexander.levin
  0 siblings, 0 replies; 349+ messages in thread
From: alexander.levin @ 2017-12-22 21:55 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Shakeel Butt, Paolo Bonzini

On Fri, Dec 22, 2017 at 07:22:32PM +0100, Michal Hocko wrote:
>On Fri 22-12-17 18:07:23, Sasha Levin wrote:
>> I don't try and override maintainers, I mostly try to get fixes out
>> of subsystems where maintainers/authors partially (or just don't)
>> mark their commits for stable.
>
>Well, I have see quite some MM patches and I believe we are quite good
>at marking patches for stable trees... I also think we we (as the whole
>kernel) are much better are using Fixes tag (although it is over used
>sometimes).

Indeed, mm/ is probably as good as it gets in the kernel.

>Moreover it makes more sense to push on those maintainers than try to
>substitude them without being so closely familiar with the subsystem. If
>missing backports result in bug reports then this just increase the
>pressure on those maintainers /me think.

Both is happening, but it's difficult to force maintainers into doing
anything, as you might have guessed...

I'm hoping that one result of this work is a tool we can stick into
scripts/ (maybe glue it to checkpatch) that'll alert when the patch
is -stable material and suggest adding tags.

>> These patches also go through a much longer review process than
>> commits that are marked for stable (there are at least 3 emails issued
>> for each such commit, and at least 1 week (usually much more) is
>> given for reviews).
>
>Does any of the maintainers read those emails? How many acks/reviewes do
>you get for those patches for the stable tree? To be honest I tend to

I get a fair amount of reviews which seems to be slightly above what
-stable tagged patches get, which is good.

Acks are not expected, and are not happening too often.

>ignore those patchbombs most of the time because it is simply impossible
>to handle them for me. I try to help backporting obvious fixes but
>reviewing seemingly randomly selected patch which applies and changelog
>looks reasonaly is simply out of my time budget. Not to mention that
>this is not just about the patch itself but also the tree it is applied
>to and other patches that are in the same pile.

I'd hope that these patches aren't "random" :)

For some background, this is based on Julia Lawall's work (and paper
https://soarsmu.github.io/papers/icse12-patch.pdf).

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (162 preceding siblings ...)
  2017-12-22 21:09 ` Shuah Khan
@ 2017-12-22 22:31 ` Dan Rue
  2017-12-23  9:17   ` Greg Kroah-Hartman
  2017-12-23 22:54 ` Guenter Roeck
  2017-12-24 19:37 ` Ivan Kozik
  165 siblings, 1 reply; 349+ messages in thread
From: Dan Rue @ 2017-12-22 22:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, ben.hutchings, shuahkh, lkft-triage, patches,
	stable, akpm, torvalds, linux

On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.9 release.
> There are 159 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.

Results from Linaro - 4.14.9-rc3 looks good. No regressions on arm64, arm, or
x86_64.

Summary
------------------------------------------------------------------------

kernel: 4.14.9-rc3
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.14.y
git commit: a8cfbfd47f96dfffa64be8567174a00e4dfc1458
git describe: v4.14.8-175-ga8cfbfd47f96
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.8-175-ga8cfbfd47f96


No regressions (compared to build v4.14.8-161-gc51876d1903b)

Boards, architectures and test suites:
-------------------------------------

hi6220-hikey - arm64
* boot - pass: 20,
* kselftest - pass: 46, skip: 16
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 60,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 983, skip: 121
* ltp-timers-tests - pass: 12,

juno-r2 - arm64
* boot - pass: 20,
* kselftest - pass: 45, skip: 17
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 987, skip: 121
* ltp-timers-tests - pass: 12,

x15 - arm
* boot - pass: 20,
* kselftest - pass: 41, skip: 20
* libhugetlbfs - pass: 87, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 60,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 20, skip: 2
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 13, skip: 1
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1037, skip: 66
* ltp-timers-tests - pass: 12,

x86_64
* boot - pass: 20,
* kselftest - pass: 58, skip: 17
* libhugetlbfs - pass: 89, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 61, skip: 1
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 9, skip: 1
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1005, skip: 116
* ltp-timers-tests - pass: 12,



Documentation - https://collaborate.linaro.org/display/LKFT/Email+Reports

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22 21:09 ` Shuah Khan
@ 2017-12-23  9:14   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-23  9:14 UTC (permalink / raw)
  To: Shuah Khan
  Cc: linux-kernel, torvalds, akpm, linux, patches, ben.hutchings,
	lkft-triage, stable

On Fri, Dec 22, 2017 at 02:09:59PM -0700, Shuah Khan wrote:
> On 12/22/2017 01:44 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.14.9 release.
> > There are 159 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> > or in the git tree and branch at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions,

Great, thanks for testing all of these and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22 22:31 ` Dan Rue
@ 2017-12-23  9:17   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-23  9:17 UTC (permalink / raw)
  To: linux-kernel, ben.hutchings, shuahkh, lkft-triage, patches,
	stable, akpm, torvalds, linux

On Fri, Dec 22, 2017 at 04:31:05PM -0600, Dan Rue wrote:
> On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.14.9 release.
> > There are 159 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> > or in the git tree and branch at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > and the diffstat can be found below.
> 
> Results from Linaro - 4.14.9-rc3 looks good. No regressions on arm64, arm, or
> x86_64.

That's amazing, it was a pain to get out :)

Thanks for testing and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-22 17:56           ` Michal Hocko
  2017-12-22 18:07             ` alexander.levin
@ 2017-12-23  9:24             ` Greg Kroah-Hartman
  2017-12-27 10:30               ` Paolo Bonzini
  1 sibling, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-23  9:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: alexander.levin, linux-kernel, stable, Shakeel Butt, Paolo Bonzini

On Fri, Dec 22, 2017 at 06:56:16PM +0100, Michal Hocko wrote:
> On Fri 22-12-17 17:40:10, Sasha Levin wrote:
> > On Fri, Dec 22, 2017 at 02:06:07PM +0100, Michal Hocko wrote:
> > >On Fri 22-12-17 13:41:22, Greg KH wrote:
> > >> On Fri, Dec 22, 2017 at 10:34:07AM +0100, Michal Hocko wrote:
> > >> > On Fri 22-12-17 09:46:33, Greg KH wrote:
> > >> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > >> > >
> > >> > > ------------------
> > >> > >
> > >> > > From: Shakeel Butt <shakeelb@google.com>
> > >> > >
> > >> > >
> > >> > > [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]
> > >> > >
> > >> > > The kvm slabs can consume a significant amount of system memory
> > >> > > and indeed in our production environment we have observed that
> > >> > > a lot of machines are spending significant amount of memory that
> > >> > > can not be left as system memory overhead. Also the allocations
> > >> > > from these slabs can be triggered directly by user space applications
> > >> > > which has access to kvm and thus a buggy application can leak
> > >> > > such memory. So, these caches should be accounted to kmemcg.
> > >> > >
> > >> > > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > >> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > >> > > Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
> > >> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > >> >
> > >> > The patch is not marked for stable, neither it fixes an existing bug.
> > >> > It is a nice to have thing for sure but I am wondering how this got
> > >> > through stable-filter.
> > >>
> > >> Sasha picked it out, and it seemed like a sane thing to backport.  If
> > >> you think it's not worthy, I'll gladly drop it, but it seemed like such
> > >> a simple bugfix to include.
> > >
> > >It is not that I would have some specific concerns about this particular
> > >patch. It is more of a worry about the overal process. I thought that
> > >_any_ patch backported to the stable tree would require a specific bug
> > >to be fixed or in exceptional cases a performance issue. I have
> > >experienced this pushback myself when trying to push "no real bug report
> > >but better to have this plugged" patches.
> > >
> > >So something has apparently changed in the process, I just haven't
> > >noticed it. I am worried this might lead to more regression in future.
> > >Not that my worry counts all that much as I am not a stable kernel user
> > >though. So this is just my 2c worth of worry.
> > 
> > The way I see it is that stable commits are supposed to fix a bug that
> > a user can hit/exploit, it doesn't have to have an actual user
> > complaining about it.
> > 
> > For this particular commit, the way I read it is that a user can avoid
> > his kmemcg limits (maybe maliciously), which would qualify as an
> > actual bug we want to get fixed.
> 
> How are you going to judge all the possible relations to other
> subsystems? I mean there is a good reason maintainers mark patches for
> stable trees. How do you want to competently decide this for them? Can
> you do that for all subsystems?
> 
> I do not want to underestimate your judgment or misinterpret your
> process here but I _believe_ that picking patches based on the changelog
> without a deep understanding of the subsystem is really risky. We do
> not really have to go a long way to see that. Just look at other patch
> in this very thread [1]. But maybe our our understanding of the stable
> trees are different.

For many subsystems, the maintainers _never_ mark patches for stable.
Others, they catch maybe half of the things they should be applying.

KVM is one such example of the "half" group, they mark patches as
resolving CVE issues at times, yet don't mark them for stable.  So when
I see a patch like this, it triggers the "oh, look, KVM doing the same
thing again", so I take the patch and of course cc: the
developers/maintainers so they can object if they want to.

Over time you get to know what subsystems are like this and what are
not.  MM is one that is really good, I almost never take a mm patch
without being told explicitly to do so.  Others are horrible and never
mark anything, so stuff has to be picked up manually through Sasha's
process or through other ways.

So it's not a perfect system, but it seems to work "good enough", and if
you ever have any questions about any patch, always feel free to ask,
there's usually a story behind almost every one...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22 18:15     ` Guenter Roeck
@ 2017-12-23 14:21       ` Greg Kroah-Hartman
  2017-12-23 17:09         ` Guenter Roeck
  0 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-23 14:21 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, torvalds, akpm, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On Fri, Dec 22, 2017 at 10:15:50AM -0800, Guenter Roeck wrote:
> On Fri, Dec 22, 2017 at 04:54:41PM +0100, Greg Kroah-Hartman wrote:
> > On Fri, Dec 22, 2017 at 04:08:39PM +0100, Greg Kroah-Hartman wrote:
> > > On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
> > > > This is the start of the stable review cycle for the 4.14.9 release.
> > > > There are 159 patches in this series, all will be posted as a response
> > > > to this one.  If anyone has any issues with these being applied, please
> > > > let me know.
> > > > 
> > > > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > > > Anything received after that time might be too late.
> > > > 
> > > > The whole patch series can be found in one patch at:
> > > > 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> > > > or in the git tree and branch at:
> > > >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > > > and the diffstat can be found below.
> > > 
> > > Ok, that blew up hard on arm64, there's now a -rc2 out with a fix for
> > > that.  Hopefully :)
> > > 
> > >  	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc2.gz
> > 
> > And because it's just been one of those weeks, there's now a -rc3 out
> > due to a bunch of important BPF patches.
> > 
> 
> > I'll stop now to give you all a chance to test...
> > 
> 
> I can't keep up with this. I'll let my builders try again tonight
> and report tomorrow.
> 
> This is what I have so far.
> 
> h8300 builds are broken.
> 
> include/linux/compiler.h:344:2: error:
> 	implicit declaration of function ‘smp_read_barrier_depends’
> 
> Seen when building arch/h8300/kernel/asm-offsets.c.
> 
> sparc32 builds are broken with the same error, in this case when building
> init/do_mounts_initrd.c.

Ok, I think I found the problem for this, and have added a fix to the
tree.  It builds here for x86 for me, I'll watch your builders and see
how it goes...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-23 14:21       ` Greg Kroah-Hartman
@ 2017-12-23 17:09         ` Guenter Roeck
  0 siblings, 0 replies; 349+ messages in thread
From: Guenter Roeck @ 2017-12-23 17:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On 12/23/2017 06:21 AM, Greg Kroah-Hartman wrote:
> On Fri, Dec 22, 2017 at 10:15:50AM -0800, Guenter Roeck wrote:
>> On Fri, Dec 22, 2017 at 04:54:41PM +0100, Greg Kroah-Hartman wrote:
>>> On Fri, Dec 22, 2017 at 04:08:39PM +0100, Greg Kroah-Hartman wrote:
>>>> On Fri, Dec 22, 2017 at 09:44:45AM +0100, Greg Kroah-Hartman wrote:
>>>>> This is the start of the stable review cycle for the 4.14.9 release.
>>>>> There are 159 patches in this series, all will be posted as a response
>>>>> to this one.  If anyone has any issues with these being applied, please
>>>>> let me know.
>>>>>
>>>>> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
>>>>> Anything received after that time might be too late.
>>>>>
>>>>> The whole patch series can be found in one patch at:
>>>>> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
>>>>> or in the git tree and branch at:
>>>>>    git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
>>>>> and the diffstat can be found below.
>>>>
>>>> Ok, that blew up hard on arm64, there's now a -rc2 out with a fix for
>>>> that.  Hopefully :)
>>>>
>>>>   	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc2.gz
>>>
>>> And because it's just been one of those weeks, there's now a -rc3 out
>>> due to a bunch of important BPF patches.
>>>
>>
>>> I'll stop now to give you all a chance to test...
>>>
>>
>> I can't keep up with this. I'll let my builders try again tonight
>> and report tomorrow.
>>
>> This is what I have so far.
>>
>> h8300 builds are broken.
>>
>> include/linux/compiler.h:344:2: error:
>> 	implicit declaration of function ‘smp_read_barrier_depends’
>>
>> Seen when building arch/h8300/kernel/asm-offsets.c.
>>
>> sparc32 builds are broken with the same error, in this case when building
>> init/do_mounts_initrd.c.
> 
> Ok, I think I found the problem for this, and have added a fix to the
> tree.  It builds here for x86 for me, I'll watch your builders and see
> how it goes...
> 

This is now fixed. I'll send a complete build report after all builds are
complete.

Guenter

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (163 preceding siblings ...)
  2017-12-22 22:31 ` Dan Rue
@ 2017-12-23 22:54 ` Guenter Roeck
  2017-12-25 13:35   ` Greg Kroah-Hartman
  2017-12-24 19:37 ` Ivan Kozik
  165 siblings, 1 reply; 349+ messages in thread
From: Guenter Roeck @ 2017-12-23 22:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, shuahkh, patches, ben.hutchings, lkft-triage, stable

On 12/22/2017 12:44 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.9 release.
> There are 159 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> Anything received after that time might be too late.
> 

For v4.14.8-176-g3b153f8:

Build results:
	total: 145 pass: 145 fail: 0
Qemu test results:
	total: 126 pass: 126 fail: 0

Guenter

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
                   ` (164 preceding siblings ...)
  2017-12-23 22:54 ` Guenter Roeck
@ 2017-12-24 19:37 ` Ivan Kozik
  2017-12-24 22:03   ` Andre Tomt
  2017-12-25 13:38   ` Greg Kroah-Hartman
  165 siblings, 2 replies; 349+ messages in thread
From: Ivan Kozik @ 2017-12-24 19:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Linus Torvalds, akpm, Guenter Roeck, Shuah Khan,
	patches, Ben Hutchings, lkft-triage, stable

On Fri, Dec 22, 2017 at 8:44 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> This is the start of the stable review cycle for the 4.14.9 release.
> There are 159 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> Anything received after that time might be too late.
>

> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit

This is uncovering a very difficult-to-debug build failure with NVIDIA DKMS:
with CONFIG_UNWINDER_ORC=y, out-of-tree modules hit this rule in
scripts/Makefile.build:

$(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE

and fail (here, at least) to build tools/objtool/objtool (note: I do have
libelf-dev installed)

After editing dkms.conf to do `make --debug=a -j1`, I see make output:

 Considering target file
'/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'.
  File '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'
does not exist.
  Looking for an implicit rule for
'/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'.
  [...]
  Trying rule prerequisite 'tools/objtool/objtool'.
  Looking for a rule with intermediate file 'tools/objtool/objtool'.
   Avoiding implicit rule recursion.

then silently fail to build objtool, silently fail to build all the .o files,
then continue until ld finally errors out trying to link nonexistent object
files.

If things look alright to you, and objtool is known to work with out-of-tree
modules, and the Debian packaging just needs to be adjusted, please ignore;
I figured I'd send this anyway because it was such a pain to debug.

Thanks,

Ivan

On Fri, Dec 22, 2017 at 8:44 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> This is the start of the stable review cycle for the 4.14.9 release.
> There are 159 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
> -------------
> Pseudo-Shortlog of commits:
>
> Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>     Linux 4.14.9-rc1
>
> Peter Hutterer <peter.hutterer@who-t.net>
>     platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes
>
> Daniel Lezcano <daniel.lezcano@linaro.org>
>     thermal/drivers/hisi: Fix multiple alarm interrupts firing
>
> Daniel Lezcano <daniel.lezcano@linaro.org>
>     thermal/drivers/hisi: Simplify the temperature/step computation
>
> Daniel Lezcano <daniel.lezcano@linaro.org>
>     thermal/drivers/hisi: Fix kernel panic on alarm interrupt
>
> Daniel Lezcano <daniel.lezcano@linaro.org>
>     thermal/drivers/hisi: Fix missing interrupt enablement
>
> Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>     IB/opa_vnic: Properly return the total MACs in UC MAC list
>
> Scott Franco <safranco@intel.com>
>     IB/opa_vnic: Properly clear Mac Table Digest
>
> Eric Anholt <eric@anholt.net>
>     drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math.
>
> Nicholas Piggin <npiggin@gmail.com>
>     cpuidle: fix broadcast control when broadcast can not be entered
>
> Alexandre Belloni <alexandre.belloni@free-electrons.com>
>     rtc: set the alarm to the next expiring timer
>
> Hoang Tran <tranviethoang.vn@gmail.com>
>     tcp: fix under-evaluated ssthresh in TCP Vegas
>
> Chen-Yu Tsai <wens@csie.org>
>     clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision
>
> Arvind Yadav <arvind.yadav.cs@gmail.com>
>     staging: greybus: light: Release memory obtained by kasprintf
>
> Wei Hu(Xavier) <xavier.huwei@huawei.com>
>     RDMA/hns: Avoid NULL pointer exception
>
> Mike Manning <mmanning@brocade.com>
>     net: ipv6: send NS for DAD when link operationally up
>
> Mick Tarsel <mjtarsel@linux.vnet.ibm.com>
>     ibmvnic: Set state UP
>
> Jacob Keller <jacob.e.keller@intel.com>
>     fm10k: ensure we process SM mbx when processing VF mbx
>
> Marek Szyprowski <m.szyprowski@samsung.com>
>     ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board
>
> Alex Williamson <alex.williamson@redhat.com>
>     vfio/pci: Virtualize Maximum Payload Size
>
> Alan Brady <alan.brady@intel.com>
>     i40e: fix client notify of VF reset
>
> Dick Kennedy <dick.kennedy@broadcom.com>
>     scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined
>
> Dick Kennedy <dick.kennedy@broadcom.com>
>     scsi: lpfc: PLOGI failures during NPIV testing
>
> Dick Kennedy <dick.kennedy@broadcom.com>
>     scsi: lpfc: Fix secure firmware updates
>
> Jacob Keller <jacob.e.keller@intel.com>
>     fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw
>
> Nicolas Dechesne <nicolas.dechesne@linaro.org>
>     ASoC: codecs: msm8916-wcd-analog: fix module autoload
>
> Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>     sctp: silence warns on sctp_stream_init allocations
>
> Nicholas Piggin <npiggin@gmail.com>
>     powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog
>
> Nicholas Piggin <npiggin@gmail.com>
>     powerpc/xmon: Avoid tripping SMP hardlockup watchdog
>
> Ed Blake <ed.blake@sondrel.com>
>     ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback
>
> Jean-François Têtu <jean-francois.tetu@savoirfairelinux.com>
>     ASoC: codecs: msm8916-wcd-analog: fix micbias level
>
> Tom Zanussi <tom.zanussi@linux.intel.com>
>     tracing: Exclude 'generic fields' from histograms
>
> Gabriele Paoloni <gabriele.paoloni@huawei.com>
>     PCI/AER: Report non-fatal errors only to the affected endpoint
>
> Jacob Keller <jacob.e.keller@intel.com>
>     i40e/i40evf: spread CPU affinity hints across online CPUs only
>
> Hans de Goede <hdegoede@redhat.com>
>     Bluetooth: hci_bcm: Fix setting of irq trigger type
>
> Hans de Goede <hdegoede@redhat.com>
>     Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev
>
> Andrew Jeffery <andrew@aj.id.au>
>     leds: pca955x: Don't invert requested value in pca955x_gpio_set_value()
>
> Wei Wang <weiwan@google.com>
>     ipv6: grab rt->rt6i_ref before allocating pcpu rt
>
> William Tu <u9012063@gmail.com>
>     ip_gre: check packet length and mtu correctly in erspan tx
>
> Guoqing Jiang <gqjiang@suse.com>
>     md: always set THREAD_WAKEUP and wake up wqueue if thread existed
>
> Luca Miccio <lucmiccio@gmail.com>
>     block,bfq: Disable writeback throttling
>
> Colin Ian King <colin.king@canonical.com>
>     IB/rxe: check for allocation failure on elem
>
> Emil Tantilov <emil.s.tantilov@intel.com>
>     ixgbe: fix use of uninitialized padding
>
> Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
>     iio: st_sensors: add register mask for status register
>
> Lihong Yang <lihong.yang@intel.com>
>     i40e: use the safe hash table iterator when deleting mac filters
>
> Christophe JAILLET <christophe.jaillet@wanadoo.fr>
>     igb: check memory allocation failure
>
> Fabio Estevam <fabio.estevam@nxp.com>
>     PM / OPP: Move error message to debug level
>
> Stuart Hayes <stuart.w.hayes@gmail.com>
>     PCI: Create SR-IOV virtfn/physfn links before attaching driver
>
> Sreekanth Reddy <sreekanth.reddy@broadcom.com>
>     scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive
>
> Varun Prakash <varun@chelsio.com>
>     scsi: cxgb4i: fix Tx skb leak
>
> David Daney <david.daney@cavium.com>
>     PCI: Avoid bus reset if bridge itself is broken
>
> Dan Murphy <dmurphy@ti.com>
>     net: phy: at803x: Change error to EINVAL for invalid MAC
>
> Shakeel Butt <shakeelb@google.com>
>     kvm, mm: account kvm related kmem slabs to kmemcg
>
> Russell King <rmk+kernel@armlinux.org.uk>
>     rtc: pl031: make interrupt optional
>
> Christophe Jaillet <christophe.jaillet@wanadoo.fr>
>     crypto: lrw - Fix an error handling path in 'create()'
>
> Christian Lamparter <chunkeey@gmail.com>
>     crypto: crypto4xx - increase context and scatter ring buffer elements
>
> Chen-Yu Tsai <wens@csie.org>
>     clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider
>
> Chen-Yu Tsai <wens@csie.org>
>     clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock
>
> Shashank Sharma <shashank.sharma@intel.com>
>     drm: Add retries for lspcon mode detection
>
> Derek Basehore <dbasehore@chromium.org>
>     backlight: pwm_bl: Fix overflow condition
>
> Jens Wiklander <jens.wiklander@linaro.org>
>     optee: fix invalid of_node_put() in optee_driver_init()
>
> Thomas Gleixner <tglx@linutronix.de>
>     x86/cpufeatures: Make CPU bugs sticky
>
> Thomas Gleixner <tglx@linutronix.de>
>     x86/paravirt: Provide a way to check for hypervisors
>
> Thomas Gleixner <tglx@linutronix.de>
>     x86/paravirt: Dont patch flush_tlb_single
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Make cpu_entry_area.tss read-only
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry: Clean up the SYSENTER_stack code
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Remove the SYSENTER stack canary
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Move the IST stacks into struct cpu_entry_area
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Create a per-CPU SYSCALL entry trampoline
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Return to userspace from the trampoline stack
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Use a per-CPU trampoline stack for IDT entries
>
> Andy Lutomirski <luto@kernel.org>
>     x86/espfix/64: Stop assuming that pt_regs is on the entry stack
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry: Remap the TSS into the CPU entry area
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct
>
> Andy Lutomirski <luto@kernel.org>
>     x86/dumpstack: Handle stack overflow on all stacks
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss
>
> Andy Lutomirski <luto@kernel.org>
>     x86/kasan/64: Teach KASAN about the cpu_entry_area
>
> Andy Lutomirski <luto@kernel.org>
>     x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/gdt: Put per-CPU GDT remaps in ascending order
>
> Andy Lutomirski <luto@kernel.org>
>     x86/dumpstack: Add get_stack_info() support for the SYSENTER stack
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Allocate and enable the SYSENTER stack
>
> Andy Lutomirski <luto@kernel.org>
>     x86/irq/64: Print the offending IP in the stack overflow warning
>
> Andy Lutomirski <luto@kernel.org>
>     x86/irq: Remove an old outdated comment about context tracking races
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/unwinder: Handle stack overflows more gracefully
>
> Andy Lutomirski <luto@kernel.org>
>     x86/unwinder/orc: Dont bail on stack overflow
>
> Boris Ostrovsky <boris.ostrovsky@oracle.com>
>     x86/entry/64/paravirt: Use paravirt-safe macro to access eflags
>
> Andrey Ryabinin <aryabinin@virtuozzo.com>
>     x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
>
> Will Deacon <will.deacon@arm.com>
>     locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
>
> Will Deacon <will.deacon@arm.com>
>     locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()
>
> Daniel Borkmann <daniel@iogearbox.net>
>     bpf: fix build issues on um due to mising bpf_perf_event.h
>
> Andi Kleen <ak@linux.intel.com>
>     perf/x86: Enable free running PEBS for REGS_USER/INTR
>
> Rudolf Marek <r.marek@assembler.cz>
>     x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
>
> Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
>     x86/cpufeature: Add User-Mode Instruction Prevention definitions
>
> Ingo Molnar <mingo@kernel.org>
>     drivers/misc/intel/pti: Rename the header file to free up the namespace
>
> Juergen Gross <jgross@suse.com>
>     x86/virt: Add enum for hypervisors to replace x86_hyper
>
> Juergen Gross <jgross@suse.com>
>     x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init'
>
> James Morse <james.morse@arm.com>
>     ACPI / APEI: Replace ioremap_page_range() with fixmap
>
> Andy Lutomirski <luto@kernel.org>
>     selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well
>
> Andy Lutomirski <luto@kernel.org>
>     selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area()
>
> Ingo Molnar <mingo@kernel.org>
>     x86/cpufeatures: Fix various details in the feature definitions
>
> Ingo Molnar <mingo@kernel.org>
>     x86/cpufeatures: Re-tabulate the X86_FEATURE definitions
>
> Borislav Petkov <bp@suse.de>
>     x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE
>
> Thomas Gleixner <tglx@linutronix.de>
>     bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h")
>
> Thomas Gleixner <tglx@linutronix.de>
>     x86/cpuid: Replace set/clear_bit32()
>
> Borislav Petkov <bp@suse.de>
>     x86/entry/64: Shorten TEST instructions
>
> Andy Lutomirski <luto@kernel.org>
>     x86/traps: Use a new on_thread_stack() helper to clean up an assertion
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Remove thread_struct::sp0
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/32: Fix cpu_current_top_of_stack initialization at boot
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Remove all remaining direct thread_struct::sp0 reads
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Stop initializing TSS.sp0 at boot
>
> Andy Lutomirski <luto@kernel.org>
>     x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context()
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry: Add task_top_of_stack() to find the top of a task's stack
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Pass SP0 directly to load_sp0()
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0()
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: De-Xen-ify our NMI code
>
> Juergen Gross <jgross@suse.com>
>     xen, x86/entry/64: Add xen NMI trap entry
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Remove the RESTORE_..._REGS infrastructure
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Use POP instead of MOV to restore regs on NMI return
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Merge the fast and slow SYSRET paths
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Use pop instead of movq in syscall_return_via_sysret
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Shrink paranoid_exit_restore and make labels local
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Simplify reg restore code in the standard IRET paths
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Move SWAPGS into the common IRET-to-usermode path
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths
>
> Andy Lutomirski <luto@kernel.org>
>     x86/entry/64: Remove the restore_c_regs_and_iret label
>
> Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
>     ptrace,x86: Make user_64bit_mode() available to 32-bit builds
>
> Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
>     x86/boot: Relocate definition of the initial state of CR0
>
> Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
>     x86/mm: Relocate page fault error codes to traps.h
>
> Gayatri Kammela <gayatri.kammela@intel.com>
>     x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features
>
> Baoquan He <bhe@redhat.com>
>     x86/mm/64: Rename the register_page_bootmem_memmap() 'size' parameter to 'nr_pages'
>
> Masahiro Yamada <yamada.masahiro@socionext.com>
>     x86/build: Beautify build log of syscall headers
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/asm: Don't use the confusing '.ifeq' directive
>
> Dongjiu Geng <gengdongjiu@huawei.com>
>     ACPI / APEI: remove the unused dead-code for SEA/NMI notification type
>
> Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     x86/xen: Drop 5-level paging support code from the XEN_PV code
>
> Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y
>
> Andrey Ryabinin <aryabinin@virtuozzo.com>
>     x86/kasan: Use the same shadow offset for 4- and 5-level paging
>
> Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
>
> Thomas Gleixner <tglx@linutronix.de>
>     x86/cpuid: Prevent out of bound access in do_clear_cpu_cap()
>
> Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
>     objtool: Print top level commands on incorrect usage
>
> Kees Cook <keescook@chromium.org>
>     x86/platform/UV: Convert timers to use timer_setup()
>
> Andi Kleen <ak@linux.intel.com>
>     x86/fpu: Remove the explicit clearing of XSAVE dependent features
>
> Andi Kleen <ak@linux.intel.com>
>     x86/fpu: Make XSAVE check the base CPUID features before enabling
>
> Andi Kleen <ak@linux.intel.com>
>     x86/fpu: Parse clearcpuid= as early XSAVE argument
>
> Andi Kleen <ak@linux.intel.com>
>     x86/cpuid: Add generic table for CPUID dependencies
>
> Andi Kleen <ak@linux.intel.com>
>     bitops: Add clear/set_bit32() to linux/bitops.h
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*'
>
> Steven Rostedt (VMware) <rostedt@goodmis.org>
>     x86/fpu/debug: Remove unused 'x86_fpu_state' and 'x86_fpu_deactivate_state' tracepoints
>
> Ingo Molnar <mingo@kernel.org>
>     x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig
>
> Jan Beulich <JBeulich@suse.com>
>     ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq()
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/head: Add unwind hint annotations
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/xen: Add unwind hint annotations
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/xen: Fix xen head ELF annotations
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/boot: Annotate verify_cpu() as a callable function
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/head: Fix head ELF function annotations
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/head: Remove unused 'bad_address' code
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     x86/head: Remove confusing comment
>
> Josh Poimboeuf <jpoimboe@redhat.com>
>     objtool: Don't report end of section error after an empty unwind hint
>
> Uros Bizjak <ubizjak@gmail.com>
>     x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates
>
>
> -------------
>
> Diffstat:
>
>  Documentation/x86/orc-unwinder.txt                 |   2 +-
>  Documentation/x86/x86_64/mm.txt                    |   2 +-
>  Makefile                                           |   8 +-
>  arch/arm/configs/exynos_defconfig                  |   2 +-
>  arch/arm64/include/asm/fixmap.h                    |   7 +
>  arch/powerpc/kernel/watchdog.c                     |   7 +-
>  arch/powerpc/xmon/xmon.c                           |  17 +-
>  arch/um/include/asm/Kbuild                         |   1 +
>  arch/x86/Kconfig                                   |   5 +-
>  arch/x86/Kconfig.debug                             |  39 +-
>  arch/x86/configs/tiny.config                       |   4 +-
>  arch/x86/configs/x86_64_defconfig                  |   1 +
>  arch/x86/entry/calling.h                           |  69 +--
>  arch/x86/entry/entry_32.S                          |   6 +-
>  arch/x86/entry/entry_64.S                          | 322 +++++++++---
>  arch/x86/entry/entry_64_compat.S                   |  10 +-
>  arch/x86/entry/syscalls/Makefile                   |   4 +-
>  arch/x86/events/core.c                             |   2 +-
>  arch/x86/events/intel/core.c                       |   4 +
>  arch/x86/events/perf_event.h                       |  24 +-
>  arch/x86/hyperv/hv_init.c                          |   2 +-
>  arch/x86/include/asm/archrandom.h                  |   8 +-
>  arch/x86/include/asm/bitops.h                      |  10 +-
>  arch/x86/include/asm/compat.h                      |   1 +
>  arch/x86/include/asm/cpufeature.h                  |  11 +-
>  arch/x86/include/asm/cpufeatures.h                 | 538 +++++++++++----------
>  arch/x86/include/asm/desc.h                        |  11 +-
>  arch/x86/include/asm/fixmap.h                      |  74 ++-
>  arch/x86/include/asm/hypervisor.h                  |  53 +-
>  arch/x86/include/asm/irqflags.h                    |   3 +
>  arch/x86/include/asm/kdebug.h                      |   1 +
>  arch/x86/include/asm/mmu_context.h                 |   4 +-
>  arch/x86/include/asm/module.h                      |   2 +-
>  arch/x86/include/asm/paravirt.h                    |  14 +-
>  arch/x86/include/asm/paravirt_types.h              |   2 +-
>  arch/x86/include/asm/percpu.h                      |   2 +-
>  arch/x86/include/asm/pgtable_types.h               |   3 +-
>  arch/x86/include/asm/processor.h                   | 109 +++--
>  arch/x86/include/asm/ptrace.h                      |   6 +-
>  arch/x86/include/asm/rmwcc.h                       |   2 +-
>  arch/x86/include/asm/stacktrace.h                  |   3 +
>  arch/x86/include/asm/switch_to.h                   |  26 +
>  arch/x86/include/asm/thread_info.h                 |   2 +-
>  arch/x86/include/asm/trace/fpu.h                   |  10 -
>  arch/x86/include/asm/traps.h                       |  21 +-
>  arch/x86/include/asm/unwind.h                      |  15 +-
>  arch/x86/include/asm/x86_init.h                    |  24 +
>  arch/x86/include/uapi/asm/processor-flags.h        |   3 +
>  arch/x86/kernel/Makefile                           |  10 +-
>  arch/x86/kernel/apic/apic.c                        |   2 +-
>  arch/x86/kernel/apic/x2apic_uv_x.c                 |   5 +-
>  arch/x86/kernel/asm-offsets.c                      |   6 +
>  arch/x86/kernel/asm-offsets_32.c                   |   9 +-
>  arch/x86/kernel/asm-offsets_64.c                   |   4 +
>  arch/x86/kernel/cpu/Makefile                       |   1 +
>  arch/x86/kernel/cpu/amd.c                          |   7 +-
>  arch/x86/kernel/cpu/common.c                       | 195 +++++---
>  arch/x86/kernel/cpu/cpuid-deps.c                   | 121 +++++
>  arch/x86/kernel/cpu/hypervisor.c                   |  64 +--
>  arch/x86/kernel/cpu/mshyperv.c                     |   6 +-
>  arch/x86/kernel/cpu/vmware.c                       |   8 +-
>  arch/x86/kernel/doublefault.c                      |  36 +-
>  arch/x86/kernel/dumpstack.c                        |  74 ++-
>  arch/x86/kernel/dumpstack_32.c                     |   6 +
>  arch/x86/kernel/dumpstack_64.c                     |   6 +
>  arch/x86/kernel/fpu/init.c                         |  11 +
>  arch/x86/kernel/fpu/xstate.c                       |  43 +-
>  arch/x86/kernel/head_32.S                          |   5 +-
>  arch/x86/kernel/head_64.S                          |  45 +-
>  arch/x86/kernel/ioport.c                           |   2 +-
>  arch/x86/kernel/irq.c                              |  12 -
>  arch/x86/kernel/irq_64.c                           |   4 +-
>  arch/x86/kernel/kvm.c                              |   6 +-
>  arch/x86/kernel/ldt.c                              |   2 +-
>  arch/x86/kernel/paravirt_patch_64.c                |   2 -
>  arch/x86/kernel/process.c                          |  27 +-
>  arch/x86/kernel/process_32.c                       |   8 +-
>  arch/x86/kernel/process_64.c                       |  19 +-
>  arch/x86/kernel/smpboot.c                          |   3 +-
>  arch/x86/kernel/traps.c                            |  72 +--
>  arch/x86/kernel/unwind_orc.c                       |  88 ++--
>  arch/x86/kernel/verify_cpu.S                       |   3 +-
>  arch/x86/kernel/vm86_32.c                          |  20 +-
>  arch/x86/kernel/vmlinux.lds.S                      |   9 +
>  arch/x86/kernel/x86_init.c                         |   9 +
>  arch/x86/kvm/mmu.c                                 |   4 +-
>  arch/x86/kvm/vmx.c                                 |   2 +-
>  arch/x86/lib/delay.c                               |   4 +-
>  arch/x86/mm/fault.c                                |  88 ++--
>  arch/x86/mm/init.c                                 |   2 +-
>  arch/x86/mm/init_64.c                              |  10 +-
>  arch/x86/mm/kasan_init_64.c                        | 262 ++++++++--
>  arch/x86/power/cpu.c                               |  16 +-
>  arch/x86/xen/enlighten_hvm.c                       |  12 +-
>  arch/x86/xen/enlighten_pv.c                        |  15 +-
>  arch/x86/xen/mmu_pv.c                              | 161 +++---
>  arch/x86/xen/smp_pv.c                              |  17 +-
>  arch/x86/xen/xen-asm_64.S                          |   2 +-
>  arch/x86/xen/xen-head.S                            |  11 +-
>  block/bfq-iosched.c                                |   3 +-
>  block/blk-wbt.c                                    |   2 +-
>  crypto/lrw.c                                       |   6 +-
>  drivers/acpi/apei/ghes.c                           |  78 +--
>  drivers/base/power/opp/core.c                      |   2 +-
>  drivers/bluetooth/hci_bcm.c                        |  23 +-
>  drivers/bluetooth/hci_ldisc.c                      |   7 +
>  drivers/clk/sunxi-ng/ccu-sun5i.c                   |   4 +-
>  drivers/clk/sunxi-ng/ccu-sun6i-a31.c               |   2 +-
>  drivers/clk/sunxi-ng/ccu_nm.c                      |   3 +
>  drivers/cpuidle/cpuidle.c                          |   1 +
>  drivers/crypto/amcc/crypto4xx_core.h               |  10 +-
>  drivers/gpu/drm/drm_dp_dual_mode_helper.c          |  16 +-
>  drivers/gpu/drm/vc4/vc4_dsi.c                      |   3 +-
>  drivers/hv/vmbus_drv.c                             |   2 +-
>  drivers/iio/accel/st_accel_core.c                  |  35 +-
>  drivers/iio/common/st_sensors/st_sensors_core.c    |   2 +-
>  drivers/iio/common/st_sensors/st_sensors_trigger.c |  16 +-
>  drivers/iio/gyro/st_gyro_core.c                    |  15 +-
>  drivers/iio/magnetometer/st_magn_core.c            |  10 +-
>  drivers/iio/pressure/st_pressure_core.c            |  15 +-
>  drivers/infiniband/hw/hns/hns_roce_hw_v1.c         |   5 +
>  drivers/infiniband/sw/rxe/rxe_pool.c               |   2 +
>  drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c   |   1 +
>  .../infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c  |   8 +-
>  drivers/input/mouse/vmmouse.c                      |  10 +-
>  drivers/leds/leds-pca955x.c                        |  17 +-
>  drivers/md/dm-mpath.c                              |  20 +-
>  drivers/md/md.c                                    |   4 +-
>  drivers/misc/pti.c                                 |   2 +-
>  drivers/misc/vmw_balloon.c                         |   2 +-
>  drivers/net/ethernet/ibm/ibmvnic.c                 |   2 +
>  drivers/net/ethernet/intel/fm10k/fm10k.h           |   4 +-
>  drivers/net/ethernet/intel/fm10k/fm10k_iov.c       |  12 +-
>  drivers/net/ethernet/intel/i40e/i40e_main.c        |  16 +-
>  drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   7 +-
>  drivers/net/ethernet/intel/i40evf/i40evf_main.c    |   9 +-
>  drivers/net/ethernet/intel/igb/igb_main.c          |   2 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_common.c    |   4 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c      |   2 +
>  drivers/net/phy/at803x.c                           |   2 +-
>  drivers/pci/iov.c                                  |   3 +-
>  drivers/pci/pci.c                                  |   4 +
>  drivers/pci/pcie/aer/aerdrv_core.c                 |   9 +-
>  drivers/platform/x86/asus-wireless.c               |   1 +
>  drivers/rtc/interface.c                            |   2 +-
>  drivers/rtc/rtc-pl031.c                            |  14 +-
>  drivers/scsi/cxgbi/cxgb4i/cxgb4i.c                 |   1 +
>  drivers/scsi/lpfc/lpfc_hbadisc.c                   |   3 +-
>  drivers/scsi/lpfc/lpfc_hw4.h                       |   2 +-
>  drivers/scsi/lpfc/lpfc_nvmet.c                     |   2 +
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c               |   5 +
>  drivers/staging/greybus/light.c                    |   2 +
>  drivers/tee/optee/core.c                           |   1 -
>  drivers/thermal/hisi_thermal.c                     |  74 ++-
>  drivers/vfio/pci/vfio_pci_config.c                 |   6 +-
>  drivers/video/backlight/pwm_bl.c                   |   7 +-
>  fs/dcache.c                                        |   4 +-
>  fs/overlayfs/ovl_entry.h                           |   2 +-
>  fs/overlayfs/readdir.c                             |   2 +-
>  include/asm-generic/vmlinux.lds.h                  |   2 +-
>  include/linux/compiler.h                           |   1 +
>  include/linux/hypervisor.h                         |   8 +-
>  include/linux/iio/common/st_sensors.h              |   7 +-
>  include/linux/{pti.h => intel-pti.h}               |   6 +-
>  include/linux/mm.h                                 |   2 +-
>  include/linux/mmzone.h                             |   6 +-
>  include/linux/rculist.h                            |   4 +-
>  include/linux/rcupdate.h                           |   4 +-
>  kernel/events/core.c                               |   4 +-
>  kernel/seccomp.c                                   |   2 +-
>  kernel/task_work.c                                 |   2 +-
>  kernel/trace/trace_events_hist.c                   |   4 +-
>  lib/Kconfig.debug                                  |   2 +-
>  mm/page_alloc.c                                    |  10 +
>  mm/slab.h                                          |   2 +-
>  mm/sparse.c                                        |  17 +-
>  net/ipv4/ip_gre.c                                  |   8 +-
>  net/ipv4/tcp_vegas.c                               |   2 +-
>  net/ipv6/addrconf.c                                |  12 +-
>  net/ipv6/route.c                                   |  58 +--
>  net/sctp/stream.c                                  |   8 +-
>  scripts/Makefile.build                             |   2 +-
>  sound/soc/codecs/msm8916-wcd-analog.c              |   9 +-
>  sound/soc/img/img-parallel-out.c                   |   2 +
>  tools/objtool/check.c                              |   7 +-
>  tools/objtool/objtool.c                            |   6 +-
>  tools/testing/selftests/x86/ldt_gdt.c              |  61 ++-
>  virt/kvm/kvm_main.c                                |   2 +-
>  188 files changed, 2414 insertions(+), 1428 deletions(-)
>
>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-24 19:37 ` Ivan Kozik
@ 2017-12-24 22:03   ` Andre Tomt
  2017-12-25 13:38   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 349+ messages in thread
From: Andre Tomt @ 2017-12-24 22:03 UTC (permalink / raw)
  To: Ivan Kozik, Greg Kroah-Hartman, stable; +Cc: linux-kernel, jpoimboe

On 24. des. 2017 20:37, Ivan Kozik wrote:
> On Fri, Dec 22, 2017 at 8:44 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
>> This is the start of the stable review cycle for the 4.14.9 release.
>> There are 159 patches in this series, all will be posted as a response
>> to this one.  If anyone has any issues with these being applied, please
>> let me know.
>>
>> Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
>> Anything received after that time might be too late.
>>
> 
>> Josh Poimboeuf <jpoimboe@redhat.com>
>>      x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit
> 
> This is uncovering a very difficult-to-debug build failure with NVIDIA DKMS:
> with CONFIG_UNWINDER_ORC=y, out-of-tree modules hit this rule in
> scripts/Makefile.build:
> 
> $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
> 
> and fail (here, at least) to build tools/objtool/objtool (note: I do have
> libelf-dev installed)
> 
> After editing dkms.conf to do `make --debug=a -j1`, I see make output:
> 
>   Considering target file
> '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'.
>    File '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'
> does not exist.
>    Looking for an implicit rule for
> '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'.
>    [...]
>    Trying rule prerequisite 'tools/objtool/objtool'.
>    Looking for a rule with intermediate file 'tools/objtool/objtool'.
>     Avoiding implicit rule recursion.
> 
> then silently fail to build objtool, silently fail to build all the .o files,
> then continue until ld finally errors out trying to link nonexistent object
> files.
> 
> If things look alright to you, and objtool is known to work with out-of-tree
> modules, and the Debian packaging just needs to be adjusted, please ignore;
> I figured I'd send this anyway because it was such a pain to debug.

The linux-header packages dkms builds against need to include objtool 
when the ORC unwinder is enabled.

Changing the default like that is a pretty big change for a stable release.

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-23 22:54 ` Guenter Roeck
@ 2017-12-25 13:35   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-25 13:35 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, torvalds, akpm, shuahkh, patches, ben.hutchings,
	lkft-triage, stable

On Sat, Dec 23, 2017 at 02:54:35PM -0800, Guenter Roeck wrote:
> On 12/22/2017 12:44 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.14.9 release.
> > There are 159 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> For v4.14.8-176-g3b153f8:
> 
> Build results:
> 	total: 145 pass: 145 fail: 0
> Qemu test results:
> 	total: 126 pass: 126 fail: 0

Wonderful, thanks for letting me know!

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 000/159] 4.14.9-stable review
  2017-12-24 19:37 ` Ivan Kozik
  2017-12-24 22:03   ` Andre Tomt
@ 2017-12-25 13:38   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-25 13:38 UTC (permalink / raw)
  To: Ivan Kozik
  Cc: linux-kernel, Linus Torvalds, akpm, Guenter Roeck, Shuah Khan,
	patches, Ben Hutchings, lkft-triage, stable

On Sun, Dec 24, 2017 at 07:37:23PM +0000, Ivan Kozik wrote:
> On Fri, Dec 22, 2017 at 8:44 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > This is the start of the stable review cycle for the 4.14.9 release.
> > There are 159 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
> > Anything received after that time might be too late.
> >
> 
> > Josh Poimboeuf <jpoimboe@redhat.com>
> >     x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit
> 
> This is uncovering a very difficult-to-debug build failure with NVIDIA DKMS:
> with CONFIG_UNWINDER_ORC=y, out-of-tree modules hit this rule in
> scripts/Makefile.build:
> 
> $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
> 
> and fail (here, at least) to build tools/objtool/objtool (note: I do have
> libelf-dev installed)

Is this problem also in Linus's tree?

> After editing dkms.conf to do `make --debug=a -j1`, I see make output:
> 
>  Considering target file
> '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'.
>   File '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'
> does not exist.
>   Looking for an implicit rule for
> '/var/lib/dkms/nvidia-current/387.34/build/nvidia/nv-gpu-numa.o'.
>   [...]
>   Trying rule prerequisite 'tools/objtool/objtool'.
>   Looking for a rule with intermediate file 'tools/objtool/objtool'.
>    Avoiding implicit rule recursion.
> 
> then silently fail to build objtool, silently fail to build all the .o files,
> then continue until ld finally errors out trying to link nonexistent object
> files.

Is this just a bug in the nvidia Makefile somehow?

> If things look alright to you, and objtool is known to work with out-of-tree
> modules, and the Debian packaging just needs to be adjusted, please ignore;
> I figured I'd send this anyway because it was such a pain to debug.

objtool should work with out-of-tree modules, again, try Linus's tree to
verify this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h
  2017-12-22  8:45 ` [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h Greg Kroah-Hartman
@ 2017-12-26 21:41   ` Ben Hutchings
  2017-12-27 12:48     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 349+ messages in thread
From: Ben Hutchings @ 2017-12-26 21:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andi Kleen
  Cc: inux-kernel, stable, Thomas Gleixner, Linus Torvalds,
	Peter Zijlstra, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 2168 bytes --]

On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Andi Kleen <ak@linux.intel.com>
> 
> commit cbe96375025e14fc76f9ed42ee5225120d7210f8 upstream.
> 
> Add two simple wrappers around set_bit/clear_bit() that accept
> the common case of an u32 array. This avoids writing
> casts in all callers.

These won't work correctly on big-endian 64-bit systems.  They are also
unsafe to use on u32 arrays with an odd length, on 64-bit systems.
This is why lib/bitmap.c has conversion functions for u32 arrays.

Ben.

> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Link: http://lkml.kernel.org/r/20171013215645.23166-2-andi@firstfloor.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  include/linux/bitops.h |   26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -228,6 +228,32 @@ static inline unsigned long __ffs64(u64
>  	return __ffs((unsigned long)word);
>  }
>  
> +/*
> + * clear_bit32 - Clear a bit in memory for u32 array
> + * @nr: Bit to clear
> + * @addr: u32 * address of bitmap
> + *
> + * Same as clear_bit, but avoids needing casts for u32 arrays.
> + */
> +
> +static __always_inline void clear_bit32(long nr, volatile u32 *addr)
> +{
> +	clear_bit(nr, (volatile unsigned long *)addr);
> +}
> +
> +/*
> + * set_bit32 - Set a bit in memory for u32 array
> + * @nr: Bit to clear
> + * @addr: u32 * address of bitmap
> + *
> + * Same as set_bit, but avoids needing casts for u32 arrays.
> + */
> +
> +static __always_inline void set_bit32(long nr, volatile u32 *addr)
> +{
> +	set_bit(nr, (volatile unsigned long *)addr);
> +}
> +
>  #ifdef __KERNEL__
>  
>  #ifndef set_mask_bits
> 
> 
-- 
Ben Hutchings
The world is coming to an end.	Please log off.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
  2017-12-23  9:24             ` Greg Kroah-Hartman
@ 2017-12-27 10:30               ` Paolo Bonzini
  0 siblings, 0 replies; 349+ messages in thread
From: Paolo Bonzini @ 2017-12-27 10:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Michal Hocko
  Cc: alexander.levin, linux-kernel, stable, Shakeel Butt

On 23/12/2017 10:24, Greg Kroah-Hartman wrote:
> For many subsystems, the maintainers _never_ mark patches for stable.
> Others, they catch maybe half of the things they should be applying.
> 
> KVM is one such example of the "half" group, they mark patches as
> resolving CVE issues at times, yet don't mark them for stable.  So when
> I see a patch like this, it triggers the "oh, look, KVM doing the same
> thing again", so I take the patch and of course cc: the
> developers/maintainers so they can object if they want to.

In general there are some cases where I tend to be conservative on
applying the "stable" tag, for example:

1) sometimes I'm not very familiar with API changes in the other
subsystems (this was the case for this patch).  If I am not sure of the
amount of backporting effort required, and the bug is not super
important, I don't mark it as stable because I don't want to later drop
a complex backport on the floor.  I prefer to have fewer patches
applied, but know that the fixes are backported to all branches.

2) not all bugs are equal; a WARN_ON_ONCE from a syzkaller testcase for
example doesn't really matter to a cloud provider that uses KVM, because
invalid API usage is not controlled by the customer.  But an oops or
BUG_ON probably *will* get CCed to stable.  So some patches for
syzkaller bugs may be CCed, some may not.

IIRC the CVE that you mention was a guest user->kernel escalation, but
it didn't affect Linux guests at all, and it couldn't be fixed
completely on Windows guests because Windows has another bug in the same
area.  Plus, I knew there would be different conflicts on all LTS
branches, so I decided to not mark it for stable.  I did dutifully
provide a backport when someone (either you or Ben Hutchings) asked for
one, though.

It does happen that Radim or I forget to Cc stable, so I'm okay with you
picking up more patches than what I mark and I will happily do the
backports for you.  Still, there is some thought put into whether to CC
stable or not. :)

Thanks,

Paolo

> Over time you get to know what subsystems are like this and what are
> not.  MM is one that is really good, I almost never take a mm patch
> without being told explicitly to do so.  Others are horrible and never
> mark anything, so stuff has to be picked up manually through Sasha's
> process or through other ways.
> 
> So it's not a perfect system, but it seems to work "good enough", and if
> you ever have any questions about any patch, always feel free to ask,
> there's usually a story behind almost every one...

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h
  2017-12-26 21:41   ` Ben Hutchings
@ 2017-12-27 12:48     ` Greg Kroah-Hartman
  2017-12-27 19:40       ` Ben Hutchings
  0 siblings, 1 reply; 349+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-27 12:48 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Andi Kleen, inux-kernel, stable, Thomas Gleixner, Linus Torvalds,
	Peter Zijlstra, Ingo Molnar

On Tue, Dec 26, 2017 at 09:41:36PM +0000, Ben Hutchings wrote:
> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Andi Kleen <ak@linux.intel.com>
> > 
> > commit cbe96375025e14fc76f9ed42ee5225120d7210f8 upstream.
> > 
> > Add two simple wrappers around set_bit/clear_bit() that accept
> > the common case of an u32 array. This avoids writing
> > casts in all callers.
> 
> These won't work correctly on big-endian 64-bit systems.  They are also
> unsafe to use on u32 arrays with an odd length, on 64-bit systems.
> This is why lib/bitmap.c has conversion functions for u32 arrays.

I end up deleteing these later in the patch series, just like upstream
did, so all should be ok, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h
  2017-12-27 12:48     ` Greg Kroah-Hartman
@ 2017-12-27 19:40       ` Ben Hutchings
  0 siblings, 0 replies; 349+ messages in thread
From: Ben Hutchings @ 2017-12-27 19:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andi Kleen, inux-kernel, stable, Thomas Gleixner, Linus Torvalds,
	Peter Zijlstra, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 1108 bytes --]

On Wed, 2017-12-27 at 13:48 +0100, Greg Kroah-Hartman wrote:
> On Tue, Dec 26, 2017 at 09:41:36PM +0000, Ben Hutchings wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > ------------------
> > > 
> > > From: Andi Kleen <ak@linux.intel.com>
> > > 
> > > commit cbe96375025e14fc76f9ed42ee5225120d7210f8 upstream.
> > > 
> > > Add two simple wrappers around set_bit/clear_bit() that accept
> > > the common case of an u32 array. This avoids writing
> > > casts in all callers.
> > 
> > These won't work correctly on big-endian 64-bit systems.  They are also
> > unsafe to use on u32 arrays with an odd length, on 64-bit systems.
> > This is why lib/bitmap.c has conversion functions for u32 arrays.
> 
> I end up deleteing these later in the patch series, just like upstream
> did, so all should be ok, right?

Ah, sorry, I didn't get that far.  The end result looks OK.

Ben.

-- 
Ben Hutchings
Any sufficiently advanced bug is indistinguishable from a feature.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2017-12-22  8:45   ` Greg Kroah-Hartman
  (?)
@ 2018-01-07  5:14     ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07  5:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, Kirill A. Shutemov, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Ingo Molnar

On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.

FYI, this broke kdump, or rather the makedumpfile part thereof.
 Forward looking wreckage is par for the kdump course, but...

> ------------------
> 
> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> 
> Size of the mem_section[] array depends on the size of the physical address space.
> 
> In preparation for boot-time switching between paging modes on x86-64
> we need to make the allocation of mem_section[] dynamic, because otherwise
> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> for 4-level paging and 2MB for 5-level paging mode.
> 
> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-mm@kvack.org
> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  include/linux/mmzone.h |    6 +++++-
>  mm/page_alloc.c        |   10 ++++++++++
>  mm/sparse.c            |   17 +++++++++++------
>  3 files changed, 26 insertions(+), 7 deletions(-)
> 
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1152,13 +1152,17 @@ struct mem_section {
>  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
>  
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> +extern struct mem_section **mem_section;
>  #else
>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>  #endif
>  
>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>  {
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section)
> +		return NULL;
> +#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>  	unsigned long start_pfn, end_pfn;
>  	int i, this_nid;
>  
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section) {
> +		unsigned long size, align;
> +
> +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> +		align = 1 << (INTERNODE_CACHE_SHIFT);
> +		mem_section = memblock_virt_alloc(size, align);
> +	}
> +#endif
> +
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>  		memory_present(this_nid, start_pfn, end_pfn);
>  }
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -23,8 +23,7 @@
>   * 1) mem_section	- memory sections, mem_map's for valid memory
>   */
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -struct mem_section *mem_section[NR_SECTION_ROOTS]
> -	____cacheline_internodealigned_in_smp;
> +struct mem_section **mem_section;
>  #else
>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>  	____cacheline_internodealigned_in_smp;
> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>  int __section_nr(struct mem_section* ms)
>  {
>  	unsigned long root_nr;
> -	struct mem_section* root;
> +	struct mem_section *root = NULL;
>  
>  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
>  
> -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> +	VM_BUG_ON(!root);
>  
>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>  }
> @@ -330,11 +329,17 @@ again:
>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>  {
>  	unsigned long usemap_snr, pgdat_snr;
> -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> +	static unsigned long old_usemap_snr;
> +	static unsigned long old_pgdat_snr;
>  	struct pglist_data *pgdat = NODE_DATA(nid);
>  	int usemap_nid;
>  
> +	/* First call */
> +	if (!old_usemap_snr) {
> +		old_usemap_snr = NR_MEM_SECTIONS;
> +		old_pgdat_snr = NR_MEM_SECTIONS;
> +	}
> +
>  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>  	if (usemap_snr == pgdat_snr)
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07  5:14     ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07  5:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, Kirill A. Shutemov, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Ingo Molnar

On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.

FYI, this broke kdump, or rather the makedumpfile part thereof.
 Forward looking wreckage is par for the kdump course, but...

> ------------------
> 
> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> 
> Size of the mem_section[] array depends on the size of the physical address space.
> 
> In preparation for boot-time switching between paging modes on x86-64
> we need to make the allocation of mem_section[] dynamic, because otherwise
> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> for 4-level paging and 2MB for 5-level paging mode.
> 
> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-mm@kvack.org
> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  include/linux/mmzone.h |    6 +++++-
>  mm/page_alloc.c        |   10 ++++++++++
>  mm/sparse.c            |   17 +++++++++++------
>  3 files changed, 26 insertions(+), 7 deletions(-)
> 
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1152,13 +1152,17 @@ struct mem_section {
>  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
>  
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> +extern struct mem_section **mem_section;
>  #else
>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>  #endif
>  
>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>  {
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section)
> +		return NULL;
> +#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>  	unsigned long start_pfn, end_pfn;
>  	int i, this_nid;
>  
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section) {
> +		unsigned long size, align;
> +
> +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> +		align = 1 << (INTERNODE_CACHE_SHIFT);
> +		mem_section = memblock_virt_alloc(size, align);
> +	}
> +#endif
> +
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>  		memory_present(this_nid, start_pfn, end_pfn);
>  }
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -23,8 +23,7 @@
>   * 1) mem_section	- memory sections, mem_map's for valid memory
>   */
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -struct mem_section *mem_section[NR_SECTION_ROOTS]
> -	____cacheline_internodealigned_in_smp;
> +struct mem_section **mem_section;
>  #else
>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>  	____cacheline_internodealigned_in_smp;
> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>  int __section_nr(struct mem_section* ms)
>  {
>  	unsigned long root_nr;
> -	struct mem_section* root;
> +	struct mem_section *root = NULL;
>  
>  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
>  
> -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> +	VM_BUG_ON(!root);
>  
>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>  }
> @@ -330,11 +329,17 @@ again:
>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>  {
>  	unsigned long usemap_snr, pgdat_snr;
> -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> +	static unsigned long old_usemap_snr;
> +	static unsigned long old_pgdat_snr;
>  	struct pglist_data *pgdat = NODE_DATA(nid);
>  	int usemap_nid;
>  
> +	/* First call */
> +	if (!old_usemap_snr) {
> +		old_usemap_snr = NR_MEM_SECTIONS;
> +		old_pgdat_snr = NR_MEM_SECTIONS;
> +	}
> +
>  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>  	if (usemap_snr == pgdat_snr)
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07  5:14     ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07  5:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, Kirill A. Shutemov, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Ingo Molnar

On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.

FYI, this broke kdump, or rather the makedumpfile part thereof.
 Forward looking wreckage is par for the kdump course, but...

> ------------------
> 
> From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> 
> Size of the mem_section[] array depends on the size of the physical address space.
> 
> In preparation for boot-time switching between paging modes on x86-64
> we need to make the allocation of mem_section[] dynamic, because otherwise
> we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> for 4-level paging and 2MB for 5-level paging mode.
> 
> The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-mm@kvack.org
> Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  include/linux/mmzone.h |    6 +++++-
>  mm/page_alloc.c        |   10 ++++++++++
>  mm/sparse.c            |   17 +++++++++++------
>  3 files changed, 26 insertions(+), 7 deletions(-)
> 
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1152,13 +1152,17 @@ struct mem_section {
>  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
>  
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> +extern struct mem_section **mem_section;
>  #else
>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>  #endif
>  
>  static inline struct mem_section *__nr_to_section(unsigned long nr)
>  {
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section)
> +		return NULL;
> +#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
>  	unsigned long start_pfn, end_pfn;
>  	int i, this_nid;
>  
> +#ifdef CONFIG_SPARSEMEM_EXTREME
> +	if (!mem_section) {
> +		unsigned long size, align;
> +
> +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> +		align = 1 << (INTERNODE_CACHE_SHIFT);
> +		mem_section = memblock_virt_alloc(size, align);
> +	}
> +#endif
> +
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
>  		memory_present(this_nid, start_pfn, end_pfn);
>  }
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -23,8 +23,7 @@
>   * 1) mem_section	- memory sections, mem_map's for valid memory
>   */
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> -struct mem_section *mem_section[NR_SECTION_ROOTS]
> -	____cacheline_internodealigned_in_smp;
> +struct mem_section **mem_section;
>  #else
>  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
>  	____cacheline_internodealigned_in_smp;
> @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
>  int __section_nr(struct mem_section* ms)
>  {
>  	unsigned long root_nr;
> -	struct mem_section* root;
> +	struct mem_section *root = NULL;
>  
>  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
>  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
>  
> -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> +	VM_BUG_ON(!root);
>  
>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>  }
> @@ -330,11 +329,17 @@ again:
>  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
>  {
>  	unsigned long usemap_snr, pgdat_snr;
> -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> +	static unsigned long old_usemap_snr;
> +	static unsigned long old_pgdat_snr;
>  	struct pglist_data *pgdat = NODE_DATA(nid);
>  	int usemap_nid;
>  
> +	/* First call */
> +	if (!old_usemap_snr) {
> +		old_usemap_snr = NR_MEM_SECTIONS;
> +		old_pgdat_snr = NR_MEM_SECTIONS;
> +	}
> +
>  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
>  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
>  	if (usemap_snr == pgdat_snr)
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07  5:14     ` Mike Galbraith
  (?)
@ 2018-01-07  9:11       ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07  9:11 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> FYI, this broke kdump, or rather the makedumpfile part thereof.
>  Forward looking wreckage is par for the kdump course, but...

Is it also broken in Linus's tree with this patch?  Or is there an
add-on patch that I should apply to 4.14 to resolve this issue there?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07  9:11       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07  9:11 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> FYI, this broke kdump, or rather the makedumpfile part thereof.
> �Forward looking wreckage is par for the kdump course, but...

Is it also broken in Linus's tree with this patch?  Or is there an
add-on patch that I should apply to 4.14 to resolve this issue there?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07  9:11       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07  9:11 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> FYI, this broke kdump, or rather the makedumpfile part thereof.
>  Forward looking wreckage is par for the kdump course, but...

Is it also broken in Linus's tree with this patch?  Or is there an
add-on patch that I should apply to 4.14 to resolve this issue there?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07  9:11       ` Greg Kroah-Hartman
  (?)
@ 2018-01-07  9:21         ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07  9:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, 2018-01-07 at 10:11 +0100, Greg Kroah-Hartman wrote:
> On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > FYI, this broke kdump, or rather the makedumpfile part thereof.
> >  Forward looking wreckage is par for the kdump course, but...
> 
> Is it also broken in Linus's tree with this patch?  Or is there an
> add-on patch that I should apply to 4.14 to resolve this issue there?

Yeah, it's belly up.  By its very nature, it's gonna get dinged up
regularly.  I only mentioned it because it's not expected that stuff
gets dinged up retroactively.

	-Mike

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07  9:21         ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07  9:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, 2018-01-07 at 10:11 +0100, Greg Kroah-Hartman wrote:
> On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > FYI, this broke kdump, or rather the makedumpfile part thereof.
> >  Forward looking wreckage is par for the kdump course, but...
> 
> Is it also broken in Linus's tree with this patch?  Or is there an
> add-on patch that I should apply to 4.14 to resolve this issue there?

Yeah, it's belly up.  By its very nature, it's gonna get dinged up
regularly.  I only mentioned it because it's not expected that stuff
gets dinged up retroactively.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07  9:21         ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07  9:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, 2018-01-07 at 10:11 +0100, Greg Kroah-Hartman wrote:
> On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > FYI, this broke kdump, or rather the makedumpfile part thereof.
> >  Forward looking wreckage is par for the kdump course, but...
> 
> Is it also broken in Linus's tree with this patch?  Or is there an
> add-on patch that I should apply to 4.14 to resolve this issue there?

Yeah, it's belly up.  By its very nature, it's gonna get dinged up
regularly.  I only mentioned it because it's not expected that stuff
gets dinged up retroactively.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07  9:11       ` Greg Kroah-Hartman
  (?)
@ 2018-01-07 10:18         ` Michal Hocko
  -1 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-07 10:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun 07-01-18 10:11:15, Greg KH wrote:
> On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > FYI, this broke kdump, or rather the makedumpfile part thereof.
> >  Forward looking wreckage is par for the kdump course, but...
> 
> Is it also broken in Linus's tree with this patch?  Or is there an
> add-on patch that I should apply to 4.14 to resolve this issue there?

This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
I guess.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 10:18         ` Michal Hocko
  0 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-07 10:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun 07-01-18 10:11:15, Greg KH wrote:
> On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > �Forward looking wreckage is par for the kdump course, but...
> 
> Is it also broken in Linus's tree with this patch?  Or is there an
> add-on patch that I should apply to 4.14 to resolve this issue there?

This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
I guess.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 10:18         ` Michal Hocko
  0 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-07 10:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun 07-01-18 10:11:15, Greg KH wrote:
> On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > 
> > FYI, this broke kdump, or rather the makedumpfile part thereof.
> >  Forward looking wreckage is par for the kdump course, but...
> 
> Is it also broken in Linus's tree with this patch?  Or is there an
> add-on patch that I should apply to 4.14 to resolve this issue there?

This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
I guess.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07 10:18         ` Michal Hocko
  (?)
@ 2018-01-07 10:42           ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07 10:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 11:18:47AM +0100, Michal Hocko wrote:
> On Sun 07-01-18 10:11:15, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > >  Forward looking wreckage is par for the kdump course, but...
> > 
> > Is it also broken in Linus's tree with this patch?  Or is there an
> > add-on patch that I should apply to 4.14 to resolve this issue there?
> 
> This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> I guess.

Good, that patch is queued up for the next 4.14-stable release in a few
days.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 10:42           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07 10:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 11:18:47AM +0100, Michal Hocko wrote:
> On Sun 07-01-18 10:11:15, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > �Forward looking wreckage is par for the kdump course, but...
> > 
> > Is it also broken in Linus's tree with this patch?  Or is there an
> > add-on patch that I should apply to 4.14 to resolve this issue there?
> 
> This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> I guess.

Good, that patch is queued up for the next 4.14-stable release in a few
days.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 10:42           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-07 10:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 11:18:47AM +0100, Michal Hocko wrote:
> On Sun 07-01-18 10:11:15, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > >  Forward looking wreckage is par for the kdump course, but...
> > 
> > Is it also broken in Linus's tree with this patch?  Or is there an
> > add-on patch that I should apply to 4.14 to resolve this issue there?
> 
> This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> I guess.

Good, that patch is queued up for the next 4.14-stable release in a few
days.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07 10:18         ` Michal Hocko
  (?)
@ 2018-01-07 12:44           ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07 12:44 UTC (permalink / raw)
  To: Michal Hocko, Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> On Sun 07-01-18 10:11:15, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > >  Forward looking wreckage is par for the kdump course, but...
> > 
> > Is it also broken in Linus's tree with this patch?  Or is there an
> > add-on patch that I should apply to 4.14 to resolve this issue there?
> 
> This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> I guess.

That won't unbreak kdump, else master wouldn't be broken.  I don't care
deeply, or know if anyone else does, I'm just reporting it because I
met it and chased it down.

	-Mike

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 12:44           ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07 12:44 UTC (permalink / raw)
  To: Michal Hocko, Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> On Sun 07-01-18 10:11:15, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > >  Forward looking wreckage is par for the kdump course, but...
> > 
> > Is it also broken in Linus's tree with this patch?  Or is there an
> > add-on patch that I should apply to 4.14 to resolve this issue there?
> 
> This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> I guess.

That won't unbreak kdump, else master wouldn't be broken.  I don't care
deeply, or know if anyone else does, I'm just reporting it because I
met it and chased it down.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 12:44           ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-07 12:44 UTC (permalink / raw)
  To: Michal Hocko, Greg Kroah-Hartman
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> On Sun 07-01-18 10:11:15, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > >  Forward looking wreckage is par for the kdump course, but...
> > 
> > Is it also broken in Linus's tree with this patch?  Or is there an
> > add-on patch that I should apply to 4.14 to resolve this issue there?
> 
> This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> I guess.

That won't unbreak kdump, else master wouldn't be broken.  I don't care
deeply, or know if anyone else does, I'm just reporting it because I
met it and chased it down.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07 12:44           ` Mike Galbraith
  (?)
@ 2018-01-07 13:23             ` Michal Hocko
  -1 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-07 13:23 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > 
> > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > >  Forward looking wreckage is par for the kdump course, but...
> > > 
> > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > 
> > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > I guess.
> 
> That won't unbreak kdump, else master wouldn't be broken.  I don't care
> deeply, or know if anyone else does, I'm just reporting it because I
> met it and chased it down.

OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
for mem_section") made it in after rc6. I am still wondering why
83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
place.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 13:23             ` Michal Hocko
  0 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-07 13:23 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > 
> > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > �Forward looking wreckage is par for the kdump course, but...
> > > 
> > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > 
> > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > I guess.
> 
> That won't unbreak kdump, else master wouldn't be broken. �I don't care
> deeply, or know if anyone else does, I'm just reporting it because I
> met it and chased it down.

OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
for mem_section") made it in after rc6. I am still wondering why
83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
place.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-07 13:23             ` Michal Hocko
  0 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-07 13:23 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > 
> > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > >  Forward looking wreckage is par for the kdump course, but...
> > > 
> > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > 
> > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > I guess.
> 
> That won't unbreak kdump, else master wouldn't be broken.  I don't care
> deeply, or know if anyone else does, I'm just reporting it because I
> met it and chased it down.

OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
for mem_section") made it in after rc6. I am still wondering why
83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
place.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07 13:23             ` Michal Hocko
  (?)
@ 2018-01-08  7:53               ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  7:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > 
> > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > 
> > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > 
> > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > I guess.
> > 
> > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > deeply, or know if anyone else does, I'm just reporting it because I
> > met it and chased it down.
> 
> OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> for mem_section") made it in after rc6. I am still wondering why
> 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> place.

It was part of the prep for the KTPI code from what I can tell.  If you
think it should be reverted, just let me know and I'll be glad to do so.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  7:53               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  7:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > 
> > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > �Forward looking wreckage is par for the kdump course, but...
> > > > 
> > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > 
> > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > I guess.
> > 
> > That won't unbreak kdump, else master wouldn't be broken. �I don't care
> > deeply, or know if anyone else does, I'm just reporting it because I
> > met it and chased it down.
> 
> OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> for mem_section") made it in after rc6. I am still wondering why
> 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> place.

It was part of the prep for the KTPI code from what I can tell.  If you
think it should be reverted, just let me know and I'll be glad to do so.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  7:53               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  7:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > 
> > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > 
> > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > 
> > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > I guess.
> > 
> > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > deeply, or know if anyone else does, I'm just reporting it because I
> > met it and chased it down.
> 
> OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> for mem_section") made it in after rc6. I am still wondering why
> 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> place.

It was part of the prep for the KTPI code from what I can tell.  If you
think it should be reverted, just let me know and I'll be glad to do so.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08  7:53               ` Greg Kroah-Hartman
  (?)
@ 2018-01-08  8:15                 ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-08  8:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Michal Hocko
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, 2018-01-08 at 08:53 +0100, Greg Kroah-Hartman wrote:
> On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > 
> > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > 
> > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > 
> > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > I guess.
> > > 
> > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > deeply, or know if anyone else does, I'm just reporting it because I
> > > met it and chased it down.
> > 
> > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > for mem_section") made it in after rc6. I am still wondering why
> > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > place.
> 
> It was part of the prep for the KTPI code from what I can tell.  If you
> think it should be reverted, just let me know and I'll be glad to do so.

No preference here.  I have to patch master regardless if I want kdump
to work while I patiently wait for userspace to get fixed up (either
that or use time I don't have to go fix it up myself).

	-Mike

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  8:15                 ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-08  8:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Michal Hocko
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, 2018-01-08 at 08:53 +0100, Greg Kroah-Hartman wrote:
> On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > 
> > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > 
> > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > 
> > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > I guess.
> > > 
> > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > deeply, or know if anyone else does, I'm just reporting it because I
> > > met it and chased it down.
> > 
> > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > for mem_section") made it in after rc6. I am still wondering why
> > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > place.
> 
> It was part of the prep for the KTPI code from what I can tell.  If you
> think it should be reverted, just let me know and I'll be glad to do so.

No preference here.  I have to patch master regardless if I want kdump
to work while I patiently wait for userspace to get fixed up (either
that or use time I don't have to go fix it up myself).

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  8:15                 ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-08  8:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Michal Hocko
  Cc: linux-kernel, stable, Kirill A. Shutemov, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, 2018-01-08 at 08:53 +0100, Greg Kroah-Hartman wrote:
> On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > 
> > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > 
> > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > 
> > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > I guess.
> > > 
> > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > deeply, or know if anyone else does, I'm just reporting it because I
> > > met it and chased it down.
> > 
> > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > for mem_section") made it in after rc6. I am still wondering why
> > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > place.
> 
> It was part of the prep for the KTPI code from what I can tell.  If you
> think it should be reverted, just let me know and I'll be glad to do so.

No preference here.  I have to patch master regardless if I want kdump
to work while I patiently wait for userspace to get fixed up (either
that or use time I don't have to go fix it up myself).

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08  8:15                 ` Mike Galbraith
  (?)
@ 2018-01-08  8:33                   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  8:33 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Michal Hocko, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 09:15:33AM +0100, Mike Galbraith wrote:
> On Mon, 2018-01-08 at 08:53 +0100, Greg Kroah-Hartman wrote:
> > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > 
> > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > > 
> > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > 
> > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > I guess.
> > > > 
> > > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > met it and chased it down.
> > > 
> > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > for mem_section") made it in after rc6. I am still wondering why
> > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > place.
> > 
> > It was part of the prep for the KTPI code from what I can tell.  If you
> > think it should be reverted, just let me know and I'll be glad to do so.
> 
> No preference here.  I have to patch master regardless if I want kdump
> to work while I patiently wait for userspace to get fixed up (either
> that or use time I don't have to go fix it up myself).

I'll stay "bug compatible" for the time being.  If you do fix this up,
can you add a cc: stable tag in your patch so I can pick it up when it
gets merged?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  8:33                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  8:33 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Michal Hocko, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 09:15:33AM +0100, Mike Galbraith wrote:
> On Mon, 2018-01-08 at 08:53 +0100, Greg Kroah-Hartman wrote:
> > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > 
> > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > > �Forward looking wreckage is par for the kdump course, but...
> > > > > > 
> > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > 
> > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > I guess.
> > > > 
> > > > That won't unbreak kdump, else master wouldn't be broken. �I don't care
> > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > met it and chased it down.
> > > 
> > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > for mem_section") made it in after rc6. I am still wondering why
> > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > place.
> > 
> > It was part of the prep for the KTPI code from what I can tell.  If you
> > think it should be reverted, just let me know and I'll be glad to do so.
> 
> No preference here. �I have to patch master regardless if I want kdump
> to work while I patiently wait for userspace to get fixed up (either
> that or use time I don't have to go fix it up myself).

I'll stay "bug compatible" for the time being.  If you do fix this up,
can you add a cc: stable tag in your patch so I can pick it up when it
gets merged?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  8:33                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  8:33 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Michal Hocko, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 09:15:33AM +0100, Mike Galbraith wrote:
> On Mon, 2018-01-08 at 08:53 +0100, Greg Kroah-Hartman wrote:
> > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > 
> > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > > 
> > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > 
> > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > I guess.
> > > > 
> > > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > met it and chased it down.
> > > 
> > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > for mem_section") made it in after rc6. I am still wondering why
> > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > place.
> > 
> > It was part of the prep for the KTPI code from what I can tell.  If you
> > think it should be reverted, just let me know and I'll be glad to do so.
> 
> No preference here.  I have to patch master regardless if I want kdump
> to work while I patiently wait for userspace to get fixed up (either
> that or use time I don't have to go fix it up myself).

I'll stay "bug compatible" for the time being.  If you do fix this up,
can you add a cc: stable tag in your patch so I can pick it up when it
gets merged?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08  7:53               ` Greg Kroah-Hartman
  (?)
@ 2018-01-08  8:47                 ` Michal Hocko
  -1 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-08  8:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon 08-01-18 08:53:08, Greg KH wrote:
> On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > 
> > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > 
> > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > 
> > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > I guess.
> > > 
> > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > deeply, or know if anyone else does, I'm just reporting it because I
> > > met it and chased it down.
> > 
> > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > for mem_section") made it in after rc6. I am still wondering why
> > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > place.
> 
> It was part of the prep for the KTPI code from what I can tell.

I do not see a direct relation, to be honest. It is more related to
5-level page tables but I might be missing some subtle relation.

> If you
> think it should be reverted, just let me know and I'll be glad to do so.

This seems to be affecting Linus tree as well so it needs to get
resolved. I would suggest reverting in stable for the mean time.
If you really need it in the stable tree then you can pull it back later
with all the follow up fixes.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  8:47                 ` Michal Hocko
  0 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-08  8:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon 08-01-18 08:53:08, Greg KH wrote:
> On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > 
> > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > �Forward looking wreckage is par for the kdump course, but...
> > > > > 
> > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > 
> > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > I guess.
> > > 
> > > That won't unbreak kdump, else master wouldn't be broken. �I don't care
> > > deeply, or know if anyone else does, I'm just reporting it because I
> > > met it and chased it down.
> > 
> > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > for mem_section") made it in after rc6. I am still wondering why
> > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > place.
> 
> It was part of the prep for the KTPI code from what I can tell.

I do not see a direct relation, to be honest. It is more related to
5-level page tables but I might be missing some subtle relation.

> If you
> think it should be reverted, just let me know and I'll be glad to do so.

This seems to be affecting Linus tree as well so it needs to get
resolved. I would suggest reverting in stable for the mean time.
If you really need it in the stable tree then you can pull it back later
with all the follow up fixes.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  8:47                 ` Michal Hocko
  0 siblings, 0 replies; 349+ messages in thread
From: Michal Hocko @ 2018-01-08  8:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon 08-01-18 08:53:08, Greg KH wrote:
> On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > 
> > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > 
> > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > 
> > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > I guess.
> > > 
> > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > deeply, or know if anyone else does, I'm just reporting it because I
> > > met it and chased it down.
> > 
> > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > for mem_section") made it in after rc6. I am still wondering why
> > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > place.
> 
> It was part of the prep for the KTPI code from what I can tell.

I do not see a direct relation, to be honest. It is more related to
5-level page tables but I might be missing some subtle relation.

> If you
> think it should be reverted, just let me know and I'll be glad to do so.

This seems to be affecting Linus tree as well so it needs to get
resolved. I would suggest reverting in stable for the mean time.
If you really need it in the stable tree then you can pull it back later
with all the follow up fixes.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08  8:47                 ` Michal Hocko
  (?)
@ 2018-01-08  9:10                   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  9:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 09:47:23AM +0100, Michal Hocko wrote:
> On Mon 08-01-18 08:53:08, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > 
> > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > > 
> > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > 
> > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > I guess.
> > > > 
> > > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > met it and chased it down.
> > > 
> > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > for mem_section") made it in after rc6. I am still wondering why
> > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > place.
> > 
> > It was part of the prep for the KTPI code from what I can tell.
> 
> I do not see a direct relation, to be honest. It is more related to
> 5-level page tables but I might be missing some subtle relation.
> 
> > If you
> > think it should be reverted, just let me know and I'll be glad to do so.
> 
> This seems to be affecting Linus tree as well so it needs to get
> resolved. I would suggest reverting in stable for the mean time.
> If you really need it in the stable tree then you can pull it back later
> with all the follow up fixes.

Ok, I've now reverted it, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  9:10                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  9:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 09:47:23AM +0100, Michal Hocko wrote:
> On Mon 08-01-18 08:53:08, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > 
> > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > > �Forward looking wreckage is par for the kdump course, but...
> > > > > > 
> > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > 
> > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > I guess.
> > > > 
> > > > That won't unbreak kdump, else master wouldn't be broken. �I don't care
> > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > met it and chased it down.
> > > 
> > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > for mem_section") made it in after rc6. I am still wondering why
> > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > place.
> > 
> > It was part of the prep for the KTPI code from what I can tell.
> 
> I do not see a direct relation, to be honest. It is more related to
> 5-level page tables but I might be missing some subtle relation.
> 
> > If you
> > think it should be reverted, just let me know and I'll be glad to do so.
> 
> This seems to be affecting Linus tree as well so it needs to get
> resolved. I would suggest reverting in stable for the mean time.
> If you really need it in the stable tree then you can pull it back later
> with all the follow up fixes.

Ok, I've now reverted it, thanks.

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  9:10                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  9:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 09:47:23AM +0100, Michal Hocko wrote:
> On Mon 08-01-18 08:53:08, Greg KH wrote:
> > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > 
> > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > > 
> > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > 
> > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > I guess.
> > > > 
> > > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > met it and chased it down.
> > > 
> > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > for mem_section") made it in after rc6. I am still wondering why
> > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > place.
> > 
> > It was part of the prep for the KTPI code from what I can tell.
> 
> I do not see a direct relation, to be honest. It is more related to
> 5-level page tables but I might be missing some subtle relation.
> 
> > If you
> > think it should be reverted, just let me know and I'll be glad to do so.
> 
> This seems to be affecting Linus tree as well so it needs to get
> resolved. I would suggest reverting in stable for the mean time.
> If you really need it in the stable tree then you can pull it back later
> with all the follow up fixes.

Ok, I've now reverted it, thanks.

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08  9:10                   ` Greg Kroah-Hartman
  (?)
@ 2018-01-08  9:27                     ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  9:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 10:10:44AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 08, 2018 at 09:47:23AM +0100, Michal Hocko wrote:
> > On Mon 08-01-18 08:53:08, Greg KH wrote:
> > > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > > 
> > > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > > > 
> > > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > > 
> > > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > > I guess.
> > > > > 
> > > > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > > met it and chased it down.
> > > > 
> > > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > > for mem_section") made it in after rc6. I am still wondering why
> > > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > > place.
> > > 
> > > It was part of the prep for the KTPI code from what I can tell.
> > 
> > I do not see a direct relation, to be honest. It is more related to
> > 5-level page tables but I might be missing some subtle relation.
> > 
> > > If you
> > > think it should be reverted, just let me know and I'll be glad to do so.
> > 
> > This seems to be affecting Linus tree as well so it needs to get
> > resolved. I would suggest reverting in stable for the mean time.
> > If you really need it in the stable tree then you can pull it back later
> > with all the follow up fixes.
> 
> Ok, I've now reverted it, thanks.

Nope, it breaks the build when reverted, I'm dropping that revert now.

It's as if the x86 maintainers actually knew what they were doing in
asking for this to be backported :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  9:27                     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  9:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 10:10:44AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 08, 2018 at 09:47:23AM +0100, Michal Hocko wrote:
> > On Mon 08-01-18 08:53:08, Greg KH wrote:
> > > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > > 
> > > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > > > �Forward looking wreckage is par for the kdump course, but...
> > > > > > > 
> > > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > > 
> > > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > > I guess.
> > > > > 
> > > > > That won't unbreak kdump, else master wouldn't be broken. �I don't care
> > > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > > met it and chased it down.
> > > > 
> > > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > > for mem_section") made it in after rc6. I am still wondering why
> > > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > > place.
> > > 
> > > It was part of the prep for the KTPI code from what I can tell.
> > 
> > I do not see a direct relation, to be honest. It is more related to
> > 5-level page tables but I might be missing some subtle relation.
> > 
> > > If you
> > > think it should be reverted, just let me know and I'll be glad to do so.
> > 
> > This seems to be affecting Linus tree as well so it needs to get
> > resolved. I would suggest reverting in stable for the mean time.
> > If you really need it in the stable tree then you can pull it back later
> > with all the follow up fixes.
> 
> Ok, I've now reverted it, thanks.

Nope, it breaks the build when reverted, I'm dropping that revert now.

It's as if the x86 maintainers actually knew what they were doing in
asking for this to be backported :)

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  9:27                     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 349+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-08  9:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Galbraith, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, Jan 08, 2018 at 10:10:44AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 08, 2018 at 09:47:23AM +0100, Michal Hocko wrote:
> > On Mon 08-01-18 08:53:08, Greg KH wrote:
> > > On Sun, Jan 07, 2018 at 02:23:09PM +0100, Michal Hocko wrote:
> > > > On Sun 07-01-18 13:44:02, Mike Galbraith wrote:
> > > > > On Sun, 2018-01-07 at 11:18 +0100, Michal Hocko wrote:
> > > > > > On Sun 07-01-18 10:11:15, Greg KH wrote:
> > > > > > > On Sun, Jan 07, 2018 at 06:14:22AM +0100, Mike Galbraith wrote:
> > > > > > > > On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > > > > > > > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > > > > > > > 
> > > > > > > > FYI, this broke kdump, or rather the makedumpfile part thereof.
> > > > > > > >  Forward looking wreckage is par for the kdump course, but...
> > > > > > > 
> > > > > > > Is it also broken in Linus's tree with this patch?  Or is there an
> > > > > > > add-on patch that I should apply to 4.14 to resolve this issue there?
> > > > > > 
> > > > > > This one http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com
> > > > > > I guess.
> > > > > 
> > > > > That won't unbreak kdump, else master wouldn't be broken.  I don't care
> > > > > deeply, or know if anyone else does, I'm just reporting it because I
> > > > > met it and chased it down.
> > > > 
> > > > OK, I didn't notice that d8cfbbfa0f7 ("mm/sparse.c: wrong allocation
> > > > for mem_section") made it in after rc6. I am still wondering why
> > > > 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for
> > > > CONFIG_SPARSEMEM_EXTREME=y") made it into the stable tree in the first
> > > > place.
> > > 
> > > It was part of the prep for the KTPI code from what I can tell.
> > 
> > I do not see a direct relation, to be honest. It is more related to
> > 5-level page tables but I might be missing some subtle relation.
> > 
> > > If you
> > > think it should be reverted, just let me know and I'll be glad to do so.
> > 
> > This seems to be affecting Linus tree as well so it needs to get
> > resolved. I would suggest reverting in stable for the mean time.
> > If you really need it in the stable tree then you can pull it back later
> > with all the follow up fixes.
> 
> Ok, I've now reverted it, thanks.

Nope, it breaks the build when reverted, I'm dropping that revert now.

It's as if the x86 maintainers actually knew what they were doing in
asking for this to be backported :)

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08  8:33                   ` Greg Kroah-Hartman
  (?)
@ 2018-01-08  9:45                     ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-08  9:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Michal Hocko, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, 2018-01-08 at 09:33 +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 08, 2018 at 09:15:33AM +0100, Mike Galbraith wrote:
> 
> > > It was part of the prep for the KTPI code from what I can tell.  If you
> > > think it should be reverted, just let me know and I'll be glad to do so.
> > 
> > No preference here.  I have to patch master regardless if I want kdump
> > to work while I patiently wait for userspace to get fixed up (either
> > that or use time I don't have to go fix it up myself).
> 
> I'll stay "bug compatible" for the time being.  If you do fix this up,
> can you add a cc: stable tag in your patch so I can pick it up when it
> gets merged?

Userspace (makedumpfile) will have to adapt, not the kernel. Meanwhile
I carry reverts, making kernels, kdump and myself all happy campers.

	-Mike

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  9:45                     ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-08  9:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Michal Hocko, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, 2018-01-08 at 09:33 +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 08, 2018 at 09:15:33AM +0100, Mike Galbraith wrote:
> 
> > > It was part of the prep for the KTPI code from what I can tell.  If you
> > > think it should be reverted, just let me know and I'll be glad to do so.
> > 
> > No preference here.  I have to patch master regardless if I want kdump
> > to work while I patiently wait for userspace to get fixed up (either
> > that or use time I don't have to go fix it up myself).
> 
> I'll stay "bug compatible" for the time being.  If you do fix this up,
> can you add a cc: stable tag in your patch so I can pick it up when it
> gets merged?

Userspace (makedumpfile) will have to adapt, not the kernel. Meanwhile
I carry reverts, making kernels, kdump and myself all happy campers.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08  9:45                     ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-08  9:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Michal Hocko, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Ingo Molnar

On Mon, 2018-01-08 at 09:33 +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 08, 2018 at 09:15:33AM +0100, Mike Galbraith wrote:
> 
> > > It was part of the prep for the KTPI code from what I can tell.  If you
> > > think it should be reverted, just let me know and I'll be glad to do so.
> > 
> > No preference here.  I have to patch master regardless if I want kdump
> > to work while I patiently wait for userspace to get fixed up (either
> > that or use time I don't have to go fix it up myself).
> 
> I'll stay "bug compatible" for the time being.  If you do fix this up,
> can you add a cc: stable tag in your patch so I can pick it up when it
> gets merged?

Userspace (makedumpfile) will have to adapt, not the kernel. Meanwhile
I carry reverts, making kernels, kdump and myself all happy campers.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-07  5:14     ` Mike Galbraith
  (?)
@ 2018-01-08 16:04       ` Ingo Molnar
  -1 siblings, 0 replies; 349+ messages in thread
From: Ingo Molnar @ 2018-01-08 16:04 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm


hi Kirill,

As Mike reported it below, your 5-level paging related upstream commit 
83e3c48729d9 and all its followup fixes:

 83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
 629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
 d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section

... still breaks kexec - and that now regresses -stable as well.

Given that 5-level paging now syntactically depends on having this commit, if we 
fully revert this then we'll have to disable 5-level paging as well.

Thanks,

	Ingo

* Mike Galbraith <efault@gmx.de> wrote:

> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> FYI, this broke kdump, or rather the makedumpfile part thereof.
>  Forward looking wreckage is par for the kdump course, but...
> 
> > ------------------
> > 
> > From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> > 
> > Size of the mem_section[] array depends on the size of the physical address space.
> > 
> > In preparation for boot-time switching between paging modes on x86-64
> > we need to make the allocation of mem_section[] dynamic, because otherwise
> > we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> > for 4-level paging and 2MB for 5-level paging mode.
> > 
> > The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: linux-mm@kvack.org
> > Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > ---
> >  include/linux/mmzone.h |    6 +++++-
> >  mm/page_alloc.c        |   10 ++++++++++
> >  mm/sparse.c            |   17 +++++++++++------
> >  3 files changed, 26 insertions(+), 7 deletions(-)
> > 
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1152,13 +1152,17 @@ struct mem_section {
> >  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
> >  
> >  #ifdef CONFIG_SPARSEMEM_EXTREME
> > -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> > +extern struct mem_section **mem_section;
> >  #else
> >  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
> >  #endif
> >  
> >  static inline struct mem_section *__nr_to_section(unsigned long nr)
> >  {
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > +	if (!mem_section)
> > +		return NULL;
> > +#endif
> >  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
> >  		return NULL;
> >  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
> >  	unsigned long start_pfn, end_pfn;
> >  	int i, this_nid;
> >  
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > +	if (!mem_section) {
> > +		unsigned long size, align;
> > +
> > +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> > +		align = 1 << (INTERNODE_CACHE_SHIFT);
> > +		mem_section = memblock_virt_alloc(size, align);
> > +	}
> > +#endif
> > +
> >  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
> >  		memory_present(this_nid, start_pfn, end_pfn);
> >  }
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -23,8 +23,7 @@
> >   * 1) mem_section	- memory sections, mem_map's for valid memory
> >   */
> >  #ifdef CONFIG_SPARSEMEM_EXTREME
> > -struct mem_section *mem_section[NR_SECTION_ROOTS]
> > -	____cacheline_internodealigned_in_smp;
> > +struct mem_section **mem_section;
> >  #else
> >  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
> >  	____cacheline_internodealigned_in_smp;
> > @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
> >  int __section_nr(struct mem_section* ms)
> >  {
> >  	unsigned long root_nr;
> > -	struct mem_section* root;
> > +	struct mem_section *root = NULL;
> >  
> >  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
> >  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> > @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
> >  		     break;
> >  	}
> >  
> > -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> > +	VM_BUG_ON(!root);
> >  
> >  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
> >  }
> > @@ -330,11 +329,17 @@ again:
> >  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
> >  {
> >  	unsigned long usemap_snr, pgdat_snr;
> > -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> > -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> > +	static unsigned long old_usemap_snr;
> > +	static unsigned long old_pgdat_snr;
> >  	struct pglist_data *pgdat = NODE_DATA(nid);
> >  	int usemap_nid;
> >  
> > +	/* First call */
> > +	if (!old_usemap_snr) {
> > +		old_usemap_snr = NR_MEM_SECTIONS;
> > +		old_pgdat_snr = NR_MEM_SECTIONS;
> > +	}
> > +
> >  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
> >  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
> >  	if (usemap_snr == pgdat_snr)
> > 
> > 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08 16:04       ` Ingo Molnar
  0 siblings, 0 replies; 349+ messages in thread
From: Ingo Molnar @ 2018-01-08 16:04 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm


hi Kirill,

As Mike reported it below, your 5-level paging related upstream commit 
83e3c48729d9 and all its followup fixes:

 83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
 629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
 d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section

... still breaks kexec - and that now regresses -stable as well.

Given that 5-level paging now syntactically depends on having this commit, if we 
fully revert this then we'll have to disable 5-level paging as well.

Thanks,

	Ingo

* Mike Galbraith <efault@gmx.de> wrote:

> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> FYI, this broke kdump, or rather the makedumpfile part thereof.
> �Forward looking wreckage is par for the kdump course, but...
> 
> > ------------------
> > 
> > From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> > 
> > Size of the mem_section[] array depends on the size of the physical address space.
> > 
> > In preparation for boot-time switching between paging modes on x86-64
> > we need to make the allocation of mem_section[] dynamic, because otherwise
> > we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> > for 4-level paging and 2MB for 5-level paging mode.
> > 
> > The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: linux-mm@kvack.org
> > Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > ---
> >  include/linux/mmzone.h |    6 +++++-
> >  mm/page_alloc.c        |   10 ++++++++++
> >  mm/sparse.c            |   17 +++++++++++------
> >  3 files changed, 26 insertions(+), 7 deletions(-)
> > 
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1152,13 +1152,17 @@ struct mem_section {
> >  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
> >  
> >  #ifdef CONFIG_SPARSEMEM_EXTREME
> > -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> > +extern struct mem_section **mem_section;
> >  #else
> >  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
> >  #endif
> >  
> >  static inline struct mem_section *__nr_to_section(unsigned long nr)
> >  {
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > +	if (!mem_section)
> > +		return NULL;
> > +#endif
> >  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
> >  		return NULL;
> >  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
> >  	unsigned long start_pfn, end_pfn;
> >  	int i, this_nid;
> >  
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > +	if (!mem_section) {
> > +		unsigned long size, align;
> > +
> > +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> > +		align = 1 << (INTERNODE_CACHE_SHIFT);
> > +		mem_section = memblock_virt_alloc(size, align);
> > +	}
> > +#endif
> > +
> >  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
> >  		memory_present(this_nid, start_pfn, end_pfn);
> >  }
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -23,8 +23,7 @@
> >   * 1) mem_section	- memory sections, mem_map's for valid memory
> >   */
> >  #ifdef CONFIG_SPARSEMEM_EXTREME
> > -struct mem_section *mem_section[NR_SECTION_ROOTS]
> > -	____cacheline_internodealigned_in_smp;
> > +struct mem_section **mem_section;
> >  #else
> >  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
> >  	____cacheline_internodealigned_in_smp;
> > @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
> >  int __section_nr(struct mem_section* ms)
> >  {
> >  	unsigned long root_nr;
> > -	struct mem_section* root;
> > +	struct mem_section *root = NULL;
> >  
> >  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
> >  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> > @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
> >  		     break;
> >  	}
> >  
> > -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> > +	VM_BUG_ON(!root);
> >  
> >  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
> >  }
> > @@ -330,11 +329,17 @@ again:
> >  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
> >  {
> >  	unsigned long usemap_snr, pgdat_snr;
> > -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> > -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> > +	static unsigned long old_usemap_snr;
> > +	static unsigned long old_pgdat_snr;
> >  	struct pglist_data *pgdat = NODE_DATA(nid);
> >  	int usemap_nid;
> >  
> > +	/* First call */
> > +	if (!old_usemap_snr) {
> > +		old_usemap_snr = NR_MEM_SECTIONS;
> > +		old_pgdat_snr = NR_MEM_SECTIONS;
> > +	}
> > +
> >  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
> >  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
> >  	if (usemap_snr == pgdat_snr)
> > 
> > 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08 16:04       ` Ingo Molnar
  0 siblings, 0 replies; 349+ messages in thread
From: Ingo Molnar @ 2018-01-08 16:04 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Kirill A. Shutemov,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm


hi Kirill,

As Mike reported it below, your 5-level paging related upstream commit 
83e3c48729d9 and all its followup fixes:

 83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
 629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
 d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section

... still breaks kexec - and that now regresses -stable as well.

Given that 5-level paging now syntactically depends on having this commit, if we 
fully revert this then we'll have to disable 5-level paging as well.

Thanks,

	Ingo

* Mike Galbraith <efault@gmx.de> wrote:

> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> 
> FYI, this broke kdump, or rather the makedumpfile part thereof.
>  Forward looking wreckage is par for the kdump course, but...
> 
> > ------------------
> > 
> > From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> > 
> > Size of the mem_section[] array depends on the size of the physical address space.
> > 
> > In preparation for boot-time switching between paging modes on x86-64
> > we need to make the allocation of mem_section[] dynamic, because otherwise
> > we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> > for 4-level paging and 2MB for 5-level paging mode.
> > 
> > The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: linux-mm@kvack.org
> > Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > ---
> >  include/linux/mmzone.h |    6 +++++-
> >  mm/page_alloc.c        |   10 ++++++++++
> >  mm/sparse.c            |   17 +++++++++++------
> >  3 files changed, 26 insertions(+), 7 deletions(-)
> > 
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1152,13 +1152,17 @@ struct mem_section {
> >  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
> >  
> >  #ifdef CONFIG_SPARSEMEM_EXTREME
> > -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> > +extern struct mem_section **mem_section;
> >  #else
> >  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
> >  #endif
> >  
> >  static inline struct mem_section *__nr_to_section(unsigned long nr)
> >  {
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > +	if (!mem_section)
> > +		return NULL;
> > +#endif
> >  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
> >  		return NULL;
> >  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
> >  	unsigned long start_pfn, end_pfn;
> >  	int i, this_nid;
> >  
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > +	if (!mem_section) {
> > +		unsigned long size, align;
> > +
> > +		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> > +		align = 1 << (INTERNODE_CACHE_SHIFT);
> > +		mem_section = memblock_virt_alloc(size, align);
> > +	}
> > +#endif
> > +
> >  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
> >  		memory_present(this_nid, start_pfn, end_pfn);
> >  }
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -23,8 +23,7 @@
> >   * 1) mem_section	- memory sections, mem_map's for valid memory
> >   */
> >  #ifdef CONFIG_SPARSEMEM_EXTREME
> > -struct mem_section *mem_section[NR_SECTION_ROOTS]
> > -	____cacheline_internodealigned_in_smp;
> > +struct mem_section **mem_section;
> >  #else
> >  struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
> >  	____cacheline_internodealigned_in_smp;
> > @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
> >  int __section_nr(struct mem_section* ms)
> >  {
> >  	unsigned long root_nr;
> > -	struct mem_section* root;
> > +	struct mem_section *root = NULL;
> >  
> >  	for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
> >  		root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> > @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
> >  		     break;
> >  	}
> >  
> > -	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> > +	VM_BUG_ON(!root);
> >  
> >  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
> >  }
> > @@ -330,11 +329,17 @@ again:
> >  static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
> >  {
> >  	unsigned long usemap_snr, pgdat_snr;
> > -	static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> > -	static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> > +	static unsigned long old_usemap_snr;
> > +	static unsigned long old_pgdat_snr;
> >  	struct pglist_data *pgdat = NODE_DATA(nid);
> >  	int usemap_nid;
> >  
> > +	/* First call */
> > +	if (!old_usemap_snr) {
> > +		old_usemap_snr = NR_MEM_SECTIONS;
> > +		old_pgdat_snr = NR_MEM_SECTIONS;
> > +	}
> > +
> >  	usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
> >  	pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
> >  	if (usemap_snr == pgdat_snr)
> > 
> > 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08 16:04       ` Ingo Molnar
@ 2018-01-08 17:46         ` Kirill A. Shutemov
  -1 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-08 17:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mike Galbraith, Greg Kroah-Hartman, linux-kernel, stable,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm

On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> 
> hi Kirill,
> 
> As Mike reported it below, your 5-level paging related upstream commit 
> 83e3c48729d9 and all its followup fixes:
> 
>  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
>  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
>  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> 
> ... still breaks kexec - and that now regresses -stable as well.
> 
> Given that 5-level paging now syntactically depends on having this commit, if we 
> fully revert this then we'll have to disable 5-level paging as well.

Urghh.. Sorry about this.

I'm on vacation right now. Give me a day to sort this out.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-08 17:46         ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-08 17:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mike Galbraith, Greg Kroah-Hartman, linux-kernel, stable,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm

On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> 
> hi Kirill,
> 
> As Mike reported it below, your 5-level paging related upstream commit 
> 83e3c48729d9 and all its followup fixes:
> 
>  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
>  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
>  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> 
> ... still breaks kexec - and that now regresses -stable as well.
> 
> Given that 5-level paging now syntactically depends on having this commit, if we 
> fully revert this then we'll have to disable 5-level paging as well.

Urghh.. Sorry about this.

I'm on vacation right now. Give me a day to sort this out.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-08 17:46         ` Kirill A. Shutemov
  (?)
  (?)
@ 2018-01-09  0:13           ` Kirill A. Shutemov
  -1 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  0:13 UTC (permalink / raw)
  To: Ingo Molnar, Mike Galbraith, Andrew Morton
  Cc: Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Dave Young, Baoquan He, Vivek Goyal, kexec

On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > 
> > hi Kirill,
> > 
> > As Mike reported it below, your 5-level paging related upstream commit 
> > 83e3c48729d9 and all its followup fixes:
> > 
> >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > 
> > ... still breaks kexec - and that now regresses -stable as well.
> > 
> > Given that 5-level paging now syntactically depends on having this commit, if we 
> > fully revert this then we'll have to disable 5-level paging as well.

This *should* help.

Mike, could you test this? (On top of the rest of the fixes.)

Sorry for the mess.

>From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 9 Jan 2018 02:55:47 +0300
Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to refer
to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array" if
mem_section is an array, but if mem_section is a pointer, it would mean
"address of the pointer".

We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_ARRAY() that would handle the situation
correctly for both cases.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
---
 include/linux/crash_core.h | 2 ++
 kernel/crash_core.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef30449..83ae04950269 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b3663896278e..d4122a837477 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  0:13           ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  0:13 UTC (permalink / raw)
  To: Ingo Molnar, Mike Galbraith, Andrew Morton
  Cc: Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Dave Young, Baoquan He, Vivek Goyal, kexec

On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > 
> > hi Kirill,
> > 
> > As Mike reported it below, your 5-level paging related upstream commit 
> > 83e3c48729d9 and all its followup fixes:
> > 
> >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > 
> > ... still breaks kexec - and that now regresses -stable as well.
> > 
> > Given that 5-level paging now syntactically depends on having this commit, if we 
> > fully revert this then we'll have to disable 5-level paging as well.

This *should* help.

Mike, could you test this? (On top of the rest of the fixes.)

Sorry for the mess.

>From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 9 Jan 2018 02:55:47 +0300
Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to refer
to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array" if
mem_section is an array, but if mem_section is a pointer, it would mean
"address of the pointer".

We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_ARRAY() that would handle the situation
correctly for both cases.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
---
 include/linux/crash_core.h | 2 ++
 kernel/crash_core.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef30449..83ae04950269 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b3663896278e..d4122a837477 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  0:13           ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  0:13 UTC (permalink / raw)
  To: Ingo Molnar, Mike Galbraith, Andrew Morton
  Cc: Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Dave Young, Baoquan He, Vivek Goyal, kexec

On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > 
> > hi Kirill,
> > 
> > As Mike reported it below, your 5-level paging related upstream commit 
> > 83e3c48729d9 and all its followup fixes:
> > 
> >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > 
> > ... still breaks kexec - and that now regresses -stable as well.
> > 
> > Given that 5-level paging now syntactically depends on having this commit, if we 
> > fully revert this then we'll have to disable 5-level paging as well.

This *should* help.

Mike, could you test this? (On top of the rest of the fixes.)

Sorry for the mess.

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  0:13           ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  0:13 UTC (permalink / raw)
  To: Ingo Molnar, Mike Galbraith, Andrew Morton
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Dave Young,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Vivek Goyal, Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov

On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > 
> > hi Kirill,
> > 
> > As Mike reported it below, your 5-level paging related upstream commit 
> > 83e3c48729d9 and all its followup fixes:
> > 
> >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > 
> > ... still breaks kexec - and that now regresses -stable as well.
> > 
> > Given that 5-level paging now syntactically depends on having this commit, if we 
> > fully revert this then we'll have to disable 5-level paging as well.

This *should* help.

Mike, could you test this? (On top of the rest of the fixes.)

Sorry for the mess.

From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 9 Jan 2018 02:55:47 +0300
Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to refer
to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array" if
mem_section is an array, but if mem_section is a pointer, it would mean
"address of the pointer".

We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_ARRAY() that would handle the situation
correctly for both cases.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
---
 include/linux/crash_core.h | 2 ++
 kernel/crash_core.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef30449..83ae04950269 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b3663896278e..d4122a837477 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
-- 
 Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  0:13           ` Kirill A. Shutemov
  (?)
@ 2018-01-09  1:09             ` Dave Young
  -1 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-09  1:09 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Kirill A. Shutemov,
	Greg Kroah-Hartman, linux-kernel, stable, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Baoquan He, Vivek Goyal, kexec

On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > 
> > > hi Kirill,
> > > 
> > > As Mike reported it below, your 5-level paging related upstream commit 
> > > 83e3c48729d9 and all its followup fixes:
> > > 
> > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > 
> > > ... still breaks kexec - and that now regresses -stable as well.
> > > 
> > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > fully revert this then we'll have to disable 5-level paging as well.
> 
> This *should* help.
> 
> Mike, could you test this? (On top of the rest of the fixes.)
> 
> Sorry for the mess.
> 
> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \

Thanks for the patch, I have a similar patch but makedumpfile maintainer
is looking at a userspace fix instead.

As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov

Thanks
Dave

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  1:09             ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-09  1:09 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Kirill A. Shutemov,
	Greg Kroah-Hartman, linux-kernel, stable, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Baoquan He, Vivek Goyal, kexec

On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > 
> > > hi Kirill,
> > > 
> > > As Mike reported it below, your 5-level paging related upstream commit 
> > > 83e3c48729d9 and all its followup fixes:
> > > 
> > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > 
> > > ... still breaks kexec - and that now regresses -stable as well.
> > > 
> > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > fully revert this then we'll have to disable 5-level paging as well.
> 
> This *should* help.
> 
> Mike, could you test this? (On top of the rest of the fixes.)
> 
> Sorry for the mess.
> 
> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \

Thanks for the patch, I have a similar patch but makedumpfile maintainer
is looking at a userspace fix instead.

As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov

Thanks
Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  1:09             ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-09  1:09 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov, Andrew Morton,
	Borislav Petkov, Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > 
> > > hi Kirill,
> > > 
> > > As Mike reported it below, your 5-level paging related upstream commit 
> > > 83e3c48729d9 and all its followup fixes:
> > > 
> > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > 
> > > ... still breaks kexec - and that now regresses -stable as well.
> > > 
> > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > fully revert this then we'll have to disable 5-level paging as well.
> 
> This *should* help.
> 
> Mike, could you test this? (On top of the rest of the fixes.)
> 
> Sorry for the mess.
> 
> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \

Thanks for the patch, I have a similar patch but makedumpfile maintainer
is looking at a userspace fix instead.

As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  0:13           ` Kirill A. Shutemov
  (?)
  (?)
@ 2018-01-09  3:44             ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-09  3:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Dave Young, Baoquan He, Vivek Goyal, kexec

On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> 
> Mike, could you test this? (On top of the rest of the fixes.)

homer:..crash/2018-01-09-04:25 # ll
total 1863604
-rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
-rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
-rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
-rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
-rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz

Yup, all better.

> Sorry for the mess.

(why, developers not installing shiny new bugs is a whole lot worse:)

> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  3:44             ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-09  3:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Dave Young, Baoquan He, Vivek Goyal, kexec

On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> 
> Mike, could you test this? (On top of the rest of the fixes.)

homer:..crash/2018-01-09-04:25 # ll
total 1863604
-rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
-rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
-rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
-rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
-rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz

Yup, all better.

> Sorry for the mess.

(why, developers not installing shiny new bugs is a whole lot worse:)

> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  3:44             ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-09  3:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Dave Young, Baoquan He, Vivek Goyal, kexec

On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> 
> Mike, could you test this? (On top of the rest of the fixes.)

homer:..crash/2018-01-09-04:25 # ll
total 1863604
-rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
-rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
-rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
-rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
-rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz

Yup, all better.

> Sorry for the mess.

(why, developers not installing shiny new bugs is a whole lot worse:)

> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  3:44             ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-01-09  3:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Dave Young,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Vivek Goyal, Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov

On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> 
> Mike, could you test this? (On top of the rest of the fixes.)

homer:..crash/2018-01-09-04:25 # ll
total 1863604
-rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
-rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
-rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
-rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
-rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz

Yup, all better.

> Sorry for the mess.

(why, developers not installing shiny new bugs is a whole lot worse:)

> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  1:09             ` Dave Young
  (?)
@ 2018-01-09  5:41               ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-09  5:41 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On 01/09/18 at 09:09am, Dave Young wrote:
> On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > > 
> > > > hi Kirill,
> > > > 
> > > > As Mike reported it below, your 5-level paging related upstream commit 
> > > > 83e3c48729d9 and all its followup fixes:
> > > > 
> > > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > > 
> > > > ... still breaks kexec - and that now regresses -stable as well.
> > > > 
> > > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > > fully revert this then we'll have to disable 5-level paging as well.
> > 
> > This *should* help.
> > 
> > Mike, could you test this? (On top of the rest of the fixes.)
> > 
> > Sorry for the mess.
> > 
> > From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..83ae04950269 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_ARRAY(name) \
> 
> Thanks for the patch, I have a similar patch but makedumpfile maintainer
> is looking at a userspace fix instead.

Seems we should add lkml to CC next time so that people can watch it.

> As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

I still think using vmcoreinfo_append_str is better. Unless we replace
all array variables with the newly added macro.

vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
                                (unsigned long)mem_section);
> 
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..d4122a837477 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  5:41               ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-09  5:41 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On 01/09/18 at 09:09am, Dave Young wrote:
> On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > > 
> > > > hi Kirill,
> > > > 
> > > > As Mike reported it below, your 5-level paging related upstream commit 
> > > > 83e3c48729d9 and all its followup fixes:
> > > > 
> > > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > > 
> > > > ... still breaks kexec - and that now regresses -stable as well.
> > > > 
> > > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > > fully revert this then we'll have to disable 5-level paging as well.
> > 
> > This *should* help.
> > 
> > Mike, could you test this? (On top of the rest of the fixes.)
> > 
> > Sorry for the mess.
> > 
> > From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..83ae04950269 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_ARRAY(name) \
> 
> Thanks for the patch, I have a similar patch but makedumpfile maintainer
> is looking at a userspace fix instead.

Seems we should add lkml to CC next time so that people can watch it.

> As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

I still think using vmcoreinfo_append_str is better. Unless we replace
all array variables with the newly added macro.

vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
                                (unsigned long)mem_section);
> 
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..d4122a837477 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> Thanks
> Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  5:41               ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-09  5:41 UTC (permalink / raw)
  To: Dave Young
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Thomas Gleixner,
	Vivek Goyal, Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On 01/09/18 at 09:09am, Dave Young wrote:
> On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > > 
> > > > hi Kirill,
> > > > 
> > > > As Mike reported it below, your 5-level paging related upstream commit 
> > > > 83e3c48729d9 and all its followup fixes:
> > > > 
> > > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > > 
> > > > ... still breaks kexec - and that now regresses -stable as well.
> > > > 
> > > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > > fully revert this then we'll have to disable 5-level paging as well.
> > 
> > This *should* help.
> > 
> > Mike, could you test this? (On top of the rest of the fixes.)
> > 
> > Sorry for the mess.
> > 
> > From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..83ae04950269 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_ARRAY(name) \
> 
> Thanks for the patch, I have a similar patch but makedumpfile maintainer
> is looking at a userspace fix instead.

Seems we should add lkml to CC next time so that people can watch it.

> As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

I still think using vmcoreinfo_append_str is better. Unless we replace
all array variables with the newly added macro.

vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
                                (unsigned long)mem_section);
> 
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..d4122a837477 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  5:41               ` Baoquan He
  (?)
@ 2018-01-09  7:24                 ` Dave Young
  -1 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-09  7:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: Kirill A. Shutemov, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On 01/09/18 at 01:41pm, Baoquan He wrote:
> On 01/09/18 at 09:09am, Dave Young wrote:
> > On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> > > On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > > > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > > > 
> > > > > hi Kirill,
> > > > > 
> > > > > As Mike reported it below, your 5-level paging related upstream commit 
> > > > > 83e3c48729d9 and all its followup fixes:
> > > > > 
> > > > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > > > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > > > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > > > 
> > > > > ... still breaks kexec - and that now regresses -stable as well.
> > > > > 
> > > > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > > > fully revert this then we'll have to disable 5-level paging as well.
> > > 
> > > This *should* help.
> > > 
> > > Mike, could you test this? (On top of the rest of the fixes.)
> > > 
> > > Sorry for the mess.
> > > 
> > > From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > > 
> > > Depending on configuration mem_section can now be an array or a pointer
> > > to an array allocated dynamically. In most cases, we can continue to refer
> > > to it as 'mem_section' regardless of what it is.
> > > 
> > > But there's one exception: '&mem_section' means "address of the array" if
> > > mem_section is an array, but if mem_section is a pointer, it would mean
> > > "address of the pointer".
> > > 
> > > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > > 
> > > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > > correctly for both cases.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > > ---
> > >  include/linux/crash_core.h | 2 ++
> > >  kernel/crash_core.c        | 2 +-
> > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > > index 06097ef30449..83ae04950269 100644
> > > --- a/include/linux/crash_core.h
> > > +++ b/include/linux/crash_core.h
> > > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> > >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> > >  #define VMCOREINFO_SYMBOL(name) \
> > >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > > +#define VMCOREINFO_ARRAY(name) \
> > 
> > Thanks for the patch, I have a similar patch but makedumpfile maintainer
> > is looking at a userspace fix instead.
> 
> Seems we should add lkml to CC next time so that people can watch it.

Yes, agreed.

> 
> > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> 
> I still think using vmcoreinfo_append_str is better. Unless we replace
> all array variables with the newly added macro.
> 
> vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
>                                 (unsigned long)mem_section);

I have no strong opinion, either change all array uses or just introduce
the macro and start to use it from now on if we have similar array
symbols.

> > 
> > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > >  #define VMCOREINFO_SIZE(name) \
> > >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> > >  			      (unsigned long)sizeof(name))
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index b3663896278e..d4122a837477 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > >  	VMCOREINFO_SYMBOL(contig_page_data);
> > >  #endif
> > >  #ifdef CONFIG_SPARSEMEM
> > > -	VMCOREINFO_SYMBOL(mem_section);
> > > +	VMCOREINFO_ARRAY(mem_section);
> > >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> > >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > > -- 
> > >  Kirill A. Shutemov
> > 
> > Thanks
> > Dave

Thanks
Dave

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  7:24                 ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-09  7:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: Kirill A. Shutemov, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On 01/09/18 at 01:41pm, Baoquan He wrote:
> On 01/09/18 at 09:09am, Dave Young wrote:
> > On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> > > On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > > > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > > > 
> > > > > hi Kirill,
> > > > > 
> > > > > As Mike reported it below, your 5-level paging related upstream commit 
> > > > > 83e3c48729d9 and all its followup fixes:
> > > > > 
> > > > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > > > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > > > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > > > 
> > > > > ... still breaks kexec - and that now regresses -stable as well.
> > > > > 
> > > > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > > > fully revert this then we'll have to disable 5-level paging as well.
> > > 
> > > This *should* help.
> > > 
> > > Mike, could you test this? (On top of the rest of the fixes.)
> > > 
> > > Sorry for the mess.
> > > 
> > > From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > > 
> > > Depending on configuration mem_section can now be an array or a pointer
> > > to an array allocated dynamically. In most cases, we can continue to refer
> > > to it as 'mem_section' regardless of what it is.
> > > 
> > > But there's one exception: '&mem_section' means "address of the array" if
> > > mem_section is an array, but if mem_section is a pointer, it would mean
> > > "address of the pointer".
> > > 
> > > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > > 
> > > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > > correctly for both cases.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > > ---
> > >  include/linux/crash_core.h | 2 ++
> > >  kernel/crash_core.c        | 2 +-
> > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > > index 06097ef30449..83ae04950269 100644
> > > --- a/include/linux/crash_core.h
> > > +++ b/include/linux/crash_core.h
> > > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> > >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> > >  #define VMCOREINFO_SYMBOL(name) \
> > >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > > +#define VMCOREINFO_ARRAY(name) \
> > 
> > Thanks for the patch, I have a similar patch but makedumpfile maintainer
> > is looking at a userspace fix instead.
> 
> Seems we should add lkml to CC next time so that people can watch it.

Yes, agreed.

> 
> > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> 
> I still think using vmcoreinfo_append_str is better. Unless we replace
> all array variables with the newly added macro.
> 
> vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
>                                 (unsigned long)mem_section);

I have no strong opinion, either change all array uses or just introduce
the macro and start to use it from now on if we have similar array
symbols.

> > 
> > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > >  #define VMCOREINFO_SIZE(name) \
> > >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> > >  			      (unsigned long)sizeof(name))
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index b3663896278e..d4122a837477 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > >  	VMCOREINFO_SYMBOL(contig_page_data);
> > >  #endif
> > >  #ifdef CONFIG_SPARSEMEM
> > > -	VMCOREINFO_SYMBOL(mem_section);
> > > +	VMCOREINFO_ARRAY(mem_section);
> > >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> > >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > > -- 
> > >  Kirill A. Shutemov
> > 
> > Thanks
> > Dave

Thanks
Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  7:24                 ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-09  7:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Thomas Gleixner,
	Vivek Goyal, Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On 01/09/18 at 01:41pm, Baoquan He wrote:
> On 01/09/18 at 09:09am, Dave Young wrote:
> > On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> > > On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > > > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > > > 
> > > > > hi Kirill,
> > > > > 
> > > > > As Mike reported it below, your 5-level paging related upstream commit 
> > > > > 83e3c48729d9 and all its followup fixes:
> > > > > 
> > > > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > > > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > > > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > > > 
> > > > > ... still breaks kexec - and that now regresses -stable as well.
> > > > > 
> > > > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > > > fully revert this then we'll have to disable 5-level paging as well.
> > > 
> > > This *should* help.
> > > 
> > > Mike, could you test this? (On top of the rest of the fixes.)
> > > 
> > > Sorry for the mess.
> > > 
> > > From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > > 
> > > Depending on configuration mem_section can now be an array or a pointer
> > > to an array allocated dynamically. In most cases, we can continue to refer
> > > to it as 'mem_section' regardless of what it is.
> > > 
> > > But there's one exception: '&mem_section' means "address of the array" if
> > > mem_section is an array, but if mem_section is a pointer, it would mean
> > > "address of the pointer".
> > > 
> > > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > > 
> > > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > > correctly for both cases.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > > ---
> > >  include/linux/crash_core.h | 2 ++
> > >  kernel/crash_core.c        | 2 +-
> > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > > index 06097ef30449..83ae04950269 100644
> > > --- a/include/linux/crash_core.h
> > > +++ b/include/linux/crash_core.h
> > > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> > >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> > >  #define VMCOREINFO_SYMBOL(name) \
> > >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > > +#define VMCOREINFO_ARRAY(name) \
> > 
> > Thanks for the patch, I have a similar patch but makedumpfile maintainer
> > is looking at a userspace fix instead.
> 
> Seems we should add lkml to CC next time so that people can watch it.

Yes, agreed.

> 
> > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> 
> I still think using vmcoreinfo_append_str is better. Unless we replace
> all array variables with the newly added macro.
> 
> vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
>                                 (unsigned long)mem_section);

I have no strong opinion, either change all array uses or just introduce
the macro and start to use it from now on if we have similar array
symbols.

> > 
> > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > >  #define VMCOREINFO_SIZE(name) \
> > >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> > >  			      (unsigned long)sizeof(name))
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index b3663896278e..d4122a837477 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > >  	VMCOREINFO_SYMBOL(contig_page_data);
> > >  #endif
> > >  #ifdef CONFIG_SPARSEMEM
> > > -	VMCOREINFO_SYMBOL(mem_section);
> > > +	VMCOREINFO_ARRAY(mem_section);
> > >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> > >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > > -- 
> > >  Kirill A. Shutemov
> > 
> > Thanks
> > Dave

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  7:24                 ` Dave Young
  (?)
@ 2018-01-09  9:05                   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  9:05 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> On 01/09/18 at 01:41pm, Baoquan He wrote:
> > On 01/09/18 at 09:09am, Dave Young wrote:
> > 
> > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

Yep, that's better.

> > I still think using vmcoreinfo_append_str is better. Unless we replace
> > all array variables with the newly added macro.
> > 
> > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> >                                 (unsigned long)mem_section);
> 
> I have no strong opinion, either change all array uses or just introduce
> the macro and start to use it from now on if we have similar array
> symbols.

Do you need some action on my side or will you folks take care about this?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  9:05                   ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  9:05 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> On 01/09/18 at 01:41pm, Baoquan He wrote:
> > On 01/09/18 at 09:09am, Dave Young wrote:
> > 
> > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

Yep, that's better.

> > I still think using vmcoreinfo_append_str is better. Unless we replace
> > all array variables with the newly added macro.
> > 
> > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> >                                 (unsigned long)mem_section);
> 
> I have no strong opinion, either change all array uses or just introduce
> the macro and start to use it from now on if we have similar array
> symbols.

Do you need some action on my side or will you folks take care about this?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-09  9:05                   ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-09  9:05 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov, Andrew Morton,
	Borislav Petkov, Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> On 01/09/18 at 01:41pm, Baoquan He wrote:
> > On 01/09/18 at 09:09am, Dave Young wrote:
> > 
> > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.

Yep, that's better.

> > I still think using vmcoreinfo_append_str is better. Unless we replace
> > all array variables with the newly added macro.
> > 
> > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> >                                 (unsigned long)mem_section);
> 
> I have no strong opinion, either change all array uses or just introduce
> the macro and start to use it from now on if we have similar array
> symbols.

Do you need some action on my side or will you folks take care about this?

-- 
 Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  9:05                   ` Kirill A. Shutemov
  (?)
@ 2018-01-10  3:08                     ` Dave Young
  -1 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-10  3:08 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Baoquan He, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > 
> > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> 
> Yep, that's better.
> 
> > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > all array variables with the newly added macro.
> > > 
> > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > >                                 (unsigned long)mem_section);
> > 
> > I have no strong opinion, either change all array uses or just introduce
> > the macro and start to use it from now on if we have similar array
> > symbols.
> 
> Do you need some action on my side or will you folks take care about this?

I think Baoquan was suggesting to update all array users in current code, if you can check every VMCOREINFO_SYMBOL and update all the arrays he will be happy. But if can not do it easily I'm fine with a VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it later as well. 

> 
> -- 
>  Kirill A. Shutemov

Thanks
Dave

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-10  3:08                     ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-10  3:08 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Baoquan He, Ingo Molnar, Mike Galbraith, Andrew Morton,
	Kirill A. Shutemov, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > 
> > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> 
> Yep, that's better.
> 
> > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > all array variables with the newly added macro.
> > > 
> > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > >                                 (unsigned long)mem_section);
> > 
> > I have no strong opinion, either change all array uses or just introduce
> > the macro and start to use it from now on if we have similar array
> > symbols.
> 
> Do you need some action on my side or will you folks take care about this?

I think Baoquan was suggesting to update all array users in current code, if you can check every VMCOREINFO_SYMBOL and update all the arrays he will be happy. But if can not do it easily I'm fine with a VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it later as well. 

> 
> -- 
>  Kirill A. Shutemov

Thanks
Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-10  3:08                     ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-10  3:08 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov, Andrew Morton,
	Borislav Petkov, Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > 
> > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> 
> Yep, that's better.
> 
> > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > all array variables with the newly added macro.
> > > 
> > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > >                                 (unsigned long)mem_section);
> > 
> > I have no strong opinion, either change all array uses or just introduce
> > the macro and start to use it from now on if we have similar array
> > symbols.
> 
> Do you need some action on my side or will you folks take care about this?

I think Baoquan was suggesting to update all array users in current code, if you can check every VMCOREINFO_SYMBOL and update all the arrays he will be happy. But if can not do it easily I'm fine with a VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it later as well. 

> 
> -- 
>  Kirill A. Shutemov

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-10  3:08                     ` Dave Young
  (?)
  (?)
@ 2018-01-10 11:16                       ` Kirill A. Shutemov
  -1 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-10 11:16 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Baoquan He, Ingo Molnar, Mike Galbraith,
	Andrew Morton, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > 
> > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > 
> > Yep, that's better.
> > 
> > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > all array variables with the newly added macro.
> > > > 
> > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > >                                 (unsigned long)mem_section);
> > > 
> > > I have no strong opinion, either change all array uses or just introduce
> > > the macro and start to use it from now on if we have similar array
> > > symbols.
> > 
> > Do you need some action on my side or will you folks take care about this?
> 
> I think Baoquan was suggesting to update all array users in current
> code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> he will be happy. But if can not do it easily I'm fine with a
> VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> later as well. 

It seems it's the only array we have there. swapper_pg_dir is a potential
candidate, but it's 'unsigned long' on arm.

Below it patch with corrected macro name.

Please, consider applying.

>From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 9 Jan 2018 02:55:47 +0300
Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to refer
to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array" if
mem_section is an array, but if mem_section is a pointer, it would mean
"address of the pointer".

We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
situation correctly for both cases.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
---
 include/linux/crash_core.h | 2 ++
 kernel/crash_core.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef30449..b511f6d24b42 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_SYMBOL_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b3663896278e..4f63597c824d 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_SYMBOL_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-10 11:16                       ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-10 11:16 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Baoquan He, Ingo Molnar, Mike Galbraith,
	Andrew Morton, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > 
> > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > 
> > Yep, that's better.
> > 
> > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > all array variables with the newly added macro.
> > > > 
> > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > >                                 (unsigned long)mem_section);
> > > 
> > > I have no strong opinion, either change all array uses or just introduce
> > > the macro and start to use it from now on if we have similar array
> > > symbols.
> > 
> > Do you need some action on my side or will you folks take care about this?
> 
> I think Baoquan was suggesting to update all array users in current
> code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> he will be happy. But if can not do it easily I'm fine with a
> VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> later as well. 

It seems it's the only array we have there. swapper_pg_dir is a potential
candidate, but it's 'unsigned long' on arm.

Below it patch with corrected macro name.

Please, consider applying.

>From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 9 Jan 2018 02:55:47 +0300
Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to refer
to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array" if
mem_section is an array, but if mem_section is a pointer, it would mean
"address of the pointer".

We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
situation correctly for both cases.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
---
 include/linux/crash_core.h | 2 ++
 kernel/crash_core.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef30449..b511f6d24b42 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_SYMBOL_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b3663896278e..4f63597c824d 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_SYMBOL_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-10 11:16                       ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-10 11:16 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Baoquan He, Ingo Molnar, Mike Galbraith,
	Andrew Morton, Greg Kroah-Hartman, linux-kernel, stable,
	Andy Lutomirski, Borislav Petkov, Cyrill Gorcunov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, linux-mm,
	Vivek Goyal, kexec

On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > 
> > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > 
> > Yep, that's better.
> > 
> > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > all array variables with the newly added macro.
> > > > 
> > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > >                                 (unsigned long)mem_section);
> > > 
> > > I have no strong opinion, either change all array uses or just introduce
> > > the macro and start to use it from now on if we have similar array
> > > symbols.
> > 
> > Do you need some action on my side or will you folks take care about this?
> 
> I think Baoquan was suggesting to update all array users in current
> code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> he will be happy. But if can not do it easily I'm fine with a
> VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> later as well. 

It seems it's the only array we have there. swapper_pg_dir is a potential
candidate, but it's 'unsigned long' on arm.

Below it patch with corrected macro name.

Please, consider applying.

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-10 11:16                       ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-10 11:16 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Cyrill Gorcunov, Kirill A. Shutemov,
	Andrew Morton, Borislav Petkov, Linus Torvalds, Ingo Molnar,
	Vivek Goyal

On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > 
> > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > 
> > Yep, that's better.
> > 
> > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > all array variables with the newly added macro.
> > > > 
> > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > >                                 (unsigned long)mem_section);
> > > 
> > > I have no strong opinion, either change all array uses or just introduce
> > > the macro and start to use it from now on if we have similar array
> > > symbols.
> > 
> > Do you need some action on my side or will you folks take care about this?
> 
> I think Baoquan was suggesting to update all array users in current
> code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> he will be happy. But if can not do it easily I'm fine with a
> VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> later as well. 

It seems it's the only array we have there. swapper_pg_dir is a potential
candidate, but it's 'unsigned long' on arm.

Below it patch with corrected macro name.

Please, consider applying.

From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 9 Jan 2018 02:55:47 +0300
Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

Depending on configuration mem_section can now be an array or a pointer
to an array allocated dynamically. In most cases, we can continue to refer
to it as 'mem_section' regardless of what it is.

But there's one exception: '&mem_section' means "address of the array" if
mem_section is an array, but if mem_section is a pointer, it would mean
"address of the pointer".

We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
writes down address of pointer into vmcoreinfo, not array as we wanted.

Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
situation correctly for both cases.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
---
 include/linux/crash_core.h | 2 ++
 kernel/crash_core.c        | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef30449..b511f6d24b42 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_SYMBOL_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b3663896278e..4f63597c824d 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_SYMBOL_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
-- 
 Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-10 11:16                       ` Kirill A. Shutemov
  (?)
@ 2018-01-11  1:06                         ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-11  1:06 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Young, Ingo Molnar
  Cc: Kirill A. Shutemov, Mike Galbraith, Andrew Morton,
	Greg Kroah-Hartman, linux-kernel, stable, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Vivek Goyal, kexec

On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > 
> > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > 
> > > Yep, that's better.
> > > 
> > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > all array variables with the newly added macro.
> > > > > 
> > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > >                                 (unsigned long)mem_section);
> > > > 
> > > > I have no strong opinion, either change all array uses or just introduce
> > > > the macro and start to use it from now on if we have similar array
> > > > symbols.
> > > 
> > > Do you need some action on my side or will you folks take care about this?
> > 
> > I think Baoquan was suggesting to update all array users in current
> > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > he will be happy. But if can not do it easily I'm fine with a
> > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > later as well. 
> 
> It seems it's the only array we have there. swapper_pg_dir is a potential
> candidate, but it's 'unsigned long' on arm.
> 
> Below it patch with corrected macro name.
> 
> Please, consider applying.
> 
> From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> situation correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")

Ack it, thanks.

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..b511f6d24b42 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..4f63597c824d 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-11  1:06                         ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-11  1:06 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Young, Ingo Molnar
  Cc: Kirill A. Shutemov, Mike Galbraith, Andrew Morton,
	Greg Kroah-Hartman, linux-kernel, stable, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Vivek Goyal, kexec

On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > 
> > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > 
> > > Yep, that's better.
> > > 
> > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > all array variables with the newly added macro.
> > > > > 
> > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > >                                 (unsigned long)mem_section);
> > > > 
> > > > I have no strong opinion, either change all array uses or just introduce
> > > > the macro and start to use it from now on if we have similar array
> > > > symbols.
> > > 
> > > Do you need some action on my side or will you folks take care about this?
> > 
> > I think Baoquan was suggesting to update all array users in current
> > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > he will be happy. But if can not do it easily I'm fine with a
> > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > later as well. 
> 
> It seems it's the only array we have there. swapper_pg_dir is a potential
> candidate, but it's 'unsigned long' on arm.
> 
> Below it patch with corrected macro name.
> 
> Please, consider applying.
> 
> From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> situation correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")

Ack it, thanks.

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..b511f6d24b42 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..4f63597c824d 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-11  1:06                         ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-11  1:06 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Young, Ingo Molnar
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Cyrill Gorcunov,
	Kirill A. Shutemov, Andrew Morton, Borislav Petkov,
	Linus Torvalds, Thomas Gleixner, Vivek Goyal

On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > 
> > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > 
> > > Yep, that's better.
> > > 
> > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > all array variables with the newly added macro.
> > > > > 
> > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > >                                 (unsigned long)mem_section);
> > > > 
> > > > I have no strong opinion, either change all array uses or just introduce
> > > > the macro and start to use it from now on if we have similar array
> > > > symbols.
> > > 
> > > Do you need some action on my side or will you folks take care about this?
> > 
> > I think Baoquan was suggesting to update all array users in current
> > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > he will be happy. But if can not do it easily I'm fine with a
> > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > later as well. 
> 
> It seems it's the only array we have there. swapper_pg_dir is a potential
> candidate, but it's 'unsigned long' on arm.
> 
> Below it patch with corrected macro name.
> 
> Please, consider applying.
> 
> From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> situation correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")

Ack it, thanks.

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..b511f6d24b42 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..4f63597c824d 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-10 11:16                       ` Kirill A. Shutemov
  (?)
@ 2018-01-12  0:55                         ` Dave Young
  -1 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-12  0:55 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton
  Cc: Kirill A. Shutemov, Baoquan He, Ingo Molnar, Mike Galbraith,
	Greg Kroah-Hartman, linux-kernel, stable, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Vivek Goyal, kexec

On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > 
> > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > 
> > > Yep, that's better.
> > > 
> > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > all array variables with the newly added macro.
> > > > > 
> > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > >                                 (unsigned long)mem_section);
> > > > 
> > > > I have no strong opinion, either change all array uses or just introduce
> > > > the macro and start to use it from now on if we have similar array
> > > > symbols.
> > > 
> > > Do you need some action on my side or will you folks take care about this?
> > 
> > I think Baoquan was suggesting to update all array users in current
> > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > he will be happy. But if can not do it easily I'm fine with a
> > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > later as well. 
> 
> It seems it's the only array we have there. swapper_pg_dir is a potential
> candidate, but it's 'unsigned long' on arm.
> 
> Below it patch with corrected macro name.
> 
> Please, consider applying.
> 
> From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> situation correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..b511f6d24b42 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..4f63597c824d 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov


Acked-by: Dave Young <dyoung@redhat.com>

If stable kernel took the mem section commits, then should also cc
stable.  Andrew, can you help to make this in 4.15?

Thanks
Dave

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-12  0:55                         ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-12  0:55 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton
  Cc: Kirill A. Shutemov, Baoquan He, Ingo Molnar, Mike Galbraith,
	Greg Kroah-Hartman, linux-kernel, stable, Andy Lutomirski,
	Borislav Petkov, Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Vivek Goyal, kexec

On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > 
> > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > 
> > > Yep, that's better.
> > > 
> > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > all array variables with the newly added macro.
> > > > > 
> > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > >                                 (unsigned long)mem_section);
> > > > 
> > > > I have no strong opinion, either change all array uses or just introduce
> > > > the macro and start to use it from now on if we have similar array
> > > > symbols.
> > > 
> > > Do you need some action on my side or will you folks take care about this?
> > 
> > I think Baoquan was suggesting to update all array users in current
> > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > he will be happy. But if can not do it easily I'm fine with a
> > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > later as well. 
> 
> It seems it's the only array we have there. swapper_pg_dir is a potential
> candidate, but it's 'unsigned long' on arm.
> 
> Below it patch with corrected macro name.
> 
> Please, consider applying.
> 
> From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> situation correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..b511f6d24b42 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..4f63597c824d 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov


Acked-by: Dave Young <dyoung@redhat.com>

If stable kernel took the mem section commits, then should also cc
stable.  Andrew, can you help to make this in 4.15?

Thanks
Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-12  0:55                         ` Dave Young
  0 siblings, 0 replies; 349+ messages in thread
From: Dave Young @ 2018-01-12  0:55 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Cyrill Gorcunov, Kirill A. Shutemov, Thomas Gleixner,
	Borislav Petkov, Linus Torvalds, Ingo Molnar, Vivek Goyal

On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > 
> > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > 
> > > Yep, that's better.
> > > 
> > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > all array variables with the newly added macro.
> > > > > 
> > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > >                                 (unsigned long)mem_section);
> > > > 
> > > > I have no strong opinion, either change all array uses or just introduce
> > > > the macro and start to use it from now on if we have similar array
> > > > symbols.
> > > 
> > > Do you need some action on my side or will you folks take care about this?
> > 
> > I think Baoquan was suggesting to update all array users in current
> > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > he will be happy. But if can not do it easily I'm fine with a
> > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > later as well. 
> 
> It seems it's the only array we have there. swapper_pg_dir is a potential
> candidate, but it's 'unsigned long' on arm.
> 
> Below it patch with corrected macro name.
> 
> Please, consider applying.
> 
> From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> situation correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..b511f6d24b42 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..4f63597c824d 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov


Acked-by: Dave Young <dyoung@redhat.com>

If stable kernel took the mem section commits, then should also cc
stable.  Andrew, can you help to make this in 4.15?

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-12  0:55                         ` Dave Young
  (?)
  (?)
@ 2018-01-15  5:57                           ` Omar Sandoval
  -1 siblings, 0 replies; 349+ messages in thread
From: Omar Sandoval @ 2018-01-15  5:57 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Andrew Morton, Kirill A. Shutemov,
	Baoquan He, Ingo Molnar, Mike Galbraith, Greg Kroah-Hartman,
	linux-kernel, stable, Andy Lutomirski, Borislav Petkov,
	Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	linux-mm, Vivek Goyal, kexec

On Fri, Jan 12, 2018 at 08:55:49AM +0800, Dave Young wrote:
> On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> > On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > > 
> > > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > > 
> > > > Yep, that's better.
> > > > 
> > > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > > all array variables with the newly added macro.
> > > > > > 
> > > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > > >                                 (unsigned long)mem_section);
> > > > > 
> > > > > I have no strong opinion, either change all array uses or just introduce
> > > > > the macro and start to use it from now on if we have similar array
> > > > > symbols.
> > > > 
> > > > Do you need some action on my side or will you folks take care about this?
> > > 
> > > I think Baoquan was suggesting to update all array users in current
> > > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > > he will be happy. But if can not do it easily I'm fine with a
> > > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > > later as well. 
> > 
> > It seems it's the only array we have there. swapper_pg_dir is a potential
> > candidate, but it's 'unsigned long' on arm.
> > 
> > Below it patch with corrected macro name.
> > 
> > Please, consider applying.
> > 
> > From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> > situation correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..b511f6d24b42 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..4f63597c824d 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> 
> Acked-by: Dave Young <dyoung@redhat.com>
> 
> If stable kernel took the mem section commits, then should also cc
> stable.  Andrew, can you help to make this in 4.15?
> 
> Thanks
> Dave

Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
different values for mem_section. That seems... odd. I had to patch
makedumpfile to fix the case of an explicit vmlinux being passed on the
command line (which I realized I don't need to do, but it should still
work):

>From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
From: Omar Sandoval <osandov@fb.com>
Date: Sun, 14 Jan 2018 17:10:30 -0800
Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
 vmlinux

Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
allocated array of pointers to mem_section instead of a static one
(i.e., struct mem_section ** instead of struct mem_section * []). This
adds an extra layer of indirection that breaks makedumpfile, which will
end up with a bunch of bogus mem_maps.

Since kernel commit a0b1280368d1 ("kdump: write correct address of
mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
contains the address of the actual struct mem_section * array instead of
the address of the pointer in .bss, which gets rid of the extra
indirection. However, makedumpfile still uses the debugging symbol from
the vmlinux image. Fix this by allowing symbols from the vmcore to
override symbols from the vmlinux image. As the comment in initial()
says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 makedumpfile.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.h b/makedumpfile.h
index 57cf4d9..d68c798 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -274,8 +274,10 @@ do { \
 } while (0)
 #define READ_SYMBOL(str_symbol, symbol) \
 do { \
-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
+	unsigned long _tmp_symbol; \
+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
+		SYMBOL(symbol) = _tmp_symbol; \
 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
 			return FALSE; \
 	} \
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-15  5:57                           ` Omar Sandoval
  0 siblings, 0 replies; 349+ messages in thread
From: Omar Sandoval @ 2018-01-15  5:57 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Andrew Morton, Kirill A. Shutemov,
	Baoquan He, Ingo Molnar, Mike Galbraith, Greg Kroah-Hartman,
	linux-kernel, stable, Andy Lutomirski, Borislav Petkov,
	Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	linux-mm, Vivek Goyal, kexec

On Fri, Jan 12, 2018 at 08:55:49AM +0800, Dave Young wrote:
> On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> > On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > > 
> > > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > > 
> > > > Yep, that's better.
> > > > 
> > > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > > all array variables with the newly added macro.
> > > > > > 
> > > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > > >                                 (unsigned long)mem_section);
> > > > > 
> > > > > I have no strong opinion, either change all array uses or just introduce
> > > > > the macro and start to use it from now on if we have similar array
> > > > > symbols.
> > > > 
> > > > Do you need some action on my side or will you folks take care about this?
> > > 
> > > I think Baoquan was suggesting to update all array users in current
> > > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > > he will be happy. But if can not do it easily I'm fine with a
> > > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > > later as well. 
> > 
> > It seems it's the only array we have there. swapper_pg_dir is a potential
> > candidate, but it's 'unsigned long' on arm.
> > 
> > Below it patch with corrected macro name.
> > 
> > Please, consider applying.
> > 
> > From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> > situation correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..b511f6d24b42 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..4f63597c824d 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> 
> Acked-by: Dave Young <dyoung@redhat.com>
> 
> If stable kernel took the mem section commits, then should also cc
> stable.  Andrew, can you help to make this in 4.15?
> 
> Thanks
> Dave

Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
different values for mem_section. That seems... odd. I had to patch
makedumpfile to fix the case of an explicit vmlinux being passed on the
command line (which I realized I don't need to do, but it should still
work):

>From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
From: Omar Sandoval <osandov@fb.com>
Date: Sun, 14 Jan 2018 17:10:30 -0800
Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
 vmlinux

Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
allocated array of pointers to mem_section instead of a static one
(i.e., struct mem_section ** instead of struct mem_section * []). This
adds an extra layer of indirection that breaks makedumpfile, which will
end up with a bunch of bogus mem_maps.

Since kernel commit a0b1280368d1 ("kdump: write correct address of
mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
contains the address of the actual struct mem_section * array instead of
the address of the pointer in .bss, which gets rid of the extra
indirection. However, makedumpfile still uses the debugging symbol from
the vmlinux image. Fix this by allowing symbols from the vmcore to
override symbols from the vmlinux image. As the comment in initial()
says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 makedumpfile.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.h b/makedumpfile.h
index 57cf4d9..d68c798 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -274,8 +274,10 @@ do { \
 } while (0)
 #define READ_SYMBOL(str_symbol, symbol) \
 do { \
-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
+	unsigned long _tmp_symbol; \
+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
+		SYMBOL(symbol) = _tmp_symbol; \
 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
 			return FALSE; \
 	} \
-- 
2.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-15  5:57                           ` Omar Sandoval
  0 siblings, 0 replies; 349+ messages in thread
From: Omar Sandoval @ 2018-01-15  5:57 UTC (permalink / raw)
  To: Dave Young
  Cc: Kirill A. Shutemov, Andrew Morton, Kirill A. Shutemov,
	Baoquan He, Ingo Molnar, Mike Galbraith, Greg Kroah-Hartman,
	linux-kernel, stable, Andy Lutomirski, Borislav Petkov,
	Cyrill Gorcunov, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	linux-mm, Vivek Goyal, kexec

On Fri, Jan 12, 2018 at 08:55:49AM +0800, Dave Young wrote:
> On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> > On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > > 
> > > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > > 
> > > > Yep, that's better.
> > > > 
> > > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > > all array variables with the newly added macro.
> > > > > > 
> > > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > > >                                 (unsigned long)mem_section);
> > > > > 
> > > > > I have no strong opinion, either change all array uses or just introduce
> > > > > the macro and start to use it from now on if we have similar array
> > > > > symbols.
> > > > 
> > > > Do you need some action on my side or will you folks take care about this?
> > > 
> > > I think Baoquan was suggesting to update all array users in current
> > > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > > he will be happy. But if can not do it easily I'm fine with a
> > > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > > later as well. 
> > 
> > It seems it's the only array we have there. swapper_pg_dir is a potential
> > candidate, but it's 'unsigned long' on arm.
> > 
> > Below it patch with corrected macro name.
> > 
> > Please, consider applying.
> > 
> > From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> > situation correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..b511f6d24b42 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..4f63597c824d 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> 
> Acked-by: Dave Young <dyoung@redhat.com>
> 
> If stable kernel took the mem section commits, then should also cc
> stable.  Andrew, can you help to make this in 4.15?
> 
> Thanks
> Dave

Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
different values for mem_section. That seems... odd. I had to patch
makedumpfile to fix the case of an explicit vmlinux being passed on the
command line (which I realized I don't need to do, but it should still
work):

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-15  5:57                           ` Omar Sandoval
  0 siblings, 0 replies; 349+ messages in thread
From: Omar Sandoval @ 2018-01-15  5:57 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Andrew Morton, Borislav Petkov,
	Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On Fri, Jan 12, 2018 at 08:55:49AM +0800, Dave Young wrote:
> On 01/10/18 at 02:16pm, Kirill A. Shutemov wrote:
> > On Wed, Jan 10, 2018 at 03:08:04AM +0000, Dave Young wrote:
> > > On Tue, Jan 09, 2018 at 12:05:52PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Jan 09, 2018 at 03:24:40PM +0800, Dave Young wrote:
> > > > > On 01/09/18 at 01:41pm, Baoquan He wrote:
> > > > > > On 01/09/18 at 09:09am, Dave Young wrote:
> > > > > > 
> > > > > > > As for the macro name, VMCOREINFO_SYMBOL_ARRAY sounds better.
> > > > 
> > > > Yep, that's better.
> > > > 
> > > > > > I still think using vmcoreinfo_append_str is better. Unless we replace
> > > > > > all array variables with the newly added macro.
> > > > > > 
> > > > > > vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > > > > >                                 (unsigned long)mem_section);
> > > > > 
> > > > > I have no strong opinion, either change all array uses or just introduce
> > > > > the macro and start to use it from now on if we have similar array
> > > > > symbols.
> > > > 
> > > > Do you need some action on my side or will you folks take care about this?
> > > 
> > > I think Baoquan was suggesting to update all array users in current
> > > code, if you can check every VMCOREINFO_SYMBOL and update all the arrays
> > > he will be happy. But if can not do it easily I'm fine with a
> > > VMCOREINFO_SYMBOL_ARRAY changes only now, we kdump people can do it
> > > later as well. 
> > 
> > It seems it's the only array we have there. swapper_pg_dir is a potential
> > candidate, but it's 'unsigned long' on arm.
> > 
> > Below it patch with corrected macro name.
> > 
> > Please, consider applying.
> > 
> > From 70f3a84b97f2de98d1364f7b10b7a42a1d8e9968 Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> > situation correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > ---
> >  include/linux/crash_core.h | 2 ++
> >  kernel/crash_core.c        | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 06097ef30449..b511f6d24b42 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> >  #define VMCOREINFO_SYMBOL(name) \
> >  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > +#define VMCOREINFO_SYMBOL_ARRAY(name) \
> > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> >  #define VMCOREINFO_SIZE(name) \
> >  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> >  			      (unsigned long)sizeof(name))
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index b3663896278e..4f63597c824d 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	VMCOREINFO_SYMBOL_ARRAY(mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > -- 
> >  Kirill A. Shutemov
> 
> 
> Acked-by: Dave Young <dyoung@redhat.com>
> 
> If stable kernel took the mem section commits, then should also cc
> stable.  Andrew, can you help to make this in 4.15?
> 
> Thanks
> Dave

Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
different values for mem_section. That seems... odd. I had to patch
makedumpfile to fix the case of an explicit vmlinux being passed on the
command line (which I realized I don't need to do, but it should still
work):

From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
From: Omar Sandoval <osandov@fb.com>
Date: Sun, 14 Jan 2018 17:10:30 -0800
Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
 vmlinux

Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
allocated array of pointers to mem_section instead of a static one
(i.e., struct mem_section ** instead of struct mem_section * []). This
adds an extra layer of indirection that breaks makedumpfile, which will
end up with a bunch of bogus mem_maps.

Since kernel commit a0b1280368d1 ("kdump: write correct address of
mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
contains the address of the actual struct mem_section * array instead of
the address of the pointer in .bss, which gets rid of the extra
indirection. However, makedumpfile still uses the debugging symbol from
the vmlinux image. Fix this by allowing symbols from the vmcore to
override symbols from the vmlinux image. As the comment in initial()
says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 makedumpfile.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.h b/makedumpfile.h
index 57cf4d9..d68c798 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -274,8 +274,10 @@ do { \
 } while (0)
 #define READ_SYMBOL(str_symbol, symbol) \
 do { \
-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
+	unsigned long _tmp_symbol; \
+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
+		SYMBOL(symbol) = _tmp_symbol; \
 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
 			return FALSE; \
 	} \
-- 
2.9.5


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* RE: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-15  5:57                           ` Omar Sandoval
  (?)
  (?)
@ 2018-01-16  8:36                             ` Atsushi Kumagai
  -1 siblings, 0 replies; 349+ messages in thread
From: Atsushi Kumagai @ 2018-01-16  8:36 UTC (permalink / raw)
  To: Omar Sandoval, Dave Young
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Andrew Morton, Borislav Petkov,
	Linus Torvalds, Ingo Molnar, Kirill A. Shutemov,
	Keiichirou Suzuki

>Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
>different values for mem_section. That seems... odd. I had to patch
>makedumpfile to fix the case of an explicit vmlinux being passed on the
>command line (which I realized I don't need to do, but it should still
>work):

Looks good to me, I'll merge this into makedumpfile-1.6.4.

Thanks,
Atsushi Kumagai

>From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
>Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
>From: Omar Sandoval <osandov@fb.com>
>Date: Sun, 14 Jan 2018 17:10:30 -0800
>Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
> vmlinux
>
>Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
>runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
>allocated array of pointers to mem_section instead of a static one
>(i.e., struct mem_section ** instead of struct mem_section * []). This
>adds an extra layer of indirection that breaks makedumpfile, which will
>end up with a bunch of bogus mem_maps.
>
>Since kernel commit a0b1280368d1 ("kdump: write correct address of
>mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
>contains the address of the actual struct mem_section * array instead of
>the address of the pointer in .bss, which gets rid of the extra
>indirection. However, makedumpfile still uses the debugging symbol from
>the vmlinux image. Fix this by allowing symbols from the vmcore to
>override symbols from the vmlinux image. As the comment in initial()
>says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".
>
>Signed-off-by: Omar Sandoval <osandov@fb.com>
>---
> makedumpfile.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 57cf4d9..d68c798 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -274,8 +274,10 @@ do { \
> } while (0)
> #define READ_SYMBOL(str_symbol, symbol) \
> do { \
>-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
>-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	unsigned long _tmp_symbol; \
>+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
>+		SYMBOL(symbol) = _tmp_symbol; \
> 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
> 			return FALSE; \
> 	} \
>--
>2.9.5
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* RE: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-16  8:36                             ` Atsushi Kumagai
  0 siblings, 0 replies; 349+ messages in thread
From: Atsushi Kumagai @ 2018-01-16  8:36 UTC (permalink / raw)
  To: Omar Sandoval, Dave Young
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Andrew Morton, Borislav Petkov,
	Linus Torvalds, Ingo Molnar, Kirill A. Shutemov,
	Keiichirou Suzuki

>Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
>different values for mem_section. That seems... odd. I had to patch
>makedumpfile to fix the case of an explicit vmlinux being passed on the
>command line (which I realized I don't need to do, but it should still
>work):

Looks good to me, I'll merge this into makedumpfile-1.6.4.

Thanks,
Atsushi Kumagai

>>From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
>Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
>From: Omar Sandoval <osandov@fb.com>
>Date: Sun, 14 Jan 2018 17:10:30 -0800
>Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
> vmlinux
>
>Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
>runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
>allocated array of pointers to mem_section instead of a static one
>(i.e., struct mem_section ** instead of struct mem_section * []). This
>adds an extra layer of indirection that breaks makedumpfile, which will
>end up with a bunch of bogus mem_maps.
>
>Since kernel commit a0b1280368d1 ("kdump: write correct address of
>mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
>contains the address of the actual struct mem_section * array instead of
>the address of the pointer in .bss, which gets rid of the extra
>indirection. However, makedumpfile still uses the debugging symbol from
>the vmlinux image. Fix this by allowing symbols from the vmcore to
>override symbols from the vmlinux image. As the comment in initial()
>says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".
>
>Signed-off-by: Omar Sandoval <osandov@fb.com>
>---
> makedumpfile.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 57cf4d9..d68c798 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -274,8 +274,10 @@ do { \
> } while (0)
> #define READ_SYMBOL(str_symbol, symbol) \
> do { \
>-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
>-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	unsigned long _tmp_symbol; \
>+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
>+		SYMBOL(symbol) = _tmp_symbol; \
> 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
> 			return FALSE; \
> 	} \
>--
>2.9.5
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* RE: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-16  8:36                             ` Atsushi Kumagai
  0 siblings, 0 replies; 349+ messages in thread
From: Atsushi Kumagai @ 2018-01-16  8:36 UTC (permalink / raw)
  To: Omar Sandoval, Dave Young
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Andrew Morton, Borislav Petkov,
	Linus Torvalds, Ingo Molnar, Kirill A. Shutemov,
	Keiichirou Suzuki

>Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
>different values for mem_section. That seems... odd. I had to patch
>makedumpfile to fix the case of an explicit vmlinux being passed on the
>command line (which I realized I don't need to do, but it should still
>work):

Looks good to me, I'll merge this into makedumpfile-1.6.4.

Thanks,
Atsushi Kumagai

>From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
>Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
>From: Omar Sandoval <osandov@fb.com>
>Date: Sun, 14 Jan 2018 17:10:30 -0800
>Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
> vmlinux
>
>Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
>runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
>allocated array of pointers to mem_section instead of a static one
>(i.e., struct mem_section ** instead of struct mem_section * []). This
>adds an extra layer of indirection that breaks makedumpfile, which will
>end up with a bunch of bogus mem_maps.
>
>Since kernel commit a0b1280368d1 ("kdump: write correct address of
>mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
>contains the address of the actual struct mem_section * array instead of
>the address of the pointer in .bss, which gets rid of the extra
>indirection. However, makedumpfile still uses the debugging symbol from
>the vmlinux image. Fix this by allowing symbols from the vmcore to
>override symbols from the vmlinux image. As the comment in initial()
>says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".
>
>Signed-off-by: Omar Sandoval <osandov@fb.com>
>---
> makedumpfile.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 57cf4d9..d68c798 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -274,8 +274,10 @@ do { \
> } while (0)
> #define READ_SYMBOL(str_symbol, symbol) \
> do { \
>-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
>-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	unsigned long _tmp_symbol; \
>+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
>+		SYMBOL(symbol) = _tmp_symbol; \
> 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
> 			return FALSE; \
> 	} \
>--
>2.9.5
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* RE: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-16  8:36                             ` Atsushi Kumagai
  0 siblings, 0 replies; 349+ messages in thread
From: Atsushi Kumagai @ 2018-01-16  8:36 UTC (permalink / raw)
  To: Omar Sandoval, Dave Young
  Cc: Ingo Molnar, Keiichirou Suzuki, Baoquan He, Peter Zijlstra,
	Greg Kroah-Hartman, Mike Galbraith, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Kirill A. Shutemov, Cyrill Gorcunov,
	Kirill A. Shutemov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Andrew Morton, Vivek Goyal

>Hm, this fix means that the vmlinux symbol table and vmcoreinfo have
>different values for mem_section. That seems... odd. I had to patch
>makedumpfile to fix the case of an explicit vmlinux being passed on the
>command line (which I realized I don't need to do, but it should still
>work):

Looks good to me, I'll merge this into makedumpfile-1.6.4.

Thanks,
Atsushi Kumagai

From 542a11a8f28b0f0a989abc3adff89da22f44c719 Mon Sep 17 00:00:00 2001
>Message-Id: <542a11a8f28b0f0a989abc3adff89da22f44c719.1515995400.git.osandov@fb.com>
>From: Omar Sandoval <osandov@fb.com>
>Date: Sun, 14 Jan 2018 17:10:30 -0800
>Subject: [PATCH] Fix SPARSEMEM_EXTREME support on Linux v4.15 when passing
> vmlinux
>
>Since kernel commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at
>runtime for CONFIG_SPARSEMEM_EXTREME=y"), mem_section is a dynamically
>allocated array of pointers to mem_section instead of a static one
>(i.e., struct mem_section ** instead of struct mem_section * []). This
>adds an extra layer of indirection that breaks makedumpfile, which will
>end up with a bunch of bogus mem_maps.
>
>Since kernel commit a0b1280368d1 ("kdump: write correct address of
>mem_section into vmcoreinfo"), the mem_section symbol in vmcoreinfo
>contains the address of the actual struct mem_section * array instead of
>the address of the pointer in .bss, which gets rid of the extra
>indirection. However, makedumpfile still uses the debugging symbol from
>the vmlinux image. Fix this by allowing symbols from the vmcore to
>override symbols from the vmlinux image. As the comment in initial()
>says, "vmcoreinfo in /proc/vmcore is more reliable than -x/-i option".
>
>Signed-off-by: Omar Sandoval <osandov@fb.com>
>---
> makedumpfile.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 57cf4d9..d68c798 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -274,8 +274,10 @@ do { \
> } while (0)
> #define READ_SYMBOL(str_symbol, symbol) \
> do { \
>-	if (SYMBOL(symbol) == NOT_FOUND_SYMBOL) { \
>-		SYMBOL(symbol) = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	unsigned long _tmp_symbol; \
>+	_tmp_symbol = read_vmcoreinfo_symbol(STR_SYMBOL(str_symbol)); \
>+	if (_tmp_symbol != NOT_FOUND_SYMBOL) { \
>+		SYMBOL(symbol) = _tmp_symbol; \
> 		if (SYMBOL(symbol) == INVALID_SYMBOL_DATA) \
> 			return FALSE; \
> 	} \
>--
>2.9.5
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  0:13           ` Kirill A. Shutemov
  (?)
@ 2018-01-17  5:24             ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-17  5:24 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

Hi Kirill,

I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
kexec and kdump reset to BIOS immediately after triggering. I saw your
patch adding 5-level paging support for kexec. Wonder if your test
succeeded to jump into kexec/kdump kernel, and what else I need to
make it. By the way, I just tested the latest upstream kernel.

commit 7f6890418 x86/kexec: Add 5-level paging support

[ ~]$ qemu-system-x86_64 --version
QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Thanks
Baoquan

On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > 
> > > hi Kirill,
> > > 
> > > As Mike reported it below, your 5-level paging related upstream commit 
> > > 83e3c48729d9 and all its followup fixes:
> > > 
> > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > 
> > > ... still breaks kexec - and that now regresses -stable as well.
> > > 
> > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > fully revert this then we'll have to disable 5-level paging as well.
> 
> This *should* help.
> 
> Mike, could you test this? (On top of the rest of the fixes.)
> 
> Sorry for the mess.
> 
> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-17  5:24             ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-17  5:24 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

Hi Kirill,

I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
kexec and kdump reset to BIOS immediately after triggering. I saw your
patch adding 5-level paging support for kexec. Wonder if your test
succeeded to jump into kexec/kdump kernel, and what else I need to
make it. By the way, I just tested the latest upstream kernel.

commit 7f6890418 x86/kexec: Add 5-level paging support

[ ~]$ qemu-system-x86_64 --version
QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Thanks
Baoquan

On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > 
> > > hi Kirill,
> > > 
> > > As Mike reported it below, your 5-level paging related upstream commit 
> > > 83e3c48729d9 and all its followup fixes:
> > > 
> > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > 
> > > ... still breaks kexec - and that now regresses -stable as well.
> > > 
> > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > fully revert this then we'll have to disable 5-level paging as well.
> 
> This *should* help.
> 
> Mike, could you test this? (On top of the rest of the fixes.)
> 
> Sorry for the mess.
> 
> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-17  5:24             ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-17  5:24 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Thomas Gleixner,
	Kirill A. Shutemov, Linus Torvalds, Cyrill Gorcunov,
	Andrew Morton, Borislav Petkov, Dave Young, Ingo Molnar,
	Vivek Goyal

Hi Kirill,

I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
kexec and kdump reset to BIOS immediately after triggering. I saw your
patch adding 5-level paging support for kexec. Wonder if your test
succeeded to jump into kexec/kdump kernel, and what else I need to
make it. By the way, I just tested the latest upstream kernel.

commit 7f6890418 x86/kexec: Add 5-level paging support

[ ~]$ qemu-system-x86_64 --version
QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Thanks
Baoquan

On 01/09/18 at 03:13am, Kirill A. Shutemov wrote:
> On Mon, Jan 08, 2018 at 08:46:53PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jan 08, 2018 at 04:04:44PM +0000, Ingo Molnar wrote:
> > > 
> > > hi Kirill,
> > > 
> > > As Mike reported it below, your 5-level paging related upstream commit 
> > > 83e3c48729d9 and all its followup fixes:
> > > 
> > >  83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
> > >  629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
> > >  d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
> > > 
> > > ... still breaks kexec - and that now regresses -stable as well.
> > > 
> > > Given that 5-level paging now syntactically depends on having this commit, if we 
> > > fully revert this then we'll have to disable 5-level paging as well.
> 
> This *should* help.
> 
> Mike, could you test this? (On top of the rest of the fixes.)
> 
> Sorry for the mess.
> 
> From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 9 Jan 2018 02:55:47 +0300
> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> 
> Depending on configuration mem_section can now be an array or a pointer
> to an array allocated dynamically. In most cases, we can continue to refer
> to it as 'mem_section' regardless of what it is.
> 
> But there's one exception: '&mem_section' means "address of the array" if
> mem_section is an array, but if mem_section is a pointer, it would mean
> "address of the pointer".
> 
> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> writes down address of pointer into vmcoreinfo, not array as we wanted.
> 
> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> correctly for both cases.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> ---
>  include/linux/crash_core.h | 2 ++
>  kernel/crash_core.c        | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 06097ef30449..83ae04950269 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
>  	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> +#define VMCOREINFO_ARRAY(name) \
> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>  #define VMCOREINFO_SIZE(name) \
>  	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>  			      (unsigned long)sizeof(name))
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index b3663896278e..d4122a837477 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>  	VMCOREINFO_SYMBOL(contig_page_data);
>  #endif
>  #ifdef CONFIG_SPARSEMEM
> -	VMCOREINFO_SYMBOL(mem_section);
> +	VMCOREINFO_ARRAY(mem_section);
>  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>  	VMCOREINFO_STRUCT_SIZE(mem_section);
>  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> -- 
>  Kirill A. Shutemov
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-17  5:24             ` Baoquan He
  (?)
@ 2018-01-25 15:50               ` Kirill A. Shutemov
  -1 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-25 15:50 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On Wed, Jan 17, 2018 at 01:24:54PM +0800, Baoquan He wrote:
> Hi Kirill,
> 
> I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
> kexec and kdump reset to BIOS immediately after triggering. I saw your
> patch adding 5-level paging support for kexec. Wonder if your test
> succeeded to jump into kexec/kdump kernel, and what else I need to
> make it. By the way, I just tested the latest upstream kernel.
> 
> commit 7f6890418 x86/kexec: Add 5-level paging support
> 
> [ ~]$ qemu-system-x86_64 --version
> QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
> Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Sorry for delay.

I didn't tested it in 5-level paging mode :-/

The patch below helps in my case. Could you test it?

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 307d3bac5f04..65a98cf2307d 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -126,8 +126,12 @@ identity_mapped:
        /*
         * Set cr4 to a known state:
         *  - physical address extension enabled
+        *  - 5-level paging, if enabled
         */
        movl    $X86_CR4_PAE, %eax
+#ifdef CONFIG_X86_5LEVEL
+       orl     $X86_CR4_LA57, %eax
+#endif
        movq    %rax, %cr4

        jmp 1f
-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-25 15:50               ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-25 15:50 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On Wed, Jan 17, 2018 at 01:24:54PM +0800, Baoquan He wrote:
> Hi Kirill,
> 
> I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
> kexec and kdump reset to BIOS immediately after triggering. I saw your
> patch adding 5-level paging support for kexec. Wonder if your test
> succeeded to jump into kexec/kdump kernel, and what else I need to
> make it. By the way, I just tested the latest upstream kernel.
> 
> commit 7f6890418 x86/kexec: Add 5-level paging support
> 
> [ ~]$ qemu-system-x86_64 --version
> QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
> Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Sorry for delay.

I didn't tested it in 5-level paging mode :-/

The patch below helps in my case. Could you test it?

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 307d3bac5f04..65a98cf2307d 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -126,8 +126,12 @@ identity_mapped:
        /*
         * Set cr4 to a known state:
         *  - physical address extension enabled
+        *  - 5-level paging, if enabled
         */
        movl    $X86_CR4_PAE, %eax
+#ifdef CONFIG_X86_5LEVEL
+       orl     $X86_CR4_LA57, %eax
+#endif
        movq    %rax, %cr4

        jmp 1f
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-25 15:50               ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-01-25 15:50 UTC (permalink / raw)
  To: Baoquan He
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Thomas Gleixner,
	Kirill A. Shutemov, Linus Torvalds, Cyrill Gorcunov,
	Andrew Morton, Borislav Petkov, Dave Young, Ingo Molnar,
	Vivek Goyal

On Wed, Jan 17, 2018 at 01:24:54PM +0800, Baoquan He wrote:
> Hi Kirill,
> 
> I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
> kexec and kdump reset to BIOS immediately after triggering. I saw your
> patch adding 5-level paging support for kexec. Wonder if your test
> succeeded to jump into kexec/kdump kernel, and what else I need to
> make it. By the way, I just tested the latest upstream kernel.
> 
> commit 7f6890418 x86/kexec: Add 5-level paging support
> 
> [ ~]$ qemu-system-x86_64 --version
> QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
> Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Sorry for delay.

I didn't tested it in 5-level paging mode :-/

The patch below helps in my case. Could you test it?

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 307d3bac5f04..65a98cf2307d 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -126,8 +126,12 @@ identity_mapped:
        /*
         * Set cr4 to a known state:
         *  - physical address extension enabled
+        *  - 5-level paging, if enabled
         */
        movl    $X86_CR4_PAE, %eax
+#ifdef CONFIG_X86_5LEVEL
+       orl     $X86_CR4_LA57, %eax
+#endif
        movq    %rax, %cr4

        jmp 1f
-- 
 Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-25 15:50               ` Kirill A. Shutemov
  (?)
@ 2018-01-26  2:48                 ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-26  2:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On 01/25/18 at 06:50pm, Kirill A. Shutemov wrote:
> On Wed, Jan 17, 2018 at 01:24:54PM +0800, Baoquan He wrote:
> > Hi Kirill,
> > 
> > I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
> > kexec and kdump reset to BIOS immediately after triggering. I saw your
> > patch adding 5-level paging support for kexec. Wonder if your test
> > succeeded to jump into kexec/kdump kernel, and what else I need to
> > make it. By the way, I just tested the latest upstream kernel.
> > 
> > commit 7f6890418 x86/kexec: Add 5-level paging support
> > 
> > [ ~]$ qemu-system-x86_64 --version
> > QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
> > Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
> 
> Sorry for delay.
> 
> I didn't tested it in 5-level paging mode :-/
> 
> The patch below helps in my case. Could you test it?

Thanks, Kirill. 

Seems it doesn't work. I have some confusion about the process, will
send you a private mail.

Thanks
Baoquan
> 
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 307d3bac5f04..65a98cf2307d 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -126,8 +126,12 @@ identity_mapped:
>         /*
>          * Set cr4 to a known state:
>          *  - physical address extension enabled
> +        *  - 5-level paging, if enabled
>          */
>         movl    $X86_CR4_PAE, %eax
> +#ifdef CONFIG_X86_5LEVEL
> +       orl     $X86_CR4_LA57, %eax
> +#endif
>         movq    %rax, %cr4
> 
>         jmp 1f
> -- 
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-26  2:48                 ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-26  2:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Mike Galbraith, Andrew Morton, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On 01/25/18 at 06:50pm, Kirill A. Shutemov wrote:
> On Wed, Jan 17, 2018 at 01:24:54PM +0800, Baoquan He wrote:
> > Hi Kirill,
> > 
> > I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
> > kexec and kdump reset to BIOS immediately after triggering. I saw your
> > patch adding 5-level paging support for kexec. Wonder if your test
> > succeeded to jump into kexec/kdump kernel, and what else I need to
> > make it. By the way, I just tested the latest upstream kernel.
> > 
> > commit 7f6890418 x86/kexec: Add 5-level paging support
> > 
> > [ ~]$ qemu-system-x86_64 --version
> > QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
> > Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
> 
> Sorry for delay.
> 
> I didn't tested it in 5-level paging mode :-/
> 
> The patch below helps in my case. Could you test it?

Thanks, Kirill. 

Seems it doesn't work. I have some confusion about the process, will
send you a private mail.

Thanks
Baoquan
> 
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 307d3bac5f04..65a98cf2307d 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -126,8 +126,12 @@ identity_mapped:
>         /*
>          * Set cr4 to a known state:
>          *  - physical address extension enabled
> +        *  - 5-level paging, if enabled
>          */
>         movl    $X86_CR4_PAE, %eax
> +#ifdef CONFIG_X86_5LEVEL
> +       orl     $X86_CR4_LA57, %eax
> +#endif
>         movq    %rax, %cr4
> 
>         jmp 1f
> -- 
>  Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-01-26  2:48                 ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-01-26  2:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Thomas Gleixner,
	Kirill A. Shutemov, Linus Torvalds, Cyrill Gorcunov,
	Andrew Morton, Borislav Petkov, Dave Young, Ingo Molnar,
	Vivek Goyal

On 01/25/18 at 06:50pm, Kirill A. Shutemov wrote:
> On Wed, Jan 17, 2018 at 01:24:54PM +0800, Baoquan He wrote:
> > Hi Kirill,
> > 
> > I setup qemu 2.9.0 to test 5-level on kexec/kdump support. While both
> > kexec and kdump reset to BIOS immediately after triggering. I saw your
> > patch adding 5-level paging support for kexec. Wonder if your test
> > succeeded to jump into kexec/kdump kernel, and what else I need to
> > make it. By the way, I just tested the latest upstream kernel.
> > 
> > commit 7f6890418 x86/kexec: Add 5-level paging support
> > 
> > [ ~]$ qemu-system-x86_64 --version
> > QEMU emulator version 2.9.0(qemu-2.9.0-1.fc26.1)
> > Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
> 
> Sorry for delay.
> 
> I didn't tested it in 5-level paging mode :-/
> 
> The patch below helps in my case. Could you test it?

Thanks, Kirill. 

Seems it doesn't work. I have some confusion about the process, will
send you a private mail.

Thanks
Baoquan
> 
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 307d3bac5f04..65a98cf2307d 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -126,8 +126,12 @@ identity_mapped:
>         /*
>          * Set cr4 to a known state:
>          *  - physical address extension enabled
> +        *  - 5-level paging, if enabled
>          */
>         movl    $X86_CR4_PAE, %eax
> +#ifdef CONFIG_X86_5LEVEL
> +       orl     $X86_CR4_LA57, %eax
> +#endif
>         movq    %rax, %cr4
> 
>         jmp 1f
> -- 
>  Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-01-09  3:44             ` Mike Galbraith
  (?)
@ 2018-02-07  9:25               ` Dou Liyang
  -1 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07  9:25 UTC (permalink / raw)
  To: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Dave Young,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Vivek Goyal, Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov

Hi All,

I met the makedumpfile failed in the upstream kernel which contained
this patch. Did I missed something else?

In fedora27 host:

[douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000060 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000010 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

Thanks,
	dou
At 01/09/2018 11:44 AM, Mike Galbraith wrote:
> On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
>>
>> Mike, could you test this? (On top of the rest of the fixes.)
> 
> homer:..crash/2018-01-09-04:25 # ll
> total 1863604
> -rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
> -rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
> -rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
> -rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
> -rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz
> 
> Yup, all better.
> 
>> Sorry for the mess.
> 
> (why, developers not installing shiny new bugs is a whole lot worse:)
> 
>>  From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Date: Tue, 9 Jan 2018 02:55:47 +0300
>> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
>>
>> Depending on configuration mem_section can now be an array or a pointer
>> to an array allocated dynamically. In most cases, we can continue to refer
>> to it as 'mem_section' regardless of what it is.
>>
>> But there's one exception: '&mem_section' means "address of the array" if
>> mem_section is an array, but if mem_section is a pointer, it would mean
>> "address of the pointer".
>>
>> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
>> writes down address of pointer into vmcoreinfo, not array as we wanted.
>>
>> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
>> correctly for both cases.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
>> ---
>>   include/linux/crash_core.h | 2 ++
>>   kernel/crash_core.c        | 2 +-
>>   2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index 06097ef30449..83ae04950269 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>>   	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>>   #define VMCOREINFO_SYMBOL(name) \
>>   	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
>> +#define VMCOREINFO_ARRAY(name) \
>> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>>   #define VMCOREINFO_SIZE(name) \
>>   	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>>   			      (unsigned long)sizeof(name))
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index b3663896278e..d4122a837477 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>>   	VMCOREINFO_SYMBOL(contig_page_data);
>>   #endif
>>   #ifdef CONFIG_SPARSEMEM
>> -	VMCOREINFO_SYMBOL(mem_section);
>> +	VMCOREINFO_ARRAY(mem_section);
>>   	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>>   	VMCOREINFO_STRUCT_SIZE(mem_section);
>>   	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07  9:25               ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07  9:25 UTC (permalink / raw)
  To: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Dave Young,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Vivek Goyal, Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov

Hi All,

I met the makedumpfile failed in the upstream kernel which contained
this patch. Did I missed something else?

In fedora27 host:

[douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000060 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000010 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

Thanks,
	dou
At 01/09/2018 11:44 AM, Mike Galbraith wrote:
> On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
>>
>> Mike, could you test this? (On top of the rest of the fixes.)
> 
> homer:..crash/2018-01-09-04:25 # ll
> total 1863604
> -rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
> -rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
> -rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
> -rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
> -rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz
> 
> Yup, all better.
> 
>> Sorry for the mess.
> 
> (why, developers not installing shiny new bugs is a whole lot worse:)
> 
>>  From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Date: Tue, 9 Jan 2018 02:55:47 +0300
>> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
>>
>> Depending on configuration mem_section can now be an array or a pointer
>> to an array allocated dynamically. In most cases, we can continue to refer
>> to it as 'mem_section' regardless of what it is.
>>
>> But there's one exception: '&mem_section' means "address of the array" if
>> mem_section is an array, but if mem_section is a pointer, it would mean
>> "address of the pointer".
>>
>> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
>> writes down address of pointer into vmcoreinfo, not array as we wanted.
>>
>> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
>> correctly for both cases.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
>> ---
>>   include/linux/crash_core.h | 2 ++
>>   kernel/crash_core.c        | 2 +-
>>   2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index 06097ef30449..83ae04950269 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>>   	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>>   #define VMCOREINFO_SYMBOL(name) \
>>   	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
>> +#define VMCOREINFO_ARRAY(name) \
>> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>>   #define VMCOREINFO_SIZE(name) \
>>   	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>>   			      (unsigned long)sizeof(name))
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index b3663896278e..d4122a837477 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>>   	VMCOREINFO_SYMBOL(contig_page_data);
>>   #endif
>>   #ifdef CONFIG_SPARSEMEM
>> -	VMCOREINFO_SYMBOL(mem_section);
>> +	VMCOREINFO_ARRAY(mem_section);
>>   	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>>   	VMCOREINFO_STRUCT_SIZE(mem_section);
>>   	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07  9:25               ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07  9:25 UTC (permalink / raw)
  To: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Linus Torvalds,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Kirill A. Shutemov, Cyrill Gorcunov, Thomas Gleixner,
	Borislav Petkov, Dave Young, Vivek Goyal

Hi All,

I met the makedumpfile failed in the upstream kernel which contained
this patch. Did I missed something else?

In fedora27 host:

[douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000060 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000010 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

Thanks,
	dou
At 01/09/2018 11:44 AM, Mike Galbraith wrote:
> On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
>>
>> Mike, could you test this? (On top of the rest of the fixes.)
> 
> homer:..crash/2018-01-09-04:25 # ll
> total 1863604
> -rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
> -rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
> -rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
> -rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
> -rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz
> 
> Yup, all better.
> 
>> Sorry for the mess.
> 
> (why, developers not installing shiny new bugs is a whole lot worse:)
> 
>>  From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Date: Tue, 9 Jan 2018 02:55:47 +0300
>> Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
>>
>> Depending on configuration mem_section can now be an array or a pointer
>> to an array allocated dynamically. In most cases, we can continue to refer
>> to it as 'mem_section' regardless of what it is.
>>
>> But there's one exception: '&mem_section' means "address of the array" if
>> mem_section is an array, but if mem_section is a pointer, it would mean
>> "address of the pointer".
>>
>> We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
>> writes down address of pointer into vmcoreinfo, not array as we wanted.
>>
>> Let's introduce VMCOREINFO_ARRAY() that would handle the situation
>> correctly for both cases.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
>> ---
>>   include/linux/crash_core.h | 2 ++
>>   kernel/crash_core.c        | 2 +-
>>   2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index 06097ef30449..83ae04950269 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>>   	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>>   #define VMCOREINFO_SYMBOL(name) \
>>   	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
>> +#define VMCOREINFO_ARRAY(name) \
>> +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
>>   #define VMCOREINFO_SIZE(name) \
>>   	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
>>   			      (unsigned long)sizeof(name))
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index b3663896278e..d4122a837477 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
>>   	VMCOREINFO_SYMBOL(contig_page_data);
>>   #endif
>>   #ifdef CONFIG_SPARSEMEM
>> -	VMCOREINFO_SYMBOL(mem_section);
>> +	VMCOREINFO_ARRAY(mem_section);
>>   	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>>   	VMCOREINFO_STRUCT_SIZE(mem_section);
>>   	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07  9:25               ` Dou Liyang
  (?)
@ 2018-02-07 10:41                 ` Kirill A. Shutemov
  -1 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-02-07 10:41 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Ingo Molnar, Andrew Morton, Baoquan He,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov

On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> Hi All,
> 
> I met the makedumpfile failed in the upstream kernel which contained
> this patch. Did I missed something else?

None I'm aware of.

Is there a reason to suspect that the issue is related to the bug this patch
fixed?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 10:41                 ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-02-07 10:41 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Ingo Molnar, Andrew Morton, Baoquan He,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov

On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> Hi All,
> 
> I met the makedumpfile failed in the upstream kernel which contained
> this patch. Did I missed something else?

None I'm aware of.

Is there a reason to suspect that the issue is related to the bug this patch
fixed?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 10:41                 ` Kirill A. Shutemov
  0 siblings, 0 replies; 349+ messages in thread
From: Kirill A. Shutemov @ 2018-02-07 10:41 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Andrew Morton, Borislav Petkov, Dave Young,
	Ingo Molnar, Vivek Goyal

On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> Hi All,
> 
> I met the makedumpfile failed in the upstream kernel which contained
> this patch. Did I missed something else?

None I'm aware of.

Is there a reason to suspect that the issue is related to the bug this patch
fixed?

-- 
 Kirill A. Shutemov

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 10:41                 ` Kirill A. Shutemov
  (?)
  (?)
@ 2018-02-07 10:45                   ` Mike Galbraith
  -1 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-02-07 10:45 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dou Liyang
  Cc: Ingo Molnar, Andrew Morton, Baoquan He, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > Hi All,
> > 
> > I met the makedumpfile failed in the upstream kernel which contained
> > this patch. Did I missed something else?
> 
> None I'm aware of.
> 
> Is there a reason to suspect that the issue is related to the bug this patch
> fixed?

Still works fine for me with .today.  Box is only 16GB desktop box though.

	-Mike

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 10:45                   ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-02-07 10:45 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dou Liyang
  Cc: Ingo Molnar, Andrew Morton, Baoquan He, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > Hi All,
> > 
> > I met the makedumpfile failed in the upstream kernel which contained
> > this patch. Did I missed something else?
> 
> None I'm aware of.
> 
> Is there a reason to suspect that the issue is related to the bug this patch
> fixed?

Still works fine for me with .today.  Box is only 16GB desktop box though.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 10:45                   ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-02-07 10:45 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dou Liyang
  Cc: Ingo Molnar, Andrew Morton, Baoquan He, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov

On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > Hi All,
> > 
> > I met the makedumpfile failed in the upstream kernel which contained
> > this patch. Did I missed something else?
> 
> None I'm aware of.
> 
> Is there a reason to suspect that the issue is related to the bug this patch
> fixed?

Still works fine for me with .today.  Box is only 16GB desktop box though.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 10:45                   ` Mike Galbraith
  0 siblings, 0 replies; 349+ messages in thread
From: Mike Galbraith @ 2018-02-07 10:45 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dou Liyang
  Cc: Baoquan He, Peter Zijlstra, Greg Kroah-Hartman, Linus Torvalds,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Cyrill Gorcunov,
	Andrew Morton, Borislav Petkov, Dave Young, Ingo Molnar,
	Vivek Goyal

On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > Hi All,
> > 
> > I met the makedumpfile failed in the upstream kernel which contained
> > this patch. Did I missed something else?
> 
> None I'm aware of.
> 
> Is there a reason to suspect that the issue is related to the bug this patch
> fixed?

Still works fine for me with .today.  Box is only 16GB desktop box though.

	-Mike

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07  9:25               ` Dou Liyang
  (?)
@ 2018-02-07 11:28                 ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 11:28 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Linus Torvalds, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm,
	Kirill A. Shutemov, Cyrill Gorcunov, Thomas Gleixner,
	Borislav Petkov, Dave Young, Vivek Goyal

On 02/07/18 at 05:25pm, Dou Liyang wrote:
> Hi All,
> 
> I met the makedumpfile failed in the upstream kernel which contained
> this patch. Did I missed something else?

readmem: Can't convert a virtual address(ffff88007ffd7000) to physical

Should not related to this patch. Otherwise your code can't get to that
step. From message, ffff88007ffd7000 is the end of the last mem region,
seems a code bug. You are testing 5-level on makedumpfile, right?

The patches I posted to descrease the memmory cost on mem map allocation
has code bug, Fengguang's test robot sent a mail to me, I have updated
patches, try to write a good patch log. You might also need check the
5-level patches you posted to makedumpfile upstream.

> 
> In fedora27 host:
> 
> [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> The kernel version is not supported.
> The makedumpfile operation may be incomplete.
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffff88007ff26000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : 0
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : 0
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : 0
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : 0
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : 0
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : 0
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : 0
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : 0
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : 0
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : 0
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : 0
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : 0
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : 0
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : 0
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : 0
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Checking for memory holes                         : [100.0 %] |         STEP
> [Checking for memory holes  ] : 0.000060 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> Checking for memory holes                         : [100.0 %] \         STEP
> [Checking for memory holes  ] : 0.000010 seconds
> Checking for memory holes                         : [100.0 %] -         STEP
> [Checking for memory holes  ] : 0.000004 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> Thanks,
> 	dou
> At 01/09/2018 11:44 AM, Mike Galbraith wrote:
> > On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> > > 
> > > Mike, could you test this? (On top of the rest of the fixes.)
> > 
> > homer:..crash/2018-01-09-04:25 # ll
> > total 1863604
> > -rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
> > -rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
> > -rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
> > -rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
> > -rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz
> > 
> > Yup, all better.
> > 
> > > Sorry for the mess.
> > 
> > (why, developers not installing shiny new bugs is a whole lot worse:)
> > 
> > >  From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > > 
> > > Depending on configuration mem_section can now be an array or a pointer
> > > to an array allocated dynamically. In most cases, we can continue to refer
> > > to it as 'mem_section' regardless of what it is.
> > > 
> > > But there's one exception: '&mem_section' means "address of the array" if
> > > mem_section is an array, but if mem_section is a pointer, it would mean
> > > "address of the pointer".
> > > 
> > > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > > 
> > > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > > correctly for both cases.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > > ---
> > >   include/linux/crash_core.h | 2 ++
> > >   kernel/crash_core.c        | 2 +-
> > >   2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > > index 06097ef30449..83ae04950269 100644
> > > --- a/include/linux/crash_core.h
> > > +++ b/include/linux/crash_core.h
> > > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> > >   	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> > >   #define VMCOREINFO_SYMBOL(name) \
> > >   	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > > +#define VMCOREINFO_ARRAY(name) \
> > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > >   #define VMCOREINFO_SIZE(name) \
> > >   	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> > >   			      (unsigned long)sizeof(name))
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index b3663896278e..d4122a837477 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > >   	VMCOREINFO_SYMBOL(contig_page_data);
> > >   #endif
> > >   #ifdef CONFIG_SPARSEMEM
> > > -	VMCOREINFO_SYMBOL(mem_section);
> > > +	VMCOREINFO_ARRAY(mem_section);
> > >   	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > >   	VMCOREINFO_STRUCT_SIZE(mem_section);
> > >   	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 11:28                 ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 11:28 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Linus Torvalds, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm,
	Kirill A. Shutemov, Cyrill Gorcunov, Thomas Gleixner,
	Borislav Petkov, Dave Young, Vivek Goyal

On 02/07/18 at 05:25pm, Dou Liyang wrote:
> Hi All,
> 
> I met the makedumpfile failed in the upstream kernel which contained
> this patch. Did I missed something else?

readmem: Can't convert a virtual address(ffff88007ffd7000) to physical

Should not related to this patch. Otherwise your code can't get to that
step. From message, ffff88007ffd7000 is the end of the last mem region,
seems a code bug. You are testing 5-level on makedumpfile, right?

The patches I posted to descrease the memmory cost on mem map allocation
has code bug, Fengguang's test robot sent a mail to me, I have updated
patches, try to write a good patch log. You might also need check the
5-level patches you posted to makedumpfile upstream.

> 
> In fedora27 host:
> 
> [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> The kernel version is not supported.
> The makedumpfile operation may be incomplete.
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffff88007ff26000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : 0
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : 0
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : 0
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : 0
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : 0
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : 0
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : 0
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : 0
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : 0
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : 0
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : 0
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : 0
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : 0
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : 0
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : 0
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Checking for memory holes                         : [100.0 %] |         STEP
> [Checking for memory holes  ] : 0.000060 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> Checking for memory holes                         : [100.0 %] \         STEP
> [Checking for memory holes  ] : 0.000010 seconds
> Checking for memory holes                         : [100.0 %] -         STEP
> [Checking for memory holes  ] : 0.000004 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> Thanks,
> 	dou
> At 01/09/2018 11:44 AM, Mike Galbraith wrote:
> > On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> > > 
> > > Mike, could you test this? (On top of the rest of the fixes.)
> > 
> > homer:..crash/2018-01-09-04:25 # ll
> > total 1863604
> > -rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
> > -rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
> > -rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
> > -rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
> > -rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz
> > 
> > Yup, all better.
> > 
> > > Sorry for the mess.
> > 
> > (why, developers not installing shiny new bugs is a whole lot worse:)
> > 
> > >  From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > > 
> > > Depending on configuration mem_section can now be an array or a pointer
> > > to an array allocated dynamically. In most cases, we can continue to refer
> > > to it as 'mem_section' regardless of what it is.
> > > 
> > > But there's one exception: '&mem_section' means "address of the array" if
> > > mem_section is an array, but if mem_section is a pointer, it would mean
> > > "address of the pointer".
> > > 
> > > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > > 
> > > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > > correctly for both cases.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > > ---
> > >   include/linux/crash_core.h | 2 ++
> > >   kernel/crash_core.c        | 2 +-
> > >   2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > > index 06097ef30449..83ae04950269 100644
> > > --- a/include/linux/crash_core.h
> > > +++ b/include/linux/crash_core.h
> > > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> > >   	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> > >   #define VMCOREINFO_SYMBOL(name) \
> > >   	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > > +#define VMCOREINFO_ARRAY(name) \
> > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > >   #define VMCOREINFO_SIZE(name) \
> > >   	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> > >   			      (unsigned long)sizeof(name))
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index b3663896278e..d4122a837477 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > >   	VMCOREINFO_SYMBOL(contig_page_data);
> > >   #endif
> > >   #ifdef CONFIG_SPARSEMEM
> > > -	VMCOREINFO_SYMBOL(mem_section);
> > > +	VMCOREINFO_ARRAY(mem_section);
> > >   	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > >   	VMCOREINFO_STRUCT_SIZE(mem_section);
> > >   	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 11:28                 ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 11:28 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Peter Zijlstra, Greg Kroah-Hartman, Dave Young, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Andrew Morton, Borislav Petkov,
	Linus Torvalds, Ingo Molnar, Kirill A. Shutemov

On 02/07/18 at 05:25pm, Dou Liyang wrote:
> Hi All,
> 
> I met the makedumpfile failed in the upstream kernel which contained
> this patch. Did I missed something else?

readmem: Can't convert a virtual address(ffff88007ffd7000) to physical

Should not related to this patch. Otherwise your code can't get to that
step. From message, ffff88007ffd7000 is the end of the last mem region,
seems a code bug. You are testing 5-level on makedumpfile, right?

The patches I posted to descrease the memmory cost on mem map allocation
has code bug, Fengguang's test robot sent a mail to me, I have updated
patches, try to write a good patch log. You might also need check the
5-level patches you posted to makedumpfile upstream.

> 
> In fedora27 host:
> 
> [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> The kernel version is not supported.
> The makedumpfile operation may be incomplete.
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffff88007ff26000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : 0
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : 0
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : 0
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : 0
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : 0
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : 0
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : 0
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : 0
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : 0
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : 0
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : 0
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : 0
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : 0
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : 0
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : 0
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Checking for memory holes                         : [100.0 %] |         STEP
> [Checking for memory holes  ] : 0.000060 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> Checking for memory holes                         : [100.0 %] \         STEP
> [Checking for memory holes  ] : 0.000010 seconds
> Checking for memory holes                         : [100.0 %] -         STEP
> [Checking for memory holes  ] : 0.000004 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> Thanks,
> 	dou
> At 01/09/2018 11:44 AM, Mike Galbraith wrote:
> > On Tue, 2018-01-09 at 03:13 +0300, Kirill A. Shutemov wrote:
> > > 
> > > Mike, could you test this? (On top of the rest of the fixes.)
> > 
> > homer:..crash/2018-01-09-04:25 # ll
> > total 1863604
> > -rw------- 1 root root      66255 Jan  9 04:25 dmesg.txt
> > -rw-r--r-- 1 root root        182 Jan  9 04:25 README.txt
> > -rw-r--r-- 1 root root    2818240 Jan  9 04:25 System.map-4.15.0.gb2cd1df-master
> > -rw------- 1 root root 1832914928 Jan  9 04:25 vmcore
> > -rw-r--r-- 1 root root   72514993 Jan  9 04:25 vmlinux-4.15.0.gb2cd1df-master.gz
> > 
> > Yup, all better.
> > 
> > > Sorry for the mess.
> > 
> > (why, developers not installing shiny new bugs is a whole lot worse:)
> > 
> > >  From 100fd567754f1457be94732046aefca204c842d2 Mon Sep 17 00:00:00 2001
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > Date: Tue, 9 Jan 2018 02:55:47 +0300
> > > Subject: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo
> > > 
> > > Depending on configuration mem_section can now be an array or a pointer
> > > to an array allocated dynamically. In most cases, we can continue to refer
> > > to it as 'mem_section' regardless of what it is.
> > > 
> > > But there's one exception: '&mem_section' means "address of the array" if
> > > mem_section is an array, but if mem_section is a pointer, it would mean
> > > "address of the pointer".
> > > 
> > > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > > 
> > > Let's introduce VMCOREINFO_ARRAY() that would handle the situation
> > > correctly for both cases.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
> > > ---
> > >   include/linux/crash_core.h | 2 ++
> > >   kernel/crash_core.c        | 2 +-
> > >   2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > > index 06097ef30449..83ae04950269 100644
> > > --- a/include/linux/crash_core.h
> > > +++ b/include/linux/crash_core.h
> > > @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> > >   	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
> > >   #define VMCOREINFO_SYMBOL(name) \
> > >   	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
> > > +#define VMCOREINFO_ARRAY(name) \
> > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > >   #define VMCOREINFO_SIZE(name) \
> > >   	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
> > >   			      (unsigned long)sizeof(name))
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index b3663896278e..d4122a837477 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > >   	VMCOREINFO_SYMBOL(contig_page_data);
> > >   #endif
> > >   #ifdef CONFIG_SPARSEMEM
> > > -	VMCOREINFO_SYMBOL(mem_section);
> > > +	VMCOREINFO_ARRAY(mem_section);
> > >   	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > >   	VMCOREINFO_STRUCT_SIZE(mem_section);
> > >   	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 10:45                   ` Mike Galbraith
  (?)
@ 2018-02-07 12:00                     ` Dou Liyang
  -1 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:00 UTC (permalink / raw)
  To: Mike Galbraith, Kirill A. Shutemov
  Cc: Ingo Molnar, Andrew Morton, Baoquan He, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov, Takao Indoh

Hi Kirill,Mike

At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>> Hi All,
>>>
>>> I met the makedumpfile failed in the upstream kernel which contained
>>> this patch. Did I missed something else?
>>
>> None I'm aware of.
>>
>> Is there a reason to suspect that the issue is related to the bug this patch
>> fixed?
> 

I did a contrastive test by my colleagues Indoh's suggestion.

Revert your two commits:

commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Fri Sep 29 17:08:16 2017 +0300

commit 629a359bdb0e0652a8227b4ff3125431995fec6e
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Tue Nov 7 11:33:37 2017 +0300

...and keep others unchanged, the makedumpfile works well.

> Still works fine for me with .today.  Box is only 16GB desktop box though.
> 
Btw, In the upstream kernel which contained this patch, I did two tests:

  1) use the makedumpfile as core_collector in /etc/kdump.conf, then
trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
makedumpfile works well and I can get the vmcore file.

      ......It is OK

  2) use cp as core_collector, do the same operation to get the vmcore 
file. then use makedumpfile to do like above:

     [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

     ......It causes makedumpfile failed.


Thanks,
	dou.

> 	-Mike
> 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:00                     ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:00 UTC (permalink / raw)
  To: Mike Galbraith, Kirill A. Shutemov
  Cc: Ingo Molnar, Andrew Morton, Baoquan He, Peter Zijlstra,
	Greg Kroah-Hartman, Dave Young, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Thomas Gleixner, Borislav Petkov, Linus Torvalds,
	Kirill A. Shutemov, Takao Indoh

Hi Kirill,Mike

At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>> Hi All,
>>>
>>> I met the makedumpfile failed in the upstream kernel which contained
>>> this patch. Did I missed something else?
>>
>> None I'm aware of.
>>
>> Is there a reason to suspect that the issue is related to the bug this patch
>> fixed?
> 

I did a contrastive test by my colleagues Indoh's suggestion.

Revert your two commits:

commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Fri Sep 29 17:08:16 2017 +0300

commit 629a359bdb0e0652a8227b4ff3125431995fec6e
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Tue Nov 7 11:33:37 2017 +0300

...and keep others unchanged, the makedumpfile works well.

> Still works fine for me with .today.  Box is only 16GB desktop box though.
> 
Btw, In the upstream kernel which contained this patch, I did two tests:

  1) use the makedumpfile as core_collector in /etc/kdump.conf, then
trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
makedumpfile works well and I can get the vmcore file.

      ......It is OK

  2) use cp as core_collector, do the same operation to get the vmcore 
file. then use makedumpfile to do like above:

     [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

     ......It causes makedumpfile failed.


Thanks,
	dou.

> 	-Mike
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:00                     ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:00 UTC (permalink / raw)
  To: Mike Galbraith, Kirill A. Shutemov
  Cc: Takao Indoh, Baoquan He, Peter Zijlstra, Greg Kroah-Hartman,
	Linus Torvalds, kexec, linux-kernel, stable, Andy Lutomirski,
	linux-mm, Thomas Gleixner, Kirill A. Shutemov, Cyrill Gorcunov,
	Andrew Morton, Borislav Petkov, Dave Young, Ingo Molnar,
	Vivek Goyal

Hi Kirill,Mike

At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>> Hi All,
>>>
>>> I met the makedumpfile failed in the upstream kernel which contained
>>> this patch. Did I missed something else?
>>
>> None I'm aware of.
>>
>> Is there a reason to suspect that the issue is related to the bug this patch
>> fixed?
> 

I did a contrastive test by my colleagues Indoh's suggestion.

Revert your two commits:

commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Fri Sep 29 17:08:16 2017 +0300

commit 629a359bdb0e0652a8227b4ff3125431995fec6e
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Tue Nov 7 11:33:37 2017 +0300

...and keep others unchanged, the makedumpfile works well.

> Still works fine for me with .today.  Box is only 16GB desktop box though.
> 
Btw, In the upstream kernel which contained this patch, I did two tests:

  1) use the makedumpfile as core_collector in /etc/kdump.conf, then
trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
makedumpfile works well and I can get the vmcore file.

      ......It is OK

  2) use cp as core_collector, do the same operation to get the vmcore 
file. then use makedumpfile to do like above:

     [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

     ......It causes makedumpfile failed.


Thanks,
	dou.

> 	-Mike
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 12:00                     ` Dou Liyang
  (?)
@ 2018-02-07 12:08                       ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:08 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

On 02/07/18 at 08:00pm, Dou Liyang wrote:
> Hi Kirill,Mike
> 
> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > Hi All,
> > > > 
> > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > this patch. Did I missed something else?
> > > 
> > > None I'm aware of.
> > > 
> > > Is there a reason to suspect that the issue is related to the bug this patch
> > > fixed?
> > 
> 
> I did a contrastive test by my colleagues Indoh's suggestion.
> 
> Revert your two commits:
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Fri Sep 29 17:08:16 2017 +0300
> 
> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Tue Nov 7 11:33:37 2017 +0300
> 
> ...and keep others unchanged, the makedumpfile works well.
> 
> > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > 
> Btw, In the upstream kernel which contained this patch, I did two tests:
> 
>  1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> makedumpfile works well and I can get the vmcore file.
> 
>      ......It is OK
> 
>  2) use cp as core_collector, do the same operation to get the vmcore file.
> then use makedumpfile to do like above:
> 
>     [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

Oh, then please ignore my previous comment. Adding '-D' can give more
debugging message.

> 
>     ......It causes makedumpfile failed.
> 
> 
> Thanks,
> 	dou.
> 
> > 	-Mike
> > 
> > 
> > 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:08                       ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:08 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

On 02/07/18 at 08:00pm, Dou Liyang wrote:
> Hi Kirill,Mike
> 
> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > Hi All,
> > > > 
> > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > this patch. Did I missed something else?
> > > 
> > > None I'm aware of.
> > > 
> > > Is there a reason to suspect that the issue is related to the bug this patch
> > > fixed?
> > 
> 
> I did a contrastive test by my colleagues Indoh's suggestion.
> 
> Revert your two commits:
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Fri Sep 29 17:08:16 2017 +0300
> 
> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Tue Nov 7 11:33:37 2017 +0300
> 
> ...and keep others unchanged, the makedumpfile works well.
> 
> > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > 
> Btw, In the upstream kernel which contained this patch, I did two tests:
> 
>  1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> makedumpfile works well and I can get the vmcore file.
> 
>      ......It is OK
> 
>  2) use cp as core_collector, do the same operation to get the vmcore file.
> then use makedumpfile to do like above:
> 
>     [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

Oh, then please ignore my previous comment. Adding '-D' can give more
debugging message.

> 
>     ......It causes makedumpfile failed.
> 
> 
> Thanks,
> 	dou.
> 
> > 	-Mike
> > 
> > 
> > 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:08                       ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:08 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

On 02/07/18 at 08:00pm, Dou Liyang wrote:
> Hi Kirill,Mike
> 
> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > Hi All,
> > > > 
> > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > this patch. Did I missed something else?
> > > 
> > > None I'm aware of.
> > > 
> > > Is there a reason to suspect that the issue is related to the bug this patch
> > > fixed?
> > 
> 
> I did a contrastive test by my colleagues Indoh's suggestion.
> 
> Revert your two commits:
> 
> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Fri Sep 29 17:08:16 2017 +0300
> 
> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Tue Nov 7 11:33:37 2017 +0300
> 
> ...and keep others unchanged, the makedumpfile works well.
> 
> > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > 
> Btw, In the upstream kernel which contained this patch, I did two tests:
> 
>  1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> makedumpfile works well and I can get the vmcore file.
> 
>      ......It is OK
> 
>  2) use cp as core_collector, do the same operation to get the vmcore file.
> then use makedumpfile to do like above:
> 
>     [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+

Oh, then please ignore my previous comment. Adding '-D' can give more
debugging message.

> 
>     ......It causes makedumpfile failed.
> 
> 
> Thanks,
> 	dou.
> 
> > 	-Mike
> > 
> > 
> > 
> 
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 12:08                       ` Baoquan He
  (?)
  (?)
@ 2018-02-07 12:17                         ` Dou Liyang
  -1 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:17 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

Hi Baoquan,

At 02/07/2018 08:08 PM, Baoquan He wrote:
> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>> Hi Kirill,Mike
>>
>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>> Hi All,
>>>>>
>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>> this patch. Did I missed something else?
>>>>
>>>> None I'm aware of.
>>>>
>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>> fixed?
>>>
>>
>> I did a contrastive test by my colleagues Indoh's suggestion.
>>
>> Revert your two commits:
>>
>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>
>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>
>> ...and keep others unchanged, the makedumpfile works well.
>>
>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>
>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>
>>   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>> makedumpfile works well and I can get the vmcore file.
>>
>>       ......It is OK
>>
>>   2) use cp as core_collector, do the same operation to get the vmcore file.
>> then use makedumpfile to do like above:
>>
>>      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> Oh, then please ignore my previous comment. Adding '-D' can give more
> debugging message.

I added '-D', Just like before, no more debugging message:

BTW, I use crash to analyze the vmcore file created by 'cp' command.

    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command 
../makedumpfile/code/vmlinux_4.15+

the crash works well, It's so interesting.

Thanks,
	dou.

The debugging message with '-D':

[douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x 
vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000014 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000006 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

> 
>>
>>      ......It causes makedumpfile failed.
>>
>>
>> Thanks,
>> 	dou.
>>
>>> 	-Mike
>>>
>>>
>>>
>>
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:17                         ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:17 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

Hi Baoquan,

At 02/07/2018 08:08 PM, Baoquan He wrote:
> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>> Hi Kirill,Mike
>>
>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>> Hi All,
>>>>>
>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>> this patch. Did I missed something else?
>>>>
>>>> None I'm aware of.
>>>>
>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>> fixed?
>>>
>>
>> I did a contrastive test by my colleagues Indoh's suggestion.
>>
>> Revert your two commits:
>>
>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>
>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>
>> ...and keep others unchanged, the makedumpfile works well.
>>
>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>
>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>
>>   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>> makedumpfile works well and I can get the vmcore file.
>>
>>       ......It is OK
>>
>>   2) use cp as core_collector, do the same operation to get the vmcore file.
>> then use makedumpfile to do like above:
>>
>>      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> Oh, then please ignore my previous comment. Adding '-D' can give more
> debugging message.

I added '-D', Just like before, no more debugging message:

BTW, I use crash to analyze the vmcore file created by 'cp' command.

    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command 
../makedumpfile/code/vmlinux_4.15+

the crash works well, It's so interesting.

Thanks,
	dou.

The debugging message with '-D':

[douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x 
vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000014 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000006 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

> 
>>
>>      ......It causes makedumpfile failed.
>>
>>
>> Thanks,
>> 	dou.
>>
>>> 	-Mike
>>>
>>>
>>>
>>
>>
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:17                         ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:17 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

Hi Baoquan,

At 02/07/2018 08:08 PM, Baoquan He wrote:
> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>> Hi Kirill,Mike
>>
>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>> Hi All,
>>>>>
>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>> this patch. Did I missed something else?
>>>>
>>>> None I'm aware of.
>>>>
>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>> fixed?
>>>
>>
>> I did a contrastive test by my colleagues Indoh's suggestion.
>>
>> Revert your two commits:
>>
>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>
>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>
>> ...and keep others unchanged, the makedumpfile works well.
>>
>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>
>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>
>>   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>> makedumpfile works well and I can get the vmcore file.
>>
>>       ......It is OK
>>
>>   2) use cp as core_collector, do the same operation to get the vmcore file.
>> then use makedumpfile to do like above:
>>
>>      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> Oh, then please ignore my previous comment. Adding '-D' can give more
> debugging message.

I added '-D', Just like before, no more debugging message:

BTW, I use crash to analyze the vmcore file created by 'cp' command.

    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command 
../makedumpfile/code/vmlinux_4.15+

the crash works well, It's so interesting.

Thanks,
	dou.

The debugging message with '-D':

[douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x 
vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000014 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000006 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

> 
>>
>>      ......It causes makedumpfile failed.
>>
>>
>> Thanks,
>> 	dou.
>>
>>> 	-Mike
>>>
>>>
>>>
>>
>>
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:17                         ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:17 UTC (permalink / raw)
  To: Baoquan He
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

Hi Baoquan,

At 02/07/2018 08:08 PM, Baoquan He wrote:
> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>> Hi Kirill,Mike
>>
>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>> Hi All,
>>>>>
>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>> this patch. Did I missed something else?
>>>>
>>>> None I'm aware of.
>>>>
>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>> fixed?
>>>
>>
>> I did a contrastive test by my colleagues Indoh's suggestion.
>>
>> Revert your two commits:
>>
>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>
>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>
>> ...and keep others unchanged, the makedumpfile works well.
>>
>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>
>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>
>>   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>> makedumpfile works well and I can get the vmcore file.
>>
>>       ......It is OK
>>
>>   2) use cp as core_collector, do the same operation to get the vmcore file.
>> then use makedumpfile to do like above:
>>
>>      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> 
> Oh, then please ignore my previous comment. Adding '-D' can give more
> debugging message.

I added '-D', Just like before, no more debugging message:

BTW, I use crash to analyze the vmcore file created by 'cp' command.

    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command 
../makedumpfile/code/vmlinux_4.15+

the crash works well, It's so interesting.

Thanks,
	dou.

The debugging message with '-D':

[douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x 
vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061
The kernel version is not supported.
The makedumpfile operation may be incomplete.

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffff88007ff26000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : 0
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : 0
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : 0
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : 0
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : 0
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : 0
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : 0
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : 0
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : 0
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : 0
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : 0
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : 0
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : 0
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : 0
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : 0
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Checking for memory holes                         : [100.0 %] | 
         STEP [Checking for memory holes  ] : 0.000014 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.
Checking for memory holes                         : [100.0 %] \ 
         STEP [Checking for memory holes  ] : 0.000006 seconds
Checking for memory holes                         : [100.0 %] - 
         STEP [Checking for memory holes  ] : 0.000004 seconds
__vtop4_x86_64: Can't get a valid pte.
readmem: Can't convert a virtual address(ffff88007ffd7000) to physical 
address.
readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
__exclude_unnecessary_pages: Can't read the buffer of struct page.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

> 
>>
>>      ......It causes makedumpfile failed.
>>
>>
>> Thanks,
>> 	dou.
>>
>>> 	-Mike
>>>
>>>
>>>
>>
>>
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 12:17                         ` Dou Liyang
  (?)
@ 2018-02-07 12:27                           ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:27 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

On 02/07/18 at 08:17pm, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 02/07/2018 08:08 PM, Baoquan He wrote:
> > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > Hi Kirill,Mike
> > > 
> > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > Hi All,
> > > > > > 
> > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > this patch. Did I missed something else?
> > > > > 
> > > > > None I'm aware of.
> > > > > 
> > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > fixed?
> > > > 
> > > 
> > > I did a contrastive test by my colleagues Indoh's suggestion.
> > > 
> > > Revert your two commits:
> > > 
> > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > 
> > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > 
> > > ...and keep others unchanged, the makedumpfile works well.
> > > 
> > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > 
> > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > 
> > >   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > makedumpfile works well and I can get the vmcore file.
> > > 
> > >       ......It is OK
> > > 
> > >   2) use cp as core_collector, do the same operation to get the vmcore file.
> > > then use makedumpfile to do like above:
> > > 
> > >      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > 
> > Oh, then please ignore my previous comment. Adding '-D' can give more
> > debugging message.
> 
> I added '-D', Just like before, no more debugging message:
> 
> BTW, I use crash to analyze the vmcore file created by 'cp' command.
> 
>    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> ../makedumpfile/code/vmlinux_4.15+
> 
> the crash works well, It's so interesting.
> 
> Thanks,
> 	dou.
> 
> The debugging message with '-D':

And what's the debugging printing when trigger crash by sysrq?

> 
> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> The kernel version is not supported.
> The makedumpfile operation may be incomplete.
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffff88007ff26000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : 0
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : 0
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : 0
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : 0
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : 0
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : 0
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : 0
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : 0
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : 0
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : 0
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : 0
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : 0
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : 0
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : 0
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : 0
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Checking for memory holes                         : [100.0 %] |         STEP
> [Checking for memory holes  ] : 0.000014 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> Checking for memory holes                         : [100.0 %] \         STEP
> [Checking for memory holes  ] : 0.000006 seconds
> Checking for memory holes                         : [100.0 %] -         STEP
> [Checking for memory holes  ] : 0.000004 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> makedumpfile Failed.
> 
> > 
> > > 
> > >      ......It causes makedumpfile failed.
> > > 
> > > 
> > > Thanks,
> > > 	dou.
> > > 
> > > > 	-Mike
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:27                           ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:27 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

On 02/07/18 at 08:17pm, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 02/07/2018 08:08 PM, Baoquan He wrote:
> > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > Hi Kirill,Mike
> > > 
> > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > Hi All,
> > > > > > 
> > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > this patch. Did I missed something else?
> > > > > 
> > > > > None I'm aware of.
> > > > > 
> > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > fixed?
> > > > 
> > > 
> > > I did a contrastive test by my colleagues Indoh's suggestion.
> > > 
> > > Revert your two commits:
> > > 
> > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > 
> > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > 
> > > ...and keep others unchanged, the makedumpfile works well.
> > > 
> > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > 
> > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > 
> > >   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > makedumpfile works well and I can get the vmcore file.
> > > 
> > >       ......It is OK
> > > 
> > >   2) use cp as core_collector, do the same operation to get the vmcore file.
> > > then use makedumpfile to do like above:
> > > 
> > >      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > 
> > Oh, then please ignore my previous comment. Adding '-D' can give more
> > debugging message.
> 
> I added '-D', Just like before, no more debugging message:
> 
> BTW, I use crash to analyze the vmcore file created by 'cp' command.
> 
>    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> ../makedumpfile/code/vmlinux_4.15+
> 
> the crash works well, It's so interesting.
> 
> Thanks,
> 	dou.
> 
> The debugging message with '-D':

And what's the debugging printing when trigger crash by sysrq?

> 
> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> The kernel version is not supported.
> The makedumpfile operation may be incomplete.
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffff88007ff26000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : 0
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : 0
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : 0
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : 0
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : 0
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : 0
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : 0
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : 0
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : 0
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : 0
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : 0
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : 0
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : 0
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : 0
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : 0
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Checking for memory holes                         : [100.0 %] |         STEP
> [Checking for memory holes  ] : 0.000014 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> Checking for memory holes                         : [100.0 %] \         STEP
> [Checking for memory holes  ] : 0.000006 seconds
> Checking for memory holes                         : [100.0 %] -         STEP
> [Checking for memory holes  ] : 0.000004 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> makedumpfile Failed.
> 
> > 
> > > 
> > >      ......It causes makedumpfile failed.
> > > 
> > > 
> > > Thanks,
> > > 	dou.
> > > 
> > > > 	-Mike
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:27                           ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:27 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

On 02/07/18 at 08:17pm, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 02/07/2018 08:08 PM, Baoquan He wrote:
> > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > Hi Kirill,Mike
> > > 
> > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > Hi All,
> > > > > > 
> > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > this patch. Did I missed something else?
> > > > > 
> > > > > None I'm aware of.
> > > > > 
> > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > fixed?
> > > > 
> > > 
> > > I did a contrastive test by my colleagues Indoh's suggestion.
> > > 
> > > Revert your two commits:
> > > 
> > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > 
> > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > 
> > > ...and keep others unchanged, the makedumpfile works well.
> > > 
> > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > 
> > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > 
> > >   1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > makedumpfile works well and I can get the vmcore file.
> > > 
> > >       ......It is OK
> > > 
> > >   2) use cp as core_collector, do the same operation to get the vmcore file.
> > > then use makedumpfile to do like above:
> > > 
> > >      [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > 
> > Oh, then please ignore my previous comment. Adding '-D' can give more
> > debugging message.
> 
> I added '-D', Just like before, no more debugging message:
> 
> BTW, I use crash to analyze the vmcore file created by 'cp' command.
> 
>    ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> ../makedumpfile/code/vmlinux_4.15+
> 
> the crash works well, It's so interesting.
> 
> Thanks,
> 	dou.
> 
> The debugging message with '-D':

And what's the debugging printing when trigger crash by sysrq?

> 
> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> The kernel version is not supported.
> The makedumpfile operation may be incomplete.
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffff88007ff26000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : 0
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : 0
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : 0
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : 0
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : 0
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : 0
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : 0
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : 0
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : 0
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : 0
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : 0
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : 0
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : 0
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : 0
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : 0
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Checking for memory holes                         : [100.0 %] |         STEP
> [Checking for memory holes  ] : 0.000014 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> Checking for memory holes                         : [100.0 %] \         STEP
> [Checking for memory holes  ] : 0.000006 seconds
> Checking for memory holes                         : [100.0 %] -         STEP
> [Checking for memory holes  ] : 0.000004 seconds
> __vtop4_x86_64: Can't get a valid pte.
> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> address.
> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> __exclude_unnecessary_pages: Can't read the buffer of struct page.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> makedumpfile Failed.
> 
> > 
> > > 
> > >      ......It causes makedumpfile failed.
> > > 
> > > 
> > > Thanks,
> > > 	dou.
> > > 
> > > > 	-Mike
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 12:27                           ` Baoquan He
  (?)
  (?)
@ 2018-02-07 12:34                             ` Dou Liyang
  -1 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:34 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh



At 02/07/2018 08:27 PM, Baoquan He wrote:
> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>> Hi Kirill,Mike
>>>>
>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>> this patch. Did I missed something else?
>>>>>>
>>>>>> None I'm aware of.
>>>>>>
>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>> fixed?
>>>>>
>>>>
>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>>
>>>> Revert your two commits:
>>>>
>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>
>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>
>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>
>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>
>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>
>>>>    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>> makedumpfile works well and I can get the vmcore file.
>>>>
>>>>        ......It is OK
>>>>
>>>>    2) use cp as core_collector, do the same operation to get the vmcore file.
>>>> then use makedumpfile to do like above:
>>>>
>>>>       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>
>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>> debugging message.
>>
>> I added '-D', Just like before, no more debugging message:
>>
>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>
>>     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>> ../makedumpfile/code/vmlinux_4.15+
>>
>> the crash works well, It's so interesting.
>>
>> Thanks,
>> 	dou.
>>
>> The debugging message with '-D':
> 
> And what's the debugging printing when trigger crash by sysrq?
> 

kdump: dump target is /dev/vda2
kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
[    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffffea0000000000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : ffffea0000200000
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : ffffea0000400000
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : ffffea0000600000
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : ffffea0000800000
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : ffffea0000a00000
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : ffffea0000c00000
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : ffffea0000e00000
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : ffffea0001000000
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : ffffea0001200000
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : ffffea0001400000
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : ffffea0001600000
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : ffffea0001800000
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : ffffea0001a00000
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : ffffea0001c00000
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : ffffea0001e00000
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Copying data                                      : [100.0 %] - 
  eta: 0s
Writing erase info...
offset_eraseinfo: 9567fb0, size_eraseinfo: 0
kdump: saving vmcore complete

Thanks,
	dou

>>
>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>> The kernel version is not supported.
>> The makedumpfile operation may be incomplete.
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffff88007ff26000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : 0
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : 0
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : 0
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : 0
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : 0
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : 0
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : 0
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : 0
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : 0
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : 0
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : 0
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : 0
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : 0
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : 0
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : 0
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Checking for memory holes                         : [100.0 %] |         STEP
>> [Checking for memory holes  ] : 0.000014 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>> Checking for memory holes                         : [100.0 %] \         STEP
>> [Checking for memory holes  ] : 0.000006 seconds
>> Checking for memory holes                         : [100.0 %] -         STEP
>> [Checking for memory holes  ] : 0.000004 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>
>> makedumpfile Failed.
>>
>>>
>>>>
>>>>       ......It causes makedumpfile failed.
>>>>
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>>> 	-Mike
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:34                             ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:34 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh



At 02/07/2018 08:27 PM, Baoquan He wrote:
> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>> Hi Kirill,Mike
>>>>
>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>> this patch. Did I missed something else?
>>>>>>
>>>>>> None I'm aware of.
>>>>>>
>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>> fixed?
>>>>>
>>>>
>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>>
>>>> Revert your two commits:
>>>>
>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>
>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>
>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>
>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>
>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>
>>>>    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>> makedumpfile works well and I can get the vmcore file.
>>>>
>>>>        ......It is OK
>>>>
>>>>    2) use cp as core_collector, do the same operation to get the vmcore file.
>>>> then use makedumpfile to do like above:
>>>>
>>>>       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>
>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>> debugging message.
>>
>> I added '-D', Just like before, no more debugging message:
>>
>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>
>>     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>> ../makedumpfile/code/vmlinux_4.15+
>>
>> the crash works well, It's so interesting.
>>
>> Thanks,
>> 	dou.
>>
>> The debugging message with '-D':
> 
> And what's the debugging printing when trigger crash by sysrq?
> 

kdump: dump target is /dev/vda2
kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
[    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffffea0000000000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : ffffea0000200000
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : ffffea0000400000
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : ffffea0000600000
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : ffffea0000800000
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : ffffea0000a00000
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : ffffea0000c00000
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : ffffea0000e00000
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : ffffea0001000000
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : ffffea0001200000
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : ffffea0001400000
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : ffffea0001600000
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : ffffea0001800000
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : ffffea0001a00000
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : ffffea0001c00000
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : ffffea0001e00000
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Copying data                                      : [100.0 %] - 
  eta: 0s
Writing erase info...
offset_eraseinfo: 9567fb0, size_eraseinfo: 0
kdump: saving vmcore complete

Thanks,
	dou

>>
>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>> The kernel version is not supported.
>> The makedumpfile operation may be incomplete.
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffff88007ff26000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : 0
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : 0
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : 0
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : 0
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : 0
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : 0
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : 0
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : 0
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : 0
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : 0
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : 0
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : 0
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : 0
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : 0
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : 0
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Checking for memory holes                         : [100.0 %] |         STEP
>> [Checking for memory holes  ] : 0.000014 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>> Checking for memory holes                         : [100.0 %] \         STEP
>> [Checking for memory holes  ] : 0.000006 seconds
>> Checking for memory holes                         : [100.0 %] -         STEP
>> [Checking for memory holes  ] : 0.000004 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>
>> makedumpfile Failed.
>>
>>>
>>>>
>>>>       ......It causes makedumpfile failed.
>>>>
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>>> 	-Mike
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:34                             ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:34 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh



At 02/07/2018 08:27 PM, Baoquan He wrote:
> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>> Hi Kirill,Mike
>>>>
>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>> this patch. Did I missed something else?
>>>>>>
>>>>>> None I'm aware of.
>>>>>>
>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>> fixed?
>>>>>
>>>>
>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>>
>>>> Revert your two commits:
>>>>
>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>
>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>
>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>
>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>
>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>
>>>>    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>> makedumpfile works well and I can get the vmcore file.
>>>>
>>>>        ......It is OK
>>>>
>>>>    2) use cp as core_collector, do the same operation to get the vmcore file.
>>>> then use makedumpfile to do like above:
>>>>
>>>>       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>
>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>> debugging message.
>>
>> I added '-D', Just like before, no more debugging message:
>>
>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>
>>     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>> ../makedumpfile/code/vmlinux_4.15+
>>
>> the crash works well, It's so interesting.
>>
>> Thanks,
>> 	dou.
>>
>> The debugging message with '-D':
> 
> And what's the debugging printing when trigger crash by sysrq?
> 

kdump: dump target is /dev/vda2
kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
[    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffffea0000000000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : ffffea0000200000
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : ffffea0000400000
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : ffffea0000600000
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : ffffea0000800000
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : ffffea0000a00000
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : ffffea0000c00000
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : ffffea0000e00000
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : ffffea0001000000
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : ffffea0001200000
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : ffffea0001400000
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : ffffea0001600000
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : ffffea0001800000
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : ffffea0001a00000
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : ffffea0001c00000
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : ffffea0001e00000
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Copying data                                      : [100.0 %] - 
  eta: 0s
Writing erase info...
offset_eraseinfo: 9567fb0, size_eraseinfo: 0
kdump: saving vmcore complete

Thanks,
	dou

>>
>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>> The kernel version is not supported.
>> The makedumpfile operation may be incomplete.
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffff88007ff26000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : 0
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : 0
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : 0
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : 0
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : 0
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : 0
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : 0
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : 0
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : 0
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : 0
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : 0
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : 0
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : 0
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : 0
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : 0
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Checking for memory holes                         : [100.0 %] |         STEP
>> [Checking for memory holes  ] : 0.000014 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>> Checking for memory holes                         : [100.0 %] \         STEP
>> [Checking for memory holes  ] : 0.000006 seconds
>> Checking for memory holes                         : [100.0 %] -         STEP
>> [Checking for memory holes  ] : 0.000004 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>
>> makedumpfile Failed.
>>
>>>
>>>>
>>>>       ......It causes makedumpfile failed.
>>>>
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>>> 	-Mike
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:34                             ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-07 12:34 UTC (permalink / raw)
  To: Baoquan He
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal



At 02/07/2018 08:27 PM, Baoquan He wrote:
> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>> Hi Kirill,Mike
>>>>
>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>> this patch. Did I missed something else?
>>>>>>
>>>>>> None I'm aware of.
>>>>>>
>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>> fixed?
>>>>>
>>>>
>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>>
>>>> Revert your two commits:
>>>>
>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>
>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>
>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>
>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>
>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>
>>>>    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>> makedumpfile works well and I can get the vmcore file.
>>>>
>>>>        ......It is OK
>>>>
>>>>    2) use cp as core_collector, do the same operation to get the vmcore file.
>>>> then use makedumpfile to do like above:
>>>>
>>>>       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>
>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>> debugging message.
>>
>> I added '-D', Just like before, no more debugging message:
>>
>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>
>>     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>> ../makedumpfile/code/vmlinux_4.15+
>>
>> the crash works well, It's so interesting.
>>
>> Thanks,
>> 	dou.
>>
>> The debugging message with '-D':
> 
> And what's the debugging printing when trigger crash by sysrq?
> 

kdump: dump target is /dev/vda2
kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
[    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
sadump: does not have partition header
sadump: read dump device as unknown format
sadump: unknown format
LOAD (0)
   phys_start : 1000000
   phys_end   : 2a86000
   virt_start : ffffffff81000000
   virt_end   : ffffffff82a86000
LOAD (1)
   phys_start : 1000
   phys_end   : 9fc00
   virt_start : ffff880000001000
   virt_end   : ffff88000009fc00
LOAD (2)
   phys_start : 100000
   phys_end   : 13000000
   virt_start : ffff880000100000
   virt_end   : ffff880013000000
LOAD (3)
   phys_start : 33000000
   phys_end   : 7ffd7000
   virt_start : ffff880033000000
   virt_end   : ffff88007ffd7000
Linux kdump
page_size    : 4096

max_mapnr    : 7ffd7

Buffer size for the cyclic mode: 131061

num of NODEs : 1


Memory type  : SPARSEMEM_EX

mem_map (0)
   mem_map    : ffffea0000000000
   pfn_start  : 0
   pfn_end    : 8000
mem_map (1)
   mem_map    : ffffea0000200000
   pfn_start  : 8000
   pfn_end    : 10000
mem_map (2)
   mem_map    : ffffea0000400000
   pfn_start  : 10000
   pfn_end    : 18000
mem_map (3)
   mem_map    : ffffea0000600000
   pfn_start  : 18000
   pfn_end    : 20000
mem_map (4)
   mem_map    : ffffea0000800000
   pfn_start  : 20000
   pfn_end    : 28000
mem_map (5)
   mem_map    : ffffea0000a00000
   pfn_start  : 28000
   pfn_end    : 30000
mem_map (6)
   mem_map    : ffffea0000c00000
   pfn_start  : 30000
   pfn_end    : 38000
mem_map (7)
   mem_map    : ffffea0000e00000
   pfn_start  : 38000
   pfn_end    : 40000
mem_map (8)
   mem_map    : ffffea0001000000
   pfn_start  : 40000
   pfn_end    : 48000
mem_map (9)
   mem_map    : ffffea0001200000
   pfn_start  : 48000
   pfn_end    : 50000
mem_map (10)
   mem_map    : ffffea0001400000
   pfn_start  : 50000
   pfn_end    : 58000
mem_map (11)
   mem_map    : ffffea0001600000
   pfn_start  : 58000
   pfn_end    : 60000
mem_map (12)
   mem_map    : ffffea0001800000
   pfn_start  : 60000
   pfn_end    : 68000
mem_map (13)
   mem_map    : ffffea0001a00000
   pfn_start  : 68000
   pfn_end    : 70000
mem_map (14)
   mem_map    : ffffea0001c00000
   pfn_start  : 70000
   pfn_end    : 78000
mem_map (15)
   mem_map    : ffffea0001e00000
   pfn_start  : 78000
   pfn_end    : 7ffd7
mmap() is available on the kernel.
Copying data                                      : [100.0 %] - 
  eta: 0s
Writing erase info...
offset_eraseinfo: 9567fb0, size_eraseinfo: 0
kdump: saving vmcore complete

Thanks,
	dou

>>
>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>> The kernel version is not supported.
>> The makedumpfile operation may be incomplete.
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffff88007ff26000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : 0
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : 0
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : 0
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : 0
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : 0
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : 0
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : 0
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : 0
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : 0
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : 0
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : 0
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : 0
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : 0
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : 0
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : 0
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Checking for memory holes                         : [100.0 %] |         STEP
>> [Checking for memory holes  ] : 0.000014 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>> Checking for memory holes                         : [100.0 %] \         STEP
>> [Checking for memory holes  ] : 0.000006 seconds
>> Checking for memory holes                         : [100.0 %] -         STEP
>> [Checking for memory holes  ] : 0.000004 seconds
>> __vtop4_x86_64: Can't get a valid pte.
>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>> address.
>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>
>> makedumpfile Failed.
>>
>>>
>>>>
>>>>       ......It causes makedumpfile failed.
>>>>
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>>> 	-Mike
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 12:34                             ` Dou Liyang
  (?)
@ 2018-02-07 12:45                               ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:45 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

On 02/07/18 at 08:34pm, Dou Liyang wrote:
> 
> 
> At 02/07/2018 08:27 PM, Baoquan He wrote:
> > On 02/07/18 at 08:17pm, Dou Liyang wrote:
> > > Hi Baoquan,
> > > 
> > > At 02/07/2018 08:08 PM, Baoquan He wrote:
> > > > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > > > Hi Kirill,Mike
> > > > > 
> > > > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > > > Hi All,
> > > > > > > > 
> > > > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > > > this patch. Did I missed something else?
> > > > > > > 
> > > > > > > None I'm aware of.
> > > > > > > 
> > > > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > > > fixed?
> > > > > > 
> > > > > 
> > > > > I did a contrastive test by my colleagues Indoh's suggestion.

OK, I may get the reason. kaslr is enabled, right? You can try to
disable kaslr and try them again. Because phys_base and kaslr_offset are
got from vmlinux, while these are generated at compiling time. Just a
guess.

> > > > > 
> > > > > Revert your two commits:
> > > > > 
> > > > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > > > 
> > > > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > > > 
> > > > > ...and keep others unchanged, the makedumpfile works well.
> > > > > 
> > > > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > > > 
> > > > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > > > 
> > > > >    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > > > makedumpfile works well and I can get the vmcore file.
> > > > > 
> > > > >        ......It is OK
> > > > > 
> > > > >    2) use cp as core_collector, do the same operation to get the vmcore file.
> > > > > then use makedumpfile to do like above:
> > > > > 
> > > > >       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > 
> > > > Oh, then please ignore my previous comment. Adding '-D' can give more
> > > > debugging message.
> > > 
> > > I added '-D', Just like before, no more debugging message:
> > > 
> > > BTW, I use crash to analyze the vmcore file created by 'cp' command.
> > > 
> > >     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> > > ../makedumpfile/code/vmlinux_4.15+
> > > 
> > > the crash works well, It's so interesting.
> > > 
> > > Thanks,
> > > 	dou.
> > > 
> > > The debugging message with '-D':
> > 
> > And what's the debugging printing when trigger crash by sysrq?
> > 
> 
> kdump: dump target is /dev/vda2
> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffffea0000000000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : ffffea0000200000
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : ffffea0000400000
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : ffffea0000600000
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : ffffea0000800000
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : ffffea0000a00000
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : ffffea0000c00000
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : ffffea0000e00000
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : ffffea0001000000
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : ffffea0001200000
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : ffffea0001400000
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : ffffea0001600000
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : ffffea0001800000
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : ffffea0001a00000
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : ffffea0001c00000
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : ffffea0001e00000
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Copying data                                      : [100.0 %] -  eta: 0s
> Writing erase info...
> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
> kdump: saving vmcore complete
> 
> Thanks,
> 	dou
> 
> > > 
> > > [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> > > vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> > > sadump: does not have partition header
> > > sadump: read dump device as unknown format
> > > sadump: unknown format
> > > LOAD (0)
> > >    phys_start : 1000000
> > >    phys_end   : 2a86000
> > >    virt_start : ffffffff81000000
> > >    virt_end   : ffffffff82a86000
> > > LOAD (1)
> > >    phys_start : 1000
> > >    phys_end   : 9fc00
> > >    virt_start : ffff880000001000
> > >    virt_end   : ffff88000009fc00
> > > LOAD (2)
> > >    phys_start : 100000
> > >    phys_end   : 13000000
> > >    virt_start : ffff880000100000
> > >    virt_end   : ffff880013000000
> > > LOAD (3)
> > >    phys_start : 33000000
> > >    phys_end   : 7ffd7000
> > >    virt_start : ffff880033000000
> > >    virt_end   : ffff88007ffd7000
> > > Linux kdump
> > > page_size    : 4096
> > > 
> > > max_mapnr    : 7ffd7
> > > 
> > > Buffer size for the cyclic mode: 131061
> > > The kernel version is not supported.
> > > The makedumpfile operation may be incomplete.
> > > 
> > > num of NODEs : 1
> > > 
> > > 
> > > Memory type  : SPARSEMEM_EX
> > > 
> > > mem_map (0)
> > >    mem_map    : ffff88007ff26000
> > >    pfn_start  : 0
> > >    pfn_end    : 8000
> > > mem_map (1)
> > >    mem_map    : 0
> > >    pfn_start  : 8000
> > >    pfn_end    : 10000
> > > mem_map (2)
> > >    mem_map    : 0
> > >    pfn_start  : 10000
> > >    pfn_end    : 18000
> > > mem_map (3)
> > >    mem_map    : 0
> > >    pfn_start  : 18000
> > >    pfn_end    : 20000
> > > mem_map (4)
> > >    mem_map    : 0
> > >    pfn_start  : 20000
> > >    pfn_end    : 28000
> > > mem_map (5)
> > >    mem_map    : 0
> > >    pfn_start  : 28000
> > >    pfn_end    : 30000
> > > mem_map (6)
> > >    mem_map    : 0
> > >    pfn_start  : 30000
> > >    pfn_end    : 38000
> > > mem_map (7)
> > >    mem_map    : 0
> > >    pfn_start  : 38000
> > >    pfn_end    : 40000
> > > mem_map (8)
> > >    mem_map    : 0
> > >    pfn_start  : 40000
> > >    pfn_end    : 48000
> > > mem_map (9)
> > >    mem_map    : 0
> > >    pfn_start  : 48000
> > >    pfn_end    : 50000
> > > mem_map (10)
> > >    mem_map    : 0
> > >    pfn_start  : 50000
> > >    pfn_end    : 58000
> > > mem_map (11)
> > >    mem_map    : 0
> > >    pfn_start  : 58000
> > >    pfn_end    : 60000
> > > mem_map (12)
> > >    mem_map    : 0
> > >    pfn_start  : 60000
> > >    pfn_end    : 68000
> > > mem_map (13)
> > >    mem_map    : 0
> > >    pfn_start  : 68000
> > >    pfn_end    : 70000
> > > mem_map (14)
> > >    mem_map    : 0
> > >    pfn_start  : 70000
> > >    pfn_end    : 78000
> > > mem_map (15)
> > >    mem_map    : 0
> > >    pfn_start  : 78000
> > >    pfn_end    : 7ffd7
> > > mmap() is available on the kernel.
> > > Checking for memory holes                         : [100.0 %] |         STEP
> > > [Checking for memory holes  ] : 0.000014 seconds
> > > __vtop4_x86_64: Can't get a valid pte.
> > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > address.
> > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > Checking for memory holes                         : [100.0 %] \         STEP
> > > [Checking for memory holes  ] : 0.000006 seconds
> > > Checking for memory holes                         : [100.0 %] -         STEP
> > > [Checking for memory holes  ] : 0.000004 seconds
> > > __vtop4_x86_64: Can't get a valid pte.
> > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > address.
> > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > 
> > > makedumpfile Failed.
> > > 
> > > > 
> > > > > 
> > > > >       ......It causes makedumpfile failed.
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > 	dou.
> > > > > 
> > > > > > 	-Mike
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:45                               ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:45 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

On 02/07/18 at 08:34pm, Dou Liyang wrote:
> 
> 
> At 02/07/2018 08:27 PM, Baoquan He wrote:
> > On 02/07/18 at 08:17pm, Dou Liyang wrote:
> > > Hi Baoquan,
> > > 
> > > At 02/07/2018 08:08 PM, Baoquan He wrote:
> > > > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > > > Hi Kirill,Mike
> > > > > 
> > > > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > > > Hi All,
> > > > > > > > 
> > > > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > > > this patch. Did I missed something else?
> > > > > > > 
> > > > > > > None I'm aware of.
> > > > > > > 
> > > > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > > > fixed?
> > > > > > 
> > > > > 
> > > > > I did a contrastive test by my colleagues Indoh's suggestion.

OK, I may get the reason. kaslr is enabled, right? You can try to
disable kaslr and try them again. Because phys_base and kaslr_offset are
got from vmlinux, while these are generated at compiling time. Just a
guess.

> > > > > 
> > > > > Revert your two commits:
> > > > > 
> > > > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > > > 
> > > > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > > > 
> > > > > ...and keep others unchanged, the makedumpfile works well.
> > > > > 
> > > > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > > > 
> > > > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > > > 
> > > > >    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > > > makedumpfile works well and I can get the vmcore file.
> > > > > 
> > > > >        ......It is OK
> > > > > 
> > > > >    2) use cp as core_collector, do the same operation to get the vmcore file.
> > > > > then use makedumpfile to do like above:
> > > > > 
> > > > >       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > 
> > > > Oh, then please ignore my previous comment. Adding '-D' can give more
> > > > debugging message.
> > > 
> > > I added '-D', Just like before, no more debugging message:
> > > 
> > > BTW, I use crash to analyze the vmcore file created by 'cp' command.
> > > 
> > >     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> > > ../makedumpfile/code/vmlinux_4.15+
> > > 
> > > the crash works well, It's so interesting.
> > > 
> > > Thanks,
> > > 	dou.
> > > 
> > > The debugging message with '-D':
> > 
> > And what's the debugging printing when trigger crash by sysrq?
> > 
> 
> kdump: dump target is /dev/vda2
> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffffea0000000000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : ffffea0000200000
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : ffffea0000400000
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : ffffea0000600000
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : ffffea0000800000
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : ffffea0000a00000
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : ffffea0000c00000
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : ffffea0000e00000
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : ffffea0001000000
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : ffffea0001200000
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : ffffea0001400000
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : ffffea0001600000
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : ffffea0001800000
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : ffffea0001a00000
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : ffffea0001c00000
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : ffffea0001e00000
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Copying data                                      : [100.0 %] -  eta: 0s
> Writing erase info...
> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
> kdump: saving vmcore complete
> 
> Thanks,
> 	dou
> 
> > > 
> > > [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> > > vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> > > sadump: does not have partition header
> > > sadump: read dump device as unknown format
> > > sadump: unknown format
> > > LOAD (0)
> > >    phys_start : 1000000
> > >    phys_end   : 2a86000
> > >    virt_start : ffffffff81000000
> > >    virt_end   : ffffffff82a86000
> > > LOAD (1)
> > >    phys_start : 1000
> > >    phys_end   : 9fc00
> > >    virt_start : ffff880000001000
> > >    virt_end   : ffff88000009fc00
> > > LOAD (2)
> > >    phys_start : 100000
> > >    phys_end   : 13000000
> > >    virt_start : ffff880000100000
> > >    virt_end   : ffff880013000000
> > > LOAD (3)
> > >    phys_start : 33000000
> > >    phys_end   : 7ffd7000
> > >    virt_start : ffff880033000000
> > >    virt_end   : ffff88007ffd7000
> > > Linux kdump
> > > page_size    : 4096
> > > 
> > > max_mapnr    : 7ffd7
> > > 
> > > Buffer size for the cyclic mode: 131061
> > > The kernel version is not supported.
> > > The makedumpfile operation may be incomplete.
> > > 
> > > num of NODEs : 1
> > > 
> > > 
> > > Memory type  : SPARSEMEM_EX
> > > 
> > > mem_map (0)
> > >    mem_map    : ffff88007ff26000
> > >    pfn_start  : 0
> > >    pfn_end    : 8000
> > > mem_map (1)
> > >    mem_map    : 0
> > >    pfn_start  : 8000
> > >    pfn_end    : 10000
> > > mem_map (2)
> > >    mem_map    : 0
> > >    pfn_start  : 10000
> > >    pfn_end    : 18000
> > > mem_map (3)
> > >    mem_map    : 0
> > >    pfn_start  : 18000
> > >    pfn_end    : 20000
> > > mem_map (4)
> > >    mem_map    : 0
> > >    pfn_start  : 20000
> > >    pfn_end    : 28000
> > > mem_map (5)
> > >    mem_map    : 0
> > >    pfn_start  : 28000
> > >    pfn_end    : 30000
> > > mem_map (6)
> > >    mem_map    : 0
> > >    pfn_start  : 30000
> > >    pfn_end    : 38000
> > > mem_map (7)
> > >    mem_map    : 0
> > >    pfn_start  : 38000
> > >    pfn_end    : 40000
> > > mem_map (8)
> > >    mem_map    : 0
> > >    pfn_start  : 40000
> > >    pfn_end    : 48000
> > > mem_map (9)
> > >    mem_map    : 0
> > >    pfn_start  : 48000
> > >    pfn_end    : 50000
> > > mem_map (10)
> > >    mem_map    : 0
> > >    pfn_start  : 50000
> > >    pfn_end    : 58000
> > > mem_map (11)
> > >    mem_map    : 0
> > >    pfn_start  : 58000
> > >    pfn_end    : 60000
> > > mem_map (12)
> > >    mem_map    : 0
> > >    pfn_start  : 60000
> > >    pfn_end    : 68000
> > > mem_map (13)
> > >    mem_map    : 0
> > >    pfn_start  : 68000
> > >    pfn_end    : 70000
> > > mem_map (14)
> > >    mem_map    : 0
> > >    pfn_start  : 70000
> > >    pfn_end    : 78000
> > > mem_map (15)
> > >    mem_map    : 0
> > >    pfn_start  : 78000
> > >    pfn_end    : 7ffd7
> > > mmap() is available on the kernel.
> > > Checking for memory holes                         : [100.0 %] |         STEP
> > > [Checking for memory holes  ] : 0.000014 seconds
> > > __vtop4_x86_64: Can't get a valid pte.
> > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > address.
> > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > Checking for memory holes                         : [100.0 %] \         STEP
> > > [Checking for memory holes  ] : 0.000006 seconds
> > > Checking for memory holes                         : [100.0 %] -         STEP
> > > [Checking for memory holes  ] : 0.000004 seconds
> > > __vtop4_x86_64: Can't get a valid pte.
> > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > address.
> > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > 
> > > makedumpfile Failed.
> > > 
> > > > 
> > > > > 
> > > > >       ......It causes makedumpfile failed.
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > 	dou.
> > > > > 
> > > > > > 	-Mike
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-07 12:45                               ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-07 12:45 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

On 02/07/18 at 08:34pm, Dou Liyang wrote:
> 
> 
> At 02/07/2018 08:27 PM, Baoquan He wrote:
> > On 02/07/18 at 08:17pm, Dou Liyang wrote:
> > > Hi Baoquan,
> > > 
> > > At 02/07/2018 08:08 PM, Baoquan He wrote:
> > > > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > > > Hi Kirill,Mike
> > > > > 
> > > > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > > > Hi All,
> > > > > > > > 
> > > > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > > > this patch. Did I missed something else?
> > > > > > > 
> > > > > > > None I'm aware of.
> > > > > > > 
> > > > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > > > fixed?
> > > > > > 
> > > > > 
> > > > > I did a contrastive test by my colleagues Indoh's suggestion.

OK, I may get the reason. kaslr is enabled, right? You can try to
disable kaslr and try them again. Because phys_base and kaslr_offset are
got from vmlinux, while these are generated at compiling time. Just a
guess.

> > > > > 
> > > > > Revert your two commits:
> > > > > 
> > > > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > > > 
> > > > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > > > 
> > > > > ...and keep others unchanged, the makedumpfile works well.
> > > > > 
> > > > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > > > 
> > > > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > > > 
> > > > >    1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > > > makedumpfile works well and I can get the vmcore file.
> > > > > 
> > > > >        ......It is OK
> > > > > 
> > > > >    2) use cp as core_collector, do the same operation to get the vmcore file.
> > > > > then use makedumpfile to do like above:
> > > > > 
> > > > >       [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > 
> > > > Oh, then please ignore my previous comment. Adding '-D' can give more
> > > > debugging message.
> > > 
> > > I added '-D', Just like before, no more debugging message:
> > > 
> > > BTW, I use crash to analyze the vmcore file created by 'cp' command.
> > > 
> > >     ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> > > ../makedumpfile/code/vmlinux_4.15+
> > > 
> > > the crash works well, It's so interesting.
> > > 
> > > Thanks,
> > > 	dou.
> > > 
> > > The debugging message with '-D':
> > 
> > And what's the debugging printing when trigger crash by sysrq?
> > 
> 
> kdump: dump target is /dev/vda2
> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 2a86000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff82a86000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9fc00
>   virt_start : ffff880000001000
>   virt_end   : ffff88000009fc00
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 13000000
>   virt_start : ffff880000100000
>   virt_end   : ffff880013000000
> LOAD (3)
>   phys_start : 33000000
>   phys_end   : 7ffd7000
>   virt_start : ffff880033000000
>   virt_end   : ffff88007ffd7000
> Linux kdump
> page_size    : 4096
> 
> max_mapnr    : 7ffd7
> 
> Buffer size for the cyclic mode: 131061
> 
> num of NODEs : 1
> 
> 
> Memory type  : SPARSEMEM_EX
> 
> mem_map (0)
>   mem_map    : ffffea0000000000
>   pfn_start  : 0
>   pfn_end    : 8000
> mem_map (1)
>   mem_map    : ffffea0000200000
>   pfn_start  : 8000
>   pfn_end    : 10000
> mem_map (2)
>   mem_map    : ffffea0000400000
>   pfn_start  : 10000
>   pfn_end    : 18000
> mem_map (3)
>   mem_map    : ffffea0000600000
>   pfn_start  : 18000
>   pfn_end    : 20000
> mem_map (4)
>   mem_map    : ffffea0000800000
>   pfn_start  : 20000
>   pfn_end    : 28000
> mem_map (5)
>   mem_map    : ffffea0000a00000
>   pfn_start  : 28000
>   pfn_end    : 30000
> mem_map (6)
>   mem_map    : ffffea0000c00000
>   pfn_start  : 30000
>   pfn_end    : 38000
> mem_map (7)
>   mem_map    : ffffea0000e00000
>   pfn_start  : 38000
>   pfn_end    : 40000
> mem_map (8)
>   mem_map    : ffffea0001000000
>   pfn_start  : 40000
>   pfn_end    : 48000
> mem_map (9)
>   mem_map    : ffffea0001200000
>   pfn_start  : 48000
>   pfn_end    : 50000
> mem_map (10)
>   mem_map    : ffffea0001400000
>   pfn_start  : 50000
>   pfn_end    : 58000
> mem_map (11)
>   mem_map    : ffffea0001600000
>   pfn_start  : 58000
>   pfn_end    : 60000
> mem_map (12)
>   mem_map    : ffffea0001800000
>   pfn_start  : 60000
>   pfn_end    : 68000
> mem_map (13)
>   mem_map    : ffffea0001a00000
>   pfn_start  : 68000
>   pfn_end    : 70000
> mem_map (14)
>   mem_map    : ffffea0001c00000
>   pfn_start  : 70000
>   pfn_end    : 78000
> mem_map (15)
>   mem_map    : ffffea0001e00000
>   pfn_start  : 78000
>   pfn_end    : 7ffd7
> mmap() is available on the kernel.
> Copying data                                      : [100.0 %] -  eta: 0s
> Writing erase info...
> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
> kdump: saving vmcore complete
> 
> Thanks,
> 	dou
> 
> > > 
> > > [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> > > vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> > > sadump: does not have partition header
> > > sadump: read dump device as unknown format
> > > sadump: unknown format
> > > LOAD (0)
> > >    phys_start : 1000000
> > >    phys_end   : 2a86000
> > >    virt_start : ffffffff81000000
> > >    virt_end   : ffffffff82a86000
> > > LOAD (1)
> > >    phys_start : 1000
> > >    phys_end   : 9fc00
> > >    virt_start : ffff880000001000
> > >    virt_end   : ffff88000009fc00
> > > LOAD (2)
> > >    phys_start : 100000
> > >    phys_end   : 13000000
> > >    virt_start : ffff880000100000
> > >    virt_end   : ffff880013000000
> > > LOAD (3)
> > >    phys_start : 33000000
> > >    phys_end   : 7ffd7000
> > >    virt_start : ffff880033000000
> > >    virt_end   : ffff88007ffd7000
> > > Linux kdump
> > > page_size    : 4096
> > > 
> > > max_mapnr    : 7ffd7
> > > 
> > > Buffer size for the cyclic mode: 131061
> > > The kernel version is not supported.
> > > The makedumpfile operation may be incomplete.
> > > 
> > > num of NODEs : 1
> > > 
> > > 
> > > Memory type  : SPARSEMEM_EX
> > > 
> > > mem_map (0)
> > >    mem_map    : ffff88007ff26000
> > >    pfn_start  : 0
> > >    pfn_end    : 8000
> > > mem_map (1)
> > >    mem_map    : 0
> > >    pfn_start  : 8000
> > >    pfn_end    : 10000
> > > mem_map (2)
> > >    mem_map    : 0
> > >    pfn_start  : 10000
> > >    pfn_end    : 18000
> > > mem_map (3)
> > >    mem_map    : 0
> > >    pfn_start  : 18000
> > >    pfn_end    : 20000
> > > mem_map (4)
> > >    mem_map    : 0
> > >    pfn_start  : 20000
> > >    pfn_end    : 28000
> > > mem_map (5)
> > >    mem_map    : 0
> > >    pfn_start  : 28000
> > >    pfn_end    : 30000
> > > mem_map (6)
> > >    mem_map    : 0
> > >    pfn_start  : 30000
> > >    pfn_end    : 38000
> > > mem_map (7)
> > >    mem_map    : 0
> > >    pfn_start  : 38000
> > >    pfn_end    : 40000
> > > mem_map (8)
> > >    mem_map    : 0
> > >    pfn_start  : 40000
> > >    pfn_end    : 48000
> > > mem_map (9)
> > >    mem_map    : 0
> > >    pfn_start  : 48000
> > >    pfn_end    : 50000
> > > mem_map (10)
> > >    mem_map    : 0
> > >    pfn_start  : 50000
> > >    pfn_end    : 58000
> > > mem_map (11)
> > >    mem_map    : 0
> > >    pfn_start  : 58000
> > >    pfn_end    : 60000
> > > mem_map (12)
> > >    mem_map    : 0
> > >    pfn_start  : 60000
> > >    pfn_end    : 68000
> > > mem_map (13)
> > >    mem_map    : 0
> > >    pfn_start  : 68000
> > >    pfn_end    : 70000
> > > mem_map (14)
> > >    mem_map    : 0
> > >    pfn_start  : 70000
> > >    pfn_end    : 78000
> > > mem_map (15)
> > >    mem_map    : 0
> > >    pfn_start  : 78000
> > >    pfn_end    : 7ffd7
> > > mmap() is available on the kernel.
> > > Checking for memory holes                         : [100.0 %] |         STEP
> > > [Checking for memory holes  ] : 0.000014 seconds
> > > __vtop4_x86_64: Can't get a valid pte.
> > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > address.
> > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > Checking for memory holes                         : [100.0 %] \         STEP
> > > [Checking for memory holes  ] : 0.000006 seconds
> > > Checking for memory holes                         : [100.0 %] -         STEP
> > > [Checking for memory holes  ] : 0.000004 seconds
> > > __vtop4_x86_64: Can't get a valid pte.
> > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > address.
> > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > 
> > > makedumpfile Failed.
> > > 
> > > > 
> > > > > 
> > > > >       ......It causes makedumpfile failed.
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > 	dou.
> > > > > 
> > > > > > 	-Mike
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-07 12:45                               ` Baoquan He
  (?)
@ 2018-02-08  1:14                                 ` Dou Liyang
  -1 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:14 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

Hi Baoquan,

At 02/07/2018 08:45 PM, Baoquan He wrote:
> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>
>>
>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>> Hi Baoquan,
>>>>
>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>> Hi Kirill,Mike
>>>>>>
>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>
>>>>>>>> None I'm aware of.
>>>>>>>>
>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>> fixed?
>>>>>>>
>>>>>>
>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
> 
> OK, I may get the reason. kaslr is enabled, right? You can try to

I add 'nokaslr' to disable the KASLR feature.

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0+ 
root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b ro crashkernel=512M rhgb 
console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8

> disable kaslr and try them again. Because phys_base and kaslr_offset are
> got from vmlinux, while these are generated at compiling time. Just a
> guess.
> 

Oh, I will recompile the kernel with KASLR disabled in .config.


Thanks,
	dou.
>>>>>>
>>>>>> Revert your two commits:
>>>>>>
>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>
>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>
>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>
>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>
>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>
>>>>>>     1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>
>>>>>>         ......It is OK
>>>>>>
>>>>>>     2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>> then use makedumpfile to do like above:
>>>>>>
>>>>>>        [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>
>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>> debugging message.
>>>>
>>>> I added '-D', Just like before, no more debugging message:
>>>>
>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>
>>>>      ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>
>>>> the crash works well, It's so interesting.
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>> The debugging message with '-D':
>>>
>>> And what's the debugging printing when trigger crash by sysrq?
>>>
>>
>> kdump: dump target is /dev/vda2
>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>> kdump: saving vmcore-dmesg.txt
>> kdump: saving vmcore-dmesg.txt complete
>> kdump: saving vmcore
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffffea0000000000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : ffffea0000200000
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : ffffea0000400000
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : ffffea0000600000
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : ffffea0000800000
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : ffffea0000a00000
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : ffffea0000c00000
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : ffffea0000e00000
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : ffffea0001000000
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : ffffea0001200000
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : ffffea0001400000
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : ffffea0001600000
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : ffffea0001800000
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : ffffea0001a00000
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : ffffea0001c00000
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : ffffea0001e00000
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Copying data                                      : [100.0 %] -  eta: 0s
>> Writing erase info...
>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>> kdump: saving vmcore complete
>>
>> Thanks,
>> 	dou
>>
>>>>
>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>> The kernel version is not supported.
>>>> The makedumpfile operation may be incomplete.
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffff88007ff26000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : 0
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : 0
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : 0
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : 0
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : 0
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : 0
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : 0
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : 0
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : 0
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : 0
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : 0
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : 0
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : 0
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : 0
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : 0
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>> __vtop4_x86_64: Can't get a valid pte.
>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>> address.
>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>> __vtop4_x86_64: Can't get a valid pte.
>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>> address.
>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>
>>>> makedumpfile Failed.
>>>>
>>>>>
>>>>>>
>>>>>>        ......It causes makedumpfile failed.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>>> 	-Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:14                                 ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:14 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Galbraith, Kirill A. Shutemov, Ingo Molnar, Andrew Morton,
	Peter Zijlstra, Greg Kroah-Hartman, Dave Young, kexec,
	linux-kernel, stable, Andy Lutomirski, linux-mm, Vivek Goyal,
	Cyrill Gorcunov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Kirill A. Shutemov, Takao Indoh

Hi Baoquan,

At 02/07/2018 08:45 PM, Baoquan He wrote:
> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>
>>
>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>> Hi Baoquan,
>>>>
>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>> Hi Kirill,Mike
>>>>>>
>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>
>>>>>>>> None I'm aware of.
>>>>>>>>
>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>> fixed?
>>>>>>>
>>>>>>
>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
> 
> OK, I may get the reason. kaslr is enabled, right? You can try to

I add 'nokaslr' to disable the KASLR feature.

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0+ 
root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b ro crashkernel=512M rhgb 
console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8

> disable kaslr and try them again. Because phys_base and kaslr_offset are
> got from vmlinux, while these are generated at compiling time. Just a
> guess.
> 

Oh, I will recompile the kernel with KASLR disabled in .config.


Thanks,
	dou.
>>>>>>
>>>>>> Revert your two commits:
>>>>>>
>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>
>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>
>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>
>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>
>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>
>>>>>>     1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>
>>>>>>         ......It is OK
>>>>>>
>>>>>>     2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>> then use makedumpfile to do like above:
>>>>>>
>>>>>>        [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>
>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>> debugging message.
>>>>
>>>> I added '-D', Just like before, no more debugging message:
>>>>
>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>
>>>>      ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>
>>>> the crash works well, It's so interesting.
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>> The debugging message with '-D':
>>>
>>> And what's the debugging printing when trigger crash by sysrq?
>>>
>>
>> kdump: dump target is /dev/vda2
>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>> kdump: saving vmcore-dmesg.txt
>> kdump: saving vmcore-dmesg.txt complete
>> kdump: saving vmcore
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffffea0000000000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : ffffea0000200000
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : ffffea0000400000
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : ffffea0000600000
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : ffffea0000800000
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : ffffea0000a00000
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : ffffea0000c00000
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : ffffea0000e00000
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : ffffea0001000000
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : ffffea0001200000
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : ffffea0001400000
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : ffffea0001600000
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : ffffea0001800000
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : ffffea0001a00000
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : ffffea0001c00000
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : ffffea0001e00000
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Copying data                                      : [100.0 %] -  eta: 0s
>> Writing erase info...
>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>> kdump: saving vmcore complete
>>
>> Thanks,
>> 	dou
>>
>>>>
>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>> The kernel version is not supported.
>>>> The makedumpfile operation may be incomplete.
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffff88007ff26000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : 0
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : 0
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : 0
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : 0
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : 0
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : 0
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : 0
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : 0
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : 0
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : 0
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : 0
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : 0
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : 0
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : 0
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : 0
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>> __vtop4_x86_64: Can't get a valid pte.
>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>> address.
>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>> __vtop4_x86_64: Can't get a valid pte.
>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>> address.
>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>
>>>> makedumpfile Failed.
>>>>
>>>>>
>>>>>>
>>>>>>        ......It causes makedumpfile failed.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>>> 	-Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:14                                 ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:14 UTC (permalink / raw)
  To: Baoquan He
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

Hi Baoquan,

At 02/07/2018 08:45 PM, Baoquan He wrote:
> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>
>>
>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>> Hi Baoquan,
>>>>
>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>> Hi Kirill,Mike
>>>>>>
>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>
>>>>>>>> None I'm aware of.
>>>>>>>>
>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>> fixed?
>>>>>>>
>>>>>>
>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
> 
> OK, I may get the reason. kaslr is enabled, right? You can try to

I add 'nokaslr' to disable the KASLR feature.

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0+ 
root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b ro crashkernel=512M rhgb 
console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8

> disable kaslr and try them again. Because phys_base and kaslr_offset are
> got from vmlinux, while these are generated at compiling time. Just a
> guess.
> 

Oh, I will recompile the kernel with KASLR disabled in .config.


Thanks,
	dou.
>>>>>>
>>>>>> Revert your two commits:
>>>>>>
>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>
>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>
>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>
>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>
>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>
>>>>>>     1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>
>>>>>>         ......It is OK
>>>>>>
>>>>>>     2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>> then use makedumpfile to do like above:
>>>>>>
>>>>>>        [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>
>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>> debugging message.
>>>>
>>>> I added '-D', Just like before, no more debugging message:
>>>>
>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>
>>>>      ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>
>>>> the crash works well, It's so interesting.
>>>>
>>>> Thanks,
>>>> 	dou.
>>>>
>>>> The debugging message with '-D':
>>>
>>> And what's the debugging printing when trigger crash by sysrq?
>>>
>>
>> kdump: dump target is /dev/vda2
>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>> kdump: saving vmcore-dmesg.txt
>> kdump: saving vmcore-dmesg.txt complete
>> kdump: saving vmcore
>> sadump: does not have partition header
>> sadump: read dump device as unknown format
>> sadump: unknown format
>> LOAD (0)
>>    phys_start : 1000000
>>    phys_end   : 2a86000
>>    virt_start : ffffffff81000000
>>    virt_end   : ffffffff82a86000
>> LOAD (1)
>>    phys_start : 1000
>>    phys_end   : 9fc00
>>    virt_start : ffff880000001000
>>    virt_end   : ffff88000009fc00
>> LOAD (2)
>>    phys_start : 100000
>>    phys_end   : 13000000
>>    virt_start : ffff880000100000
>>    virt_end   : ffff880013000000
>> LOAD (3)
>>    phys_start : 33000000
>>    phys_end   : 7ffd7000
>>    virt_start : ffff880033000000
>>    virt_end   : ffff88007ffd7000
>> Linux kdump
>> page_size    : 4096
>>
>> max_mapnr    : 7ffd7
>>
>> Buffer size for the cyclic mode: 131061
>>
>> num of NODEs : 1
>>
>>
>> Memory type  : SPARSEMEM_EX
>>
>> mem_map (0)
>>    mem_map    : ffffea0000000000
>>    pfn_start  : 0
>>    pfn_end    : 8000
>> mem_map (1)
>>    mem_map    : ffffea0000200000
>>    pfn_start  : 8000
>>    pfn_end    : 10000
>> mem_map (2)
>>    mem_map    : ffffea0000400000
>>    pfn_start  : 10000
>>    pfn_end    : 18000
>> mem_map (3)
>>    mem_map    : ffffea0000600000
>>    pfn_start  : 18000
>>    pfn_end    : 20000
>> mem_map (4)
>>    mem_map    : ffffea0000800000
>>    pfn_start  : 20000
>>    pfn_end    : 28000
>> mem_map (5)
>>    mem_map    : ffffea0000a00000
>>    pfn_start  : 28000
>>    pfn_end    : 30000
>> mem_map (6)
>>    mem_map    : ffffea0000c00000
>>    pfn_start  : 30000
>>    pfn_end    : 38000
>> mem_map (7)
>>    mem_map    : ffffea0000e00000
>>    pfn_start  : 38000
>>    pfn_end    : 40000
>> mem_map (8)
>>    mem_map    : ffffea0001000000
>>    pfn_start  : 40000
>>    pfn_end    : 48000
>> mem_map (9)
>>    mem_map    : ffffea0001200000
>>    pfn_start  : 48000
>>    pfn_end    : 50000
>> mem_map (10)
>>    mem_map    : ffffea0001400000
>>    pfn_start  : 50000
>>    pfn_end    : 58000
>> mem_map (11)
>>    mem_map    : ffffea0001600000
>>    pfn_start  : 58000
>>    pfn_end    : 60000
>> mem_map (12)
>>    mem_map    : ffffea0001800000
>>    pfn_start  : 60000
>>    pfn_end    : 68000
>> mem_map (13)
>>    mem_map    : ffffea0001a00000
>>    pfn_start  : 68000
>>    pfn_end    : 70000
>> mem_map (14)
>>    mem_map    : ffffea0001c00000
>>    pfn_start  : 70000
>>    pfn_end    : 78000
>> mem_map (15)
>>    mem_map    : ffffea0001e00000
>>    pfn_start  : 78000
>>    pfn_end    : 7ffd7
>> mmap() is available on the kernel.
>> Copying data                                      : [100.0 %] -  eta: 0s
>> Writing erase info...
>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>> kdump: saving vmcore complete
>>
>> Thanks,
>> 	dou
>>
>>>>
>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>> The kernel version is not supported.
>>>> The makedumpfile operation may be incomplete.
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffff88007ff26000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : 0
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : 0
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : 0
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : 0
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : 0
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : 0
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : 0
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : 0
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : 0
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : 0
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : 0
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : 0
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : 0
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : 0
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : 0
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>> __vtop4_x86_64: Can't get a valid pte.
>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>> address.
>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>> __vtop4_x86_64: Can't get a valid pte.
>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>> address.
>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>
>>>> makedumpfile Failed.
>>>>
>>>>>
>>>>>>
>>>>>>        ......It causes makedumpfile failed.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>>> 	-Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-08  1:14                                 ` Dou Liyang
  (?)
@ 2018-02-08  1:23                                   ` Baoquan He
  -1 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-08  1:23 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

On 02/08/18 at 09:14am, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 02/07/2018 08:45 PM, Baoquan He wrote:
> > On 02/07/18 at 08:34pm, Dou Liyang wrote:
> > > 
> > > 
> > > At 02/07/2018 08:27 PM, Baoquan He wrote:
> > > > On 02/07/18 at 08:17pm, Dou Liyang wrote:
> > > > > Hi Baoquan,
> > > > > 
> > > > > At 02/07/2018 08:08 PM, Baoquan He wrote:
> > > > > > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > > > > > Hi Kirill,Mike
> > > > > > > 
> > > > > > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > > > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > > > > > Hi All,
> > > > > > > > > > 
> > > > > > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > > > > > this patch. Did I missed something else?
> > > > > > > > > 
> > > > > > > > > None I'm aware of.
> > > > > > > > > 
> > > > > > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > > > > > fixed?
> > > > > > > > 
> > > > > > > 
> > > > > > > I did a contrastive test by my colleagues Indoh's suggestion.
> > 
> > OK, I may get the reason. kaslr is enabled, right? You can try to
> 
> I add 'nokaslr' to disable the KASLR feature.
    ~~~added??
> 
> # cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
> 
> > disable kaslr and try them again. Because phys_base and kaslr_offset are
> > got from vmlinux, while these are generated at compiling time. Just a
> > guess.
> > 
> 
> Oh, I will recompile the kernel with KASLR disabled in .config.

Then it's not what I guessed. Need debug makedumpfile since using
vmlinux is another code path, few people use it usually.

> 
> 
> Thanks,
> 	dou.
> > > > > > > 
> > > > > > > Revert your two commits:
> > > > > > > 
> > > > > > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > > > > > 
> > > > > > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > > > > > 
> > > > > > > ...and keep others unchanged, the makedumpfile works well.
> > > > > > > 
> > > > > > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > > > > > 
> > > > > > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > > > > > 
> > > > > > >     1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > > > > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > > > > > makedumpfile works well and I can get the vmcore file.
> > > > > > > 
> > > > > > >         ......It is OK
> > > > > > > 
> > > > > > >     2) use cp as core_collector, do the same operation to get the vmcore file.
> > > > > > > then use makedumpfile to do like above:
> > > > > > > 
> > > > > > >        [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > > > > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > > > 
> > > > > > Oh, then please ignore my previous comment. Adding '-D' can give more
> > > > > > debugging message.
> > > > > 
> > > > > I added '-D', Just like before, no more debugging message:
> > > > > 
> > > > > BTW, I use crash to analyze the vmcore file created by 'cp' command.
> > > > > 
> > > > >      ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> > > > > ../makedumpfile/code/vmlinux_4.15+
> > > > > 
> > > > > the crash works well, It's so interesting.
> > > > > 
> > > > > Thanks,
> > > > > 	dou.
> > > > > 
> > > > > The debugging message with '-D':
> > > > 
> > > > And what's the debugging printing when trigger crash by sysrq?
> > > > 
> > > 
> > > kdump: dump target is /dev/vda2
> > > kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
> > > [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
> > > kdump: saving vmcore-dmesg.txt
> > > kdump: saving vmcore-dmesg.txt complete
> > > kdump: saving vmcore
> > > sadump: does not have partition header
> > > sadump: read dump device as unknown format
> > > sadump: unknown format
> > > LOAD (0)
> > >    phys_start : 1000000
> > >    phys_end   : 2a86000
> > >    virt_start : ffffffff81000000
> > >    virt_end   : ffffffff82a86000
> > > LOAD (1)
> > >    phys_start : 1000
> > >    phys_end   : 9fc00
> > >    virt_start : ffff880000001000
> > >    virt_end   : ffff88000009fc00
> > > LOAD (2)
> > >    phys_start : 100000
> > >    phys_end   : 13000000
> > >    virt_start : ffff880000100000
> > >    virt_end   : ffff880013000000
> > > LOAD (3)
> > >    phys_start : 33000000
> > >    phys_end   : 7ffd7000
> > >    virt_start : ffff880033000000
> > >    virt_end   : ffff88007ffd7000
> > > Linux kdump
> > > page_size    : 4096
> > > 
> > > max_mapnr    : 7ffd7
> > > 
> > > Buffer size for the cyclic mode: 131061
> > > 
> > > num of NODEs : 1
> > > 
> > > 
> > > Memory type  : SPARSEMEM_EX
> > > 
> > > mem_map (0)
> > >    mem_map    : ffffea0000000000
> > >    pfn_start  : 0
> > >    pfn_end    : 8000
> > > mem_map (1)
> > >    mem_map    : ffffea0000200000
> > >    pfn_start  : 8000
> > >    pfn_end    : 10000
> > > mem_map (2)
> > >    mem_map    : ffffea0000400000
> > >    pfn_start  : 10000
> > >    pfn_end    : 18000
> > > mem_map (3)
> > >    mem_map    : ffffea0000600000
> > >    pfn_start  : 18000
> > >    pfn_end    : 20000
> > > mem_map (4)
> > >    mem_map    : ffffea0000800000
> > >    pfn_start  : 20000
> > >    pfn_end    : 28000
> > > mem_map (5)
> > >    mem_map    : ffffea0000a00000
> > >    pfn_start  : 28000
> > >    pfn_end    : 30000
> > > mem_map (6)
> > >    mem_map    : ffffea0000c00000
> > >    pfn_start  : 30000
> > >    pfn_end    : 38000
> > > mem_map (7)
> > >    mem_map    : ffffea0000e00000
> > >    pfn_start  : 38000
> > >    pfn_end    : 40000
> > > mem_map (8)
> > >    mem_map    : ffffea0001000000
> > >    pfn_start  : 40000
> > >    pfn_end    : 48000
> > > mem_map (9)
> > >    mem_map    : ffffea0001200000
> > >    pfn_start  : 48000
> > >    pfn_end    : 50000
> > > mem_map (10)
> > >    mem_map    : ffffea0001400000
> > >    pfn_start  : 50000
> > >    pfn_end    : 58000
> > > mem_map (11)
> > >    mem_map    : ffffea0001600000
> > >    pfn_start  : 58000
> > >    pfn_end    : 60000
> > > mem_map (12)
> > >    mem_map    : ffffea0001800000
> > >    pfn_start  : 60000
> > >    pfn_end    : 68000
> > > mem_map (13)
> > >    mem_map    : ffffea0001a00000
> > >    pfn_start  : 68000
> > >    pfn_end    : 70000
> > > mem_map (14)
> > >    mem_map    : ffffea0001c00000
> > >    pfn_start  : 70000
> > >    pfn_end    : 78000
> > > mem_map (15)
> > >    mem_map    : ffffea0001e00000
> > >    pfn_start  : 78000
> > >    pfn_end    : 7ffd7
> > > mmap() is available on the kernel.
> > > Copying data                                      : [100.0 %] -  eta: 0s
> > > Writing erase info...
> > > offset_eraseinfo: 9567fb0, size_eraseinfo: 0
> > > kdump: saving vmcore complete
> > > 
> > > Thanks,
> > > 	dou
> > > 
> > > > > 
> > > > > [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> > > > > vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > > sadump: does not have partition header
> > > > > sadump: read dump device as unknown format
> > > > > sadump: unknown format
> > > > > LOAD (0)
> > > > >     phys_start : 1000000
> > > > >     phys_end   : 2a86000
> > > > >     virt_start : ffffffff81000000
> > > > >     virt_end   : ffffffff82a86000
> > > > > LOAD (1)
> > > > >     phys_start : 1000
> > > > >     phys_end   : 9fc00
> > > > >     virt_start : ffff880000001000
> > > > >     virt_end   : ffff88000009fc00
> > > > > LOAD (2)
> > > > >     phys_start : 100000
> > > > >     phys_end   : 13000000
> > > > >     virt_start : ffff880000100000
> > > > >     virt_end   : ffff880013000000
> > > > > LOAD (3)
> > > > >     phys_start : 33000000
> > > > >     phys_end   : 7ffd7000
> > > > >     virt_start : ffff880033000000
> > > > >     virt_end   : ffff88007ffd7000
> > > > > Linux kdump
> > > > > page_size    : 4096
> > > > > 
> > > > > max_mapnr    : 7ffd7
> > > > > 
> > > > > Buffer size for the cyclic mode: 131061
> > > > > The kernel version is not supported.
> > > > > The makedumpfile operation may be incomplete.
> > > > > 
> > > > > num of NODEs : 1
> > > > > 
> > > > > 
> > > > > Memory type  : SPARSEMEM_EX
> > > > > 
> > > > > mem_map (0)
> > > > >     mem_map    : ffff88007ff26000
> > > > >     pfn_start  : 0
> > > > >     pfn_end    : 8000
> > > > > mem_map (1)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 8000
> > > > >     pfn_end    : 10000
> > > > > mem_map (2)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 10000
> > > > >     pfn_end    : 18000
> > > > > mem_map (3)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 18000
> > > > >     pfn_end    : 20000
> > > > > mem_map (4)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 20000
> > > > >     pfn_end    : 28000
> > > > > mem_map (5)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 28000
> > > > >     pfn_end    : 30000
> > > > > mem_map (6)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 30000
> > > > >     pfn_end    : 38000
> > > > > mem_map (7)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 38000
> > > > >     pfn_end    : 40000
> > > > > mem_map (8)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 40000
> > > > >     pfn_end    : 48000
> > > > > mem_map (9)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 48000
> > > > >     pfn_end    : 50000
> > > > > mem_map (10)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 50000
> > > > >     pfn_end    : 58000
> > > > > mem_map (11)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 58000
> > > > >     pfn_end    : 60000
> > > > > mem_map (12)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 60000
> > > > >     pfn_end    : 68000
> > > > > mem_map (13)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 68000
> > > > >     pfn_end    : 70000
> > > > > mem_map (14)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 70000
> > > > >     pfn_end    : 78000
> > > > > mem_map (15)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 78000
> > > > >     pfn_end    : 7ffd7
> > > > > mmap() is available on the kernel.
> > > > > Checking for memory holes                         : [100.0 %] |         STEP
> > > > > [Checking for memory holes  ] : 0.000014 seconds
> > > > > __vtop4_x86_64: Can't get a valid pte.
> > > > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > > > address.
> > > > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > > > Checking for memory holes                         : [100.0 %] \         STEP
> > > > > [Checking for memory holes  ] : 0.000006 seconds
> > > > > Checking for memory holes                         : [100.0 %] -         STEP
> > > > > [Checking for memory holes  ] : 0.000004 seconds
> > > > > __vtop4_x86_64: Can't get a valid pte.
> > > > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > > > address.
> > > > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > > > 
> > > > > makedumpfile Failed.
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > >        ......It causes makedumpfile failed.
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 	dou.
> > > > > > > 
> > > > > > > > 	-Mike
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:23                                   ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-08  1:23 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

On 02/08/18 at 09:14am, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 02/07/2018 08:45 PM, Baoquan He wrote:
> > On 02/07/18 at 08:34pm, Dou Liyang wrote:
> > > 
> > > 
> > > At 02/07/2018 08:27 PM, Baoquan He wrote:
> > > > On 02/07/18 at 08:17pm, Dou Liyang wrote:
> > > > > Hi Baoquan,
> > > > > 
> > > > > At 02/07/2018 08:08 PM, Baoquan He wrote:
> > > > > > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > > > > > Hi Kirill,Mike
> > > > > > > 
> > > > > > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > > > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > > > > > Hi All,
> > > > > > > > > > 
> > > > > > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > > > > > this patch. Did I missed something else?
> > > > > > > > > 
> > > > > > > > > None I'm aware of.
> > > > > > > > > 
> > > > > > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > > > > > fixed?
> > > > > > > > 
> > > > > > > 
> > > > > > > I did a contrastive test by my colleagues Indoh's suggestion.
> > 
> > OK, I may get the reason. kaslr is enabled, right? You can try to
> 
> I add 'nokaslr' to disable the KASLR feature.
    ~~~added??
> 
> # cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
> 
> > disable kaslr and try them again. Because phys_base and kaslr_offset are
> > got from vmlinux, while these are generated at compiling time. Just a
> > guess.
> > 
> 
> Oh, I will recompile the kernel with KASLR disabled in .config.

Then it's not what I guessed. Need debug makedumpfile since using
vmlinux is another code path, few people use it usually.

> 
> 
> Thanks,
> 	dou.
> > > > > > > 
> > > > > > > Revert your two commits:
> > > > > > > 
> > > > > > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > > > > > 
> > > > > > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > > > > > 
> > > > > > > ...and keep others unchanged, the makedumpfile works well.
> > > > > > > 
> > > > > > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > > > > > 
> > > > > > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > > > > > 
> > > > > > >     1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > > > > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > > > > > makedumpfile works well and I can get the vmcore file.
> > > > > > > 
> > > > > > >         ......It is OK
> > > > > > > 
> > > > > > >     2) use cp as core_collector, do the same operation to get the vmcore file.
> > > > > > > then use makedumpfile to do like above:
> > > > > > > 
> > > > > > >        [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > > > > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > > > 
> > > > > > Oh, then please ignore my previous comment. Adding '-D' can give more
> > > > > > debugging message.
> > > > > 
> > > > > I added '-D', Just like before, no more debugging message:
> > > > > 
> > > > > BTW, I use crash to analyze the vmcore file created by 'cp' command.
> > > > > 
> > > > >      ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> > > > > ../makedumpfile/code/vmlinux_4.15+
> > > > > 
> > > > > the crash works well, It's so interesting.
> > > > > 
> > > > > Thanks,
> > > > > 	dou.
> > > > > 
> > > > > The debugging message with '-D':
> > > > 
> > > > And what's the debugging printing when trigger crash by sysrq?
> > > > 
> > > 
> > > kdump: dump target is /dev/vda2
> > > kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
> > > [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
> > > kdump: saving vmcore-dmesg.txt
> > > kdump: saving vmcore-dmesg.txt complete
> > > kdump: saving vmcore
> > > sadump: does not have partition header
> > > sadump: read dump device as unknown format
> > > sadump: unknown format
> > > LOAD (0)
> > >    phys_start : 1000000
> > >    phys_end   : 2a86000
> > >    virt_start : ffffffff81000000
> > >    virt_end   : ffffffff82a86000
> > > LOAD (1)
> > >    phys_start : 1000
> > >    phys_end   : 9fc00
> > >    virt_start : ffff880000001000
> > >    virt_end   : ffff88000009fc00
> > > LOAD (2)
> > >    phys_start : 100000
> > >    phys_end   : 13000000
> > >    virt_start : ffff880000100000
> > >    virt_end   : ffff880013000000
> > > LOAD (3)
> > >    phys_start : 33000000
> > >    phys_end   : 7ffd7000
> > >    virt_start : ffff880033000000
> > >    virt_end   : ffff88007ffd7000
> > > Linux kdump
> > > page_size    : 4096
> > > 
> > > max_mapnr    : 7ffd7
> > > 
> > > Buffer size for the cyclic mode: 131061
> > > 
> > > num of NODEs : 1
> > > 
> > > 
> > > Memory type  : SPARSEMEM_EX
> > > 
> > > mem_map (0)
> > >    mem_map    : ffffea0000000000
> > >    pfn_start  : 0
> > >    pfn_end    : 8000
> > > mem_map (1)
> > >    mem_map    : ffffea0000200000
> > >    pfn_start  : 8000
> > >    pfn_end    : 10000
> > > mem_map (2)
> > >    mem_map    : ffffea0000400000
> > >    pfn_start  : 10000
> > >    pfn_end    : 18000
> > > mem_map (3)
> > >    mem_map    : ffffea0000600000
> > >    pfn_start  : 18000
> > >    pfn_end    : 20000
> > > mem_map (4)
> > >    mem_map    : ffffea0000800000
> > >    pfn_start  : 20000
> > >    pfn_end    : 28000
> > > mem_map (5)
> > >    mem_map    : ffffea0000a00000
> > >    pfn_start  : 28000
> > >    pfn_end    : 30000
> > > mem_map (6)
> > >    mem_map    : ffffea0000c00000
> > >    pfn_start  : 30000
> > >    pfn_end    : 38000
> > > mem_map (7)
> > >    mem_map    : ffffea0000e00000
> > >    pfn_start  : 38000
> > >    pfn_end    : 40000
> > > mem_map (8)
> > >    mem_map    : ffffea0001000000
> > >    pfn_start  : 40000
> > >    pfn_end    : 48000
> > > mem_map (9)
> > >    mem_map    : ffffea0001200000
> > >    pfn_start  : 48000
> > >    pfn_end    : 50000
> > > mem_map (10)
> > >    mem_map    : ffffea0001400000
> > >    pfn_start  : 50000
> > >    pfn_end    : 58000
> > > mem_map (11)
> > >    mem_map    : ffffea0001600000
> > >    pfn_start  : 58000
> > >    pfn_end    : 60000
> > > mem_map (12)
> > >    mem_map    : ffffea0001800000
> > >    pfn_start  : 60000
> > >    pfn_end    : 68000
> > > mem_map (13)
> > >    mem_map    : ffffea0001a00000
> > >    pfn_start  : 68000
> > >    pfn_end    : 70000
> > > mem_map (14)
> > >    mem_map    : ffffea0001c00000
> > >    pfn_start  : 70000
> > >    pfn_end    : 78000
> > > mem_map (15)
> > >    mem_map    : ffffea0001e00000
> > >    pfn_start  : 78000
> > >    pfn_end    : 7ffd7
> > > mmap() is available on the kernel.
> > > Copying data                                      : [100.0 %] -  eta: 0s
> > > Writing erase info...
> > > offset_eraseinfo: 9567fb0, size_eraseinfo: 0
> > > kdump: saving vmcore complete
> > > 
> > > Thanks,
> > > 	dou
> > > 
> > > > > 
> > > > > [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> > > > > vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > > sadump: does not have partition header
> > > > > sadump: read dump device as unknown format
> > > > > sadump: unknown format
> > > > > LOAD (0)
> > > > >     phys_start : 1000000
> > > > >     phys_end   : 2a86000
> > > > >     virt_start : ffffffff81000000
> > > > >     virt_end   : ffffffff82a86000
> > > > > LOAD (1)
> > > > >     phys_start : 1000
> > > > >     phys_end   : 9fc00
> > > > >     virt_start : ffff880000001000
> > > > >     virt_end   : ffff88000009fc00
> > > > > LOAD (2)
> > > > >     phys_start : 100000
> > > > >     phys_end   : 13000000
> > > > >     virt_start : ffff880000100000
> > > > >     virt_end   : ffff880013000000
> > > > > LOAD (3)
> > > > >     phys_start : 33000000
> > > > >     phys_end   : 7ffd7000
> > > > >     virt_start : ffff880033000000
> > > > >     virt_end   : ffff88007ffd7000
> > > > > Linux kdump
> > > > > page_size    : 4096
> > > > > 
> > > > > max_mapnr    : 7ffd7
> > > > > 
> > > > > Buffer size for the cyclic mode: 131061
> > > > > The kernel version is not supported.
> > > > > The makedumpfile operation may be incomplete.
> > > > > 
> > > > > num of NODEs : 1
> > > > > 
> > > > > 
> > > > > Memory type  : SPARSEMEM_EX
> > > > > 
> > > > > mem_map (0)
> > > > >     mem_map    : ffff88007ff26000
> > > > >     pfn_start  : 0
> > > > >     pfn_end    : 8000
> > > > > mem_map (1)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 8000
> > > > >     pfn_end    : 10000
> > > > > mem_map (2)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 10000
> > > > >     pfn_end    : 18000
> > > > > mem_map (3)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 18000
> > > > >     pfn_end    : 20000
> > > > > mem_map (4)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 20000
> > > > >     pfn_end    : 28000
> > > > > mem_map (5)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 28000
> > > > >     pfn_end    : 30000
> > > > > mem_map (6)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 30000
> > > > >     pfn_end    : 38000
> > > > > mem_map (7)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 38000
> > > > >     pfn_end    : 40000
> > > > > mem_map (8)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 40000
> > > > >     pfn_end    : 48000
> > > > > mem_map (9)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 48000
> > > > >     pfn_end    : 50000
> > > > > mem_map (10)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 50000
> > > > >     pfn_end    : 58000
> > > > > mem_map (11)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 58000
> > > > >     pfn_end    : 60000
> > > > > mem_map (12)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 60000
> > > > >     pfn_end    : 68000
> > > > > mem_map (13)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 68000
> > > > >     pfn_end    : 70000
> > > > > mem_map (14)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 70000
> > > > >     pfn_end    : 78000
> > > > > mem_map (15)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 78000
> > > > >     pfn_end    : 7ffd7
> > > > > mmap() is available on the kernel.
> > > > > Checking for memory holes                         : [100.0 %] |         STEP
> > > > > [Checking for memory holes  ] : 0.000014 seconds
> > > > > __vtop4_x86_64: Can't get a valid pte.
> > > > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > > > address.
> > > > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > > > Checking for memory holes                         : [100.0 %] \         STEP
> > > > > [Checking for memory holes  ] : 0.000006 seconds
> > > > > Checking for memory holes                         : [100.0 %] -         STEP
> > > > > [Checking for memory holes  ] : 0.000004 seconds
> > > > > __vtop4_x86_64: Can't get a valid pte.
> > > > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > > > address.
> > > > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > > > 
> > > > > makedumpfile Failed.
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > >        ......It causes makedumpfile failed.
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 	dou.
> > > > > > > 
> > > > > > > > 	-Mike
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:23                                   ` Baoquan He
  0 siblings, 0 replies; 349+ messages in thread
From: Baoquan He @ 2018-02-08  1:23 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Ingo Molnar, Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman,
	Dave Young, Mike Galbraith, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Andrew Morton, Kirill A. Shutemov

On 02/08/18 at 09:14am, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 02/07/2018 08:45 PM, Baoquan He wrote:
> > On 02/07/18 at 08:34pm, Dou Liyang wrote:
> > > 
> > > 
> > > At 02/07/2018 08:27 PM, Baoquan He wrote:
> > > > On 02/07/18 at 08:17pm, Dou Liyang wrote:
> > > > > Hi Baoquan,
> > > > > 
> > > > > At 02/07/2018 08:08 PM, Baoquan He wrote:
> > > > > > On 02/07/18 at 08:00pm, Dou Liyang wrote:
> > > > > > > Hi Kirill,Mike
> > > > > > > 
> > > > > > > At 02/07/2018 06:45 PM, Mike Galbraith wrote:
> > > > > > > > On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
> > > > > > > > > On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
> > > > > > > > > > Hi All,
> > > > > > > > > > 
> > > > > > > > > > I met the makedumpfile failed in the upstream kernel which contained
> > > > > > > > > > this patch. Did I missed something else?
> > > > > > > > > 
> > > > > > > > > None I'm aware of.
> > > > > > > > > 
> > > > > > > > > Is there a reason to suspect that the issue is related to the bug this patch
> > > > > > > > > fixed?
> > > > > > > > 
> > > > > > > 
> > > > > > > I did a contrastive test by my colleagues Indoh's suggestion.
> > 
> > OK, I may get the reason. kaslr is enabled, right? You can try to
> 
> I add 'nokaslr' to disable the KASLR feature.
    ~~~added??
> 
> # cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
> 
> > disable kaslr and try them again. Because phys_base and kaslr_offset are
> > got from vmlinux, while these are generated at compiling time. Just a
> > guess.
> > 
> 
> Oh, I will recompile the kernel with KASLR disabled in .config.

Then it's not what I guessed. Need debug makedumpfile since using
vmlinux is another code path, few people use it usually.

> 
> 
> Thanks,
> 	dou.
> > > > > > > 
> > > > > > > Revert your two commits:
> > > > > > > 
> > > > > > > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
> > > > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > Date:   Fri Sep 29 17:08:16 2017 +0300
> > > > > > > 
> > > > > > > commit 629a359bdb0e0652a8227b4ff3125431995fec6e
> > > > > > > Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > Date:   Tue Nov 7 11:33:37 2017 +0300
> > > > > > > 
> > > > > > > ...and keep others unchanged, the makedumpfile works well.
> > > > > > > 
> > > > > > > > Still works fine for me with .today.  Box is only 16GB desktop box though.
> > > > > > > > 
> > > > > > > Btw, In the upstream kernel which contained this patch, I did two tests:
> > > > > > > 
> > > > > > >     1) use the makedumpfile as core_collector in /etc/kdump.conf, then
> > > > > > > trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
> > > > > > > makedumpfile works well and I can get the vmcore file.
> > > > > > > 
> > > > > > >         ......It is OK
> > > > > > > 
> > > > > > >     2) use cp as core_collector, do the same operation to get the vmcore file.
> > > > > > > then use makedumpfile to do like above:
> > > > > > > 
> > > > > > >        [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
> > > > > > > vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > > > 
> > > > > > Oh, then please ignore my previous comment. Adding '-D' can give more
> > > > > > debugging message.
> > > > > 
> > > > > I added '-D', Just like before, no more debugging message:
> > > > > 
> > > > > BTW, I use crash to analyze the vmcore file created by 'cp' command.
> > > > > 
> > > > >      ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
> > > > > ../makedumpfile/code/vmlinux_4.15+
> > > > > 
> > > > > the crash works well, It's so interesting.
> > > > > 
> > > > > Thanks,
> > > > > 	dou.
> > > > > 
> > > > > The debugging message with '-D':
> > > > 
> > > > And what's the debugging printing when trigger crash by sysrq?
> > > > 
> > > 
> > > kdump: dump target is /dev/vda2
> > > kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
> > > [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
> > > kdump: saving vmcore-dmesg.txt
> > > kdump: saving vmcore-dmesg.txt complete
> > > kdump: saving vmcore
> > > sadump: does not have partition header
> > > sadump: read dump device as unknown format
> > > sadump: unknown format
> > > LOAD (0)
> > >    phys_start : 1000000
> > >    phys_end   : 2a86000
> > >    virt_start : ffffffff81000000
> > >    virt_end   : ffffffff82a86000
> > > LOAD (1)
> > >    phys_start : 1000
> > >    phys_end   : 9fc00
> > >    virt_start : ffff880000001000
> > >    virt_end   : ffff88000009fc00
> > > LOAD (2)
> > >    phys_start : 100000
> > >    phys_end   : 13000000
> > >    virt_start : ffff880000100000
> > >    virt_end   : ffff880013000000
> > > LOAD (3)
> > >    phys_start : 33000000
> > >    phys_end   : 7ffd7000
> > >    virt_start : ffff880033000000
> > >    virt_end   : ffff88007ffd7000
> > > Linux kdump
> > > page_size    : 4096
> > > 
> > > max_mapnr    : 7ffd7
> > > 
> > > Buffer size for the cyclic mode: 131061
> > > 
> > > num of NODEs : 1
> > > 
> > > 
> > > Memory type  : SPARSEMEM_EX
> > > 
> > > mem_map (0)
> > >    mem_map    : ffffea0000000000
> > >    pfn_start  : 0
> > >    pfn_end    : 8000
> > > mem_map (1)
> > >    mem_map    : ffffea0000200000
> > >    pfn_start  : 8000
> > >    pfn_end    : 10000
> > > mem_map (2)
> > >    mem_map    : ffffea0000400000
> > >    pfn_start  : 10000
> > >    pfn_end    : 18000
> > > mem_map (3)
> > >    mem_map    : ffffea0000600000
> > >    pfn_start  : 18000
> > >    pfn_end    : 20000
> > > mem_map (4)
> > >    mem_map    : ffffea0000800000
> > >    pfn_start  : 20000
> > >    pfn_end    : 28000
> > > mem_map (5)
> > >    mem_map    : ffffea0000a00000
> > >    pfn_start  : 28000
> > >    pfn_end    : 30000
> > > mem_map (6)
> > >    mem_map    : ffffea0000c00000
> > >    pfn_start  : 30000
> > >    pfn_end    : 38000
> > > mem_map (7)
> > >    mem_map    : ffffea0000e00000
> > >    pfn_start  : 38000
> > >    pfn_end    : 40000
> > > mem_map (8)
> > >    mem_map    : ffffea0001000000
> > >    pfn_start  : 40000
> > >    pfn_end    : 48000
> > > mem_map (9)
> > >    mem_map    : ffffea0001200000
> > >    pfn_start  : 48000
> > >    pfn_end    : 50000
> > > mem_map (10)
> > >    mem_map    : ffffea0001400000
> > >    pfn_start  : 50000
> > >    pfn_end    : 58000
> > > mem_map (11)
> > >    mem_map    : ffffea0001600000
> > >    pfn_start  : 58000
> > >    pfn_end    : 60000
> > > mem_map (12)
> > >    mem_map    : ffffea0001800000
> > >    pfn_start  : 60000
> > >    pfn_end    : 68000
> > > mem_map (13)
> > >    mem_map    : ffffea0001a00000
> > >    pfn_start  : 68000
> > >    pfn_end    : 70000
> > > mem_map (14)
> > >    mem_map    : ffffea0001c00000
> > >    pfn_start  : 70000
> > >    pfn_end    : 78000
> > > mem_map (15)
> > >    mem_map    : ffffea0001e00000
> > >    pfn_start  : 78000
> > >    pfn_end    : 7ffd7
> > > mmap() is available on the kernel.
> > > Copying data                                      : [100.0 %] -  eta: 0s
> > > Writing erase info...
> > > offset_eraseinfo: 9567fb0, size_eraseinfo: 0
> > > kdump: saving vmcore complete
> > > 
> > > Thanks,
> > > 	dou
> > > 
> > > > > 
> > > > > [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
> > > > > vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
> > > > > sadump: does not have partition header
> > > > > sadump: read dump device as unknown format
> > > > > sadump: unknown format
> > > > > LOAD (0)
> > > > >     phys_start : 1000000
> > > > >     phys_end   : 2a86000
> > > > >     virt_start : ffffffff81000000
> > > > >     virt_end   : ffffffff82a86000
> > > > > LOAD (1)
> > > > >     phys_start : 1000
> > > > >     phys_end   : 9fc00
> > > > >     virt_start : ffff880000001000
> > > > >     virt_end   : ffff88000009fc00
> > > > > LOAD (2)
> > > > >     phys_start : 100000
> > > > >     phys_end   : 13000000
> > > > >     virt_start : ffff880000100000
> > > > >     virt_end   : ffff880013000000
> > > > > LOAD (3)
> > > > >     phys_start : 33000000
> > > > >     phys_end   : 7ffd7000
> > > > >     virt_start : ffff880033000000
> > > > >     virt_end   : ffff88007ffd7000
> > > > > Linux kdump
> > > > > page_size    : 4096
> > > > > 
> > > > > max_mapnr    : 7ffd7
> > > > > 
> > > > > Buffer size for the cyclic mode: 131061
> > > > > The kernel version is not supported.
> > > > > The makedumpfile operation may be incomplete.
> > > > > 
> > > > > num of NODEs : 1
> > > > > 
> > > > > 
> > > > > Memory type  : SPARSEMEM_EX
> > > > > 
> > > > > mem_map (0)
> > > > >     mem_map    : ffff88007ff26000
> > > > >     pfn_start  : 0
> > > > >     pfn_end    : 8000
> > > > > mem_map (1)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 8000
> > > > >     pfn_end    : 10000
> > > > > mem_map (2)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 10000
> > > > >     pfn_end    : 18000
> > > > > mem_map (3)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 18000
> > > > >     pfn_end    : 20000
> > > > > mem_map (4)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 20000
> > > > >     pfn_end    : 28000
> > > > > mem_map (5)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 28000
> > > > >     pfn_end    : 30000
> > > > > mem_map (6)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 30000
> > > > >     pfn_end    : 38000
> > > > > mem_map (7)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 38000
> > > > >     pfn_end    : 40000
> > > > > mem_map (8)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 40000
> > > > >     pfn_end    : 48000
> > > > > mem_map (9)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 48000
> > > > >     pfn_end    : 50000
> > > > > mem_map (10)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 50000
> > > > >     pfn_end    : 58000
> > > > > mem_map (11)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 58000
> > > > >     pfn_end    : 60000
> > > > > mem_map (12)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 60000
> > > > >     pfn_end    : 68000
> > > > > mem_map (13)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 68000
> > > > >     pfn_end    : 70000
> > > > > mem_map (14)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 70000
> > > > >     pfn_end    : 78000
> > > > > mem_map (15)
> > > > >     mem_map    : 0
> > > > >     pfn_start  : 78000
> > > > >     pfn_end    : 7ffd7
> > > > > mmap() is available on the kernel.
> > > > > Checking for memory holes                         : [100.0 %] |         STEP
> > > > > [Checking for memory holes  ] : 0.000014 seconds
> > > > > __vtop4_x86_64: Can't get a valid pte.
> > > > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > > > address.
> > > > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > > > Checking for memory holes                         : [100.0 %] \         STEP
> > > > > [Checking for memory holes  ] : 0.000006 seconds
> > > > > Checking for memory holes                         : [100.0 %] -         STEP
> > > > > [Checking for memory holes  ] : 0.000004 seconds
> > > > > __vtop4_x86_64: Can't get a valid pte.
> > > > > readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
> > > > > address.
> > > > > readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
> > > > > __exclude_unnecessary_pages: Can't read the buffer of struct page.
> > > > > create_2nd_bitmap: Can't exclude unnecessary pages.
> > > > > 
> > > > > makedumpfile Failed.
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > >        ......It causes makedumpfile failed.
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 	dou.
> > > > > > > 
> > > > > > > > 	-Mike
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
  2018-02-08  1:23                                   ` Baoquan He
  (?)
  (?)
@ 2018-02-08  1:44                                     ` Dou Liyang
  -1 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:44 UTC (permalink / raw)
  To: Baoquan He
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

Hi Baoquan,

At 02/08/2018 09:23 AM, Baoquan He wrote:
> On 02/08/18 at 09:14am, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:45 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>>>
>>>>
>>>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>>>> Hi Baoquan,
>>>>>>
>>>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>>>> Hi Kirill,Mike
>>>>>>>>
>>>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>>>
>>>>>>>>>> None I'm aware of.
>>>>>>>>>>
>>>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>>>> fixed?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>
>>> OK, I may get the reason. kaslr is enabled, right? You can try to
>>
>> I add 'nokaslr' to disable the KASLR feature.
>      ~~~added??

oops! yes, the kaslr had already disabled by this option when I tested.

>>
>> # cat /proc/cmdline
>> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
>> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
>>
>>> disable kaslr and try them again. Because phys_base and kaslr_offset are
>>> got from vmlinux, while these are generated at compiling time. Just a
>>> guess.
>>>
>>
>> Oh, I will recompile the kernel with KASLR disabled in .config.
> 
> Then it's not what I guessed. Need debug makedumpfile since using
> vmlinux is another code path, few people use it usually.
> 

Understood, I will try to look into it.

Thanks,
	dou

>>
>>
>> Thanks,
>> 	dou.
>>>>>>>>
>>>>>>>> Revert your two commits:
>>>>>>>>
>>>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>>>
>>>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>>>
>>>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>>>
>>>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>>>
>>>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>>>
>>>>>>>>      1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>>>
>>>>>>>>          ......It is OK
>>>>>>>>
>>>>>>>>      2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>>>> then use makedumpfile to do like above:
>>>>>>>>
>>>>>>>>         [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>>>
>>>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>>>> debugging message.
>>>>>>
>>>>>> I added '-D', Just like before, no more debugging message:
>>>>>>
>>>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>>>
>>>>>>       ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>>>
>>>>>> the crash works well, It's so interesting.
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>> The debugging message with '-D':
>>>>>
>>>>> And what's the debugging printing when trigger crash by sysrq?
>>>>>
>>>>
>>>> kdump: dump target is /dev/vda2
>>>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>>>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>>>> kdump: saving vmcore-dmesg.txt
>>>> kdump: saving vmcore-dmesg.txt complete
>>>> kdump: saving vmcore
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffffea0000000000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : ffffea0000200000
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : ffffea0000400000
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : ffffea0000600000
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : ffffea0000800000
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : ffffea0000a00000
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : ffffea0000c00000
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : ffffea0000e00000
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : ffffea0001000000
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : ffffea0001200000
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : ffffea0001400000
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : ffffea0001600000
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : ffffea0001800000
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : ffffea0001a00000
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : ffffea0001c00000
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : ffffea0001e00000
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Copying data                                      : [100.0 %] -  eta: 0s
>>>> Writing erase info...
>>>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>>>> kdump: saving vmcore complete
>>>>
>>>> Thanks,
>>>> 	dou
>>>>
>>>>>>
>>>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>> sadump: does not have partition header
>>>>>> sadump: read dump device as unknown format
>>>>>> sadump: unknown format
>>>>>> LOAD (0)
>>>>>>      phys_start : 1000000
>>>>>>      phys_end   : 2a86000
>>>>>>      virt_start : ffffffff81000000
>>>>>>      virt_end   : ffffffff82a86000
>>>>>> LOAD (1)
>>>>>>      phys_start : 1000
>>>>>>      phys_end   : 9fc00
>>>>>>      virt_start : ffff880000001000
>>>>>>      virt_end   : ffff88000009fc00
>>>>>> LOAD (2)
>>>>>>      phys_start : 100000
>>>>>>      phys_end   : 13000000
>>>>>>      virt_start : ffff880000100000
>>>>>>      virt_end   : ffff880013000000
>>>>>> LOAD (3)
>>>>>>      phys_start : 33000000
>>>>>>      phys_end   : 7ffd7000
>>>>>>      virt_start : ffff880033000000
>>>>>>      virt_end   : ffff88007ffd7000
>>>>>> Linux kdump
>>>>>> page_size    : 4096
>>>>>>
>>>>>> max_mapnr    : 7ffd7
>>>>>>
>>>>>> Buffer size for the cyclic mode: 131061
>>>>>> The kernel version is not supported.
>>>>>> The makedumpfile operation may be incomplete.
>>>>>>
>>>>>> num of NODEs : 1
>>>>>>
>>>>>>
>>>>>> Memory type  : SPARSEMEM_EX
>>>>>>
>>>>>> mem_map (0)
>>>>>>      mem_map    : ffff88007ff26000
>>>>>>      pfn_start  : 0
>>>>>>      pfn_end    : 8000
>>>>>> mem_map (1)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 8000
>>>>>>      pfn_end    : 10000
>>>>>> mem_map (2)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 10000
>>>>>>      pfn_end    : 18000
>>>>>> mem_map (3)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 18000
>>>>>>      pfn_end    : 20000
>>>>>> mem_map (4)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 20000
>>>>>>      pfn_end    : 28000
>>>>>> mem_map (5)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 28000
>>>>>>      pfn_end    : 30000
>>>>>> mem_map (6)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 30000
>>>>>>      pfn_end    : 38000
>>>>>> mem_map (7)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 38000
>>>>>>      pfn_end    : 40000
>>>>>> mem_map (8)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 40000
>>>>>>      pfn_end    : 48000
>>>>>> mem_map (9)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 48000
>>>>>>      pfn_end    : 50000
>>>>>> mem_map (10)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 50000
>>>>>>      pfn_end    : 58000
>>>>>> mem_map (11)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 58000
>>>>>>      pfn_end    : 60000
>>>>>> mem_map (12)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 60000
>>>>>>      pfn_end    : 68000
>>>>>> mem_map (13)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 68000
>>>>>>      pfn_end    : 70000
>>>>>> mem_map (14)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 70000
>>>>>>      pfn_end    : 78000
>>>>>> mem_map (15)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 78000
>>>>>>      pfn_end    : 7ffd7
>>>>>> mmap() is available on the kernel.
>>>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>>
>>>>>> makedumpfile Failed.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>         ......It causes makedumpfile failed.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> 	dou.
>>>>>>>>
>>>>>>>>> 	-Mike
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:44                                     ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:44 UTC (permalink / raw)
  To: Baoquan He
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

Hi Baoquan,

At 02/08/2018 09:23 AM, Baoquan He wrote:
> On 02/08/18 at 09:14am, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:45 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>>>
>>>>
>>>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>>>> Hi Baoquan,
>>>>>>
>>>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>>>> Hi Kirill,Mike
>>>>>>>>
>>>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>>>
>>>>>>>>>> None I'm aware of.
>>>>>>>>>>
>>>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>>>> fixed?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>
>>> OK, I may get the reason. kaslr is enabled, right? You can try to
>>
>> I add 'nokaslr' to disable the KASLR feature.
>      ~~~added??

oops! yes, the kaslr had already disabled by this option when I tested.

>>
>> # cat /proc/cmdline
>> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
>> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
>>
>>> disable kaslr and try them again. Because phys_base and kaslr_offset are
>>> got from vmlinux, while these are generated at compiling time. Just a
>>> guess.
>>>
>>
>> Oh, I will recompile the kernel with KASLR disabled in .config.
> 
> Then it's not what I guessed. Need debug makedumpfile since using
> vmlinux is another code path, few people use it usually.
> 

Understood, I will try to look into it.

Thanks,
	dou

>>
>>
>> Thanks,
>> 	dou.
>>>>>>>>
>>>>>>>> Revert your two commits:
>>>>>>>>
>>>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>>>
>>>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>>>
>>>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>>>
>>>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>>>
>>>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>>>
>>>>>>>>      1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>>>
>>>>>>>>          ......It is OK
>>>>>>>>
>>>>>>>>      2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>>>> then use makedumpfile to do like above:
>>>>>>>>
>>>>>>>>         [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>>>
>>>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>>>> debugging message.
>>>>>>
>>>>>> I added '-D', Just like before, no more debugging message:
>>>>>>
>>>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>>>
>>>>>>       ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>>>
>>>>>> the crash works well, It's so interesting.
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>> The debugging message with '-D':
>>>>>
>>>>> And what's the debugging printing when trigger crash by sysrq?
>>>>>
>>>>
>>>> kdump: dump target is /dev/vda2
>>>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>>>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>>>> kdump: saving vmcore-dmesg.txt
>>>> kdump: saving vmcore-dmesg.txt complete
>>>> kdump: saving vmcore
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffffea0000000000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : ffffea0000200000
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : ffffea0000400000
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : ffffea0000600000
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : ffffea0000800000
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : ffffea0000a00000
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : ffffea0000c00000
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : ffffea0000e00000
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : ffffea0001000000
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : ffffea0001200000
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : ffffea0001400000
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : ffffea0001600000
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : ffffea0001800000
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : ffffea0001a00000
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : ffffea0001c00000
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : ffffea0001e00000
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Copying data                                      : [100.0 %] -  eta: 0s
>>>> Writing erase info...
>>>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>>>> kdump: saving vmcore complete
>>>>
>>>> Thanks,
>>>> 	dou
>>>>
>>>>>>
>>>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>> sadump: does not have partition header
>>>>>> sadump: read dump device as unknown format
>>>>>> sadump: unknown format
>>>>>> LOAD (0)
>>>>>>      phys_start : 1000000
>>>>>>      phys_end   : 2a86000
>>>>>>      virt_start : ffffffff81000000
>>>>>>      virt_end   : ffffffff82a86000
>>>>>> LOAD (1)
>>>>>>      phys_start : 1000
>>>>>>      phys_end   : 9fc00
>>>>>>      virt_start : ffff880000001000
>>>>>>      virt_end   : ffff88000009fc00
>>>>>> LOAD (2)
>>>>>>      phys_start : 100000
>>>>>>      phys_end   : 13000000
>>>>>>      virt_start : ffff880000100000
>>>>>>      virt_end   : ffff880013000000
>>>>>> LOAD (3)
>>>>>>      phys_start : 33000000
>>>>>>      phys_end   : 7ffd7000
>>>>>>      virt_start : ffff880033000000
>>>>>>      virt_end   : ffff88007ffd7000
>>>>>> Linux kdump
>>>>>> page_size    : 4096
>>>>>>
>>>>>> max_mapnr    : 7ffd7
>>>>>>
>>>>>> Buffer size for the cyclic mode: 131061
>>>>>> The kernel version is not supported.
>>>>>> The makedumpfile operation may be incomplete.
>>>>>>
>>>>>> num of NODEs : 1
>>>>>>
>>>>>>
>>>>>> Memory type  : SPARSEMEM_EX
>>>>>>
>>>>>> mem_map (0)
>>>>>>      mem_map    : ffff88007ff26000
>>>>>>      pfn_start  : 0
>>>>>>      pfn_end    : 8000
>>>>>> mem_map (1)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 8000
>>>>>>      pfn_end    : 10000
>>>>>> mem_map (2)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 10000
>>>>>>      pfn_end    : 18000
>>>>>> mem_map (3)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 18000
>>>>>>      pfn_end    : 20000
>>>>>> mem_map (4)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 20000
>>>>>>      pfn_end    : 28000
>>>>>> mem_map (5)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 28000
>>>>>>      pfn_end    : 30000
>>>>>> mem_map (6)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 30000
>>>>>>      pfn_end    : 38000
>>>>>> mem_map (7)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 38000
>>>>>>      pfn_end    : 40000
>>>>>> mem_map (8)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 40000
>>>>>>      pfn_end    : 48000
>>>>>> mem_map (9)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 48000
>>>>>>      pfn_end    : 50000
>>>>>> mem_map (10)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 50000
>>>>>>      pfn_end    : 58000
>>>>>> mem_map (11)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 58000
>>>>>>      pfn_end    : 60000
>>>>>> mem_map (12)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 60000
>>>>>>      pfn_end    : 68000
>>>>>> mem_map (13)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 68000
>>>>>>      pfn_end    : 70000
>>>>>> mem_map (14)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 70000
>>>>>>      pfn_end    : 78000
>>>>>> mem_map (15)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 78000
>>>>>>      pfn_end    : 7ffd7
>>>>>> mmap() is available on the kernel.
>>>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>>
>>>>>> makedumpfile Failed.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>         ......It causes makedumpfile failed.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> 	dou.
>>>>>>>>
>>>>>>>>> 	-Mike
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:44                                     ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:44 UTC (permalink / raw)
  To: Baoquan He
  Cc: Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman, Mike Galbraith,
	kexec, linux-kernel, stable, Andy Lutomirski, linux-mm,
	Thomas Gleixner, Kirill A. Shutemov, Linus Torvalds,
	Cyrill Gorcunov, Kirill A. Shutemov, Andrew Morton,
	Borislav Petkov, Dave Young, Ingo Molnar, Vivek Goyal

Hi Baoquan,

At 02/08/2018 09:23 AM, Baoquan He wrote:
> On 02/08/18 at 09:14am, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:45 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>>>
>>>>
>>>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>>>> Hi Baoquan,
>>>>>>
>>>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>>>> Hi Kirill,Mike
>>>>>>>>
>>>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>>>
>>>>>>>>>> None I'm aware of.
>>>>>>>>>>
>>>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>>>> fixed?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>
>>> OK, I may get the reason. kaslr is enabled, right? You can try to
>>
>> I add 'nokaslr' to disable the KASLR feature.
>      ~~~added??

oops! yes, the kaslr had already disabled by this option when I tested.

>>
>> # cat /proc/cmdline
>> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
>> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
>>
>>> disable kaslr and try them again. Because phys_base and kaslr_offset are
>>> got from vmlinux, while these are generated at compiling time. Just a
>>> guess.
>>>
>>
>> Oh, I will recompile the kernel with KASLR disabled in .config.
> 
> Then it's not what I guessed. Need debug makedumpfile since using
> vmlinux is another code path, few people use it usually.
> 

Understood, I will try to look into it.

Thanks,
	dou

>>
>>
>> Thanks,
>> 	dou.
>>>>>>>>
>>>>>>>> Revert your two commits:
>>>>>>>>
>>>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>>>
>>>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>>>
>>>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>>>
>>>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>>>
>>>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>>>
>>>>>>>>      1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>>>
>>>>>>>>          ......It is OK
>>>>>>>>
>>>>>>>>      2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>>>> then use makedumpfile to do like above:
>>>>>>>>
>>>>>>>>         [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>>>
>>>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>>>> debugging message.
>>>>>>
>>>>>> I added '-D', Just like before, no more debugging message:
>>>>>>
>>>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>>>
>>>>>>       ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>>>
>>>>>> the crash works well, It's so interesting.
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>> The debugging message with '-D':
>>>>>
>>>>> And what's the debugging printing when trigger crash by sysrq?
>>>>>
>>>>
>>>> kdump: dump target is /dev/vda2
>>>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>>>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>>>> kdump: saving vmcore-dmesg.txt
>>>> kdump: saving vmcore-dmesg.txt complete
>>>> kdump: saving vmcore
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffffea0000000000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : ffffea0000200000
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : ffffea0000400000
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : ffffea0000600000
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : ffffea0000800000
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : ffffea0000a00000
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : ffffea0000c00000
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : ffffea0000e00000
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : ffffea0001000000
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : ffffea0001200000
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : ffffea0001400000
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : ffffea0001600000
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : ffffea0001800000
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : ffffea0001a00000
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : ffffea0001c00000
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : ffffea0001e00000
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Copying data                                      : [100.0 %] -  eta: 0s
>>>> Writing erase info...
>>>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>>>> kdump: saving vmcore complete
>>>>
>>>> Thanks,
>>>> 	dou
>>>>
>>>>>>
>>>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>> sadump: does not have partition header
>>>>>> sadump: read dump device as unknown format
>>>>>> sadump: unknown format
>>>>>> LOAD (0)
>>>>>>      phys_start : 1000000
>>>>>>      phys_end   : 2a86000
>>>>>>      virt_start : ffffffff81000000
>>>>>>      virt_end   : ffffffff82a86000
>>>>>> LOAD (1)
>>>>>>      phys_start : 1000
>>>>>>      phys_end   : 9fc00
>>>>>>      virt_start : ffff880000001000
>>>>>>      virt_end   : ffff88000009fc00
>>>>>> LOAD (2)
>>>>>>      phys_start : 100000
>>>>>>      phys_end   : 13000000
>>>>>>      virt_start : ffff880000100000
>>>>>>      virt_end   : ffff880013000000
>>>>>> LOAD (3)
>>>>>>      phys_start : 33000000
>>>>>>      phys_end   : 7ffd7000
>>>>>>      virt_start : ffff880033000000
>>>>>>      virt_end   : ffff88007ffd7000
>>>>>> Linux kdump
>>>>>> page_size    : 4096
>>>>>>
>>>>>> max_mapnr    : 7ffd7
>>>>>>
>>>>>> Buffer size for the cyclic mode: 131061
>>>>>> The kernel version is not supported.
>>>>>> The makedumpfile operation may be incomplete.
>>>>>>
>>>>>> num of NODEs : 1
>>>>>>
>>>>>>
>>>>>> Memory type  : SPARSEMEM_EX
>>>>>>
>>>>>> mem_map (0)
>>>>>>      mem_map    : ffff88007ff26000
>>>>>>      pfn_start  : 0
>>>>>>      pfn_end    : 8000
>>>>>> mem_map (1)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 8000
>>>>>>      pfn_end    : 10000
>>>>>> mem_map (2)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 10000
>>>>>>      pfn_end    : 18000
>>>>>> mem_map (3)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 18000
>>>>>>      pfn_end    : 20000
>>>>>> mem_map (4)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 20000
>>>>>>      pfn_end    : 28000
>>>>>> mem_map (5)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 28000
>>>>>>      pfn_end    : 30000
>>>>>> mem_map (6)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 30000
>>>>>>      pfn_end    : 38000
>>>>>> mem_map (7)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 38000
>>>>>>      pfn_end    : 40000
>>>>>> mem_map (8)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 40000
>>>>>>      pfn_end    : 48000
>>>>>> mem_map (9)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 48000
>>>>>>      pfn_end    : 50000
>>>>>> mem_map (10)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 50000
>>>>>>      pfn_end    : 58000
>>>>>> mem_map (11)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 58000
>>>>>>      pfn_end    : 60000
>>>>>> mem_map (12)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 60000
>>>>>>      pfn_end    : 68000
>>>>>> mem_map (13)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 68000
>>>>>>      pfn_end    : 70000
>>>>>> mem_map (14)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 70000
>>>>>>      pfn_end    : 78000
>>>>>> mem_map (15)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 78000
>>>>>>      pfn_end    : 7ffd7
>>>>>> mmap() is available on the kernel.
>>>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>>
>>>>>> makedumpfile Failed.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>         ......It causes makedumpfile failed.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> 	dou.
>>>>>>>>
>>>>>>>>> 	-Mike
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 349+ messages in thread

* Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
@ 2018-02-08  1:44                                     ` Dou Liyang
  0 siblings, 0 replies; 349+ messages in thread
From: Dou Liyang @ 2018-02-08  1:44 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, Takao Indoh, Peter Zijlstra, Greg Kroah-Hartman,
	Dave Young, Mike Galbraith, kexec, linux-kernel, stable,
	Andy Lutomirski, linux-mm, Vivek Goyal, Cyrill Gorcunov,
	Kirill A. Shutemov, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds, Andrew Morton, Kirill A. Shutemov

Hi Baoquan,

At 02/08/2018 09:23 AM, Baoquan He wrote:
> On 02/08/18 at 09:14am, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 02/07/2018 08:45 PM, Baoquan He wrote:
>>> On 02/07/18 at 08:34pm, Dou Liyang wrote:
>>>>
>>>>
>>>> At 02/07/2018 08:27 PM, Baoquan He wrote:
>>>>> On 02/07/18 at 08:17pm, Dou Liyang wrote:
>>>>>> Hi Baoquan,
>>>>>>
>>>>>> At 02/07/2018 08:08 PM, Baoquan He wrote:
>>>>>>> On 02/07/18 at 08:00pm, Dou Liyang wrote:
>>>>>>>> Hi Kirill,Mike
>>>>>>>>
>>>>>>>> At 02/07/2018 06:45 PM, Mike Galbraith wrote:
>>>>>>>>> On Wed, 2018-02-07 at 13:41 +0300, Kirill A. Shutemov wrote:
>>>>>>>>>> On Wed, Feb 07, 2018 at 05:25:05PM +0800, Dou Liyang wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I met the makedumpfile failed in the upstream kernel which contained
>>>>>>>>>>> this patch. Did I missed something else?
>>>>>>>>>>
>>>>>>>>>> None I'm aware of.
>>>>>>>>>>
>>>>>>>>>> Is there a reason to suspect that the issue is related to the bug this patch
>>>>>>>>>> fixed?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I did a contrastive test by my colleagues Indoh's suggestion.
>>>
>>> OK, I may get the reason. kaslr is enabled, right? You can try to
>>
>> I add 'nokaslr' to disable the KASLR feature.
>      ~~~added??

oops! yes, the kaslr had already disabled by this option when I tested.

>>
>> # cat /proc/cmdline
>> BOOT_IMAGE=/vmlinuz-4.15.0+ root=UUID=10f10326-c923-4098-86aa-afed5c54ee0b
>> ro crashkernel=512M rhgb console=tty0 console=ttyS0 nokaslr LANG=en_US.UTF-8
>>
>>> disable kaslr and try them again. Because phys_base and kaslr_offset are
>>> got from vmlinux, while these are generated at compiling time. Just a
>>> guess.
>>>
>>
>> Oh, I will recompile the kernel with KASLR disabled in .config.
> 
> Then it's not what I guessed. Need debug makedumpfile since using
> vmlinux is another code path, few people use it usually.
> 

Understood, I will try to look into it.

Thanks,
	dou

>>
>>
>> Thanks,
>> 	dou.
>>>>>>>>
>>>>>>>> Revert your two commits:
>>>>>>>>
>>>>>>>> commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Fri Sep 29 17:08:16 2017 +0300
>>>>>>>>
>>>>>>>> commit 629a359bdb0e0652a8227b4ff3125431995fec6e
>>>>>>>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>>>>> Date:   Tue Nov 7 11:33:37 2017 +0300
>>>>>>>>
>>>>>>>> ...and keep others unchanged, the makedumpfile works well.
>>>>>>>>
>>>>>>>>> Still works fine for me with .today.  Box is only 16GB desktop box though.
>>>>>>>>>
>>>>>>>> Btw, In the upstream kernel which contained this patch, I did two tests:
>>>>>>>>
>>>>>>>>      1) use the makedumpfile as core_collector in /etc/kdump.conf, then
>>>>>>>> trigger the process of kdump by echo 1 >/proc/sysrq-trigger, the
>>>>>>>> makedumpfile works well and I can get the vmcore file.
>>>>>>>>
>>>>>>>>          ......It is OK
>>>>>>>>
>>>>>>>>      2) use cp as core_collector, do the same operation to get the vmcore file.
>>>>>>>> then use makedumpfile to do like above:
>>>>>>>>
>>>>>>>>         [douly@localhost code]$ ./makedumpfile -d 31 --message-level 31 -x
>>>>>>>> vmlinux_4.15+ vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>>>
>>>>>>> Oh, then please ignore my previous comment. Adding '-D' can give more
>>>>>>> debugging message.
>>>>>>
>>>>>> I added '-D', Just like before, no more debugging message:
>>>>>>
>>>>>> BTW, I use crash to analyze the vmcore file created by 'cp' command.
>>>>>>
>>>>>>       ./crash ../makedumpfile/code/vmcore_4.15+_from_cp_command
>>>>>> ../makedumpfile/code/vmlinux_4.15+
>>>>>>
>>>>>> the crash works well, It's so interesting.
>>>>>>
>>>>>> Thanks,
>>>>>> 	dou.
>>>>>>
>>>>>> The debugging message with '-D':
>>>>>
>>>>> And what's the debugging printing when trigger crash by sysrq?
>>>>>
>>>>
>>>> kdump: dump target is /dev/vda2
>>>> kdump: saving to /sysroot//var/crash/127.0.0.1-2018-02-07-07:31:56/
>>>> [    2.751352] EXT4-fs (vda2): re-mounted. Opts: data=ordered
>>>> kdump: saving vmcore-dmesg.txt
>>>> kdump: saving vmcore-dmesg.txt complete
>>>> kdump: saving vmcore
>>>> sadump: does not have partition header
>>>> sadump: read dump device as unknown format
>>>> sadump: unknown format
>>>> LOAD (0)
>>>>     phys_start : 1000000
>>>>     phys_end   : 2a86000
>>>>     virt_start : ffffffff81000000
>>>>     virt_end   : ffffffff82a86000
>>>> LOAD (1)
>>>>     phys_start : 1000
>>>>     phys_end   : 9fc00
>>>>     virt_start : ffff880000001000
>>>>     virt_end   : ffff88000009fc00
>>>> LOAD (2)
>>>>     phys_start : 100000
>>>>     phys_end   : 13000000
>>>>     virt_start : ffff880000100000
>>>>     virt_end   : ffff880013000000
>>>> LOAD (3)
>>>>     phys_start : 33000000
>>>>     phys_end   : 7ffd7000
>>>>     virt_start : ffff880033000000
>>>>     virt_end   : ffff88007ffd7000
>>>> Linux kdump
>>>> page_size    : 4096
>>>>
>>>> max_mapnr    : 7ffd7
>>>>
>>>> Buffer size for the cyclic mode: 131061
>>>>
>>>> num of NODEs : 1
>>>>
>>>>
>>>> Memory type  : SPARSEMEM_EX
>>>>
>>>> mem_map (0)
>>>>     mem_map    : ffffea0000000000
>>>>     pfn_start  : 0
>>>>     pfn_end    : 8000
>>>> mem_map (1)
>>>>     mem_map    : ffffea0000200000
>>>>     pfn_start  : 8000
>>>>     pfn_end    : 10000
>>>> mem_map (2)
>>>>     mem_map    : ffffea0000400000
>>>>     pfn_start  : 10000
>>>>     pfn_end    : 18000
>>>> mem_map (3)
>>>>     mem_map    : ffffea0000600000
>>>>     pfn_start  : 18000
>>>>     pfn_end    : 20000
>>>> mem_map (4)
>>>>     mem_map    : ffffea0000800000
>>>>     pfn_start  : 20000
>>>>     pfn_end    : 28000
>>>> mem_map (5)
>>>>     mem_map    : ffffea0000a00000
>>>>     pfn_start  : 28000
>>>>     pfn_end    : 30000
>>>> mem_map (6)
>>>>     mem_map    : ffffea0000c00000
>>>>     pfn_start  : 30000
>>>>     pfn_end    : 38000
>>>> mem_map (7)
>>>>     mem_map    : ffffea0000e00000
>>>>     pfn_start  : 38000
>>>>     pfn_end    : 40000
>>>> mem_map (8)
>>>>     mem_map    : ffffea0001000000
>>>>     pfn_start  : 40000
>>>>     pfn_end    : 48000
>>>> mem_map (9)
>>>>     mem_map    : ffffea0001200000
>>>>     pfn_start  : 48000
>>>>     pfn_end    : 50000
>>>> mem_map (10)
>>>>     mem_map    : ffffea0001400000
>>>>     pfn_start  : 50000
>>>>     pfn_end    : 58000
>>>> mem_map (11)
>>>>     mem_map    : ffffea0001600000
>>>>     pfn_start  : 58000
>>>>     pfn_end    : 60000
>>>> mem_map (12)
>>>>     mem_map    : ffffea0001800000
>>>>     pfn_start  : 60000
>>>>     pfn_end    : 68000
>>>> mem_map (13)
>>>>     mem_map    : ffffea0001a00000
>>>>     pfn_start  : 68000
>>>>     pfn_end    : 70000
>>>> mem_map (14)
>>>>     mem_map    : ffffea0001c00000
>>>>     pfn_start  : 70000
>>>>     pfn_end    : 78000
>>>> mem_map (15)
>>>>     mem_map    : ffffea0001e00000
>>>>     pfn_start  : 78000
>>>>     pfn_end    : 7ffd7
>>>> mmap() is available on the kernel.
>>>> Copying data                                      : [100.0 %] -  eta: 0s
>>>> Writing erase info...
>>>> offset_eraseinfo: 9567fb0, size_eraseinfo: 0
>>>> kdump: saving vmcore complete
>>>>
>>>> Thanks,
>>>> 	dou
>>>>
>>>>>>
>>>>>> [douly@localhost code]$ ./makedumpfile -D -d 31 --message-level 31 -x
>>>>>> vmlinux_4.15+  vmcore_4.15+_from_cp_command vmcore_4.15+
>>>>>> sadump: does not have partition header
>>>>>> sadump: read dump device as unknown format
>>>>>> sadump: unknown format
>>>>>> LOAD (0)
>>>>>>      phys_start : 1000000
>>>>>>      phys_end   : 2a86000
>>>>>>      virt_start : ffffffff81000000
>>>>>>      virt_end   : ffffffff82a86000
>>>>>> LOAD (1)
>>>>>>      phys_start : 1000
>>>>>>      phys_end   : 9fc00
>>>>>>      virt_start : ffff880000001000
>>>>>>      virt_end   : ffff88000009fc00
>>>>>> LOAD (2)
>>>>>>      phys_start : 100000
>>>>>>      phys_end   : 13000000
>>>>>>      virt_start : ffff880000100000
>>>>>>      virt_end   : ffff880013000000
>>>>>> LOAD (3)
>>>>>>      phys_start : 33000000
>>>>>>      phys_end   : 7ffd7000
>>>>>>      virt_start : ffff880033000000
>>>>>>      virt_end   : ffff88007ffd7000
>>>>>> Linux kdump
>>>>>> page_size    : 4096
>>>>>>
>>>>>> max_mapnr    : 7ffd7
>>>>>>
>>>>>> Buffer size for the cyclic mode: 131061
>>>>>> The kernel version is not supported.
>>>>>> The makedumpfile operation may be incomplete.
>>>>>>
>>>>>> num of NODEs : 1
>>>>>>
>>>>>>
>>>>>> Memory type  : SPARSEMEM_EX
>>>>>>
>>>>>> mem_map (0)
>>>>>>      mem_map    : ffff88007ff26000
>>>>>>      pfn_start  : 0
>>>>>>      pfn_end    : 8000
>>>>>> mem_map (1)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 8000
>>>>>>      pfn_end    : 10000
>>>>>> mem_map (2)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 10000
>>>>>>      pfn_end    : 18000
>>>>>> mem_map (3)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 18000
>>>>>>      pfn_end    : 20000
>>>>>> mem_map (4)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 20000
>>>>>>      pfn_end    : 28000
>>>>>> mem_map (5)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 28000
>>>>>>      pfn_end    : 30000
>>>>>> mem_map (6)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 30000
>>>>>>      pfn_end    : 38000
>>>>>> mem_map (7)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 38000
>>>>>>      pfn_end    : 40000
>>>>>> mem_map (8)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 40000
>>>>>>      pfn_end    : 48000
>>>>>> mem_map (9)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 48000
>>>>>>      pfn_end    : 50000
>>>>>> mem_map (10)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 50000
>>>>>>      pfn_end    : 58000
>>>>>> mem_map (11)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 58000
>>>>>>      pfn_end    : 60000
>>>>>> mem_map (12)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 60000
>>>>>>      pfn_end    : 68000
>>>>>> mem_map (13)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 68000
>>>>>>      pfn_end    : 70000
>>>>>> mem_map (14)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 70000
>>>>>>      pfn_end    : 78000
>>>>>> mem_map (15)
>>>>>>      mem_map    : 0
>>>>>>      pfn_start  : 78000
>>>>>>      pfn_end    : 7ffd7
>>>>>> mmap() is available on the kernel.
>>>>>> Checking for memory holes                         : [100.0 %] |         STEP
>>>>>> [Checking for memory holes  ] : 0.000014 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>> Checking for memory holes                         : [100.0 %] \         STEP
>>>>>> [Checking for memory holes  ] : 0.000006 seconds
>>>>>> Checking for memory holes                         : [100.0 %] -         STEP
>>>>>> [Checking for memory holes  ] : 0.000004 seconds
>>>>>> __vtop4_x86_64: Can't get a valid pte.
>>>>>> readmem: Can't convert a virtual address(ffff88007ffd7000) to physical
>>>>>> address.
>>>>>> readmem: type_addr: 0, addr:ffff88007ffd7000, size:32768
>>>>>> __exclude_unnecessary_pages: Can't read the buffer of struct page.
>>>>>> create_2nd_bitmap: Can't exclude unnecessary pages.
>>>>>>
>>>>>> makedumpfile Failed.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>         ......It causes makedumpfile failed.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> 	dou.
>>>>>>>>
>>>>>>>>> 	-Mike
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 349+ messages in thread

end of thread, other threads:[~2018-02-08  1:44 UTC | newest]

Thread overview: 349+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-22  8:44 [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 001/159] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 002/159] objtool: Dont report end of section error after an empty unwind hint Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 003/159] x86/head: Remove confusing comment Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 004/159] x86/head: Remove unused bad_address code Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 005/159] x86/head: Fix head ELF function annotations Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 006/159] x86/boot: Annotate verify_cpu() as a callable function Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 007/159] x86/xen: Fix xen head ELF annotations Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 008/159] x86/xen: Add unwind hint annotations Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 009/159] x86/head: " Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 010/159] ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq() Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 011/159] x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 012/159] x86/fpu/debug: Remove unused x86_fpu_state and x86_fpu_deactivate_state tracepoints Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 013/159] x86/unwind: Rename unwinder config options to CONFIG_UNWINDER_* Greg Kroah-Hartman
2017-12-22  8:44 ` [PATCH 4.14 014/159] x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 015/159] bitops: Add clear/set_bit32() to linux/bitops.h Greg Kroah-Hartman
2017-12-26 21:41   ` Ben Hutchings
2017-12-27 12:48     ` Greg Kroah-Hartman
2017-12-27 19:40       ` Ben Hutchings
2017-12-22  8:45 ` [PATCH 4.14 016/159] x86/cpuid: Add generic table for CPUID dependencies Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 017/159] x86/fpu: Parse clearcpuid= as early XSAVE argument Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 018/159] x86/fpu: Make XSAVE check the base CPUID features before enabling Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 019/159] x86/fpu: Remove the explicit clearing of XSAVE dependent features Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 020/159] x86/platform/UV: Convert timers to use timer_setup() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 021/159] objtool: Print top level commands on incorrect usage Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 022/159] x86/cpuid: Prevent out of bound access in do_clear_cpu_cap() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22 14:18   ` Dan Rue
2017-12-22 14:18     ` Dan Rue
2017-12-22 14:52     ` Naresh Kamboju
2017-12-22 14:52       ` Naresh Kamboju
2017-12-22 15:12       ` Greg Kroah-Hartman
2017-12-22 15:12         ` Greg Kroah-Hartman
2017-12-22 15:03     ` Greg Kroah-Hartman
2017-12-22 15:03       ` Greg Kroah-Hartman
2018-01-07  5:14   ` Mike Galbraith
2018-01-07  5:14     ` Mike Galbraith
2018-01-07  5:14     ` Mike Galbraith
2018-01-07  9:11     ` Greg Kroah-Hartman
2018-01-07  9:11       ` Greg Kroah-Hartman
2018-01-07  9:11       ` Greg Kroah-Hartman
2018-01-07  9:21       ` Mike Galbraith
2018-01-07  9:21         ` Mike Galbraith
2018-01-07  9:21         ` Mike Galbraith
2018-01-07 10:18       ` Michal Hocko
2018-01-07 10:18         ` Michal Hocko
2018-01-07 10:18         ` Michal Hocko
2018-01-07 10:42         ` Greg Kroah-Hartman
2018-01-07 10:42           ` Greg Kroah-Hartman
2018-01-07 10:42           ` Greg Kroah-Hartman
2018-01-07 12:44         ` Mike Galbraith
2018-01-07 12:44           ` Mike Galbraith
2018-01-07 12:44           ` Mike Galbraith
2018-01-07 13:23           ` Michal Hocko
2018-01-07 13:23             ` Michal Hocko
2018-01-07 13:23             ` Michal Hocko
2018-01-08  7:53             ` Greg Kroah-Hartman
2018-01-08  7:53               ` Greg Kroah-Hartman
2018-01-08  7:53               ` Greg Kroah-Hartman
2018-01-08  8:15               ` Mike Galbraith
2018-01-08  8:15                 ` Mike Galbraith
2018-01-08  8:15                 ` Mike Galbraith
2018-01-08  8:33                 ` Greg Kroah-Hartman
2018-01-08  8:33                   ` Greg Kroah-Hartman
2018-01-08  8:33                   ` Greg Kroah-Hartman
2018-01-08  9:45                   ` Mike Galbraith
2018-01-08  9:45                     ` Mike Galbraith
2018-01-08  9:45                     ` Mike Galbraith
2018-01-08  8:47               ` Michal Hocko
2018-01-08  8:47                 ` Michal Hocko
2018-01-08  8:47                 ` Michal Hocko
2018-01-08  9:10                 ` Greg Kroah-Hartman
2018-01-08  9:10                   ` Greg Kroah-Hartman
2018-01-08  9:10                   ` Greg Kroah-Hartman
2018-01-08  9:27                   ` Greg Kroah-Hartman
2018-01-08  9:27                     ` Greg Kroah-Hartman
2018-01-08  9:27                     ` Greg Kroah-Hartman
2018-01-08 16:04     ` Ingo Molnar
2018-01-08 16:04       ` Ingo Molnar
2018-01-08 16:04       ` Ingo Molnar
2018-01-08 17:46       ` Kirill A. Shutemov
2018-01-08 17:46         ` Kirill A. Shutemov
2018-01-09  0:13         ` Kirill A. Shutemov
2018-01-09  0:13           ` Kirill A. Shutemov
2018-01-09  0:13           ` Kirill A. Shutemov
2018-01-09  0:13           ` Kirill A. Shutemov
2018-01-09  1:09           ` Dave Young
2018-01-09  1:09             ` Dave Young
2018-01-09  1:09             ` Dave Young
2018-01-09  5:41             ` Baoquan He
2018-01-09  5:41               ` Baoquan He
2018-01-09  5:41               ` Baoquan He
2018-01-09  7:24               ` Dave Young
2018-01-09  7:24                 ` Dave Young
2018-01-09  7:24                 ` Dave Young
2018-01-09  9:05                 ` Kirill A. Shutemov
2018-01-09  9:05                   ` Kirill A. Shutemov
2018-01-09  9:05                   ` Kirill A. Shutemov
2018-01-10  3:08                   ` Dave Young
2018-01-10  3:08                     ` Dave Young
2018-01-10  3:08                     ` Dave Young
2018-01-10 11:16                     ` Kirill A. Shutemov
2018-01-10 11:16                       ` Kirill A. Shutemov
2018-01-10 11:16                       ` Kirill A. Shutemov
2018-01-10 11:16                       ` Kirill A. Shutemov
2018-01-11  1:06                       ` Baoquan He
2018-01-11  1:06                         ` Baoquan He
2018-01-11  1:06                         ` Baoquan He
2018-01-12  0:55                       ` Dave Young
2018-01-12  0:55                         ` Dave Young
2018-01-12  0:55                         ` Dave Young
2018-01-15  5:57                         ` Omar Sandoval
2018-01-15  5:57                           ` Omar Sandoval
2018-01-15  5:57                           ` Omar Sandoval
2018-01-15  5:57                           ` Omar Sandoval
2018-01-16  8:36                           ` Atsushi Kumagai
2018-01-16  8:36                             ` Atsushi Kumagai
2018-01-16  8:36                             ` Atsushi Kumagai
2018-01-16  8:36                             ` Atsushi Kumagai
2018-01-09  3:44           ` Mike Galbraith
2018-01-09  3:44             ` Mike Galbraith
2018-01-09  3:44             ` Mike Galbraith
2018-01-09  3:44             ` Mike Galbraith
2018-02-07  9:25             ` Dou Liyang
2018-02-07  9:25               ` Dou Liyang
2018-02-07  9:25               ` Dou Liyang
2018-02-07 10:41               ` Kirill A. Shutemov
2018-02-07 10:41                 ` Kirill A. Shutemov
2018-02-07 10:41                 ` Kirill A. Shutemov
2018-02-07 10:45                 ` Mike Galbraith
2018-02-07 10:45                   ` Mike Galbraith
2018-02-07 10:45                   ` Mike Galbraith
2018-02-07 10:45                   ` Mike Galbraith
2018-02-07 12:00                   ` Dou Liyang
2018-02-07 12:00                     ` Dou Liyang
2018-02-07 12:00                     ` Dou Liyang
2018-02-07 12:08                     ` Baoquan He
2018-02-07 12:08                       ` Baoquan He
2018-02-07 12:08                       ` Baoquan He
2018-02-07 12:17                       ` Dou Liyang
2018-02-07 12:17                         ` Dou Liyang
2018-02-07 12:17                         ` Dou Liyang
2018-02-07 12:17                         ` Dou Liyang
2018-02-07 12:27                         ` Baoquan He
2018-02-07 12:27                           ` Baoquan He
2018-02-07 12:27                           ` Baoquan He
2018-02-07 12:34                           ` Dou Liyang
2018-02-07 12:34                             ` Dou Liyang
2018-02-07 12:34                             ` Dou Liyang
2018-02-07 12:34                             ` Dou Liyang
2018-02-07 12:45                             ` Baoquan He
2018-02-07 12:45                               ` Baoquan He
2018-02-07 12:45                               ` Baoquan He
2018-02-08  1:14                               ` Dou Liyang
2018-02-08  1:14                                 ` Dou Liyang
2018-02-08  1:14                                 ` Dou Liyang
2018-02-08  1:23                                 ` Baoquan He
2018-02-08  1:23                                   ` Baoquan He
2018-02-08  1:23                                   ` Baoquan He
2018-02-08  1:44                                   ` Dou Liyang
2018-02-08  1:44                                     ` Dou Liyang
2018-02-08  1:44                                     ` Dou Liyang
2018-02-08  1:44                                     ` Dou Liyang
2018-02-07 11:28               ` Baoquan He
2018-02-07 11:28                 ` Baoquan He
2018-02-07 11:28                 ` Baoquan He
2018-01-17  5:24           ` Baoquan He
2018-01-17  5:24             ` Baoquan He
2018-01-17  5:24             ` Baoquan He
2018-01-25 15:50             ` Kirill A. Shutemov
2018-01-25 15:50               ` Kirill A. Shutemov
2018-01-25 15:50               ` Kirill A. Shutemov
2018-01-26  2:48               ` Baoquan He
2018-01-26  2:48                 ` Baoquan He
2018-01-26  2:48                 ` Baoquan He
2017-12-22  8:45 ` [PATCH 4.14 024/159] x86/kasan: Use the same shadow offset for 4- and 5-level paging Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 025/159] x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 026/159] x86/xen: Drop 5-level paging support code from the XEN_PV code Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 027/159] ACPI / APEI: remove the unused dead-code for SEA/NMI notification type Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 028/159] x86/asm: Dont use the confusing .ifeq directive Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 029/159] x86/build: Beautify build log of syscall headers Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 030/159] x86/mm/64: Rename the register_page_bootmem_memmap() size parameter to nr_pages Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 031/159] x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 032/159] x86/mm: Relocate page fault error codes to traps.h Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 033/159] x86/boot: Relocate definition of the initial state of CR0 Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 034/159] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 035/159] x86/entry/64: Remove the restore_c_regs_and_iret label Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 036/159] x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 037/159] x86/entry/64: Move SWAPGS into the common IRET-to-usermode path Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 038/159] x86/entry/64: Simplify reg restore code in the standard IRET paths Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 039/159] x86/entry/64: Shrink paranoid_exit_restore and make labels local Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 040/159] x86/entry/64: Use pop instead of movq in syscall_return_via_sysret Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 041/159] x86/entry/64: Merge the fast and slow SYSRET paths Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 042/159] x86/entry/64: Use POP instead of MOV to restore regs on NMI return Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 043/159] x86/entry/64: Remove the RESTORE_..._REGS infrastructure Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 044/159] xen, x86/entry/64: Add xen NMI trap entry Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 045/159] x86/entry/64: De-Xen-ify our NMI code Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 046/159] x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 047/159] x86/entry/64: Pass SP0 directly to load_sp0() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 048/159] x86/entry: Add task_top_of_stack() to find the top of a tasks stack Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 049/159] x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 050/159] x86/entry/64: Stop initializing TSS.sp0 at boot Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 051/159] x86/entry/64: Remove all remaining direct thread_struct::sp0 reads Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 052/159] x86/entry/32: Fix cpu_current_top_of_stack initialization at boot Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 053/159] x86/entry/64: Remove thread_struct::sp0 Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 054/159] x86/traps: Use a new on_thread_stack() helper to clean up an assertion Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 055/159] x86/entry/64: Shorten TEST instructions Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 056/159] x86/cpuid: Replace set/clear_bit32() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 057/159] bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h") Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 058/159] x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 059/159] x86/cpufeatures: Re-tabulate the X86_FEATURE definitions Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 060/159] x86/cpufeatures: Fix various details in the feature definitions Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 061/159] selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 062/159] selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 063/159] ACPI / APEI: Replace ioremap_page_range() with fixmap Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 064/159] x86/virt, x86/platform: Merge struct x86_hyper into struct x86_platform and struct x86_init Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45 ` Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 065/159] x86/virt: Add enum for hypervisors to replace x86_hyper Greg Kroah-Hartman
2017-12-22  8:45 ` Greg Kroah-Hartman
2017-12-22  8:45 ` Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 066/159] drivers/misc/intel/pti: Rename the header file to free up the namespace Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 067/159] x86/cpufeature: Add User-Mode Instruction Prevention definitions Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 068/159] x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 069/159] perf/x86: Enable free running PEBS for REGS_USER/INTR Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 070/159] bpf: fix build issues on um due to mising bpf_perf_event.h Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 071/159] locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 072/159] locking/barriers: Convert users of lockless_dereference() " Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 073/159] x86/mm/kasan: Dont use vmemmap_populate() to initialize shadow Greg Kroah-Hartman
2017-12-22  8:45 ` [PATCH 4.14 074/159] x86/entry/64/paravirt: Use paravirt-safe macro to access eflags Greg Kroah-Hartman
2017-12-22  8:45   ` Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 075/159] x86/unwinder/orc: Dont bail on stack overflow Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 076/159] x86/unwinder: Handle stack overflows more gracefully Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 077/159] x86/irq: Remove an old outdated comment about context tracking races Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 078/159] x86/irq/64: Print the offending IP in the stack overflow warning Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 079/159] x86/entry/64: Allocate and enable the SYSENTER stack Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 080/159] x86/dumpstack: Add get_stack_info() support for " Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 081/159] x86/entry/gdt: Put per-CPU GDT remaps in ascending order Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 082/159] x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 083/159] x86/kasan/64: Teach KASAN about the cpu_entry_area Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 084/159] x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 085/159] x86/dumpstack: Handle stack overflow on all stacks Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 086/159] x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 087/159] x86/entry: Remap the TSS into the CPU entry area Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 088/159] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 089/159] x86/espfix/64: Stop assuming that pt_regs is on the entry stack Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 090/159] x86/entry/64: Use a per-CPU trampoline stack for IDT entries Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 091/159] x86/entry/64: Return to userspace from the trampoline stack Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 092/159] x86/entry/64: Create a per-CPU SYSCALL entry trampoline Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 093/159] x86/entry/64: Move the IST stacks into struct cpu_entry_area Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 094/159] x86/entry/64: Remove the SYSENTER stack canary Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 095/159] x86/entry: Clean up the SYSENTER_stack code Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 096/159] x86/entry/64: Make cpu_entry_area.tss read-only Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 097/159] x86/paravirt: Dont patch flush_tlb_single Greg Kroah-Hartman
2017-12-22  8:46   ` Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 098/159] x86/paravirt: Provide a way to check for hypervisors Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 099/159] x86/cpufeatures: Make CPU bugs sticky Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 100/159] optee: fix invalid of_node_put() in optee_driver_init() Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 101/159] backlight: pwm_bl: Fix overflow condition Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 102/159] drm: Add retries for lspcon mode detection Greg Kroah-Hartman
2017-12-22  8:46   ` Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 103/159] clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 104/159] clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 105/159] crypto: crypto4xx - increase context and scatter ring buffer elements Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 106/159] crypto: lrw - Fix an error handling path in create() Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 107/159] rtc: pl031: make interrupt optional Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg Greg Kroah-Hartman
2017-12-22  9:34   ` Michal Hocko
2017-12-22 12:41     ` Greg Kroah-Hartman
2017-12-22 13:06       ` Michal Hocko
2017-12-22 17:40         ` alexander.levin
2017-12-22 17:56           ` Michal Hocko
2017-12-22 18:07             ` alexander.levin
2017-12-22 18:22               ` Michal Hocko
2017-12-22 21:55                 ` alexander.levin
2017-12-23  9:24             ` Greg Kroah-Hartman
2017-12-27 10:30               ` Paolo Bonzini
2017-12-22  8:46 ` [PATCH 4.14 109/159] net: phy: at803x: Change error to EINVAL for invalid MAC Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 110/159] PCI: Avoid bus reset if bridge itself is broken Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 111/159] scsi: cxgb4i: fix Tx skb leak Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 112/159] scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 113/159] PCI: Create SR-IOV virtfn/physfn links before attaching driver Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 114/159] PM / OPP: Move error message to debug level Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 115/159] igb: check memory allocation failure Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 116/159] i40e: use the safe hash table iterator when deleting mac filters Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 117/159] iio: st_sensors: add register mask for status register Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 118/159] ixgbe: fix use of uninitialized padding Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 119/159] IB/rxe: check for allocation failure on elem Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 120/159] block,bfq: Disable writeback throttling Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 121/159] md: always set THREAD_WAKEUP and wake up wqueue if thread existed Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 122/159] ip_gre: check packet length and mtu correctly in erspan tx Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 123/159] ipv6: grab rt->rt6i_ref before allocating pcpu rt Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 125/159] Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 126/159] Bluetooth: hci_bcm: Fix setting of irq trigger type Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 127/159] i40e/i40evf: spread CPU affinity hints across online CPUs only Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 128/159] PCI/AER: Report non-fatal errors only to the affected endpoint Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 129/159] tracing: Exclude generic fields from histograms Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 131/159] ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 132/159] powerpc/xmon: Avoid tripping SMP hardlockup watchdog Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 133/159] powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog Greg Kroah-Hartman
2017-12-22  8:46 ` [PATCH 4.14 134/159] sctp: silence warns on sctp_stream_init allocations Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 135/159] ASoC: codecs: msm8916-wcd-analog: fix module autoload Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 136/159] fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 137/159] scsi: lpfc: Fix secure firmware updates Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 138/159] scsi: lpfc: PLOGI failures during NPIV testing Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 139/159] scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 140/159] i40e: fix client notify of VF reset Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 141/159] vfio/pci: Virtualize Maximum Payload Size Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 142/159] ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 143/159] fm10k: ensure we process SM mbx when processing VF mbx Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 144/159] ibmvnic: Set state UP Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 145/159] net: ipv6: send NS for DAD when link operationally up Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 146/159] RDMA/hns: Avoid NULL pointer exception Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 147/159] staging: greybus: light: Release memory obtained by kasprintf Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 148/159] clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 149/159] tcp: fix under-evaluated ssthresh in TCP Vegas Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 150/159] rtc: set the alarm to the next expiring timer Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 151/159] cpuidle: fix broadcast control when broadcast can not be entered Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 152/159] drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 153/159] IB/opa_vnic: Properly clear Mac Table Digest Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 154/159] IB/opa_vnic: Properly return the total MACs in UC MAC list Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 155/159] thermal/drivers/hisi: Fix missing interrupt enablement Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 156/159] thermal/drivers/hisi: Fix kernel panic on alarm interrupt Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 157/159] thermal/drivers/hisi: Simplify the temperature/step computation Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 158/159] thermal/drivers/hisi: Fix multiple alarm interrupts firing Greg Kroah-Hartman
2017-12-22  8:47 ` [PATCH 4.14 159/159] platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes Greg Kroah-Hartman
2017-12-22 15:08 ` [PATCH 4.14 000/159] 4.14.9-stable review Greg Kroah-Hartman
2017-12-22 15:54   ` Greg Kroah-Hartman
2017-12-22 18:15     ` Guenter Roeck
2017-12-23 14:21       ` Greg Kroah-Hartman
2017-12-23 17:09         ` Guenter Roeck
     [not found] ` <5a3cfea4.0692500a.66bcf.cf6b@mx.google.com>
2017-12-22 15:11   ` Greg Kroah-Hartman
2017-12-22 15:45     ` Greg Kroah-Hartman
2017-12-22 21:09 ` Shuah Khan
2017-12-23  9:14   ` Greg Kroah-Hartman
2017-12-22 22:31 ` Dan Rue
2017-12-23  9:17   ` Greg Kroah-Hartman
2017-12-23 22:54 ` Guenter Roeck
2017-12-25 13:35   ` Greg Kroah-Hartman
2017-12-24 19:37 ` Ivan Kozik
2017-12-24 22:03   ` Andre Tomt
2017-12-25 13:38   ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.