* Linux 5.1-rc5 @ 2019-04-14 22:40 Linus Torvalds 2019-04-15 5:19 ` Christoph Hellwig 0 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2019-04-14 22:40 UTC (permalink / raw) To: Linux List Kernel Mailing Here we go again.. It's Sunday afternoon, must mean another rc kernel. We have changes all over, but not unseasonably many of them, and most of the ones here are very small. Looking at the stats, the sound driver updates kind of stand out, being almost a third of the patch (and about a third of the commits too, so it's not because of some single big patch). But none of it looks all that scary. Outside of the sound fixes, another third is other drivers (gpu, rdma, nvme, mmc, block layer..) and the last third is "misc". That includes arch updates, tooling, and various core fixes (networking, filesystem, security modules, and core kernel/mm). Nothing in here makes me feel uncomfortable about this release cycle so far. Knock wood. Shortlog appended with an overview of the details, as usual. Linus --- Alex Deucher (1): drm/amdkfd: Add picasso pci id Alexander Potapenko (1): x86/asm: Use stricter assembly constraints in bitops Anand Jain (2): btrfs: prop: fix zstd compression parameter validation btrfs: prop: fix vanished compression property after failed set Andre Przywara (1): PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller Andrei Vagin (1): alarmtimer: Return correct remaining time Annaliese McDermond (2): ASoC: tlv320aic32x4: Fix Common Pins ASoC: tlv320aic32x4: Change author's name Ard Biesheuvel (1): arm64/ftrace: fix inadvertent BUG() in trampoline check Arnaud Pouliquen (1): ASoC: stm32: fix sai driver name initialisation Bart Van Assche (1): locking/lockdep: Zap lock classes even with lock debugging disabled Brian Norris (1): Bluetooth: btusb: request wake pin with NOAUTOEN CK Hu (2): drm/mediatek: Implement gem prime vmap/vunmap function drm/mediatek: Add Mediatek framebuffer device Charles Keepax (6): ASoC: wm_adsp: Correct handling of compressed streams that restart ASoC: wm_adsp: Correct error messages in wm_adsp_buffer_get_error ASoC: wm_adsp: Add locking to wm_adsp2_bus_error ASoC: wm_adsp: Shutdown any compressed streams on DSP watchdog timeout ASoC: wm_adsp: Check for buffer in trigger stop ASoC: cs35l35: Disable regulators on driver removal Chong Qiao (1): MIPS: KGDB: fix kgdb support for SMP platforms. Chris Wilson (2): drm/i915/gvt: Annotate iomem usage drm/i915/gvt: Prevent use-after-free in ppgtt_free_all_spt() Christoph Hellwig (1): sparc64/pci_sun4v: fix ATU checks for large DMA masks Christophe Leroy (2): powerpc/32: Fix early boot failure with RTAS built-in powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64 Chuck Lever (2): NFS: Fix handling of reply page vector xprtrdma: Fix helper that drains the transport Cornelia Huck (1): virtio: Honour 'may_reduce_num' in vring_create_virtqueue Dan Carpenter (5): drm/mediatek: Fix an error code in mtk_hdmi_dt_parse_pdata() aio: Fix an error code in __io_submit_one() irqchip/irq-ls1x: Missing error code in ls1x_intc_of_init() NFC: nci: Add some bounds checking in nci_hci_cmd_received() nfc: nci: Potential off by one in ->pipes[] array Daniel Drake (1): mmc: alcor: don't write data before command has completed Daniel Mack (1): ASoC: cs4270: Set auto-increment bit for register writes Daniel Mentz (1): ALSA: uapi: #include <time.h> in asound.h Dave Airlie (1): drm/udl: add a release method and delay modeset teardown David Müller (1): clk: x86: Add system specific quirk to mark clocks as critical Dongli Zhang (2): virtio-blk: limit number of hw queues by nr_cpu_ids scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids Erik Schmauss (1): ACPICA: Namespace: remove address node from global list after method termination Faiz Abbas (1): mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning Filipe Manana (1): Btrfs: do not allow trimming when a fs is mounted with the nologreplay option Guenter Roeck (1): ASoC: intel: Fix crash at suspend/resume after failed codec registration Gustavo A. R. Silva (1): ASoC: ab8500: Mark expected switch fall-through Hans Holmberg (1): lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs Hans de Goede (1): ASoC: Intel: cht_bsw_max98090_ti: Enable codec clock once and keep it enabled Heiner Kallweit (1): r8169: disable ASPM again Horatiu Vultur (1): MIPS: generic: Add switchdev, pinctrl and fit to ocelot_defconfig Hui Wang (1): ALSA: hda - Add two more machines to the power_save_blacklist Imre Deak (1): drm/i915: Get power refs in encoder->get_power_domains() Iuliana Prodan (1): crypto: caam - fix copy of next buffer for xcbc and cmac James Smart (1): nvme-fc: correct csn initialization and increments on error Jani Nikula (1): drm/i915/dp: revert back to max link rate and lane count on eDP Jann Horn (1): linux/kernel.h: Use parentheses around argument in u64_to_user_ptr() Jarkko Sakkinen (2): tpm: turn on TPM on suspend for TPM 1.x KEYS: trusted: allow trusted.ko to initialize w/o a TPM Jason Yan (1): block: fix the return errno for direct IO Jenny TC (1): ASoC: Intel: Skylake: enable S24_LE format support Jens Axboe (2): tools/io_uring: remove IOCQE_FLAG_CACHEHIT io_uring: restrict IORING_SETUP_SQPOLL to root Jernej Skrabec (1): drm/sun4i: DW HDMI: Lower max. supported rate for H6 Jerome Brunet (1): ASoC: dpcm: skip missing substream while applying symmetry Jiada Wang (2): ASoC: rsnd: src: Avoid a potential deadlock ASoC: rsnd: src: fix compiler warnings Jian-Hong Pan (1): ALSA: hda/realtek: Enable headset MIC of Acer TravelMate B114-21 with ALC233 Joerg Roedel (1): iommu/amd: Set exclusion range correctly John Hsu (2): ASoC: nau8824: fix the issue of the widget with prefix name ASoC: nau8810: fix the issue of widget with prefixed name Jonathan Hunter (1): ASoC: soc-core: Fix probe deferral following prelink failure Josh Poimboeuf (1): objtool: Add rewind_stack_do_exit() to the noreturn list Jérôme Glisse (1): block: do not leak memory in bio_copy_user_iov() KaiChieh Chuang (2): ASoC: mediatek: btcvsd add loopback ASoC: dpcm: prevent snd_soc_dpcm use after free Kaike Wan (5): IB/hfi1: Failed to drain send queue when QP is put into error state IB/hfi1: Clear the IOWAIT pending bits when QP is put into error state IB/hfi1: Eliminate opcode tests on mr deref IB/hfi1: Fix the allocation of RSM table IB/hfi1: Do not flush send queue in the TID RDMA second leg Kailang Yang (1): ALSA: hda/realtek - Move to ACT_INIT state Kamal Heib (1): RDMA/vmw_pvrdma: Fix memory leak on pvrdma_pci_remove Kees Cook (1): apparmor: Restore Y/N in /sys for apparmor's "enabled" Kefeng Wang (1): genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n Keith Busch (1): nvmet: fix discover log page when offsets are used Kuninori Morimoto (2): ASoC: audio-graph-card: don't select DPCM via audio-graph-card ASoC: simple-card: don't select DPCM via simple-audio-card Lendacky, Thomas (3): x86/perf/amd: Resolve race condition when disabling PMC x86/perf/amd: Resolve NMI latency issues for active PMCs x86/perf/amd: Remove need to check "running" bit in NMI handler Leonard Crestez (1): clk: imx: Fix PLL_1416X not rounding rates Lijun Ou (1): RDMA/hns: Fix bug that caused srq creation to fail Linus Torvalds (4): mm: make page ref count overflow check tighter and more explicit mm: add 'try_get_page()' helper function mm: prevent get_user_pages() from overflowing page refcount Linux 5.1-rc5 Longpeng (1): virtio_pci: fix a NULL pointer reference in vp_del_vqs Lorenzo Bianconi (2): net: ip_gre: fix possible use-after-free in erspan_rcv net: ip6_gre: fix possible use-after-free in ip6erspan_rcv Marc Gonzalez (1): ASoC: wcd9335: Fix missing regmap requirement Martin Blumenstingl (1): clk: meson: pll: fix rounding and setting a rate that matches precisely Matteo Croce (1): drm/omap: fix typo Matthew Wilcox (1): fs: prevent page refcount overflow in pipe_buf_get Matthias Wieloch (1): clk: at91: fix programmable clock for sama5d2 Max Filippov (4): xtensa: use actual syscall number in do_syscall_trace_leave xtensa: fix initialization of pt_regs::syscall in start_thread xtensa: fix return_address xtensa: fix format string warning in init_pmd Maxime Jourdan (2): clk: meson-gxbb: round the vdec dividers to closest clk: meson: g12a: fix VPU clock muxes mask Mel Gorman (1): sched/fair: Do not re-read ->h_load_next during hierarchical load calculation Miaohe Lin (1): net: vrf: Fix ping failed when vrf mtu is set to 0 Michael Chan (2): bnxt_en: Improve RX consumer index validity check. bnxt_en: Reset device on RX buffer errors. Michael Ellerman (1): powerpc/mm: Define MAX_PHYSMEM_BITS for all 64-bit configs Michael S. Tsirkin (1): MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi Michael Zhivich (3): ethtool: avoid signed-unsigned comparison in ethtool_validate_speed() broadcom: tg3: fix use of SPEED_UNKNOWN ethtool constant qlogic: qlcnic: fix use of SPEED_UNKNOWN ethtool constant Miguel Ojeda (1): clang-format: Update with the latest for_each macro list Ming Lei (3): block: don't use for-inside-for in bio_for_each_segment_all blk-mq: introduce blk_mq_complete_request_sync() nvme: cancel request synchronously Moni Shoua (1): IB/mlx5: Reset access mask when looping inside page fault handler Neil Armstrong (4): clk: meson-g12a: fix VPU clock parents drm/bridge: dw-hdmi: disable SCDC configuration for invalid setups clk: meson: vid-pll-div: remove warning and return 0 on invalid config Revert "Documentation/gpu/meson: Remove link to meson_canvas.c" Nicholas Kazlauskas (1): drm/amd/display: Fix negative cursor pos programming (v2) Nicholas Piggin (1): powerpc/64s/radix: Fix radix segment exception handling Nicolas Dichtel (1): selftests: add a tc matchall test case Oleksandr Andrushchenko (1): ALSA: xen-front: Do not use stream buffer size before it is set Olga Kornievskaia (1): NFSv4.1 fix incorrect return value in copy_file_range Olivier Moysan (9): ASoC: stm32: sai: fix iec958 controls indexation ASoC: stm32: sai: fix exposed capabilities in spdif mode ASoC: stm32: sai: fix race condition in irq handler ASoC: stm32: sai: fix oversampling mode ASoC: stm32: sai: fix set_sync service ASoC: stm32: i2s: fix registers declaration in regmap ASoC: stm32: dfsdm: manage multiple prepare ASoC: stm32: dfsdm: fix debugfs warnings on entry creation ASoC: stm32: sai: fix master clock management Ondrej Jirman (1): drm/sun4i: tcon top: Fix NULL/invalid pointer dereference in sun8i_tcon_top_un/bind Pankaj Bharadiya (1): ASoC: dapm: Fix NULL pointer dereference in snd_soc_dapm_free_kcontrol Paolo Valente (1): block, bfq: fix use after free in bfq_bfqq_expire Paul Thomas (1): net: macb driver, check for SKBTX_HW_TSTAMP Peter Zijlstra (2): perf/x86/intel: Initialize TFA MSR perf/core: Fix perf_event_disable_inatomic() race Philipp Puschmann (1): ASoC: tlv320aic3x: fix reset gpio reference counting Qian Cai (1): slab: fix a crash by reading /proc/slab_allocators Rander Wang (3): ASoC:soc-pcm:fix a codec fixup issue in TDM case ASoC:hdac_hda:use correct format to setup hda codec ASoC:intel:skl:fix a simultaneous playback & capture issue on hda platform Ranjani Sridharan (6): ASoC: dapm: set power_check callback for widgets that shouldnt be always on ASoC: intel: skylake: add remove() callback for component driver ASoC: topology: Use the correct dobj to free enum control values and texts ASoC: core: conditionally increase module refcount on component open ASoC: pcm: update module refcount if module_get_upon_open is set ASoC: pcm: fix error handling when try_module_get() fails. Richard Sailer (1): ALSA: hda/realtek - Add quirk for Tuxedo XC 1509 Rodrigo Siqueira (1): drm/atomic-helper: Make atomic_enable/disable crtc callbacks optional Russell King (2): ASoC: hdmi-codec: fix S/PDIF DAI ASoC: hdmi-codec: avoid limiting params->msbits in hw_params() S.j. Wang (2): ASoC: fsl_asrc: add constraint for the asrc of older version ASoC: fsl_esai: fix channel swap issue when stream starts Scott Wood (1): dma-debug: only skip one stackframe entry Sean Paul (1): Documentation/gpu/meson: Remove link to meson_canvas.c Sergey Miroshnichenko (1): PCI: pciehp: Ignore Link State Changes after powering off a slot Shuming Fan (3): ASoC: rt5682: Check JD status when system resume ASoC: rt5682: fix jack type detection issue ASoC: rt5682: recording has no sound after booting Stefan Agner (1): gpu: host1x: Fix compile error when IOMMU API is not available Stefan Schmidt (1): MAINTAINERS: ieee802154: update documentation file pattern Stephane Eranian (1): perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS Stephen Boyd (2): genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent() platform/x86: pmc_atom: Drop __initconst on dmi table Sugar Zhang (2): ASoC: rockchip: pdm: fix regmap_ops hang issue ASoC: rockchip: pdm: change dma burst to 8 Sylwester Nawrocki (2): ASoC: samsung: i2s: Fix DAPM routes for capture stream ASoC: samsung: odroid: Fix clock configuration for 44100 sample rate Tadeusz Struk (3): tpm: fix an invalid condition in tpm_common_poll selftests/tpm2: Extend tests to cover partial reads selftests/tpm2: Open tpm dev in unbuffered mode Takashi Iwai (1): ALSA: hda: Fix racy display power access Tetsuo Handa (1): NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family. Thomas Bogendoerfer (1): MIPS: SGI-IP27: Fix use of unchecked pointer in shutdown_bridge_irq Tony Lindgren (1): drm/omap: hdmi4_cec: Fix CEC clock handling for PM Trond Myklebust (1): Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping" Tzung-Bi Shih (2): ASoC: mediatek: mt8183: skip for i2s5 in mck_disable ASoC: Intel: kbl: fix wrong number of channels Vandita Kulkarni (2): drm/i915/icl: Ungate ddi clocks before IO enable drm/i915/icl: Fix port disable sequence for mipi-dsi Varun Prakash (1): scsi: csiostor: fix missing data copy in csio_scsi_err_handler() Ville Syrjälä (1): drm/i915: Fix pipe_bpp readout for BXT/GLK DSI Wangyan Wang (5): drm/mediatek: fix the rate and divder of hdmi phy for MT2701 drm/mediatek: make implementation of recalc_rate() for MT2701 hdmi phy drm/mediatek: remove flag CLK_SET_RATE_PARENT for MT2701 hdmi phy drm/mediatek: using new factor for tvdpll for MT2701 hdmi phy drm/mediatek: no change parent rate in round_rate() for MT2701 hdmi phy Wei Yongjun (1): aio: use kmem_cache_free() instead of kfree() Weiyi Lu (1): clk: mediatek: fix clk-gate flag setting Wen Yang (1): drm/mediatek: fix possible object reference leak Will Deacon (2): arm64: backtrace: Don't bother trying to unwind the userspace stack arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value Xiaochen Shen (1): x86/resctrl: Fix typos in the mba_sc mount option Xiong Zhang (1): drm/i915/gvt: Roundup fb->height into tile's height at calucation fb->size Yangyang Li (1): RDMA/hns: Bugfix for SCC hem free Yue Haibing (1): tpm: Fix the type of the return value in calc_tpm2_event_size() YueHaibing (1): iov_iter: Fix build error without CONFIG_CRYPTO Zubin Mithra (1): ALSA: seq: Fix OOB-reads from strlcpy ndesaulniers@google.com (1): KEYS: trusted: fix -Wvarags warning shaoyunl (1): drm/amdgpu: Adjust IB test timeout for XGMI configuration tiancyin (1): drm/amd/display: fix cursor black issue wentalou (1): drm/amdgpu: amdgpu_device_recover_vram always failed if only one node in shadow_list ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-14 22:40 Linux 5.1-rc5 Linus Torvalds @ 2019-04-15 5:19 ` Christoph Hellwig 2019-04-15 16:17 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Christoph Hellwig @ 2019-04-15 5:19 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux List Kernel Mailing Can we please have the page refcount overflow fixes out on the list for review, even if it is after the fact? On Sun, Apr 14, 2019 at 03:40:47PM -0700, Linus Torvalds wrote: > Nothing in here makes me feel uncomfortable about this release cycle > so far. Knock wood. > > Shortlog appended with an overview of the details, as usual. > > Linus > > --- > > Alex Deucher (1): > drm/amdkfd: Add picasso pci id > > Alexander Potapenko (1): > x86/asm: Use stricter assembly constraints in bitops > > Anand Jain (2): > btrfs: prop: fix zstd compression parameter validation > btrfs: prop: fix vanished compression property after failed set > > Andre Przywara (1): > PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller > > Andrei Vagin (1): > alarmtimer: Return correct remaining time > > Annaliese McDermond (2): > ASoC: tlv320aic32x4: Fix Common Pins > ASoC: tlv320aic32x4: Change author's name > > Ard Biesheuvel (1): > arm64/ftrace: fix inadvertent BUG() in trampoline check > > Arnaud Pouliquen (1): > ASoC: stm32: fix sai driver name initialisation > > Bart Van Assche (1): > locking/lockdep: Zap lock classes even with lock debugging disabled > > Brian Norris (1): > Bluetooth: btusb: request wake pin with NOAUTOEN > > CK Hu (2): > drm/mediatek: Implement gem prime vmap/vunmap function > drm/mediatek: Add Mediatek framebuffer device > > Charles Keepax (6): > ASoC: wm_adsp: Correct handling of compressed streams that restart > ASoC: wm_adsp: Correct error messages in wm_adsp_buffer_get_error > ASoC: wm_adsp: Add locking to wm_adsp2_bus_error > ASoC: wm_adsp: Shutdown any compressed streams on DSP watchdog timeout > ASoC: wm_adsp: Check for buffer in trigger stop > ASoC: cs35l35: Disable regulators on driver removal > > Chong Qiao (1): > MIPS: KGDB: fix kgdb support for SMP platforms. > > Chris Wilson (2): > drm/i915/gvt: Annotate iomem usage > drm/i915/gvt: Prevent use-after-free in ppgtt_free_all_spt() > > Christoph Hellwig (1): > sparc64/pci_sun4v: fix ATU checks for large DMA masks > > Christophe Leroy (2): > powerpc/32: Fix early boot failure with RTAS built-in > powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64 > > Chuck Lever (2): > NFS: Fix handling of reply page vector > xprtrdma: Fix helper that drains the transport > > Cornelia Huck (1): > virtio: Honour 'may_reduce_num' in vring_create_virtqueue > > Dan Carpenter (5): > drm/mediatek: Fix an error code in mtk_hdmi_dt_parse_pdata() > aio: Fix an error code in __io_submit_one() > irqchip/irq-ls1x: Missing error code in ls1x_intc_of_init() > NFC: nci: Add some bounds checking in nci_hci_cmd_received() > nfc: nci: Potential off by one in ->pipes[] array > > Daniel Drake (1): > mmc: alcor: don't write data before command has completed > > Daniel Mack (1): > ASoC: cs4270: Set auto-increment bit for register writes > > Daniel Mentz (1): > ALSA: uapi: #include <time.h> in asound.h > > Dave Airlie (1): > drm/udl: add a release method and delay modeset teardown > > David Müller (1): > clk: x86: Add system specific quirk to mark clocks as critical > > Dongli Zhang (2): > virtio-blk: limit number of hw queues by nr_cpu_ids > scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids > > Erik Schmauss (1): > ACPICA: Namespace: remove address node from global list after > method termination > > Faiz Abbas (1): > mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning > > Filipe Manana (1): > Btrfs: do not allow trimming when a fs is mounted with the > nologreplay option > > Guenter Roeck (1): > ASoC: intel: Fix crash at suspend/resume after failed codec registration > > Gustavo A. R. Silva (1): > ASoC: ab8500: Mark expected switch fall-through > > Hans Holmberg (1): > lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs > > Hans de Goede (1): > ASoC: Intel: cht_bsw_max98090_ti: Enable codec clock once and > keep it enabled > > Heiner Kallweit (1): > r8169: disable ASPM again > > Horatiu Vultur (1): > MIPS: generic: Add switchdev, pinctrl and fit to ocelot_defconfig > > Hui Wang (1): > ALSA: hda - Add two more machines to the power_save_blacklist > > Imre Deak (1): > drm/i915: Get power refs in encoder->get_power_domains() > > Iuliana Prodan (1): > crypto: caam - fix copy of next buffer for xcbc and cmac > > James Smart (1): > nvme-fc: correct csn initialization and increments on error > > Jani Nikula (1): > drm/i915/dp: revert back to max link rate and lane count on eDP > > Jann Horn (1): > linux/kernel.h: Use parentheses around argument in u64_to_user_ptr() > > Jarkko Sakkinen (2): > tpm: turn on TPM on suspend for TPM 1.x > KEYS: trusted: allow trusted.ko to initialize w/o a TPM > > Jason Yan (1): > block: fix the return errno for direct IO > > Jenny TC (1): > ASoC: Intel: Skylake: enable S24_LE format support > > Jens Axboe (2): > tools/io_uring: remove IOCQE_FLAG_CACHEHIT > io_uring: restrict IORING_SETUP_SQPOLL to root > > Jernej Skrabec (1): > drm/sun4i: DW HDMI: Lower max. supported rate for H6 > > Jerome Brunet (1): > ASoC: dpcm: skip missing substream while applying symmetry > > Jiada Wang (2): > ASoC: rsnd: src: Avoid a potential deadlock > ASoC: rsnd: src: fix compiler warnings > > Jian-Hong Pan (1): > ALSA: hda/realtek: Enable headset MIC of Acer TravelMate B114-21 > with ALC233 > > Joerg Roedel (1): > iommu/amd: Set exclusion range correctly > > John Hsu (2): > ASoC: nau8824: fix the issue of the widget with prefix name > ASoC: nau8810: fix the issue of widget with prefixed name > > Jonathan Hunter (1): > ASoC: soc-core: Fix probe deferral following prelink failure > > Josh Poimboeuf (1): > objtool: Add rewind_stack_do_exit() to the noreturn list > > Jérôme Glisse (1): > block: do not leak memory in bio_copy_user_iov() > > KaiChieh Chuang (2): > ASoC: mediatek: btcvsd add loopback > ASoC: dpcm: prevent snd_soc_dpcm use after free > > Kaike Wan (5): > IB/hfi1: Failed to drain send queue when QP is put into error state > IB/hfi1: Clear the IOWAIT pending bits when QP is put into error state > IB/hfi1: Eliminate opcode tests on mr deref > IB/hfi1: Fix the allocation of RSM table > IB/hfi1: Do not flush send queue in the TID RDMA second leg > > Kailang Yang (1): > ALSA: hda/realtek - Move to ACT_INIT state > > Kamal Heib (1): > RDMA/vmw_pvrdma: Fix memory leak on pvrdma_pci_remove > > Kees Cook (1): > apparmor: Restore Y/N in /sys for apparmor's "enabled" > > Kefeng Wang (1): > genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n > > Keith Busch (1): > nvmet: fix discover log page when offsets are used > > Kuninori Morimoto (2): > ASoC: audio-graph-card: don't select DPCM via audio-graph-card > ASoC: simple-card: don't select DPCM via simple-audio-card > > Lendacky, Thomas (3): > x86/perf/amd: Resolve race condition when disabling PMC > x86/perf/amd: Resolve NMI latency issues for active PMCs > x86/perf/amd: Remove need to check "running" bit in NMI handler > > Leonard Crestez (1): > clk: imx: Fix PLL_1416X not rounding rates > > Lijun Ou (1): > RDMA/hns: Fix bug that caused srq creation to fail > > Linus Torvalds (4): > mm: make page ref count overflow check tighter and more explicit > mm: add 'try_get_page()' helper function > mm: prevent get_user_pages() from overflowing page refcount > Linux 5.1-rc5 > > Longpeng (1): > virtio_pci: fix a NULL pointer reference in vp_del_vqs > > Lorenzo Bianconi (2): > net: ip_gre: fix possible use-after-free in erspan_rcv > net: ip6_gre: fix possible use-after-free in ip6erspan_rcv > > Marc Gonzalez (1): > ASoC: wcd9335: Fix missing regmap requirement > > Martin Blumenstingl (1): > clk: meson: pll: fix rounding and setting a rate that matches precisely > > Matteo Croce (1): > drm/omap: fix typo > > Matthew Wilcox (1): > fs: prevent page refcount overflow in pipe_buf_get > > Matthias Wieloch (1): > clk: at91: fix programmable clock for sama5d2 > > Max Filippov (4): > xtensa: use actual syscall number in do_syscall_trace_leave > xtensa: fix initialization of pt_regs::syscall in start_thread > xtensa: fix return_address > xtensa: fix format string warning in init_pmd > > Maxime Jourdan (2): > clk: meson-gxbb: round the vdec dividers to closest > clk: meson: g12a: fix VPU clock muxes mask > > Mel Gorman (1): > sched/fair: Do not re-read ->h_load_next during hierarchical > load calculation > > Miaohe Lin (1): > net: vrf: Fix ping failed when vrf mtu is set to 0 > > Michael Chan (2): > bnxt_en: Improve RX consumer index validity check. > bnxt_en: Reset device on RX buffer errors. > > Michael Ellerman (1): > powerpc/mm: Define MAX_PHYSMEM_BITS for all 64-bit configs > > Michael S. Tsirkin (1): > MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi > > Michael Zhivich (3): > ethtool: avoid signed-unsigned comparison in ethtool_validate_speed() > broadcom: tg3: fix use of SPEED_UNKNOWN ethtool constant > qlogic: qlcnic: fix use of SPEED_UNKNOWN ethtool constant > > Miguel Ojeda (1): > clang-format: Update with the latest for_each macro list > > Ming Lei (3): > block: don't use for-inside-for in bio_for_each_segment_all > blk-mq: introduce blk_mq_complete_request_sync() > nvme: cancel request synchronously > > Moni Shoua (1): > IB/mlx5: Reset access mask when looping inside page fault handler > > Neil Armstrong (4): > clk: meson-g12a: fix VPU clock parents > drm/bridge: dw-hdmi: disable SCDC configuration for invalid setups > clk: meson: vid-pll-div: remove warning and return 0 on invalid config > Revert "Documentation/gpu/meson: Remove link to meson_canvas.c" > > Nicholas Kazlauskas (1): > drm/amd/display: Fix negative cursor pos programming (v2) > > Nicholas Piggin (1): > powerpc/64s/radix: Fix radix segment exception handling > > Nicolas Dichtel (1): > selftests: add a tc matchall test case > > Oleksandr Andrushchenko (1): > ALSA: xen-front: Do not use stream buffer size before it is set > > Olga Kornievskaia (1): > NFSv4.1 fix incorrect return value in copy_file_range > > Olivier Moysan (9): > ASoC: stm32: sai: fix iec958 controls indexation > ASoC: stm32: sai: fix exposed capabilities in spdif mode > ASoC: stm32: sai: fix race condition in irq handler > ASoC: stm32: sai: fix oversampling mode > ASoC: stm32: sai: fix set_sync service > ASoC: stm32: i2s: fix registers declaration in regmap > ASoC: stm32: dfsdm: manage multiple prepare > ASoC: stm32: dfsdm: fix debugfs warnings on entry creation > ASoC: stm32: sai: fix master clock management > > Ondrej Jirman (1): > drm/sun4i: tcon top: Fix NULL/invalid pointer dereference in > sun8i_tcon_top_un/bind > > Pankaj Bharadiya (1): > ASoC: dapm: Fix NULL pointer dereference in snd_soc_dapm_free_kcontrol > > Paolo Valente (1): > block, bfq: fix use after free in bfq_bfqq_expire > > Paul Thomas (1): > net: macb driver, check for SKBTX_HW_TSTAMP > > Peter Zijlstra (2): > perf/x86/intel: Initialize TFA MSR > perf/core: Fix perf_event_disable_inatomic() race > > Philipp Puschmann (1): > ASoC: tlv320aic3x: fix reset gpio reference counting > > Qian Cai (1): > slab: fix a crash by reading /proc/slab_allocators > > Rander Wang (3): > ASoC:soc-pcm:fix a codec fixup issue in TDM case > ASoC:hdac_hda:use correct format to setup hda codec > ASoC:intel:skl:fix a simultaneous playback & capture issue on hda platform > > Ranjani Sridharan (6): > ASoC: dapm: set power_check callback for widgets that shouldnt > be always on > ASoC: intel: skylake: add remove() callback for component driver > ASoC: topology: Use the correct dobj to free enum control values and texts > ASoC: core: conditionally increase module refcount on component open > ASoC: pcm: update module refcount if module_get_upon_open is set > ASoC: pcm: fix error handling when try_module_get() fails. > > Richard Sailer (1): > ALSA: hda/realtek - Add quirk for Tuxedo XC 1509 > > Rodrigo Siqueira (1): > drm/atomic-helper: Make atomic_enable/disable crtc callbacks optional > > Russell King (2): > ASoC: hdmi-codec: fix S/PDIF DAI > ASoC: hdmi-codec: avoid limiting params->msbits in hw_params() > > S.j. Wang (2): > ASoC: fsl_asrc: add constraint for the asrc of older version > ASoC: fsl_esai: fix channel swap issue when stream starts > > Scott Wood (1): > dma-debug: only skip one stackframe entry > > Sean Paul (1): > Documentation/gpu/meson: Remove link to meson_canvas.c > > Sergey Miroshnichenko (1): > PCI: pciehp: Ignore Link State Changes after powering off a slot > > Shuming Fan (3): > ASoC: rt5682: Check JD status when system resume > ASoC: rt5682: fix jack type detection issue > ASoC: rt5682: recording has no sound after booting > > Stefan Agner (1): > gpu: host1x: Fix compile error when IOMMU API is not available > > Stefan Schmidt (1): > MAINTAINERS: ieee802154: update documentation file pattern > > Stephane Eranian (1): > perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS > > Stephen Boyd (2): > genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent() > platform/x86: pmc_atom: Drop __initconst on dmi table > > Sugar Zhang (2): > ASoC: rockchip: pdm: fix regmap_ops hang issue > ASoC: rockchip: pdm: change dma burst to 8 > > Sylwester Nawrocki (2): > ASoC: samsung: i2s: Fix DAPM routes for capture stream > ASoC: samsung: odroid: Fix clock configuration for 44100 sample rate > > Tadeusz Struk (3): > tpm: fix an invalid condition in tpm_common_poll > selftests/tpm2: Extend tests to cover partial reads > selftests/tpm2: Open tpm dev in unbuffered mode > > Takashi Iwai (1): > ALSA: hda: Fix racy display power access > > Tetsuo Handa (1): > NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family. > > Thomas Bogendoerfer (1): > MIPS: SGI-IP27: Fix use of unchecked pointer in shutdown_bridge_irq > > Tony Lindgren (1): > drm/omap: hdmi4_cec: Fix CEC clock handling for PM > > Trond Myklebust (1): > Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping" > > Tzung-Bi Shih (2): > ASoC: mediatek: mt8183: skip for i2s5 in mck_disable > ASoC: Intel: kbl: fix wrong number of channels > > Vandita Kulkarni (2): > drm/i915/icl: Ungate ddi clocks before IO enable > drm/i915/icl: Fix port disable sequence for mipi-dsi > > Varun Prakash (1): > scsi: csiostor: fix missing data copy in csio_scsi_err_handler() > > Ville Syrjälä (1): > drm/i915: Fix pipe_bpp readout for BXT/GLK DSI > > Wangyan Wang (5): > drm/mediatek: fix the rate and divder of hdmi phy for MT2701 > drm/mediatek: make implementation of recalc_rate() for MT2701 hdmi phy > drm/mediatek: remove flag CLK_SET_RATE_PARENT for MT2701 hdmi phy > drm/mediatek: using new factor for tvdpll for MT2701 hdmi phy > drm/mediatek: no change parent rate in round_rate() for MT2701 hdmi phy > > Wei Yongjun (1): > aio: use kmem_cache_free() instead of kfree() > > Weiyi Lu (1): > clk: mediatek: fix clk-gate flag setting > > Wen Yang (1): > drm/mediatek: fix possible object reference leak > > Will Deacon (2): > arm64: backtrace: Don't bother trying to unwind the userspace stack > arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value > > Xiaochen Shen (1): > x86/resctrl: Fix typos in the mba_sc mount option > > Xiong Zhang (1): > drm/i915/gvt: Roundup fb->height into tile's height at calucation fb->size > > Yangyang Li (1): > RDMA/hns: Bugfix for SCC hem free > > Yue Haibing (1): > tpm: Fix the type of the return value in calc_tpm2_event_size() > > YueHaibing (1): > iov_iter: Fix build error without CONFIG_CRYPTO > > Zubin Mithra (1): > ALSA: seq: Fix OOB-reads from strlcpy > > ndesaulniers@google.com (1): > KEYS: trusted: fix -Wvarags warning > > shaoyunl (1): > drm/amdgpu: Adjust IB test timeout for XGMI configuration > > tiancyin (1): > drm/amd/display: fix cursor black issue > > wentalou (1): > drm/amdgpu: amdgpu_device_recover_vram always failed if only one > node in shadow_list ---end quoted text--- ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-15 5:19 ` Christoph Hellwig @ 2019-04-15 16:17 ` Linus Torvalds 2019-04-16 9:09 ` Martin Schwidefsky ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Linus Torvalds @ 2019-04-15 16:17 UTC (permalink / raw) To: Christoph Hellwig Cc: Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, Martin Schwidefsky, linux-s390 On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > Can we please have the page refcount overflow fixes out on the list > for review, even if it is after the fact? They were actually on a list for review long before the fact, but it was the security mailing list. The issue actually got discussed back in January along with early versions of the patches, but then we dropped the ball because it just wasn't on anybody's radar and it got resurrected late March. Willy wrote a rather bigger patch-series, and review of that is what then resulted in those commits. So they may look recent, but that's just because the original patches got seriously edited down and rewritten. That said, powerpc and s390 should at least look at maybe adding a check for the page ref in their gup paths too. Powerpc has the special gup_hugepte() case, and s390 has its own version of gup entirely. I was actually hoping the s390 guys would look at using the generic gup code. I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem largely irrelevant, partly since even theoretically this whole issue needs a _lot_ of memory. Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' (page ref overflow)"). You may or may not really care. Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-15 16:17 ` Linus Torvalds @ 2019-04-16 9:09 ` Martin Schwidefsky 2019-04-16 12:06 ` Martin Schwidefsky 2019-04-17 3:38 ` Michael Ellerman ` (2 subsequent siblings) 3 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-16 9:09 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Mon, 15 Apr 2019 09:17:10 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > Can we please have the page refcount overflow fixes out on the list > > for review, even if it is after the fact? > > They were actually on a list for review long before the fact, but it > was the security mailing list. The issue actually got discussed back > in January along with early versions of the patches, but then we > dropped the ball because it just wasn't on anybody's radar and it got > resurrected late March. Willy wrote a rather bigger patch-series, and > review of that is what then resulted in those commits. So they may > look recent, but that's just because the original patches got > seriously edited down and rewritten. First time I hear about this, thanks for the heads up. > That said, powerpc and s390 should at least look at maybe adding a > check for the page ref in their gup paths too. Powerpc has the special > gup_hugepte() case, and s390 has its own version of gup entirely. I > was actually hoping the s390 guys would look at using the generic gup > code. We did look at converting the s390 gup code to CONFIG_HAVE_GENERIC_GUP, there are some details that need careful consideration. The top one is access_ok(), for s390 we always return true. The generic gup code relies on the fact that a page table walk with a specific address is doable if access_ok() returned true, the s390 specific check is slightly different: if ((end <= start) || (end > mm->context.asce_limit)) return 0; The obvious approach would be to modify access_ok() to check against the asce_limit. I will try and see if anything breaks, e.g. the automatic page table upgrade. > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > largely irrelevant, partly since even theoretically this whole issue > needs a _lot_ of memory. > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' > (page ref overflow)"). You may or may not really care. On s390 we can have up to 16TB of memory in a single LPAR. So yes, I do care about it. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-16 9:09 ` Martin Schwidefsky @ 2019-04-16 12:06 ` Martin Schwidefsky 2019-04-16 16:16 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-16 12:06 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Tue, 16 Apr 2019 11:09:06 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > On Mon, 15 Apr 2019 09:17:10 -0700 > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > Can we please have the page refcount overflow fixes out on the list > > > for review, even if it is after the fact? > > > > They were actually on a list for review long before the fact, but it > > was the security mailing list. The issue actually got discussed back > > in January along with early versions of the patches, but then we > > dropped the ball because it just wasn't on anybody's radar and it got > > resurrected late March. Willy wrote a rather bigger patch-series, and > > review of that is what then resulted in those commits. So they may > > look recent, but that's just because the original patches got > > seriously edited down and rewritten. > > First time I hear about this, thanks for the heads up. > > > That said, powerpc and s390 should at least look at maybe adding a > > check for the page ref in their gup paths too. Powerpc has the special > > gup_hugepte() case, and s390 has its own version of gup entirely. I > > was actually hoping the s390 guys would look at using the generic gup > > code. > > We did look at converting the s390 gup code to CONFIG_HAVE_GENERIC_GUP, > there are some details that need careful consideration. The top one > is access_ok(), for s390 we always return true. The generic gup code > relies on the fact that a page table walk with a specific address is > doable if access_ok() returned true, the s390 specific check is slightly > different: > > if ((end <= start) || (end > mm->context.asce_limit)) > return 0; > > The obvious approach would be to modify access_ok() to check against > the asce_limit. I will try and see if anything breaks, e.g. the automatic > page table upgrade. I tested the waters in regard to access_ok() and the generic gup code. The good news is that mm/gup.c with CONFIG_HAVE_GENERIC_GUP=y seems to work just fine if the access_ok() issue is taken care of. But.. Bloat-o-meter with a non-empty uaccess_ok() that checks against current->mm->context.asce_limit: add/remove: 8/2 grow/shrink: 611/11 up/down: 61352/-1914 (59438) with CONFIG_HAVE_GENERIC_GUP on top of that add/remove: 10/2 grow/shrink: 612/12 up/down: 63568/-3280 (60288) This is not nice, would a patch like the following be acceptable? -- Subject: [PATCH] mm: introduce mm_pgd_walk_ok Add the architecture overrideable function mm_pgd_walk_ok() to check if a block of memory is inside the limits of the page table hierarchy of a given mm struct. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> --- include/asm-generic/pgtable.h | 4 ++++ mm/gup.c | 4 ++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index fa782fba51ee..7d2a8a58f1c1 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1186,4 +1186,8 @@ static inline bool arch_has_pfn_modify_check(void) #define mm_pmd_folded(mm) __is_defined(__PAGETABLE_PMD_FOLDED) #endif +#ifndef mm_pgd_walk_ok +#define mm_pgd_walk_ok(mm, addr, size) access_ok(addr, size) +#endif + #endif /* _ASM_GENERIC_PGTABLE_H */ diff --git a/mm/gup.c b/mm/gup.c index 91819b8ad9cc..b3eb3f45d237 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1990,7 +1990,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, len = (unsigned long) nr_pages << PAGE_SHIFT; end = start + len; - if (unlikely(!access_ok((void __user *)start, len))) + if (unlikely(!mm_pgd_walk_ok(current->mm, (void __user *)start, len))) return 0; /* @@ -2044,7 +2044,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write, if (nr_pages <= 0) return 0; - if (unlikely(!access_ok((void __user *)start, len))) + if (unlikely(!mm_pgd_walk_ok(current->mm, (void __user *)start, len))) return -EFAULT; if (gup_fast_permitted(start, nr_pages)) { -- 2.16.4 With an empty access_ok() but a "real" mm_pgd_walk_ok() the results are much more reasonable: add/remove: 2/0 grow/shrink: 2/1 up/down: 2186/-1382 (804) -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-16 12:06 ` Martin Schwidefsky @ 2019-04-16 16:16 ` Linus Torvalds 2019-04-16 16:49 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2019-04-16 16:16 UTC (permalink / raw) To: Martin Schwidefsky Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Tue, Apr 16, 2019 at 5:08 AM Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > > This is not nice, would a patch like the following be acceptable? Umm. We actually already *have* this function. It's called "gup_fast_permitted()" and it's used by x86-64 to verify the proper address range. Exactly like s390 needs.. Could you please use that instead? Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-16 16:16 ` Linus Torvalds @ 2019-04-16 16:49 ` Linus Torvalds 2019-04-17 7:46 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2019-04-16 16:49 UTC (permalink / raw) To: Martin Schwidefsky Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 [-- Attachment #1: Type: text/plain, Size: 739 bytes --] On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > We actually already *have* this function. > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify > the proper address range. Exactly like s390 needs.. > > Could you please use that instead? IOW, something like the attached. Obviously untested. And maybe 'current' isn't declared in <asm/pgtable.h>, in which case you'd need to modify it to instead make the inline function be "s390_gup_fast_permitted()" that takes a pointer to the mm, and do something like #define gup_fast_permitted(start, pages) \ s390_gup_fast_permitted(current->mm, start, pages) instead. But I think you get the idea.. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 724 bytes --] arch/s390/include/asm/pgtable.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 76dc344edb8c..a08248995f50 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1659,4 +1659,16 @@ static inline void check_pgt_cache(void) { } #include <asm-generic/pgtable.h> +static inline bool gup_fast_permitted(unsigned long start, int nr_pages) +{ + unsigned long len, end; + + len = (unsigned long)nr_pages << PAGE_SHIFT; + end = start + len; + if (end < start) + return false; + return end <= current->mm->context.asce_limit; +} +#define gup_fast_permitted gup_fast_permitted + #endif /* _S390_PAGE_H */ ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-16 16:49 ` Linus Torvalds @ 2019-04-17 7:46 ` Martin Schwidefsky 2019-04-17 8:02 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-17 7:46 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Tue, 16 Apr 2019 09:49:46 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > We actually already *have* this function. > > > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify > > the proper address range. Exactly like s390 needs.. > > > > Could you please use that instead? > > IOW, something like the attached. > > Obviously untested. And maybe 'current' isn't declared in > <asm/pgtable.h>, in which case you'd need to modify it to instead make > the inline function be "s390_gup_fast_permitted()" that takes a > pointer to the mm, and do something like > > #define gup_fast_permitted(start, pages) \ > s390_gup_fast_permitted(current->mm, start, pages) > > instead. > > But I think you get the idea.. Nice, I did not realize that gup_fast_permitted is a platform override-able function. So that part is doable in arch/s390. But I spoke to soon, I got my first crash and realized that the common gup code is not usable as it is. The reason is this e.g. this sequence: pgdp = pgd_offset(current->mm, addr); pgd_t pgd = READ_ONCE(*pgdp); /* some checking on pgd */ gup_p4d_range(pgd, addr, next, write, pages, nr); p4dp = p4d_offset(&pgd, addr); p4d_t p4d = READ_ONCE(*p4dp); /* some checking on p4d */ gup_pud_range(p4d, addr, next, write, pages, nr); pudp = pud_offset(&p4d, addr); pud_t pud = READ_ONCE(*pudp); /* some checking on pud */ gup_pmd_range(pud, addr, next, write, pages, nr; Each step along the way will read the page table entry and pass the table entry to the next function. This clashes with the page table folding on s390. The s390 gup code looks more like this: pgdp = pgd_offset(current->mm, addr); /* some checking on pgd */ pgd_t pgd = READ_ONCE(*pgdp); gup_p4d_range(pgdp, pgd, addr, next, write, pages, &nr); p4dp = p4d_offset(pgdp, addr); p4d_t p4d = READ_ONCE(*p4dp); /* some checking on p4d */ gup_pud_range(p4dp, p4d, addr, next, write, pages, nr); pudp = pud_offset(p4dp, addr); pud_t pud = READ_ONCE(*pudp); /* some checking on pud */ gup_pmd_range(pudp, pud, addr, next, write, pages, nr; There are magic dereferences in the s390 versions of p4d_offset, pud_offset and pmd_offset functions. To make this work the pointer passed to these functions may not be the local copy of the already dereferenced table entry. I'll cook up a patch for the common code. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-17 7:46 ` Martin Schwidefsky @ 2019-04-17 8:02 ` Martin Schwidefsky 2019-04-17 16:57 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-17 8:02 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Wed, 17 Apr 2019 09:46:37 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > On Tue, 16 Apr 2019 09:49:46 -0700 > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > > > > > > We actually already *have* this function. > > > > > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify > > > the proper address range. Exactly like s390 needs.. > > > > > > Could you please use that instead? > > > > IOW, something like the attached. > > > > Obviously untested. And maybe 'current' isn't declared in > > <asm/pgtable.h>, in which case you'd need to modify it to instead make > > the inline function be "s390_gup_fast_permitted()" that takes a > > pointer to the mm, and do something like > > > > #define gup_fast_permitted(start, pages) \ > > s390_gup_fast_permitted(current->mm, start, pages) > > > > instead. > > > > But I think you get the idea.. > > Nice, I did not realize that gup_fast_permitted is a platform > override-able function. So that part is doable in arch/s390. But I > spoke to soon, I got my first crash and realized that the common gup code > is not usable as it is. The reason is this e.g. this sequence: > > pgdp = pgd_offset(current->mm, addr); > pgd_t pgd = READ_ONCE(*pgdp); > /* some checking on pgd */ > gup_p4d_range(pgd, addr, next, write, pages, nr); > > p4dp = p4d_offset(&pgd, addr); > p4d_t p4d = READ_ONCE(*p4dp); > /* some checking on p4d */ > gup_pud_range(p4d, addr, next, write, pages, nr); > > pudp = pud_offset(&p4d, addr); > pud_t pud = READ_ONCE(*pudp); > /* some checking on pud */ > gup_pmd_range(pud, addr, next, write, pages, nr; > > Each step along the way will read the page table entry and pass the > table entry to the next function. This clashes with the page table > folding on s390. The s390 gup code looks more like this: > > pgdp = pgd_offset(current->mm, addr); > /* some checking on pgd */ > pgd_t pgd = READ_ONCE(*pgdp); > gup_p4d_range(pgdp, pgd, addr, next, write, pages, &nr); > > p4dp = p4d_offset(pgdp, addr); > p4d_t p4d = READ_ONCE(*p4dp); > /* some checking on p4d */ > gup_pud_range(p4dp, p4d, addr, next, write, pages, nr); > > pudp = pud_offset(p4dp, addr); > pud_t pud = READ_ONCE(*pudp); > /* some checking on pud */ > gup_pmd_range(pudp, pud, addr, next, write, pages, nr; > > There are magic dereferences in the s390 versions of p4d_offset, > pud_offset and pmd_offset functions. To make this work the pointer > passed to these functions may not be the local copy of the already > dereferenced table entry. I'll cook up a patch for the common code. Grumpf, that does *not* work. For gup the table entries may be read only once. Now I remember why I open-coded p4d_offset, pud_offset and pmd_offset in arch/s390/mm/gup.c, to avoid to read the table entries twice. It will be hard to use the common gup code after all. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-17 8:02 ` Martin Schwidefsky @ 2019-04-17 16:57 ` Linus Torvalds 2019-04-18 8:02 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2019-04-17 16:57 UTC (permalink / raw) To: Martin Schwidefsky Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Wed, Apr 17, 2019 at 1:02 AM Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > > Grumpf, that does *not* work. For gup the table entries may be read only > once. Now I remember why I open-coded p4d_offset, pud_offset and pmd_offset > in arch/s390/mm/gup.c, to avoid to read the table entries twice. > It will be hard to use the common gup code after all. Hmm. The common gup code generally should do the "read only once" thing too (since by definition the gup-fast case is done without locking), although it's probably the case that most architectures simply don't care. What would it require for the generic code to work for s390? Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-17 16:57 ` Linus Torvalds @ 2019-04-18 8:02 ` Martin Schwidefsky 2019-04-18 15:49 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-18 8:02 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Wed, 17 Apr 2019 09:57:01 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Apr 17, 2019 at 1:02 AM Martin Schwidefsky > <schwidefsky@de.ibm.com> wrote: > > > > Grumpf, that does *not* work. For gup the table entries may be read only > > once. Now I remember why I open-coded p4d_offset, pud_offset and pmd_offset > > in arch/s390/mm/gup.c, to avoid to read the table entries twice. > > It will be hard to use the common gup code after all. > > Hmm. The common gup code generally should do the "read only once" > thing too (since by definition the gup-fast case is done without > locking), although it's probably the case that most architectures > simply don't care. > > What would it require for the generic code to work for s390? The problematic lines in the generic gup code are these three: 1845: pmdp = pmd_offset(&pud, addr); 1888: pudp = pud_offset(&p4d, addr); 1916: p4dp = p4d_offset(&pgd, addr); Passing the pointer of a *copy* of a page table entry to pxd_offset() does not work with the page table folding on s390. The pxd_offset() function on s390 have to make a choice, either return the dereferenced value behind the passed pointer (that works) or return the original page table pointer if the table level is folded (that does not work). To fix this we would need three new helpers pmd_offset_orig, pud_offset_orig and p4d_offset_orig, their generic definition would look like this: #define p4d_offset_orig(pgdp, pgd, address) p4d_offset(&pgd, address) #define pud_offset_orig(p4dp, p4d, address) pud_offset(&p4d, address) #define pmd_offset_orig(pudp, pud, address) pmd_offset(&pud, address) For the s390 definition see the following branch: git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git generic-gup A quick test with this branch shows everything working normally. Keeping my fingers crossed that I did not miss anything. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-18 8:02 ` Martin Schwidefsky @ 2019-04-18 15:49 ` Linus Torvalds 2019-04-18 18:41 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2019-04-18 15:49 UTC (permalink / raw) To: Martin Schwidefsky Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, Apr 18, 2019 at 1:02 AM Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > > The problematic lines in the generic gup code are these three: > > 1845: pmdp = pmd_offset(&pud, addr); > 1888: pudp = pud_offset(&p4d, addr); > 1916: p4dp = p4d_offset(&pgd, addr); > > Passing the pointer of a *copy* of a page table entry to pxd_offset() does > not work with the page table folding on s390. Hmm. I wonder why. x86 too does the folding thing for the p4d and pud case. The folding works with the local copy just the same way it works with the orignal value. But I see that s390 does some other kind of folding and does that addition of the p*d_index() unconditionally. I guess that does mean that s390 will just have to have its own walker. For the issue of the page refcount overflow it really isn't a huge deal. Adding the refcount checking is simple (see the example patch I gave for powerpc - you'll just have a couple of extra cases since you do it all, rather than just the special hugetlb cases). Obviously in general it would have been nicer to share as much code as possible, but let's not make things unnecessarily complex if s390 is just fundamentally different.. Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-18 15:49 ` Linus Torvalds @ 2019-04-18 18:41 ` Martin Schwidefsky 2019-04-19 13:33 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-18 18:41 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, 18 Apr 2019 08:49:32 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Apr 18, 2019 at 1:02 AM Martin Schwidefsky > <schwidefsky@de.ibm.com> wrote: > > > > The problematic lines in the generic gup code are these three: > > > > 1845: pmdp = pmd_offset(&pud, addr); > > 1888: pudp = pud_offset(&p4d, addr); > > 1916: p4dp = p4d_offset(&pgd, addr); > > > > Passing the pointer of a *copy* of a page table entry to pxd_offset() does > > not work with the page table folding on s390. > > Hmm. I wonder why. x86 too does the folding thing for the p4d and pud case. > > The folding works with the local copy just the same way it works with > the orignal value. The difference is that with the static page table folding pgd_offset() does the index calculation of the actual hardware top-level table. With dynamic page table folding as s390 is doing it, if the task does not use a 5-level page table pgd_offset() will see a pgd_index() of 0, the indexing of the actual top-level table is done later with p4d_offset(), pud_offset() or pmd_offset(). As an example, with a three level page table we have three indexes x/y/z. The common code "thinks" 5 indexing steps, with static folding the index sequence is x 0 0 y z. With dynamic folding the sequence is 0 0 x y z. By moving the first indexing operation to pgd_offset the static sequence does not add an index to a non-dereferenced pointer to a stack variable, the dynamic sequence does. > But I see that s390 does some other kind of folding and does that > addition of the p*d_index() unconditionally. > > I guess that does mean that s390 will just have to have its own walker. > > For the issue of the page refcount overflow it really isn't a huge > deal. Adding the refcount checking is simple (see the example patch I > gave for powerpc - you'll just have a couple of extra cases since you > do it all, rather than just the special hugetlb cases). > > Obviously in general it would have been nicer to share as much code as > possible, but let's not make things unnecessarily complex if s390 is > just fundamentally different.. It would have been nice to use the generic code (less bugs) but not at the price of over-complicating things. And that page table folding thing always makes my head hurt. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-18 18:41 ` Martin Schwidefsky @ 2019-04-19 13:33 ` Martin Schwidefsky 2019-04-19 17:27 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-19 13:33 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, 18 Apr 2019 20:41:44 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > On Thu, 18 Apr 2019 08:49:32 -0700 > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Thu, Apr 18, 2019 at 1:02 AM Martin Schwidefsky > > <schwidefsky@de.ibm.com> wrote: > > > > > > The problematic lines in the generic gup code are these three: > > > > > > 1845: pmdp = pmd_offset(&pud, addr); > > > 1888: pudp = pud_offset(&p4d, addr); > > > 1916: p4dp = p4d_offset(&pgd, addr); > > > > > > Passing the pointer of a *copy* of a page table entry to pxd_offset() does > > > not work with the page table folding on s390. > > > > Hmm. I wonder why. x86 too does the folding thing for the p4d and pud case. > > > > The folding works with the local copy just the same way it works with > > the orignal value. > > The difference is that with the static page table folding pgd_offset() > does the index calculation of the actual hardware top-level table. With > dynamic page table folding as s390 is doing it, if the task does not use > a 5-level page table pgd_offset() will see a pgd_index() of 0, the indexing > of the actual top-level table is done later with p4d_offset(), pud_offset() > or pmd_offset(). > > As an example, with a three level page table we have three indexes x/y/z. > The common code "thinks" 5 indexing steps, with static folding the index > sequence is x 0 0 y z. With dynamic folding the sequence is 0 0 x y z. > By moving the first indexing operation to pgd_offset the static sequence > does not add an index to a non-dereferenced pointer to a stack variable, > the dynamic sequence does. That problem got stuck in my head and I thought more about it. Why not emulate the static folding sequence in the s390 page table code? As the table type is encoded in every entry for the region and segment tables, pgd_offset() can look at the first entry to find the table type and then do the correct index calculation for the given top-level table. Like this: static inline pgd_t *pgd_offset_raw(pgd_t *pgd, unsigned long address) { unsigned long rste; unsigned int shift; /* Get the first entry of the top level table */ rste = pgd_val(*pgd); /* Pick up the shift from the table type of the first entry */ shift = ((rste & _REGION_ENTRY_TYPE_MASK) >> 2) * 11 + 20; return pgd + ((address >> shift) & (PTRS_PER_PGD - 1)); } #define pgd_offset(mm, address) pgd_offset_raw((mm)->pgd, address) #define pgd_offset_k(address) pgd_offset(&init_mm, address) static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address) { if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R1) return (p4d_t *) pgd; return (p4d_t *) pgd_deref(*pgd) + p4d_index(address); } static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address) { if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R2) return (pud_t *) p4d; return (pud_t *) p4d_deref(*p4d) + pud_index(address); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R3) return (pmd_t *) pud; return (pmd_t *) pud_deref(*pud) + pmd_index(address); } This needs more thorough testing but in principle it does work. The kernel boots and survives a kernel compile. The only things that is slightly off is that pgd_offset() now has to look at the first table entry to do its job. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-19 13:33 ` Martin Schwidefsky @ 2019-04-19 17:27 ` Linus Torvalds 2019-04-23 15:38 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2019-04-19 17:27 UTC (permalink / raw) To: Martin Schwidefsky Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Fri, Apr 19, 2019 at 6:33 AM Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > > That problem got stuck in my head and I thought more about it. Why not > emulate the static folding sequence in the s390 page table code? So this model seems much closer to what x86 does in its folding, where the pattern is basically > static inline pX-1d_t *pXd_offset(pXd_t *pXd, unsigned long address) > { > if (pXd_folded(pXd) > return (pX-1d_t *) pXd; > return (pX-1d_t *) pXd_deref(*pXd) + pXd_index(address); > } which is really how the code is designed to work (ie the folded entry doesn't actually do anything to the page directory pointer, it just says "ok, we'll use this exact page directory pointer for the next lower level instead". And that's very much what allows the generic gup code to load the entry once, and use a temporary, and as you walk down the chain, if it is folded it just then uses that (previous) temporary value for the next level instead. IOW, the lower level page table is hidden inside the upper level one, and folding just means "don't do any offsets, don't change any values, just use the entry as-is for the next lower level". So I think that's the right thing to do. Looking at the s390 code, it seems to fold things the other way, conceptually hiding the upper level inside the lower one, and always doing the offset thing (but just avoiding the dereference). Maybe there's some reason why the s390 code does it that way, but I think your new model is the right one, and hopefully means you can use the generic page table walking more easily. Of course, the s390 folding is very different from the x86 one (or the generic fixed 3-level of 4-level cases). The x86 folding doesn't depend on the contents of the page tables, it's just entirely static (well, the 5th level is conditional, but it's conditional on a static key, not on what is in the page tables). So maybe the old model of s390 made more sense in that context, but I look at your new suggested pXd_offset() functions and I go "yeah, that's the way it's supposed to work". Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-19 17:27 ` Linus Torvalds @ 2019-04-23 15:38 ` Martin Schwidefsky 2019-04-23 16:06 ` Linus Torvalds 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-04-23 15:38 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Fri, 19 Apr 2019 10:27:17 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Apr 19, 2019 at 6:33 AM Martin Schwidefsky > <schwidefsky@de.ibm.com> wrote: > > > > That problem got stuck in my head and I thought more about it. Why not > > emulate the static folding sequence in the s390 page table code? > > So this model seems much closer to what x86 does in its folding, where > the pattern is basically > > > static inline pX-1d_t *pXd_offset(pXd_t *pXd, unsigned long address) > > { > > if (pXd_folded(pXd) > > return (pX-1d_t *) pXd; > > return (pX-1d_t *) pXd_deref(*pXd) + pXd_index(address); > > } > > which is really how the code is designed to work (ie the folded entry > doesn't actually do anything to the page directory pointer, it just > says "ok, we'll use this exact page directory pointer for the next > lower level instead". > > And that's very much what allows the generic gup code to load the > entry once, and use a temporary, and as you walk down the chain, if it > is folded it just then uses that (previous) temporary value for the > next level instead. IOW, the lower level page table is hidden inside > the upper level one, and folding just means "don't do any offsets, > don't change any values, just use the entry as-is for the next lower > level". > > So I think that's the right thing to do. Ok, I added two patches for my s390/linux:features branch Martin Schwidefsky (2): s390/mm: make the pxd_offset functions more robust s390/mm: convert to the generic get_user_pages_fast code All code changes are inside arch/s390, I plan to include these patches with the next merge window. That gives us a little bit of time to run our tests. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-23 15:38 ` Martin Schwidefsky @ 2019-04-23 16:06 ` Linus Torvalds 0 siblings, 0 replies; 26+ messages in thread From: Linus Torvalds @ 2019-04-23 16:06 UTC (permalink / raw) To: Martin Schwidefsky Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Tue, Apr 23, 2019 at 8:39 AM Martin Schwidefsky <schwidefsky@de.ibm.com> wrote: > > Ok, I added two patches for my s390/linux:features branch > > Martin Schwidefsky (2): > s390/mm: make the pxd_offset functions more robust > s390/mm: convert to the generic get_user_pages_fast code > > All code changes are inside arch/s390, I plan to include these patches with > the next merge window. That gives us a little bit of time to run our tests. Sounds good. Thanks for looking into this all. Now I slightly wonder about all the other random architectures that don't use the HAVE_GENERIC_GUP config option, but at least we'll have all of arm, powerpc, x86 and s390 using the generic code.. Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-15 16:17 ` Linus Torvalds 2019-04-16 9:09 ` Martin Schwidefsky @ 2019-04-17 3:38 ` Michael Ellerman 2019-04-17 4:13 ` Linus Torvalds 2019-05-02 12:21 ` Greg KH 2019-05-02 23:15 ` Christoph Hellwig 3 siblings, 1 reply; 26+ messages in thread From: Michael Ellerman @ 2019-04-17 3:38 UTC (permalink / raw) To: Linus Torvalds, Christoph Hellwig Cc: Linux List Kernel Mailing, linuxppc-dev, Martin Schwidefsky, linux-s390, Nicholas Piggin, Aneesh Kumar K.V, Paul Mackerras [ Cc += Nick & Aneesh & Paul ] Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: >> >> Can we please have the page refcount overflow fixes out on the list >> for review, even if it is after the fact? > > They were actually on a list for review long before the fact, but it > was the security mailing list. The issue actually got discussed back > in January along with early versions of the patches, but then we > dropped the ball because it just wasn't on anybody's radar and it got > resurrected late March. Willy wrote a rather bigger patch-series, and > review of that is what then resulted in those commits. So they may > look recent, but that's just because the original patches got > seriously edited down and rewritten. > > That said, powerpc and s390 should at least look at maybe adding a > check for the page ref in their gup paths too. Powerpc has the special > gup_hugepte() case Which uses page_cache_add_speculative(), which handles the case of the refcount being zero but not overflow. So that looks like it needs fixing. We also have follow_huge_pd() that should use try_get_page(). And we have a few uses of bare get_page() in KVM code which might be subject to the same attack. cheers ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-17 3:38 ` Michael Ellerman @ 2019-04-17 4:13 ` Linus Torvalds 0 siblings, 0 replies; 26+ messages in thread From: Linus Torvalds @ 2019-04-17 4:13 UTC (permalink / raw) To: Michael Ellerman Cc: Christoph Hellwig, Linux List Kernel Mailing, linuxppc-dev, Martin Schwidefsky, linux-s390, Nicholas Piggin, Aneesh Kumar K.V, Paul Mackerras On Tue, Apr 16, 2019 at 8:38 PM Michael Ellerman <mpe@ellerman.id.au> wrote: > > > That said, powerpc and s390 should at least look at maybe adding a > > check for the page ref in their gup paths too. Powerpc has the special > > gup_hugepte() case > > Which uses page_cache_add_speculative(), which handles the case of the > refcount being zero but not overflow. So that looks like it needs > fixing. Note that unlike the zero check, the "too many refs" check does _not_ need to be atomic. Because it's not a correctness issue right at some magical exact point, it's a much more ambiguous a "the refcount is now so large that I'm not going to do GUP on this page any more". Being off by a number of pages in case there's a race is just fine. So you could do something like this (TOTALLY UNTESTED, and whitespace-damaged on purpose - I don't want you to apply it blindly) appended patch. > And we have a few uses of bare get_page() in KVM code which might be > subject to the same attack. Note that you really have to have not just a get_page(), but some way of lining up *billions* of them. Which really tends to be pretty hard. Linus ---- diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 9e732bb2c84a..52db7ff7c756 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -523,7 +523,8 @@ struct page *follow_huge_pd(struct vm_area_struct *vma, page = pte_page(*ptep); page += ((address & mask) >> PAGE_SHIFT); if (flags & FOLL_GET) - get_page(page); + if (!try_get_page(page)) + page = NULL; } else { if (is_hugetlb_entry_migration(*ptep)) { spin_unlock(ptl); @@ -883,6 +884,8 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, refs = 0; head = pte_page(pte); + if (page_ref_count(head) < 0) + return 0; page = head + ((addr & (sz-1)) >> PAGE_SHIFT); do { ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-15 16:17 ` Linus Torvalds 2019-04-16 9:09 ` Martin Schwidefsky 2019-04-17 3:38 ` Michael Ellerman @ 2019-05-02 12:21 ` Greg KH 2019-05-02 14:17 ` Martin Schwidefsky 2019-05-03 13:31 ` Michael Ellerman 2019-05-02 23:15 ` Christoph Hellwig 3 siblings, 2 replies; 26+ messages in thread From: Greg KH @ 2019-05-02 12:21 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, Martin Schwidefsky, linux-s390 On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > Can we please have the page refcount overflow fixes out on the list > > for review, even if it is after the fact? > > They were actually on a list for review long before the fact, but it > was the security mailing list. The issue actually got discussed back > in January along with early versions of the patches, but then we > dropped the ball because it just wasn't on anybody's radar and it got > resurrected late March. Willy wrote a rather bigger patch-series, and > review of that is what then resulted in those commits. So they may > look recent, but that's just because the original patches got > seriously edited down and rewritten. > > That said, powerpc and s390 should at least look at maybe adding a > check for the page ref in their gup paths too. Powerpc has the special > gup_hugepte() case, and s390 has its own version of gup entirely. I > was actually hoping the s390 guys would look at using the generic gup > code. > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > largely irrelevant, partly since even theoretically this whole issue > needs a _lot_ of memory. > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' > (page ref overflow)"). You may or may not really care. I've now queued these patches up for the next round of stable releases, as some people seem to care about these. I didn't see any follow-on patches for s390 or ppc64 hit the tree for these changes, am I just missing them and should also queue up a few more to handle this issue on those platforms? thanks, greg k-h ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-05-02 12:21 ` Greg KH @ 2019-05-02 14:17 ` Martin Schwidefsky 2019-05-02 14:31 ` Greg KH 2019-05-03 13:31 ` Michael Ellerman 1 sibling, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-05-02 14:17 UTC (permalink / raw) To: Greg KH Cc: Linus Torvalds, Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, 2 May 2019 14:21:28 +0200 Greg KH <gregkh@linuxfoundation.org> wrote: > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > Can we please have the page refcount overflow fixes out on the list > > > for review, even if it is after the fact? > > > > They were actually on a list for review long before the fact, but it > > was the security mailing list. The issue actually got discussed back > > in January along with early versions of the patches, but then we > > dropped the ball because it just wasn't on anybody's radar and it got > > resurrected late March. Willy wrote a rather bigger patch-series, and > > review of that is what then resulted in those commits. So they may > > look recent, but that's just because the original patches got > > seriously edited down and rewritten. > > > > That said, powerpc and s390 should at least look at maybe adding a > > check for the page ref in their gup paths too. Powerpc has the special > > gup_hugepte() case, and s390 has its own version of gup entirely. I > > was actually hoping the s390 guys would look at using the generic gup > > code. > > > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > > largely irrelevant, partly since even theoretically this whole issue > > needs a _lot_ of memory. > > > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' > > (page ref overflow)"). You may or may not really care. > > I've now queued these patches up for the next round of stable releases, > as some people seem to care about these. > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for > these changes, am I just missing them and should also queue up a few > more to handle this issue on those platforms? I fixed that with a different approach. The following two patches are queued for the next merge window: d1874a0c2805 "s390/mm: make the pxd_offset functions more robust" 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code" With these two s390 now uses the generic gup code in mm/gup.c -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-05-02 14:17 ` Martin Schwidefsky @ 2019-05-02 14:31 ` Greg KH 2019-05-02 15:10 ` Martin Schwidefsky 0 siblings, 1 reply; 26+ messages in thread From: Greg KH @ 2019-05-02 14:31 UTC (permalink / raw) To: Martin Schwidefsky Cc: Linus Torvalds, Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote: > On Thu, 2 May 2019 14:21:28 +0200 > Greg KH <gregkh@linuxfoundation.org> wrote: > > > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > > > Can we please have the page refcount overflow fixes out on the list > > > > for review, even if it is after the fact? > > > > > > They were actually on a list for review long before the fact, but it > > > was the security mailing list. The issue actually got discussed back > > > in January along with early versions of the patches, but then we > > > dropped the ball because it just wasn't on anybody's radar and it got > > > resurrected late March. Willy wrote a rather bigger patch-series, and > > > review of that is what then resulted in those commits. So they may > > > look recent, but that's just because the original patches got > > > seriously edited down and rewritten. > > > > > > That said, powerpc and s390 should at least look at maybe adding a > > > check for the page ref in their gup paths too. Powerpc has the special > > > gup_hugepte() case, and s390 has its own version of gup entirely. I > > > was actually hoping the s390 guys would look at using the generic gup > > > code. > > > > > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > > > largely irrelevant, partly since even theoretically this whole issue > > > needs a _lot_ of memory. > > > > > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' > > > (page ref overflow)"). You may or may not really care. > > > > I've now queued these patches up for the next round of stable releases, > > as some people seem to care about these. > > > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for > > these changes, am I just missing them and should also queue up a few > > more to handle this issue on those platforms? > > I fixed that with a different approach. The following two patches are > queued for the next merge window: > > d1874a0c2805 "s390/mm: make the pxd_offset functions more robust" > 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code" > > With these two s390 now uses the generic gup code in mm/gup.c Nice! Do you want me to queue those up for the stable backports once they hit a public -rc release? thanks, greg k-h ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-05-02 14:31 ` Greg KH @ 2019-05-02 15:10 ` Martin Schwidefsky 2019-05-20 11:09 ` Greg KH 0 siblings, 1 reply; 26+ messages in thread From: Martin Schwidefsky @ 2019-05-02 15:10 UTC (permalink / raw) To: Greg KH Cc: Linus Torvalds, Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, 2 May 2019 16:31:10 +0200 Greg KH <gregkh@linuxfoundation.org> wrote: > On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote: > > On Thu, 2 May 2019 14:21:28 +0200 > > Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: > > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > > > > > Can we please have the page refcount overflow fixes out on the list > > > > > for review, even if it is after the fact? > > > > > > > > They were actually on a list for review long before the fact, but it > > > > was the security mailing list. The issue actually got discussed back > > > > in January along with early versions of the patches, but then we > > > > dropped the ball because it just wasn't on anybody's radar and it got > > > > resurrected late March. Willy wrote a rather bigger patch-series, and > > > > review of that is what then resulted in those commits. So they may > > > > look recent, but that's just because the original patches got > > > > seriously edited down and rewritten. > > > > > > > > That said, powerpc and s390 should at least look at maybe adding a > > > > check for the page ref in their gup paths too. Powerpc has the special > > > > gup_hugepte() case, and s390 has its own version of gup entirely. I > > > > was actually hoping the s390 guys would look at using the generic gup > > > > code. > > > > > > > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > > > > largely irrelevant, partly since even theoretically this whole issue > > > > needs a _lot_ of memory. > > > > > > > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' > > > > (page ref overflow)"). You may or may not really care. > > > > > > I've now queued these patches up for the next round of stable releases, > > > as some people seem to care about these. > > > > > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for > > > these changes, am I just missing them and should also queue up a few > > > more to handle this issue on those platforms? > > > > I fixed that with a different approach. The following two patches are > > queued for the next merge window: > > > > d1874a0c2805 "s390/mm: make the pxd_offset functions more robust" > > 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code" > > > > With these two s390 now uses the generic gup code in mm/gup.c > > Nice! Do you want me to queue those up for the stable backports once > they hit a public -rc release? Yes please! -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-05-02 15:10 ` Martin Schwidefsky @ 2019-05-20 11:09 ` Greg KH 0 siblings, 0 replies; 26+ messages in thread From: Greg KH @ 2019-05-20 11:09 UTC (permalink / raw) To: Martin Schwidefsky Cc: Linus Torvalds, Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, linux-s390 On Thu, May 02, 2019 at 05:10:55PM +0200, Martin Schwidefsky wrote: > On Thu, 2 May 2019 16:31:10 +0200 > Greg KH <gregkh@linuxfoundation.org> wrote: > > > On Thu, May 02, 2019 at 04:17:58PM +0200, Martin Schwidefsky wrote: > > > On Thu, 2 May 2019 14:21:28 +0200 > > > Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: > > > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > > > > > > > Can we please have the page refcount overflow fixes out on the list > > > > > > for review, even if it is after the fact? > > > > > > > > > > They were actually on a list for review long before the fact, but it > > > > > was the security mailing list. The issue actually got discussed back > > > > > in January along with early versions of the patches, but then we > > > > > dropped the ball because it just wasn't on anybody's radar and it got > > > > > resurrected late March. Willy wrote a rather bigger patch-series, and > > > > > review of that is what then resulted in those commits. So they may > > > > > look recent, but that's just because the original patches got > > > > > seriously edited down and rewritten. > > > > > > > > > > That said, powerpc and s390 should at least look at maybe adding a > > > > > check for the page ref in their gup paths too. Powerpc has the special > > > > > gup_hugepte() case, and s390 has its own version of gup entirely. I > > > > > was actually hoping the s390 guys would look at using the generic gup > > > > > code. > > > > > > > > > > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > > > > > largely irrelevant, partly since even theoretically this whole issue > > > > > needs a _lot_ of memory. > > > > > > > > > > Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' > > > > > (page ref overflow)"). You may or may not really care. > > > > > > > > I've now queued these patches up for the next round of stable releases, > > > > as some people seem to care about these. > > > > > > > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for > > > > these changes, am I just missing them and should also queue up a few > > > > more to handle this issue on those platforms? > > > > > > I fixed that with a different approach. The following two patches are > > > queued for the next merge window: > > > > > > d1874a0c2805 "s390/mm: make the pxd_offset functions more robust" > > > 1a42010cdc26 "s390/mm: convert to the generic get_user_pages_fast code" > > > > > > With these two s390 now uses the generic gup code in mm/gup.c > > > > Nice! Do you want me to queue those up for the stable backports once > > they hit a public -rc release? > > Yes please! Now queued up to 5.0 and 5.1, but did not apply to 4.19 or older :( thanks, greg k-h ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-05-02 12:21 ` Greg KH 2019-05-02 14:17 ` Martin Schwidefsky @ 2019-05-03 13:31 ` Michael Ellerman 1 sibling, 0 replies; 26+ messages in thread From: Michael Ellerman @ 2019-05-03 13:31 UTC (permalink / raw) To: Greg KH, Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, linuxppc-dev, Martin Schwidefsky, linux-s390 Greg KH <gregkh@linuxfoundation.org> writes: > On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: >> On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote: >> > >> > Can we please have the page refcount overflow fixes out on the list >> > for review, even if it is after the fact? >> >> They were actually on a list for review long before the fact, but it >> was the security mailing list. The issue actually got discussed back >> in January along with early versions of the patches, but then we >> dropped the ball because it just wasn't on anybody's radar and it got >> resurrected late March. Willy wrote a rather bigger patch-series, and >> review of that is what then resulted in those commits. So they may >> look recent, but that's just because the original patches got >> seriously edited down and rewritten. >> >> That said, powerpc and s390 should at least look at maybe adding a >> check for the page ref in their gup paths too. Powerpc has the special >> gup_hugepte() case, and s390 has its own version of gup entirely. I >> was actually hoping the s390 guys would look at using the generic gup >> code. >> >> I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem >> largely irrelevant, partly since even theoretically this whole issue >> needs a _lot_ of memory. >> >> Michael, Martin, see commit 6b3a70773630 ("Merge branch 'page-refs' >> (page ref overflow)"). You may or may not really care. > > I've now queued these patches up for the next round of stable releases, > as some people seem to care about these. > > I didn't see any follow-on patches for s390 or ppc64 hit the tree for > these changes, am I just missing them and should also queue up a few > more to handle this issue on those platforms? No you haven't missed them for powerpc. It's on my list. cheers ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Linux 5.1-rc5 2019-04-15 16:17 ` Linus Torvalds ` (2 preceding siblings ...) 2019-05-02 12:21 ` Greg KH @ 2019-05-02 23:15 ` Christoph Hellwig 3 siblings, 0 replies; 26+ messages in thread From: Christoph Hellwig @ 2019-05-02 23:15 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Linux List Kernel Mailing, Michael Ellerman, linuxppc-dev, Martin Schwidefsky, linux-s390, Hillf Danton, Paul Burton, James Hogan, linux-mips, Paul Mundt, Stas Sergeev, Yoshinori Sato, Rich Felker, linux-sh, David S. Miller, Khalid Aziz, Nitin Gupta, sparclinux On Mon, Apr 15, 2019 at 09:17:10AM -0700, Linus Torvalds wrote: > I ruthlessly also entirely ignored MIPS, SH and sparc, since they seem > largely irrelevant, partly since even theoretically this whole issue > needs a _lot_ of memory. Adding the relevant people - while the might be irrelevant, at least mips and sparc have some giant memory systems. And I'd really like to see the arch-specific GUP implementations to go away for other reasons, as we have a few issues to sort out with GUP usage now (we just had discussions at LSF/MM), and the less implementations we have to deal with the better. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2019-05-20 11:09 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-04-14 22:40 Linux 5.1-rc5 Linus Torvalds 2019-04-15 5:19 ` Christoph Hellwig 2019-04-15 16:17 ` Linus Torvalds 2019-04-16 9:09 ` Martin Schwidefsky 2019-04-16 12:06 ` Martin Schwidefsky 2019-04-16 16:16 ` Linus Torvalds 2019-04-16 16:49 ` Linus Torvalds 2019-04-17 7:46 ` Martin Schwidefsky 2019-04-17 8:02 ` Martin Schwidefsky 2019-04-17 16:57 ` Linus Torvalds 2019-04-18 8:02 ` Martin Schwidefsky 2019-04-18 15:49 ` Linus Torvalds 2019-04-18 18:41 ` Martin Schwidefsky 2019-04-19 13:33 ` Martin Schwidefsky 2019-04-19 17:27 ` Linus Torvalds 2019-04-23 15:38 ` Martin Schwidefsky 2019-04-23 16:06 ` Linus Torvalds 2019-04-17 3:38 ` Michael Ellerman 2019-04-17 4:13 ` Linus Torvalds 2019-05-02 12:21 ` Greg KH 2019-05-02 14:17 ` Martin Schwidefsky 2019-05-02 14:31 ` Greg KH 2019-05-02 15:10 ` Martin Schwidefsky 2019-05-20 11:09 ` Greg KH 2019-05-03 13:31 ` Michael Ellerman 2019-05-02 23:15 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).