All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux 4.15-rc6
@ 2017-12-31 22:57 Linus Torvalds
  2018-01-02 20:28 ` Andres Freund
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2017-12-31 22:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List

One last rc at the end of the year - and a Happy New Year to everybody!

This would have been a very quiet week, if it wasn't for the final x86
PTI stuff - and that shows in the diffstat too. About half the rc6
work is x86 updates. The timing for this isn't wonderful, but it all
looks nice and clean.

Outside of the x86 updates, it's misc driver updates (usb, networking,
rdma, sound), some perf tooling, and misc random stuff (core
networking, some irq fixes).

With all the x86 pti work coming in late in the rc like this, I'm by
now almost guaranteed to do an rc8 this release, not because there are
any known problems, but simply because of the timing of the patches.

Go forth and test,

                  Linus

---

Abhijeet Kumar (1):
      ASoC: nau8825: fix issue that pop noise when start capture

Adam Borowski (1):
      MAINTAINERS: mark arch/blackfin/ and its gubbins as orphaned

Adam Thomson (2):
      ASoC: da7219: Correct IRQ level in DT binding example
      ASoC: da7218: Correct IRQ level in DT binding example

Alex Vesker (1):
      IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush

Alexander Kappner (1):
      xhci: Fix use-after-free in xhci debugfs

Alexandre Belloni (1):
      ASoC: atmel-classd: select correct Kconfig symbol

Alexey Kodanev (1):
      ip6_gre: fix device features for ioctl setup

Andrew F. Davis (1):
      ASoC: tlv320aic31xx: Fix GPIO1 register definition

Andrew Lunn (1):
      kernel/irq: Extend lockdep class for request mutex

Andy Lutomirski (7):
      x86/espfix/64: Fix espfix double-fault handling on 5-level systems
      x86/mm/pti: Add functions to clone kernel PMDs
      x86/mm/pti: Share cpu_entry_area with user space page tables
      x86/mm/pti: Map ESPFIX into user space
      x86/mm/64: Make a full PGD-entry size hole in the memory map
      x86/pti: Put the LDT in its own PGD if PTI is on
      x86/pti: Map the vsyscall page if needed

Andy Shevchenko (1):
      thunderbolt: Make pathname to force_power shorter

Anna-Maria Gleixner (1):
      timers: Use deferrable base independent of base::nohz_active

Antony Antony (1):
      xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)

Arnaldo Carvalho de Melo (2):
      tools arch s390: Do not include header files from the kernel sources
      x86/asm: Allow again using asm.h when building for the 'bpf' clang target

Arnd Bergmann (1):
      phy: rcar-gen3-usb2: select USB_COMMON

Arvind Yadav (1):
      phy: cpcap-usb: Fix platform_get_irq_byname's error checking.

Avinash Repaka (1):
      RDS: Check cmsg_len before dereferencing CMSG_DATA

Aviv Heller (1):
      xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)

Bard Liao (1):
      ASoC: rt5645: reset RT5645_AD_DA_MIXER at probe

Ben Gainey (1):
      perf jvmti: Generate correct debug information for inlined code

Ben Hutchings (1):
      ASoC: wm_adsp: Fix validation of firmware and coeff lengths

Ben Skeggs (1):
      drm/nouveau: fix race when adding delayed work items

Borislav Petkov (2):
      x86/pti: Add the pti= cmdline option and documentation
      x86/mm/dump_pagetables: Add page table directory to the debugfs
VFS hierarchy

Brian Norris (1):
      ASoC: rt5514-spi: only enable wakeup when fully initialized

Bryan Tan (3):
      RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
      RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
      RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy

Cathy Avery (1):
      scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error

Chris Zhong (1):
      phy: rockchip-typec: add pm_runtime_disable in err case

Christophe Leroy (1):
      gpio: fix "gpio-line-names" property retrieval

Cong Wang (2):
      xfrm: check id proto in validate_tmpl()
      net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()

Daniel Axtens (1):
      HID: holtekff: move MODULE_* parameters out of #ifdef block

Daniel Thompson (1):
      usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201

Daniele Palmas (1):
      USB: serial: option: add support for Telit ME910 PID 0x1101

Dave Hansen (10):
      x86/mm/pti: Disable global pages if PAGE_TABLE_ISOLATION=y
      x86/mm/pti: Prepare the x86/entry assembly code for entry/exit
CR3 switching
      x86/mm/pti: Add mapping helper functions
      x86/mm/pti: Allow NX poison to be set in p4d/pgd
      x86/mm/pti: Allocate a separate user PGD
      x86/mm/pti: Populate user PGD
      x86/mm: Allow flushing for future ASID switches
      x86/mm: Abstract switching CR3
      x86/mm: Use INVPCID for __native_flush_tlb_single()
      x86/mm/pti: Add Kconfig

Dexuan Cui (1):
      vmbus: unregister device_obj->channels_kset

Dmitry Fleytman Dmitry Fleytman (1):
      usb: Add device quirk for Logitech HD Pro Webcam C925e

Dmitry Torokhov (1):
      kobject: fix suppressing modalias in uevents delivered over netlink

Dong Aisheng (1):
      clk: use atomic runtime pm api in clk_core_is_enabled

Dou Liyang (2):
      x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
      x86/apic: Update the 'apic=' description of setting APIC driver

Eudean Sun (1):
      HID: cp2112: Fix I2C_BLOCK_DATA transactions

Florian Westphal (1):
      xfrm: put policies when reusing pcpu xdst entry

Frederic Weisbecker (2):
      sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
      sched/isolation: Document boot parameters dependency on
CONFIG_CPU_ISOLATION=y

Fugang Duan (1):
      net: fec: unmap the xmit buffer that are not transferred by DMA

Gabriel Krisman Bertazi (1):
      i915: Reject CCS modifiers for pipe C on Geminilake

Grygorii Strashko (2):
      gpio: gpio-reg: fix build
      net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg
workaround

Guenter Roeck (2):
      ASoC: amd: Add error checking to probe function
      genirq: Guard handle_bad_irq log messages

Guilherme G. Piccoli (1):
      bnx2x: Improve reliability in case of nested PCI errors

Guneshwor Singh (1):
      ASoC: Intel: Skylake: Do not check dev_type for dmic link type

Hannes Reinecke (1):
      scsi: core: check for device state in __scsi_remove_target()

Hans de Goede (1):
      HID: core: lower log level for unknown main item tags to warnings

Herbert Xu (1):
      xfrm: Reinject transport-mode packets through tasklet

Hugh Dickins (1):
      x86/events/intel/ds: Map debug buffers in cpu_entry_area

Hui Wang (3):
      ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
      ALSA: hda - fix headset mic detection issue on a Dell machine
      ALSA: hda - change the location for one mic on a Lenovo machine

Jakub Kicinski (2):
      tools: bpftool: maps: close json array on error paths of show
      tools: bpftool: protect against races with disappearing objects

Jan Engelhardt (1):
      sparc64: repair calling incorrect hweight function from stubs

Jiada Wang (2):
      ASoC: rsnd: ssiu: clear SSI_MODE for non TDM Extended modes
      ASoC: rsnd: ssi: fix race condition in rsnd_ssi_pointer_update

Jing Xia (1):
      tracing: Fix crash when it fails to alloc ring buffer

Jiri Olsa (2):
      perf tools: Use shell function for perl cflags retrieval
      perf tools: Fix up build in hardened environments

Jiri Pirko (1):
      net: sched: fix possible null pointer deref in tcf_block_put

Joel Fernandes (1):
      cpufreq: schedutil: Use idle_calls counter of the remote CPU

Johan Hovold (4):
      ASoC: da7218: fix fix child-node lookup
      ASoC: twl4030: fix child-node lookup
      USB: chipidea: msm: fix ulpi-node lookup
      phy: tegra: fix device-tree node lookups

John Stultz (1):
      staging: ion: Fix ion_cma_heap allocations

Jon Maloy (2):
      tipc: base group replicast ack counter on number of actual receivers
      tipc: fix memory leak of group member when peer node is lost

Josh Poimboeuf (1):
      x86/stacktrace: Make zombie stack traces reliable

Juan Zea (1):
      usbip: fix usbip bind writing random string after command in match_busid

Kuninori Morimoto (2):
      ASoC: rcar: revert IOMMU support so far
      ASoC: rsnd: fixup ADG register mask

Linus Torvalds (4):
      n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
      x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
      kbuild: add '-fno-stack-check' to kernel build options
      Linux 4.15-rc6

Linus Walleij (1):
      hwmon: Deal with errors from the thermal subsystem

Lukas Bulwahn (1):
      objtool: Fix Clang enum conversion warning

Maciej S. Szmigiero (2):
      ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
      ASoC: fsl_ssi: serialize AC'97 register access operations

Majd Dibbiny (2):
      IB/mlx5: Fix congestion counters in LAG mode
      IB/mlx5: Serialize access to the VMA list

Martin Blumenstingl (1):
      nvmem: meson-mx-efuse: fix reading from an offset other than 0

Mat Martineau (1):
      tcp: Avoid preprocessor directives in tracepoint macro args

Mathias Nyman (2):
      USB: Fix off by one in type-specific length check of BOS SSP capability
      xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate

Mathieu Malaterre (1):
      cpu/hotplug: Move inline keyword at the beginning of declaration

Matthew Wilcox (1):
      x86/build: Make isoimage work on Debian

Matthieu CASTET (1):
      led: core: Fix brightness setting when setting delay_off=0

Max Schulze (1):
      USB: serial: ftdi_sio: add id for Airbus DS P8GR

Michael J. Ruhl (1):
      IB/hfi: Only read capability registers if the capability exists

Michal Kubecek (1):
      xfrm: fix XFRMA_OUTPUT_MARK policy entry

Mika Westerberg (2):
      MAINTAINERS: Add thunderbolt.rst to the Thunderbolt driver entry
      thunderbolt: Mask ring interrupt properly when polling starts

Moni Shoua (2):
      IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
      IB/core: Verify that QP is security enabled in create and destroy

Naveen Manohar (2):
      ASoC: Intel: kbl: Modify map for Headset Playback to fix pop-noise
      ASoC: Intel: Change kern log level to avoid unwanted messages

NeilBrown (1):
      staging: lustre: lnet: Fix recent breakage from list_for_each conversion

Nicolin Chen (1):
      ASoC: fsl_asrc: Fix typo in a field define

Nitzan Carmi (1):
      IB/mlx5: Fix mlx5_ib_alloc_mr error flow

Oliver Neukum (1):
      usb: add RESET_RESUME for ELSA MicroLink 56K

Parthasarathy Bhuvaragan (1):
      tipc: fix hanging poll() for stream sockets

Paul E. McKenney (1):
      sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION

Peter Zijlstra (3):
      x86/mm: Use/Fix PCID to optimize user/kernel switches
      x86/mm: Optimize RESTORE_CR3
      x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming

Quentin Monnet (1):
      selftests/bpf: fix Makefile for passing LLC to the command line

Reinhard Speyerer (1):
      USB: serial: qcserial: add Sierra Wireless EM7565

Russell King (2):
      phylink: ensure the PHY interface mode is appropriately set
      phylink: ensure AN is enabled

SZ Lin (林上智) (1):
      USB: serial: option: adding support for YUGA CLM920-NC5

Shuah Khan (4):
      usbip: vhci: stop printing kernel pointer addresses in messages
      usbip: stub: stop printing kernel pointer addresses in messages
      usbip: prevent leaking socket pointer address in messages
      usbip: stub_rx: fix static checker warning on unnecessary checks

Simon Ser (2):
      objtool: Fix seg fault caused by missing parameter
      objtool: Fix seg fault with clang-compiled objects

Siva Reddy Kallam (3):
      tg3: Update copyright
      tg3: Add workaround to restrict 5762 MRRS to 2048
      tg3: Enable PHY reset in MTU change path for 5720

Srinivas Kandagatla (1):
      ASoC: codecs: msm8916-wcd: Fix supported formats

Stefan Potyra (1):
      ASoC: rockchip: disable clock on error

Steffen Klassert (2):
      xfrm: Fix stack-out-of-bounds read on socket policy lookup.
      xfrm: Fix stack-out-of-bounds with misconfigured transport mode policies.

Steve Wise (3):
      iw_cxgb4: Only validate the MSN for successful completions
      iw_cxgb4: reflect the original WR opcode in drain cqes
      iw_cxgb4: when flushing, complete all wrs in a chain

Steven Rostedt (VMware) (4):
      ring-buffer: Mask out the info bits when returning buffer page length
      tracing: Remove extra zeroing out of the ring buffer page
      ring-buffer: Do no reuse reader page if still in use
      tracing: Fix possible double free on failure of allocating trace buffer

Sudeep Holla (1):
      drivers: base: cacheinfo: fix cache type for non-architected system cache

Sushmita Susheelendra (1):
      staging: android: ion: Fix dma direction for dma_sync_sg_for_cpu/device

Takashi Iwai (2):
      ALSA: hda: Drop useless WARN_ON()
      ALSA: hda - Fix missing COEF init for ALC225/295/299

Thomas Gleixner (25):
      x86/cpufeatures: Add X86_BUG_CPU_INSECURE
      x86/mm/pti: Add infrastructure for page table isolation
      x86/mm/pti: Force entry through trampoline when PTI active
      x86/entry: Align entry text section to PMD boundary
      x86/mm/pti: Share entry text PMD
      x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
      x86/mm/dump_pagetables: Check user space page table for WX pages
      x86/mm/dump_pagetables: Allow dumping current pagetables
      x86/ldt: Make the LDT mapping RO
      perf/x86/intel: Plug memory leak in intel_pmu_init()
      x86/apic: Switch all APICs to Fixed delivery mode
      gpio: brcmstb: Make really use of the new lockdep class
      genirq/msi: Handle reactivation only on success
      genirq: Introduce IRQD_CAN_RESERVE flag
      x86/vector: Use IRQD_CAN_RESERVE flag
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
      genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
      timers: Reinitialize per cpu bases on hotplug
      nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
      timers: Invoke timer_start_debug() where it makes sense
      timerqueue: Document return values of timerqueue_add/del()
      x86/smpboot: Remove stale TLB flush invocations
      x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
      x86/ldt: Plug memory leak in error path
      x86/ldt: Make LDT pgtable free conditional

Todd Kjos (1):
      binder: fix proc->files use-after-free

Tom Herbert (2):
      sock: Add sock_owned_by_user_nocheck
      strparser: Call sock_owned_by_user_nocheck

Tom Lendacky (1):
      x86/mm: Unbreak modules that use the DMA API

Tommi Rantala (2):
      tipc: error path leak fixes in tipc_enable_bearer()
      tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path

Tonghao Zhang (1):
      sctp: Replace use of sockets_allocated with specified macro.

Vlastimil Babka (1):
      x86/dumpstack: Indicate in Oops whether PTI is configured and enabled

Willem de Bruijn (1):
      skbuff: in skb_copy_ubufs unclone before releasing zerocopy

Xiaolin Zhang (1):
      drm/i915/gvt: Fix pipe A enable as default for vgpu

oder_chiou@realtek.com (3):
      ASoC: rt5514: Make sure the DMIC delay will be happened after
normal SUPPLY widgets power on
      ASoC: rt5514: Add the sanity check for the driver_data in the
resume function
      ASoC: rt5663: Fix the wrong result of the first jack detection

rodrigosiqueira (1):
      x86: Remove unused parameter of prepare_switch_to

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 4.15-rc6
  2017-12-31 22:57 Linux 4.15-rc6 Linus Torvalds
@ 2018-01-02 20:28 ` Andres Freund
  2018-01-02 21:09   ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Andres Freund @ 2018-01-02 20:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

On 2017-12-31 14:57:51 -0800, Linus Torvalds wrote:
> With all the x86 pti work coming in late in the rc like this, I'm by
> now almost guaranteed to do an rc8 this release, not because there are
> any known problems, but simply because of the timing of the patches.
>
> Go forth and test,

I thought it'd be interesting to run a short benchmark to be able to
estimate the impact of the PTI work on postgres workloads (which I work
on). On my skylake laptop, a memory resident, OLTP workload with 16
connections results in:

pti=on:
tps = 220791.228297 (including connections establishing)
tps = 220836.842052 (excluding connections establishing)

pti=off:
tps = 236629.778328 (including connections establishing)
tps = 236646.598274 (excluding connections establishing)

model name      : Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz


This isn't a complaint, I just thought it might be useful
information. If it helps for anything/anybody, I'm happy to run
additional benchmarks / provide additional information.


- Andres

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 4.15-rc6
  2018-01-02 20:28 ` Andres Freund
@ 2018-01-02 21:09   ` Linus Torvalds
  2018-01-03 12:57     ` Willy Tarreau
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2018-01-02 21:09 UTC (permalink / raw)
  To: Andres Freund; +Cc: Linux Kernel Mailing List

On Tue, Jan 2, 2018 at 12:28 PM, Andres Freund <andres@anarazel.de> wrote:
>
> I thought it'd be interesting to run a short benchmark to be able to
> estimate the impact of the PTI work on postgres workloads (which I work
> on). On my skylake laptop, a memory resident, OLTP workload with 16
> connections results in:

Yeah, that's actually pretty much in line with expectations.

Something around 5% performance impact of the isolation is what people
are looking at.

Obviously it depends on just exactly what you do. Some loads will
hardly be affected at all, if they just spend all their time in user
space. And if you do a lot of small system calls, you might see
double-digit slowdowns.

> This isn't a complaint, I just thought it might be useful
> information. If it helps for anything/anybody, I'm happy to run
> additional benchmarks / provide additional information.

Note that it will depend heavily on the hardware too. Older CPU's
without PCID will be impacted more by the isolation. And I think some
of the back-ports won't take advantage of PCID even on newer hardware.

              Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 4.15-rc6
  2018-01-02 21:09   ` Linus Torvalds
@ 2018-01-03 12:57     ` Willy Tarreau
  2018-01-03 21:20       ` Andres Freund
  0 siblings, 1 reply; 7+ messages in thread
From: Willy Tarreau @ 2018-01-03 12:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andres Freund, Linux Kernel Mailing List

On Tue, Jan 02, 2018 at 01:09:13PM -0800, Linus Torvalds wrote:
> On Tue, Jan 2, 2018 at 12:28 PM, Andres Freund <andres@anarazel.de> wrote:
> >
> > I thought it'd be interesting to run a short benchmark to be able to
> > estimate the impact of the PTI work on postgres workloads (which I work
> > on). On my skylake laptop, a memory resident, OLTP workload with 16
> > connections results in:
> 
> Yeah, that's actually pretty much in line with expectations.
> 
> Something around 5% performance impact of the isolation is what people
> are looking at.
> 
> Obviously it depends on just exactly what you do. Some loads will
> hardly be affected at all, if they just spend all their time in user
> space. And if you do a lot of small system calls, you might see
> double-digit slowdowns.

I can confirm, I've just run some tests on haproxy on a core i7-4790K
and I'm observing a performance loss of ~17%, making the connection
rate go down from 245k/s to 204k/s. It's indeed quite significant for
such use cases, eventhough I think it might reasonably be absorbed by
usual noise in most use cases.

With that said, I think we should start to think about an option to
disable this per process. We could imagine for example a prctl()
requiring CAP_SYS_ADMIN to disable it. This would at least allow
processes started as root to disable it when they consider themselves
irrelevant to this kind of protection (mostly I/O intensive or network
intensive applications).

> > This isn't a complaint, I just thought it might be useful
> > information. If it helps for anything/anybody, I'm happy to run
> > additional benchmarks / provide additional information.
> 
> Note that it will depend heavily on the hardware too. Older CPU's
> without PCID will be impacted more by the isolation.

Interesting. This CPU has PCID, so it's possible that older hardware
may indeed be hit a bit more.

Regards,
Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 4.15-rc6
  2018-01-03 12:57     ` Willy Tarreau
@ 2018-01-03 21:20       ` Andres Freund
  2018-01-04 10:27         ` Willy Tarreau
  0 siblings, 1 reply; 7+ messages in thread
From: Andres Freund @ 2018-01-03 21:20 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Linus Torvalds, Linux Kernel Mailing List

On 2018-01-03 13:57:25 +0100, Willy Tarreau wrote:
> On Tue, Jan 02, 2018 at 01:09:13PM -0800, Linus Torvalds wrote:
> > On Tue, Jan 2, 2018 at 12:28 PM, Andres Freund <andres@anarazel.de> wrote:
> > >
> > > I thought it'd be interesting to run a short benchmark to be able to
> > > estimate the impact of the PTI work on postgres workloads (which I work
> > > on). On my skylake laptop, a memory resident, OLTP workload with 16
> > > connections results in:
> > 
> > Yeah, that's actually pretty much in line with expectations.
> > 
> > Something around 5% performance impact of the isolation is what people
> > are looking at.
> > 
> > Obviously it depends on just exactly what you do. Some loads will
> > hardly be affected at all, if they just spend all their time in user
> > space. And if you do a lot of small system calls, you might see
> > double-digit slowdowns.
> 
> I can confirm, I've just run some tests on haproxy on a core i7-4790K
> and I'm observing a performance loss of ~17%, making the connection
> rate go down from 245k/s to 204k/s. It's indeed quite significant for
> such use cases, eventhough I think it might reasonably be absorbed by
> usual noise in most use cases.

Yea, I've expanded the postgres benchmarks a bit, and it's not hard to
construct cases with significantly increased slowdowns:
https://www.postgresql.org/message-id/20180102222354.qikjmf7dvnjgbkxe@alap3.anarazel.de

and that's on a laptop, not a large system. I'd assume at least the
nopcid cases gets considerably worse on larger sysstems.


> With that said, I think we should start to think about an option to
> disable this per process. We could imagine for example a prctl()
> requiring CAP_SYS_ADMIN to disable it. This would at least allow
> processes started as root to disable it when they consider themselves
> irrelevant to this kind of protection (mostly I/O intensive or network
> intensive applications).

That might not be a bad idea. If so, it'd be a good idea to keep it
separate from CAP_SYS_ADMIN. E.g. postgres refuses to run as root, but
setcap'ing to allow CAP_SYS_LIVE_AND_LET_LIVE_SYSCALL or such would
work.

But I suspect this isn't something easily done on a capability/prctl
level? Seems not uncomplicated to change this after a process has
already been created - so maybe it'd be easier to force this via
personality()?


> > > This isn't a complaint, I just thought it might be useful
> > > information. If it helps for anything/anybody, I'm happy to run
> > > additional benchmarks / provide additional information.
> > 
> > Note that it will depend heavily on the hardware too. Older CPU's
> > without PCID will be impacted more by the isolation.
> 
> Interesting. This CPU has PCID, so it's possible that older hardware
> may indeed be hit a bit more.

The post linked above has numbers with nopcid disabling pcid use, and
indeed, the difference is quite measurable.

Greetings,

Andres Freund

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 4.15-rc6
  2018-01-03 21:20       ` Andres Freund
@ 2018-01-04 10:27         ` Willy Tarreau
  2018-01-04 11:03           ` Willy Tarreau
  0 siblings, 1 reply; 7+ messages in thread
From: Willy Tarreau @ 2018-01-04 10:27 UTC (permalink / raw)
  To: Andres Freund; +Cc: Linus Torvalds, Linux Kernel Mailing List

On Wed, Jan 03, 2018 at 01:20:00PM -0800, Andres Freund wrote:
> On 2018-01-03 13:57:25 +0100, Willy Tarreau wrote:
> > I think we should start to think about an option to
> > disable this per process. We could imagine for example a prctl()
> > requiring CAP_SYS_ADMIN to disable it. This would at least allow
> > processes started as root to disable it when they consider themselves
> > irrelevant to this kind of protection (mostly I/O intensive or network
> > intensive applications).
> 
> That might not be a bad idea. If so, it'd be a good idea to keep it
> separate from CAP_SYS_ADMIN. E.g. postgres refuses to run as root,

There's a difference between "running as" and "starting as" (eg in
haproxy we encourage to *start as root* but not to *run as root*, this
allows the process to chroot to /var/empty and drop all privileges).
But I get your point, it's important to adapt to what various programs
will require.

> but
> setcap'ing to allow CAP_SYS_LIVE_AND_LET_LIVE_SYSCALL or such would
> work.

yes probably.

> But I suspect this isn't something easily done on a capability/prctl
> level? Seems not uncomplicated to change this after a process has
> already been created - so maybe it'd be easier to force this via
> personality()?

I don't know. One solution when you perform changes that effect the
running process' VMA is to re-exec itself after the change :

   if (pti_protection_enabled && prctl(PR_SET_PTI, PR_PTI_DISABLE) == 0)
           exit(execve(argv[0], argv, envp));

> > > > This isn't a complaint, I just thought it might be useful
> > > > information. If it helps for anything/anybody, I'm happy to run
> > > > additional benchmarks / provide additional information.
> > > 
> > > Note that it will depend heavily on the hardware too. Older CPU's
> > > without PCID will be impacted more by the isolation.
> > 
> > Interesting. This CPU has PCID, so it's possible that older hardware
> > may indeed be hit a bit more.
> 
> The post linked above has numbers with nopcid disabling pcid use, and
> indeed, the difference is quite measurable.

I'm going to re-run the tests on an Atom C2518 now, which doesn't have pcid,
I don't even know if it's affected by the issue.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 4.15-rc6
  2018-01-04 10:27         ` Willy Tarreau
@ 2018-01-04 11:03           ` Willy Tarreau
  0 siblings, 0 replies; 7+ messages in thread
From: Willy Tarreau @ 2018-01-04 11:03 UTC (permalink / raw)
  To: Andres Freund; +Cc: Linus Torvalds, Linux Kernel Mailing List

On Thu, Jan 04, 2018 at 11:27:04AM +0100, Willy Tarreau wrote:
> > The post linked above has numbers with nopcid disabling pcid use, and
> > indeed, the difference is quite measurable.
> 
> I'm going to re-run the tests on an Atom C2518 now, which doesn't have pcid,
> I don't even know if it's affected by the issue.

FWIW the performance loss on the Atom is less important (probably in
part because other parts are slower and bottlenecks are spread in all
the hardware). I'm observing "only" 11-14% loss on the same tests.

Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-01-04 11:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-31 22:57 Linux 4.15-rc6 Linus Torvalds
2018-01-02 20:28 ` Andres Freund
2018-01-02 21:09   ` Linus Torvalds
2018-01-03 12:57     ` Willy Tarreau
2018-01-03 21:20       ` Andres Freund
2018-01-04 10:27         ` Willy Tarreau
2018-01-04 11:03           ` Willy Tarreau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.