* Linux 4.15-rc7 @ 2018-01-07 22:55 Linus Torvalds 2018-01-08 7:20 ` Thomas Gleixner 2018-01-10 23:32 ` Pavel Machek 0 siblings, 2 replies; 20+ messages in thread From: Linus Torvalds @ 2018-01-07 22:55 UTC (permalink / raw) To: Linux Kernel Mailing List Ok, we had an interesting week, and by now everybody knows why we were merging all those odd x86 page table isolation patches without following all of the normal release timing rules. But rc7 itself is actually pretty calm. Yes, there were a few small follow-up patches to the PTI code still, and yes, there's been a fair amount of discussion about the exact details of the Spectre fixes, but at least in general things have been nice and calm. And we're actually back to "normal" in that most of the patches are drivers (mainly GPU, some crypto, some random small things - input layer, platform drivers etc). There are misc small filesystem and arch updates too. The appended shortlog is small enough that it's easy to just scroll down and get a feel for what happened. The one thing I want to do now that Meltdown and Spectre are public, is to give a *big* shout-out to the x86 people, and Thomas Gleixner in particular for really being on top of this. It's been one huge annoyance, and honestly, Thomas really went over and beyond in this whole mess. A lot of other people have obviously been involved too, don't get me wrong, but this is exactly the kind of issue that easily results in lots of nasty hacky patches because people are falling all over themselves trying to fix it and they can't even talk about why they are doing it in public, and Thomas &co ended up being a huge reason for why it was all much easier for me to merge: because of the literally _months_ of work on quality control and gating these patches and making sure the end result was a clean and manageable series. So a big *BIG* thanks to Thomas for making it so much easier for me to merge all this stuff. The whole nasty TLB isolation patches would have been just _so_ much more horrible without him. Anyway, due to this all, 4.15 will obviously be one of the releases with an rc8, even if things are starting to really calm down by now. We'll see, hopefully we won't need any more than that. Linus --- Aaron Ma (1): Input: elantech - add new icbody type 15 Al Viro (2): sget(): handle failures of register_shrinker() fix "netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'" Alejandro Mery (3): ARM: davinci: Use platform_device_register_full() to create pdev for dm365's eDMA ARM: davinci: Add dma_mask to dm365's eDMA device ARM: davinci: fix mmc entries in dm365's dma_slave_map Alexey Brodkin (2): ARC: Fix detection of dual-issue enabled ARC: [plat-hsdk] Switch DisplayLink driver from fbdev to DRM Aliaksei Karaliou (2): xfs: quota: fix missed destroy of qi_tree_lock xfs: quota: check result of register_shrinker() Andrea Arcangeli (1): userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails Andrew Morton (1): kernel/exit.c: export abort() to modules Andrey Ryabinin (1): x86/mm: Set MODULES_END to 0xffffffffff000000 Anshuman Khandual (1): mm/mprotect: add a cond_resched() inside change_pmd_range() Anthony Kim (1): Input: hideep - fix compile error due to missing include file Antoine Tenart (3): crypto: inside-secure - free requests even if their handling failed crypto: inside-secure - fix request allocations in invalidation path crypto: inside-secure - do not use areq->result for partial results Ard Biesheuvel (1): efi/capsule-loader: Reinstate virtual capsule mapping Arnd Bergmann (3): ARM: dts: ls1021a: fix incorrect clock references ARM: dts: tango4: remove bogus interrupt-controller property crypto: chelsio - select CRYPTO_GF128MUL Baoquan He (1): mm/sparse.c: wrong allocation for mem_section Bogdan Mirea (2): arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property Boris Brezillon (1): mtd: nand: pxa3xx: Fix READOOB implementation Chen-Yu Tsai (1): ARM: dts: sunxi: Convert to CCU index macros for HDMI controller Chris Mason (1): btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes Christian Borntraeger (2): KVM: s390: fix cmma migration for multiple memory slots KVM: s390: prevent buffer overrun on memory hotplug during migration Dan Carpenter (1): afs: Potential uninitialized variable in afs_extract_data() Darrick J. Wong (1): xfs: fix s_maxbytes overflow problems Dave Young (2): x86/efi: Fix kernel param add_efi_memmap regression mm: check pfn_valid first in zero_resv_unavail David Howells (3): fscache: Fix the default for fscache_maybe_release_page() afs: Fix unlink afs: Fix missing error handling in afs_write_end() David Lechner (1): ARM: dts: da850-lego-ev3: Fix battery voltage gpio David Woodhouse (1): x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm Dhinakaran Pandiyan (1): drm/i915/psr: Fix register name mess up. Dmitry Torokhov (1): Input: elants_i2c - do not clobber interrupt trigger on x86 Eric Biggers (3): crypto: chacha20poly1305 - validate the digest size crypto: pcrypt - fix freeing pcrypt instances capabilities: fix buffer overread on very short xattr Eric W. Biederman (1): pid: Handle failure to allocate the first pid in a pid namespace Eugeniy Paltsev (4): ARC: [plat-hsdk]: Set initial core pll output frequency ARC: [plat-hsdk]: Get rid of core pll frequency set in platform code ARC: [plat-axs103]: Set initial core pll output frequency ARC: [plat-axs103] refactor the quad core DT quirk code Hans Verkuil (1): omapdrm/dss/hdmi4_cec: fix interrupt handling Heiko Carstens (1): s390/sclp: disable FORTIFY_SOURCE for early sclp code Heiko Stuebner (3): ARM: dts: rockchip: add cpu0-regulator on rk3066a-marsboard arm64: dts: rockchip: fix trailing 0 in rk3328 tsadc interrupts arm64: dts: rockchip: limit rk3328-rock64 gmac speed to 100MBit for now Helge Deller (6): parisc: Show unhashed hardware inventory parisc: Show initial kernel memory layout unhashed parisc: Show unhashed HPA of Dino chip parisc: Show unhashed EISA EEPROM address parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel parisc: qemu idle sleep support Icenowy Zheng (1): arm64: allwinner: a64: add Ethernet PHY regulator for several boards Jacek Anaszewski (1): leds: core: Fix regression caused by commit 2b83ff96f51d Jagan Teki (1): arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3 James Hogan (1): lib/mpi: Fix umul_ppmm() for MIPS64r6 Jan Engelhardt (1): crypto: n2 - cure use after free Javier Martinez Canillas (1): ARM: dts: exynos: Enable Mixer node for Exynos5800 Peach Pi machine Jean-Philippe Brucker (1): iommu/arm-smmu-v3: Don't free page table ops twice Jeffy Chen (1): mailmap: update Mark Yao's email address Jim Mattson (1): kvm: vmx: Scrub hardware GPRs at VM-exit Joel Stanley (1): ARM: dts: aspeed-g4: Correct VUART IRQ number John Johansen (1): apparmor: fix regression in mount mediation when feature set is pinned John Sperbeck (1): powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR Jonathan Cameron (1): crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t Josh Poimboeuf (2): x86/dumpstack: Fix partial register dumps x86/dumpstack: Print registers for first stack frame Kees Cook (1): exec: Weaken dumpability for secureexec Klaus Goger (1): arm64: dts: rockchip: remove vdd_log from rk3399-puma Linus Torvalds (1): Linux 4.15-rc7 Lucas De Marchi (1): drm/i915: Apply Display WA #1183 on skl, kbl, and cfl Markus Heiser (1): docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c Martin Schwidefsky (1): s390: fix preemption race in disable_sacf_uaccess Masahiro Yamada (1): arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC Matt Fleming (1): MAINTAINERS: Remove Matt Fleming as EFI co-maintainer Matthew Wilcox (1): mm/debug.c: provide useful debugging information for VM_BUG Maxime Ripard (1): ARM: dts: sun8i: a711: Reinstate the PMIC compatible Nick Desaulniers (1): x86/process: Define cpu_tss_rw in same section as declaration Nikolay Borisov (1): btrfs: Fix flush bio leak Ofer Heifetz (1): crypto: inside-secure - per request invalidation Oleg Nesterov (1): kernel/acct.c: fix the acct->needcheck check in check_free_space() Oleksandr Andrushchenko (1): Input: xen-kbdfront - do not advertise multi-touch pressure support Olof Johansson (1): Input: joystick/analog - riscv has get_cycles() Peter Rosin (1): ARM: dts: at91: disable the nxp,se97b SMBUS timeout on the TSE-850 Peter Zijlstra (1): x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers Randy Dunlap (1): documentation/gpu/i915: fix docs build error after file rename Rob Herring (1): ARM: dts: rockchip: fix rk3288 iep-IOMMU interrupts property cells Robin Murphy (1): iommu/arm-smmu-v3: Cope with duplicated Stream IDs Russell King (5): drm/armada: fix leak of crtc structure drm/armada: fix SRAM powerdown drm/armada: fix UV swap code drm/armada: improve efficiency of armada_drm_plane_calc_addrs() drm/armada: fix YUV planar format framebuffer offsets Sebastian Ott (1): s390/pci: handle insufficient resources during dma tlb flush Sergey Matyukevich (1): arm64: dts: orange-pi-zero-plus2: fix sdcard detect Sergey Senozhatsky (2): arc: do not use __print_symbol() mm/zsmalloc.c: include fs.h Sinan Kaya (1): mfd: rtsx: Release IRQ during shutdown Stefan Brüns (1): sunxi-rsb: Include OF based modalias in device uevent Stefan Haberland (1): s390/dasd: fix wrongly assigned configuration data Tetsuo Handa (1): mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed. Thomas Gleixner (7): x86/pti: Enable PTI by default x86/pti: Make sure the user/kernel PTEs match x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat() x86/mm: Map cpu_entry_area at the same place on 4/5 level x86/kaslr: Fix the vaddr_end mess x86/tlb: Drop the _GPL from the cpu_tlbstate export x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN Tom Lendacky (1): x86/cpu, x86/pti: Do not enable PTI on AMD processors Ville Syrjälä (2): drm/i915: Disable DC states around GMBUS on GLK drm/i915: Put all non-blocking modesets onto an ordered wq Vineet Gupta (3): ARC: uaccess: dont use "l" gcc inline asm constraint modifier ARC: handle gcc generated __builtin_trap() ARC: handle gcc generated __builtin_trap for older compiler Wei Yongjun (1): xen/pvcalls: use GFP_ATOMIC under spin lock Xiongwei Song (1): drm/ttm: check the return value of kzalloc Yue Hin Lau (1): drm/amd/display: call set csc_default if enable adjustment is false Zhen Lei (1): Input: ims-pcu - fix typo in the error message ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-07 22:55 Linux 4.15-rc7 Linus Torvalds @ 2018-01-08 7:20 ` Thomas Gleixner 2018-01-10 23:32 ` Pavel Machek 1 sibling, 0 replies; 20+ messages in thread From: Thomas Gleixner @ 2018-01-08 7:20 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux Kernel Mailing List Linus, On Sun, 7 Jan 2018, Linus Torvalds wrote: > The one thing I want to do now that Meltdown and Spectre are public, > is to give a *big* shout-out to the x86 people, and Thomas Gleixner in > particular for really being on top of this. It's been one huge > annoyance, and honestly, Thomas really went over and beyond in this > whole mess. A lot of other people have obviously been involved too, > don't get me wrong, but this is exactly the kind of issue that easily > results in lots of nasty hacky patches because people are falling all > over themselves trying to fix it and they can't even talk about why > they are doing it in public, and Thomas &co ended up being a huge > reason for why it was all much easier for me to merge: because of the > literally _months_ of work on quality control and gating these patches > and making sure the end result was a clean and manageable series. > > So a big *BIG* thanks to Thomas for making it so much easier for me to > merge all this stuff. The whole nasty TLB isolation patches would > have been just _so_ much more horrible without him. I'm deeply moved and feel a little ashamed as without the help of others this wouldn't have been possible at all. So it's on me to hand over the *BIG* thanks to: Ingo Molnar who was the git logistics mastermind behind this, the last sanity check before commit and the initial stress tester. Thanks especially for taking over most of the regular tip maintenance workload. Andi Lutomirksy for the great work on the entry code, cpu entry area, LDT mapping and the PCID rework and his reviews. Borislav Petkov for his meticulous reviews, his help with all AMD issues and being always on standby for testing and debugging despite his workload of backporting KAISER to dead kernels. Peter Zijlstra for his work on the tlb flush / PCID code, reviews and supporting me on the short trip into LDT VMA mapping which we had to drop for various reasons. Dave Hansen who did the initial KAISER port and helped all along with the rework in various ways Josh Poimboeuf for fixing up all the stacktrace issues we encountered Juergen Gross for helping out on the XEN side of things so we did not have to dig into the inwards of XEN/PV. Kess Cook for helping with coordination behind the scenes Greg Kroah-Hartman for not pestering us with all the pre 4.14 backports and the smooth integration and exposure to 4.14 stable which gave us more test coverage and helped us to iron out the inevitable hickups. Linus for keeping his diving harpune in the cabinet and giving us great support for getting this into his tree on time which allowed 4.14 to gain all the goods as well. The team at TU Graz who did the initial KAISER implementation. I'm really impressed what kernel first timers can achieve and I have to say that I see worse code in my daily work as a maintainer. Congrats to them for their findings in the guts of our CPUs as well. Really impressive! This list is surely not complete, so I extend the thanks to everyone who helped with review, patches, testing, bug reports and regression hunting. I want to take the opportunity to say thanks to my wife Monika, my family and my great team @linutronix for bearing with the extraordinary grumpy old greybeard which I certainly was for the past two month. It's been an interesting challenge to sort that out in such a short timeframe, but I'm sure all of the involved people would have preferred to do this with the head start which at least one other OS got on this. But it's not time yet for a post-mortem of this mess, we still have to sort out the spectre mitigations and it seems Linus expects me to keep my hand on things for the next time. Aye, aye, captain! Lets sort this in a technical manner, with the security of our users in mind and then take a break and after that sit down and gain the performance back which we lost on the way. Lots of work ahead. Thanks, Thomas ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-07 22:55 Linux 4.15-rc7 Linus Torvalds 2018-01-08 7:20 ` Thomas Gleixner @ 2018-01-10 23:32 ` Pavel Machek 2018-01-11 11:29 ` Olivier Galibert 2018-01-11 14:07 ` Jiri Kosina 1 sibling, 2 replies; 20+ messages in thread From: Pavel Machek @ 2018-01-10 23:32 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux Kernel Mailing List, jikos [-- Attachment #1: Type: text/plain, Size: 972 bytes --] Hi! > The one thing I want to do now that Meltdown and Spectre are public, > is to give a *big* shout-out to the x86 people, and Thomas Gleixner in > particular for really being on top of this. It's been one huge > annoyance, and honestly, Thomas really went over and beyond in this > whole mess. A lot of other people have obviously been involved too, As I understand it: KPTI prevents Meltdown attack on x86-64, but Spectre means even x86-64 is not expected to be safe? Ok, so Meltdown is public... And I still have some nice 32-bit machines I'd like to keep working. Proof of concept is out, https://github.com/IAIK/meltdown/ . Is anyone working on KPTI for x86-32? SLES11 should still be supported, and that should have x86-32 version; any chance SUSE can share some patches? Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-10 23:32 ` Pavel Machek @ 2018-01-11 11:29 ` Olivier Galibert 2018-01-11 14:06 ` Nikolay Borisov 2018-01-12 11:06 ` Pavel Machek 2018-01-11 14:07 ` Jiri Kosina 1 sibling, 2 replies; 20+ messages in thread From: Olivier Galibert @ 2018-01-11 11:29 UTC (permalink / raw) To: Pavel Machek; +Cc: Linus Torvalds, Linux Kernel Mailing List, jikos Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI? OG. On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek <pavel@ucw.cz> wrote: > Hi! > >> The one thing I want to do now that Meltdown and Spectre are public, >> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in >> particular for really being on top of this. It's been one huge >> annoyance, and honestly, Thomas really went over and beyond in this >> whole mess. A lot of other people have obviously been involved too, > > As I understand it: KPTI prevents Meltdown attack on x86-64, but > Spectre means even x86-64 is not expected to be safe? > > Ok, so Meltdown is public... And I still have some nice 32-bit > machines I'd like to keep working. > > Proof of concept is out, https://github.com/IAIK/meltdown/ . > > Is anyone working on KPTI for x86-32? SLES11 should still be > supported, and that should have x86-32 version; any chance SUSE can > share some patches? > > Thanks, > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-11 11:29 ` Olivier Galibert @ 2018-01-11 14:06 ` Nikolay Borisov 2018-01-12 11:06 ` Pavel Machek 1 sibling, 0 replies; 20+ messages in thread From: Nikolay Borisov @ 2018-01-11 14:06 UTC (permalink / raw) To: Olivier Galibert, Pavel Machek Cc: Linus Torvalds, Linux Kernel Mailing List, jikos On 11.01.2018 13:29, Olivier Galibert wrote: > Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI? 4g/4g was never accepted upstream > > OG. > > > On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek <pavel@ucw.cz> wrote: >> Hi! >> >>> The one thing I want to do now that Meltdown and Spectre are public, >>> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in >>> particular for really being on top of this. It's been one huge >>> annoyance, and honestly, Thomas really went over and beyond in this >>> whole mess. A lot of other people have obviously been involved too, >> >> As I understand it: KPTI prevents Meltdown attack on x86-64, but >> Spectre means even x86-64 is not expected to be safe? >> >> Ok, so Meltdown is public... And I still have some nice 32-bit >> machines I'd like to keep working. >> >> Proof of concept is out, https://github.com/IAIK/meltdown/ . >> >> Is anyone working on KPTI for x86-32? SLES11 should still be >> supported, and that should have x86-32 version; any chance SUSE can >> share some patches? >> >> Thanks, >> Pavel >> -- >> (english) http://www.livejournal.com/~pavelmachek >> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-11 11:29 ` Olivier Galibert 2018-01-11 14:06 ` Nikolay Borisov @ 2018-01-12 11:06 ` Pavel Machek 2018-01-12 13:23 ` Arnd Bergmann 1 sibling, 1 reply; 20+ messages in thread From: Pavel Machek @ 2018-01-12 11:06 UTC (permalink / raw) To: Olivier Galibert; +Cc: Linus Torvalds, Linux Kernel Mailing List, jikos [-- Attachment #1: Type: text/plain, Size: 1912 bytes --] Hi! > Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI? Good point. Is that still supported? Was it ever? Umm. I seem to recall that 4G/4G layout was out of tree but never merged. High Memory Support 1. off (NOHIGHMEM) 2. 4GB (HIGHMEM4G) > 3. 64GB (HIGHMEM64G) choice[1-3]: 3 Memory split > 1. 3G/1G user/kernel split (VMSPLIT_3G) (NEW) 2. 2G/2G user/kernel split (VMSPLIT_2G) 3. 1G/3G user/kernel split (VMSPLIT_1G) choice[1-3?]: Does anyone have recent patches? Best regards, Pavel > On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek <pavel@ucw.cz> wrote: > > Hi! > > > >> The one thing I want to do now that Meltdown and Spectre are public, > >> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in > >> particular for really being on top of this. It's been one huge > >> annoyance, and honestly, Thomas really went over and beyond in this > >> whole mess. A lot of other people have obviously been involved too, > > > > As I understand it: KPTI prevents Meltdown attack on x86-64, but > > Spectre means even x86-64 is not expected to be safe? > > > > Ok, so Meltdown is public... And I still have some nice 32-bit > > machines I'd like to keep working. > > > > Proof of concept is out, https://github.com/IAIK/meltdown/ . > > > > Is anyone working on KPTI for x86-32? SLES11 should still be > > supported, and that should have x86-32 version; any chance SUSE can > > share some patches? > > > > Thanks, > > Pavel > > -- > > (english) http://www.livejournal.com/~pavelmachek > > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 11:06 ` Pavel Machek @ 2018-01-12 13:23 ` Arnd Bergmann 2018-01-12 14:43 ` Pavel Machek ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Arnd Bergmann @ 2018-01-12 13:23 UTC (permalink / raw) To: Pavel Machek Cc: Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 12:06 PM, Pavel Machek <pavel@ucw.cz> wrote: > Hi! > >> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI? > > Good point. Is that still supported? Was it ever? > > Umm. I seem to recall that 4G/4G layout was out of tree but never > merged. I think that's correct: it was in RHEL3 and RHEL4 but never merged upstream. However, there is an important difference between KPTI and X86_4G: The former unmaps the kernel pages from the user space page tables, but keeps both the linear mapping and the user pages visible in kernel mode, while the latter must have also unmapped user space pages from kernel mode, requiring a more expensive get_user/put_user implementation. Kees mentioned an idea to also unmap user pages from kernel mode as an additional safeguard on top of KPTI, which would get it even closer to the X86_4G implementation: https://outflux.net/blog/archives/2018/01/04/smep-emulation-in-pti/ Could you be more specific which 32-bit x86 chips you have that are affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode and AMD NexGen K6/K7 also affected by Spectre but probably not Meltdown, and most other 32-bit microarchitectures seem to be purely in-order. Arnd ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 13:23 ` Arnd Bergmann @ 2018-01-12 14:43 ` Pavel Machek 2018-01-12 17:20 ` vcaputo 2018-01-12 17:34 ` Linus Torvalds 2 siblings, 0 replies; 20+ messages in thread From: Pavel Machek @ 2018-01-12 14:43 UTC (permalink / raw) To: Arnd Bergmann Cc: Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos [-- Attachment #1: Type: text/plain, Size: 1726 bytes --] Hi! > >> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI? > > > > Good point. Is that still supported? Was it ever? > > > > Umm. I seem to recall that 4G/4G layout was out of tree but never > > merged. > > I think that's correct: it was in RHEL3 and RHEL4 but never merged > upstream. Too bad. > However, there is an important difference between KPTI and X86_4G: > The former unmaps the kernel pages from the user space page tables, > but keeps both the linear mapping and the user pages visible in > kernel mode, while the latter must have also unmapped user space > pages from kernel mode, requiring a more expensive get_user/put_user > implementation. > > Kees mentioned an idea to also unmap user pages from kernel > mode as an additional safeguard on top of KPTI, which would get > it even closer to the X86_4G implementation: > https://outflux.net/blog/archives/2018/01/04/smep-emulation-in-pti/ Well, I guess at this point I'm looking for a good place to start from... > Could you be more specific which 32-bit x86 chips you have that are > affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo > laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode > and AMD NexGen K6/K7 also affected by Spectre but probably not > Meltdown, and most other 32-bit microarchitectures seem to be purely > in-order. I do have Core Solo here'd like to keep working (and useful for web browsing). Then there's Pentium M. Occasionaly I run 32-bit kernels on modern machines for testing. Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 13:23 ` Arnd Bergmann 2018-01-12 14:43 ` Pavel Machek @ 2018-01-12 17:20 ` vcaputo 2018-01-12 20:11 ` Arnd Bergmann 2018-01-12 17:34 ` Linus Torvalds 2 siblings, 1 reply; 20+ messages in thread From: vcaputo @ 2018-01-12 17:20 UTC (permalink / raw) To: Arnd Bergmann Cc: Pavel Machek, Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 02:23:20PM +0100, Arnd Bergmann wrote: > On Fri, Jan 12, 2018 at 12:06 PM, Pavel Machek <pavel@ucw.cz> wrote: > > Hi! > > > >> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI? > > > > Good point. Is that still supported? Was it ever? > > > > Umm. I seem to recall that 4G/4G layout was out of tree but never > > merged. > > I think that's correct: it was in RHEL3 and RHEL4 but never merged > upstream. > > However, there is an important difference between KPTI and X86_4G: > The former unmaps the kernel pages from the user space page tables, > but keeps both the linear mapping and the user pages visible in > kernel mode, while the latter must have also unmapped user space > pages from kernel mode, requiring a more expensive get_user/put_user > implementation. > > Kees mentioned an idea to also unmap user pages from kernel > mode as an additional safeguard on top of KPTI, which would get > it even closer to the X86_4G implementation: > https://outflux.net/blog/archives/2018/01/04/smep-emulation-in-pti/ > > Could you be more specific which 32-bit x86 chips you have that are > affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo > laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode > and AMD NexGen K6/K7 also affected by Spectre but probably not > Meltdown, and most other 32-bit microarchitectures seem to be purely > in-order. > I have some Celeron D, 4GiB dedicated servers with a 32-bit stack. They've proven to be very reliable boxes, and are the most affordable baremetal x86 machines I've found. I'd appreciate a PTI implementation on them. Thanks, Vito Caputo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 17:20 ` vcaputo @ 2018-01-12 20:11 ` Arnd Bergmann 2018-01-12 22:04 ` vcaputo 0 siblings, 1 reply; 20+ messages in thread From: Arnd Bergmann @ 2018-01-12 20:11 UTC (permalink / raw) To: vcaputo Cc: Pavel Machek, Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 6:20 PM, <vcaputo@pengaru.com> wrote: > On Fri, Jan 12, 2018 at 02:23:20PM +0100, Arnd Bergmann wrote: >> Could you be more specific which 32-bit x86 chips you have that are >> affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo >> laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode >> and AMD NexGen K6/K7 also affected by Spectre but probably not >> Meltdown, and most other 32-bit microarchitectures seem to be purely >> in-order. >> > > I have some Celeron D, 4GiB dedicated servers with a 32-bit stack. > They've proven to be very reliable boxes, and are the most affordable > baremetal x86 machines I've found. I'd appreciate a PTI implementation > on them. That's an interesting setup for a number of reasons: - Celeron D are mostly 64-bit CPUs, but it depends on the particular model/stepping, so if you have a couple of them, you might be able to avoid the meltdown bug by running a 64-bit kernel with KPTI at least on some of them, or trivially replace the CPU on others. This usually works without changing user space, and tends to result in a faster system than running a 32-bit kernel as you avoid highmem. - I haven't found a definite answer on whether Netburst-based CPUs are affected by meltdown at all. Some people claim it's affected, others say it's not. If the code from https://github.com/IAIK/meltdown is successful on your Celeron D, then we know it's affected, if not, then you could decide to not care about KPTI (Spectre would still be an issue). - A 32-bit system running with mostly highmem (only the low 768 MB out of 4GB are directly mapped) means some of the exploits are harder to do in practice, as most of the page cache is not visible in the kernel, and reading data from other processes will fail more often that succeed. - Economically, it seems barely worth running these if you pay for the electricity: the CPU costs a few dollars/euros, it only takes a couple of weeks of continuous operation to exceed that in operating cost. Replacing the mainboard with a modern low end all-in-one board at 10W might pay off within a year. If you don't pay for electricity, that obviously doesn't work. Arnd ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 20:11 ` Arnd Bergmann @ 2018-01-12 22:04 ` vcaputo 2018-01-12 22:08 ` Arnd Bergmann 0 siblings, 1 reply; 20+ messages in thread From: vcaputo @ 2018-01-12 22:04 UTC (permalink / raw) To: Arnd Bergmann Cc: Pavel Machek, Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 09:11:38PM +0100, Arnd Bergmann wrote: > On Fri, Jan 12, 2018 at 6:20 PM, <vcaputo@pengaru.com> wrote: > > On Fri, Jan 12, 2018 at 02:23:20PM +0100, Arnd Bergmann wrote: > > >> Could you be more specific which 32-bit x86 chips you have that are > >> affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo > >> laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode > >> and AMD NexGen K6/K7 also affected by Spectre but probably not > >> Meltdown, and most other 32-bit microarchitectures seem to be purely > >> in-order. > >> > > > > I have some Celeron D, 4GiB dedicated servers with a 32-bit stack. > > They've proven to be very reliable boxes, and are the most affordable > > baremetal x86 machines I've found. I'd appreciate a PTI implementation > > on them. > > That's an interesting setup for a number of reasons: > > - Celeron D are mostly 64-bit CPUs, but it depends on the particular > model/stepping, so if you have a couple of them, you might be able to > avoid the meltdown bug by running a 64-bit kernel with KPTI at least on > some of them, or trivially replace the CPU on others. This usually > works without changing user space, and tends to result in a faster > system than running a 32-bit kernel as you avoid highmem. > This may be possible, I'll need to try booting a x86_64 kernel on one and see. I would rather not change all of userspace. > - I haven't found a definite answer on whether Netburst-based CPUs > are affected by meltdown at all. Some people claim it's affected, > others say it's not. If the code from https://github.com/IAIK/meltdown > is successful on your Celeron D, then we know it's affected, if not, > then you could decide to not care about KPTI (Spectre would still > be an issue). > I tried that when the code was first made public, but libkdump doesn't support 32-bit; it's full of 64-bit register use in the assembly bits. > - A 32-bit system running with mostly highmem (only the low 768 MB > out of 4GB are directly mapped) means some of the exploits are > harder to do in practice, as most of the page cache is not visible > in the kernel, and reading data from other processes will fail more > often that succeed. > Well that's good news. > - Economically, it seems barely worth running these if you pay for > the electricity: the CPU costs a few dollars/euros, it only takes > a couple of weeks of continuous operation to exceed that in > operating cost. Replacing the mainboard with a modern low end > all-in-one board at 10W might pay off within a year. If you don't pay > for electricity, that obviously doesn't work. > I don't pay for the electricity, these are old dedicated servers hosted by a third party. Not my hardware, and any more modern dedicated x86 servers I've found are substantially more expensive and always SMP. This particular hosting provider has tried selling me upgrades to their current low-end offering (which is still SMP), the price basically doubles. These boxes are mostly idle, performing just personal email and ssh duties. For this situation reliability and security is the priority, power efficiency and performance are not. Thanks, Vito Caputo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 22:04 ` vcaputo @ 2018-01-12 22:08 ` Arnd Bergmann 2018-01-12 22:58 ` vcaputo 0 siblings, 1 reply; 20+ messages in thread From: Arnd Bergmann @ 2018-01-12 22:08 UTC (permalink / raw) To: vcaputo Cc: Pavel Machek, Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 11:04 PM, <vcaputo@pengaru.com> wrote: > On Fri, Jan 12, 2018 at 09:11:38PM +0100, Arnd Bergmann wrote: > >> - I haven't found a definite answer on whether Netburst-based CPUs >> are affected by meltdown at all. Some people claim it's affected, >> others say it's not. If the code from https://github.com/IAIK/meltdown >> is successful on your Celeron D, then we know it's affected, if not, >> then you could decide to not care about KPTI (Spectre would still >> be an issue). >> > > I tried that when the code was first made public, but libkdump doesn't > support 32-bit; it's full of 64-bit register use in the assembly bits. Apparently 32-bit support was added on Wednesday, maybe you can try again with today's version. Arnd ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 22:08 ` Arnd Bergmann @ 2018-01-12 22:58 ` vcaputo 0 siblings, 0 replies; 20+ messages in thread From: vcaputo @ 2018-01-12 22:58 UTC (permalink / raw) To: Arnd Bergmann Cc: Pavel Machek, Olivier Galibert, Linus Torvalds, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 11:08:58PM +0100, Arnd Bergmann wrote: > On Fri, Jan 12, 2018 at 11:04 PM, <vcaputo@pengaru.com> wrote: > > On Fri, Jan 12, 2018 at 09:11:38PM +0100, Arnd Bergmann wrote: > > > >> - I haven't found a definite answer on whether Netburst-based CPUs > >> are affected by meltdown at all. Some people claim it's affected, > >> others say it's not. If the code from https://github.com/IAIK/meltdown > >> is successful on your Celeron D, then we know it's affected, if not, > >> then you could decide to not care about KPTI (Spectre would still > >> be an issue). > >> > > > > I tried that when the code was first made public, but libkdump doesn't > > support 32-bit; it's full of 64-bit register use in the assembly bits. > > Apparently 32-bit support was added on Wednesday, maybe you > can try again with today's version. > Thanks for informing me of this, I hadn't noticed. I just tried it out, and confirmed the Celeron D is vulnerable to meltdown. Regards, Vito Caputo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 13:23 ` Arnd Bergmann 2018-01-12 14:43 ` Pavel Machek 2018-01-12 17:20 ` vcaputo @ 2018-01-12 17:34 ` Linus Torvalds 2018-01-12 19:38 ` Pavel Machek 2 siblings, 1 reply; 20+ messages in thread From: Linus Torvalds @ 2018-01-12 17:34 UTC (permalink / raw) To: Arnd Bergmann Cc: Pavel Machek, Olivier Galibert, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 5:23 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > However, there is an important difference between KPTI and X86_4G: > The former unmaps the kernel pages from the user space page tables, > but keeps both the linear mapping and the user pages visible in > kernel mode, while the latter must have also unmapped user space > pages from kernel mode, requiring a more expensive get_user/put_user > implementation. Indeed. And I think that the 4G:4G patches do things wrong. People are already complaining about the PTI costs. Separating user space entirely is much much worse, and makes all user accesses from kernel space too painful for words. Honestly, I didn't merge the old 4G:4G patches originally, and I'm not going to merge them this time around either. Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 17:34 ` Linus Torvalds @ 2018-01-12 19:38 ` Pavel Machek 2018-01-12 19:44 ` Linus Torvalds 0 siblings, 1 reply; 20+ messages in thread From: Pavel Machek @ 2018-01-12 19:38 UTC (permalink / raw) To: Linus Torvalds Cc: Arnd Bergmann, Olivier Galibert, Linux Kernel Mailing List, jikos [-- Attachment #1: Type: text/plain, Size: 1493 bytes --] On Fri 2018-01-12 09:34:03, Linus Torvalds wrote: > On Fri, Jan 12, 2018 at 5:23 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > However, there is an important difference between KPTI and X86_4G: > > The former unmaps the kernel pages from the user space page tables, > > but keeps both the linear mapping and the user pages visible in > > kernel mode, while the latter must have also unmapped user space > > pages from kernel mode, requiring a more expensive get_user/put_user > > implementation. > > Indeed. And I think that the 4G:4G patches do things wrong. Yeah. But if there's copy around for something recent, I'd still like to see it. > People are already complaining about the PTI costs. Separating user > space entirely is much much worse, and makes all user accesses from > kernel space too painful for words. > > Honestly, I didn't merge the old 4G:4G patches originally, and I'm not > going to merge them this time around either. I'll try to do the right thing. OTOH... I don't like the fact that kernel memory on my machine is currently readable, probably even from javascript. I tried disabling CPU caches. Just like that, off, boom. My system will not survive that, and it looks like 100x slowdown. So 2x slowdown would be an improvement (and 4G:4G can probably do better than that). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 19:38 ` Pavel Machek @ 2018-01-12 19:44 ` Linus Torvalds 2018-01-12 20:41 ` Pavel Machek 2018-01-13 12:52 ` kernel page table isolation for x86-32 was " Pavel Machek 0 siblings, 2 replies; 20+ messages in thread From: Linus Torvalds @ 2018-01-12 19:44 UTC (permalink / raw) To: Pavel Machek Cc: Arnd Bergmann, Olivier Galibert, Linux Kernel Mailing List, jikos On Fri, Jan 12, 2018 at 11:38 AM, Pavel Machek <pavel@ucw.cz> wrote: > > I'll try to do the right thing. OTOH... I don't like the fact that > kernel memory on my machine is currently readable, probably even from > javascript. Oh, absolutely. I'm just saying that it's probably best to try to start from the x86-64 KPTI model, and see how that works for x86-32. Maybe some of the 4G:4G entry code could come in handy as a "these are the issues" kind of thing. > I tried disabling CPU caches. Just like that, off, boom. My system > will not survive that, and it looks like 100x slowdown. Yeah, no. That is not a realistic thing to do on any hardware since the PPro, I'm afraid. Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-12 19:44 ` Linus Torvalds @ 2018-01-12 20:41 ` Pavel Machek 2018-01-13 12:52 ` kernel page table isolation for x86-32 was " Pavel Machek 1 sibling, 0 replies; 20+ messages in thread From: Pavel Machek @ 2018-01-12 20:41 UTC (permalink / raw) To: Linus Torvalds Cc: Arnd Bergmann, Olivier Galibert, Linux Kernel Mailing List, jikos [-- Attachment #1: Type: text/plain, Size: 1336 bytes --] On Fri 2018-01-12 11:44:48, Linus Torvalds wrote: > On Fri, Jan 12, 2018 at 11:38 AM, Pavel Machek <pavel@ucw.cz> wrote: > > > > I'll try to do the right thing. OTOH... I don't like the fact that > > kernel memory on my machine is currently readable, probably even from > > javascript. > > Oh, absolutely. I'm just saying that it's probably best to try to > start from the x86-64 KPTI model, and see how that works for x86-32. > > Maybe some of the 4G:4G entry code could come in handy as a "these are > the issues" kind of thing. Ok, so I do have the diff that compiles, and it is 300 lines. Those will be extremely tricky 300 lines, but... > > I tried disabling CPU caches. Just like that, off, boom. My system > > will not survive that, and it looks like 100x slowdown. > > Yeah, no. That is not a realistic thing to do on any hardware since > the PPro, I'm afraid. What is special about PPro? Well -- cache off kind of is what I want -- kills Spectre _and_ Meltdown ;-), attacking close to the fundametal issue. And it really should be doable on UP system, right? I guess I should re-try with plain VGA console, not framebuffer. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* kernel page table isolation for x86-32 was Re: Linux 4.15-rc7 2018-01-12 19:44 ` Linus Torvalds 2018-01-12 20:41 ` Pavel Machek @ 2018-01-13 12:52 ` Pavel Machek 1 sibling, 0 replies; 20+ messages in thread From: Pavel Machek @ 2018-01-13 12:52 UTC (permalink / raw) To: Linus Torvalds, vcaputo Cc: Arnd Bergmann, Olivier Galibert, Linux Kernel Mailing List, jikos [-- Attachment #1: Type: text/plain, Size: 14802 bytes --] Hi! > > I'll try to do the right thing. OTOH... I don't like the fact that > > kernel memory on my machine is currently readable, probably even from > > javascript. > > Oh, absolutely. I'm just saying that it's probably best to try to > start from the x86-64 KPTI model, and see how that works for x86-32. Ok, it should not be too bad. Here's something... getting it to compile should be easy, getting it to work might be trickier. Not sure what needs to be done for the LDT. Pavel diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt index d11eff6..e13e1e5 100644 --- a/Documentation/x86/pti.txt +++ b/Documentation/x86/pti.txt @@ -124,7 +124,7 @@ Possible Future Work boot-time switching. Testing -======== +======= To test stability of PTI, the following test procedure is recommended, ideally doing all of these in parallel: diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index 45a63e0..b0485cc 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -332,6 +332,99 @@ For 32-bit we have the following conventions - kernel is built with #endif +#else + +/* + * x86-32 kernel page table isolation. + */ +#ifdef CONFIG_PAGE_TABLE_ISOLATION + +/* + * PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two + * halves: + */ +#define PTI_SWITCH_PGTABLES_MASK (1<<PAGE_SHIFT) +#define PTI_SWITCH_MASK (PTI_SWITCH_PGTABLES_MASK|(1<<X86_CR3_PTI_SWITCH_BIT)) + +.macro ADJUST_KERNEL_CR3 reg:req + /* Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */ + andl $(~PTI_SWITCH_MASK), \reg +.endm + +.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI + movl %cr3, \scratch_reg + ADJUST_KERNEL_CR3 \scratch_reg + movl \scratch_reg, %cr3 +.Lend_\@: +.endm + +.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI + mov %cr3, \scratch_reg + +.Lwrcr3_\@: + /* Flip the PGD and ASID to the user version */ + orl $(PTI_SWITCH_MASK), \scratch_reg + mov \scratch_reg, %cr3 +.Lend_\@: +.endm + +.macro SWITCH_TO_USER_CR3_STACK scratch_reg:req + pushl %eax + SWITCH_TO_USER_CR3_NOSTACK scratch_reg=\scratch_reg scratch_reg2=%eax + popl %eax +.endm + +.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req + ALTERNATIVE "jmp .Ldone_\@", "", X86_FEATURE_PTI + movl %cr3, \scratch_reg + movl \scratch_reg, \save_reg + /* + * Is the "switch mask" all zero? That means that both of + * these are zero: + * + * 1. The user/kernel PCID bit, and + * 2. The user/kernel "bit" that points CR3 to the + * bottom half of the 8k PGD + * + * That indicates a kernel CR3 value, not a user CR3. + */ + testl $(PTI_SWITCH_MASK), \scratch_reg + jz .Ldone_\@ + + ADJUST_KERNEL_CR3 \scratch_reg + movl \scratch_reg, %cr3 + +.Ldone_\@: +.endm + +.macro RESTORE_CR3 scratch_reg:req save_reg:req + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI + + /* + * The CR3 write could be avoided when not changing its value, + * but would require a CR3 read *and* a scratch register. + */ + movl \save_reg, %cr3 +.Lend_\@: +.endm + +#else /* CONFIG_PAGE_TABLE_ISOLATION=n: */ + +.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req +.endm +.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req +.endm +.macro SWITCH_TO_USER_CR3_STACK scratch_reg:req +.endm +.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req +.endm +.macro RESTORE_CR3 scratch_reg:req save_reg:req +.endm + +#endif + #endif /* CONFIG_X86_64 */ /* diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index d2ef7f32..be8759b 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -46,6 +46,8 @@ #include <asm/frame.h> #include <asm/nospec-branch.h> +#include "calling.h" + .section .entry.text, "ax" /* @@ -428,6 +430,7 @@ ENTRY(entry_SYSENTER_32) pushl $0 /* pt_regs->ip = 0 (placeholder) */ pushl %eax /* pt_regs->orig_ax */ SAVE_ALL pt_regs_ax=$-ENOSYS /* save rest */ + SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%edx save_reg=%ecx /* * SYSENTER doesn't filter flags, so we need to clear NT, AC @@ -464,6 +467,7 @@ ENTRY(entry_SYSENTER_32) ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \ "jmp .Lsyscall_32_done", X86_FEATURE_XENPV + RESTORE_CR3 scratch_reg=%edx save_reg=%ecx /* Opportunistic SYSEXIT */ TRACE_IRQS_ON /* User mode traces as IRQs on. */ movl PT_EIP(%esp), %edx /* pt_regs->ip */ @@ -539,6 +543,7 @@ ENTRY(entry_INT80_32) ASM_CLAC pushl %eax /* pt_regs->orig_ax */ SAVE_ALL pt_regs_ax=$-ENOSYS /* save rest */ + SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%edx save_reg=%ecx /* * User mode is traced as though IRQs are on, and the interrupt gate @@ -552,6 +557,7 @@ ENTRY(entry_INT80_32) restore_all: TRACE_IRQS_IRET + RESTORE_CR3 scratch_reg=%eax save_reg=%ecx .Lrestore_all_notrace: #ifdef CONFIG_X86_ESPFIX32 ALTERNATIVE "jmp .Lrestore_nocheck", "", X86_BUG_ESPFIX diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index e51b65a..a87fb89 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -4,7 +4,7 @@ * * Copyright (C) 1991, 1992 Linus Torvalds * Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs - * Copyright (C) 2000 Pavel Machek <pavel@suse.cz> + * Copyright (C) 2000 Pavel Machek SuSE Labs * * entry.S contains the system-call and fault low-level handling routines. * diff --git a/arch/x86/include/asm/pgtable_32.h b/arch/x86/include/asm/pgtable_32.h index e67c062..1b36e56 100644 --- a/arch/x86/include/asm/pgtable_32.h +++ b/arch/x86/include/asm/pgtable_32.h @@ -107,4 +107,78 @@ do { \ */ #define LOWMEM_PAGES ((((2<<31) - __PAGE_OFFSET) >> PAGE_SHIFT)) +#ifdef CONFIG_PAGE_TABLE_ISOLATION +/* + * All top-level PAGE_TABLE_ISOLATION page tables are order-1 pages + * (8k-aligned and 8k in size). The kernel one is at the beginning 4k and + * the user one is in the last 4k. To switch between them, you + * just need to flip the 12th bit in their addresses. + */ +#define PTI_PGTABLE_SWITCH_BIT PAGE_SHIFT + +#ifndef __ASSEMBLY__ +/* + * This generates better code than the inline assembly in + * __set_bit(). + */ +static inline void *ptr_set_bit(void *ptr, int bit) +{ + unsigned long __ptr = (unsigned long)ptr; + + __ptr |= BIT(bit); + return (void *)__ptr; +} +static inline void *ptr_clear_bit(void *ptr, int bit) +{ + unsigned long __ptr = (unsigned long)ptr; + + __ptr &= ~BIT(bit); + return (void *)__ptr; +} + +static inline pgd_t *kernel_to_user_pgdp(pgd_t *pgdp) +{ + return ptr_set_bit(pgdp, PTI_PGTABLE_SWITCH_BIT); +} + +static inline pgd_t *user_to_kernel_pgdp(pgd_t *pgdp) +{ + return ptr_clear_bit(pgdp, PTI_PGTABLE_SWITCH_BIT); +} + +static inline p4d_t *kernel_to_user_p4dp(p4d_t *p4dp) +{ + return ptr_set_bit(p4dp, PTI_PGTABLE_SWITCH_BIT); +} + +static inline p4d_t *user_to_kernel_p4dp(p4d_t *p4dp) +{ + return ptr_clear_bit(p4dp, PTI_PGTABLE_SWITCH_BIT); +} +#endif +#endif /* CONFIG_PAGE_TABLE_ISOLATION */ + +#ifndef __ASSEMBLY__ +#ifdef CONFIG_PAGE_TABLE_ISOLATION +pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd); + +/* + * Take a PGD location (pgdp) and a pgd value that needs to be set there. + * Populates the user and returns the resulting PGD that must be set in + * the kernel copy of the page tables. + */ +static inline pgd_t pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) +{ + if (!static_cpu_has(X86_FEATURE_PTI)) + return pgd; + return __pti_set_user_pgd(pgdp, pgd); +} +#else +static inline pgd_t pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) +{ + return pgd; +} +#endif +#endif + #endif /* _ASM_X86_PGTABLE_32_H */ diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h index ce245b0..804fc33 100644 --- a/arch/x86/include/asm/pgtable_32_types.h +++ b/arch/x86/include/asm/pgtable_32_types.h @@ -62,4 +62,6 @@ extern bool __vmalloc_start_set; /* set once high_memory is set */ #define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE) +#define LDT_BASE_ADDR 0 /* FIXME */ + #endif /* _ASM_X86_PGTABLE_32_DEFS_H */ diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index 6a60fea..8f1cf71 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -39,10 +39,6 @@ #define CR3_PCID_MASK 0xFFFull #define CR3_NOFLUSH BIT_ULL(63) -#ifdef CONFIG_PAGE_TABLE_ISOLATION -# define X86_CR3_PTI_SWITCH_BIT 11 -#endif - #else /* * CR3_ADDR_MASK needs at least bits 31:5 set on PAE systems, and we save @@ -53,4 +49,8 @@ #define CR3_NOFLUSH 0 #endif +#ifdef CONFIG_PAGE_TABLE_ISOLATION +# define X86_CR3_PTI_SWITCH_BIT 11 +#endif + #endif /* _ASM_X86_PROCESSOR_FLAGS_H */ diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index dbaf14d..849e073 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1,5 +1,7 @@ #define pr_fmt(fmt) "SMP alternatives: " fmt +#define DEBUG + #include <linux/module.h> #include <linux/sched.h> #include <linux/mutex.h> @@ -28,7 +30,7 @@ EXPORT_SYMBOL_GPL(alternatives_patched); #define MAX_PATCH_LEN (255-1) -static int __initdata_or_module debug_alternative; +static int __initdata_or_module debug_alternative = 1; static int __init debug_alt(char *str) { @@ -60,7 +62,7 @@ __setup("noreplace-paravirt", setup_noreplace_paravirt); #define DPRINTK(fmt, args...) \ do { \ if (debug_alternative) \ - printk(KERN_DEBUG "%s: " fmt "\n", __func__, ##args); \ + printk( "%s: " fmt "\n", __func__, ##args); \ } while (0) #define DUMP_BYTES(buf, len, fmt, args...) \ @@ -71,7 +73,7 @@ do { \ if (!(len)) \ break; \ \ - printk(KERN_DEBUG fmt, ##args); \ + printk( fmt, ##args); \ for (j = 0; j < (len) - 1; j++) \ printk(KERN_CONT "%02hhx ", buf[j]); \ printk(KERN_CONT "%02hhx\n", buf[j]); \ @@ -373,6 +375,8 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start, u8 *instr, *replacement; u8 insnbuf[MAX_PATCH_LEN]; + printk("apply_alternatives: entering\n"); + DPRINTK("alt table %p -> %p", start, end); /* * The scan order should be from start to end. A later scanned diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S index c290209..002ffaf 100644 --- a/arch/x86/kernel/head_32.S +++ b/arch/x86/kernel/head_32.S @@ -505,6 +505,31 @@ __INITDATA GLOBAL(early_recursion_flag) .long 0 +#define NEXT_PAGE(name) \ + .balign PAGE_SIZE; \ +GLOBAL(name) + +#ifdef CONFIG_PAGE_TABLE_ISOLATION +/* + * Each PGD needs to be 8k long and 8k aligned. We do not + * ever go out to userspace with these, so we do not + * strictly *need* the second page, but this allows us to + * have a single set_pgd() implementation that does not + * need to worry about whether it has 4k or 8k to work + * with. + * + * This ensures PGDs are 8k long: + */ +#define PTI_USER_PGD_FILL 1024 +/* This ensures they are 8k-aligned: */ +#define NEXT_PGD_PAGE(name) \ + .balign 2 * PAGE_SIZE; \ +GLOBAL(name) +#else +#define NEXT_PGD_PAGE(name) NEXT_PAGE(name) +#define PTI_USER_PGD_FILL 0 +#endif + __REFDATA .align 4 ENTRY(initial_code) @@ -516,24 +541,26 @@ ENTRY(setup_once_ref) * BSS section */ __PAGE_ALIGNED_BSS - .align PAGE_SIZE #ifdef CONFIG_X86_PAE -.globl initial_pg_pmd +NEXT_PGD_PAGE(initial_pg_pmd) initial_pg_pmd: .fill 1024*KPMDS,4,0 + .fill PTI_USER_PGD_FILL,4,0 #else -.globl initial_page_table +NEXT_PGD_PAGE(initial_page_table) initial_page_table: .fill 1024,4,0 + .fill PTI_USER_PGD_FILL,4,0 #endif initial_pg_fixmap: .fill 1024,4,0 .globl empty_zero_page empty_zero_page: .fill 4096,1,0 -.globl swapper_pg_dir +NEXT_PGD_PAGE(swapper_pg_dir) swapper_pg_dir: .fill 1024,4,0 + .fill PTI_USER_PGD_FILL,4,0 EXPORT_SYMBOL(empty_zero_page) /* diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 04a625f..57f5cd4 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -3,7 +3,7 @@ * linux/arch/x86/kernel/head_64.S -- start in 32bit and switch to 64bit * * Copyright (C) 2000 Andrea Arcangeli <andrea@suse.de> SuSE - * Copyright (C) 2000 Pavel Machek <pavel@suse.cz> + * Copyright (C) 2000 Pavel Machek * Copyright (C) 2000 Karsten Keil <kkeil@suse.de> * Copyright (C) 2001,2002 Andi Kleen <ak@suse.de> * Copyright (C) 2005 Eric Biederman <ebiederm@xmission.com> diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index 2a4849e..896b53b 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -543,7 +543,11 @@ EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs); static void ptdump_walk_user_pgd_level_checkwx(void) { #ifdef CONFIG_PAGE_TABLE_ISOLATION +#ifdef CONFIG_X86_64 pgd_t *pgd = (pgd_t *) &init_top_pgt; +#else + pgd_t *pgd = swapper_pg_dir; +#endif if (!static_cpu_has(X86_FEATURE_PTI)) return; diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index ce38f16..029b5c8 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -113,8 +113,11 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) * Top-level entries added to init_mm's usermode pgd after boot * will not be automatically propagated to other mms. */ +#ifdef X86_64 + /* FIXME? */ if (!pgdp_maps_userspace(pgdp)) return pgd; +#endif /* * The user page tables get the full PGD, accessible from @@ -166,7 +169,9 @@ static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page))); } +#ifdef X86_64 BUILD_BUG_ON(pgd_large(*pgd) != 0); +#endif return p4d_offset(pgd, address); } diff --git a/security/Kconfig b/security/Kconfig index 1f96e19..ad77de4 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -57,7 +57,7 @@ config SECURITY_NETWORK config PAGE_TABLE_ISOLATION bool "Remove the kernel mapping in user mode" default y - depends on X86_64 && !UML + depends on (X86_32 || X86_64) && !UML help This feature reduces the number of hardware side channels by ensuring that the majority of kernel addresses are not mapped -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-10 23:32 ` Pavel Machek 2018-01-11 11:29 ` Olivier Galibert @ 2018-01-11 14:07 ` Jiri Kosina 2018-01-19 10:28 ` Pavel Machek 1 sibling, 1 reply; 20+ messages in thread From: Jiri Kosina @ 2018-01-11 14:07 UTC (permalink / raw) To: Pavel Machek; +Cc: Linus Torvalds, Linux Kernel Mailing List On Thu, 11 Jan 2018, Pavel Machek wrote: > Is anyone working on KPTI for x86-32? SLES11 should still be supported, > and that should have x86-32 version; any chance SUSE can share some > patches? We are sharing sources of all our kernels at http://kernel.suse.com/ If you can find the x86-32 support there, it's yours (hint: you won't). Otherwise, you'd either have to wait until we (or someone else) implements it (it's on our list), or implement it yourself. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.15-rc7 2018-01-11 14:07 ` Jiri Kosina @ 2018-01-19 10:28 ` Pavel Machek 0 siblings, 0 replies; 20+ messages in thread From: Pavel Machek @ 2018-01-19 10:28 UTC (permalink / raw) To: Jiri Kosina; +Cc: Linus Torvalds, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 852 bytes --] On Thu 2018-01-11 15:07:22, Jiri Kosina wrote: > On Thu, 11 Jan 2018, Pavel Machek wrote: > > > Is anyone working on KPTI for x86-32? SLES11 should still be supported, > > and that should have x86-32 version; any chance SUSE can share some > > patches? > > We are sharing sources of all our kernels at > > http://kernel.suse.com/ > > If you can find the x86-32 support there, it's yours (hint: you won't). > > Otherwise, you'd either have to wait until we (or someone else) implements > it (it's on our list), or implement it yourself. Hmm. Seems Joerg Roedel from suse sent implementation after all. And it should boot, mine did not yet. Let me do some testing... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2018-01-19 10:28 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-01-07 22:55 Linux 4.15-rc7 Linus Torvalds 2018-01-08 7:20 ` Thomas Gleixner 2018-01-10 23:32 ` Pavel Machek 2018-01-11 11:29 ` Olivier Galibert 2018-01-11 14:06 ` Nikolay Borisov 2018-01-12 11:06 ` Pavel Machek 2018-01-12 13:23 ` Arnd Bergmann 2018-01-12 14:43 ` Pavel Machek 2018-01-12 17:20 ` vcaputo 2018-01-12 20:11 ` Arnd Bergmann 2018-01-12 22:04 ` vcaputo 2018-01-12 22:08 ` Arnd Bergmann 2018-01-12 22:58 ` vcaputo 2018-01-12 17:34 ` Linus Torvalds 2018-01-12 19:38 ` Pavel Machek 2018-01-12 19:44 ` Linus Torvalds 2018-01-12 20:41 ` Pavel Machek 2018-01-13 12:52 ` kernel page table isolation for x86-32 was " Pavel Machek 2018-01-11 14:07 ` Jiri Kosina 2018-01-19 10:28 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).