linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH v9 02/13] x86: always set IF before oopsing from page fault
       [not found]         ` <20190404154727.GA14030@cisco>
@ 2019-04-04 16:23           ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 24+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-04 16:23 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Andy Lutomirski, Khalid Aziz, iommu, X86 ML, LKML, Linux-MM, Khalid Aziz

- stepping on del button while browsing though CCs.
On 2019-04-04 09:47:27 [-0600], Tycho Andersen wrote:
> > Hmm.  do_exit() isn't really meant to be "try your best to leave the
> > system somewhat usable without returning" -- it's a function that,
> > other than in OOPSes, is called from a well-defined state.  So I think
> > rewind_stack_do_exit() is probably a better spot.  But we need to
> > rewind the stack and *then* turn on IRQs, since we otherwise risk
> > exploding quite badly.
> 
> Ok, sounds good. I guess we can include something like this patch in
> the next series.

The tracing infrastructure probably doesn't know that the interrupts are
back on. Also if you were holding a spin lock then your preempt count
isn't 0 which means that might_sleep() will trigger a splat (in your
backtrace it was zero).

> Thanks,
> 
> Tycho
Sebastian

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 00/13] Add support for eXclusive Page Frame Ownership
       [not found] <cover.1554248001.git.khalid.aziz@oracle.com>
       [not found] ` <e6c57f675e5b53d4de266412aa526b7660c47918.1554248002.git.khalid.aziz@oracle.com>
@ 2019-04-04 16:44 ` Nadav Amit
  2019-04-04 17:18   ` Khalid Aziz
       [not found] ` <f1ac3700970365fb979533294774af0b0dd84b3b.1554248002.git.khalid.aziz@oracle.com>
  2 siblings, 1 reply; 24+ messages in thread
From: Nadav Amit @ 2019-04-04 16:44 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: X86 ML, linux-arm-kernel, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List

> On Apr 3, 2019, at 10:34 AM, Khalid Aziz <khalid.aziz@oracle.com> wrote:
> 
> This is another update to the work Juerg, Tycho and Julian have
> done on XPFO.

Interesting work, but note that it triggers a warning on my system due to
possible deadlock. It seems that the patch-set disables IRQs in
xpfo_kunmap() and then might flush remote TLBs when a large page is split.
This is wrong, since it might lead to deadlocks.


[  947.262208] WARNING: CPU: 6 PID: 9892 at kernel/smp.c:416 smp_call_function_many+0x92/0x250
[  947.263767] Modules linked in: sb_edac vmw_balloon crct10dif_pclmul crc32_pclmul joydev ghash_clmulni_intel input_leds intel_rapl_perf serio_raw mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core vmw_vsock_vmci_transport vsock vmw_vmci iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear hid_generic usbhid hid vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm aesni_intel psmouse aes_x86_64 crypto_simd cryptd glue_helper mptspi vmxnet3 scsi_transport_spi mptscsih ahci mptbase libahci i2c_piix4 pata_acpi
[  947.274649] CPU: 6 PID: 9892 Comm: cc1 Not tainted 5.0.0+ #7
[  947.275804] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/28/2017
[  947.277704] RIP: 0010:smp_call_function_many+0x92/0x250
[  947.278640] Code: 3b 05 66 fc 4e 01 72 26 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 8b 05 2b cc 7e 01 85 c0 75 bf 80 3d a8 99 4e 01 00 75 b6 <0f> 0b eb b2 44 89 c7 48 c7 c2 a0 9a 61 aa 4c 89 fe 44 89 45 d0 e8
[  947.281895] RSP: 0000:ffffafe04538f970 EFLAGS: 00010046
[  947.282821] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000001
[  947.284084] RDX: 0000000000000000 RSI: ffffffffa9078d70 RDI: ffffffffaa619aa0
[  947.285343] RBP: ffffafe04538f9a8 R08: ffff9d7040000ff0 R09: 0000000000000000
[  947.286596] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa9078d70
[  947.287855] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffffaa619aa0
[  947.289118] FS:  00007f668b122ac0(0000) GS:ffff9d727fd80000(0000) knlGS:0000000000000000
[  947.290550] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  947.291569] CR2: 00007f6688389004 CR3: 0000000224496006 CR4: 00000000003606e0
[  947.292861] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  947.294125] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  947.295394] Call Trace:
[  947.295854]  ? load_new_mm_cr3+0xe0/0xe0
[  947.296568]  on_each_cpu+0x2d/0x60
[  947.297191]  flush_tlb_all+0x1c/0x20
[  947.297846]  __split_large_page+0x5d9/0x640
[  947.298604]  set_kpte+0xfe/0x260
[  947.299824]  get_page_from_freelist+0x1633/0x1680
[  947.301260]  ? lookup_address+0x2d/0x30
[  947.302550]  ? set_kpte+0x1e1/0x260
[  947.303760]  __alloc_pages_nodemask+0x13f/0x2e0
[  947.305137]  alloc_pages_vma+0x7a/0x1c0
[  947.306378]  wp_page_copy+0x201/0xa30
[  947.307582]  ? generic_file_read_iter+0x96a/0xcf0
[  947.308946]  do_wp_page+0x1cc/0x420
[  947.310086]  __handle_mm_fault+0xc0d/0x1600
[  947.311331]  handle_mm_fault+0xe1/0x210
[  947.312502]  __do_page_fault+0x23a/0x4c0
[  947.313672]  ? _cond_resched+0x19/0x30
[  947.314795]  do_page_fault+0x2e/0xe0
[  947.315878]  ? page_fault+0x8/0x30
[  947.316916]  page_fault+0x1e/0x30
[  947.317930] RIP: 0033:0x76581e
[  947.318893] Code: eb 05 89 d8 48 8d 04 80 48 8d 34 c5 08 00 00 00 48 85 ff 74 04 44 8b 67 04 e8 de 80 08 00 81 e3 ff ff ff 7f 48 89 45 00 8b 10 <44> 89 60 04 81 e2 00 00 00 80 09 da 89 10 c1 ea 18 83 e2 7f 88 50
[  947.323337] RSP: 002b:00007ffde06c0e40 EFLAGS: 00010202
[  947.324663] RAX: 00007f6688389000 RBX: 0000000000000004 RCX: 0000000000000001
[  947.326317] RDX: 0000000000000000 RSI: 0000000001000001 RDI: 0000000000000017
[  947.327973] RBP: 00007f66883882d8 R08: 00000000032e05f0 R09: 00007f668b30e6f0
[  947.329619] R10: 0000000000000002 R11: 00000000032e05f0 R12: 0000000000000000
[  947.331260] R13: 00007f6688388230 R14: 00007f6688388288 R15: 00007f668ac3b0a8
[  947.332911] ---[ end trace 7d605a38c67d83ae ]---

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 00/13] Add support for eXclusive Page Frame Ownership
  2019-04-04 16:44 ` [RFC PATCH v9 00/13] Add support for eXclusive Page Frame Ownership Nadav Amit
@ 2019-04-04 17:18   ` Khalid Aziz
  0 siblings, 0 replies; 24+ messages in thread
From: Khalid Aziz @ 2019-04-04 17:18 UTC (permalink / raw)
  To: Nadav Amit
  Cc: X86 ML, linux-arm-kernel, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List

On 4/4/19 10:44 AM, Nadav Amit wrote:
>> On Apr 3, 2019, at 10:34 AM, Khalid Aziz <khalid.aziz@oracle.com> wrote:
>>
>> This is another update to the work Juerg, Tycho and Julian have
>> done on XPFO.
> 
> Interesting work, but note that it triggers a warning on my system due to
> possible deadlock. It seems that the patch-set disables IRQs in
> xpfo_kunmap() and then might flush remote TLBs when a large page is split.
> This is wrong, since it might lead to deadlocks.
> 
> 
> [  947.262208] WARNING: CPU: 6 PID: 9892 at kernel/smp.c:416 smp_call_function_many+0x92/0x250
> [  947.263767] Modules linked in: sb_edac vmw_balloon crct10dif_pclmul crc32_pclmul joydev ghash_clmulni_intel input_leds intel_rapl_perf serio_raw mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core vmw_vsock_vmci_transport vsock vmw_vmci iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear hid_generic usbhid hid vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm aesni_intel psmouse aes_x86_64 crypto_simd cryptd glue_helper mptspi vmxnet3 scsi_transport_spi mptscsih ahci mptbase libahci i2c_piix4 pata_acpi
> [  947.274649] CPU: 6 PID: 9892 Comm: cc1 Not tainted 5.0.0+ #7
> [  947.275804] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/28/2017
> [  947.277704] RIP: 0010:smp_call_function_many+0x92/0x250
> [  947.278640] Code: 3b 05 66 fc 4e 01 72 26 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 8b 05 2b cc 7e 01 85 c0 75 bf 80 3d a8 99 4e 01 00 75 b6 <0f> 0b eb b2 44 89 c7 48 c7 c2 a0 9a 61 aa 4c 89 fe 44 89 45 d0 e8
> [  947.281895] RSP: 0000:ffffafe04538f970 EFLAGS: 00010046
> [  947.282821] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000001
> [  947.284084] RDX: 0000000000000000 RSI: ffffffffa9078d70 RDI: ffffffffaa619aa0
> [  947.285343] RBP: ffffafe04538f9a8 R08: ffff9d7040000ff0 R09: 0000000000000000
> [  947.286596] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa9078d70
> [  947.287855] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffffaa619aa0
> [  947.289118] FS:  00007f668b122ac0(0000) GS:ffff9d727fd80000(0000) knlGS:0000000000000000
> [  947.290550] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  947.291569] CR2: 00007f6688389004 CR3: 0000000224496006 CR4: 00000000003606e0
> [  947.292861] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  947.294125] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  947.295394] Call Trace:
> [  947.295854]  ? load_new_mm_cr3+0xe0/0xe0
> [  947.296568]  on_each_cpu+0x2d/0x60
> [  947.297191]  flush_tlb_all+0x1c/0x20
> [  947.297846]  __split_large_page+0x5d9/0x640
> [  947.298604]  set_kpte+0xfe/0x260
> [  947.299824]  get_page_from_freelist+0x1633/0x1680
> [  947.301260]  ? lookup_address+0x2d/0x30
> [  947.302550]  ? set_kpte+0x1e1/0x260
> [  947.303760]  __alloc_pages_nodemask+0x13f/0x2e0
> [  947.305137]  alloc_pages_vma+0x7a/0x1c0
> [  947.306378]  wp_page_copy+0x201/0xa30
> [  947.307582]  ? generic_file_read_iter+0x96a/0xcf0
> [  947.308946]  do_wp_page+0x1cc/0x420
> [  947.310086]  __handle_mm_fault+0xc0d/0x1600
> [  947.311331]  handle_mm_fault+0xe1/0x210
> [  947.312502]  __do_page_fault+0x23a/0x4c0
> [  947.313672]  ? _cond_resched+0x19/0x30
> [  947.314795]  do_page_fault+0x2e/0xe0
> [  947.315878]  ? page_fault+0x8/0x30
> [  947.316916]  page_fault+0x1e/0x30
> [  947.317930] RIP: 0033:0x76581e
> [  947.318893] Code: eb 05 89 d8 48 8d 04 80 48 8d 34 c5 08 00 00 00 48 85 ff 74 04 44 8b 67 04 e8 de 80 08 00 81 e3 ff ff ff 7f 48 89 45 00 8b 10 <44> 89 60 04 81 e2 00 00 00 80 09 da 89 10 c1 ea 18 83 e2 7f 88 50
> [  947.323337] RSP: 002b:00007ffde06c0e40 EFLAGS: 00010202
> [  947.324663] RAX: 00007f6688389000 RBX: 0000000000000004 RCX: 0000000000000001
> [  947.326317] RDX: 0000000000000000 RSI: 0000000001000001 RDI: 0000000000000017
> [  947.327973] RBP: 00007f66883882d8 R08: 00000000032e05f0 R09: 00007f668b30e6f0
> [  947.329619] R10: 0000000000000002 R11: 00000000032e05f0 R12: 0000000000000000
> [  947.331260] R13: 00007f6688388230 R14: 00007f6688388288 R15: 00007f668ac3b0a8
> [  947.332911] ---[ end trace 7d605a38c67d83ae ]---
> 

Thanks for letting me know. xpfo_kunmap() is not quite right. It will
end up being rewritten for the next version.

--
Khalid


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
       [not found] ` <f1ac3700970365fb979533294774af0b0dd84b3b.1554248002.git.khalid.aziz@oracle.com>
@ 2019-04-17 16:15   ` Ingo Molnar
  2019-04-17 16:49     ` Khalid Aziz
  0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2019-04-17 16:15 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: juergh, tycho, jsteckli, keescook, konrad.wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, dwmw, andrew.cooper3,
	jcm, boris.ostrovsky, iommu, x86, linux-arm-kernel, linux-doc,
	linux-kernel, linux-mm, linux-security-module, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Andy Lutomirski,
	Peter Zijlstra, Dave Hansen, Borislav Petkov, H. Peter Anvin,
	Arjan van de Ven, Greg Kroah-Hartman


[ Sorry, had to trim the Cc: list from hell. Tried to keep all the 
  mailing lists and all x86 developers. ]

* Khalid Aziz <khalid.aziz@oracle.com> wrote:

> From: Juerg Haefliger <juerg.haefliger@canonical.com>
> 
> This patch adds basic support infrastructure for XPFO which protects 
> against 'ret2dir' kernel attacks. The basic idea is to enforce 
> exclusive ownership of page frames by either the kernel or userspace, 
> unless explicitly requested by the kernel. Whenever a page destined for 
> userspace is allocated, it is unmapped from physmap (the kernel's page 
> table). When such a page is reclaimed from userspace, it is mapped back 
> to physmap. Individual architectures can enable full XPFO support using 
> this infrastructure by supplying architecture specific pieces.

I have a higher level, meta question:

Is there any updated analysis outlining why this XPFO overhead would be 
required on x86-64 kernels running on SMAP/SMEP CPUs which should be all 
recent Intel and AMD CPUs, and with kernel that mark all direct kernel 
mappings as non-executable - which should be all reasonably modern 
kernels later than v4.0 or so?

I.e. the original motivation of the XPFO patches was to prevent execution 
of direct kernel mappings. Is this motivation still present if those 
mappings are non-executable?

(Sorry if this has been asked and answered in previous discussions.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 16:15   ` [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO) Ingo Molnar
@ 2019-04-17 16:49     ` Khalid Aziz
  2019-04-17 17:09       ` Ingo Molnar
  2019-05-01 14:49       ` Waiman Long
  0 siblings, 2 replies; 24+ messages in thread
From: Khalid Aziz @ 2019-04-17 16:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: juergh, tycho, jsteckli, keescook, konrad.wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, dwmw, andrew.cooper3,
	jcm, boris.ostrovsky, iommu, x86, linux-arm-kernel, linux-doc,
	linux-kernel, linux-mm, linux-security-module, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Andy Lutomirski,
	Peter Zijlstra, Dave Hansen, Borislav Petkov, H. Peter Anvin,
	Arjan van de Ven, Greg Kroah-Hartman

On 4/17/19 10:15 AM, Ingo Molnar wrote:
> 
> [ Sorry, had to trim the Cc: list from hell. Tried to keep all the 
>   mailing lists and all x86 developers. ]
> 
> * Khalid Aziz <khalid.aziz@oracle.com> wrote:
> 
>> From: Juerg Haefliger <juerg.haefliger@canonical.com>
>>
>> This patch adds basic support infrastructure for XPFO which protects 
>> against 'ret2dir' kernel attacks. The basic idea is to enforce 
>> exclusive ownership of page frames by either the kernel or userspace, 
>> unless explicitly requested by the kernel. Whenever a page destined for 
>> userspace is allocated, it is unmapped from physmap (the kernel's page 
>> table). When such a page is reclaimed from userspace, it is mapped back 
>> to physmap. Individual architectures can enable full XPFO support using 
>> this infrastructure by supplying architecture specific pieces.
> 
> I have a higher level, meta question:
> 
> Is there any updated analysis outlining why this XPFO overhead would be 
> required on x86-64 kernels running on SMAP/SMEP CPUs which should be all 
> recent Intel and AMD CPUs, and with kernel that mark all direct kernel 
> mappings as non-executable - which should be all reasonably modern 
> kernels later than v4.0 or so?
> 
> I.e. the original motivation of the XPFO patches was to prevent execution 
> of direct kernel mappings. Is this motivation still present if those 
> mappings are non-executable?
> 
> (Sorry if this has been asked and answered in previous discussions.)

Hi Ingo,

That is a good question. Because of the cost of XPFO, we have to be very
sure we need this protection. The paper from Vasileios, Michalis and
Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
and 6.2.

Thanks,
Khalid



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 16:49     ` Khalid Aziz
@ 2019-04-17 17:09       ` Ingo Molnar
  2019-04-17 17:19         ` Nadav Amit
  2019-04-17 17:33         ` Khalid Aziz
  2019-05-01 14:49       ` Waiman Long
  1 sibling, 2 replies; 24+ messages in thread
From: Ingo Molnar @ 2019-04-17 17:09 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: juergh, tycho, jsteckli, keescook, konrad.wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, dwmw, andrew.cooper3,
	jcm, boris.ostrovsky, iommu, x86, linux-arm-kernel, linux-doc,
	linux-kernel, linux-mm, linux-security-module, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Andy Lutomirski,
	Peter Zijlstra, Dave Hansen, Borislav Petkov, H. Peter Anvin,
	Arjan van de Ven, Greg Kroah-Hartman


* Khalid Aziz <khalid.aziz@oracle.com> wrote:

> > I.e. the original motivation of the XPFO patches was to prevent execution 
> > of direct kernel mappings. Is this motivation still present if those 
> > mappings are non-executable?
> > 
> > (Sorry if this has been asked and answered in previous discussions.)
> 
> Hi Ingo,
> 
> That is a good question. Because of the cost of XPFO, we have to be very
> sure we need this protection. The paper from Vasileios, Michalis and
> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
> and 6.2.

So it would be nice if you could generally summarize external arguments 
when defending a patchset, instead of me having to dig through a PDF 
which not only causes me to spend time that you probably already spent 
reading that PDF, but I might also interpret it incorrectly. ;-)

The PDF you cited says this:

  "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
   in many platforms, including x86-64.  In our example, the content of 
   user address 0xBEEF000 is also accessible through kernel address 
   0xFFFF87FF9F080000 as plain, executable code."

Is this actually true of modern x86-64 kernels? We've locked down W^X 
protections in general.

I.e. this conclusion:

  "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and 
   triggering the kernel to dereference it, an attacker can directly 
   execute shell code with kernel privileges."

... appears to be predicated on imperfect W^X protections on the x86-64 
kernel.

Do such holes exist on the latest x86-64 kernel? If yes, is there a 
reason to believe that these W^X holes cannot be fixed, or that any fix 
would be more expensive than XPFO?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 17:09       ` Ingo Molnar
@ 2019-04-17 17:19         ` Nadav Amit
  2019-04-17 17:26           ` Ingo Molnar
  2019-04-17 17:33         ` Khalid Aziz
  1 sibling, 1 reply; 24+ messages in thread
From: Nadav Amit @ 2019-04-17 17:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Khalid Aziz, juergh, Tycho Andersen, jsteckli, keescook,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris.hyser, tyhicks, David Woodhouse, Andrew Cooper, jcm,
	Boris Ostrovsky, iommu, X86 ML, linux-arm-kernel,
	open list:DOCUMENTATION, Linux List Kernel Mailing, Linux-MM,
	LSM List, Khalid Aziz, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

> On Apr 17, 2019, at 10:09 AM, Ingo Molnar <mingo@kernel.org> wrote:
> 
> 
> * Khalid Aziz <khalid.aziz@oracle.com> wrote:
> 
>>> I.e. the original motivation of the XPFO patches was to prevent execution 
>>> of direct kernel mappings. Is this motivation still present if those 
>>> mappings are non-executable?
>>> 
>>> (Sorry if this has been asked and answered in previous discussions.)
>> 
>> Hi Ingo,
>> 
>> That is a good question. Because of the cost of XPFO, we have to be very
>> sure we need this protection. The paper from Vasileios, Michalis and
>> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
>> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
>> and 6.2.
> 
> So it would be nice if you could generally summarize external arguments 
> when defending a patchset, instead of me having to dig through a PDF 
> which not only causes me to spend time that you probably already spent 
> reading that PDF, but I might also interpret it incorrectly. ;-)
> 
> The PDF you cited says this:
> 
>  "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
>   in many platforms, including x86-64.  In our example, the content of 
>   user address 0xBEEF000 is also accessible through kernel address 
>   0xFFFF87FF9F080000 as plain, executable code."
> 
> Is this actually true of modern x86-64 kernels? We've locked down W^X 
> protections in general.

As I was curious, I looked at the paper. Here is a quote from it:

"In x86-64, however, the permissions of physmap are not in sane state.
Kernels up to v3.8.13 violate the W^X property by mapping the entire region
as “readable, writeable, and executable” (RWX)—only very recent kernels
(≥v3.9) use the more conservative RW mapping.”


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 17:19         ` Nadav Amit
@ 2019-04-17 17:26           ` Ingo Molnar
  2019-04-17 17:44             ` Nadav Amit
  0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2019-04-17 17:26 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Khalid Aziz, juergh, Tycho Andersen, jsteckli, keescook,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris.hyser, tyhicks, David Woodhouse, Andrew Cooper, jcm,
	Boris Ostrovsky, iommu, X86 ML, linux-arm-kernel,
	open list:DOCUMENTATION, Linux List Kernel Mailing, Linux-MM,
	LSM List, Khalid Aziz, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman


* Nadav Amit <nadav.amit@gmail.com> wrote:

> > On Apr 17, 2019, at 10:09 AM, Ingo Molnar <mingo@kernel.org> wrote:
> > 
> > 
> > * Khalid Aziz <khalid.aziz@oracle.com> wrote:
> > 
> >>> I.e. the original motivation of the XPFO patches was to prevent execution 
> >>> of direct kernel mappings. Is this motivation still present if those 
> >>> mappings are non-executable?
> >>> 
> >>> (Sorry if this has been asked and answered in previous discussions.)
> >> 
> >> Hi Ingo,
> >> 
> >> That is a good question. Because of the cost of XPFO, we have to be very
> >> sure we need this protection. The paper from Vasileios, Michalis and
> >> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
> >> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
> >> and 6.2.
> > 
> > So it would be nice if you could generally summarize external arguments 
> > when defending a patchset, instead of me having to dig through a PDF 
> > which not only causes me to spend time that you probably already spent 
> > reading that PDF, but I might also interpret it incorrectly. ;-)
> > 
> > The PDF you cited says this:
> > 
> >  "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
> >   in many platforms, including x86-64.  In our example, the content of 
> >   user address 0xBEEF000 is also accessible through kernel address 
> >   0xFFFF87FF9F080000 as plain, executable code."
> > 
> > Is this actually true of modern x86-64 kernels? We've locked down W^X 
> > protections in general.
> 
> As I was curious, I looked at the paper. Here is a quote from it:
> 
> "In x86-64, however, the permissions of physmap are not in sane state.
> Kernels up to v3.8.13 violate the W^X property by mapping the entire region
> as “readable, writeable, and executable” (RWX)—only very recent kernels
> (≥v3.9) use the more conservative RW mapping.”

But v3.8.13 is a 5+ years old kernel, it doesn't count as a "modern" 
kernel in any sense of the word. For any proposed patchset with 
significant complexity and non-trivial costs the benchmark version 
threshold is the "current upstream kernel".

So does that quote address my followup questions:

> Is this actually true of modern x86-64 kernels? We've locked down W^X
> protections in general.
>
> I.e. this conclusion:
>
>   "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and
>    triggering the kernel to dereference it, an attacker can directly
>    execute shell code with kernel privileges."
>
> ... appears to be predicated on imperfect W^X protections on the x86-64
> kernel.
>
> Do such holes exist on the latest x86-64 kernel? If yes, is there a
> reason to believe that these W^X holes cannot be fixed, or that any fix
> would be more expensive than XPFO?

?

What you are proposing here is a XPFO patch-set against recent kernels 
with significant runtime overhead, so my questions about the W^X holes 
are warranted.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 17:09       ` Ingo Molnar
  2019-04-17 17:19         ` Nadav Amit
@ 2019-04-17 17:33         ` Khalid Aziz
  2019-04-17 19:49           ` Andy Lutomirski
  1 sibling, 1 reply; 24+ messages in thread
From: Khalid Aziz @ 2019-04-17 17:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: juergh, tycho, jsteckli, keescook, konrad.wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, dwmw, andrew.cooper3,
	jcm, boris.ostrovsky, iommu, x86, linux-arm-kernel, linux-doc,
	linux-kernel, linux-mm, linux-security-module, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Andy Lutomirski,
	Peter Zijlstra, Dave Hansen, Borislav Petkov, H. Peter Anvin,
	Arjan van de Ven, Greg Kroah-Hartman

On 4/17/19 11:09 AM, Ingo Molnar wrote:
> 
> * Khalid Aziz <khalid.aziz@oracle.com> wrote:
> 
>>> I.e. the original motivation of the XPFO patches was to prevent execution 
>>> of direct kernel mappings. Is this motivation still present if those 
>>> mappings are non-executable?
>>>
>>> (Sorry if this has been asked and answered in previous discussions.)
>>
>> Hi Ingo,
>>
>> That is a good question. Because of the cost of XPFO, we have to be very
>> sure we need this protection. The paper from Vasileios, Michalis and
>> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
>> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
>> and 6.2.
> 
> So it would be nice if you could generally summarize external arguments 
> when defending a patchset, instead of me having to dig through a PDF 
> which not only causes me to spend time that you probably already spent 
> reading that PDF, but I might also interpret it incorrectly. ;-)

Sorry, you are right. Even though that paper explains it well, a summary
is always useful.

> 
> The PDF you cited says this:
> 
>   "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
>    in many platforms, including x86-64.  In our example, the content of 
>    user address 0xBEEF000 is also accessible through kernel address 
>    0xFFFF87FF9F080000 as plain, executable code."
> 
> Is this actually true of modern x86-64 kernels? We've locked down W^X 
> protections in general.
> 
> I.e. this conclusion:
> 
>   "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and 
>    triggering the kernel to dereference it, an attacker can directly 
>    execute shell code with kernel privileges."
> 
> ... appears to be predicated on imperfect W^X protections on the x86-64 
> kernel.
> 
> Do such holes exist on the latest x86-64 kernel? If yes, is there a 
> reason to believe that these W^X holes cannot be fixed, or that any fix 
> would be more expensive than XPFO?

Even if physmap is not executable, return-oriented programming (ROP) can
still be used to launch an attack. Instead of placing executable code at
user address 0xBEEF000, attacker can place an ROP payload there. kfptr
is then overwritten to point to a stack-pivoting gadget. Using the
physmap address aliasing, the ROP payload becomes kernel-mode stack. The
execution can then be hijacked upon execution of ret instruction. This
is a gist of the subsection titled "Non-executable physmap" under
section 6.2 and it looked convincing enough to me. If you have a
different take on this, I am very interested in your point of view.

Thanks,
Khalid



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 17:26           ` Ingo Molnar
@ 2019-04-17 17:44             ` Nadav Amit
  2019-04-17 21:19               ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Nadav Amit @ 2019-04-17 17:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Khalid Aziz, juergh, Tycho Andersen, jsteckli, keescook,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris.hyser, tyhicks, David Woodhouse, Andrew Cooper, jcm,
	Boris Ostrovsky, iommu, X86 ML, linux-arm-kernel,
	open list:DOCUMENTATION, Linux List Kernel Mailing, Linux-MM,
	LSM List, Khalid Aziz, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

> On Apr 17, 2019, at 10:26 AM, Ingo Molnar <mingo@kernel.org> wrote:
> 
> 
> * Nadav Amit <nadav.amit@gmail.com> wrote:
> 
>>> On Apr 17, 2019, at 10:09 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>> 
>>> 
>>> * Khalid Aziz <khalid.aziz@oracle.com> wrote:
>>> 
>>>>> I.e. the original motivation of the XPFO patches was to prevent execution 
>>>>> of direct kernel mappings. Is this motivation still present if those 
>>>>> mappings are non-executable?
>>>>> 
>>>>> (Sorry if this has been asked and answered in previous discussions.)
>>>> 
>>>> Hi Ingo,
>>>> 
>>>> That is a good question. Because of the cost of XPFO, we have to be very
>>>> sure we need this protection. The paper from Vasileios, Michalis and
>>>> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
>>>> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
>>>> and 6.2.
>>> 
>>> So it would be nice if you could generally summarize external arguments 
>>> when defending a patchset, instead of me having to dig through a PDF 
>>> which not only causes me to spend time that you probably already spent 
>>> reading that PDF, but I might also interpret it incorrectly. ;-)
>>> 
>>> The PDF you cited says this:
>>> 
>>> "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
>>>  in many platforms, including x86-64.  In our example, the content of 
>>>  user address 0xBEEF000 is also accessible through kernel address 
>>>  0xFFFF87FF9F080000 as plain, executable code."
>>> 
>>> Is this actually true of modern x86-64 kernels? We've locked down W^X 
>>> protections in general.
>> 
>> As I was curious, I looked at the paper. Here is a quote from it:
>> 
>> "In x86-64, however, the permissions of physmap are not in sane state.
>> Kernels up to v3.8.13 violate the W^X property by mapping the entire region
>> as “readable, writeable, and executable” (RWX)—only very recent kernels
>> (≥v3.9) use the more conservative RW mapping.”
> 
> But v3.8.13 is a 5+ years old kernel, it doesn't count as a "modern" 
> kernel in any sense of the word. For any proposed patchset with 
> significant complexity and non-trivial costs the benchmark version 
> threshold is the "current upstream kernel".
> 
> So does that quote address my followup questions:
> 
>> Is this actually true of modern x86-64 kernels? We've locked down W^X
>> protections in general.
>> 
>> I.e. this conclusion:
>> 
>>  "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and
>>   triggering the kernel to dereference it, an attacker can directly
>>   execute shell code with kernel privileges."
>> 
>> ... appears to be predicated on imperfect W^X protections on the x86-64
>> kernel.
>> 
>> Do such holes exist on the latest x86-64 kernel? If yes, is there a
>> reason to believe that these W^X holes cannot be fixed, or that any fix
>> would be more expensive than XPFO?
> 
> ?
> 
> What you are proposing here is a XPFO patch-set against recent kernels 
> with significant runtime overhead, so my questions about the W^X holes 
> are warranted.
> 

Just to clarify - I am an innocent bystander and have no part in this work.
I was just looking (again) at the paper, as I was curious due to the recent
patches that I sent that improve W^X protection.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 17:33         ` Khalid Aziz
@ 2019-04-17 19:49           ` Andy Lutomirski
  2019-04-17 19:52             ` Tycho Andersen
  2019-04-17 20:12             ` Khalid Aziz
  0 siblings, 2 replies; 24+ messages in thread
From: Andy Lutomirski @ 2019-04-17 19:49 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: Ingo Molnar, Juerg Haefliger, Tycho Andersen, jsteckli,
	Kees Cook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris hyser, Tyler Hicks, Woodhouse, David,
	Andrew Cooper, Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-arm-kernel, open list:DOCUMENTATION, LKML, Linux-MM,
	LSM List, Khalid Aziz, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On Wed, Apr 17, 2019 at 10:33 AM Khalid Aziz <khalid.aziz@oracle.com> wrote:
>
> On 4/17/19 11:09 AM, Ingo Molnar wrote:
> >
> > * Khalid Aziz <khalid.aziz@oracle.com> wrote:
> >
> >>> I.e. the original motivation of the XPFO patches was to prevent execution
> >>> of direct kernel mappings. Is this motivation still present if those
> >>> mappings are non-executable?
> >>>
> >>> (Sorry if this has been asked and answered in previous discussions.)
> >>
> >> Hi Ingo,
> >>
> >> That is a good question. Because of the cost of XPFO, we have to be very
> >> sure we need this protection. The paper from Vasileios, Michalis and
> >> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
> >> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
> >> and 6.2.
> >
> > So it would be nice if you could generally summarize external arguments
> > when defending a patchset, instead of me having to dig through a PDF
> > which not only causes me to spend time that you probably already spent
> > reading that PDF, but I might also interpret it incorrectly. ;-)
>
> Sorry, you are right. Even though that paper explains it well, a summary
> is always useful.
>
> >
> > The PDF you cited says this:
> >
> >   "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced
> >    in many platforms, including x86-64.  In our example, the content of
> >    user address 0xBEEF000 is also accessible through kernel address
> >    0xFFFF87FF9F080000 as plain, executable code."
> >
> > Is this actually true of modern x86-64 kernels? We've locked down W^X
> > protections in general.
> >
> > I.e. this conclusion:
> >
> >   "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and
> >    triggering the kernel to dereference it, an attacker can directly
> >    execute shell code with kernel privileges."
> >
> > ... appears to be predicated on imperfect W^X protections on the x86-64
> > kernel.
> >
> > Do such holes exist on the latest x86-64 kernel? If yes, is there a
> > reason to believe that these W^X holes cannot be fixed, or that any fix
> > would be more expensive than XPFO?
>
> Even if physmap is not executable, return-oriented programming (ROP) can
> still be used to launch an attack. Instead of placing executable code at
> user address 0xBEEF000, attacker can place an ROP payload there. kfptr
> is then overwritten to point to a stack-pivoting gadget. Using the
> physmap address aliasing, the ROP payload becomes kernel-mode stack. The
> execution can then be hijacked upon execution of ret instruction. This
> is a gist of the subsection titled "Non-executable physmap" under
> section 6.2 and it looked convincing enough to me. If you have a
> different take on this, I am very interested in your point of view.

My issue with all this is that XPFO is really very expensive.  I think
that, if we're going to seriously consider upstreaming expensive
exploit mitigations like this, we should consider others first, in
particular CFI techniques.  grsecurity's RAP would be a great start.
I also proposed using a gcc plugin (or upstream gcc feature) to add
some instrumentation to any code that pops RSP to verify that the
resulting (unsigned) change in RSP is between 0 and THREAD_SIZE bytes.
This will make ROP quite a bit harder.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 19:49           ` Andy Lutomirski
@ 2019-04-17 19:52             ` Tycho Andersen
  2019-04-17 20:12             ` Khalid Aziz
  1 sibling, 0 replies; 24+ messages in thread
From: Tycho Andersen @ 2019-04-17 19:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Khalid Aziz, Ingo Molnar, Juerg Haefliger, jsteckli, Kees Cook,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris hyser, Tyler Hicks, Woodhouse, David, Andrew Cooper,
	Jon Masters, Boris Ostrovsky, iommu, X86 ML, linux-arm-kernel,
	open list:DOCUMENTATION, LKML, Linux-MM, LSM List, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Dave Hansen, Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On Wed, Apr 17, 2019 at 12:49:04PM -0700, Andy Lutomirski wrote:
> I also proposed using a gcc plugin (or upstream gcc feature) to add
> some instrumentation to any code that pops RSP to verify that the
> resulting (unsigned) change in RSP is between 0 and THREAD_SIZE bytes.
> This will make ROP quite a bit harder.

I've been playing around with this for a bit, and hope to have
something to post Soon :)

Tycho

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 19:49           ` Andy Lutomirski
  2019-04-17 19:52             ` Tycho Andersen
@ 2019-04-17 20:12             ` Khalid Aziz
  1 sibling, 0 replies; 24+ messages in thread
From: Khalid Aziz @ 2019-04-17 20:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Juerg Haefliger, Tycho Andersen, jsteckli,
	Kees Cook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris hyser, Tyler Hicks, Woodhouse, David,
	Andrew Cooper, Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-arm-kernel, open list:DOCUMENTATION, LKML, Linux-MM,
	LSM List, Khalid Aziz, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Dave Hansen, Borislav Petkov,
	H. Peter Anvin, Arjan van de Ven, Greg Kroah-Hartman

On 4/17/19 1:49 PM, Andy Lutomirski wrote:
> On Wed, Apr 17, 2019 at 10:33 AM Khalid Aziz <khalid.aziz@oracle.com> wrote:
>>
>> On 4/17/19 11:09 AM, Ingo Molnar wrote:
>>>
>>> * Khalid Aziz <khalid.aziz@oracle.com> wrote:
>>>
>>>>> I.e. the original motivation of the XPFO patches was to prevent execution
>>>>> of direct kernel mappings. Is this motivation still present if those
>>>>> mappings are non-executable?
>>>>>
>>>>> (Sorry if this has been asked and answered in previous discussions.)
>>>>
>>>> Hi Ingo,
>>>>
>>>> That is a good question. Because of the cost of XPFO, we have to be very
>>>> sure we need this protection. The paper from Vasileios, Michalis and
>>>> Angelos - <http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf>,
>>>> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
>>>> and 6.2.
>>>
>>> So it would be nice if you could generally summarize external arguments
>>> when defending a patchset, instead of me having to dig through a PDF
>>> which not only causes me to spend time that you probably already spent
>>> reading that PDF, but I might also interpret it incorrectly. ;-)
>>
>> Sorry, you are right. Even though that paper explains it well, a summary
>> is always useful.
>>
>>>
>>> The PDF you cited says this:
>>>
>>>   "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced
>>>    in many platforms, including x86-64.  In our example, the content of
>>>    user address 0xBEEF000 is also accessible through kernel address
>>>    0xFFFF87FF9F080000 as plain, executable code."
>>>
>>> Is this actually true of modern x86-64 kernels? We've locked down W^X
>>> protections in general.
>>>
>>> I.e. this conclusion:
>>>
>>>   "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and
>>>    triggering the kernel to dereference it, an attacker can directly
>>>    execute shell code with kernel privileges."
>>>
>>> ... appears to be predicated on imperfect W^X protections on the x86-64
>>> kernel.
>>>
>>> Do such holes exist on the latest x86-64 kernel? If yes, is there a
>>> reason to believe that these W^X holes cannot be fixed, or that any fix
>>> would be more expensive than XPFO?
>>
>> Even if physmap is not executable, return-oriented programming (ROP) can
>> still be used to launch an attack. Instead of placing executable code at
>> user address 0xBEEF000, attacker can place an ROP payload there. kfptr
>> is then overwritten to point to a stack-pivoting gadget. Using the
>> physmap address aliasing, the ROP payload becomes kernel-mode stack. The
>> execution can then be hijacked upon execution of ret instruction. This
>> is a gist of the subsection titled "Non-executable physmap" under
>> section 6.2 and it looked convincing enough to me. If you have a
>> different take on this, I am very interested in your point of view.
> 
> My issue with all this is that XPFO is really very expensive.  I think
> that, if we're going to seriously consider upstreaming expensive
> exploit mitigations like this, we should consider others first, in
> particular CFI techniques.  grsecurity's RAP would be a great start.
> I also proposed using a gcc plugin (or upstream gcc feature) to add
> some instrumentation to any code that pops RSP to verify that the
> resulting (unsigned) change in RSP is between 0 and THREAD_SIZE bytes.
> This will make ROP quite a bit harder.
> 

Yes, XPFO is expensive. I have been able to reduce the overhead of XPFO
from 2537% to 28% (on large servers) but 28% is still quite significant.
Alternative mitigation techniques with lower impact would easily be more
acceptable as long as they provide same level of protection. If we have
to go with XPFO, we will continue to look for more performance
improvement to bring that number down further from 28%. Hopefully what
Tycho is working on will yield better results. I am continuing to look
for improvements to XPFO in parallel.

Thanks,
Khalid


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 17:44             ` Nadav Amit
@ 2019-04-17 21:19               ` Thomas Gleixner
       [not found]                 ` <CAHk-=wgBMg9P-nYQR2pS0XwVdikPCBqLsMFqR9nk=wSmAd4_5g@mail.gmail.com>
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2019-04-17 21:19 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Ingo Molnar, Khalid Aziz, juergh, Tycho Andersen, jsteckli,
	keescook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, David Woodhouse,
	Andrew Cooper, jcm, Boris Ostrovsky, iommu, X86 ML,
	linux-arm-kernel, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Peter Zijlstra,
	Dave Hansen, Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

[-- Attachment #1: Type: text/plain, Size: 2067 bytes --]

On Wed, 17 Apr 2019, Nadav Amit wrote:
> > On Apr 17, 2019, at 10:26 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >> As I was curious, I looked at the paper. Here is a quote from it:
> >> 
> >> "In x86-64, however, the permissions of physmap are not in sane state.
> >> Kernels up to v3.8.13 violate the W^X property by mapping the entire region
> >> as “readable, writeable, and executable” (RWX)—only very recent kernels
> >> (≥v3.9) use the more conservative RW mapping.”
> > 
> > But v3.8.13 is a 5+ years old kernel, it doesn't count as a "modern" 
> > kernel in any sense of the word. For any proposed patchset with 
> > significant complexity and non-trivial costs the benchmark version 
> > threshold is the "current upstream kernel".
> > 
> > So does that quote address my followup questions:
> > 
> >> Is this actually true of modern x86-64 kernels? We've locked down W^X
> >> protections in general.
> >> 
> >> I.e. this conclusion:
> >> 
> >>  "Therefore, by simply overwriting kfptr with 0xFFFF87FF9F080000 and
> >>   triggering the kernel to dereference it, an attacker can directly
> >>   execute shell code with kernel privileges."
> >> 
> >> ... appears to be predicated on imperfect W^X protections on the x86-64
> >> kernel.
> >> 
> >> Do such holes exist on the latest x86-64 kernel? If yes, is there a
> >> reason to believe that these W^X holes cannot be fixed, or that any fix
> >> would be more expensive than XPFO?
> > 
> > ?
> > 
> > What you are proposing here is a XPFO patch-set against recent kernels 
> > with significant runtime overhead, so my questions about the W^X holes 
> > are warranted.
> > 
> 
> Just to clarify - I am an innocent bystander and have no part in this work.
> I was just looking (again) at the paper, as I was curious due to the recent
> patches that I sent that improve W^X protection.

It's not necessarily a W+X issue. The user space text is mapped in the
kernel as well and even if it is mapped RX then this can happen. So any
kernel mappings of user space text need to be mapped NX!

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
       [not found]                 ` <CAHk-=wgBMg9P-nYQR2pS0XwVdikPCBqLsMFqR9nk=wSmAd4_5g@mail.gmail.com>
@ 2019-04-17 23:42                   ` Thomas Gleixner
  2019-04-17 23:52                     ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2019-04-17 23:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nadav Amit, Ingo Molnar, Khalid Aziz, juergh, Tycho Andersen,
	jsteckli, keescook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, David Woodhouse,
	Andrew Cooper, jcm, Boris Ostrovsky, iommu, X86 ML,
	linux-arm-kernel, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On Wed, 17 Apr 2019, Linus Torvalds wrote:

> On Wed, Apr 17, 2019, 14:20 Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> >
> > It's not necessarily a W+X issue. The user space text is mapped in the
> > kernel as well and even if it is mapped RX then this can happen. So any
> > kernel mappings of user space text need to be mapped NX!
> 
> With SMEP, user space pages are always NX.

We talk past each other. The user space page in the ring3 valid virtual
address space (non negative) is of course protected by SMEP.

The attack utilizes the kernel linear mapping of the physical
memory. I.e. user space address 0x43210 has a kernel equivalent at
0xfxxxxxxxxxx. So if the attack manages to trick the kernel to that valid
kernel address and that is mapped X --> game over. SMEP does not help
there.

From the top of my head I'd say this is a non issue as those kernel address
space mappings _should_ be NX, but we got bitten by _should_ in the past:)

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 23:42                   ` Thomas Gleixner
@ 2019-04-17 23:52                     ` Linus Torvalds
  2019-04-18  4:41                       ` Andy Lutomirski
  2019-04-18  6:14                       ` Thomas Gleixner
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2019-04-17 23:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Nadav Amit, Ingo Molnar, Khalid Aziz, juergh, Tycho Andersen,
	jsteckli, Kees Cook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, Tyler Hicks, David Woodhouse,
	Andrew Cooper, Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-alpha@vger.kernel.org, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On Wed, Apr 17, 2019 at 4:42 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, 17 Apr 2019, Linus Torvalds wrote:
>
> > With SMEP, user space pages are always NX.
>
> We talk past each other. The user space page in the ring3 valid virtual
> address space (non negative) is of course protected by SMEP.
>
> The attack utilizes the kernel linear mapping of the physical
> memory. I.e. user space address 0x43210 has a kernel equivalent at
> 0xfxxxxxxxxxx. So if the attack manages to trick the kernel to that valid
> kernel address and that is mapped X --> game over. SMEP does not help
> there.

Oh, agreed.

But that would simply be a kernel bug. We should only map kernel pages
executable when we have kernel code in them, and we should certainly
not allow those pages to be mapped writably in user space.

That kind of "executable in kernel, writable in user" would be a
horrendous and major bug.

So i think it's a non-issue.

> From the top of my head I'd say this is a non issue as those kernel address
> space mappings _should_ be NX, but we got bitten by _should_ in the past:)

I do agree that bugs can happen, obviously, and we might have missed something.

But in the context of XPFO, I would argue (*very* strongly) that the
likelihood of the above kind of bug is absolutely *miniscule* compared
to the likelihood that we'd have something wrong in the software
implementation of XPFO.

So if the argument is "we might have bugs in software", then I think
that's an argument _against_ XPFO rather than for it.

                Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 23:52                     ` Linus Torvalds
@ 2019-04-18  4:41                       ` Andy Lutomirski
  2019-04-18  5:41                         ` Kees Cook
  2019-04-18  6:14                       ` Thomas Gleixner
  1 sibling, 1 reply; 24+ messages in thread
From: Andy Lutomirski @ 2019-04-18  4:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Nadav Amit, Ingo Molnar, Khalid Aziz,
	Juerg Haefliger, Tycho Andersen, jsteckli, Kees Cook,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris hyser, Tyler Hicks, David Woodhouse, Andrew Cooper,
	Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-alpha@vger.kernel.org, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On Wed, Apr 17, 2019 at 5:00 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, Apr 17, 2019 at 4:42 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Wed, 17 Apr 2019, Linus Torvalds wrote:
> >
> > > With SMEP, user space pages are always NX.
> >
> > We talk past each other. The user space page in the ring3 valid virtual
> > address space (non negative) is of course protected by SMEP.
> >
> > The attack utilizes the kernel linear mapping of the physical
> > memory. I.e. user space address 0x43210 has a kernel equivalent at
> > 0xfxxxxxxxxxx. So if the attack manages to trick the kernel to that valid
> > kernel address and that is mapped X --> game over. SMEP does not help
> > there.
>
> Oh, agreed.
>
> But that would simply be a kernel bug. We should only map kernel pages
> executable when we have kernel code in them, and we should certainly
> not allow those pages to be mapped writably in user space.
>
> That kind of "executable in kernel, writable in user" would be a
> horrendous and major bug.
>
> So i think it's a non-issue.
>
> > From the top of my head I'd say this is a non issue as those kernel address
> > space mappings _should_ be NX, but we got bitten by _should_ in the past:)
>
> I do agree that bugs can happen, obviously, and we might have missed something.
>
> But in the context of XPFO, I would argue (*very* strongly) that the
> likelihood of the above kind of bug is absolutely *miniscule* compared
> to the likelihood that we'd have something wrong in the software
> implementation of XPFO.
>
> So if the argument is "we might have bugs in software", then I think
> that's an argument _against_ XPFO rather than for it.
>

I don't think this type of NX goof was ever the argument for XPFO.
The main argument I've heard is that a malicious user program writes a
ROP payload into user memory (regular anonymous user memory) and then
gets the kernel to erroneously set RSP (*not* RIP) to point there.

I find this argument fairly weak for a couple reasons.  First, if
we're worried about this, let's do in-kernel CFI, not XPFO, to
mitigate it.  Second, I don't see why the exact same attack can't be
done using, say, page cache, and unless I'm missing something, XPFO
doesn't protect page cache.  Or network buffers, or pipe buffers, etc.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-18  4:41                       ` Andy Lutomirski
@ 2019-04-18  5:41                         ` Kees Cook
  2019-04-18 14:34                           ` Khalid Aziz
  0 siblings, 1 reply; 24+ messages in thread
From: Kees Cook @ 2019-04-18  5:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Thomas Gleixner, Nadav Amit, Ingo Molnar,
	Khalid Aziz, Juerg Haefliger, Tycho Andersen, Julian Stecklina,
	Kees Cook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris hyser, Tyler Hicks, David Woodhouse,
	Andrew Cooper, Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-alpha@vger.kernel.org, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Peter Zijlstra, Dave Hansen, Borislav Petkov,
	H. Peter Anvin, Arjan van de Ven, Greg Kroah-Hartman

On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski <luto@kernel.org> wrote:
> I don't think this type of NX goof was ever the argument for XPFO.
> The main argument I've heard is that a malicious user program writes a
> ROP payload into user memory (regular anonymous user memory) and then
> gets the kernel to erroneously set RSP (*not* RIP) to point there.

Well, more than just ROP. Any of the various attack primitives. The NX
stuff is about moving RIP: SMEP-bypassing. But there is still basic
SMAP-bypassing for putting a malicious structure in userspace and
having the kernel access it via the linear mapping, etc.

> I find this argument fairly weak for a couple reasons.  First, if
> we're worried about this, let's do in-kernel CFI, not XPFO, to

CFI is getting much closer. Getting the kernel happy under Clang, LTO,
and CFI is under active development. (It's functional for arm64
already, and pieces have been getting upstreamed.)

> mitigate it.  Second, I don't see why the exact same attack can't be
> done using, say, page cache, and unless I'm missing something, XPFO
> doesn't protect page cache.  Or network buffers, or pipe buffers, etc.

My understanding is that it's much easier to feel out the linear
mapping address than for the others. But yes, all of those same attack
primitives are possible in other memory areas (though most are NX),
and plenty of exploits have done such things.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 23:52                     ` Linus Torvalds
  2019-04-18  4:41                       ` Andy Lutomirski
@ 2019-04-18  6:14                       ` Thomas Gleixner
  1 sibling, 0 replies; 24+ messages in thread
From: Thomas Gleixner @ 2019-04-18  6:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nadav Amit, Ingo Molnar, Khalid Aziz, juergh, Tycho Andersen,
	jsteckli, Kees Cook, Konrad Rzeszutek Wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, Tyler Hicks, David Woodhouse,
	Andrew Cooper, Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-alpha@vger.kernel.org, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Andy Lutomirski, Peter Zijlstra, Dave Hansen,
	Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On Wed, 17 Apr 2019, Linus Torvalds wrote:
> On Wed, Apr 17, 2019 at 4:42 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Wed, 17 Apr 2019, Linus Torvalds wrote:
> > > With SMEP, user space pages are always NX.
> >
> > We talk past each other. The user space page in the ring3 valid virtual
> > address space (non negative) is of course protected by SMEP.
> >
> > The attack utilizes the kernel linear mapping of the physical
> > memory. I.e. user space address 0x43210 has a kernel equivalent at
> > 0xfxxxxxxxxxx. So if the attack manages to trick the kernel to that valid
> > kernel address and that is mapped X --> game over. SMEP does not help
> > there.
> 
> Oh, agreed.
> 
> But that would simply be a kernel bug. We should only map kernel pages
> executable when we have kernel code in them, and we should certainly
> not allow those pages to be mapped writably in user space.
> 
> That kind of "executable in kernel, writable in user" would be a
> horrendous and major bug.
> 
> So i think it's a non-issue.

Pretty much.

> > From the top of my head I'd say this is a non issue as those kernel address
> > space mappings _should_ be NX, but we got bitten by _should_ in the past:)
> 
> I do agree that bugs can happen, obviously, and we might have missed something.
>
> But in the context of XPFO, I would argue (*very* strongly) that the
> likelihood of the above kind of bug is absolutely *miniscule* compared
> to the likelihood that we'd have something wrong in the software
> implementation of XPFO.
> 
> So if the argument is "we might have bugs in software", then I think
> that's an argument _against_ XPFO rather than for it.

No argument from my side. We better spend time to make sure that a bogus
kernel side X mapping is caught, like we catch other things.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-18  5:41                         ` Kees Cook
@ 2019-04-18 14:34                           ` Khalid Aziz
  2019-04-22 19:30                             ` Khalid Aziz
  2019-04-22 22:23                             ` Kees Cook
  0 siblings, 2 replies; 24+ messages in thread
From: Khalid Aziz @ 2019-04-18 14:34 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski
  Cc: Linus Torvalds, Thomas Gleixner, Nadav Amit, Ingo Molnar,
	Juerg Haefliger, Tycho Andersen, Julian Stecklina,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris hyser, Tyler Hicks, David Woodhouse, Andrew Cooper,
	Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-alpha@vger.kernel.org, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Peter Zijlstra, Dave Hansen, Borislav Petkov,
	H. Peter Anvin, Arjan van de Ven, Greg Kroah-Hartman

On 4/17/19 11:41 PM, Kees Cook wrote:
> On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski <luto@kernel.org> wrote:
>> I don't think this type of NX goof was ever the argument for XPFO.
>> The main argument I've heard is that a malicious user program writes a
>> ROP payload into user memory (regular anonymous user memory) and then
>> gets the kernel to erroneously set RSP (*not* RIP) to point there.
> 
> Well, more than just ROP. Any of the various attack primitives. The NX
> stuff is about moving RIP: SMEP-bypassing. But there is still basic
> SMAP-bypassing for putting a malicious structure in userspace and
> having the kernel access it via the linear mapping, etc.
> 
>> I find this argument fairly weak for a couple reasons.  First, if
>> we're worried about this, let's do in-kernel CFI, not XPFO, to
> 
> CFI is getting much closer. Getting the kernel happy under Clang, LTO,
> and CFI is under active development. (It's functional for arm64
> already, and pieces have been getting upstreamed.)
> 

CFI theoretically offers protection with fairly low overhead. I have not
played much with CFI in clang. I agree with Linus that probability of
bugs in XPFO implementation itself is a cause of concern. If CFI in
Clang can provide us the same level of protection as XPFO does, I
wouldn't want to push for an expensive change like XPFO.

If Clang/CFI can't get us there for extended period of time, does it
make sense to continue to poke at XPFO?

Thanks,
Khalid


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-18 14:34                           ` Khalid Aziz
@ 2019-04-22 19:30                             ` Khalid Aziz
  2019-04-22 22:23                             ` Kees Cook
  1 sibling, 0 replies; 24+ messages in thread
From: Khalid Aziz @ 2019-04-22 19:30 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski, Linus Torvalds
  Cc: Thomas Gleixner, Nadav Amit, Ingo Molnar, Juerg Haefliger,
	Tycho Andersen, Julian Stecklina, Konrad Rzeszutek Wilk,
	Juerg Haefliger, deepa.srinivasan, chris hyser, Tyler Hicks,
	David Woodhouse, Andrew Cooper, Jon Masters, Boris Ostrovsky,
	iommu, X86 ML, linux-alpha@vger.kernel.org,
	open list:DOCUMENTATION, Linux List Kernel Mailing, Linux-MM,
	LSM List, Khalid Aziz, Andrew Morton, Peter Zijlstra,
	Dave Hansen, Borislav Petkov, H. Peter Anvin, Arjan van de Ven,
	Greg Kroah-Hartman

On 4/18/19 8:34 AM, Khalid Aziz wrote:
> On 4/17/19 11:41 PM, Kees Cook wrote:
>> On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski <luto@kernel.org> wrote:
>>> I don't think this type of NX goof was ever the argument for XPFO.
>>> The main argument I've heard is that a malicious user program writes a
>>> ROP payload into user memory (regular anonymous user memory) and then
>>> gets the kernel to erroneously set RSP (*not* RIP) to point there.
>>
>> Well, more than just ROP. Any of the various attack primitives. The NX
>> stuff is about moving RIP: SMEP-bypassing. But there is still basic
>> SMAP-bypassing for putting a malicious structure in userspace and
>> having the kernel access it via the linear mapping, etc.
>>
>>> I find this argument fairly weak for a couple reasons.  First, if
>>> we're worried about this, let's do in-kernel CFI, not XPFO, to
>>
>> CFI is getting much closer. Getting the kernel happy under Clang, LTO,
>> and CFI is under active development. (It's functional for arm64
>> already, and pieces have been getting upstreamed.)
>>
> 
> CFI theoretically offers protection with fairly low overhead. I have not
> played much with CFI in clang. I agree with Linus that probability of
> bugs in XPFO implementation itself is a cause of concern. If CFI in
> Clang can provide us the same level of protection as XPFO does, I
> wouldn't want to push for an expensive change like XPFO.
> 
> If Clang/CFI can't get us there for extended period of time, does it
> make sense to continue to poke at XPFO?

Any feedback on continued effort on XPFO? If it makes sense to have XPFO
available as a solution for ret2dir issue in case Clang/CFI does not
work out, I will continue to refine it.

--
Khalid


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-18 14:34                           ` Khalid Aziz
  2019-04-22 19:30                             ` Khalid Aziz
@ 2019-04-22 22:23                             ` Kees Cook
  1 sibling, 0 replies; 24+ messages in thread
From: Kees Cook @ 2019-04-22 22:23 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: Andy Lutomirski, Linus Torvalds, Thomas Gleixner, Nadav Amit,
	Ingo Molnar, Juerg Haefliger, Tycho Andersen, Julian Stecklina,
	Konrad Rzeszutek Wilk, Juerg Haefliger, deepa.srinivasan,
	chris hyser, Tyler Hicks, David Woodhouse, Andrew Cooper,
	Jon Masters, Boris Ostrovsky, iommu, X86 ML,
	linux-alpha@vger.kernel.org, open list:DOCUMENTATION,
	Linux List Kernel Mailing, Linux-MM, LSM List, Khalid Aziz,
	Andrew Morton, Peter Zijlstra, Dave Hansen, Borislav Petkov,
	H. Peter Anvin, Arjan van de Ven, Greg Kroah-Hartman

On Thu, Apr 18, 2019 at 7:35 AM Khalid Aziz <khalid.aziz@oracle.com> wrote:
>
> On 4/17/19 11:41 PM, Kees Cook wrote:
> > On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski <luto@kernel.org> wrote:
> >> I don't think this type of NX goof was ever the argument for XPFO.
> >> The main argument I've heard is that a malicious user program writes a
> >> ROP payload into user memory (regular anonymous user memory) and then
> >> gets the kernel to erroneously set RSP (*not* RIP) to point there.
> >
> > Well, more than just ROP. Any of the various attack primitives. The NX
> > stuff is about moving RIP: SMEP-bypassing. But there is still basic
> > SMAP-bypassing for putting a malicious structure in userspace and
> > having the kernel access it via the linear mapping, etc.
> >
> >> I find this argument fairly weak for a couple reasons.  First, if
> >> we're worried about this, let's do in-kernel CFI, not XPFO, to
> >
> > CFI is getting much closer. Getting the kernel happy under Clang, LTO,
> > and CFI is under active development. (It's functional for arm64
> > already, and pieces have been getting upstreamed.)
> >
>
> CFI theoretically offers protection with fairly low overhead. I have not
> played much with CFI in clang. I agree with Linus that probability of
> bugs in XPFO implementation itself is a cause of concern. If CFI in
> Clang can provide us the same level of protection as XPFO does, I
> wouldn't want to push for an expensive change like XPFO.
>
> If Clang/CFI can't get us there for extended period of time, does it
> make sense to continue to poke at XPFO?

Well, I think CFI will certainly vastly narrow the execution paths
available to an attacker, but what I continue to see XPFO useful for
is stopping attacks that need to locate something in memory. (i.e. not
ret2dir but, like, read2dir.) It's arguable that such attacks would
just use heap, stack, etc to hold such things, but the linear map
remains relatively easy to find/target. But I agree: the protection is
getting more and more narrow (especially with CFI coming down the
pipe), and if it's still a 28% hit, that's not going to be tenable for
anyone but the truly paranoid. :)

All that said, there isn't a very good backward-edge CFI protection
(i.e. ROP defense) on x86 in Clang. The forward-edge looks decent, but
requires LTO, etc. My point is there is still a long path to gaining
CFI in upstream.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-04-17 16:49     ` Khalid Aziz
  2019-04-17 17:09       ` Ingo Molnar
@ 2019-05-01 14:49       ` Waiman Long
  2019-05-01 15:18         ` Khalid Aziz
  1 sibling, 1 reply; 24+ messages in thread
From: Waiman Long @ 2019-05-01 14:49 UTC (permalink / raw)
  To: Khalid Aziz, Ingo Molnar
  Cc: juergh, tycho, jsteckli, keescook, konrad.wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, dwmw, andrew.cooper3,
	jcm, boris.ostrovsky, iommu, x86, linux-arm-kernel, linux-doc,
	linux-kernel, linux-mm, linux-security-module, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Andy Lutomirski,
	Peter Zijlstra, Dave Hansen, Borislav Petkov, H. Peter Anvin,
	Arjan van de Ven, Greg Kroah-Hartman

On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide/kernel-parameters.txt

> index 858b6c0b9a15..9b36da94760e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2997,6 +2997,12 @@
>
>      nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>
> +    noxpfo        [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
> +            when CONFIG_XPFO is on. Physical pages mapped into
> +            user applications will also be mapped in the
> +            kernel's address space as if CONFIG_XPFO was not
> +            enabled.
> +
>      cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>              CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>              Some features depend on CPU0. Known dependencies are:

Given the big performance impact that XPFO can have. It should be off by
default when configured. Instead, the xpfo option should be used to
enable it.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)
  2019-05-01 14:49       ` Waiman Long
@ 2019-05-01 15:18         ` Khalid Aziz
  0 siblings, 0 replies; 24+ messages in thread
From: Khalid Aziz @ 2019-05-01 15:18 UTC (permalink / raw)
  To: Waiman Long, Ingo Molnar
  Cc: juergh, tycho, jsteckli, keescook, konrad.wilk, Juerg Haefliger,
	deepa.srinivasan, chris.hyser, tyhicks, dwmw, andrew.cooper3,
	jcm, boris.ostrovsky, iommu, x86, linux-arm-kernel, linux-doc,
	linux-kernel, linux-mm, linux-security-module, Khalid Aziz,
	Linus Torvalds, Andrew Morton, Thomas Gleixner, Andy Lutomirski,
	Peter Zijlstra, Dave Hansen, Borislav Petkov, H. Peter Anvin,
	Arjan van de Ven, Greg Kroah-Hartman

On 5/1/19 8:49 AM, Waiman Long wrote:
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> 
>> index 858b6c0b9a15..9b36da94760e 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2997,6 +2997,12 @@
>>
>>       nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>>
>> +    noxpfo        [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
>> +            when CONFIG_XPFO is on. Physical pages mapped into
>> +            user applications will also be mapped in the
>> +            kernel's address space as if CONFIG_XPFO was not
>> +            enabled.
>> +
>>       cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>>               CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>>               Some features depend on CPU0. Known dependencies are:
> 
> Given the big performance impact that XPFO can have. It should be off by
> default when configured. Instead, the xpfo option should be used to
> enable it.

Agreed. I plan to disable it by default in the next version of the
patch. This is likely to end up being a feature for extreme security
conscious folks only, unless I or someone else comes up with further
significant performance boost.

Thanks,
Khalid


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-05-01 15:22 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1554248001.git.khalid.aziz@oracle.com>
     [not found] ` <e6c57f675e5b53d4de266412aa526b7660c47918.1554248002.git.khalid.aziz@oracle.com>
     [not found]   ` <CALCETrXvwuwkVSJ+S5s7wTBkNNj3fRVxpx9BvsXWrT=3ZdRnCw@mail.gmail.com>
     [not found]     ` <20190404013956.GA3365@cisco>
     [not found]       ` <CALCETrVp37Xo3EMHkeedP1zxUMf9og=mceBa8c55e1F4G1DRSQ@mail.gmail.com>
     [not found]         ` <20190404154727.GA14030@cisco>
2019-04-04 16:23           ` [RFC PATCH v9 02/13] x86: always set IF before oopsing from page fault Sebastian Andrzej Siewior
2019-04-04 16:44 ` [RFC PATCH v9 00/13] Add support for eXclusive Page Frame Ownership Nadav Amit
2019-04-04 17:18   ` Khalid Aziz
     [not found] ` <f1ac3700970365fb979533294774af0b0dd84b3b.1554248002.git.khalid.aziz@oracle.com>
2019-04-17 16:15   ` [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO) Ingo Molnar
2019-04-17 16:49     ` Khalid Aziz
2019-04-17 17:09       ` Ingo Molnar
2019-04-17 17:19         ` Nadav Amit
2019-04-17 17:26           ` Ingo Molnar
2019-04-17 17:44             ` Nadav Amit
2019-04-17 21:19               ` Thomas Gleixner
     [not found]                 ` <CAHk-=wgBMg9P-nYQR2pS0XwVdikPCBqLsMFqR9nk=wSmAd4_5g@mail.gmail.com>
2019-04-17 23:42                   ` Thomas Gleixner
2019-04-17 23:52                     ` Linus Torvalds
2019-04-18  4:41                       ` Andy Lutomirski
2019-04-18  5:41                         ` Kees Cook
2019-04-18 14:34                           ` Khalid Aziz
2019-04-22 19:30                             ` Khalid Aziz
2019-04-22 22:23                             ` Kees Cook
2019-04-18  6:14                       ` Thomas Gleixner
2019-04-17 17:33         ` Khalid Aziz
2019-04-17 19:49           ` Andy Lutomirski
2019-04-17 19:52             ` Tycho Andersen
2019-04-17 20:12             ` Khalid Aziz
2019-05-01 14:49       ` Waiman Long
2019-05-01 15:18         ` Khalid Aziz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).