All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, linux-arch@vger.kernel.org, x86@kernel.org,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H.J. Lu" <hjl.tools@gmail.com>,
	Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Jonathan Corbet <corbet@lwn.net>, Oleg Nesterov <oleg@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Mike Kravetz <mike.kravetz@oracle.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Subject: [PATCH 8/9] x86/cet: Handle shadow stack page fault
Date: Thu,  7 Jun 2018 07:37:04 -0700	[thread overview]
Message-ID: <20180607143705.3531-9-yu-cheng.yu@intel.com> (raw)
In-Reply-To: <20180607143705.3531-1-yu-cheng.yu@intel.com>

When a task does fork(), its shadow stack must be duplicated for
the child.  However, the child may not actually use all pages of
of the copied shadow stack.  This patch implements a flow that
is similar to copy-on-write of an anonymous page, but for shadow
stack memory.  A shadow stack PTE needs to be RO and dirty.  We
use this dirty bit requirement to effect the copying of shadow
stack pages.

In copy_one_pte(), we clear the dirty bit from the shadow stack
PTE.  On the next shadow stack access to the PTE, a page fault
occurs.  At that time, we then copy/re-use the page and fix the
PTE.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 mm/memory.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 01f5464e0fd2..275c7fb3fc96 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1022,7 +1022,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect_flush(vma, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 
@@ -2444,7 +2444,13 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
 
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
-	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+
+	if (is_shstk_mapping(vma->vm_flags))
+		entry = pte_mkdirty_shstk(entry);
+	else
+		entry = pte_mkdirty(entry);
+
+	entry = maybe_mkwrite(entry, vma);
 	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
 		update_mmu_cache(vma, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -2517,7 +2523,11 @@ static int wp_page_copy(struct vm_fault *vmf)
 		}
 		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 		entry = mk_pte(new_page, vma->vm_page_prot);
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		if (is_shstk_mapping(vma->vm_flags))
+			entry = pte_mkdirty_shstk(entry);
+		else
+			entry = pte_mkdirty(entry);
+		entry = maybe_mkwrite(entry, vma);
 		/*
 		 * Clear the pte entry and flush it first, before updating the
 		 * pte with the new entry. This will avoid a race condition
@@ -3192,6 +3202,14 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	mem_cgroup_commit_charge(page, memcg, false, false);
 	lru_cache_add_active_or_unevictable(page, vma);
 setpte:
+	/*
+	 * If this is within a shadow stack mapping, mark
+	 * the PTE dirty.  We don't use pte_mkdirty(),
+	 * because the PTE must have _PAGE_DIRTY_HW set.
+	 */
+	if (is_shstk_mapping(vma->vm_flags))
+		entry = pte_mkdirty_shstk(entry);
+
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
 
 	/* No need to invalidate - it was non-present before */
@@ -3974,6 +3992,14 @@ static int handle_pte_fault(struct vm_fault *vmf)
 	entry = vmf->orig_pte;
 	if (unlikely(!pte_same(*vmf->pte, entry)))
 		goto unlock;
+
+	/*
+	 * Shadow stack PTEs are copy-on-access, so do_wp_page()
+	 * handling on them no matter if we have write fault or not.
+	 */
+	if (is_shstk_mapping(vmf->vma->vm_flags))
+		return do_wp_page(vmf);
+
 	if (vmf->flags & FAULT_FLAG_WRITE) {
 		if (!pte_write(entry))
 			return do_wp_page(vmf);
-- 
2.15.1

WARNING: multiple messages have this Message-ID (diff)
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, linux-arch@vger.kernel.org, x86@kernel.org,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H.J. Lu" <hjl.tools@gmail.com>,
	Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Jonathan Corbet <corbet@lwn.net>, Oleg Nesterov <oleg@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Mike Kravetz <mike.kravetz@oracle.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Subject: [PATCH 8/9] x86/cet: Handle shadow stack page fault
Date: Thu,  7 Jun 2018 07:37:04 -0700	[thread overview]
Message-ID: <20180607143705.3531-9-yu-cheng.yu@intel.com> (raw)
In-Reply-To: <20180607143705.3531-1-yu-cheng.yu@intel.com>

When a task does fork(), its shadow stack must be duplicated for
the child.  However, the child may not actually use all pages of
of the copied shadow stack.  This patch implements a flow that
is similar to copy-on-write of an anonymous page, but for shadow
stack memory.  A shadow stack PTE needs to be RO and dirty.  We
use this dirty bit requirement to effect the copying of shadow
stack pages.

In copy_one_pte(), we clear the dirty bit from the shadow stack
PTE.  On the next shadow stack access to the PTE, a page fault
occurs.  At that time, we then copy/re-use the page and fix the
PTE.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 mm/memory.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 01f5464e0fd2..275c7fb3fc96 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1022,7 +1022,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect_flush(vma, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 
@@ -2444,7 +2444,13 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
 
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
-	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+
+	if (is_shstk_mapping(vma->vm_flags))
+		entry = pte_mkdirty_shstk(entry);
+	else
+		entry = pte_mkdirty(entry);
+
+	entry = maybe_mkwrite(entry, vma);
 	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
 		update_mmu_cache(vma, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -2517,7 +2523,11 @@ static int wp_page_copy(struct vm_fault *vmf)
 		}
 		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 		entry = mk_pte(new_page, vma->vm_page_prot);
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		if (is_shstk_mapping(vma->vm_flags))
+			entry = pte_mkdirty_shstk(entry);
+		else
+			entry = pte_mkdirty(entry);
+		entry = maybe_mkwrite(entry, vma);
 		/*
 		 * Clear the pte entry and flush it first, before updating the
 		 * pte with the new entry. This will avoid a race condition
@@ -3192,6 +3202,14 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	mem_cgroup_commit_charge(page, memcg, false, false);
 	lru_cache_add_active_or_unevictable(page, vma);
 setpte:
+	/*
+	 * If this is within a shadow stack mapping, mark
+	 * the PTE dirty.  We don't use pte_mkdirty(),
+	 * because the PTE must have _PAGE_DIRTY_HW set.
+	 */
+	if (is_shstk_mapping(vma->vm_flags))
+		entry = pte_mkdirty_shstk(entry);
+
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
 
 	/* No need to invalidate - it was non-present before */
@@ -3974,6 +3992,14 @@ static int handle_pte_fault(struct vm_fault *vmf)
 	entry = vmf->orig_pte;
 	if (unlikely(!pte_same(*vmf->pte, entry)))
 		goto unlock;
+
+	/*
+	 * Shadow stack PTEs are copy-on-access, so do_wp_page()
+	 * handling on them no matter if we have write fault or not.
+	 */
+	if (is_shstk_mapping(vmf->vma->vm_flags))
+		return do_wp_page(vmf);
+
 	if (vmf->flags & FAULT_FLAG_WRITE) {
 		if (!pte_write(entry))
 			return do_wp_page(vmf);
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2018-06-07 14:40 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-07 14:36 [PATCH 0/9] Control Flow Enforcement - Part (2) Yu-cheng Yu
2018-06-07 14:36 ` Yu-cheng Yu
2018-06-07 14:36 ` [PATCH 1/9] x86/cet: Control protection exception handler Yu-cheng Yu
2018-06-07 14:36   ` Yu-cheng Yu
2018-06-07 15:46   ` Andy Lutomirski
2018-06-07 15:46     ` Andy Lutomirski
2018-06-07 16:23     ` Yu-cheng Yu
2018-06-07 16:23       ` Yu-cheng Yu
2018-06-08  4:17   ` kbuild test robot
2018-06-08  4:17     ` kbuild test robot
2018-06-08  4:17     ` kbuild test robot
2018-06-08  4:18   ` kbuild test robot
2018-06-08  4:18     ` kbuild test robot
2018-06-08  4:18     ` kbuild test robot
2018-06-07 14:36 ` [PATCH 2/9] x86/cet: Add Kconfig option for user-mode shadow stack Yu-cheng Yu
2018-06-07 14:36   ` Yu-cheng Yu
2018-06-07 15:47   ` Andy Lutomirski
2018-06-07 15:47     ` Andy Lutomirski
2018-06-07 15:58     ` Yu-cheng Yu
2018-06-07 15:58       ` Yu-cheng Yu
2018-06-07 16:28       ` Andy Lutomirski
2018-06-07 16:28         ` Andy Lutomirski
2018-06-07 14:36 ` [PATCH 3/9] mm: Introduce VM_SHSTK for shadow stack memory Yu-cheng Yu
2018-06-07 14:36   ` Yu-cheng Yu
2018-06-07 14:37 ` [PATCH 4/9] x86/mm: Change _PAGE_DIRTY to _PAGE_DIRTY_HW Yu-cheng Yu
2018-06-07 14:37   ` Yu-cheng Yu
2018-06-08  3:53   ` kbuild test robot
2018-06-08  3:53     ` kbuild test robot
2018-06-08  3:53     ` kbuild test robot
2018-06-07 14:37 ` [PATCH 5/9] x86/mm: Introduce _PAGE_DIRTY_SW Yu-cheng Yu
2018-06-07 14:37   ` Yu-cheng Yu
2018-06-08  5:15   ` kbuild test robot
2018-06-08  5:15     ` kbuild test robot
2018-06-08  5:15     ` kbuild test robot
2018-06-07 14:37 ` [PATCH 6/9] x86/mm: Introduce ptep_set_wrprotect_flush and related functions Yu-cheng Yu
2018-06-07 14:37   ` Yu-cheng Yu
2018-06-07 16:24   ` Andy Lutomirski
2018-06-07 16:24     ` Andy Lutomirski
2018-06-07 18:21     ` Dave Hansen
2018-06-07 18:21       ` Dave Hansen
2018-06-07 18:24       ` Andy Lutomirski
2018-06-07 18:24         ` Andy Lutomirski
2018-06-07 20:29     ` Dave Hansen
2018-06-07 20:29       ` Dave Hansen
2018-06-07 20:36       ` Yu-cheng Yu
2018-06-07 20:36         ` Yu-cheng Yu
2018-06-08  0:59       ` Andy Lutomirski
2018-06-08  0:59         ` Andy Lutomirski
2018-06-08  1:20         ` Dave Hansen
2018-06-08  1:20           ` Dave Hansen
2018-06-08  4:43   ` kbuild test robot
2018-06-08  4:43     ` kbuild test robot
2018-06-08  4:43     ` kbuild test robot
2018-06-08 14:13   ` kbuild test robot
2018-06-08 14:13     ` kbuild test robot
2018-06-08 14:13     ` kbuild test robot
2018-06-07 14:37 ` [PATCH 7/9] x86/mm: Shadow stack page fault error checking Yu-cheng Yu
2018-06-07 14:37   ` Yu-cheng Yu
2018-06-07 16:26   ` Andy Lutomirski
2018-06-07 16:26     ` Andy Lutomirski
2018-06-07 16:46     ` Yu-cheng Yu
2018-06-07 16:46       ` Yu-cheng Yu
2018-06-07 16:56     ` Dave Hansen
2018-06-07 16:56       ` Dave Hansen
2018-06-07 14:37 ` Yu-cheng Yu [this message]
2018-06-07 14:37   ` [PATCH 8/9] x86/cet: Handle shadow stack page fault Yu-cheng Yu
2018-06-07 14:37 ` [PATCH 9/9] x86/cet: Handle THP/HugeTLB shadow stack page copying Yu-cheng Yu
2018-06-07 14:37   ` Yu-cheng Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180607143705.3531-9-yu-cheng.yu@intel.com \
    --to=yu-cheng.yu@intel.com \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vedvyas.shanbhogue@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.