* [PATCH 0/4] MDWE without inheritance @ 2023-05-04 17:09 Florent Revest 2023-05-04 17:09 ` [PATCH 1/4] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest ` (4 more replies) 0 siblings, 5 replies; 19+ messages in thread From: Florent Revest @ 2023-05-04 17:09 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy, Florent Revest Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags current with a flag that prevents pages that were previously not executable from becoming executable. This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK) At Google, we've been using a somewhat similar downstream patch for a few years now. To make the adoption of this feature easier, we've had it support a mode in which the W^X flag does not propagate to children. For example, this is handy if a C process which wants W^X protection suspects it could start children processes that would use a JIT. I'd like to align our features with the upstream prctl. This series proposes a new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It sets a different flag in current that is not in MMF_INIT_MASK and which does not propagate. As part of looking into MDWE, I also fixed a couple of things in the MDWE test. Florent Revest (4): kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test kselftest: vm: Fix mdwe's mmap_FIXED test case mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl kselftest: vm: Add tests for no-inherit memory-deny-write-execute include/linux/mman.h | 8 +- include/linux/sched/coredump.h | 1 + include/uapi/linux/prctl.h | 1 + kernel/sys.c | 29 +++++-- tools/include/uapi/linux/prctl.h | 1 + tools/testing/selftests/mm/mdwe_test.c | 110 +++++++++++++++++++++---- 6 files changed, 128 insertions(+), 22 deletions(-) -- 2.40.1.495.gc816e09b53d-goog ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/4] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test 2023-05-04 17:09 [PATCH 0/4] MDWE without inheritance Florent Revest @ 2023-05-04 17:09 ` Florent Revest 2023-05-04 17:09 ` [PATCH 2/4] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest ` (3 subsequent siblings) 4 siblings, 0 replies; 19+ messages in thread From: Florent Revest @ 2023-05-04 17:09 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy, Florent Revest Signed-off-by: Florent Revest <revest@chromium.org> --- tools/testing/selftests/mm/mdwe_test.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c index bc91bef5d254..d0954c657feb 100644 --- a/tools/testing/selftests/mm/mdwe_test.c +++ b/tools/testing/selftests/mm/mdwe_test.c @@ -49,19 +49,19 @@ FIXTURE_VARIANT(mdwe) FIXTURE_VARIANT_ADD(mdwe, stock) { - .enabled = false, + .enabled = false, .forked = false, }; FIXTURE_VARIANT_ADD(mdwe, enabled) { - .enabled = true, + .enabled = true, .forked = false, }; FIXTURE_VARIANT_ADD(mdwe, forked) { - .enabled = true, + .enabled = true, .forked = true, }; -- 2.40.1.495.gc816e09b53d-goog ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/4] kselftest: vm: Fix mdwe's mmap_FIXED test case 2023-05-04 17:09 [PATCH 0/4] MDWE without inheritance Florent Revest 2023-05-04 17:09 ` [PATCH 1/4] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest @ 2023-05-04 17:09 ` Florent Revest 2023-05-04 17:13 ` Florent Revest 2023-05-04 17:09 ` [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest ` (2 subsequent siblings) 4 siblings, 1 reply; 19+ messages in thread From: Florent Revest @ 2023-05-04 17:09 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy, Florent Revest I checked with the original author, the mmap_FIXED test case wasn't properly tested and fails. Currently, it maps two consecutive (non overlapping) pages and expects the second mapping to be denied by MDWE but these two pages have nothing to do with each other so MDWE is actually out of the picture here. What the test actually intended to do was to remap a virtual address using MAP_FIXED. However, this operation unmaps the existing mapping and creates a new one so the va is backed by a new page and MDWE is again out of the picture, all remappings should succeed. This patch keeps the test case to make it clear that this situation is expected to work. Signed-off-by: Florent Revest <revest@chromium.org> --- tools/testing/selftests/mm/mdwe_test.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c index d0954c657feb..91aa9c3099e7 100644 --- a/tools/testing/selftests/mm/mdwe_test.c +++ b/tools/testing/selftests/mm/mdwe_test.c @@ -168,13 +168,10 @@ TEST_F(mdwe, mmap_FIXED) self->p = mmap(NULL, self->size, PROT_READ, self->flags, 0, 0); ASSERT_NE(self->p, MAP_FAILED); - p = mmap(self->p + self->size, self->size, PROT_READ | PROT_EXEC, + /* MAP_FIXED unmaps the existing page before mapping which is allowed */ + p = mmap(self->p, self->size, PROT_READ | PROT_EXEC, self->flags | MAP_FIXED, 0, 0); - if (variant->enabled) { - EXPECT_EQ(p, MAP_FAILED); - } else { - EXPECT_EQ(p, self->p); - } + EXPECT_EQ(p, self->p); } TEST_F(mdwe, arm64_BTI) -- 2.40.1.495.gc816e09b53d-goog ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] kselftest: vm: Fix mdwe's mmap_FIXED test case 2023-05-04 17:09 ` [PATCH 2/4] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest @ 2023-05-04 17:13 ` Florent Revest 0 siblings, 0 replies; 19+ messages in thread From: Florent Revest @ 2023-05-04 17:13 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy On Thu, May 4, 2023 at 7:10 PM Florent Revest <revest@chromium.org> wrote: > > I checked with the original author, the mmap_FIXED test case wasn't > properly tested and fails. Currently, it maps two consecutive (non > overlapping) pages and expects the second mapping to be denied by MDWE > but these two pages have nothing to do with each other so MDWE is > actually out of the picture here. > > What the test actually intended to do was to remap a virtual address > using MAP_FIXED. However, this operation unmaps the existing mapping and > creates a new one so the va is backed by a new page and MDWE is again > out of the picture, all remappings should succeed. > > This patch keeps the test case to make it clear that this situation is > expected to work. > > Signed-off-by: Florent Revest <revest@chromium.org> Ah, snap, I hit send to fast and forgot to add: Fixes: 4cf1fe34fd18 ("kselftest: vm: add tests for memory-deny-write-execute") I will do it in the next iteration (surely there'll be other things to fix... :)) ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl 2023-05-04 17:09 [PATCH 0/4] MDWE without inheritance Florent Revest 2023-05-04 17:09 ` [PATCH 1/4] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest 2023-05-04 17:09 ` [PATCH 2/4] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest @ 2023-05-04 17:09 ` Florent Revest 2023-05-05 18:34 ` Catalin Marinas 2023-05-04 17:09 ` [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest 2023-05-04 20:06 ` [PATCH 0/4] MDWE without inheritance Peter Xu 4 siblings, 1 reply; 19+ messages in thread From: Florent Revest @ 2023-05-04 17:09 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy, Florent Revest This extends the current PR_SET_MDWE prctl arg with a bit to indicate that the process doesn't want MDWE protection to propagate to children. To implement this no-inherit mode, the tag in current->mm->flags must be absent from MMF_INIT_MASK. This means that the encoding for "MDWE but without inherit" is different in the prctl than in the mm flags. This leads to a bit of bit-mangling in the prctl implementation. Signed-off-by: Florent Revest <revest@chromium.org> --- include/linux/mman.h | 8 +++++++- include/linux/sched/coredump.h | 1 + include/uapi/linux/prctl.h | 1 + kernel/sys.c | 29 +++++++++++++++++++++++------ tools/include/uapi/linux/prctl.h | 1 + 5 files changed, 33 insertions(+), 7 deletions(-) diff --git a/include/linux/mman.h b/include/linux/mman.h index cee1e4b566d8..3d7a0b70ad2d 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -157,6 +157,12 @@ calc_vm_flag_bits(unsigned long flags) unsigned long vm_commit_limit(void); +static inline bool has_mdwe_enabled(struct task_struct *task) +{ + return test_bit(MMF_HAS_MDWE, &task->mm->flags) || + test_bit(MMF_HAS_MDWE_NO_INHERIT, &task->mm->flags); +} + /* * Denies creating a writable executable mapping or gaining executable permissions. * @@ -178,7 +184,7 @@ unsigned long vm_commit_limit(void); */ static inline bool map_deny_write_exec(struct vm_area_struct *vma, unsigned long vm_flags) { - if (!test_bit(MMF_HAS_MDWE, ¤t->mm->flags)) + if (!has_mdwe_enabled(current)) return false; if ((vm_flags & VM_EXEC) && (vm_flags & VM_WRITE)) diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h index 0ee96ea7a0e9..b2d9659ef863 100644 --- a/include/linux/sched/coredump.h +++ b/include/linux/sched/coredump.h @@ -91,4 +91,5 @@ static inline int get_dumpable(struct mm_struct *mm) MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK) #define MMF_VM_MERGE_ANY 29 +#define MMF_HAS_MDWE_NO_INHERIT 30 #endif /* _LINUX_SCHED_COREDUMP_H */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index f23d9a16507f..31ec44728412 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -284,6 +284,7 @@ struct prctl_mm_map { /* Memory deny write / execute */ #define PR_SET_MDWE 65 # define PR_MDWE_REFUSE_EXEC_GAIN 1 +# define PR_MDWE_NO_INHERIT 2 #define PR_GET_MDWE 66 diff --git a/kernel/sys.c b/kernel/sys.c index 339fee3eff6a..c864fd42ece1 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2368,12 +2368,25 @@ static inline int prctl_set_mdwe(unsigned long bits, unsigned long arg3, if (arg3 || arg4 || arg5) return -EINVAL; - if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN)) + if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT)) return -EINVAL; - if (bits & PR_MDWE_REFUSE_EXEC_GAIN) - set_bit(MMF_HAS_MDWE, ¤t->mm->flags); - else if (test_bit(MMF_HAS_MDWE, ¤t->mm->flags)) + /* Cannot set NO_INHERIT without REFUSE_EXEC_GAIN */ + if (bits & PR_MDWE_NO_INHERIT && !(bits & PR_MDWE_REFUSE_EXEC_GAIN)) + return -EINVAL; + + if (bits & PR_MDWE_REFUSE_EXEC_GAIN) { + if (bits & PR_MDWE_NO_INHERIT) { + /* Cannot go from inherit mode to no inherit */ + if (test_bit(MMF_HAS_MDWE, ¤t->mm->flags)) + return -EPERM; + + set_bit(MMF_HAS_MDWE_NO_INHERIT, ¤t->mm->flags); + } else { + set_bit(MMF_HAS_MDWE, ¤t->mm->flags); + clear_bit(MMF_HAS_MDWE_NO_INHERIT, ¤t->mm->flags); + } + } else if (has_mdwe_enabled(current)) return -EPERM; /* Cannot unset the flag */ return 0; @@ -2385,8 +2398,12 @@ static inline int prctl_get_mdwe(unsigned long arg2, unsigned long arg3, if (arg2 || arg3 || arg4 || arg5) return -EINVAL; - return test_bit(MMF_HAS_MDWE, ¤t->mm->flags) ? - PR_MDWE_REFUSE_EXEC_GAIN : 0; + if (test_bit(MMF_HAS_MDWE, ¤t->mm->flags)) + return PR_MDWE_REFUSE_EXEC_GAIN; + else if (test_bit(MMF_HAS_MDWE_NO_INHERIT, ¤t->mm->flags)) + return PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT; + + return 0; } static int prctl_get_auxv(void __user *addr, unsigned long len) diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h index 759b3f53e53f..a3424852d2d6 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -284,6 +284,7 @@ struct prctl_mm_map { /* Memory deny write / execute */ #define PR_SET_MDWE 65 # define PR_MDWE_REFUSE_EXEC_GAIN 1 +# define PR_MDWE_NO_INHERIT 2 #define PR_GET_MDWE 66 -- 2.40.1.495.gc816e09b53d-goog ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl 2023-05-04 17:09 ` [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest @ 2023-05-05 18:34 ` Catalin Marinas 2023-05-08 12:11 ` Florent Revest 0 siblings, 1 reply; 19+ messages in thread From: Catalin Marinas @ 2023-05-05 18:34 UTC (permalink / raw) To: Florent Revest Cc: linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy On Thu, May 04, 2023 at 07:09:41PM +0200, Florent Revest wrote: > This extends the current PR_SET_MDWE prctl arg with a bit to indicate > that the process doesn't want MDWE protection to propagate to children. > > To implement this no-inherit mode, the tag in current->mm->flags must be > absent from MMF_INIT_MASK. This means that the encoding for "MDWE but > without inherit" is different in the prctl than in the mm flags. This > leads to a bit of bit-mangling in the prctl implementation. That bit mangling is not that bad but it complicates the code a bit, especially if we'll add new bits in the future. We also need to check both the original and the no-inherit bits for each feature. Another question is whether we want to support more fine-grained inheriting or just a big knob that disables inheriting for all the (future) MDWE flags. I think a somewhat simpler way would be to clear the flags on fork(), either based on a big MMF_HAS_MDWE_NO_INHERIT knob or individual ones. Something like below (completely untested): diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h index 0ee96ea7a0e9..ca83a0c8d19c 100644 --- a/include/linux/sched/coredump.h +++ b/include/linux/sched/coredump.h @@ -91,4 +91,12 @@ static inline int get_dumpable(struct mm_struct *mm) MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK) #define MMF_VM_MERGE_ANY 29 + +#define MMF_INIT_FLAGS(flags) ({ \ + unsigned long new_flags = flags; \ + if (new_flags & (1UL << MMF_HAS_MDWE_NO_INHERIT)) \ + new_flags &= ~(1UL << MMF_HAS_MDWE_MASK); \ + new_flags & MMF_INIT_MASK; \ +}) + #endif /* _LINUX_SCHED_COREDUMP_H */ diff --git a/kernel/fork.c b/kernel/fork.c index ed4e01daccaa..53f0b68a5451 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1288,7 +1288,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, hugetlb_count_init(mm); if (current->mm) { - mm->flags = current->mm->flags & MMF_INIT_MASK; + mm->flags = MMF_INIT_FLAGS(current->mm->flags); mm->def_flags = current->mm->def_flags & VM_INIT_DEF_MASK; } else { mm->flags = default_dump_filter; The checks in MMF_INIT_FLAGS() can grow in time if we add more bits in there but we still only keep a single flag that determines whether the feature is enabled (maybe that's more like bikeshedding at this moment when we have a single bit). (fun remark: I see you cc'ed nd@arm.com'; that's not a real person, it's what our IT folk asked us to add on cc so that the Exchange server doesn't append the legal disclaimer; most lists are covered already without such cc but I guess people feel safer to add it, just in case) -- Catalin ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl 2023-05-05 18:34 ` Catalin Marinas @ 2023-05-08 12:11 ` Florent Revest 0 siblings, 0 replies; 19+ messages in thread From: Florent Revest @ 2023-05-08 12:11 UTC (permalink / raw) To: Catalin Marinas Cc: linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy On Fri, May 5, 2023 at 8:34 PM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Thu, May 04, 2023 at 07:09:41PM +0200, Florent Revest wrote: > > This extends the current PR_SET_MDWE prctl arg with a bit to indicate > > that the process doesn't want MDWE protection to propagate to children. > > > > To implement this no-inherit mode, the tag in current->mm->flags must be > > absent from MMF_INIT_MASK. This means that the encoding for "MDWE but > > without inherit" is different in the prctl than in the mm flags. This > > leads to a bit of bit-mangling in the prctl implementation. > > That bit mangling is not that bad but it complicates the code a bit, > especially if we'll add new bits in the future. We also need to check > both the original and the no-inherit bits for each feature. I agree :) > Another question is whether we want to support more fine-grained > inheriting or just a big knob that disables inheriting for all the > (future) MDWE flags. Yep, I can't think of a more fine-grained inheritance model off the top of my head but it would be good to have a sane base for when the need arises. > I think a somewhat simpler way would be to clear the flags on fork(), > either based on a big MMF_HAS_MDWE_NO_INHERIT knob or individual ones. > Something like below (completely untested): > > diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h > index 0ee96ea7a0e9..ca83a0c8d19c 100644 > --- a/include/linux/sched/coredump.h > +++ b/include/linux/sched/coredump.h > @@ -91,4 +91,12 @@ static inline int get_dumpable(struct mm_struct *mm) > MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK) > > #define MMF_VM_MERGE_ANY 29 > + > +#define MMF_INIT_FLAGS(flags) ({ \ > + unsigned long new_flags = flags; \ > + if (new_flags & (1UL << MMF_HAS_MDWE_NO_INHERIT)) \ > + new_flags &= ~(1UL << MMF_HAS_MDWE_MASK); \ > + new_flags & MMF_INIT_MASK; \ > +}) > + > #endif /* _LINUX_SCHED_COREDUMP_H */ > diff --git a/kernel/fork.c b/kernel/fork.c > index ed4e01daccaa..53f0b68a5451 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1288,7 +1288,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, > hugetlb_count_init(mm); > > if (current->mm) { > - mm->flags = current->mm->flags & MMF_INIT_MASK; > + mm->flags = MMF_INIT_FLAGS(current->mm->flags); > mm->def_flags = current->mm->def_flags & VM_INIT_DEF_MASK; > } else { > mm->flags = default_dump_filter; > > The checks in MMF_INIT_FLAGS() can grow in time if we add more bits in > there but we still only keep a single flag that determines whether the > feature is enabled (maybe that's more like bikeshedding at this moment > when we have a single bit). Sounds good! I had considered something like this but I was afraid I'd spill too much logic into fork.c... I didn't think of making it a neat macro in coredump.h. That's a good point, I'll do this in v2. > (fun remark: I see you cc'ed nd@arm.com'; that's not a real person, it's > what our IT folk asked us to add on cc so that the Exchange server > doesn't append the legal disclaimer; most lists are covered already > without such cc but I guess people feel safer to add it, just in case) Ahah! I mostly just copied the cc list from Joey's series. I remember wondering why I couldn't find any patch sent by this mysterious ND but I thought that if they got such a cool username, surely they must have been at ARM since the early days and have some important role... :) Then... mister nd won't get to see my v2! Thanks for the heads up. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute 2023-05-04 17:09 [PATCH 0/4] MDWE without inheritance Florent Revest ` (2 preceding siblings ...) 2023-05-04 17:09 ` [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest @ 2023-05-04 17:09 ` Florent Revest 2023-05-04 20:29 ` Alexey Izbyshev 2023-05-04 20:06 ` [PATCH 0/4] MDWE without inheritance Peter Xu 4 siblings, 1 reply; 19+ messages in thread From: Florent Revest @ 2023-05-04 17:09 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, izbyshev, nd, broonie, szabolcs.nagy, Florent Revest Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the PR_SET_MDWE prctl. Signed-off-by: Florent Revest <revest@chromium.org> --- tools/testing/selftests/mm/mdwe_test.c | 95 ++++++++++++++++++++++++-- 1 file changed, 89 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c index 91aa9c3099e7..9f08ed1b99ae 100644 --- a/tools/testing/selftests/mm/mdwe_test.c +++ b/tools/testing/selftests/mm/mdwe_test.c @@ -22,6 +22,8 @@ TEST(prctl_flags) { + EXPECT_LT(prctl(PR_SET_MDWE, PR_MDWE_NO_INHERIT, 0L, 0L, 7L), 0); + EXPECT_LT(prctl(PR_SET_MDWE, 7L, 0L, 0L, 0L), 0); EXPECT_LT(prctl(PR_SET_MDWE, 0L, 7L, 0L, 0L), 0); EXPECT_LT(prctl(PR_SET_MDWE, 0L, 0L, 7L, 0L), 0); @@ -33,6 +35,66 @@ TEST(prctl_flags) EXPECT_LT(prctl(PR_GET_MDWE, 0L, 0L, 0L, 7L), 0); } +FIXTURE(consecutive_prctl_flags) {}; +FIXTURE_SETUP(consecutive_prctl_flags) {} +FIXTURE_TEARDOWN(consecutive_prctl_flags) {} + +FIXTURE_VARIANT(consecutive_prctl_flags) +{ + unsigned long first_flags; + unsigned long second_flags; + bool should_work; +}; + +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, same) +{ + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN, + .second_flags = PR_MDWE_REFUSE_EXEC_GAIN, + .should_work = true, +}; + +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_disable_mdwe) +{ + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN, + .second_flags = 0, + .should_work = false, +}; + +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_disable_mdwe_no_inherit) +{ + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT, + .second_flags = 0, + .should_work = false, +}; + +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, can_lower_privileges) +{ + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT, + .second_flags = PR_MDWE_REFUSE_EXEC_GAIN, + .should_work = true, +}; + +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_gain_privileges) +{ + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN, + .second_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT, + .should_work = false, +}; + +TEST_F(consecutive_prctl_flags, two_prctls) +{ + int ret; + + EXPECT_EQ(prctl(PR_SET_MDWE, variant->first_flags, 0L, 0L, 0L), 0); + + ret = prctl(PR_SET_MDWE, variant->second_flags, 0L, 0L, 0L); + if (variant->should_work) { + EXPECT_EQ(ret, 0); + } else { + EXPECT_NE(ret, 0); + } +} + FIXTURE(mdwe) { void *p; @@ -45,28 +107,45 @@ FIXTURE_VARIANT(mdwe) { bool enabled; bool forked; + bool inherit; }; FIXTURE_VARIANT_ADD(mdwe, stock) { .enabled = false, .forked = false, + .inherit = false, }; FIXTURE_VARIANT_ADD(mdwe, enabled) { .enabled = true, .forked = false, + .inherit = true, }; -FIXTURE_VARIANT_ADD(mdwe, forked) +FIXTURE_VARIANT_ADD(mdwe, inherited) { .enabled = true, .forked = true, + .inherit = true, }; +FIXTURE_VARIANT_ADD(mdwe, not_inherited) +{ + .enabled = true, + .forked = true, + .inherit = false, +}; + +static bool executable_map_should_fail(const FIXTURE_VARIANT(mdwe) *variant) +{ + return variant->enabled && (!variant->forked || variant->inherit); +} + FIXTURE_SETUP(mdwe) { + unsigned long mdwe_flags; int ret, status; self->p = NULL; @@ -76,13 +155,17 @@ FIXTURE_SETUP(mdwe) if (!variant->enabled) return; - ret = prctl(PR_SET_MDWE, PR_MDWE_REFUSE_EXEC_GAIN, 0L, 0L, 0L); + mdwe_flags = PR_MDWE_REFUSE_EXEC_GAIN; + if (!variant->inherit) + mdwe_flags |= PR_MDWE_NO_INHERIT; + + ret = prctl(PR_SET_MDWE, mdwe_flags, 0L, 0L, 0L); ASSERT_EQ(ret, 0) { TH_LOG("PR_SET_MDWE failed or unsupported"); } ret = prctl(PR_GET_MDWE, 0L, 0L, 0L, 0L); - ASSERT_EQ(ret, 1); + ASSERT_EQ(ret, mdwe_flags); if (variant->forked) { self->pid = fork(); @@ -113,7 +196,7 @@ TEST_F(mdwe, mmap_READ_EXEC) TEST_F(mdwe, mmap_WRITE_EXEC) { self->p = mmap(NULL, self->size, PROT_WRITE | PROT_EXEC, self->flags, 0, 0); - if (variant->enabled) { + if (executable_map_should_fail(variant)) { EXPECT_EQ(self->p, MAP_FAILED); } else { EXPECT_NE(self->p, MAP_FAILED); @@ -139,7 +222,7 @@ TEST_F(mdwe, mprotect_add_EXEC) ASSERT_NE(self->p, MAP_FAILED); ret = mprotect(self->p, self->size, PROT_READ | PROT_EXEC); - if (variant->enabled) { + if (executable_map_should_fail(variant)) { EXPECT_LT(ret, 0); } else { EXPECT_EQ(ret, 0); @@ -154,7 +237,7 @@ TEST_F(mdwe, mprotect_WRITE_EXEC) ASSERT_NE(self->p, MAP_FAILED); ret = mprotect(self->p, self->size, PROT_WRITE | PROT_EXEC); - if (variant->enabled) { + if (executable_map_should_fail(variant)) { EXPECT_LT(ret, 0); } else { EXPECT_EQ(ret, 0); -- 2.40.1.495.gc816e09b53d-goog ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute 2023-05-04 17:09 ` [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest @ 2023-05-04 20:29 ` Alexey Izbyshev 2023-05-05 16:42 ` Florent Revest 0 siblings, 1 reply; 19+ messages in thread From: Alexey Izbyshev @ 2023-05-04 20:29 UTC (permalink / raw) To: Florent Revest Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, nd, broonie, szabolcs.nagy On 2023-05-04 20:09, Florent Revest wrote: > Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the > PR_SET_MDWE prctl. > > Signed-off-by: Florent Revest <revest@chromium.org> > --- > tools/testing/selftests/mm/mdwe_test.c | 95 ++++++++++++++++++++++++-- > 1 file changed, 89 insertions(+), 6 deletions(-) > > diff --git a/tools/testing/selftests/mm/mdwe_test.c > b/tools/testing/selftests/mm/mdwe_test.c > index 91aa9c3099e7..9f08ed1b99ae 100644 > --- a/tools/testing/selftests/mm/mdwe_test.c > +++ b/tools/testing/selftests/mm/mdwe_test.c > @@ -22,6 +22,8 @@ > > TEST(prctl_flags) > { > + EXPECT_LT(prctl(PR_SET_MDWE, PR_MDWE_NO_INHERIT, 0L, 0L, 7L), 0); > + PR_MDWE_NO_INHERIT is defined to an int constant, so passing it to prctl() without a cast to long or similar may produce wrong code on 64-bit targets (ABIs typically don't require the compiler to clear the upper 32 bits of a 64-bit register when passing a 32-bit argument, so va_arg(arg, unsigned long) in prctl() implementation might get junk). Arguably, defining PR_MDWE_* to plain int constants is a bug, or at least a footgun for users of uapi headers. Thanks, Alexey > EXPECT_LT(prctl(PR_SET_MDWE, 7L, 0L, 0L, 0L), 0); > EXPECT_LT(prctl(PR_SET_MDWE, 0L, 7L, 0L, 0L), 0); > EXPECT_LT(prctl(PR_SET_MDWE, 0L, 0L, 7L, 0L), 0); > @@ -33,6 +35,66 @@ TEST(prctl_flags) > EXPECT_LT(prctl(PR_GET_MDWE, 0L, 0L, 0L, 7L), 0); > } > > +FIXTURE(consecutive_prctl_flags) {}; > +FIXTURE_SETUP(consecutive_prctl_flags) {} > +FIXTURE_TEARDOWN(consecutive_prctl_flags) {} > + > +FIXTURE_VARIANT(consecutive_prctl_flags) > +{ > + unsigned long first_flags; > + unsigned long second_flags; > + bool should_work; > +}; > + > +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, same) > +{ > + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN, > + .second_flags = PR_MDWE_REFUSE_EXEC_GAIN, > + .should_work = true, > +}; > + > +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_disable_mdwe) > +{ > + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN, > + .second_flags = 0, > + .should_work = false, > +}; > + > +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, > cant_disable_mdwe_no_inherit) > +{ > + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT, > + .second_flags = 0, > + .should_work = false, > +}; > + > +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, can_lower_privileges) > +{ > + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT, > + .second_flags = PR_MDWE_REFUSE_EXEC_GAIN, > + .should_work = true, > +}; > + > +FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_gain_privileges) > +{ > + .first_flags = PR_MDWE_REFUSE_EXEC_GAIN, > + .second_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT, > + .should_work = false, > +}; > + > +TEST_F(consecutive_prctl_flags, two_prctls) > +{ > + int ret; > + > + EXPECT_EQ(prctl(PR_SET_MDWE, variant->first_flags, 0L, 0L, 0L), 0); > + > + ret = prctl(PR_SET_MDWE, variant->second_flags, 0L, 0L, 0L); > + if (variant->should_work) { > + EXPECT_EQ(ret, 0); > + } else { > + EXPECT_NE(ret, 0); > + } > +} > + > FIXTURE(mdwe) > { > void *p; > @@ -45,28 +107,45 @@ FIXTURE_VARIANT(mdwe) > { > bool enabled; > bool forked; > + bool inherit; > }; > > FIXTURE_VARIANT_ADD(mdwe, stock) > { > .enabled = false, > .forked = false, > + .inherit = false, > }; > > FIXTURE_VARIANT_ADD(mdwe, enabled) > { > .enabled = true, > .forked = false, > + .inherit = true, > }; > > -FIXTURE_VARIANT_ADD(mdwe, forked) > +FIXTURE_VARIANT_ADD(mdwe, inherited) > { > .enabled = true, > .forked = true, > + .inherit = true, > }; > > +FIXTURE_VARIANT_ADD(mdwe, not_inherited) > +{ > + .enabled = true, > + .forked = true, > + .inherit = false, > +}; > + > +static bool executable_map_should_fail(const FIXTURE_VARIANT(mdwe) > *variant) > +{ > + return variant->enabled && (!variant->forked || variant->inherit); > +} > + > FIXTURE_SETUP(mdwe) > { > + unsigned long mdwe_flags; > int ret, status; > > self->p = NULL; > @@ -76,13 +155,17 @@ FIXTURE_SETUP(mdwe) > if (!variant->enabled) > return; > > - ret = prctl(PR_SET_MDWE, PR_MDWE_REFUSE_EXEC_GAIN, 0L, 0L, 0L); > + mdwe_flags = PR_MDWE_REFUSE_EXEC_GAIN; > + if (!variant->inherit) > + mdwe_flags |= PR_MDWE_NO_INHERIT; > + > + ret = prctl(PR_SET_MDWE, mdwe_flags, 0L, 0L, 0L); > ASSERT_EQ(ret, 0) { > TH_LOG("PR_SET_MDWE failed or unsupported"); > } > > ret = prctl(PR_GET_MDWE, 0L, 0L, 0L, 0L); > - ASSERT_EQ(ret, 1); > + ASSERT_EQ(ret, mdwe_flags); > > if (variant->forked) { > self->pid = fork(); > @@ -113,7 +196,7 @@ TEST_F(mdwe, mmap_READ_EXEC) > TEST_F(mdwe, mmap_WRITE_EXEC) > { > self->p = mmap(NULL, self->size, PROT_WRITE | PROT_EXEC, self->flags, > 0, 0); > - if (variant->enabled) { > + if (executable_map_should_fail(variant)) { > EXPECT_EQ(self->p, MAP_FAILED); > } else { > EXPECT_NE(self->p, MAP_FAILED); > @@ -139,7 +222,7 @@ TEST_F(mdwe, mprotect_add_EXEC) > ASSERT_NE(self->p, MAP_FAILED); > > ret = mprotect(self->p, self->size, PROT_READ | PROT_EXEC); > - if (variant->enabled) { > + if (executable_map_should_fail(variant)) { > EXPECT_LT(ret, 0); > } else { > EXPECT_EQ(ret, 0); > @@ -154,7 +237,7 @@ TEST_F(mdwe, mprotect_WRITE_EXEC) > ASSERT_NE(self->p, MAP_FAILED); > > ret = mprotect(self->p, self->size, PROT_WRITE | PROT_EXEC); > - if (variant->enabled) { > + if (executable_map_should_fail(variant)) { > EXPECT_LT(ret, 0); > } else { > EXPECT_EQ(ret, 0); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute 2023-05-04 20:29 ` Alexey Izbyshev @ 2023-05-05 16:42 ` Florent Revest 2023-05-05 21:26 ` Alexey Izbyshev 0 siblings, 1 reply; 19+ messages in thread From: Florent Revest @ 2023-05-05 16:42 UTC (permalink / raw) To: Alexey Izbyshev Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, nd, broonie, szabolcs.nagy On Thu, May 4, 2023 at 10:30 PM Alexey Izbyshev <izbyshev@ispras.ru> wrote: > > On 2023-05-04 20:09, Florent Revest wrote: > > Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the > > PR_SET_MDWE prctl. > > > > Signed-off-by: Florent Revest <revest@chromium.org> > > --- > > tools/testing/selftests/mm/mdwe_test.c | 95 ++++++++++++++++++++++++-- > > 1 file changed, 89 insertions(+), 6 deletions(-) > > > > diff --git a/tools/testing/selftests/mm/mdwe_test.c > > b/tools/testing/selftests/mm/mdwe_test.c > > index 91aa9c3099e7..9f08ed1b99ae 100644 > > --- a/tools/testing/selftests/mm/mdwe_test.c > > +++ b/tools/testing/selftests/mm/mdwe_test.c > > @@ -22,6 +22,8 @@ > > > > TEST(prctl_flags) > > { > > + EXPECT_LT(prctl(PR_SET_MDWE, PR_MDWE_NO_INHERIT, 0L, 0L, 7L), 0); > > + > > PR_MDWE_NO_INHERIT is defined to an int constant, so passing it to > prctl() without a cast to long or similar may produce wrong code on > 64-bit targets (ABIs typically don't require the compiler to clear the > upper 32 bits of a 64-bit register when passing a 32-bit argument, so > va_arg(arg, unsigned long) in prctl() implementation might get junk). Ah, good catch Alexey! :) > Arguably, defining PR_MDWE_* to plain int constants is a bug, or at > least a footgun for users of uapi headers. As part of the next version of this series, I'm happy to: 1- change the existing PR_MDWE_REFUSE_EXEC_GAIN to 1UL 2- introduce PR_MDWE_NO_INHERIT as 2UL But I'm surprised that most of the macros in include/uapi/linux/prctl.h are the same sort of footguns already ? Hasn't it been an issue for other prctls yet ? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute 2023-05-05 16:42 ` Florent Revest @ 2023-05-05 21:26 ` Alexey Izbyshev 2023-05-08 12:12 ` Florent Revest 0 siblings, 1 reply; 19+ messages in thread From: Alexey Izbyshev @ 2023-05-05 21:26 UTC (permalink / raw) To: Florent Revest Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, nd, broonie, szabolcs.nagy On 2023-05-05 19:42, Florent Revest wrote: > On Thu, May 4, 2023 at 10:30 PM Alexey Izbyshev <izbyshev@ispras.ru> > wrote: >> >> On 2023-05-04 20:09, Florent Revest wrote: >> > Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the >> > PR_SET_MDWE prctl. >> > >> > Signed-off-by: Florent Revest <revest@chromium.org> >> > --- >> > tools/testing/selftests/mm/mdwe_test.c | 95 ++++++++++++++++++++++++-- >> > 1 file changed, 89 insertions(+), 6 deletions(-) >> > >> > diff --git a/tools/testing/selftests/mm/mdwe_test.c >> > b/tools/testing/selftests/mm/mdwe_test.c >> > index 91aa9c3099e7..9f08ed1b99ae 100644 >> > --- a/tools/testing/selftests/mm/mdwe_test.c >> > +++ b/tools/testing/selftests/mm/mdwe_test.c >> > @@ -22,6 +22,8 @@ >> > >> > TEST(prctl_flags) >> > { >> > + EXPECT_LT(prctl(PR_SET_MDWE, PR_MDWE_NO_INHERIT, 0L, 0L, 7L), 0); >> > + >> >> PR_MDWE_NO_INHERIT is defined to an int constant, so passing it to >> prctl() without a cast to long or similar may produce wrong code on >> 64-bit targets (ABIs typically don't require the compiler to clear the >> upper 32 bits of a 64-bit register when passing a 32-bit argument, so >> va_arg(arg, unsigned long) in prctl() implementation might get junk). > > Ah, good catch Alexey! :) > >> Arguably, defining PR_MDWE_* to plain int constants is a bug, or at >> least a footgun for users of uapi headers. > > As part of the next version of this series, I'm happy to: > 1- change the existing PR_MDWE_REFUSE_EXEC_GAIN to 1UL > 2- introduce PR_MDWE_NO_INHERIT as 2UL > Yes, I think it's the right thing to do. I suggest to spell them as (1UL << 0), etc. for consistency with all other unsigned long flags in the header. > But I'm surprised that most of the macros in > include/uapi/linux/prctl.h are the same sort of footguns already ? > Hasn't it been an issue for other prctls yet ? Yes, they are. I'm not aware of a public discussion of this specific issue, but note that at least for some prctl() options the kernel doesn't care about upper bits because arguments are truncated before doing anything else with them (e.g. for PR_SCHED_CORE raw prctl() arguments are implicitly converted to what sched_core_share_pid() expects). Also, actually getting junk in the upper bits might not always be easy (e.g. on x86-64 all or almost all instructions with r32 destination operand clear the upper bits). Unfortunately, I don't have a better answer than this. Alexey ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute 2023-05-05 21:26 ` Alexey Izbyshev @ 2023-05-08 12:12 ` Florent Revest 0 siblings, 0 replies; 19+ messages in thread From: Florent Revest @ 2023-05-08 12:12 UTC (permalink / raw) To: Alexey Izbyshev Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, peterx, nd, broonie, szabolcs.nagy On Fri, May 5, 2023 at 11:26 PM Alexey Izbyshev <izbyshev@ispras.ru> wrote: > > On 2023-05-05 19:42, Florent Revest wrote: > > On Thu, May 4, 2023 at 10:30 PM Alexey Izbyshev <izbyshev@ispras.ru> > > wrote: > >> > >> On 2023-05-04 20:09, Florent Revest wrote: > >> > Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the > >> > PR_SET_MDWE prctl. > >> > > >> > Signed-off-by: Florent Revest <revest@chromium.org> > >> > --- > >> > tools/testing/selftests/mm/mdwe_test.c | 95 ++++++++++++++++++++++++-- > >> > 1 file changed, 89 insertions(+), 6 deletions(-) > >> > > >> > diff --git a/tools/testing/selftests/mm/mdwe_test.c > >> > b/tools/testing/selftests/mm/mdwe_test.c > >> > index 91aa9c3099e7..9f08ed1b99ae 100644 > >> > --- a/tools/testing/selftests/mm/mdwe_test.c > >> > +++ b/tools/testing/selftests/mm/mdwe_test.c > >> > @@ -22,6 +22,8 @@ > >> > > >> > TEST(prctl_flags) > >> > { > >> > + EXPECT_LT(prctl(PR_SET_MDWE, PR_MDWE_NO_INHERIT, 0L, 0L, 7L), 0); > >> > + > >> > >> PR_MDWE_NO_INHERIT is defined to an int constant, so passing it to > >> prctl() without a cast to long or similar may produce wrong code on > >> 64-bit targets (ABIs typically don't require the compiler to clear the > >> upper 32 bits of a 64-bit register when passing a 32-bit argument, so > >> va_arg(arg, unsigned long) in prctl() implementation might get junk). > > > > Ah, good catch Alexey! :) > > > >> Arguably, defining PR_MDWE_* to plain int constants is a bug, or at > >> least a footgun for users of uapi headers. > > > > As part of the next version of this series, I'm happy to: > > 1- change the existing PR_MDWE_REFUSE_EXEC_GAIN to 1UL > > 2- introduce PR_MDWE_NO_INHERIT as 2UL > > > Yes, I think it's the right thing to do. I suggest to spell them as (1UL > << 0), etc. for consistency with all other unsigned long flags in the > header. Ah yeah, absolutely! Good tip too, thank you :) > > But I'm surprised that most of the macros in > > include/uapi/linux/prctl.h are the same sort of footguns already ? > > Hasn't it been an issue for other prctls yet ? > > Yes, they are. I'm not aware of a public discussion of this specific > issue, but note that at least for some prctl() options the kernel > doesn't care about upper bits because arguments are truncated before > doing anything else with them (e.g. for PR_SCHED_CORE raw prctl() That makes sense > arguments are implicitly converted to what sched_core_share_pid() > expects). Also, actually getting junk in the upper bits might not always > be easy (e.g. on x86-64 all or almost all instructions with r32 > destination operand clear the upper bits). Unfortunately, I don't have a > better answer than this. Okay, I was just curious, that's good to know ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-04 17:09 [PATCH 0/4] MDWE without inheritance Florent Revest ` (3 preceding siblings ...) 2023-05-04 17:09 ` [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest @ 2023-05-04 20:06 ` Peter Xu 2023-05-05 16:42 ` Florent Revest 4 siblings, 1 reply; 19+ messages in thread From: Peter Xu @ 2023-05-04 20:06 UTC (permalink / raw) To: Florent Revest Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy On Thu, May 04, 2023 at 07:09:38PM +0200, Florent Revest wrote: > Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags > current with a flag that prevents pages that were previously not executable from > becoming executable. > This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK) > > At Google, we've been using a somewhat similar downstream patch for a few years > now. To make the adoption of this feature easier, we've had it support a mode in > which the W^X flag does not propagate to children. For example, this is handy if > a C process which wants W^X protection suspects it could start children > processes that would use a JIT. > > I'd like to align our features with the upstream prctl. This series proposes a > new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It > sets a different flag in current that is not in MMF_INIT_MASK and which does not > propagate. I don't think I have enough context, so sorry if I'm going to ask a naive question.. I can understand how current MDWE helps on not allowing any modifi-able content from becoming executable. How could NO_INHERIT help if it won't inherit and not in MMF_INIT_MASK? IIUC it means the restriction will only apply to the current process. Then I assume the process can escape from this rule simply by a fork(). If so, what's the point to protect at all? And, what's the difference of this comparing to disabling MDWE after being enabled (which seems to be forbidden for now, but it seems fork() can play a similar role of disabling it)? Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-04 20:06 ` [PATCH 0/4] MDWE without inheritance Peter Xu @ 2023-05-05 16:42 ` Florent Revest 2023-05-08 1:29 ` Peter Xu 0 siblings, 1 reply; 19+ messages in thread From: Florent Revest @ 2023-05-05 16:42 UTC (permalink / raw) To: Peter Xu Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy On Thu, May 4, 2023 at 10:06 PM Peter Xu <peterx@redhat.com> wrote: > > On Thu, May 04, 2023 at 07:09:38PM +0200, Florent Revest wrote: > > Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags > > current with a flag that prevents pages that were previously not executable from > > becoming executable. > > This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK) > > > > At Google, we've been using a somewhat similar downstream patch for a few years > > now. To make the adoption of this feature easier, we've had it support a mode in > > which the W^X flag does not propagate to children. For example, this is handy if > > a C process which wants W^X protection suspects it could start children > > processes that would use a JIT. > > > > I'd like to align our features with the upstream prctl. This series proposes a > > new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It > > sets a different flag in current that is not in MMF_INIT_MASK and which does not > > propagate. > > I don't think I have enough context, so sorry if I'm going to ask a naive > question.. Not at all! :) You're absolutely right, it's important to address these points. > I can understand how current MDWE helps on not allowing any modifi-able > content from becoming executable. How could NO_INHERIT help if it won't > inherit and not in MMF_INIT_MASK? The way I see it, enabling MDWE is just a small step towards hardening a binary anyway. It can possibly make exploitation a bit harder in the case where the attacker has _just_: a write primitive they can use to write a shellcode somewhere and a primitive to make that page executable later. It's a fairly narrow protection already and I think it only really helps as part of a broader "defense in depth" strategy. > IIUC it means the restriction will only apply to the current process. Then > I assume the process can escape from this rule simply by a fork(). If so, > what's the point to protect at all? If we assume enough control from the attacker, then MDWE is already useless since it can be bypassed by writing to a file and then mmapping that file with PROT_EXEC. I think that's a good example of how "perfect can be the enemy of good" in security hardening. MDWE isn't a silver-bullet but it's a cheap trick and it makes a small dent in reducing the attack surface so it seems worth having anyway ? But indeed, to address your question, if you choose to use this NO_INHERIT flag: you're no longer protected if the attacker can fork() as part of their exploitation. I think it's been a useful trade-off for our internal users since, on the other hand, it also makes adoption a lot easier: our C++ services developers can trivially opt into a potpourri of hardening features without having to think too much about how they work under-the-hood. The default behavior has been to use a NO_INHERIT strategy so users don't get bad surprises the day when they try to spawn a JITted subcommand. In the meantime, their C++ service still gets a little bit of extra protection. > And, what's the difference of this comparing to disabling MDWE after being > enabled (which seems to be forbidden for now, but it seems fork() can play > a similar role of disabling it)? That would be functionally somewhat similar, yes. I think it mostly comes down to ease of adoption. I imagine that users who would opt into NO_INHERIT are those who are interested in MDWE for the binary they are writing but aren't 100% confident in what subprocesses they will run and so they don't have to think about disabling it after every fork. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-05 16:42 ` Florent Revest @ 2023-05-08 1:29 ` Peter Xu 2023-05-08 12:12 ` Florent Revest 0 siblings, 1 reply; 19+ messages in thread From: Peter Xu @ 2023-05-08 1:29 UTC (permalink / raw) To: Florent Revest Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy On Fri, May 05, 2023 at 06:42:08PM +0200, Florent Revest wrote: > On Thu, May 4, 2023 at 10:06 PM Peter Xu <peterx@redhat.com> wrote: > > > > On Thu, May 04, 2023 at 07:09:38PM +0200, Florent Revest wrote: > > > Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags > > > current with a flag that prevents pages that were previously not executable from > > > becoming executable. > > > This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK) > > > > > > At Google, we've been using a somewhat similar downstream patch for a few years > > > now. To make the adoption of this feature easier, we've had it support a mode in > > > which the W^X flag does not propagate to children. For example, this is handy if > > > a C process which wants W^X protection suspects it could start children > > > processes that would use a JIT. > > > > > > I'd like to align our features with the upstream prctl. This series proposes a > > > new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It > > > sets a different flag in current that is not in MMF_INIT_MASK and which does not > > > propagate. > > > > I don't think I have enough context, so sorry if I'm going to ask a naive > > question.. > > Not at all! :) You're absolutely right, it's important to address these points. > > > I can understand how current MDWE helps on not allowing any modifi-able > > content from becoming executable. How could NO_INHERIT help if it won't > > inherit and not in MMF_INIT_MASK? > > The way I see it, enabling MDWE is just a small step towards hardening > a binary anyway. It can possibly make exploitation a bit harder in the > case where the attacker has _just_: a write primitive they can use to > write a shellcode somewhere and a primitive to make that page > executable later. It's a fairly narrow protection already and I think > it only really helps as part of a broader "defense in depth" strategy. > > > IIUC it means the restriction will only apply to the current process. Then > > I assume the process can escape from this rule simply by a fork(). If so, > > what's the point to protect at all? > > If we assume enough control from the attacker, then MDWE is already > useless since it can be bypassed by writing to a file and then > mmapping that file with PROT_EXEC. I think that's a good example of > how "perfect can be the enemy of good" in security hardening. MDWE > isn't a silver-bullet but it's a cheap trick and it makes a small dent > in reducing the attack surface so it seems worth having anyway ? > > But indeed, to address your question, if you choose to use this > NO_INHERIT flag: you're no longer protected if the attacker can fork() > as part of their exploitation. I think it's been a useful trade-off > for our internal users since, on the other hand, it also makes > adoption a lot easier: our C++ services developers can trivially opt > into a potpourri of hardening features without having to think too > much about how they work under-the-hood. The default behavior has been > to use a NO_INHERIT strategy so users don't get bad surprises the day > when they try to spawn a JITted subcommand. In the meantime, their C++ > service still gets a little bit of extra protection. > > > And, what's the difference of this comparing to disabling MDWE after being > > enabled (which seems to be forbidden for now, but it seems fork() can play > > a similar role of disabling it)? > > That would be functionally somewhat similar, yes. I think it mostly > comes down to ease of adoption. I imagine that users who would opt > into NO_INHERIT are those who are interested in MDWE for the binary > they are writing but aren't 100% confident in what subprocesses they > will run and so they don't have to think about disabling it after > every fork. Okay, that makes sense to me. Thanks. Since the original MDWE was for systemd, I'm wondering what will happen if some program like what you said is invoked by systemd and with MDWE enabled already. Currently in your patch IIUC MDWE_NO_INHERIT will fail directly on MDWE enabled process, but then it makes me think whether it makes more sense to allow converting MDWE->MDWE_NO_INHERIT in this case. It seems to provide a most broad coverage on system daemons using MDWE starting from systemd initial process, meanwhile allows specific daemons to fork into anything like a JIT process so it can make itself NO_INHERIT. Attackers won't leverage this because MDWE_NO_INHERIT also means MDWE enabled. -- Peter Xu ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-08 1:29 ` Peter Xu @ 2023-05-08 12:12 ` Florent Revest 2023-05-08 14:10 ` Catalin Marinas 0 siblings, 1 reply; 19+ messages in thread From: Florent Revest @ 2023-05-08 12:12 UTC (permalink / raw) To: Peter Xu Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy, toiwoton, lennart On Mon, May 8, 2023 at 3:29 AM Peter Xu <peterx@redhat.com> wrote: > > On Fri, May 05, 2023 at 06:42:08PM +0200, Florent Revest wrote: > > On Thu, May 4, 2023 at 10:06 PM Peter Xu <peterx@redhat.com> wrote: > > > > > > On Thu, May 04, 2023 at 07:09:38PM +0200, Florent Revest wrote: > > > > Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags > > > > current with a flag that prevents pages that were previously not executable from > > > > becoming executable. > > > > This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK) > > > > > > > > At Google, we've been using a somewhat similar downstream patch for a few years > > > > now. To make the adoption of this feature easier, we've had it support a mode in > > > > which the W^X flag does not propagate to children. For example, this is handy if > > > > a C process which wants W^X protection suspects it could start children > > > > processes that would use a JIT. > > > > > > > > I'd like to align our features with the upstream prctl. This series proposes a > > > > new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It > > > > sets a different flag in current that is not in MMF_INIT_MASK and which does not > > > > propagate. > > > > > > I don't think I have enough context, so sorry if I'm going to ask a naive > > > question.. > > > > Not at all! :) You're absolutely right, it's important to address these points. > > > > > I can understand how current MDWE helps on not allowing any modifi-able > > > content from becoming executable. How could NO_INHERIT help if it won't > > > inherit and not in MMF_INIT_MASK? > > > > The way I see it, enabling MDWE is just a small step towards hardening > > a binary anyway. It can possibly make exploitation a bit harder in the > > case where the attacker has _just_: a write primitive they can use to > > write a shellcode somewhere and a primitive to make that page > > executable later. It's a fairly narrow protection already and I think > > it only really helps as part of a broader "defense in depth" strategy. > > > > > IIUC it means the restriction will only apply to the current process. Then > > > I assume the process can escape from this rule simply by a fork(). If so, > > > what's the point to protect at all? > > > > If we assume enough control from the attacker, then MDWE is already > > useless since it can be bypassed by writing to a file and then > > mmapping that file with PROT_EXEC. I think that's a good example of > > how "perfect can be the enemy of good" in security hardening. MDWE > > isn't a silver-bullet but it's a cheap trick and it makes a small dent > > in reducing the attack surface so it seems worth having anyway ? > > > > But indeed, to address your question, if you choose to use this > > NO_INHERIT flag: you're no longer protected if the attacker can fork() > > as part of their exploitation. I think it's been a useful trade-off > > for our internal users since, on the other hand, it also makes > > adoption a lot easier: our C++ services developers can trivially opt > > into a potpourri of hardening features without having to think too > > much about how they work under-the-hood. The default behavior has been > > to use a NO_INHERIT strategy so users don't get bad surprises the day > > when they try to spawn a JITted subcommand. In the meantime, their C++ > > service still gets a little bit of extra protection. > > > > > And, what's the difference of this comparing to disabling MDWE after being > > > enabled (which seems to be forbidden for now, but it seems fork() can play > > > a similar role of disabling it)? > > > > That would be functionally somewhat similar, yes. I think it mostly > > comes down to ease of adoption. I imagine that users who would opt > > into NO_INHERIT are those who are interested in MDWE for the binary > > they are writing but aren't 100% confident in what subprocesses they > > will run and so they don't have to think about disabling it after > > every fork. > > Okay, that makes sense to me. Thanks. > > Since the original MDWE was for systemd, I'm wondering what will happen if > some program like what you said is invoked by systemd and with MDWE enabled > already. Good question > Currently in your patch IIUC MDWE_NO_INHERIT will fail directly on MDWE > enabled process, Yes, I tried to stay close to the spirit of the existing logic (which doesn't allow any sort of privilege gains) but this is not particularly a requirement on our side so I'm quite flexible here. Maybe Joey has an input here ? > but then it makes me think whether it makes more sense to > allow converting MDWE->MDWE_NO_INHERIT in this case. It seems to provide a > most broad coverage on system daemons using MDWE starting from systemd > initial process, meanwhile allows specific daemons to fork into anything > like a JIT process so it can make itself NO_INHERIT. Attackers won't > leverage this because MDWE_NO_INHERIT also means MDWE enabled. I should have cc-ed systemd folks who could have opinions on this. I will for v2. + cc Topi & Lennart ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-08 12:12 ` Florent Revest @ 2023-05-08 14:10 ` Catalin Marinas 2023-05-08 17:21 ` Topi Miettinen 0 siblings, 1 reply; 19+ messages in thread From: Catalin Marinas @ 2023-05-08 14:10 UTC (permalink / raw) To: Florent Revest Cc: Peter Xu, linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy, toiwoton, lennart On Mon, May 08, 2023 at 02:12:21PM +0200, Florent Revest wrote: > On Mon, May 8, 2023 at 3:29 AM Peter Xu <peterx@redhat.com> wrote: > > On Fri, May 05, 2023 at 06:42:08PM +0200, Florent Revest wrote: > > > On Thu, May 4, 2023 at 10:06 PM Peter Xu <peterx@redhat.com> wrote: > > > > And, what's the difference of this comparing to disabling MDWE after being > > > > enabled (which seems to be forbidden for now, but it seems fork() can play > > > > a similar role of disabling it)? > > > > > > That would be functionally somewhat similar, yes. I think it mostly > > > comes down to ease of adoption. I imagine that users who would opt > > > into NO_INHERIT are those who are interested in MDWE for the binary > > > they are writing but aren't 100% confident in what subprocesses they > > > will run and so they don't have to think about disabling it after > > > every fork. > > > > Okay, that makes sense to me. Thanks. > > > > Since the original MDWE was for systemd, I'm wondering what will happen if > > some program like what you said is invoked by systemd and with MDWE enabled > > already. > > Good question I think JITs work around this by creating two separate mappings of the same pages, one RX and the other RW (rather than toggling the permission with mprotect()). I had the impression Chromium can use memfd to do something similar but I never checked. > > Currently in your patch IIUC MDWE_NO_INHERIT will fail directly on MDWE > > enabled process, > > Yes, I tried to stay close to the spirit of the existing logic (which > doesn't allow any sort of privilege gains) but this is not > particularly a requirement on our side so I'm quite flexible here. I think we should keep the original behaviour of systemd here, otherwise they won't transition to the new interface and keep using the SECCOMP BPF approach (which, in addition, prevents glibc from setting PROT_BTI on an already executable mapping). To me MDWE is not about preventing JITs but rather ensuring buggy programs don't end up with WX mappings. We ended up this way because of the SECCOMP BPF limitations (just guessing, I haven't been involved in its design). With a no-inherit MDWE, one can introduce an additional policy for systemd. It would be a sysadmin decision which one to enable and maybe current (inherit) MDWE will disappear in time. x86 has protection keys and arm64 will soon have permission overlays that allow user-space to toggle between RX and RW (Joey is looking at the arm64 support). I'm not sure how we'll end up implemented this on arm64 (and haven't looked at x86) but I have a suspicion MDWE will get in the way as the base page table permission will probably need PROT_WRITE|PROT_EXEC. -- Catalin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-08 14:10 ` Catalin Marinas @ 2023-05-08 17:21 ` Topi Miettinen 2023-05-09 10:04 ` Catalin Marinas 0 siblings, 1 reply; 19+ messages in thread From: Topi Miettinen @ 2023-05-08 17:21 UTC (permalink / raw) To: Catalin Marinas, Florent Revest Cc: Peter Xu, linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy, lennart On 8.5.2023 17.10, Catalin Marinas wrote: > I think we should keep the original behaviour of systemd here, otherwise > they won't transition to the new interface and keep using the SECCOMP > BPF approach (which, in addition, prevents glibc from setting PROT_BTI > on an already executable mapping). Systemd has transitioned to prctl(PR_SET_MDWE) method since release of v253, so the original behaviour definitely should be kept. > To me MDWE is not about preventing JITs but rather ensuring buggy > programs don't end up with WX mappings. We ended up this way because of > the SECCOMP BPF limitations (just guessing, I haven't been involved in > its design). With a no-inherit MDWE, one can introduce an additional > policy for systemd. It would be a sysadmin decision which one to enable > and maybe current (inherit) MDWE will disappear in time. There could be a new setting for this, like MemoryDenyWriteExecute=no-inherit. I'd only use it for those special cases where MemoryDenyWriteExecute=yes can't be used. > x86 has protection keys and arm64 will soon have permission overlays > that allow user-space to toggle between RX and RW (Joey is looking at > the arm64 support). I'm not sure how we'll end up implemented this on > arm64 (and haven't looked at x86) but I have a suspicion MDWE will get > in the way as the base page table permission will probably need > PROT_WRITE|PROT_EXEC. Wouldn't those features defeat any gains from MDWE? The features probably should be forbidden with MemoryDenyWriteExecute=yes. -Topi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/4] MDWE without inheritance 2023-05-08 17:21 ` Topi Miettinen @ 2023-05-09 10:04 ` Catalin Marinas 0 siblings, 0 replies; 19+ messages in thread From: Catalin Marinas @ 2023-05-09 10:04 UTC (permalink / raw) To: Topi Miettinen Cc: Florent Revest, Peter Xu, linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly, mhocko, keescook, david, izbyshev, nd, broonie, szabolcs.nagy, lennart On Mon, May 08, 2023 at 08:21:16PM +0300, Topi Miettinen wrote: > On 8.5.2023 17.10, Catalin Marinas wrote: > > I think we should keep the original behaviour of systemd here, otherwise > > they won't transition to the new interface and keep using the SECCOMP > > BPF approach (which, in addition, prevents glibc from setting PROT_BTI > > on an already executable mapping). > > Systemd has transitioned to prctl(PR_SET_MDWE) method since release of v253, > so the original behaviour definitely should be kept. That's great. So yes, no ABI changes allowed anymore. > > x86 has protection keys and arm64 will soon have permission overlays > > that allow user-space to toggle between RX and RW (Joey is looking at > > the arm64 support). I'm not sure how we'll end up implemented this on > > arm64 (and haven't looked at x86) but I have a suspicion MDWE will get > > in the way as the base page table permission will probably need > > PROT_WRITE|PROT_EXEC. > > Wouldn't those features defeat any gains from MDWE? The features probably > should be forbidden with MemoryDenyWriteExecute=yes. The permission overlays (controlled by the user) can only further restrict the mmap() permissions. So MDWE would still work as expected. If one wants to toggle between RW and RX with overlays, the overall mmap() needs to be RWX and it won't work if MDWE=yes. No need to explicitly disable the overlays feature. On arm64 at least, with the introduction of permission overlays we also have the notion of "Read, Execute if not Write". This permission automatically disables Exec if the mapping becomes writable (overlays can disable writable, allowing exec). We could have a new MDWE policy which allows this, though I'm not that keen on using it in Linux since background permission changes done by the kernel can lead to an unexpected executable permission (e.g. marking a page read-only for clean/dirty tracking or in preparation for CoW after fork()). -- Catalin ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2023-05-09 10:04 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-05-04 17:09 [PATCH 0/4] MDWE without inheritance Florent Revest 2023-05-04 17:09 ` [PATCH 1/4] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest 2023-05-04 17:09 ` [PATCH 2/4] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest 2023-05-04 17:13 ` Florent Revest 2023-05-04 17:09 ` [PATCH 3/4] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest 2023-05-05 18:34 ` Catalin Marinas 2023-05-08 12:11 ` Florent Revest 2023-05-04 17:09 ` [PATCH 4/4] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest 2023-05-04 20:29 ` Alexey Izbyshev 2023-05-05 16:42 ` Florent Revest 2023-05-05 21:26 ` Alexey Izbyshev 2023-05-08 12:12 ` Florent Revest 2023-05-04 20:06 ` [PATCH 0/4] MDWE without inheritance Peter Xu 2023-05-05 16:42 ` Florent Revest 2023-05-08 1:29 ` Peter Xu 2023-05-08 12:12 ` Florent Revest 2023-05-08 14:10 ` Catalin Marinas 2023-05-08 17:21 ` Topi Miettinen 2023-05-09 10:04 ` Catalin Marinas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.