All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-03 10:25 ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:25 UTC (permalink / raw)
  To: libc-alpha
  Cc: Jeremy Linton, Catalin Marinas, Mark Rutland, Will Deacon,
	Mark Brown, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

Re-mmap executable segments instead of mprotecting them in
case mprotect is seccomp filtered.

For the kernel mapped main executable we don't have the fd
for re-mmap so linux needs to be updated to add BTI. (In the
presence of seccomp filters for mprotect(PROT_EXEC) the libc
cannot change BTI protection at runtime based on user space
policy so it is better if the kernel maps BTI compatible
binaries with PROT_BTI by default.)

Szabolcs Nagy (4):
  elf: Pass the fd to note processing [BZ #26831]
  elf: Move note processing after l_phdr is updated [BZ #26831]
  aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
  aarch64: Remove the bti link_map field [BZ #26831]

 elf/dl-load.c              | 38 ++++++++++++++++---------------
 elf/rtld.c                 |  4 ++--
 sysdeps/aarch64/dl-bti.c   | 46 ++++++++++++++++++++------------------
 sysdeps/aarch64/dl-prop.h  | 17 +++++++-------
 sysdeps/aarch64/linkmap.h  |  1 -
 sysdeps/generic/dl-prop.h  |  6 ++---
 sysdeps/generic/ldsodefs.h |  5 +++--
 sysdeps/x86/dl-prop.h      |  6 ++---
 8 files changed, 64 insertions(+), 59 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-03 10:25 ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:25 UTC (permalink / raw)
  To: libc-alpha
  Cc: Mark Rutland, Florian Weimer, Kees Cook, kernel-hardening,
	Salvatore Mesoraca, Catalin Marinas, Will Deacon, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

Re-mmap executable segments instead of mprotecting them in
case mprotect is seccomp filtered.

For the kernel mapped main executable we don't have the fd
for re-mmap so linux needs to be updated to add BTI. (In the
presence of seccomp filters for mprotect(PROT_EXEC) the libc
cannot change BTI protection at runtime based on user space
policy so it is better if the kernel maps BTI compatible
binaries with PROT_BTI by default.)

Szabolcs Nagy (4):
  elf: Pass the fd to note processing [BZ #26831]
  elf: Move note processing after l_phdr is updated [BZ #26831]
  aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
  aarch64: Remove the bti link_map field [BZ #26831]

 elf/dl-load.c              | 38 ++++++++++++++++---------------
 elf/rtld.c                 |  4 ++--
 sysdeps/aarch64/dl-bti.c   | 46 ++++++++++++++++++++------------------
 sysdeps/aarch64/dl-prop.h  | 17 +++++++-------
 sysdeps/aarch64/linkmap.h  |  1 -
 sysdeps/generic/dl-prop.h  |  6 ++---
 sysdeps/generic/ldsodefs.h |  5 +++--
 sysdeps/x86/dl-prop.h      |  6 ++---
 8 files changed, 64 insertions(+), 59 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/4] elf: Pass the fd to note processing [BZ #26831]
  2020-11-03 10:25 ` Szabolcs Nagy
@ 2020-11-03 10:25   ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:25 UTC (permalink / raw)
  To: libc-alpha
  Cc: Jeremy Linton, Catalin Marinas, Mark Rutland, Will Deacon,
	Mark Brown, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.

When the fd is not available (kernel loaded modules), then -1
is passed.

The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
---
 elf/dl-load.c              | 12 +++++++-----
 elf/rtld.c                 |  4 ++--
 sysdeps/aarch64/dl-prop.h  |  6 +++---
 sysdeps/generic/dl-prop.h  |  6 +++---
 sysdeps/generic/ldsodefs.h |  5 +++--
 sysdeps/x86/dl-prop.h      |  6 +++---
 6 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/elf/dl-load.c b/elf/dl-load.c
index f3201e7c14..ceaab7f18e 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -861,10 +861,12 @@ lose (int code, int fd, const char *name, char *realname, struct link_map *l,
 
 /* Process PT_GNU_PROPERTY program header PH in module L after
    PT_LOAD segments are mapped.  Only one NT_GNU_PROPERTY_TYPE_0
-   note is handled which contains processor specific properties.  */
+   note is handled which contains processor specific properties.
+   FD is -1 for the kernel mapped main executable otherwise it is
+   the fd used for loading module L.  */
 
 void
-_dl_process_pt_gnu_property (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_gnu_property (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
   const ElfW(Nhdr) *note = (const void *) (ph->p_vaddr + l->l_addr);
   const ElfW(Addr) size = ph->p_memsz;
@@ -911,7 +913,7 @@ _dl_process_pt_gnu_property (struct link_map *l, const ElfW(Phdr) *ph)
 	      last_type = type;
 
 	      /* Target specific property processing.  */
-	      if (_dl_process_gnu_property (l, type, datasz, ptr) == 0)
+	      if (_dl_process_gnu_property (l, fd, type, datasz, ptr) == 0)
 		return;
 
 	      /* Check the next property item.  */
@@ -1266,10 +1268,10 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
       switch (ph[-1].p_type)
 	{
 	case PT_NOTE:
-	  _dl_process_pt_note (l, &ph[-1]);
+	  _dl_process_pt_note (l, fd, &ph[-1]);
 	  break;
 	case PT_GNU_PROPERTY:
-	  _dl_process_pt_gnu_property (l, &ph[-1]);
+	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
 	  break;
 	}
   }
diff --git a/elf/rtld.c b/elf/rtld.c
index 5d117d0d2c..6ba918338b 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1531,10 +1531,10 @@ dl_main (const ElfW(Phdr) *phdr,
     switch (ph[-1].p_type)
       {
       case PT_NOTE:
-	_dl_process_pt_note (main_map, &ph[-1]);
+	_dl_process_pt_note (main_map, -1, &ph[-1]);
 	break;
       case PT_GNU_PROPERTY:
-	_dl_process_pt_gnu_property (main_map, &ph[-1]);
+	_dl_process_pt_gnu_property (main_map, -1, &ph[-1]);
 	break;
       }
 
diff --git a/sysdeps/aarch64/dl-prop.h b/sysdeps/aarch64/dl-prop.h
index b0785bda83..2016d1472e 100644
--- a/sysdeps/aarch64/dl-prop.h
+++ b/sysdeps/aarch64/dl-prop.h
@@ -35,13 +35,13 @@ _dl_open_check (struct link_map *m)
 }
 
 static inline void __attribute__ ((always_inline))
-_dl_process_pt_note (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_note (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
 }
 
 static inline int
-_dl_process_gnu_property (struct link_map *l, uint32_t type, uint32_t datasz,
-			  void *data)
+_dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
+			  uint32_t datasz, void *data)
 {
   if (type == GNU_PROPERTY_AARCH64_FEATURE_1_AND)
     {
diff --git a/sysdeps/generic/dl-prop.h b/sysdeps/generic/dl-prop.h
index f1cf576fe3..df27ff8e6a 100644
--- a/sysdeps/generic/dl-prop.h
+++ b/sysdeps/generic/dl-prop.h
@@ -37,15 +37,15 @@ _dl_open_check (struct link_map *m)
 }
 
 static inline void __attribute__ ((always_inline))
-_dl_process_pt_note (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_note (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
 }
 
 /* Called for each property in the NT_GNU_PROPERTY_TYPE_0 note of L,
    processing of the properties continues until this returns 0.  */
 static inline int __attribute__ ((always_inline))
-_dl_process_gnu_property (struct link_map *l, uint32_t type, uint32_t datasz,
-			  void *data)
+_dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
+			  uint32_t datasz, void *data)
 {
   return 0;
 }
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 382eeb9be0..702cb0f488 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -925,8 +925,9 @@ extern void _dl_rtld_di_serinfo (struct link_map *loader,
 				 Dl_serinfo *si, bool counting);
 
 /* Process PT_GNU_PROPERTY program header PH in module L after
-   PT_LOAD segments are mapped.  */
-void _dl_process_pt_gnu_property (struct link_map *l, const ElfW(Phdr) *ph);
+   PT_LOAD segments are mapped from file FD.  */
+void _dl_process_pt_gnu_property (struct link_map *l, int fd,
+				  const ElfW(Phdr) *ph);
 
 
 /* Search loaded objects' symbol tables for a definition of the symbol
diff --git a/sysdeps/x86/dl-prop.h b/sysdeps/x86/dl-prop.h
index 89911e19e2..4eb3b85a7b 100644
--- a/sysdeps/x86/dl-prop.h
+++ b/sysdeps/x86/dl-prop.h
@@ -145,15 +145,15 @@ _dl_process_cet_property_note (struct link_map *l,
 }
 
 static inline void __attribute__ ((unused))
-_dl_process_pt_note (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_note (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
   const ElfW(Nhdr) *note = (const void *) (ph->p_vaddr + l->l_addr);
   _dl_process_cet_property_note (l, note, ph->p_memsz, ph->p_align);
 }
 
 static inline int __attribute__ ((always_inline))
-_dl_process_gnu_property (struct link_map *l, uint32_t type, uint32_t datasz,
-			  void *data)
+_dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
+			  uint32_t datasz, void *data)
 {
   return 0;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 1/4] elf: Pass the fd to note processing [BZ #26831]
@ 2020-11-03 10:25   ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:25 UTC (permalink / raw)
  To: libc-alpha
  Cc: Mark Rutland, Florian Weimer, Kees Cook, kernel-hardening,
	Salvatore Mesoraca, Catalin Marinas, Will Deacon, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.

When the fd is not available (kernel loaded modules), then -1
is passed.

The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
---
 elf/dl-load.c              | 12 +++++++-----
 elf/rtld.c                 |  4 ++--
 sysdeps/aarch64/dl-prop.h  |  6 +++---
 sysdeps/generic/dl-prop.h  |  6 +++---
 sysdeps/generic/ldsodefs.h |  5 +++--
 sysdeps/x86/dl-prop.h      |  6 +++---
 6 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/elf/dl-load.c b/elf/dl-load.c
index f3201e7c14..ceaab7f18e 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -861,10 +861,12 @@ lose (int code, int fd, const char *name, char *realname, struct link_map *l,
 
 /* Process PT_GNU_PROPERTY program header PH in module L after
    PT_LOAD segments are mapped.  Only one NT_GNU_PROPERTY_TYPE_0
-   note is handled which contains processor specific properties.  */
+   note is handled which contains processor specific properties.
+   FD is -1 for the kernel mapped main executable otherwise it is
+   the fd used for loading module L.  */
 
 void
-_dl_process_pt_gnu_property (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_gnu_property (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
   const ElfW(Nhdr) *note = (const void *) (ph->p_vaddr + l->l_addr);
   const ElfW(Addr) size = ph->p_memsz;
@@ -911,7 +913,7 @@ _dl_process_pt_gnu_property (struct link_map *l, const ElfW(Phdr) *ph)
 	      last_type = type;
 
 	      /* Target specific property processing.  */
-	      if (_dl_process_gnu_property (l, type, datasz, ptr) == 0)
+	      if (_dl_process_gnu_property (l, fd, type, datasz, ptr) == 0)
 		return;
 
 	      /* Check the next property item.  */
@@ -1266,10 +1268,10 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
       switch (ph[-1].p_type)
 	{
 	case PT_NOTE:
-	  _dl_process_pt_note (l, &ph[-1]);
+	  _dl_process_pt_note (l, fd, &ph[-1]);
 	  break;
 	case PT_GNU_PROPERTY:
-	  _dl_process_pt_gnu_property (l, &ph[-1]);
+	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
 	  break;
 	}
   }
diff --git a/elf/rtld.c b/elf/rtld.c
index 5d117d0d2c..6ba918338b 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1531,10 +1531,10 @@ dl_main (const ElfW(Phdr) *phdr,
     switch (ph[-1].p_type)
       {
       case PT_NOTE:
-	_dl_process_pt_note (main_map, &ph[-1]);
+	_dl_process_pt_note (main_map, -1, &ph[-1]);
 	break;
       case PT_GNU_PROPERTY:
-	_dl_process_pt_gnu_property (main_map, &ph[-1]);
+	_dl_process_pt_gnu_property (main_map, -1, &ph[-1]);
 	break;
       }
 
diff --git a/sysdeps/aarch64/dl-prop.h b/sysdeps/aarch64/dl-prop.h
index b0785bda83..2016d1472e 100644
--- a/sysdeps/aarch64/dl-prop.h
+++ b/sysdeps/aarch64/dl-prop.h
@@ -35,13 +35,13 @@ _dl_open_check (struct link_map *m)
 }
 
 static inline void __attribute__ ((always_inline))
-_dl_process_pt_note (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_note (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
 }
 
 static inline int
-_dl_process_gnu_property (struct link_map *l, uint32_t type, uint32_t datasz,
-			  void *data)
+_dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
+			  uint32_t datasz, void *data)
 {
   if (type == GNU_PROPERTY_AARCH64_FEATURE_1_AND)
     {
diff --git a/sysdeps/generic/dl-prop.h b/sysdeps/generic/dl-prop.h
index f1cf576fe3..df27ff8e6a 100644
--- a/sysdeps/generic/dl-prop.h
+++ b/sysdeps/generic/dl-prop.h
@@ -37,15 +37,15 @@ _dl_open_check (struct link_map *m)
 }
 
 static inline void __attribute__ ((always_inline))
-_dl_process_pt_note (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_note (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
 }
 
 /* Called for each property in the NT_GNU_PROPERTY_TYPE_0 note of L,
    processing of the properties continues until this returns 0.  */
 static inline int __attribute__ ((always_inline))
-_dl_process_gnu_property (struct link_map *l, uint32_t type, uint32_t datasz,
-			  void *data)
+_dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
+			  uint32_t datasz, void *data)
 {
   return 0;
 }
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 382eeb9be0..702cb0f488 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -925,8 +925,9 @@ extern void _dl_rtld_di_serinfo (struct link_map *loader,
 				 Dl_serinfo *si, bool counting);
 
 /* Process PT_GNU_PROPERTY program header PH in module L after
-   PT_LOAD segments are mapped.  */
-void _dl_process_pt_gnu_property (struct link_map *l, const ElfW(Phdr) *ph);
+   PT_LOAD segments are mapped from file FD.  */
+void _dl_process_pt_gnu_property (struct link_map *l, int fd,
+				  const ElfW(Phdr) *ph);
 
 
 /* Search loaded objects' symbol tables for a definition of the symbol
diff --git a/sysdeps/x86/dl-prop.h b/sysdeps/x86/dl-prop.h
index 89911e19e2..4eb3b85a7b 100644
--- a/sysdeps/x86/dl-prop.h
+++ b/sysdeps/x86/dl-prop.h
@@ -145,15 +145,15 @@ _dl_process_cet_property_note (struct link_map *l,
 }
 
 static inline void __attribute__ ((unused))
-_dl_process_pt_note (struct link_map *l, const ElfW(Phdr) *ph)
+_dl_process_pt_note (struct link_map *l, int fd, const ElfW(Phdr) *ph)
 {
   const ElfW(Nhdr) *note = (const void *) (ph->p_vaddr + l->l_addr);
   _dl_process_cet_property_note (l, note, ph->p_memsz, ph->p_align);
 }
 
 static inline int __attribute__ ((always_inline))
-_dl_process_gnu_property (struct link_map *l, uint32_t type, uint32_t datasz,
-			  void *data)
+_dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
+			  uint32_t datasz, void *data)
 {
   return 0;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
  2020-11-03 10:25 ` Szabolcs Nagy
@ 2020-11-03 10:26   ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:26 UTC (permalink / raw)
  To: libc-alpha
  Cc: Jeremy Linton, Catalin Marinas, Mark Rutland, Will Deacon,
	Mark Brown, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

Program headers are processed in two pass: after the first pass
load segments are mmapped so in the second pass target specific
note processing logic can access the notes.

The second pass is moved later so various link_map fields are
set up that may be useful for note processing such as l_phdr.
---
 elf/dl-load.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/elf/dl-load.c b/elf/dl-load.c
index ceaab7f18e..673cf960a0 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
 				  maplength, has_holes, loader);
     if (__glibc_unlikely (errstring != NULL))
       goto call_lose;
-
-    /* Process program headers again after load segments are mapped in
-       case processing requires accessing those segments.  Scan program
-       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
-       exits.  */
-    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
-      switch (ph[-1].p_type)
-	{
-	case PT_NOTE:
-	  _dl_process_pt_note (l, fd, &ph[-1]);
-	  break;
-	case PT_GNU_PROPERTY:
-	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
-	  break;
-	}
   }
 
   if (l->l_ld == 0)
@@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
     /* Assign the next available module ID.  */
     l->l_tls_modid = _dl_next_tls_modid ();
 
+  /* Process program headers again after load segments are mapped in
+     case processing requires accessing those segments.  Scan program
+     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
+     exits.  */
+  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
+    switch (ph[-1].p_type)
+      {
+      case PT_NOTE:
+	_dl_process_pt_note (l, fd, &ph[-1]);
+	break;
+      case PT_GNU_PROPERTY:
+	_dl_process_pt_gnu_property (l, fd, &ph[-1]);
+	break;
+      }
+
 #ifdef DL_AFTER_LOAD
   DL_AFTER_LOAD (l);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 10:26   ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:26 UTC (permalink / raw)
  To: libc-alpha
  Cc: Mark Rutland, Florian Weimer, Kees Cook, kernel-hardening,
	Salvatore Mesoraca, Catalin Marinas, Will Deacon, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

Program headers are processed in two pass: after the first pass
load segments are mmapped so in the second pass target specific
note processing logic can access the notes.

The second pass is moved later so various link_map fields are
set up that may be useful for note processing such as l_phdr.
---
 elf/dl-load.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/elf/dl-load.c b/elf/dl-load.c
index ceaab7f18e..673cf960a0 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
 				  maplength, has_holes, loader);
     if (__glibc_unlikely (errstring != NULL))
       goto call_lose;
-
-    /* Process program headers again after load segments are mapped in
-       case processing requires accessing those segments.  Scan program
-       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
-       exits.  */
-    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
-      switch (ph[-1].p_type)
-	{
-	case PT_NOTE:
-	  _dl_process_pt_note (l, fd, &ph[-1]);
-	  break;
-	case PT_GNU_PROPERTY:
-	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
-	  break;
-	}
   }
 
   if (l->l_ld == 0)
@@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
     /* Assign the next available module ID.  */
     l->l_tls_modid = _dl_next_tls_modid ();
 
+  /* Process program headers again after load segments are mapped in
+     case processing requires accessing those segments.  Scan program
+     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
+     exits.  */
+  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
+    switch (ph[-1].p_type)
+      {
+      case PT_NOTE:
+	_dl_process_pt_note (l, fd, &ph[-1]);
+	break;
+      case PT_GNU_PROPERTY:
+	_dl_process_pt_gnu_property (l, fd, &ph[-1]);
+	break;
+      }
+
 #ifdef DL_AFTER_LOAD
   DL_AFTER_LOAD (l);
 #endif
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 3/4] aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
  2020-11-03 10:25 ` Szabolcs Nagy
@ 2020-11-03 10:26   ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:26 UTC (permalink / raw)
  To: libc-alpha
  Cc: Jeremy Linton, Catalin Marinas, Mark Rutland, Will Deacon,
	Mark Brown, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

Re-mmap executable segments if possible instead of using mprotect
to add PROT_BTI. This allows using BTI protection with security
policies that prevent mprotect with PROT_EXEC.

If the fd of the ELF module is not available because it was kernel
mapped then mprotect is used and failures are ignored.  It is
expected that linux kernel will add PROT_BTI when mapping a module
(current linux as of version 5.9 does not do this).

Computing the mapping parameters follows the logic of
_dl_map_object_from_fd more closely now.

Fixes bug 26831.
---
 sysdeps/aarch64/dl-bti.c  | 46 ++++++++++++++++++++-------------------
 sysdeps/aarch64/dl-prop.h | 14 +++++++-----
 2 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/sysdeps/aarch64/dl-bti.c b/sysdeps/aarch64/dl-bti.c
index 196e462520..385f1731ca 100644
--- a/sysdeps/aarch64/dl-bti.c
+++ b/sysdeps/aarch64/dl-bti.c
@@ -19,43 +19,45 @@
 #include <errno.h>
 #include <libintl.h>
 #include <ldsodefs.h>
+#include <dl-load.h>  /* For MAP_COPY.  */
 
-static int
-enable_bti (struct link_map *map, const char *program)
+/* Enable BTI protection for MAP.  */
+
+void
+_dl_bti_protect (struct link_map *map, int fd)
 {
+  const size_t pagesz = GLRO(dl_pagesize);
   const ElfW(Phdr) *phdr;
-  unsigned prot;
 
   for (phdr = map->l_phdr; phdr < &map->l_phdr[map->l_phnum]; ++phdr)
     if (phdr->p_type == PT_LOAD && (phdr->p_flags & PF_X))
       {
-	void *start = (void *) (phdr->p_vaddr + map->l_addr);
-	size_t len = phdr->p_memsz;
-
-	prot = PROT_EXEC | PROT_BTI;
+	size_t vstart = ALIGN_DOWN (phdr->p_vaddr, pagesz);
+	size_t vend = ALIGN_UP (phdr->p_vaddr + phdr->p_filesz, pagesz);
+	off_t off = ALIGN_DOWN (phdr->p_offset, pagesz);
+	void *start = (void *) (vstart + map->l_addr);
+	size_t len = vend - vstart;
+
+	/* Add PROT_BTI.  */
+	unsigned prot = PROT_EXEC | PROT_BTI;
 	if (phdr->p_flags & PF_R)
 	  prot |= PROT_READ;
 	if (phdr->p_flags & PF_W)
 	  prot |= PROT_WRITE;
 
-	if (__mprotect (start, len, prot) < 0)
+	if (fd == -1)
+	  {
+	    /* Ignore failures: rely on the kernel adding PROT_BTI then.  */
+	    __mprotect (start, len, prot);
+	  }
+	else
 	  {
-	    if (program)
-	      _dl_fatal_printf ("%s: mprotect failed to turn on BTI\n",
-				map->l_name);
-	    else
+	    void *p = __mmap (start, len, prot, MAP_FIXED|MAP_COPY|MAP_FILE,
+			      fd, off);
+	    if (p == MAP_FAILED)
 	      _dl_signal_error (errno, map->l_name, "dlopen",
-				N_("mprotect failed to turn on BTI"));
+				N_("failed to turn on BTI protection"));
 	  }
       }
   return 0;
 }
-
-/* Enable BTI for L if required.  */
-
-void
-_dl_bti_check (struct link_map *l, const char *program)
-{
-  if (GLRO(dl_aarch64_cpu_features).bti && l->l_mach.bti)
-    enable_bti (l, program);
-}
diff --git a/sysdeps/aarch64/dl-prop.h b/sysdeps/aarch64/dl-prop.h
index 2016d1472e..762bc93733 100644
--- a/sysdeps/aarch64/dl-prop.h
+++ b/sysdeps/aarch64/dl-prop.h
@@ -19,19 +19,16 @@
 #ifndef _DL_PROP_H
 #define _DL_PROP_H
 
-extern void _dl_bti_check (struct link_map *, const char *)
-    attribute_hidden;
+extern void _dl_bti_protect (struct link_map *, int) attribute_hidden;
 
 static inline void __attribute__ ((always_inline))
 _rtld_main_check (struct link_map *m, const char *program)
 {
-  _dl_bti_check (m, program);
 }
 
 static inline void __attribute__ ((always_inline))
 _dl_open_check (struct link_map *m)
 {
-  _dl_bti_check (m, NULL);
 }
 
 static inline void __attribute__ ((always_inline))
@@ -43,6 +40,10 @@ static inline int
 _dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
 			  uint32_t datasz, void *data)
 {
+  if (!GLRO(dl_aarch64_cpu_features).bti)
+    /* Skip note processing.  */
+    return 0;
+
   if (type == GNU_PROPERTY_AARCH64_FEATURE_1_AND)
     {
       /* Stop if the property note is ill-formed.  */
@@ -51,7 +52,10 @@ _dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
 
       unsigned int feature_1 = *(unsigned int *) data;
       if (feature_1 & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
-	l->l_mach.bti = true;
+	{
+	  l->l_mach.bti = true;  /* No longer needed.  */
+	  _dl_bti_protect (l, fd);
+	}
 
       /* Stop if we processed the property note.  */
       return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 3/4] aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
@ 2020-11-03 10:26   ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:26 UTC (permalink / raw)
  To: libc-alpha
  Cc: Mark Rutland, Florian Weimer, Kees Cook, kernel-hardening,
	Salvatore Mesoraca, Catalin Marinas, Will Deacon, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

Re-mmap executable segments if possible instead of using mprotect
to add PROT_BTI. This allows using BTI protection with security
policies that prevent mprotect with PROT_EXEC.

If the fd of the ELF module is not available because it was kernel
mapped then mprotect is used and failures are ignored.  It is
expected that linux kernel will add PROT_BTI when mapping a module
(current linux as of version 5.9 does not do this).

Computing the mapping parameters follows the logic of
_dl_map_object_from_fd more closely now.

Fixes bug 26831.
---
 sysdeps/aarch64/dl-bti.c  | 46 ++++++++++++++++++++-------------------
 sysdeps/aarch64/dl-prop.h | 14 +++++++-----
 2 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/sysdeps/aarch64/dl-bti.c b/sysdeps/aarch64/dl-bti.c
index 196e462520..385f1731ca 100644
--- a/sysdeps/aarch64/dl-bti.c
+++ b/sysdeps/aarch64/dl-bti.c
@@ -19,43 +19,45 @@
 #include <errno.h>
 #include <libintl.h>
 #include <ldsodefs.h>
+#include <dl-load.h>  /* For MAP_COPY.  */
 
-static int
-enable_bti (struct link_map *map, const char *program)
+/* Enable BTI protection for MAP.  */
+
+void
+_dl_bti_protect (struct link_map *map, int fd)
 {
+  const size_t pagesz = GLRO(dl_pagesize);
   const ElfW(Phdr) *phdr;
-  unsigned prot;
 
   for (phdr = map->l_phdr; phdr < &map->l_phdr[map->l_phnum]; ++phdr)
     if (phdr->p_type == PT_LOAD && (phdr->p_flags & PF_X))
       {
-	void *start = (void *) (phdr->p_vaddr + map->l_addr);
-	size_t len = phdr->p_memsz;
-
-	prot = PROT_EXEC | PROT_BTI;
+	size_t vstart = ALIGN_DOWN (phdr->p_vaddr, pagesz);
+	size_t vend = ALIGN_UP (phdr->p_vaddr + phdr->p_filesz, pagesz);
+	off_t off = ALIGN_DOWN (phdr->p_offset, pagesz);
+	void *start = (void *) (vstart + map->l_addr);
+	size_t len = vend - vstart;
+
+	/* Add PROT_BTI.  */
+	unsigned prot = PROT_EXEC | PROT_BTI;
 	if (phdr->p_flags & PF_R)
 	  prot |= PROT_READ;
 	if (phdr->p_flags & PF_W)
 	  prot |= PROT_WRITE;
 
-	if (__mprotect (start, len, prot) < 0)
+	if (fd == -1)
+	  {
+	    /* Ignore failures: rely on the kernel adding PROT_BTI then.  */
+	    __mprotect (start, len, prot);
+	  }
+	else
 	  {
-	    if (program)
-	      _dl_fatal_printf ("%s: mprotect failed to turn on BTI\n",
-				map->l_name);
-	    else
+	    void *p = __mmap (start, len, prot, MAP_FIXED|MAP_COPY|MAP_FILE,
+			      fd, off);
+	    if (p == MAP_FAILED)
 	      _dl_signal_error (errno, map->l_name, "dlopen",
-				N_("mprotect failed to turn on BTI"));
+				N_("failed to turn on BTI protection"));
 	  }
       }
   return 0;
 }
-
-/* Enable BTI for L if required.  */
-
-void
-_dl_bti_check (struct link_map *l, const char *program)
-{
-  if (GLRO(dl_aarch64_cpu_features).bti && l->l_mach.bti)
-    enable_bti (l, program);
-}
diff --git a/sysdeps/aarch64/dl-prop.h b/sysdeps/aarch64/dl-prop.h
index 2016d1472e..762bc93733 100644
--- a/sysdeps/aarch64/dl-prop.h
+++ b/sysdeps/aarch64/dl-prop.h
@@ -19,19 +19,16 @@
 #ifndef _DL_PROP_H
 #define _DL_PROP_H
 
-extern void _dl_bti_check (struct link_map *, const char *)
-    attribute_hidden;
+extern void _dl_bti_protect (struct link_map *, int) attribute_hidden;
 
 static inline void __attribute__ ((always_inline))
 _rtld_main_check (struct link_map *m, const char *program)
 {
-  _dl_bti_check (m, program);
 }
 
 static inline void __attribute__ ((always_inline))
 _dl_open_check (struct link_map *m)
 {
-  _dl_bti_check (m, NULL);
 }
 
 static inline void __attribute__ ((always_inline))
@@ -43,6 +40,10 @@ static inline int
 _dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
 			  uint32_t datasz, void *data)
 {
+  if (!GLRO(dl_aarch64_cpu_features).bti)
+    /* Skip note processing.  */
+    return 0;
+
   if (type == GNU_PROPERTY_AARCH64_FEATURE_1_AND)
     {
       /* Stop if the property note is ill-formed.  */
@@ -51,7 +52,10 @@ _dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
 
       unsigned int feature_1 = *(unsigned int *) data;
       if (feature_1 & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
-	l->l_mach.bti = true;
+	{
+	  l->l_mach.bti = true;  /* No longer needed.  */
+	  _dl_bti_protect (l, fd);
+	}
 
       /* Stop if we processed the property note.  */
       return 0;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 4/4] aarch64: Remove the bti link_map field [BZ #26831]
  2020-11-03 10:25 ` Szabolcs Nagy
@ 2020-11-03 10:26   ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:26 UTC (permalink / raw)
  To: libc-alpha
  Cc: Jeremy Linton, Catalin Marinas, Mark Rutland, Will Deacon,
	Mark Brown, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

The bti link_map field is no longer necessary because PROT_BTI
is applied at note processing time immediately instead of in
_dl_open_check based on the bti field.

This is a separate patch that is not expected to be backported
to avoid changing the link_map layout that is libc internal ABI.
---
 sysdeps/aarch64/dl-prop.h | 5 +----
 sysdeps/aarch64/linkmap.h | 1 -
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/sysdeps/aarch64/dl-prop.h b/sysdeps/aarch64/dl-prop.h
index 762bc93733..cf14381e4a 100644
--- a/sysdeps/aarch64/dl-prop.h
+++ b/sysdeps/aarch64/dl-prop.h
@@ -52,10 +52,7 @@ _dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
 
       unsigned int feature_1 = *(unsigned int *) data;
       if (feature_1 & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
-	{
-	  l->l_mach.bti = true;  /* No longer needed.  */
-	  _dl_bti_protect (l, fd);
-	}
+	_dl_bti_protect (l, fd);
 
       /* Stop if we processed the property note.  */
       return 0;
diff --git a/sysdeps/aarch64/linkmap.h b/sysdeps/aarch64/linkmap.h
index 847a03ace2..e921e77495 100644
--- a/sysdeps/aarch64/linkmap.h
+++ b/sysdeps/aarch64/linkmap.h
@@ -22,5 +22,4 @@ struct link_map_machine
 {
   ElfW(Addr) plt;	  /* Address of .plt */
   void *tlsdesc_table;	  /* Address of TLS descriptor hash table.  */
-  bool bti;		  /* Branch Target Identification is enabled.  */
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 4/4] aarch64: Remove the bti link_map field [BZ #26831]
@ 2020-11-03 10:26   ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 10:26 UTC (permalink / raw)
  To: libc-alpha
  Cc: Mark Rutland, Florian Weimer, Kees Cook, kernel-hardening,
	Salvatore Mesoraca, Catalin Marinas, Will Deacon, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

The bti link_map field is no longer necessary because PROT_BTI
is applied at note processing time immediately instead of in
_dl_open_check based on the bti field.

This is a separate patch that is not expected to be backported
to avoid changing the link_map layout that is libc internal ABI.
---
 sysdeps/aarch64/dl-prop.h | 5 +----
 sysdeps/aarch64/linkmap.h | 1 -
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/sysdeps/aarch64/dl-prop.h b/sysdeps/aarch64/dl-prop.h
index 762bc93733..cf14381e4a 100644
--- a/sysdeps/aarch64/dl-prop.h
+++ b/sysdeps/aarch64/dl-prop.h
@@ -52,10 +52,7 @@ _dl_process_gnu_property (struct link_map *l, int fd, uint32_t type,
 
       unsigned int feature_1 = *(unsigned int *) data;
       if (feature_1 & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
-	{
-	  l->l_mach.bti = true;  /* No longer needed.  */
-	  _dl_bti_protect (l, fd);
-	}
+	_dl_bti_protect (l, fd);
 
       /* Stop if we processed the property note.  */
       return 0;
diff --git a/sysdeps/aarch64/linkmap.h b/sysdeps/aarch64/linkmap.h
index 847a03ace2..e921e77495 100644
--- a/sysdeps/aarch64/linkmap.h
+++ b/sysdeps/aarch64/linkmap.h
@@ -22,5 +22,4 @@ struct link_map_machine
 {
   ElfW(Addr) plt;	  /* Address of .plt */
   void *tlsdesc_table;	  /* Address of TLS descriptor hash table.  */
-  bool bti;		  /* Branch Target Identification is enabled.  */
 };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/4] aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
  2020-11-03 10:26   ` Szabolcs Nagy
  (?)
@ 2020-11-03 10:34     ` Florian Weimer
  -1 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-03 10:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: libc-alpha, Jeremy Linton, Catalin Marinas, Mark Rutland,
	Will Deacon, Mark Brown, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

* Szabolcs Nagy:

> Re-mmap executable segments if possible instead of using mprotect
> to add PROT_BTI. This allows using BTI protection with security
> policies that prevent mprotect with PROT_EXEC.
>
> If the fd of the ELF module is not available because it was kernel
> mapped then mprotect is used and failures are ignored.  It is
> expected that linux kernel will add PROT_BTI when mapping a module
> (current linux as of version 5.9 does not do this).
>
> Computing the mapping parameters follows the logic of
> _dl_map_object_from_fd more closely now.

What's the performance of this on execve-heavy workloads, such as kernel
or glibc builds?  Hopefully it's cheap because these mappings have not
been faulted in yet.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/4] aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
@ 2020-11-03 10:34     ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-03 10:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Mark Rutland, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Catalin Marinas, Will Deacon, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

* Szabolcs Nagy:

> Re-mmap executable segments if possible instead of using mprotect
> to add PROT_BTI. This allows using BTI protection with security
> policies that prevent mprotect with PROT_EXEC.
>
> If the fd of the ELF module is not available because it was kernel
> mapped then mprotect is used and failures are ignored.  It is
> expected that linux kernel will add PROT_BTI when mapping a module
> (current linux as of version 5.9 does not do this).
>
> Computing the mapping parameters follows the logic of
> _dl_map_object_from_fd more closely now.

What's the performance of this on execve-heavy workloads, such as kernel
or glibc builds?  Hopefully it's cheap because these mappings have not
been faulted in yet.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/4] aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831]
@ 2020-11-03 10:34     ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-03 10:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: libc-alpha, Jeremy Linton, Catalin Marinas, Mark Rutland,
	Will Deacon, Mark Brown, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

* Szabolcs Nagy:

> Re-mmap executable segments if possible instead of using mprotect
> to add PROT_BTI. This allows using BTI protection with security
> policies that prevent mprotect with PROT_EXEC.
>
> If the fd of the ELF module is not available because it was kernel
> mapped then mprotect is used and failures are ignored.  It is
> expected that linux kernel will add PROT_BTI when mapping a module
> (current linux as of version 5.9 does not do this).
>
> Computing the mapping parameters follows the logic of
> _dl_map_object_from_fd more closely now.

What's the performance of this on execve-heavy workloads, such as kernel
or glibc builds?  Hopefully it's cheap because these mappings have not
been faulted in yet.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
  2020-11-03 10:26   ` Szabolcs Nagy
  (?)
@ 2020-11-03 10:38     ` Florian Weimer
  -1 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-03 10:38 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: libc-alpha, Jeremy Linton, Catalin Marinas, Mark Rutland,
	Will Deacon, Mark Brown, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening, H.J. Lu

* Szabolcs Nagy:

> Program headers are processed in two pass: after the first pass
> load segments are mmapped so in the second pass target specific
> note processing logic can access the notes.
>
> The second pass is moved later so various link_map fields are
> set up that may be useful for note processing such as l_phdr.
> ---
>  elf/dl-load.c | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index ceaab7f18e..673cf960a0 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>  				  maplength, has_holes, loader);
>      if (__glibc_unlikely (errstring != NULL))
>        goto call_lose;
> -
> -    /* Process program headers again after load segments are mapped in
> -       case processing requires accessing those segments.  Scan program
> -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> -       exits.  */
> -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> -      switch (ph[-1].p_type)
> -	{
> -	case PT_NOTE:
> -	  _dl_process_pt_note (l, fd, &ph[-1]);
> -	  break;
> -	case PT_GNU_PROPERTY:
> -	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> -	  break;
> -	}
>    }
>  
>    if (l->l_ld == 0)
> @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
>      /* Assign the next available module ID.  */
>      l->l_tls_modid = _dl_next_tls_modid ();
>  
> +  /* Process program headers again after load segments are mapped in
> +     case processing requires accessing those segments.  Scan program
> +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> +     exits.  */
> +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> +    switch (ph[-1].p_type)
> +      {
> +      case PT_NOTE:
> +	_dl_process_pt_note (l, fd, &ph[-1]);
> +	break;
> +      case PT_GNU_PROPERTY:
> +	_dl_process_pt_gnu_property (l, fd, &ph[-1]);
> +	break;
> +      }
> +
>  #ifdef DL_AFTER_LOAD
>    DL_AFTER_LOAD (l);
>  #endif

Is this still compatible with the CET requirements?

I hope it is because the CET magic happens in _dl_open_check, so after
the the code in elf/dl-load.c has run.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 10:38     ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-03 10:38 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Mark Rutland, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Catalin Marinas, H.J. Lu, Will Deacon,
	linux-kernel, Jeremy Linton, Mark Brown, Lennart Poettering,
	linux-hardening, Topi Miettinen, linux-arm-kernel

* Szabolcs Nagy:

> Program headers are processed in two pass: after the first pass
> load segments are mmapped so in the second pass target specific
> note processing logic can access the notes.
>
> The second pass is moved later so various link_map fields are
> set up that may be useful for note processing such as l_phdr.
> ---
>  elf/dl-load.c | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index ceaab7f18e..673cf960a0 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>  				  maplength, has_holes, loader);
>      if (__glibc_unlikely (errstring != NULL))
>        goto call_lose;
> -
> -    /* Process program headers again after load segments are mapped in
> -       case processing requires accessing those segments.  Scan program
> -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> -       exits.  */
> -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> -      switch (ph[-1].p_type)
> -	{
> -	case PT_NOTE:
> -	  _dl_process_pt_note (l, fd, &ph[-1]);
> -	  break;
> -	case PT_GNU_PROPERTY:
> -	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> -	  break;
> -	}
>    }
>  
>    if (l->l_ld == 0)
> @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
>      /* Assign the next available module ID.  */
>      l->l_tls_modid = _dl_next_tls_modid ();
>  
> +  /* Process program headers again after load segments are mapped in
> +     case processing requires accessing those segments.  Scan program
> +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> +     exits.  */
> +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> +    switch (ph[-1].p_type)
> +      {
> +      case PT_NOTE:
> +	_dl_process_pt_note (l, fd, &ph[-1]);
> +	break;
> +      case PT_GNU_PROPERTY:
> +	_dl_process_pt_gnu_property (l, fd, &ph[-1]);
> +	break;
> +      }
> +
>  #ifdef DL_AFTER_LOAD
>    DL_AFTER_LOAD (l);
>  #endif

Is this still compatible with the CET requirements?

I hope it is because the CET magic happens in _dl_open_check, so after
the the code in elf/dl-load.c has run.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 10:38     ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-03 10:38 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: libc-alpha, Jeremy Linton, Catalin Marinas, Mark Rutland,
	Will Deacon, Mark Brown, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening, H.J. Lu

* Szabolcs Nagy:

> Program headers are processed in two pass: after the first pass
> load segments are mmapped so in the second pass target specific
> note processing logic can access the notes.
>
> The second pass is moved later so various link_map fields are
> set up that may be useful for note processing such as l_phdr.
> ---
>  elf/dl-load.c | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index ceaab7f18e..673cf960a0 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>  				  maplength, has_holes, loader);
>      if (__glibc_unlikely (errstring != NULL))
>        goto call_lose;
> -
> -    /* Process program headers again after load segments are mapped in
> -       case processing requires accessing those segments.  Scan program
> -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> -       exits.  */
> -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> -      switch (ph[-1].p_type)
> -	{
> -	case PT_NOTE:
> -	  _dl_process_pt_note (l, fd, &ph[-1]);
> -	  break;
> -	case PT_GNU_PROPERTY:
> -	  _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> -	  break;
> -	}
>    }
>  
>    if (l->l_ld == 0)
> @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
>      /* Assign the next available module ID.  */
>      l->l_tls_modid = _dl_next_tls_modid ();
>  
> +  /* Process program headers again after load segments are mapped in
> +     case processing requires accessing those segments.  Scan program
> +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> +     exits.  */
> +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> +    switch (ph[-1].p_type)
> +      {
> +      case PT_NOTE:
> +	_dl_process_pt_note (l, fd, &ph[-1]);
> +	break;
> +      case PT_GNU_PROPERTY:
> +	_dl_process_pt_gnu_property (l, fd, &ph[-1]);
> +	break;
> +      }
> +
>  #ifdef DL_AFTER_LOAD
>    DL_AFTER_LOAD (l);
>  #endif

Is this still compatible with the CET requirements?

I hope it is because the CET magic happens in _dl_open_check, so after
the the code in elf/dl-load.c has run.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
  2020-11-03 10:38     ` Florian Weimer
  (?)
@ 2020-11-03 12:36       ` H.J. Lu
  -1 siblings, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2020-11-03 12:36 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Szabolcs Nagy, GNU C Library, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Will Deacon, Mark Brown, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, Topi Miettinen, LKML,
	linux-arm-kernel, Kernel Hardening, linux-hardening

On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Szabolcs Nagy:
>
> > Program headers are processed in two pass: after the first pass
> > load segments are mmapped so in the second pass target specific
> > note processing logic can access the notes.
> >
> > The second pass is moved later so various link_map fields are
> > set up that may be useful for note processing such as l_phdr.
> > ---
> >  elf/dl-load.c | 30 +++++++++++++++---------------
> >  1 file changed, 15 insertions(+), 15 deletions(-)
> >
> > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > index ceaab7f18e..673cf960a0 100644
> > --- a/elf/dl-load.c
> > +++ b/elf/dl-load.c
> > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> >                                 maplength, has_holes, loader);
> >      if (__glibc_unlikely (errstring != NULL))
> >        goto call_lose;
> > -
> > -    /* Process program headers again after load segments are mapped in
> > -       case processing requires accessing those segments.  Scan program
> > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > -       exits.  */
> > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > -      switch (ph[-1].p_type)
> > -     {
> > -     case PT_NOTE:
> > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > -       break;
> > -     case PT_GNU_PROPERTY:
> > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > -       break;
> > -     }
> >    }
> >
> >    if (l->l_ld == 0)
> > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> >      /* Assign the next available module ID.  */
> >      l->l_tls_modid = _dl_next_tls_modid ();
> >
> > +  /* Process program headers again after load segments are mapped in
> > +     case processing requires accessing those segments.  Scan program
> > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > +     exits.  */
> > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > +    switch (ph[-1].p_type)
> > +      {
> > +      case PT_NOTE:
> > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > +     break;
> > +      case PT_GNU_PROPERTY:
> > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > +     break;
> > +      }
> > +
> >  #ifdef DL_AFTER_LOAD
> >    DL_AFTER_LOAD (l);
> >  #endif
>
> Is this still compatible with the CET requirements?
>
> I hope it is because the CET magic happens in _dl_open_check, so after
> the the code in elf/dl-load.c has run.
>
>

_dl_process_pt_note and _dl_process_pt_gnu_property may call
_dl_signal_error.  Are we prepared to clean more things up when it
happens?  I am investigating:

https://sourceware.org/bugzilla/show_bug.cgi?id=26825

I don't think cleanup of _dl_process_pt_gnu_property failure is done
properly.

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 12:36       ` H.J. Lu
  0 siblings, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2020-11-03 12:36 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Mark Rutland, Salvatore Mesoraca, GNU C Library, Kees Cook,
	Kernel Hardening, Szabolcs Nagy, Catalin Marinas, Will Deacon,
	LKML, Jeremy Linton, Mark Brown, Lennart Poettering,
	linux-hardening, Topi Miettinen, linux-arm-kernel

On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Szabolcs Nagy:
>
> > Program headers are processed in two pass: after the first pass
> > load segments are mmapped so in the second pass target specific
> > note processing logic can access the notes.
> >
> > The second pass is moved later so various link_map fields are
> > set up that may be useful for note processing such as l_phdr.
> > ---
> >  elf/dl-load.c | 30 +++++++++++++++---------------
> >  1 file changed, 15 insertions(+), 15 deletions(-)
> >
> > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > index ceaab7f18e..673cf960a0 100644
> > --- a/elf/dl-load.c
> > +++ b/elf/dl-load.c
> > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> >                                 maplength, has_holes, loader);
> >      if (__glibc_unlikely (errstring != NULL))
> >        goto call_lose;
> > -
> > -    /* Process program headers again after load segments are mapped in
> > -       case processing requires accessing those segments.  Scan program
> > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > -       exits.  */
> > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > -      switch (ph[-1].p_type)
> > -     {
> > -     case PT_NOTE:
> > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > -       break;
> > -     case PT_GNU_PROPERTY:
> > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > -       break;
> > -     }
> >    }
> >
> >    if (l->l_ld == 0)
> > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> >      /* Assign the next available module ID.  */
> >      l->l_tls_modid = _dl_next_tls_modid ();
> >
> > +  /* Process program headers again after load segments are mapped in
> > +     case processing requires accessing those segments.  Scan program
> > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > +     exits.  */
> > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > +    switch (ph[-1].p_type)
> > +      {
> > +      case PT_NOTE:
> > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > +     break;
> > +      case PT_GNU_PROPERTY:
> > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > +     break;
> > +      }
> > +
> >  #ifdef DL_AFTER_LOAD
> >    DL_AFTER_LOAD (l);
> >  #endif
>
> Is this still compatible with the CET requirements?
>
> I hope it is because the CET magic happens in _dl_open_check, so after
> the the code in elf/dl-load.c has run.
>
>

_dl_process_pt_note and _dl_process_pt_gnu_property may call
_dl_signal_error.  Are we prepared to clean more things up when it
happens?  I am investigating:

https://sourceware.org/bugzilla/show_bug.cgi?id=26825

I don't think cleanup of _dl_process_pt_gnu_property failure is done
properly.

-- 
H.J.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 12:36       ` H.J. Lu
  0 siblings, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2020-11-03 12:36 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Szabolcs Nagy, GNU C Library, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Will Deacon, Mark Brown, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, Topi Miettinen, LKML,
	linux-arm-kernel, Kernel Hardening, linux-hardening

On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Szabolcs Nagy:
>
> > Program headers are processed in two pass: after the first pass
> > load segments are mmapped so in the second pass target specific
> > note processing logic can access the notes.
> >
> > The second pass is moved later so various link_map fields are
> > set up that may be useful for note processing such as l_phdr.
> > ---
> >  elf/dl-load.c | 30 +++++++++++++++---------------
> >  1 file changed, 15 insertions(+), 15 deletions(-)
> >
> > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > index ceaab7f18e..673cf960a0 100644
> > --- a/elf/dl-load.c
> > +++ b/elf/dl-load.c
> > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> >                                 maplength, has_holes, loader);
> >      if (__glibc_unlikely (errstring != NULL))
> >        goto call_lose;
> > -
> > -    /* Process program headers again after load segments are mapped in
> > -       case processing requires accessing those segments.  Scan program
> > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > -       exits.  */
> > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > -      switch (ph[-1].p_type)
> > -     {
> > -     case PT_NOTE:
> > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > -       break;
> > -     case PT_GNU_PROPERTY:
> > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > -       break;
> > -     }
> >    }
> >
> >    if (l->l_ld == 0)
> > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> >      /* Assign the next available module ID.  */
> >      l->l_tls_modid = _dl_next_tls_modid ();
> >
> > +  /* Process program headers again after load segments are mapped in
> > +     case processing requires accessing those segments.  Scan program
> > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > +     exits.  */
> > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > +    switch (ph[-1].p_type)
> > +      {
> > +      case PT_NOTE:
> > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > +     break;
> > +      case PT_GNU_PROPERTY:
> > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > +     break;
> > +      }
> > +
> >  #ifdef DL_AFTER_LOAD
> >    DL_AFTER_LOAD (l);
> >  #endif
>
> Is this still compatible with the CET requirements?
>
> I hope it is because the CET magic happens in _dl_open_check, so after
> the the code in elf/dl-load.c has run.
>
>

_dl_process_pt_note and _dl_process_pt_gnu_property may call
_dl_signal_error.  Are we prepared to clean more things up when it
happens?  I am investigating:

https://sourceware.org/bugzilla/show_bug.cgi?id=26825

I don't think cleanup of _dl_process_pt_gnu_property failure is done
properly.

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
  2020-11-03 12:36       ` H.J. Lu
@ 2020-11-03 15:04         ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 15:04 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Florian Weimer, GNU C Library, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Will Deacon, Mark Brown, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, Topi Miettinen, LKML,
	linux-arm-kernel, Kernel Hardening, linux-hardening

The 11/03/2020 04:36, H.J. Lu wrote:
> On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
> > * Szabolcs Nagy:
> >
> > > Program headers are processed in two pass: after the first pass
> > > load segments are mmapped so in the second pass target specific
> > > note processing logic can access the notes.
> > >
> > > The second pass is moved later so various link_map fields are
> > > set up that may be useful for note processing such as l_phdr.
> > > ---
> > >  elf/dl-load.c | 30 +++++++++++++++---------------
> > >  1 file changed, 15 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > > index ceaab7f18e..673cf960a0 100644
> > > --- a/elf/dl-load.c
> > > +++ b/elf/dl-load.c
> > > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> > >                                 maplength, has_holes, loader);
> > >      if (__glibc_unlikely (errstring != NULL))
> > >        goto call_lose;
> > > -
> > > -    /* Process program headers again after load segments are mapped in
> > > -       case processing requires accessing those segments.  Scan program
> > > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > -       exits.  */
> > > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > > -      switch (ph[-1].p_type)
> > > -     {
> > > -     case PT_NOTE:
> > > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > > -       break;
> > > -     case PT_GNU_PROPERTY:
> > > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > -       break;
> > > -     }
> > >    }
> > >
> > >    if (l->l_ld == 0)
> > > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> > >      /* Assign the next available module ID.  */
> > >      l->l_tls_modid = _dl_next_tls_modid ();
> > >
> > > +  /* Process program headers again after load segments are mapped in
> > > +     case processing requires accessing those segments.  Scan program
> > > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > +     exits.  */
> > > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > > +    switch (ph[-1].p_type)
> > > +      {
> > > +      case PT_NOTE:
> > > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > > +     break;
> > > +      case PT_GNU_PROPERTY:
> > > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > +     break;
> > > +      }
> > > +
> > >  #ifdef DL_AFTER_LOAD
> > >    DL_AFTER_LOAD (l);
> > >  #endif
> >
> > Is this still compatible with the CET requirements?
> >
> > I hope it is because the CET magic happens in _dl_open_check, so after
> > the the code in elf/dl-load.c has run.

i believe the note processing and later cet magic
are not affected by this code move.

but i did not test this with cet.

> 
> _dl_process_pt_note and _dl_process_pt_gnu_property may call
> _dl_signal_error.  Are we prepared to clean more things up when it
> happens?  I am investigating:

yeah, this is difficult to reason about.

it seems to me that after _dl_map_object returns there
may be _dl_map_object_deps which can fail in a way that
all of dlopen has to be rolled back, so if i move things
around in _dl_map_object that should not introduce new
issues.

but it is not clear to me how robust the dlopen code is
against arbitrary failure in dl_open_worker.

> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=26825
> 
> I don't think cleanup of _dl_process_pt_gnu_property failure is done
> properly.
> 
> -- 
> H.J.

-- 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 15:04         ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-03 15:04 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Florian Weimer, Mark Rutland, GNU C Library, Kees Cook,
	Kernel Hardening, Salvatore Mesoraca, Catalin Marinas,
	Will Deacon, LKML, Jeremy Linton, Mark Brown, Lennart Poettering,
	linux-hardening, Topi Miettinen, linux-arm-kernel

The 11/03/2020 04:36, H.J. Lu wrote:
> On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
> > * Szabolcs Nagy:
> >
> > > Program headers are processed in two pass: after the first pass
> > > load segments are mmapped so in the second pass target specific
> > > note processing logic can access the notes.
> > >
> > > The second pass is moved later so various link_map fields are
> > > set up that may be useful for note processing such as l_phdr.
> > > ---
> > >  elf/dl-load.c | 30 +++++++++++++++---------------
> > >  1 file changed, 15 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > > index ceaab7f18e..673cf960a0 100644
> > > --- a/elf/dl-load.c
> > > +++ b/elf/dl-load.c
> > > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> > >                                 maplength, has_holes, loader);
> > >      if (__glibc_unlikely (errstring != NULL))
> > >        goto call_lose;
> > > -
> > > -    /* Process program headers again after load segments are mapped in
> > > -       case processing requires accessing those segments.  Scan program
> > > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > -       exits.  */
> > > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > > -      switch (ph[-1].p_type)
> > > -     {
> > > -     case PT_NOTE:
> > > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > > -       break;
> > > -     case PT_GNU_PROPERTY:
> > > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > -       break;
> > > -     }
> > >    }
> > >
> > >    if (l->l_ld == 0)
> > > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> > >      /* Assign the next available module ID.  */
> > >      l->l_tls_modid = _dl_next_tls_modid ();
> > >
> > > +  /* Process program headers again after load segments are mapped in
> > > +     case processing requires accessing those segments.  Scan program
> > > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > +     exits.  */
> > > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > > +    switch (ph[-1].p_type)
> > > +      {
> > > +      case PT_NOTE:
> > > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > > +     break;
> > > +      case PT_GNU_PROPERTY:
> > > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > +     break;
> > > +      }
> > > +
> > >  #ifdef DL_AFTER_LOAD
> > >    DL_AFTER_LOAD (l);
> > >  #endif
> >
> > Is this still compatible with the CET requirements?
> >
> > I hope it is because the CET magic happens in _dl_open_check, so after
> > the the code in elf/dl-load.c has run.

i believe the note processing and later cet magic
are not affected by this code move.

but i did not test this with cet.

> 
> _dl_process_pt_note and _dl_process_pt_gnu_property may call
> _dl_signal_error.  Are we prepared to clean more things up when it
> happens?  I am investigating:

yeah, this is difficult to reason about.

it seems to me that after _dl_map_object returns there
may be _dl_map_object_deps which can fail in a way that
all of dlopen has to be rolled back, so if i move things
around in _dl_map_object that should not introduce new
issues.

but it is not clear to me how robust the dlopen code is
against arbitrary failure in dl_open_worker.

> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=26825
> 
> I don't think cleanup of _dl_process_pt_gnu_property failure is done
> properly.
> 
> -- 
> H.J.

-- 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
  2020-11-03 15:04         ` Szabolcs Nagy
  (?)
@ 2020-11-03 15:27           ` H.J. Lu
  -1 siblings, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2020-11-03 15:27 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Florian Weimer, GNU C Library, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Will Deacon, Mark Brown, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, Topi Miettinen, LKML,
	linux-arm-kernel, Kernel Hardening, linux-hardening

On Tue, Nov 3, 2020 at 7:04 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 11/03/2020 04:36, H.J. Lu wrote:
> > On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
> > > * Szabolcs Nagy:
> > >
> > > > Program headers are processed in two pass: after the first pass
> > > > load segments are mmapped so in the second pass target specific
> > > > note processing logic can access the notes.
> > > >
> > > > The second pass is moved later so various link_map fields are
> > > > set up that may be useful for note processing such as l_phdr.
> > > > ---
> > > >  elf/dl-load.c | 30 +++++++++++++++---------------
> > > >  1 file changed, 15 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > > > index ceaab7f18e..673cf960a0 100644
> > > > --- a/elf/dl-load.c
> > > > +++ b/elf/dl-load.c
> > > > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> > > >                                 maplength, has_holes, loader);
> > > >      if (__glibc_unlikely (errstring != NULL))
> > > >        goto call_lose;
> > > > -
> > > > -    /* Process program headers again after load segments are mapped in
> > > > -       case processing requires accessing those segments.  Scan program
> > > > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > > -       exits.  */
> > > > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > > > -      switch (ph[-1].p_type)
> > > > -     {
> > > > -     case PT_NOTE:
> > > > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > > > -       break;
> > > > -     case PT_GNU_PROPERTY:
> > > > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > > -       break;
> > > > -     }
> > > >    }
> > > >
> > > >    if (l->l_ld == 0)
> > > > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> > > >      /* Assign the next available module ID.  */
> > > >      l->l_tls_modid = _dl_next_tls_modid ();
> > > >
> > > > +  /* Process program headers again after load segments are mapped in
> > > > +     case processing requires accessing those segments.  Scan program
> > > > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > > +     exits.  */
> > > > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > > > +    switch (ph[-1].p_type)
> > > > +      {
> > > > +      case PT_NOTE:
> > > > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > > > +     break;
> > > > +      case PT_GNU_PROPERTY:
> > > > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > > +     break;
> > > > +      }
> > > > +
> > > >  #ifdef DL_AFTER_LOAD
> > > >    DL_AFTER_LOAD (l);
> > > >  #endif
> > >
> > > Is this still compatible with the CET requirements?
> > >
> > > I hope it is because the CET magic happens in _dl_open_check, so after
> > > the the code in elf/dl-load.c has run.
>
> i believe the note processing and later cet magic
> are not affected by this code move.
>
> but i did not test this with cet.
>
> >
> > _dl_process_pt_note and _dl_process_pt_gnu_property may call
> > _dl_signal_error.  Are we prepared to clean more things up when it
> > happens?  I am investigating:
>
> yeah, this is difficult to reason about.
>
> it seems to me that after _dl_map_object returns there
> may be _dl_map_object_deps which can fail in a way that
> all of dlopen has to be rolled back, so if i move things
> around in _dl_map_object that should not introduce new
> issues.

I haven't investigated it in detail.  But there are

1314   if (l->l_phdr == NULL)
1315     {
1316       /* The program header is not contained in any of the segments.
1317          We have to allocate memory ourself and copy it over from out
1318          temporary place.  */
1319       ElfW(Phdr) *newp = (ElfW(Phdr) *) malloc (header->e_phnum
1320                                                 * sizeof (ElfW(Phdr)));
1321       if (newp == NULL)
1322         {
1323           errstring = N_("cannot allocate memory for program header");
1324           goto call_lose_errno;
1325         }
1326
1327       l->l_phdr = memcpy (newp, phdr,
1328                           (header->e_phnum * sizeof (ElfW(Phdr))));
1329       l->l_phdr_allocated = 1;
1330     }

When _dl_process_pt_gnu_property is moved after it, will l_phdr be
free on _dl_signal_error?

> but it is not clear to me how robust the dlopen code is
> against arbitrary failure in dl_open_worker.

I think we are mostly OK, except for some corner cases.   Delay
_dl_process_pt_gnu_property may introduce more corner cases.

> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=26825
> >
> > I don't think cleanup of _dl_process_pt_gnu_property failure is done
> > properly.
> >
> > --
> > H.J.
>
> --



-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 15:27           ` H.J. Lu
  0 siblings, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2020-11-03 15:27 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Florian Weimer, Mark Rutland, GNU C Library, Kees Cook,
	Kernel Hardening, Salvatore Mesoraca, Catalin Marinas,
	Will Deacon, LKML, Jeremy Linton, Mark Brown, Lennart Poettering,
	linux-hardening, Topi Miettinen, linux-arm-kernel

On Tue, Nov 3, 2020 at 7:04 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 11/03/2020 04:36, H.J. Lu wrote:
> > On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
> > > * Szabolcs Nagy:
> > >
> > > > Program headers are processed in two pass: after the first pass
> > > > load segments are mmapped so in the second pass target specific
> > > > note processing logic can access the notes.
> > > >
> > > > The second pass is moved later so various link_map fields are
> > > > set up that may be useful for note processing such as l_phdr.
> > > > ---
> > > >  elf/dl-load.c | 30 +++++++++++++++---------------
> > > >  1 file changed, 15 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > > > index ceaab7f18e..673cf960a0 100644
> > > > --- a/elf/dl-load.c
> > > > +++ b/elf/dl-load.c
> > > > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> > > >                                 maplength, has_holes, loader);
> > > >      if (__glibc_unlikely (errstring != NULL))
> > > >        goto call_lose;
> > > > -
> > > > -    /* Process program headers again after load segments are mapped in
> > > > -       case processing requires accessing those segments.  Scan program
> > > > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > > -       exits.  */
> > > > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > > > -      switch (ph[-1].p_type)
> > > > -     {
> > > > -     case PT_NOTE:
> > > > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > > > -       break;
> > > > -     case PT_GNU_PROPERTY:
> > > > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > > -       break;
> > > > -     }
> > > >    }
> > > >
> > > >    if (l->l_ld == 0)
> > > > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> > > >      /* Assign the next available module ID.  */
> > > >      l->l_tls_modid = _dl_next_tls_modid ();
> > > >
> > > > +  /* Process program headers again after load segments are mapped in
> > > > +     case processing requires accessing those segments.  Scan program
> > > > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > > +     exits.  */
> > > > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > > > +    switch (ph[-1].p_type)
> > > > +      {
> > > > +      case PT_NOTE:
> > > > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > > > +     break;
> > > > +      case PT_GNU_PROPERTY:
> > > > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > > +     break;
> > > > +      }
> > > > +
> > > >  #ifdef DL_AFTER_LOAD
> > > >    DL_AFTER_LOAD (l);
> > > >  #endif
> > >
> > > Is this still compatible with the CET requirements?
> > >
> > > I hope it is because the CET magic happens in _dl_open_check, so after
> > > the the code in elf/dl-load.c has run.
>
> i believe the note processing and later cet magic
> are not affected by this code move.
>
> but i did not test this with cet.
>
> >
> > _dl_process_pt_note and _dl_process_pt_gnu_property may call
> > _dl_signal_error.  Are we prepared to clean more things up when it
> > happens?  I am investigating:
>
> yeah, this is difficult to reason about.
>
> it seems to me that after _dl_map_object returns there
> may be _dl_map_object_deps which can fail in a way that
> all of dlopen has to be rolled back, so if i move things
> around in _dl_map_object that should not introduce new
> issues.

I haven't investigated it in detail.  But there are

1314   if (l->l_phdr == NULL)
1315     {
1316       /* The program header is not contained in any of the segments.
1317          We have to allocate memory ourself and copy it over from out
1318          temporary place.  */
1319       ElfW(Phdr) *newp = (ElfW(Phdr) *) malloc (header->e_phnum
1320                                                 * sizeof (ElfW(Phdr)));
1321       if (newp == NULL)
1322         {
1323           errstring = N_("cannot allocate memory for program header");
1324           goto call_lose_errno;
1325         }
1326
1327       l->l_phdr = memcpy (newp, phdr,
1328                           (header->e_phnum * sizeof (ElfW(Phdr))));
1329       l->l_phdr_allocated = 1;
1330     }

When _dl_process_pt_gnu_property is moved after it, will l_phdr be
free on _dl_signal_error?

> but it is not clear to me how robust the dlopen code is
> against arbitrary failure in dl_open_worker.

I think we are mostly OK, except for some corner cases.   Delay
_dl_process_pt_gnu_property may introduce more corner cases.

> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=26825
> >
> > I don't think cleanup of _dl_process_pt_gnu_property failure is done
> > properly.
> >
> > --
> > H.J.
>
> --



-- 
H.J.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] elf: Move note processing after l_phdr is updated [BZ #26831]
@ 2020-11-03 15:27           ` H.J. Lu
  0 siblings, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2020-11-03 15:27 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Florian Weimer, GNU C Library, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Will Deacon, Mark Brown, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, Topi Miettinen, LKML,
	linux-arm-kernel, Kernel Hardening, linux-hardening

On Tue, Nov 3, 2020 at 7:04 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 11/03/2020 04:36, H.J. Lu wrote:
> > On Tue, Nov 3, 2020 at 2:38 AM Florian Weimer <fweimer@redhat.com> wrote:
> > > * Szabolcs Nagy:
> > >
> > > > Program headers are processed in two pass: after the first pass
> > > > load segments are mmapped so in the second pass target specific
> > > > note processing logic can access the notes.
> > > >
> > > > The second pass is moved later so various link_map fields are
> > > > set up that may be useful for note processing such as l_phdr.
> > > > ---
> > > >  elf/dl-load.c | 30 +++++++++++++++---------------
> > > >  1 file changed, 15 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > > > index ceaab7f18e..673cf960a0 100644
> > > > --- a/elf/dl-load.c
> > > > +++ b/elf/dl-load.c
> > > > @@ -1259,21 +1259,6 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> > > >                                 maplength, has_holes, loader);
> > > >      if (__glibc_unlikely (errstring != NULL))
> > > >        goto call_lose;
> > > > -
> > > > -    /* Process program headers again after load segments are mapped in
> > > > -       case processing requires accessing those segments.  Scan program
> > > > -       headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > > -       exits.  */
> > > > -    for (ph = &phdr[l->l_phnum]; ph != phdr; --ph)
> > > > -      switch (ph[-1].p_type)
> > > > -     {
> > > > -     case PT_NOTE:
> > > > -       _dl_process_pt_note (l, fd, &ph[-1]);
> > > > -       break;
> > > > -     case PT_GNU_PROPERTY:
> > > > -       _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > > -       break;
> > > > -     }
> > > >    }
> > > >
> > > >    if (l->l_ld == 0)
> > > > @@ -1481,6 +1466,21 @@ cannot enable executable stack as shared object requires");
> > > >      /* Assign the next available module ID.  */
> > > >      l->l_tls_modid = _dl_next_tls_modid ();
> > > >
> > > > +  /* Process program headers again after load segments are mapped in
> > > > +     case processing requires accessing those segments.  Scan program
> > > > +     headers backward so that PT_NOTE can be skipped if PT_GNU_PROPERTY
> > > > +     exits.  */
> > > > +  for (ph = &l->l_phdr[l->l_phnum]; ph != l->l_phdr; --ph)
> > > > +    switch (ph[-1].p_type)
> > > > +      {
> > > > +      case PT_NOTE:
> > > > +     _dl_process_pt_note (l, fd, &ph[-1]);
> > > > +     break;
> > > > +      case PT_GNU_PROPERTY:
> > > > +     _dl_process_pt_gnu_property (l, fd, &ph[-1]);
> > > > +     break;
> > > > +      }
> > > > +
> > > >  #ifdef DL_AFTER_LOAD
> > > >    DL_AFTER_LOAD (l);
> > > >  #endif
> > >
> > > Is this still compatible with the CET requirements?
> > >
> > > I hope it is because the CET magic happens in _dl_open_check, so after
> > > the the code in elf/dl-load.c has run.
>
> i believe the note processing and later cet magic
> are not affected by this code move.
>
> but i did not test this with cet.
>
> >
> > _dl_process_pt_note and _dl_process_pt_gnu_property may call
> > _dl_signal_error.  Are we prepared to clean more things up when it
> > happens?  I am investigating:
>
> yeah, this is difficult to reason about.
>
> it seems to me that after _dl_map_object returns there
> may be _dl_map_object_deps which can fail in a way that
> all of dlopen has to be rolled back, so if i move things
> around in _dl_map_object that should not introduce new
> issues.

I haven't investigated it in detail.  But there are

1314   if (l->l_phdr == NULL)
1315     {
1316       /* The program header is not contained in any of the segments.
1317          We have to allocate memory ourself and copy it over from out
1318          temporary place.  */
1319       ElfW(Phdr) *newp = (ElfW(Phdr) *) malloc (header->e_phnum
1320                                                 * sizeof (ElfW(Phdr)));
1321       if (newp == NULL)
1322         {
1323           errstring = N_("cannot allocate memory for program header");
1324           goto call_lose_errno;
1325         }
1326
1327       l->l_phdr = memcpy (newp, phdr,
1328                           (header->e_phnum * sizeof (ElfW(Phdr))));
1329       l->l_phdr_allocated = 1;
1330     }

When _dl_process_pt_gnu_property is moved after it, will l_phdr be
free on _dl_signal_error?

> but it is not clear to me how robust the dlopen code is
> against arbitrary failure in dl_open_worker.

I think we are mostly OK, except for some corner cases.   Delay
_dl_process_pt_gnu_property may introduce more corner cases.

> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=26825
> >
> > I don't think cleanup of _dl_process_pt_gnu_property failure is done
> > properly.
> >
> > --
> > H.J.
>
> --



-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-03 10:25 ` Szabolcs Nagy
@ 2020-11-03 17:34   ` Mark Brown
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Brown @ 2020-11-03 17:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: libc-alpha, Jeremy Linton, Catalin Marinas, Mark Rutland,
	Will Deacon, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

[-- Attachment #1: Type: text/plain, Size: 1283 bytes --]

On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:

> Re-mmap executable segments instead of mprotecting them in
> case mprotect is seccomp filtered.

> For the kernel mapped main executable we don't have the fd
> for re-mmap so linux needs to be updated to add BTI. (In the
> presence of seccomp filters for mprotect(PROT_EXEC) the libc
> cannot change BTI protection at runtime based on user space
> policy so it is better if the kernel maps BTI compatible
> binaries with PROT_BTI by default.)

Given that there were still some ongoing discussions on a more robust
kernel interface here and there seem to be a few concerns with this
series should we perhaps just take a step back and disable this seccomp
filter in systemd on arm64, at least for the time being?  That seems
safer than rolling out things that set ABI quickly, a big part of the
reason we went with having the dynamic linker enable PROT_BTI in the
first place was to give us more flexibility to handle any unforseen
consequences of enabling BTI that we run into.  We are going to have
similar issues with other features like MTE so we need to make sure that
whatever we're doing works with them too.

Also updated to Will's current e-mail address - Will, do you have
thoughts on what we should do here?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-03 17:34   ` Mark Brown
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Brown @ 2020-11-03 17:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, Catalin Marinas,
	linux-kernel, Jeremy Linton, Lennart Poettering, linux-hardening,
	Topi Miettinen, Will Deacon, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1283 bytes --]

On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:

> Re-mmap executable segments instead of mprotecting them in
> case mprotect is seccomp filtered.

> For the kernel mapped main executable we don't have the fd
> for re-mmap so linux needs to be updated to add BTI. (In the
> presence of seccomp filters for mprotect(PROT_EXEC) the libc
> cannot change BTI protection at runtime based on user space
> policy so it is better if the kernel maps BTI compatible
> binaries with PROT_BTI by default.)

Given that there were still some ongoing discussions on a more robust
kernel interface here and there seem to be a few concerns with this
series should we perhaps just take a step back and disable this seccomp
filter in systemd on arm64, at least for the time being?  That seems
safer than rolling out things that set ABI quickly, a big part of the
reason we went with having the dynamic linker enable PROT_BTI in the
first place was to give us more flexibility to handle any unforseen
consequences of enabling BTI that we run into.  We are going to have
similar issues with other features like MTE so we need to make sure that
whatever we're doing works with them too.

Also updated to Will's current e-mail address - Will, do you have
thoughts on what we should do here?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-03 17:34   ` Mark Brown
@ 2020-11-04  5:41     ` Jeremy Linton
  -1 siblings, 0 replies; 62+ messages in thread
From: Jeremy Linton @ 2020-11-04  5:41 UTC (permalink / raw)
  To: Mark Brown, Szabolcs Nagy
  Cc: libc-alpha, Catalin Marinas, Mark Rutland, Will Deacon,
	Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

Hi,

On 11/3/20 11:34 AM, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> 
>> Re-mmap executable segments instead of mprotecting them in
>> case mprotect is seccomp filtered.
> 
>> For the kernel mapped main executable we don't have the fd
>> for re-mmap so linux needs to be updated to add BTI. (In the
>> presence of seccomp filters for mprotect(PROT_EXEC) the libc
>> cannot change BTI protection at runtime based on user space
>> policy so it is better if the kernel maps BTI compatible
>> binaries with PROT_BTI by default.)
> 
> Given that there were still some ongoing discussions on a more robust
> kernel interface here and there seem to be a few concerns with this
> series should we perhaps just take a step back and disable this seccomp
> filter in systemd on arm64, at least for the time being?  That seems
> safer than rolling out things that set ABI quickly, a big part of the

So, that's a bigger hammer than I think is needed and punishes !BTI 
machines. I'm going to suggest that if we need to carry a temp patch its 
more like the glibc patch I mentioned in the Fedora defect. That patch 
simply logs a message, on the mprotect failures rather than aborting. 
Its fairly non-intrusive.

That leaves seccomp functional, and BTI generally functional except when 
seccomp is restricting it. I've also been asked that if a patch like 
that is needed, its (temporary?) merged to the glibc trunk, rather than 
just being carried by the distro's.

Thanks,

> reason we went with having the dynamic linker enable PROT_BTI in the
> first place was to give us more flexibility to handle any unforseen
> consequences of enabling BTI that we run into.  We are going to have
> similar issues with other features like MTE so we need to make sure that
> whatever we're doing works with them too.
> 
> Also updated to Will's current e-mail address - Will, do you have
> thoughts on what we should do here?
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  5:41     ` Jeremy Linton
  0 siblings, 0 replies; 62+ messages in thread
From: Jeremy Linton @ 2020-11-04  5:41 UTC (permalink / raw)
  To: Mark Brown, Szabolcs Nagy
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, Catalin Marinas,
	linux-kernel, Lennart Poettering, linux-hardening,
	Topi Miettinen, Will Deacon, linux-arm-kernel

Hi,

On 11/3/20 11:34 AM, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> 
>> Re-mmap executable segments instead of mprotecting them in
>> case mprotect is seccomp filtered.
> 
>> For the kernel mapped main executable we don't have the fd
>> for re-mmap so linux needs to be updated to add BTI. (In the
>> presence of seccomp filters for mprotect(PROT_EXEC) the libc
>> cannot change BTI protection at runtime based on user space
>> policy so it is better if the kernel maps BTI compatible
>> binaries with PROT_BTI by default.)
> 
> Given that there were still some ongoing discussions on a more robust
> kernel interface here and there seem to be a few concerns with this
> series should we perhaps just take a step back and disable this seccomp
> filter in systemd on arm64, at least for the time being?  That seems
> safer than rolling out things that set ABI quickly, a big part of the

So, that's a bigger hammer than I think is needed and punishes !BTI 
machines. I'm going to suggest that if we need to carry a temp patch its 
more like the glibc patch I mentioned in the Fedora defect. That patch 
simply logs a message, on the mprotect failures rather than aborting. 
Its fairly non-intrusive.

That leaves seccomp functional, and BTI generally functional except when 
seccomp is restricting it. I've also been asked that if a patch like 
that is needed, its (temporary?) merged to the glibc trunk, rather than 
just being carried by the distro's.

Thanks,

> reason we went with having the dynamic linker enable PROT_BTI in the
> first place was to give us more flexibility to handle any unforseen
> consequences of enabling BTI that we run into.  We are going to have
> similar issues with other features like MTE so we need to make sure that
> whatever we're doing works with them too.
> 
> Also updated to Will's current e-mail address - Will, do you have
> thoughts on what we should do here?
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  5:41     ` Jeremy Linton
@ 2020-11-04  8:57       ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-04  8:57 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Mark Brown, libc-alpha, Catalin Marinas, Mark Rutland,
	Will Deacon, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

The 11/03/2020 23:41, Jeremy Linton wrote:
> On 11/3/20 11:34 AM, Mark Brown wrote:
> > On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> > 
> > > Re-mmap executable segments instead of mprotecting them in
> > > case mprotect is seccomp filtered.
> > 
> > > For the kernel mapped main executable we don't have the fd
> > > for re-mmap so linux needs to be updated to add BTI. (In the
> > > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > > cannot change BTI protection at runtime based on user space
> > > policy so it is better if the kernel maps BTI compatible
> > > binaries with PROT_BTI by default.)
> > 
> > Given that there were still some ongoing discussions on a more robust
> > kernel interface here and there seem to be a few concerns with this
> > series should we perhaps just take a step back and disable this seccomp
> > filter in systemd on arm64, at least for the time being?  That seems
> > safer than rolling out things that set ABI quickly, a big part of the
> 
> So, that's a bigger hammer than I think is needed and punishes !BTI
> machines. I'm going to suggest that if we need to carry a temp patch its
> more like the glibc patch I mentioned in the Fedora defect. That patch
> simply logs a message, on the mprotect failures rather than aborting. Its
> fairly non-intrusive.
> 
> That leaves seccomp functional, and BTI generally functional except when
> seccomp is restricting it. I've also been asked that if a patch like that is
> needed, its (temporary?) merged to the glibc trunk, rather than just being
> carried by the distro's.

note that changing mprotect into mmap in glibc works
even if the kernel or systemd decides to do things
differently: currently the only wart is that on the
main exe we have to use mprotect and silently ignore
the failures. (but e.g. some return to libc attacks
are reliably prevented since libc.so guaranteed to
have PROT_BTI this way.)

the patchset is a bit more intrusive than i hoped
for, so if we expect to get a new syscall then
simply ignoring mprotect failures may be a better
temporary solution. (and i still need to do
benchmarks to see if the cost of re-mmap is not
much more than the cost of mprotect.)

changing the seccomp filter in systemd does not
help if there are other seccomp filters elsewhere
doing the same. i think we will have to avoid using
mprotect(PROT_EXEC) or at least ignore the failure.

> > reason we went with having the dynamic linker enable PROT_BTI in the
> > first place was to give us more flexibility to handle any unforseen
> > consequences of enabling BTI that we run into.  We are going to have
> > similar issues with other features like MTE so we need to make sure that
> > whatever we're doing works with them too.
> > 
> > Also updated to Will's current e-mail address - Will, do you have
> > thoughts on what we should do here?
> > 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  8:57       ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-04  8:57 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, Catalin Marinas,
	linux-kernel, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, Will Deacon, linux-arm-kernel

The 11/03/2020 23:41, Jeremy Linton wrote:
> On 11/3/20 11:34 AM, Mark Brown wrote:
> > On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> > 
> > > Re-mmap executable segments instead of mprotecting them in
> > > case mprotect is seccomp filtered.
> > 
> > > For the kernel mapped main executable we don't have the fd
> > > for re-mmap so linux needs to be updated to add BTI. (In the
> > > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > > cannot change BTI protection at runtime based on user space
> > > policy so it is better if the kernel maps BTI compatible
> > > binaries with PROT_BTI by default.)
> > 
> > Given that there were still some ongoing discussions on a more robust
> > kernel interface here and there seem to be a few concerns with this
> > series should we perhaps just take a step back and disable this seccomp
> > filter in systemd on arm64, at least for the time being?  That seems
> > safer than rolling out things that set ABI quickly, a big part of the
> 
> So, that's a bigger hammer than I think is needed and punishes !BTI
> machines. I'm going to suggest that if we need to carry a temp patch its
> more like the glibc patch I mentioned in the Fedora defect. That patch
> simply logs a message, on the mprotect failures rather than aborting. Its
> fairly non-intrusive.
> 
> That leaves seccomp functional, and BTI generally functional except when
> seccomp is restricting it. I've also been asked that if a patch like that is
> needed, its (temporary?) merged to the glibc trunk, rather than just being
> carried by the distro's.

note that changing mprotect into mmap in glibc works
even if the kernel or systemd decides to do things
differently: currently the only wart is that on the
main exe we have to use mprotect and silently ignore
the failures. (but e.g. some return to libc attacks
are reliably prevented since libc.so guaranteed to
have PROT_BTI this way.)

the patchset is a bit more intrusive than i hoped
for, so if we expect to get a new syscall then
simply ignoring mprotect failures may be a better
temporary solution. (and i still need to do
benchmarks to see if the cost of re-mmap is not
much more than the cost of mprotect.)

changing the seccomp filter in systemd does not
help if there are other seccomp filters elsewhere
doing the same. i think we will have to avoid using
mprotect(PROT_EXEC) or at least ignore the failure.

> > reason we went with having the dynamic linker enable PROT_BTI in the
> > first place was to give us more flexibility to handle any unforseen
> > consequences of enabling BTI that we run into.  We are going to have
> > similar issues with other features like MTE so we need to make sure that
> > whatever we're doing works with them too.
> > 
> > Also updated to Will's current e-mail address - Will, do you have
> > thoughts on what we should do here?
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-03 17:34   ` Mark Brown
@ 2020-11-04  9:02     ` Topi Miettinen
  -1 siblings, 0 replies; 62+ messages in thread
From: Topi Miettinen @ 2020-11-04  9:02 UTC (permalink / raw)
  To: Mark Brown, Szabolcs Nagy
  Cc: libc-alpha, Jeremy Linton, Catalin Marinas, Mark Rutland,
	Will Deacon, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, linux-kernel, linux-arm-kernel,
	kernel-hardening, linux-hardening

On 3.11.2020 19.34, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> 
>> Re-mmap executable segments instead of mprotecting them in
>> case mprotect is seccomp filtered.
> 
>> For the kernel mapped main executable we don't have the fd
>> for re-mmap so linux needs to be updated to add BTI. (In the
>> presence of seccomp filters for mprotect(PROT_EXEC) the libc
>> cannot change BTI protection at runtime based on user space
>> policy so it is better if the kernel maps BTI compatible
>> binaries with PROT_BTI by default.)
> 
> Given that there were still some ongoing discussions on a more robust
> kernel interface here and there seem to be a few concerns with this
> series should we perhaps just take a step back and disable this seccomp
> filter in systemd on arm64, at least for the time being?

Filtering mprotect() and mmap() with seccomp also protects BTI, since 
without it the attacker could remove PROT_BTI from existing pages, or 
map new pages without BTI. This would be possible even with SARA or 
SELinux execmem protections enabled, since they don't care about PROT_BTI.

-Topi

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  9:02     ` Topi Miettinen
  0 siblings, 0 replies; 62+ messages in thread
From: Topi Miettinen @ 2020-11-04  9:02 UTC (permalink / raw)
  To: Mark Brown, Szabolcs Nagy
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, Catalin Marinas,
	linux-kernel, Jeremy Linton, Lennart Poettering, linux-hardening,
	Will Deacon, linux-arm-kernel

On 3.11.2020 19.34, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> 
>> Re-mmap executable segments instead of mprotecting them in
>> case mprotect is seccomp filtered.
> 
>> For the kernel mapped main executable we don't have the fd
>> for re-mmap so linux needs to be updated to add BTI. (In the
>> presence of seccomp filters for mprotect(PROT_EXEC) the libc
>> cannot change BTI protection at runtime based on user space
>> policy so it is better if the kernel maps BTI compatible
>> binaries with PROT_BTI by default.)
> 
> Given that there were still some ongoing discussions on a more robust
> kernel interface here and there seem to be a few concerns with this
> series should we perhaps just take a step back and disable this seccomp
> filter in systemd on arm64, at least for the time being?

Filtering mprotect() and mmap() with seccomp also protects BTI, since 
without it the attacker could remove PROT_BTI from existing pages, or 
map new pages without BTI. This would be possible even with SARA or 
SELinux execmem protections enabled, since they don't care about PROT_BTI.

-Topi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-03 17:34   ` Mark Brown
@ 2020-11-04  9:20     ` Will Deacon
  -1 siblings, 0 replies; 62+ messages in thread
From: Will Deacon @ 2020-11-04  9:20 UTC (permalink / raw)
  To: Mark Brown
  Cc: Szabolcs Nagy, libc-alpha, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

On Tue, Nov 03, 2020 at 05:34:38PM +0000, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> 
> > Re-mmap executable segments instead of mprotecting them in
> > case mprotect is seccomp filtered.
> 
> > For the kernel mapped main executable we don't have the fd
> > for re-mmap so linux needs to be updated to add BTI. (In the
> > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > cannot change BTI protection at runtime based on user space
> > policy so it is better if the kernel maps BTI compatible
> > binaries with PROT_BTI by default.)
> 
> Given that there were still some ongoing discussions on a more robust
> kernel interface here and there seem to be a few concerns with this
> series should we perhaps just take a step back and disable this seccomp
> filter in systemd on arm64, at least for the time being?  That seems
> safer than rolling out things that set ABI quickly, a big part of the
> reason we went with having the dynamic linker enable PROT_BTI in the
> first place was to give us more flexibility to handle any unforseen
> consequences of enabling BTI that we run into.  We are going to have
> similar issues with other features like MTE so we need to make sure that
> whatever we're doing works with them too.
> 
> Also updated to Will's current e-mail address - Will, do you have
> thoughts on what we should do here?

Changing the kernel to map the main executable with PROT_BTI by default is a
user-visible change in behaviour and not without risk, so if we're going to
do that then it needs to be opt-in because the current behaviour has been
there since 5.8. I suppose we could shoe-horn in a cmdline option for 5.10
(which will be the first LTS with BTI) but it would be better to put up with
the current ABI if possible.

Is there real value in this seccomp filter if it only looks at mprotect(),
or was it just implemented because it's easy to do and sounds like a good
idea?

Will

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  9:20     ` Will Deacon
  0 siblings, 0 replies; 62+ messages in thread
From: Will Deacon @ 2020-11-04  9:20 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Jeremy Linton, Lennart Poettering, linux-hardening,
	Salvatore Mesoraca, Topi Miettinen, linux-arm-kernel

On Tue, Nov 03, 2020 at 05:34:38PM +0000, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> 
> > Re-mmap executable segments instead of mprotecting them in
> > case mprotect is seccomp filtered.
> 
> > For the kernel mapped main executable we don't have the fd
> > for re-mmap so linux needs to be updated to add BTI. (In the
> > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > cannot change BTI protection at runtime based on user space
> > policy so it is better if the kernel maps BTI compatible
> > binaries with PROT_BTI by default.)
> 
> Given that there were still some ongoing discussions on a more robust
> kernel interface here and there seem to be a few concerns with this
> series should we perhaps just take a step back and disable this seccomp
> filter in systemd on arm64, at least for the time being?  That seems
> safer than rolling out things that set ABI quickly, a big part of the
> reason we went with having the dynamic linker enable PROT_BTI in the
> first place was to give us more flexibility to handle any unforseen
> consequences of enabling BTI that we run into.  We are going to have
> similar issues with other features like MTE so we need to make sure that
> whatever we're doing works with them too.
> 
> Also updated to Will's current e-mail address - Will, do you have
> thoughts on what we should do here?

Changing the kernel to map the main executable with PROT_BTI by default is a
user-visible change in behaviour and not without risk, so if we're going to
do that then it needs to be opt-in because the current behaviour has been
there since 5.8. I suppose we could shoe-horn in a cmdline option for 5.10
(which will be the first LTS with BTI) but it would be better to put up with
the current ABI if possible.

Is there real value in this seccomp filter if it only looks at mprotect(),
or was it just implemented because it's easy to do and sounds like a good
idea?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  9:20     ` Will Deacon
  (?)
@ 2020-11-04  9:29       ` Florian Weimer
  -1 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-04  9:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Brown, Szabolcs Nagy, libc-alpha, Jeremy Linton,
	Catalin Marinas, Mark Rutland, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

* Will Deacon:

> Is there real value in this seccomp filter if it only looks at mprotect(),
> or was it just implemented because it's easy to do and sounds like a good
> idea?

It seems bogus to me.  Everyone will just create alias mappings instead,
just like they did for the similar SELinux feature.  See “Example code
to avoid execmem violations” in:

  <https://www.akkadia.org/drepper/selinux-mem.html>

As you can see, this reference implementation creates a PROT_WRITE
mapping aliased to a PROT_EXEC mapping, so it actually reduces security
compared to something that generates the code in an anonymous mapping
and calls mprotect to make it executable.

Furthermore, it requires unusual cache flushing code on some AArch64
implementations (a requirement that is not shared by any Linux other
architecture to which libffi has been ported), resulting in
hard-to-track-down real-world bugs.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  9:29       ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-04  9:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, linux-arm-kernel

* Will Deacon:

> Is there real value in this seccomp filter if it only looks at mprotect(),
> or was it just implemented because it's easy to do and sounds like a good
> idea?

It seems bogus to me.  Everyone will just create alias mappings instead,
just like they did for the similar SELinux feature.  See “Example code
to avoid execmem violations” in:

  <https://www.akkadia.org/drepper/selinux-mem.html>

As you can see, this reference implementation creates a PROT_WRITE
mapping aliased to a PROT_EXEC mapping, so it actually reduces security
compared to something that generates the code in an anonymous mapping
and calls mprotect to make it executable.

Furthermore, it requires unusual cache flushing code on some AArch64
implementations (a requirement that is not shared by any Linux other
architecture to which libffi has been ported), resulting in
hard-to-track-down real-world bugs.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  9:29       ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-04  9:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Brown, Szabolcs Nagy, libc-alpha, Jeremy Linton,
	Catalin Marinas, Mark Rutland, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

* Will Deacon:

> Is there real value in this seccomp filter if it only looks at mprotect(),
> or was it just implemented because it's easy to do and sounds like a good
> idea?

It seems bogus to me.  Everyone will just create alias mappings instead,
just like they did for the similar SELinux feature.  See “Example code
to avoid execmem violations” in:

  <https://www.akkadia.org/drepper/selinux-mem.html>

As you can see, this reference implementation creates a PROT_WRITE
mapping aliased to a PROT_EXEC mapping, so it actually reduces security
compared to something that generates the code in an anonymous mapping
and calls mprotect to make it executable.

Furthermore, it requires unusual cache flushing code on some AArch64
implementations (a requirement that is not shared by any Linux other
architecture to which libffi has been ported), resulting in
hard-to-track-down real-world bugs.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  9:29       ` Florian Weimer
@ 2020-11-04  9:55         ` Topi Miettinen
  -1 siblings, 0 replies; 62+ messages in thread
From: Topi Miettinen @ 2020-11-04  9:55 UTC (permalink / raw)
  To: Florian Weimer, Will Deacon
  Cc: Mark Brown, Szabolcs Nagy, libc-alpha, Jeremy Linton,
	Catalin Marinas, Mark Rutland, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, linux-kernel, linux-arm-kernel,
	kernel-hardening, linux-hardening

On 4.11.2020 11.29, Florian Weimer wrote:
> * Will Deacon:
> 
>> Is there real value in this seccomp filter if it only looks at mprotect(),
>> or was it just implemented because it's easy to do and sounds like a good
>> idea?
> 
> It seems bogus to me.  Everyone will just create alias mappings instead,
> just like they did for the similar SELinux feature.  See “Example code
> to avoid execmem violations” in:
> 
>    <https://www.akkadia.org/drepper/selinux-mem.html>

Also note "But this is very dangerous: programs should never use memory 
regions which are writable and executable at the same time. Assuming 
that it is really necessary to generate executable code while the 
program runs the method employed should be reconsidered."

> As you can see, this reference implementation creates a PROT_WRITE
> mapping aliased to a PROT_EXEC mapping, so it actually reduces security
> compared to something that generates the code in an anonymous mapping
> and calls mprotect to make it executable.

Drepper's methods to avoid SELinux protections are indeed the two ways 
(which I'm aware) to also avoid the seccomp filter: by using 
memfd_create() and using a file system which writable and executable to 
the process to create a new executable file. Both methods can be 
eliminated for many system services, memfd_create() with seccomp and 
filesystem W&X with mount namespaces.

If a service legitimately needs executable and writable mappings (due to 
JIT, trampolines etc), it's easy to disable the filter whenever really 
needed with "MemoryDenyWriteExecute=no" (which is the default) in case 
of systemd or a TE rule like "allow type_t self:process { execmem };" 
for SELinux. But this shouldn't be the default case, since there are 
many services which don't need W&X.

I'd also question what is the value of BTI if it can be easily 
circumvented by removing PROT_BTI with mprotect()?

-Topi

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04  9:55         ` Topi Miettinen
  0 siblings, 0 replies; 62+ messages in thread
From: Topi Miettinen @ 2020-11-04  9:55 UTC (permalink / raw)
  To: Florian Weimer, Will Deacon
  Cc: Mark Rutland, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	linux-arm-kernel

On 4.11.2020 11.29, Florian Weimer wrote:
> * Will Deacon:
> 
>> Is there real value in this seccomp filter if it only looks at mprotect(),
>> or was it just implemented because it's easy to do and sounds like a good
>> idea?
> 
> It seems bogus to me.  Everyone will just create alias mappings instead,
> just like they did for the similar SELinux feature.  See “Example code
> to avoid execmem violations” in:
> 
>    <https://www.akkadia.org/drepper/selinux-mem.html>

Also note "But this is very dangerous: programs should never use memory 
regions which are writable and executable at the same time. Assuming 
that it is really necessary to generate executable code while the 
program runs the method employed should be reconsidered."

> As you can see, this reference implementation creates a PROT_WRITE
> mapping aliased to a PROT_EXEC mapping, so it actually reduces security
> compared to something that generates the code in an anonymous mapping
> and calls mprotect to make it executable.

Drepper's methods to avoid SELinux protections are indeed the two ways 
(which I'm aware) to also avoid the seccomp filter: by using 
memfd_create() and using a file system which writable and executable to 
the process to create a new executable file. Both methods can be 
eliminated for many system services, memfd_create() with seccomp and 
filesystem W&X with mount namespaces.

If a service legitimately needs executable and writable mappings (due to 
JIT, trampolines etc), it's easy to disable the filter whenever really 
needed with "MemoryDenyWriteExecute=no" (which is the default) in case 
of systemd or a TE rule like "allow type_t self:process { execmem };" 
for SELinux. But this shouldn't be the default case, since there are 
many services which don't need W&X.

I'd also question what is the value of BTI if it can be easily 
circumvented by removing PROT_BTI with mprotect()?

-Topi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  5:41     ` Jeremy Linton
@ 2020-11-04 10:50       ` Mark Brown
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Brown @ 2020-11-04 10:50 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Szabolcs Nagy, libc-alpha, Catalin Marinas, Mark Rutland,
	Will Deacon, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

[-- Attachment #1: Type: text/plain, Size: 1277 bytes --]

On Tue, Nov 03, 2020 at 11:41:42PM -0600, Jeremy Linton wrote:
> On 11/3/20 11:34 AM, Mark Brown wrote:

> > Given that there were still some ongoing discussions on a more robust
> > kernel interface here and there seem to be a few concerns with this
> > series should we perhaps just take a step back and disable this seccomp
> > filter in systemd on arm64, at least for the time being?  That seems
> > safer than rolling out things that set ABI quickly, a big part of the

> So, that's a bigger hammer than I think is needed and punishes !BTI
> machines. I'm going to suggest that if we need to carry a temp patch its
> more like the glibc patch I mentioned in the Fedora defect. That patch
> simply logs a message, on the mprotect failures rather than aborting. Its
> fairly non-intrusive.

> That leaves seccomp functional, and BTI generally functional except when
> seccomp is restricting it. I've also been asked that if a patch like that is
> needed, its (temporary?) merged to the glibc trunk, rather than just being
> carried by the distro's.

The effect on pre-BTI hardware is an issue, another option would be for
systemd to disable this seccomp usage but only after checking for BTI
support in the system rather than just doing so purely based on the
architecture.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 10:50       ` Mark Brown
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Brown @ 2020-11-04 10:50 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Lennart Poettering, linux-hardening, Salvatore Mesoraca,
	Topi Miettinen, Will Deacon, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1277 bytes --]

On Tue, Nov 03, 2020 at 11:41:42PM -0600, Jeremy Linton wrote:
> On 11/3/20 11:34 AM, Mark Brown wrote:

> > Given that there were still some ongoing discussions on a more robust
> > kernel interface here and there seem to be a few concerns with this
> > series should we perhaps just take a step back and disable this seccomp
> > filter in systemd on arm64, at least for the time being?  That seems
> > safer than rolling out things that set ABI quickly, a big part of the

> So, that's a bigger hammer than I think is needed and punishes !BTI
> machines. I'm going to suggest that if we need to carry a temp patch its
> more like the glibc patch I mentioned in the Fedora defect. That patch
> simply logs a message, on the mprotect failures rather than aborting. Its
> fairly non-intrusive.

> That leaves seccomp functional, and BTI generally functional except when
> seccomp is restricting it. I've also been asked that if a patch like that is
> needed, its (temporary?) merged to the glibc trunk, rather than just being
> carried by the distro's.

The effect on pre-BTI hardware is an issue, another option would be for
systemd to disable this seccomp usage but only after checking for BTI
support in the system rather than just doing so purely based on the
architecture.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  9:55         ` Topi Miettinen
@ 2020-11-04 14:35           ` Catalin Marinas
  -1 siblings, 0 replies; 62+ messages in thread
From: Catalin Marinas @ 2020-11-04 14:35 UTC (permalink / raw)
  To: Topi Miettinen
  Cc: Florian Weimer, Will Deacon, Mark Brown, Szabolcs Nagy,
	libc-alpha, Jeremy Linton, Mark Rutland, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
> On 4.11.2020 11.29, Florian Weimer wrote:
> > * Will Deacon:
> > 
> > > Is there real value in this seccomp filter if it only looks at mprotect(),
> > > or was it just implemented because it's easy to do and sounds like a good
> > > idea?
> > 
> > It seems bogus to me.  Everyone will just create alias mappings instead,
> > just like they did for the similar SELinux feature.  See “Example code
> > to avoid execmem violations” in:
> > 
> >    <https://www.akkadia.org/drepper/selinux-mem.html>
[...]
> > As you can see, this reference implementation creates a PROT_WRITE
> > mapping aliased to a PROT_EXEC mapping, so it actually reduces security
> > compared to something that generates the code in an anonymous mapping
> > and calls mprotect to make it executable.
[...]
> If a service legitimately needs executable and writable mappings (due to
> JIT, trampolines etc), it's easy to disable the filter whenever really
> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
> systemd or a TE rule like "allow type_t self:process { execmem };" for
> SELinux. But this shouldn't be the default case, since there are many
> services which don't need W&X.

I think Drepper's point is that separate X and W mappings, with enough
randomisation, would be more secure than allowing W&X at the same
address (but, of course, less secure than not having W at all, though
that's not always possible).

> I'd also question what is the value of BTI if it can be easily circumvented
> by removing PROT_BTI with mprotect()?

Well, BTI is a protection against JOP attacks. The assumption here is
that an attacker cannot invoke mprotect() to disable PROT_BTI. If it
can, it's probably not worth bothering with a subsequent JOP attack, it
can already call functions directly.

I see MDWX not as a way of detecting attacks but rather plugging
inadvertent security holes in certain programs. On arm64, such hardening
currently gets in the way of another hardening feature, BTI.

-- 
Catalin

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 14:35           ` Catalin Marinas
  0 siblings, 0 replies; 62+ messages in thread
From: Catalin Marinas @ 2020-11-04 14:35 UTC (permalink / raw)
  To: Topi Miettinen
  Cc: Florian Weimer, Mark Rutland, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, linux-kernel, Jeremy Linton,
	Mark Brown, Lennart Poettering, linux-hardening,
	Salvatore Mesoraca, Will Deacon, linux-arm-kernel

On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
> On 4.11.2020 11.29, Florian Weimer wrote:
> > * Will Deacon:
> > 
> > > Is there real value in this seccomp filter if it only looks at mprotect(),
> > > or was it just implemented because it's easy to do and sounds like a good
> > > idea?
> > 
> > It seems bogus to me.  Everyone will just create alias mappings instead,
> > just like they did for the similar SELinux feature.  See “Example code
> > to avoid execmem violations” in:
> > 
> >    <https://www.akkadia.org/drepper/selinux-mem.html>
[...]
> > As you can see, this reference implementation creates a PROT_WRITE
> > mapping aliased to a PROT_EXEC mapping, so it actually reduces security
> > compared to something that generates the code in an anonymous mapping
> > and calls mprotect to make it executable.
[...]
> If a service legitimately needs executable and writable mappings (due to
> JIT, trampolines etc), it's easy to disable the filter whenever really
> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
> systemd or a TE rule like "allow type_t self:process { execmem };" for
> SELinux. But this shouldn't be the default case, since there are many
> services which don't need W&X.

I think Drepper's point is that separate X and W mappings, with enough
randomisation, would be more secure than allowing W&X at the same
address (but, of course, less secure than not having W at all, though
that's not always possible).

> I'd also question what is the value of BTI if it can be easily circumvented
> by removing PROT_BTI with mprotect()?

Well, BTI is a protection against JOP attacks. The assumption here is
that an attacker cannot invoke mprotect() to disable PROT_BTI. If it
can, it's probably not worth bothering with a subsequent JOP attack, it
can already call functions directly.

I see MDWX not as a way of detecting attacks but rather plugging
inadvertent security holes in certain programs. On arm64, such hardening
currently gets in the way of another hardening feature, BTI.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  8:57       ` Szabolcs Nagy
@ 2020-11-04 14:41         ` Catalin Marinas
  -1 siblings, 0 replies; 62+ messages in thread
From: Catalin Marinas @ 2020-11-04 14:41 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Jeremy Linton, Mark Brown, libc-alpha, Mark Rutland, Will Deacon,
	Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

On Wed, Nov 04, 2020 at 08:57:05AM +0000, Szabolcs Nagy wrote:
> The 11/03/2020 23:41, Jeremy Linton wrote:
> > On 11/3/20 11:34 AM, Mark Brown wrote:
> > > On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> > > 
> > > > Re-mmap executable segments instead of mprotecting them in
> > > > case mprotect is seccomp filtered.
> > > 
> > > > For the kernel mapped main executable we don't have the fd
> > > > for re-mmap so linux needs to be updated to add BTI. (In the
> > > > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > > > cannot change BTI protection at runtime based on user space
> > > > policy so it is better if the kernel maps BTI compatible
> > > > binaries with PROT_BTI by default.)
> > > 
> > > Given that there were still some ongoing discussions on a more robust
> > > kernel interface here and there seem to be a few concerns with this
> > > series should we perhaps just take a step back and disable this seccomp
> > > filter in systemd on arm64, at least for the time being?  That seems
> > > safer than rolling out things that set ABI quickly, a big part of the
> > 
> > So, that's a bigger hammer than I think is needed and punishes !BTI
> > machines. I'm going to suggest that if we need to carry a temp patch its
> > more like the glibc patch I mentioned in the Fedora defect. That patch
> > simply logs a message, on the mprotect failures rather than aborting. Its
> > fairly non-intrusive.
> > 
> > That leaves seccomp functional, and BTI generally functional except when
> > seccomp is restricting it. I've also been asked that if a patch like that is
> > needed, its (temporary?) merged to the glibc trunk, rather than just being
> > carried by the distro's.
> 
> note that changing mprotect into mmap in glibc works
> even if the kernel or systemd decides to do things
> differently: currently the only wart is that on the
> main exe we have to use mprotect and silently ignore
> the failures.

Can the dynamic loader mmap() the main exe again while munmap'ing the
original one? (sorry if it was already discussed)

-- 
Catalin

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 14:41         ` Catalin Marinas
  0 siblings, 0 replies; 62+ messages in thread
From: Catalin Marinas @ 2020-11-04 14:41 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Topi Miettinen, Will Deacon, linux-arm-kernel

On Wed, Nov 04, 2020 at 08:57:05AM +0000, Szabolcs Nagy wrote:
> The 11/03/2020 23:41, Jeremy Linton wrote:
> > On 11/3/20 11:34 AM, Mark Brown wrote:
> > > On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> > > 
> > > > Re-mmap executable segments instead of mprotecting them in
> > > > case mprotect is seccomp filtered.
> > > 
> > > > For the kernel mapped main executable we don't have the fd
> > > > for re-mmap so linux needs to be updated to add BTI. (In the
> > > > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > > > cannot change BTI protection at runtime based on user space
> > > > policy so it is better if the kernel maps BTI compatible
> > > > binaries with PROT_BTI by default.)
> > > 
> > > Given that there were still some ongoing discussions on a more robust
> > > kernel interface here and there seem to be a few concerns with this
> > > series should we perhaps just take a step back and disable this seccomp
> > > filter in systemd on arm64, at least for the time being?  That seems
> > > safer than rolling out things that set ABI quickly, a big part of the
> > 
> > So, that's a bigger hammer than I think is needed and punishes !BTI
> > machines. I'm going to suggest that if we need to carry a temp patch its
> > more like the glibc patch I mentioned in the Fedora defect. That patch
> > simply logs a message, on the mprotect failures rather than aborting. Its
> > fairly non-intrusive.
> > 
> > That leaves seccomp functional, and BTI generally functional except when
> > seccomp is restricting it. I've also been asked that if a patch like that is
> > needed, its (temporary?) merged to the glibc trunk, rather than just being
> > carried by the distro's.
> 
> note that changing mprotect into mmap in glibc works
> even if the kernel or systemd decides to do things
> differently: currently the only wart is that on the
> main exe we have to use mprotect and silently ignore
> the failures.

Can the dynamic loader mmap() the main exe again while munmap'ing the
original one? (sorry if it was already discussed)

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04 14:41         ` Catalin Marinas
  (?)
@ 2020-11-04 14:45           ` Florian Weimer
  -1 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-04 14:45 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Szabolcs Nagy, Jeremy Linton, Mark Brown, libc-alpha,
	Mark Rutland, Will Deacon, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

* Catalin Marinas:

> Can the dynamic loader mmap() the main exe again while munmap'ing the
> original one? (sorry if it was already discussed)

No, we don't have a descriptor for that.  /proc may not be mounted, and
using the path stored there has a race condition anyway.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 14:45           ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-04 14:45 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, linux-kernel, Jeremy Linton,
	Mark Brown, Lennart Poettering, linux-hardening, Topi Miettinen,
	Will Deacon, linux-arm-kernel

* Catalin Marinas:

> Can the dynamic loader mmap() the main exe again while munmap'ing the
> original one? (sorry if it was already discussed)

No, we don't have a descriptor for that.  /proc may not be mounted, and
using the path stored there has a race condition anyway.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 14:45           ` Florian Weimer
  0 siblings, 0 replies; 62+ messages in thread
From: Florian Weimer @ 2020-11-04 14:45 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Szabolcs Nagy, Jeremy Linton, Mark Brown, libc-alpha,
	Mark Rutland, Will Deacon, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

* Catalin Marinas:

> Can the dynamic loader mmap() the main exe again while munmap'ing the
> original one? (sorry if it was already discussed)

No, we don't have a descriptor for that.  /proc may not be mounted, and
using the path stored there has a race condition anyway.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04 14:35           ` Catalin Marinas
@ 2020-11-04 15:19             ` Topi Miettinen
  -1 siblings, 0 replies; 62+ messages in thread
From: Topi Miettinen @ 2020-11-04 15:19 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Florian Weimer, Will Deacon, Mark Brown, Szabolcs Nagy,
	libc-alpha, Jeremy Linton, Mark Rutland, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

On 4.11.2020 16.35, Catalin Marinas wrote:
> On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
>> On 4.11.2020 11.29, Florian Weimer wrote:
>>> * Will Deacon:
>>>
>>>> Is there real value in this seccomp filter if it only looks at mprotect(),
>>>> or was it just implemented because it's easy to do and sounds like a good
>>>> idea?
>>>
>>> It seems bogus to me.  Everyone will just create alias mappings instead,
>>> just like they did for the similar SELinux feature.  See “Example code
>>> to avoid execmem violations” in:
>>>
>>>     <https://www.akkadia.org/drepper/selinux-mem.html>
> [...]
>>> As you can see, this reference implementation creates a PROT_WRITE
>>> mapping aliased to a PROT_EXEC mapping, so it actually reduces security
>>> compared to something that generates the code in an anonymous mapping
>>> and calls mprotect to make it executable.
> [...]
>> If a service legitimately needs executable and writable mappings (due to
>> JIT, trampolines etc), it's easy to disable the filter whenever really
>> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
>> systemd or a TE rule like "allow type_t self:process { execmem };" for
>> SELinux. But this shouldn't be the default case, since there are many
>> services which don't need W&X.
> 
> I think Drepper's point is that separate X and W mappings, with enough
> randomisation, would be more secure than allowing W&X at the same
> address (but, of course, less secure than not having W at all, though
> that's not always possible).
> 
>> I'd also question what is the value of BTI if it can be easily circumvented
>> by removing PROT_BTI with mprotect()?
> 
> Well, BTI is a protection against JOP attacks. The assumption here is
> that an attacker cannot invoke mprotect() to disable PROT_BTI. If it
> can, it's probably not worth bothering with a subsequent JOP attack, it
> can already call functions directly.

I suppose that the target for the attacker is to eventually perform 
system calls rather than looping forever in JOP/ROP gadgets.

> I see MDWX not as a way of detecting attacks but rather plugging
> inadvertent security holes in certain programs. On arm64, such hardening
> currently gets in the way of another hardening feature, BTI.

I don't think it has to get in the way at all. Why wouldn't something 
simple like this work:

diff --git a/elf/dl-load.c b/elf/dl-load.c
index 646c5dca40..12a74d15e8 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1170,8 +1170,13 @@ _dl_map_object_from_fd (const char *name, const 
char *origname, int fd,
             c->prot |= PROT_READ;
           if (ph->p_flags & PF_W)
             c->prot |= PROT_WRITE;
-         if (ph->p_flags & PF_X)
+         if (ph->p_flags & PF_X) {
             c->prot |= PROT_EXEC;
+#ifdef PROT_BTI
+           if (GLRO(dl_bti) & 1)
+             c->prot |= PROT_BTI;
+#endif
+         }
  #endif
           break;

diff --git a/elf/dl-support.c b/elf/dl-support.c
index 7704c101c5..22c7cc7b81 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
@@ -222,7 +222,7 @@ __rtld_lock_define_initialized_recursive (, 
_dl_load_write_lock)


  #ifdef HAVE_AUX_VECTOR
-int _dl_clktck;
+int _dl_clktck, _dl_bti;

  void
  _dl_aux_init (ElfW(auxv_t) *av)
@@ -294,6 +294,11 @@ _dl_aux_init (ElfW(auxv_t) *av)
        case AT_RANDOM:
         _dl_random = (void *) av->a_un.a_val;
         break;
+#ifdef PROT_BTI
+      case AT_BTI:
+       _dl_bti = (void *) av->a_un.a_val;
+       break;
+#endif
        DL_PLATFORM_AUXV
        }
    if (seen == 0xf)

Kernel sets the aux vector to indicate that BTI should be enabled for 
all segments and main exe is already protected.

-Topi

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 15:19             ` Topi Miettinen
  0 siblings, 0 replies; 62+ messages in thread
From: Topi Miettinen @ 2020-11-04 15:19 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Florian Weimer, Mark Rutland, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, linux-kernel, Jeremy Linton,
	Mark Brown, Lennart Poettering, linux-hardening,
	Salvatore Mesoraca, Will Deacon, linux-arm-kernel

On 4.11.2020 16.35, Catalin Marinas wrote:
> On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
>> On 4.11.2020 11.29, Florian Weimer wrote:
>>> * Will Deacon:
>>>
>>>> Is there real value in this seccomp filter if it only looks at mprotect(),
>>>> or was it just implemented because it's easy to do and sounds like a good
>>>> idea?
>>>
>>> It seems bogus to me.  Everyone will just create alias mappings instead,
>>> just like they did for the similar SELinux feature.  See “Example code
>>> to avoid execmem violations” in:
>>>
>>>     <https://www.akkadia.org/drepper/selinux-mem.html>
> [...]
>>> As you can see, this reference implementation creates a PROT_WRITE
>>> mapping aliased to a PROT_EXEC mapping, so it actually reduces security
>>> compared to something that generates the code in an anonymous mapping
>>> and calls mprotect to make it executable.
> [...]
>> If a service legitimately needs executable and writable mappings (due to
>> JIT, trampolines etc), it's easy to disable the filter whenever really
>> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
>> systemd or a TE rule like "allow type_t self:process { execmem };" for
>> SELinux. But this shouldn't be the default case, since there are many
>> services which don't need W&X.
> 
> I think Drepper's point is that separate X and W mappings, with enough
> randomisation, would be more secure than allowing W&X at the same
> address (but, of course, less secure than not having W at all, though
> that's not always possible).
> 
>> I'd also question what is the value of BTI if it can be easily circumvented
>> by removing PROT_BTI with mprotect()?
> 
> Well, BTI is a protection against JOP attacks. The assumption here is
> that an attacker cannot invoke mprotect() to disable PROT_BTI. If it
> can, it's probably not worth bothering with a subsequent JOP attack, it
> can already call functions directly.

I suppose that the target for the attacker is to eventually perform 
system calls rather than looping forever in JOP/ROP gadgets.

> I see MDWX not as a way of detecting attacks but rather plugging
> inadvertent security holes in certain programs. On arm64, such hardening
> currently gets in the way of another hardening feature, BTI.

I don't think it has to get in the way at all. Why wouldn't something 
simple like this work:

diff --git a/elf/dl-load.c b/elf/dl-load.c
index 646c5dca40..12a74d15e8 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1170,8 +1170,13 @@ _dl_map_object_from_fd (const char *name, const 
char *origname, int fd,
             c->prot |= PROT_READ;
           if (ph->p_flags & PF_W)
             c->prot |= PROT_WRITE;
-         if (ph->p_flags & PF_X)
+         if (ph->p_flags & PF_X) {
             c->prot |= PROT_EXEC;
+#ifdef PROT_BTI
+           if (GLRO(dl_bti) & 1)
+             c->prot |= PROT_BTI;
+#endif
+         }
  #endif
           break;

diff --git a/elf/dl-support.c b/elf/dl-support.c
index 7704c101c5..22c7cc7b81 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
@@ -222,7 +222,7 @@ __rtld_lock_define_initialized_recursive (, 
_dl_load_write_lock)


  #ifdef HAVE_AUX_VECTOR
-int _dl_clktck;
+int _dl_clktck, _dl_bti;

  void
  _dl_aux_init (ElfW(auxv_t) *av)
@@ -294,6 +294,11 @@ _dl_aux_init (ElfW(auxv_t) *av)
        case AT_RANDOM:
         _dl_random = (void *) av->a_un.a_val;
         break;
+#ifdef PROT_BTI
+      case AT_BTI:
+       _dl_bti = (void *) av->a_un.a_val;
+       break;
+#endif
        DL_PLATFORM_AUXV
        }
    if (seen == 0xf)

Kernel sets the aux vector to indicate that BTI should be enabled for 
all segments and main exe is already protected.

-Topi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  9:55         ` Topi Miettinen
@ 2020-11-04 15:20           ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2020-11-04 15:20 UTC (permalink / raw)
  To: Topi Miettinen
  Cc: Florian Weimer, Will Deacon, Mark Brown, Szabolcs Nagy,
	libc-alpha, Jeremy Linton, Catalin Marinas, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
> On 4.11.2020 11.29, Florian Weimer wrote:
> > * Will Deacon:
> > 
> > > Is there real value in this seccomp filter if it only looks at mprotect(),
> > > or was it just implemented because it's easy to do and sounds like a good
> > > idea?
> > 
> > It seems bogus to me.  Everyone will just create alias mappings instead,
> > just like they did for the similar SELinux feature.  See “Example code
> > to avoid execmem violations” in:
> > 
> >    <https://www.akkadia.org/drepper/selinux-mem.html>
> 
> Also note "But this is very dangerous: programs should never use memory
> regions which are writable and executable at the same time. Assuming that it
> is really necessary to generate executable code while the program runs the
> method employed should be reconsidered."

Sure, and to be clear we're not trying to violate the "at the same time"
property. We do not want to permit simultaneous PROT_WRITE and PROT_EXEC
at any instant in time. What we're asking is to not block changing
permissions to PROT_EXEC in the absence of PROT_WRITE.

I think that the goal of preventing WRITE -> EXEC transitions for some
memory is sane, but I think the existing kernel primitives available to
systemd don't allow us to do that in a robust way because we don't have
all the relevant state tracked and accessible, and the existing approach
gets in the way of doing the right thing for other mitigations.

Consequently I think it would be better going forward to add a more
robust (kernel) mechanism for enforcement that can distinguish
WRITE->EXEC from EXEC->EXEC+BTI, and e.g. can be used to forbid aliasing
mappings with differing W/X permissions. Then userspace could eventually
transition over to that and get /stronger/ protection while permitting
the BTI case we'd like to work now.

> If a service legitimately needs executable and writable mappings (due to
> JIT, trampolines etc), it's easy to disable the filter whenever really
> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
> systemd or a TE rule like "allow type_t self:process { execmem };" for
> SELinux. But this shouldn't be the default case, since there are many
> services which don't need W&X.
> 
> I'd also question what is the value of BTI if it can be easily circumvented
> by removing PROT_BTI with mprotect()?

I agree that turning BTI off is a concern, and to that end I'd like to
add an enforcement mechanism whereby we could prevent that (ideally the
same mechanism by which we could prevent WRITE -> EXEC transitions). 

But, as with all things it's a matter of degree. MDWE and BTI are both
hurdles to an adversary, but neither are absolutes and there are
approaches to bypass either. By the time someone's issuing mprotect()
with an arbitrary VA and/or prot, they are liable to have been able to
do the same with mmap() and circumvent MDWE.

I'd really like to not have BTI silently disabled in order to work with
MDWE, because the risk is that it gets silently disabled elsewhere. The
risk of the changing the kernel to enable BTI for a binary is not well
known since we don't control other peoples libraries that might end up
not being compatible somehow with that. The risk of disabling a portion
of the MDWE protections seems to be the least out of the options we have
available, as unfortunate as it seems, and I think we can come up with a
better MDWE approach going forward.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 15:20           ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2020-11-04 15:20 UTC (permalink / raw)
  To: Topi Miettinen
  Cc: Florian Weimer, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Jeremy Linton, Mark Brown, Lennart Poettering, linux-hardening,
	Will Deacon, linux-arm-kernel

On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
> On 4.11.2020 11.29, Florian Weimer wrote:
> > * Will Deacon:
> > 
> > > Is there real value in this seccomp filter if it only looks at mprotect(),
> > > or was it just implemented because it's easy to do and sounds like a good
> > > idea?
> > 
> > It seems bogus to me.  Everyone will just create alias mappings instead,
> > just like they did for the similar SELinux feature.  See “Example code
> > to avoid execmem violations” in:
> > 
> >    <https://www.akkadia.org/drepper/selinux-mem.html>
> 
> Also note "But this is very dangerous: programs should never use memory
> regions which are writable and executable at the same time. Assuming that it
> is really necessary to generate executable code while the program runs the
> method employed should be reconsidered."

Sure, and to be clear we're not trying to violate the "at the same time"
property. We do not want to permit simultaneous PROT_WRITE and PROT_EXEC
at any instant in time. What we're asking is to not block changing
permissions to PROT_EXEC in the absence of PROT_WRITE.

I think that the goal of preventing WRITE -> EXEC transitions for some
memory is sane, but I think the existing kernel primitives available to
systemd don't allow us to do that in a robust way because we don't have
all the relevant state tracked and accessible, and the existing approach
gets in the way of doing the right thing for other mitigations.

Consequently I think it would be better going forward to add a more
robust (kernel) mechanism for enforcement that can distinguish
WRITE->EXEC from EXEC->EXEC+BTI, and e.g. can be used to forbid aliasing
mappings with differing W/X permissions. Then userspace could eventually
transition over to that and get /stronger/ protection while permitting
the BTI case we'd like to work now.

> If a service legitimately needs executable and writable mappings (due to
> JIT, trampolines etc), it's easy to disable the filter whenever really
> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
> systemd or a TE rule like "allow type_t self:process { execmem };" for
> SELinux. But this shouldn't be the default case, since there are many
> services which don't need W&X.
> 
> I'd also question what is the value of BTI if it can be easily circumvented
> by removing PROT_BTI with mprotect()?

I agree that turning BTI off is a concern, and to that end I'd like to
add an enforcement mechanism whereby we could prevent that (ideally the
same mechanism by which we could prevent WRITE -> EXEC transitions). 

But, as with all things it's a matter of degree. MDWE and BTI are both
hurdles to an adversary, but neither are absolutes and there are
approaches to bypass either. By the time someone's issuing mprotect()
with an arbitrary VA and/or prot, they are liable to have been able to
do the same with mmap() and circumvent MDWE.

I'd really like to not have BTI silently disabled in order to work with
MDWE, because the risk is that it gets silently disabled elsewhere. The
risk of the changing the kernel to enable BTI for a binary is not well
known since we don't control other peoples libraries that might end up
not being compatible somehow with that. The risk of disabling a portion
of the MDWE protections seems to be the least out of the options we have
available, as unfortunate as it seems, and I think we can come up with a
better MDWE approach going forward.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04 15:19             ` Topi Miettinen
@ 2020-11-04 16:08               ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-04 16:08 UTC (permalink / raw)
  To: Topi Miettinen
  Cc: Catalin Marinas, Florian Weimer, Will Deacon, Mark Brown,
	libc-alpha, Jeremy Linton, Mark Rutland, Kees Cook,
	Salvatore Mesoraca, Lennart Poettering, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

The 11/04/2020 17:19, Topi Miettinen wrote:
> On 4.11.2020 16.35, Catalin Marinas wrote:
> > On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
> > > On 4.11.2020 11.29, Florian Weimer wrote:
> > > > * Will Deacon:
> > > > > Is there real value in this seccomp filter if it only looks at mprotect(),
> > > > > or was it just implemented because it's easy to do and sounds like a good
> > > > > idea?
> > > > 
> > > > It seems bogus to me.  Everyone will just create alias mappings instead,
> > > > just like they did for the similar SELinux feature.  See “Example code
> > > > to avoid execmem violations” in:
> > > > 
> > > >     <https://www.akkadia.org/drepper/selinux-mem.html>
> > [...]
> > > > As you can see, this reference implementation creates a PROT_WRITE
> > > > mapping aliased to a PROT_EXEC mapping, so it actually reduces security
> > > > compared to something that generates the code in an anonymous mapping
> > > > and calls mprotect to make it executable.
> > [...]
> > > If a service legitimately needs executable and writable mappings (due to
> > > JIT, trampolines etc), it's easy to disable the filter whenever really
> > > needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
> > > systemd or a TE rule like "allow type_t self:process { execmem };" for
> > > SELinux. But this shouldn't be the default case, since there are many
> > > services which don't need W&X.
> > 
> > I think Drepper's point is that separate X and W mappings, with enough
> > randomisation, would be more secure than allowing W&X at the same
> > address (but, of course, less secure than not having W at all, though
> > that's not always possible).
> > 
> > > I'd also question what is the value of BTI if it can be easily circumvented
> > > by removing PROT_BTI with mprotect()?
> > 
> > Well, BTI is a protection against JOP attacks. The assumption here is
> > that an attacker cannot invoke mprotect() to disable PROT_BTI. If it
> > can, it's probably not worth bothering with a subsequent JOP attack, it
> > can already call functions directly.
> 
> I suppose that the target for the attacker is to eventually perform system
> calls rather than looping forever in JOP/ROP gadgets.
> 
> > I see MDWX not as a way of detecting attacks but rather plugging
> > inadvertent security holes in certain programs. On arm64, such hardening
> > currently gets in the way of another hardening feature, BTI.
> 
> I don't think it has to get in the way at all. Why wouldn't something simple
> like this work:

PROT_BTI is only valid on binaries that are BTI compatible.
to detect that, the load segments must be already mapped.

AT_BTI does not solve this: we want to be able to load legacy
elf modules. (a BTI enforcement setting may be useful where
incompatible modules are rejected, but that cannot be the
default for backward compatibility reasons.)

> 
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 646c5dca40..12a74d15e8 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1170,8 +1170,13 @@ _dl_map_object_from_fd (const char *name, const char
> *origname, int fd,
>             c->prot |= PROT_READ;
>           if (ph->p_flags & PF_W)
>             c->prot |= PROT_WRITE;
> -         if (ph->p_flags & PF_X)
> +         if (ph->p_flags & PF_X) {
>             c->prot |= PROT_EXEC;
> +#ifdef PROT_BTI
> +           if (GLRO(dl_bti) & 1)
> +             c->prot |= PROT_BTI;
> +#endif
> +         }
>  #endif
>           break;
> 
> diff --git a/elf/dl-support.c b/elf/dl-support.c
> index 7704c101c5..22c7cc7b81 100644
> --- a/elf/dl-support.c
> +++ b/elf/dl-support.c
> @@ -222,7 +222,7 @@ __rtld_lock_define_initialized_recursive (,
> _dl_load_write_lock)
> 
> 
>  #ifdef HAVE_AUX_VECTOR
> -int _dl_clktck;
> +int _dl_clktck, _dl_bti;
> 
>  void
>  _dl_aux_init (ElfW(auxv_t) *av)
> @@ -294,6 +294,11 @@ _dl_aux_init (ElfW(auxv_t) *av)
>        case AT_RANDOM:
>         _dl_random = (void *) av->a_un.a_val;
>         break;
> +#ifdef PROT_BTI
> +      case AT_BTI:
> +       _dl_bti = (void *) av->a_un.a_val;
> +       break;
> +#endif
>        DL_PLATFORM_AUXV
>        }
>    if (seen == 0xf)
> 
> Kernel sets the aux vector to indicate that BTI should be enabled for all
> segments and main exe is already protected.
> 
> -Topi

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 16:08               ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-04 16:08 UTC (permalink / raw)
  To: Topi Miettinen
  Cc: Florian Weimer, Mark Rutland, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, Catalin Marinas,
	linux-kernel, Jeremy Linton, Mark Brown, Lennart Poettering,
	linux-hardening, Will Deacon, linux-arm-kernel

The 11/04/2020 17:19, Topi Miettinen wrote:
> On 4.11.2020 16.35, Catalin Marinas wrote:
> > On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
> > > On 4.11.2020 11.29, Florian Weimer wrote:
> > > > * Will Deacon:
> > > > > Is there real value in this seccomp filter if it only looks at mprotect(),
> > > > > or was it just implemented because it's easy to do and sounds like a good
> > > > > idea?
> > > > 
> > > > It seems bogus to me.  Everyone will just create alias mappings instead,
> > > > just like they did for the similar SELinux feature.  See “Example code
> > > > to avoid execmem violations” in:
> > > > 
> > > >     <https://www.akkadia.org/drepper/selinux-mem.html>
> > [...]
> > > > As you can see, this reference implementation creates a PROT_WRITE
> > > > mapping aliased to a PROT_EXEC mapping, so it actually reduces security
> > > > compared to something that generates the code in an anonymous mapping
> > > > and calls mprotect to make it executable.
> > [...]
> > > If a service legitimately needs executable and writable mappings (due to
> > > JIT, trampolines etc), it's easy to disable the filter whenever really
> > > needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
> > > systemd or a TE rule like "allow type_t self:process { execmem };" for
> > > SELinux. But this shouldn't be the default case, since there are many
> > > services which don't need W&X.
> > 
> > I think Drepper's point is that separate X and W mappings, with enough
> > randomisation, would be more secure than allowing W&X at the same
> > address (but, of course, less secure than not having W at all, though
> > that's not always possible).
> > 
> > > I'd also question what is the value of BTI if it can be easily circumvented
> > > by removing PROT_BTI with mprotect()?
> > 
> > Well, BTI is a protection against JOP attacks. The assumption here is
> > that an attacker cannot invoke mprotect() to disable PROT_BTI. If it
> > can, it's probably not worth bothering with a subsequent JOP attack, it
> > can already call functions directly.
> 
> I suppose that the target for the attacker is to eventually perform system
> calls rather than looping forever in JOP/ROP gadgets.
> 
> > I see MDWX not as a way of detecting attacks but rather plugging
> > inadvertent security holes in certain programs. On arm64, such hardening
> > currently gets in the way of another hardening feature, BTI.
> 
> I don't think it has to get in the way at all. Why wouldn't something simple
> like this work:

PROT_BTI is only valid on binaries that are BTI compatible.
to detect that, the load segments must be already mapped.

AT_BTI does not solve this: we want to be able to load legacy
elf modules. (a BTI enforcement setting may be useful where
incompatible modules are rejected, but that cannot be the
default for backward compatibility reasons.)

> 
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 646c5dca40..12a74d15e8 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1170,8 +1170,13 @@ _dl_map_object_from_fd (const char *name, const char
> *origname, int fd,
>             c->prot |= PROT_READ;
>           if (ph->p_flags & PF_W)
>             c->prot |= PROT_WRITE;
> -         if (ph->p_flags & PF_X)
> +         if (ph->p_flags & PF_X) {
>             c->prot |= PROT_EXEC;
> +#ifdef PROT_BTI
> +           if (GLRO(dl_bti) & 1)
> +             c->prot |= PROT_BTI;
> +#endif
> +         }
>  #endif
>           break;
> 
> diff --git a/elf/dl-support.c b/elf/dl-support.c
> index 7704c101c5..22c7cc7b81 100644
> --- a/elf/dl-support.c
> +++ b/elf/dl-support.c
> @@ -222,7 +222,7 @@ __rtld_lock_define_initialized_recursive (,
> _dl_load_write_lock)
> 
> 
>  #ifdef HAVE_AUX_VECTOR
> -int _dl_clktck;
> +int _dl_clktck, _dl_bti;
> 
>  void
>  _dl_aux_init (ElfW(auxv_t) *av)
> @@ -294,6 +294,11 @@ _dl_aux_init (ElfW(auxv_t) *av)
>        case AT_RANDOM:
>         _dl_random = (void *) av->a_un.a_val;
>         break;
> +#ifdef PROT_BTI
> +      case AT_BTI:
> +       _dl_bti = (void *) av->a_un.a_val;
> +       break;
> +#endif
>        DL_PLATFORM_AUXV
>        }
>    if (seen == 0xf)
> 
> Kernel sets the aux vector to indicate that BTI should be enabled for all
> segments and main exe is already protected.
> 
> -Topi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04 10:50       ` Mark Brown
@ 2020-11-04 18:47         ` Jeremy Linton
  -1 siblings, 0 replies; 62+ messages in thread
From: Jeremy Linton @ 2020-11-04 18:47 UTC (permalink / raw)
  To: Mark Brown
  Cc: Szabolcs Nagy, libc-alpha, Catalin Marinas, Mark Rutland,
	Will Deacon, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

Hi,

On 11/4/20 4:50 AM, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 11:41:42PM -0600, Jeremy Linton wrote:
>> On 11/3/20 11:34 AM, Mark Brown wrote:
> 
>>> Given that there were still some ongoing discussions on a more robust
>>> kernel interface here and there seem to be a few concerns with this
>>> series should we perhaps just take a step back and disable this seccomp
>>> filter in systemd on arm64, at least for the time being?  That seems
>>> safer than rolling out things that set ABI quickly, a big part of the
> 
>> So, that's a bigger hammer than I think is needed and punishes !BTI
>> machines. I'm going to suggest that if we need to carry a temp patch its
>> more like the glibc patch I mentioned in the Fedora defect. That patch
>> simply logs a message, on the mprotect failures rather than aborting. Its
>> fairly non-intrusive.
> 
>> That leaves seccomp functional, and BTI generally functional except when
>> seccomp is restricting it. I've also been asked that if a patch like that is
>> needed, its (temporary?) merged to the glibc trunk, rather than just being
>> carried by the distro's.
> 
> The effect on pre-BTI hardware is an issue, another option would be for
> systemd to disable this seccomp usage but only after checking for BTI
> support in the system rather than just doing so purely based on the
> architecture.
> 

That works, but your also losing seccomp in the case where the machine 
is BTI capable, but the service isn't. So it should really be checking 
the elf notes, but at that point you might just as well patch glibc.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 18:47         ` Jeremy Linton
  0 siblings, 0 replies; 62+ messages in thread
From: Jeremy Linton @ 2020-11-04 18:47 UTC (permalink / raw)
  To: Mark Brown
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Lennart Poettering, linux-hardening, Salvatore Mesoraca,
	Topi Miettinen, Will Deacon, linux-arm-kernel

Hi,

On 11/4/20 4:50 AM, Mark Brown wrote:
> On Tue, Nov 03, 2020 at 11:41:42PM -0600, Jeremy Linton wrote:
>> On 11/3/20 11:34 AM, Mark Brown wrote:
> 
>>> Given that there were still some ongoing discussions on a more robust
>>> kernel interface here and there seem to be a few concerns with this
>>> series should we perhaps just take a step back and disable this seccomp
>>> filter in systemd on arm64, at least for the time being?  That seems
>>> safer than rolling out things that set ABI quickly, a big part of the
> 
>> So, that's a bigger hammer than I think is needed and punishes !BTI
>> machines. I'm going to suggest that if we need to carry a temp patch its
>> more like the glibc patch I mentioned in the Fedora defect. That patch
>> simply logs a message, on the mprotect failures rather than aborting. Its
>> fairly non-intrusive.
> 
>> That leaves seccomp functional, and BTI generally functional except when
>> seccomp is restricting it. I've also been asked that if a patch like that is
>> needed, its (temporary?) merged to the glibc trunk, rather than just being
>> carried by the distro's.
> 
> The effect on pre-BTI hardware is an issue, another option would be for
> systemd to disable this seccomp usage but only after checking for BTI
> support in the system rather than just doing so purely based on the
> architecture.
> 

That works, but your also losing seccomp in the case where the machine 
is BTI capable, but the service isn't. So it should really be checking 
the elf notes, but at that point you might just as well patch glibc.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04 18:47         ` Jeremy Linton
@ 2020-11-04 18:53           ` Mark Brown
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Brown @ 2020-11-04 18:53 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Szabolcs Nagy, libc-alpha, Catalin Marinas, Mark Rutland,
	Will Deacon, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

[-- Attachment #1: Type: text/plain, Size: 766 bytes --]

On Wed, Nov 04, 2020 at 12:47:09PM -0600, Jeremy Linton wrote:
> On 11/4/20 4:50 AM, Mark Brown wrote:

> > The effect on pre-BTI hardware is an issue, another option would be for
> > systemd to disable this seccomp usage but only after checking for BTI
> > support in the system rather than just doing so purely based on the
> > architecture.

> That works, but your also losing seccomp in the case where the machine is
> BTI capable, but the service isn't. So it should really be checking the elf
> notes, but at that point you might just as well patch glibc.

True, I guess I was assuming that a BTI rebuild is done at the distro
level but of course even if that's the case a system could have third
party binaries so you can't just assume that the world is BTI.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 18:53           ` Mark Brown
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Brown @ 2020-11-04 18:53 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Lennart Poettering, linux-hardening, Salvatore Mesoraca,
	Topi Miettinen, Will Deacon, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 766 bytes --]

On Wed, Nov 04, 2020 at 12:47:09PM -0600, Jeremy Linton wrote:
> On 11/4/20 4:50 AM, Mark Brown wrote:

> > The effect on pre-BTI hardware is an issue, another option would be for
> > systemd to disable this seccomp usage but only after checking for BTI
> > support in the system rather than just doing so purely based on the
> > architecture.

> That works, but your also losing seccomp in the case where the machine is
> BTI capable, but the service isn't. So it should really be checking the elf
> notes, but at that point you might just as well patch glibc.

True, I guess I was assuming that a BTI rebuild is done at the distro
level but of course even if that's the case a system could have third
party binaries so you can't just assume that the world is BTI.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04 15:20           ` Mark Rutland
@ 2020-11-04 18:59             ` Jeremy Linton
  -1 siblings, 0 replies; 62+ messages in thread
From: Jeremy Linton @ 2020-11-04 18:59 UTC (permalink / raw)
  To: Mark Rutland, Topi Miettinen
  Cc: Florian Weimer, Will Deacon, Mark Brown, Szabolcs Nagy,
	libc-alpha, Catalin Marinas, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, linux-kernel, linux-arm-kernel,
	kernel-hardening, linux-hardening

Hi,

On 11/4/20 9:20 AM, Mark Rutland wrote:
> On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
>> On 4.11.2020 11.29, Florian Weimer wrote:
>>> * Will Deacon:
>>>
>>>> Is there real value in this seccomp filter if it only looks at mprotect(),
>>>> or was it just implemented because it's easy to do and sounds like a good
>>>> idea?
>>>
>>> It seems bogus to me.  Everyone will just create alias mappings instead,
>>> just like they did for the similar SELinux feature.  See “Example code
>>> to avoid execmem violations” in:
>>>
>>>     <https://www.akkadia.org/drepper/selinux-mem.html>
>>
>> Also note "But this is very dangerous: programs should never use memory
>> regions which are writable and executable at the same time. Assuming that it
>> is really necessary to generate executable code while the program runs the
>> method employed should be reconsidered."
> 
> Sure, and to be clear we're not trying to violate the "at the same time"
> property. We do not want to permit simultaneous PROT_WRITE and PROT_EXEC
> at any instant in time. What we're asking is to not block changing
> permissions to PROT_EXEC in the absence of PROT_WRITE.
> 
> I think that the goal of preventing WRITE -> EXEC transitions for some
> memory is sane, but I think the existing kernel primitives available to
> systemd don't allow us to do that in a robust way because we don't have
> all the relevant state tracked and accessible, and the existing approach
> gets in the way of doing the right thing for other mitigations.
> 
> Consequently I think it would be better going forward to add a more
> robust (kernel) mechanism for enforcement that can distinguish
> WRITE->EXEC from EXEC->EXEC+BTI, and e.g. can be used to forbid aliasing
> mappings with differing W/X permissions. Then userspace could eventually
> transition over to that and get /stronger/ protection while permitting
> the BTI case we'd like to work now.
> 
>> If a service legitimately needs executable and writable mappings (due to
>> JIT, trampolines etc), it's easy to disable the filter whenever really
>> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
>> systemd or a TE rule like "allow type_t self:process { execmem };" for
>> SELinux. But this shouldn't be the default case, since there are many
>> services which don't need W&X.
>>
>> I'd also question what is the value of BTI if it can be easily circumvented
>> by removing PROT_BTI with mprotect()?
> 
> I agree that turning BTI off is a concern, and to that end I'd like to
> add an enforcement mechanism whereby we could prevent that (ideally the
> same mechanism by which we could prevent WRITE -> EXEC transitions).
> 
> But, as with all things it's a matter of degree. MDWE and BTI are both
> hurdles to an adversary, but neither are absolutes and there are
> approaches to bypass either. By the time someone's issuing mprotect()
> with an arbitrary VA and/or prot, they are liable to have been able to
> do the same with mmap() and circumvent MDWE.
> 
> I'd really like to not have BTI silently disabled in order to work with
> MDWE, because the risk is that it gets silently disabled elsewhere. The
> risk of the changing the kernel to enable BTI for a binary is not well
> known since we don't control other peoples libraries that might end up
> not being compatible somehow with that. The risk of disabling a portion
> of the MDWE protections seems to be the least out of the options we have
> available, as unfortunate as it seems, and I think we can come up with a
> better MDWE approach going forward.

OTOH, You don't really want to blanket disable either protection, and 
unfortunately  you can't really tell until its too late if the service 
is fully BTI enabled. So you either end up disabling MDWE unnecessarily, 
or you delay until the only choice is not enabling BTI.

I guess there is another option too, which is some kind of delayed MDWE 
policy that only turns on once the service has started, but that isn't 
ideal either.

.

> 
> Thanks,
> Mark.
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-04 18:59             ` Jeremy Linton
  0 siblings, 0 replies; 62+ messages in thread
From: Jeremy Linton @ 2020-11-04 18:59 UTC (permalink / raw)
  To: Mark Rutland, Topi Miettinen
  Cc: Florian Weimer, Salvatore Mesoraca, libc-alpha, Kees Cook,
	kernel-hardening, Szabolcs Nagy, Catalin Marinas, linux-kernel,
	Mark Brown, Lennart Poettering, linux-hardening, Will Deacon,
	linux-arm-kernel

Hi,

On 11/4/20 9:20 AM, Mark Rutland wrote:
> On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote:
>> On 4.11.2020 11.29, Florian Weimer wrote:
>>> * Will Deacon:
>>>
>>>> Is there real value in this seccomp filter if it only looks at mprotect(),
>>>> or was it just implemented because it's easy to do and sounds like a good
>>>> idea?
>>>
>>> It seems bogus to me.  Everyone will just create alias mappings instead,
>>> just like they did for the similar SELinux feature.  See “Example code
>>> to avoid execmem violations” in:
>>>
>>>     <https://www.akkadia.org/drepper/selinux-mem.html>
>>
>> Also note "But this is very dangerous: programs should never use memory
>> regions which are writable and executable at the same time. Assuming that it
>> is really necessary to generate executable code while the program runs the
>> method employed should be reconsidered."
> 
> Sure, and to be clear we're not trying to violate the "at the same time"
> property. We do not want to permit simultaneous PROT_WRITE and PROT_EXEC
> at any instant in time. What we're asking is to not block changing
> permissions to PROT_EXEC in the absence of PROT_WRITE.
> 
> I think that the goal of preventing WRITE -> EXEC transitions for some
> memory is sane, but I think the existing kernel primitives available to
> systemd don't allow us to do that in a robust way because we don't have
> all the relevant state tracked and accessible, and the existing approach
> gets in the way of doing the right thing for other mitigations.
> 
> Consequently I think it would be better going forward to add a more
> robust (kernel) mechanism for enforcement that can distinguish
> WRITE->EXEC from EXEC->EXEC+BTI, and e.g. can be used to forbid aliasing
> mappings with differing W/X permissions. Then userspace could eventually
> transition over to that and get /stronger/ protection while permitting
> the BTI case we'd like to work now.
> 
>> If a service legitimately needs executable and writable mappings (due to
>> JIT, trampolines etc), it's easy to disable the filter whenever really
>> needed with "MemoryDenyWriteExecute=no" (which is the default) in case of
>> systemd or a TE rule like "allow type_t self:process { execmem };" for
>> SELinux. But this shouldn't be the default case, since there are many
>> services which don't need W&X.
>>
>> I'd also question what is the value of BTI if it can be easily circumvented
>> by removing PROT_BTI with mprotect()?
> 
> I agree that turning BTI off is a concern, and to that end I'd like to
> add an enforcement mechanism whereby we could prevent that (ideally the
> same mechanism by which we could prevent WRITE -> EXEC transitions).
> 
> But, as with all things it's a matter of degree. MDWE and BTI are both
> hurdles to an adversary, but neither are absolutes and there are
> approaches to bypass either. By the time someone's issuing mprotect()
> with an arbitrary VA and/or prot, they are liable to have been able to
> do the same with mmap() and circumvent MDWE.
> 
> I'd really like to not have BTI silently disabled in order to work with
> MDWE, because the risk is that it gets silently disabled elsewhere. The
> risk of the changing the kernel to enable BTI for a binary is not well
> known since we don't control other peoples libraries that might end up
> not being compatible somehow with that. The risk of disabling a portion
> of the MDWE protections seems to be the least out of the options we have
> available, as unfortunate as it seems, and I think we can come up with a
> better MDWE approach going forward.

OTOH, You don't really want to blanket disable either protection, and 
unfortunately  you can't really tell until its too late if the service 
is fully BTI enabled. So you either end up disabling MDWE unnecessarily, 
or you delay until the only choice is not enabling BTI.

I guess there is another option too, which is some kind of delayed MDWE 
policy that only turns on once the service has started, but that isn't 
ideal either.

.

> 
> Thanks,
> Mark.
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
  2020-11-04  9:20     ` Will Deacon
@ 2020-11-05 11:31       ` Szabolcs Nagy
  -1 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-05 11:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Brown, libc-alpha, Jeremy Linton, Catalin Marinas,
	Mark Rutland, Florian Weimer, Kees Cook, Salvatore Mesoraca,
	Lennart Poettering, Topi Miettinen, linux-kernel,
	linux-arm-kernel, kernel-hardening, linux-hardening

The 11/04/2020 09:20, Will Deacon wrote:
> On Tue, Nov 03, 2020 at 05:34:38PM +0000, Mark Brown wrote:
> > On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> > 
> > > Re-mmap executable segments instead of mprotecting them in
> > > case mprotect is seccomp filtered.
> > 
> > > For the kernel mapped main executable we don't have the fd
> > > for re-mmap so linux needs to be updated to add BTI. (In the
> > > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > > cannot change BTI protection at runtime based on user space
> > > policy so it is better if the kernel maps BTI compatible
> > > binaries with PROT_BTI by default.)
> > 
> > Given that there were still some ongoing discussions on a more robust
> > kernel interface here and there seem to be a few concerns with this
> > series should we perhaps just take a step back and disable this seccomp
> > filter in systemd on arm64, at least for the time being?  That seems
> > safer than rolling out things that set ABI quickly, a big part of the
> > reason we went with having the dynamic linker enable PROT_BTI in the
> > first place was to give us more flexibility to handle any unforseen
> > consequences of enabling BTI that we run into.  We are going to have
> > similar issues with other features like MTE so we need to make sure that
> > whatever we're doing works with them too.
> > 
> > Also updated to Will's current e-mail address - Will, do you have
> > thoughts on what we should do here?
> 
> Changing the kernel to map the main executable with PROT_BTI by default is a
> user-visible change in behaviour and not without risk, so if we're going to
> do that then it needs to be opt-in because the current behaviour has been
> there since 5.8. I suppose we could shoe-horn in a cmdline option for 5.10
> (which will be the first LTS with BTI) but it would be better to put up with
> the current ABI if possible.

it's not clear to me how adding PROT_BTI in
the kernel would be observable in practice.

adding PROT_BTI to marked elf modules should
only have effect in cases when the process does
invalid operations and then there is no compat
requirement. if this is not the case then adding
PROT_BTI on static exe is already problematic.

if there is some issue with bti that makes
users want to turn it off, then they should do
it system wide or may be we can have a no-bti
option in userspace which uses mprotect to turn
it off (but since that has to be an explicit
opt-out i don't mind if the user also has to
disable the seccomp sandbox).

> Is there real value in this seccomp filter if it only looks at mprotect(),
> or was it just implemented because it's easy to do and sounds like a good
> idea?

i'm fine with just using mprotect and telling
users to remove the seccomp filter. but that
makes bti less attractive for deployment.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831]
@ 2020-11-05 11:31       ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2020-11-05 11:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Florian Weimer, libc-alpha, Kees Cook,
	kernel-hardening, Salvatore Mesoraca, Catalin Marinas,
	linux-kernel, Jeremy Linton, Mark Brown, Lennart Poettering,
	linux-hardening, Topi Miettinen, linux-arm-kernel

The 11/04/2020 09:20, Will Deacon wrote:
> On Tue, Nov 03, 2020 at 05:34:38PM +0000, Mark Brown wrote:
> > On Tue, Nov 03, 2020 at 10:25:37AM +0000, Szabolcs Nagy wrote:
> > 
> > > Re-mmap executable segments instead of mprotecting them in
> > > case mprotect is seccomp filtered.
> > 
> > > For the kernel mapped main executable we don't have the fd
> > > for re-mmap so linux needs to be updated to add BTI. (In the
> > > presence of seccomp filters for mprotect(PROT_EXEC) the libc
> > > cannot change BTI protection at runtime based on user space
> > > policy so it is better if the kernel maps BTI compatible
> > > binaries with PROT_BTI by default.)
> > 
> > Given that there were still some ongoing discussions on a more robust
> > kernel interface here and there seem to be a few concerns with this
> > series should we perhaps just take a step back and disable this seccomp
> > filter in systemd on arm64, at least for the time being?  That seems
> > safer than rolling out things that set ABI quickly, a big part of the
> > reason we went with having the dynamic linker enable PROT_BTI in the
> > first place was to give us more flexibility to handle any unforseen
> > consequences of enabling BTI that we run into.  We are going to have
> > similar issues with other features like MTE so we need to make sure that
> > whatever we're doing works with them too.
> > 
> > Also updated to Will's current e-mail address - Will, do you have
> > thoughts on what we should do here?
> 
> Changing the kernel to map the main executable with PROT_BTI by default is a
> user-visible change in behaviour and not without risk, so if we're going to
> do that then it needs to be opt-in because the current behaviour has been
> there since 5.8. I suppose we could shoe-horn in a cmdline option for 5.10
> (which will be the first LTS with BTI) but it would be better to put up with
> the current ABI if possible.

it's not clear to me how adding PROT_BTI in
the kernel would be observable in practice.

adding PROT_BTI to marked elf modules should
only have effect in cases when the process does
invalid operations and then there is no compat
requirement. if this is not the case then adding
PROT_BTI on static exe is already problematic.

if there is some issue with bti that makes
users want to turn it off, then they should do
it system wide or may be we can have a no-bti
option in userspace which uses mprotect to turn
it off (but since that has to be an explicit
opt-out i don't mind if the user also has to
disable the seccomp sandbox).

> Is there real value in this seccomp filter if it only looks at mprotect(),
> or was it just implemented because it's easy to do and sounds like a good
> idea?

i'm fine with just using mprotect and telling
users to remove the seccomp filter. but that
makes bti less attractive for deployment.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2020-11-05 11:33 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 10:25 [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831] Szabolcs Nagy
2020-11-03 10:25 ` Szabolcs Nagy
2020-11-03 10:25 ` [PATCH 1/4] elf: Pass the fd to note processing " Szabolcs Nagy
2020-11-03 10:25   ` Szabolcs Nagy
2020-11-03 10:26 ` [PATCH 2/4] elf: Move note processing after l_phdr is updated " Szabolcs Nagy
2020-11-03 10:26   ` Szabolcs Nagy
2020-11-03 10:38   ` Florian Weimer
2020-11-03 10:38     ` Florian Weimer
2020-11-03 10:38     ` Florian Weimer
2020-11-03 12:36     ` H.J. Lu
2020-11-03 12:36       ` H.J. Lu
2020-11-03 12:36       ` H.J. Lu
2020-11-03 15:04       ` Szabolcs Nagy
2020-11-03 15:04         ` Szabolcs Nagy
2020-11-03 15:27         ` H.J. Lu
2020-11-03 15:27           ` H.J. Lu
2020-11-03 15:27           ` H.J. Lu
2020-11-03 10:26 ` [PATCH 3/4] aarch64: Use mmap to add PROT_BTI instead of mprotect " Szabolcs Nagy
2020-11-03 10:26   ` Szabolcs Nagy
2020-11-03 10:34   ` Florian Weimer
2020-11-03 10:34     ` Florian Weimer
2020-11-03 10:34     ` Florian Weimer
2020-11-03 10:26 ` [PATCH 4/4] aarch64: Remove the bti link_map field " Szabolcs Nagy
2020-11-03 10:26   ` Szabolcs Nagy
2020-11-03 17:34 ` [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) " Mark Brown
2020-11-03 17:34   ` Mark Brown
2020-11-04  5:41   ` Jeremy Linton
2020-11-04  5:41     ` Jeremy Linton
2020-11-04  8:57     ` Szabolcs Nagy
2020-11-04  8:57       ` Szabolcs Nagy
2020-11-04 14:41       ` Catalin Marinas
2020-11-04 14:41         ` Catalin Marinas
2020-11-04 14:45         ` Florian Weimer
2020-11-04 14:45           ` Florian Weimer
2020-11-04 14:45           ` Florian Weimer
2020-11-04 10:50     ` Mark Brown
2020-11-04 10:50       ` Mark Brown
2020-11-04 18:47       ` Jeremy Linton
2020-11-04 18:47         ` Jeremy Linton
2020-11-04 18:53         ` Mark Brown
2020-11-04 18:53           ` Mark Brown
2020-11-04  9:02   ` Topi Miettinen
2020-11-04  9:02     ` Topi Miettinen
2020-11-04  9:20   ` Will Deacon
2020-11-04  9:20     ` Will Deacon
2020-11-04  9:29     ` Florian Weimer
2020-11-04  9:29       ` Florian Weimer
2020-11-04  9:29       ` Florian Weimer
2020-11-04  9:55       ` Topi Miettinen
2020-11-04  9:55         ` Topi Miettinen
2020-11-04 14:35         ` Catalin Marinas
2020-11-04 14:35           ` Catalin Marinas
2020-11-04 15:19           ` Topi Miettinen
2020-11-04 15:19             ` Topi Miettinen
2020-11-04 16:08             ` Szabolcs Nagy
2020-11-04 16:08               ` Szabolcs Nagy
2020-11-04 15:20         ` Mark Rutland
2020-11-04 15:20           ` Mark Rutland
2020-11-04 18:59           ` Jeremy Linton
2020-11-04 18:59             ` Jeremy Linton
2020-11-05 11:31     ` Szabolcs Nagy
2020-11-05 11:31       ` Szabolcs Nagy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.