linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 1/7] arch: add __ro_mostly_after_init section marker
@ 2017-02-19 10:04 Hoeun Ryu
  2017-02-19 10:04 ` [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function Hoeun Ryu
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel
  Cc: Hoeun Ryu, Arnd Bergmann, Kees Cook, Ingo Molnar, linux-arch

 After `__ro_after_init` marker is included in kernel, many kernel data
objects can be read-only-after-init. But there are many other places that
would be good to read-only-after-init but `__ro_after_init` can not be simply
applicable to them because they should be writable at some points, which are
during module_init/exit or dynamic de/registration for a specific subsystem.
 `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
section is mapped as read-only after kernel init. The different thing is
this section is temporarily mapped as read-write during module_init/exit and
de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
 Use `__ro_mostly_after_init` as a way to mark such memory instead when
`__ro_after_init` is not applicable because the memory should be writable
at the described points of time. They are read-only right after kernel init
and writable temporarily only during module_init/exit and dynamic
de/registration for a subsystem.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 include/asm-generic/sections.h    |  1 +
 include/asm-generic/vmlinux.lds.h | 10 ++++++++++
 include/linux/cache.h             | 11 +++++++++++
 3 files changed, 22 insertions(+)

diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index 4df64a1..16a6f21 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -34,6 +34,7 @@ extern char __bss_start[], __bss_stop[];
 extern char __init_begin[], __init_end[];
 extern char _sinittext[], _einittext[];
 extern char __start_data_ro_after_init[], __end_data_ro_after_init[];
+extern char __start_data_ro_mostly_after_init[], __end_data_ro_mostly_after_init[];
 extern char _end[];
 extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[];
 extern char __kprobes_text_start[], __kprobes_text_end[];
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 4e09b28..cc5f44e 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -265,6 +265,15 @@
 	__end_data_ro_after_init = .;
 #endif
 
+#ifndef RO_MOSTLY_AFTER_INIT_DATA
+#define RO_MOSTLY_AFTER_INIT_DATA(align)				\
+	. = ALIGN(align);						\
+	VMLINUX_SYMBOL(__start_data_ro_mostly_after_init) = .;		\
+	*(.data..ro_mostly_after_init)					\
+	. = ALIGN(align);						\
+	VMLINUX_SYMBOL(__end_data_ro_mostly_after_init) = .;
+#endif
+
 /*
  * Read only Data
  */
@@ -275,6 +284,7 @@
 		*(.rodata) *(.rodata.*)					\
 		RO_AFTER_INIT_DATA	/* Read only after init */	\
 		KEEP(*(__vermagic))	/* Kernel version magic */	\
+		RO_MOSTLY_AFTER_INIT_DATA(align)			\
 		. = ALIGN(8);						\
 		VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .;		\
 		KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
diff --git a/include/linux/cache.h b/include/linux/cache.h
index 1be04f8..fd1cb9b 100644
--- a/include/linux/cache.h
+++ b/include/linux/cache.h
@@ -30,6 +30,17 @@
 #define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
 #endif
 
+/*
+ * __ro_mostly_after_init is almost like __ro_after_init.
+ * but __ro_mostly_after_init section is temporarily writable only during
+ * module_init/exit or dynamic de/registeration of a subsystem using
+ * set_ro_mostly_after_init_rw/ro pair.
+ */
+#ifndef __ro_mostly_after_init
+#define __ro_mostly_after_init \
+	__attribute__((__section__(".data..ro_mostly_after_init")))
+#endif
+
 #ifndef ____cacheline_aligned
 #define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
@ 2017-02-19 10:04 ` Hoeun Ryu
  2017-02-20 10:22   ` [kernel-hardening] " Mark Rutland
  2017-02-19 10:04 ` [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit Hoeun Ryu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel
  Cc: Hoeun Ryu, Kees Cook, Jessica Yu, Ingo Molnar, Andrew Morton,
	Emese Revfy, AKASHI Takahiro, Fabian Frederick, Helge Deller,
	Laura Abbott, Nicholas Piggin, Thomas Gleixner, Petr Mladek,
	Yang Shi, Rasmus Villemoes, Tejun Heo, Prarit Bhargava,
	Lokesh Vutla

 Add set_ro_mostly_after_init_rw/ro pair to modify memory attributes for
memory marked as `ro_mostly_after_init`.

 I am doubtful that this is the right place where these functions reside and
these functions are suitable for all architectures for memory attributes
modification. Please comment.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 include/linux/init.h |  6 ++++++
 init/main.c          | 24 ++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/include/linux/init.h b/include/linux/init.h
index 79af096..d68e4f7 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -131,6 +131,12 @@ extern bool rodata_enabled;
 #endif
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void mark_rodata_ro(void);
+
+void set_ro_mostly_after_init_rw(void);
+void set_ro_mostly_after_init_ro(void);
+#else
+static inline void set_ro_mostly_after_init_rw(void) { }
+static inline void set_ro_mostly_after_init_ro(void) { }
 #endif
 
 extern void (*late_time_init)(void);
diff --git a/init/main.c b/init/main.c
index 4719abf..a5d4873 100644
--- a/init/main.c
+++ b/init/main.c
@@ -941,6 +941,30 @@ static void mark_readonly(void)
 	} else
 		pr_info("Kernel memory protection disabled.\n");
 }
+
+void set_ro_mostly_after_init_rw(void)
+{
+	unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
+	unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
+	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
+
+	if (!rodata_enabled)
+		return;
+
+	set_memory_rw(start, nr_pages);
+}
+
+void set_ro_mostly_after_init_ro(void)
+{
+	unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
+	unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
+	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
+
+	if (!rodata_enabled)
+		return;
+
+	set_memory_ro(start, nr_pages);
+}
 #else
 static inline void mark_readonly(void)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
  2017-02-19 10:04 ` [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function Hoeun Ryu
@ 2017-02-19 10:04 ` Hoeun Ryu
  2017-02-20 10:30   ` [kernel-hardening] " Mark Rutland
  2017-02-19 10:04 ` [RFC 4/7] selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops Hoeun Ryu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel; +Cc: Hoeun Ryu, Jessica Yu, Rusty Russell

 `__ro_mostly_after_init` is almost like `__ro_after_init`. The section is
read-only as same as `__ro_after_init` after kernel init. This patch makes
`__ro_mostly_after_init` section read-write temporarily only during
module_init/module_exit.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 kernel/module.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 7eba6de..3b25e0e 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -987,8 +987,11 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
 
 	mutex_unlock(&module_mutex);
 	/* Final destruction now no one is using it. */
-	if (mod->exit != NULL)
+	if (mod->exit != NULL) {
+		set_ro_mostly_after_init_rw();
 		mod->exit();
+		set_ro_mostly_after_init_ro();
+	}
 	blocking_notifier_call_chain(&module_notify_list,
 				     MODULE_STATE_GOING, mod);
 	klp_module_going(mod);
@@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
 
 	do_mod_ctors(mod);
 	/* Start the module */
-	if (mod->init != NULL)
+	if (mod->init != NULL) {
+		set_ro_mostly_after_init_rw();
 		ret = do_one_initcall(mod->init);
+		set_ro_mostly_after_init_ro();
+	}
 	if (ret < 0) {
 		goto fail_free_freeinit;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 4/7] selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
  2017-02-19 10:04 ` [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function Hoeun Ryu
  2017-02-19 10:04 ` [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit Hoeun Ryu
@ 2017-02-19 10:04 ` Hoeun Ryu
  2017-02-21 10:35   ` Tetsuo Handa
  2017-02-19 10:04 ` [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states Hoeun Ryu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel
  Cc: Hoeun Ryu, Paul Moore, Stephen Smalley, Eric Paris, James Morris,
	Serge E. Hallyn, selinux, linux-security-module

 It would be good that selinux hooks objects are marked as
`__ro_mostly_after_init`. They can not be simply marked as `__ro_after_init'
because they should be writable during selinux_disable procedure.
`__ro_mostly_after_init` section is temporarily read-write during
selinux_disable procedure via set_ro_mostly_after_init_rw/ro pair. Now that
they can be read-only except during the procedure.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 security/selinux/hooks.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9a8f12f..64fd799 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -6106,7 +6106,7 @@ static int selinux_key_getsecurity(struct key *key, char **_buffer)
 
 #endif
 
-static struct security_hook_list selinux_hooks[] = {
+static struct security_hook_list selinux_hooks[] __ro_mostly_after_init = {
 	LSM_HOOK_INIT(binder_set_context_mgr, selinux_binder_set_context_mgr),
 	LSM_HOOK_INIT(binder_transaction, selinux_binder_transaction),
 	LSM_HOOK_INIT(binder_transfer_binder, selinux_binder_transfer_binder),
@@ -6381,7 +6381,7 @@ security_initcall(selinux_init);
 
 #if defined(CONFIG_NETFILTER)
 
-static struct nf_hook_ops selinux_nf_ops[] = {
+static struct nf_hook_ops selinux_nf_ops[] __ro_mostly_after_init = {
 	{
 		.hook =		selinux_ipv4_postroute,
 		.pf =		NFPROTO_IPV4,
@@ -6477,13 +6477,17 @@ int selinux_disable(void)
 	selinux_disabled = 1;
 	selinux_enabled = 0;
 
+	set_ro_mostly_after_init_rw();
 	security_delete_hooks(selinux_hooks, ARRAY_SIZE(selinux_hooks));
+	set_ro_mostly_after_init_ro();
 
 	/* Try to destroy the avc node cache */
 	avc_disable();
 
 	/* Unregister netfilter hooks. */
+	set_ro_mostly_after_init_ro();
 	selinux_nf_ip_exit();
+	set_ro_mostly_after_init_rw();
 
 	/* Unregister selinuxfs. */
 	exit_sel_fs();
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
                   ` (2 preceding siblings ...)
  2017-02-19 10:04 ` [RFC 4/7] selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops Hoeun Ryu
@ 2017-02-19 10:04 ` Hoeun Ryu
  2017-02-20  8:20   ` Sebastian Andrzej Siewior
  2017-02-19 10:04 ` [RFC 6/7] arm64: add __map_kernel_segment to accept additional vm flags Hoeun Ryu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel
  Cc: Hoeun Ryu, Thomas Gleixner, Sebastian Andrzej Siewior,
	Ingo Molnar, Anna-Maria Gleixner, Peter Zijlstra,
	Boris Ostrovsky

It would be good that `__ro_mostly_after_init` is marked to cpuhp state
objects. They can not be simply marked as `__ro_after_init` because they
should be writable during module_init/exit. Now that they can be read-only
except during module_init/exit

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 kernel/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 0a5f630..12ad4c2 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1126,7 +1126,7 @@ core_initcall(cpu_hotplug_pm_sync_init);
 #endif /* CONFIG_SMP */
 
 /* Boot processor state steps */
-static struct cpuhp_step cpuhp_bp_states[] = {
+static struct cpuhp_step cpuhp_bp_states[] __ro_mostly_after_init = {
 	[CPUHP_OFFLINE] = {
 		.name			= "offline",
 		.startup.single		= NULL,
@@ -1212,7 +1212,7 @@ static struct cpuhp_step cpuhp_bp_states[] = {
 };
 
 /* Application processor state steps */
-static struct cpuhp_step cpuhp_ap_states[] = {
+static struct cpuhp_step cpuhp_ap_states[] __ro_mostly_after_init = {
 #ifdef CONFIG_SMP
 	/* Final state before CPU kills itself */
 	[CPUHP_AP_IDLE_DEAD] = {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 6/7] arm64: add __map_kernel_segment to accept additional vm flags
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
                   ` (3 preceding siblings ...)
  2017-02-19 10:04 ` [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states Hoeun Ryu
@ 2017-02-19 10:04 ` Hoeun Ryu
  2017-02-19 11:21   ` Ard Biesheuvel
  2017-02-19 10:04 ` [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section Hoeun Ryu
  2017-02-19 11:24 ` [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker Ard Biesheuvel
  6 siblings, 1 reply; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel
  Cc: Hoeun Ryu, Catalin Marinas, Will Deacon, Ard Biesheuvel,
	Mark Rutland, Laura Abbott, Kefeng Wang, Jeremy Linton,
	linux-arm-kernel

 Memory attribute for `__ro_mostly_after_init` section should be changed
via set_memory_rw/ro that doesn't work against vm areas which don't have
VM_ALLOC. Add this function to map `__ro_mostly_after_init` section with
VM_ALLOC flag set in map_kernel.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 arch/arm64/mm/mmu.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b805c01..91271b1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -444,8 +444,10 @@ void mark_rodata_ro(void)
 	debug_checkwx();
 }
 
-static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
-				      pgprot_t prot, struct vm_struct *vma)
+static void __init __map_kernel_segment(pgd_t *pgd,
+					void *va_start, void *va_end,
+					pgprot_t prot, struct vm_struct *vma,
+					unsigned long flags)
 {
 	phys_addr_t pa_start = __pa_symbol(va_start);
 	unsigned long size = va_end - va_start;
@@ -459,12 +461,18 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
 	vma->addr	= va_start;
 	vma->phys_addr	= pa_start;
 	vma->size	= size;
-	vma->flags	= VM_MAP;
+	vma->flags	= flags;
 	vma->caller	= __builtin_return_address(0);
 
 	vm_area_add_early(vma);
 }
 
+static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
+				      pgprot_t prot, struct vm_struct *vma)
+{
+	return __map_kernel_segment(pgd, va_start, va_end, prot, vma, VM_MAP);
+}
+
 /*
  * Create fine-grained mappings for the kernel.
  */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
                   ` (4 preceding siblings ...)
  2017-02-19 10:04 ` [RFC 6/7] arm64: add __map_kernel_segment to accept additional vm flags Hoeun Ryu
@ 2017-02-19 10:04 ` Hoeun Ryu
  2017-02-19 11:35   ` Ard Biesheuvel
  2017-02-19 11:24 ` [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker Ard Biesheuvel
  6 siblings, 1 reply; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-19 10:04 UTC (permalink / raw)
  To: kernel-hardening, linux-kernel
  Cc: Hoeun Ryu, Catalin Marinas, Will Deacon, Ard Biesheuvel,
	Mark Rutland, Laura Abbott, Kefeng Wang, Jeremy Linton,
	linux-arm-kernel

Map rodata sections seperately for the new __ro_mostly_after_init section.
Attribute of memory for __ro_mostly_after_init section can be changed later
so we need a dedicated vmalloced region for set_memory_rw/ro api.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 arch/arm64/mm/mmu.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 91271b1..4a89a2e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -434,8 +434,22 @@ void mark_rodata_ro(void)
 	 * mark .rodata as read only. Use __init_begin rather than __end_rodata
 	 * to cover NOTES and EXCEPTION_TABLE.
 	 */
-	section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata;
-	create_mapping_late(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
+	section_size = (unsigned long)__start_data_ro_mostly_after_init -
+		(unsigned long)__start_rodata;
+	create_mapping_late(__pa_symbol(__start_rodata),
+			    (unsigned long)__start_rodata,
+			    section_size, PAGE_KERNEL_RO);
+
+	section_size = (unsigned long)__end_data_ro_mostly_after_init -
+		(unsigned long)__start_data_ro_mostly_after_init;
+	create_mapping_late(__pa_symbol(__start_data_ro_mostly_after_init),
+			    (unsigned long)__start_data_ro_mostly_after_init,
+			    section_size, PAGE_KERNEL_RO);
+
+	section_size = (unsigned long)__init_begin -
+		(unsigned long)__end_data_ro_mostly_after_init;
+	create_mapping_late(__pa_symbol(__end_data_ro_mostly_after_init),
+			    (unsigned long)__end_data_ro_mostly_after_init,
 			    section_size, PAGE_KERNEL_RO);
 
 	/* flush the TLBs after updating live kernel mappings */
@@ -478,10 +492,18 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
  */
 static void __init map_kernel(pgd_t *pgd)
 {
-	static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data;
+	static struct vm_struct vmlinux_text, vmlinux_rodata1, vmlinux_rodata2, vmlinux_ro_mostly_after_init, vmlinux_init, vmlinux_data;
 
 	map_kernel_segment(pgd, _text, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
-	map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, &vmlinux_rodata);
+	map_kernel_segment(pgd, __start_rodata, __start_data_ro_mostly_after_init, PAGE_KERNEL, &vmlinux_rodata1);
+	__map_kernel_segment(pgd,
+			     __start_data_ro_mostly_after_init,
+			     __end_data_ro_mostly_after_init,
+			     PAGE_KERNEL,
+			     &vmlinux_ro_mostly_after_init,
+			     VM_MAP | VM_ALLOC);
+	map_kernel_segment(pgd, __end_data_ro_mostly_after_init, __init_begin, PAGE_KERNEL, &vmlinux_rodata2);
+
 	map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
 			   &vmlinux_init);
 	map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC 6/7] arm64: add __map_kernel_segment to accept additional vm flags
  2017-02-19 10:04 ` [RFC 6/7] arm64: add __map_kernel_segment to accept additional vm flags Hoeun Ryu
@ 2017-02-19 11:21   ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2017-02-19 11:21 UTC (permalink / raw)
  To: Hoeun Ryu
  Cc: kernel-hardening, linux-kernel, Catalin Marinas, Will Deacon,
	Mark Rutland, Laura Abbott, Kefeng Wang, Jeremy Linton,
	linux-arm-kernel

On 19 February 2017 at 10:04, Hoeun Ryu <hoeun.ryu@gmail.com> wrote:
>  Memory attribute for `__ro_mostly_after_init` section should be changed
> via set_memory_rw/ro that doesn't work against vm areas which don't have
> VM_ALLOC.

This is for a good reason: VMALLOC regions are guaranteed to be mapped
down to pages, which is the only thing set_memory_rX() supports. On
arm64, kernel segments may be mapped using contiguous page mappings or
block mappings, which cannot be modified like this.

> Add this function to map `__ro_mostly_after_init` section with
> VM_ALLOC flag set in map_kernel.
>

If we add this functionality, I'd rather we have a special handler for
the __ro_mostly_after_init section and not use set_memory_rX().

> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
> ---
>  arch/arm64/mm/mmu.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index b805c01..91271b1 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -444,8 +444,10 @@ void mark_rodata_ro(void)
>         debug_checkwx();
>  }
>
> -static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
> -                                     pgprot_t prot, struct vm_struct *vma)
> +static void __init __map_kernel_segment(pgd_t *pgd,
> +                                       void *va_start, void *va_end,
> +                                       pgprot_t prot, struct vm_struct *vma,
> +                                       unsigned long flags)
>  {
>         phys_addr_t pa_start = __pa_symbol(va_start);
>         unsigned long size = va_end - va_start;
> @@ -459,12 +461,18 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
>         vma->addr       = va_start;
>         vma->phys_addr  = pa_start;
>         vma->size       = size;
> -       vma->flags      = VM_MAP;
> +       vma->flags      = flags;
>         vma->caller     = __builtin_return_address(0);
>
>         vm_area_add_early(vma);
>  }
>
> +static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
> +                                     pgprot_t prot, struct vm_struct *vma)
> +{
> +       return __map_kernel_segment(pgd, va_start, va_end, prot, vma, VM_MAP);
> +}
> +
>  /*
>   * Create fine-grained mappings for the kernel.
>   */
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker
  2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
                   ` (5 preceding siblings ...)
  2017-02-19 10:04 ` [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section Hoeun Ryu
@ 2017-02-19 11:24 ` Ard Biesheuvel
  2017-02-21  6:29   ` Ho-Eun Ryu
  6 siblings, 1 reply; 22+ messages in thread
From: Ard Biesheuvel @ 2017-02-19 11:24 UTC (permalink / raw)
  To: Hoeun Ryu
  Cc: kernel-hardening, linux-kernel, Arnd Bergmann, Kees Cook,
	Ingo Molnar, linux-arch

On 19 February 2017 at 10:04, Hoeun Ryu <hoeun.ryu@gmail.com> wrote:
>  After `__ro_after_init` marker is included in kernel, many kernel data
> objects can be read-only-after-init. But there are many other places that
> would be good to read-only-after-init but `__ro_after_init` can not be simply
> applicable to them because they should be writable at some points, which are
> during module_init/exit or dynamic de/registration for a specific subsystem.
>  `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
> section is mapped as read-only after kernel init. The different thing is
> this section is temporarily mapped as read-write during module_init/exit and
> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>  Use `__ro_mostly_after_init` as a way to mark such memory instead when
> `__ro_after_init` is not applicable because the memory should be writable
> at the described points of time. They are read-only right after kernel init
> and writable temporarily only during module_init/exit and dynamic
> de/registration for a subsystem.
>
> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
> ---
>  include/asm-generic/sections.h    |  1 +
>  include/asm-generic/vmlinux.lds.h | 10 ++++++++++
>  include/linux/cache.h             | 11 +++++++++++
>  3 files changed, 22 insertions(+)
>
> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
> index 4df64a1..16a6f21 100644
> --- a/include/asm-generic/sections.h
> +++ b/include/asm-generic/sections.h
> @@ -34,6 +34,7 @@ extern char __bss_start[], __bss_stop[];
>  extern char __init_begin[], __init_end[];
>  extern char _sinittext[], _einittext[];
>  extern char __start_data_ro_after_init[], __end_data_ro_after_init[];
> +extern char __start_data_ro_mostly_after_init[], __end_data_ro_mostly_after_init[];
>  extern char _end[];
>  extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[];
>  extern char __kprobes_text_start[], __kprobes_text_end[];
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 4e09b28..cc5f44e 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -265,6 +265,15 @@
>         __end_data_ro_after_init = .;
>  #endif
>
> +#ifndef RO_MOSTLY_AFTER_INIT_DATA
> +#define RO_MOSTLY_AFTER_INIT_DATA(align)                               \
> +       . = ALIGN(align);                                               \
> +       VMLINUX_SYMBOL(__start_data_ro_mostly_after_init) = .;          \
> +       *(.data..ro_mostly_after_init)                                  \
> +       . = ALIGN(align);                                               \
> +       VMLINUX_SYMBOL(__end_data_ro_mostly_after_init) = .;
> +#endif
> +
>  /*
>   * Read only Data
>   */
> @@ -275,6 +284,7 @@
>                 *(.rodata) *(.rodata.*)                                 \
>                 RO_AFTER_INIT_DATA      /* Read only after init */      \
>                 KEEP(*(__vermagic))     /* Kernel version magic */      \
> +               RO_MOSTLY_AFTER_INIT_DATA(align)                        \

You can't really drop this in the middle of a section like this. On
arm64, we try very hard to use a minimal segment alignment of 64 KB
(of 2 MB if DEBUG_ALIGN_RODATA=y), to ensure that the TLB footprint of
the kernel image is minimized.

So this should be a separate section in the arm64 linker script.

>                 . = ALIGN(8);                                           \
>                 VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .;         \
>                 KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
> diff --git a/include/linux/cache.h b/include/linux/cache.h
> index 1be04f8..fd1cb9b 100644
> --- a/include/linux/cache.h
> +++ b/include/linux/cache.h
> @@ -30,6 +30,17 @@
>  #define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
>  #endif
>
> +/*
> + * __ro_mostly_after_init is almost like __ro_after_init.
> + * but __ro_mostly_after_init section is temporarily writable only during
> + * module_init/exit or dynamic de/registeration of a subsystem using
> + * set_ro_mostly_after_init_rw/ro pair.
> + */
> +#ifndef __ro_mostly_after_init
> +#define __ro_mostly_after_init \
> +       __attribute__((__section__(".data..ro_mostly_after_init")))
> +#endif
> +
>  #ifndef ____cacheline_aligned
>  #define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
>  #endif
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section
  2017-02-19 10:04 ` [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section Hoeun Ryu
@ 2017-02-19 11:35   ` Ard Biesheuvel
  2017-02-20 12:45     ` Mark Rutland
  0 siblings, 1 reply; 22+ messages in thread
From: Ard Biesheuvel @ 2017-02-19 11:35 UTC (permalink / raw)
  To: Hoeun Ryu, Kees Cook
  Cc: kernel-hardening, linux-kernel, Catalin Marinas, Will Deacon,
	Mark Rutland, Laura Abbott, Kefeng Wang, Jeremy Linton,
	linux-arm-kernel

On 19 February 2017 at 10:04, Hoeun Ryu <hoeun.ryu@gmail.com> wrote:
> Map rodata sections seperately for the new __ro_mostly_after_init section.
> Attribute of memory for __ro_mostly_after_init section can be changed later
> so we need a dedicated vmalloced region for set_memory_rw/ro api.
>
> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
> ---
>  arch/arm64/mm/mmu.c | 30 ++++++++++++++++++++++++++----
>  1 file changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 91271b1..4a89a2e 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -434,8 +434,22 @@ void mark_rodata_ro(void)
>          * mark .rodata as read only. Use __init_begin rather than __end_rodata
>          * to cover NOTES and EXCEPTION_TABLE.
>          */
> -       section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata;
> -       create_mapping_late(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
> +       section_size = (unsigned long)__start_data_ro_mostly_after_init -
> +               (unsigned long)__start_rodata;
> +       create_mapping_late(__pa_symbol(__start_rodata),
> +                           (unsigned long)__start_rodata,
> +                           section_size, PAGE_KERNEL_RO);
> +
> +       section_size = (unsigned long)__end_data_ro_mostly_after_init -
> +               (unsigned long)__start_data_ro_mostly_after_init;
> +       create_mapping_late(__pa_symbol(__start_data_ro_mostly_after_init),
> +                           (unsigned long)__start_data_ro_mostly_after_init,
> +                           section_size, PAGE_KERNEL_RO);
> +
> +       section_size = (unsigned long)__init_begin -
> +               (unsigned long)__end_data_ro_mostly_after_init;
> +       create_mapping_late(__pa_symbol(__end_data_ro_mostly_after_init),
> +                           (unsigned long)__end_data_ro_mostly_after_init,
>                             section_size, PAGE_KERNEL_RO);
>
>         /* flush the TLBs after updating live kernel mappings */
> @@ -478,10 +492,18 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
>   */
>  static void __init map_kernel(pgd_t *pgd)
>  {
> -       static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data;
> +       static struct vm_struct vmlinux_text, vmlinux_rodata1, vmlinux_rodata2, vmlinux_ro_mostly_after_init, vmlinux_init, vmlinux_data;
>
>         map_kernel_segment(pgd, _text, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
> -       map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, &vmlinux_rodata);
> +       map_kernel_segment(pgd, __start_rodata, __start_data_ro_mostly_after_init, PAGE_KERNEL, &vmlinux_rodata1);
> +       __map_kernel_segment(pgd,
> +                            __start_data_ro_mostly_after_init,
> +                            __end_data_ro_mostly_after_init,
> +                            PAGE_KERNEL,
> +                            &vmlinux_ro_mostly_after_init,
> +                            VM_MAP | VM_ALLOC);
> +       map_kernel_segment(pgd, __end_data_ro_mostly_after_init, __init_begin, PAGE_KERNEL, &vmlinux_rodata2);
> +
>         map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
>                            &vmlinux_init);
>         map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
> --
> 2.7.4
>

While it is correct that you are splitting this into three separate
segments (otherwise we would not be able to change the permissions
later without risking splitting to occur), I think this leads to
unnecessary fragmentation.

If there is demand for this feature (but you still need to make the
argument for that), I wonder if it wouldn't be sufficient, and much
more straightforward, to redefine the __ro_after_init semantics to
include the kind of subsystem registration and module init context you
are targeting, and implement some hooks to temporarily lift the
__ro_after_init r/o permission restrictions in a controlled manner.

Kees: any thoughts?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
  2017-02-19 10:04 ` [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states Hoeun Ryu
@ 2017-02-20  8:20   ` Sebastian Andrzej Siewior
  2017-02-21  5:47     ` Ho-Eun Ryu
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-02-20  8:20 UTC (permalink / raw)
  To: Hoeun Ryu
  Cc: kernel-hardening, linux-kernel, Thomas Gleixner, Ingo Molnar,
	Anna-Maria Gleixner, Peter Zijlstra, Boris Ostrovsky

On 2017-02-19 19:04:08 [+0900], Hoeun Ryu wrote:
> It would be good that `__ro_mostly_after_init` is marked to cpuhp state
> objects. 
why?

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function
  2017-02-19 10:04 ` [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function Hoeun Ryu
@ 2017-02-20 10:22   ` Mark Rutland
  2017-02-21  6:33     ` Ho-Eun Ryu
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Rutland @ 2017-02-20 10:22 UTC (permalink / raw)
  To: Hoeun Ryu
  Cc: kernel-hardening, linux-kernel, Kees Cook, Jessica Yu,
	Ingo Molnar, Andrew Morton, Emese Revfy, AKASHI Takahiro,
	Fabian Frederick, Helge Deller, Laura Abbott, Nicholas Piggin,
	Thomas Gleixner, Petr Mladek, Yang Shi, Rasmus Villemoes,
	Tejun Heo, Prarit Bhargava, Lokesh Vutla

On Sun, Feb 19, 2017 at 07:04:05PM +0900, Hoeun Ryu wrote:
>  Add set_ro_mostly_after_init_rw/ro pair to modify memory attributes for
> memory marked as `ro_mostly_after_init`.
> 
>  I am doubtful that this is the right place where these functions reside and
> these functions are suitable for all architectures for memory attributes
> modification. Please comment.

These won't work for arm64, since set_memory_* only work on
page-granular mappings in the vmalloc area.

The "real" kernel mappings can use larger block mappings, and would need
to be split (which cannot be done at runtime) before permissions could
be changed at page granularity.

Thanks,
Mark.

> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
> ---
>  include/linux/init.h |  6 ++++++
>  init/main.c          | 24 ++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/include/linux/init.h b/include/linux/init.h
> index 79af096..d68e4f7 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -131,6 +131,12 @@ extern bool rodata_enabled;
>  #endif
>  #ifdef CONFIG_STRICT_KERNEL_RWX
>  void mark_rodata_ro(void);
> +
> +void set_ro_mostly_after_init_rw(void);
> +void set_ro_mostly_after_init_ro(void);
> +#else
> +static inline void set_ro_mostly_after_init_rw(void) { }
> +static inline void set_ro_mostly_after_init_ro(void) { }
>  #endif
>  
>  extern void (*late_time_init)(void);
> diff --git a/init/main.c b/init/main.c
> index 4719abf..a5d4873 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -941,6 +941,30 @@ static void mark_readonly(void)
>  	} else
>  		pr_info("Kernel memory protection disabled.\n");
>  }
> +
> +void set_ro_mostly_after_init_rw(void)
> +{
> +	unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
> +	unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
> +	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> +
> +	if (!rodata_enabled)
> +		return;
> +
> +	set_memory_rw(start, nr_pages);
> +}
> +
> +void set_ro_mostly_after_init_ro(void)
> +{
> +	unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
> +	unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
> +	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> +
> +	if (!rodata_enabled)
> +		return;
> +
> +	set_memory_ro(start, nr_pages);
> +}
>  #else
>  static inline void mark_readonly(void)
>  {
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit
  2017-02-19 10:04 ` [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit Hoeun Ryu
@ 2017-02-20 10:30   ` Mark Rutland
  2017-02-21 13:36     ` Ho-Eun Ryu
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Rutland @ 2017-02-20 10:30 UTC (permalink / raw)
  To: Hoeun Ryu; +Cc: kernel-hardening, linux-kernel, Jessica Yu, Rusty Russell

On Sun, Feb 19, 2017 at 07:04:06PM +0900, Hoeun Ryu wrote:
>  `__ro_mostly_after_init` is almost like `__ro_after_init`. The section is
> read-only as same as `__ro_after_init` after kernel init. This patch makes
> `__ro_mostly_after_init` section read-write temporarily only during
> module_init/module_exit.
> 
> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
> ---
>  kernel/module.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index 7eba6de..3b25e0e 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -987,8 +987,11 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
>  
>  	mutex_unlock(&module_mutex);
>  	/* Final destruction now no one is using it. */
> -	if (mod->exit != NULL)
> +	if (mod->exit != NULL) {
> +		set_ro_mostly_after_init_rw();
>  		mod->exit();
> +		set_ro_mostly_after_init_ro();
> +	}
>  	blocking_notifier_call_chain(&module_notify_list,
>  				     MODULE_STATE_GOING, mod);
>  	klp_module_going(mod);
> @@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
>  
>  	do_mod_ctors(mod);
>  	/* Start the module */
> -	if (mod->init != NULL)
> +	if (mod->init != NULL) {
> +		set_ro_mostly_after_init_rw();
>  		ret = do_one_initcall(mod->init);
> +		set_ro_mostly_after_init_ro();
> +	}

This looks very much like the pax_{open,close}_kernel() approach for
write-rarely data.

I think it would be better to implement a first class write-rarely
mechanism rather than trying to extend __ro_after_init to cover this
case.

As mentioned previously, I *think* we can have a generic implementation
that uses an mm to temporarily map a (thread/cpu-local) RW alias of the
data in question in what would otherwise be the user half of the address
space. Regardless, we can have a generic interface [1] that can cater
for that style of approach and/or something like ARM's domains or x86's
pkeys.

Thanks,
Mark.

[1] http://www.openwall.com/lists/kernel-hardening/2016/11/18/3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section
  2017-02-19 11:35   ` Ard Biesheuvel
@ 2017-02-20 12:45     ` Mark Rutland
  2017-02-21 20:38       ` Kees Cook
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Rutland @ 2017-02-20 12:45 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Hoeun Ryu, Kees Cook, kernel-hardening, linux-kernel,
	Catalin Marinas, Will Deacon, Laura Abbott, Kefeng Wang,
	Jeremy Linton, linux-arm-kernel

On Sun, Feb 19, 2017 at 11:35:51AM +0000, Ard Biesheuvel wrote:
> On 19 February 2017 at 10:04, Hoeun Ryu <hoeun.ryu@gmail.com> wrote:
> > Map rodata sections seperately for the new __ro_mostly_after_init section.
> > Attribute of memory for __ro_mostly_after_init section can be changed later
> > so we need a dedicated vmalloced region for set_memory_rw/ro api.

> While it is correct that you are splitting this into three separate
> segments (otherwise we would not be able to change the permissions
> later without risking splitting to occur), I think this leads to
> unnecessary fragmentation.
> 
> If there is demand for this feature (but you still need to make the
> argument for that), I wonder if it wouldn't be sufficient, and much
> more straightforward, to redefine the __ro_after_init semantics to
> include the kind of subsystem registration and module init context you
> are targeting, and implement some hooks to temporarily lift the
> __ro_after_init r/o permission restrictions in a controlled manner.

>From a look over the series, I think this is just __write_rarely in
disguise. I personally think that we should keep __write_rarely and
__ro_after_init separate, the later being a strictly one-shot affair.

I had some ideas [1] as to how we could implement __write_rarely without
carving up the kernel mapping further (and keeping the RW permissions
local to the thread needing it), but I have not had the time to look
into that further.

Thanks,
Mark.

[1] http://www.openwall.com/lists/kernel-hardening/2016/11/18/3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
  2017-02-20  8:20   ` Sebastian Andrzej Siewior
@ 2017-02-21  5:47     ` Ho-Eun Ryu
  0 siblings, 0 replies; 22+ messages in thread
From: Ho-Eun Ryu @ 2017-02-21  5:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: kernel-hardening, linux-kernel, Thomas Gleixner, Ingo Molnar,
	Anna-Maria Gleixner, Peter Zijlstra, Boris Ostrovsky


> On 20 Feb 2017, at 5:20 PM, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
> On 2017-02-19 19:04:08 [+0900], Hoeun Ryu wrote:
>> It would be good that `__ro_mostly_after_init` is marked to cpuhp state
>> objects. 
> why?
> 

I’m requesting for comments of a new feature called __ro_mostly_after_init section marker.
It’s similar to __ro_after_init, but the section can be writable for some point of time.
Please see the cover letter [1] and the first patch [2].

[1] : https://lkml.org/lkml/2017/2/19/29
[2] : https://lkml.org/lkml/2017/2/19/32

> Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker
  2017-02-19 11:24 ` [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker Ard Biesheuvel
@ 2017-02-21  6:29   ` Ho-Eun Ryu
  0 siblings, 0 replies; 22+ messages in thread
From: Ho-Eun Ryu @ 2017-02-21  6:29 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: kernel-hardening, linux-kernel, Arnd Bergmann, Kees Cook,
	Ingo Molnar, linux-arch


> On 19 Feb 2017, at 8:24 PM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> 
> On 19 February 2017 at 10:04, Hoeun Ryu <hoeun.ryu@gmail.com> wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> Use `__ro_mostly_after_init` as a way to mark such memory instead when
>> `__ro_after_init` is not applicable because the memory should be writable
>> at the described points of time. They are read-only right after kernel init
>> and writable temporarily only during module_init/exit and dynamic
>> de/registration for a subsystem.
>> 
>> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
>> ---
>> include/asm-generic/sections.h    |  1 +
>> include/asm-generic/vmlinux.lds.h | 10 ++++++++++
>> include/linux/cache.h             | 11 +++++++++++
>> 3 files changed, 22 insertions(+)
>> 
>> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
>> index 4df64a1..16a6f21 100644
>> --- a/include/asm-generic/sections.h
>> +++ b/include/asm-generic/sections.h
>> @@ -34,6 +34,7 @@ extern char __bss_start[], __bss_stop[];
>> extern char __init_begin[], __init_end[];
>> extern char _sinittext[], _einittext[];
>> extern char __start_data_ro_after_init[], __end_data_ro_after_init[];
>> +extern char __start_data_ro_mostly_after_init[], __end_data_ro_mostly_after_init[];
>> extern char _end[];
>> extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[];
>> extern char __kprobes_text_start[], __kprobes_text_end[];
>> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
>> index 4e09b28..cc5f44e 100644
>> --- a/include/asm-generic/vmlinux.lds.h
>> +++ b/include/asm-generic/vmlinux.lds.h
>> @@ -265,6 +265,15 @@
>>        __end_data_ro_after_init = .;
>> #endif
>> 
>> +#ifndef RO_MOSTLY_AFTER_INIT_DATA
>> +#define RO_MOSTLY_AFTER_INIT_DATA(align)                               \
>> +       . = ALIGN(align);                                               \
>> +       VMLINUX_SYMBOL(__start_data_ro_mostly_after_init) = .;          \
>> +       *(.data..ro_mostly_after_init)                                  \
>> +       . = ALIGN(align);                                               \
>> +       VMLINUX_SYMBOL(__end_data_ro_mostly_after_init) = .;
>> +#endif
>> +
>> /*
>>  * Read only Data
>>  */
>> @@ -275,6 +284,7 @@
>>                *(.rodata) *(.rodata.*)                                 \
>>                RO_AFTER_INIT_DATA      /* Read only after init */      \
>>                KEEP(*(__vermagic))     /* Kernel version magic */      \
>> +               RO_MOSTLY_AFTER_INIT_DATA(align)                        \
> 
> You can't really drop this in the middle of a section like this. On
> arm64, we try very hard to use a minimal segment alignment of 64 KB
> (of 2 MB if DEBUG_ALIGN_RODATA=y), to ensure that the TLB footprint of
> the kernel image is minimized.
> 
> So this should be a separate section in the arm64 linker script.
> 

Could we achieve that using ALIGN(SECTION_SIZE) not ALIGN(PAGE_SIZE) for RO_DATA ?

>>                . = ALIGN(8);                                           \
>>                VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .;         \
>>                KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
>> diff --git a/include/linux/cache.h b/include/linux/cache.h
>> index 1be04f8..fd1cb9b 100644
>> --- a/include/linux/cache.h
>> +++ b/include/linux/cache.h
>> @@ -30,6 +30,17 @@
>> #define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
>> #endif
>> 
>> +/*
>> + * __ro_mostly_after_init is almost like __ro_after_init.
>> + * but __ro_mostly_after_init section is temporarily writable only during
>> + * module_init/exit or dynamic de/registeration of a subsystem using
>> + * set_ro_mostly_after_init_rw/ro pair.
>> + */
>> +#ifndef __ro_mostly_after_init
>> +#define __ro_mostly_after_init \
>> +       __attribute__((__section__(".data..ro_mostly_after_init")))
>> +#endif
>> +
>> #ifndef ____cacheline_aligned
>> #define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
>> #endif
>> --
>> 2.7.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function
  2017-02-20 10:22   ` [kernel-hardening] " Mark Rutland
@ 2017-02-21  6:33     ` Ho-Eun Ryu
  0 siblings, 0 replies; 22+ messages in thread
From: Ho-Eun Ryu @ 2017-02-21  6:33 UTC (permalink / raw)
  To: Mark Rutland
  Cc: kernel-hardening, LKML, Kees Cook, Jessica Yu, Ingo Molnar,
	Andrew Morton, Emese Revfy, AKASHI Takahiro, Fabian Frederick,
	Helge Deller, Laura Abbott, Nicholas Piggin, Thomas Gleixner,
	Petr Mladek, Yang Shi, Rasmus Villemoes, Tejun Heo,
	Prarit Bhargava, Lokesh Vutla


> On 20 Feb 2017, at 7:22 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Sun, Feb 19, 2017 at 07:04:05PM +0900, Hoeun Ryu wrote:
>> Add set_ro_mostly_after_init_rw/ro pair to modify memory attributes for
>> memory marked as `ro_mostly_after_init`.
>> 
>> I am doubtful that this is the right place where these functions reside and
>> these functions are suitable for all architectures for memory attributes
>> modification. Please comment.
> 
> These won't work for arm64, since set_memory_* only work on
> page-granular mappings in the vmalloc area.
> 
> The "real" kernel mappings can use larger block mappings, and would need
> to be split (which cannot be done at runtime) before permissions could
> be changed at page granularity.

So I sent RFC 6/7 [1] and 7/7 [2] that splits the block mapping to the page granular.
I think you and Ard Biesheuvel don’t like it anyway.

[1] : https://lkml.org/lkml/2017/2/19/38
[2] : https://lkml.org/lkml/2017/2/19/39

> 
> Thanks,
> Mark.
> 
>> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
>> ---
>> include/linux/init.h |  6 ++++++
>> init/main.c          | 24 ++++++++++++++++++++++++
>> 2 files changed, 30 insertions(+)
>> 
>> diff --git a/include/linux/init.h b/include/linux/init.h
>> index 79af096..d68e4f7 100644
>> --- a/include/linux/init.h
>> +++ b/include/linux/init.h
>> @@ -131,6 +131,12 @@ extern bool rodata_enabled;
>> #endif
>> #ifdef CONFIG_STRICT_KERNEL_RWX
>> void mark_rodata_ro(void);
>> +
>> +void set_ro_mostly_after_init_rw(void);
>> +void set_ro_mostly_after_init_ro(void);
>> +#else
>> +static inline void set_ro_mostly_after_init_rw(void) { }
>> +static inline void set_ro_mostly_after_init_ro(void) { }
>> #endif
>> 
>> extern void (*late_time_init)(void);
>> diff --git a/init/main.c b/init/main.c
>> index 4719abf..a5d4873 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -941,6 +941,30 @@ static void mark_readonly(void)
>> 	} else
>> 		pr_info("Kernel memory protection disabled.\n");
>> }
>> +
>> +void set_ro_mostly_after_init_rw(void)
>> +{
>> +	unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
>> +	unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
>> +	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> +
>> +	if (!rodata_enabled)
>> +		return;
>> +
>> +	set_memory_rw(start, nr_pages);
>> +}
>> +
>> +void set_ro_mostly_after_init_ro(void)
>> +{
>> +	unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
>> +	unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
>> +	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> +
>> +	if (!rodata_enabled)
>> +		return;
>> +
>> +	set_memory_ro(start, nr_pages);
>> +}
>> #else
>> static inline void mark_readonly(void)
>> {
>> -- 
>> 2.7.4
>> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 4/7] selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops
  2017-02-19 10:04 ` [RFC 4/7] selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops Hoeun Ryu
@ 2017-02-21 10:35   ` Tetsuo Handa
  0 siblings, 0 replies; 22+ messages in thread
From: Tetsuo Handa @ 2017-02-21 10:35 UTC (permalink / raw)
  To: Hoeun Ryu, kernel-hardening, linux-kernel
  Cc: Paul Moore, Stephen Smalley, Eric Paris, James Morris,
	Serge E. Hallyn, selinux, linux-security-module

On 2017/02/19 19:04, Hoeun Ryu wrote:
>  It would be good that selinux hooks objects are marked as
> `__ro_mostly_after_init`. They can not be simply marked as `__ro_after_init'
> because they should be writable during selinux_disable procedure.
> `__ro_mostly_after_init` section is temporarily read-write during
> selinux_disable procedure via set_ro_mostly_after_init_rw/ro pair. Now that
> they can be read-only except during the procedure.
> 
> -static struct security_hook_list selinux_hooks[] = {
> +static struct security_hook_list selinux_hooks[] __ro_mostly_after_init = {

This won't work. This variable is array of "struct list_head".
You need to set same attribute to variables pointed by
"struct list_head"->next and "struct list_head"->prev .

>  	LSM_HOOK_INIT(binder_set_context_mgr, selinux_binder_set_context_mgr),
>  	LSM_HOOK_INIT(binder_transaction, selinux_binder_transaction),
>  	LSM_HOOK_INIT(binder_transfer_binder, selinux_binder_transfer_binder),

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit
  2017-02-20 10:30   ` [kernel-hardening] " Mark Rutland
@ 2017-02-21 13:36     ` Ho-Eun Ryu
  2017-02-21 13:58       ` Mark Rutland
  0 siblings, 1 reply; 22+ messages in thread
From: Ho-Eun Ryu @ 2017-02-21 13:36 UTC (permalink / raw)
  To: Mark Rutland; +Cc: kernel-hardening, linux-kernel, Jessica Yu, Rusty Russell


> On 20 Feb 2017, at 7:30 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Sun, Feb 19, 2017 at 07:04:06PM +0900, Hoeun Ryu wrote:
>> `__ro_mostly_after_init` is almost like `__ro_after_init`. The section is
>> read-only as same as `__ro_after_init` after kernel init. This patch makes
>> `__ro_mostly_after_init` section read-write temporarily only during
>> module_init/module_exit.
>> 
>> Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
>> ---
>> kernel/module.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/module.c b/kernel/module.c
>> index 7eba6de..3b25e0e 100644
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -987,8 +987,11 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
>> 
>> 	mutex_unlock(&module_mutex);
>> 	/* Final destruction now no one is using it. */
>> -	if (mod->exit != NULL)
>> +	if (mod->exit != NULL) {
>> +		set_ro_mostly_after_init_rw();
>> 		mod->exit();
>> +		set_ro_mostly_after_init_ro();
>> +	}
>> 	blocking_notifier_call_chain(&module_notify_list,
>> 				     MODULE_STATE_GOING, mod);
>> 	klp_module_going(mod);
>> @@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
>> 
>> 	do_mod_ctors(mod);
>> 	/* Start the module */
>> -	if (mod->init != NULL)
>> +	if (mod->init != NULL) {
>> +		set_ro_mostly_after_init_rw();
>> 		ret = do_one_initcall(mod->init);
>> +		set_ro_mostly_after_init_ro();
>> +	}
> 
> This looks very much like the pax_{open,close}_kernel() approach for
> write-rarely data.

I read the discussion [1] and I agree that __ro_mostly_after_init marker
looks very similar to __write_rarely. 

> 
> I think it would be better to implement a first class write-rarely
> mechanism rather than trying to extend __ro_after_init to cover this
> case.

I’m not extending __ro_after_init. __ro_mostly_after_init resides in the same section of rodata though.

> 
> As mentioned previously, I *think* we can have a generic implementation
> that uses an mm to temporarily map a (thread/cpu-local) RW alias of the
> data in question in what would otherwise be the user half of the address
> space. Regardless, we can have a generic interface [1] that can cater
> for that style of approach and/or something like ARM's domains or x86's
> pkeys.
> 

I’m still learning cpu/kernel architectures, It would be very thankful if you tell me more about the detail of the implementation itself.

The mm that maps temporary RW alias is like
    * special mm like idmap/init_mm which have its own page tables?
    * the page tables have the same content of page tables of init_mm’s swapper_pg_dir except for RW permissions for a specific section (let’s say __write_rarely)
    * then use switch_mm(special_rw_mm) to change the address space before the access happens to the section
    * then use switch_mm(current->mm) to change the address space to original after the access is done

And the interface itself. rare_write(__val, __val), is it a single value access interface.
I’m intending to make data in __ro_mostly_after_init section RW during multiple accesses like during module_init/exit.
and __rare_rw_map()/unmap() used in rare_write() seems to work like open/close api.

How could __rare_rw_ptr() be implemented and what happens when `__rw_var = __rare_rw_ptr(&(__var))` is done ?

However the interface will look like, Do we still need a special data section that is mapped RO in general but RW in some cases ?
if then, doesn’t __ro_mostly_after_init marker itself make sense and we still need it ?

> Thanks,
> Mark.
> 
> [1] http://www.openwall.com/lists/kernel-hardening/2016/11/18/3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit
  2017-02-21 13:36     ` Ho-Eun Ryu
@ 2017-02-21 13:58       ` Mark Rutland
  2017-02-22 13:45         ` Hoeun Ryu
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Rutland @ 2017-02-21 13:58 UTC (permalink / raw)
  To: Ho-Eun Ryu; +Cc: kernel-hardening, linux-kernel, Jessica Yu, Rusty Russell

On Tue, Feb 21, 2017 at 10:36:05PM +0900, Ho-Eun Ryu wrote:
> > On 20 Feb 2017, at 7:30 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Sun, Feb 19, 2017 at 07:04:06PM +0900, Hoeun Ryu wrote:

> >> @@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
> >> 
> >> 	do_mod_ctors(mod);
> >> 	/* Start the module */
> >> -	if (mod->init != NULL)
> >> +	if (mod->init != NULL) {
> >> +		set_ro_mostly_after_init_rw();
> >> 		ret = do_one_initcall(mod->init);
> >> +		set_ro_mostly_after_init_ro();
> >> +	}
> > 
> > This looks very much like the pax_{open,close}_kernel() approach for
> > write-rarely data.
> 
> I read the discussion [1] and I agree that __ro_mostly_after_init marker
> looks very similar to __write_rarely. 
> 
> > I think it would be better to implement a first class write-rarely
> > mechanism rather than trying to extend __ro_after_init to cover this
> > case.
> 
> I’m not extending __ro_after_init. __ro_mostly_after_init resides in
> the same section of rodata though.

Sorry; I was confused when I wrote that email. I now understand that
you're adding a separate annotation.

> > As mentioned previously, I *think* we can have a generic implementation
> > that uses an mm to temporarily map a (thread/cpu-local) RW alias of the
> > data in question in what would otherwise be the user half of the address
> > space. Regardless, we can have a generic interface [1] that can cater
> > for that style of approach and/or something like ARM's domains or x86's
> > pkeys.
> > 
> 
> I’m still learning cpu/kernel architectures, It would be very thankful if you tell me more about the detail of the implementation itself.
> 
> The mm that maps temporary RW alias is like
>     * special mm like idmap/init_mm which have its own page tables?
>     * the page tables have the same content of page tables of
>       init_mm’s swapper_pg_dir except for RW permissions for a
>       specific section (let’s say __write_rarely)

This would be a special mm, like a user mm, that only mapped the
relevant VA(s).

That might map the relevant variable on-demand, or the mapping could
cover the whole write_rarely area.

>     * then use switch_mm(special_rw_mm) to change the address space
>       before the access happens to the section
>     * then use switch_mm(current->mm) to change the address space to
>       original after the access is done

Yes.

> And the interface itself. rare_write(__val, __val), is it a single
> value access interface.
> I’m intending to make data in __ro_mostly_after_init section RW during
> multiple accesses like during module_init/exit.
> and __rare_rw_map()/unmap() used in rare_write() seems to work like
> open/close api.

The __rare_rw_{map,unmap}() functions would map in the RW alias, but do
not necessarily change the RO alias to RW. This is why __rare_rw_ptr()
would be necessary, and is the major difference to the open/close API.

We could certainly allow several writes between a map/unmap. The key
requirement is that each write is instrumented so that it goes via the
RW alias.

> How could __rare_rw_ptr() be implemented and what happens when
> `__rw_var = __rare_rw_ptr(&(__var))` is done ?

__rare_rw_ptr() would take a pointer to the usual RO alias, and derive
its RW alias. What exactly this should do depends on how the RW alias is
implemented.

On a system using an RW mm, let's assume we place all __write_rarely
variables in a region bounded by __rare_write_begin/__rare_write_end,
and when the mm is installed place, we have an RW alias of this region
beginning at __rw_alias_start. In this case, it'd look something like:

#define __rare_rw_ptr(ptr) ({				\
	unsigned long __ptr = (unsigned long)(ptr);	\
	__ptr -= __rare_write_start;			\
	__ptr += __rw_alias_start;			\
	(typeof(ptr))__ptr;				\
})

... does that make sense?

For systems where you can freely/easily alter (local) permissions (e.g.
using ARM's domains), that can be done within __rare_rw_{map,unmap}(),
and __rare_rw_ptr can just return the original pointer.

> However the interface will look like, Do we still need a special data
> section that is mapped RO in general but RW in some cases ?

With the above, I think the usual mapping can always be RO.

> if then, doesn’t __ro_mostly_after_init marker itself make sense and
> we still need it ?

We may need a marker to bound the set of variables we wish to map in
this way.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section
  2017-02-20 12:45     ` Mark Rutland
@ 2017-02-21 20:38       ` Kees Cook
  0 siblings, 0 replies; 22+ messages in thread
From: Kees Cook @ 2017-02-21 20:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, Hoeun Ryu, kernel-hardening, linux-kernel,
	Catalin Marinas, Will Deacon, Laura Abbott, Kefeng Wang,
	Jeremy Linton, linux-arm-kernel

On Mon, Feb 20, 2017 at 4:45 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Sun, Feb 19, 2017 at 11:35:51AM +0000, Ard Biesheuvel wrote:
>> On 19 February 2017 at 10:04, Hoeun Ryu <hoeun.ryu@gmail.com> wrote:
>> > Map rodata sections seperately for the new __ro_mostly_after_init section.
>> > Attribute of memory for __ro_mostly_after_init section can be changed later
>> > so we need a dedicated vmalloced region for set_memory_rw/ro api.
>
>> While it is correct that you are splitting this into three separate
>> segments (otherwise we would not be able to change the permissions
>> later without risking splitting to occur), I think this leads to
>> unnecessary fragmentation.
>>
>> If there is demand for this feature (but you still need to make the
>> argument for that), I wonder if it wouldn't be sufficient, and much
>> more straightforward, to redefine the __ro_after_init semantics to
>> include the kind of subsystem registration and module init context you
>> are targeting, and implement some hooks to temporarily lift the
>> __ro_after_init r/o permission restrictions in a controlled manner.
>
> From a look over the series, I think this is just __write_rarely in
> disguise. I personally think that we should keep __write_rarely and
> __ro_after_init separate, the later being a strictly one-shot affair.

That's my thinking too.

> I had some ideas [1] as to how we could implement __write_rarely without
> carving up the kernel mapping further (and keeping the RW permissions
> local to the thread needing it), but I have not had the time to look
> into that further.

I'm working on a series to do this for x86, but I keep getting
distracted. I hope to get an RFC posted this week.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [kernel-hardening] [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit
  2017-02-21 13:58       ` Mark Rutland
@ 2017-02-22 13:45         ` Hoeun Ryu
  0 siblings, 0 replies; 22+ messages in thread
From: Hoeun Ryu @ 2017-02-22 13:45 UTC (permalink / raw)
  To: Mark Rutland; +Cc: kernel-hardening, linux-kernel, Jessica Yu, Rusty Russell

Thank you for your detailed explanation. It helped a lot for understandings.

> On Feb 21, 2017, at 10:58 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Tue, Feb 21, 2017 at 10:36:05PM +0900, Ho-Eun Ryu wrote:
>>> On 20 Feb 2017, at 7:30 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>>> On Sun, Feb 19, 2017 at 07:04:06PM +0900, Hoeun Ryu wrote:
> 
>>>> @@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
>>>> 
>>>>   do_mod_ctors(mod);
>>>>   /* Start the module */
>>>> -    if (mod->init != NULL)
>>>> +    if (mod->init != NULL) {
>>>> +        set_ro_mostly_after_init_rw();
>>>>       ret = do_one_initcall(mod->init);
>>>> +        set_ro_mostly_after_init_ro();
>>>> +    }
>>> 
>>> This looks very much like the pax_{open,close}_kernel() approach for
>>> write-rarely data.
>> 
>> I read the discussion [1] and I agree that __ro_mostly_after_init marker
>> looks very similar to __write_rarely. 
>> 
>>> I think it would be better to implement a first class write-rarely
>>> mechanism rather than trying to extend __ro_after_init to cover this
>>> case.
>> 
>> I’m not extending __ro_after_init. __ro_mostly_after_init resides in
>> the same section of rodata though.
> 
> Sorry; I was confused when I wrote that email. I now understand that
> you're adding a separate annotation.
> 
>>> As mentioned previously, I *think* we can have a generic implementation
>>> that uses an mm to temporarily map a (thread/cpu-local) RW alias of the
>>> data in question in what would otherwise be the user half of the address
>>> space. Regardless, we can have a generic interface [1] that can cater
>>> for that style of approach and/or something like ARM's domains or x86's
>>> pkeys.
>> 
>> I’m still learning cpu/kernel architectures, It would be very thankful if you tell me more about the detail of the implementation itself.
>> 
>> The mm that maps temporary RW alias is like
>>   * special mm like idmap/init_mm which have its own page tables?
>>   * the page tables have the same content of page tables of
>>     init_mm’s swapper_pg_dir except for RW permissions for a
>>     specific section (let’s say __write_rarely)
> 
> This would be a special mm, like a user mm, that only mapped the
> relevant VA(s).

we need a separate mm/pgd for ttbr0_el1 in kernel image section as idmap and swapper_pg_dir currently do and we make VA alias mapping for RO section with RW permission under TASK_SIZE during kernel init. And then we can switch to the mm by setting the pgd to ttbr0_el1. Right ?

It came to my mind that how about the relationship with SW_TTBR0_PAN .
What if copy_from_user tries to do something against RW alias ?

val_rw = __rw_ptr(&val);
__rw_map();
copy_from_user(&val_rw, user_ptr);
__re_unmap();

__rw_map() will install rw_mm->gpd to ttbr0_el1 but uaccess_enable() will immediately reinstall thread_info->pgd to ttbr0_el1 and we loose RW alias.
Am I something wrong or confused ?

> 
> That might map the relevant variable on-demand, or the mapping could
> cover the whole write_rarely area.
> 
>>   * then use switch_mm(special_rw_mm) to change the address space
>>     before the access happens to the section
>>   * then use switch_mm(current->mm) to change the address space to
>>     original after the access is done
> 
> Yes.
> 
>> And the interface itself. rare_write(__val, __val), is it a single
>> value access interface.
>> I’m intending to make data in __ro_mostly_after_init section RW during
>> multiple accesses like during module_init/exit.
>> and __rare_rw_map()/unmap() used in rare_write() seems to work like
>> open/close api.
> 
> The __rare_rw_{map,unmap}() functions would map in the RW alias, but do
> not necessarily change the RO alias to RW. This is why __rare_rw_ptr()
> would be necessary, and is the major difference to the open/close API.
> 
> We could certainly allow several writes between a map/unmap. The key
> requirement is that each write is instrumented so that it goes via the
> RW alias.
> 
>> How could __rare_rw_ptr() be implemented and what happens when
>> `__rw_var = __rare_rw_ptr(&(__var))` is done ?
> 
> __rare_rw_ptr() would take a pointer to the usual RO alias, and derive
> its RW alias. What exactly this should do depends on how the RW alias is
> implemented.
> 
> On a system using an RW mm, let's assume we place all __write_rarely
> variables in a region bounded by __rare_write_begin/__rare_write_end,
> and when the mm is installed place, we have an RW alias of this region
> beginning at __rw_alias_start. In this case, it'd look something like:
> 
> #define __rare_rw_ptr(ptr) ({                \
>   unsigned long __ptr = (unsigned long)(ptr);    \
>   __ptr -= __rare_write_start;            \
>   __ptr += __rw_alias_start;            \
>   (typeof(ptr))__ptr;                \
> })
> 
> ... does that make sense?

Yes. Cool.

> 
> For systems where you can freely/easily alter (local) permissions (e.g.
> using ARM's domains), that can be done within __rare_rw_{map,unmap}(),
> and __rare_rw_ptr can just return the original pointer.
> 
>> However the interface will look like, Do we still need a special data
>> section that is mapped RO in general but RW in some cases ?
> 
> With the above, I think the usual mapping can always be RO.
> 
>> if then, doesn’t __ro_mostly_after_init marker itself make sense and
>> we still need it ?
> 
> We may need a marker to bound the set of variables we wish to map in
> this way.
> 
> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-02-22 13:45 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-19 10:04 [RFC 1/7] arch: add __ro_mostly_after_init section marker Hoeun Ryu
2017-02-19 10:04 ` [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function Hoeun Ryu
2017-02-20 10:22   ` [kernel-hardening] " Mark Rutland
2017-02-21  6:33     ` Ho-Eun Ryu
2017-02-19 10:04 ` [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit Hoeun Ryu
2017-02-20 10:30   ` [kernel-hardening] " Mark Rutland
2017-02-21 13:36     ` Ho-Eun Ryu
2017-02-21 13:58       ` Mark Rutland
2017-02-22 13:45         ` Hoeun Ryu
2017-02-19 10:04 ` [RFC 4/7] selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops Hoeun Ryu
2017-02-21 10:35   ` Tetsuo Handa
2017-02-19 10:04 ` [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states Hoeun Ryu
2017-02-20  8:20   ` Sebastian Andrzej Siewior
2017-02-21  5:47     ` Ho-Eun Ryu
2017-02-19 10:04 ` [RFC 6/7] arm64: add __map_kernel_segment to accept additional vm flags Hoeun Ryu
2017-02-19 11:21   ` Ard Biesheuvel
2017-02-19 10:04 ` [RFC 7/7] arm64: map seperately rodata sections for __ro_mostly_after_init section Hoeun Ryu
2017-02-19 11:35   ` Ard Biesheuvel
2017-02-20 12:45     ` Mark Rutland
2017-02-21 20:38       ` Kees Cook
2017-02-19 11:24 ` [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker Ard Biesheuvel
2017-02-21  6:29   ` Ho-Eun Ryu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).