live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] livepatch: Clear relocation targets on a module removal
@ 2019-07-19 12:28 Miroslav Benes
  2019-07-19 12:28 ` [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path Miroslav Benes
  2019-07-19 12:28 ` [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
  0 siblings, 2 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-07-19 12:28 UTC (permalink / raw)
  To: jikos, jpoimboe, pmladek
  Cc: joe.lawrence, live-patching, linux-kernel, Miroslav Benes

The second attempt to resolve the issue reported by Josh last year [1]
and also reported earlier this year again [2]. The first attempt [3]
tried to deny the patched modules to be removed. It did not solve the
issue completely. It would be possible, but we decided to try the
arch-specific approach first, which I am sending now.

Sending early as RFC, because I am leaving on holiday tomorrow and it
would be great if you took a look meanwhile.

- I decided not to CC the arch maintainers yet. If we decide that the
  approach is feasible first on our livepatch side, I will split the
  second patch and resend properly.
- the first patch could go in regardless the rest, I guess.
- I am not sure about the placement in
  klp_cleanup_module_patches_limited(). I think it is the best possible,
  but I would really appreciate double-checking.
- I am also not sure about the naming, so ideas also welcome

Lightly tested on both x86_64 and ppc64le and it looked ok.

[1] 20180602161151.apuhs2dygsexmcg2@treble
[2] 1561019068-132672-1-git-send-email-cj.chengjian@huawei.com
[3] 20180607092949.1706-1-mbenes@suse.cz

Miroslav Benes (2):
  livepatch: Nullify obj->mod in klp_module_coming()'s error path
  livepatch: Clear relocation targets on a module removal

 arch/powerpc/kernel/Makefile    |  1 +
 arch/powerpc/kernel/livepatch.c | 75 +++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/module.h    | 15 +++++++
 arch/powerpc/kernel/module_64.c |  7 +--
 arch/x86/kernel/livepatch.c     | 67 +++++++++++++++++++++++++++++
 include/linux/livepatch.h       |  5 +++
 kernel/livepatch/core.c         | 18 +++++---
 7 files changed, 177 insertions(+), 11 deletions(-)
 create mode 100644 arch/powerpc/kernel/livepatch.c
 create mode 100644 arch/powerpc/kernel/module.h

-- 
2.22.0


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path
  2019-07-19 12:28 [RFC PATCH 0/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
@ 2019-07-19 12:28 ` Miroslav Benes
  2019-07-28 19:45   ` Josh Poimboeuf
  2019-07-19 12:28 ` [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
  1 sibling, 1 reply; 44+ messages in thread
From: Miroslav Benes @ 2019-07-19 12:28 UTC (permalink / raw)
  To: jikos, jpoimboe, pmladek
  Cc: joe.lawrence, live-patching, linux-kernel, Miroslav Benes

klp_module_coming() is called for every module appearing in the system.
It sets obj->mod to a patched module for klp_object obj. Unfortunately
it leaves it set even if an error happens later in the function and the
patched module is not allowed to be loaded.

klp_is_object_loaded() uses obj->mod variable and could currently give a
wrong return value. The bug is probably harmless as of now.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/livepatch/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index c4ce08f43bd6..ab4a4606d19b 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -1175,6 +1175,7 @@ int klp_module_coming(struct module *mod)
 	pr_warn("patch '%s' failed for module '%s', refusing to load module '%s'\n",
 		patch->mod->name, obj->mod->name, obj->mod->name);
 	mod->klp_alive = false;
+	obj->mod = NULL;
 	klp_cleanup_module_patches_limited(mod, patch);
 	mutex_unlock(&klp_mutex);
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-07-19 12:28 [RFC PATCH 0/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
  2019-07-19 12:28 ` [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path Miroslav Benes
@ 2019-07-19 12:28 ` Miroslav Benes
  2019-07-22  9:33   ` Petr Mladek
  2019-07-28 20:04   ` Josh Poimboeuf
  1 sibling, 2 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-07-19 12:28 UTC (permalink / raw)
  To: jikos, jpoimboe, pmladek
  Cc: joe.lawrence, live-patching, linux-kernel, Miroslav Benes

Josh reported a bug:

  When the object to be patched is a module, and that module is
  rmmod'ed and reloaded, it fails to load with:

  module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
  livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
  livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'

  The livepatch module has a relocation which references a symbol
  in the _previous_ loading of nfsd. When apply_relocate_add()
  tries to replace the old relocation with a new one, it sees that
  the previous one is nonzero and it errors out.

  On ppc64le, we have a similar issue:

  module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
  livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
  livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'

He also proposed three different solutions. We could remove the error
check in apply_relocate_add() introduced by commit eda9cec4c9a1
("x86/module: Detect and skip invalid relocations"). However the check
is useful for detecting corrupted modules.

We could also deny the patched modules to be removed. If it proved to be
a major drawback for users, we could still implement a different
approach. The solution would also complicate the existing code a lot.

We thus decided to reverse the relocation patching (clear all relocation
targets on x86_64, or return back nops on powerpc). The solution is not
universal and is too much arch-specific, but it may prove to be simpler
in the end.

Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
 arch/powerpc/kernel/Makefile    |  1 +
 arch/powerpc/kernel/livepatch.c | 75 +++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/module.h    | 15 +++++++
 arch/powerpc/kernel/module_64.c |  7 +--
 arch/x86/kernel/livepatch.c     | 67 +++++++++++++++++++++++++++++
 include/linux/livepatch.h       |  5 +++
 kernel/livepatch/core.c         | 17 +++++---
 7 files changed, 176 insertions(+), 11 deletions(-)
 create mode 100644 arch/powerpc/kernel/livepatch.c
 create mode 100644 arch/powerpc/kernel/module.h

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 0ea6c4aa3a20..639000f78dc3 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -154,6 +154,7 @@ endif
 
 obj-$(CONFIG_EPAPR_PARAVIRT)	+= epapr_paravirt.o epapr_hcalls.o
 obj-$(CONFIG_KVM_GUEST)		+= kvm.o kvm_emul.o
+obj-$(CONFIG_LIVEPATCH)	+= livepatch.o
 
 # Disable GCOV, KCOV & sanitizers in odd or sensitive code
 GCOV_PROFILE_prom_init.o := n
diff --git a/arch/powerpc/kernel/livepatch.c b/arch/powerpc/kernel/livepatch.c
new file mode 100644
index 000000000000..6f2468c60695
--- /dev/null
+++ b/arch/powerpc/kernel/livepatch.c
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * livepatch.c - powerpc-specific Kernel Live Patching Core
+ */
+
+#include <linux/livepatch.h>
+#include <asm/code-patching.h>
+#include "module.h"
+
+void arch_klp_free_object_loaded(struct klp_patch *patch,
+				 struct klp_object *obj)
+{
+	const char *objname, *secname, *symname;
+	char sec_objname[MODULE_NAME_LEN];
+	struct klp_modinfo *info;
+	Elf64_Shdr *s;
+	Elf64_Rela *rel;
+	Elf64_Sym *sym;
+	void *loc;
+	u32 *instruction;
+	int i, cnt;
+
+	info = patch->mod->klp_info;
+	objname = klp_is_module(obj) ? obj->name : "vmlinux";
+
+	/* See livepatch core code for BUILD_BUG_ON() explanation */
+	BUILD_BUG_ON(MODULE_NAME_LEN < 56 || KSYM_NAME_LEN != 128);
+
+	/* For each klp relocation section */
+	for (s = info->sechdrs; s < info->sechdrs + info->hdr.e_shnum; s++) {
+		if (!(s->sh_flags & SHF_RELA_LIVEPATCH))
+			continue;
+
+		/*
+		 * Format: .klp.rela.sec_objname.section_name
+		 */
+		secname = info->secstrings + s->sh_name;
+		cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
+		if (cnt != 1) {
+			pr_err("section %s has an incorrectly formatted name\n",
+			       secname);
+			continue;
+		}
+
+		if (strcmp(objname, sec_objname))
+			continue;
+
+		rel = (void *)s->sh_addr;
+		for (i = 0; i < s->sh_size / sizeof(*rel); i++) {
+			loc = (void *)info->sechdrs[s->sh_info].sh_addr
+				+ rel[i].r_offset;
+			sym = (Elf64_Sym *)info->sechdrs[info->symndx].sh_addr
+				+ ELF64_R_SYM(rel[i].r_info);
+			symname = patch->mod->core_kallsyms.strtab
+				+ sym->st_name;
+
+			if (ELF64_R_TYPE(rel[i].r_info) != R_PPC_REL24)
+				continue;
+
+			if (sym->st_shndx != SHN_UNDEF &&
+			    sym->st_shndx != SHN_LIVEPATCH)
+				continue;
+
+			instruction = (u32 *)loc;
+			if (is_mprofile_mcount_callsite(symname, instruction))
+				continue;
+
+			if (!instr_is_relative_link_branch(*instruction))
+				continue;
+
+			instruction += 1;
+			*instruction = PPC_INST_NOP;
+		}
+	}
+}
diff --git a/arch/powerpc/kernel/module.h b/arch/powerpc/kernel/module.h
new file mode 100644
index 000000000000..748c38addf53
--- /dev/null
+++ b/arch/powerpc/kernel/module.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _POWERPC_ARCH_MODULE_H
+#define _POWERPC_ARCH_MODULE_H
+
+#ifdef CONFIG_MPROFILE_KERNEL
+bool is_mprofile_mcount_callsite(const char *name, u32 *instruction);
+#else
+static inline bool is_mprofile_mcount_callsite(const char *name, u32 *instruction)
+{
+	return false;
+}
+#endif
+
+#endif
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index a93b10c48000..5f34fbc3f1b8 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -450,7 +450,7 @@ static unsigned long stub_for_addr(const Elf64_Shdr *sechdrs,
 }
 
 #ifdef CONFIG_MPROFILE_KERNEL
-static bool is_mprofile_mcount_callsite(const char *name, u32 *instruction)
+bool is_mprofile_mcount_callsite(const char *name, u32 *instruction)
 {
 	if (strcmp("_mcount", name))
 		return false;
@@ -485,11 +485,6 @@ static void squash_toc_save_inst(const char *name, unsigned long addr)
 }
 #else
 static void squash_toc_save_inst(const char *name, unsigned long addr) { }
-
-static bool is_mprofile_mcount_callsite(const char *name, u32 *instruction)
-{
-	return false;
-}
 #endif
 
 /* We expect a noop next: if it is, replace it with instruction to
diff --git a/arch/x86/kernel/livepatch.c b/arch/x86/kernel/livepatch.c
index 6a68e41206e7..e99bc8dbf4a7 100644
--- a/arch/x86/kernel/livepatch.c
+++ b/arch/x86/kernel/livepatch.c
@@ -51,3 +51,70 @@ void arch_klp_init_object_loaded(struct klp_patch *patch,
 		apply_paravirt(pseg, pseg + para->sh_size);
 	}
 }
+
+void arch_klp_free_object_loaded(struct klp_patch *patch,
+				 struct klp_object *obj)
+{
+	const char *objname, *secname;
+	char sec_objname[MODULE_NAME_LEN];
+	struct klp_modinfo *info;
+	Elf64_Shdr *s;
+	Elf64_Rela *rel;
+	void *loc;
+	int i, cnt;
+
+	info = patch->mod->klp_info;
+	objname = klp_is_module(obj) ? obj->name : "vmlinux";
+
+	/* See livepatch core code for BUILD_BUG_ON() explanation */
+	BUILD_BUG_ON(MODULE_NAME_LEN < 56 || KSYM_NAME_LEN != 128);
+
+	/* For each klp relocation section */
+	for (s = info->sechdrs; s < info->sechdrs + info->hdr.e_shnum; s++) {
+		if (!(s->sh_flags & SHF_RELA_LIVEPATCH))
+			continue;
+
+		/*
+		 * Format: .klp.rela.sec_objname.section_name
+		 */
+		secname = info->secstrings + s->sh_name;
+		cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
+		if (cnt != 1) {
+			pr_err("section %s has an incorrectly formatted name\n",
+			       secname);
+			continue;
+		}
+
+		if (strcmp(objname, sec_objname))
+			continue;
+
+		rel = (void *)s->sh_addr;
+		for (i = 0; i < s->sh_size / sizeof(*rel); i++) {
+			loc = (void *)info->sechdrs[s->sh_info].sh_addr
+				+ rel[i].r_offset;
+
+			switch (ELF64_R_TYPE(rel[i].r_info)) {
+			case R_X86_64_NONE:
+				break;
+			case R_X86_64_64:
+				*(u64 *)loc = 0;
+				break;
+			case R_X86_64_32:
+				*(u32 *)loc = 0;
+				break;
+			case R_X86_64_32S:
+				*(s32 *)loc = 0;
+				break;
+			case R_X86_64_PC32:
+			case R_X86_64_PLT32:
+				*(u32 *)loc = 0;
+				break;
+			case R_X86_64_PC64:
+				*(u64 *)loc = 0;
+				break;
+			default:
+				break;
+			}
+		}
+	}
+}
diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 273400814020..a227f473d5e5 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -202,6 +202,11 @@ static inline bool klp_have_reliable_stack(void)
 	       IS_ENABLED(CONFIG_HAVE_RELIABLE_STACKTRACE);
 }
 
+static inline bool klp_is_module(struct klp_object *obj)
+{
+	return obj->name;
+}
+
 typedef int (*klp_shadow_ctor_t)(void *obj,
 				 void *shadow_data,
 				 void *ctor_data);
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index ab4a4606d19b..7a5d0cd59ec1 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -43,11 +43,6 @@ LIST_HEAD(klp_patches);
 
 static struct kobject *klp_root_kobj;
 
-static bool klp_is_module(struct klp_object *obj)
-{
-	return obj->name;
-}
-
 /* sets obj->mod if object is not vmlinux and module is found */
 static void klp_find_object_module(struct klp_object *obj)
 {
@@ -712,6 +707,12 @@ void __weak arch_klp_init_object_loaded(struct klp_patch *patch,
 {
 }
 
+/* Arches may override this to finish any remaining arch-specific tasks */
+void __weak arch_klp_free_object_loaded(struct klp_patch *patch,
+					struct klp_object *obj)
+{
+}
+
 /* parts of the initialization that is done only when the object is loaded */
 static int klp_init_object_loaded(struct klp_patch *patch,
 				  struct klp_object *obj)
@@ -1100,6 +1101,12 @@ static void klp_cleanup_module_patches_limited(struct module *mod,
 
 			klp_post_unpatch_callback(obj);
 
+			mutex_lock(&text_mutex);
+			module_disable_ro(patch->mod);
+			arch_klp_free_object_loaded(patch, obj);
+			module_enable_ro(patch->mod, true);
+			mutex_unlock(&text_mutex);
+
 			klp_free_object_loaded(obj);
 			break;
 		}
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-07-19 12:28 ` [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
@ 2019-07-22  9:33   ` Petr Mladek
  2019-08-14 12:33     ` Miroslav Benes
  2019-07-28 20:04   ` Josh Poimboeuf
  1 sibling, 1 reply; 44+ messages in thread
From: Petr Mladek @ 2019-07-22  9:33 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: jikos, jpoimboe, joe.lawrence, linux-kernel, live-patching

On Fri 2019-07-19 14:28:40, Miroslav Benes wrote:
> Josh reported a bug:
> 
>   When the object to be patched is a module, and that module is
>   rmmod'ed and reloaded, it fails to load with:
> 
>   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> 
>   The livepatch module has a relocation which references a symbol
>   in the _previous_ loading of nfsd. When apply_relocate_add()
>   tries to replace the old relocation with a new one, it sees that
>   the previous one is nonzero and it errors out.
> 
>   On ppc64le, we have a similar issue:
> 
>   module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> 
> He also proposed three different solutions. We could remove the error
> check in apply_relocate_add() introduced by commit eda9cec4c9a1
> ("x86/module: Detect and skip invalid relocations"). However the check
> is useful for detecting corrupted modules.
> 
> We could also deny the patched modules to be removed. If it proved to be
> a major drawback for users, we could still implement a different
> approach. The solution would also complicate the existing code a lot.
> 
> We thus decided to reverse the relocation patching (clear all relocation
> targets on x86_64, or return back nops on powerpc). The solution is not
> universal and is too much arch-specific, but it may prove to be simpler
> in the end.
> 
> Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
>  arch/powerpc/kernel/Makefile    |  1 +
>  arch/powerpc/kernel/livepatch.c | 75 +++++++++++++++++++++++++++++++++
>  arch/powerpc/kernel/module.h    | 15 +++++++
>  arch/powerpc/kernel/module_64.c |  7 +--
>  arch/x86/kernel/livepatch.c     | 67 +++++++++++++++++++++++++++++
>  include/linux/livepatch.h       |  5 +++
>  kernel/livepatch/core.c         | 17 +++++---
>  7 files changed, 176 insertions(+), 11 deletions(-)
>  create mode 100644 arch/powerpc/kernel/livepatch.c
>  create mode 100644 arch/powerpc/kernel/module.h
> 
> diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
> index 0ea6c4aa3a20..639000f78dc3 100644
> --- a/arch/powerpc/kernel/Makefile
> +++ b/arch/powerpc/kernel/Makefile
> @@ -154,6 +154,7 @@ endif
>  
>  obj-$(CONFIG_EPAPR_PARAVIRT)	+= epapr_paravirt.o epapr_hcalls.o
>  obj-$(CONFIG_KVM_GUEST)		+= kvm.o kvm_emul.o
> +obj-$(CONFIG_LIVEPATCH)	+= livepatch.o
>  
>  # Disable GCOV, KCOV & sanitizers in odd or sensitive code
>  GCOV_PROFILE_prom_init.o := n
> diff --git a/arch/powerpc/kernel/livepatch.c b/arch/powerpc/kernel/livepatch.c
> new file mode 100644
> index 000000000000..6f2468c60695
> --- /dev/null
> +++ b/arch/powerpc/kernel/livepatch.c
> @@ -0,0 +1,75 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * livepatch.c - powerpc-specific Kernel Live Patching Core
> + */
> +
> +#include <linux/livepatch.h>
> +#include <asm/code-patching.h>
> +#include "module.h"
> +
> +void arch_klp_free_object_loaded(struct klp_patch *patch,
> +				 struct klp_object *obj)

If I get it correctly then this functions reverts changes done by
klp_write_object_relocations(). Therefore it should get called
klp_clear_object_relocations() or so.

There is also arch_klp_init_object_loaded() but it does different
things, for example it applies alternatives or paravirt instructions.
Do we need to revert these as well?


> +{
> +	const char *objname, *secname, *symname;
> +	char sec_objname[MODULE_NAME_LEN];
> +	struct klp_modinfo *info;
> +	Elf64_Shdr *s;
> +	Elf64_Rela *rel;
> +	Elf64_Sym *sym;
> +	void *loc;
> +	u32 *instruction;
> +	int i, cnt;
> +
> +	info = patch->mod->klp_info;
> +	objname = klp_is_module(obj) ? obj->name : "vmlinux";
> +
> +	/* See livepatch core code for BUILD_BUG_ON() explanation */
> +	BUILD_BUG_ON(MODULE_NAME_LEN < 56 || KSYM_NAME_LEN != 128);
> +
> +	/* For each klp relocation section */
> +	for (s = info->sechdrs; s < info->sechdrs + info->hdr.e_shnum; s++) {
> +		if (!(s->sh_flags & SHF_RELA_LIVEPATCH))
> +			continue;
> +
> +		/*
> +		 * Format: .klp.rela.sec_objname.section_name
> +		 */
> +		secname = info->secstrings + s->sh_name;
> +		cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> +		if (cnt != 1) {
> +			pr_err("section %s has an incorrectly formatted name\n",
> +			       secname);
> +			continue;
> +		}
> +
> +		if (strcmp(objname, sec_objname))
> +			continue;

The above code seems to be arch-independent. Please, move it into
klp_clear_object_relocations() or so.

> +		rel = (void *)s->sh_addr;
> +		for (i = 0; i < s->sh_size / sizeof(*rel); i++) {
> +			loc = (void *)info->sechdrs[s->sh_info].sh_addr
> +				+ rel[i].r_offset;
> +			sym = (Elf64_Sym *)info->sechdrs[info->symndx].sh_addr
> +				+ ELF64_R_SYM(rel[i].r_info);
> +			symname = patch->mod->core_kallsyms.strtab
> +				+ sym->st_name;
> +
> +			if (ELF64_R_TYPE(rel[i].r_info) != R_PPC_REL24)
> +				continue;
> +
> +			if (sym->st_shndx != SHN_UNDEF &&
> +			    sym->st_shndx != SHN_LIVEPATCH)
> +				continue;

The above check is livepatch-specific. But in principle, this should
revert changes done by apply_relocate_add(). I would implement
apply_relocation_clear() or apply_relocation_del() or ...
and call it from the generic klp_clear_object_relocations().

The code should be put into the same source files as
apply_relocate_add(). It will increase the chance that
any changes will be in sync.

Of course, it is possible that there was a reason for the
livepatch-specific filtering that I am not aware of.

> +
> +			instruction = (u32 *)loc;
> +			if (is_mprofile_mcount_callsite(symname, instruction))
> +				continue;
> +
> +			if (!instr_is_relative_link_branch(*instruction))
> +				continue;
> +
> +			instruction += 1;
> +			*instruction = PPC_INST_NOP;
> +		}
> +	}
> +}

Otherwise, this approach looks fine to me. I believe that this area
is pretty stable and the maintenance should be rather cheap.

Best Regards,
Petr
acceptable.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path
  2019-07-19 12:28 ` [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path Miroslav Benes
@ 2019-07-28 19:45   ` Josh Poimboeuf
  2019-08-19 11:26     ` Petr Mladek
  0 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-07-28 19:45 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: jikos, pmladek, joe.lawrence, live-patching, linux-kernel

On Fri, Jul 19, 2019 at 02:28:39PM +0200, Miroslav Benes wrote:
> klp_module_coming() is called for every module appearing in the system.
> It sets obj->mod to a patched module for klp_object obj. Unfortunately
> it leaves it set even if an error happens later in the function and the
> patched module is not allowed to be loaded.
> 
> klp_is_object_loaded() uses obj->mod variable and could currently give a
> wrong return value. The bug is probably harmless as of now.
> 
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> Reviewed-by: Petr Mladek <pmladek@suse.com>

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-07-19 12:28 ` [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
  2019-07-22  9:33   ` Petr Mladek
@ 2019-07-28 20:04   ` Josh Poimboeuf
  2019-08-14 11:06     ` Miroslav Benes
  1 sibling, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-07-28 20:04 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: jikos, pmladek, joe.lawrence, live-patching, linux-kernel

On Fri, Jul 19, 2019 at 02:28:40PM +0200, Miroslav Benes wrote:
> Josh reported a bug:
> 
>   When the object to be patched is a module, and that module is
>   rmmod'ed and reloaded, it fails to load with:
> 
>   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> 
>   The livepatch module has a relocation which references a symbol
>   in the _previous_ loading of nfsd. When apply_relocate_add()
>   tries to replace the old relocation with a new one, it sees that
>   the previous one is nonzero and it errors out.
> 
>   On ppc64le, we have a similar issue:
> 
>   module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> 
> He also proposed three different solutions. We could remove the error
> check in apply_relocate_add() introduced by commit eda9cec4c9a1
> ("x86/module: Detect and skip invalid relocations"). However the check
> is useful for detecting corrupted modules.
> 
> We could also deny the patched modules to be removed. If it proved to be
> a major drawback for users, we could still implement a different
> approach. The solution would also complicate the existing code a lot.
> 
> We thus decided to reverse the relocation patching (clear all relocation
> targets on x86_64, or return back nops on powerpc). The solution is not
> universal and is too much arch-specific, but it may prove to be simpler
> in the end.

Thanks for the patch Miroslav.

However, I really don't like it.  All this extra convoluted
arch-specific code, just so users can unload a patched module.

Remind me why we didn't do the "deny the patched modules to be removed"
option?

Really, we should be going in the opposite direction, by creating module
dependencies, like all other kernel modules do, ensuring that a module
is loaded *before* we patch it.  That would also eliminate this bug.

And I think it would also help us remove a lot of nasty code, like the
coming/going notifiers and the .klp.arch mess.  Which, BTW, seem to be
the sources of most of our bugs...

Yes, there's the "but it's less flexible!" argument.  Does anybody
really need the flexibility?  I strongly doubt it.  I would love to see
an RFC patch which enforces that restriction, to see all the nasty code
we could remove.  I would much rather live patching be stable than
flexible.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-07-28 20:04   ` Josh Poimboeuf
@ 2019-08-14 11:06     ` Miroslav Benes
  2019-08-14 15:12       ` Josh Poimboeuf
  0 siblings, 1 reply; 44+ messages in thread
From: Miroslav Benes @ 2019-08-14 11:06 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: jikos, pmladek, joe.lawrence, live-patching, linux-kernel

On Sun, 28 Jul 2019, Josh Poimboeuf wrote:

> On Fri, Jul 19, 2019 at 02:28:40PM +0200, Miroslav Benes wrote:
> > Josh reported a bug:
> > 
> >   When the object to be patched is a module, and that module is
> >   rmmod'ed and reloaded, it fails to load with:
> > 
> >   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > 
> >   The livepatch module has a relocation which references a symbol
> >   in the _previous_ loading of nfsd. When apply_relocate_add()
> >   tries to replace the old relocation with a new one, it sees that
> >   the previous one is nonzero and it errors out.
> > 
> >   On ppc64le, we have a similar issue:
> > 
> >   module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > 
> > He also proposed three different solutions. We could remove the error
> > check in apply_relocate_add() introduced by commit eda9cec4c9a1
> > ("x86/module: Detect and skip invalid relocations"). However the check
> > is useful for detecting corrupted modules.
> > 
> > We could also deny the patched modules to be removed. If it proved to be
> > a major drawback for users, we could still implement a different
> > approach. The solution would also complicate the existing code a lot.
> > 
> > We thus decided to reverse the relocation patching (clear all relocation
> > targets on x86_64, or return back nops on powerpc). The solution is not
> > universal and is too much arch-specific, but it may prove to be simpler
> > in the end.
> 
> Thanks for the patch Miroslav.
> 
> However, I really don't like it.  All this extra convoluted
> arch-specific code, just so users can unload a patched module.

Yes, it is unfortunate.
 
> Remind me why we didn't do the "deny the patched modules to be removed"
> option?

Petr came with a couple of issues in the patch. Nothing unfixable, but it 
would complicate the code a bit, so we wanted to explore arch-specific 
approach first. I'll return to it, fix it and we'll see the outcome.

> Really, we should be going in the opposite direction, by creating module
> dependencies, like all other kernel modules do, ensuring that a module
> is loaded *before* we patch it.  That would also eliminate this bug.

Yes, but it is not ideal either with cumulative one-fixes-all patch 
modules. It would load also modules which are not necessary for a 
customer and I know that at least some customers care about this. They 
want to deploy only things which are crucial for their systems.

We could split patch modules as you proposed in the past, but that have 
issues as well.

Anyway, that is why I proposed "Rethinking late module patching" talk at 
LPC and we should try to come up with a solution there.

> And I think it would also help us remove a lot of nasty code, like the
> coming/going notifiers and the .klp.arch mess.  Which, BTW, seem to be
> the sources of most of our bugs...

Yes.

> Yes, there's the "but it's less flexible!" argument.  Does anybody
> really need the flexibility?  I strongly doubt it.  I would love to see
> an RFC patch which enforces that restriction, to see all the nasty code
> we could remove.  I would much rather live patching be stable than
> flexible.

I agree that unloading a module does not make sense much (famous last 
words), so we could try.

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-07-22  9:33   ` Petr Mladek
@ 2019-08-14 12:33     ` Miroslav Benes
  0 siblings, 0 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-08-14 12:33 UTC (permalink / raw)
  To: Petr Mladek; +Cc: jikos, jpoimboe, joe.lawrence, linux-kernel, live-patching

On Mon, 22 Jul 2019, Petr Mladek wrote:

> On Fri 2019-07-19 14:28:40, Miroslav Benes wrote:
> > Josh reported a bug:
> > 
> >   When the object to be patched is a module, and that module is
> >   rmmod'ed and reloaded, it fails to load with:
> > 
> >   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > 
> >   The livepatch module has a relocation which references a symbol
> >   in the _previous_ loading of nfsd. When apply_relocate_add()
> >   tries to replace the old relocation with a new one, it sees that
> >   the previous one is nonzero and it errors out.
> > 
> >   On ppc64le, we have a similar issue:
> > 
> >   module_64: livepatch_nfsd: Expected nop after call, got e8410018 at e_show+0x60/0x548 [livepatch_nfsd]
> >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > 
> > He also proposed three different solutions. We could remove the error
> > check in apply_relocate_add() introduced by commit eda9cec4c9a1
> > ("x86/module: Detect and skip invalid relocations"). However the check
> > is useful for detecting corrupted modules.
> > 
> > We could also deny the patched modules to be removed. If it proved to be
> > a major drawback for users, we could still implement a different
> > approach. The solution would also complicate the existing code a lot.
> > 
> > We thus decided to reverse the relocation patching (clear all relocation
> > targets on x86_64, or return back nops on powerpc). The solution is not
> > universal and is too much arch-specific, but it may prove to be simpler
> > in the end.
> > 
> > Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> > ---
> >  arch/powerpc/kernel/Makefile    |  1 +
> >  arch/powerpc/kernel/livepatch.c | 75 +++++++++++++++++++++++++++++++++
> >  arch/powerpc/kernel/module.h    | 15 +++++++
> >  arch/powerpc/kernel/module_64.c |  7 +--
> >  arch/x86/kernel/livepatch.c     | 67 +++++++++++++++++++++++++++++
> >  include/linux/livepatch.h       |  5 +++
> >  kernel/livepatch/core.c         | 17 +++++---
> >  7 files changed, 176 insertions(+), 11 deletions(-)
> >  create mode 100644 arch/powerpc/kernel/livepatch.c
> >  create mode 100644 arch/powerpc/kernel/module.h
> > 
> > diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
> > index 0ea6c4aa3a20..639000f78dc3 100644
> > --- a/arch/powerpc/kernel/Makefile
> > +++ b/arch/powerpc/kernel/Makefile
> > @@ -154,6 +154,7 @@ endif
> >  
> >  obj-$(CONFIG_EPAPR_PARAVIRT)	+= epapr_paravirt.o epapr_hcalls.o
> >  obj-$(CONFIG_KVM_GUEST)		+= kvm.o kvm_emul.o
> > +obj-$(CONFIG_LIVEPATCH)	+= livepatch.o
> >  
> >  # Disable GCOV, KCOV & sanitizers in odd or sensitive code
> >  GCOV_PROFILE_prom_init.o := n
> > diff --git a/arch/powerpc/kernel/livepatch.c b/arch/powerpc/kernel/livepatch.c
> > new file mode 100644
> > index 000000000000..6f2468c60695
> > --- /dev/null
> > +++ b/arch/powerpc/kernel/livepatch.c
> > @@ -0,0 +1,75 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * livepatch.c - powerpc-specific Kernel Live Patching Core
> > + */
> > +
> > +#include <linux/livepatch.h>
> > +#include <asm/code-patching.h>
> > +#include "module.h"
> > +
> > +void arch_klp_free_object_loaded(struct klp_patch *patch,
> > +				 struct klp_object *obj)
> 
> If I get it correctly then this functions reverts changes done by
> klp_write_object_relocations(). Therefore it should get called
> klp_clear_object_relocations() or so.

Good point. Especially when we want to split the function to 
arch-independent and arch-specific parts.
 
> There is also arch_klp_init_object_loaded() but it does different
> things, for example it applies alternatives or paravirt instructions.
> Do we need to revert these as well?

No, I don't think so. They should not change because the patch module 
stays loaded.
 
> > +{
> > +	const char *objname, *secname, *symname;
> > +	char sec_objname[MODULE_NAME_LEN];
> > +	struct klp_modinfo *info;
> > +	Elf64_Shdr *s;
> > +	Elf64_Rela *rel;
> > +	Elf64_Sym *sym;
> > +	void *loc;
> > +	u32 *instruction;
> > +	int i, cnt;
> > +
> > +	info = patch->mod->klp_info;
> > +	objname = klp_is_module(obj) ? obj->name : "vmlinux";
> > +
> > +	/* See livepatch core code for BUILD_BUG_ON() explanation */
> > +	BUILD_BUG_ON(MODULE_NAME_LEN < 56 || KSYM_NAME_LEN != 128);
> > +
> > +	/* For each klp relocation section */
> > +	for (s = info->sechdrs; s < info->sechdrs + info->hdr.e_shnum; s++) {
> > +		if (!(s->sh_flags & SHF_RELA_LIVEPATCH))
> > +			continue;
> > +
> > +		/*
> > +		 * Format: .klp.rela.sec_objname.section_name
> > +		 */
> > +		secname = info->secstrings + s->sh_name;
> > +		cnt = sscanf(secname, ".klp.rela.%55[^.]", sec_objname);
> > +		if (cnt != 1) {
> > +			pr_err("section %s has an incorrectly formatted name\n",
> > +			       secname);
> > +			continue;
> > +		}
> > +
> > +		if (strcmp(objname, sec_objname))
> > +			continue;
> 
> The above code seems to be arch-independent. Please, move it into
> klp_clear_object_relocations() or so.

Yes.
 
> > +		rel = (void *)s->sh_addr;
> > +		for (i = 0; i < s->sh_size / sizeof(*rel); i++) {
> > +			loc = (void *)info->sechdrs[s->sh_info].sh_addr
> > +				+ rel[i].r_offset;
> > +			sym = (Elf64_Sym *)info->sechdrs[info->symndx].sh_addr
> > +				+ ELF64_R_SYM(rel[i].r_info);
> > +			symname = patch->mod->core_kallsyms.strtab
> > +				+ sym->st_name;
> > +
> > +			if (ELF64_R_TYPE(rel[i].r_info) != R_PPC_REL24)
> > +				continue;
> > +
> > +			if (sym->st_shndx != SHN_UNDEF &&
> > +			    sym->st_shndx != SHN_LIVEPATCH)
> > +				continue;
> 
> The above check is livepatch-specific. But in principle, this should
> revert changes done by apply_relocate_add(). I would implement
> apply_relocation_clear() or apply_relocation_del() or ...
> and call it from the generic klp_clear_object_relocations().
> 
> The code should be put into the same source files as
> apply_relocate_add(). It will increase the chance that
> any changes will be in sync.

Yes, it would be more consistent. It also shows how much code have to be 
introduced to fix the issue :/
 
> Of course, it is possible that there was a reason for the
> livepatch-specific filtering that I am not aware of.
> 
> > +
> > +			instruction = (u32 *)loc;
> > +			if (is_mprofile_mcount_callsite(symname, instruction))
> > +				continue;
> > +
> > +			if (!instr_is_relative_link_branch(*instruction))
> > +				continue;
> > +
> > +			instruction += 1;
> > +			*instruction = PPC_INST_NOP;
> > +		}
> > +	}
> > +}
> 
> Otherwise, this approach looks fine to me. I believe that this area
> is pretty stable and the maintenance should be rather cheap.

Thanks for the review. I'll try to fix the first approach (which denies 
the modules to be removed) now so we can compare.

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-14 11:06     ` Miroslav Benes
@ 2019-08-14 15:12       ` Josh Poimboeuf
  2019-08-16  9:46         ` Petr Mladek
  2019-08-26 13:44         ` Nicolai Stange
  0 siblings, 2 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-08-14 15:12 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: jikos, pmladek, joe.lawrence, live-patching, linux-kernel

On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
> > Really, we should be going in the opposite direction, by creating module
> > dependencies, like all other kernel modules do, ensuring that a module
> > is loaded *before* we patch it.  That would also eliminate this bug.
> 
> Yes, but it is not ideal either with cumulative one-fixes-all patch 
> modules. It would load also modules which are not necessary for a 
> customer and I know that at least some customers care about this. They 
> want to deploy only things which are crucial for their systems.

If you frame the question as "do you want to destabilize the live
patching infrastucture" then the answer might be different.

We should look at whether it makes sense to destabilize live patching
for everybody, for a small minority of people who care about a small
minority of edge cases.

Or maybe there's some other solution we haven't thought about, which
fits more in the framework of how kernel modules already work.

> We could split patch modules as you proposed in the past, but that have 
> issues as well.

Right, I'm not really crazy about that solution either.

Here's another idea: per-object patch modules.  Patches to vmlinux are
in a vmlinux patch module.  Patches to kvm.ko are in a kvm patch module.
That would require:

- Careful management of dependencies between object-specific patches.
  Maybe that just means that exported function ABIs shouldn't change.

- Some kind of hooking into modprobe to ensure the patch module gets
  loaded with the real one.

- Changing 'atomic replace' to allow patch modules to be per-object.

> Anyway, that is why I proposed "Rethinking late module patching" talk at 
> LPC and we should try to come up with a solution there.

Thanks, I saw that.  It's definitely worthy of more discussion.  The
talk may be more productive if there were code to look at.  For example,
a patch which removes all the "late module patching" gunk, so we can at
least quantify the cost of the current approach.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-14 15:12       ` Josh Poimboeuf
@ 2019-08-16  9:46         ` Petr Mladek
  2019-08-22 22:36           ` Josh Poimboeuf
  2019-08-26 13:44         ` Nicolai Stange
  1 sibling, 1 reply; 44+ messages in thread
From: Petr Mladek @ 2019-08-16  9:46 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Miroslav Benes, jikos, joe.lawrence, linux-kernel, live-patching

On Wed 2019-08-14 10:12:44, Josh Poimboeuf wrote:
> On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
> > > Really, we should be going in the opposite direction, by creating module
> > > dependencies, like all other kernel modules do, ensuring that a module
> > > is loaded *before* we patch it.  That would also eliminate this bug.
> > 
> > Yes, but it is not ideal either with cumulative one-fixes-all patch 
> > modules. It would load also modules which are not necessary for a 
> > customer and I know that at least some customers care about this. They 
> > want to deploy only things which are crucial for their systems.
> 
> If you frame the question as "do you want to destabilize the live
> patching infrastucture" then the answer might be different.
> 
> We should look at whether it makes sense to destabilize live patching
> for everybody, for a small minority of people who care about a small
> minority of edge cases.

I do not see it that simple. Forcing livepatched modules to be
loaded would mean loading "random" new modules when updating
livepatches:

  + It means more actions and higher risk to destabilize
    the system. Different modules have different quality.

  + It might open more security holes that are not fixed by
    the livepatch.

  + It might require some extra configuration actions to handle
    the newly opened interfaces (devices). For example, updating
    SELinux policies.

  + Are there conflicting modules that might need to get
    livepatched?

This approach has a strong no-go from my side.


> Or maybe there's some other solution we haven't thought about, which
> fits more in the framework of how kernel modules already work.
>
> > We could split patch modules as you proposed in the past, but that have 
> > issues as well.

> Right, I'm not really crazy about that solution either.

Yes, this would just move the problem somewhere else.


> Here's another idea: per-object patch modules.  Patches to vmlinux are
> in a vmlinux patch module.  Patches to kvm.ko are in a kvm patch module.
> That would require:
> 
> - Careful management of dependencies between object-specific patches.
>   Maybe that just means that exported function ABIs shouldn't change.
> 
> - Some kind of hooking into modprobe to ensure the patch module gets
>   loaded with the real one.

I see this just as a particular approach how to split livepatches
per-object. The above points suggest how to handle dependencies
on the kernel side.

> - Changing 'atomic replace' to allow patch modules to be per-object.

The problem might be how to transition all loaded objects atomically
when the needed code is loaded from different modules.

Alternative would be to support only per-object consitency. But it
might reduce the number of supported scenarios too much. Also it
would make livepatching more error-prone.


I would like to see updated variant of this patch to see how much
arch-specific code is really necessary.

IMHO, if reverting relocations is too complicated then the least painful
compromise is to "deny the patched modules to be removed".

> > Anyway, that is why I proposed "Rethinking late module patching" talk at 
> > LPC and we should try to come up with a solution there.
>
> Thanks, I saw that.  It's definitely worthy of more discussion.  The
> talk may be more productive if there were code to look at.  For example,
> a patch which removes all the "late module patching" gunk, so we can at
> least quantify the cost of the current approach.

+1

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path
  2019-07-28 19:45   ` Josh Poimboeuf
@ 2019-08-19 11:26     ` Petr Mladek
  0 siblings, 0 replies; 44+ messages in thread
From: Petr Mladek @ 2019-08-19 11:26 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Miroslav Benes, jikos, joe.lawrence, linux-kernel, live-patching

On Sun 2019-07-28 14:45:33, Josh Poimboeuf wrote:
> On Fri, Jul 19, 2019 at 02:28:39PM +0200, Miroslav Benes wrote:
> > klp_module_coming() is called for every module appearing in the system.
> > It sets obj->mod to a patched module for klp_object obj. Unfortunately
> > it leaves it set even if an error happens later in the function and the
> > patched module is not allowed to be loaded.
> > 
> > klp_is_object_loaded() uses obj->mod variable and could currently give a
> > wrong return value. The bug is probably harmless as of now.
> > 
> > Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> > Reviewed-by: Petr Mladek <pmladek@suse.com>
> 
> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>

This patch has been committed into the branch for-5.4/core.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-16  9:46         ` Petr Mladek
@ 2019-08-22 22:36           ` Josh Poimboeuf
  2019-08-23  8:13             ` Petr Mladek
  0 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-08-22 22:36 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Miroslav Benes, jikos, joe.lawrence, linux-kernel, live-patching

On Fri, Aug 16, 2019 at 11:46:08AM +0200, Petr Mladek wrote:
> On Wed 2019-08-14 10:12:44, Josh Poimboeuf wrote:
> > On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
> > > > Really, we should be going in the opposite direction, by creating module
> > > > dependencies, like all other kernel modules do, ensuring that a module
> > > > is loaded *before* we patch it.  That would also eliminate this bug.
> > > 
> > > Yes, but it is not ideal either with cumulative one-fixes-all patch 
> > > modules. It would load also modules which are not necessary for a 
> > > customer and I know that at least some customers care about this. They 
> > > want to deploy only things which are crucial for their systems.
> > 
> > If you frame the question as "do you want to destabilize the live
> > patching infrastucture" then the answer might be different.
> > 
> > We should look at whether it makes sense to destabilize live patching
> > for everybody, for a small minority of people who care about a small
> > minority of edge cases.
> 
> I do not see it that simple. Forcing livepatched modules to be
> loaded would mean loading "random" new modules when updating
> livepatches:

I don't want to start a long debate on this, because this idea isn't
even my first choice.  But we shouldn't dismiss it outright.

<devils-advocate>

>   + It means more actions and higher risk to destabilize
>     the system. Different modules have different quality.

Maybe the distro shouldn't ship modules which would destabilize the
system.

>   + It might open more security holes that are not fixed by
>     the livepatch.

Following the same line of thinking, the livepatch infrastructure might
open security holes because of the inherent complexity of late module
patching.

>   + It might require some extra configuration actions to handle
>     the newly opened interfaces (devices). For example, updating
>     SELinux policies.

I assume you mean user-created policies, not distro ones?  Is this even
a realistic concern?

>   + Are there conflicting modules that might need to get
>     livepatched?

Again is this realistic?

> This approach has a strong no-go from my side.

</devils-advocate>

I agree it's not ideal, but nothing is ideal at this point.  Let's not
to rule it out prematurely.  I do feel that our current approach is not
the best.  It will continue to create problems for us until we fix it.

>
> > Or maybe there's some other solution we haven't thought about, which
> > fits more in the framework of how kernel modules already work.
> >
> > > We could split patch modules as you proposed in the past, but that have 
> > > issues as well.
> 
> > Right, I'm not really crazy about that solution either.
> 
> Yes, this would just move the problem somewhere else.
> 
> 
> > Here's another idea: per-object patch modules.  Patches to vmlinux are
> > in a vmlinux patch module.  Patches to kvm.ko are in a kvm patch module.
> > That would require:
> > 
> > - Careful management of dependencies between object-specific patches.
> >   Maybe that just means that exported function ABIs shouldn't change.
> > 
> > - Some kind of hooking into modprobe to ensure the patch module gets
> >   loaded with the real one.
> 
> I see this just as a particular approach how to split livepatches
> per-object. The above points suggest how to handle dependencies
> on the kernel side.

Yes, they would need to be done on the distro / patch creation /
operational side.  They probably wouldn't affect livepatch code.

> > - Changing 'atomic replace' to allow patch modules to be per-object.
> 
> The problem might be how to transition all loaded objects atomically
> when the needed code is loaded from different modules.

I'm not sure what you mean.

My idea was that each patch module would be specific to an object, with
no inter-module change dependencies.  So when using atomic replace, if
the patch module is only targeted to vmlinux, then only vmlinux-targeted
patch modules would be replaced.

In other words, 'atomic replace' would be object-specific.

> Alternative would be to support only per-object consitency. But it
> might reduce the number of supported scenarios too much. Also it
> would make livepatching more error-prone.

Again, I don't follow.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-22 22:36           ` Josh Poimboeuf
@ 2019-08-23  8:13             ` Petr Mladek
  2019-08-26 14:54               ` Josh Poimboeuf
  0 siblings, 1 reply; 44+ messages in thread
From: Petr Mladek @ 2019-08-23  8:13 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: jikos, joe.lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu 2019-08-22 17:36:49, Josh Poimboeuf wrote:
> On Fri, Aug 16, 2019 at 11:46:08AM +0200, Petr Mladek wrote:
> > On Wed 2019-08-14 10:12:44, Josh Poimboeuf wrote:
> > > On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
> > > > > Really, we should be going in the opposite direction, by creating module
> > > > > dependencies, like all other kernel modules do, ensuring that a module
> > > > > is loaded *before* we patch it.  That would also eliminate this bug.
> > > 
> > > We should look at whether it makes sense to destabilize live patching
> > > for everybody, for a small minority of people who care about a small
> > > minority of edge cases.
> > 
> > I do not see it that simple. Forcing livepatched modules to be
> > loaded would mean loading "random" new modules when updating
> > livepatches:
> 
> I don't want to start a long debate on this, because this idea isn't
> even my first choice.  But we shouldn't dismiss it outright.

I am glad to hear that this is not your first choice.


> >   + It means more actions and higher risk to destabilize
> >     the system. Different modules have different quality.
> 
> Maybe the distro shouldn't ship modules which would destabilize the
> system.

Is this realistic? Even the best QA could not check all scenarios.
My point is that the more actions we do the bigger the risk is.

Anyway, this approach might cause loading modules that are never
or rarely loaded together. Real life systems have limited number of
peripherals.

I wonder if it might actually break certification of some
hardware. It is just an idea. I do not know how certifications
are done and what is the scope or limits.


> >   + It might open more security holes that are not fixed by
> >     the livepatch.
> 
> Following the same line of thinking, the livepatch infrastructure might
> open security holes because of the inherent complexity of late module
> patching.

Could you be more specific, please?
Has there been any known security hole in the late module
livepatching code?


> >   + It might require some extra configuration actions to handle
> >     the newly opened interfaces (devices). For example, updating
> >     SELinux policies.
> 
> I assume you mean user-created policies, not distro ones?  Is this even
> a realistic concern?

Honestly, I do not know. I am not familiar with this area. There are
also containers. They are going to be everywhere. They also need a lot
of rules to keep stuff separated. And it is another area where I have
no idea if newly loaded and unexpectedly modules might need special
handling.


> >   + Are there conflicting modules that might need to get
> >     livepatched?
> 
> Again is this realistic?

I do not know. But I could imagine it.


> > This approach has a strong no-go from my side.
> 
> </devils-advocate>
> 
> I agree it's not ideal, but nothing is ideal at this point.  Let's not
> to rule it out prematurely.  I do feel that our current approach is not
> the best.  It will continue to create problems for us until we fix it.

I am sure that we could do better. I just think that forcibly loading
modules is opening too huge can of worms. Basically all other
approaches have more limited or better defined effects.

For example, the newly added code that clears the relocations
is something that can be tested. Behavior of "random" mix of
loaded modules opens possibilities that have never been
discovered before.


> > > - Changing 'atomic replace' to allow patch modules to be per-object.
> > 
> > The problem might be how to transition all loaded objects atomically
> > when the needed code is loaded from different modules.
> 
> I'm not sure what you mean.
> 
> My idea was that each patch module would be specific to an object, with
> no inter-module change dependencies.  So when using atomic replace, if
> the patch module is only targeted to vmlinux, then only vmlinux-targeted
> patch modules would be replaced.
> 
> In other words, 'atomic replace' would be object-specific.
> 
> > Alternative would be to support only per-object consitency. But it
> > might reduce the number of supported scenarios too much. Also it
> > would make livepatching more error-prone.

By per-object consistency I mean the same as you with "each patch
module would be specific to an object, with no inter-module change
dependencies".

My concern is that it would prevent semantic changes in a shared code.
Semantic changes are rare. But changes in shared code are not.

If we reduce the consistency to per-object consistency. Will the
consistency still make sense then? We might want to go back to
trees, I mean immediate mode.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-14 15:12       ` Josh Poimboeuf
  2019-08-16  9:46         ` Petr Mladek
@ 2019-08-26 13:44         ` Nicolai Stange
  2019-08-26 15:02           ` Josh Poimboeuf
  1 sibling, 1 reply; 44+ messages in thread
From: Nicolai Stange @ 2019-08-26 13:44 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Miroslav Benes, jikos, pmladek, joe.lawrence, live-patching,
	linux-kernel

Josh Poimboeuf <jpoimboe@redhat.com> writes:

> On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
>> > Really, we should be going in the opposite direction, by creating module
>> > dependencies, like all other kernel modules do, ensuring that a module
>> > is loaded *before* we patch it.  That would also eliminate this bug.
>> 
>> Yes, but it is not ideal either with cumulative one-fixes-all patch 
>> modules. It would load also modules which are not necessary for a 
>> customer and I know that at least some customers care about this. They 
>> want to deploy only things which are crucial for their systems.

Security concerns set aside, some of the patched modules might get
distributed separately from the main kernel through some sort of
kernel-*-extra packages and thus, not be found on some target system
at all. Or they might have been blacklisted.


> If you frame the question as "do you want to destabilize the live
> patching infrastucture" then the answer might be different.
>
> We should look at whether it makes sense to destabilize live patching
> for everybody, for a small minority of people who care about a small
> minority of edge cases.
>
> Or maybe there's some other solution we haven't thought about, which
> fits more in the framework of how kernel modules already work.
>
>> We could split patch modules as you proposed in the past, but that have 
>> issues as well.
>
> Right, I'm not really crazy about that solution either.
>
> Here's another idea: per-object patch modules.  Patches to vmlinux are
> in a vmlinux patch module.  Patches to kvm.ko are in a kvm patch module.
> That would require:
>
> - Careful management of dependencies between object-specific patches.
>   Maybe that just means that exported function ABIs shouldn't change.
>
> - Some kind of hooking into modprobe to ensure the patch module gets
>   loaded with the real one.
>
> - Changing 'atomic replace' to allow patch modules to be per-object.
>

Perhaps I'm misunderstanding, but supporting only per-object livepatch
modules would make livepatch creation for something like commit
15fab63e1e57 ("fs: prevent page refcount overflow in pipe_buf_get"),
CVE-2019-11487 really cumbersome (see the fuse part)?

I think I've seen similar interdependencies between e.g. kvm.ko <->
kvm_intel.ko, but can't find an example right now.


Thanks,

Nicolai

--
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 247165, AG München), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-23  8:13             ` Petr Mladek
@ 2019-08-26 14:54               ` Josh Poimboeuf
  2019-08-27 15:05                 ` Joe Lawrence
  2019-09-02 16:13                 ` Miroslav Benes
  0 siblings, 2 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-08-26 14:54 UTC (permalink / raw)
  To: Petr Mladek
  Cc: jikos, joe.lawrence, Miroslav Benes, linux-kernel, live-patching

On Fri, Aug 23, 2019 at 10:13:06AM +0200, Petr Mladek wrote:
> On Thu 2019-08-22 17:36:49, Josh Poimboeuf wrote:
> > On Fri, Aug 16, 2019 at 11:46:08AM +0200, Petr Mladek wrote:
> > > On Wed 2019-08-14 10:12:44, Josh Poimboeuf wrote:
> > > > On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
> > > > > > Really, we should be going in the opposite direction, by creating module
> > > > > > dependencies, like all other kernel modules do, ensuring that a module
> > > > > > is loaded *before* we patch it.  That would also eliminate this bug.
> > > > 
> > > > We should look at whether it makes sense to destabilize live patching
> > > > for everybody, for a small minority of people who care about a small
> > > > minority of edge cases.
> > > 
> > > I do not see it that simple. Forcing livepatched modules to be
> > > loaded would mean loading "random" new modules when updating
> > > livepatches:
> > 
> > I don't want to start a long debate on this, because this idea isn't
> > even my first choice.  But we shouldn't dismiss it outright.
> 
> I am glad to hear that this is not your first choice.
> 
> 
> > >   + It means more actions and higher risk to destabilize
> > >     the system. Different modules have different quality.
> > 
> > Maybe the distro shouldn't ship modules which would destabilize the
> > system.
> 
> Is this realistic? Even the best QA could not check all scenarios.
> My point is that the more actions we do the bigger the risk is.

Sure, it introduces risk.  But we have to compare that risk (which only
affects rare edge cases) with the ones introduced by the late module
patching code.  I get the feeling that "late module patching" introduces
risk to a broader range of use cases than "occasional loading of unused
modules".

The latter risk could be minimized by introducing a disabled state for
modules - load it in memory, but don't expose it to users until
explicitly loaded.  Just a brainstormed idea; not sure whether it would
work in practice.

> Anyway, this approach might cause loading modules that are never
> or rarely loaded together. Real life systems have limited number of
> peripherals.
> 
> I wonder if it might actually break certification of some
> hardware. It is just an idea. I do not know how certifications
> are done and what is the scope or limits.
> 
> 
> > >   + It might open more security holes that are not fixed by
> > >     the livepatch.
> > 
> > Following the same line of thinking, the livepatch infrastructure might
> > open security holes because of the inherent complexity of late module
> > patching.
> 
> Could you be more specific, please?
> Has there been any known security hole in the late module
> livepatching code?

Just off the top of my head, I can think of two recent bugs which can be
blamed on late module patching:

1) There was a RHEL-only bug which caused arch_klp_init_object_loaded()
   to not be loaded.  This resulted in a panic when certain patched code
   was executed.

2) arch_klp_init_object_loaded() currently doesn't have any jump label
   specific code.  This has recently caused panics for patched code
   which relies on static keys.  The workaround is to not use jump
   labels in patched code.  The real fix is to add support for them in
   arch_klp_init_object_loaded().

I can easily foresee more problems like those in the future.  Going
forward we have to always keep track of which special sections are
needed for which architectures.  Those special sections can change over
time, or can simply be overlooked for a given architecture.  It's
fragile.

Not to mention that most of the bugs we've fixed over the years seem to
be related to klp_init_object_loaded() and klp_module_coming/going().
I would expect that to continue given the hackish nature of late module
loading.  With live patching, almost any bug can be a security bug.


> > >   + It might require some extra configuration actions to handle
> > >     the newly opened interfaces (devices). For example, updating
> > >     SELinux policies.
> > 
> > I assume you mean user-created policies, not distro ones?  Is this even
> > a realistic concern?
> 
> Honestly, I do not know. I am not familiar with this area. There are
> also containers. They are going to be everywhere. They also need a lot
> of rules to keep stuff separated. And it is another area where I have
> no idea if newly loaded and unexpectedly modules might need special
> handling.
> 
> 
> > >   + Are there conflicting modules that might need to get
> > >     livepatched?
> > 
> > Again is this realistic?
> 
> I do not know. But I could imagine it.
> 
> 
> > > This approach has a strong no-go from my side.
> > 
> > </devils-advocate>
> > 
> > I agree it's not ideal, but nothing is ideal at this point.  Let's not
> > to rule it out prematurely.  I do feel that our current approach is not
> > the best.  It will continue to create problems for us until we fix it.
> 
> I am sure that we could do better. I just think that forcibly loading
> modules is opening too huge can of worms. Basically all other
> approaches have more limited or better defined effects.
> 
> For example, the newly added code that clears the relocations
> is something that can be tested. Behavior of "random" mix of
> loaded modules opens possibilities that have never been
> discovered before.

I'd just ask that you not be so quick to shut down ideas.  Ideas can be
iterated.  If you're sure we can do better, propose something better.
Shooting down ideas without trying to improve them (or find better ones)
doesn't help.

Our late module patching architecture is too fragile to be acceptable.
It's time to find something better.

> > > > - Changing 'atomic replace' to allow patch modules to be per-object.
> > > 
> > > The problem might be how to transition all loaded objects atomically
> > > when the needed code is loaded from different modules.
> > 
> > I'm not sure what you mean.
> > 
> > My idea was that each patch module would be specific to an object, with
> > no inter-module change dependencies.  So when using atomic replace, if
> > the patch module is only targeted to vmlinux, then only vmlinux-targeted
> > patch modules would be replaced.
> > 
> > In other words, 'atomic replace' would be object-specific.
> > 
> > > Alternative would be to support only per-object consitency. But it
> > > might reduce the number of supported scenarios too much. Also it
> > > would make livepatching more error-prone.
> 
> By per-object consistency I mean the same as you with "each patch
> module would be specific to an object, with no inter-module change
> dependencies".
> 
> My concern is that it would prevent semantic changes in a shared code.
> Semantic changes are rare. But changes in shared code are not.
> 
> If we reduce the consistency to per-object consistency. Will the
> consistency still make sense then? We might want to go back to
> trees, I mean immediate mode.

I still don't follow your logic.  Why wouldn't per-object consistency
make sense?  Most patches are per-object anyway.

We just have to make sure not to change exported interfaces.  (But
that's already an issue for our distros anyway because of kABI.)

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-26 13:44         ` Nicolai Stange
@ 2019-08-26 15:02           ` Josh Poimboeuf
  0 siblings, 0 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-08-26 15:02 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Miroslav Benes, jikos, pmladek, joe.lawrence, live-patching,
	linux-kernel

On Mon, Aug 26, 2019 at 03:44:21PM +0200, Nicolai Stange wrote:
> Josh Poimboeuf <jpoimboe@redhat.com> writes:
> 
> > On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
> >> > Really, we should be going in the opposite direction, by creating module
> >> > dependencies, like all other kernel modules do, ensuring that a module
> >> > is loaded *before* we patch it.  That would also eliminate this bug.
> >> 
> >> Yes, but it is not ideal either with cumulative one-fixes-all patch 
> >> modules. It would load also modules which are not necessary for a 
> >> customer and I know that at least some customers care about this. They 
> >> want to deploy only things which are crucial for their systems.
> 
> Security concerns set aside, some of the patched modules might get
> distributed separately from the main kernel through some sort of
> kernel-*-extra packages and thus, not be found on some target system
> at all. Or they might have been blacklisted.

True.

> > If you frame the question as "do you want to destabilize the live
> > patching infrastucture" then the answer might be different.
> >
> > We should look at whether it makes sense to destabilize live patching
> > for everybody, for a small minority of people who care about a small
> > minority of edge cases.
> >
> > Or maybe there's some other solution we haven't thought about, which
> > fits more in the framework of how kernel modules already work.
> >
> >> We could split patch modules as you proposed in the past, but that have 
> >> issues as well.
> >
> > Right, I'm not really crazy about that solution either.
> >
> > Here's another idea: per-object patch modules.  Patches to vmlinux are
> > in a vmlinux patch module.  Patches to kvm.ko are in a kvm patch module.
> > That would require:
> >
> > - Careful management of dependencies between object-specific patches.
> >   Maybe that just means that exported function ABIs shouldn't change.
> >
> > - Some kind of hooking into modprobe to ensure the patch module gets
> >   loaded with the real one.
> >
> > - Changing 'atomic replace' to allow patch modules to be per-object.
> >
> 
> Perhaps I'm misunderstanding, but supporting only per-object livepatch
> modules would make livepatch creation for something like commit
> 15fab63e1e57 ("fs: prevent page refcount overflow in pipe_buf_get"),
> CVE-2019-11487 really cumbersome (see the fuse part)?

Just don't change exported interfaces.

In this case you could leave generic_pipe_buf_get() alone and then
instead add a generic_pipe_buf_get__patched() which is called by the
patched fuse module.  If you build the fuse-specific livepatch module
right, it would automatically have a dependency on the vmlinux-specific
livepatch module.

> I think I've seen similar interdependencies between e.g. kvm.ko <->
> kvm_intel.ko, but can't find an example right now.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-26 14:54               ` Josh Poimboeuf
@ 2019-08-27 15:05                 ` Joe Lawrence
  2019-08-27 15:37                   ` Josh Poimboeuf
  2019-09-02 16:13                 ` Miroslav Benes
  1 sibling, 1 reply; 44+ messages in thread
From: Joe Lawrence @ 2019-08-27 15:05 UTC (permalink / raw)
  To: Josh Poimboeuf, Petr Mladek
  Cc: jikos, Miroslav Benes, linux-kernel, live-patching

On 8/26/19 10:54 AM, Josh Poimboeuf wrote:
> On Fri, Aug 23, 2019 at 10:13:06AM +0200, Petr Mladek wrote:
>> On Thu 2019-08-22 17:36:49, Josh Poimboeuf wrote:
>>> On Fri, Aug 16, 2019 at 11:46:08AM +0200, Petr Mladek wrote:
>>>> On Wed 2019-08-14 10:12:44, Josh Poimboeuf wrote:
>>>>> On Wed, Aug 14, 2019 at 01:06:09PM +0200, Miroslav Benes wrote:
>>>>>>> Really, we should be going in the opposite direction, by creating module
>>>>>>> dependencies, like all other kernel modules do, ensuring that a module
>>>>>>> is loaded *before* we patch it.  That would also eliminate this bug.
>>>>>
>>>>> We should look at whether it makes sense to destabilize live patching
>>>>> for everybody, for a small minority of people who care about a small
>>>>> minority of edge cases.
>>>>
>>>> I do not see it that simple. Forcing livepatched modules to be
>>>> loaded would mean loading "random" new modules when updating
>>>> livepatches:
>>>
>>> I don't want to start a long debate on this, because this idea isn't
>>> even my first choice.  But we shouldn't dismiss it outright.
>>
>> I am glad to hear that this is not your first choice.
>>
>>
>>>>    + It means more actions and higher risk to destabilize
>>>>      the system. Different modules have different quality.
>>>
>>> Maybe the distro shouldn't ship modules which would destabilize the
>>> system.
>>
>> Is this realistic? Even the best QA could not check all scenarios.
>> My point is that the more actions we do the bigger the risk is.
> 
> Sure, it introduces risk.  But we have to compare that risk (which only
> affects rare edge cases) with the ones introduced by the late module
> patching code.  I get the feeling that "late module patching" introduces
> risk to a broader range of use cases than "occasional loading of unused
> modules".
> 
> The latter risk could be minimized by introducing a disabled state for
> modules - load it in memory, but don't expose it to users until
> explicitly loaded.  Just a brainstormed idea; not sure whether it would
> work in practice.
> 

Interesting idea.  We would need to ensure consistency between the 
loaded-but-not-enabled module and the version on disk.  Does module init 
run when it's enabled?  Etc.

<blue sky ideas>

What about folding this the other way?  ie, if a livepatch targets 
unloaded module foo, loaded module bar, and vmlinux ... it effectively 
patches bar and vmlinux, but the foo changes are dropped. 
Responsibility is placed on the admin to install an updated foo before 
loading it (in which case, livepatching core will again ignore foo.)

Building on this idea, perhaps loading that livepatch would also 
blacklist specific, known vulnerable (unloaded) module versions.  If the 
admin tries to load one, a debug msg is generated explaining why it 
can't be loaded by default.

</blue sky ideas>

>>>>    + It might open more security holes that are not fixed by
>>>>      the livepatch.
>>>
>>> Following the same line of thinking, the livepatch infrastructure might
>>> open security holes because of the inherent complexity of late module
>>> patching.
>>
>> Could you be more specific, please?
>> Has there been any known security hole in the late module
>> livepatching code?
> 
> Just off the top of my head, I can think of two recent bugs which can be
> blamed on late module patching:
> 
> 1) There was a RHEL-only bug which caused arch_klp_init_object_loaded()
>     to not be loaded.  This resulted in a panic when certain patched code
>     was executed.
> 
> 2) arch_klp_init_object_loaded() currently doesn't have any jump label
>     specific code.  This has recently caused panics for patched code
>     which relies on static keys.  The workaround is to not use jump
>     labels in patched code.  The real fix is to add support for them in
>     arch_klp_init_object_loaded().
> 
> I can easily foresee more problems like those in the future.  Going
> forward we have to always keep track of which special sections are
> needed for which architectures.  Those special sections can change over
> time, or can simply be overlooked for a given architecture.  It's
> fragile.

FWIW, the static keys case is more involved than simple deferred 
relocations -- those keys are added to lists and then the static key 
code futzes with them when it needs to update code sites.  That means 
the code managing the data structures, kernel/jump_label.c, will need to 
understand livepatch's strangely loaded-but-not-initialized variants.

I don't think the other special sections will require such invasive 
changes, but it's something to keep in mind with respect to late module 
patching.

-- Joe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-27 15:05                 ` Joe Lawrence
@ 2019-08-27 15:37                   ` Josh Poimboeuf
  0 siblings, 0 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-08-27 15:37 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: Petr Mladek, jikos, Miroslav Benes, linux-kernel, live-patching

On Tue, Aug 27, 2019 at 11:05:51AM -0400, Joe Lawrence wrote:
> > Sure, it introduces risk.  But we have to compare that risk (which only
> > affects rare edge cases) with the ones introduced by the late module
> > patching code.  I get the feeling that "late module patching" introduces
> > risk to a broader range of use cases than "occasional loading of unused
> > modules".
> > 
> > The latter risk could be minimized by introducing a disabled state for
> > modules - load it in memory, but don't expose it to users until
> > explicitly loaded.  Just a brainstormed idea; not sure whether it would
> > work in practice.
> > 
> 
> Interesting idea.  We would need to ensure consistency between the
> loaded-but-not-enabled module and the version on disk.  Does module init run
> when it's enabled?  Etc.

I don't think there can be a mismatch unless somebody is mucking with
the .ko files directly -- and anyway that would already be a problem
today, because the patch module already assumes that the runtime version
of the module matches what the patch module was built against.

> <blue sky ideas>
> 
> What about folding this the other way?  ie, if a livepatch targets unloaded
> module foo, loaded module bar, and vmlinux ... it effectively patches bar
> and vmlinux, but the foo changes are dropped. Responsibility is placed on
> the admin to install an updated foo before loading it (in which case,
> livepatching core will again ignore foo.)
> 
> Building on this idea, perhaps loading that livepatch would also blacklist
> specific, known vulnerable (unloaded) module versions.  If the admin tries
> to load one, a debug msg is generated explaining why it can't be loaded by
> default.
> 
> </blue sky ideas>

I like this.

One potential tweak: the updated modules could be delivered with the
patch module, and either replaced on disk or otherwise hooked into
modprobe.

> > > > >    + It might open more security holes that are not fixed by
> > > > >      the livepatch.
> > > > 
> > > > Following the same line of thinking, the livepatch infrastructure might
> > > > open security holes because of the inherent complexity of late module
> > > > patching.
> > > 
> > > Could you be more specific, please?
> > > Has there been any known security hole in the late module
> > > livepatching code?
> > 
> > Just off the top of my head, I can think of two recent bugs which can be
> > blamed on late module patching:
> > 
> > 1) There was a RHEL-only bug which caused arch_klp_init_object_loaded()
> >     to not be loaded.  This resulted in a panic when certain patched code
> >     was executed.
> > 
> > 2) arch_klp_init_object_loaded() currently doesn't have any jump label
> >     specific code.  This has recently caused panics for patched code
> >     which relies on static keys.  The workaround is to not use jump
> >     labels in patched code.  The real fix is to add support for them in
> >     arch_klp_init_object_loaded().
> > 
> > I can easily foresee more problems like those in the future.  Going
> > forward we have to always keep track of which special sections are
> > needed for which architectures.  Those special sections can change over
> > time, or can simply be overlooked for a given architecture.  It's
> > fragile.
> 
> FWIW, the static keys case is more involved than simple deferred relocations
> -- those keys are added to lists and then the static key code futzes with
> them when it needs to update code sites.  That means the code managing the
> data structures, kernel/jump_label.c, will need to understand livepatch's
> strangely loaded-but-not-initialized variants.
> 
> I don't think the other special sections will require such invasive changes,
> but it's something to keep in mind with respect to late module patching.

Maybe it could be implemented in a way that such differences are
transparent (insert lots of hand-waving).

So as far as I can tell, we currently have three feasible options:

1) drop unloaded module changes (and blacklist the old .ko and/or replace it)
2) use per-object patches (with no exported function changes)
3) half-load unloaded modules so we can patch them

I think I like #1, if we could figure out a simple way to do it.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-08-26 14:54               ` Josh Poimboeuf
  2019-08-27 15:05                 ` Joe Lawrence
@ 2019-09-02 16:13                 ` Miroslav Benes
  2019-09-02 17:05                   ` Joe Lawrence
  1 sibling, 1 reply; 44+ messages in thread
From: Miroslav Benes @ 2019-09-02 16:13 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, jikos, joe.lawrence, linux-kernel, live-patching

> I can easily foresee more problems like those in the future.  Going
> forward we have to always keep track of which special sections are
> needed for which architectures.  Those special sections can change over
> time, or can simply be overlooked for a given architecture.  It's
> fragile.

Indeed. It bothers me a lot. Even x86 "port" is not feature complete in 
this regard (jump labels, alternatives,...) and who knows what lurks in 
the corners of the other architectures we support.

So it is in itself reason enough to do something about late module 
patching.

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-02 16:13                 ` Miroslav Benes
@ 2019-09-02 17:05                   ` Joe Lawrence
  2019-09-03 13:02                     ` Miroslav Benes
  0 siblings, 1 reply; 44+ messages in thread
From: Joe Lawrence @ 2019-09-02 17:05 UTC (permalink / raw)
  To: Miroslav Benes, Josh Poimboeuf
  Cc: Petr Mladek, jikos, linux-kernel, live-patching

On 9/2/19 12:13 PM, Miroslav Benes wrote:
>> I can easily foresee more problems like those in the future.  Going
>> forward we have to always keep track of which special sections are
>> needed for which architectures.  Those special sections can change over
>> time, or can simply be overlooked for a given architecture.  It's
>> fragile.
> 
> Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> this regard (jump labels, alternatives,...) and who knows what lurks in
> the corners of the other architectures we support.
> 
> So it is in itself reason enough to do something about late module
> patching.
> 

Hi Miroslav,

I was tinkering with the "blue-sky" ideas that I mentioned to Josh the 
other day.  I dunno if you had a chance to look at what removing that 
code looks like, but I can continue to flesh out that idea if it looks 
interesting:

   https://github.com/joe-lawrence/linux/tree/blue-sky

A full demo would require packaging up replacement .ko's with a 
livepatch, as well as "blacklisting" those deprecated .kos, etc.  But 
that's all I had time to cook up last week before our holiday weekend here.

Regards,

-- Joe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-02 17:05                   ` Joe Lawrence
@ 2019-09-03 13:02                     ` Miroslav Benes
  2019-09-04  8:49                       ` Petr Mladek
  2019-09-05  2:32                       ` Josh Poimboeuf
  0 siblings, 2 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-09-03 13:02 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: Josh Poimboeuf, Petr Mladek, jikos, linux-kernel, live-patching

On Mon, 2 Sep 2019, Joe Lawrence wrote:

> On 9/2/19 12:13 PM, Miroslav Benes wrote:
> >> I can easily foresee more problems like those in the future.  Going
> >> forward we have to always keep track of which special sections are
> >> needed for which architectures.  Those special sections can change over
> >> time, or can simply be overlooked for a given architecture.  It's
> >> fragile.
> > 
> > Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> > this regard (jump labels, alternatives,...) and who knows what lurks in
> > the corners of the other architectures we support.
> > 
> > So it is in itself reason enough to do something about late module
> > patching.
> > 
> 
> Hi Miroslav,
> 
> I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
> day.

> I dunno if you had a chance to look at what removing that code looks
> like, but I can continue to flesh out that idea if it looks interesting:

Unfortunately no and I don't think I'll come up with something useful 
before LPC, so anything is really welcome.

> 
>   https://github.com/joe-lawrence/linux/tree/blue-sky
> 
> A full demo would require packaging up replacement .ko's with a livepatch, as
> well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> to cook up last week before our holiday weekend here.

Frankly, I'm not sure about this approach. I'm kind of torn. The current 
solution is far from ideal, but I'm not excited about the other options 
either. It seems like the choice is basically between "general but 
technically complicated fragile solution with nontrivial maintenance 
burden", or "something safer and maybe cleaner, but limiting for 
users/distros". Of course it depends on whether the limitation is even 
real and how big it is. Unfortunately we cannot quantify it much and that 
is probably why our opinions (in the email thread) differ.

Not much constructive email, but I have to think about it some more 
(before LPC).

Regards
Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-03 13:02                     ` Miroslav Benes
@ 2019-09-04  8:49                       ` Petr Mladek
  2019-09-04 16:26                         ` Joe Lawrence
                                           ` (2 more replies)
  2019-09-05  2:32                       ` Josh Poimboeuf
  1 sibling, 3 replies; 44+ messages in thread
From: Petr Mladek @ 2019-09-04  8:49 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Joe Lawrence, jikos, Josh Poimboeuf, linux-kernel, live-patching

On Tue 2019-09-03 15:02:34, Miroslav Benes wrote:
> On Mon, 2 Sep 2019, Joe Lawrence wrote:
> 
> > On 9/2/19 12:13 PM, Miroslav Benes wrote:
> > >> I can easily foresee more problems like those in the future.  Going
> > >> forward we have to always keep track of which special sections are
> > >> needed for which architectures.  Those special sections can change over
> > >> time, or can simply be overlooked for a given architecture.  It's
> > >> fragile.
> > > 
> > > Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> > > this regard (jump labels, alternatives,...) and who knows what lurks in
> > > the corners of the other architectures we support.
> > > 
> > > So it is in itself reason enough to do something about late module
> > > patching.
> > > 
> > 
> > Hi Miroslav,
> > 
> > I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
> > day.
> 
> > I dunno if you had a chance to look at what removing that code looks
> > like, but I can continue to flesh out that idea if it looks interesting:
> 
> Unfortunately no and I don't think I'll come up with something useful 
> before LPC, so anything is really welcome.
> 
> > 
> >   https://github.com/joe-lawrence/linux/tree/blue-sky
> > 
> > A full demo would require packaging up replacement .ko's with a livepatch, as
> > well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> > to cook up last week before our holiday weekend here.
> 
> Frankly, I'm not sure about this approach. I'm kind of torn. The current 
> solution is far from ideal, but I'm not excited about the other options 
> either. It seems like the choice is basically between "general but 
> technically complicated fragile solution with nontrivial maintenance 
> burden", or "something safer and maybe cleaner, but limiting for 
> users/distros". Of course it depends on whether the limitation is even 
> real and how big it is. Unfortunately we cannot quantify it much and that 
> is probably why our opinions (in the email thread) differ.

I wonder what is necessary for a productive discussion on Plumbers:

  + Josh would like to see what code can get removed when late
    handling of modules gets removed. I think that it might be
    partially visible from Joe's blue-sky patches.


  + I would like to better understand the scope of the current
    problems. It is about modifying code in the livepatch that
    depends on position of the related code:

      + relocations are rather clear; we will need them anyway
	to access non-public (static) API from the original code.

      + What are the other changes?

      + Do we use them in livepatches? How often?

      + How often new problematic features appear?

      + Would be possible to detect potential problems, for example
	by comparing the code in the binary and in memory when
	the module is loaded the normal way?

      + Would be possible to reset the livepatch code in memory
	when the related module is unloaded and safe us half
	of the troubles?


    + It might be useful to prepare overview of the existing proposals
      and agree on the positives and negatives. I am afraid that some
      of them might depend on the customer base and
      use cases. Sometimes we might not have enough information.
      But it might be good to get on the same page where possible.

      Anyway, it might rule out some variants so that we could better
      concentrate on the acceptable ones. Or come with yet another
      proposal that would avoid the real blockers.


Any other ideas?

Would it be better to discuss this in a separate room with
a whiteboard or paperboard?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-04  8:49                       ` Petr Mladek
@ 2019-09-04 16:26                         ` Joe Lawrence
  2019-09-05  2:50                         ` Josh Poimboeuf
  2019-09-05 11:52                         ` Miroslav Benes
  2 siblings, 0 replies; 44+ messages in thread
From: Joe Lawrence @ 2019-09-04 16:26 UTC (permalink / raw)
  To: Petr Mladek, Miroslav Benes
  Cc: jikos, Josh Poimboeuf, linux-kernel, live-patching

On 9/4/19 4:49 AM, Petr Mladek wrote:
> On Tue 2019-09-03 15:02:34, Miroslav Benes wrote:
>> On Mon, 2 Sep 2019, Joe Lawrence wrote:
>>
>>> On 9/2/19 12:13 PM, Miroslav Benes wrote:
>>>>> I can easily foresee more problems like those in the future.  Going
>>>>> forward we have to always keep track of which special sections are
>>>>> needed for which architectures.  Those special sections can change over
>>>>> time, or can simply be overlooked for a given architecture.  It's
>>>>> fragile.
>>>>
>>>> Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
>>>> this regard (jump labels, alternatives,...) and who knows what lurks in
>>>> the corners of the other architectures we support.
>>>>
>>>> So it is in itself reason enough to do something about late module
>>>> patching.
>>>>
>>>
>>> Hi Miroslav,
>>>
>>> I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
>>> day.
>>
>>> I dunno if you had a chance to look at what removing that code looks
>>> like, but I can continue to flesh out that idea if it looks interesting:
>>
>> Unfortunately no and I don't think I'll come up with something useful
>> before LPC, so anything is really welcome.
>>
>>>
>>>    https://github.com/joe-lawrence/linux/tree/blue-sky
>>>
>>> A full demo would require packaging up replacement .ko's with a livepatch, as
>>> well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
>>> to cook up last week before our holiday weekend here.
>>
>> Frankly, I'm not sure about this approach. I'm kind of torn. The current
>> solution is far from ideal, but I'm not excited about the other options
>> either. It seems like the choice is basically between "general but
>> technically complicated fragile solution with nontrivial maintenance
>> burden", or "something safer and maybe cleaner, but limiting for
>> users/distros". Of course it depends on whether the limitation is even
>> real and how big it is. Unfortunately we cannot quantify it much and that
>> is probably why our opinions (in the email thread) differ.
> 
> I wonder what is necessary for a productive discussion on Plumbers:
> 

Pre-planning this part of the miniconf is a great idea.

>    + Josh would like to see what code can get removed when late
>      handling of modules gets removed. I think that it might be
>      partially visible from Joe's blue-sky patches.
> 
> 
>    + I would like to better understand the scope of the current
>      problems. It is about modifying code in the livepatch that
>      depends on position of the related code:
> 
>        + relocations are rather clear; we will need them anyway
> 	to access non-public (static) API from the original code.
> 
>        + What are the other changes?
> 
>        + Do we use them in livepatches? How often?
> 
>        + How often new problematic features appear?
> 
>        + Would be possible to detect potential problems, for example
> 	by comparing the code in the binary and in memory when
> 	the module is loaded the normal way?
> 
>        + Would be possible to reset the livepatch code in memory
> 	when the related module is unloaded and safe us half
> 	of the troubles?
> 
> 
>      + It might be useful to prepare overview of the existing proposals
>        and agree on the positives and negatives. I am afraid that some
>        of them might depend on the customer base and
>        use cases. Sometimes we might not have enough information.
>        But it might be good to get on the same page where possible.
> 
>        Anyway, it might rule out some variants so that we could better
>        concentrate on the acceptable ones. Or come with yet another
>        proposal that would avoid the real blockers.
> 
> 
> Any other ideas?

I'll just add to your list that late module patching introduces 
complexity for klp-convert / livepatch style relocation support. 
Without worrying about unloaded modules, I *think* klp-convert might 
already be able to handle relocations in special sections (altinsts, 
parainst, etc.).

I've put the current klp-convert patchset on top of the blue-sky branch 
to see if this indeed the case, but I'm not sure if I'll get through 
that experiment before LPC.

> 
> Would it be better to discuss this in a separate room with
> a whiteboard or paperboard?
> 

Whiteboard would probably be ideal, but paper would work and be more 
transportable than the former.

-- Joe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-03 13:02                     ` Miroslav Benes
  2019-09-04  8:49                       ` Petr Mladek
@ 2019-09-05  2:32                       ` Josh Poimboeuf
  2019-09-05 12:16                         ` Miroslav Benes
  1 sibling, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05  2:32 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Joe Lawrence, Petr Mladek, jikos, linux-kernel, live-patching

On Tue, Sep 03, 2019 at 03:02:34PM +0200, Miroslav Benes wrote:
> On Mon, 2 Sep 2019, Joe Lawrence wrote:
> 
> > On 9/2/19 12:13 PM, Miroslav Benes wrote:
> > >> I can easily foresee more problems like those in the future.  Going
> > >> forward we have to always keep track of which special sections are
> > >> needed for which architectures.  Those special sections can change over
> > >> time, or can simply be overlooked for a given architecture.  It's
> > >> fragile.
> > > 
> > > Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> > > this regard (jump labels, alternatives,...) and who knows what lurks in
> > > the corners of the other architectures we support.
> > > 
> > > So it is in itself reason enough to do something about late module
> > > patching.
> > > 
> > 
> > Hi Miroslav,
> > 
> > I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
> > day.
> 
> > I dunno if you had a chance to look at what removing that code looks
> > like, but I can continue to flesh out that idea if it looks interesting:
> 
> Unfortunately no and I don't think I'll come up with something useful 
> before LPC, so anything is really welcome.
> 
> > 
> >   https://github.com/joe-lawrence/linux/tree/blue-sky

I like this a lot.

> > A full demo would require packaging up replacement .ko's with a livepatch, as
> > well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> > to cook up last week before our holiday weekend here.
> 
> Frankly, I'm not sure about this approach. I'm kind of torn. The current 
> solution is far from ideal, but I'm not excited about the other options 
> either. It seems like the choice is basically between "general but 
> technically complicated fragile solution with nontrivial maintenance 
> burden", or "something safer and maybe cleaner, but limiting for 
> users/distros". Of course it depends on whether the limitation is even 
> real and how big it is. Unfortunately we cannot quantify it much and that 
> is probably why our opinions (in the email thread) differ.

How would this option be "limiting for users/distros"?  If the packaging
part of the solution is done correctly then I don't see how it would be
limiting.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-04  8:49                       ` Petr Mladek
  2019-09-04 16:26                         ` Joe Lawrence
@ 2019-09-05  2:50                         ` Josh Poimboeuf
  2019-09-05 11:09                           ` Petr Mladek
  2019-09-05 12:03                           ` Miroslav Benes
  2019-09-05 11:52                         ` Miroslav Benes
  2 siblings, 2 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05  2:50 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Miroslav Benes, Joe Lawrence, jikos, linux-kernel, live-patching

On Wed, Sep 04, 2019 at 10:49:32AM +0200, Petr Mladek wrote:
> On Tue 2019-09-03 15:02:34, Miroslav Benes wrote:
> > On Mon, 2 Sep 2019, Joe Lawrence wrote:
> > 
> > > On 9/2/19 12:13 PM, Miroslav Benes wrote:
> > > >> I can easily foresee more problems like those in the future.  Going
> > > >> forward we have to always keep track of which special sections are
> > > >> needed for which architectures.  Those special sections can change over
> > > >> time, or can simply be overlooked for a given architecture.  It's
> > > >> fragile.
> > > > 
> > > > Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> > > > this regard (jump labels, alternatives,...) and who knows what lurks in
> > > > the corners of the other architectures we support.
> > > > 
> > > > So it is in itself reason enough to do something about late module
> > > > patching.
> > > > 
> > > 
> > > Hi Miroslav,
> > > 
> > > I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
> > > day.
> > 
> > > I dunno if you had a chance to look at what removing that code looks
> > > like, but I can continue to flesh out that idea if it looks interesting:
> > 
> > Unfortunately no and I don't think I'll come up with something useful 
> > before LPC, so anything is really welcome.
> > 
> > > 
> > >   https://github.com/joe-lawrence/linux/tree/blue-sky
> > > 
> > > A full demo would require packaging up replacement .ko's with a livepatch, as
> > > well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> > > to cook up last week before our holiday weekend here.
> > 
> > Frankly, I'm not sure about this approach. I'm kind of torn. The current 
> > solution is far from ideal, but I'm not excited about the other options 
> > either. It seems like the choice is basically between "general but 
> > technically complicated fragile solution with nontrivial maintenance 
> > burden", or "something safer and maybe cleaner, but limiting for 
> > users/distros". Of course it depends on whether the limitation is even 
> > real and how big it is. Unfortunately we cannot quantify it much and that 
> > is probably why our opinions (in the email thread) differ.
> 
> I wonder what is necessary for a productive discussion on Plumbers:
> 
>   + Josh would like to see what code can get removed when late
>     handling of modules gets removed. I think that it might be
>     partially visible from Joe's blue-sky patches.

Yes, and I like what I see.  Especially the removal of the .klp.arch
nastiness!

>   + I would like to better understand the scope of the current
>     problems. It is about modifying code in the livepatch that
>     depends on position of the related code:
> 
>       + relocations are rather clear; we will need them anyway
> 	to access non-public (static) API from the original code.
> 
>       + What are the other changes?

I think the .klp.arch sections are the big ones:

  .klp.arch.altinstructions
  .klp.arch.parainstructions
  .klp.arch.jump_labels (doesn't exist yet)

And that's just x86...

And then of course there's the klp coming/going notifiers which have
also been an additional source of complexity.

>       + Do we use them in livepatches? How often?

I don't have a number, but it's very common to patch a function which
uses jump labels or alternatives.

>       + How often new problematic features appear?

I'm not exactly sure what you mean, but it seems that anytime we add a
new feature, we have to try to wrap our heads around how it interacts
with the weirdness of late module patching.

>       + Would be possible to detect potential problems, for example
> 	by comparing the code in the binary and in memory when
> 	the module is loaded the normal way?

Perhaps, though I assume this would be some out-of-band testing thing.

>       + Would be possible to reset the livepatch code in memory
> 	when the related module is unloaded and safe us half
> 	of the troubles?

Maybe, but I think that would solve a much lower percentage of our
troubles than half :-/

>     + It might be useful to prepare overview of the existing proposals
>       and agree on the positives and negatives. I am afraid that some
>       of them might depend on the customer base and
>       use cases. Sometimes we might not have enough information.
>       But it might be good to get on the same page where possible.

I think we've already done that for the existing proposals.  Maybe
Miroslav can summarize them at the LPC session.

>       Anyway, it might rule out some variants so that we could better
>       concentrate on the acceptable ones. Or come with yet another
>       proposal that would avoid the real blockers.

I'd like to hear more specific negatives about Joe's recent patches,
which IMO, are the best option we've discussed so far.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05  2:50                         ` Josh Poimboeuf
@ 2019-09-05 11:09                           ` Petr Mladek
  2019-09-05 11:19                             ` Jiri Kosina
                                               ` (2 more replies)
  2019-09-05 12:03                           ` Miroslav Benes
  1 sibling, 3 replies; 44+ messages in thread
From: Petr Mladek @ 2019-09-05 11:09 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: jikos, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Wed 2019-09-04 21:50:55, Josh Poimboeuf wrote:
> On Wed, Sep 04, 2019 at 10:49:32AM +0200, Petr Mladek wrote:
> > I wonder what is necessary for a productive discussion on Plumbers:
> > 
> >   + Josh would like to see what code can get removed when late
> >     handling of modules gets removed. I think that it might be
> >     partially visible from Joe's blue-sky patches.
> 
> Yes, and I like what I see.  Especially the removal of the .klp.arch
> nastiness!

Could we get rid of it?

Is there any other way to get access to static variables
and functions from the livepatched code?

> I think the .klp.arch sections are the big ones:
> 
>   .klp.arch.altinstructions
>   .klp.arch.parainstructions
>   .klp.arch.jump_labels (doesn't exist yet)
> 
> And that's just x86...
> 
> And then of course there's the klp coming/going notifiers which have
> also been an additional source of complexity.
> 
> >       + Do we use them in livepatches? How often?
> 
> I don't have a number, but it's very common to patch a function which
> uses jump labels or alternatives.

Really? My impression is that both alternatives and jump_labels
are used in hot paths. I would expect them mostly in core code
that is always loaded.

Alternatives are often used in assembly that we are not able
to livepatch anyway.

Or are they spread widely via some macros or inlined functions?


> >       + How often new problematic features appear?
> 
> I'm not exactly sure what you mean, but it seems that anytime we add a
> new feature, we have to try to wrap our heads around how it interacts
> with the weirdness of late module patching.

I agree that we need to think about it and it makes complications.
Anyway, I think that these are never the biggest problems.

I would be more concerned about arch-specific features that might need
special handling in the livepatch code. Everyone talks only about
alternatives and jump_labels that were added long time ago.


> >       Anyway, it might rule out some variants so that we could better
> >       concentrate on the acceptable ones. Or come with yet another
> >       proposal that would avoid the real blockers.
> 
> I'd like to hear more specific negatives about Joe's recent patches,
> which IMO, are the best option we've discussed so far.

I discussed this approach with our project manager. He was not much
excited about this solution. His first idea was that it would block
attaching USB devices. They are used by admins when taking care of
the servers. And there might be other scenarios where a new module
might need loading to solve some situation.

Customers understand Livepatching as a way how to secure system
without immediate reboot and with minimal (invisible) effect
on the workload. They might get pretty surprised when the system
suddenly blocks their "normal" workflow.

As Miroslav said. No solution is perfect. We need to find the most
acceptable compromise. It seems that you are more concerned about
saving code, reducing complexity and risk. I am more concerned
about user satisfaction.

It is almost impossible to predict effects on user satisfaction
because they have different workflow, use case, expectation,
and tolerance.

We could better estimate the technical side of each solution:

   + implementation cost
   + maintenance cost
   + risks
   + possible improvements and hardening
   + user visible effects
   + complication and limits with creating livepatches


From my POV, the most problematic is the arch-specific code.
It is hard to maintain and we do not have it fully under
control.

And I do not believe that we could remove all arch specific code
when we do not allow delayed livepatching of modules.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 11:09                           ` Petr Mladek
@ 2019-09-05 11:19                             ` Jiri Kosina
  2019-09-05 13:23                               ` Josh Poimboeuf
  2019-09-05 11:39                             ` Joe Lawrence
  2019-09-05 13:08                             ` Josh Poimboeuf
  2 siblings, 1 reply; 44+ messages in thread
From: Jiri Kosina @ 2019-09-05 11:19 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Joe Lawrence, Miroslav Benes, linux-kernel,
	live-patching

On Thu, 5 Sep 2019, Petr Mladek wrote:

> > I don't have a number, but it's very common to patch a function which 
> > uses jump labels or alternatives.
> 
> Really? My impression is that both alternatives and jump_labels
> are used in hot paths. I would expect them mostly in core code
> that is always loaded.
> 
> Alternatives are often used in assembly that we are not able
> to livepatch anyway.
> 
> Or are they spread widely via some macros or inlined functions?

All the indirect jumps are turned into alternatives when retpolines are in 
place.

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 11:09                           ` Petr Mladek
  2019-09-05 11:19                             ` Jiri Kosina
@ 2019-09-05 11:39                             ` Joe Lawrence
  2019-09-05 13:08                             ` Josh Poimboeuf
  2 siblings, 0 replies; 44+ messages in thread
From: Joe Lawrence @ 2019-09-05 11:39 UTC (permalink / raw)
  To: Petr Mladek, Josh Poimboeuf
  Cc: jikos, Miroslav Benes, linux-kernel, live-patching

On 9/5/19 7:09 AM, Petr Mladek wrote:
> On Wed 2019-09-04 21:50:55, Josh Poimboeuf wrote:
>> On Wed, Sep 04, 2019 at 10:49:32AM +0200, Petr Mladek wrote:
>>> I wonder what is necessary for a productive discussion on Plumbers:
>>>
>>>    + Josh would like to see what code can get removed when late
>>>      handling of modules gets removed. I think that it might be
>>>      partially visible from Joe's blue-sky patches.
>>
>> Yes, and I like what I see.  Especially the removal of the .klp.arch
>> nastiness!
> 
> Could we get rid of it?
> 
> Is there any other way to get access to static variables
> and functions from the livepatched code?
> 

Hi Petr,

I think the question is whether .klp (not-arch specific) relocations 
would be sufficient (without late module patching).  If it would a great 
simplification if those were all we needed.  I'm not 100% sure about 
this quite yet, but am hoping that is the case.

>>>        Anyway, it might rule out some variants so that we could better
>>>        concentrate on the acceptable ones. Or come with yet another
>>>        proposal that would avoid the real blockers.
>>
>> I'd like to hear more specific negatives about Joe's recent patches,
>> which IMO, are the best option we've discussed so far.
> 
> I discussed this approach with our project manager. He was not much
> excited about this solution. His first idea was that it would block
> attaching USB devices. They are used by admins when taking care of
> the servers. And there might be other scenarios where a new module
> might need loading to solve some situation.
> > Customers understand Livepatching as a way how to secure system
> without immediate reboot and with minimal (invisible) effect
> on the workload. They might get pretty surprised when the system > suddenly blocks their "normal" workflow.

FWIW the complete blue-sky idea was that the package delivered to the 
customer (RPM, deb, whatever) would provide:

  - livepatch .ko, blacklists known vulnerable srcversions
  - updated .ko's for the blacklisted modules

The second part would maintain module loading workflow, albeit with a 
new set .ko files.

> As Miroslav said. No solution is perfect. We need to find the most
> acceptable compromise. It seems that you are more concerned about
> saving code, reducing complexity and risk. I am more concerned
> about user satisfaction.
> 
> It is almost impossible to predict effects on user satisfaction
> because they have different workflow, use case, expectation,
> and tolerance.
> 
> We could better estimate the technical side of each solution:
> 
>     + implementation cost
>     + maintenance cost
>     + risks
>     + possible improvements and hardening
>     + user visible effects
>     + complication and limits with creating livepatches
> 
> 
>  From my POV, the most problematic is the arch-specific code.
> It is hard to maintain and we do not have it fully under
> control.
> 
> And I do not believe that we could remove all arch specific code
> when we do not allow delayed livepatching of modules.
> 

No doubt there will probably always be some arch-specific code, and even 
my blue-sky branch didn't move all that much.  But I think the idea 
could be a bigger simplification in terms of the mental model, should 
the solution be acceptable by criteria you mention above.

-- Joe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-04  8:49                       ` Petr Mladek
  2019-09-04 16:26                         ` Joe Lawrence
  2019-09-05  2:50                         ` Josh Poimboeuf
@ 2019-09-05 11:52                         ` Miroslav Benes
  2 siblings, 0 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-09-05 11:52 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joe Lawrence, jikos, Josh Poimboeuf, linux-kernel, live-patching

[...]

> I wonder what is necessary for a productive discussion on Plumbers:

[...]

>     + It might be useful to prepare overview of the existing proposals
>       and agree on the positives and negatives. I am afraid that some
>       of them might depend on the customer base and
>       use cases. Sometimes we might not have enough information.
>       But it might be good to get on the same page where possible.
> 
>       Anyway, it might rule out some variants so that we could better
>       concentrate on the acceptable ones. Or come with yet another
>       proposal that would avoid the real blockers.

My plan is to describe the problem first for the public in the room. Then 
describe the proposals and their advantages and disadvantages. This should 
start the discussion pretty well.

I would be happy if we managed to settle at least on the requirements for 
a solution. It seems that our experience with users and their use cases 
differ a lot and I doubt we could come up with a good solution without 
stating what we want (and don't want) first.

Silly hope is that there may be someone with a perfect solution in the 
room. After all, it is what conferences are for.

> Would it be better to discuss this in a separate room with
> a whiteboard or paperboard?

Maybe.

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05  2:50                         ` Josh Poimboeuf
  2019-09-05 11:09                           ` Petr Mladek
@ 2019-09-05 12:03                           ` Miroslav Benes
  2019-09-05 12:35                             ` Josh Poimboeuf
  1 sibling, 1 reply; 44+ messages in thread
From: Miroslav Benes @ 2019-09-05 12:03 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Joe Lawrence, jikos, linux-kernel, live-patching

On Wed, 4 Sep 2019, Josh Poimboeuf wrote:

> On Wed, Sep 04, 2019 at 10:49:32AM +0200, Petr Mladek wrote:
> > On Tue 2019-09-03 15:02:34, Miroslav Benes wrote:
> > > On Mon, 2 Sep 2019, Joe Lawrence wrote:
> > > 
> > > > On 9/2/19 12:13 PM, Miroslav Benes wrote:
> > > > >> I can easily foresee more problems like those in the future.  Going
> > > > >> forward we have to always keep track of which special sections are
> > > > >> needed for which architectures.  Those special sections can change over
> > > > >> time, or can simply be overlooked for a given architecture.  It's
> > > > >> fragile.
> > > > > 
> > > > > Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> > > > > this regard (jump labels, alternatives,...) and who knows what lurks in
> > > > > the corners of the other architectures we support.
> > > > > 
> > > > > So it is in itself reason enough to do something about late module
> > > > > patching.
> > > > > 
> > > > 
> > > > Hi Miroslav,
> > > > 
> > > > I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
> > > > day.
> > > 
> > > > I dunno if you had a chance to look at what removing that code looks
> > > > like, but I can continue to flesh out that idea if it looks interesting:
> > > 
> > > Unfortunately no and I don't think I'll come up with something useful 
> > > before LPC, so anything is really welcome.
> > > 
> > > > 
> > > >   https://github.com/joe-lawrence/linux/tree/blue-sky
> > > > 
> > > > A full demo would require packaging up replacement .ko's with a livepatch, as
> > > > well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> > > > to cook up last week before our holiday weekend here.
> > > 
> > > Frankly, I'm not sure about this approach. I'm kind of torn. The current 
> > > solution is far from ideal, but I'm not excited about the other options 
> > > either. It seems like the choice is basically between "general but 
> > > technically complicated fragile solution with nontrivial maintenance 
> > > burden", or "something safer and maybe cleaner, but limiting for 
> > > users/distros". Of course it depends on whether the limitation is even 
> > > real and how big it is. Unfortunately we cannot quantify it much and that 
> > > is probably why our opinions (in the email thread) differ.
> > 
> > I wonder what is necessary for a productive discussion on Plumbers:
> > 
> >   + Josh would like to see what code can get removed when late
> >     handling of modules gets removed. I think that it might be
> >     partially visible from Joe's blue-sky patches.
> 
> Yes, and I like what I see.  Especially the removal of the .klp.arch
> nastiness!
> 
> >   + I would like to better understand the scope of the current
> >     problems. It is about modifying code in the livepatch that
> >     depends on position of the related code:
> > 
> >       + relocations are rather clear; we will need them anyway
> > 	to access non-public (static) API from the original code.
> > 
> >       + What are the other changes?
> 
> I think the .klp.arch sections are the big ones:
> 
>   .klp.arch.altinstructions
>   .klp.arch.parainstructions
>   .klp.arch.jump_labels (doesn't exist yet)
> 
> And that's just x86...

I may misunderstand, but we have .klp.arch sections because para and 
alternatives have to be processed after relocations. And if we cannot get 
rid of relocations completely, because of static symbols, then we cannot 
get rid of .klp.arch sections either.
 
> And then of course there's the klp coming/going notifiers which have
> also been an additional source of complexity.

True, but I think we (me and Petr) do not consider it as much of a problem 
as you.

> I'd like to hear more specific negatives about Joe's recent patches,
> which IMO, are the best option we've discussed so far.

I'll reply elsewhere...

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05  2:32                       ` Josh Poimboeuf
@ 2019-09-05 12:16                         ` Miroslav Benes
  2019-09-05 12:54                           ` Josh Poimboeuf
  0 siblings, 1 reply; 44+ messages in thread
From: Miroslav Benes @ 2019-09-05 12:16 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Joe Lawrence, Petr Mladek, jikos, linux-kernel, live-patching

On Wed, 4 Sep 2019, Josh Poimboeuf wrote:

> On Tue, Sep 03, 2019 at 03:02:34PM +0200, Miroslav Benes wrote:
> > On Mon, 2 Sep 2019, Joe Lawrence wrote:
> > 
> > > On 9/2/19 12:13 PM, Miroslav Benes wrote:
> > > >> I can easily foresee more problems like those in the future.  Going
> > > >> forward we have to always keep track of which special sections are
> > > >> needed for which architectures.  Those special sections can change over
> > > >> time, or can simply be overlooked for a given architecture.  It's
> > > >> fragile.
> > > > 
> > > > Indeed. It bothers me a lot. Even x86 "port" is not feature complete in
> > > > this regard (jump labels, alternatives,...) and who knows what lurks in
> > > > the corners of the other architectures we support.
> > > > 
> > > > So it is in itself reason enough to do something about late module
> > > > patching.
> > > > 
> > > 
> > > Hi Miroslav,
> > > 
> > > I was tinkering with the "blue-sky" ideas that I mentioned to Josh the other
> > > day.
> > 
> > > I dunno if you had a chance to look at what removing that code looks
> > > like, but I can continue to flesh out that idea if it looks interesting:
> > 
> > Unfortunately no and I don't think I'll come up with something useful 
> > before LPC, so anything is really welcome.
> > 
> > > 
> > >   https://github.com/joe-lawrence/linux/tree/blue-sky
> 
> I like this a lot.
> 
> > > A full demo would require packaging up replacement .ko's with a livepatch, as
> > > well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> > > to cook up last week before our holiday weekend here.
> > 
> > Frankly, I'm not sure about this approach. I'm kind of torn. The current 
> > solution is far from ideal, but I'm not excited about the other options 
> > either. It seems like the choice is basically between "general but 
> > technically complicated fragile solution with nontrivial maintenance 
> > burden", or "something safer and maybe cleaner, but limiting for 
> > users/distros". Of course it depends on whether the limitation is even 
> > real and how big it is. Unfortunately we cannot quantify it much and that 
> > is probably why our opinions (in the email thread) differ.
> 
> How would this option be "limiting for users/distros"?  If the packaging
> part of the solution is done correctly then I don't see how it would be
> limiting.

I'll try to explain my worries.

Blacklisting first. Yes, I agree that it would make things a lot simpler, 
but I am afraid it would not fly at SUSE. Petr meanwhile explained 
elsewhere, but I don't think we can limit our customers that much. We 
perceive live patching as a product as much transparent as possible and as 
less intrusive as possible. One thing is to forbid to remove a module, the 
other is to forbid its loading.

We could warn the admin. Something like "there is a fix for a module foo, 
which is not loaded currently. It will not be patched and the system will 
be still vulnerable if you load the module unless a new fixed version is 
provided."

Yes, we can distribute the new version of .ko with a livepatch. What is 
the reason for blacklisting then? I don't probably understand, but either 
a module is loaded and we can patch it (without late module patching), or 
it is not and we could replace .ko on disk.

Now, I don't think that replacing .ko on disk is a good idea. We've 
already discussed it. It would lead to a maintenance/packaging problem, 
because you never know which version of the module is loaded in the 
system. The state space grows rather rapidly there.

But I may be wrong in my understanding, so bear with me.

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 12:03                           ` Miroslav Benes
@ 2019-09-05 12:35                             ` Josh Poimboeuf
  2019-09-05 12:49                               ` Miroslav Benes
  0 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 12:35 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Petr Mladek, Joe Lawrence, jikos, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 02:03:34PM +0200, Miroslav Benes wrote:
> > >   + I would like to better understand the scope of the current
> > >     problems. It is about modifying code in the livepatch that
> > >     depends on position of the related code:
> > > 
> > >       + relocations are rather clear; we will need them anyway
> > > 	to access non-public (static) API from the original code.
> > > 
> > >       + What are the other changes?
> > 
> > I think the .klp.arch sections are the big ones:
> > 
> >   .klp.arch.altinstructions
> >   .klp.arch.parainstructions
> >   .klp.arch.jump_labels (doesn't exist yet)
> > 
> > And that's just x86...
> 
> I may misunderstand, but we have .klp.arch sections because para and 
> alternatives have to be processed after relocations. And if we cannot get 
> rid of relocations completely, because of static symbols, then we cannot 
> get rid of .klp.arch sections either.

With late module patching gone, the module code can just process the klp
relocations at the same time it processes normal relocations.

Then the normal module alt/para/jump_label processing code can be used
instead of arch_klp_init_object_loaded().

Note this also means that Joe's patches can remove copy_module_elf() and
free_module_elf().  And module_arch_freeing_init() in s390.

> > And then of course there's the klp coming/going notifiers which have
> > also been an additional source of complexity.
> 
> True, but I think we (me and Petr) do not consider it as much of a problem 
> as you.

It's less of an issue than .klp.arch and all the related code which can
be removed.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 12:35                             ` Josh Poimboeuf
@ 2019-09-05 12:49                               ` Miroslav Benes
  0 siblings, 0 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-09-05 12:49 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Joe Lawrence, jikos, linux-kernel, live-patching

On Thu, 5 Sep 2019, Josh Poimboeuf wrote:

> On Thu, Sep 05, 2019 at 02:03:34PM +0200, Miroslav Benes wrote:
> > > >   + I would like to better understand the scope of the current
> > > >     problems. It is about modifying code in the livepatch that
> > > >     depends on position of the related code:
> > > > 
> > > >       + relocations are rather clear; we will need them anyway
> > > > 	to access non-public (static) API from the original code.
> > > > 
> > > >       + What are the other changes?
> > > 
> > > I think the .klp.arch sections are the big ones:
> > > 
> > >   .klp.arch.altinstructions
> > >   .klp.arch.parainstructions
> > >   .klp.arch.jump_labels (doesn't exist yet)
> > > 
> > > And that's just x86...
> > 
> > I may misunderstand, but we have .klp.arch sections because para and 
> > alternatives have to be processed after relocations. And if we cannot get 
> > rid of relocations completely, because of static symbols, then we cannot 
> > get rid of .klp.arch sections either.
> 
> With late module patching gone, the module code can just process the klp
> relocations at the same time it processes normal relocations.
> 
> Then the normal module alt/para/jump_label processing code can be used
> instead of arch_klp_init_object_loaded().

Ah, of course. I obviously cannot grasp the idea of not having late module 
patching :)
 
> Note this also means that Joe's patches can remove copy_module_elf() and
> free_module_elf().  And module_arch_freeing_init() in s390.

Correct.

So yes, it would simplify the code a lot. I am still worried about the 
consequences.

> > > And then of course there's the klp coming/going notifiers which have
> > > also been an additional source of complexity.
> > 
> > True, but I think we (me and Petr) do not consider it as much of a problem 
> > as you.
> 
> It's less of an issue than .klp.arch and all the related code which can
> be removed.

Ok.

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 12:16                         ` Miroslav Benes
@ 2019-09-05 12:54                           ` Josh Poimboeuf
  2019-09-06 12:51                             ` Miroslav Benes
  0 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 12:54 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Joe Lawrence, Petr Mladek, jikos, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 02:16:51PM +0200, Miroslav Benes wrote:
> > > > A full demo would require packaging up replacement .ko's with a livepatch, as
> > > > well as "blacklisting" those deprecated .kos, etc.  But that's all I had time
> > > > to cook up last week before our holiday weekend here.
> > > 
> > > Frankly, I'm not sure about this approach. I'm kind of torn. The current 
> > > solution is far from ideal, but I'm not excited about the other options 
> > > either. It seems like the choice is basically between "general but 
> > > technically complicated fragile solution with nontrivial maintenance 
> > > burden", or "something safer and maybe cleaner, but limiting for 
> > > users/distros". Of course it depends on whether the limitation is even 
> > > real and how big it is. Unfortunately we cannot quantify it much and that 
> > > is probably why our opinions (in the email thread) differ.
> > 
> > How would this option be "limiting for users/distros"?  If the packaging
> > part of the solution is done correctly then I don't see how it would be
> > limiting.
> 
> I'll try to explain my worries.
> 
> Blacklisting first. Yes, I agree that it would make things a lot simpler, 
> but I am afraid it would not fly at SUSE. Petr meanwhile explained 
> elsewhere, but I don't think we can limit our customers that much. We 
> perceive live patching as a product as much transparent as possible and as 
> less intrusive as possible. One thing is to forbid to remove a module, the 
> other is to forbid its loading.
> 
> We could warn the admin. Something like "there is a fix for a module foo, 
> which is not loaded currently. It will not be patched and the system will 
> be still vulnerable if you load the module unless a new fixed version is 
> provided."

No.  We just distribute the new .ko with the livepatch.  It should be
transparent to the user.

> Yes, we can distribute the new version of .ko with a livepatch. What is 
> the reason for blacklisting then? I don't probably understand, but either 
> a module is loaded and we can patch it (without late module patching), or 
> it is not and we could replace .ko on disk.

I think the blacklisting is a failsafe to prevent the old module from
accidentally getting loaded after patching.

> Now, I don't think that replacing .ko on disk is a good idea. We've 
> already discussed it. It would lead to a maintenance/packaging problem, 
> because you never know which version of the module is loaded in the 
> system. The state space grows rather rapidly there.

What exactly are your concerns?

Either the old version of the module is loaded, and it's livepatched; or
the new version of the module is loaded, and it's not livepatched.

Anyway that could be reported to the user somehow, e.g. report
srcversion in sysfs.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 11:09                           ` Petr Mladek
  2019-09-05 11:19                             ` Jiri Kosina
  2019-09-05 11:39                             ` Joe Lawrence
@ 2019-09-05 13:08                             ` Josh Poimboeuf
  2019-09-05 13:15                               ` Josh Poimboeuf
  2 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 13:08 UTC (permalink / raw)
  To: Petr Mladek
  Cc: jikos, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 01:09:55PM +0200, Petr Mladek wrote:
> > I don't have a number, but it's very common to patch a function which
> > uses jump labels or alternatives.
> 
> Really? My impression is that both alternatives and jump_labels
> are used in hot paths. I would expect them mostly in core code
> that is always loaded.
> 
> Alternatives are often used in assembly that we are not able
> to livepatch anyway.
> 
> Or are they spread widely via some macros or inlined functions?

Jump labels are used everywhere.  Looking at vmlinux.o in my kernel:

  Relocation section [19621] '.rela__jump_table' for section [19620] '__jump_table' at offset 0x197873c8 contains 11913 entries:

Each jump label entry has 3 entries, so 11913/3 = 3971 jump labels.

$ readelf -s vmlinux.o |grep FUNC |wc -l
46902

3971/46902 = ~8.5%

~8.5% of functions use jump labels.

> > >       + How often new problematic features appear?
> > 
> > I'm not exactly sure what you mean, but it seems that anytime we add a
> > new feature, we have to try to wrap our heads around how it interacts
> > with the weirdness of late module patching.
> 
> I agree that we need to think about it and it makes complications.
> Anyway, I think that these are never the biggest problems.
> 
> I would be more concerned about arch-specific features that might need
> special handling in the livepatch code. Everyone talks only about
> alternatives and jump_labels that were added long time ago.

Jump labels have been around for many years, but we somehow missed
implementing klp.arch for them.  As I said this resulted in panics.

There may be other similar cases lurking, both in x86 and other arches.
It's not a comforting thought!

And each case requires special klp code in addition to the real code.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 13:08                             ` Josh Poimboeuf
@ 2019-09-05 13:15                               ` Josh Poimboeuf
  2019-09-05 13:52                                 ` Petr Mladek
  0 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 13:15 UTC (permalink / raw)
  To: Petr Mladek
  Cc: jikos, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 08:08:32AM -0500, Josh Poimboeuf wrote:
> On Thu, Sep 05, 2019 at 01:09:55PM +0200, Petr Mladek wrote:
> > > I don't have a number, but it's very common to patch a function which
> > > uses jump labels or alternatives.
> > 
> > Really? My impression is that both alternatives and jump_labels
> > are used in hot paths. I would expect them mostly in core code
> > that is always loaded.
> > 
> > Alternatives are often used in assembly that we are not able
> > to livepatch anyway.
> > 
> > Or are they spread widely via some macros or inlined functions?
> 
> Jump labels are used everywhere.  Looking at vmlinux.o in my kernel:
> 
>   Relocation section [19621] '.rela__jump_table' for section [19620] '__jump_table' at offset 0x197873c8 contains 11913 entries:
> 
> Each jump label entry has 3 entries, so 11913/3 = 3971 jump labels.
> 
> $ readelf -s vmlinux.o |grep FUNC |wc -l
> 46902
> 
> 3971/46902 = ~8.5%
> 
> ~8.5% of functions use jump labels.

Obviously some functions may use more than one jump label so this isn't
exactly bulletproof math.  But it gives a rough idea of how widespread
they are.

> 
> > > >       + How often new problematic features appear?
> > > 
> > > I'm not exactly sure what you mean, but it seems that anytime we add a
> > > new feature, we have to try to wrap our heads around how it interacts
> > > with the weirdness of late module patching.
> > 
> > I agree that we need to think about it and it makes complications.
> > Anyway, I think that these are never the biggest problems.
> > 
> > I would be more concerned about arch-specific features that might need
> > special handling in the livepatch code. Everyone talks only about
> > alternatives and jump_labels that were added long time ago.
> 
> Jump labels have been around for many years, but we somehow missed
> implementing klp.arch for them.  As I said this resulted in panics.
> 
> There may be other similar cases lurking, both in x86 and other arches.
> It's not a comforting thought!
> 
> And each case requires special klp code in addition to the real code.
> 
> -- 
> Josh

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 11:19                             ` Jiri Kosina
@ 2019-09-05 13:23                               ` Josh Poimboeuf
  2019-09-05 13:31                                 ` Jiri Kosina
  0 siblings, 1 reply; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 13:23 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Petr Mladek, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 01:19:06PM +0200, Jiri Kosina wrote:
> On Thu, 5 Sep 2019, Petr Mladek wrote:
> 
> > > I don't have a number, but it's very common to patch a function which 
> > > uses jump labels or alternatives.
> > 
> > Really? My impression is that both alternatives and jump_labels
> > are used in hot paths. I would expect them mostly in core code
> > that is always loaded.
> > 
> > Alternatives are often used in assembly that we are not able
> > to livepatch anyway.
> > 
> > Or are they spread widely via some macros or inlined functions?
> 
> All the indirect jumps are turned into alternatives when retpolines are in 
> place.

Actually in C code those are done by the compiler as calls/jumps to
__x86_indirect_thunk_*.

But there are still a bunch of paravirt patched instructions and
alternatives used throughout the kernel.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 13:23                               ` Josh Poimboeuf
@ 2019-09-05 13:31                                 ` Jiri Kosina
  2019-09-05 13:42                                   ` Josh Poimboeuf
  0 siblings, 1 reply; 44+ messages in thread
From: Jiri Kosina @ 2019-09-05 13:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu, 5 Sep 2019, Josh Poimboeuf wrote:

> > All the indirect jumps are turned into alternatives when retpolines 
> > are in place.
> 
> Actually in C code those are done by the compiler as calls/jumps to
> __x86_indirect_thunk_*.

Sure, and the thunks do the redirection via JMP_NOSPEC / CALL_NOSPEC, 
which has alternative in it.

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 13:31                                 ` Jiri Kosina
@ 2019-09-05 13:42                                   ` Josh Poimboeuf
  0 siblings, 0 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 13:42 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Petr Mladek, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 03:31:56PM +0200, Jiri Kosina wrote:
> On Thu, 5 Sep 2019, Josh Poimboeuf wrote:
> 
> > > All the indirect jumps are turned into alternatives when retpolines 
> > > are in place.
> > 
> > Actually in C code those are done by the compiler as calls/jumps to
> > __x86_indirect_thunk_*.
> 
> Sure, and the thunks do the redirection via JMP_NOSPEC / CALL_NOSPEC, 
> which has alternative in it.

But the thunks are isolated to arch/x86/lib/retpoline.S.  We can't patch
that code anyway.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 13:15                               ` Josh Poimboeuf
@ 2019-09-05 13:52                                 ` Petr Mladek
  2019-09-05 14:28                                   ` Josh Poimboeuf
  0 siblings, 1 reply; 44+ messages in thread
From: Petr Mladek @ 2019-09-05 13:52 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: jikos, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu 2019-09-05 08:15:02, Josh Poimboeuf wrote:
> On Thu, Sep 05, 2019 at 08:08:32AM -0500, Josh Poimboeuf wrote:
> > On Thu, Sep 05, 2019 at 01:09:55PM +0200, Petr Mladek wrote:
> > > > I don't have a number, but it's very common to patch a function which
> > > > uses jump labels or alternatives.
> > > 
> > > Really? My impression is that both alternatives and jump_labels
> > > are used in hot paths. I would expect them mostly in core code
> > > that is always loaded.
> > > 
> > > Alternatives are often used in assembly that we are not able
> > > to livepatch anyway.
> > > 
> > > Or are they spread widely via some macros or inlined functions?
> > 
> > Jump labels are used everywhere.  Looking at vmlinux.o in my kernel:
> > 
> >   Relocation section [19621] '.rela__jump_table' for section [19620] '__jump_table' at offset 0x197873c8 contains 11913 entries:
> > 
> > Each jump label entry has 3 entries, so 11913/3 = 3971 jump labels.
> > 
> > $ readelf -s vmlinux.o |grep FUNC |wc -l
> > 46902
> > 
> > 3971/46902 = ~8.5%
> > 
> > ~8.5% of functions use jump labels.
> 
> Obviously some functions may use more than one jump label so this isn't
> exactly bulletproof math.  But it gives a rough idea of how widespread
> they are.

It looks scary. I just wonder why we have never met this problem during
last few years.

My only guess is that most of these functions are either in core
kernel or in code that we do not livepatch.

I do not want to say that we should ignore it. I want to
understand the cost and impact of the various approaches.

Regards,
Petr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 13:52                                 ` Petr Mladek
@ 2019-09-05 14:28                                   ` Josh Poimboeuf
  0 siblings, 0 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-05 14:28 UTC (permalink / raw)
  To: Petr Mladek
  Cc: jikos, Joe Lawrence, Miroslav Benes, linux-kernel, live-patching

On Thu, Sep 05, 2019 at 03:52:59PM +0200, Petr Mladek wrote:
> On Thu 2019-09-05 08:15:02, Josh Poimboeuf wrote:
> > On Thu, Sep 05, 2019 at 08:08:32AM -0500, Josh Poimboeuf wrote:
> > > On Thu, Sep 05, 2019 at 01:09:55PM +0200, Petr Mladek wrote:
> > > > > I don't have a number, but it's very common to patch a function which
> > > > > uses jump labels or alternatives.
> > > > 
> > > > Really? My impression is that both alternatives and jump_labels
> > > > are used in hot paths. I would expect them mostly in core code
> > > > that is always loaded.
> > > > 
> > > > Alternatives are often used in assembly that we are not able
> > > > to livepatch anyway.
> > > > 
> > > > Or are they spread widely via some macros or inlined functions?
> > > 
> > > Jump labels are used everywhere.  Looking at vmlinux.o in my kernel:
> > > 
> > >   Relocation section [19621] '.rela__jump_table' for section [19620] '__jump_table' at offset 0x197873c8 contains 11913 entries:
> > > 
> > > Each jump label entry has 3 entries, so 11913/3 = 3971 jump labels.
> > > 
> > > $ readelf -s vmlinux.o |grep FUNC |wc -l
> > > 46902
> > > 
> > > 3971/46902 = ~8.5%
> > > 
> > > ~8.5% of functions use jump labels.
> > 
> > Obviously some functions may use more than one jump label so this isn't
> > exactly bulletproof math.  But it gives a rough idea of how widespread
> > they are.
> 
> It looks scary. I just wonder why we have never met this problem during
> last few years.

Who knows what can happen when you disable jump label patching.
Sometimes it may be harmless.  A panic is probably the worst case.
There may be other fail modes which are harder to detect.

> My only guess is that most of these functions are either in core
> kernel or in code that we do not livepatch.

This is definitely not the case.  We recently introduced jump label
checking in kpatch-build, and it complains a lot.  The workaround is to
replace such uses with static_key_enabled().

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-05 12:54                           ` Josh Poimboeuf
@ 2019-09-06 12:51                             ` Miroslav Benes
  2019-09-06 15:38                               ` Joe Lawrence
  2019-09-06 16:45                               ` Josh Poimboeuf
  0 siblings, 2 replies; 44+ messages in thread
From: Miroslav Benes @ 2019-09-06 12:51 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Joe Lawrence, Petr Mladek, jikos, linux-kernel, live-patching


> > Now, I don't think that replacing .ko on disk is a good idea. We've 
> > already discussed it. It would lead to a maintenance/packaging problem, 
> > because you never know which version of the module is loaded in the 
> > system. The state space grows rather rapidly there.
> 
> What exactly are your concerns?
> 
> Either the old version of the module is loaded, and it's livepatched; or
> the new version of the module is loaded, and it's not livepatched.

Let's have module foo.ko with function a().

Live patch 1 (LP1) fixes it to a'(), which calls new function b() (present 
in LP1). LP1 is used only if foo.ko is loaded. foo.ko is replaced with 
foo'.ko on disk. It contains both a'() (fixed a() to be precise) and new 
b().

Now there is LP2 with new function c() (or c'(), it does not matter) 
calling b(). Either foo.ko or foo'.ko can be loaded and you don't know 
which one. The implementation LP2 would be different in both cases.

You could say that it does not matter. If LP2 is implemented for foo.ko, 
the same could work for foo'.ko (b() would be a part of LP2 and would not 
be called directly from foo'.ko). LP2 would only be necessarily larger. It 
is true in case of functions, but if symbol b is not a function but a 
global variable, it is different then.

Moreover, in this case foo'.ko is "LP superset". Meaning that it contains 
only fixes which are present in LP1. What if it is not. We usually 
preserve kABI, so there could be a module in two or more versions compiled 
from slightly different code (older/newer and so on) and you don't know 
which one is loaded. To be fair we don't allow it (I think) at SUSE except 
for KMPs (kernel module packages) (the issue of course exists even now 
and we haven't solved it yet, because it is rare) and out of tree modules 
which we don't support with LP. It could be solved with srcversion, but it 
complicates things a lot. "blue sky" idea could extend the issue to all 
modules given the above is real.

Does it make sense?

Miroslav

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-06 12:51                             ` Miroslav Benes
@ 2019-09-06 15:38                               ` Joe Lawrence
  2019-09-06 16:45                               ` Josh Poimboeuf
  1 sibling, 0 replies; 44+ messages in thread
From: Joe Lawrence @ 2019-09-06 15:38 UTC (permalink / raw)
  To: Miroslav Benes, Josh Poimboeuf
  Cc: Petr Mladek, jikos, linux-kernel, live-patching

On 9/6/19 1:51 PM, Miroslav Benes wrote:
> 
>>> Now, I don't think that replacing .ko on disk is a good idea. We've
>>> already discussed it. It would lead to a maintenance/packaging problem,
>>> because you never know which version of the module is loaded in the
>>> system. The state space grows rather rapidly there.
>>
>> What exactly are your concerns?
>>
>> Either the old version of the module is loaded, and it's livepatched; or
>> the new version of the module is loaded, and it's not livepatched.
> 
> Let's have module foo.ko with function a().
> 
> Live patch 1 (LP1) fixes it to a'(), which calls new function b() (present
> in LP1). LP1 is used only if foo.ko is loaded. foo.ko is replaced with
> foo'.ko on disk. It contains both a'() (fixed a() to be precise) and new
> b().
> 
> Now there is LP2 with new function c() (or c'(), it does not matter)
> calling b(). Either foo.ko or foo'.ko can be loaded and you don't know
> which one. The implementation LP2 would be different in both cases.
> 
> You could say that it does not matter. If LP2 is implemented for foo.ko,
> the same could work for foo'.ko (b() would be a part of LP2 and would not
> be called directly from foo'.ko). LP2 would only be necessarily larger. It
> is true in case of functions, but if symbol b is not a function but a
> global variable, it is different then.
> 
> Moreover, in this case foo'.ko is "LP superset". Meaning that it contains
> only fixes which are present in LP1. What if it is not. We usually
> preserve kABI, so there could be a module in two or more versions compiled
> from slightly different code (older/newer and so on) and you don't know
> which one is loaded. To be fair we don't allow it (I think) at SUSE except
> for KMPs (kernel module packages) (the issue of course exists even now
> and we haven't solved it yet, because it is rare) and out of tree modules
> which we don't support with LP. It could be solved with srcversion, but it
> complicates things a lot. "blue sky" idea could extend the issue to all
> modules given the above is real.
> 
> Does it make sense?
> 

If I understand correctly, you're saying that this would add another 
dimension to the potential system state that livepatches need to 
consider?  e.g. when updating a livepatch to v3, a v2 patched module may 
or may not be loaded.  So are we updating livepatch v2 code or module v2 
code...

I agree that for functions, we could probably get away with repeating 
code, but not necessarily for new global variables.

Then there's the question of:

   module v3 == module v{1,2} + livepatch v3?


Is this scenario similar to one where a customer somehow finds and loads 
module v3 before loading livepatch v3?  Livepatch doesn't have a 
srcversion whitelist so this should be entirely possible.  I suppose it 
is a bit different in that module v3 would be starting from a fresh load 
and not something that livepatch v3 has hotpatched from an unknown 
source/base.

-- Joe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal
  2019-09-06 12:51                             ` Miroslav Benes
  2019-09-06 15:38                               ` Joe Lawrence
@ 2019-09-06 16:45                               ` Josh Poimboeuf
  1 sibling, 0 replies; 44+ messages in thread
From: Josh Poimboeuf @ 2019-09-06 16:45 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Joe Lawrence, Petr Mladek, jikos, linux-kernel, live-patching

On Fri, Sep 06, 2019 at 02:51:01PM +0200, Miroslav Benes wrote:
> 
> > > Now, I don't think that replacing .ko on disk is a good idea. We've 
> > > already discussed it. It would lead to a maintenance/packaging problem, 
> > > because you never know which version of the module is loaded in the 
> > > system. The state space grows rather rapidly there.
> > 
> > What exactly are your concerns?
> > 
> > Either the old version of the module is loaded, and it's livepatched; or
> > the new version of the module is loaded, and it's not livepatched.
> 
> Let's have module foo.ko with function a().
> 
> Live patch 1 (LP1) fixes it to a'(), which calls new function b() (present 
> in LP1). LP1 is used only if foo.ko is loaded. foo.ko is replaced with 
> foo'.ko on disk. It contains both a'() (fixed a() to be precise) and new 
> b().
> 
> Now there is LP2 with new function c() (or c'(), it does not matter) 
> calling b(). Either foo.ko or foo'.ko can be loaded and you don't know 
> which one. The implementation LP2 would be different in both cases.
> 
> You could say that it does not matter. If LP2 is implemented for foo.ko, 
> the same could work for foo'.ko (b() would be a part of LP2 and would not 
> be called directly from foo'.ko). LP2 would only be necessarily larger. It 
> is true in case of functions, but if symbol b is not a function but a 
> global variable, it is different then.

Assuming atomic replace, I don't see how this could be a problem.  LP2
replaces LP1, so why would LP2 need to access LP1's (or foo'.ko's)
symbol b?  All live patches should be built against and targeted for the
original foo.ko.

However... it might break atomic replace functionality in another way.

If LP2 is an 'atomic replace' partial revert of LP1, and foo'.ko were
loaded, when loading LP2, the atomic replace code wouldn't be able to
detect which functions were "patched" in foo'.ko. So if the LP2
functions are not a superset of the LP1 functions, the "patched"
functions in foo'.ko wouldn't get reverted.

What if foo'.ko were really just the original foo.ko, plus livepatch
metadata grafted onto it somehow, such that it patches itself when it
loads?  Then patched state would always be the same regardless of
whether the patch came from the LP or foo'.ko.

> Moreover, in this case foo'.ko is "LP superset". Meaning that it contains 
> only fixes which are present in LP1. What if it is not. We usually 
> preserve kABI, so there could be a module in two or more versions compiled 
> from slightly different code (older/newer and so on) and you don't know 
> which one is loaded. To be fair we don't allow it (I think) at SUSE except 
> for KMPs (kernel module packages) (the issue of course exists even now 
> and we haven't solved it yet, because it is rare) and out of tree modules 
> which we don't support with LP. It could be solved with srcversion, but it 
> complicates things a lot. "blue sky" idea could extend the issue to all 
> modules given the above is real.

I'm having trouble understanding what this issue is and how "blue sky"
would extend it.

-- 
Josh

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2019-09-06 16:45 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 12:28 [RFC PATCH 0/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
2019-07-19 12:28 ` [PATCH 1/2] livepatch: Nullify obj->mod in klp_module_coming()'s error path Miroslav Benes
2019-07-28 19:45   ` Josh Poimboeuf
2019-08-19 11:26     ` Petr Mladek
2019-07-19 12:28 ` [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal Miroslav Benes
2019-07-22  9:33   ` Petr Mladek
2019-08-14 12:33     ` Miroslav Benes
2019-07-28 20:04   ` Josh Poimboeuf
2019-08-14 11:06     ` Miroslav Benes
2019-08-14 15:12       ` Josh Poimboeuf
2019-08-16  9:46         ` Petr Mladek
2019-08-22 22:36           ` Josh Poimboeuf
2019-08-23  8:13             ` Petr Mladek
2019-08-26 14:54               ` Josh Poimboeuf
2019-08-27 15:05                 ` Joe Lawrence
2019-08-27 15:37                   ` Josh Poimboeuf
2019-09-02 16:13                 ` Miroslav Benes
2019-09-02 17:05                   ` Joe Lawrence
2019-09-03 13:02                     ` Miroslav Benes
2019-09-04  8:49                       ` Petr Mladek
2019-09-04 16:26                         ` Joe Lawrence
2019-09-05  2:50                         ` Josh Poimboeuf
2019-09-05 11:09                           ` Petr Mladek
2019-09-05 11:19                             ` Jiri Kosina
2019-09-05 13:23                               ` Josh Poimboeuf
2019-09-05 13:31                                 ` Jiri Kosina
2019-09-05 13:42                                   ` Josh Poimboeuf
2019-09-05 11:39                             ` Joe Lawrence
2019-09-05 13:08                             ` Josh Poimboeuf
2019-09-05 13:15                               ` Josh Poimboeuf
2019-09-05 13:52                                 ` Petr Mladek
2019-09-05 14:28                                   ` Josh Poimboeuf
2019-09-05 12:03                           ` Miroslav Benes
2019-09-05 12:35                             ` Josh Poimboeuf
2019-09-05 12:49                               ` Miroslav Benes
2019-09-05 11:52                         ` Miroslav Benes
2019-09-05  2:32                       ` Josh Poimboeuf
2019-09-05 12:16                         ` Miroslav Benes
2019-09-05 12:54                           ` Josh Poimboeuf
2019-09-06 12:51                             ` Miroslav Benes
2019-09-06 15:38                               ` Joe Lawrence
2019-09-06 16:45                               ` Josh Poimboeuf
2019-08-26 13:44         ` Nicolai Stange
2019-08-26 15:02           ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).