linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] livepatch: introduce atomic replace
@ 2017-08-30 21:38 Jason Baron
  2017-08-30 21:38 ` [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators Jason Baron
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jason Baron @ 2017-08-30 21:38 UTC (permalink / raw)
  To: linux-kernel, live-patching; +Cc: jpoimboe, jeyu, jikos, mbenes, pmladek

Hi,

In testing livepatch, I found that when doing cumulative patches, if a patched
function is completed reverted by a subsequent patch (back to its original state)
livepatch does not revert the funtion to its original state. Specifically, if
patch A introduces a change to function 1, and patch B reverts the change to
function 1 and introduces changes to say function 2 and 3 as well, the change
that patch A introducd to function 1 is still present. This could be addressed
by first completely removing patch A (disable and then rmmod) and then inserting
patch B (insmod and enable), but this leaves an unpatched window. In discussing
this issue with Josh on the kpatch mailing list, he mentioned that we could get
'atomic replace working properly', and that is the direction of this patchset:
https://www.redhat.com/archives/kpatch/2017-June/msg00005.html

Patches:

1) livepatch: Add klp_object and klp_func dynamic iterators
A prep patch for the 'atomic replace' feature such that dynamic objects
and functions can be allocated.

2) livepatch: add atomic replace
Core feature.

3) livepatch: Add a sysctl livepatch_mode for atomic replace
Introduces a knob for enabling atomic replace. This patch is really only
meant for testing purposes, and should not be considered for application.
The intention is for the livepatch module generator to set the 'replace'
field in the klp_patch structure such that the kernel knows to do the atomic
replace.

Thanks,

-Jason

v1-v2:                                                                                                          
-removed the func_iter and obj_iter (Petr Mladek)
-initialiing kobject structure for no_op functions using:
 klp_init_object() and klp_init_func()
-added a 'replace' field to klp_patch, similar to the immediate field
-a 'replace' patch now disables all previous patches
-tried to shorten klp_init_patch_no_ops()...
-Simplified logic klp_complete_transition (Petr Mladek)

Jason Baron (3):
  livepatch: Add dynamic klp_object and klp_func iterators
  livepatch: add atomic replace
  livepatch: Add a sysctl livepatch_mode for atomic replace

 include/linux/livepatch.h     | 108 ++++++++++++++++++++++++-
 kernel/livepatch/core.c       | 182 +++++++++++++++++++++++++++++++++++++++---
 kernel/livepatch/core.h       |   5 ++
 kernel/livepatch/patch.c      |  19 +++--
 kernel/livepatch/patch.h      |   4 +-
 kernel/livepatch/transition.c |  47 ++++++++++-
 kernel/sysctl.c               |  12 +++
 7 files changed, 352 insertions(+), 25 deletions(-)

-- 
2.6.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators
  2017-08-30 21:38 [PATCH v2 0/3] livepatch: introduce atomic replace Jason Baron
@ 2017-08-30 21:38 ` Jason Baron
  2017-09-07 12:34   ` Petr Mladek
  2017-08-30 21:38 ` [PATCH v2 2/3] livepatch: add atomic replace Jason Baron
  2017-08-30 21:38 ` [PATCH v2 3/3] livepatch: Add a sysctl livepatch_mode for " Jason Baron
  2 siblings, 1 reply; 14+ messages in thread
From: Jason Baron @ 2017-08-30 21:38 UTC (permalink / raw)
  To: linux-kernel, live-patching; +Cc: jpoimboe, jeyu, jikos, mbenes, pmladek

In preparation to introducing atomic replace, introduce iterators for klp_func
and klp_object, such that objects and functions can be dynmically allocated
(needed for atomic replace). This patch is intended to effectively be a no-op
until atomic replace is introduced.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
---
 include/linux/livepatch.h | 94 +++++++++++++++++++++++++++++++++++++++++++++--
 kernel/livepatch/core.c   |  4 ++
 2 files changed, 94 insertions(+), 4 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 194991e..8d3df55 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -24,6 +24,7 @@
 #include <linux/module.h>
 #include <linux/ftrace.h>
 #include <linux/completion.h>
+#include <linux/list.h>
 
 #if IS_ENABLED(CONFIG_LIVEPATCH)
 
@@ -44,6 +45,7 @@
  * @old_addr:	the address of the function being patched
  * @kobj:	kobject for sysfs resources
  * @stack_node:	list node for klp_ops func_stack list
+ * @func_entry: used to link struct klp_func to struct klp_object
  * @old_size:	size of the old function
  * @new_size:	size of the new function
  * @patched:	the func has been added to the klp_ops list
@@ -82,6 +84,7 @@ struct klp_func {
 	unsigned long old_addr;
 	struct kobject kobj;
 	struct list_head stack_node;
+	struct list_head func_entry;
 	unsigned long old_size, new_size;
 	bool patched;
 	bool transition;
@@ -92,6 +95,8 @@ struct klp_func {
  * @name:	module name (or NULL for vmlinux)
  * @funcs:	function entries for functions to be patched in the object
  * @kobj:	kobject for sysfs resources
+ * @func_list:	head of list for dynamically allocate struct klp_func
+ * @obj_entry:	used to link struct klp_object to struct klp_patch
  * @mod:	kernel module associated with the patched object
  *		(NULL for vmlinux)
  * @patched:	the object's funcs have been added to the klp_ops list
@@ -103,6 +108,8 @@ struct klp_object {
 
 	/* internal */
 	struct kobject kobj;
+	struct list_head func_list;
+	struct list_head obj_entry;
 	struct module *mod;
 	bool patched;
 };
@@ -114,6 +121,7 @@ struct klp_object {
  * @immediate:  patch all funcs immediately, bypassing safety mechanisms
  * @list:	list node for global list of registered patches
  * @kobj:	kobject for sysfs resources
+ * @obj_list:	head of list for dynamically allocated struct klp_object
  * @enabled:	the patch is enabled (but operation may be incomplete)
  * @finish:	for waiting till it is safe to remove the patch module
  */
@@ -126,17 +134,95 @@ struct klp_patch {
 	/* internal */
 	struct list_head list;
 	struct kobject kobj;
+	struct list_head obj_list;
 	bool enabled;
 	struct completion finish;
 };
 
+static inline struct klp_object *obj_iter_next(struct klp_patch *patch,
+					       struct klp_object *obj)
+{
+	struct klp_object *next_obj = NULL;
+
+	if (list_empty(&obj->obj_entry)) {
+		next_obj = obj + 1;
+		if (next_obj->funcs || next_obj->name)
+			goto out;
+		else
+			next_obj = NULL;
+		if (!list_empty(&patch->obj_list))
+			next_obj = container_of(patch->obj_list.next,
+					struct klp_object,
+					obj_entry);
+		goto out;
+	}
+	if (obj->obj_entry.next != &patch->obj_list)
+		next_obj = container_of(obj->obj_entry.next,
+				struct klp_object,
+				obj_entry);
+out:
+	return next_obj;
+}
+
+static inline struct klp_object *obj_iter_init(struct klp_patch *patch)
+{
+	if (patch->objs->funcs || patch->objs->name)
+		return patch->objs;
+	else
+		return NULL;
+}
+
 #define klp_for_each_object(patch, obj) \
-	for (obj = patch->objs; obj->funcs || obj->name; obj++)
+	for (obj = obj_iter_init(patch); obj; obj = obj_iter_next(patch, obj))
+
+static inline struct klp_func *func_iter_next(struct klp_object *obj,
+					      struct klp_func *func)
+{
+	struct klp_func *next_func = NULL;
+
+	if (list_empty(&func->func_entry)) {
+		next_func = func + 1;
+		if (next_func->old_name || next_func->new_func ||
+		    next_func->old_sympos)
+			goto out;
+		else
+			next_func = NULL;
+		if (!list_empty(&obj->func_list))
+			next_func = container_of(obj->func_list.next,
+					struct klp_func,
+					func_entry);
+		goto out;
+	}
+	if (func->func_entry.next != &obj->func_list)
+		next_func = container_of(func->func_entry.next,
+					 struct klp_func,
+					 func_entry);
+out:
+	return next_func;
+}
+
+static inline struct klp_func *func_iter_init(struct klp_object *obj)
+{
+	/* statically allocated */
+	if (list_empty(&obj->obj_entry)) {
+		if (obj->funcs->old_name || obj->funcs->new_func ||
+		    obj->funcs->old_sympos)
+			return obj->funcs;
+		else
+			return NULL;
+	} else {
+		if (!list_empty(obj->func_list.next))
+			return container_of(obj->func_list.next,
+					    struct klp_func,
+					    func_entry);
+		else
+			return NULL;
+	}
+}
 
 #define klp_for_each_func(obj, func) \
-	for (func = obj->funcs; \
-	     func->old_name || func->new_func || func->old_sympos; \
-	     func++)
+	for (func = func_iter_init(obj); func; \
+	     func = func_iter_next(obj, func))
 
 int klp_register_patch(struct klp_patch *);
 int klp_unregister_patch(struct klp_patch *);
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index b9628e4..6004be3 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -606,6 +606,7 @@ static int klp_init_func(struct klp_object *obj, struct klp_func *func)
 		return -EINVAL;
 
 	INIT_LIST_HEAD(&func->stack_node);
+	INIT_LIST_HEAD(&func->func_entry);
 	func->patched = false;
 	func->transition = false;
 
@@ -729,7 +730,10 @@ static int klp_init_patch(struct klp_patch *patch)
 		return ret;
 	}
 
+	INIT_LIST_HEAD(&patch->obj_list);
 	klp_for_each_object(patch, obj) {
+		INIT_LIST_HEAD(&obj->obj_entry);
+		INIT_LIST_HEAD(&obj->func_list);
 		ret = klp_init_object(patch, obj);
 		if (ret)
 			goto free;
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/3] livepatch: add atomic replace
  2017-08-30 21:38 [PATCH v2 0/3] livepatch: introduce atomic replace Jason Baron
  2017-08-30 21:38 ` [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators Jason Baron
@ 2017-08-30 21:38 ` Jason Baron
  2017-09-11 13:53   ` Petr Mladek
  2017-08-30 21:38 ` [PATCH v2 3/3] livepatch: Add a sysctl livepatch_mode for " Jason Baron
  2 siblings, 1 reply; 14+ messages in thread
From: Jason Baron @ 2017-08-30 21:38 UTC (permalink / raw)
  To: linux-kernel, live-patching; +Cc: jpoimboe, jeyu, jikos, mbenes, pmladek

When doing cumulative patches, if patch A introduces a change to function 1,
and patch B reverts the change to function 1 and introduces changes to say
function 2 and 3 as well, the change that patch A introduced to function 1
is still present.

Introduce atomic replace, by first creating a 'no_op' klp_func for all
the functions that are reverted by patch B. The reason that 'no_op'
klp_funcs are created, instead of just unregistering directly from the ftrace
function hook, is to ensure that the consistency model is properly preserved.
By introducing the 'no_op' functions, 'atomic replace' leverages the existing
consistency model code. Then, after transition to the new code, 'atomic
replace' unregisters the ftrace handlers that are associated with the 'no_op'
klp_funcs, such that that we run the original un-patched function with no
additional no_op function overhead.

Since 'atomic replace' has completely replaced all previous livepatch modules,
it explicitly disables all previous livepatch modules, in the example- patch A,
such that the livepatch module associated with patch A can be completely removed
(rmmod). Patch A is now in a permanently disabled state. But if it is removed
from the kernel with rmmod, it can be re-inserted (insmod), and act as an atomic
replace on top of patch A.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
---
 include/linux/livepatch.h     |   6 ++
 kernel/livepatch/core.c       | 177 +++++++++++++++++++++++++++++++++++++++---
 kernel/livepatch/core.h       |   5 ++
 kernel/livepatch/patch.c      |  19 +++--
 kernel/livepatch/patch.h      |   4 +-
 kernel/livepatch/transition.c |  47 ++++++++++-
 6 files changed, 234 insertions(+), 24 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index 8d3df55..ee6d18b 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -50,6 +50,7 @@
  * @new_size:	size of the new function
  * @patched:	the func has been added to the klp_ops list
  * @transition:	the func is currently being applied or reverted
+ * @no_op:	this is a no_op function used to compelete revert a function
  *
  * The patched and transition variables define the func's patching state.  When
  * patching, a func is always in one of the following states:
@@ -88,6 +89,7 @@ struct klp_func {
 	unsigned long old_size, new_size;
 	bool patched;
 	bool transition;
+	bool no_op;
 };
 
 /**
@@ -119,10 +121,12 @@ struct klp_object {
  * @mod:	reference to the live patch module
  * @objs:	object entries for kernel objects to be patched
  * @immediate:  patch all funcs immediately, bypassing safety mechanisms
+ * @replace:	replace all funcs, reverting functions that are no longer patched
  * @list:	list node for global list of registered patches
  * @kobj:	kobject for sysfs resources
  * @obj_list:	head of list for dynamically allocated struct klp_object
  * @enabled:	the patch is enabled (but operation may be incomplete)
+ * @replaced:	the patch has been replaced an can not be re-enabled
  * @finish:	for waiting till it is safe to remove the patch module
  */
 struct klp_patch {
@@ -130,12 +134,14 @@ struct klp_patch {
 	struct module *mod;
 	struct klp_object *objs;
 	bool immediate;
+	bool replace;
 
 	/* internal */
 	struct list_head list;
 	struct kobject kobj;
 	struct list_head obj_list;
 	bool enabled;
+	bool replaced;
 	struct completion finish;
 };
 
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 6004be3..21cecc5 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -45,7 +45,7 @@
  */
 DEFINE_MUTEX(klp_mutex);
 
-static LIST_HEAD(klp_patches);
+LIST_HEAD(klp_patches);
 
 static struct kobject *klp_root_kobj;
 
@@ -351,6 +351,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	if (klp_transition_patch)
 		return -EBUSY;
 
+	if (patch->replaced)
+		return -EINVAL;
+
 	if (WARN_ON(patch->enabled))
 		return -EINVAL;
 
@@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
 		list_del(&patch->list);
 }
 
-static int klp_init_func(struct klp_object *obj, struct klp_func *func)
+void klp_patch_free_no_ops(struct klp_patch *patch)
+{
+	struct klp_object *obj, *tmp_obj;
+	struct klp_func *func, *tmp_func;
+
+	klp_for_each_object(patch, obj) {
+		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
+					 func_entry) {
+			list_del_init(&func->func_entry);
+			kobject_put(&func->kobj);
+			kfree(func->old_name);
+			kfree(func);
+		}
+		INIT_LIST_HEAD(&obj->func_list);
+	}
+	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
+		list_del_init(&obj->obj_entry);
+		kobject_put(&obj->kobj);
+		kfree(obj->name);
+		kfree(obj);
+	}
+	INIT_LIST_HEAD(&patch->obj_list);
+}
+
+static int klp_init_func(struct klp_object *obj, struct klp_func *func,
+			 bool no_op)
 {
-	if (!func->old_name || !func->new_func)
+	if (!func->old_name || (!no_op && !func->new_func))
 		return -EINVAL;
 
-	INIT_LIST_HEAD(&func->stack_node);
 	INIT_LIST_HEAD(&func->func_entry);
+	INIT_LIST_HEAD(&func->stack_node);
 	func->patched = false;
 	func->transition = false;
 
@@ -670,19 +698,23 @@ static int klp_init_object_loaded(struct klp_patch *patch,
 	return 0;
 }
 
-static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
+static int klp_init_object(struct klp_patch *patch, struct klp_object *obj,
+			   bool no_op)
 {
 	struct klp_func *func;
 	int ret;
 	const char *name;
 
-	if (!obj->funcs)
+	if (!obj->funcs && !no_op)
 		return -EINVAL;
 
 	obj->patched = false;
 	obj->mod = NULL;
+	INIT_LIST_HEAD(&obj->obj_entry);
+	INIT_LIST_HEAD(&obj->func_list);
 
-	klp_find_object_module(obj);
+	if (!no_op)
+		klp_find_object_module(obj);
 
 	name = klp_is_module(obj) ? obj->name : "vmlinux";
 	ret = kobject_init_and_add(&obj->kobj, &klp_ktype_object,
@@ -690,8 +722,12 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
 	if (ret)
 		return ret;
 
+	if (no_op)
+		return 0;
+
 	klp_for_each_func(obj, func) {
-		ret = klp_init_func(obj, func);
+		func->no_op = false;
+		ret = klp_init_func(obj, func, false);
 		if (ret)
 			goto free;
 	}
@@ -710,6 +746,115 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
 	return ret;
 }
 
+static int klp_init_patch_no_ops(struct klp_patch *prev_patch,
+				 struct klp_patch *patch)
+{
+	struct klp_object *obj, *prev_obj;
+	struct klp_func *prev_func, *func;
+	int ret;
+	bool found, mod;
+
+	klp_for_each_object(prev_patch, prev_obj) {
+		klp_for_each_func(prev_obj, prev_func) {
+next_func:
+			klp_for_each_object(patch, obj) {
+				klp_for_each_func(obj, func) {
+					if ((strcmp(prev_func->old_name,
+						    func->old_name) == 0) &&
+						(prev_func->old_sympos ==
+							func->old_sympos)) {
+						goto next_func;
+					}
+				}
+			}
+			mod = klp_is_module(prev_obj);
+			found = false;
+			klp_for_each_object(patch, obj) {
+				if (mod) {
+					if (klp_is_module(obj) &&
+					    strcmp(prev_obj->name,
+						   obj->name) == 0) {
+						found = true;
+						break;
+					}
+				} else if (!klp_is_module(obj)) {
+					found = true;
+					break;
+				}
+			}
+			if (!found) {
+				obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+				if (!obj)
+					return -ENOMEM;
+				if (prev_obj->name) {
+					obj->name = kstrdup(prev_obj->name,
+							    GFP_KERNEL);
+					if (!obj->name) {
+						kfree(obj);
+						return -ENOMEM;
+					}
+				} else {
+					obj->name = NULL;
+				}
+				obj->funcs = NULL;
+				ret = klp_init_object(patch, obj, true);
+				if (ret) {
+					kfree(obj->name);
+					kfree(obj);
+					return ret;
+				}
+				obj->mod = prev_obj->mod;
+				list_add(&obj->obj_entry, &patch->obj_list);
+			}
+			func = kzalloc(sizeof(*func), GFP_KERNEL);
+			if (!func)
+				return -ENOMEM;
+			if (prev_func->old_name) {
+				func->old_name = kstrdup(prev_func->old_name,
+							 GFP_KERNEL);
+				if (!func->old_name) {
+					kfree(func);
+					return -ENOMEM;
+				}
+			} else {
+				func->old_name = NULL;
+			}
+			func->new_func = NULL;
+			func->old_sympos = prev_func->old_sympos;
+			ret = klp_init_func(obj, func, true);
+			func->immediate = prev_func->immediate;
+			func->old_addr = prev_func->old_addr;
+			func->old_size = prev_func->old_size;
+			func->new_size = 0;
+			func->no_op = true;
+			list_add(&func->func_entry, &obj->func_list);
+		}
+	}
+	return 0;
+}
+
+static int klp_init_no_ops(struct klp_patch *patch)
+{
+	struct klp_patch *prev_patch;
+	int ret = 0;
+
+	if (!patch->replace)
+		return 0;
+
+	prev_patch = patch;
+	while (prev_patch->list.prev != &klp_patches) {
+		prev_patch = list_prev_entry(prev_patch, list);
+
+		ret = klp_init_patch_no_ops(prev_patch, patch);
+		if (ret)
+			return ret;
+
+		if (prev_patch->replace)
+			break;
+	}
+	return 0;
+}
+
 static int klp_init_patch(struct klp_patch *patch)
 {
 	struct klp_object *obj;
@@ -721,6 +866,8 @@ static int klp_init_patch(struct klp_patch *patch)
 	mutex_lock(&klp_mutex);
 
 	patch->enabled = false;
+	patch->replaced = false;
+
 	init_completion(&patch->finish);
 
 	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
@@ -732,20 +879,25 @@ static int klp_init_patch(struct klp_patch *patch)
 
 	INIT_LIST_HEAD(&patch->obj_list);
 	klp_for_each_object(patch, obj) {
-		INIT_LIST_HEAD(&obj->obj_entry);
-		INIT_LIST_HEAD(&obj->func_list);
-		ret = klp_init_object(patch, obj);
+		ret = klp_init_object(patch, obj, false);
 		if (ret)
 			goto free;
 	}
 
 	list_add_tail(&patch->list, &klp_patches);
 
+	ret = klp_init_no_ops(patch);
+	if (ret) {
+		list_del(&patch->list);
+		goto free;
+	}
+
 	mutex_unlock(&klp_mutex);
 
 	return 0;
 
 free:
+	klp_patch_free_no_ops(patch);
 	klp_free_objects_limited(patch, obj);
 
 	mutex_unlock(&klp_mutex);
@@ -780,6 +932,7 @@ int klp_unregister_patch(struct klp_patch *patch)
 		goto err;
 	}
 
+	klp_patch_free_no_ops(patch);
 	klp_free_patch(patch);
 
 	mutex_unlock(&klp_mutex);
@@ -933,7 +1086,7 @@ void klp_module_going(struct module *mod)
 			if (patch->enabled || patch == klp_transition_patch) {
 				pr_notice("reverting patch '%s' on unloading module '%s'\n",
 					  patch->mod->name, obj->mod->name);
-				klp_unpatch_object(obj);
+				klp_unpatch_object(obj, false);
 			}
 
 			klp_free_object_loaded(obj);
diff --git a/kernel/livepatch/core.h b/kernel/livepatch/core.h
index c74f24c..0705ac3 100644
--- a/kernel/livepatch/core.h
+++ b/kernel/livepatch/core.h
@@ -1,6 +1,11 @@
 #ifndef _LIVEPATCH_CORE_H
 #define _LIVEPATCH_CORE_H
 
+#include <linux/livepatch.h>
+
 extern struct mutex klp_mutex;
+extern struct list_head klp_patches;
+
+void klp_patch_free_no_ops(struct klp_patch *patch);
 
 #endif /* _LIVEPATCH_CORE_H */
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 52c4e90..10b75e3 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -117,6 +117,8 @@ static void notrace klp_ftrace_handler(unsigned long ip,
 		}
 	}
 
+	if (func->no_op)
+		goto unlock;
 	klp_arch_set_pc(regs, (unsigned long)func->new_func);
 unlock:
 	preempt_enable_notrace();
@@ -235,15 +237,20 @@ static int klp_patch_func(struct klp_func *func)
 	return ret;
 }
 
-void klp_unpatch_object(struct klp_object *obj)
+void klp_unpatch_object(struct klp_object *obj, bool no_op)
 {
 	struct klp_func *func;
 
-	klp_for_each_func(obj, func)
+	klp_for_each_func(obj, func) {
+		if (no_op && !func->no_op)
+			continue;
+
 		if (func->patched)
 			klp_unpatch_func(func);
+	}
 
-	obj->patched = false;
+	if (!no_op)
+		obj->patched = false;
 }
 
 int klp_patch_object(struct klp_object *obj)
@@ -257,7 +264,7 @@ int klp_patch_object(struct klp_object *obj)
 	klp_for_each_func(obj, func) {
 		ret = klp_patch_func(func);
 		if (ret) {
-			klp_unpatch_object(obj);
+			klp_unpatch_object(obj, false);
 			return ret;
 		}
 	}
@@ -266,11 +273,11 @@ int klp_patch_object(struct klp_object *obj)
 	return 0;
 }
 
-void klp_unpatch_objects(struct klp_patch *patch)
+void klp_unpatch_objects(struct klp_patch *patch, bool no_op)
 {
 	struct klp_object *obj;
 
 	klp_for_each_object(patch, obj)
 		if (obj->patched)
-			klp_unpatch_object(obj);
+			klp_unpatch_object(obj, no_op);
 }
diff --git a/kernel/livepatch/patch.h b/kernel/livepatch/patch.h
index 0db2271..2e13c50 100644
--- a/kernel/livepatch/patch.h
+++ b/kernel/livepatch/patch.h
@@ -27,7 +27,7 @@ struct klp_ops {
 struct klp_ops *klp_find_ops(unsigned long old_addr);
 
 int klp_patch_object(struct klp_object *obj);
-void klp_unpatch_object(struct klp_object *obj);
-void klp_unpatch_objects(struct klp_patch *patch);
+void klp_unpatch_object(struct klp_object *obj, bool no_op);
+void klp_unpatch_objects(struct klp_patch *patch, bool no_op);
 
 #endif /* _LIVEPATCH_PATCH_H */
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index b004a1f..d5e620e 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -21,6 +21,8 @@
 
 #include <linux/cpu.h>
 #include <linux/stacktrace.h>
+#include <linux/ftrace.h>
+#include <linux/delay.h>
 #include "core.h"
 #include "patch.h"
 #include "transition.h"
@@ -70,6 +72,7 @@ static void klp_synchronize_transition(void)
 	schedule_on_each_cpu(klp_sync);
 }
 
+
 /*
  * The transition to the target patch state is complete.  Clean up the data
  * structures.
@@ -81,14 +84,39 @@ static void klp_complete_transition(void)
 	struct task_struct *g, *task;
 	unsigned int cpu;
 	bool immediate_func = false;
+	bool no_op = false;
+	struct klp_patch *prev_patch;
+
+	/*
+	 * for replace patches, we disable all previous patches, and replace
+	 * the dynamic no-op functions by removing the ftrace hook. After
+	 * klp_synchronize_transition() is called its safe to free the
+	 * the dynamic no-op functions, done by klp_patch_free_no_ops()
+	 */
+	if (klp_target_state == KLP_PATCHED && klp_transition_patch->replace) {
+		prev_patch = klp_transition_patch;
+		while (prev_patch->list.prev != &klp_patches) {
+			prev_patch = list_prev_entry(prev_patch, list);
+			if (prev_patch->enabled) {
+				klp_unpatch_objects(prev_patch, false);
+				prev_patch->enabled = false;
+				prev_patch->replaced = true;
+				module_put(prev_patch->mod);
+			}
+		}
+		klp_unpatch_objects(klp_transition_patch, true);
+		no_op = true;
+	}
 
 	if (klp_target_state == KLP_UNPATCHED) {
 		/*
 		 * All tasks have transitioned to KLP_UNPATCHED so we can now
 		 * remove the new functions from the func_stack.
 		 */
-		klp_unpatch_objects(klp_transition_patch);
+		klp_unpatch_objects(klp_transition_patch, false);
+	}
 
+	if (klp_target_state == KLP_UNPATCHED || no_op) {
 		/*
 		 * Make sure klp_ftrace_handler() can no longer see functions
 		 * from this patch on the ops->func_stack.  Otherwise, after
@@ -130,6 +158,9 @@ static void klp_complete_transition(void)
 	}
 
 done:
+	if (no_op)
+		klp_patch_free_no_ops(klp_transition_patch);
+
 	klp_target_state = KLP_UNDEFINED;
 	klp_transition_patch = NULL;
 }
@@ -202,10 +233,18 @@ static int klp_check_stack_func(struct klp_func *func,
 		if (klp_target_state == KLP_UNPATCHED) {
 			 /*
 			  * Check for the to-be-unpatched function
-			  * (the func itself).
+			  * (the func itself). If we're unpatching
+			  * a no-op, then we're running the original
+			  * function. We never 'patch' a no-op function,
+			  * since we just remove the ftrace hook.
 			  */
-			func_addr = (unsigned long)func->new_func;
-			func_size = func->new_size;
+			if (func->no_op) {
+				func_addr = (unsigned long)func->old_addr;
+				func_size = func->old_size;
+			} else {
+				func_addr = (unsigned long)func->new_func;
+				func_size = func->new_size;
+			}
 		} else {
 			/*
 			 * Check for the to-be-patched function
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/3] livepatch: Add a sysctl livepatch_mode for atomic replace
  2017-08-30 21:38 [PATCH v2 0/3] livepatch: introduce atomic replace Jason Baron
  2017-08-30 21:38 ` [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators Jason Baron
  2017-08-30 21:38 ` [PATCH v2 2/3] livepatch: add atomic replace Jason Baron
@ 2017-08-30 21:38 ` Jason Baron
  2 siblings, 0 replies; 14+ messages in thread
From: Jason Baron @ 2017-08-30 21:38 UTC (permalink / raw)
  To: linux-kernel, live-patching; +Cc: jpoimboe, jeyu, jikos, mbenes, pmladek

Introduce a sysctl knob such that by default livepatch is not in
'atomic replace' mode. A '0' in /proc/sys/kernel/livepatch_mode means the
current default mode, while a '1' means do atomic replace.

This patch is not meant to be applied and is for testing purposes only. The
intent is for the tool that creates the livepatch modules to set the 'replace'
field in struct klp_patch to 1, to indicate that atomic replace mode is
being requested, 0 otherwise.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
---
 include/linux/livepatch.h |  8 ++++++++
 kernel/livepatch/core.c   |  7 +++++++
 kernel/sysctl.c           | 12 ++++++++++++
 3 files changed, 27 insertions(+)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index ee6d18b..3c4df79 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -35,6 +35,14 @@
 #define KLP_UNPATCHED	 0
 #define KLP_PATCHED	 1
 
+/* livepatch mode */
+
+extern int sysctl_livepatch_mode;
+enum {
+	LIVEPATCH_MODE_DEFAULT,
+	LIVEPATCH_MODE_REPLACE,
+};
+
 /**
  * struct klp_func - function structure for live patching
  * @old_name:	name of the function to be patched
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 21cecc5..cdd89a4 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -49,6 +49,8 @@ LIST_HEAD(klp_patches);
 
 static struct kobject *klp_root_kobj;
 
+int sysctl_livepatch_mode;
+
 static bool klp_is_module(struct klp_object *obj)
 {
 	return obj->name;
@@ -868,6 +870,11 @@ static int klp_init_patch(struct klp_patch *patch)
 	patch->enabled = false;
 	patch->replaced = false;
 
+	if (sysctl_livepatch_mode == LIVEPATCH_MODE_REPLACE)
+		patch->replace = true;
+	else
+		patch->replace = false;
+
 	init_completion(&patch->finish);
 
 	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4dfba1a..3a0a1f6 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -67,6 +67,7 @@
 #include <linux/kexec.h>
 #include <linux/bpf.h>
 #include <linux/mount.h>
+#include <linux/livepatch.h>
 
 #include <linux/uaccess.h>
 #include <asm/processor.h>
@@ -1203,6 +1204,17 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &one,
 	},
 #endif
+#ifdef CONFIG_LIVEPATCH
+	{
+		.procname	= "livepatch_mode",
+		.data		= &sysctl_livepatch_mode,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+#endif
 	{ }
 };
 
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators
  2017-08-30 21:38 ` [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators Jason Baron
@ 2017-09-07 12:34   ` Petr Mladek
  2017-09-07 16:53     ` Joe Perches
  0 siblings, 1 reply; 14+ messages in thread
From: Petr Mladek @ 2017-09-07 12:34 UTC (permalink / raw)
  To: Jason Baron; +Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes

On Wed 2017-08-30 17:38:43, Jason Baron wrote:
> In preparation to introducing atomic replace, introduce iterators for klp_func


> and klp_object, such that objects and functions can be dynmically allocated
> (needed for atomic replace). This patch is intended to effectively be a no-op

./scripts/checkpatch.pl reports that these lines are too long.
It prefers a maximum 75 chars per line in the commit message ;-)

> until atomic replace is introduced.
> 
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -24,6 +24,7 @@
>  #include <linux/module.h>
>  #include <linux/ftrace.h>
>  #include <linux/completion.h>
> +#include <linux/list.h>
>  
>  #if IS_ENABLED(CONFIG_LIVEPATCH)
>  
> @@ -44,6 +45,7 @@
>   * @old_addr:	the address of the function being patched
>   * @kobj:	kobject for sysfs resources
>   * @stack_node:	list node for klp_ops func_stack list
> + * @func_entry: used to link struct klp_func to struct klp_object

Please, make it clear that only dynamically allocated structures
are linked. Same for the other entries.

It might look superfluous when you read this patch. But it
will help a lot when you read the code one year from now.


>   * @old_size:	size of the old function
>   * @new_size:	size of the new function
>   * @patched:	the func has been added to the klp_ops list

[...]

> @@ -126,17 +134,95 @@ struct klp_patch {
>  	/* internal */
>  	struct list_head list;
>  	struct kobject kobj;
> +	struct list_head obj_list;
>  	bool enabled;
>  	struct completion finish;
>  };
>  
> +static inline struct klp_object *obj_iter_next(struct klp_patch *patch,
> +					       struct klp_object *obj)

The function is far from trivial. I wonder if it is still a good
candidate for inlining.

Also it should get prefixed by klp_ because it is in a header
that can be included anywhere.

Next I still think that it will be easier to understand when
we make it more clear that only the dymanically allocated
objects are in the list. I mean renaming:

  obj_entry -> dyn_obj_entry
  obj_list -> dyn_obj_list

> +{
> +	struct klp_object *next_obj = NULL;
> +

The code quite tricky. IMHO, it would deserve a comment.

	/*
	 * Statically defined objects are in NULL-ended array.
	 * Only dynamic ones are in the obj_list.
	 */
> +	if (list_empty(&obj->obj_entry)) {
> +		next_obj = obj + 1;
> +		if (next_obj->funcs || next_obj->name)
> +			goto out;
> +		else
> +			next_obj = NULL;

Please, add an empty line here to make it better readable.

> +		if (!list_empty(&patch->obj_list))
> +			next_obj = container_of(patch->obj_list.next,
> +					struct klp_object,
> +					obj_entry);
> +		goto out;
> +	}

Also here an empty line.

> +	if (obj->obj_entry.next != &patch->obj_list)
> +		next_obj = container_of(obj->obj_entry.next,
> +				struct klp_object,
> +				obj_entry);
> +out:
> +	return next_obj;
> +}

> +static inline struct klp_object *obj_iter_init(struct klp_patch *patch)
> +{
> +	if (patch->objs->funcs || patch->objs->name)

I would do something like

#define klp_is_null_obj(obj) (!obj->funcs && !obj->name)

Then it can be used here an in klp_obj_iter_next().
This will be even more useful in the func iterator
where the check is even more complicated.


> +		return patch->objs;
> +	else
> +		return NULL;
> +}
> +
>  #define klp_for_each_object(patch, obj) \
> -	for (obj = patch->objs; obj->funcs || obj->name; obj++)
> +	for (obj = obj_iter_init(patch); obj; obj = obj_iter_next(patch, obj))
> +
> +static inline struct klp_func *func_iter_next(struct klp_object *obj,
> +					      struct klp_func *func)
> +{
> +	struct klp_func *next_func = NULL;
> +
> +	if (list_empty(&func->func_entry)) {
> +		next_func = func + 1;
> +		if (next_func->old_name || next_func->new_func ||
> +		    next_func->old_sympos)
> +			goto out;
> +		else
> +			next_func = NULL;
> +		if (!list_empty(&obj->func_list))
> +			next_func = container_of(obj->func_list.next,
> +					struct klp_func,
> +					func_entry);

I have just realized that a practice is to use list_entry() instead
of container_of() for list entries. It probably makes the code better
readable for a trained eye.

> +		goto out;
> +	}
> +	if (func->func_entry.next != &obj->func_list)
> +		next_func = container_of(func->func_entry.next,
> +					 struct klp_func,
> +					 func_entry);
> +out:
> +	return next_func;
> +}

[...]

>  #define klp_for_each_func(obj, func) \
> -	for (func = obj->funcs; \
> -	     func->old_name || func->new_func || func->old_sympos; \
> -	     func++)
> +	for (func = func_iter_init(obj); func; \
> +	     func = func_iter_next(obj, func))

Otherwise, I have basically the same comments about iter_func
like for iter_obj.


>  int klp_register_patch(struct klp_patch *);
>  int klp_unregister_patch(struct klp_patch *);
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index b9628e4..6004be3 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -729,7 +730,10 @@ static int klp_init_patch(struct klp_patch *patch)
>  		return ret;
>  	}
>  
> +	INIT_LIST_HEAD(&patch->obj_list);

Please, do this together with the other trivial initizalizations.
I mean to do it in the same place like in the other init functions:

	patch->enabled = false;
	patch->replaced = false;
+	INIT_LIST_HEAD(&patch->obj_list);


>  	klp_for_each_object(patch, obj) {
> +		INIT_LIST_HEAD(&obj->obj_entry);
> +		INIT_LIST_HEAD(&obj->func_list);

These two should be done in klp_init_object(). You move it there
in the second patch anyway.

>  		ret = klp_init_object(patch, obj);
>  		if (ret)
>  			goto free;

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators
  2017-09-07 12:34   ` Petr Mladek
@ 2017-09-07 16:53     ` Joe Perches
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Perches @ 2017-09-07 16:53 UTC (permalink / raw)
  To: Petr Mladek, Jason Baron
  Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes

On Thu, 2017-09-07 at 14:34 +0200, Petr Mladek wrote:
> On Wed 2017-08-30 17:38:43, Jason Baron wrote:
[]
> > +	if (list_empty(&obj->obj_entry)) {
> > +		next_obj = obj + 1;
> > +		if (next_obj->funcs || next_obj->name)
> > +			goto out;
> > +		else
> > +			next_obj = NULL;
> 
> Please, add an empty line here to make it better readable.

and/or just get rid of the else

> > +static inline struct klp_func *func_iter_next(struct klp_object *obj,
> > +					      struct klp_func *func)
> > +{
> > +	struct klp_func *next_func = NULL;
> > +
> > +	if (list_empty(&func->func_entry)) {
> > +		next_func = func + 1;
> > +		if (next_func->old_name || next_func->new_func ||
> > +		    next_func->old_sympos)
> > +			goto out;
> > +		else
> > +			next_func = NULL;

here too

> > +		if (!list_empty(&obj->func_list))
> > +			next_func = container_of(obj->func_list.next,
> > +					struct klp_func,
> > +					func_entry);
> 
> I have just realized that a practice is to use list_entry() instead
> of container_of() for list entries. It probably makes the code better
> readable for a trained eye.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-08-30 21:38 ` [PATCH v2 2/3] livepatch: add atomic replace Jason Baron
@ 2017-09-11 13:53   ` Petr Mladek
  2017-09-12  3:46     ` Jason Baron
  0 siblings, 1 reply; 14+ messages in thread
From: Petr Mladek @ 2017-09-11 13:53 UTC (permalink / raw)
  To: Jason Baron; +Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes

On Wed 2017-08-30 17:38:44, Jason Baron wrote:
> When doing cumulative patches, if patch A introduces a change to function 1,
> and patch B reverts the change to function 1 and introduces changes to say
> function 2 and 3 as well, the change that patch A introduced to function 1
> is still present.

This is a bit confusing. One might thing that the function 1 still
uses the code added by the patch A. I would say something like:


Sometimes we would like to revert a particular fix. This is not easy
because we want to keep all other fixes active and we could revert
only the last applied patch.

One solution would be to apply new patch that implemented all
the reverted functions like in the original code. It would work
as expected but there will be unnecessary redirections.

A better solution would be to say that a new patch replaces
an older one. This might be complicated if we want to replace
a particular patch. But it is rather easy when a new cummulative
patch replaces all others.

For this, a new "replace" flag is added to the struct klp_patch.
When it is enabled, 'no_op' klp_func will be dynamically
created for all functions that are already being patched but
that will not longer be modified by the new patch. They are
temporarily used as a new target during the patch transition...


> Introduce atomic replace, by first creating a 'no_op' klp_func for all
> the functions that are reverted by patch B. The reason that 'no_op'
> klp_funcs are created, instead of just unregistering directly from the ftrace
> function hook, is to ensure that the consistency model is properly preserved.
> By introducing the 'no_op' functions, 'atomic replace' leverages the existing
> consistency model code. Then, after transition to the new code, 'atomic
> replace' unregisters the ftrace handlers that are associated with the 'no_op'
> klp_funcs, such that that we run the original un-patched function with no
> additional no_op function overhead.
> 
> Since 'atomic replace' has completely replaced all previous livepatch modules,
> it explicitly disables all previous livepatch modules, in the example- patch A,
> such that the livepatch module associated with patch A can be completely removed
> (rmmod). Patch A is now in a permanently disabled state. But if it is removed
> from the kernel with rmmod, it can be re-inserted (insmod), and act as an atomic
> replace on top of patch A.
>
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 8d3df55..ee6d18b 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -119,10 +121,12 @@ struct klp_object {
>   * @mod:	reference to the live patch module
>   * @objs:	object entries for kernel objects to be patched
>   * @immediate:  patch all funcs immediately, bypassing safety mechanisms
> + * @replace:	replace all funcs, reverting functions that are no longer patched
>   * @list:	list node for global list of registered patches
>   * @kobj:	kobject for sysfs resources
>   * @obj_list:	head of list for dynamically allocated struct klp_object
>   * @enabled:	the patch is enabled (but operation may be incomplete)
> + * @replaced:	the patch has been replaced an can not be re-enabled

I finally understand why you do this. I forgot that even disabled
patches are still in klp_patch list.

This makes some sense for patches that fix different bugs and
touch the same function. They should be applied and enabled
in the right order because a later patch for the same function
must be based on the code from the previous one.

It makes less sense for cummulative patches that we use in kGraft.
There basically every patch has the "replace" flag set. If
we enable one patch it simply replaces any other one. Then
ordering is not important. Each patch is standalone and
consistent on its own.


I see two problems with your approach. One is cosmetic.
The names of the two flags replace/replaced are too similar.
It is quite prone for mistakes ;-)

Second, we could not remove module where any function is patched
usign the "immediate" flag. But people might want to revert
to this patch if the last one does not work as expected.
After all the main purpose of livepatching is to fix
problems without a system reboot. Therefore we should
allow to enable the replaced patches even without
removing the module.


One solution would be to remove the replaced patches from
klp_patch list. And enforce stacking the following way
in __klp_enable_patch():

	/*
	 * Enforce stacking: only the first disabled patch can be
	 * enabled. The only exception is a patch that atomically
	 * replaces all other patches and was disabled by
	 * another patch of this type.
	 */
	if (list_empty(&patch->list)) {
		if (!patch->replace)
			return -EBUSY;
		list_add_tail(&patch->list, &klp_patches);
	} else if (patch->list.prev != &klp_patches &&
		   !list_prev_entry(patch, list)->enabled) {
		return -EBUSY;
	}

Well, I would add this support later in a separate patch.
We would need to (re)generate the no_op stuff in __klp_enable_patch()
for this. Also we would need to make sure that this patch
could not longer get enabled once klp_unregister_patch()
is called.

For now, I would just remove the replaced patches from
klp_patches list and refuse enabling patches that are
not in the list. I mean to use a check of
list_empty(&patch->list) instead of the extra
"replaced" variable.


>   * @finish:	for waiting till it is safe to remove the patch module
>   */
>  struct klp_patch {
> @@ -130,12 +134,14 @@ struct klp_patch {
>  	struct module *mod;
>  	struct klp_object *objs;
>  	bool immediate;
> +	bool replace;
>  
>  	/* internal */
>  	struct list_head list;
>  	struct kobject kobj;
>  	struct list_head obj_list;
>  	bool enabled;
> +	bool replaced;
>  	struct completion finish;
>  };
>  
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 6004be3..21cecc5 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
>  		list_del(&patch->list);
>  }
>  
> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> +void klp_patch_free_no_ops(struct klp_patch *patch)
> +{
> +	struct klp_object *obj, *tmp_obj;
> +	struct klp_func *func, *tmp_func;
> +
> +	klp_for_each_object(patch, obj) {
> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> +					 func_entry) {
> +			list_del_init(&func->func_entry);
> +			kobject_put(&func->kobj);
> +			kfree(func->old_name);
> +			kfree(func);

kobject_put() is asynchronous. The structure should be freed using
the kobject release method.

The question is how secure this should be. We might want to block
other operations with the patch until all the release methods
are called. But this might be tricky as there is not a single
root kobject that would get destroyed at the end.

A crazy solution would be to define a global atomic counter.
It might get increased with each kobject_put() call and
descreased in each release method. Then we could wait
until it gets zero.

It should be safe to wait with klp_mutex taken. Note that this
is not possible with patch->kobj() where the "enable"
attribute takes the mutex as well, see
enabled_store() in kernel/livepatch/core.c.


> +		}
> +		INIT_LIST_HEAD(&obj->func_list);
> +	}
> +	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
> +		list_del_init(&obj->obj_entry);
> +		kobject_put(&obj->kobj);
> +		kfree(obj->name);
> +		kfree(obj);

Same here.

> +	}
> +	INIT_LIST_HEAD(&patch->obj_list);
> +}
> +
> +static int klp_init_func(struct klp_object *obj, struct klp_func *func,
> +			 bool no_op)
>  {
> -	if (!func->old_name || !func->new_func)
> +	if (!func->old_name || (!no_op && !func->new_func))
>  		return -EINVAL;
>  
> -	INIT_LIST_HEAD(&func->stack_node);
>  	INIT_LIST_HEAD(&func->func_entry);
> +	INIT_LIST_HEAD(&func->stack_node);

This just shuffles a line that was added in the previous patch ;-)

>  	func->patched = false;
>  	func->transition = false;
>  
> @@ -670,19 +698,23 @@ static int klp_init_object_loaded(struct klp_patch *patch,
>  	return 0;
>  }
>  
> -static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
> +static int klp_init_object(struct klp_patch *patch, struct klp_object *obj,
> +			   bool no_op)

It we initialized the no_op stuff earlier, we might pass this
information via obj->no_op. See below for more details.

>  {
>  	struct klp_func *func;
>  	int ret;
>  	const char *name;
>  
> -	if (!obj->funcs)
> +	if (!obj->funcs && !no_op)
>  		return -EINVAL;
>  
>  	obj->patched = false;
>  	obj->mod = NULL;
> +	INIT_LIST_HEAD(&obj->obj_entry);
> +	INIT_LIST_HEAD(&obj->func_list);

This should be done already in the previous patch.
Well, we might want to change this to

	if (!obj->no_op)
		INIT_LIST_HEAD(&obj->func_list);


>  
> -	klp_find_object_module(obj);
> +	if (!no_op)
> +		klp_find_object_module(obj);

This looks just like an optimization. You copy the information
from the related to-be-removed patch. I would avoid it to reduce
the extra changes and twists in the code. It is a rare slow path.

Instead, I suggest to initialize the no_op related stuff
earlier and then call the classic init() methods that
would fill the rest. See below.


>  	name = klp_is_module(obj) ? obj->name : "vmlinux";
>  	ret = kobject_init_and_add(&obj->kobj, &klp_ktype_object,
> @@ -690,8 +722,12 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
>  	if (ret)
>  		return ret;
>  
> +	if (no_op)
> +		return 0;

This might be hard to maintain. It seems to be needed because
you call klp_init_no_ops() after you already klp_init_object()
for the statically defined ones.

Alternative solution would be to add the dynamically allocated
structures earlier and call basically the unchanged klp_init_object()
for all of them. See below for more details.

> +
>  	klp_for_each_func(obj, func) {
> -		ret = klp_init_func(obj, func);
> +		func->no_op = false;

This is weird and not needed. We should not touch it here.
Note that statically defined structures are zeroed except
for the explicitely defined members. Therefore it is false
out of box. It should be set earlier for the dynamic ones.

> +		ret = klp_init_func(obj, func, false);
>  		if (ret)
>  			goto free;
>  	}
> @@ -710,6 +746,115 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
>  	return ret;
>  }
>  
> +static int klp_init_patch_no_ops(struct klp_patch *prev_patch,
> +				 struct klp_patch *patch)
> +{
> +	struct klp_object *obj, *prev_obj;
> +	struct klp_func *prev_func, *func;
> +	int ret;
> +	bool found, mod;

This is a lot of spagetti code: more than one sceen and 6 indentation
levels. Also all the use of prev_patch is rather confusing.

If we call this function before we initialize the statically defined
objects. Then we could keep the original klp_init_* functions
almost without changes.

First, I suggest to create a function that would initialize just
the stuff specific for the dynamically allocated stuff.

static void klp_init_patch_dyn(struct klp_patch *patch)
{
	struct klp_object *obj;
	struct klp_object *func;

	INIT_LIST_HEAD(&patch->obj_list);
	klp_for_each_object(patch, obj) {
		INIT_LIST_HEAD(&obj->obj_entry);
		INIT_LIST_HEAD(&obj->func_list);
		klp_for_each_func(obj, func) {
			INIT_LIST_HEAD(&func->func_entry);
		}
	}
}

I still suggest to make clear that these list items are
not generic. I would prefer to make it clear that they
are used by the no_op stuff =>

  patch->obj_list  -> patch->nop_obj_list
  obj->obj_entry   -> obj->nop_obj_entry
  obj->func_list   -> obj->nop_func_list
  func->func_entry -> func->nop_func_entry

Note that I used "nop" instead of "no_op". I think that
it is a well known shortcut.



let's go back to the init stuff. As a second step, we could
allocate and add the structures for the no_op operations.
We should fill the no_op-specific stuff and the items that
are normally pre-set in statically defined structures.
Then we could call the classic klp_init_object().

I would split this into several levels. I do not say that the
following is the best split but...

static int klp_check_and_add_no_op_func(struct klp_patch *obj,
				struct klp_func *old_func)
{
	struct klp_func *func;

	func = klp_find_func(obj, old_func->old_name, old_func->old_sympos);
	if (func)
		return 0;

	func = klp_create_no_op_func(old_obj->name);
	if (IS_ERR(func))
		return PTR_ERR(func);

	list_add(&func->func_entry, &obj->func_list);

	return 0;
}

static int klp_check_and_add_no_op_object(struct klp_patch *patch,
				struct klp_object *old_obj)
{
	struct klp_object *obj;
	struct klp_func *old_func, *func;
	int err = 0;

	obj = klp_find_obj(patch, old_obj->name);
	if (!obj) {
		obj = klp_create_no_op_object(old_obj->name);
		if (IS_ERR(obj))
			   return PTR_ERR(obj);

		list_add(&obj->obj_entry, &patch->obj_list);
	}

	klp_for_each_func(old_obj, old_func) {
		err = klp_check_and_add_no_op_func(obj, old_func);
		if (err)
			// destroy previously created no_op stuff?
	}

	return 0;
}

static int klp_check_and_add_no_ops(struct klp_patch *patch)
{
	struct klp_patch *old_patch;
	struct klp_object *old_obj, *obj;
	int err = 0;

	list_for_each_entry(old_patch, &klp_patches, list) {
		klp_for_each_object(old_patch, old_obj) {
			err = klp_check_and_add_no_op_object(patch,
					old_obj);
			if (err)
				return err;
		}
	}

	return 0;
}

It is more code but it should be better readable and manageable.
Especially, the new functions handle only the no_op specific stuff.
They do not duplicate functionality that could be done
by the original init() functions for the statically
defined structures.

> +
> +	klp_for_each_object(prev_patch, prev_obj) {
> +		klp_for_each_func(prev_obj, prev_func) {
> +next_func:

> +			klp_for_each_object(patch, obj) {
> +				klp_for_each_func(obj, func) {
> +					if ((strcmp(prev_func->old_name,
> +						    func->old_name) == 0) &&
> +						(prev_func->old_sympos ==
> +							func->old_sympos)) {
> +						goto next_func;
> +					}
> +				}
> +			}

This could be used to implement that klp_find_func().


> +			mod = klp_is_module(prev_obj);
> +			found = false;
> +			klp_for_each_object(patch, obj) {
> +				if (mod) {
> +					if (klp_is_module(obj) &&
> +					    strcmp(prev_obj->name,
> +						   obj->name) == 0) {
> +						found = true;
> +						break;
> +					}
> +				} else if (!klp_is_module(obj)) {
> +					found = true;
> +					break;
> +				}
> +			}

This could be used to implement klp_find_object().


> +			if (!found) {
> +				obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> +				if (!obj)
> +					return -ENOMEM;
> +				if (prev_obj->name) {
> +					obj->name = kstrdup(prev_obj->name,
> +							    GFP_KERNEL);
> +					if (!obj->name) {
> +						kfree(obj);
> +						return -ENOMEM;
> +					}
> +				} else {
> +					obj->name = NULL;
> +				}
> +				obj->funcs = NULL;
> +				ret = klp_init_object(patch, obj, true);
> +				if (ret) {
> +					kfree(obj->name);
> +					kfree(obj);
> +					return ret;
> +				}
> +				obj->mod = prev_obj->mod;
> +				list_add(&obj->obj_entry, &patch->obj_list);
> +			}

This is a base for the klp_create_no_op_object(). It should set everything
that will be needed for the classic klp_init_object() plus that no_op
specific stuff. It might might looks like:

static struct klp_object *
klp_create_no_op_object(const char *name)
{
	struct klp_object *obj;

	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
	if (!obj)
		return ERR_PTR(-ENOMEM);

	obj->no_op = true;
	INIT_LIST_HEAD(&obj->func_list);

	if (name) {
		obj->name = kstrdup(name, GFP_KERNEL);
		if (!obj->name) {
			kfree(obj);
			return ERR_PRT(-ENOMEM);
		}
	}

	return obj;
}

> +			func = kzalloc(sizeof(*func), GFP_KERNEL);
> +			if (!func)
> +				return -ENOMEM;
> +			if (prev_func->old_name) {
> +				func->old_name = kstrdup(prev_func->old_name,
> +							 GFP_KERNEL);
> +				if (!func->old_name) {
> +					kfree(func);
> +					return -ENOMEM;
> +				}
> +			} else {
> +				func->old_name = NULL;
> +			}
> +			func->new_func = NULL;
> +			func->old_sympos = prev_func->old_sympos;
> +			ret = klp_init_func(obj, func, true);
> +			func->immediate = prev_func->immediate;
> +			func->old_addr = prev_func->old_addr;
> +			func->old_size = prev_func->old_size;
> +			func->new_size = 0;
> +			func->no_op = true;
> +			list_add(&func->func_entry, &obj->func_list);

And this is a base for klp_create_no_op_func(). Something like:

static struct klp_func *
klp_create_no_op_func(const char *old_name,
		      unsigned long old_sympos,
		      bool immediate)
{
	if (!old_name)
		return ERR_PTR(-EINVAL);

	func = kzalloc(sizeof(*func), GFP_KERNEL);
	if (!func)
		return ERR_PTR(-ENOMEM);

	func->old_name = kstrdup(old_name, GFP_KERNEL);
	if (!func->old_name) {
		kfree(func);
		return ERR_PTR(-ENOMEM);
	}

	func->old_sympos = old_sympos;
	func->immediate = immediate;
	func->no_op = true;

	return func;
}


> +		}
> +	}
> +	return 0;
> +}
> +
> +static int klp_init_no_ops(struct klp_patch *patch)
> +{
> +	struct klp_patch *prev_patch;
> +	int ret = 0;
> +
> +	if (!patch->replace)
> +		return 0;
> +
> +	prev_patch = patch;
> +	while (prev_patch->list.prev != &klp_patches) {
> +		prev_patch = list_prev_entry(prev_patch, list);
> +
> +		ret = klp_init_patch_no_ops(prev_patch, patch);
> +		if (ret)
> +			return ret;
> +
> +		if (prev_patch->replace)
> +			break;
> +	}
> +	return 0;
> +}
> +
>  static int klp_init_patch(struct klp_patch *patch)
>  {
>  	struct klp_object *obj;
> @@ -721,6 +866,8 @@ static int klp_init_patch(struct klp_patch *patch)
>  	mutex_lock(&klp_mutex);
>  
>  	patch->enabled = false;
> +	patch->replaced = false;
> +
>  	init_completion(&patch->finish);
>  
>  	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
> @@ -732,20 +879,25 @@ static int klp_init_patch(struct klp_patch *patch)
>  
>  	INIT_LIST_HEAD(&patch->obj_list);
>  	klp_for_each_object(patch, obj) {
> -		INIT_LIST_HEAD(&obj->obj_entry);
> -		INIT_LIST_HEAD(&obj->func_list);
> -		ret = klp_init_object(patch, obj);
> +		ret = klp_init_object(patch, obj, false);

The extra parameter still might be needed because only no_ops
need to get re-added and reinitialized when the disabled patch
is enabled again. Well, this feature might be added later
in a separate patch.

>  		if (ret)
>  			goto free;
>  	}
>  
>  	list_add_tail(&patch->list, &klp_patches);
>  
> +	ret = klp_init_no_ops(patch);
> +	if (ret) {
> +		list_del(&patch->list);
> +		goto free;
> +	}

This should be done earlier using that klp_check_and_add_no_ops().

> +
>  	mutex_unlock(&klp_mutex);
>  
>  	return 0;
>  
>  free:
> +	klp_patch_free_no_ops(patch);
>  	klp_free_objects_limited(patch, obj);
>  
>  	mutex_unlock(&klp_mutex);
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index b004a1f..d5e620e 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -70,6 +72,7 @@ static void klp_synchronize_transition(void)
>  	schedule_on_each_cpu(klp_sync);
>  }
>  
> +

extra empty line?

>  /*
>   * The transition to the target patch state is complete.  Clean up the data
>   * structures.
> @@ -81,14 +84,39 @@ static void klp_complete_transition(void)
>  	struct task_struct *g, *task;
>  	unsigned int cpu;
>  	bool immediate_func = false;
> +	bool no_op = false;
> +	struct klp_patch *prev_patch;
> +
> +	/*
> +	 * for replace patches, we disable all previous patches, and replace
> +	 * the dynamic no-op functions by removing the ftrace hook. After
> +	 * klp_synchronize_transition() is called its safe to free the
> +	 * the dynamic no-op functions, done by klp_patch_free_no_ops()
> +	 */
> +	if (klp_target_state == KLP_PATCHED && klp_transition_patch->replace) {
> +		prev_patch = klp_transition_patch;
> +		while (prev_patch->list.prev != &klp_patches) {
> +			prev_patch = list_prev_entry(prev_patch, list);

I wonder if the following code might be more obvious:

	list_for_each_entry(old_patch, &klp_patches, list) {
		if (old_patch = klp_transition_patch)
			continue;

> +			if (prev_patch->enabled) {
> +				klp_unpatch_objects(prev_patch, false);
> +				prev_patch->enabled = false;

I suggested to do this in __klp_disable_patch() in the last review.
But we need something else.

The fact is that patch->enable is normally manipulated at the
beginning of the transition, see __klp_enable_patch()
and __klp_disable_patch(). Therefore we are kind of
inconsistent when we manipulate it here for the replaced
patchses.

But we must keep the patches enabled during the entire
transition. Othewise, klp_module_coming() and klp_module_going()
would not patch/unpatch the objects correctly.

In fact, I think that we should set
klp_transition_patch->enabled = false in klp_complete_transition()
instead of in __klp_disable_patch(). I mean that flag should reflect
whether the ftrace handlers see the patch or not. Therefore it
should be always true during the transition. Then we would not
need the extra check for klp_transition_patch in coming/going
handlers. Well, this should be done in separate patch.
I could prepare it if you want.


> +				prev_patch->replaced = true;
> +				module_put(prev_patch->mod);

In each case, we could do this only when the related no_op
functions were not applied using "immediate" flag.

I am not 100% sure that it is really necessary. But I would feel
more comfortable when we do this after klp_synchronize_transition();
so that any ftrace handler could not see the related
functions on the stack.

> +			}
> +		}
> +		klp_unpatch_objects(klp_transition_patch, true);
> +		no_op = true;
> +	}
>  
>  	if (klp_target_state == KLP_UNPATCHED) {
>  		/*
>  		 * All tasks have transitioned to KLP_UNPATCHED so we can now
>  		 * remove the new functions from the func_stack.
>  		 */
> -		klp_unpatch_objects(klp_transition_patch);
> +		klp_unpatch_objects(klp_transition_patch, false);

This code patch made me think about a situation when
the "enable" operation was reverted during the transition.
Then the target state is KLP_UNPATCHED and no_op stuff
is there.

Well, in this case, we need to keep the no_op stuff
until either the the patch is enabled egain and the
transition finishes or we need to remove/free it
klp_unregister_patch().


> +	}
>  
> +	if (klp_target_state == KLP_UNPATCHED || no_op) {
>  		/*
>  		 * Make sure klp_ftrace_handler() can no longer see functions
>  		 * from this patch on the ops->func_stack.  Otherwise, after

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-11 13:53   ` Petr Mladek
@ 2017-09-12  3:46     ` Jason Baron
  2017-09-12  8:35       ` Petr Mladek
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Baron @ 2017-09-12  3:46 UTC (permalink / raw)
  To: Petr Mladek; +Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes



On 09/11/2017 09:53 AM, Petr Mladek wrote:
> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
>> When doing cumulative patches, if patch A introduces a change to function 1,
>> and patch B reverts the change to function 1 and introduces changes to say
>> function 2 and 3 as well, the change that patch A introduced to function 1
>> is still present.
> 
> This is a bit confusing. One might thing that the function 1 still
> uses the code added by the patch A. I would say something like:
> 
> 
> Sometimes we would like to revert a particular fix. This is not easy
> because we want to keep all other fixes active and we could revert
> only the last applied patch.
> 
> One solution would be to apply new patch that implemented all
> the reverted functions like in the original code. It would work
> as expected but there will be unnecessary redirections.
> 
> A better solution would be to say that a new patch replaces
> an older one. This might be complicated if we want to replace
> a particular patch. But it is rather easy when a new cummulative
> patch replaces all others.
> 
> For this, a new "replace" flag is added to the struct klp_patch.
> When it is enabled, 'no_op' klp_func will be dynamically
> created for all functions that are already being patched but
> that will not longer be modified by the new patch. They are
> temporarily used as a new target during the patch transition...
> 
>

ok. I will re-work the text here.

>> Introduce atomic replace, by first creating a 'no_op' klp_func for all
>> the functions that are reverted by patch B. The reason that 'no_op'
>> klp_funcs are created, instead of just unregistering directly from the ftrace
>> function hook, is to ensure that the consistency model is properly preserved.
>> By introducing the 'no_op' functions, 'atomic replace' leverages the existing
>> consistency model code. Then, after transition to the new code, 'atomic
>> replace' unregisters the ftrace handlers that are associated with the 'no_op'
>> klp_funcs, such that that we run the original un-patched function with no
>> additional no_op function overhead.
>>
>> Since 'atomic replace' has completely replaced all previous livepatch modules,
>> it explicitly disables all previous livepatch modules, in the example- patch A,
>> such that the livepatch module associated with patch A can be completely removed
>> (rmmod). Patch A is now in a permanently disabled state. But if it is removed
>> from the kernel with rmmod, it can be re-inserted (insmod), and act as an atomic
>> replace on top of patch A.
>>
>> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
>> index 8d3df55..ee6d18b 100644
>> --- a/include/linux/livepatch.h
>> +++ b/include/linux/livepatch.h
>> @@ -119,10 +121,12 @@ struct klp_object {
>>   * @mod:	reference to the live patch module
>>   * @objs:	object entries for kernel objects to be patched
>>   * @immediate:  patch all funcs immediately, bypassing safety mechanisms
>> + * @replace:	replace all funcs, reverting functions that are no longer patched
>>   * @list:	list node for global list of registered patches
>>   * @kobj:	kobject for sysfs resources
>>   * @obj_list:	head of list for dynamically allocated struct klp_object
>>   * @enabled:	the patch is enabled (but operation may be incomplete)
>> + * @replaced:	the patch has been replaced an can not be re-enabled
> 
> I finally understand why you do this. I forgot that even disabled
> patches are still in klp_patch list.
> 
> This makes some sense for patches that fix different bugs and
> touch the same function. They should be applied and enabled
> in the right order because a later patch for the same function
> must be based on the code from the previous one.
> 
> It makes less sense for cummulative patches that we use in kGraft.
> There basically every patch has the "replace" flag set. If
> we enable one patch it simply replaces any other one. Then
> ordering is not important. Each patch is standalone and
> consistent on its own.
> 
> 
> I see two problems with your approach. One is cosmetic.
> The names of the two flags replace/replaced are too similar.
> It is quite prone for mistakes ;-)
> 
> Second, we could not remove module where any function is patched
> usign the "immediate" flag. But people might want to revert
> to this patch if the last one does not work as expected.
> After all the main purpose of livepatching is to fix
> problems without a system reboot. Therefore we should
> allow to enable the replaced patches even without
> removing the module.
> 

If the livepatch has the 'replace' bit set and not the 'immediate' bit,
then I believe that we could remove *all* types of previous livepatches
even if they have the 'immediate' flag set. That is, because of the
consistency model, we are sure that we are running only functions from
the cumulative replace patch.

So I think it may be worth making a distinction between patches that
have 'replace' bit set and the immediate bit *unset*, and those that
have the 'replace' bit set and the immediate bit *set*. So it seems to
me that this patchset could just reject 'replace' patches that have the
'immediate' flag set, and treat the 'immediate' flag being set as
separate todo item if there is interest?

Then, the 'replace' patch with 'immediate' not set, conceptually does a
'disable' and an 'unregister' of all previous live patches, such that
they could be removed (rmmod) and re-inserted.

> 
> One solution would be to remove the replaced patches from
> klp_patch list. And enforce stacking the following way
> in __klp_enable_patch():
> 
> 	/*
> 	 * Enforce stacking: only the first disabled patch can be
> 	 * enabled. The only exception is a patch that atomically
> 	 * replaces all other patches and was disabled by
> 	 * another patch of this type.
> 	 */
> 	if (list_empty(&patch->list)) {
> 		if (!patch->replace)
> 			return -EBUSY;
> 		list_add_tail(&patch->list, &klp_patches);
> 	} else if (patch->list.prev != &klp_patches &&
> 		   !list_prev_entry(patch, list)->enabled) {
> 		return -EBUSY;
> 	}
> 
> Well, I would add this support later in a separate patch.
> We would need to (re)generate the no_op stuff in __klp_enable_patch()
> for this. Also we would need to make sure that this patch
> could not longer get enabled once klp_unregister_patch()
> is called.
> 
> For now, I would just remove the replaced patches from
> klp_patches list and refuse enabling patches that are
> not in the list. I mean to use a check of
> list_empty(&patch->list) instead of the extra
> "replaced" variable.

Ok, that makes sense to me as a way to remove 'replaced'.

> 
> 
>>   * @finish:	for waiting till it is safe to remove the patch module
>>   */
>>  struct klp_patch {
>> @@ -130,12 +134,14 @@ struct klp_patch {
>>  	struct module *mod;
>>  	struct klp_object *objs;
>>  	bool immediate;
>> +	bool replace;
>>  
>>  	/* internal */
>>  	struct list_head list;
>>  	struct kobject kobj;
>>  	struct list_head obj_list;
>>  	bool enabled;
>> +	bool replaced;
>>  	struct completion finish;
>>  };
>>  
>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>> index 6004be3..21cecc5 100644
>> --- a/kernel/livepatch/core.c
>> +++ b/kernel/livepatch/core.c
>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
>>  		list_del(&patch->list);
>>  }
>>  
>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
>> +void klp_patch_free_no_ops(struct klp_patch *patch)
>> +{
>> +	struct klp_object *obj, *tmp_obj;
>> +	struct klp_func *func, *tmp_func;
>> +
>> +	klp_for_each_object(patch, obj) {
>> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
>> +					 func_entry) {
>> +			list_del_init(&func->func_entry);
>> +			kobject_put(&func->kobj);
>> +			kfree(func->old_name);
>> +			kfree(func);
> 
> kobject_put() is asynchronous. The structure should be freed using
> the kobject release method.
> 
> The question is how secure this should be. We might want to block
> other operations with the patch until all the release methods
> are called. But this might be tricky as there is not a single
> root kobject that would get destroyed at the end.
> 
> A crazy solution would be to define a global atomic counter.
> It might get increased with each kobject_put() call and
> descreased in each release method. Then we could wait
> until it gets zero.
> 
> It should be safe to wait with klp_mutex taken. Note that this
> is not possible with patch->kobj() where the "enable"
> attribute takes the mutex as well, see
> enabled_store() in kernel/livepatch/core.c.

Thanks for pointing this out - it looks like the livepatch code uses
wait_for_completion() with special kobj callbacks. Perhaps, there could
be a nop callback, but I'd have to look at this more closely...

> 
> 
>> +		}
>> +		INIT_LIST_HEAD(&obj->func_list);
>> +	}
>> +	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
>> +		list_del_init(&obj->obj_entry);
>> +		kobject_put(&obj->kobj);
>> +		kfree(obj->name);
>> +		kfree(obj);
> 
> Same here.
> 
>> +	}
>> +	INIT_LIST_HEAD(&patch->obj_list);
>> +}
>> +
>> +static int klp_init_func(struct klp_object *obj, struct klp_func *func,
>> +			 bool no_op)
>>  {
>> -	if (!func->old_name || !func->new_func)
>> +	if (!func->old_name || (!no_op && !func->new_func))
>>  		return -EINVAL;
>>  
>> -	INIT_LIST_HEAD(&func->stack_node);
>>  	INIT_LIST_HEAD(&func->func_entry);
>> +	INIT_LIST_HEAD(&func->stack_node);
> 
> This just shuffles a line that was added in the previous patch ;-)
> 
>>  	func->patched = false;
>>  	func->transition = false;
>>  
>> @@ -670,19 +698,23 @@ static int klp_init_object_loaded(struct klp_patch *patch,
>>  	return 0;
>>  }
>>  
>> -static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
>> +static int klp_init_object(struct klp_patch *patch, struct klp_object *obj,
>> +			   bool no_op)
> 
> It we initialized the no_op stuff earlier, we might pass this
> information via obj->no_op. See below for more details.
> 
>>  {
>>  	struct klp_func *func;
>>  	int ret;
>>  	const char *name;
>>  
>> -	if (!obj->funcs)
>> +	if (!obj->funcs && !no_op)
>>  		return -EINVAL;
>>  
>>  	obj->patched = false;
>>  	obj->mod = NULL;
>> +	INIT_LIST_HEAD(&obj->obj_entry);
>> +	INIT_LIST_HEAD(&obj->func_list);
> 
> This should be done already in the previous patch.
> Well, we might want to change this to
> 
> 	if (!obj->no_op)
> 		INIT_LIST_HEAD(&obj->func_list);
> 
> 
>>  
>> -	klp_find_object_module(obj);
>> +	if (!no_op)
>> +		klp_find_object_module(obj);
> 
> This looks just like an optimization. You copy the information
> from the related to-be-removed patch. I would avoid it to reduce
> the extra changes and twists in the code. It is a rare slow path.

ok.

> 
> Instead, I suggest to initialize the no_op related stuff
> earlier and then call the classic init() methods that
> would fill the rest. See below.
> 
> 
>>  	name = klp_is_module(obj) ? obj->name : "vmlinux";
>>  	ret = kobject_init_and_add(&obj->kobj, &klp_ktype_object,
>> @@ -690,8 +722,12 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
>>  	if (ret)
>>  		return ret;
>>  
>> +	if (no_op)
>> +		return 0;
> 
> This might be hard to maintain. It seems to be needed because
> you call klp_init_no_ops() after you already klp_init_object()
> for the statically defined ones.
> 
> Alternative solution would be to add the dynamically allocated
> structures earlier and call basically the unchanged klp_init_object()
> for all of them. See below for more details.'

Ok, I can re-work this.

> 
>> +
>>  	klp_for_each_func(obj, func) {
>> -		ret = klp_init_func(obj, func);
>> +		func->no_op = false;
> 
> This is weird and not needed. We should not touch it here.
> Note that statically defined structures are zeroed except
> for the explicitely defined members. Therefore it is false
> out of box. It should be set earlier for the dynamic ones.
> 
>> +		ret = klp_init_func(obj, func, false);
>>  		if (ret)
>>  			goto free;
>>  	}
>> @@ -710,6 +746,115 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
>>  	return ret;
>>  }
>>  
>> +static int klp_init_patch_no_ops(struct klp_patch *prev_patch,
>> +				 struct klp_patch *patch)
>> +{
>> +	struct klp_object *obj, *prev_obj;
>> +	struct klp_func *prev_func, *func;
>> +	int ret;
>> +	bool found, mod;
> 
> This is a lot of spagetti code: more than one sceen and 6 indentation
> levels. Also all the use of prev_patch is rather confusing.
> 
> If we call this function before we initialize the statically defined
> objects. Then we could keep the original klp_init_* functions
> almost without changes.
> 
> First, I suggest to create a function that would initialize just
> the stuff specific for the dynamically allocated stuff.
> 
> static void klp_init_patch_dyn(struct klp_patch *patch)
> {
> 	struct klp_object *obj;
> 	struct klp_object *func;
> 
> 	INIT_LIST_HEAD(&patch->obj_list);
> 	klp_for_each_object(patch, obj) {
> 		INIT_LIST_HEAD(&obj->obj_entry);
> 		INIT_LIST_HEAD(&obj->func_list);
> 		klp_for_each_func(obj, func) {
> 			INIT_LIST_HEAD(&func->func_entry);
> 		}
> 	}
> }
> 
> I still suggest to make clear that these list items are
> not generic. I would prefer to make it clear that they
> are used by the no_op stuff =>
> 
>   patch->obj_list  -> patch->nop_obj_list
>   obj->obj_entry   -> obj->nop_obj_entry
>   obj->func_list   -> obj->nop_func_list
>   func->func_entry -> func->nop_func_entry
> 
> Note that I used "nop" instead of "no_op". I think that
> it is a well known shortcut.
> 
> 
> 
> let's go back to the init stuff. As a second step, we could
> allocate and add the structures for the no_op operations.
> We should fill the no_op-specific stuff and the items that
> are normally pre-set in statically defined structures.
> Then we could call the classic klp_init_object().
> 
> I would split this into several levels. I do not say that the
> following is the best split but...
> 
> static int klp_check_and_add_no_op_func(struct klp_patch *obj,
> 				struct klp_func *old_func)
> {
> 	struct klp_func *func;
> 
> 	func = klp_find_func(obj, old_func->old_name, old_func->old_sympos);
> 	if (func)
> 		return 0;
> 
> 	func = klp_create_no_op_func(old_obj->name);
> 	if (IS_ERR(func))
> 		return PTR_ERR(func);
> 
> 	list_add(&func->func_entry, &obj->func_list);
> 
> 	return 0;
> }
> 
> static int klp_check_and_add_no_op_object(struct klp_patch *patch,
> 				struct klp_object *old_obj)
> {
> 	struct klp_object *obj;
> 	struct klp_func *old_func, *func;
> 	int err = 0;
> 
> 	obj = klp_find_obj(patch, old_obj->name);
> 	if (!obj) {
> 		obj = klp_create_no_op_object(old_obj->name);
> 		if (IS_ERR(obj))
> 			   return PTR_ERR(obj);
> 
> 		list_add(&obj->obj_entry, &patch->obj_list);
> 	}
> 
> 	klp_for_each_func(old_obj, old_func) {
> 		err = klp_check_and_add_no_op_func(obj, old_func);
> 		if (err)
> 			// destroy previously created no_op stuff?
> 	}
> 
> 	return 0;
> }
> 
> static int klp_check_and_add_no_ops(struct klp_patch *patch)
> {
> 	struct klp_patch *old_patch;
> 	struct klp_object *old_obj, *obj;
> 	int err = 0;
> 
> 	list_for_each_entry(old_patch, &klp_patches, list) {
> 		klp_for_each_object(old_patch, old_obj) {
> 			err = klp_check_and_add_no_op_object(patch,
> 					old_obj);
> 			if (err)
> 				return err;
> 		}
> 	}
> 
> 	return 0;
> }
> 
> It is more code but it should be better readable and manageable.
> Especially, the new functions handle only the no_op specific stuff.
> They do not duplicate functionality that could be done
> by the original init() functions for the statically
> defined structures.

Ok, this restructuring makes sense and I will try it out.

> 
>> +
>> +	klp_for_each_object(prev_patch, prev_obj) {
>> +		klp_for_each_func(prev_obj, prev_func) {
>> +next_func:
> 
>> +			klp_for_each_object(patch, obj) {
>> +				klp_for_each_func(obj, func) {
>> +					if ((strcmp(prev_func->old_name,
>> +						    func->old_name) == 0) &&
>> +						(prev_func->old_sympos ==
>> +							func->old_sympos)) {
>> +						goto next_func;
>> +					}
>> +				}
>> +			}
> 
> This could be used to implement that klp_find_func().
> 
> 
>> +			mod = klp_is_module(prev_obj);
>> +			found = false;
>> +			klp_for_each_object(patch, obj) {
>> +				if (mod) {
>> +					if (klp_is_module(obj) &&
>> +					    strcmp(prev_obj->name,
>> +						   obj->name) == 0) {
>> +						found = true;
>> +						break;
>> +					}
>> +				} else if (!klp_is_module(obj)) {
>> +					found = true;
>> +					break;
>> +				}
>> +			}
> 
> This could be used to implement klp_find_object().
> 
> 
>> +			if (!found) {
>> +				obj = kzalloc(sizeof(*obj), GFP_KERNEL);
>> +				if (!obj)
>> +					return -ENOMEM;
>> +				if (prev_obj->name) {
>> +					obj->name = kstrdup(prev_obj->name,
>> +							    GFP_KERNEL);
>> +					if (!obj->name) {
>> +						kfree(obj);
>> +						return -ENOMEM;
>> +					}
>> +				} else {
>> +					obj->name = NULL;
>> +				}
>> +				obj->funcs = NULL;
>> +				ret = klp_init_object(patch, obj, true);
>> +				if (ret) {
>> +					kfree(obj->name);
>> +					kfree(obj);
>> +					return ret;
>> +				}
>> +				obj->mod = prev_obj->mod;
>> +				list_add(&obj->obj_entry, &patch->obj_list);
>> +			}
> 
> This is a base for the klp_create_no_op_object(). It should set everything
> that will be needed for the classic klp_init_object() plus that no_op
> specific stuff. It might might looks like:
> 
> static struct klp_object *
> klp_create_no_op_object(const char *name)
> {
> 	struct klp_object *obj;
> 
> 	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> 	if (!obj)
> 		return ERR_PTR(-ENOMEM);
> 
> 	obj->no_op = true;
> 	INIT_LIST_HEAD(&obj->func_list);
> 
> 	if (name) {
> 		obj->name = kstrdup(name, GFP_KERNEL);
> 		if (!obj->name) {
> 			kfree(obj);
> 			return ERR_PRT(-ENOMEM);
> 		}
> 	}
> 
> 	return obj;
> }
> 
>> +			func = kzalloc(sizeof(*func), GFP_KERNEL);
>> +			if (!func)
>> +				return -ENOMEM;
>> +			if (prev_func->old_name) {
>> +				func->old_name = kstrdup(prev_func->old_name,
>> +							 GFP_KERNEL);
>> +				if (!func->old_name) {
>> +					kfree(func);
>> +					return -ENOMEM;
>> +				}
>> +			} else {
>> +				func->old_name = NULL;
>> +			}
>> +			func->new_func = NULL;
>> +			func->old_sympos = prev_func->old_sympos;
>> +			ret = klp_init_func(obj, func, true);
>> +			func->immediate = prev_func->immediate;
>> +			func->old_addr = prev_func->old_addr;
>> +			func->old_size = prev_func->old_size;
>> +			func->new_size = 0;
>> +			func->no_op = true;
>> +			list_add(&func->func_entry, &obj->func_list);
> 
> And this is a base for klp_create_no_op_func(). Something like:
> 
> static struct klp_func *
> klp_create_no_op_func(const char *old_name,
> 		      unsigned long old_sympos,
> 		      bool immediate)
> {
> 	if (!old_name)
> 		return ERR_PTR(-EINVAL);
> 
> 	func = kzalloc(sizeof(*func), GFP_KERNEL);
> 	if (!func)
> 		return ERR_PTR(-ENOMEM);
> 
> 	func->old_name = kstrdup(old_name, GFP_KERNEL);
> 	if (!func->old_name) {
> 		kfree(func);
> 		return ERR_PTR(-ENOMEM);
> 	}
> 
> 	func->old_sympos = old_sympos;
> 	func->immediate = immediate;
> 	func->no_op = true;
> 
> 	return func;
> }
> 
> 
>> +		}
>> +	}
>> +	return 0;
>> +}
>> +
>> +static int klp_init_no_ops(struct klp_patch *patch)
>> +{
>> +	struct klp_patch *prev_patch;
>> +	int ret = 0;
>> +
>> +	if (!patch->replace)
>> +		return 0;
>> +
>> +	prev_patch = patch;
>> +	while (prev_patch->list.prev != &klp_patches) {
>> +		prev_patch = list_prev_entry(prev_patch, list);
>> +
>> +		ret = klp_init_patch_no_ops(prev_patch, patch);
>> +		if (ret)
>> +			return ret;
>> +
>> +		if (prev_patch->replace)
>> +			break;
>> +	}
>> +	return 0;
>> +}
>> +
>>  static int klp_init_patch(struct klp_patch *patch)
>>  {
>>  	struct klp_object *obj;
>> @@ -721,6 +866,8 @@ static int klp_init_patch(struct klp_patch *patch)
>>  	mutex_lock(&klp_mutex);
>>  
>>  	patch->enabled = false;
>> +	patch->replaced = false;
>> +
>>  	init_completion(&patch->finish);
>>  
>>  	ret = kobject_init_and_add(&patch->kobj, &klp_ktype_patch,
>> @@ -732,20 +879,25 @@ static int klp_init_patch(struct klp_patch *patch)
>>  
>>  	INIT_LIST_HEAD(&patch->obj_list);
>>  	klp_for_each_object(patch, obj) {
>> -		INIT_LIST_HEAD(&obj->obj_entry);
>> -		INIT_LIST_HEAD(&obj->func_list);
>> -		ret = klp_init_object(patch, obj);
>> +		ret = klp_init_object(patch, obj, false);
> 
> The extra parameter still might be needed because only no_ops
> need to get re-added and reinitialized when the disabled patch
> is enabled again. Well, this feature might be added later
> in a separate patch.
> 
>>  		if (ret)
>>  			goto free;
>>  	}
>>  
>>  	list_add_tail(&patch->list, &klp_patches);
>>  
>> +	ret = klp_init_no_ops(patch);
>> +	if (ret) {
>> +		list_del(&patch->list);
>> +		goto free;
>> +	}
> 
> This should be done earlier using that klp_check_and_add_no_ops().
> 
>> +
>>  	mutex_unlock(&klp_mutex);
>>  
>>  	return 0;
>>  
>>  free:
>> +	klp_patch_free_no_ops(patch);
>>  	klp_free_objects_limited(patch, obj);
>>  
>>  	mutex_unlock(&klp_mutex);
>> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
>> index b004a1f..d5e620e 100644
>> --- a/kernel/livepatch/transition.c
>> +++ b/kernel/livepatch/transition.c
>> @@ -70,6 +72,7 @@ static void klp_synchronize_transition(void)
>>  	schedule_on_each_cpu(klp_sync);
>>  }
>>  
>> +
> 
> extra empty line?
> 
>>  /*
>>   * The transition to the target patch state is complete.  Clean up the data
>>   * structures.
>> @@ -81,14 +84,39 @@ static void klp_complete_transition(void)
>>  	struct task_struct *g, *task;
>>  	unsigned int cpu;
>>  	bool immediate_func = false;
>> +	bool no_op = false;
>> +	struct klp_patch *prev_patch;
>> +
>> +	/*
>> +	 * for replace patches, we disable all previous patches, and replace
>> +	 * the dynamic no-op functions by removing the ftrace hook. After
>> +	 * klp_synchronize_transition() is called its safe to free the
>> +	 * the dynamic no-op functions, done by klp_patch_free_no_ops()
>> +	 */
>> +	if (klp_target_state == KLP_PATCHED && klp_transition_patch->replace) {
>> +		prev_patch = klp_transition_patch;
>> +		while (prev_patch->list.prev != &klp_patches) {
>> +			prev_patch = list_prev_entry(prev_patch, list);
> 
> I wonder if the following code might be more obvious:
> 
> 	list_for_each_entry(old_patch, &klp_patches, list) {
> 		if (old_patch = klp_transition_patch)
> 			continue;
> 
>> +			if (prev_patch->enabled) {
>> +				klp_unpatch_objects(prev_patch, false);
>> +				prev_patch->enabled = false;
> 
> I suggested to do this in __klp_disable_patch() in the last review.
> But we need something else.
> 
> The fact is that patch->enable is normally manipulated at the
> beginning of the transition, see __klp_enable_patch()
> and __klp_disable_patch(). Therefore we are kind of
> inconsistent when we manipulate it here for the replaced
> patchses.>

ok, the suggestion is that for the non-immediate replace we do a
'special' disable here, once we know that the transition patch has been
successfully enabled.

> But we must keep the patches enabled during the entire
> transition. Othewise, klp_module_coming() and klp_module_going()
> would not patch/unpatch the objects correctly.

I think the patch does, its only when its been successfully enabled that
it goes and disables the previous patches.

> 
> In fact, I think that we should set
> klp_transition_patch->enabled = false in klp_complete_transition()
> instead of in __klp_disable_patch(). I mean that flag should reflect
> whether the ftrace handlers see the patch or not. Therefore it
> should be always true during the transition. Then we would not
> need the extra check for klp_transition_patch in coming/going
> handlers. Well, this should be done in separate patch.
> I could prepare it if you want.

I don't quite see why this is needed for this patch series - I think the
added nops should be handled the way all the other functions are...


> 
> 
>> +				prev_patch->replaced = true;
>> +				module_put(prev_patch->mod);
> 
> In each case, we could do this only when the related no_op
> functions were not applied using "immediate" flag.

yes.

> 
> I am not 100% sure that it is really necessary. But I would feel
> more comfortable when we do this after klp_synchronize_transition();
> so that any ftrace handler could not see the related
> functions on the stack.

Which bit are you referring to here?

> 
>> +			}
>> +		}
>> +		klp_unpatch_objects(klp_transition_patch, true);
>> +		no_op = true;
>> +	}
>>  
>>  	if (klp_target_state == KLP_UNPATCHED) {
>>  		/*
>>  		 * All tasks have transitioned to KLP_UNPATCHED so we can now
>>  		 * remove the new functions from the func_stack.
>>  		 */
>> -		klp_unpatch_objects(klp_transition_patch);
>> +		klp_unpatch_objects(klp_transition_patch, false);
> 
> This code patch made me think about a situation when
> the "enable" operation was reverted during the transition.
> Then the target state is KLP_UNPATCHED and no_op stuff
> is there.
> 
> Well, in this case, we need to keep the no_op stuff
> until either the the patch is enabled egain and the
> transition finishes or we need to remove/free it
> klp_unregister_patch().

In the case that klp_cancel_transition() calls klp_complete_transition()
and we remove the nop functions, we also remove all the regular
'functions', which should be re-added on the function stacks if the
patch is re-enabled. So I think this is fine.

Thanks,

-Jason



> 
> 
>> +	}
>>  
>> +	if (klp_target_state == KLP_UNPATCHED || no_op) {
>>  		/*
>>  		 * Make sure klp_ftrace_handler() can no longer see functions
>>  		 * from this patch on the ops->func_stack.  Otherwise, after
> 
> Best Regards,
> Petr
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-12  3:46     ` Jason Baron
@ 2017-09-12  8:35       ` Petr Mladek
  2017-09-13  6:47         ` Jason Baron
  0 siblings, 1 reply; 14+ messages in thread
From: Petr Mladek @ 2017-09-12  8:35 UTC (permalink / raw)
  To: Jason Baron; +Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes

On Mon 2017-09-11 23:46:28, Jason Baron wrote:
> On 09/11/2017 09:53 AM, Petr Mladek wrote:
> > On Wed 2017-08-30 17:38:44, Jason Baron wrote:
> >> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> >> index 8d3df55..ee6d18b 100644
> >> --- a/include/linux/livepatch.h
> >> +++ b/include/linux/livepatch.h
> >> @@ -119,10 +121,12 @@ struct klp_object {
> >>   * @mod:	reference to the live patch module
> >>   * @objs:	object entries for kernel objects to be patched
> >>   * @immediate:  patch all funcs immediately, bypassing safety mechanisms
> >> + * @replace:	replace all funcs, reverting functions that are no longer patched
> >>   * @list:	list node for global list of registered patches
> >>   * @kobj:	kobject for sysfs resources
> >>   * @obj_list:	head of list for dynamically allocated struct klp_object
> >>   * @enabled:	the patch is enabled (but operation may be incomplete)
> >> + * @replaced:	the patch has been replaced an can not be re-enabled
> > 
> > I finally understand why you do this. I forgot that even disabled
> > patches are still in klp_patch list.
> > 
> > This makes some sense for patches that fix different bugs and
> > touch the same function. They should be applied and enabled
> > in the right order because a later patch for the same function
> > must be based on the code from the previous one.
> > 
> > It makes less sense for cummulative patches that we use in kGraft.
> > There basically every patch has the "replace" flag set. If
> > we enable one patch it simply replaces any other one. Then
> > ordering is not important. Each patch is standalone and
> > consistent on its own.
> > 
> > 
> > I see two problems with your approach. One is cosmetic.
> > The names of the two flags replace/replaced are too similar.
> > It is quite prone for mistakes ;-)
> > 
> > Second, we could not remove module where any function is patched
> > usign the "immediate" flag. But people might want to revert
> > to this patch if the last one does not work as expected.
> > After all the main purpose of livepatching is to fix
> > problems without a system reboot. Therefore we should
> > allow to enable the replaced patches even without
> > removing the module.
> > 
> 
> If the livepatch has the 'replace' bit set and not the 'immediate' bit,
> then I believe that we could remove *all* types of previous livepatches
> even if they have the 'immediate' flag set. That is, because of the
> consistency model, we are sure that we are running only functions from
> the cumulative replace patch.

You are partly right. The catch is if a function was previously
patched using the immediate flag and the function is not longer
patched by the new cumulative patch with the 'replace' flag.
Then we need to create "no_op" for the function and the 'no_op'
must inherit the immediate flag set. Then the consistency model
does not guarantee that the code from the older patch is not running
and we could not allow to remove the older patch.


> So I think it may be worth making a distinction between patches that
> have 'replace' bit set and the immediate bit *unset*, and those that
> have the 'replace' bit set and the immediate bit *set*. So it seems to
> me that this patchset could just reject 'replace' patches that have the
> 'immediate' flag set, and treat the 'immediate' flag being set as
> separate todo item if there is interest?

I would do it the other way. If we apply a livepatch with both "replace"
and "immediate" flag set then I would avoid calling module_put()
for all the replaced patches. IMHO, it is a useful feature and
this is safe and easy.

We could later add an optimization that would allow to remove
patches that were not immediate and where all the functions
were replaced without the immediate flag as well.

I guess that you know this. But just for sure note that there
are two possibilities. The "immediate" flag might be set either
for the entire patch (struct klp_patch) or for particular
functions (struct klp_func).


> Then, the 'replace' patch with 'immediate' not set, conceptually does a
> 'disable' and an 'unregister' of all previous live patches, such that
> they could be removed (rmmod) and re-inserted.

Yes, if you consider also the flags used by "no_op" functions.


> > One solution would be to remove the replaced patches from
> > klp_patch list. And enforce stacking the following way
> > in __klp_enable_patch():
> > 
> > 	/*
> > 	 * Enforce stacking: only the first disabled patch can be
> > 	 * enabled. The only exception is a patch that atomically
> > 	 * replaces all other patches and was disabled by
> > 	 * another patch of this type.
> > 	 */
> > 	if (list_empty(&patch->list)) {
> > 		if (!patch->replace)
> > 			return -EBUSY;
> > 		list_add_tail(&patch->list, &klp_patches);
> > 	} else if (patch->list.prev != &klp_patches &&
> > 		   !list_prev_entry(patch, list)->enabled) {
> > 		return -EBUSY;
> > 	}
> > 
> > Well, I would add this support later in a separate patch.
> > We would need to (re)generate the no_op stuff in __klp_enable_patch()
> > for this. Also we would need to make sure that this patch
> > could not longer get enabled once klp_unregister_patch()
> > is called.
> > 
> > For now, I would just remove the replaced patches from
> > klp_patches list and refuse enabling patches that are
> > not in the list. I mean to use a check of
> > list_empty(&patch->list) instead of the extra
> > "replaced" variable.
> 
> Ok, that makes sense to me as a way to remove 'replaced'.
> 
> > 
> > 
> >>   * @finish:	for waiting till it is safe to remove the patch module
> >>   */
> >>  struct klp_patch {
> >> @@ -130,12 +134,14 @@ struct klp_patch {
> >>  	struct module *mod;
> >>  	struct klp_object *objs;
> >>  	bool immediate;
> >> +	bool replace;
> >>  
> >>  	/* internal */
> >>  	struct list_head list;
> >>  	struct kobject kobj;
> >>  	struct list_head obj_list;
> >>  	bool enabled;
> >> +	bool replaced;
> >>  	struct completion finish;
> >>  };
> >>  
> >> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> >> index 6004be3..21cecc5 100644
> >> --- a/kernel/livepatch/core.c
> >> +++ b/kernel/livepatch/core.c
> >> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
> >>  		list_del(&patch->list);
> >>  }
> >>  
> >> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> >> +void klp_patch_free_no_ops(struct klp_patch *patch)
> >> +{
> >> +	struct klp_object *obj, *tmp_obj;
> >> +	struct klp_func *func, *tmp_func;
> >> +
> >> +	klp_for_each_object(patch, obj) {
> >> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> >> +					 func_entry) {
> >> +			list_del_init(&func->func_entry);
> >> +			kobject_put(&func->kobj);
> >> +			kfree(func->old_name);
> >> +			kfree(func);
> > 
> > kobject_put() is asynchronous. The structure should be freed using
> > the kobject release method.
> > 
> > The question is how secure this should be. We might want to block
> > other operations with the patch until all the release methods
> > are called. But this might be tricky as there is not a single
> > root kobject that would get destroyed at the end.
> > 
> > A crazy solution would be to define a global atomic counter.
> > It might get increased with each kobject_put() call and
> > descreased in each release method. Then we could wait
> > until it gets zero.
> > 
> > It should be safe to wait with klp_mutex taken. Note that this
> > is not possible with patch->kobj() where the "enable"
> > attribute takes the mutex as well, see
> > enabled_store() in kernel/livepatch/core.c.
> 
> Thanks for pointing this out - it looks like the livepatch code uses
> wait_for_completion() with special kobj callbacks. Perhaps, there could
> be a nop callback, but I'd have to look at this more closely...

The completion is usable for the root kobject but you do not free
it here. It might be pretty complicated if you need separate
completion for all the freed kobjects.

A solution might be a global atomic counter and a waitqueue.
Feel free to ask for more details.


> >> +		}
> >> +		INIT_LIST_HEAD(&obj->func_list);
> >> +	}
> >> +	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
> >> +		list_del_init(&obj->obj_entry);
> >> +		kobject_put(&obj->kobj);
> >> +		kfree(obj->name);
> >> +		kfree(obj);
> > 
> > Same here.
> > 
> >> +	}
> >> +	INIT_LIST_HEAD(&patch->obj_list);
> >> +}
> >> +


> >> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> >> index b004a1f..d5e620e 100644
> >> --- a/kernel/livepatch/transition.c
> >> +++ b/kernel/livepatch/transition.c
> >> @@ -70,6 +72,7 @@ static void klp_synchronize_transition(void)
> >>  	schedule_on_each_cpu(klp_sync);
> >>  }
> >>  
> >> +
> > 
> > extra empty line?
> > 
> >>  /*
> >>   * The transition to the target patch state is complete.  Clean up the data
> >>   * structures.
> >> @@ -81,14 +84,39 @@ static void klp_complete_transition(void)
> >>  	struct task_struct *g, *task;
> >>  	unsigned int cpu;
> >>  	bool immediate_func = false;
> >> +	bool no_op = false;
> >> +	struct klp_patch *prev_patch;
> >> +
> >> +	/*
> >> +	 * for replace patches, we disable all previous patches, and replace
> >> +	 * the dynamic no-op functions by removing the ftrace hook. After
> >> +	 * klp_synchronize_transition() is called its safe to free the
> >> +	 * the dynamic no-op functions, done by klp_patch_free_no_ops()
> >> +	 */
> >> +	if (klp_target_state == KLP_PATCHED && klp_transition_patch->replace) {
> >> +		prev_patch = klp_transition_patch;
> >> +		while (prev_patch->list.prev != &klp_patches) {
> >> +			prev_patch = list_prev_entry(prev_patch, list);
> > 
> > I wonder if the following code might be more obvious:
> > 
> > 	list_for_each_entry(old_patch, &klp_patches, list) {
> > 		if (old_patch = klp_transition_patch)
> > 			continue;
> > 
> >> +			if (prev_patch->enabled) {
> >> +				klp_unpatch_objects(prev_patch, false);
> >> +				prev_patch->enabled = false;
> > 
> > I suggested to do this in __klp_disable_patch() in the last review.
> > But we need something else.
> > 
> > The fact is that patch->enable is normally manipulated at the
> > beginning of the transition, see __klp_enable_patch()
> > and __klp_disable_patch(). Therefore we are kind of
> > inconsistent when we manipulate it here for the replaced
> > patchses.>
> 
> ok, the suggestion is that for the non-immediate replace we do a
> 'special' disable here, once we know that the transition patch has been
> successfully enabled.
> 
> > But we must keep the patches enabled during the entire
> > transition. Othewise, klp_module_coming() and klp_module_going()
> > would not patch/unpatch the objects correctly.
> 
> I think the patch does, its only when its been successfully enabled that
> it goes and disables the previous patches.
> 
> > 
> > In fact, I think that we should set
> > klp_transition_patch->enabled = false in klp_complete_transition()
> > instead of in __klp_disable_patch(). I mean that flag should reflect
> > whether the ftrace handlers see the patch or not. Therefore it
> > should be always true during the transition. Then we would not
> > need the extra check for klp_transition_patch in coming/going
> > handlers. Well, this should be done in separate patch.
> > I could prepare it if you want.
> 
> I don't quite see why this is needed for this patch series

Yes, please, see the last two sentences of my paragraph. I suggested
to do this in a separate patch. I also offered to provide it.

> I think the added nops should be handled the way all the other
> functions are...

It is not about no_ops. It is about the "enabled" flag
in struct klp_patch. This patchset handles this flag
differently for replaced patches than we do it for
a normally disabled ones.

Well, it is a detail that I wanted to point out.
The handling of replaced patches looks fine after all.
Feel free to handle the "enabled" flag as you do now.


> 
> > 
> > 
> >> +				prev_patch->replaced = true;
> >> +				module_put(prev_patch->mod);
> > 
> > In each case, we could do this only when the related no_op
> > functions were not applied using "immediate" flag.
> 
> yes.
> 
> > 
> > I am not 100% sure that it is really necessary. But I would feel
> > more comfortable when we do this after klp_synchronize_transition();
> > so that any ftrace handler could not see the related
> > functions on the stack.
> 
> Which bit are you referring to here?

I mean that we should call module_put() for all the replaced patches
later in klp_complete_transition(). In particular, we should do so
after we call klp_synchronize_transition() for
klp_target_state == KLP_PATCHED variant.


> > 
> >> +			}
> >> +		}
> >> +		klp_unpatch_objects(klp_transition_patch, true);
> >> +		no_op = true;
> >> +	}
> >>  
> >>  	if (klp_target_state == KLP_UNPATCHED) {
> >>  		/*
> >>  		 * All tasks have transitioned to KLP_UNPATCHED so we can now
> >>  		 * remove the new functions from the func_stack.
> >>  		 */
> >> -		klp_unpatch_objects(klp_transition_patch);
> >> +		klp_unpatch_objects(klp_transition_patch, false);
> > 
> > This code patch made me think about a situation when
> > the "enable" operation was reverted during the transition.
> > Then the target state is KLP_UNPATCHED and no_op stuff
> > is there.
> > 
> > Well, in this case, we need to keep the no_op stuff
> > until either the the patch is enabled egain and the
> > transition finishes or we need to remove/free it
> > klp_unregister_patch().
> 
> In the case that klp_cancel_transition() calls klp_complete_transition()
> and we remove the nop functions, we also remove all the regular
> 'functions', which should be re-added on the function stacks if the
> patch is re-enabled. So I think this is fine.

But we do not remove no_op when klp_target_state == KLP_PATCHED.
And this might never happen when we call klp_reverse_transition()
in the middle of the transition.

I see two solutions:

Either we could postpone generating the no_op functions
until __klp_enable_patch() call and always free
them in klp_finish_transition().

Or we need to check if some no_op functions are still
there when __klp_unregister_patch() is called and free
them there.


In the long term, we would need the first solution
because it would allow to re-enable the replaced
patches. But it will be more complicated to reuse
the existing code, especially the init functions.

Feel free to keep it easy and implement
the 2nd possibility in this patch(set).

Thanks for working on this.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-12  8:35       ` Petr Mladek
@ 2017-09-13  6:47         ` Jason Baron
  2017-09-14 13:32           ` Petr Mladek
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Baron @ 2017-09-13  6:47 UTC (permalink / raw)
  To: Petr Mladek; +Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes



On 09/12/2017 01:35 AM, Petr Mladek wrote:
> On Mon 2017-09-11 23:46:28, Jason Baron wrote:
>> On 09/11/2017 09:53 AM, Petr Mladek wrote:
>>> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
>>>> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
>>>> index 8d3df55..ee6d18b 100644
>>>> --- a/include/linux/livepatch.h
>>>> +++ b/include/linux/livepatch.h
>>>> @@ -119,10 +121,12 @@ struct klp_object {
>>>>   * @mod:	reference to the live patch module
>>>>   * @objs:	object entries for kernel objects to be patched
>>>>   * @immediate:  patch all funcs immediately, bypassing safety mechanisms
>>>> + * @replace:	replace all funcs, reverting functions that are no longer patched
>>>>   * @list:	list node for global list of registered patches
>>>>   * @kobj:	kobject for sysfs resources
>>>>   * @obj_list:	head of list for dynamically allocated struct klp_object
>>>>   * @enabled:	the patch is enabled (but operation may be incomplete)
>>>> + * @replaced:	the patch has been replaced an can not be re-enabled
>>>
>>> I finally understand why you do this. I forgot that even disabled
>>> patches are still in klp_patch list.
>>>
>>> This makes some sense for patches that fix different bugs and
>>> touch the same function. They should be applied and enabled
>>> in the right order because a later patch for the same function
>>> must be based on the code from the previous one.
>>>
>>> It makes less sense for cummulative patches that we use in kGraft.
>>> There basically every patch has the "replace" flag set. If
>>> we enable one patch it simply replaces any other one. Then
>>> ordering is not important. Each patch is standalone and
>>> consistent on its own.
>>>
>>>
>>> I see two problems with your approach. One is cosmetic.
>>> The names of the two flags replace/replaced are too similar.
>>> It is quite prone for mistakes ;-)
>>>
>>> Second, we could not remove module where any function is patched
>>> usign the "immediate" flag. But people might want to revert
>>> to this patch if the last one does not work as expected.
>>> After all the main purpose of livepatching is to fix
>>> problems without a system reboot. Therefore we should
>>> allow to enable the replaced patches even without
>>> removing the module.
>>>
>>
>> If the livepatch has the 'replace' bit set and not the 'immediate' bit,
>> then I believe that we could remove *all* types of previous livepatches
>> even if they have the 'immediate' flag set. That is, because of the
>> consistency model, we are sure that we are running only functions from
>> the cumulative replace patch.
> 
> You are partly right. The catch is if a function was previously
> patched using the immediate flag and the function is not longer
> patched by the new cumulative patch with the 'replace' flag.
> Then we need to create "no_op" for the function and the 'no_op'
> must inherit the immediate flag set. Then the consistency model
> does not guarantee that the code from the older patch is not running
> and we could not allow to remove the older patch.
> 

Agreed - the replace patch should 'inherit' the immediate flag. This
raises the question, what if say two patches have the immediate flag set
differently for the same function? This would be unlikely since
presumably we would set it the same way for all patches. In this case, I
think a reasonable thing to do would be to 'inherit' the immediate flag
from the immediately proceeding patch, since that's the one we are
reverting from.

> 
>> So I think it may be worth making a distinction between patches that
>> have 'replace' bit set and the immediate bit *unset*, and those that
>> have the 'replace' bit set and the immediate bit *set*. So it seems to
>> me that this patchset could just reject 'replace' patches that have the
>> 'immediate' flag set, and treat the 'immediate' flag being set as
>> separate todo item if there is interest?
> 
> I would do it the other way. If we apply a livepatch with both "replace"
> and "immediate" flag set then I would avoid calling module_put()
> for all the replaced patches. IMHO, it is a useful feature and
> this is safe and easy.
> 
> We could later add an optimization that would allow to remove
> patches that were not immediate and where all the functions
> were replaced without the immediate flag as well.

ok.

> 
> I guess that you know this. But just for sure note that there
> are two possibilities. The "immediate" flag might be set either
> for the entire patch (struct klp_patch) or for particular
> functions (struct klp_func).
> 
> 
>> Then, the 'replace' patch with 'immediate' not set, conceptually does a
>> 'disable' and an 'unregister' of all previous live patches, such that
>> they could be removed (rmmod) and re-inserted.
> 
> Yes, if you consider also the flags used by "no_op" functions.
> 
>

Agreed. I'm thinking it might make sense to split this out as separate
patch.


>>> One solution would be to remove the replaced patches from
>>> klp_patch list. And enforce stacking the following way
>>> in __klp_enable_patch():
>>>
>>> 	/*
>>> 	 * Enforce stacking: only the first disabled patch can be
>>> 	 * enabled. The only exception is a patch that atomically
>>> 	 * replaces all other patches and was disabled by
>>> 	 * another patch of this type.
>>> 	 */
>>> 	if (list_empty(&patch->list)) {
>>> 		if (!patch->replace)
>>> 			return -EBUSY;
>>> 		list_add_tail(&patch->list, &klp_patches);
>>> 	} else if (patch->list.prev != &klp_patches &&
>>> 		   !list_prev_entry(patch, list)->enabled) {
>>> 		return -EBUSY;
>>> 	}
>>>
>>> Well, I would add this support later in a separate patch.
>>> We would need to (re)generate the no_op stuff in __klp_enable_patch()
>>> for this. Also we would need to make sure that this patch
>>> could not longer get enabled once klp_unregister_patch()
>>> is called.
>>>
>>> For now, I would just remove the replaced patches from
>>> klp_patches list and refuse enabling patches that are
>>> not in the list. I mean to use a check of
>>> list_empty(&patch->list) instead of the extra
>>> "replaced" variable.
>>
>> Ok, that makes sense to me as a way to remove 'replaced'.
>>
>>>
>>>
>>>>   * @finish:	for waiting till it is safe to remove the patch module
>>>>   */
>>>>  struct klp_patch {
>>>> @@ -130,12 +134,14 @@ struct klp_patch {
>>>>  	struct module *mod;
>>>>  	struct klp_object *objs;
>>>>  	bool immediate;
>>>> +	bool replace;
>>>>  
>>>>  	/* internal */
>>>>  	struct list_head list;
>>>>  	struct kobject kobj;
>>>>  	struct list_head obj_list;
>>>>  	bool enabled;
>>>> +	bool replaced;
>>>>  	struct completion finish;
>>>>  };
>>>>  
>>>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>>>> index 6004be3..21cecc5 100644
>>>> --- a/kernel/livepatch/core.c
>>>> +++ b/kernel/livepatch/core.c
>>>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
>>>>  		list_del(&patch->list);
>>>>  }
>>>>  
>>>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
>>>> +void klp_patch_free_no_ops(struct klp_patch *patch)
>>>> +{
>>>> +	struct klp_object *obj, *tmp_obj;
>>>> +	struct klp_func *func, *tmp_func;
>>>> +
>>>> +	klp_for_each_object(patch, obj) {
>>>> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
>>>> +					 func_entry) {
>>>> +			list_del_init(&func->func_entry);
>>>> +			kobject_put(&func->kobj);
>>>> +			kfree(func->old_name);
>>>> +			kfree(func);
>>>
>>> kobject_put() is asynchronous. The structure should be freed using
>>> the kobject release method.
>>>
>>> The question is how secure this should be. We might want to block
>>> other operations with the patch until all the release methods
>>> are called. But this might be tricky as there is not a single
>>> root kobject that would get destroyed at the end.
>>>
>>> A crazy solution would be to define a global atomic counter.
>>> It might get increased with each kobject_put() call and
>>> descreased in each release method. Then we could wait
>>> until it gets zero.
>>>
>>> It should be safe to wait with klp_mutex taken. Note that this
>>> is not possible with patch->kobj() where the "the enable"
>>> attribute takes the mutex as well, see
>>> enabled_store() in kernel/livepatch/core.c.
>>
>> Thanks for pointing this out - it looks like the livepatch code uses
>> wait_for_completion() with special kobj callbacks. Perhaps, there could
>> be a nop callback, but I'd have to look at this more closely...
> 
> The completion is usable for the root kobject but you do not free
> it here. It might be pretty complicated if you need separate
> completion for all the freed kobjects.
> 
> A solution might be a global atomic counter and a waitqueue.
> Feel free to ask for more details.
> 

So the issue is that the user might access some of the klp_* data
structures via /sysfs after we have already freed them? If so, this
seems to be a common kernel pattern:

kobject_del(kobj);
kobject_put(kobj);


> 
>>>> +		}
>>>> +		INIT_LIST_HEAD(&obj->func_list);
>>>> +	}
>>>> +	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
>>>> +		list_del_init(&obj->obj_entry);
>>>> +		kobject_put(&obj->kobj);
>>>> +		kfree(obj->name);
>>>> +		kfree(obj);
>>>
>>> Same here.
>>>
>>>> +	}
>>>> +	INIT_LIST_HEAD(&patch->obj_list);
>>>> +}
>>>> +
> 
> 
>>>> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
>>>> index b004a1f..d5e620e 100644
>>>> --- a/kernel/livepatch/transition.c
>>>> +++ b/kernel/livepatch/transition.c
>>>> @@ -70,6 +72,7 @@ static void klp_synchronize_transition(void)
>>>>  	schedule_on_each_cpu(klp_sync);
>>>>  }
>>>>  
>>>> +
>>>
>>> extra empty line?
>>>
>>>>  /*
>>>>   * The transition to the target patch state is complete.  Clean up the data
>>>>   * structures.
>>>> @@ -81,14 +84,39 @@ static void klp_complete_transition(void)
>>>>  	struct task_struct *g, *task;
>>>>  	unsigned int cpu;
>>>>  	bool immediate_func = false;
>>>> +	bool no_op = false;
>>>> +	struct klp_patch *prev_patch;
>>>> +
>>>> +	/*
>>>> +	 * for replace patches, we disable all previous patches, and replace
>>>> +	 * the dynamic no-op functions by removing the ftrace hook. After
>>>> +	 * klp_synchronize_transition() is called its safe to free the
>>>> +	 * the dynamic no-op functions, done by klp_patch_free_no_ops()
>>>> +	 */
>>>> +	if (klp_target_state == KLP_PATCHED && klp_transition_patch->replace) {
>>>> +		prev_patch = klp_transition_patch;
>>>> +		while (prev_patch->list.prev != &klp_patches) {
>>>> +			prev_patch = list_prev_entry(prev_patch, list);
>>>
>>> I wonder if the following code might be more obvious:
>>>
>>> 	list_for_each_entry(old_patch, &klp_patches, list) {
>>> 		if (old_patch = klp_transition_patch)
>>> 			continue;
>>>
>>>> +			if (prev_patch->enabled) {
>>>> +				klp_unpatch_objects(prev_patch, false);
>>>> +				prev_patch->enabled = false;
>>>
>>> I suggested to do this in __klp_disable_patch() in the last review.
>>> But we need something else.
>>>
>>> The fact is that patch->enable is normally manipulated at the
>>> beginning of the transition, see __klp_enable_patch()
>>> and __klp_disable_patch(). Therefore we are kind of
>>> inconsistent when we manipulate it here for the replaced
>>> patchses.>
>>
>> ok, the suggestion is that for the non-immediate replace we do a
>> 'special' disable here, once we know that the transition patch has been
>> successfully enabled.
>>
>>> But we must keep the patches enabled during the entire
>>> transition. Othewise, klp_module_coming() and klp_module_going()
>>> would not patch/unpatch the objects correctly.
>>
>> I think the patch does, its only when its been successfully enabled that
>> it goes and disables the previous patches.
>>
>>>
>>> In fact, I think that we should set
>>> klp_transition_patch->enabled = false in klp_complete_transition()
>>> instead of in __klp_disable_patch(). I mean that flag should reflect
>>> whether the ftrace handlers see the patch or not. Therefore it
>>> should be always true during the transition. Then we would not
>>> need the extra check for klp_transition_patch in coming/going
>>> handlers. Well, this should be done in separate patch.
>>> I could prepare it if you want.
>>
>> I don't quite see why this is needed for this patch series
> 
> Yes, please, see the last two sentences of my paragraph. I suggested
> to do this in a separate patch. I also offered to provide it.
> 
>> I think the added nops should be handled the way all the other
>> functions are...
> 
> It is not about no_ops. It is about the "enabled" flag
> in struct klp_patch. This patchset handles this flag
> differently for replaced patches than we do it for
> a normally disabled ones.
> 
> Well, it is a detail that I wanted to point out.
> The handling of replaced patches looks fine after all.
> Feel free to handle the "enabled" flag as you do now.
> 
> 
>>
>>>
>>>
>>>> +				prev_patch->replaced = true;
>>>> +				module_put(prev_patch->mod);
>>>
>>> In each case, we could do this only when the related no_op
>>> functions were not applied using "immediate" flag.
>>
>> yes.
>>
>>>
>>> I am not 100% sure that it is really necessary. But I would feel
>>> more comfortable when we do this after klp_synchronize_transition();
>>> so that any ftrace handler could not see the related
>>> functions on the stack.
>>
>> Which bit are you referring to here?
> 
> I mean that we should call module_put() for all the replaced patches
> later in klp_complete_transition(). In particular, we should do so
> after we call klp_synchronize_transition() for
> klp_target_state == KLP_PATCHED variant.
>

ok.

> 
>>>
>>>> +			}
>>>> +		}
>>>> +		klp_unpatch_objects(klp_transition_patch, true);
>>>> +		no_op = true;
>>>> +	}
>>>>  
>>>>  	if (klp_target_state == KLP_UNPATCHED) {
>>>>  		/*
>>>>  		 * All tasks have transitioned to KLP_UNPATCHED so we can now
>>>>  		 * remove the new functions from the func_stack.
>>>>  		 */
>>>> -		klp_unpatch_objects(klp_transition_patch);
>>>> +		klp_unpatch_objects(klp_transition_patch, false);
>>>
>>> This code patch made me think about a situation when
>>> the "enable" operation was reverted during the transition.
>>> Then the target state is KLP_UNPATCHED and no_op stuff
>>> is there.
>>>
>>> Well, in this case, we need to keep the no_op stuff
>>> until either the the patch is enabled egain and the
>>> transition finishes or we need to remove/free it
>>> klp_unregister_patch().
>>
>> In the case that klp_cancel_transition() calls klp_complete_transition()
>> and we remove the nop functions, we also remove all the regular
>> 'functions', which should be re-added on the function stacks if the
>> patch is re-enabled. So I think this is fine.
> 
> But we do not remove no_op when klp_target_state == KLP_PATCHED.
> And this might never happen when we call klp_reverse_transition()
> in the middle of the transition.
> 
> I see two solutions:
> 
> Either we could postpone generating the no_op functions
> until __klp_enable_patch() call and always free
> them in klp_finish_transition().
> 
> Or we need to check if some no_op functions are still
> there when __klp_unregister_patch() is called and free
> them there.
> 
> 
> In the long term, we would need the first solution
> because it would allow to re-enable the replaced
> patches. But it will be more complicated to reuse
> the existing code, especially the init functions.
> 
> Feel free to keep it easy and implement
> the 2nd possibility in this patch(set).
> 

Indeed I took the 2nd possibility in this patch already - see the call
to klp_patch_free_no_ops() in klp_unregister_patch().

Thanks,

-Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-13  6:47         ` Jason Baron
@ 2017-09-14 13:32           ` Petr Mladek
  2017-09-14 15:31             ` Jason Baron
  0 siblings, 1 reply; 14+ messages in thread
From: Petr Mladek @ 2017-09-14 13:32 UTC (permalink / raw)
  To: Jason Baron; +Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes

On Tue 2017-09-12 23:47:32, Jason Baron wrote:
> 
> 
> On 09/12/2017 01:35 AM, Petr Mladek wrote:
> > On Mon 2017-09-11 23:46:28, Jason Baron wrote:
> >> On 09/11/2017 09:53 AM, Petr Mladek wrote:
> >>> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
> >>>> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> >>>> index 8d3df55..ee6d18b 100644
> >>>> --- a/include/linux/livepatch.h
> >>>> +++ b/include/linux/livepatch.h
> >>>> @@ -119,10 +121,12 @@ struct klp_object {
> >>>>   * @mod:	reference to the live patch module
> >>>>   * @objs:	object entries for kernel objects to be patched
> >>>>   * @immediate:  patch all funcs immediately, bypassing safety mechanisms
> >>>> + * @replace:	replace all funcs, reverting functions that are no longer patched
> >>>>   * @list:	list node for global list of registered patches
> >>>>   * @kobj:	kobject for sysfs resources
> >>>>   * @obj_list:	head of list for dynamically allocated struct klp_object
> >>>>   * @enabled:	the patch is enabled (but operation may be incomplete)
> >>>> + * @replaced:	the patch has been replaced an can not be re-enabled
> >>>
> >>> I finally understand why you do this. I forgot that even disabled
> >>> patches are still in klp_patch list.
> >>>
> >>> This makes some sense for patches that fix different bugs and
> >>> touch the same function. They should be applied and enabled
> >>> in the right order because a later patch for the same function
> >>> must be based on the code from the previous one.
> >>>
> >>> It makes less sense for cummulative patches that we use in kGraft.
> >>> There basically every patch has the "replace" flag set. If
> >>> we enable one patch it simply replaces any other one. Then
> >>> ordering is not important. Each patch is standalone and
> >>> consistent on its own.
> >>>
> >>>
> >>> I see two problems with your approach. One is cosmetic.
> >>> The names of the two flags replace/replaced are too similar.
> >>> It is quite prone for mistakes ;-)
> >>>
> >>> Second, we could not remove module where any function is patched
> >>> usign the "immediate" flag. But people might want to revert
> >>> to this patch if the last one does not work as expected.
> >>> After all the main purpose of livepatching is to fix
> >>> problems without a system reboot. Therefore we should
> >>> allow to enable the replaced patches even without
> >>> removing the module.
> >>>
> >>
> >> If the livepatch has the 'replace' bit set and not the 'immediate' bit,
> >> then I believe that we could remove *all* types of previous livepatches
> >> even if they have the 'immediate' flag set. That is, because of the
> >> consistency model, we are sure that we are running only functions from
> >> the cumulative replace patch.
> > 
> > You are partly right. The catch is if a function was previously
> > patched using the immediate flag and the function is not longer
> > patched by the new cumulative patch with the 'replace' flag.
> > Then we need to create "no_op" for the function and the 'no_op'
> > must inherit the immediate flag set. Then the consistency model
> > does not guarantee that the code from the older patch is not running
> > and we could not allow to remove the older patch.
> > 
> 
> Agreed - the replace patch should 'inherit' the immediate flag. This
> raises the question, what if say two patches have the immediate flag set
> differently for the same function? This would be unlikely since
> presumably we would set it the same way for all patches. In this case, I
> think a reasonable thing to do would be to 'inherit' the immediate flag
> from the immediately proceeding patch, since that's the one we are
> reverting from.

Good question. I would personally use immediate = false if there
is a mismatch. It might block the transition but it will not break
the consistency model. The consistency and stability is always
more important. The user could always either revert the transition.
Or there is going to be a way to force it.

IMHO, there are basically two approaches how to use
the immediate flag:

Some people might enable it only when it is safe and needed
to patch a function that might sleep for long. These people
either want to be on the safe side. Or they want to remove
disabled patches when possible. Your approach would be fine
in this case.

Another group of people might always enable the immediate flag
when it is safe. They would disable the immediate flag only when
the patch does a semantic change. These people want to keep
it easy and avoid potential troubles with a transition
(complex code, might get blocked, ...). This is the reason
to be conservative. The cumulative patch is going to replace
all existing patches. If one of them needed the complex
consistency model, we should use it as well.


> >>>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> >>>> index 6004be3..21cecc5 100644
> >>>> --- a/kernel/livepatch/core.c
> >>>> +++ b/kernel/livepatch/core.c
> >>>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
> >>>>  		list_del(&patch->list);
> >>>>  }
> >>>>  
> >>>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> >>>> +void klp_patch_free_no_ops(struct klp_patch *patch)
> >>>> +{
> >>>> +	struct klp_object *obj, *tmp_obj;
> >>>> +	struct klp_func *func, *tmp_func;
> >>>> +
> >>>> +	klp_for_each_object(patch, obj) {
> >>>> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> >>>> +					 func_entry) {
> >>>> +			list_del_init(&func->func_entry);
> >>>> +			kobject_put(&func->kobj);
> >>>> +			kfree(func->old_name);
> >>>> +			kfree(func);
> >>>
> >>> kobject_put() is asynchronous. The structure should be freed using
> >>> the kobject release method.
> >>>
> >>> The question is how secure this should be. We might want to block
> >>> other operations with the patch until all the release methods
> >>> are called. But this might be tricky as there is not a single
> >>> root kobject that would get destroyed at the end.
> >>>
> >>> A crazy solution would be to define a global atomic counter.
> >>> It might get increased with each kobject_put() call and
> >>> descreased in each release method. Then we could wait
> >>> until it gets zero.
> >>>
> >>> It should be safe to wait with klp_mutex taken. Note that this
> >>> is not possible with patch->kobj() where the "the enable"
> >>> attribute takes the mutex as well, see
> >>> enabled_store() in kernel/livepatch/core.c.
> >>
> >> Thanks for pointing this out - it looks like the livepatch code uses
> >> wait_for_completion() with special kobj callbacks. Perhaps, there could
> >> be a nop callback, but I'd have to look at this more closely...
> > 
> > The completion is usable for the root kobject but you do not free
> > it here. It might be pretty complicated if you need separate
> > completion for all the freed kobjects.
> > 
> > A solution might be a global atomic counter and a waitqueue.
> > Feel free to ask for more details.
> > 
> 
> So the issue is that the user might access some of the klp_* data
> structures via /sysfs after we have already freed them?

yes

> if so, this seems to be a common kernel pattern:
> 
> kobject_del(kobj);
> kobject_put(kobj);

IMHO, this is used for other reason.

kobject_del() removes the object from the hierachy. Therefore
it prevents new operations. But it does not wait for the exiting
operations to finish. Therefore there still might be users that
access the data even after this function finishes.

kobject_put() is enough if you do not mind how the object
might be visible. It would remove it once the last reference
on the object is removed. See the following code in
kobject_cleanup.

	/* remove from sysfs if the caller did not do it */
	if (kobj->state_in_sysfs) {
		pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
			 kobject_name(kobj), kobj);
		kobject_del(kobj);
	}

In each case, you must not free the data accessible from the kobject
handlers until kobj_type->release method is called.


IMHO, the solution is not that complicated after all.
If you are able to distinguish statically and dynamically
defined klp_func and klp_obj structures, you might
just modify the existing kobject release methods:


atomic_t klp_no_op_release_count;
static DECLARE_WAIT_QUEUE_HEAD(klp_no_op_release_wait);

static void klp_kobj_release_object(struct kobject *kobj)
{
	struct klp_object *obj;

	obj = container_of(kobj, struct klp_object, kobj);
	/* Free dynamically allocated object */
	if (!obj->funcs) {
		kfree(obj->name);
		kfree(obj);
		atomic_dec(&klp_no_op_release_count);
		wake_up(&klp_no_op_release_wait);
	}
}

static void klp_kobj_release_func(struct kobject *kobj)
{
	struct klp_func *func;

	obj = container_of(kobj, struct klp_func, kobj);
	/* Free dynamically allocated functions */
	if (!func->new_func) {
		kfree(func->old_name);
		kfree(func);
		atomic_dec(&klp_no_op_release_count);
		wake_up(&klp_no_op_release_wait);
	}
}


then we could have

void klp_patch_free_no_ops(struct klp_patch *patch)
{
	struct klp_object *obj, *tmp_obj;
	struct klp_func *func, *tmp_func;

	klp_for_each_object(patch, obj) {
		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
					 func_entry) {
			list_del_init(&func->func_entry);
			atomic_inc(&klp_no_op_release_count);
			kobject_put(&func->kobj);
		}
		INIT_LIST_HEAD(&obj->func_list);
	}
	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
		list_del_init(&obj->obj_entry);
		atomic_inc(&klp_no_op_release_count);
		kobject_put(&obj->kobj);
	}
	INIT_LIST_HEAD(&patch->obj_list);

	wait_event(&klp_no_op_release_wait,
		   atomic_read(&klp_no_op_release_count) == 0);
}

Add we should call this function also in klp_complete_transition()
to avoid code duplication.


> >>>> +		}
> >>>> +		INIT_LIST_HEAD(&obj->func_list);
> >>>> +	}
> >>>> +	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
> >>>> +		list_del_init(&obj->obj_entry);
> >>>> +		kobject_put(&obj->kobj);
> >>>> +		kfree(obj->name);
> >>>> +		kfree(obj);
> >>>
> >>> Same here.
> >>>
> >>>> +	}
> >>>> +	INIT_LIST_HEAD(&patch->obj_list);
> >>>> +}
> >>>> +
> > 
> > 
> >>>> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> >>>> index b004a1f..d5e620e 100644
> >>>> --- a/kernel/livepatch/transition.c
> >>>> +++ b/kernel/livepatch/transition.c
> > 
> >>>
> >>>> +			}
> >>>> +		}
> >>>> +		klp_unpatch_objects(klp_transition_patch, true);
> >>>> +		no_op = true;
> >>>> +	}
> >>>>  
> >>>>  	if (klp_target_state == KLP_UNPATCHED) {
> >>>>  		/*
> >>>>  		 * All tasks have transitioned to KLP_UNPATCHED so we can now
> >>>>  		 * remove the new functions from the func_stack.
> >>>>  		 */
> >>>> -		klp_unpatch_objects(klp_transition_patch);
> >>>> +		klp_unpatch_objects(klp_transition_patch, false);
> >>>
> >>> This code patch made me think about a situation when
> >>> the "enable" operation was reverted during the transition.
> >>> Then the target state is KLP_UNPATCHED and no_op stuff
> >>> is there.
> >>>
> >>> Well, in this case, we need to keep the no_op stuff
> >>> until either the the patch is enabled egain and the
> >>> transition finishes or we need to remove/free it
> >>> klp_unregister_patch().
> >>
> >> In the case that klp_cancel_transition() calls klp_complete_transition()
> >> and we remove the nop functions, we also remove all the regular
> >> 'functions', which should be re-added on the function stacks if the
> >> patch is re-enabled. So I think this is fine.
> > 
> > But we do not remove no_op when klp_target_state == KLP_PATCHED.
> > And this might never happen when we call klp_reverse_transition()
> > in the middle of the transition.
> > 
> > I see two solutions:
> > 
> > Either we could postpone generating the no_op functions
> > until __klp_enable_patch() call and always free
> > them in klp_finish_transition().
> > 
> > Or we need to check if some no_op functions are still
> > there when __klp_unregister_patch() is called and free
> > them there.
> > 
> > 
> > In the long term, we would need the first solution
> > because it would allow to re-enable the replaced
> > patches. But it will be more complicated to reuse
> > the existing code, especially the init functions.
> > 
> > Feel free to keep it easy and implement
> > the 2nd possibility in this patch(set).
> > 
> 
> Indeed I took the 2nd possibility in this patch already - see the call
> to klp_patch_free_no_ops() in klp_unregister_patch().

Ah, I have missed this. Then we are on the safe side.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-14 13:32           ` Petr Mladek
@ 2017-09-14 15:31             ` Jason Baron
  2017-09-15 15:46               ` Petr Mladek
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Baron @ 2017-09-14 15:31 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes, gregkh



On 09/14/2017 06:32 AM, Petr Mladek wrote:
> On Tue 2017-09-12 23:47:32, Jason Baron wrote:
>>
>>
>> On 09/12/2017 01:35 AM, Petr Mladek wrote:
>>> On Mon 2017-09-11 23:46:28, Jason Baron wrote:
>>>> On 09/11/2017 09:53 AM, Petr Mladek wrote:
>>>>> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
>>>>>> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
>>>>>> index 8d3df55..ee6d18b 100644
>>>>>> --- a/include/linux/livepatch.h
>>>>>> +++ b/include/linux/livepatch.h
>>>>>> @@ -119,10 +121,12 @@ struct klp_object {
>>>>>>   * @mod:	reference to the live patch module
>>>>>>   * @objs:	object entries for kernel objects to be patched
>>>>>>   * @immediate:  patch all funcs immediately, bypassing safety mechanisms
>>>>>> + * @replace:	replace all funcs, reverting functions that are no longer patched
>>>>>>   * @list:	list node for global list of registered patches
>>>>>>   * @kobj:	kobject for sysfs resources
>>>>>>   * @obj_list:	head of list for dynamically allocated struct klp_object
>>>>>>   * @enabled:	the patch is enabled (but operation may be incomplete)
>>>>>> + * @replaced:	the patch has been replaced an can not be re-enabled
>>>>>
>>>>> I finally understand why you do this. I forgot that even disabled
>>>>> patches are still in klp_patch list.
>>>>>
>>>>> This makes some sense for patches that fix different bugs and
>>>>> touch the same function. They should be applied and enabled
>>>>> in the right order because a later patch for the same function
>>>>> must be based on the code from the previous one.
>>>>>
>>>>> It makes less sense for cummulative patches that we use in kGraft.
>>>>> There basically every patch has the "replace" flag set. If
>>>>> we enable one patch it simply replaces any other one. Then
>>>>> ordering is not important. Each patch is standalone and
>>>>> consistent on its own.
>>>>>
>>>>>
>>>>> I see two problems with your approach. One is cosmetic.
>>>>> The names of the two flags replace/replaced are too similar.
>>>>> It is quite prone for mistakes ;-)
>>>>>
>>>>> Second, we could not remove module where any function is patched
>>>>> usign the "immediate" flag. But people might want to revert
>>>>> to this patch if the last one does not work as expected.
>>>>> After all the main purpose of livepatching is to fix
>>>>> problems without a system reboot. Therefore we should
>>>>> allow to enable the replaced patches even without
>>>>> removing the module.
>>>>>
>>>>
>>>> If the livepatch has the 'replace' bit set and not the 'immediate' bit,
>>>> then I believe that we could remove *all* types of previous livepatches
>>>> even if they have the 'immediate' flag set. That is, because of the
>>>> consistency model, we are sure that we are running only functions from
>>>> the cumulative replace patch.
>>>
>>> You are partly right. The catch is if a function was previously
>>> patched using the immediate flag and the function is not longer
>>> patched by the new cumulative patch with the 'replace' flag.
>>> Then we need to create "no_op" for the function and the 'no_op'
>>> must inherit the immediate flag set. Then the consistency model
>>> does not guarantee that the code from the older patch is not running
>>> and we could not allow to remove the older patch.
>>>
>>
>> Agreed - the replace patch should 'inherit' the immediate flag. This
>> raises the question, what if say two patches have the immediate flag set
>> differently for the same function? This would be unlikely since
>> presumably we would set it the same way for all patches. In this case, I
>> think a reasonable thing to do would be to 'inherit' the immediate flag
>> from the immediately proceeding patch, since that's the one we are
>> reverting from.
> 
> Good question. I would personally use immediate = false if there
> is a mismatch. It might block the transition but it will not break
> the consistency model. The consistency and stability is always
> more important. The user could always either revert the transition.
> Or there is going to be a way to force it.
> 
> IMHO, there are basically two approaches how to use
> the immediate flag:
> 
> Some people might enable it only when it is safe and needed
> to patch a function that might sleep for long. These people
> either want to be on the safe side. Or they want to remove
> disabled patches when possible. Your approach would be fine
> in this case.
> 
> Another group of people might always enable the immediate flag
> when it is safe. They would disable the immediate flag only when
> the patch does a semantic change. These people want to keep
> it easy and avoid potential troubles with a transition
> (complex code, might get blocked, ...). This is the reason
> to be conservative. The cumulative patch is going to replace
> all existing patches. If one of them needed the complex
> consistency model, we should use it as well.
> 

Ok, makes sense.

> 
>>>>>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>>>>>> index 6004be3..21cecc5 100644
>>>>>> --- a/kernel/livepatch/core.c
>>>>>> +++ b/kernel/livepatch/core.c
>>>>>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
>>>>>>  		list_del(&patch->list);
>>>>>>  }
>>>>>>  
>>>>>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
>>>>>> +void klp_patch_free_no_ops(struct klp_patch *patch)
>>>>>> +{
>>>>>> +	struct klp_object *obj, *tmp_obj;
>>>>>> +	struct klp_func *func, *tmp_func;
>>>>>> +
>>>>>> +	klp_for_each_object(patch, obj) {
>>>>>> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
>>>>>> +					 func_entry) {
>>>>>> +			list_del_init(&func->func_entry);
>>>>>> +			kobject_put(&func->kobj);
>>>>>> +			kfree(func->old_name);
>>>>>> +			kfree(func);
>>>>>
>>>>> kobject_put() is asynchronous. The structure should be freed using
>>>>> the kobject release method.
>>>>>
>>>>> The question is how secure this should be. We might want to block
>>>>> other operations with the patch until all the release methods
>>>>> are called. But this might be tricky as there is not a single
>>>>> root kobject that would get destroyed at the end.
>>>>>
>>>>> A crazy solution would be to define a global atomic counter.
>>>>> It might get increased with each kobject_put() call and
>>>>> descreased in each release method. Then we could wait
>>>>> until it gets zero.
>>>>>
>>>>> It should be safe to wait with klp_mutex taken. Note that this
>>>>> is not possible with patch->kobj() where the "the enable"
>>>>> attribute takes the mutex as well, see
>>>>> enabled_store() in kernel/livepatch/core.c.
>>>>
>>>> Thanks for pointing this out - it looks like the livepatch code uses
>>>> wait_for_completion() with special kobj callbacks. Perhaps, there could
>>>> be a nop callback, but I'd have to look at this more closely...
>>>
>>> The completion is usable for the root kobject but you do not free
>>> it here. It might be pretty complicated if you need separate
>>> completion for all the freed kobjects.
>>>
>>> A solution might be a global atomic counter and a waitqueue.
>>> Feel free to ask for more details.
>>>
>>
>> So the issue is that the user might access some of the klp_* data
>> structures via /sysfs after we have already freed them?
> 
> yes
> 
>> if so, this seems to be a common kernel pattern:
>>
>> kobject_del(kobj);
>> kobject_put(kobj);
> 
> IMHO, this is used for other reason.
> 
> kobject_del() removes the object from the hierachy. Therefore
> it prevents new operations. But it does not wait for the exiting
> operations to finish. Therefore there still might be users that
> access the data even after this function finishes.

I looked into this further - and it does appear to wait until all
operations finish. In kernfs_drain() the code does:

wait_event(root->deactivate_waitq, atomic_read(&kn->active) ==
KN_DEACTIVATED_BIAS);

The call stack is:

kobject_del()
 sysfs_remove_dir()
  kernfs_remove()
   __kernfs_remove()
    kernfs_drain()

And __kernfs_remove() has already done a 'deactivate' prior:

/* prevent any new usage under @kn by deactivating all nodes */

pos = NULL;

while ((pos = kernfs_next_descendant_post(pos, kn)))

       if (kernfs_active(pos))

               atomic_add(KN_DEACTIVATED_BIAS, &pos->active);


So I *think* doing the kobject_del() first is sufficient. It may be
worth some better documented if that is the case...


Thanks,

-Jason


> 
> kobject_put() is enough if you do not mind how the object
> might be visible. It would remove it once the last reference
> on the object is removed. See the following code in
> kobject_cleanup.
> 
> 	/* remove from sysfs if the caller did not do it */
> 	if (kobj->state_in_sysfs) {
> 		pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> 			 kobject_name(kobj), kobj);
> 		kobject_del(kobj);
> 	}
> 
> In each case, you must not free the data accessible from the kobject
> handlers until kobj_type->release method is called.
> 
> 
> IMHO, the solution is not that complicated after all.
> If you are able to distinguish statically and dynamically
> defined klp_func and klp_obj structures, you might
> just modify the existing kobject release methods:
> 
> 
> atomic_t klp_no_op_release_count;
> static DECLARE_WAIT_QUEUE_HEAD(klp_no_op_release_wait);
> 
> static void klp_kobj_release_object(struct kobject *kobj)
> {
> 	struct klp_object *obj;
> 
> 	obj = container_of(kobj, struct klp_object, kobj);
> 	/* Free dynamically allocated object */
> 	if (!obj->funcs) {
> 		kfree(obj->name);
> 		kfree(obj);
> 		atomic_dec(&klp_no_op_release_count);
> 		wake_up(&klp_no_op_release_wait);
> 	}
> }
> 
> static void klp_kobj_release_func(struct kobject *kobj)
> {
> 	struct klp_func *func;
> 
> 	obj = container_of(kobj, struct klp_func, kobj);
> 	/* Free dynamically allocated functions */
> 	if (!func->new_func) {
> 		kfree(func->old_name);
> 		kfree(func);
> 		atomic_dec(&klp_no_op_release_count);
> 		wake_up(&klp_no_op_release_wait);
> 	}
> }
> 
> 
> then we could have
> 
> void klp_patch_free_no_ops(struct klp_patch *patch)
> {
> 	struct klp_object *obj, *tmp_obj;
> 	struct klp_func *func, *tmp_func;
> 
> 	klp_for_each_object(patch, obj) {
> 		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> 					 func_entry) {
> 			list_del_init(&func->func_entry);
> 			atomic_inc(&klp_no_op_release_count);
> 			kobject_put(&func->kobj);
> 		}
> 		INIT_LIST_HEAD(&obj->func_list);
> 	}
> 	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
> 		list_del_init(&obj->obj_entry);
> 		atomic_inc(&klp_no_op_release_count);
> 		kobject_put(&obj->kobj);
> 	}
> 	INIT_LIST_HEAD(&patch->obj_list);
> 
> 	wait_event(&klp_no_op_release_wait,
> 		   atomic_read(&klp_no_op_release_count) == 0);
> }
> 
> Add we should call this function also in klp_complete_transition()
> to avoid code duplication.
> 
> 
>>>>>> +		}
>>>>>> +		INIT_LIST_HEAD(&obj->func_list);
>>>>>> +	}
>>>>>> +	list_for_each_entry_safe(obj, tmp_obj, &patch->obj_list, obj_entry) {
>>>>>> +		list_del_init(&obj->obj_entry);
>>>>>> +		kobject_put(&obj->kobj);
>>>>>> +		kfree(obj->name);
>>>>>> +		kfree(obj);
>>>>>
>>>>> Same here.
>>>>>
>>>>>> +	}
>>>>>> +	INIT_LIST_HEAD(&patch->obj_list);
>>>>>> +}
>>>>>> +
>>>
>>>
>>>>>> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
>>>>>> index b004a1f..d5e620e 100644
>>>>>> --- a/kernel/livepatch/transition.c
>>>>>> +++ b/kernel/livepatch/transition.c
>>>
>>>>>
>>>>>> +			}
>>>>>> +		}
>>>>>> +		klp_unpatch_objects(klp_transition_patch, true);
>>>>>> +		no_op = true;
>>>>>> +	}
>>>>>>  
>>>>>>  	if (klp_target_state == KLP_UNPATCHED) {
>>>>>>  		/*
>>>>>>  		 * All tasks have transitioned to KLP_UNPATCHED so we can now
>>>>>>  		 * remove the new functions from the func_stack.
>>>>>>  		 */
>>>>>> -		klp_unpatch_objects(klp_transition_patch);
>>>>>> +		klp_unpatch_objects(klp_transition_patch, false);
>>>>>
>>>>> This code patch made me think about a situation when
>>>>> the "enable" operation was reverted during the transition.
>>>>> Then the target state is KLP_UNPATCHED and no_op stuff
>>>>> is there.
>>>>>
>>>>> Well, in this case, we need to keep the no_op stuff
>>>>> until either the the patch is enabled egain and the
>>>>> transition finishes or we need to remove/free it
>>>>> klp_unregister_patch().
>>>>
>>>> In the case that klp_cancel_transition() calls klp_complete_transition()
>>>> and we remove the nop functions, we also remove all the regular
>>>> 'functions', which should be re-added on the function stacks if the
>>>> patch is re-enabled. So I think this is fine.
>>>
>>> But we do not remove no_op when klp_target_state == KLP_PATCHED.
>>> And this might never happen when we call klp_reverse_transition()
>>> in the middle of the transition.
>>>
>>> I see two solutions:
>>>
>>> Either we could postpone generating the no_op functions
>>> until __klp_enable_patch() call and always free
>>> them in klp_finish_transition().
>>>
>>> Or we need to check if some no_op functions are still
>>> there when __klp_unregister_patch() is called and free
>>> them there.
>>>
>>>
>>> In the long term, we would need the first solution
>>> because it would allow to re-enable the replaced
>>> patches. But it will be more complicated to reuse
>>> the existing code, especially the init functions.
>>>
>>> Feel free to keep it easy and implement
>>> the 2nd possibility in this patch(set).
>>>
>>
>> Indeed I took the 2nd possibility in this patch already - see the call
>> to klp_patch_free_no_ops() in klp_unregister_patch().
> 
> Ah, I have missed this. Then we are on the safe side.
> 
> Best Regards,
> Petr
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-14 15:31             ` Jason Baron
@ 2017-09-15 15:46               ` Petr Mladek
  2017-09-16  0:10                 ` Greg KH
  0 siblings, 1 reply; 14+ messages in thread
From: Petr Mladek @ 2017-09-15 15:46 UTC (permalink / raw)
  To: Jason Baron
  Cc: linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes, gregkh

On Thu 2017-09-14 08:31:24, Jason Baron wrote:
> 
> 
> On 09/14/2017 06:32 AM, Petr Mladek wrote:
> > On Tue 2017-09-12 23:47:32, Jason Baron wrote:
> >>
> >>
> >> On 09/12/2017 01:35 AM, Petr Mladek wrote:
> >>> On Mon 2017-09-11 23:46:28, Jason Baron wrote:
> >>>> On 09/11/2017 09:53 AM, Petr Mladek wrote:
> >>>>> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
> >>>>>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> >>>>>> index 6004be3..21cecc5 100644
> >>>>>> --- a/kernel/livepatch/core.c
> >>>>>> +++ b/kernel/livepatch/core.c
> >>>>>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
> >>>>>>  		list_del(&patch->list);
> >>>>>>  }
> >>>>>>  
> >>>>>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> >>>>>> +void klp_patch_free_no_ops(struct klp_patch *patch)
> >>>>>> +{
> >>>>>> +	struct klp_object *obj, *tmp_obj;
> >>>>>> +	struct klp_func *func, *tmp_func;
> >>>>>> +
> >>>>>> +	klp_for_each_object(patch, obj) {
> >>>>>> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> >>>>>> +					 func_entry) {
> >>>>>> +			list_del_init(&func->func_entry);
> >>>>>> +			kobject_put(&func->kobj);
> >>>>>> +			kfree(func->old_name);
> >>>>>> +			kfree(func);
> >>>>>
> >>>>> kobject_put() is asynchronous. The structure should be freed using
> >>>>> the kobject release method.
> >>>>>
> >>>>> The question is how secure this should be. We might want to block
> >>>>> other operations with the patch until all the release methods
> >>>>> are called. But this might be tricky as there is not a single
> >>>>> root kobject that would get destroyed at the end.
> >>>>>
> >>>>> A crazy solution would be to define a global atomic counter.
> >>>>> It might get increased with each kobject_put() call and
> >>>>> descreased in each release method. Then we could wait
> >>>>> until it gets zero.
> >>>>>
> >>>>> It should be safe to wait with klp_mutex taken. Note that this
> >>>>> is not possible with patch->kobj() where the "the enable"
> >>>>> attribute takes the mutex as well, see
> >>>>> enabled_store() in kernel/livepatch/core.c.
> >>>>
> >>>> Thanks for pointing this out - it looks like the livepatch code uses
> >>>> wait_for_completion() with special kobj callbacks. Perhaps, there could
> >>>> be a nop callback, but I'd have to look at this more closely...
> >>>
> >>> The completion is usable for the root kobject but you do not free
> >>> it here. It might be pretty complicated if you need separate
> >>> completion for all the freed kobjects.
> >>>
> >>> A solution might be a global atomic counter and a waitqueue.
> >>> Feel free to ask for more details.
> >>>
> >>
> >> So the issue is that the user might access some of the klp_* data
> >> structures via /sysfs after we have already freed them?
> > 
> > yes
> > 
> >> if so, this seems to be a common kernel pattern:
> >>
> >> kobject_del(kobj);
> >> kobject_put(kobj);
> > 
> > IMHO, this is used for other reason.
> > 
> > kobject_del() removes the object from the hierachy. Therefore
> > it prevents new operations. But it does not wait for the exiting
> > operations to finish. Therefore there still might be users that
> > access the data even after this function finishes.
> 
> I looked into this further - and it does appear to wait until all
> operations finish. In kernfs_drain() the code does:
> 
> wait_event(root->deactivate_waitq, atomic_read(&kn->active) ==
> KN_DEACTIVATED_BIAS);
> 
> The call stack is:
> 
> kobject_del()
>  sysfs_remove_dir()
>   kernfs_remove()
>    __kernfs_remove()
>     kernfs_drain()
> 
> And __kernfs_remove() has already done a 'deactivate' prior:
> 
> /* prevent any new usage under @kn by deactivating all nodes */
> 
> pos = NULL;
> 
> while ((pos = kernfs_next_descendant_post(pos, kn)))
> 
>        if (kernfs_active(pos))
> 
>                atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
> 
> 
> So I *think* doing the kobject_del() first is sufficient. It may be
> worth some better documented if that is the case...

Sigh, I am confused. Your arguments look convincing. But I still
feel a fear. As I said, we had had a long discussion about this long
time ago. Unfortunately, I do not remember all the details.

In each case, there is the following mentioned in
Documentation/kobject.txt:

=== cut ===
Once you registered your kobject via kobject_add(), you must never use
kfree() to free it directly. The only safe way is to use kobject_put(). It
is good practice to always use kobject_put() after kobject_init() to avoid
errors creeping in.

This notification is done through a kobject's release() method. Usually
such a method has a form like::

    void my_object_release(struct kobject *kobj)
    {
            struct my_object *mine = container_of(kobj, struct my_object, kobj);

            /* Perform any additional cleanup on this object, then... */
            kfree(mine);
    }

One important point cannot be overstated: every kobject must have a
release() method, and the kobject must persist (in a consistent state)
until that method is called. If these constraints are not met, the code is
flawed.  Note that the kernel will warn you if you forget to provide a
release() method.  Do not try to get rid of this warning by providing an
"empty" release function; you will be mocked mercilessly by the kobject
maintainer if you attempt this.

Note, the name of the kobject is available in the release function, but it
must NOT be changed within this callback.  Otherwise there will be a memory
leak in the kobject core, which makes people unhappy.
=== cut ===


The fact is that if you enable CONFIG_DEBUG_KOBJECT_RELEASE, it will
make a deferred access to the struct kobject by intention.
See schedule_delayed_work() in kobject_release.

The struct kobject is part of both struct klp_func and
struct klp_object. The delayed call will cause an access
to an already freed memory if you free the structures
immediately after calling kobject_put().


I am not sure why this is. Either we missed something and
kernfs_drain() is not enough to avoid extra references
of the kobject. Or kobject authors are over paranoid and
push people to the only supported and "right design"
a hard way.

Anyway, we either need to "fix" kobject implementation
or we need to free the structures in the respective
release callbacks.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] livepatch: add atomic replace
  2017-09-15 15:46               ` Petr Mladek
@ 2017-09-16  0:10                 ` Greg KH
  0 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2017-09-16  0:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jason Baron, linux-kernel, live-patching, jpoimboe, jeyu, jikos, mbenes

On Fri, Sep 15, 2017 at 05:46:58PM +0200, Petr Mladek wrote:
> On Thu 2017-09-14 08:31:24, Jason Baron wrote:
> > 
> > 
> > On 09/14/2017 06:32 AM, Petr Mladek wrote:
> > > On Tue 2017-09-12 23:47:32, Jason Baron wrote:
> > >>
> > >>
> > >> On 09/12/2017 01:35 AM, Petr Mladek wrote:
> > >>> On Mon 2017-09-11 23:46:28, Jason Baron wrote:
> > >>>> On 09/11/2017 09:53 AM, Petr Mladek wrote:
> > >>>>> On Wed 2017-08-30 17:38:44, Jason Baron wrote:
> > >>>>>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > >>>>>> index 6004be3..21cecc5 100644
> > >>>>>> --- a/kernel/livepatch/core.c
> > >>>>>> +++ b/kernel/livepatch/core.c
> > >>>>>> @@ -600,13 +603,38 @@ static void klp_free_patch(struct klp_patch *patch)
> > >>>>>>  		list_del(&patch->list);
> > >>>>>>  }
> > >>>>>>  
> > >>>>>> -static int klp_init_func(struct klp_object *obj, struct klp_func *func)
> > >>>>>> +void klp_patch_free_no_ops(struct klp_patch *patch)
> > >>>>>> +{
> > >>>>>> +	struct klp_object *obj, *tmp_obj;
> > >>>>>> +	struct klp_func *func, *tmp_func;
> > >>>>>> +
> > >>>>>> +	klp_for_each_object(patch, obj) {
> > >>>>>> +		list_for_each_entry_safe(func, tmp_func, &obj->func_list,
> > >>>>>> +					 func_entry) {
> > >>>>>> +			list_del_init(&func->func_entry);
> > >>>>>> +			kobject_put(&func->kobj);
> > >>>>>> +			kfree(func->old_name);
> > >>>>>> +			kfree(func);
> > >>>>>
> > >>>>> kobject_put() is asynchronous. The structure should be freed using
> > >>>>> the kobject release method.
> > >>>>>
> > >>>>> The question is how secure this should be. We might want to block
> > >>>>> other operations with the patch until all the release methods
> > >>>>> are called. But this might be tricky as there is not a single
> > >>>>> root kobject that would get destroyed at the end.
> > >>>>>
> > >>>>> A crazy solution would be to define a global atomic counter.
> > >>>>> It might get increased with each kobject_put() call and
> > >>>>> descreased in each release method. Then we could wait
> > >>>>> until it gets zero.
> > >>>>>
> > >>>>> It should be safe to wait with klp_mutex taken. Note that this
> > >>>>> is not possible with patch->kobj() where the "the enable"
> > >>>>> attribute takes the mutex as well, see
> > >>>>> enabled_store() in kernel/livepatch/core.c.
> > >>>>
> > >>>> Thanks for pointing this out - it looks like the livepatch code uses
> > >>>> wait_for_completion() with special kobj callbacks. Perhaps, there could
> > >>>> be a nop callback, but I'd have to look at this more closely...
> > >>>
> > >>> The completion is usable for the root kobject but you do not free
> > >>> it here. It might be pretty complicated if you need separate
> > >>> completion for all the freed kobjects.
> > >>>
> > >>> A solution might be a global atomic counter and a waitqueue.
> > >>> Feel free to ask for more details.
> > >>>
> > >>
> > >> So the issue is that the user might access some of the klp_* data
> > >> structures via /sysfs after we have already freed them?
> > > 
> > > yes
> > > 
> > >> if so, this seems to be a common kernel pattern:
> > >>
> > >> kobject_del(kobj);
> > >> kobject_put(kobj);
> > > 
> > > IMHO, this is used for other reason.
> > > 
> > > kobject_del() removes the object from the hierachy. Therefore
> > > it prevents new operations. But it does not wait for the exiting
> > > operations to finish. Therefore there still might be users that
> > > access the data even after this function finishes.
> > 
> > I looked into this further - and it does appear to wait until all
> > operations finish. In kernfs_drain() the code does:
> > 
> > wait_event(root->deactivate_waitq, atomic_read(&kn->active) ==
> > KN_DEACTIVATED_BIAS);
> > 
> > The call stack is:
> > 
> > kobject_del()
> >  sysfs_remove_dir()
> >   kernfs_remove()
> >    __kernfs_remove()
> >     kernfs_drain()
> > 
> > And __kernfs_remove() has already done a 'deactivate' prior:
> > 
> > /* prevent any new usage under @kn by deactivating all nodes */
> > 
> > pos = NULL;
> > 
> > while ((pos = kernfs_next_descendant_post(pos, kn)))
> > 
> >        if (kernfs_active(pos))
> > 
> >                atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
> > 
> > 
> > So I *think* doing the kobject_del() first is sufficient. It may be
> > worth some better documented if that is the case...
> 
> Sigh, I am confused. Your arguments look convincing. But I still
> feel a fear. As I said, we had had a long discussion about this long
> time ago. Unfortunately, I do not remember all the details.
> 
> In each case, there is the following mentioned in
> Documentation/kobject.txt:
> 
> === cut ===
> Once you registered your kobject via kobject_add(), you must never use
> kfree() to free it directly. The only safe way is to use kobject_put(). It
> is good practice to always use kobject_put() after kobject_init() to avoid
> errors creeping in.
> 
> This notification is done through a kobject's release() method. Usually
> such a method has a form like::
> 
>     void my_object_release(struct kobject *kobj)
>     {
>             struct my_object *mine = container_of(kobj, struct my_object, kobj);
> 
>             /* Perform any additional cleanup on this object, then... */
>             kfree(mine);
>     }
> 
> One important point cannot be overstated: every kobject must have a
> release() method, and the kobject must persist (in a consistent state)
> until that method is called. If these constraints are not met, the code is
> flawed.  Note that the kernel will warn you if you forget to provide a
> release() method.  Do not try to get rid of this warning by providing an
> "empty" release function; you will be mocked mercilessly by the kobject
> maintainer if you attempt this.
> 
> Note, the name of the kobject is available in the release function, but it
> must NOT be changed within this callback.  Otherwise there will be a memory
> leak in the kobject core, which makes people unhappy.
> === cut ===
> 
> 
> The fact is that if you enable CONFIG_DEBUG_KOBJECT_RELEASE, it will
> make a deferred access to the struct kobject by intention.
> See schedule_delayed_work() in kobject_release.
> 
> The struct kobject is part of both struct klp_func and
> struct klp_object. The delayed call will cause an access
> to an already freed memory if you free the structures
> immediately after calling kobject_put().
> 
> 
> I am not sure why this is. Either we missed something and
> kernfs_drain() is not enough to avoid extra references
> of the kobject. Or kobject authors are over paranoid and
> push people to the only supported and "right design"
> a hard way.
> 
> Anyway, we either need to "fix" kobject implementation
> or we need to free the structures in the respective
> release callbacks.

You read the documentation correctly, you need to free your structures
in the release callbacks, kobject is doing this correctly.

Also, it's kind of hard to keep an extra reference around kernfs does a
good job of keeping the kernel from crashing for issues like this.  Try
enabling the above mentioned kernel option, along with slab poisoning
and watch your code blow up :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-09-16  0:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-30 21:38 [PATCH v2 0/3] livepatch: introduce atomic replace Jason Baron
2017-08-30 21:38 ` [PATCH v2 1/3] livepatch: Add dynamic klp_object and klp_func iterators Jason Baron
2017-09-07 12:34   ` Petr Mladek
2017-09-07 16:53     ` Joe Perches
2017-08-30 21:38 ` [PATCH v2 2/3] livepatch: add atomic replace Jason Baron
2017-09-11 13:53   ` Petr Mladek
2017-09-12  3:46     ` Jason Baron
2017-09-12  8:35       ` Petr Mladek
2017-09-13  6:47         ` Jason Baron
2017-09-14 13:32           ` Petr Mladek
2017-09-14 15:31             ` Jason Baron
2017-09-15 15:46               ` Petr Mladek
2017-09-16  0:10                 ` Greg KH
2017-08-30 21:38 ` [PATCH v2 3/3] livepatch: Add a sysctl livepatch_mode for " Jason Baron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).