From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751899AbeCVPnU (ORCPT ); Thu, 22 Mar 2018 11:43:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:52503 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751870AbeCVPnN (ORCPT ); Thu, 22 Mar 2018 11:43:13 -0400 Date: Thu, 22 Mar 2018 16:43:11 +0100 From: Petr Mladek To: Josh Poimboeuf Cc: Jiri Kosina , Miroslav Benes , Jason Baron , Joe Lawrence , Jessica Yu , Evgenii Shatokhin , live-patching@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v10 06/10] livepatch: Add atomic replace Message-ID: <20180322154311.eosu7udehilywlfb@pathway.suse.cz> References: <20180307082039.10196-1-pmladek@suse.com> <20180307082039.10196-7-pmladek@suse.com> <20180313224804.hvqngflltznjtttw@treble> <20180320143501.j66vjipux2fedz5j@pathway.suse.cz> <20180320212619.3rb6lfbtrjkfe4id@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180320212619.3rb6lfbtrjkfe4id@treble> User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 2018-03-20 16:26:19, Josh Poimboeuf wrote: > On Tue, Mar 20, 2018 at 03:35:01PM +0100, Petr Mladek wrote: > > On Tue 2018-03-13 17:48:04, Josh Poimboeuf wrote: > > > On Wed, Mar 07, 2018 at 09:20:35AM +0100, Petr Mladek wrote: > > > > This patch adds a new "replace" flag to struct klp_patch. When it is > > > > enabled, a set of 'nop' klp_func will be dynamically created for all > > > > functions that are already being patched but that will no longer be > > > > modified by the new patch. They are temporarily used as a new target > > > > during the patch transition. > > > > > > > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c > > > > index fd0296859ff4..ad508a86b2f9 100644 > > > > --- a/kernel/livepatch/core.c > > > > +++ b/kernel/livepatch/core.c > > > > +static int klp_add_nops(struct klp_patch *patch) > > > > +{ > > > > + struct klp_patch *old_patch; > > > > + struct klp_object *old_obj; > > > > + int err = 0; > > > > + > > > > + if (WARN_ON(!patch->replace)) > > > > + return -EINVAL; > > > > > > IMO, this is another one of those overly paranoid warnings that isn't > > > really needed. Why would we call klp_add_nops() for a non-replace > > > patch? > > > > Just to be sure. What is the difference, for example, against the following > > checks in __klp_enable_patch() from your point of view, please? > > > > if (klp_transition_patch) > > return -EBUSY; > > > > if (WARN_ON(patch->enabled)) > > return -EINVAL; > > > > One difference is that klp_enable_patch() is exported symbol. One the > > other hand, livepatch code developers could do mistakes as well. > > Adding nops sounds like an innoncent operation after all ;-) > > But klp_enable_patch() being an exported symbol is an important > difference. It catches a patch author abusing the interface. Which is > much more likely than one of us accidentally calling klp_add_nops(). > Have you not noticed how thorough our code reviews are? ;-) > > Anyway, I suppose it's a harmless check and I don't feel very strongly > about it, it just seems unnecessary. I have removed the check. > > > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c > > > > index 6917100fbe79..d6af190865d2 100644 > > > > --- a/kernel/livepatch/transition.c > > > > +++ b/kernel/livepatch/transition.c > > > > @@ -87,6 +87,36 @@ static void klp_complete_transition(void) > > > > klp_transition_patch->mod->name, > > > > klp_target_state == KLP_PATCHED ? "patching" : "unpatching"); > > > > > > > > + /* > > > > + * For replace patches, we disable all previous patches, and replace > > > > + * the dynamic no-op functions by removing the ftrace hook. > > > > + */ > > > > + if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) { > > > > + /* > > > > + * Make sure that no ftrace handler accesses any older patch > > > > + * on the stack. This might happen when the user forced the > > > > + * transaction while some running tasks were still falling > > > > + * back to the old code. There might even still be ftrace > > > > + * handlers that have not seen the last patch on the stack yet. > > > > + * > > > > + * It probably is not necessary because of the rcu-safe access. > > > > + * But better be safe than sorry. > > > > + */ > > > > + if (klp_forced) > > > > + klp_synchronize_transition(); > > > > > > I don't like this. Hopefully we can get just rid of it, if we also get > > > rid of the concept of "throwing away" patches like I proposed. > > > > What exactly you do not like about it, please? > > > > It is not needed if all processes were migrated using the consistency > > model, definitely. > > > > If the transition has been forced then the barrier should be needed from > > similar reasons as the barrier after klp_unpatch_objects() below. > > We basically want to be sure what ftrace handlers see on the stack. > > > > Will it help, when I remove the last paragraph where the formulation > > is quite uncertain? > > Well, the last paragraph doesn't inspire a lot of confidence ;-) It > sounds like voodoo. Races are never easy area. You are right that I was not completely confident and wanted to be on the safe side. Your questions helped me to realize that the synchronization is not neeeded. > Also the comment just seems very confusing to me: > > - What specifically is it protecting against, e.g., _why_ should no > ftrace handler access any old patch on the stack, and when shouldn't > it do so? Is the barrier needed before func->transition is cleared, > or what? I was afraid of invalid memory accessed in klp_ftrace_handler(). I simply underestimated the power of RCU. I am not that familiar with it. Also the following is a bit non-standard: func = list_entry_rcu(func->stack_node.next, struct klp_func, stack_node); Anyway, you made me to check it. All looks safe after all. > - Does RCU make it safe, or doesn't it? If yes, why is this needed? If > no, why not? Yes. I removed the synchronization. Instead, I explained the situation in a comment above klp_discard_replaced_patches(). > > > > + > > > > + klp_throw_away_replaced_patches(klp_transition_patch, > > > > + klp_forced); > > > > + > > > > + /* > > > > + * There is no need to synchronize the transition after removing > > > > + * nops. They must be the last on the func_stack. Ftrace > > > > + * gurantees that nobody will stay in the trampoline after > > > > > > "guarantees" > > > > > > > + * the ftrace handler is unregistered. > > > > + */ > > > > + klp_unpatch_objects(klp_transition_patch, KLP_FUNC_NOP); > > > > + } > > > > + > > > > if (klp_target_state == KLP_UNPATCHED) { > > > > /* > > > > * All tasks have transitioned to KLP_UNPATCHED so we can now > > > > @@ -143,6 +173,15 @@ static void klp_complete_transition(void) > > > > if (!klp_forced && klp_target_state == KLP_UNPATCHED) > > > > module_put(klp_transition_patch->mod); > > > > > > > > + /* > > > > + * We do not need to wait until the objects are really freed. > > > > + * The patch must be on the bottom of the stack. Therefore it > > > > + * will never replace anything else. The only important thing > > > > + * is that we wait when the patch is being unregistered. > > > > + */ > > > > + if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) > > > > + klp_free_objects(klp_transition_patch, KLP_FUNC_NOP); > > > > + > > > > > > This makes me a bit nervous. What happens if the patch is enabled, then > > > disabled, then enabled again? Then klp_free_objects() wouldn't do > > > anything, because the ops would already be freed. > > > > They are not necessary when all replaced patches are removed from > > the stack. There will be no livepatch if this one gets disabled. > > My point was that if you enable, then disable, then enable again, > klp_free_objects() will get called again, and it will do nothing the > second time around. > Maybe that's safe in this instance, but in general, it's easy to forget > the re-enable case when adding special cases for 'patch->replace'. Yes, it is safe. > I get the feeling that it would be safer to just clear 'patch->replace' > after this step to avoid such scenarios. After all, when re-enabling a > 'replace' patch, it's no longer replacing anything (assuming here that a > replace patch will permanently disable all previous patches). I am not completely comfortable with touching item that is set by the author of the patch. On the other hand, I do not see how it could harm. I have just added the line to disable the flag. Best Regards, Petr