From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757618AbaEPQ1c (ORCPT ); Fri, 16 May 2014 12:27:32 -0400 Received: from cantor2.suse.de ([195.135.220.15]:48597 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754838AbaEPQ1a (ORCPT ); Fri, 16 May 2014 12:27:30 -0400 Date: Fri, 16 May 2014 18:27:27 +0200 (CEST) From: Jiri Kosina To: Steven Rostedt cc: Masami Hiramatsu , Ingo Molnar , Frederic Weisbecker , Josh Poimboeuf , Seth Jennings , Ingo Molnar , Jiri Slaby , linux-kernel@vger.kernel.org, Peter Zijlstra , Andrew Morton , Linus Torvalds , Thomas Gleixner Subject: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching In-Reply-To: <20140506082604.31928cb9@gandalf.local.home> Message-ID: References: <20140505085537.GA32196@gmail.com> <20140505132638.GA14432@treble.redhat.com> <20140505141038.GA27403@localhost.localdomain> <20140505184304.GA15137@gmail.com> <5368CB6E.3090105@hitachi.com> <20140506082604.31928cb9@gandalf.local.home> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 6 May 2014, Steven Rostedt wrote: > > However, I also think if users can accept such freezing wait-time, > > it means they can also accept kexec based "checkpoint-restart" patching. > > So, I think the final goal of the kpatch will be live patching without > > stopping the machine. I'm discussing the issue on github #138, but that is > > off-topic. :) > > I agree with Ingo too. Being conservative at first is the right > approach here. We should start out with a stop_machine making sure that > everything is sane before we continue. Sure, that's not much different > than a kexec, but lets take things one step at a time. > > ftrace did the stop_machine (and still does for some archs), and slowly > moved to a more efficient method. kpatch/kgraft should follow suit. I don't really agree here. I actually believe that "lazy" switching kgraft is doing provides a little bit more in the sense of consistency than stop_machine()-based aproach. Consider this scenario: void foo() { for (i=0; i<10000; i++) { bar(i); something_else(i); } } Let's say you want to live-patch bar(). With stop_machine()-based aproach, you can easily end-up with old bar() and new bar() being called in two consecutive iterations before the loop is even exited, right? (especially on preemptible kernel, or if something_else() goes to sleep). With lazy-switching implemented in kgraft, this can never happen. So I'd like to ask for a little bit more explanation why you think the stop_machine()-based patching provides more sanity/consistency assurance than the lazy switching we're doing. Thanks a lot, -- Jiri Kosina SUSE Labs