From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757972AbaEPSKI (ORCPT ); Fri, 16 May 2014 14:10:08 -0400 Received: from mail9.hitachi.co.jp ([133.145.228.44]:49292 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757914AbaEPSKF (ORCPT ); Fri, 16 May 2014 14:10:05 -0400 Message-ID: <53765475.6040707@hitachi.com> Date: Sat, 17 May 2014 03:09:57 +0900 From: Masami Hiramatsu Organization: Hitachi, Ltd., Japan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: Jiri Kosina Cc: Steven Rostedt , Ingo Molnar , Frederic Weisbecker , Josh Poimboeuf , Seth Jennings , Ingo Molnar , Jiri Slaby , linux-kernel@vger.kernel.org, Peter Zijlstra , Andrew Morton , Linus Torvalds , Thomas Gleixner Subject: Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching References: <20140505085537.GA32196@gmail.com> <20140505132638.GA14432@treble.redhat.com> <20140505141038.GA27403@localhost.localdomain> <20140505184304.GA15137@gmail.com> <5368CB6E.3090105@hitachi.com> <20140506082604.31928cb9@gandalf.local.home> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (2014/05/17 1:27), Jiri Kosina wrote: > On Tue, 6 May 2014, Steven Rostedt wrote: > >>> However, I also think if users can accept such freezing wait-time, >>> it means they can also accept kexec based "checkpoint-restart" patching. >>> So, I think the final goal of the kpatch will be live patching without >>> stopping the machine. I'm discussing the issue on github #138, but that is >>> off-topic. :) >> >> I agree with Ingo too. Being conservative at first is the right >> approach here. We should start out with a stop_machine making sure that >> everything is sane before we continue. Sure, that's not much different >> than a kexec, but lets take things one step at a time. >> >> ftrace did the stop_machine (and still does for some archs), and slowly >> moved to a more efficient method. kpatch/kgraft should follow suit. > > I don't really agree here. > > I actually believe that "lazy" switching kgraft is doing provides a little > bit more in the sense of consistency than stop_machine()-based aproach. > > Consider this scenario: > > void foo() > { > for (i=0; i<10000; i++) { > bar(i); > something_else(i); > } > } In this case, I'd recommend you to add foo() to replacing target as dummy. Then, kpatch can ensure foo() is actually not running. :) > Let's say you want to live-patch bar(). With stop_machine()-based aproach, > you can easily end-up with old bar() and new bar() being called in two > consecutive iterations before the loop is even exited, right? (especially > on preemptible kernel, or if something_else() goes to sleep). > > With lazy-switching implemented in kgraft, this can never happen. And I guess similar thing may happen with kgraft. If old function and new function share a non-auto variable and they modify it different way, the result will be unexpected by the mutual interference. Thank you, > > So I'd like to ask for a little bit more explanation why you think the > stop_machine()-based patching provides more sanity/consistency assurance > than the lazy switching we're doing. > > Thanks a lot, > -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com