From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F908C433EF for ; Thu, 23 Sep 2021 13:14:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1523961216 for ; Thu, 23 Sep 2021 13:14:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241209AbhIWNQV (ORCPT ); Thu, 23 Sep 2021 09:16:21 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:44954 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241211AbhIWNQV (ORCPT ); Thu, 23 Sep 2021 09:16:21 -0400 Received: from relay1.suse.de (relay1.suse.de [149.44.160.133]) by smtp-out2.suse.de (Postfix) with ESMTP id C1CFC1FFB0; Thu, 23 Sep 2021 13:14:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1632402888; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=N1FFFjqNtxvFumHoY1HQ7sDzxjdq1AfkS9o1k39AM/Q=; b=pZbh0RwXnp69/q47cYgywHtYNNFir4kHJ/mvQH68hramJR7aBMKTY8ntFCvwjBSVZvlkoS OetBEOW1y0aG2JTpetfTIlWgbRIHrfI+H0QHKaG0cenqE5PNeXJYAr2nqkY0BlZ+t8ibAb UnCs+Vsn0zFfPCR0OlyYQbXBoxVEhNo= Received: from suse.cz (unknown [10.100.224.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay1.suse.de (Postfix) with ESMTPS id 9AD8025D3C; Thu, 23 Sep 2021 13:14:48 +0000 (UTC) Date: Thu, 23 Sep 2021 15:14:48 +0200 From: Petr Mladek To: Peter Zijlstra Cc: gor@linux.ibm.com, jpoimboe@redhat.com, jikos@kernel.org, mbenes@suse.cz, mingo@kernel.org, linux-kernel@vger.kernel.org, joe.lawrence@redhat.com, fweisbec@gmail.com, tglx@linutronix.de, hca@linux.ibm.com, svens@linux.ibm.com, sumanthk@linux.ibm.com, live-patching@vger.kernel.org, paulmck@kernel.org Subject: Re: [RFC][PATCH 7/7] livepatch,context_tracking: Avoid disturbing NOHZ_FULL tasks Message-ID: References: <20210922110506.703075504@infradead.org> <20210922110836.304335737@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210922110836.304335737@infradead.org> Precedence: bulk List-ID: X-Mailing-List: live-patching@vger.kernel.org On Wed 2021-09-22 13:05:13, Peter Zijlstra wrote: > When a task is stuck in NOHZ_FULL usermode, we can simply mark the > livepatch state complete. > > Signed-off-by: Peter Zijlstra (Intel) > --- > kernel/livepatch/transition.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > --- a/kernel/livepatch/transition.c > +++ b/kernel/livepatch/transition.c > @@ -270,13 +270,24 @@ static int klp_check_task(struct task_st > { > int ret; > > - if (task_curr(task)) > + if (task_curr(task)) { > + if (context_tracking_state_cpu(task_cpu(task)) == CONTEXT_USER) { > + /* > + * If we observe the CPU being in USER context, they > + * must issue an smp_mb() before doing much kernel > + * space and as such will observe the patched state, > + * mark it clean. > + */ > + goto complete; IMHO, this is not safe: CPU0 CPU1 klp_check_task(A) if (context_tracking_state_cpu(task_cpu(task)) == CONTEXT_USER) goto complete; clear_tsk_thread_flag(task, TIF_PATCH_PENDING); # task switching to kernel space klp_update_patch_state(A) if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING)) //false # calling kernel code with old task->patch_state task->patch_state = klp_target_state; BANG: CPU0 sets task->patch_state when task A is already running kernel code on CPU1. > + } > return -EBUSY; > + } > > ret = klp_check_stack(task, arg); > if (ret) > return ret; > > +complete: > clear_tsk_thread_flag(task, TIF_PATCH_PENDING); > task->patch_state = klp_target_state; A solution might be to switch ordering and add a barrier here. > return 0; The code might look like: static int klp_check_task(struct task_struct *task, void *arg) { int ret; if (task_curr(task)) { if (context_tracking_state_cpu(task_cpu(task)) == CONTEXT_USER) { /* * Task running in USER mode might get switched * immediately. They are switched when entering * kernel code anyway. */ goto complete; } return -EBUSY; } ret = klp_check_stack(task, arg); if (ret) return ret; complete: WRITE_ONCE(task->patch_state, klp_target_state); /* * We switch also tasks running in USER mode here. They must * see the new state before clearing the pending flag. * Otherwise, they might enter kernel mode without switching * the state in klp_update_patch_state(). */ smp_wmb(); clear_tsk_thread_flag(task, TIF_PATCH_PENDING); return 0; } The only problem is that the corresponding read barrier is not clear. It will make more sense if it is paired with some read barrier in the scheduler after handling TIF flags. But we should be on the safe side because klp_ftrace_handler() always does read barrier before reading the state. Though, it is done there from other reasons. Best Regards, Petr