From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFE61C32771 for ; Thu, 16 Jan 2020 00:31:01 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BABEC207FF for ; Thu, 16 Jan 2020 00:31:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="e0GhKTvu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BABEC207FF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:References :In-Reply-To:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ILbNEQfLF23rnhXHmv0dIbHz7xBz53RogStHRprk6qI=; b=e0GhKTvuR67khA ZINw00SKm/jmG3B6ovooDv0GFXi7xCvGVxnIDmcJnepReA0U9Op0RCwifUWy2DWNF3BjSzFruC6a3 4VlzXqLm47xXx2KXfG2nEJJikWSCGkLqmbOJ5IYe0eyiVMCXBk19KS79bSYL0iO4Uxa5kPmGoZV65 DxgqbjK8XHfeSxMMIN4Us/y9hv9li8rOHFcDQ7Fj6bFNFSiBs74LikJyB2napFeTs5vIfffGZH2tz ggjpUAvVwj6S1MMQLDcLhEm4zPt2J4oqYaksUecfTTttAqwAX7Alnx88tLzO/ieUeba1JZW2kpcfd Uzyme6XZtDDTrD7vcGRA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1irt3t-0000s2-3t; Thu, 16 Jan 2020 00:31:01 +0000 Received: from galois.linutronix.de ([2a0a:51c0:0:12e:550::1]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1irt3o-0000qs-1p for linux-arm-kernel@lists.infradead.org; Thu, 16 Jan 2020 00:31:00 +0000 Received: from p5b06da22.dip0.t-ipconnect.de ([91.6.218.34] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1irt3S-0004QP-Qv; Thu, 16 Jan 2020 01:30:35 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id B7BBB10121C; Thu, 16 Jan 2020 01:30:33 +0100 (CET) From: Thomas Gleixner To: Hsin-Yi Wang Subject: Re: [PATCH v5] reboot: support offline CPUs before reboot In-Reply-To: <20200115063410.131692-1-hsinyi@chromium.org> References: <20200115063410.131692-1-hsinyi@chromium.org> Date: Thu, 16 Jan 2020 01:30:33 +0100 Message-ID: <8736cgxmxi.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200115_163056_232627_B0804A8C X-CRM114-Status: GOOD ( 17.43 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, Peter Zijlstra , Heiko Carstens , linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, Guenter Roeck , Will Deacon , Ingo Molnar , linux-s390@vger.kernel.org, linux-csky@vger.kernel.org, Aaro Koskinen , Fenghua Yu , linux-pm@vger.kernel.org, linux-xtensa@linux-xtensa.org, Stephen Boyd , Josh Poimboeuf , Pavankumar Kondeti , linux-arm-kernel@lists.infradead.org, linux-parisc@vger.kernel.org, Greg Kroah-Hartman , linux-mips@vger.kernel.org, James Morse , Jiri Kosina , Vitaly Kuznetsov , linuxppc-dev@lists.ozlabs.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hsin-Yi Wang writes: > Currently system reboots uses architecture specific codes (smp_send_stop) > to offline non reboot CPUs. Most architecture's implementation is looping > through all non reboot online CPUs and call ipi function to each of them. Some > architecture like arm64, arm, and x86... would set offline masks to cpu without > really offline them. This causes some race condition and kernel warning comes > out sometimes when system reboots. 'some race condition and kernel warning' is pretty useless information. Please describe exactly which kind of issues are caused by the current mechanism. Especially the race conditions are the interesting part (the warnings are just a consequence). > This patch adds a config ARCH_OFFLINE_CPUS_ON_REBOOT, which would > offline cpus in Please read Documentation/process/submitting-patches.rst and search for 'This patch'. > migrate_to_reboot_cpu(). If non reboot cpus are all offlined here, the loop for > checking online cpus would be an empty loop. This does not make any sense. The issues which you are trying to solve are going to be still there when CONFIG_HOTPLUG_CPU is disabled. > If architecture don't enable this config, or some cpus somehow fails > to offline, it would fallback to ipi function. This is really a half baken solution which keeps the various pointlessly different pseudo reboot/kexec offlining implementations around. So with this we have yet more code which only works depending on kernel configuration and has the issue of potentially not being able to offline a CPU. IOW this is going to fail completely in cases where a system is in a state which prevents regular hotplug. The existing pseudo-offline functions have timeouts and eventually a fallback, e.g. the NMI fallback on x86. With this proposed regular offline solution this will just get stuck w/o a chance to force recovery. While I like the idea and surely agree that the ideal solution is to properly shutdown the CPUs on reboot, we need to take a step back and look at the minimum requirements for a regular shutdown/reboot and at the same time have a look at the requirements for emergency shutdown and kexec/kcrash. Having proper information about the race conditions and warnings you mentioned would be a good starting point. > Opt in this config for architectures that support CONFIG_HOTPLUG_CPU. This is not opt-in. You force that on all architectures which support CONFIG_HOTPLUG_CPU. The way we do this normally is to provide the infrastructure first and then have separate patches (one per architecture) enabling this, which allows the architecture maintainers to decide individually. Thanks, tglx _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel