From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.0 required=3.0 tests=BAYES_00,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2F66C4709D for ; Fri, 4 Jun 2021 10:04:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 91BF161208 for ; Fri, 4 Jun 2021 10:04:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230483AbhFDKFq (ORCPT ); Fri, 4 Jun 2021 06:05:46 -0400 Received: from mail.kernel.org ([198.145.29.99]:51676 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbhFDKFl (ORCPT ); Fri, 4 Jun 2021 06:05:41 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id ABFA461369; Fri, 4 Jun 2021 10:03:54 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lp6gC-005SAa-JF; Fri, 04 Jun 2021 11:03:52 +0100 Date: Fri, 04 Jun 2021 11:03:50 +0100 Message-ID: <87tumeymih.wl-maz@kernel.org> From: Marc Zyngier To: Sergey Senozhatsky Cc: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Jim Mattson , Huacai Chen , Paul Mackerras , Christian Borntraeger , Suleiman Souhlal , x86@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-s390@vger.kernel.org Subject: Re: [RFC][PATCH] kvm: add suspend pm-notifier In-Reply-To: References: <20210603164315.682994-1-senozhatsky@chromium.org> <87v96uyq2v.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: senozhatsky@chromium.org, pbonzini@redhat.com, seanjc@google.com, vkuznets@redhat.com, jmattson@google.com, chenhuacai@kernel.org, paulus@ozlabs.org, borntraeger@de.ibm.com, suleiman@google.com, x86@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-s390@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 04 Jun 2021 10:20:28 +0100, Sergey Senozhatsky wrote: > > On (21/06/04 09:46), Marc Zyngier wrote: > [..] > > > +void kvm_arch_pm_notifier(struct kvm *kvm) > > > +{ > > > +#ifdef CONFIG_PM > > > + int c; > > > + > > > + mutex_lock(&kvm->lock); > > > + for (c = 0; c < kvm->created_vcpus; c++) { > > > + struct kvm_vcpu *vcpu = kvm->vcpus[c]; > > > + int r; > > > + > > > + if (!vcpu) > > > + continue; > > > > Wouldn't kvm_for_each_vcpu() avoid this kind of checks? > > Right, that's what I do in v2, which I haven't posted yet. > > [..] > > > +#include > > > + > > > #ifndef KVM_MAX_VCPU_ID > > > #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS > > > #endif > > > @@ -579,6 +581,10 @@ struct kvm { > > > pid_t userspace_pid; > > > unsigned int max_halt_poll_ns; > > > u32 dirty_ring_size; > > > + > > > +#ifdef CONFIG_PM > > > + struct notifier_block pm_notifier; > > > +#endif > > > > I'd certainly like to be able to opt out from this on architectures > > that do not implement anything useful in the PM callbacks. > > Well on the other hand PM-callbacks are harmless on those archs, they > won't overload the __weak function. I don't care much for the callbacks. But struct kvm is bloated enough, and I'd be happy not to have this structure embedded in it if I can avoid it. > > > Please consider making this an independent config option that individual > > archs can buy into. > > Sure, I can take a look into this, but how is this better than __weak > function? (that's a real question) Weak functions are OK AFAIC. More crud in struct kvm is what I'm not OK with. > > [..] > > > +#ifdef CONFIG_PM > > > +static int kvm_pm_notifier_call(struct notifier_block *bl, > > > + unsigned long state, > > > + void *unused) > > > +{ > > > + struct kvm *kvm = container_of(bl, struct kvm, pm_notifier); > > > + > > > + switch (state) { > > > + case PM_HIBERNATION_PREPARE: > > > + case PM_SUSPEND_PREPARE: > > > + kvm_arch_pm_notifier(kvm); > > > > How about passing the state to the notifier callback? I'd expect it to > > be useful to do something on resume too. > > For different states we can have different kvm_arch functions instead. > kvm_arch_pm_notifier() can be renamed to kvm_arch_suspend_notifier(), > so that we don't need to have `switch (state)` in every arch-code. Then > for resume/post resume states we can have kvm_arch_resume_notifier() > arch functions. I'd rather we keep an arch API that is similar to the one the rest of the kernel has, instead of a flurry of small helpers that need to grow each time someone adds a new PM state. A switch() in the arch-specific implementation is absolutely fine. > > > > + break; > > > + } > > > + return NOTIFY_DONE; > > > +} > > > + > > > +static void kvm_init_pm_notifier(struct kvm *kvm) > > > +{ > > > + kvm->pm_notifier.notifier_call = kvm_pm_notifier_call; > > > + kvm->pm_notifier.priority = INT_MAX; > > > > How is this priority determined? > > Nothing magical here. I want this to be executed first, before we suspend > ftrace, RCU and the like. Besides KVM is usually the last one to register > its PM callbacks, so there can be something on the notifier list that > returns NOTIFY_STOP_MASK in front of KVM PM-notifier list entry. Which begs the question: should arch-specific code be able to veto suspend and return an error itself? Always returning NOTIFY_DONE seems highly optimistic. M. -- Without deviation from the norm, progress is not possible.