From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27212C41535 for ; Tue, 5 Apr 2022 23:41:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1576169AbiDEXKU (ORCPT ); Tue, 5 Apr 2022 19:10:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573643AbiDET2W (ORCPT ); Tue, 5 Apr 2022 15:28:22 -0400 Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0002BE9FD for ; Tue, 5 Apr 2022 12:26:22 -0700 (PDT) Received: by mail-lf1-x12b.google.com with SMTP id e16so148383lfc.13 for ; Tue, 05 Apr 2022 12:26:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=aM/g1QWf5Ry19waQIN/yIeidX6+3IktjhZXbrkM4RkA=; b=VqGDk0cJy3VeD9YeUGkYiZjnNQkZlk/Mdl0vUqqTFICQid0WGPL+Av1QUqRMJhz3Ke WrYERt9bbKqaRmJWLGp+DBX+ozx2WIsDDOgkFrI0daBzk4PEMGo4qVcPq8l8U0s145vZ vBzjBwIR3Z43nEGj8amENjwPazeZzr5a7iNwQ9ylKV6qSev2vlvPWE2IFFchha3/nric 89HS0Jw07JOwWXOvkQ+C9aGiS7y3U1hNOzwVyZEpKqj/3hwe5Ja/gGOAb+DKaMELoMNS IihVPlC3vtOWpd1N8AI6o+dwKHircdKjX5BtX/lT14fQCsCzaYapIIgcND7heWlJBP+U +nFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=aM/g1QWf5Ry19waQIN/yIeidX6+3IktjhZXbrkM4RkA=; b=cR5QggDIfnJmkgCk1MKQqbBhVA+pCCUcaMQkTqBR9Fbx+0fS43OyFjiS83Gj2uhhUX L+m2f2CYAHyMpw7blNNgXQyIqsTIG5YevsWHHNI68G2Dn2Iax4vHZJ7c/hhoqSLR+JQF 11aR40hs5jD6BXqFsLj3TvI3uZqmF1ELzOVhYG2be1ObTqfhnlin7hVhqeli34J7LdAU 3yb7jurYdQWfO5PVd7OIfLsqkPxzbsUMxc7vs7zTCnXN4fYCDXvrpy2nbP8Oc/yF1ZAa vfEQtVl2AEBOBbvSbyOL8jw2oEJXieQqVJZMl0fX7cT1DoDK3apRJAFB7g2aF2Oa6xP3 2A9Q== X-Gm-Message-State: AOAM532F6ljWvusW/gbeHMEAPb/Jtlhw3q+WMZQ5GZctPXnOILIYpDsC +WTRu0fmjrnWS8LjSizrLfInJfeSUjMB+bj63J1lQA== X-Google-Smtp-Source: ABdhPJxuMetLuvdZqMK40/5W5YwyJzebZo0tDnZYg162KlZH94xPjpjAL4QaPZ0vTIDn+hejxD/o7qNaX1svStxVFmU= X-Received: by 2002:a05:6512:12c6:b0:44a:650f:3b86 with SMTP id p6-20020a05651212c600b0044a650f3b86mr3771545lfg.79.1649186780874; Tue, 05 Apr 2022 12:26:20 -0700 (PDT) MIME-Version: 1.0 References: <20220330164306.2376085-1-pgonda@google.com> In-Reply-To: From: Peter Gonda Date: Tue, 5 Apr 2022 13:26:09 -0600 Message-ID: Subject: Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages() To: Sean Christopherson Cc: Mingwei Zhang , kvm , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 31, 2022 at 3:31 PM Sean Christopherson wro= te: > > On Wed, Mar 30, 2022, Mingwei Zhang wrote: > > On Wed, Mar 30, 2022 at 9:43 AM Peter Gonda wrote: > > > > > > Add resched to avoid warning from sev_clflush_pages() with large numb= er > > > of pages. > > > > > > Signed-off-by: Peter Gonda > > > Cc: Sean Christopherson > > > Cc: kvm@vger.kernel.org > > > Cc: linux-kernel@vger.kernel.org > > > > > > --- > > > Here is a warning similar to what I've seen many times running large = SEV > > > VMs: > > > [ 357.714051] CPU 15: need_resched set for > 52000222 ns (52 ticks) = without schedule > > > [ 357.721623] WARNING: CPU: 15 PID: 35848 at kernel/sched/core.c:373= 3 scheduler_tick+0x2f9/0x3f0 > > > [ 357.730222] Modules linked in: kvm_amd uhaul vfat fat hdi2_standar= d_ftl hdi2_megablocks hdi2_pmc hdi2_pmc_eeprom hdi2 stg elephant_dev_num cc= p i2c_mux_ltc4306 i2c_mux i2c_via_ipmi i2c_piix4 google_bmc_usb google_bmc_= gpioi2c_mb_common google_bmc_mailbox cdc_acm xhci_pci xhci_hcd sha3_generic= gq nv_p2p_glue accel_class > > > [ 357.758261] CPU: 15 PID: 35848 Comm: switchto-defaul Not tainted 4= .15.0-smp-DEV #11 > > > [ 357.765912] Hardware name: Google, Inc. = Arcadia_IT_80/Arcadia_IT_80, BIOS 30.20.2-gce 1= 1/05/2021 > > > [ 357.779372] RIP: 0010:scheduler_tick+0x2f9/0x3f0 > > > [ 357.783988] RSP: 0018:ffff98558d1c3dd8 EFLAGS: 00010046 > > > [ 357.789207] RAX: 741f23206aa8dc00 RBX: 0000005349236a42 RCX: 00000= 00000000007 > > > [ 357.796339] RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffff9= 8558d1d5a98 > > > [ 357.803463] RBP: ffff98558d1c3ea0 R08: 0000000000100ceb R09: 00000= 00000000000 > > > [ 357.810597] R10: ffff98558c958c00 R11: ffffffff94850740 R12: 00000= 000031975de > > > [ 357.817729] R13: 0000000000000000 R14: ffff98558d1e2640 R15: ffff9= 8525739ea40 > > > [ 357.824862] FS: 00007f87503eb700(0000) GS:ffff98558d1c0000(0000) = knlGS:0000000000000000 > > > [ 357.832948] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [ 357.838695] CR2: 00005572fe74b080 CR3: 0000007bea706006 CR4: 00000= 00000360ef0 > > > [ 357.845828] Call Trace: > > > [ 357.848277] > > > [ 357.850294] [] ? tick_setup_sched_timer+0x130/0= x130 > > > [ 357.856818] [] ? rcu_sched_clock_irq+0x6ed/0x85= 0 > > > [ 357.863084] [] ? __run_timers+0x42/0x260 > > > [ 357.868654] [] ? tick_setup_sched_timer+0x130/0= x130 > > > [ 357.875182] [] update_process_times+0x7b/0x90 > > > [ 357.881188] [] tick_sched_timer+0x82/0xd0 > > > [ 357.886845] [] __run_hrtimer+0x81/0x200 > > > [ 357.892331] [] hrtimer_interrupt+0x192/0x450 > > > [ 357.898252] [] ? __do_softirq+0x2fa/0x33e > > > [ 357.903911] [] smp_apic_timer_interrupt+0xac/0x= 1d0 > > > [ 357.910349] [] apic_timer_interrupt+0x86/0x90 > > > [ 357.916347] > > > [ 357.918452] RIP: 0010:clflush_cache_range+0x3f/0x50 > > > [ 357.923324] RSP: 0018:ffff98529af89cc0 EFLAGS: 00000246 ORIG_RAX: = ffffffffffffff12 > > > [ 357.930889] RAX: 0000000000000040 RBX: 0000000000038135 RCX: ffff9= 85233d36000 > > > [ 357.938013] RDX: ffff985233d36000 RSI: 0000000000001000 RDI: ffff9= 85233d35000 > > > [ 357.945145] RBP: ffff98529af89cc0 R08: 0000000000000001 R09: ffffb= 5753fb23000 > > > [ 357.952271] R10: 000000000003fe00 R11: 0000000000000008 R12: 00000= 00000040000 > > > [ 357.959401] R13: ffff98525739ea40 R14: ffffb5753fb22000 R15: ffff9= 8532a58dd80 > > > [ 357.966536] [] svm_register_enc_region+0xd1/0x1= 70 [kvm_amd] > > > [ 357.973758] [] kvm_arch_vm_ioctl+0x84c/0xb00 > > > [ 357.979677] [] ? handle_mm_fault+0x6ff/0x1370 > > > [ 357.985683] [] kvm_vm_ioctl+0x69b/0x720 > > > [ 357.991167] [] do_vfs_ioctl+0x47d/0x680 > > > [ 357.996654] [] SyS_ioctl+0x68/0x90 > > > [ 358.001706] [] do_syscall_64+0x71/0x110 > > > [ 358.007192] [] entry_SYSCALL_64_after_hwframe+0= x3d/0xa2 > > > > > > Tested by running a large 256gib SEV VM several times, saw no warning= s. > > > Without the change warnings are seen. > > Clean up the splat (remove timestamps, everything with a ?, etc... I beli= eve there > is a kernel scripts/ to do this...) and throw it in the changelog. Docum= enting the > exact problem is very helpful, e.g. future readers may wonder "what warni= ng?". Paolo has queued this I think, so I'll do this next time I am fixing a warning. Thanks Sean. > > > > > --- > > > arch/x86/kvm/svm/sev.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > > > index 75fa6dd268f0..c2fe89ecdb2d 100644 > > > --- a/arch/x86/kvm/svm/sev.c > > > +++ b/arch/x86/kvm/svm/sev.c > > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[= ], unsigned long npages) > > > page_virtual =3D kmap_atomic(pages[i]); > > > clflush_cache_range(page_virtual, PAGE_SIZE); > > > kunmap_atomic(page_virtual); > > > + cond_resched(); > > > > If you add cond_resched() here, the frequency (once per 4K) might be > > too high. You may want to do it once per X pages, where X could be > > something like 1G/4K? > > No, every iteration is perfectly ok. The "cond"itional part means that t= his will > reschedule if and only if it actually needs to be rescheduled, e.g. if th= e task's > timeslice as expired. The check for a needed reschedule is cheap, using > cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a rec= hed > check prior to enterring the guest.