All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
@ 2022-03-30 16:43 Peter Gonda
  2022-03-30 16:52 ` Mingwei Zhang
  2022-04-05 12:12 ` Paolo Bonzini
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Gonda @ 2022-03-30 16:43 UTC (permalink / raw)
  To: kvm; +Cc: Peter Gonda, Sean Christopherson, linux-kernel

Add resched to avoid warning from sev_clflush_pages() with large number
of pages.

Signed-off-by: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
Here is a warning similar to what I've seen many times running large SEV
VMs:
[  357.714051] CPU 15: need_resched set for > 52000222 ns (52 ticks) without schedule
[  357.721623] WARNING: CPU: 15 PID: 35848 at kernel/sched/core.c:3733 scheduler_tick+0x2f9/0x3f0
[  357.730222] Modules linked in: kvm_amd uhaul vfat fat hdi2_standard_ftl hdi2_megablocks hdi2_pmc hdi2_pmc_eeprom hdi2 stg elephant_dev_num ccp i2c_mux_ltc4306 i2c_mux i2c_via_ipmi i2c_piix4 google_bmc_usb google_bmc_gpioi2c_mb_common google_bmc_mailbox cdc_acm xhci_pci xhci_hcd sha3_generic gq nv_p2p_glue accel_class
[  357.758261] CPU: 15 PID: 35848 Comm: switchto-defaul Not tainted 4.15.0-smp-DEV #11
[  357.765912] Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 30.20.2-gce 11/05/2021
[  357.779372] RIP: 0010:scheduler_tick+0x2f9/0x3f0
[  357.783988] RSP: 0018:ffff98558d1c3dd8 EFLAGS: 00010046
[  357.789207] RAX: 741f23206aa8dc00 RBX: 0000005349236a42 RCX: 0000000000000007
[  357.796339] RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffff98558d1d5a98
[  357.803463] RBP: ffff98558d1c3ea0 R08: 0000000000100ceb R09: 0000000000000000
[  357.810597] R10: ffff98558c958c00 R11: ffffffff94850740 R12: 00000000031975de
[  357.817729] R13: 0000000000000000 R14: ffff98558d1e2640 R15: ffff98525739ea40
[  357.824862] FS:  00007f87503eb700(0000) GS:ffff98558d1c0000(0000) knlGS:0000000000000000
[  357.832948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  357.838695] CR2: 00005572fe74b080 CR3: 0000007bea706006 CR4: 0000000000360ef0
[  357.845828] Call Trace:
[  357.848277]  <IRQ>
[  357.850294]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
[  357.856818]  [<ffffffff943ed60d>] ? rcu_sched_clock_irq+0x6ed/0x850
[  357.863084]  [<ffffffff943fdf02>] ? __run_timers+0x42/0x260
[  357.868654]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
[  357.875182]  [<ffffffff943fd35b>] update_process_times+0x7b/0x90
[  357.881188]  [<ffffffff944114a2>] tick_sched_timer+0x82/0xd0
[  357.886845]  [<ffffffff94400671>] __run_hrtimer+0x81/0x200
[  357.892331]  [<ffffffff943ff222>] hrtimer_interrupt+0x192/0x450
[  357.898252]  [<ffffffff950002fa>] ? __do_softirq+0x2fa/0x33e
[  357.903911]  [<ffffffff94e02edc>] smp_apic_timer_interrupt+0xac/0x1d0
[  357.910349]  [<ffffffff94e01ef6>] apic_timer_interrupt+0x86/0x90
[  357.916347]  </IRQ>
[  357.918452] RIP: 0010:clflush_cache_range+0x3f/0x50
[  357.923324] RSP: 0018:ffff98529af89cc0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12
[  357.930889] RAX: 0000000000000040 RBX: 0000000000038135 RCX: ffff985233d36000
[  357.938013] RDX: ffff985233d36000 RSI: 0000000000001000 RDI: ffff985233d35000
[  357.945145] RBP: ffff98529af89cc0 R08: 0000000000000001 R09: ffffb5753fb23000
[  357.952271] R10: 000000000003fe00 R11: 0000000000000008 R12: 0000000000040000
[  357.959401] R13: ffff98525739ea40 R14: ffffb5753fb22000 R15: ffff98532a58dd80
[  357.966536]  [<ffffffffc07afd41>] svm_register_enc_region+0xd1/0x170 [kvm_amd]
[  357.973758]  [<ffffffff94246e8c>] kvm_arch_vm_ioctl+0x84c/0xb00
[  357.979677]  [<ffffffff9455980f>] ? handle_mm_fault+0x6ff/0x1370
[  357.985683]  [<ffffffff9423412b>] kvm_vm_ioctl+0x69b/0x720
[  357.991167]  [<ffffffff945dfd9d>] do_vfs_ioctl+0x47d/0x680
[  357.996654]  [<ffffffff945e0188>] SyS_ioctl+0x68/0x90
[  358.001706]  [<ffffffff942066f1>] do_syscall_64+0x71/0x110
[  358.007192]  [<ffffffff94e00081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Tested by running a large 256gib SEV VM several times, saw no warnings.
Without the change warnings are seen.

---
 arch/x86/kvm/svm/sev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 75fa6dd268f0..c2fe89ecdb2d 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
 		page_virtual = kmap_atomic(pages[i]);
 		clflush_cache_range(page_virtual, PAGE_SIZE);
 		kunmap_atomic(page_virtual);
+		cond_resched();
 	}
 }
 
-- 
2.35.1.1094.g7c7d902a7c-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-03-30 16:43 [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages() Peter Gonda
@ 2022-03-30 16:52 ` Mingwei Zhang
  2022-03-31 21:31   ` Sean Christopherson
  2022-04-05 12:12 ` Paolo Bonzini
  1 sibling, 1 reply; 11+ messages in thread
From: Mingwei Zhang @ 2022-03-30 16:52 UTC (permalink / raw)
  To: Peter Gonda; +Cc: kvm, Sean Christopherson, LKML

On Wed, Mar 30, 2022 at 9:43 AM Peter Gonda <pgonda@google.com> wrote:
>
> Add resched to avoid warning from sev_clflush_pages() with large number
> of pages.
>
> Signed-off-by: Peter Gonda <pgonda@google.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
>
> ---
> Here is a warning similar to what I've seen many times running large SEV
> VMs:
> [  357.714051] CPU 15: need_resched set for > 52000222 ns (52 ticks) without schedule
> [  357.721623] WARNING: CPU: 15 PID: 35848 at kernel/sched/core.c:3733 scheduler_tick+0x2f9/0x3f0
> [  357.730222] Modules linked in: kvm_amd uhaul vfat fat hdi2_standard_ftl hdi2_megablocks hdi2_pmc hdi2_pmc_eeprom hdi2 stg elephant_dev_num ccp i2c_mux_ltc4306 i2c_mux i2c_via_ipmi i2c_piix4 google_bmc_usb google_bmc_gpioi2c_mb_common google_bmc_mailbox cdc_acm xhci_pci xhci_hcd sha3_generic gq nv_p2p_glue accel_class
> [  357.758261] CPU: 15 PID: 35848 Comm: switchto-defaul Not tainted 4.15.0-smp-DEV #11
> [  357.765912] Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 30.20.2-gce 11/05/2021
> [  357.779372] RIP: 0010:scheduler_tick+0x2f9/0x3f0
> [  357.783988] RSP: 0018:ffff98558d1c3dd8 EFLAGS: 00010046
> [  357.789207] RAX: 741f23206aa8dc00 RBX: 0000005349236a42 RCX: 0000000000000007
> [  357.796339] RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffff98558d1d5a98
> [  357.803463] RBP: ffff98558d1c3ea0 R08: 0000000000100ceb R09: 0000000000000000
> [  357.810597] R10: ffff98558c958c00 R11: ffffffff94850740 R12: 00000000031975de
> [  357.817729] R13: 0000000000000000 R14: ffff98558d1e2640 R15: ffff98525739ea40
> [  357.824862] FS:  00007f87503eb700(0000) GS:ffff98558d1c0000(0000) knlGS:0000000000000000
> [  357.832948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  357.838695] CR2: 00005572fe74b080 CR3: 0000007bea706006 CR4: 0000000000360ef0
> [  357.845828] Call Trace:
> [  357.848277]  <IRQ>
> [  357.850294]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> [  357.856818]  [<ffffffff943ed60d>] ? rcu_sched_clock_irq+0x6ed/0x850
> [  357.863084]  [<ffffffff943fdf02>] ? __run_timers+0x42/0x260
> [  357.868654]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> [  357.875182]  [<ffffffff943fd35b>] update_process_times+0x7b/0x90
> [  357.881188]  [<ffffffff944114a2>] tick_sched_timer+0x82/0xd0
> [  357.886845]  [<ffffffff94400671>] __run_hrtimer+0x81/0x200
> [  357.892331]  [<ffffffff943ff222>] hrtimer_interrupt+0x192/0x450
> [  357.898252]  [<ffffffff950002fa>] ? __do_softirq+0x2fa/0x33e
> [  357.903911]  [<ffffffff94e02edc>] smp_apic_timer_interrupt+0xac/0x1d0
> [  357.910349]  [<ffffffff94e01ef6>] apic_timer_interrupt+0x86/0x90
> [  357.916347]  </IRQ>
> [  357.918452] RIP: 0010:clflush_cache_range+0x3f/0x50
> [  357.923324] RSP: 0018:ffff98529af89cc0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12
> [  357.930889] RAX: 0000000000000040 RBX: 0000000000038135 RCX: ffff985233d36000
> [  357.938013] RDX: ffff985233d36000 RSI: 0000000000001000 RDI: ffff985233d35000
> [  357.945145] RBP: ffff98529af89cc0 R08: 0000000000000001 R09: ffffb5753fb23000
> [  357.952271] R10: 000000000003fe00 R11: 0000000000000008 R12: 0000000000040000
> [  357.959401] R13: ffff98525739ea40 R14: ffffb5753fb22000 R15: ffff98532a58dd80
> [  357.966536]  [<ffffffffc07afd41>] svm_register_enc_region+0xd1/0x170 [kvm_amd]
> [  357.973758]  [<ffffffff94246e8c>] kvm_arch_vm_ioctl+0x84c/0xb00
> [  357.979677]  [<ffffffff9455980f>] ? handle_mm_fault+0x6ff/0x1370
> [  357.985683]  [<ffffffff9423412b>] kvm_vm_ioctl+0x69b/0x720
> [  357.991167]  [<ffffffff945dfd9d>] do_vfs_ioctl+0x47d/0x680
> [  357.996654]  [<ffffffff945e0188>] SyS_ioctl+0x68/0x90
> [  358.001706]  [<ffffffff942066f1>] do_syscall_64+0x71/0x110
> [  358.007192]  [<ffffffff94e00081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>
> Tested by running a large 256gib SEV VM several times, saw no warnings.
> Without the change warnings are seen.
>
> ---
>  arch/x86/kvm/svm/sev.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 75fa6dd268f0..c2fe89ecdb2d 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
>                 page_virtual = kmap_atomic(pages[i]);
>                 clflush_cache_range(page_virtual, PAGE_SIZE);
>                 kunmap_atomic(page_virtual);
> +               cond_resched();

If you add cond_resched() here, the frequency (once per 4K) might be
too high. You may want to do it once per X pages, where X could be
something like 1G/4K?

>         }
>  }
>
> --
> 2.35.1.1094.g7c7d902a7c-goog
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-03-30 16:52 ` Mingwei Zhang
@ 2022-03-31 21:31   ` Sean Christopherson
  2022-04-05 19:26     ` Peter Gonda
  2022-04-06 17:53     ` Mingwei Zhang
  0 siblings, 2 replies; 11+ messages in thread
From: Sean Christopherson @ 2022-03-31 21:31 UTC (permalink / raw)
  To: Mingwei Zhang; +Cc: Peter Gonda, kvm, LKML

On Wed, Mar 30, 2022, Mingwei Zhang wrote:
> On Wed, Mar 30, 2022 at 9:43 AM Peter Gonda <pgonda@google.com> wrote:
> >
> > Add resched to avoid warning from sev_clflush_pages() with large number
> > of pages.
> >
> > Signed-off-by: Peter Gonda <pgonda@google.com>
> > Cc: Sean Christopherson <seanjc@google.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> >
> > ---
> > Here is a warning similar to what I've seen many times running large SEV
> > VMs:
> > [  357.714051] CPU 15: need_resched set for > 52000222 ns (52 ticks) without schedule
> > [  357.721623] WARNING: CPU: 15 PID: 35848 at kernel/sched/core.c:3733 scheduler_tick+0x2f9/0x3f0
> > [  357.730222] Modules linked in: kvm_amd uhaul vfat fat hdi2_standard_ftl hdi2_megablocks hdi2_pmc hdi2_pmc_eeprom hdi2 stg elephant_dev_num ccp i2c_mux_ltc4306 i2c_mux i2c_via_ipmi i2c_piix4 google_bmc_usb google_bmc_gpioi2c_mb_common google_bmc_mailbox cdc_acm xhci_pci xhci_hcd sha3_generic gq nv_p2p_glue accel_class
> > [  357.758261] CPU: 15 PID: 35848 Comm: switchto-defaul Not tainted 4.15.0-smp-DEV #11
> > [  357.765912] Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 30.20.2-gce 11/05/2021
> > [  357.779372] RIP: 0010:scheduler_tick+0x2f9/0x3f0
> > [  357.783988] RSP: 0018:ffff98558d1c3dd8 EFLAGS: 00010046
> > [  357.789207] RAX: 741f23206aa8dc00 RBX: 0000005349236a42 RCX: 0000000000000007
> > [  357.796339] RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffff98558d1d5a98
> > [  357.803463] RBP: ffff98558d1c3ea0 R08: 0000000000100ceb R09: 0000000000000000
> > [  357.810597] R10: ffff98558c958c00 R11: ffffffff94850740 R12: 00000000031975de
> > [  357.817729] R13: 0000000000000000 R14: ffff98558d1e2640 R15: ffff98525739ea40
> > [  357.824862] FS:  00007f87503eb700(0000) GS:ffff98558d1c0000(0000) knlGS:0000000000000000
> > [  357.832948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  357.838695] CR2: 00005572fe74b080 CR3: 0000007bea706006 CR4: 0000000000360ef0
> > [  357.845828] Call Trace:
> > [  357.848277]  <IRQ>
> > [  357.850294]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> > [  357.856818]  [<ffffffff943ed60d>] ? rcu_sched_clock_irq+0x6ed/0x850
> > [  357.863084]  [<ffffffff943fdf02>] ? __run_timers+0x42/0x260
> > [  357.868654]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> > [  357.875182]  [<ffffffff943fd35b>] update_process_times+0x7b/0x90
> > [  357.881188]  [<ffffffff944114a2>] tick_sched_timer+0x82/0xd0
> > [  357.886845]  [<ffffffff94400671>] __run_hrtimer+0x81/0x200
> > [  357.892331]  [<ffffffff943ff222>] hrtimer_interrupt+0x192/0x450
> > [  357.898252]  [<ffffffff950002fa>] ? __do_softirq+0x2fa/0x33e
> > [  357.903911]  [<ffffffff94e02edc>] smp_apic_timer_interrupt+0xac/0x1d0
> > [  357.910349]  [<ffffffff94e01ef6>] apic_timer_interrupt+0x86/0x90
> > [  357.916347]  </IRQ>
> > [  357.918452] RIP: 0010:clflush_cache_range+0x3f/0x50
> > [  357.923324] RSP: 0018:ffff98529af89cc0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12
> > [  357.930889] RAX: 0000000000000040 RBX: 0000000000038135 RCX: ffff985233d36000
> > [  357.938013] RDX: ffff985233d36000 RSI: 0000000000001000 RDI: ffff985233d35000
> > [  357.945145] RBP: ffff98529af89cc0 R08: 0000000000000001 R09: ffffb5753fb23000
> > [  357.952271] R10: 000000000003fe00 R11: 0000000000000008 R12: 0000000000040000
> > [  357.959401] R13: ffff98525739ea40 R14: ffffb5753fb22000 R15: ffff98532a58dd80
> > [  357.966536]  [<ffffffffc07afd41>] svm_register_enc_region+0xd1/0x170 [kvm_amd]
> > [  357.973758]  [<ffffffff94246e8c>] kvm_arch_vm_ioctl+0x84c/0xb00
> > [  357.979677]  [<ffffffff9455980f>] ? handle_mm_fault+0x6ff/0x1370
> > [  357.985683]  [<ffffffff9423412b>] kvm_vm_ioctl+0x69b/0x720
> > [  357.991167]  [<ffffffff945dfd9d>] do_vfs_ioctl+0x47d/0x680
> > [  357.996654]  [<ffffffff945e0188>] SyS_ioctl+0x68/0x90
> > [  358.001706]  [<ffffffff942066f1>] do_syscall_64+0x71/0x110
> > [  358.007192]  [<ffffffff94e00081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> >
> > Tested by running a large 256gib SEV VM several times, saw no warnings.
> > Without the change warnings are seen.

Clean up the splat (remove timestamps, everything with a ?, etc... I believe there
is a kernel scripts/ to do this...) and throw it in the changelog.  Documenting the
exact problem is very helpful, e.g. future readers may wonder "what warning?".

> > ---
> >  arch/x86/kvm/svm/sev.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> >                 page_virtual = kmap_atomic(pages[i]);
> >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> >                 kunmap_atomic(page_virtual);
> > +               cond_resched();
> 
> If you add cond_resched() here, the frequency (once per 4K) might be
> too high. You may want to do it once per X pages, where X could be
> something like 1G/4K?

No, every iteration is perfectly ok.  The "cond"itional part means that this will
reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
timeslice as expired.  The check for a needed reschedule is cheap, using
cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
check prior to enterring the guest.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-03-30 16:43 [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages() Peter Gonda
  2022-03-30 16:52 ` Mingwei Zhang
@ 2022-04-05 12:12 ` Paolo Bonzini
  1 sibling, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-05 12:12 UTC (permalink / raw)
  To: Peter Gonda; +Cc: kvm, Sean Christopherson, linux-kernel

Queued, thanks.

Paolo



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-03-31 21:31   ` Sean Christopherson
@ 2022-04-05 19:26     ` Peter Gonda
  2022-04-06 17:53     ` Mingwei Zhang
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Gonda @ 2022-04-05 19:26 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Mingwei Zhang, kvm, LKML

On Thu, Mar 31, 2022 at 3:31 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Mar 30, 2022, Mingwei Zhang wrote:
> > On Wed, Mar 30, 2022 at 9:43 AM Peter Gonda <pgonda@google.com> wrote:
> > >
> > > Add resched to avoid warning from sev_clflush_pages() with large number
> > > of pages.
> > >
> > > Signed-off-by: Peter Gonda <pgonda@google.com>
> > > Cc: Sean Christopherson <seanjc@google.com>
> > > Cc: kvm@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > >
> > > ---
> > > Here is a warning similar to what I've seen many times running large SEV
> > > VMs:
> > > [  357.714051] CPU 15: need_resched set for > 52000222 ns (52 ticks) without schedule
> > > [  357.721623] WARNING: CPU: 15 PID: 35848 at kernel/sched/core.c:3733 scheduler_tick+0x2f9/0x3f0
> > > [  357.730222] Modules linked in: kvm_amd uhaul vfat fat hdi2_standard_ftl hdi2_megablocks hdi2_pmc hdi2_pmc_eeprom hdi2 stg elephant_dev_num ccp i2c_mux_ltc4306 i2c_mux i2c_via_ipmi i2c_piix4 google_bmc_usb google_bmc_gpioi2c_mb_common google_bmc_mailbox cdc_acm xhci_pci xhci_hcd sha3_generic gq nv_p2p_glue accel_class
> > > [  357.758261] CPU: 15 PID: 35848 Comm: switchto-defaul Not tainted 4.15.0-smp-DEV #11
> > > [  357.765912] Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 30.20.2-gce 11/05/2021
> > > [  357.779372] RIP: 0010:scheduler_tick+0x2f9/0x3f0
> > > [  357.783988] RSP: 0018:ffff98558d1c3dd8 EFLAGS: 00010046
> > > [  357.789207] RAX: 741f23206aa8dc00 RBX: 0000005349236a42 RCX: 0000000000000007
> > > [  357.796339] RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffff98558d1d5a98
> > > [  357.803463] RBP: ffff98558d1c3ea0 R08: 0000000000100ceb R09: 0000000000000000
> > > [  357.810597] R10: ffff98558c958c00 R11: ffffffff94850740 R12: 00000000031975de
> > > [  357.817729] R13: 0000000000000000 R14: ffff98558d1e2640 R15: ffff98525739ea40
> > > [  357.824862] FS:  00007f87503eb700(0000) GS:ffff98558d1c0000(0000) knlGS:0000000000000000
> > > [  357.832948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [  357.838695] CR2: 00005572fe74b080 CR3: 0000007bea706006 CR4: 0000000000360ef0
> > > [  357.845828] Call Trace:
> > > [  357.848277]  <IRQ>
> > > [  357.850294]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> > > [  357.856818]  [<ffffffff943ed60d>] ? rcu_sched_clock_irq+0x6ed/0x850
> > > [  357.863084]  [<ffffffff943fdf02>] ? __run_timers+0x42/0x260
> > > [  357.868654]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> > > [  357.875182]  [<ffffffff943fd35b>] update_process_times+0x7b/0x90
> > > [  357.881188]  [<ffffffff944114a2>] tick_sched_timer+0x82/0xd0
> > > [  357.886845]  [<ffffffff94400671>] __run_hrtimer+0x81/0x200
> > > [  357.892331]  [<ffffffff943ff222>] hrtimer_interrupt+0x192/0x450
> > > [  357.898252]  [<ffffffff950002fa>] ? __do_softirq+0x2fa/0x33e
> > > [  357.903911]  [<ffffffff94e02edc>] smp_apic_timer_interrupt+0xac/0x1d0
> > > [  357.910349]  [<ffffffff94e01ef6>] apic_timer_interrupt+0x86/0x90
> > > [  357.916347]  </IRQ>
> > > [  357.918452] RIP: 0010:clflush_cache_range+0x3f/0x50
> > > [  357.923324] RSP: 0018:ffff98529af89cc0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12
> > > [  357.930889] RAX: 0000000000000040 RBX: 0000000000038135 RCX: ffff985233d36000
> > > [  357.938013] RDX: ffff985233d36000 RSI: 0000000000001000 RDI: ffff985233d35000
> > > [  357.945145] RBP: ffff98529af89cc0 R08: 0000000000000001 R09: ffffb5753fb23000
> > > [  357.952271] R10: 000000000003fe00 R11: 0000000000000008 R12: 0000000000040000
> > > [  357.959401] R13: ffff98525739ea40 R14: ffffb5753fb22000 R15: ffff98532a58dd80
> > > [  357.966536]  [<ffffffffc07afd41>] svm_register_enc_region+0xd1/0x170 [kvm_amd]
> > > [  357.973758]  [<ffffffff94246e8c>] kvm_arch_vm_ioctl+0x84c/0xb00
> > > [  357.979677]  [<ffffffff9455980f>] ? handle_mm_fault+0x6ff/0x1370
> > > [  357.985683]  [<ffffffff9423412b>] kvm_vm_ioctl+0x69b/0x720
> > > [  357.991167]  [<ffffffff945dfd9d>] do_vfs_ioctl+0x47d/0x680
> > > [  357.996654]  [<ffffffff945e0188>] SyS_ioctl+0x68/0x90
> > > [  358.001706]  [<ffffffff942066f1>] do_syscall_64+0x71/0x110
> > > [  358.007192]  [<ffffffff94e00081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> > >
> > > Tested by running a large 256gib SEV VM several times, saw no warnings.
> > > Without the change warnings are seen.
>
> Clean up the splat (remove timestamps, everything with a ?, etc... I believe there
> is a kernel scripts/ to do this...) and throw it in the changelog.  Documenting the
> exact problem is very helpful, e.g. future readers may wonder "what warning?".

Paolo has queued this I think, so I'll do this next time I am fixing a
warning.  Thanks Sean.

>
>
> > > ---
> > >  arch/x86/kvm/svm/sev.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > --- a/arch/x86/kvm/svm/sev.c
> > > +++ b/arch/x86/kvm/svm/sev.c
> > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > >                 page_virtual = kmap_atomic(pages[i]);
> > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > >                 kunmap_atomic(page_virtual);
> > > +               cond_resched();
> >
> > If you add cond_resched() here, the frequency (once per 4K) might be
> > too high. You may want to do it once per X pages, where X could be
> > something like 1G/4K?
>
> No, every iteration is perfectly ok.  The "cond"itional part means that this will
> reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> timeslice as expired.  The check for a needed reschedule is cheap, using
> cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> check prior to enterring the guest.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-03-31 21:31   ` Sean Christopherson
  2022-04-05 19:26     ` Peter Gonda
@ 2022-04-06 17:53     ` Mingwei Zhang
  2022-04-06 18:26       ` Sean Christopherson
  1 sibling, 1 reply; 11+ messages in thread
From: Mingwei Zhang @ 2022-04-06 17:53 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Peter Gonda, kvm, LKML

Hi Sean,

> > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > --- a/arch/x86/kvm/svm/sev.c
> > > +++ b/arch/x86/kvm/svm/sev.c
> > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > >                 page_virtual = kmap_atomic(pages[i]);
> > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > >                 kunmap_atomic(page_virtual);
> > > +               cond_resched();
> >
> > If you add cond_resched() here, the frequency (once per 4K) might be
> > too high. You may want to do it once per X pages, where X could be
> > something like 1G/4K?
>
> No, every iteration is perfectly ok.  The "cond"itional part means that this will
> reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> timeslice as expired.  The check for a needed reschedule is cheap, using
> cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> check prior to enterring the guest.

Double check on the code again. I think the point is not about flag
checking. Obviously branch prediction could really help. The point I
think is the 'call' to cond_resched(). Depending on the kernel
configuration, cond_resched() may not always be inlined, at least this
is my understanding so far? So if that is true, then it still might
not always be the best to call cond_resched() that often.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-04-06 17:53     ` Mingwei Zhang
@ 2022-04-06 18:26       ` Sean Christopherson
  2022-04-06 18:34         ` Peter Gonda
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2022-04-06 18:26 UTC (permalink / raw)
  To: Mingwei Zhang; +Cc: Peter Gonda, kvm, LKML

On Wed, Apr 06, 2022, Mingwei Zhang wrote:
> Hi Sean,
> 
> > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > > --- a/arch/x86/kvm/svm/sev.c
> > > > +++ b/arch/x86/kvm/svm/sev.c
> > > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > > >                 page_virtual = kmap_atomic(pages[i]);
> > > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > > >                 kunmap_atomic(page_virtual);
> > > > +               cond_resched();
> > >
> > > If you add cond_resched() here, the frequency (once per 4K) might be
> > > too high. You may want to do it once per X pages, where X could be
> > > something like 1G/4K?
> >
> > No, every iteration is perfectly ok.  The "cond"itional part means that this will
> > reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> > timeslice as expired.  The check for a needed reschedule is cheap, using
> > cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> > check prior to enterring the guest.
> 
> Double check on the code again. I think the point is not about flag
> checking. Obviously branch prediction could really help. The point I
> think is the 'call' to cond_resched(). Depending on the kernel
> configuration, cond_resched() may not always be inlined, at least this
> is my understanding so far? So if that is true, then it still might
> not always be the best to call cond_resched() that often.

Eh, compared to the cost of 64 back-to-back CLFLUSHOPTs, the cost of __cond_resched()
is peanuts.  Even accounting for the rcu_all_qs() work, it's still dwarfed by the
cost of flushing data from the cache.  E.g. based on Agner Fog's wonderful uop
latencies[*], the actual flush time for a single page is going to be upwards of
10k cycles, whereas __cond_resched() is going to well under 100 cycles in the happy
case of no work.  Even if those throughput numbers are off by an order of magnitude,
e.g. CLFLUSHOPT can complete in 15 cycles, that's still ~1k cycles.

Peter, don't we also theoretically need cond_resched() in the loops in
sev_launch_update_data()?  AFAICT, there's no articifical restriction on the size
of the payload, i.e. the kernel is effectively relying on userspace to not update
large swaths of memory.

[*] https://www.agner.org/optimize/instruction_tables.pdf

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-04-06 18:26       ` Sean Christopherson
@ 2022-04-06 18:34         ` Peter Gonda
  2022-04-18 15:48           ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Gonda @ 2022-04-06 18:34 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Mingwei Zhang, kvm, LKML

On Wed, Apr 6, 2022 at 12:26 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 06, 2022, Mingwei Zhang wrote:
> > Hi Sean,
> >
> > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > > > --- a/arch/x86/kvm/svm/sev.c
> > > > > +++ b/arch/x86/kvm/svm/sev.c
> > > > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > > > >                 page_virtual = kmap_atomic(pages[i]);
> > > > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > > > >                 kunmap_atomic(page_virtual);
> > > > > +               cond_resched();
> > > >
> > > > If you add cond_resched() here, the frequency (once per 4K) might be
> > > > too high. You may want to do it once per X pages, where X could be
> > > > something like 1G/4K?
> > >
> > > No, every iteration is perfectly ok.  The "cond"itional part means that this will
> > > reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> > > timeslice as expired.  The check for a needed reschedule is cheap, using
> > > cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> > > check prior to enterring the guest.
> >
> > Double check on the code again. I think the point is not about flag
> > checking. Obviously branch prediction could really help. The point I
> > think is the 'call' to cond_resched(). Depending on the kernel
> > configuration, cond_resched() may not always be inlined, at least this
> > is my understanding so far? So if that is true, then it still might
> > not always be the best to call cond_resched() that often.
>
> Eh, compared to the cost of 64 back-to-back CLFLUSHOPTs, the cost of __cond_resched()
> is peanuts.  Even accounting for the rcu_all_qs() work, it's still dwarfed by the
> cost of flushing data from the cache.  E.g. based on Agner Fog's wonderful uop
> latencies[*], the actual flush time for a single page is going to be upwards of
> 10k cycles, whereas __cond_resched() is going to well under 100 cycles in the happy
> case of no work.  Even if those throughput numbers are off by an order of magnitude,
> e.g. CLFLUSHOPT can complete in 15 cycles, that's still ~1k cycles.
>
> Peter, don't we also theoretically need cond_resched() in the loops in
> sev_launch_update_data()?  AFAICT, there's no articifical restriction on the size
> of the payload, i.e. the kernel is effectively relying on userspace to not update
> large swaths of memory.

Yea we probably do want to cond_resched() in the for loop inside of
sev_launch_update_data(). Ithink in  sev_dbg_crypt() userspace could
request a large number of pages to be decrypted/encrypted for
debugging but se have a call to sev_pin_memory() in the loop so that
will have a cond_resded() inside of __get_users_pages(). Or should we
have a cond_resded() inside of the loop in sev_dbg_crypt() too?

>
> [*] https://www.agner.org/optimize/instruction_tables.pdf

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-04-06 18:34         ` Peter Gonda
@ 2022-04-18 15:48           ` Sean Christopherson
  2022-04-21 14:35             ` Peter Gonda
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2022-04-18 15:48 UTC (permalink / raw)
  To: Peter Gonda; +Cc: Mingwei Zhang, kvm, LKML

On Wed, Apr 06, 2022, Peter Gonda wrote:
> On Wed, Apr 6, 2022 at 12:26 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Apr 06, 2022, Mingwei Zhang wrote:
> > > Hi Sean,
> > >
> > > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > > > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > > > > --- a/arch/x86/kvm/svm/sev.c
> > > > > > +++ b/arch/x86/kvm/svm/sev.c
> > > > > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > > > > >                 page_virtual = kmap_atomic(pages[i]);
> > > > > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > > > > >                 kunmap_atomic(page_virtual);
> > > > > > +               cond_resched();
> > > > >
> > > > > If you add cond_resched() here, the frequency (once per 4K) might be
> > > > > too high. You may want to do it once per X pages, where X could be
> > > > > something like 1G/4K?
> > > >
> > > > No, every iteration is perfectly ok.  The "cond"itional part means that this will
> > > > reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> > > > timeslice as expired.  The check for a needed reschedule is cheap, using
> > > > cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> > > > check prior to enterring the guest.
> > >
> > > Double check on the code again. I think the point is not about flag
> > > checking. Obviously branch prediction could really help. The point I
> > > think is the 'call' to cond_resched(). Depending on the kernel
> > > configuration, cond_resched() may not always be inlined, at least this
> > > is my understanding so far? So if that is true, then it still might
> > > not always be the best to call cond_resched() that often.
> >
> > Eh, compared to the cost of 64 back-to-back CLFLUSHOPTs, the cost of __cond_resched()
> > is peanuts.  Even accounting for the rcu_all_qs() work, it's still dwarfed by the
> > cost of flushing data from the cache.  E.g. based on Agner Fog's wonderful uop
> > latencies[*], the actual flush time for a single page is going to be upwards of
> > 10k cycles, whereas __cond_resched() is going to well under 100 cycles in the happy
> > case of no work.  Even if those throughput numbers are off by an order of magnitude,
> > e.g. CLFLUSHOPT can complete in 15 cycles, that's still ~1k cycles.
> >
> > Peter, don't we also theoretically need cond_resched() in the loops in
> > sev_launch_update_data()?  AFAICT, there's no articifical restriction on the size
> > of the payload, i.e. the kernel is effectively relying on userspace to not update
> > large swaths of memory.
> 
> Yea we probably do want to cond_resched() in the for loop inside of
> sev_launch_update_data(). Ithink in  sev_dbg_crypt() userspace could
> request a large number of pages to be decrypted/encrypted for
> debugging but se have a call to sev_pin_memory() in the loop so that
> will have a cond_resded() inside of __get_users_pages(). Or should we
> have a cond_resded() inside of the loop in sev_dbg_crypt() too?

I believe sev_dbg_crypt() needs a cond_resched() of its own, sev_pin_memory()
isn't guaranteed to get into the slow path of internal_get_user_pages_fast().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-04-18 15:48           ` Sean Christopherson
@ 2022-04-21 14:35             ` Peter Gonda
  2022-04-21 14:46               ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Gonda @ 2022-04-21 14:35 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Mingwei Zhang, kvm, LKML

On Mon, Apr 18, 2022 at 9:48 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 06, 2022, Peter Gonda wrote:
> > On Wed, Apr 6, 2022 at 12:26 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Wed, Apr 06, 2022, Mingwei Zhang wrote:
> > > > Hi Sean,
> > > >
> > > > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > > > > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > > > > > --- a/arch/x86/kvm/svm/sev.c
> > > > > > > +++ b/arch/x86/kvm/svm/sev.c
> > > > > > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > > > > > >                 page_virtual = kmap_atomic(pages[i]);
> > > > > > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > > > > > >                 kunmap_atomic(page_virtual);
> > > > > > > +               cond_resched();
> > > > > >
> > > > > > If you add cond_resched() here, the frequency (once per 4K) might be
> > > > > > too high. You may want to do it once per X pages, where X could be
> > > > > > something like 1G/4K?
> > > > >
> > > > > No, every iteration is perfectly ok.  The "cond"itional part means that this will
> > > > > reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> > > > > timeslice as expired.  The check for a needed reschedule is cheap, using
> > > > > cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> > > > > check prior to enterring the guest.
> > > >
> > > > Double check on the code again. I think the point is not about flag
> > > > checking. Obviously branch prediction could really help. The point I
> > > > think is the 'call' to cond_resched(). Depending on the kernel
> > > > configuration, cond_resched() may not always be inlined, at least this
> > > > is my understanding so far? So if that is true, then it still might
> > > > not always be the best to call cond_resched() that often.
> > >
> > > Eh, compared to the cost of 64 back-to-back CLFLUSHOPTs, the cost of __cond_resched()
> > > is peanuts.  Even accounting for the rcu_all_qs() work, it's still dwarfed by the
> > > cost of flushing data from the cache.  E.g. based on Agner Fog's wonderful uop
> > > latencies[*], the actual flush time for a single page is going to be upwards of
> > > 10k cycles, whereas __cond_resched() is going to well under 100 cycles in the happy
> > > case of no work.  Even if those throughput numbers are off by an order of magnitude,
> > > e.g. CLFLUSHOPT can complete in 15 cycles, that's still ~1k cycles.
> > >
> > > Peter, don't we also theoretically need cond_resched() in the loops in
> > > sev_launch_update_data()?  AFAICT, there's no articifical restriction on the size
> > > of the payload, i.e. the kernel is effectively relying on userspace to not update
> > > large swaths of memory.
> >
> > Yea we probably do want to cond_resched() in the for loop inside of
> > sev_launch_update_data(). Ithink in  sev_dbg_crypt() userspace could
> > request a large number of pages to be decrypted/encrypted for
> > debugging but se have a call to sev_pin_memory() in the loop so that
> > will have a cond_resded() inside of __get_users_pages(). Or should we
> > have a cond_resded() inside of the loop in sev_dbg_crypt() too?
>
> I believe sev_dbg_crypt() needs a cond_resched() of its own, sev_pin_memory()
> isn't guaranteed to get into the slow path of internal_get_user_pages_fast().

Ah, understood thanks. I'll send out patches for those two paths. I
personally haven't seen any warning logs from them though.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()
  2022-04-21 14:35             ` Peter Gonda
@ 2022-04-21 14:46               ` Sean Christopherson
  0 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2022-04-21 14:46 UTC (permalink / raw)
  To: Peter Gonda; +Cc: Mingwei Zhang, kvm, LKML

On Thu, Apr 21, 2022, Peter Gonda wrote:
> On Mon, Apr 18, 2022 at 9:48 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Apr 06, 2022, Peter Gonda wrote:
> > > On Wed, Apr 6, 2022 at 12:26 PM Sean Christopherson <seanjc@google.com> wrote:
> > > >
> > > > On Wed, Apr 06, 2022, Mingwei Zhang wrote:
> > > > > Hi Sean,
> > > > >
> > > > > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > > > > > > > index 75fa6dd268f0..c2fe89ecdb2d 100644
> > > > > > > > --- a/arch/x86/kvm/svm/sev.c
> > > > > > > > +++ b/arch/x86/kvm/svm/sev.c
> > > > > > > > @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
> > > > > > > >                 page_virtual = kmap_atomic(pages[i]);
> > > > > > > >                 clflush_cache_range(page_virtual, PAGE_SIZE);
> > > > > > > >                 kunmap_atomic(page_virtual);
> > > > > > > > +               cond_resched();
> > > > > > >
> > > > > > > If you add cond_resched() here, the frequency (once per 4K) might be
> > > > > > > too high. You may want to do it once per X pages, where X could be
> > > > > > > something like 1G/4K?
> > > > > >
> > > > > > No, every iteration is perfectly ok.  The "cond"itional part means that this will
> > > > > > reschedule if and only if it actually needs to be rescheduled, e.g. if the task's
> > > > > > timeslice as expired.  The check for a needed reschedule is cheap, using
> > > > > > cond_resched() in tight-ish loops is ok and intended, e.g. KVM does a reched
> > > > > > check prior to enterring the guest.
> > > > >
> > > > > Double check on the code again. I think the point is not about flag
> > > > > checking. Obviously branch prediction could really help. The point I
> > > > > think is the 'call' to cond_resched(). Depending on the kernel
> > > > > configuration, cond_resched() may not always be inlined, at least this
> > > > > is my understanding so far? So if that is true, then it still might
> > > > > not always be the best to call cond_resched() that often.
> > > >
> > > > Eh, compared to the cost of 64 back-to-back CLFLUSHOPTs, the cost of __cond_resched()
> > > > is peanuts.  Even accounting for the rcu_all_qs() work, it's still dwarfed by the
> > > > cost of flushing data from the cache.  E.g. based on Agner Fog's wonderful uop
> > > > latencies[*], the actual flush time for a single page is going to be upwards of
> > > > 10k cycles, whereas __cond_resched() is going to well under 100 cycles in the happy
> > > > case of no work.  Even if those throughput numbers are off by an order of magnitude,
> > > > e.g. CLFLUSHOPT can complete in 15 cycles, that's still ~1k cycles.
> > > >
> > > > Peter, don't we also theoretically need cond_resched() in the loops in
> > > > sev_launch_update_data()?  AFAICT, there's no articifical restriction on the size
> > > > of the payload, i.e. the kernel is effectively relying on userspace to not update
> > > > large swaths of memory.
> > >
> > > Yea we probably do want to cond_resched() in the for loop inside of
> > > sev_launch_update_data(). Ithink in  sev_dbg_crypt() userspace could
> > > request a large number of pages to be decrypted/encrypted for
> > > debugging but se have a call to sev_pin_memory() in the loop so that
> > > will have a cond_resded() inside of __get_users_pages(). Or should we
> > > have a cond_resded() inside of the loop in sev_dbg_crypt() too?
> >
> > I believe sev_dbg_crypt() needs a cond_resched() of its own, sev_pin_memory()
> > isn't guaranteed to get into the slow path of internal_get_user_pages_fast().
> 
> Ah, understood thanks. I'll send out patches for those two paths. I
> personally haven't seen any warning logs from them though.

Do you have test cases that deliberately attempt to decrypt+read pages upon pages
of guest memory at a time?  Unless someone has wired up a VMM to do a full dump
of guest memory, I highly doubt a "real" VMM will do more than read a handful of
bytes at a time.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-21 14:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-30 16:43 [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages() Peter Gonda
2022-03-30 16:52 ` Mingwei Zhang
2022-03-31 21:31   ` Sean Christopherson
2022-04-05 19:26     ` Peter Gonda
2022-04-06 17:53     ` Mingwei Zhang
2022-04-06 18:26       ` Sean Christopherson
2022-04-06 18:34         ` Peter Gonda
2022-04-18 15:48           ` Sean Christopherson
2022-04-21 14:35             ` Peter Gonda
2022-04-21 14:46               ` Sean Christopherson
2022-04-05 12:12 ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.