All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: credit=sched2 machine hang when using DRAKVUF
@ 2020-10-22 22:59 Michał Leszczyński
  2020-10-23  4:47 ` Jürgen Groß
  0 siblings, 1 reply; 7+ messages in thread
From: Michał Leszczyński @ 2020-10-22 22:59 UTC (permalink / raw)
  To: xen-devel

Hello,

when using DRAKVUF against a Windows 7 x64 DomU, the whole machine hangs after a few minutes.

The chance for a hang seems to be correlated with number of PCPUs, in this case we have 14 PCPUs and hang is very easily reproducible, while on other machines with 2-4 PCPUs it's very rare (but still occurring sometimes). The issue is observed with the default sched=credit2 and is no longer reproducible once sched=credit is set.


Enclosed: panic log from my Dom0.

Best regards,
Michał Leszczyński
CERT Polska


paź 22 12:20:50 hostname kernel: rcu: INFO: rcu_sched self-detected stall on CPU
paź 22 12:20:50 hostname kernel: rcu:         3-....: (21002 ticks this GP) idle=7e2/1/0x4000000000000002 softirq=61729/61729 fqs=10490
paź 22 12:20:50 hostname kernel: rcu:          (t=21003 jiffies g=36437 q=9406)
paź 22 12:20:50 hostname kernel: NMI backtrace for cpu 3
paź 22 12:20:50 hostname kernel: CPU: 3 PID: 4153 Comm: drakvuf Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
paź 22 12:20:50 hostname kernel: Hardware name: Dell Inc. PowerEdge R640/08HT8T, BIOS 2.1.8 04/30/2019
paź 22 12:20:50 hostname kernel: Call Trace:
paź 22 12:20:50 hostname kernel:  <IRQ>
paź 22 12:20:50 hostname kernel:  dump_stack+0x5c/0x80
paź 22 12:20:50 hostname kernel:  nmi_cpu_backtrace.cold.4+0x13/0x50
paź 22 12:20:50 hostname kernel:  ? lapic_can_unplug_cpu.cold.29+0x3b/0x3b
paź 22 12:20:50 hostname kernel:  nmi_trigger_cpumask_backtrace+0xf9/0xfb
paź 22 12:20:50 hostname kernel:  rcu_dump_cpu_stacks+0x9b/0xcb
paź 22 12:20:50 hostname kernel:  rcu_check_callbacks.cold.81+0x1db/0x335
paź 22 12:20:50 hostname kernel:  ? tick_sched_do_timer+0x60/0x60
paź 22 12:20:50 hostname kernel:  update_process_times+0x28/0x60
paź 22 12:20:50 hostname kernel:  tick_sched_handle+0x22/0x60
paź 22 12:20:50 hostname kernel:  tick_sched_timer+0x37/0x70
paź 22 12:20:50 hostname kernel:  __hrtimer_run_queues+0x100/0x280
paź 22 12:20:50 hostname kernel:  hrtimer_interrupt+0x100/0x220
paź 22 12:20:50 hostname kernel:  xen_timer_interrupt+0x1e/0x30
paź 22 12:20:50 hostname kernel:  __handle_irq_event_percpu+0x46/0x190
paź 22 12:20:50 hostname kernel:  handle_irq_event_percpu+0x30/0x80
paź 22 12:20:50 hostname kernel:  handle_percpu_irq+0x40/0x60
paź 22 12:20:50 hostname kernel:  generic_handle_irq+0x27/0x30
paź 22 12:20:50 hostname kernel:  __evtchn_fifo_handle_events+0x17d/0x190
paź 22 12:20:50 hostname kernel:  __xen_evtchn_do_upcall+0x42/0x80
paź 22 12:20:50 hostname kernel:  xen_evtchn_do_upcall+0x27/0x40
paź 22 12:20:50 hostname kernel:  xen_do_hypervisor_callback+0x29/0x40
paź 22 12:20:50 hostname kernel:  </IRQ>
paź 22 12:20:50 hostname kernel: RIP: e030:smp_call_function_single+0xce/0xf0
paź 22 12:20:50 hostname kernel: Code: 8b 4c 24 38 65 48 33 0c 25 28 00 00 00 75 34 c9 c3 48 89 d1 48 89 f2 48 89 e6 e8 6d fe ff ff 8b 54 24 18 83 e2 01 74 0b f3 90 <8b> 54 24 18 8
3 e2 01 75 f5 eb ca 8b 05 b9 99 4d 01 85 c0 75 88 0f
paź 22 12:20:50 hostname kernel: RSP: e02b:ffffc9004713bd00 EFLAGS: 00000202
paź 22 12:20:50 hostname kernel: RAX: 0000000000000000 RBX: ffff888b0b6eea40 RCX: 0000000000000200
paź 22 12:20:50 hostname kernel: RDX: 0000000000000001 RSI: ffffffff8212e4a0 RDI: ffffffff81c2dec0
paź 22 12:20:50 hostname kernel: RBP: ffffc9004713bd50 R08: 0000000000000000 R09: ffff888c54052480
paź 22 12:20:50 hostname kernel: R10: ffff888c540524a8 R11: 0000000000000000 R12: ffffc9004713bd60
paź 22 12:20:50 hostname kernel: R13: 0000000080000000 R14: ffffffff80000000 R15: ffff888b0b6eeab0
paź 22 12:20:50 hostname kernel:  ? xen_pgd_alloc+0x110/0x110
paź 22 12:20:50 hostname kernel:  xen_exit_mmap+0xaa/0x100
paź 22 12:20:50 hostname kernel:  exit_mmap+0x64/0x180
paź 22 12:20:50 hostname kernel:  ? __raw_spin_unlock+0x5/0x10
paź 22 12:20:50 hostname kernel:  ? __handle_mm_fault+0x1090/0x1270
paź 22 12:20:50 hostname kernel:  ? _raw_spin_unlock_irqrestore+0x14/0x20
paź 22 12:20:50 hostname kernel:  ? exit_robust_list+0x5b/0x130
paź 22 12:20:50 hostname kernel:  mmput+0x54/0x130
paź 22 12:20:50 hostname kernel:  do_exit+0x290/0xb90
paź 22 12:20:50 hostname kernel:  ? handle_mm_fault+0xd6/0x200
paź 22 12:20:50 hostname kernel:  do_group_exit+0x3a/0xa0
paź 22 12:20:50 hostname kernel:  __x64_sys_exit_group+0x14/0x20
paź 22 12:20:50 hostname kernel:  do_syscall_64+0x53/0x110
paź 22 12:20:50 hostname kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
paź 22 12:20:50 hostname kernel: RIP: 0033:0x7f98d23ec9d6
paź 22 12:20:50 hostname kernel: Code: Bad RIP value.
paź 22 12:20:50 hostname kernel: RSP: 002b:00007ffc4a0327f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
paź 22 12:20:50 hostname kernel: RAX: ffffffffffffffda RBX: 00007f98d24dd760 RCX: 00007f98d23ec9d6
paź 22 12:20:50 hostname kernel: RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
paź 22 12:20:50 hostname kernel: RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff60
paź 22 12:20:50 hostname kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f98d24dd760
paź 22 12:20:50 hostname kernel: R13: 000000000000005a R14: 00007f98d24e6428 R15: 0000000000000000


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: credit=sched2 machine hang when using DRAKVUF
  2020-10-22 22:59 BUG: credit=sched2 machine hang when using DRAKVUF Michał Leszczyński
@ 2020-10-23  4:47 ` Jürgen Groß
  2020-10-23 23:36   ` Michał Leszczyński
  2020-10-28  2:04   ` Michał Leszczyński
  0 siblings, 2 replies; 7+ messages in thread
From: Jürgen Groß @ 2020-10-23  4:47 UTC (permalink / raw)
  To: Michał Leszczyński, xen-devel

On 23.10.20 00:59, Michał Leszczyński wrote:
> Hello,
> 
> when using DRAKVUF against a Windows 7 x64 DomU, the whole machine hangs after a few minutes.
> 
> The chance for a hang seems to be correlated with number of PCPUs, in this case we have 14 PCPUs and hang is very easily reproducible, while on other machines with 2-4 PCPUs it's very rare (but still occurring sometimes). The issue is observed with the default sched=credit2 and is no longer reproducible once sched=credit is set.

Interesting. Can you please share some more information?

Which Xen version are you using?

Is there any additional information in the dom0 log which could be
related to the hang (earlier WARN() splats, Oopses, Xen related
messages, hardware failure messages, ...?

Can you please try to get backtraces of all cpus at the time of the
hang?

It would help to know which cpu was the target of the call of
smp_call_function_single(), so a disassembly of that function would
be needed to find that information from the dumped registers.

I'm asking because I've seen a similar problem recently and I was
rather suspecting a fifo event channel issue than the Xen scheduler,
but your data suggests it could be the scheduler after all (if it is
the same issue, of course).


Juergen


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: credit=sched2 machine hang when using DRAKVUF
  2020-10-23  4:47 ` Jürgen Groß
@ 2020-10-23 23:36   ` Michał Leszczyński
  2020-10-28  2:04   ` Michał Leszczyński
  1 sibling, 0 replies; 7+ messages in thread
From: Michał Leszczyński @ 2020-10-23 23:36 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel

----- 23 paź, 2020 o 6:47, Jürgen Groß jgross@suse.com napisał(a):

> On 23.10.20 00:59, Michał Leszczyński wrote:
>> Hello,
>> 
>> when using DRAKVUF against a Windows 7 x64 DomU, the whole machine hangs after a
>> few minutes.
>> 
>> The chance for a hang seems to be correlated with number of PCPUs, in this case
>> we have 14 PCPUs and hang is very easily reproducible, while on other machines
>> with 2-4 PCPUs it's very rare (but still occurring sometimes). The issue is
>> observed with the default sched=credit2 and is no longer reproducible once
>> sched=credit is set.
> 
> Interesting. Can you please share some more information?
> 
> Which Xen version are you using?

RELEASE-4.14

> 
> Is there any additional information in the dom0 log which could be
> related to the hang (earlier WARN() splats, Oopses, Xen related
> messages, hardware failure messages, ...?

I will try to find something out next week and will come back to you.

> 
> Can you please try to get backtraces of all cpus at the time of the
> hang?
> 
> It would help to know which cpu was the target of the call of
> smp_call_function_single(), so a disassembly of that function would
> be needed to find that information from the dumped registers.
> 
> I'm asking because I've seen a similar problem recently and I was
> rather suspecting a fifo event channel issue than the Xen scheduler,
> but your data suggests it could be the scheduler after all (if it is
> the same issue, of course).
> 
> 
> Juergen


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: credit=sched2 machine hang when using DRAKVUF
  2020-10-23  4:47 ` Jürgen Groß
  2020-10-23 23:36   ` Michał Leszczyński
@ 2020-10-28  2:04   ` Michał Leszczyński
  2020-10-28  7:45     ` Jan Beulich
  2020-10-30 18:50     ` Dario Faggioli
  1 sibling, 2 replies; 7+ messages in thread
From: Michał Leszczyński @ 2020-10-28  2:04 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel

----- 23 paź, 2020 o 6:47, Jürgen Groß jgross@suse.com napisał(a):

> On 23.10.20 00:59, Michał Leszczyński wrote:
>> Hello,
>> 
>> when using DRAKVUF against a Windows 7 x64 DomU, the whole machine hangs after a
>> few minutes.
>> 
>> The chance for a hang seems to be correlated with number of PCPUs, in this case
>> we have 14 PCPUs and hang is very easily reproducible, while on other machines
>> with 2-4 PCPUs it's very rare (but still occurring sometimes). The issue is
>> observed with the default sched=credit2 and is no longer reproducible once
>> sched=credit is set.
> 
> Interesting. Can you please share some more information?
> 
> Which Xen version are you using?
> 
> Is there any additional information in the dom0 log which could be
> related to the hang (earlier WARN() splats, Oopses, Xen related
> messages, hardware failure messages, ...?
> 
> Can you please try to get backtraces of all cpus at the time of the
> hang?
> 
> It would help to know which cpu was the target of the call of
> smp_call_function_single(), so a disassembly of that function would
> be needed to find that information from the dumped registers.
> 
> I'm asking because I've seen a similar problem recently and I was
> rather suspecting a fifo event channel issue than the Xen scheduler,
> but your data suggests it could be the scheduler after all (if it is
> the same issue, of course).
> 
> 
> Juergen


As I've said before, I'm using RELEASE-4.14.0, this is DELL PowerEdge R640 with 14 PCPUs.

I have the following additional pieces of log (enclosed below). As you could see, the issue is about particular vCPUs of Dom0 not being scheduled for a long time, which really decreases stability of the host system.

Hope this helps somehow.



Best regards,
Michał Leszczyński
CERT Polska

---

[  313.730969] rcu: INFO: rcu_sched self-detected stall on CPU
[  313.731154] rcu:     5-....: (5249 ticks this GP) idle=c6e/1/0x4000000000000002 softirq=4625/4625 fqs=2624
[  313.731474] rcu:      (t=5250 jiffies g=10309 q=220)
[  338.968676] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  346.963959] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [xenconsoled:2747]
(XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
(XEN) sched_smt_power_savings: disabled
(XEN) NOW=384307105230
(XEN) Online Cpus: 0,2,4,6,8,10,12,14,16,18,20,22,24,26
(XEN) Cpupool 0:
(XEN) Cpus: 0,2,4,6,8,10,12,14,16,18,20,22,24,26
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Active queues: 2
(XEN)   default-weight     = 256
(XEN) Runqueue 0:
(XEN)   ncpus              = 7
(XEN)   cpus               = 0,2,4,6,8,10,12
(XEN)   max_weight         = 256
(XEN)   pick_bias          = 10
(XEN)   instload           = 3
(XEN)   aveload            = 805194 (~307%)
(XEN)   idlers: 00000000,00000000,00000000,00000000,00000000,00000000,00001145
(XEN)   tickled: 00000000,00000000,00000000,00000000,00000000,00000000,00000000
(XEN)   fully idle cores: 00000000,00000000,00000000,00000000,00000000,00000000,00001145
(XEN) Runqueue 1:
(XEN)   ncpus              = 7
(XEN)   cpus               = 14,16,18,20,22,24,26
(XEN)   max_weight         = 256
(XEN)   pick_bias          = 22
(XEN)   instload           = 0
(XEN)   aveload            = 51211 (~19%)
(XEN)   idlers: 00000000,00000000,00000000,00000000,00000000,00000000,05454000
(XEN)   tickled: 00000000,00000000,00000000,00000000,00000000,00000000,00000000
(XEN)   fully idle cores: 00000000,00000000,00000000,00000000,00000000,00000000,05454000
(XEN) Domain info:
(XEN)   Domain: 0 w 256 c 0 v 14
(XEN)     1: [0.0] flags=20 cpu=0 credit=-10000000 [w=256] load=4594 (~1%)
(XEN)     2: [0.1] flags=20 cpu=2 credit=9134904 [w=256] load=262144 (~100%)
(XEN)     3: [0.2] flags=22 cpu=4 credit=-10000000 [w=256] load=262144 (~100%)
(XEN)     4: [0.3] flags=20 cpu=6 credit=-10000000 [w=256] load=4299 (~1%)
(XEN)     5: [0.4] flags=20 cpu=8 credit=-10000000 [w=256] load=4537 (~1%)
(XEN)     6: [0.5] flags=22 cpu=10 credit=-10000000 [w=256] load=262144 (~100%)
(XEN)     7: [0.6] flags=20 cpu=12 credit=-10000000 [w=256] load=5158 (~1%)
(XEN)     8: [0.7] flags=20 cpu=14 credit=10053352 [w=256] load=5150 (~1%)
(XEN)     9: [0.8] flags=20 cpu=16 credit=10200526 [w=256] load=5155 (~1%)
(XEN)    10: [0.9] flags=20 cpu=18 credit=10207025 [w=256] load=4939 (~1%)
(XEN)    11: [0.10] flags=20 cpu=20 credit=10131199 [w=256] load=5753 (~2%)
(XEN)    12: [0.11] flags=20 cpu=22 credit=8103663 [w=256] load=22544 (~8%)
(XEN)    13: [0.12] flags=20 cpu=24 credit=10213151 [w=256] load=4905 (~1%)
(XEN)    14: [0.13] flags=20 cpu=26 credit=10235821 [w=256] load=4858 (~1%)
(XEN)   Domain: 29 w 256 c 0 v 4
(XEN)    15: [29.0] flags=0 cpu=16 credit=10500000 [w=256] load=0 (~0%)
(XEN)    16: [29.1] flags=0 cpu=26 credit=10500000 [w=256] load=0 (~0%)
(XEN)    17: [29.2] flags=0 cpu=10 credit=8294046 [w=256] load=0 (~0%)
(XEN)    18: [29.3] flags=0 cpu=12 credit=9009727 [w=256] load=0 (~0%)
(XEN)   Domain: 30 w 256 c 0 v 4
(XEN)    19: [30.0] flags=0 cpu=26 credit=10500000 [w=256] load=0 (~0%)
(XEN)    20: [30.1] flags=0 cpu=16 credit=10500000 [w=256] load=0 (~0%)
(XEN)    21: [30.2] flags=0 cpu=18 credit=10500000 [w=256] load=0 (~0%)
(XEN)    22: [30.3] flags=0 cpu=22 credit=10500000 [w=256] load=0 (~0%)
(XEN) Runqueue 0:
(XEN) CPU[00] runq=0, sibling={0}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[02] runq=0, sibling={2}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[04] runq=0, sibling={4}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN)   run: [0.2] flags=22 cpu=4 credit=-10000000 [w=256] load=262144 (~100%)
(XEN) CPU[06] runq=0, sibling={6}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[08] runq=0, sibling={8}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[10] runq=0, sibling={10}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN)   run: [0.5] flags=22 cpu=10 credit=-10000000 [w=256] load=262144 (~100%)
(XEN) CPU[12] runq=0, sibling={12}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) RUNQ:
(XEN)     0: [0.1] flags=20 cpu=2 credit=9134904 [w=256] load=262144 (~100%)
(XEN) Runqueue 1:
(XEN) CPU[14] runq=1, sibling={14}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[16] runq=1, sibling={16}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[18] runq=1, sibling={18}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[20] runq=1, sibling={20}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[22] runq=1, sibling={22}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[24] runq=1, sibling={24}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) CPU[26] runq=1, sibling={26}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
(XEN) RUNQ:
(XEN) CPUs info:
(XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
(XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
(XEN) CPU[04] current=d0v2, curr=d0v2, prev=NULL
(XEN) CPU[06] current=d[IDLE]v6, curr=d[IDLE]v6, prev=NULL
(XEN) CPU[08] current=d[IDLE]v8, curr=d[IDLE]v8, prev=NULL
(XEN) CPU[10] current=d0v5, curr=d0v5, prev=NULL
(XEN) CPU[12] current=d[IDLE]v12, curr=d[IDLE]v12, prev=NULL
(XEN) CPU[14] current=d[IDLE]v14, curr=d[IDLE]v14, prev=NULL
(XEN) CPU[16] current=d[IDLE]v16, curr=d[IDLE]v16, prev=NULL
(XEN) CPU[18] current=d[IDLE]v18, curr=d[IDLE]v18, prev=NULL
(XEN) CPU[20] current=d[IDLE]v20, curr=d[IDLE]v20, prev=NULL
(XEN) CPU[22] current=d[IDLE]v22, curr=d[IDLE]v22, prev=NULL
(XEN) CPU[24] current=d[IDLE]v24, curr=d[IDLE]v24, prev=NULL
(XEN) CPU[26] current=d[IDLE]v26, curr=d[IDLE]v26, prev=NULL
[  366.966158] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  374.961437] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [xenconsoled:2747]
[  376.737277] rcu: INFO: rcu_sched self-detected stall on CPU
[  376.737457] rcu:     5-....: (21002 ticks this GP) idle=c6e/1/0x4000000000000002 softirq=4625/4630 fqs=10491
[  376.737773] rcu:      (t=21003 jiffies g=10309 q=4631)
[  402.958905] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [xenconsoled:2747]
[  402.962904] watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [sshd:5991]
[  413.434099] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 2-... 5-... } 5657 jiffies s: 57 root: 0x24/.
[  413.434470] rcu: blocking rcu_node structures:
[  430.956362] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  430.960361] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  439.743562] rcu: INFO: rcu_sched self-detected stall on CPU
[  439.743741] rcu:     5-....: (36755 ticks this GP) idle=c6e/1/0x4000000000000002 softirq=4625/4630 fqs=18363
[  439.744060] rcu:      (t=36756 jiffies g=10309 q=8837)
[  458.953810] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  466.957079] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  476.916310] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 2-... 5-... } 21529 jiffies s: 57 root: 0x24/.
[  476.916712] rcu: blocking rcu_node structures:
[  486.951250] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  491.251030] INFO: task sshd:5993 blocked for more than 120 seconds.
[  491.251249]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  491.251535] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  491.251891] INFO: task sshd:5995 blocked for more than 120 seconds.
[  491.252078]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  491.252321] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  491.252657] INFO: task qemu-system-i38:6011 blocked for more than 120 seconds.
[  491.252924]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  491.253167] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  491.253512] INFO: task sshd:6056 blocked for more than 120 seconds.
[  491.253703]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  491.253947] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  494.954517] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  502.749802] rcu: INFO: rcu_sched self-detected stall on CPU
[  502.749981] rcu:     5-....: (52508 ticks this GP) idle=c6e/1/0x4000000000000002 softirq=4625/4630 fqs=26225
[  502.750307] rcu:      (t=52509 jiffies g=10309 q=12321)
[  514.948683] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  526.951580] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  540.398469] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 2-... 5-... } 37401 jiffies s: 57 root: 0x24/.
[  540.398843] rcu: blocking rcu_node structures:
[  542.946109] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  554.949003] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  565.756021] rcu: INFO: rcu_sched self-detected stall on CPU
[  565.756203] rcu:     5-....: (68261 ticks this GP) idle=c6e/1/0x4000000000000002 softirq=4625/4630 fqs=34096
[  565.756521] rcu:      (t=68262 jiffies g=10309 q=14838)
[  570.943529] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  590.945683] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  598.940945] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  603.880631] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 2-... 5-... } 53273 jiffies s: 57 root: 0x24/.
[  603.881006] rcu: blocking rcu_node structures:
[  612.071898] INFO: task ntpd:2726 blocked for more than 120 seconds.
[  612.072120]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  612.072409] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  612.072806] INFO: task sshd:5993 blocked for more than 120 seconds.
[  612.073016]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  612.073258] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  612.073605] INFO: task sshd:5995 blocked for more than 120 seconds.
[  612.085380]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  612.097439] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  612.109691] INFO: task qemu-system-i38:6011 blocked for more than 120 seconds.
[  612.121877]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  612.134123] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  612.146501] INFO: task sshd:6056 blocked for more than 120 seconds.
[  612.158836]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  612.171134] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  612.183643] INFO: task qemu-system-i38:6105 blocked for more than 120 seconds.
[  612.196174]       Tainted: P           OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[  612.208899] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  618.943102] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
[  626.938354] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  628.762185] rcu: INFO: rcu_sched self-detected stall on CPU
[  628.774574] rcu:     5-....: (84011 ticks this GP) idle=c6e/1/0x4000000000000002 softirq=4625/4630 fqs=41964
[  628.787208] rcu:      (t=84015 jiffies g=10309 q=21278)
[  654.935761] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [xenconsoled:2747]
[  654.939869] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: credit=sched2 machine hang when using DRAKVUF
  2020-10-28  2:04   ` Michał Leszczyński
@ 2020-10-28  7:45     ` Jan Beulich
  2020-10-31  0:42       ` Dario Faggioli
  2020-10-30 18:50     ` Dario Faggioli
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2020-10-28  7:45 UTC (permalink / raw)
  To: Michał Leszczyński; +Cc: xen-devel, Jürgen Groß

On 28.10.2020 03:04, Michał Leszczyński wrote:
> As I've said before, I'm using RELEASE-4.14.0, this is DELL PowerEdge R640 with 14 PCPUs.

I.e. you haven't tried the tip of the 4.14 stable branch?

> I have the following additional pieces of log (enclosed below). As you could see, the issue is about particular vCPUs of Dom0 not being scheduled for a long time, which really decreases stability of the host system.

I have to admit that the log makes me wonder whether this isn't a
Dom0 internal issue:

> [  338.968676] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [sshd:5991]
> [  346.963959] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [xenconsoled:2747]

For these two vCPU-s we see ...

> (XEN) Domain info:
> (XEN)   Domain: 0 w 256 c 0 v 14
> (XEN)     1: [0.0] flags=20 cpu=0 credit=-10000000 [w=256] load=4594 (~1%)
> (XEN)     2: [0.1] flags=20 cpu=2 credit=9134904 [w=256] load=262144 (~100%)
> (XEN)     3: [0.2] flags=22 cpu=4 credit=-10000000 [w=256] load=262144 (~100%)
> (XEN)     4: [0.3] flags=20 cpu=6 credit=-10000000 [w=256] load=4299 (~1%)
> (XEN)     5: [0.4] flags=20 cpu=8 credit=-10000000 [w=256] load=4537 (~1%)
> (XEN)     6: [0.5] flags=22 cpu=10 credit=-10000000 [w=256] load=262144 (~100%)

... that both are fully loaded and ...

> (XEN) Runqueue 0:
> (XEN) CPU[00] runq=0, sibling={0}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[02] runq=0, sibling={2}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[04] runq=0, sibling={4}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN)   run: [0.2] flags=22 cpu=4 credit=-10000000 [w=256] load=262144 (~100%)
> (XEN) CPU[06] runq=0, sibling={6}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[08] runq=0, sibling={8}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[10] runq=0, sibling={10}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN)   run: [0.5] flags=22 cpu=10 credit=-10000000 [w=256] load=262144 (~100%)

... they're actively running, confirmed another time ...

> (XEN) RUNQ:
> (XEN) CPUs info:
> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
> (XEN) CPU[04] current=d0v2, curr=d0v2, prev=NULL
> (XEN) CPU[06] current=d[IDLE]v6, curr=d[IDLE]v6, prev=NULL
> (XEN) CPU[08] current=d[IDLE]v8, curr=d[IDLE]v8, prev=NULL
> (XEN) CPU[10] current=d0v5, curr=d0v5, prev=NULL

... here. Hence an additional question is what exactly they're doing.
'0' and possibly 'd' debug key output may shed some light on it, but
to interpret that output the exact kernel and hypervisor binaries
would need to be known / available.

Furthermore to tell dead lock from live lock, more than one invocation
of any of the involved debug keys is often helpful.

Jan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: credit=sched2 machine hang when using DRAKVUF
  2020-10-28  2:04   ` Michał Leszczyński
  2020-10-28  7:45     ` Jan Beulich
@ 2020-10-30 18:50     ` Dario Faggioli
  1 sibling, 0 replies; 7+ messages in thread
From: Dario Faggioli @ 2020-10-30 18:50 UTC (permalink / raw)
  To: Michał Leszczyński, Jürgen Groß
  Cc: xen-devel, George Dunlap

[-- Attachment #1: Type: text/plain, Size: 6727 bytes --]

[Cc-ing George as it's often useful having him in the loop when doing 
 all this math on credits]

On Wed, 2020-10-28 at 03:04 +0100, Michał Leszczyński wrote:
> ----- 23 paź, 2020 o 6:47, Jürgen Groß jgross@suse.com napisał(a):
> As I've said before, I'm using RELEASE-4.14.0, this is DELL PowerEdge
> R640 with 14 PCPUs.
> 
> I have the following additional pieces of log (enclosed below). As
> you could see, the issue is about particular vCPUs of Dom0 not being
> scheduled for a long time, which really decreases stability of the
> host system.
> 
> ---
> 
> [  313.730969] rcu: INFO: rcu_sched self-detected stall on CPU
> [  313.731154] rcu:     5-....: (5249 ticks this GP)
> idle=c6e/1/0x4000000000000002 softirq=4625/4625 fqs=2624
> [  313.731474] rcu:      (t=5250 jiffies g=10309 q=220)
> [  338.968676] watchdog: BUG: soft lockup - CPU#5 stuck for 22s!
> [sshd:5991]
> [  346.963959] watchdog: BUG: soft lockup - CPU#2 stuck for 23s!
> [xenconsoled:2747]
> (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch
> input)
> (XEN) sched_smt_power_savings: disabled
> (XEN) NOW=384307105230
> (XEN) Online Cpus: 0,2,4,6,8,10,12,14,16,18,20,22,24,26
> (XEN) Cpupool 0:
> (XEN) Cpus: 0,2,4,6,8,10,12,14,16,18,20,22,24,26
> (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
> (XEN) Scheduler: SMP Credit Scheduler rev2 (credit2)
> (XEN) Active queues: 2
> (XEN)   default-weight     = 256
> (XEN) Runqueue 0:
> (XEN)   ncpus              = 7
> (XEN)   cpus               = 0,2,4,6,8,10,12
> (XEN)   max_weight         = 256
> (XEN)   pick_bias          = 10
> (XEN)   instload           = 3
> (XEN)   aveload            = 805194 (~307%)
> [...]
> (XEN) Runqueue 1:
> (XEN)   ncpus              = 7
> (XEN)   cpus               = 14,16,18,20,22,24,26
> (XEN)   max_weight         = 256
> (XEN)   pick_bias          = 22
> (XEN)   instload           = 0
> (XEN)   aveload            = 51211 (~19%)
> (XEN)   idlers:
>
That's quite imbalanced... Is there any pinning involved, by any
chance?

> [...]
> (XEN) Domain info:
> (XEN)   Domain: 0 w 256 c 0 v 14
> (XEN)     1: [0.0] flags=20 cpu=0 credit=-10000000 [w=256] load=4594
> (~1%)
> (XEN)     2: [0.1] flags=20 cpu=2 credit=9134904 [w=256] load=262144
> (~100%)
> (XEN)     3: [0.2] flags=22 cpu=4 credit=-10000000 [w=256]
> load=262144 (~100%)
> (XEN)     4: [0.3] flags=20 cpu=6 credit=-10000000 [w=256] load=4299
> (~1%)
> (XEN)     5: [0.4] flags=20 cpu=8 credit=-10000000 [w=256] load=4537
> (~1%)
> (XEN)     6: [0.5] flags=22 cpu=10 credit=-10000000 [w=256]
> load=262144 (~100%)
> (XEN)     7: [0.6] flags=20 cpu=12 credit=-10000000 [w=256] load=5158
>
All these credit = -10000000 are a bit weird. The scheduler only allow
a vCPU to run until it has no less than -500000. As soon as a vCPU
reaches that point (actually, as soon as it reaches credit < 0), it
should be preempted. And if there is no other one with positive amount
of credits, credits themselves are reset (by adding 10000000 to them).

So, in an ideal world, we'll see no vCPU with credit < 0  here. We know
tat we're not in a super-ideal world, and that's why we allow for
credits to go a bit lower, until -500000.

And we're not even in an ideal world, so values smaller than -500000
are actually possible, due to timer/IRQ latency, etc. But they
shouldn't be too smaller than that.

If we see -10000000, it seems that a vCPU managed to run form more than
20ms, i.e., the 10ms it's been given at the last reset event plus 10ms
more, which brought it to -10ms. We can't know exacly how much more
than 20ms, because even if it run for 40ms, -10000000 is the lower
limit for the credits, and we clip "more negative" values at that
level.

Point is though, the scheduler is indeed setting timers for making sure
that the vCPU is interrupted when it reaches -500000. It looks that
here, quite a few vCPU are able to ignore such timer interrupts, or the
timer interrupts are not being delivered, or they are, but with some
giant latency/delay.

> (XEN) Runqueue 0:
> (XEN) CPU[00] runq=0, sibling={0}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[02] runq=0, sibling={2}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[04] runq=0, sibling={4}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN)   run: [0.2] flags=22 cpu=4 credit=-10000000 [w=256] load=262144 (~100%)
>
And here we see this again. So, they're reporting being stuck, but it's
not that they're stuck as "not being run by the hypervisor. On the
contrary, they're running all the time!

And... well, I'd like to think a bit more about this but I'd say that
the vCPU is running and the functions that do the credit accounting
(like burn_credits()-->t2c_update()) are being called, otherwise we
would not see -10000000.

Also...

> (XEN) CPU[06] runq=0, sibling={6}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[08] runq=0, sibling={8}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) CPU[10] runq=0, sibling={10}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN)   run: [0.5] flags=22 cpu=10 credit=-10000000 [w=256] load=262144 (~100%)
> (XEN) CPU[12] runq=0, sibling={12}, core={0,2,4,6,8,10,12,14,16,18,20,22,24,26}
> (XEN) RUNQ:
> (XEN)     0: [0.1] flags=20 cpu=2 credit=9134904 [w=256] load=262144 (~100%)
>
... The actual scheduler (csched2_schedule()) is not being invoked. In
fact, there is vCPU d0v1, sitting here in the runqueue.

And if the scheduler would run on either CPU 4 or 12, it would notice
that vCPUs 2 and 5 have -10000000 credits, while there is vCPU 1
waiting with 9134904 credits. And they'd pick it up and preempt.

Or it is running, but is not doing what I just described for whatever
bug we have there, but that I am not currently seeing.

Also, all the other idle vCPUs are not picking d0v1 up either.

Another thing that should not happen.

So, can I see:

# xl info -n
# xl vcpu-list

 ?

In the meantime, I'll continue looking at the code, and maybe come up
with a debug patch, e.g., for figuring out whether csched2_schedule()
is either running but not doing its job, or not running at all.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BUG: credit=sched2 machine hang when using DRAKVUF
  2020-10-28  7:45     ` Jan Beulich
@ 2020-10-31  0:42       ` Dario Faggioli
  0 siblings, 0 replies; 7+ messages in thread
From: Dario Faggioli @ 2020-10-31  0:42 UTC (permalink / raw)
  To: Jan Beulich, Michał Leszczyński
  Cc: xen-devel, Jürgen Groß

[-- Attachment #1: Type: text/plain, Size: 3147 bytes --]

On Wed, 2020-10-28 at 08:45 +0100, Jan Beulich wrote:
> On 28.10.2020 03:04, Michał Leszczyński wrote:
> 
> 
> I have to admit that the log makes me wonder whether this isn't a
> Dom0 internal issue:
> 
> > [  338.968676] watchdog: BUG: soft lockup - CPU#5 stuck for 22s!
> > [sshd:5991]
> > [  346.963959] watchdog: BUG: soft lockup - CPU#2 stuck for 23s!
> > [xenconsoled:2747]
> 
Yeah, weird.

> For these two vCPU-s we see ...
> 
> > (XEN) Domain info:
> > (XEN)   Domain: 0 w 256 c 0 v 14
> > (XEN)     1: [0.0] flags=20 cpu=0 credit=-10000000 [w=256]
> > load=4594 (~1%)
> > (XEN)     2: [0.1] flags=20 cpu=2 credit=9134904 [w=256]
> > load=262144 (~100%)
> > (XEN)     3: [0.2] flags=22 cpu=4 credit=-10000000 [w=256]
> > load=262144 (~100%)
> > (XEN)     4: [0.3] flags=20 cpu=6 credit=-10000000 [w=256]
> > load=4299 (~1%)
> > (XEN)     5: [0.4] flags=20 cpu=8 credit=-10000000 [w=256]
> > load=4537 (~1%)
> > (XEN)     6: [0.5] flags=22 cpu=10 credit=-10000000 [w=256]
> > load=262144 (~100%)
> 
> ... that both are fully loaded and ...
> 
> > [...]
> 
> ... they're actively running,
>
True indeed. But as I said in my other reply, it's weird that we have
so many vCPUs with the artificial value that we use to represent the
minimum value of credits we allow a vCPU to have.

And it's weird that, with some idle CPUs and with two vCPUs running
vCPUs with negative credits, we have one with positive credits sitting
in the runqueue.

Unless the debug-key captured a transient  state. Like, d0v1 is in the
runqueue because it just woke-up and the 'r' dump occurred between when
it's put in the runqueue and when a physical CPU (which is poked during
the wake-up itself) picks it up.

It seems unlikely, and this still would not explain nor justify the -
10000000. But, still, Michał, can you perhaps check whether, while the
issue manifests, poking at the 'r' key a few times always show the same
(or a similar) situation?

> > (XEN) RUNQ:
> > (XEN) CPUs info:
> > (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
> > (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
> > (XEN) CPU[04] current=d0v2, curr=d0v2, prev=NULL
> > (XEN) CPU[06] current=d[IDLE]v6, curr=d[IDLE]v6, prev=NULL
> > (XEN) CPU[08] current=d[IDLE]v8, curr=d[IDLE]v8, prev=NULL
> > (XEN) CPU[10] current=d0v5, curr=d0v5, prev=NULL
> 
> ... here. Hence an additional question is what exactly they're doing.
> '0' and possibly 'd' debug key output may shed some light on it, but
> to interpret that output the exact kernel and hypervisor binaries
> would need to be known / available.
> 
Yes, I agree. Even considering all that I said (which seems to point
back at a Xen issue, rather than kernel), knowing more about what the
vCPUs are doing could indeed be helpful!

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-31  0:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 22:59 BUG: credit=sched2 machine hang when using DRAKVUF Michał Leszczyński
2020-10-23  4:47 ` Jürgen Groß
2020-10-23 23:36   ` Michał Leszczyński
2020-10-28  2:04   ` Michał Leszczyński
2020-10-28  7:45     ` Jan Beulich
2020-10-31  0:42       ` Dario Faggioli
2020-10-30 18:50     ` Dario Faggioli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.