All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] Xen crash after S3 suspend - Xen 4.13
@ 2020-03-18 14:16 Marek Marczykowski-Górecki
  2020-03-18 14:50 ` Andrew Cooper
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2020-03-18 14:16 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4339 bytes --]

Hi,

In my test setup (inside KVM with nested virt enabled), I rather
frequently get Xen crash on resume from S3. Full message below.

This is Xen 4.13.0, with some patches, including "sched: fix resuming
from S3 with smt=0".

Contrary to the previous issue, this one does not happen always - I
would say in about 40% cases on this setup, but very rarely on physical
setup.

This is _without_ core scheduling enabled, and also with smt=off.

Do you think it would be any different on xen-unstable? I cat try, but
it isn't trivial in this setup, so I'd ask first.


(XEN) Xen BUG at sched_credit2.c:3474
(XEN) ----[ Xen-4.13.0  x86_64  debug=n   Not tainted ]----
(XEN) CPU:    1
(XEN) RIP:    e008:[<ffff82d08022a60a>] sched_credit2.c#csched2_schedule+0xafa/0x1350
(XEN) RFLAGS: 0000000000010097   CONTEXT: hypervisor (d5v0)
(XEN) rax: ffff8300a9845000   rbx: 0000000000000001   rcx: ffff8301ba8fb0e0
(XEN) rdx: 000000000000001b   rsi: ffff8300821dcdd0   rdi: ffff82d0803fa720
(XEN) rbp: ffff82d08053aee0   rsp: ffff8301ba907d68   r8:  0000000000000000
(XEN) r9:  ffff8300a98452b8   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff8301ba8fb238   r13: ffff8300821dcec0   r14: ffff8301ba8fbf70
(XEN) r15: 0000000000000001   cr0: 0000000080050033   cr4: 00000000000026e0
(XEN) cr3: 000000008e85e000   cr2: 00007a169cd25a10
(XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d08022a60a> (sched_credit2.c#csched2_schedule+0xafa/0x1350):
(XEN)  65 10 0f 84 b6 f5 ff ff <0f> 0b 0f 1f 40 00 4c 89 ef e8 b8 bb ff ff e9 ee
(XEN) Xen stack trace from rsp=ffff8301ba907d68:
(XEN)    ffff82d0803fa720 00000051e8e6c8a5 0000000000000001 ffff8300821dcdd0
(XEN)    ffff82d000000000 0000000000000282 0000000000000001 ba90000100000010
(XEN)    00000051e8e7146c ffff83008ec38090 ffff8301ba907fff ffff8300a9845000
(XEN)    0000000000000000 0000000000000000 ffff8300821dcdd0 ffff8301ba919010
(XEN)    0000000000000000 ffff82d0803fa720 ffff8301ba8fbf70 00000051e8e6c8a5
(XEN)    ffff82d08022f469 ffff82d080229b10 ffff8300821dcdd0 0000000000000001
(XEN)    ffff8300821dcdd0 ffff82d08053aee0 0000000000000001 ffff8301ba919010
(XEN)    ffff83008ec38000 ffff82d080233815 ffff8301ba8fb238 ffff82d08051ff88
(XEN)    00000001ba907ef8 ffff8301ba8fb238 ffff83008ec38000 0000000000000000
(XEN)    ffff82d0802eed2f ffff82d08030e494 ffffffffffffffff ffff82d080524880
(XEN)    ffff82d08052c880 0000000000000001 0000000000000000 0000000000000000
(XEN)    ffff82d080234b51 ffff83008ec38000 0000000000000000 0000000000000000
(XEN)    0000000000000000 ffff82d08031233b ffffffff82614780 0000000000000000
(XEN)    0000000000000000 ffffffff82614780 0000000000000000 0000000000000000
(XEN)    000000000002e2b8 0000000000000000 00000000001b2ff4 0000000000000001
(XEN)    ffffffff81a18a10 0000000000000000 0000000000000000 ffffffff82603e58
(XEN)    0000003d0d5893ff 000000fa00000000 ffffffff81a18dee 0000000000000000
(XEN)    0000000000000246 ffffffff82603e90 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000e01000000001
(XEN)    ffff83008ec38000 000000313a3ea000 00000000000026e0 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d08022a60a>] R sched_credit2.c#csched2_schedule+0xafa/0x1350
(XEN)    [<ffff82d08022f469>] S schedule.c#do_schedule+0xb9/0x520
(XEN)    [<ffff82d080229b10>] S sched_credit2.c#csched2_schedule+0/0x1350
(XEN)    [<ffff82d080233815>] S schedule.c#schedule+0x1b5/0x230
(XEN)    [<ffff82d0802eed2f>] S hvm_vcpu_has_pending_irq+0x6f/0x90
(XEN)    [<ffff82d08030e494>] S nvmx_switch_guest+0x94/0x1a40
(XEN)    [<ffff82d080234b51>] S softirq.c#__do_softirq+0x51/0x80
(XEN)    [<ffff82d08031233b>] S vmx_asm_do_vmentry+0x2b/0x30
(XEN) 
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) Xen BUG at sched_credit2.c:3474
(XEN) ****************************************
(XEN) 
(XEN) Reboot in five seconds...

Full serial log:
https://openqa.qubes-os.org/tests/6994/file/serial0.txt

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-18 14:16 [Xen-devel] Xen crash after S3 suspend - Xen 4.13 Marek Marczykowski-Górecki
@ 2020-03-18 14:50 ` Andrew Cooper
  2020-03-18 22:10   ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Cooper @ 2020-03-18 14:50 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, xen-devel

On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> Hi,
>
> In my test setup (inside KVM with nested virt enabled), I rather
> frequently get Xen crash on resume from S3. Full message below.
>
> This is Xen 4.13.0, with some patches, including "sched: fix resuming
> from S3 with smt=0".
>
> Contrary to the previous issue, this one does not happen always - I
> would say in about 40% cases on this setup, but very rarely on physical
> setup.
>
> This is _without_ core scheduling enabled, and also with smt=off.
>
> Do you think it would be any different on xen-unstable? I cat try, but
> it isn't trivial in this setup, so I'd ask first.
>
>
> (XEN) Xen BUG at sched_credit2.c:3474
> (XEN) ----[ Xen-4.13.0  x86_64  debug=n   Not tainted ]----

Any chance you can reproduce with a debug build?

There is a good chance that an assertion might fire earlier than this BUG().

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-18 14:50 ` Andrew Cooper
@ 2020-03-18 22:10   ` Marek Marczykowski-Górecki
  2020-03-19  0:28     ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2020-03-18 22:10 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4726 bytes --]

On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > Hi,
> >
> > In my test setup (inside KVM with nested virt enabled), I rather
> > frequently get Xen crash on resume from S3. Full message below.
> >
> > This is Xen 4.13.0, with some patches, including "sched: fix resuming
> > from S3 with smt=0".
> >
> > Contrary to the previous issue, this one does not happen always - I
> > would say in about 40% cases on this setup, but very rarely on physical
> > setup.
> >
> > This is _without_ core scheduling enabled, and also with smt=off.
> >
> > Do you think it would be any different on xen-unstable? I cat try, but
> > it isn't trivial in this setup, so I'd ask first.
> >
> >
> > (XEN) Xen BUG at sched_credit2.c:3474
> > (XEN) ----[ Xen-4.13.0  x86_64  debug=n   Not tainted ]----
> 
> Any chance you can reproduce with a debug build?
> 
> There is a good chance that an assertion might fire earlier than this BUG().

Got it:

(XEN) Assertion 'c2rqd(ops, sched_unit_master(unit)) == svc->rqd' failed at sched_credit2.c:2137
(XEN) ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d08022bee9>] sched_credit2.c#csched2_unit_wake+0x174/0x176
(XEN) RFLAGS: 0000000000010097   CONTEXT: hypervisor (d0v1)
(XEN) rax: ffff8301ba8fb198   rbx: ffff83008ea2ae10   rcx: 000000000000001b
(XEN) rdx: ffff8301ba8fb0c0   rsi: ffff8301ba8fb040   rdi: 0000000000000001
(XEN) rbp: ffff8300be84fd30   rsp: ffff8300be84fd10   r8:  deadbeefdeadf00d
(XEN) r9:  deadbeefdeadf00d   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff83008ea2af30   r13: ffff82d080430720   r14: 0000000000000001
(XEN) r15: 0000004177a8cc12   cr0: 0000000080050033   cr4: 00000000000006e0
(XEN) cr3: 000000010e658000   cr2: 0000599b4e014010
(XEN) fsb: 00007d6c53bfd700   gsb: ffff88813ff00000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d08022bee9> (sched_credit2.c#csched2_unit_wake+0x174/0x176):
(XEN)  ef e8 e4 c3 ff ff eb a7 <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 54 53 48
(XEN) Xen stack trace from rsp=ffff8300be84fd10:
(XEN)    ffff83008dc3b000 ffff83008ea2ae10 ffff83008ea2ae10 ffff8301ba8fb198
(XEN)    ffff8300be84fd78 ffff82d0802346c6 0000000000000202 ffff8301ba8fb198
(XEN)    ffff83008dc3b000 ffff83011e6e4000 00007d6c503d2010 0000000000000000
(XEN)    0000000000000001 ffff8300be84fd90 ffff82d0802082f1 ffff83011e6e4000
(XEN)    ffff8300be84fda8 ffff82d08020843e 0000000000000000 ffff8300be84fec0
(XEN)    ffff82d080205ea5 0000000000000206 ffff830100e16060 0000000000007f88
(XEN)    ffff8301ba88b000 ffff8300be84fdf8 ffff82d08023bae1 ffff8301ba88b068
(XEN)    ffff82d080557300 ffff8300be84fe50 0000000000000206 0000001200000004
(XEN)    00007d6c2f6b0005 00007d6c3c00a250 0000000000000005 0000000000000001
(XEN)    00007d6c2f6b0189 000000030000007c 00007d6c10000e20 ffff010a00000005
(XEN)    000000000002cf56 000000000002e846 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000001 0000000554eea2c6 0000000000000001
(XEN)    7ba0548d00000000 ffff8300be84fef8 ffff8301ba88b000 0000000000000001
(XEN)    0000000000000000 0000000000000000 ffff8300be84fee8 ffff82d08034d922
(XEN)    ffff8301ba88b000 0000000000000000 0000000000000000 00007cff417b00e7
(XEN)    ffff82d080354432 ffff88813a157100 00007d6c53bfc590 00007d6c53bfc590
(XEN)    ffff8881358ad3e8 ffff88813a157100 ffff88813a157101 0000000000000282
(XEN)    0000000000000000 ffffc90001953e08 0000000000000000 0000000000000024
(XEN)    ffffffff8100148a 0000000000000000 0000000000000000 00007d6c503d2010
(XEN)    0000010000000000 ffffffff8100148a 000000000000e033 0000000000000282
(XEN) Xen call trace:
(XEN)    [<ffff82d08022bee9>] R sched_credit2.c#csched2_unit_wake+0x174/0x176
(XEN)    [<ffff82d0802346c6>] F vcpu_wake+0xdd/0x3ff
(XEN)    [<ffff82d0802082f1>] F domain_unpause+0x2f/0x3b
(XEN)    [<ffff82d08020843e>] F domain_unpause_by_systemcontroller+0x40/0x60
(XEN)    [<ffff82d080205ea5>] F do_domctl+0x9e4/0x1952
(XEN)    [<ffff82d08034d922>] F pv_hypercall+0x548/0x560
(XEN)    [<ffff82d080354432>] F lstar_enter+0x112/0x120
(XEN) 
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'c2rqd(ops, sched_unit_master(unit)) == svc->rqd' failed at sched_credit2.c:2137
(XEN) ****************************************
(XEN) 
(XEN) Reboot in five seconds...


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-18 22:10   ` Marek Marczykowski-Górecki
@ 2020-03-19  0:28     ` Dario Faggioli
  2020-03-19  0:59       ` Marek Marczykowski-Górecki
  2020-03-23  0:09       ` Marek Marczykowski-Górecki
  0 siblings, 2 replies; 26+ messages in thread
From: Dario Faggioli @ 2020-03-19  0:28 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, Andrew Cooper; +Cc: xen-devel, Juergen Gross


[-- Attachment #1.1: Type: text/plain, Size: 2920 bytes --]

[Adding Juergen]

On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > Hi,
> > > 
> > > In my test setup (inside KVM with nested virt enabled), I rather
> > > frequently get Xen crash on resume from S3. Full message below.
> > > 
> > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > resuming
> > > from S3 with smt=0".
> > > 
> > > Contrary to the previous issue, this one does not happen always -
> > > I
> > > would say in about 40% cases on this setup, but very rarely on
> > > physical
> > > setup.
> > > 
> > > This is _without_ core scheduling enabled, and also with smt=off.
> > > 
> > > Do you think it would be any different on xen-unstable? I cat
> > > try, but
> > > it isn't trivial in this setup, so I'd ask first.
> > > 
Well, Juergen has fixed quite a few issues.

Most of them where triggering with core-scheduling enabled, and I don't
recall any of them which looked similar or related to this.

Still, it's possible that the same issue causes different symptoms, and
hence that maybe one of the patches would fix this too.

But if it's difficult for you to try upstream, let's maybe wait and see
if he has an opinion about this bug.

I have just one question:

> (XEN) Assertion 'c2rqd(ops, sched_unit_master(unit)) == svc->rqd'
> failed at sched_credit2.c:2137
> (XEN) ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
> [...]
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08022bee9>] R
> sched_credit2.c#csched2_unit_wake+0x174/0x176
> (XEN)    [<ffff82d0802346c6>] F vcpu_wake+0xdd/0x3ff
> (XEN)    [<ffff82d0802082f1>] F domain_unpause+0x2f/0x3b
> (XEN)    [<ffff82d08020843e>] F
> domain_unpause_by_systemcontroller+0x40/0x60
> (XEN)    [<ffff82d080205ea5>] F do_domctl+0x9e4/0x1952
> (XEN)    [<ffff82d08034d922>] F pv_hypercall+0x548/0x560
> (XEN)    [<ffff82d080354432>] F lstar_enter+0x112/0x120
> (XEN) 
> (XEN) 
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'c2rqd(ops, sched_unit_master(unit)) == svc->rqd'
> failed at sched_credit2.c:2137
> (XEN) ****************************************
> (XEN) 
>
Do you remember (or can easily test) whether this was also occurring on
Xen 4.12, i.e., without core-scheduling code even being there, when
this ASSERT was:

 ASSERT(c2rqd(ops, vc->processor) == svc->rqd );

If no, that might mean we have some scheduling resource and/or master
CPU issue on the S3 resume path.

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-19  0:28     ` Dario Faggioli
@ 2020-03-19  0:59       ` Marek Marczykowski-Górecki
  2020-03-23  0:09       ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2020-03-19  0:59 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Andrew Cooper, Juergen Gross, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 652 bytes --]

On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> Do you remember (or can easily test) whether this was also occurring on
> Xen 4.12, 

Looking at my tests logs it did not happened on Xen 4.12.

> i.e., without core-scheduling code even being there, when
> this ASSERT was:
> 
>  ASSERT(c2rqd(ops, vc->processor) == svc->rqd );
> 
> If no, that might mean we have some scheduling resource and/or master
> CPU issue on the S3 resume path.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-19  0:28     ` Dario Faggioli
  2020-03-19  0:59       ` Marek Marczykowski-Górecki
@ 2020-03-23  0:09       ` Marek Marczykowski-Górecki
  2020-03-23  8:14         ` Jan Beulich
  2020-09-29 14:27         ` Marek Marczykowski-Górecki
  1 sibling, 2 replies; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2020-03-23  0:09 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Andrew Cooper, Juergen Gross, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 8944 bytes --]

On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> [Adding Juergen]
> 
> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > Hi,
> > > > 
> > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > frequently get Xen crash on resume from S3. Full message below.
> > > > 
> > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > resuming
> > > > from S3 with smt=0".
> > > > 
> > > > Contrary to the previous issue, this one does not happen always -
> > > > I
> > > > would say in about 40% cases on this setup, but very rarely on
> > > > physical
> > > > setup.
> > > > 
> > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > 
> > > > Do you think it would be any different on xen-unstable? I cat
> > > > try, but
> > > > it isn't trivial in this setup, so I'd ask first.
> > > > 
> Well, Juergen has fixed quite a few issues.
> 
> Most of them where triggering with core-scheduling enabled, and I don't
> recall any of them which looked similar or related to this.
> 
> Still, it's possible that the same issue causes different symptoms, and
> hence that maybe one of the patches would fix this too.

I've tested on master (d094e95fb7c), and reproduced exactly the same crash
(pasted below for the completeness). 
But there is more: additionally, in most (all?) cases after resume I've got
soft lockup in Linux dom0 in smp_call_function_single() - see below. It
didn't happened before and the only change was Xen 4.13 -> master.

Xen crash:

(XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
(XEN) ----[ Xen-4.14-unstable  x86_64  debug=y   Not tainted ]----
(XEN) CPU:    1
(XEN) RIP:    e008:[<ffff82d08023a3c5>] credit2.c#csched2_unit_wake+0x14f/0x151
(XEN) RFLAGS: 0000000000010002   CONTEXT: hypervisor (d0v1)
(XEN) rax: ffff8301ba8fafb0   rbx: ffff8300912238b0   rcx: 0000000000000000
(XEN) rdx: ffff8301ba8d81f0   rsi: 0000000000000000   rdi: ffff8301ba8d8016
(XEN) rbp: ffff830170db7d30   rsp: ffff830170db7d10   r8:  deadbeefdeadf00d
(XEN) r9:  deadbeefdeadf00d   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff8300912239a0   r13: ffff82d080433780   r14: 0000000000000000
(XEN) r15: 0000005bdb5286ad   cr0: 0000000080050033   cr4: 0000000000000660
(XEN) cr3: 000000010e53c000   cr2: 00005ec1b2f56280
(XEN) fsb: 000079872ee29700   gsb: ffff88813ff00000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d08023a3c5> (credit2.c#csched2_unit_wake+0x14f/0x151):
(XEN)  df e8 f9 c5 ff ff eb ad <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 54 53 48
(XEN) Xen stack trace from rsp=ffff830170db7d10:
(XEN)    ffff830090a33000 ffff8300912238b0 ffff8300912238b0 ffff8301ba8d8010
(XEN)    ffff830170db7d78 ffff82d08024253b 0000000000000202 ffff8301ba8d8010
(XEN)    ffff830090a33000 ffff8300a864b000 000079872c600010 0000000000000000
(XEN)    0000000000000001 ffff830170db7d90 ffff82d080206e09 ffff8300a864b000
(XEN)    ffff830170db7da8 ffff82d080206f1c 0000000000000000 ffff830170db7ec0
(XEN)    ffff82d080204de7 ffff8301ba8cb001 ffff830170db7fff 0000000470db7e10
(XEN)    0000000000000000 ffff82e0021d0160 ffff88813ff15b28 ffff8301ba8cb000
(XEN)    ffff8301ba8cb000 ffff8301ba88b000 ffff830170db7e10 0000001200000004
(XEN)    0000798728000005 0000000000000001 0000000000000005 000079872ee286e0
(XEN)    000079872c109e77 000000030000001c 00007986ec0013c0 ffff010a00000005
(XEN)    000000000002a240 000000000002bb30 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000001 00000005d1ea5ab2 0000000000000001
(XEN)    7ba0548d00000000 ffff830170db7ef8 ffff8301ba88b000 0000000000000001
(XEN)    0000000000000000 0000000000000000 ffff830170db7ee8 ffff82d0802d779d
(XEN)    ffff8301ba88b000 0000000000000000 0000000000000000 00007cfe8f2480e7
(XEN)    ffff82d080355432 ffff88813a1bef00 000079872ee28590 000079872ee28590
(XEN)    ffff8881358e9c40 ffff88813a1bef00 ffff88813a1bef01 0000000000000282
(XEN)    0000000000000000 ffffc90001923e08 0000000000000000 0000000000000024
(XEN)    ffffffff8100148a 0000000000000000 0000000000000000 000079872c600010
(XEN)    0000010000000000 ffffffff8100148a 000000000000e033 0000000000000282
(XEN) Xen call trace:
(XEN)    [<ffff82d08023a3c5>] R credit2.c#csched2_unit_wake+0x14f/0x151
(XEN)    [<ffff82d08024253b>] F vcpu_wake+0xdd/0x3ff
(XEN)    [<ffff82d080206e09>] F domain_unpause+0x2f/0x3b
(XEN)    [<ffff82d080206f1c>] F domain_unpause_by_systemcontroller+0x40/0x60
(XEN)    [<ffff82d080204de7>] F do_domctl+0x9e1/0x16f1
(XEN)    [<ffff82d0802d779d>] F pv_hypercall+0x548/0x560
(XEN)    [<ffff82d080355432>] F lstar_enter+0x112/0x120
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...


Linux dom0 soft lockup:

[  524.742089] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [systemd:1]
[  524.747897] Modules linked in: joydev br_netfilter xt_physdev xen_netback bridge stp llc loop ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter snd_hda_codec_generic ledtrig_audio ppdev snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core edac_mce_amd snd_hwdep snd_seq snd_seq_device snd_pcm pcspkr snd_timer snd parport_pc e1000e soundcore parport i2c_piix4 xenfs ip_tables dm_thin_pool dm_persistent_data libcrc32c dm_bio_prison bochs_drm drm_kms_helper drm_vram_helper ttm drm serio_raw ehci_pci ehci_hcd virtio_console virtio_scsi ata_generic pata_acpi floppy qemu_fw_cfg xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput pkcs8_key_parser
[  524.768696] CPU: 1 PID: 1 Comm: systemd Tainted: G        W         5.4.25-1.qubes.x86_64 #1
[  524.771407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[  524.775056] RIP: e030:smp_call_function_single+0xe0/0x110
[  524.776755] Code: 65 48 33 0c 25 28 00 00 00 75 3b c9 c3 4c 89 c2 4c 89 c9 48 89 e6 e8 5f fe ff ff 8b 54 24 18 83 e2 01 74 0b f3 90 8b 54 24 18 <83> e2 01 75 f5 eb ca 8b 05 3b 92 e0 01 85 c0 75 80 0f 0b e9 79 ff
[  524.783649] RSP: e02b:ffffc90000c0fc60 EFLAGS: 00000202
[  524.788857] RAX: 0000000000000000 RBX: ffff888136632540 RCX: 0000000000000040
[  524.791207] RDX: 0000000000000003 RSI: ffffffff82824c60 RDI: ffffffff820107c0
[  524.793610] RBP: ffffc90000c0fca0 R08: 0000000000000000 R09: ffff88813b0007e8
[  524.795737] R10: 0000000000000000 R11: ffffffff8265b6e8 R12: 0000000000000001
[  524.797847] R13: ffffc90000c0fdb0 R14: ffffffff82feb744 R15: ffff88813b7c6800
[  524.800156] FS:  000074e59239e5c0(0000) GS:ffff88813ff00000(0000) knlGS:0000000000000000
[  524.802883] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[  524.804661] CR2: 000074e59345a400 CR3: 00000001337e0000 CR4: 0000000000000660
[  524.807097] Call Trace:
[  524.807959]  ? perf_cgroup_attach+0x70/0x70
[  524.809433]  ? _raw_spin_unlock_irqrestore+0x14/0x20
[  524.811179]  ? cgroup_move_task+0x109/0x150
[  524.812623]  task_function_call+0x4d/0x80
[  524.814179]  ? perf_cgroup_switch+0x190/0x190
[  524.815738]  perf_cgroup_attach+0x3f/0x70
[  524.817125]  cgroup_migrate_execute+0x35e/0x420
[  524.818704]  cgroup_attach_task+0x159/0x210
[  524.820158]  ? find_inode_fast.isra.0+0x8e/0xb0
[  524.822055]  cgroup_procs_write+0xd0/0x100
[  524.823692]  cgroup_file_write+0x9b/0x170
[  524.825220]  kernfs_fop_write+0xce/0x1b0
[  524.826598]  vfs_write+0xb6/0x1a0
[  524.827776]  ksys_write+0x67/0xe0
[  524.828969]  do_syscall_64+0x5b/0x1a0
[  524.830083]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  524.831599] RIP: 0033:0x74e5933894b7
[  524.832696] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  524.838570] RSP: 002b:00007ffdfc2df548 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  524.841100] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 000074e5933894b7
[  524.843469] RDX: 0000000000000005 RSI: 00007ffdfc2df70a RDI: 0000000000000017
[  524.846368] RBP: 00007ffdfc2df70a R08: 0000000000000000 R09: 00007ffdfc2df590
[  524.848816] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
[  524.851009] R13: 00006149cb4f3800 R14: 0000000000000005 R15: 000074e59345a700

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-23  0:09       ` Marek Marczykowski-Górecki
@ 2020-03-23  8:14         ` Jan Beulich
  2020-09-29 14:27         ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2020-03-23  8:14 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Andrew Cooper, Juergen Gross, xen-devel, Dario Faggioli

On 23.03.2020 01:09, Marek Marczykowski-Górecki wrote:
> But there is more: additionally, in most (all?) cases after resume I've got
> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> didn't happened before and the only change was Xen 4.13 -> master.

Unless the Linux side manifestation rings a bell to someone, would
there be any chance you could bisect this to the offending commit?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-03-23  0:09       ` Marek Marczykowski-Górecki
  2020-03-23  8:14         ` Jan Beulich
@ 2020-09-29 14:27         ` Marek Marczykowski-Górecki
  2020-09-29 15:07           ` Jürgen Groß
  1 sibling, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2020-09-29 14:27 UTC (permalink / raw)
  To: Juergen Gross; +Cc: Dario Faggioli, Andrew Cooper, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> > [Adding Juergen]
> > 
> > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > > Hi,
> > > > > 
> > > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > > frequently get Xen crash on resume from S3. Full message below.
> > > > > 
> > > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > > resuming
> > > > > from S3 with smt=0".
> > > > > 
> > > > > Contrary to the previous issue, this one does not happen always -
> > > > > I
> > > > > would say in about 40% cases on this setup, but very rarely on
> > > > > physical
> > > > > setup.
> > > > > 
> > > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > > 
> > > > > Do you think it would be any different on xen-unstable? I cat
> > > > > try, but
> > > > > it isn't trivial in this setup, so I'd ask first.
> > > > > 
> > Well, Juergen has fixed quite a few issues.
> > 
> > Most of them where triggering with core-scheduling enabled, and I don't
> > recall any of them which looked similar or related to this.
> > 
> > Still, it's possible that the same issue causes different symptoms, and
> > hence that maybe one of the patches would fix this too.
> 
> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> (pasted below for the completeness). 
> But there is more: additionally, in most (all?) cases after resume I've got
> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> didn't happened before and the only change was Xen 4.13 -> master.
> 
> Xen crash:
> 
> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133

Juergen, any idea about this one? This is also happening on the current
stable-4.14 (28855ebcdbfa).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-09-29 14:27         ` Marek Marczykowski-Górecki
@ 2020-09-29 15:07           ` Jürgen Groß
  2020-09-29 15:16             ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Jürgen Groß @ 2020-09-29 15:07 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, Juergen Gross
  Cc: Dario Faggioli, Andrew Cooper, xen-devel

On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
>> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
>>> [Adding Juergen]
>>>
>>> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
>>>> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
>>>>> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
>>>>>> Hi,
>>>>>>
>>>>>> In my test setup (inside KVM with nested virt enabled), I rather
>>>>>> frequently get Xen crash on resume from S3. Full message below.
>>>>>>
>>>>>> This is Xen 4.13.0, with some patches, including "sched: fix
>>>>>> resuming
>>>>>> from S3 with smt=0".
>>>>>>
>>>>>> Contrary to the previous issue, this one does not happen always -
>>>>>> I
>>>>>> would say in about 40% cases on this setup, but very rarely on
>>>>>> physical
>>>>>> setup.
>>>>>>
>>>>>> This is _without_ core scheduling enabled, and also with smt=off.
>>>>>>
>>>>>> Do you think it would be any different on xen-unstable? I cat
>>>>>> try, but
>>>>>> it isn't trivial in this setup, so I'd ask first.
>>>>>>
>>> Well, Juergen has fixed quite a few issues.
>>>
>>> Most of them where triggering with core-scheduling enabled, and I don't
>>> recall any of them which looked similar or related to this.
>>>
>>> Still, it's possible that the same issue causes different symptoms, and
>>> hence that maybe one of the patches would fix this too.
>>
>> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
>> (pasted below for the completeness).
>> But there is more: additionally, in most (all?) cases after resume I've got
>> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
>> didn't happened before and the only change was Xen 4.13 -> master.
>>
>> Xen crash:
>>
>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
> 
> Juergen, any idea about this one? This is also happening on the current
> stable-4.14 (28855ebcdbfa).
> 

Oh, sorry I didn't come back to this issue.

I suspect this is related to stop_machine_run() being called during
suspend(), as I'm seeing very sporadic issues when offlining and then
onlining cpus with core scheduling being active (it seems as if the
dom0 vcpu doing the cpu online activity sometimes is using an old
vcpu state).

I wasn't able to catch the real problem despite of having tried lots
of approaches using debug patches.

Recently I suspected the whole problem could be somehow related to
RCU handling, as stop_machine_run() is relying on tasklets which are
executing in idle context, and RCU handling is done in idle context,
too. So there might be some kind of use after free scenario in case
some memory is freed via RCU despite it still being used by a tasklet.

I "just" need to find some time to verify this suspicion. Any help doing
this would be appreciated. :-)


Juergen


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-09-29 15:07           ` Jürgen Groß
@ 2020-09-29 15:16             ` Marek Marczykowski-Górecki
  2020-09-29 15:27               ` Jürgen Groß
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2020-09-29 15:16 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Juergen Gross, Dario Faggioli, Andrew Cooper, xen-devel

[-- Attachment #1: Type: text/plain, Size: 4074 bytes --]

On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
> On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> > > > [Adding Juergen]
> > > > 
> > > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > > > > frequently get Xen crash on resume from S3. Full message below.
> > > > > > > 
> > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > > > > resuming
> > > > > > > from S3 with smt=0".
> > > > > > > 
> > > > > > > Contrary to the previous issue, this one does not happen always -
> > > > > > > I
> > > > > > > would say in about 40% cases on this setup, but very rarely on
> > > > > > > physical
> > > > > > > setup.
> > > > > > > 
> > > > > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > > > > 
> > > > > > > Do you think it would be any different on xen-unstable? I cat
> > > > > > > try, but
> > > > > > > it isn't trivial in this setup, so I'd ask first.
> > > > > > > 
> > > > Well, Juergen has fixed quite a few issues.
> > > > 
> > > > Most of them where triggering with core-scheduling enabled, and I don't
> > > > recall any of them which looked similar or related to this.
> > > > 
> > > > Still, it's possible that the same issue causes different symptoms, and
> > > > hence that maybe one of the patches would fix this too.
> > > 
> > > I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> > > (pasted below for the completeness).
> > > But there is more: additionally, in most (all?) cases after resume I've got
> > > soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> > > didn't happened before and the only change was Xen 4.13 -> master.
> > > 
> > > Xen crash:
> > > 
> > > (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
> > 
> > Juergen, any idea about this one? This is also happening on the current
> > stable-4.14 (28855ebcdbfa).
> > 
> 
> Oh, sorry I didn't come back to this issue.
> 
> I suspect this is related to stop_machine_run() being called during
> suspend(), as I'm seeing very sporadic issues when offlining and then
> onlining cpus with core scheduling being active (it seems as if the
> dom0 vcpu doing the cpu online activity sometimes is using an old
> vcpu state).

Note this is default Xen 4.14 start, so core scheduling is _not_ active:

    (XEN) Brought up 2 CPUs
    (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
    (XEN) Adding cpu 0 to runqueue 0
    (XEN)  First cpu on runqueue, activating
    (XEN) Adding cpu 1 to runqueue 1
    (XEN)  First cpu on runqueue, activating

> I wasn't able to catch the real problem despite of having tried lots
> of approaches using debug patches.
> 
> Recently I suspected the whole problem could be somehow related to
> RCU handling, as stop_machine_run() is relying on tasklets which are
> executing in idle context, and RCU handling is done in idle context,
> too. So there might be some kind of use after free scenario in case
> some memory is freed via RCU despite it still being used by a tasklet.

That sounds plausible, even though I don't really know this area of Xen.

> I "just" need to find some time to verify this suspicion. Any help doing
> this would be appreciated. :-)

I do have a setup where I can easily-ish reproduce the issue. If there
is some debug patch you'd like me to try, I can do that.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
  2020-09-29 15:16             ` Marek Marczykowski-Górecki
@ 2020-09-29 15:27               ` Jürgen Groß
  2021-01-31  2:15                 ` [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Jürgen Groß @ 2020-09-29 15:27 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Juergen Gross, Dario Faggioli, Andrew Cooper, xen-devel

On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
> On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
>> On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
>>> On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
>>>> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
>>>>> [Adding Juergen]
>>>>>
>>>>> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
>>>>>> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
>>>>>>> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> In my test setup (inside KVM with nested virt enabled), I rather
>>>>>>>> frequently get Xen crash on resume from S3. Full message below.
>>>>>>>>
>>>>>>>> This is Xen 4.13.0, with some patches, including "sched: fix
>>>>>>>> resuming
>>>>>>>> from S3 with smt=0".
>>>>>>>>
>>>>>>>> Contrary to the previous issue, this one does not happen always -
>>>>>>>> I
>>>>>>>> would say in about 40% cases on this setup, but very rarely on
>>>>>>>> physical
>>>>>>>> setup.
>>>>>>>>
>>>>>>>> This is _without_ core scheduling enabled, and also with smt=off.
>>>>>>>>
>>>>>>>> Do you think it would be any different on xen-unstable? I cat
>>>>>>>> try, but
>>>>>>>> it isn't trivial in this setup, so I'd ask first.
>>>>>>>>
>>>>> Well, Juergen has fixed quite a few issues.
>>>>>
>>>>> Most of them where triggering with core-scheduling enabled, and I don't
>>>>> recall any of them which looked similar or related to this.
>>>>>
>>>>> Still, it's possible that the same issue causes different symptoms, and
>>>>> hence that maybe one of the patches would fix this too.
>>>>
>>>> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
>>>> (pasted below for the completeness).
>>>> But there is more: additionally, in most (all?) cases after resume I've got
>>>> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
>>>> didn't happened before and the only change was Xen 4.13 -> master.
>>>>
>>>> Xen crash:
>>>>
>>>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
>>>
>>> Juergen, any idea about this one? This is also happening on the current
>>> stable-4.14 (28855ebcdbfa).
>>>
>>
>> Oh, sorry I didn't come back to this issue.
>>
>> I suspect this is related to stop_machine_run() being called during
>> suspend(), as I'm seeing very sporadic issues when offlining and then
>> onlining cpus with core scheduling being active (it seems as if the
>> dom0 vcpu doing the cpu online activity sometimes is using an old
>> vcpu state).
> 
> Note this is default Xen 4.14 start, so core scheduling is _not_ active:

The similarity in the two failure cases is that multiple cpus are
affected by the operations during stop_machine_run().

> 
>      (XEN) Brought up 2 CPUs
>      (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
>      (XEN) Adding cpu 0 to runqueue 0
>      (XEN)  First cpu on runqueue, activating
>      (XEN) Adding cpu 1 to runqueue 1
>      (XEN)  First cpu on runqueue, activating
> 
>> I wasn't able to catch the real problem despite of having tried lots
>> of approaches using debug patches.
>>
>> Recently I suspected the whole problem could be somehow related to
>> RCU handling, as stop_machine_run() is relying on tasklets which are
>> executing in idle context, and RCU handling is done in idle context,
>> too. So there might be some kind of use after free scenario in case
>> some memory is freed via RCU despite it still being used by a tasklet.
> 
> That sounds plausible, even though I don't really know this area of Xen.
> 
>> I "just" need to find some time to verify this suspicion. Any help doing
>> this would be appreciated. :-)
> 
> I do have a setup where I can easily-ish reproduce the issue. If there
> is some debug patch you'd like me to try, I can do that.

Thanks. I might come back to that offer as you are seeing a crash which
will be much easier to analyze. Catching my error case is much harder as
it surfaces some time after the real problem in a non destructive way
(usually I'm seeing a failure to load a library in the program which
just did its job via exactly the library claiming not being loadable).


Juergen


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2020-09-29 15:27               ` Jürgen Groß
@ 2021-01-31  2:15                 ` Marek Marczykowski-Górecki
  2021-10-09 16:28                   ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2021-01-31  2:15 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Juergen Gross, Dario Faggioli, Jan Beulich, Andrew Cooper, xen-devel

[-- Attachment #1: Type: text/plain, Size: 10511 bytes --]

On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
> On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
> > On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
> > > On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> > > > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> > > > > > [Adding Juergen]
> > > > > > 
> > > > > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > > > > > > frequently get Xen crash on resume from S3. Full message below.
> > > > > > > > > 
> > > > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > > > > > > resuming
> > > > > > > > > from S3 with smt=0".
> > > > > > > > > 
> > > > > > > > > Contrary to the previous issue, this one does not happen always -
> > > > > > > > > I
> > > > > > > > > would say in about 40% cases on this setup, but very rarely on
> > > > > > > > > physical
> > > > > > > > > setup.
> > > > > > > > > 
> > > > > > > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > > > > > > 
> > > > > > > > > Do you think it would be any different on xen-unstable? I cat
> > > > > > > > > try, but
> > > > > > > > > it isn't trivial in this setup, so I'd ask first.
> > > > > > > > > 
> > > > > > Well, Juergen has fixed quite a few issues.
> > > > > > 
> > > > > > Most of them where triggering with core-scheduling enabled, and I don't
> > > > > > recall any of them which looked similar or related to this.
> > > > > > 
> > > > > > Still, it's possible that the same issue causes different symptoms, and
> > > > > > hence that maybe one of the patches would fix this too.
> > > > > 
> > > > > I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> > > > > (pasted below for the completeness).
> > > > > But there is more: additionally, in most (all?) cases after resume I've got
> > > > > soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> > > > > didn't happened before and the only change was Xen 4.13 -> master.
> > > > > 
> > > > > Xen crash:
> > > > > 
> > > > > (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
> > > > 
> > > > Juergen, any idea about this one? This is also happening on the current
> > > > stable-4.14 (28855ebcdbfa).
> > > > 
> > > 
> > > Oh, sorry I didn't come back to this issue.
> > > 
> > > I suspect this is related to stop_machine_run() being called during
> > > suspend(), as I'm seeing very sporadic issues when offlining and then
> > > onlining cpus with core scheduling being active (it seems as if the
> > > dom0 vcpu doing the cpu online activity sometimes is using an old
> > > vcpu state).
> > 
> > Note this is default Xen 4.14 start, so core scheduling is _not_ active:
> 
> The similarity in the two failure cases is that multiple cpus are
> affected by the operations during stop_machine_run().
> 
> > 
> >      (XEN) Brought up 2 CPUs
> >      (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
> >      (XEN) Adding cpu 0 to runqueue 0
> >      (XEN)  First cpu on runqueue, activating
> >      (XEN) Adding cpu 1 to runqueue 1
> >      (XEN)  First cpu on runqueue, activating
> > 
> > > I wasn't able to catch the real problem despite of having tried lots
> > > of approaches using debug patches.
> > > 
> > > Recently I suspected the whole problem could be somehow related to
> > > RCU handling, as stop_machine_run() is relying on tasklets which are
> > > executing in idle context, and RCU handling is done in idle context,
> > > too. So there might be some kind of use after free scenario in case
> > > some memory is freed via RCU despite it still being used by a tasklet.
> > 
> > That sounds plausible, even though I don't really know this area of Xen.
> > 
> > > I "just" need to find some time to verify this suspicion. Any help doing
> > > this would be appreciated. :-)
> > 
> > I do have a setup where I can easily-ish reproduce the issue. If there
> > is some debug patch you'd like me to try, I can do that.
> 
> Thanks. I might come back to that offer as you are seeing a crash which
> will be much easier to analyze. Catching my error case is much harder as
> it surfaces some time after the real problem in a non destructive way
> (usually I'm seeing a failure to load a library in the program which
> just did its job via exactly the library claiming not being loadable).

Hi,

I'm resurrecting this thread as it was recently mentioned elsewhere. I
can still reproduce the issue on the recent staging branch (9dc687f155).

It fails after the first resume (not always, but frequent enough to
debug it). At least one guest needs to be running - with just (PV) dom0
the crash doesn't happen (at least for the ~8 times in a row I tried).
If the first resume works, the second (almost?) always will fail but
with a different symptoms - dom0 kernel lockups (at least some of its
vcpus). I haven't debugged this one yet at all.

Any help will be appreciated, I can apply some debug patches, change
configuration etc.

In the meantime I tried to collect more info with the patch below, but I
can't make much of it. Here is what I've got:

(XEN) c2r(sched_unit_master(unit)) = 1, c2rqd(sched_unit_master(unit)) = ffff8301ba6d0230, sched_unit_master(unit) = 1, svc->rqd->id = 0, svc->rqd = ffff8301ba6d0050, 
(XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2294
(XEN) ----[ Xen-4.15-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d040246642>] credit2.c#csched2_unit_wake+0x205/0x207
(XEN) RFLAGS: 0000000000010083   CONTEXT: hypervisor
(XEN) rax: ffff8301ba6d0230   rbx: ffff8301ba6b7b50   rcx: 0000000000000001
(XEN) rdx: 000000317a159000   rsi: 000000000000000a   rdi: ffff82d0404686b8
(XEN) rbp: ffff8300be877df0   rsp: ffff8300be877dd0   r8:  0000000000000000
(XEN) r9:  0000000000000004   r10: 0000000000000001   r11: 0000000000000001
(XEN) r12: ffff8301ba6b7be0   r13: ffff82d040453d00   r14: 0000000000000001
(XEN) r15: 0000013446313336   cr0: 000000008005003b   cr4: 00000000000006e0
(XEN) cr3: 00000000be866000   cr2: 0000000000000000
(XEN) fsb: 00000000000b8000   gsb: 0000000000000000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d040246642> (credit2.c#csched2_unit_wake+0x205/0x207):
(XEN)  f3 00 00 e9 7a ff ff ff <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 54 53 48
(XEN) Xen stack trace from rsp=ffff8300be877dd0:
(XEN)    ffff8301ba6a2000 ffff8301ba6b7b50 ffff8301ba6b7b50 ffff8301ba6d0230
(XEN)    ffff8300be877e40 ffff82d04024f968 0000000000000202 ffff8301ba6d0230
(XEN)    ffff82d0405a05c0 ffff83010a6f1fd0 ffff8301ba6a2000 0000000000000000
(XEN)    0000000000000000 ffff82d0405a05c0 ffff8300be877e50 ffff82d04020548a
(XEN)    ffff8300be877e70 ffff82d04020552b ffff8301ba6a21b8 ffff82d0405801b0
(XEN)    ffff8300be877e88 ffff82d04022ba4e ffff82d0405801a0 ffff8300be877eb8
(XEN)    ffff82d04022bd2f 0000000000000000 0000000000007fff ffff82d040586f00
(XEN)    ffff82d0405801b0 ffff8300be877ef0 ffff82d0402f0412 ffff82d0402f039e
(XEN)    ffff8301ba6a2000 ffff8301ba708000 0000000000000000 ffff8301ba6bb000
(XEN)    ffff8300be877e18 ffff8881072a76e0 ffffc90003f3ff08 0000000000000003
(XEN)    0000000000000000 0000000000002401 ffffffff827594c8 0000000000000246
(XEN)    0000000000000003 0000000000002401 0000000000002401 0000000000000000
(XEN)    ffffffff810010ea 0000000000002401 0000000000000010 deadbeefdeadf00d
(XEN)    0000010000000000 ffffffff810010ea 000000000000e033 0000000000000246
(XEN)    ffffc90003f3fcb8 000000000000e02b 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000e01000000000 ffff8301ba6fb000
(XEN)    0000000000000000 00000000000006e0 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d040246642>] R credit2.c#csched2_unit_wake+0x205/0x207
(XEN)    [<ffff82d04024f968>] F vcpu_wake+0x105/0x52c
(XEN)    [<ffff82d04020548a>] F vcpu_unpause+0x13/0x15
(XEN)    [<ffff82d04020552b>] F domain.c#continue_hypercall_tasklet_handler+0x9f/0xb9
(XEN)    [<ffff82d04022ba4e>] F tasklet.c#do_tasklet_work+0x76/0xa9
(XEN)    [<ffff82d04022bd2f>] F do_tasklet+0x58/0x8a
(XEN)    [<ffff82d0402f0412>] F domain.c#idle_loop+0x74/0xdd
(XEN) 
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2294

And the patch:

diff --git a/xen/common/sched/credit2.c b/xen/common/sched/credit2.c
index eb5e5a78c5e7..475f0acf2dc5 100644
--- a/xen/common/sched/credit2.c
+++ b/xen/common/sched/credit2.c
@@ -2268,10 +2268,28 @@ csched2_unit_wake(const struct scheduler *ops, struct sched_unit *unit)
     }
 
     /* Add into the new runqueue if necessary */
-    if ( svc->rqd == NULL )
+    if ( svc->rqd == NULL ) {
+        printk(XENLOG_DEBUG "assigning cpu %d to runqueue\n", cpu);
         runq_assign(unit);
-    else
+        printk(XENLOG_DEBUG "assigned cpu %d to runqueue %d\n",
+                cpu,
+                svc->rqd ? svc->rqd->id : -1);
+    } else {
+        if (c2rqd(sched_unit_master(unit)) != svc->rqd) {
+            printk(XENLOG_DEBUG "c2r(sched_unit_master(unit)) = %d, "
+                                "c2rqd(sched_unit_master(unit)) = %p, "
+                                "sched_unit_master(unit) = %d, "
+                                "svc->rqd->id = %d, "
+                                "svc->rqd = %p, "
+                                "\n",
+                c2r(sched_unit_master(unit)),
+                c2rqd(sched_unit_master(unit)),
+                sched_unit_master(unit),
+                svc->rqd->id,
+                svc->rqd);
+        }
         ASSERT(c2rqd(sched_unit_master(unit)) == svc->rqd );
+    }
 
     now = NOW();
 



-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2021-01-31  2:15                 ` [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer Marek Marczykowski-Górecki
@ 2021-10-09 16:28                   ` Marek Marczykowski-Górecki
  2022-08-21 16:14                     ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2021-10-09 16:28 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Juergen Gross, Dario Faggioli, Jan Beulich, Andrew Cooper, xen-devel

[-- Attachment #1: Type: text/plain, Size: 6033 bytes --]

On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
> On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
> > On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
> > > On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
> > > > On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> > > > > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> > > > > > > [Adding Juergen]
> > > > > > > 
> > > > > > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > > > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > > > > > > > frequently get Xen crash on resume from S3. Full message below.
> > > > > > > > > > 
> > > > > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > > > > > > > resuming
> > > > > > > > > > from S3 with smt=0".
> > > > > > > > > > 
> > > > > > > > > > Contrary to the previous issue, this one does not happen always -
> > > > > > > > > > I
> > > > > > > > > > would say in about 40% cases on this setup, but very rarely on
> > > > > > > > > > physical
> > > > > > > > > > setup.
> > > > > > > > > > 
> > > > > > > > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > > > > > > > 
> > > > > > > > > > Do you think it would be any different on xen-unstable? I cat
> > > > > > > > > > try, but
> > > > > > > > > > it isn't trivial in this setup, so I'd ask first.
> > > > > > > > > > 
> > > > > > > Well, Juergen has fixed quite a few issues.
> > > > > > > 
> > > > > > > Most of them where triggering with core-scheduling enabled, and I don't
> > > > > > > recall any of them which looked similar or related to this.
> > > > > > > 
> > > > > > > Still, it's possible that the same issue causes different symptoms, and
> > > > > > > hence that maybe one of the patches would fix this too.
> > > > > > 
> > > > > > I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> > > > > > (pasted below for the completeness).
> > > > > > But there is more: additionally, in most (all?) cases after resume I've got
> > > > > > soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> > > > > > didn't happened before and the only change was Xen 4.13 -> master.
> > > > > > 
> > > > > > Xen crash:
> > > > > > 
> > > > > > (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
> > > > > 
> > > > > Juergen, any idea about this one? This is also happening on the current
> > > > > stable-4.14 (28855ebcdbfa).
> > > > > 
> > > > 
> > > > Oh, sorry I didn't come back to this issue.
> > > > 
> > > > I suspect this is related to stop_machine_run() being called during
> > > > suspend(), as I'm seeing very sporadic issues when offlining and then
> > > > onlining cpus with core scheduling being active (it seems as if the
> > > > dom0 vcpu doing the cpu online activity sometimes is using an old
> > > > vcpu state).
> > > 
> > > Note this is default Xen 4.14 start, so core scheduling is _not_ active:
> > 
> > The similarity in the two failure cases is that multiple cpus are
> > affected by the operations during stop_machine_run().
> > 
> > > 
> > >      (XEN) Brought up 2 CPUs
> > >      (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
> > >      (XEN) Adding cpu 0 to runqueue 0
> > >      (XEN)  First cpu on runqueue, activating
> > >      (XEN) Adding cpu 1 to runqueue 1
> > >      (XEN)  First cpu on runqueue, activating
> > > 
> > > > I wasn't able to catch the real problem despite of having tried lots
> > > > of approaches using debug patches.
> > > > 
> > > > Recently I suspected the whole problem could be somehow related to
> > > > RCU handling, as stop_machine_run() is relying on tasklets which are
> > > > executing in idle context, and RCU handling is done in idle context,
> > > > too. So there might be some kind of use after free scenario in case
> > > > some memory is freed via RCU despite it still being used by a tasklet.
> > > 
> > > That sounds plausible, even though I don't really know this area of Xen.
> > > 
> > > > I "just" need to find some time to verify this suspicion. Any help doing
> > > > this would be appreciated. :-)
> > > 
> > > I do have a setup where I can easily-ish reproduce the issue. If there
> > > is some debug patch you'd like me to try, I can do that.
> > 
> > Thanks. I might come back to that offer as you are seeing a crash which
> > will be much easier to analyze. Catching my error case is much harder as
> > it surfaces some time after the real problem in a non destructive way
> > (usually I'm seeing a failure to load a library in the program which
> > just did its job via exactly the library claiming not being loadable).
> 
> Hi,
> 
> I'm resurrecting this thread as it was recently mentioned elsewhere. I
> can still reproduce the issue on the recent staging branch (9dc687f155).
> 
> It fails after the first resume (not always, but frequent enough to
> debug it). At least one guest needs to be running - with just (PV) dom0
> the crash doesn't happen (at least for the ~8 times in a row I tried).
> If the first resume works, the second (almost?) always will fail but
> with a different symptoms - dom0 kernel lockups (at least some of its
> vcpus). I haven't debugged this one yet at all.
> 
> Any help will be appreciated, I can apply some debug patches, change
> configuration etc.

This still happens on 4.14.3. Maybe it is related to freeing percpu
areas, as it caused other issues with suspend too? Just a thought...

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2021-10-09 16:28                   ` Marek Marczykowski-Górecki
@ 2022-08-21 16:14                     ` Marek Marczykowski-Górecki
  2022-08-22  9:53                       ` Jan Beulich
  2022-08-22 15:34                       ` Juergen Gross
  0 siblings, 2 replies; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-08-21 16:14 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Juergen Gross, Dario Faggioli, Jan Beulich, Andrew Cooper, xen-devel

[-- Attachment #1: Type: text/plain, Size: 14708 bytes --]

On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
> > On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
> > > On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
> > > > On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
> > > > > On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> > > > > > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> > > > > > > > [Adding Juergen]
> > > > > > > > 
> > > > > > > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > > > > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > > > > > > > > frequently get Xen crash on resume from S3. Full message below.
> > > > > > > > > > > 
> > > > > > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > > > > > > > > resuming
> > > > > > > > > > > from S3 with smt=0".
> > > > > > > > > > > 
> > > > > > > > > > > Contrary to the previous issue, this one does not happen always -
> > > > > > > > > > > I
> > > > > > > > > > > would say in about 40% cases on this setup, but very rarely on
> > > > > > > > > > > physical
> > > > > > > > > > > setup.
> > > > > > > > > > > 
> > > > > > > > > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > > > > > > > > 
> > > > > > > > > > > Do you think it would be any different on xen-unstable? I cat
> > > > > > > > > > > try, but
> > > > > > > > > > > it isn't trivial in this setup, so I'd ask first.
> > > > > > > > > > > 
> > > > > > > > Well, Juergen has fixed quite a few issues.
> > > > > > > > 
> > > > > > > > Most of them where triggering with core-scheduling enabled, and I don't
> > > > > > > > recall any of them which looked similar or related to this.
> > > > > > > > 
> > > > > > > > Still, it's possible that the same issue causes different symptoms, and
> > > > > > > > hence that maybe one of the patches would fix this too.
> > > > > > > 
> > > > > > > I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> > > > > > > (pasted below for the completeness).
> > > > > > > But there is more: additionally, in most (all?) cases after resume I've got
> > > > > > > soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> > > > > > > didn't happened before and the only change was Xen 4.13 -> master.
> > > > > > > 
> > > > > > > Xen crash:
> > > > > > > 
> > > > > > > (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
> > > > > > 
> > > > > > Juergen, any idea about this one? This is also happening on the current
> > > > > > stable-4.14 (28855ebcdbfa).
> > > > > > 
> > > > > 
> > > > > Oh, sorry I didn't come back to this issue.
> > > > > 
> > > > > I suspect this is related to stop_machine_run() being called during
> > > > > suspend(), as I'm seeing very sporadic issues when offlining and then
> > > > > onlining cpus with core scheduling being active (it seems as if the
> > > > > dom0 vcpu doing the cpu online activity sometimes is using an old
> > > > > vcpu state).
> > > > 
> > > > Note this is default Xen 4.14 start, so core scheduling is _not_ active:
> > > 
> > > The similarity in the two failure cases is that multiple cpus are
> > > affected by the operations during stop_machine_run().
> > > 
> > > > 
> > > >      (XEN) Brought up 2 CPUs
> > > >      (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
> > > >      (XEN) Adding cpu 0 to runqueue 0
> > > >      (XEN)  First cpu on runqueue, activating
> > > >      (XEN) Adding cpu 1 to runqueue 1
> > > >      (XEN)  First cpu on runqueue, activating
> > > > 
> > > > > I wasn't able to catch the real problem despite of having tried lots
> > > > > of approaches using debug patches.
> > > > > 
> > > > > Recently I suspected the whole problem could be somehow related to
> > > > > RCU handling, as stop_machine_run() is relying on tasklets which are
> > > > > executing in idle context, and RCU handling is done in idle context,
> > > > > too. So there might be some kind of use after free scenario in case
> > > > > some memory is freed via RCU despite it still being used by a tasklet.
> > > > 
> > > > That sounds plausible, even though I don't really know this area of Xen.
> > > > 
> > > > > I "just" need to find some time to verify this suspicion. Any help doing
> > > > > this would be appreciated. :-)
> > > > 
> > > > I do have a setup where I can easily-ish reproduce the issue. If there
> > > > is some debug patch you'd like me to try, I can do that.
> > > 
> > > Thanks. I might come back to that offer as you are seeing a crash which
> > > will be much easier to analyze. Catching my error case is much harder as
> > > it surfaces some time after the real problem in a non destructive way
> > > (usually I'm seeing a failure to load a library in the program which
> > > just did its job via exactly the library claiming not being loadable).
> > 
> > Hi,
> > 
> > I'm resurrecting this thread as it was recently mentioned elsewhere. I
> > can still reproduce the issue on the recent staging branch (9dc687f155).
> > 
> > It fails after the first resume (not always, but frequent enough to
> > debug it). At least one guest needs to be running - with just (PV) dom0
> > the crash doesn't happen (at least for the ~8 times in a row I tried).
> > If the first resume works, the second (almost?) always will fail but
> > with a different symptoms - dom0 kernel lockups (at least some of its
> > vcpus). I haven't debugged this one yet at all.
> > 
> > Any help will be appreciated, I can apply some debug patches, change
> > configuration etc.
> 
> This still happens on 4.14.3. Maybe it is related to freeing percpu
> areas, as it caused other issues with suspend too? Just a thought...

I have reproduced this on current staging(*). And I can reproduce it
reliably. And also, I got (I believe) closely related crash with credit1
scheduler.

(*) It isn't plain staging, it's one with my xhci console patches on
top, including attempt to make it survive S3. I believe the only
relevant part there is sticking set_timer() into console resume path (or
just having a timer with rather short delay registered). The actual tree
at https://github.com/marmarek/xen/tree/master-xue2-debug, including
quite a lot of debug prints and debug hacks.

Specific crash with credit2:

    (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:928
    (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
    (XEN) CPU:    0
    (XEN) RIP:    e008:[<ffff82d0402434bf>] credit.c#csched_tick+0x2d4/0x494
    (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
    (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002
    (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002
    (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004
    (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887
    (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40
    (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
    (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318
    (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000
    (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
    (XEN) Xen code around <ffff82d0402434bf> (credit.c#csched_tick+0x2d4/0x494):
    (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 0f b7 00 89 45 cc
    (XEN) Xen stack trace from rsp=ffff830256227de0:
    (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90
    (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb
    (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60
    (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0
    (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00
    (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
    (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
    (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
    (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00
    (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031
    (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000
    (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700
    (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f
    (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b
    (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
    (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
    (XEN) Xen call trace:
    (XEN)    [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494
    (XEN)    [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c
    (XEN)    [<ffff82d04022f73f>] F timer.c#timer_softirq_action+0x71/0x278
    (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
    (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
    (XEN)    [<ffff82d0402f03c6>] F x86_64/entry.S#process_softirqs+0x6/0x20
    (XEN) 
    (XEN) 
    (XEN) ****************************************
    (XEN) Panic on CPU 0:
    (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:928
    (XEN) ****************************************

Specific crash with credit1:

    (XEN) ASSERT FAIL! cpu 4, sched_unit_master 0, currunit id 4
    (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:936
    (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
    (XEN) CPU:    0
    (XEN) RIP:    e008:[<ffff82d04024355c>] credit.c#csched_tick+0x2f7/0x4b7
    (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
    (XEN) rax: ffff8302517f64d0   rbx: 0000000000000004   rcx: 0000000000000000
    (XEN) rdx: ffff830256227fff   rsi: 000000000000000a   rdi: ffff82d0404786b8
    (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000087
    (XEN) r9:  0000000000000001   r10: 0000000000000001   r11: 0000000000000004
    (XEN) r12: ffff83025174de50   r13: ffff830251778100   r14: ffff83025174df40
    (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
    (XEN) cr3: 000000024ac10000   cr2: 0000000000000000
    (XEN) fsb: 0000000000000000   gsb: ffff888255700000   gss: 0000000000000000
    (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: 0000   cs: e008
    (XEN) Xen code around <ffff82d04024355c> (credit.c#csched_tick+0x2f7/0x4b7):
    (XEN)  58 70 0f 84 bf fe ff ff <0f> 0b 0f 0b 0f 0b 48 8b 40 20 0f b7 00 89 45 cc
    (XEN) Xen stack trace from rsp=ffff830256227de0:
    (XEN)    000000000046692a ffff8302515c0bc0 000000044025ae27 002191c0517f64e8
    (XEN)    ffff82d0405c33e0 ffff8302517c8590 ffff82d0405c3280 ffff82d040243265
    (XEN)    0000000000000004 00000026d27a80a7 0000000000000000 ffff830256227e60
    (XEN)    ffff82d04022ee94 ffff82d0405c3280 ffff8302517e7df0 ffff830251778120
    (XEN)    ffff830256227ea0 ffff82d04022f77a ffff830256227e80 ffff82d0405c9f00
    (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
    (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
    (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
    (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f0496 0000000000000001
    (XEN)    ffffc90040127ddc ffffc90040127dd0 ffffc90040127dd4 ffffc90040127dd8
    (XEN)    00000000756e6547 0000000000000000 0000000000000000 ffffc90040127ddc
    (XEN)    ffffc90040127dd0 000000000000000d 000000006c65746e 0000000049656e69
    (XEN)    ffffc90040127dd4 ffffc90040127dd8 000000fa00000000 ffffffff8101c5a4
    (XEN)    000000000000e033 0000000000000293 ffffc90040127db8 000000000000e02b
    (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
    (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
    (XEN) Xen call trace:
    (XEN)    [<ffff82d04024355c>] R credit.c#csched_tick+0x2f7/0x4b7
    (XEN)    [<ffff82d04022ee94>] F timer.c#execute_timer+0x45/0x5c
    (XEN)    [<ffff82d04022f77a>] F timer.c#timer_softirq_action+0x71/0x278
    (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
    (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15  
    (XEN)    [<ffff82d0402f0496>] F x86_64/entry.S#process_softirqs+0x6/0x20
    (XEN)
    (XEN) ****************************************
    (XEN) Panic on CPU 0:
    (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:936
    (XEN) ****************************************

I'm not sure about the nature of credit2 crash, but for credit1 it looks
much simpler to understand. Credit1 registers a timer for each pcpu (in
init_pdata()). Then, the timer function
(csched_tick()->csched_unit_acct()) asserts if it was really called on
correct CPU. The issue is that suspend path migrates all timers to CPU0
(timerc.:cpu_callback->migrate_timers_from_cpu()), and restore path
doesn't migrate them back. All this while not unregistering pCPUs from
the scheduler (sched_deinit_pdata() / init_pdata() is not called -
confirmed via debug prints).

I'm not exactly sure if that's the same issue that applies to credit2,
but I think it's likely. A naive fix could be unregistering all pCPUs
from the scheduler (an registering them back on resume), but this may
mess up CPU pools, and probably few other things. Any better ideas? I
can test various patches, provide more debug info etc - should be rather
straightforward given I have reliable reproducer now.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-08-21 16:14                     ` Marek Marczykowski-Górecki
@ 2022-08-22  9:53                       ` Jan Beulich
  2022-08-22 10:00                         ` Marek Marczykowski-Górecki
  2022-08-22 15:34                       ` Juergen Gross
  1 sibling, 1 reply; 26+ messages in thread
From: Jan Beulich @ 2022-08-22  9:53 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Juergen Gross, Dario Faggioli, Andrew Cooper, xen-devel,
	Jürgen Groß

On 21.08.2022 18:14, Marek Marczykowski-Górecki wrote:
> On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
>> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
>>> On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
>>>> On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
>>>>> On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
>>>>>> On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
>>>>>>> On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
>>>>>>>> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
>>>>>>>>> [Adding Juergen]
>>>>>>>>>
>>>>>>>>> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
>>>>>>>>>> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
>>>>>>>>>>> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> In my test setup (inside KVM with nested virt enabled), I rather
>>>>>>>>>>>> frequently get Xen crash on resume from S3. Full message below.
>>>>>>>>>>>>
>>>>>>>>>>>> This is Xen 4.13.0, with some patches, including "sched: fix
>>>>>>>>>>>> resuming
>>>>>>>>>>>> from S3 with smt=0".
>>>>>>>>>>>>
>>>>>>>>>>>> Contrary to the previous issue, this one does not happen always -
>>>>>>>>>>>> I
>>>>>>>>>>>> would say in about 40% cases on this setup, but very rarely on
>>>>>>>>>>>> physical
>>>>>>>>>>>> setup.
>>>>>>>>>>>>
>>>>>>>>>>>> This is _without_ core scheduling enabled, and also with smt=off.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you think it would be any different on xen-unstable? I cat
>>>>>>>>>>>> try, but
>>>>>>>>>>>> it isn't trivial in this setup, so I'd ask first.
>>>>>>>>>>>>
>>>>>>>>> Well, Juergen has fixed quite a few issues.
>>>>>>>>>
>>>>>>>>> Most of them where triggering with core-scheduling enabled, and I don't
>>>>>>>>> recall any of them which looked similar or related to this.
>>>>>>>>>
>>>>>>>>> Still, it's possible that the same issue causes different symptoms, and
>>>>>>>>> hence that maybe one of the patches would fix this too.
>>>>>>>>
>>>>>>>> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
>>>>>>>> (pasted below for the completeness).
>>>>>>>> But there is more: additionally, in most (all?) cases after resume I've got
>>>>>>>> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
>>>>>>>> didn't happened before and the only change was Xen 4.13 -> master.
>>>>>>>>
>>>>>>>> Xen crash:
>>>>>>>>
>>>>>>>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
>>>>>>>
>>>>>>> Juergen, any idea about this one? This is also happening on the current
>>>>>>> stable-4.14 (28855ebcdbfa).
>>>>>>>
>>>>>>
>>>>>> Oh, sorry I didn't come back to this issue.
>>>>>>
>>>>>> I suspect this is related to stop_machine_run() being called during
>>>>>> suspend(), as I'm seeing very sporadic issues when offlining and then
>>>>>> onlining cpus with core scheduling being active (it seems as if the
>>>>>> dom0 vcpu doing the cpu online activity sometimes is using an old
>>>>>> vcpu state).
>>>>>
>>>>> Note this is default Xen 4.14 start, so core scheduling is _not_ active:
>>>>
>>>> The similarity in the two failure cases is that multiple cpus are
>>>> affected by the operations during stop_machine_run().
>>>>
>>>>>
>>>>>      (XEN) Brought up 2 CPUs
>>>>>      (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
>>>>>      (XEN) Adding cpu 0 to runqueue 0
>>>>>      (XEN)  First cpu on runqueue, activating
>>>>>      (XEN) Adding cpu 1 to runqueue 1
>>>>>      (XEN)  First cpu on runqueue, activating
>>>>>
>>>>>> I wasn't able to catch the real problem despite of having tried lots
>>>>>> of approaches using debug patches.
>>>>>>
>>>>>> Recently I suspected the whole problem could be somehow related to
>>>>>> RCU handling, as stop_machine_run() is relying on tasklets which are
>>>>>> executing in idle context, and RCU handling is done in idle context,
>>>>>> too. So there might be some kind of use after free scenario in case
>>>>>> some memory is freed via RCU despite it still being used by a tasklet.
>>>>>
>>>>> That sounds plausible, even though I don't really know this area of Xen.
>>>>>
>>>>>> I "just" need to find some time to verify this suspicion. Any help doing
>>>>>> this would be appreciated. :-)
>>>>>
>>>>> I do have a setup where I can easily-ish reproduce the issue. If there
>>>>> is some debug patch you'd like me to try, I can do that.
>>>>
>>>> Thanks. I might come back to that offer as you are seeing a crash which
>>>> will be much easier to analyze. Catching my error case is much harder as
>>>> it surfaces some time after the real problem in a non destructive way
>>>> (usually I'm seeing a failure to load a library in the program which
>>>> just did its job via exactly the library claiming not being loadable).
>>>
>>> Hi,
>>>
>>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
>>> can still reproduce the issue on the recent staging branch (9dc687f155).
>>>
>>> It fails after the first resume (not always, but frequent enough to
>>> debug it). At least one guest needs to be running - with just (PV) dom0
>>> the crash doesn't happen (at least for the ~8 times in a row I tried).
>>> If the first resume works, the second (almost?) always will fail but
>>> with a different symptoms - dom0 kernel lockups (at least some of its
>>> vcpus). I haven't debugged this one yet at all.
>>>
>>> Any help will be appreciated, I can apply some debug patches, change
>>> configuration etc.
>>
>> This still happens on 4.14.3. Maybe it is related to freeing percpu
>> areas, as it caused other issues with suspend too? Just a thought...
> 
> I have reproduced this on current staging(*). And I can reproduce it
> reliably. And also, I got (I believe) closely related crash with credit1
> scheduler.
> 
> (*) It isn't plain staging, it's one with my xhci console patches on
> top, including attempt to make it survive S3. I believe the only
> relevant part there is sticking set_timer() into console resume path (or
> just having a timer with rather short delay registered). The actual tree
> at https://github.com/marmarek/xen/tree/master-xue2-debug, including
> quite a lot of debug prints and debug hacks.
> 
> Specific crash with credit2:

Are you sure this is Credit2? Both ...

>     (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:928

... here and ...

>     (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>     (XEN) CPU:    0
>     (XEN) RIP:    e008:[<ffff82d0402434bf>] credit.c#csched_tick+0x2d4/0x494
>     (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
>     (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002
>     (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002
>     (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004
>     (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887
>     (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40
>     (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
>     (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318
>     (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000
>     (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>     (XEN) Xen code around <ffff82d0402434bf> (credit.c#csched_tick+0x2d4/0x494):
>     (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 0f b7 00 89 45 cc
>     (XEN) Xen stack trace from rsp=ffff830256227de0:
>     (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90
>     (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb
>     (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60
>     (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0
>     (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00
>     (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
>     (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
>     (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
>     (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00
>     (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031
>     (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000
>     (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700
>     (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f
>     (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b
>     (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>     (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
>     (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
>     (XEN) Xen call trace:
>     (XEN)    [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494
>     (XEN)    [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c
>     (XEN)    [<ffff82d04022f73f>] F timer.c#timer_softirq_action+0x71/0x278
>     (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
>     (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
>     (XEN)    [<ffff82d0402f03c6>] F x86_64/entry.S#process_softirqs+0x6/0x20

... here the only references are to credit.c, i.e. Credit1 code.

Jan


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-08-22  9:53                       ` Jan Beulich
@ 2022-08-22 10:00                         ` Marek Marczykowski-Górecki
  2022-09-20 10:22                           ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-08-22 10:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Dario Faggioli, Andrew Cooper, xen-devel,
	Jürgen Groß

[-- Attachment #1: Type: text/plain, Size: 13724 bytes --]

On Mon, Aug 22, 2022 at 11:53:50AM +0200, Jan Beulich wrote:
> On 21.08.2022 18:14, Marek Marczykowski-Górecki wrote:
> > On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
> >> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
> >>> On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
> >>>> On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
> >>>>> On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
> >>>>>> On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> >>>>>>> On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> >>>>>>>> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> >>>>>>>>> [Adding Juergen]
> >>>>>>>>>
> >>>>>>>>> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> >>>>>>>>>> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> >>>>>>>>>>> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> In my test setup (inside KVM with nested virt enabled), I rather
> >>>>>>>>>>>> frequently get Xen crash on resume from S3. Full message below.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is Xen 4.13.0, with some patches, including "sched: fix
> >>>>>>>>>>>> resuming
> >>>>>>>>>>>> from S3 with smt=0".
> >>>>>>>>>>>>
> >>>>>>>>>>>> Contrary to the previous issue, this one does not happen always -
> >>>>>>>>>>>> I
> >>>>>>>>>>>> would say in about 40% cases on this setup, but very rarely on
> >>>>>>>>>>>> physical
> >>>>>>>>>>>> setup.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is _without_ core scheduling enabled, and also with smt=off.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Do you think it would be any different on xen-unstable? I cat
> >>>>>>>>>>>> try, but
> >>>>>>>>>>>> it isn't trivial in this setup, so I'd ask first.
> >>>>>>>>>>>>
> >>>>>>>>> Well, Juergen has fixed quite a few issues.
> >>>>>>>>>
> >>>>>>>>> Most of them where triggering with core-scheduling enabled, and I don't
> >>>>>>>>> recall any of them which looked similar or related to this.
> >>>>>>>>>
> >>>>>>>>> Still, it's possible that the same issue causes different symptoms, and
> >>>>>>>>> hence that maybe one of the patches would fix this too.
> >>>>>>>>
> >>>>>>>> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> >>>>>>>> (pasted below for the completeness).
> >>>>>>>> But there is more: additionally, in most (all?) cases after resume I've got
> >>>>>>>> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> >>>>>>>> didn't happened before and the only change was Xen 4.13 -> master.
> >>>>>>>>
> >>>>>>>> Xen crash:
> >>>>>>>>
> >>>>>>>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
> >>>>>>>
> >>>>>>> Juergen, any idea about this one? This is also happening on the current
> >>>>>>> stable-4.14 (28855ebcdbfa).
> >>>>>>>
> >>>>>>
> >>>>>> Oh, sorry I didn't come back to this issue.
> >>>>>>
> >>>>>> I suspect this is related to stop_machine_run() being called during
> >>>>>> suspend(), as I'm seeing very sporadic issues when offlining and then
> >>>>>> onlining cpus with core scheduling being active (it seems as if the
> >>>>>> dom0 vcpu doing the cpu online activity sometimes is using an old
> >>>>>> vcpu state).
> >>>>>
> >>>>> Note this is default Xen 4.14 start, so core scheduling is _not_ active:
> >>>>
> >>>> The similarity in the two failure cases is that multiple cpus are
> >>>> affected by the operations during stop_machine_run().
> >>>>
> >>>>>
> >>>>>      (XEN) Brought up 2 CPUs
> >>>>>      (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
> >>>>>      (XEN) Adding cpu 0 to runqueue 0
> >>>>>      (XEN)  First cpu on runqueue, activating
> >>>>>      (XEN) Adding cpu 1 to runqueue 1
> >>>>>      (XEN)  First cpu on runqueue, activating
> >>>>>
> >>>>>> I wasn't able to catch the real problem despite of having tried lots
> >>>>>> of approaches using debug patches.
> >>>>>>
> >>>>>> Recently I suspected the whole problem could be somehow related to
> >>>>>> RCU handling, as stop_machine_run() is relying on tasklets which are
> >>>>>> executing in idle context, and RCU handling is done in idle context,
> >>>>>> too. So there might be some kind of use after free scenario in case
> >>>>>> some memory is freed via RCU despite it still being used by a tasklet.
> >>>>>
> >>>>> That sounds plausible, even though I don't really know this area of Xen.
> >>>>>
> >>>>>> I "just" need to find some time to verify this suspicion. Any help doing
> >>>>>> this would be appreciated. :-)
> >>>>>
> >>>>> I do have a setup where I can easily-ish reproduce the issue. If there
> >>>>> is some debug patch you'd like me to try, I can do that.
> >>>>
> >>>> Thanks. I might come back to that offer as you are seeing a crash which
> >>>> will be much easier to analyze. Catching my error case is much harder as
> >>>> it surfaces some time after the real problem in a non destructive way
> >>>> (usually I'm seeing a failure to load a library in the program which
> >>>> just did its job via exactly the library claiming not being loadable).
> >>>
> >>> Hi,
> >>>
> >>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
> >>> can still reproduce the issue on the recent staging branch (9dc687f155).
> >>>
> >>> It fails after the first resume (not always, but frequent enough to
> >>> debug it). At least one guest needs to be running - with just (PV) dom0
> >>> the crash doesn't happen (at least for the ~8 times in a row I tried).
> >>> If the first resume works, the second (almost?) always will fail but
> >>> with a different symptoms - dom0 kernel lockups (at least some of its
> >>> vcpus). I haven't debugged this one yet at all.
> >>>
> >>> Any help will be appreciated, I can apply some debug patches, change
> >>> configuration etc.
> >>
> >> This still happens on 4.14.3. Maybe it is related to freeing percpu
> >> areas, as it caused other issues with suspend too? Just a thought...
> > 
> > I have reproduced this on current staging(*). And I can reproduce it
> > reliably. And also, I got (I believe) closely related crash with credit1
> > scheduler.
> > 
> > (*) It isn't plain staging, it's one with my xhci console patches on
> > top, including attempt to make it survive S3. I believe the only
> > relevant part there is sticking set_timer() into console resume path (or
> > just having a timer with rather short delay registered). The actual tree
> > at https://github.com/marmarek/xen/tree/master-xue2-debug, including
> > quite a lot of debug prints and debug hacks.
> > 
> > Specific crash with credit2:
> 
> Are you sure this is Credit2? Both ...
> 
> >     (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:928
> 
> ... here and ...
> 
> >     (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
> >     (XEN) CPU:    0
> >     (XEN) RIP:    e008:[<ffff82d0402434bf>] credit.c#csched_tick+0x2d4/0x494
> >     (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
> >     (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002
> >     (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002
> >     (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004
> >     (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887
> >     (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40
> >     (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
> >     (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318
> >     (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000
> >     (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> >     (XEN) Xen code around <ffff82d0402434bf> (credit.c#csched_tick+0x2d4/0x494):
> >     (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 0f b7 00 89 45 cc
> >     (XEN) Xen stack trace from rsp=ffff830256227de0:
> >     (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90
> >     (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb
> >     (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60
> >     (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0
> >     (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00
> >     (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
> >     (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
> >     (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
> >     (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00
> >     (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031
> >     (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000
> >     (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700
> >     (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f
> >     (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b
> >     (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >     (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
> >     (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
> >     (XEN) Xen call trace:
> >     (XEN)    [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494
> >     (XEN)    [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c
> >     (XEN)    [<ffff82d04022f73f>] F timer.c#timer_softirq_action+0x71/0x278
> >     (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
> >     (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
> >     (XEN)    [<ffff82d0402f03c6>] F x86_64/entry.S#process_softirqs+0x6/0x20
> 
> ... here the only references are to credit.c, i.e. Credit1 code.

Wrong paste, sorry, this is the credit2 one:

(XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
(XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
(XEN) CPU:    10
(XEN) RIP:    e008:[<ffff82d040247a4d>] credit2.c#csched2_unit_wake+0x152/0x154
(XEN) RFLAGS: 0000000000010083   CONTEXT: hypervisor (d0v0)
(XEN) rax: ffff830251778230   rbx: ffff830251768cb0   rcx: 00000032111d6000
(XEN) rdx: ffff8302515c1eb0   rsi: 0000000000000006   rdi: ffff830251769000
(XEN) rbp: ffff8302515cfd90   rsp: ffff8302515cfd70   r8:  ffff830251769000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff830251768dd0   r13: ffff8302515c1d00   r14: 0000000000000006
(XEN) r15: ffff82d0405ddb40   cr0: 0000000080050033   cr4: 0000000000372660
(XEN) cr3: 000000022f2a1000   cr2: ffff8881012738e0
(XEN) fsb: 0000744bf6a0db80   gsb: ffff888255600000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d040247a4d> (credit2.c#csched2_unit_wake+0x152/0x154):
(XEN)  df e8 6f bf ff ff eb ad <0f> 0b f3 0f 1e fa 55 48 89 e5 41 57 41 56 41 55
(XEN) Xen stack trace from rsp=ffff8302515cfd70:
(XEN)    ffff83025174b000 ffff830251768cb0 ffff830251778270 ffff82d0405c4298
(XEN)    ffff8302515cfdd8 ffff82d04024fcb8 0000000000000202 ffff830251778270
(XEN)    ffff83025174b000 0000000000000001 ffff830251769018 0000000000000000
(XEN)    0000000000000000 ffff8302515cfe48 ffff82d04020a8c9 ffff8882556aedc0
(XEN)    0000000000000003 00001910537e623e 0000000b988f78a6 0000000059d4a716
(XEN)    00001901f30fa41e 0000000217f96af6 0000000000000000 ffff83025174b000
(XEN)    ffff830251756000 0000000000000002 0000000000000001 ffff8302515cfe70
(XEN)    ffff82d0402f7968 ffff830251756000 ffff8302515cfef8 0000000000000018
(XEN)    ffff8302515cfee8 ffff82d0402ec6de 0000000000000000 ffffffff82f157e0
(XEN)    0000000000000000 0000000000000000 ffff8302515cfef8 0000000000000000
(XEN)    0000000000000000 ffff8302515cffff ffff830251756000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 00007cfdaea300e7
(XEN)    ffff82d0402012bd 0000000000000000 ffffffff82c51120 ffff88810036cf00
(XEN)    0000000000000002 000000000001e120 0000000000000002 0000000000000246
(XEN)    ffffffff82f157e0 0000000000000001 0000000000000000 0000000000000018
(XEN)    ffffffff81e4a30a 0000000000000000 0000000000000002 0000000000000001
(XEN)    0000010000000000 ffffffff81e4a30a 000000000000e033 0000000000000246
(XEN)    ffffc9004aef7c18 000000000000e02b fb5ee398d214b10c eb5ef398c214a10c
(XEN)    eb56f390c21ca104 ebd6f310c29ca184 0000e0100000000a ffff830251756000
(XEN)    0000003211016000 0000000000372660 0000000000000000 80000002963e1002
(XEN) Xen call trace:
(XEN)    [<ffff82d040247a4d>] R credit2.c#csched2_unit_wake+0x152/0x154
(XEN)    [<ffff82d04024fcb8>] F vcpu_wake+0xfd/0x267
(XEN)    [<ffff82d04020a8c9>] F common_vcpu_op+0x178/0x5d1
(XEN)    [<ffff82d0402f7968>] F do_vcpu_op+0x69/0x226
(XEN)    [<ffff82d0402ec6de>] F pv_hypercall+0x575/0x657
(XEN)    [<ffff82d0402012bd>] F lstar_enter+0x13d/0x150
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 10:
(XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
(XEN) ****************************************


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-08-21 16:14                     ` Marek Marczykowski-Górecki
  2022-08-22  9:53                       ` Jan Beulich
@ 2022-08-22 15:34                       ` Juergen Gross
  2022-09-06 11:46                         ` Juergen Gross
  1 sibling, 1 reply; 26+ messages in thread
From: Juergen Gross @ 2022-08-22 15:34 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Juergen Gross, Dario Faggioli, Jan Beulich, Andrew Cooper, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 14766 bytes --]

On 21.08.22 18:14, Marek Marczykowski-Górecki wrote:
> On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
>> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
>>> On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
>>>> On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
>>>>> On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
>>>>>> On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
>>>>>>> On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
>>>>>>>> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
>>>>>>>>> [Adding Juergen]
>>>>>>>>>
>>>>>>>>> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
>>>>>>>>>> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
>>>>>>>>>>> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> In my test setup (inside KVM with nested virt enabled), I rather
>>>>>>>>>>>> frequently get Xen crash on resume from S3. Full message below.
>>>>>>>>>>>>
>>>>>>>>>>>> This is Xen 4.13.0, with some patches, including "sched: fix
>>>>>>>>>>>> resuming
>>>>>>>>>>>> from S3 with smt=0".
>>>>>>>>>>>>
>>>>>>>>>>>> Contrary to the previous issue, this one does not happen always -
>>>>>>>>>>>> I
>>>>>>>>>>>> would say in about 40% cases on this setup, but very rarely on
>>>>>>>>>>>> physical
>>>>>>>>>>>> setup.
>>>>>>>>>>>>
>>>>>>>>>>>> This is _without_ core scheduling enabled, and also with smt=off.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you think it would be any different on xen-unstable? I cat
>>>>>>>>>>>> try, but
>>>>>>>>>>>> it isn't trivial in this setup, so I'd ask first.
>>>>>>>>>>>>
>>>>>>>>> Well, Juergen has fixed quite a few issues.
>>>>>>>>>
>>>>>>>>> Most of them where triggering with core-scheduling enabled, and I don't
>>>>>>>>> recall any of them which looked similar or related to this.
>>>>>>>>>
>>>>>>>>> Still, it's possible that the same issue causes different symptoms, and
>>>>>>>>> hence that maybe one of the patches would fix this too.
>>>>>>>>
>>>>>>>> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
>>>>>>>> (pasted below for the completeness).
>>>>>>>> But there is more: additionally, in most (all?) cases after resume I've got
>>>>>>>> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
>>>>>>>> didn't happened before and the only change was Xen 4.13 -> master.
>>>>>>>>
>>>>>>>> Xen crash:
>>>>>>>>
>>>>>>>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133
>>>>>>>
>>>>>>> Juergen, any idea about this one? This is also happening on the current
>>>>>>> stable-4.14 (28855ebcdbfa).
>>>>>>>
>>>>>>
>>>>>> Oh, sorry I didn't come back to this issue.
>>>>>>
>>>>>> I suspect this is related to stop_machine_run() being called during
>>>>>> suspend(), as I'm seeing very sporadic issues when offlining and then
>>>>>> onlining cpus with core scheduling being active (it seems as if the
>>>>>> dom0 vcpu doing the cpu online activity sometimes is using an old
>>>>>> vcpu state).
>>>>>
>>>>> Note this is default Xen 4.14 start, so core scheduling is _not_ active:
>>>>
>>>> The similarity in the two failure cases is that multiple cpus are
>>>> affected by the operations during stop_machine_run().
>>>>
>>>>>
>>>>>       (XEN) Brought up 2 CPUs
>>>>>       (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
>>>>>       (XEN) Adding cpu 0 to runqueue 0
>>>>>       (XEN)  First cpu on runqueue, activating
>>>>>       (XEN) Adding cpu 1 to runqueue 1
>>>>>       (XEN)  First cpu on runqueue, activating
>>>>>
>>>>>> I wasn't able to catch the real problem despite of having tried lots
>>>>>> of approaches using debug patches.
>>>>>>
>>>>>> Recently I suspected the whole problem could be somehow related to
>>>>>> RCU handling, as stop_machine_run() is relying on tasklets which are
>>>>>> executing in idle context, and RCU handling is done in idle context,
>>>>>> too. So there might be some kind of use after free scenario in case
>>>>>> some memory is freed via RCU despite it still being used by a tasklet.
>>>>>
>>>>> That sounds plausible, even though I don't really know this area of Xen.
>>>>>
>>>>>> I "just" need to find some time to verify this suspicion. Any help doing
>>>>>> this would be appreciated. :-)
>>>>>
>>>>> I do have a setup where I can easily-ish reproduce the issue. If there
>>>>> is some debug patch you'd like me to try, I can do that.
>>>>
>>>> Thanks. I might come back to that offer as you are seeing a crash which
>>>> will be much easier to analyze. Catching my error case is much harder as
>>>> it surfaces some time after the real problem in a non destructive way
>>>> (usually I'm seeing a failure to load a library in the program which
>>>> just did its job via exactly the library claiming not being loadable).
>>>
>>> Hi,
>>>
>>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
>>> can still reproduce the issue on the recent staging branch (9dc687f155).
>>>
>>> It fails after the first resume (not always, but frequent enough to
>>> debug it). At least one guest needs to be running - with just (PV) dom0
>>> the crash doesn't happen (at least for the ~8 times in a row I tried).
>>> If the first resume works, the second (almost?) always will fail but
>>> with a different symptoms - dom0 kernel lockups (at least some of its
>>> vcpus). I haven't debugged this one yet at all.
>>>
>>> Any help will be appreciated, I can apply some debug patches, change
>>> configuration etc.
>>
>> This still happens on 4.14.3. Maybe it is related to freeing percpu
>> areas, as it caused other issues with suspend too? Just a thought...
> 
> I have reproduced this on current staging(*). And I can reproduce it
> reliably. And also, I got (I believe) closely related crash with credit1
> scheduler.
> 
> (*) It isn't plain staging, it's one with my xhci console patches on
> top, including attempt to make it survive S3. I believe the only
> relevant part there is sticking set_timer() into console resume path (or
> just having a timer with rather short delay registered). The actual tree
> at https://github.com/marmarek/xen/tree/master-xue2-debug, including
> quite a lot of debug prints and debug hacks.
> 
> Specific crash with credit2:
> 
>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:928
>      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>      (XEN) CPU:    0
>      (XEN) RIP:    e008:[<ffff82d0402434bf>] credit.c#csched_tick+0x2d4/0x494
>      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
>      (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002
>      (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002
>      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004
>      (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887
>      (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40
>      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
>      (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318
>      (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000
>      (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>      (XEN) Xen code around <ffff82d0402434bf> (credit.c#csched_tick+0x2d4/0x494):
>      (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 0f b7 00 89 45 cc
>      (XEN) Xen stack trace from rsp=ffff830256227de0:
>      (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90
>      (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb
>      (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60
>      (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0
>      (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00
>      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
>      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
>      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
>      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00
>      (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031
>      (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000
>      (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700
>      (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f
>      (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b
>      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
>      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
>      (XEN) Xen call trace:
>      (XEN)    [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494
>      (XEN)    [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c
>      (XEN)    [<ffff82d04022f73f>] F timer.c#timer_softirq_action+0x71/0x278
>      (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
>      (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
>      (XEN)    [<ffff82d0402f03c6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>      (XEN)
>      (XEN)
>      (XEN) ****************************************
>      (XEN) Panic on CPU 0:
>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:928
>      (XEN) ****************************************
> 
> Specific crash with credit1:
> 
>      (XEN) ASSERT FAIL! cpu 4, sched_unit_master 0, currunit id 4
>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:936
>      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>      (XEN) CPU:    0
>      (XEN) RIP:    e008:[<ffff82d04024355c>] credit.c#csched_tick+0x2f7/0x4b7
>      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
>      (XEN) rax: ffff8302517f64d0   rbx: 0000000000000004   rcx: 0000000000000000
>      (XEN) rdx: ffff830256227fff   rsi: 000000000000000a   rdi: ffff82d0404786b8
>      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000087
>      (XEN) r9:  0000000000000001   r10: 0000000000000001   r11: 0000000000000004
>      (XEN) r12: ffff83025174de50   r13: ffff830251778100   r14: ffff83025174df40
>      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
>      (XEN) cr3: 000000024ac10000   cr2: 0000000000000000
>      (XEN) fsb: 0000000000000000   gsb: ffff888255700000   gss: 0000000000000000
>      (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: 0000   cs: e008
>      (XEN) Xen code around <ffff82d04024355c> (credit.c#csched_tick+0x2f7/0x4b7):
>      (XEN)  58 70 0f 84 bf fe ff ff <0f> 0b 0f 0b 0f 0b 48 8b 40 20 0f b7 00 89 45 cc
>      (XEN) Xen stack trace from rsp=ffff830256227de0:
>      (XEN)    000000000046692a ffff8302515c0bc0 000000044025ae27 002191c0517f64e8
>      (XEN)    ffff82d0405c33e0 ffff8302517c8590 ffff82d0405c3280 ffff82d040243265
>      (XEN)    0000000000000004 00000026d27a80a7 0000000000000000 ffff830256227e60
>      (XEN)    ffff82d04022ee94 ffff82d0405c3280 ffff8302517e7df0 ffff830251778120
>      (XEN)    ffff830256227ea0 ffff82d04022f77a ffff830256227e80 ffff82d0405c9f00
>      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
>      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
>      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
>      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f0496 0000000000000001
>      (XEN)    ffffc90040127ddc ffffc90040127dd0 ffffc90040127dd4 ffffc90040127dd8
>      (XEN)    00000000756e6547 0000000000000000 0000000000000000 ffffc90040127ddc
>      (XEN)    ffffc90040127dd0 000000000000000d 000000006c65746e 0000000049656e69
>      (XEN)    ffffc90040127dd4 ffffc90040127dd8 000000fa00000000 ffffffff8101c5a4
>      (XEN)    000000000000e033 0000000000000293 ffffc90040127db8 000000000000e02b
>      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
>      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
>      (XEN) Xen call trace:
>      (XEN)    [<ffff82d04024355c>] R credit.c#csched_tick+0x2f7/0x4b7
>      (XEN)    [<ffff82d04022ee94>] F timer.c#execute_timer+0x45/0x5c
>      (XEN)    [<ffff82d04022f77a>] F timer.c#timer_softirq_action+0x71/0x278
>      (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
>      (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
>      (XEN)    [<ffff82d0402f0496>] F x86_64/entry.S#process_softirqs+0x6/0x20
>      (XEN)
>      (XEN) ****************************************
>      (XEN) Panic on CPU 0:
>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at common/sched/credit.c:936
>      (XEN) ****************************************
> 
> I'm not sure about the nature of credit2 crash, but for credit1 it looks
> much simpler to understand. Credit1 registers a timer for each pcpu (in
> init_pdata()). Then, the timer function
> (csched_tick()->csched_unit_acct()) asserts if it was really called on
> correct CPU. The issue is that suspend path migrates all timers to CPU0
> (timerc.:cpu_callback->migrate_timers_from_cpu()), and restore path
> doesn't migrate them back. All this while not unregistering pCPUs from
> the scheduler (sched_deinit_pdata() / init_pdata() is not called -
> confirmed via debug prints).
> 
> I'm not exactly sure if that's the same issue that applies to credit2,
> but I think it's likely. A naive fix could be unregistering all pCPUs
> from the scheduler (an registering them back on resume), but this may
> mess up CPU pools, and probably few other things. Any better ideas? I
> can test various patches, provide more debug info etc - should be rather
> straightforward given I have reliable reproducer now.

Unregistering the cpus from the scheduler would break core scheduling.

I'll have a look if the timers can just be deactivated in order not to
have them migrated to cpu0.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-08-22 15:34                       ` Juergen Gross
@ 2022-09-06 11:46                         ` Juergen Gross
  2022-09-06 12:35                           ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 26+ messages in thread
From: Juergen Gross @ 2022-09-06 11:46 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Juergen Gross, Dario Faggioli, Jan Beulich, Andrew Cooper, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 15820 bytes --]

On 22.08.22 17:34, Juergen Gross wrote:
> On 21.08.22 18:14, Marek Marczykowski-Górecki wrote:
>> On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
>>> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
>>>> On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
>>>>> On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
>>>>>> On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
>>>>>>> On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
>>>>>>>> On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
>>>>>>>>> On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
>>>>>>>>>> [Adding Juergen]
>>>>>>>>>>
>>>>>>>>>> On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
>>>>>>>>>>> On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
>>>>>>>>>>>> On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> In my test setup (inside KVM with nested virt enabled), I rather
>>>>>>>>>>>>> frequently get Xen crash on resume from S3. Full message below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is Xen 4.13.0, with some patches, including "sched: fix
>>>>>>>>>>>>> resuming
>>>>>>>>>>>>> from S3 with smt=0".
>>>>>>>>>>>>>
>>>>>>>>>>>>> Contrary to the previous issue, this one does not happen always -
>>>>>>>>>>>>> I
>>>>>>>>>>>>> would say in about 40% cases on this setup, but very rarely on
>>>>>>>>>>>>> physical
>>>>>>>>>>>>> setup.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is _without_ core scheduling enabled, and also with smt=off.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you think it would be any different on xen-unstable? I cat
>>>>>>>>>>>>> try, but
>>>>>>>>>>>>> it isn't trivial in this setup, so I'd ask first.
>>>>>>>>>>>>>
>>>>>>>>>> Well, Juergen has fixed quite a few issues.
>>>>>>>>>>
>>>>>>>>>> Most of them where triggering with core-scheduling enabled, and I don't
>>>>>>>>>> recall any of them which looked similar or related to this.
>>>>>>>>>>
>>>>>>>>>> Still, it's possible that the same issue causes different symptoms, and
>>>>>>>>>> hence that maybe one of the patches would fix this too.
>>>>>>>>>
>>>>>>>>> I've tested on master (d094e95fb7c), and reproduced exactly the same crash
>>>>>>>>> (pasted below for the completeness).
>>>>>>>>> But there is more: additionally, in most (all?) cases after resume I've 
>>>>>>>>> got
>>>>>>>>> soft lockup in Linux dom0 in smp_call_function_single() - see below. It
>>>>>>>>> didn't happened before and the only change was Xen 4.13 -> master.
>>>>>>>>>
>>>>>>>>> Xen crash:
>>>>>>>>>
>>>>>>>>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at 
>>>>>>>>> credit2.c:2133
>>>>>>>>
>>>>>>>> Juergen, any idea about this one? This is also happening on the current
>>>>>>>> stable-4.14 (28855ebcdbfa).
>>>>>>>>
>>>>>>>
>>>>>>> Oh, sorry I didn't come back to this issue.
>>>>>>>
>>>>>>> I suspect this is related to stop_machine_run() being called during
>>>>>>> suspend(), as I'm seeing very sporadic issues when offlining and then
>>>>>>> onlining cpus with core scheduling being active (it seems as if the
>>>>>>> dom0 vcpu doing the cpu online activity sometimes is using an old
>>>>>>> vcpu state).
>>>>>>
>>>>>> Note this is default Xen 4.14 start, so core scheduling is _not_ active:
>>>>>
>>>>> The similarity in the two failure cases is that multiple cpus are
>>>>> affected by the operations during stop_machine_run().
>>>>>
>>>>>>
>>>>>>       (XEN) Brought up 2 CPUs
>>>>>>       (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
>>>>>>       (XEN) Adding cpu 0 to runqueue 0
>>>>>>       (XEN)  First cpu on runqueue, activating
>>>>>>       (XEN) Adding cpu 1 to runqueue 1
>>>>>>       (XEN)  First cpu on runqueue, activating
>>>>>>
>>>>>>> I wasn't able to catch the real problem despite of having tried lots
>>>>>>> of approaches using debug patches.
>>>>>>>
>>>>>>> Recently I suspected the whole problem could be somehow related to
>>>>>>> RCU handling, as stop_machine_run() is relying on tasklets which are
>>>>>>> executing in idle context, and RCU handling is done in idle context,
>>>>>>> too. So there might be some kind of use after free scenario in case
>>>>>>> some memory is freed via RCU despite it still being used by a tasklet.
>>>>>>
>>>>>> That sounds plausible, even though I don't really know this area of Xen.
>>>>>>
>>>>>>> I "just" need to find some time to verify this suspicion. Any help doing
>>>>>>> this would be appreciated. :-)
>>>>>>
>>>>>> I do have a setup where I can easily-ish reproduce the issue. If there
>>>>>> is some debug patch you'd like me to try, I can do that.
>>>>>
>>>>> Thanks. I might come back to that offer as you are seeing a crash which
>>>>> will be much easier to analyze. Catching my error case is much harder as
>>>>> it surfaces some time after the real problem in a non destructive way
>>>>> (usually I'm seeing a failure to load a library in the program which
>>>>> just did its job via exactly the library claiming not being loadable).
>>>>
>>>> Hi,
>>>>
>>>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
>>>> can still reproduce the issue on the recent staging branch (9dc687f155).
>>>>
>>>> It fails after the first resume (not always, but frequent enough to
>>>> debug it). At least one guest needs to be running - with just (PV) dom0
>>>> the crash doesn't happen (at least for the ~8 times in a row I tried).
>>>> If the first resume works, the second (almost?) always will fail but
>>>> with a different symptoms - dom0 kernel lockups (at least some of its
>>>> vcpus). I haven't debugged this one yet at all.
>>>>
>>>> Any help will be appreciated, I can apply some debug patches, change
>>>> configuration etc.
>>>
>>> This still happens on 4.14.3. Maybe it is related to freeing percpu
>>> areas, as it caused other issues with suspend too? Just a thought...
>>
>> I have reproduced this on current staging(*). And I can reproduce it
>> reliably. And also, I got (I believe) closely related crash with credit1
>> scheduler.
>>
>> (*) It isn't plain staging, it's one with my xhci console patches on
>> top, including attempt to make it survive S3. I believe the only
>> relevant part there is sticking set_timer() into console resume path (or
>> just having a timer with rather short delay registered). The actual tree
>> at https://github.com/marmarek/xen/tree/master-xue2-debug, including
>> quite a lot of debug prints and debug hacks.
>>
>> Specific crash with credit2:
>>
>>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at 
>> common/sched/credit.c:928
>>      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>>      (XEN) CPU:    0
>>      (XEN) RIP:    e008:[<ffff82d0402434bf>] credit.c#csched_tick+0x2d4/0x494
>>      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
>>      (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002
>>      (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002
>>      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004
>>      (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887
>>      (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40
>>      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
>>      (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318
>>      (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000
>>      (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>      (XEN) Xen code around <ffff82d0402434bf> (credit.c#csched_tick+0x2d4/0x494):
>>      (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 0f b7 00 
>> 89 45 cc
>>      (XEN) Xen stack trace from rsp=ffff830256227de0:
>>      (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90
>>      (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb
>>      (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60
>>      (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0
>>      (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00
>>      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
>>      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
>>      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
>>      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00
>>      (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031
>>      (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000
>>      (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700
>>      (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f
>>      (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b
>>      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
>>      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
>>      (XEN) Xen call trace:
>>      (XEN)    [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494
>>      (XEN)    [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c
>>      (XEN)    [<ffff82d04022f73f>] F timer.c#timer_softirq_action+0x71/0x278
>>      (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
>>      (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
>>      (XEN)    [<ffff82d0402f03c6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>      (XEN)
>>      (XEN)
>>      (XEN) ****************************************
>>      (XEN) Panic on CPU 0:
>>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at 
>> common/sched/credit.c:928
>>      (XEN) ****************************************
>>
>> Specific crash with credit1:
>>
>>      (XEN) ASSERT FAIL! cpu 4, sched_unit_master 0, currunit id 4
>>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at 
>> common/sched/credit.c:936
>>      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>>      (XEN) CPU:    0
>>      (XEN) RIP:    e008:[<ffff82d04024355c>] credit.c#csched_tick+0x2f7/0x4b7
>>      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
>>      (XEN) rax: ffff8302517f64d0   rbx: 0000000000000004   rcx: 0000000000000000
>>      (XEN) rdx: ffff830256227fff   rsi: 000000000000000a   rdi: ffff82d0404786b8
>>      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000087
>>      (XEN) r9:  0000000000000001   r10: 0000000000000001   r11: 0000000000000004
>>      (XEN) r12: ffff83025174de50   r13: ffff830251778100   r14: ffff83025174df40
>>      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
>>      (XEN) cr3: 000000024ac10000   cr2: 0000000000000000
>>      (XEN) fsb: 0000000000000000   gsb: ffff888255700000   gss: 0000000000000000
>>      (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>      (XEN) Xen code around <ffff82d04024355c> (credit.c#csched_tick+0x2f7/0x4b7):
>>      (XEN)  58 70 0f 84 bf fe ff ff <0f> 0b 0f 0b 0f 0b 48 8b 40 20 0f b7 00 
>> 89 45 cc
>>      (XEN) Xen stack trace from rsp=ffff830256227de0:
>>      (XEN)    000000000046692a ffff8302515c0bc0 000000044025ae27 002191c0517f64e8
>>      (XEN)    ffff82d0405c33e0 ffff8302517c8590 ffff82d0405c3280 ffff82d040243265
>>      (XEN)    0000000000000004 00000026d27a80a7 0000000000000000 ffff830256227e60
>>      (XEN)    ffff82d04022ee94 ffff82d0405c3280 ffff8302517e7df0 ffff830251778120
>>      (XEN)    ffff830256227ea0 ffff82d04022f77a ffff830256227e80 ffff82d0405c9f00
>>      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
>>      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
>>      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
>>      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f0496 0000000000000001
>>      (XEN)    ffffc90040127ddc ffffc90040127dd0 ffffc90040127dd4 ffffc90040127dd8
>>      (XEN)    00000000756e6547 0000000000000000 0000000000000000 ffffc90040127ddc
>>      (XEN)    ffffc90040127dd0 000000000000000d 000000006c65746e 0000000049656e69
>>      (XEN)    ffffc90040127dd4 ffffc90040127dd8 000000fa00000000 ffffffff8101c5a4
>>      (XEN)    000000000000e033 0000000000000293 ffffc90040127db8 000000000000e02b
>>      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
>>      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
>>      (XEN) Xen call trace:
>>      (XEN)    [<ffff82d04024355c>] R credit.c#csched_tick+0x2f7/0x4b7
>>      (XEN)    [<ffff82d04022ee94>] F timer.c#execute_timer+0x45/0x5c
>>      (XEN)    [<ffff82d04022f77a>] F timer.c#timer_softirq_action+0x71/0x278
>>      (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
>>      (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
>>      (XEN)    [<ffff82d0402f0496>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>      (XEN)
>>      (XEN) ****************************************
>>      (XEN) Panic on CPU 0:
>>      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at 
>> common/sched/credit.c:936
>>      (XEN) ****************************************
>>
>> I'm not sure about the nature of credit2 crash, but for credit1 it looks
>> much simpler to understand. Credit1 registers a timer for each pcpu (in
>> init_pdata()). Then, the timer function
>> (csched_tick()->csched_unit_acct()) asserts if it was really called on
>> correct CPU. The issue is that suspend path migrates all timers to CPU0
>> (timerc.:cpu_callback->migrate_timers_from_cpu()), and restore path
>> doesn't migrate them back. All this while not unregistering pCPUs from
>> the scheduler (sched_deinit_pdata() / init_pdata() is not called -
>> confirmed via debug prints).
>>
>> I'm not exactly sure if that's the same issue that applies to credit2,
>> but I think it's likely. A naive fix could be unregistering all pCPUs
>> from the scheduler (an registering them back on resume), but this may
>> mess up CPU pools, and probably few other things. Any better ideas? I
>> can test various patches, provide more debug info etc - should be rather
>> straightforward given I have reliable reproducer now.
> 
> Unregistering the cpus from the scheduler would break core scheduling.
> 
> I'll have a look if the timers can just be deactivated in order not to
> have them migrated to cpu0.

Could you test the attached patch, please?


Juergen

[-- Attachment #1.1.2: 0001-xen-timer-don-t-migrate-timers-away-from-cpus-during.patch --]
[-- Type: text/x-patch, Size: 2238 bytes --]

From a2c224bee53e2216d41b34518bb2d11b64d63b6a Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
Date: Tue, 6 Sep 2022 13:39:48 +0200
Subject: [PATCH] xen/timer: don't migrate timers away from cpus during suspend

During a suspend/resume cycle timers on all cpus but cpu 0 will be
migrated to cpu 0, as the other cpus are taken down.

This is problematic in case such a timer is related to a specific vcpu,
as the vcpus are not migrated to another cpu during suspend (migrating
them would break cpupools and core scheduling).

In order to avoid the problems just try to keep the timers on their
cpus. Only migrate them away in case resume failed. Doing so isn't
problematic, as any vcpu on a cpu not coming back to life would be
migrated away, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/timer.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/xen/common/timer.c b/xen/common/timer.c
index 9b5016d5ed..6b5473e5f1 100644
--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -637,6 +637,13 @@ static void free_percpu_timers(unsigned int cpu)
         ASSERT(ts->heap == dummy_heap);
 }
 
+static void deinit_timers(unsigned int cpu)
+{
+    migrate_timers_from_cpu(cpu);
+    if ( !park_offline_cpus )
+        free_percpu_timers(cpu);
+}
+
 static int cf_check cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -655,13 +662,14 @@ static int cf_check cpu_callback(
         }
         break;
 
-    case CPU_UP_CANCELED:
     case CPU_DEAD:
-    case CPU_RESUME_FAILED:
-        migrate_timers_from_cpu(cpu);
+        if ( system_state != SYS_STATE_suspend )
+            deinit_timers(cpu);
+        break;
 
-        if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
-            free_percpu_timers(cpu);
+    case CPU_UP_CANCELED:
+    case CPU_RESUME_FAILED:
+        deinit_timers(cpu);
         break;
 
     case CPU_REMOVE:
-- 
2.35.3


[-- Attachment #1.1.3: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-06 11:46                         ` Juergen Gross
@ 2022-09-06 12:35                           ` Marek Marczykowski-Górecki
  2022-09-07 12:21                             ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-09-06 12:35 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Juergen Gross, Dario Faggioli, Jan Beulich, Andrew Cooper, xen-devel

[-- Attachment #1: Type: text/plain, Size: 20183 bytes --]

On Tue, Sep 06, 2022 at 01:46:55PM +0200, Juergen Gross wrote:
> On 22.08.22 17:34, Juergen Gross wrote:
> > On 21.08.22 18:14, Marek Marczykowski-Górecki wrote:
> > > On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
> > > > On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote:
> > > > > > On 29.09.20 17:16, Marek Marczykowski-Górecki wrote:
> > > > > > > On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote:
> > > > > > > > On 29.09.20 16:27, Marek Marczykowski-Górecki wrote:
> > > > > > > > > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote:
> > > > > > > > > > > [Adding Juergen]
> > > > > > > > > > > 
> > > > > > > > > > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote:
> > > > > > > > > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > In my test setup (inside KVM with nested virt enabled), I rather
> > > > > > > > > > > > > > frequently get Xen crash on resume from S3. Full message below.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix
> > > > > > > > > > > > > > resuming
> > > > > > > > > > > > > > from S3 with smt=0".
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Contrary to the previous issue, this one does not happen always -
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > would say in about 40% cases on this setup, but very rarely on
> > > > > > > > > > > > > > physical
> > > > > > > > > > > > > > setup.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > This is _without_ core scheduling enabled, and also with smt=off.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Do you think it would be any different on xen-unstable? I cat
> > > > > > > > > > > > > > try, but
> > > > > > > > > > > > > > it isn't trivial in this setup, so I'd ask first.
> > > > > > > > > > > > > > 
> > > > > > > > > > > Well, Juergen has fixed quite a few issues.
> > > > > > > > > > > 
> > > > > > > > > > > Most of them where triggering with core-scheduling enabled, and I don't
> > > > > > > > > > > recall any of them which looked similar or related to this.
> > > > > > > > > > > 
> > > > > > > > > > > Still, it's possible that the same issue causes different symptoms, and
> > > > > > > > > > > hence that maybe one of the patches would fix this too.
> > > > > > > > > > 
> > > > > > > > > > I've tested on master (d094e95fb7c), and reproduced exactly the same crash
> > > > > > > > > > (pasted below for the completeness).
> > > > > > > > > > But there is more: additionally, in most
> > > > > > > > > > (all?) cases after resume I've got
> > > > > > > > > > soft lockup in Linux dom0 in smp_call_function_single() - see below. It
> > > > > > > > > > didn't happened before and the only change was Xen 4.13 -> master.
> > > > > > > > > > 
> > > > > > > > > > Xen crash:
> > > > > > > > > > 
> > > > > > > > > > (XEN) Assertion
> > > > > > > > > > 'c2rqd(sched_unit_master(unit)) ==
> > > > > > > > > > svc->rqd' failed at credit2.c:2133
> > > > > > > > > 
> > > > > > > > > Juergen, any idea about this one? This is also happening on the current
> > > > > > > > > stable-4.14 (28855ebcdbfa).
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Oh, sorry I didn't come back to this issue.
> > > > > > > > 
> > > > > > > > I suspect this is related to stop_machine_run() being called during
> > > > > > > > suspend(), as I'm seeing very sporadic issues when offlining and then
> > > > > > > > onlining cpus with core scheduling being active (it seems as if the
> > > > > > > > dom0 vcpu doing the cpu online activity sometimes is using an old
> > > > > > > > vcpu state).
> > > > > > > 
> > > > > > > Note this is default Xen 4.14 start, so core scheduling is _not_ active:
> > > > > > 
> > > > > > The similarity in the two failure cases is that multiple cpus are
> > > > > > affected by the operations during stop_machine_run().
> > > > > > 
> > > > > > > 
> > > > > > >       (XEN) Brought up 2 CPUs
> > > > > > >       (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
> > > > > > >       (XEN) Adding cpu 0 to runqueue 0
> > > > > > >       (XEN)  First cpu on runqueue, activating
> > > > > > >       (XEN) Adding cpu 1 to runqueue 1
> > > > > > >       (XEN)  First cpu on runqueue, activating
> > > > > > > 
> > > > > > > > I wasn't able to catch the real problem despite of having tried lots
> > > > > > > > of approaches using debug patches.
> > > > > > > > 
> > > > > > > > Recently I suspected the whole problem could be somehow related to
> > > > > > > > RCU handling, as stop_machine_run() is relying on tasklets which are
> > > > > > > > executing in idle context, and RCU handling is done in idle context,
> > > > > > > > too. So there might be some kind of use after free scenario in case
> > > > > > > > some memory is freed via RCU despite it still being used by a tasklet.
> > > > > > > 
> > > > > > > That sounds plausible, even though I don't really know this area of Xen.
> > > > > > > 
> > > > > > > > I "just" need to find some time to verify this suspicion. Any help doing
> > > > > > > > this would be appreciated. :-)
> > > > > > > 
> > > > > > > I do have a setup where I can easily-ish reproduce the issue. If there
> > > > > > > is some debug patch you'd like me to try, I can do that.
> > > > > > 
> > > > > > Thanks. I might come back to that offer as you are seeing a crash which
> > > > > > will be much easier to analyze. Catching my error case is much harder as
> > > > > > it surfaces some time after the real problem in a non destructive way
> > > > > > (usually I'm seeing a failure to load a library in the program which
> > > > > > just did its job via exactly the library claiming not being loadable).
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > I'm resurrecting this thread as it was recently mentioned elsewhere. I
> > > > > can still reproduce the issue on the recent staging branch (9dc687f155).
> > > > > 
> > > > > It fails after the first resume (not always, but frequent enough to
> > > > > debug it). At least one guest needs to be running - with just (PV) dom0
> > > > > the crash doesn't happen (at least for the ~8 times in a row I tried).
> > > > > If the first resume works, the second (almost?) always will fail but
> > > > > with a different symptoms - dom0 kernel lockups (at least some of its
> > > > > vcpus). I haven't debugged this one yet at all.
> > > > > 
> > > > > Any help will be appreciated, I can apply some debug patches, change
> > > > > configuration etc.
> > > > 
> > > > This still happens on 4.14.3. Maybe it is related to freeing percpu
> > > > areas, as it caused other issues with suspend too? Just a thought...
> > > 
> > > I have reproduced this on current staging(*). And I can reproduce it
> > > reliably. And also, I got (I believe) closely related crash with credit1
> > > scheduler.
> > > 
> > > (*) It isn't plain staging, it's one with my xhci console patches on
> > > top, including attempt to make it survive S3. I believe the only
> > > relevant part there is sticking set_timer() into console resume path (or
> > > just having a timer with rather short delay registered). The actual tree
> > > at https://github.com/marmarek/xen/tree/master-xue2-debug, including
> > > quite a lot of debug prints and debug hacks.
> > > 
> > > Specific crash with credit2:
> > > 
> > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at
> > > common/sched/credit.c:928
> > >      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
> > >      (XEN) CPU:    0
> > >      (XEN) RIP:    e008:[<ffff82d0402434bf>] credit.c#csched_tick+0x2d4/0x494
> > >      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
> > >      (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002
> > >      (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002
> > >      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004
> > >      (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887
> > >      (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40
> > >      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
> > >      (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318
> > >      (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000
> > >      (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> > >      (XEN) Xen code around <ffff82d0402434bf> (credit.c#csched_tick+0x2d4/0x494):
> > >      (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20
> > > 0f b7 00 89 45 cc
> > >      (XEN) Xen stack trace from rsp=ffff830256227de0:
> > >      (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90
> > >      (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb
> > >      (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60
> > >      (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0
> > >      (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00
> > >      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
> > >      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
> > >      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
> > >      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00
> > >      (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031
> > >      (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000
> > >      (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700
> > >      (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f
> > >      (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b
> > >      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
> > >      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
> > >      (XEN) Xen call trace:
> > >      (XEN)    [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494
> > >      (XEN)    [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c
> > >      (XEN)    [<ffff82d04022f73f>] F timer.c#timer_softirq_action+0x71/0x278
> > >      (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
> > >      (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
> > >      (XEN)    [<ffff82d0402f03c6>] F x86_64/entry.S#process_softirqs+0x6/0x20
> > >      (XEN)
> > >      (XEN)
> > >      (XEN) ****************************************
> > >      (XEN) Panic on CPU 0:
> > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at
> > > common/sched/credit.c:928
> > >      (XEN) ****************************************
> > > 
> > > Specific crash with credit1:
> > > 
> > >      (XEN) ASSERT FAIL! cpu 4, sched_unit_master 0, currunit id 4
> > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at
> > > common/sched/credit.c:936
> > >      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
> > >      (XEN) CPU:    0
> > >      (XEN) RIP:    e008:[<ffff82d04024355c>] credit.c#csched_tick+0x2f7/0x4b7
> > >      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4)
> > >      (XEN) rax: ffff8302517f64d0   rbx: 0000000000000004   rcx: 0000000000000000
> > >      (XEN) rdx: ffff830256227fff   rsi: 000000000000000a   rdi: ffff82d0404786b8
> > >      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000087
> > >      (XEN) r9:  0000000000000001   r10: 0000000000000001   r11: 0000000000000004
> > >      (XEN) r12: ffff83025174de50   r13: ffff830251778100   r14: ffff83025174df40
> > >      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660
> > >      (XEN) cr3: 000000024ac10000   cr2: 0000000000000000
> > >      (XEN) fsb: 0000000000000000   gsb: ffff888255700000   gss: 0000000000000000
> > >      (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: 0000   cs: e008
> > >      (XEN) Xen code around <ffff82d04024355c> (credit.c#csched_tick+0x2f7/0x4b7):
> > >      (XEN)  58 70 0f 84 bf fe ff ff <0f> 0b 0f 0b 0f 0b 48 8b 40 20
> > > 0f b7 00 89 45 cc
> > >      (XEN) Xen stack trace from rsp=ffff830256227de0:
> > >      (XEN)    000000000046692a ffff8302515c0bc0 000000044025ae27 002191c0517f64e8
> > >      (XEN)    ffff82d0405c33e0 ffff8302517c8590 ffff82d0405c3280 ffff82d040243265
> > >      (XEN)    0000000000000004 00000026d27a80a7 0000000000000000 ffff830256227e60
> > >      (XEN)    ffff82d04022ee94 ffff82d0405c3280 ffff8302517e7df0 ffff830251778120
> > >      (XEN)    ffff830256227ea0 ffff82d04022f77a ffff830256227e80 ffff82d0405c9f00
> > >      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000
> > >      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000
> > >      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8
> > >      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f0496 0000000000000001
> > >      (XEN)    ffffc90040127ddc ffffc90040127dd0 ffffc90040127dd4 ffffc90040127dd8
> > >      (XEN)    00000000756e6547 0000000000000000 0000000000000000 ffffc90040127ddc
> > >      (XEN)    ffffc90040127dd0 000000000000000d 000000006c65746e 0000000049656e69
> > >      (XEN)    ffffc90040127dd4 ffffc90040127dd8 000000fa00000000 ffffffff8101c5a4
> > >      (XEN)    000000000000e033 0000000000000293 ffffc90040127db8 000000000000e02b
> > >      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660
> > >      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000
> > >      (XEN) Xen call trace:
> > >      (XEN)    [<ffff82d04024355c>] R credit.c#csched_tick+0x2f7/0x4b7
> > >      (XEN)    [<ffff82d04022ee94>] F timer.c#execute_timer+0x45/0x5c
> > >      (XEN)    [<ffff82d04022f77a>] F timer.c#timer_softirq_action+0x71/0x278
> > >      (XEN)    [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe
> > >      (XEN)    [<ffff82d04022d2ff>] F do_softirq+0x13/0x15
> > >      (XEN)    [<ffff82d0402f0496>] F x86_64/entry.S#process_softirqs+0x6/0x20
> > >      (XEN)
> > >      (XEN) ****************************************
> > >      (XEN) Panic on CPU 0:
> > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at
> > > common/sched/credit.c:936
> > >      (XEN) ****************************************
> > > 
> > > I'm not sure about the nature of credit2 crash, but for credit1 it looks
> > > much simpler to understand. Credit1 registers a timer for each pcpu (in
> > > init_pdata()). Then, the timer function
> > > (csched_tick()->csched_unit_acct()) asserts if it was really called on
> > > correct CPU. The issue is that suspend path migrates all timers to CPU0
> > > (timerc.:cpu_callback->migrate_timers_from_cpu()), and restore path
> > > doesn't migrate them back. All this while not unregistering pCPUs from
> > > the scheduler (sched_deinit_pdata() / init_pdata() is not called -
> > > confirmed via debug prints).
> > > 
> > > I'm not exactly sure if that's the same issue that applies to credit2,
> > > but I think it's likely. A naive fix could be unregistering all pCPUs
> > > from the scheduler (an registering them back on resume), but this may
> > > mess up CPU pools, and probably few other things. Any better ideas? I
> > > can test various patches, provide more debug info etc - should be rather
> > > straightforward given I have reliable reproducer now.
> > 
> > Unregistering the cpus from the scheduler would break core scheduling.
> > 
> > I'll have a look if the timers can just be deactivated in order not to
> > have them migrated to cpu0.
> 
> Could you test the attached patch, please?

I did a test with only dom0 running, and it works now. It isn't a
comprehensive test, but just dom0 was enough to crash it before, and it
stays working now. Thanks!

Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

> 
> 
> Juergen

> From a2c224bee53e2216d41b34518bb2d11b64d63b6a Mon Sep 17 00:00:00 2001
> From: Juergen Gross <jgross@suse.com>
> To: xen-devel@lists.xenproject.org
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Wei Liu <wl@xen.org>
> Date: Tue, 6 Sep 2022 13:39:48 +0200
> Subject: [PATCH] xen/timer: don't migrate timers away from cpus during suspend
> 
> During a suspend/resume cycle timers on all cpus but cpu 0 will be
> migrated to cpu 0, as the other cpus are taken down.
> 
> This is problematic in case such a timer is related to a specific vcpu,
> as the vcpus are not migrated to another cpu during suspend (migrating
> them would break cpupools and core scheduling).
> 
> In order to avoid the problems just try to keep the timers on their
> cpus. Only migrate them away in case resume failed. Doing so isn't
> problematic, as any vcpu on a cpu not coming back to life would be
> migrated away, too.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  xen/common/timer.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/timer.c b/xen/common/timer.c
> index 9b5016d5ed..6b5473e5f1 100644
> --- a/xen/common/timer.c
> +++ b/xen/common/timer.c
> @@ -637,6 +637,13 @@ static void free_percpu_timers(unsigned int cpu)
>          ASSERT(ts->heap == dummy_heap);
>  }
>  
> +static void deinit_timers(unsigned int cpu)
> +{
> +    migrate_timers_from_cpu(cpu);
> +    if ( !park_offline_cpus )
> +        free_percpu_timers(cpu);
> +}
> +
>  static int cf_check cpu_callback(
>      struct notifier_block *nfb, unsigned long action, void *hcpu)
>  {
> @@ -655,13 +662,14 @@ static int cf_check cpu_callback(
>          }
>          break;
>  
> -    case CPU_UP_CANCELED:
>      case CPU_DEAD:
> -    case CPU_RESUME_FAILED:
> -        migrate_timers_from_cpu(cpu);
> +        if ( system_state != SYS_STATE_suspend )
> +            deinit_timers(cpu);
> +        break;
>  
> -        if ( !park_offline_cpus && system_state != SYS_STATE_suspend )
> -            free_percpu_timers(cpu);
> +    case CPU_UP_CANCELED:
> +    case CPU_RESUME_FAILED:
> +        deinit_timers(cpu);
>          break;
>  
>      case CPU_REMOVE:
> -- 
> 2.35.3
> 






-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-06 12:35                           ` Marek Marczykowski-Górecki
@ 2022-09-07 12:21                             ` Dario Faggioli
  2022-09-07 15:07                               ` marmarek
  0 siblings, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2022-09-07 12:21 UTC (permalink / raw)
  To: Juergen Gross, marmarek; +Cc: jgross, Jan Beulich, andrew.cooper3, xen-devel

[-- Attachment #1: Type: text/plain, Size: 836 bytes --]

On Tue, 2022-09-06 at 14:35 +0200, Marek Marczykowski-Górecki wrote:
> On Tue, Sep 06, 2022 at 01:46:55PM +0200, Juergen Gross wrote:
> > 
> > Could you test the attached patch, please?
> 
> I did a test with only dom0 running, and it works now. It isn't a
> comprehensive test, but just dom0 was enough to crash it before, and
> it
> stays working now.
>
That's very cool to hear! Thanks for testing and reporting back.

Just to be sure, did you check both Credit1 and Credit2 and do they
both work, with Juergen's patch?

Thanks again and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-07 12:21                             ` Dario Faggioli
@ 2022-09-07 15:07                               ` marmarek
  0 siblings, 0 replies; 26+ messages in thread
From: marmarek @ 2022-09-07 15:07 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Juergen Gross, jgross, Jan Beulich, andrew.cooper3, xen-devel

[-- Attachment #1: Type: text/plain, Size: 849 bytes --]

On Wed, Sep 07, 2022 at 12:21:12PM +0000, Dario Faggioli wrote:
> On Tue, 2022-09-06 at 14:35 +0200, Marek Marczykowski-Górecki wrote:
> > On Tue, Sep 06, 2022 at 01:46:55PM +0200, Juergen Gross wrote:
> > > 
> > > Could you test the attached patch, please?
> > 
> > I did a test with only dom0 running, and it works now. It isn't a
> > comprehensive test, but just dom0 was enough to crash it before, and
> > it
> > stays working now.
> >
> That's very cool to hear! Thanks for testing and reporting back.
> 
> Just to be sure, did you check both Credit1 and Credit2 and do they
> both work, with Juergen's patch?

The test above was with credit1. I did checked credit2 later, and it
still crashes, unfortunately (Juergen knows already from our IRC chat).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-08-22 10:00                         ` Marek Marczykowski-Górecki
@ 2022-09-20 10:22                           ` Marek Marczykowski-Górecki
  2022-09-20 14:30                             ` Jan Beulich
                                               ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-09-20 10:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Dario Faggioli, Andrew Cooper, xen-devel,
	Jürgen Groß

[-- Attachment #1: Type: text/plain, Size: 10212 bytes --]

On Mon, Aug 22, 2022 at 12:00:27PM +0200, Marek Marczykowski-Górecki wrote:
> On Mon, Aug 22, 2022 at 11:53:50AM +0200, Jan Beulich wrote:
> > On 21.08.2022 18:14, Marek Marczykowski-Górecki wrote:
> > > On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
> > >> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
> > >>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
> > >>> can still reproduce the issue on the recent staging branch (9dc687f155).
> > >>>
> > >>> It fails after the first resume (not always, but frequent enough to
> > >>> debug it). At least one guest needs to be running - with just (PV) dom0
> > >>> the crash doesn't happen (at least for the ~8 times in a row I tried).
> > >>> If the first resume works, the second (almost?) always will fail but
> > >>> with a different symptoms - dom0 kernel lockups (at least some of its
> > >>> vcpus). I haven't debugged this one yet at all.
> > >>>
> > >>> Any help will be appreciated, I can apply some debug patches, change
> > >>> configuration etc.
> > >>
> > >> This still happens on 4.14.3. Maybe it is related to freeing percpu
> > >> areas, as it caused other issues with suspend too? Just a thought...
> > > 
> > > I have reproduced this on current staging(*). And I can reproduce it
> > > reliably. And also, I got (I believe) closely related crash with credit1
> > > scheduler.
> > > 
> > > (*) It isn't plain staging, it's one with my xhci console patches on
> > > top, including attempt to make it survive S3. I believe the only
> > > relevant part there is sticking set_timer() into console resume path (or
> > > just having a timer with rather short delay registered). The actual tree
> > > at https://github.com/marmarek/xen/tree/master-xue2-debug, including
> > > quite a lot of debug prints and debug hacks.
> > > 
> > > Specific crash with credit2:
> 
> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
> (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
> (XEN) CPU:    10
> (XEN) RIP:    e008:[<ffff82d040247a4d>] credit2.c#csched2_unit_wake+0x152/0x154
> (XEN) RFLAGS: 0000000000010083   CONTEXT: hypervisor (d0v0)
> (XEN) rax: ffff830251778230   rbx: ffff830251768cb0   rcx: 00000032111d6000
> (XEN) rdx: ffff8302515c1eb0   rsi: 0000000000000006   rdi: ffff830251769000
> (XEN) rbp: ffff8302515cfd90   rsp: ffff8302515cfd70   r8:  ffff830251769000
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
> (XEN) r12: ffff830251768dd0   r13: ffff8302515c1d00   r14: 0000000000000006
> (XEN) r15: ffff82d0405ddb40   cr0: 0000000080050033   cr4: 0000000000372660
> (XEN) cr3: 000000022f2a1000   cr2: ffff8881012738e0
> (XEN) fsb: 0000744bf6a0db80   gsb: ffff888255600000   gss: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen code around <ffff82d040247a4d> (credit2.c#csched2_unit_wake+0x152/0x154):
> (XEN)  df e8 6f bf ff ff eb ad <0f> 0b f3 0f 1e fa 55 48 89 e5 41 57 41 56 41 55
> (XEN) Xen stack trace from rsp=ffff8302515cfd70:
> (XEN)    ffff83025174b000 ffff830251768cb0 ffff830251778270 ffff82d0405c4298
> (XEN)    ffff8302515cfdd8 ffff82d04024fcb8 0000000000000202 ffff830251778270
> (XEN)    ffff83025174b000 0000000000000001 ffff830251769018 0000000000000000
> (XEN)    0000000000000000 ffff8302515cfe48 ffff82d04020a8c9 ffff8882556aedc0
> (XEN)    0000000000000003 00001910537e623e 0000000b988f78a6 0000000059d4a716
> (XEN)    00001901f30fa41e 0000000217f96af6 0000000000000000 ffff83025174b000
> (XEN)    ffff830251756000 0000000000000002 0000000000000001 ffff8302515cfe70
> (XEN)    ffff82d0402f7968 ffff830251756000 ffff8302515cfef8 0000000000000018
> (XEN)    ffff8302515cfee8 ffff82d0402ec6de 0000000000000000 ffffffff82f157e0
> (XEN)    0000000000000000 0000000000000000 ffff8302515cfef8 0000000000000000
> (XEN)    0000000000000000 ffff8302515cffff ffff830251756000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 00007cfdaea300e7
> (XEN)    ffff82d0402012bd 0000000000000000 ffffffff82c51120 ffff88810036cf00
> (XEN)    0000000000000002 000000000001e120 0000000000000002 0000000000000246
> (XEN)    ffffffff82f157e0 0000000000000001 0000000000000000 0000000000000018
> (XEN)    ffffffff81e4a30a 0000000000000000 0000000000000002 0000000000000001
> (XEN)    0000010000000000 ffffffff81e4a30a 000000000000e033 0000000000000246
> (XEN)    ffffc9004aef7c18 000000000000e02b fb5ee398d214b10c eb5ef398c214a10c
> (XEN)    eb56f390c21ca104 ebd6f310c29ca184 0000e0100000000a ffff830251756000
> (XEN)    0000003211016000 0000000000372660 0000000000000000 80000002963e1002
> (XEN) Xen call trace:
> (XEN)    [<ffff82d040247a4d>] R credit2.c#csched2_unit_wake+0x152/0x154
> (XEN)    [<ffff82d04024fcb8>] F vcpu_wake+0xfd/0x267
> (XEN)    [<ffff82d04020a8c9>] F common_vcpu_op+0x178/0x5d1
> (XEN)    [<ffff82d0402f7968>] F do_vcpu_op+0x69/0x226
> (XEN)    [<ffff82d0402ec6de>] F pv_hypercall+0x575/0x657
> (XEN)    [<ffff82d0402012bd>] F lstar_enter+0x13d/0x150
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 10:
> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
> (XEN) ****************************************

Ok, I think I figured it out!

I added a function that verifies run queues of all the sched units, and
called it basically every other line on the resume path. The debug
function (if anybody is interested):

    void verify_sched_units(void)
    {   
        struct domain *d;
        const struct sched_unit *unit;
        
        for_each_domain ( d )
        {
            for_each_sched_unit ( d, unit )
            {
                if ( c2rqd(sched_unit_master(unit)) != csched2_unit(unit)->rqd )
                {
                    printk(XENLOG_WARNING "d%d sched unit %d: rq=%d, unit master %d, rq=%d\n",
                            d->domain_id, unit->unit_id,
                            csched2_unit(unit)->rqd ? csched2_unit(unit)->rqd->id : -1,
                            sched_unit_master(unit),
                            c2rqd(sched_unit_master(unit))->id);
                    WARN_ON(1);
                }
            }
        }
    }

It appears that restore_vcpu_affinity() is responsible, specifically
this part:

1216         /*
1217          * Re-assign the initial processor as after resume we have no
1218          * guarantee the old processor has come back to life again.
1219          *
1220          * Therefore, here, before actually unpausing the domains, we should
1221          * set v->processor of each of their vCPUs to something that will
1222          * make sense for the scheduler of the cpupool in which they are in.
1223          */
...
1249         res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
1250         sched_set_res(unit, res);
1251 
1252         spin_unlock_irq(lock);
1253 
1254         /* v->processor might have changed, so reacquire the lock. */
1255         lock = unit_schedule_lock_irq(unit);
1256         res = sched_pick_resource(unit_scheduler(unit), unit);
1257         sched_set_res(unit, res);
1258         spin_unlock_irq(lock);
1259 
1260         if ( old_cpu != sched_unit_master(unit) )
1261             sched_move_irqs(unit);

It calls sched_set_res() directly, which assigns sched resources, but
does _not_ adjust runqueues (if new pcpu happen to be assigned to
another runqueue than the one from previous pcpu).

I have two (non exclusive) ideas here:
1. If old_cpu is actually still available, do not move it at all.
2. Use sched_migrate() instead of sched_set_res().

Here is the patch that fixes it for me:
---8<---
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 83455fbde1c8..dcf202d8b307 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1246,19 +1246,29 @@ void restore_vcpu_affinity(struct domain *d)
             }
         }
 
-        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
+        /* Prefer old cpu if available. */
+        if ( cpumask_test_cpu(old_cpu, cpumask_scratch_cpu(cpu)) )
+            res = get_sched_res(old_cpu);
+        else
+            res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
         sched_set_res(unit, res);
 
         spin_unlock_irq(lock);
 
-        /* v->processor might have changed, so reacquire the lock. */
-        lock = unit_schedule_lock_irq(unit);
-        res = sched_pick_resource(unit_scheduler(unit), unit);
-        sched_set_res(unit, res);
-        spin_unlock_irq(lock);
-
+        /*
+         * If different cpu was chosen, it was random, let scheduler do proper
+         * decision.
+         */
         if ( old_cpu != sched_unit_master(unit) )
+        {
+            /* v->processor might have changed, so reacquire the lock. */
+            lock = unit_schedule_lock_irq(unit);
+            res = sched_pick_resource(unit_scheduler(unit), unit);
+            sched_migrate(unit_scheduler(unit), unit, res->master_cpu);
+            spin_unlock_irq(lock);
+
             sched_move_irqs(unit);
+        }
     }
 
     rcu_read_unlock(&sched_res_rculock);
---8<---

I have several doubts here:

1. If old_cpu is available, is sched_set_res() needed at all?
2. Should both calls be changed to sched_migrate()? Currently I changed
   only the second one, in case scheduler could be confused about
   old_cpu not being available anymore.
3. Are there any extra locking requirements for sched_migrate() at this
   stage? The long comment above sched_unit_migrate_start() suggests
   there might be, but I'm not sure if that's really the case during
   resume.
4. Related to the above - should thaw_domains() be modified to call
   restore_vcpu_affinity() for all domains first, and unpause only
   later? That could reduce locking requirements, I guess.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-20 10:22                           ` Marek Marczykowski-Górecki
@ 2022-09-20 14:30                             ` Jan Beulich
  2022-10-11 11:22                               ` Marek Marczykowski-Górecki
  2022-10-14 16:42                             ` George Dunlap
  2022-10-21  6:41                             ` Juergen Gross
  2 siblings, 1 reply; 26+ messages in thread
From: Jan Beulich @ 2022-09-20 14:30 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, Dario Faggioli, George Dunlap
  Cc: Andrew Cooper, xen-devel, Jürgen Groß

On 20.09.2022 12:22, Marek Marczykowski-Górecki wrote:
> On Mon, Aug 22, 2022 at 12:00:27PM +0200, Marek Marczykowski-Górecki wrote:
>> On Mon, Aug 22, 2022 at 11:53:50AM +0200, Jan Beulich wrote:
>>> On 21.08.2022 18:14, Marek Marczykowski-Górecki wrote:
>>>> On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
>>>>> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
>>>>>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
>>>>>> can still reproduce the issue on the recent staging branch (9dc687f155).
>>>>>>
>>>>>> It fails after the first resume (not always, but frequent enough to
>>>>>> debug it). At least one guest needs to be running - with just (PV) dom0
>>>>>> the crash doesn't happen (at least for the ~8 times in a row I tried).
>>>>>> If the first resume works, the second (almost?) always will fail but
>>>>>> with a different symptoms - dom0 kernel lockups (at least some of its
>>>>>> vcpus). I haven't debugged this one yet at all.
>>>>>>
>>>>>> Any help will be appreciated, I can apply some debug patches, change
>>>>>> configuration etc.
>>>>>
>>>>> This still happens on 4.14.3. Maybe it is related to freeing percpu
>>>>> areas, as it caused other issues with suspend too? Just a thought...
>>>>
>>>> I have reproduced this on current staging(*). And I can reproduce it
>>>> reliably. And also, I got (I believe) closely related crash with credit1
>>>> scheduler.
>>>>
>>>> (*) It isn't plain staging, it's one with my xhci console patches on
>>>> top, including attempt to make it survive S3. I believe the only
>>>> relevant part there is sticking set_timer() into console resume path (or
>>>> just having a timer with rather short delay registered). The actual tree
>>>> at https://github.com/marmarek/xen/tree/master-xue2-debug, including
>>>> quite a lot of debug prints and debug hacks.
>>>>
>>>> Specific crash with credit2:
>>
>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
>> (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>> (XEN) CPU:    10
>> (XEN) RIP:    e008:[<ffff82d040247a4d>] credit2.c#csched2_unit_wake+0x152/0x154
>> (XEN) RFLAGS: 0000000000010083   CONTEXT: hypervisor (d0v0)
>> (XEN) rax: ffff830251778230   rbx: ffff830251768cb0   rcx: 00000032111d6000
>> (XEN) rdx: ffff8302515c1eb0   rsi: 0000000000000006   rdi: ffff830251769000
>> (XEN) rbp: ffff8302515cfd90   rsp: ffff8302515cfd70   r8:  ffff830251769000
>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
>> (XEN) r12: ffff830251768dd0   r13: ffff8302515c1d00   r14: 0000000000000006
>> (XEN) r15: ffff82d0405ddb40   cr0: 0000000080050033   cr4: 0000000000372660
>> (XEN) cr3: 000000022f2a1000   cr2: ffff8881012738e0
>> (XEN) fsb: 0000744bf6a0db80   gsb: ffff888255600000   gss: 0000000000000000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen code around <ffff82d040247a4d> (credit2.c#csched2_unit_wake+0x152/0x154):
>> (XEN)  df e8 6f bf ff ff eb ad <0f> 0b f3 0f 1e fa 55 48 89 e5 41 57 41 56 41 55
>> (XEN) Xen stack trace from rsp=ffff8302515cfd70:
>> (XEN)    ffff83025174b000 ffff830251768cb0 ffff830251778270 ffff82d0405c4298
>> (XEN)    ffff8302515cfdd8 ffff82d04024fcb8 0000000000000202 ffff830251778270
>> (XEN)    ffff83025174b000 0000000000000001 ffff830251769018 0000000000000000
>> (XEN)    0000000000000000 ffff8302515cfe48 ffff82d04020a8c9 ffff8882556aedc0
>> (XEN)    0000000000000003 00001910537e623e 0000000b988f78a6 0000000059d4a716
>> (XEN)    00001901f30fa41e 0000000217f96af6 0000000000000000 ffff83025174b000
>> (XEN)    ffff830251756000 0000000000000002 0000000000000001 ffff8302515cfe70
>> (XEN)    ffff82d0402f7968 ffff830251756000 ffff8302515cfef8 0000000000000018
>> (XEN)    ffff8302515cfee8 ffff82d0402ec6de 0000000000000000 ffffffff82f157e0
>> (XEN)    0000000000000000 0000000000000000 ffff8302515cfef8 0000000000000000
>> (XEN)    0000000000000000 ffff8302515cffff ffff830251756000 0000000000000000
>> (XEN)    0000000000000000 0000000000000000 0000000000000000 00007cfdaea300e7
>> (XEN)    ffff82d0402012bd 0000000000000000 ffffffff82c51120 ffff88810036cf00
>> (XEN)    0000000000000002 000000000001e120 0000000000000002 0000000000000246
>> (XEN)    ffffffff82f157e0 0000000000000001 0000000000000000 0000000000000018
>> (XEN)    ffffffff81e4a30a 0000000000000000 0000000000000002 0000000000000001
>> (XEN)    0000010000000000 ffffffff81e4a30a 000000000000e033 0000000000000246
>> (XEN)    ffffc9004aef7c18 000000000000e02b fb5ee398d214b10c eb5ef398c214a10c
>> (XEN)    eb56f390c21ca104 ebd6f310c29ca184 0000e0100000000a ffff830251756000
>> (XEN)    0000003211016000 0000000000372660 0000000000000000 80000002963e1002
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d040247a4d>] R credit2.c#csched2_unit_wake+0x152/0x154
>> (XEN)    [<ffff82d04024fcb8>] F vcpu_wake+0xfd/0x267
>> (XEN)    [<ffff82d04020a8c9>] F common_vcpu_op+0x178/0x5d1
>> (XEN)    [<ffff82d0402f7968>] F do_vcpu_op+0x69/0x226
>> (XEN)    [<ffff82d0402ec6de>] F pv_hypercall+0x575/0x657
>> (XEN)    [<ffff82d0402012bd>] F lstar_enter+0x13d/0x150
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 10:
>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
>> (XEN) ****************************************
> 
> Ok, I think I figured it out!
> 
> I added a function that verifies run queues of all the sched units, and
> called it basically every other line on the resume path. The debug
> function (if anybody is interested):
> 
>     void verify_sched_units(void)
>     {   
>         struct domain *d;
>         const struct sched_unit *unit;
>         
>         for_each_domain ( d )
>         {
>             for_each_sched_unit ( d, unit )
>             {
>                 if ( c2rqd(sched_unit_master(unit)) != csched2_unit(unit)->rqd )
>                 {
>                     printk(XENLOG_WARNING "d%d sched unit %d: rq=%d, unit master %d, rq=%d\n",
>                             d->domain_id, unit->unit_id,
>                             csched2_unit(unit)->rqd ? csched2_unit(unit)->rqd->id : -1,
>                             sched_unit_master(unit),
>                             c2rqd(sched_unit_master(unit))->id);
>                     WARN_ON(1);
>                 }
>             }
>         }
>     }
> 
> It appears that restore_vcpu_affinity() is responsible, specifically
> this part:
> 
> 1216         /*
> 1217          * Re-assign the initial processor as after resume we have no
> 1218          * guarantee the old processor has come back to life again.
> 1219          *
> 1220          * Therefore, here, before actually unpausing the domains, we should
> 1221          * set v->processor of each of their vCPUs to something that will
> 1222          * make sense for the scheduler of the cpupool in which they are in.
> 1223          */
> ...
> 1249         res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> 1250         sched_set_res(unit, res);
> 1251 
> 1252         spin_unlock_irq(lock);
> 1253 
> 1254         /* v->processor might have changed, so reacquire the lock. */
> 1255         lock = unit_schedule_lock_irq(unit);
> 1256         res = sched_pick_resource(unit_scheduler(unit), unit);
> 1257         sched_set_res(unit, res);
> 1258         spin_unlock_irq(lock);
> 1259 
> 1260         if ( old_cpu != sched_unit_master(unit) )
> 1261             sched_move_irqs(unit);
> 
> It calls sched_set_res() directly, which assigns sched resources, but
> does _not_ adjust runqueues (if new pcpu happen to be assigned to
> another runqueue than the one from previous pcpu).
> 
> I have two (non exclusive) ideas here:
> 1. If old_cpu is actually still available, do not move it at all.
> 2. Use sched_migrate() instead of sched_set_res().
> 
> Here is the patch that fixes it for me:
> ---8<---
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 83455fbde1c8..dcf202d8b307 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1246,19 +1246,29 @@ void restore_vcpu_affinity(struct domain *d)
>              }
>          }
>  
> -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> +        /* Prefer old cpu if available. */
> +        if ( cpumask_test_cpu(old_cpu, cpumask_scratch_cpu(cpu)) )
> +            res = get_sched_res(old_cpu);
> +        else
> +            res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
>          sched_set_res(unit, res);
>  
>          spin_unlock_irq(lock);
>  
> -        /* v->processor might have changed, so reacquire the lock. */
> -        lock = unit_schedule_lock_irq(unit);
> -        res = sched_pick_resource(unit_scheduler(unit), unit);
> -        sched_set_res(unit, res);
> -        spin_unlock_irq(lock);
> -
> +        /*
> +         * If different cpu was chosen, it was random, let scheduler do proper
> +         * decision.
> +         */
>          if ( old_cpu != sched_unit_master(unit) )
> +        {
> +            /* v->processor might have changed, so reacquire the lock. */
> +            lock = unit_schedule_lock_irq(unit);
> +            res = sched_pick_resource(unit_scheduler(unit), unit);
> +            sched_migrate(unit_scheduler(unit), unit, res->master_cpu);
> +            spin_unlock_irq(lock);
> +
>              sched_move_irqs(unit);
> +        }
>      }
>  
>      rcu_read_unlock(&sched_res_rculock);
> ---8<---
> 
> I have several doubts here:
> 
> 1. If old_cpu is available, is sched_set_res() needed at all?
> 2. Should both calls be changed to sched_migrate()? Currently I changed
>    only the second one, in case scheduler could be confused about
>    old_cpu not being available anymore.
> 3. Are there any extra locking requirements for sched_migrate() at this
>    stage? The long comment above sched_unit_migrate_start() suggests
>    there might be, but I'm not sure if that's really the case during
>    resume.
> 4. Related to the above - should thaw_domains() be modified to call
>    restore_vcpu_affinity() for all domains first, and unpause only
>    later? That could reduce locking requirements, I guess.

All questions primarily to the scheduler maintainers - forwarding
accordingly.

Jan


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-20 14:30                             ` Jan Beulich
@ 2022-10-11 11:22                               ` Marek Marczykowski-Górecki
  0 siblings, 0 replies; 26+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-10-11 11:22 UTC (permalink / raw)
  To: Dario Faggioli, George Dunlap
  Cc: Jan Beulich, xen-devel, Andrew Cooper, Jürgen Groß

[-- Attachment #1: Type: text/plain, Size: 11103 bytes --]

On Tue, Sep 20, 2022 at 04:30:41PM +0200, Jan Beulich wrote:
> On 20.09.2022 12:22, Marek Marczykowski-Górecki wrote:
> > On Mon, Aug 22, 2022 at 12:00:27PM +0200, Marek Marczykowski-Górecki wrote:
> >> On Mon, Aug 22, 2022 at 11:53:50AM +0200, Jan Beulich wrote:
> >>> On 21.08.2022 18:14, Marek Marczykowski-Górecki wrote:
> >>>> On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
> >>>>> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
> >>>>>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
> >>>>>> can still reproduce the issue on the recent staging branch (9dc687f155).
> >>>>>>
> >>>>>> It fails after the first resume (not always, but frequent enough to
> >>>>>> debug it). At least one guest needs to be running - with just (PV) dom0
> >>>>>> the crash doesn't happen (at least for the ~8 times in a row I tried).
> >>>>>> If the first resume works, the second (almost?) always will fail but
> >>>>>> with a different symptoms - dom0 kernel lockups (at least some of its
> >>>>>> vcpus). I haven't debugged this one yet at all.
> >>>>>>
> >>>>>> Any help will be appreciated, I can apply some debug patches, change
> >>>>>> configuration etc.
> >>>>>
> >>>>> This still happens on 4.14.3. Maybe it is related to freeing percpu
> >>>>> areas, as it caused other issues with suspend too? Just a thought...
> >>>>
> >>>> I have reproduced this on current staging(*). And I can reproduce it
> >>>> reliably. And also, I got (I believe) closely related crash with credit1
> >>>> scheduler.
> >>>>
> >>>> (*) It isn't plain staging, it's one with my xhci console patches on
> >>>> top, including attempt to make it survive S3. I believe the only
> >>>> relevant part there is sticking set_timer() into console resume path (or
> >>>> just having a timer with rather short delay registered). The actual tree
> >>>> at https://github.com/marmarek/xen/tree/master-xue2-debug, including
> >>>> quite a lot of debug prints and debug hacks.
> >>>>
> >>>> Specific crash with credit2:
> >>
> >> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
> >> (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
> >> (XEN) CPU:    10
> >> (XEN) RIP:    e008:[<ffff82d040247a4d>] credit2.c#csched2_unit_wake+0x152/0x154
> >> (XEN) RFLAGS: 0000000000010083   CONTEXT: hypervisor (d0v0)
> >> (XEN) rax: ffff830251778230   rbx: ffff830251768cb0   rcx: 00000032111d6000
> >> (XEN) rdx: ffff8302515c1eb0   rsi: 0000000000000006   rdi: ffff830251769000
> >> (XEN) rbp: ffff8302515cfd90   rsp: ffff8302515cfd70   r8:  ffff830251769000
> >> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
> >> (XEN) r12: ffff830251768dd0   r13: ffff8302515c1d00   r14: 0000000000000006
> >> (XEN) r15: ffff82d0405ddb40   cr0: 0000000080050033   cr4: 0000000000372660
> >> (XEN) cr3: 000000022f2a1000   cr2: ffff8881012738e0
> >> (XEN) fsb: 0000744bf6a0db80   gsb: ffff888255600000   gss: 0000000000000000
> >> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> >> (XEN) Xen code around <ffff82d040247a4d> (credit2.c#csched2_unit_wake+0x152/0x154):
> >> (XEN)  df e8 6f bf ff ff eb ad <0f> 0b f3 0f 1e fa 55 48 89 e5 41 57 41 56 41 55
> >> (XEN) Xen stack trace from rsp=ffff8302515cfd70:
> >> (XEN)    ffff83025174b000 ffff830251768cb0 ffff830251778270 ffff82d0405c4298
> >> (XEN)    ffff8302515cfdd8 ffff82d04024fcb8 0000000000000202 ffff830251778270
> >> (XEN)    ffff83025174b000 0000000000000001 ffff830251769018 0000000000000000
> >> (XEN)    0000000000000000 ffff8302515cfe48 ffff82d04020a8c9 ffff8882556aedc0
> >> (XEN)    0000000000000003 00001910537e623e 0000000b988f78a6 0000000059d4a716
> >> (XEN)    00001901f30fa41e 0000000217f96af6 0000000000000000 ffff83025174b000
> >> (XEN)    ffff830251756000 0000000000000002 0000000000000001 ffff8302515cfe70
> >> (XEN)    ffff82d0402f7968 ffff830251756000 ffff8302515cfef8 0000000000000018
> >> (XEN)    ffff8302515cfee8 ffff82d0402ec6de 0000000000000000 ffffffff82f157e0
> >> (XEN)    0000000000000000 0000000000000000 ffff8302515cfef8 0000000000000000
> >> (XEN)    0000000000000000 ffff8302515cffff ffff830251756000 0000000000000000
> >> (XEN)    0000000000000000 0000000000000000 0000000000000000 00007cfdaea300e7
> >> (XEN)    ffff82d0402012bd 0000000000000000 ffffffff82c51120 ffff88810036cf00
> >> (XEN)    0000000000000002 000000000001e120 0000000000000002 0000000000000246
> >> (XEN)    ffffffff82f157e0 0000000000000001 0000000000000000 0000000000000018
> >> (XEN)    ffffffff81e4a30a 0000000000000000 0000000000000002 0000000000000001
> >> (XEN)    0000010000000000 ffffffff81e4a30a 000000000000e033 0000000000000246
> >> (XEN)    ffffc9004aef7c18 000000000000e02b fb5ee398d214b10c eb5ef398c214a10c
> >> (XEN)    eb56f390c21ca104 ebd6f310c29ca184 0000e0100000000a ffff830251756000
> >> (XEN)    0000003211016000 0000000000372660 0000000000000000 80000002963e1002
> >> (XEN) Xen call trace:
> >> (XEN)    [<ffff82d040247a4d>] R credit2.c#csched2_unit_wake+0x152/0x154
> >> (XEN)    [<ffff82d04024fcb8>] F vcpu_wake+0xfd/0x267
> >> (XEN)    [<ffff82d04020a8c9>] F common_vcpu_op+0x178/0x5d1
> >> (XEN)    [<ffff82d0402f7968>] F do_vcpu_op+0x69/0x226
> >> (XEN)    [<ffff82d0402ec6de>] F pv_hypercall+0x575/0x657
> >> (XEN)    [<ffff82d0402012bd>] F lstar_enter+0x13d/0x150
> >> (XEN)
> >> (XEN)
> >> (XEN) ****************************************
> >> (XEN) Panic on CPU 10:
> >> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
> >> (XEN) ****************************************
> > 
> > Ok, I think I figured it out!
> > 
> > I added a function that verifies run queues of all the sched units, and
> > called it basically every other line on the resume path. The debug
> > function (if anybody is interested):
> > 
> >     void verify_sched_units(void)
> >     {   
> >         struct domain *d;
> >         const struct sched_unit *unit;
> >         
> >         for_each_domain ( d )
> >         {
> >             for_each_sched_unit ( d, unit )
> >             {
> >                 if ( c2rqd(sched_unit_master(unit)) != csched2_unit(unit)->rqd )
> >                 {
> >                     printk(XENLOG_WARNING "d%d sched unit %d: rq=%d, unit master %d, rq=%d\n",
> >                             d->domain_id, unit->unit_id,
> >                             csched2_unit(unit)->rqd ? csched2_unit(unit)->rqd->id : -1,
> >                             sched_unit_master(unit),
> >                             c2rqd(sched_unit_master(unit))->id);
> >                     WARN_ON(1);
> >                 }
> >             }
> >         }
> >     }
> > 
> > It appears that restore_vcpu_affinity() is responsible, specifically
> > this part:
> > 
> > 1216         /*
> > 1217          * Re-assign the initial processor as after resume we have no
> > 1218          * guarantee the old processor has come back to life again.
> > 1219          *
> > 1220          * Therefore, here, before actually unpausing the domains, we should
> > 1221          * set v->processor of each of their vCPUs to something that will
> > 1222          * make sense for the scheduler of the cpupool in which they are in.
> > 1223          */
> > ...
> > 1249         res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> > 1250         sched_set_res(unit, res);
> > 1251 
> > 1252         spin_unlock_irq(lock);
> > 1253 
> > 1254         /* v->processor might have changed, so reacquire the lock. */
> > 1255         lock = unit_schedule_lock_irq(unit);
> > 1256         res = sched_pick_resource(unit_scheduler(unit), unit);
> > 1257         sched_set_res(unit, res);
> > 1258         spin_unlock_irq(lock);
> > 1259 
> > 1260         if ( old_cpu != sched_unit_master(unit) )
> > 1261             sched_move_irqs(unit);
> > 
> > It calls sched_set_res() directly, which assigns sched resources, but
> > does _not_ adjust runqueues (if new pcpu happen to be assigned to
> > another runqueue than the one from previous pcpu).
> > 
> > I have two (non exclusive) ideas here:
> > 1. If old_cpu is actually still available, do not move it at all.
> > 2. Use sched_migrate() instead of sched_set_res().
> > 
> > Here is the patch that fixes it for me:
> > ---8<---
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index 83455fbde1c8..dcf202d8b307 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -1246,19 +1246,29 @@ void restore_vcpu_affinity(struct domain *d)
> >              }
> >          }
> >  
> > -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> > +        /* Prefer old cpu if available. */
> > +        if ( cpumask_test_cpu(old_cpu, cpumask_scratch_cpu(cpu)) )
> > +            res = get_sched_res(old_cpu);
> > +        else
> > +            res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> >          sched_set_res(unit, res);
> >  
> >          spin_unlock_irq(lock);
> >  
> > -        /* v->processor might have changed, so reacquire the lock. */
> > -        lock = unit_schedule_lock_irq(unit);
> > -        res = sched_pick_resource(unit_scheduler(unit), unit);
> > -        sched_set_res(unit, res);
> > -        spin_unlock_irq(lock);
> > -
> > +        /*
> > +         * If different cpu was chosen, it was random, let scheduler do proper
> > +         * decision.
> > +         */
> >          if ( old_cpu != sched_unit_master(unit) )
> > +        {
> > +            /* v->processor might have changed, so reacquire the lock. */
> > +            lock = unit_schedule_lock_irq(unit);
> > +            res = sched_pick_resource(unit_scheduler(unit), unit);
> > +            sched_migrate(unit_scheduler(unit), unit, res->master_cpu);
> > +            spin_unlock_irq(lock);
> > +
> >              sched_move_irqs(unit);
> > +        }
> >      }
> >  
> >      rcu_read_unlock(&sched_res_rculock);
> > ---8<---
> > 
> > I have several doubts here:
> > 
> > 1. If old_cpu is available, is sched_set_res() needed at all?
> > 2. Should both calls be changed to sched_migrate()? Currently I changed
> >    only the second one, in case scheduler could be confused about
> >    old_cpu not being available anymore.
> > 3. Are there any extra locking requirements for sched_migrate() at this
> >    stage? The long comment above sched_unit_migrate_start() suggests
> >    there might be, but I'm not sure if that's really the case during
> >    resume.
> > 4. Related to the above - should thaw_domains() be modified to call
> >    restore_vcpu_affinity() for all domains first, and unpause only
> >    later? That could reduce locking requirements, I guess.
> 
> All questions primarily to the scheduler maintainers - forwarding
> accordingly.

Ping?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-20 10:22                           ` Marek Marczykowski-Górecki
  2022-09-20 14:30                             ` Jan Beulich
@ 2022-10-14 16:42                             ` George Dunlap
  2022-10-21  6:41                             ` Juergen Gross
  2 siblings, 0 replies; 26+ messages in thread
From: George Dunlap @ 2022-10-14 16:42 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Jan Beulich, Juergen Gross, Dario Faggioli, Andrew Cooper,
	xen-devel, Jürgen Groß

[-- Attachment #1: Type: text/plain, Size: 3771 bytes --]

On Tue, Sep 20, 2022 at 11:23 AM Marek Marczykowski-Górecki <
marmarek@invisiblethingslab.com> wrote:

>
> I have two (non exclusive) ideas here:
> 1. If old_cpu is actually still available, do not move it at all.
> 2. Use sched_migrate() instead of sched_set_res().
>

Other possibilities:

3.  Make sure that svc->rqd is set to null when the affinity is broken.

Currently on vcpu creation, sched_init_vcpu() expects to set the pcpu; and
it looks like for credit2, the svc->rqd may not be set until the first time
it's woken up (that's the 'if' part of the 'if/else' clause whose 'else'
contains the ASSERT() you're hitting).  If when we broke the CPU affinity
on suspend, we set the runqueues to NULL, then on wake it would "take" the
runqueue assigned by restore_vcpu_affinity().

4. Make sched2_unit_wake() tolerant of pcpus changing under its feet.

#3 would potentially make things more robust, but would require adding some
sort of call-back to notify schedulers that affinity had been broken.  ATM
this might only be used by credit2.

#4 would potentially be dangerous: if some other bit of credit2 code which
assumes the svc->rq is valid.


> Here is the patch that fixes it for me:
> ---8<---
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 83455fbde1c8..dcf202d8b307 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1246,19 +1246,29 @@ void restore_vcpu_affinity(struct domain *d)
>              }
>          }
>
> -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> +        /* Prefer old cpu if available. */
> +        if ( cpumask_test_cpu(old_cpu, cpumask_scratch_cpu(cpu)) )
> +            res = get_sched_res(old_cpu);
> +        else
> +            res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
>          sched_set_res(unit, res);
>
>          spin_unlock_irq(lock);
>
> -        /* v->processor might have changed, so reacquire the lock. */
> -        lock = unit_schedule_lock_irq(unit);
> -        res = sched_pick_resource(unit_scheduler(unit), unit);
> -        sched_set_res(unit, res);
> -        spin_unlock_irq(lock);
> -
> +        /*
> +         * If different cpu was chosen, it was random, let scheduler do
> proper
> +         * decision.
> +         */
>          if ( old_cpu != sched_unit_master(unit) )
> +        {
> +            /* v->processor might have changed, so reacquire the lock. */
> +            lock = unit_schedule_lock_irq(unit);
> +            res = sched_pick_resource(unit_scheduler(unit), unit);
> +            sched_migrate(unit_scheduler(unit), unit, res->master_cpu);
> +            spin_unlock_irq(lock);
> +
>              sched_move_irqs(unit);
> +        }
>      }
>
>      rcu_read_unlock(&sched_res_rculock);
> ---8<---
>
> I have several doubts here:
>
> 1. If old_cpu is available, is sched_set_res() needed at all?
> 2. Should both calls be changed to sched_migrate()? Currently I changed
>    only the second one, in case scheduler could be confused about
>    old_cpu not being available anymore.
> 3. Are there any extra locking requirements for sched_migrate() at this
>    stage? The long comment above sched_unit_migrate_start() suggests
>    there might be, but I'm not sure if that's really the case during
>    resume.
> 4. Related to the above - should thaw_domains() be modified to call
>    restore_vcpu_affinity() for all domains first, and unpause only
>    later? That could reduce locking requirements, I guess.
>

Unfortunately this code has had a lot of churn since the last time I really
engaged with it; I'm going to have to come back to this on Monday.

Jürgen / Dario, any thoughts?

 -George

[-- Attachment #2: Type: text/html, Size: 4790 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
  2022-09-20 10:22                           ` Marek Marczykowski-Górecki
  2022-09-20 14:30                             ` Jan Beulich
  2022-10-14 16:42                             ` George Dunlap
@ 2022-10-21  6:41                             ` Juergen Gross
  2 siblings, 0 replies; 26+ messages in thread
From: Juergen Gross @ 2022-10-21  6:41 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki, Jan Beulich
  Cc: Juergen Gross, Dario Faggioli, Andrew Cooper, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 10839 bytes --]

On 20.09.22 12:22, Marek Marczykowski-Górecki wrote:
> On Mon, Aug 22, 2022 at 12:00:27PM +0200, Marek Marczykowski-Górecki wrote:
>> On Mon, Aug 22, 2022 at 11:53:50AM +0200, Jan Beulich wrote:
>>> On 21.08.2022 18:14, Marek Marczykowski-Górecki wrote:
>>>> On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote:
>>>>> On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote:
>>>>>> I'm resurrecting this thread as it was recently mentioned elsewhere. I
>>>>>> can still reproduce the issue on the recent staging branch (9dc687f155).
>>>>>>
>>>>>> It fails after the first resume (not always, but frequent enough to
>>>>>> debug it). At least one guest needs to be running - with just (PV) dom0
>>>>>> the crash doesn't happen (at least for the ~8 times in a row I tried).
>>>>>> If the first resume works, the second (almost?) always will fail but
>>>>>> with a different symptoms - dom0 kernel lockups (at least some of its
>>>>>> vcpus). I haven't debugged this one yet at all.
>>>>>>
>>>>>> Any help will be appreciated, I can apply some debug patches, change
>>>>>> configuration etc.
>>>>>
>>>>> This still happens on 4.14.3. Maybe it is related to freeing percpu
>>>>> areas, as it caused other issues with suspend too? Just a thought...
>>>>
>>>> I have reproduced this on current staging(*). And I can reproduce it
>>>> reliably. And also, I got (I believe) closely related crash with credit1
>>>> scheduler.
>>>>
>>>> (*) It isn't plain staging, it's one with my xhci console patches on
>>>> top, including attempt to make it survive S3. I believe the only
>>>> relevant part there is sticking set_timer() into console resume path (or
>>>> just having a timer with rather short delay registered). The actual tree
>>>> at https://github.com/marmarek/xen/tree/master-xue2-debug, including
>>>> quite a lot of debug prints and debug hacks.
>>>>
>>>> Specific crash with credit2:
>>
>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
>> (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]----
>> (XEN) CPU:    10
>> (XEN) RIP:    e008:[<ffff82d040247a4d>] credit2.c#csched2_unit_wake+0x152/0x154
>> (XEN) RFLAGS: 0000000000010083   CONTEXT: hypervisor (d0v0)
>> (XEN) rax: ffff830251778230   rbx: ffff830251768cb0   rcx: 00000032111d6000
>> (XEN) rdx: ffff8302515c1eb0   rsi: 0000000000000006   rdi: ffff830251769000
>> (XEN) rbp: ffff8302515cfd90   rsp: ffff8302515cfd70   r8:  ffff830251769000
>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
>> (XEN) r12: ffff830251768dd0   r13: ffff8302515c1d00   r14: 0000000000000006
>> (XEN) r15: ffff82d0405ddb40   cr0: 0000000080050033   cr4: 0000000000372660
>> (XEN) cr3: 000000022f2a1000   cr2: ffff8881012738e0
>> (XEN) fsb: 0000744bf6a0db80   gsb: ffff888255600000   gss: 0000000000000000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen code around <ffff82d040247a4d> (credit2.c#csched2_unit_wake+0x152/0x154):
>> (XEN)  df e8 6f bf ff ff eb ad <0f> 0b f3 0f 1e fa 55 48 89 e5 41 57 41 56 41 55
>> (XEN) Xen stack trace from rsp=ffff8302515cfd70:
>> (XEN)    ffff83025174b000 ffff830251768cb0 ffff830251778270 ffff82d0405c4298
>> (XEN)    ffff8302515cfdd8 ffff82d04024fcb8 0000000000000202 ffff830251778270
>> (XEN)    ffff83025174b000 0000000000000001 ffff830251769018 0000000000000000
>> (XEN)    0000000000000000 ffff8302515cfe48 ffff82d04020a8c9 ffff8882556aedc0
>> (XEN)    0000000000000003 00001910537e623e 0000000b988f78a6 0000000059d4a716
>> (XEN)    00001901f30fa41e 0000000217f96af6 0000000000000000 ffff83025174b000
>> (XEN)    ffff830251756000 0000000000000002 0000000000000001 ffff8302515cfe70
>> (XEN)    ffff82d0402f7968 ffff830251756000 ffff8302515cfef8 0000000000000018
>> (XEN)    ffff8302515cfee8 ffff82d0402ec6de 0000000000000000 ffffffff82f157e0
>> (XEN)    0000000000000000 0000000000000000 ffff8302515cfef8 0000000000000000
>> (XEN)    0000000000000000 ffff8302515cffff ffff830251756000 0000000000000000
>> (XEN)    0000000000000000 0000000000000000 0000000000000000 00007cfdaea300e7
>> (XEN)    ffff82d0402012bd 0000000000000000 ffffffff82c51120 ffff88810036cf00
>> (XEN)    0000000000000002 000000000001e120 0000000000000002 0000000000000246
>> (XEN)    ffffffff82f157e0 0000000000000001 0000000000000000 0000000000000018
>> (XEN)    ffffffff81e4a30a 0000000000000000 0000000000000002 0000000000000001
>> (XEN)    0000010000000000 ffffffff81e4a30a 000000000000e033 0000000000000246
>> (XEN)    ffffc9004aef7c18 000000000000e02b fb5ee398d214b10c eb5ef398c214a10c
>> (XEN)    eb56f390c21ca104 ebd6f310c29ca184 0000e0100000000a ffff830251756000
>> (XEN)    0000003211016000 0000000000372660 0000000000000000 80000002963e1002
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d040247a4d>] R credit2.c#csched2_unit_wake+0x152/0x154
>> (XEN)    [<ffff82d04024fcb8>] F vcpu_wake+0xfd/0x267
>> (XEN)    [<ffff82d04020a8c9>] F common_vcpu_op+0x178/0x5d1
>> (XEN)    [<ffff82d0402f7968>] F do_vcpu_op+0x69/0x226
>> (XEN)    [<ffff82d0402ec6de>] F pv_hypercall+0x575/0x657
>> (XEN)    [<ffff82d0402012bd>] F lstar_enter+0x13d/0x150
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 10:
>> (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at common/sched/credit2.c:2274
>> (XEN) ****************************************
> 
> Ok, I think I figured it out!
> 
> I added a function that verifies run queues of all the sched units, and
> called it basically every other line on the resume path. The debug
> function (if anybody is interested):
> 
>      void verify_sched_units(void)
>      {
>          struct domain *d;
>          const struct sched_unit *unit;
>          
>          for_each_domain ( d )
>          {
>              for_each_sched_unit ( d, unit )
>              {
>                  if ( c2rqd(sched_unit_master(unit)) != csched2_unit(unit)->rqd )
>                  {
>                      printk(XENLOG_WARNING "d%d sched unit %d: rq=%d, unit master %d, rq=%d\n",
>                              d->domain_id, unit->unit_id,
>                              csched2_unit(unit)->rqd ? csched2_unit(unit)->rqd->id : -1,
>                              sched_unit_master(unit),
>                              c2rqd(sched_unit_master(unit))->id);
>                      WARN_ON(1);
>                  }
>              }
>          }
>      }
> 
> It appears that restore_vcpu_affinity() is responsible, specifically
> this part:
> 
> 1216         /*
> 1217          * Re-assign the initial processor as after resume we have no
> 1218          * guarantee the old processor has come back to life again.
> 1219          *
> 1220          * Therefore, here, before actually unpausing the domains, we should
> 1221          * set v->processor of each of their vCPUs to something that will
> 1222          * make sense for the scheduler of the cpupool in which they are in.
> 1223          */
> ...
> 1249         res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> 1250         sched_set_res(unit, res);
> 1251
> 1252         spin_unlock_irq(lock);
> 1253
> 1254         /* v->processor might have changed, so reacquire the lock. */
> 1255         lock = unit_schedule_lock_irq(unit);
> 1256         res = sched_pick_resource(unit_scheduler(unit), unit);
> 1257         sched_set_res(unit, res);
> 1258         spin_unlock_irq(lock);
> 1259
> 1260         if ( old_cpu != sched_unit_master(unit) )
> 1261             sched_move_irqs(unit);
> 
> It calls sched_set_res() directly, which assigns sched resources, but
> does _not_ adjust runqueues (if new pcpu happen to be assigned to
> another runqueue than the one from previous pcpu).
> 
> I have two (non exclusive) ideas here:
> 1. If old_cpu is actually still available, do not move it at all.
> 2. Use sched_migrate() instead of sched_set_res().
> 
> Here is the patch that fixes it for me:
> ---8<---
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 83455fbde1c8..dcf202d8b307 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1246,19 +1246,29 @@ void restore_vcpu_affinity(struct domain *d)
>               }
>           }
>   
> -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> +        /* Prefer old cpu if available. */
> +        if ( cpumask_test_cpu(old_cpu, cpumask_scratch_cpu(cpu)) )
> +            res = get_sched_res(old_cpu);
> +        else
> +            res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
>           sched_set_res(unit, res);
>   
>           spin_unlock_irq(lock);
>   
> -        /* v->processor might have changed, so reacquire the lock. */
> -        lock = unit_schedule_lock_irq(unit);
> -        res = sched_pick_resource(unit_scheduler(unit), unit);
> -        sched_set_res(unit, res);
> -        spin_unlock_irq(lock);
> -
> +        /*
> +         * If different cpu was chosen, it was random, let scheduler do proper
> +         * decision.
> +         */
>           if ( old_cpu != sched_unit_master(unit) )
> +        {
> +            /* v->processor might have changed, so reacquire the lock. */
> +            lock = unit_schedule_lock_irq(unit);
> +            res = sched_pick_resource(unit_scheduler(unit), unit);
> +            sched_migrate(unit_scheduler(unit), unit, res->master_cpu);
> +            spin_unlock_irq(lock);
> +
>               sched_move_irqs(unit);
> +        }
>       }
>   
>       rcu_read_unlock(&sched_res_rculock);
> ---8<---
> 
> I have several doubts here:
> 
> 1. If old_cpu is available, is sched_set_res() needed at all?
> 2. Should both calls be changed to sched_migrate()? Currently I changed
>     only the second one, in case scheduler could be confused about
>     old_cpu not being available anymore.
> 3. Are there any extra locking requirements for sched_migrate() at this
>     stage? The long comment above sched_unit_migrate_start() suggests
>     there might be, but I'm not sure if that's really the case during
>     resume.
> 4. Related to the above - should thaw_domains() be modified to call
>     restore_vcpu_affinity() for all domains first, and unpause only
>     later? That could reduce locking requirements, I guess.
> 

Looking into this in more detail I think the fix is much easier.

In case a cpu isn't coming up again after suspend, cpu_disable_scheduler()
for this cpu will be called. This will do the needed vcpu migration, so we
can just remove restore_vcpu_affinity() completely without any need for
replacement.

I'll write a patch.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-10-21  6:41 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-18 14:16 [Xen-devel] Xen crash after S3 suspend - Xen 4.13 Marek Marczykowski-Górecki
2020-03-18 14:50 ` Andrew Cooper
2020-03-18 22:10   ` Marek Marczykowski-Górecki
2020-03-19  0:28     ` Dario Faggioli
2020-03-19  0:59       ` Marek Marczykowski-Górecki
2020-03-23  0:09       ` Marek Marczykowski-Górecki
2020-03-23  8:14         ` Jan Beulich
2020-09-29 14:27         ` Marek Marczykowski-Górecki
2020-09-29 15:07           ` Jürgen Groß
2020-09-29 15:16             ` Marek Marczykowski-Górecki
2020-09-29 15:27               ` Jürgen Groß
2021-01-31  2:15                 ` [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer Marek Marczykowski-Górecki
2021-10-09 16:28                   ` Marek Marczykowski-Górecki
2022-08-21 16:14                     ` Marek Marczykowski-Górecki
2022-08-22  9:53                       ` Jan Beulich
2022-08-22 10:00                         ` Marek Marczykowski-Górecki
2022-09-20 10:22                           ` Marek Marczykowski-Górecki
2022-09-20 14:30                             ` Jan Beulich
2022-10-11 11:22                               ` Marek Marczykowski-Górecki
2022-10-14 16:42                             ` George Dunlap
2022-10-21  6:41                             ` Juergen Gross
2022-08-22 15:34                       ` Juergen Gross
2022-09-06 11:46                         ` Juergen Gross
2022-09-06 12:35                           ` Marek Marczykowski-Górecki
2022-09-07 12:21                             ` Dario Faggioli
2022-09-07 15:07                               ` marmarek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.