On Tue, Sep 06, 2022 at 01:46:55PM +0200, Juergen Gross wrote: > On 22.08.22 17:34, Juergen Gross wrote: > > On 21.08.22 18:14, Marek Marczykowski-Górecki wrote: > > > On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki wrote: > > > > On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki wrote: > > > > > On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote: > > > > > > On 29.09.20 17:16, Marek Marczykowski-Górecki wrote: > > > > > > > On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote: > > > > > > > > On 29.09.20 16:27, Marek Marczykowski-Górecki wrote: > > > > > > > > > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek Marczykowski-Górecki wrote: > > > > > > > > > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote: > > > > > > > > > > > [Adding Juergen] > > > > > > > > > > > > > > > > > > > > > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote: > > > > > > > > > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote: > > > > > > > > > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > > > > > In my test setup (inside KVM with nested virt enabled), I rather > > > > > > > > > > > > > > frequently get Xen crash on resume from S3. Full message below. > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix > > > > > > > > > > > > > > resuming > > > > > > > > > > > > > > from S3 with smt=0". > > > > > > > > > > > > > > > > > > > > > > > > > > > > Contrary to the previous issue, this one does not happen always - > > > > > > > > > > > > > > I > > > > > > > > > > > > > > would say in about 40% cases on this setup, but very rarely on > > > > > > > > > > > > > > physical > > > > > > > > > > > > > > setup. > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is _without_ core scheduling enabled, and also with smt=off. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you think it would be any different on xen-unstable? I cat > > > > > > > > > > > > > > try, but > > > > > > > > > > > > > > it isn't trivial in this setup, so I'd ask first. > > > > > > > > > > > > > > > > > > > > > > > > > Well, Juergen has fixed quite a few issues. > > > > > > > > > > > > > > > > > > > > > > Most of them where triggering with core-scheduling enabled, and I don't > > > > > > > > > > > recall any of them which looked similar or related to this. > > > > > > > > > > > > > > > > > > > > > > Still, it's possible that the same issue causes different symptoms, and > > > > > > > > > > > hence that maybe one of the patches would fix this too. > > > > > > > > > > > > > > > > > > > > I've tested on master (d094e95fb7c), and reproduced exactly the same crash > > > > > > > > > > (pasted below for the completeness). > > > > > > > > > > But there is more: additionally, in most > > > > > > > > > > (all?) cases after resume I've got > > > > > > > > > > soft lockup in Linux dom0 in smp_call_function_single() - see below. It > > > > > > > > > > didn't happened before and the only change was Xen 4.13 -> master. > > > > > > > > > > > > > > > > > > > > Xen crash: > > > > > > > > > > > > > > > > > > > > (XEN) Assertion > > > > > > > > > > 'c2rqd(sched_unit_master(unit)) == > > > > > > > > > > svc->rqd' failed at credit2.c:2133 > > > > > > > > > > > > > > > > > > Juergen, any idea about this one? This is also happening on the current > > > > > > > > > stable-4.14 (28855ebcdbfa). > > > > > > > > > > > > > > > > > > > > > > > > > Oh, sorry I didn't come back to this issue. > > > > > > > > > > > > > > > > I suspect this is related to stop_machine_run() being called during > > > > > > > > suspend(), as I'm seeing very sporadic issues when offlining and then > > > > > > > > onlining cpus with core scheduling being active (it seems as if the > > > > > > > > dom0 vcpu doing the cpu online activity sometimes is using an old > > > > > > > > vcpu state). > > > > > > > > > > > > > > Note this is default Xen 4.14 start, so core scheduling is _not_ active: > > > > > > > > > > > > The similarity in the two failure cases is that multiple cpus are > > > > > > affected by the operations during stop_machine_run(). > > > > > > > > > > > > > > > > > > > >       (XEN) Brought up 2 CPUs > > > > > > >       (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource > > > > > > >       (XEN) Adding cpu 0 to runqueue 0 > > > > > > >       (XEN)  First cpu on runqueue, activating > > > > > > >       (XEN) Adding cpu 1 to runqueue 1 > > > > > > >       (XEN)  First cpu on runqueue, activating > > > > > > > > > > > > > > > I wasn't able to catch the real problem despite of having tried lots > > > > > > > > of approaches using debug patches. > > > > > > > > > > > > > > > > Recently I suspected the whole problem could be somehow related to > > > > > > > > RCU handling, as stop_machine_run() is relying on tasklets which are > > > > > > > > executing in idle context, and RCU handling is done in idle context, > > > > > > > > too. So there might be some kind of use after free scenario in case > > > > > > > > some memory is freed via RCU despite it still being used by a tasklet. > > > > > > > > > > > > > > That sounds plausible, even though I don't really know this area of Xen. > > > > > > > > > > > > > > > I "just" need to find some time to verify this suspicion. Any help doing > > > > > > > > this would be appreciated. :-) > > > > > > > > > > > > > > I do have a setup where I can easily-ish reproduce the issue. If there > > > > > > > is some debug patch you'd like me to try, I can do that. > > > > > > > > > > > > Thanks. I might come back to that offer as you are seeing a crash which > > > > > > will be much easier to analyze. Catching my error case is much harder as > > > > > > it surfaces some time after the real problem in a non destructive way > > > > > > (usually I'm seeing a failure to load a library in the program which > > > > > > just did its job via exactly the library claiming not being loadable). > > > > > > > > > > Hi, > > > > > > > > > > I'm resurrecting this thread as it was recently mentioned elsewhere. I > > > > > can still reproduce the issue on the recent staging branch (9dc687f155). > > > > > > > > > > It fails after the first resume (not always, but frequent enough to > > > > > debug it). At least one guest needs to be running - with just (PV) dom0 > > > > > the crash doesn't happen (at least for the ~8 times in a row I tried). > > > > > If the first resume works, the second (almost?) always will fail but > > > > > with a different symptoms - dom0 kernel lockups (at least some of its > > > > > vcpus). I haven't debugged this one yet at all. > > > > > > > > > > Any help will be appreciated, I can apply some debug patches, change > > > > > configuration etc. > > > > > > > > This still happens on 4.14.3. Maybe it is related to freeing percpu > > > > areas, as it caused other issues with suspend too? Just a thought... > > > > > > I have reproduced this on current staging(*). And I can reproduce it > > > reliably. And also, I got (I believe) closely related crash with credit1 > > > scheduler. > > > > > > (*) It isn't plain staging, it's one with my xhci console patches on > > > top, including attempt to make it survive S3. I believe the only > > > relevant part there is sticking set_timer() into console resume path (or > > > just having a timer with rather short delay registered). The actual tree > > > at https://github.com/marmarek/xen/tree/master-xue2-debug, including > > > quite a lot of debug prints and debug hacks. > > > > > > Specific crash with credit2: > > > > > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:928 > > >      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]---- > > >      (XEN) CPU:    0 > > >      (XEN) RIP:    e008:[] credit.c#csched_tick+0x2d4/0x494 > > >      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4) > > >      (XEN) rax: ffff82d0405c4298   rbx: 0000000000000002   rcx: 0000000000000002 > > >      (XEN) rdx: ffff8302517f64d0   rsi: ffff8302515c0fc0   rdi: 0000000000000002 > > >      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000004 > > >      (XEN) r9:  ffff8302517ac820   r10: ffff830251745068   r11: 00000088cb734887 > > >      (XEN) r12: ffff83025174de50   r13: ffff8302515c0fa0   r14: ffff83025174df40 > > >      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660 > > >      (XEN) cr3: 00000001bacbd000   cr2: 000077e5ec02a318 > > >      (XEN) fsb: 000077e5fe533700   gsb: ffff888255700000   gss: 0000000000000000 > > >      (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008 > > >      (XEN) Xen code around (credit.c#csched_tick+0x2d4/0x494): > > >      (XEN)  01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 > > > 0f b7 00 89 45 cc > > >      (XEN) Xen stack trace from rsp=ffff830256227de0: > > >      (XEN)    ffff830256227fff 0000000000000000 0000000256227e10 ffff82d04035be90 > > >      (XEN)    ffff830256227ef8 ffff830251745000 ffff82d0405c3280 ffff82d0402431eb > > >      (XEN)    0000000000000002 00000088c9ba9534 0000000000000000 ffff830256227e60 > > >      (XEN)    ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 ffff8302515c0fc0 > > >      (XEN)    ffff830256227ea0 ffff82d04022f73f ffff830256227e80 ffff82d0405c9f00 > > >      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000 > > >      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000 > > >      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8 > > >      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 ffff88810c005c00 > > >      (XEN)    0000000000000031 0000000000000100 00000000fffffe00 0000000000000031 > > >      (XEN)    0000000000000031 ffffffff82d45d28 0000000000000e2e 0000000000000000 > > >      (XEN)    0000000000000032 00000000ffffef31 0000000000000000 ffff88812244a700 > > >      (XEN)    0000000000000005 ffff88812244a780 000000fa00000000 ffffffff818db55f > > >      (XEN)    000000000000e033 0000000000000246 ffffc900409b7c50 000000000000e02b > > >      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > >      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660 > > >      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000 > > >      (XEN) Xen call trace: > > >      (XEN)    [] R credit.c#csched_tick+0x2d4/0x494 > > >      (XEN)    [] F timer.c#execute_timer+0x45/0x5c > > >      (XEN)    [] F timer.c#timer_softirq_action+0x71/0x278 > > >      (XEN)    [] F softirq.c#__do_softirq+0x94/0xbe > > >      (XEN)    [] F do_softirq+0x13/0x15 > > >      (XEN)    [] F x86_64/entry.S#process_softirqs+0x6/0x20 > > >      (XEN) > > >      (XEN) > > >      (XEN) **************************************** > > >      (XEN) Panic on CPU 0: > > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:928 > > >      (XEN) **************************************** > > > > > > Specific crash with credit1: > > > > > >      (XEN) ASSERT FAIL! cpu 4, sched_unit_master 0, currunit id 4 > > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:936 > > >      (XEN) ----[ Xen-4.17-unstable  x86_64  debug=y  Tainted:   C    ]---- > > >      (XEN) CPU:    0 > > >      (XEN) RIP:    e008:[] credit.c#csched_tick+0x2f7/0x4b7 > > >      (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v4) > > >      (XEN) rax: ffff8302517f64d0   rbx: 0000000000000004   rcx: 0000000000000000 > > >      (XEN) rdx: ffff830256227fff   rsi: 000000000000000a   rdi: ffff82d0404786b8 > > >      (XEN) rbp: ffff830256227e38   rsp: ffff830256227de0   r8:  0000000000000087 > > >      (XEN) r9:  0000000000000001   r10: 0000000000000001   r11: 0000000000000004 > > >      (XEN) r12: ffff83025174de50   r13: ffff830251778100   r14: ffff83025174df40 > > >      (XEN) r15: ffff8302515c0cc0   cr0: 0000000080050033   cr4: 0000000000372660 > > >      (XEN) cr3: 000000024ac10000   cr2: 0000000000000000 > > >      (XEN) fsb: 0000000000000000   gsb: ffff888255700000   gss: 0000000000000000 > > >      (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: 0000   cs: e008 > > >      (XEN) Xen code around (credit.c#csched_tick+0x2f7/0x4b7): > > >      (XEN)  58 70 0f 84 bf fe ff ff <0f> 0b 0f 0b 0f 0b 48 8b 40 20 > > > 0f b7 00 89 45 cc > > >      (XEN) Xen stack trace from rsp=ffff830256227de0: > > >      (XEN)    000000000046692a ffff8302515c0bc0 000000044025ae27 002191c0517f64e8 > > >      (XEN)    ffff82d0405c33e0 ffff8302517c8590 ffff82d0405c3280 ffff82d040243265 > > >      (XEN)    0000000000000004 00000026d27a80a7 0000000000000000 ffff830256227e60 > > >      (XEN)    ffff82d04022ee94 ffff82d0405c3280 ffff8302517e7df0 ffff830251778120 > > >      (XEN)    ffff830256227ea0 ffff82d04022f77a ffff830256227e80 ffff82d0405c9f00 > > >      (XEN)    ffffffffffffffff ffff82d0405c9f00 ffff830256227fff 0000000000000000 > > >      (XEN)    ffff830256227ed8 ffff82d04022d26c ffff830251745000 0000000000000000 > > >      (XEN)    0000000000000000 ffff830256227fff 0000000000000000 ffff830256227ee8 > > >      (XEN)    ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f0496 0000000000000001 > > >      (XEN)    ffffc90040127ddc ffffc90040127dd0 ffffc90040127dd4 ffffc90040127dd8 > > >      (XEN)    00000000756e6547 0000000000000000 0000000000000000 ffffc90040127ddc > > >      (XEN)    ffffc90040127dd0 000000000000000d 000000006c65746e 0000000049656e69 > > >      (XEN)    ffffc90040127dd4 ffffc90040127dd8 000000fa00000000 ffffffff8101c5a4 > > >      (XEN)    000000000000e033 0000000000000293 ffffc90040127db8 000000000000e02b > > >      (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > >      (XEN)    0000e01000000000 ffff830251745000 0000000000000000 0000000000372660 > > >      (XEN)    0000000000000000 800000025620b002 000e030300000001 0000000000000000 > > >      (XEN) Xen call trace: > > >      (XEN)    [] R credit.c#csched_tick+0x2f7/0x4b7 > > >      (XEN)    [] F timer.c#execute_timer+0x45/0x5c > > >      (XEN)    [] F timer.c#timer_softirq_action+0x71/0x278 > > >      (XEN)    [] F softirq.c#__do_softirq+0x94/0xbe > > >      (XEN)    [] F do_softirq+0x13/0x15 > > >      (XEN)    [] F x86_64/entry.S#process_softirqs+0x6/0x20 > > >      (XEN) > > >      (XEN) **************************************** > > >      (XEN) Panic on CPU 0: > > >      (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:936 > > >      (XEN) **************************************** > > > > > > I'm not sure about the nature of credit2 crash, but for credit1 it looks > > > much simpler to understand. Credit1 registers a timer for each pcpu (in > > > init_pdata()). Then, the timer function > > > (csched_tick()->csched_unit_acct()) asserts if it was really called on > > > correct CPU. The issue is that suspend path migrates all timers to CPU0 > > > (timerc.:cpu_callback->migrate_timers_from_cpu()), and restore path > > > doesn't migrate them back. All this while not unregistering pCPUs from > > > the scheduler (sched_deinit_pdata() / init_pdata() is not called - > > > confirmed via debug prints). > > > > > > I'm not exactly sure if that's the same issue that applies to credit2, > > > but I think it's likely. A naive fix could be unregistering all pCPUs > > > from the scheduler (an registering them back on resume), but this may > > > mess up CPU pools, and probably few other things. Any better ideas? I > > > can test various patches, provide more debug info etc - should be rather > > > straightforward given I have reliable reproducer now. > > > > Unregistering the cpus from the scheduler would break core scheduling. > > > > I'll have a look if the timers can just be deactivated in order not to > > have them migrated to cpu0. > > Could you test the attached patch, please? I did a test with only dom0 running, and it works now. It isn't a comprehensive test, but just dom0 was enough to crash it before, and it stays working now. Thanks! Tested-by: Marek Marczykowski-Górecki > > > Juergen > From a2c224bee53e2216d41b34518bb2d11b64d63b6a Mon Sep 17 00:00:00 2001 > From: Juergen Gross > To: xen-devel@lists.xenproject.org > Cc: Andrew Cooper > Cc: George Dunlap > Cc: Jan Beulich > Cc: Julien Grall > Cc: Stefano Stabellini > Cc: Wei Liu > Date: Tue, 6 Sep 2022 13:39:48 +0200 > Subject: [PATCH] xen/timer: don't migrate timers away from cpus during suspend > > During a suspend/resume cycle timers on all cpus but cpu 0 will be > migrated to cpu 0, as the other cpus are taken down. > > This is problematic in case such a timer is related to a specific vcpu, > as the vcpus are not migrated to another cpu during suspend (migrating > them would break cpupools and core scheduling). > > In order to avoid the problems just try to keep the timers on their > cpus. Only migrate them away in case resume failed. Doing so isn't > problematic, as any vcpu on a cpu not coming back to life would be > migrated away, too. > > Signed-off-by: Juergen Gross > --- > xen/common/timer.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/xen/common/timer.c b/xen/common/timer.c > index 9b5016d5ed..6b5473e5f1 100644 > --- a/xen/common/timer.c > +++ b/xen/common/timer.c > @@ -637,6 +637,13 @@ static void free_percpu_timers(unsigned int cpu) > ASSERT(ts->heap == dummy_heap); > } > > +static void deinit_timers(unsigned int cpu) > +{ > + migrate_timers_from_cpu(cpu); > + if ( !park_offline_cpus ) > + free_percpu_timers(cpu); > +} > + > static int cf_check cpu_callback( > struct notifier_block *nfb, unsigned long action, void *hcpu) > { > @@ -655,13 +662,14 @@ static int cf_check cpu_callback( > } > break; > > - case CPU_UP_CANCELED: > case CPU_DEAD: > - case CPU_RESUME_FAILED: > - migrate_timers_from_cpu(cpu); > + if ( system_state != SYS_STATE_suspend ) > + deinit_timers(cpu); > + break; > > - if ( !park_offline_cpus && system_state != SYS_STATE_suspend ) > - free_percpu_timers(cpu); > + case CPU_UP_CANCELED: > + case CPU_RESUME_FAILED: > + deinit_timers(cpu); > break; > > case CPU_REMOVE: > -- > 2.35.3 > -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab