xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] dom0less + sched=null => broken in staging
@ 2019-08-07 18:22 Stefano Stabellini
  2019-08-08  8:04 ` George Dunlap
  2019-08-09 17:57 ` Dario Faggioli
  0 siblings, 2 replies; 26+ messages in thread
From: Stefano Stabellini @ 2019-08-07 18:22 UTC (permalink / raw)
  To: George.Dunlap, dfaggioli; +Cc: xen-devel, sstabellini

Hi Dario, George,

Dom0less with sched=null is broken on staging, it simply hangs soon
after Xen is finished loading things. My impression is that vcpus are
not actually started. I did a git bisection and it pointed to:

commit d545f1d6c2519a183ed631cfca7aff0baf29fde5 (refs/bisect/bad)
Author: Dario Faggioli <dfaggioli@suse.com>
Date:   Mon Aug 5 11:50:55 2019 +0100

    xen: sched: deal with vCPUs being or becoming online or offline
    
Any ideas?

Cheers,

Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-07 18:22 [Xen-devel] dom0less + sched=null => broken in staging Stefano Stabellini
@ 2019-08-08  8:04 ` George Dunlap
  2019-08-08 20:44   ` Stefano Stabellini
  2019-08-09 17:57 ` Dario Faggioli
  1 sibling, 1 reply; 26+ messages in thread
From: George Dunlap @ 2019-08-08  8:04 UTC (permalink / raw)
  To: Stefano Stabellini, George.Dunlap, dfaggioli; +Cc: xen-devel

On 8/7/19 7:22 PM, Stefano Stabellini wrote:
> Hi Dario, George,
> 
> Dom0less with sched=null is broken on staging, it simply hangs soon
> after Xen is finished loading things. My impression is that vcpus are
> not actually started. I did a git bisection and it pointed to:
> 
> commit d545f1d6c2519a183ed631cfca7aff0baf29fde5 (refs/bisect/bad)
> Author: Dario Faggioli <dfaggioli@suse.com>
> Date:   Mon Aug 5 11:50:55 2019 +0100
> 
>     xen: sched: deal with vCPUs being or becoming online or offline

That's Dario's patch -- Dario, can you take a look?

Stefano, how urgent is it for things to work for you -- i.e., at what
point do you want to consider reverting the patch?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-08  8:04 ` George Dunlap
@ 2019-08-08 20:44   ` Stefano Stabellini
  2019-08-09  7:40     ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2019-08-08 20:44 UTC (permalink / raw)
  To: George Dunlap; +Cc: George.Dunlap, xen-devel, Stefano Stabellini, dfaggioli

On Thu, 8 Aug 2019, George Dunlap wrote:
> On 8/7/19 7:22 PM, Stefano Stabellini wrote:
> > Hi Dario, George,
> > 
> > Dom0less with sched=null is broken on staging, it simply hangs soon
> > after Xen is finished loading things. My impression is that vcpus are
> > not actually started. I did a git bisection and it pointed to:
> > 
> > commit d545f1d6c2519a183ed631cfca7aff0baf29fde5 (refs/bisect/bad)
> > Author: Dario Faggioli <dfaggioli@suse.com>
> > Date:   Mon Aug 5 11:50:55 2019 +0100
> > 
> >     xen: sched: deal with vCPUs being or becoming online or offline
> 
> That's Dario's patch -- Dario, can you take a look?
> 
> Stefano, how urgent is it for things to work for you -- i.e., at what
> point do you want to consider reverting the patch?

Of course, we cannot make a release with this issue. I can live with it
for now because I have a revert for
d545f1d6c2519a183ed631cfca7aff0baf29fde5 at the top of all my working
branches and the production branches at Xilinx are based on the last
release.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-08 20:44   ` Stefano Stabellini
@ 2019-08-09  7:40     ` Dario Faggioli
  0 siblings, 0 replies; 26+ messages in thread
From: Dario Faggioli @ 2019-08-09  7:40 UTC (permalink / raw)
  To: Stefano Stabellini, George Dunlap; +Cc: George.Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1546 bytes --]

On Thu, 2019-08-08 at 13:44 -0700, Stefano Stabellini wrote:
> On Thu, 8 Aug 2019, George Dunlap wrote:
> > On 8/7/19 7:22 PM, Stefano Stabellini wrote:
> > > Hi Dario, George,
> > > 
> > > Dom0less with sched=null is broken on staging, it simply hangs
> > > soon
> > > after Xen is finished loading things. My impression is that vcpus
> > > are
> > > not actually started. I did a git bisection and it pointed to:
> > > 
> > > commit d545f1d6c2519a183ed631cfca7aff0baf29fde5 (refs/bisect/bad)
> > > Author: Dario Faggioli <dfaggioli@suse.com>
> > > Date:   Mon Aug 5 11:50:55 2019 +0100
> > > 
> > >     xen: sched: deal with vCPUs being or becoming online or
> > > offline
> > 
> > That's Dario's patch -- Dario, can you take a look?
> > 
> > Stefano, how urgent is it for things to work for you -- i.e., at
> > what
> > point do you want to consider reverting the patch?
> 
Ok... Patches works for me, in a "dom0full" configuration. :-)

> Of course, we cannot make a release with this issue. I can live with
> it
> for now because I have a revert for
> d545f1d6c2519a183ed631cfca7aff0baf29fde5 at the top of all my working
> branches and the production branches at Xilinx are based on the last
> release.
>
I'll take a look.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-07 18:22 [Xen-devel] dom0less + sched=null => broken in staging Stefano Stabellini
  2019-08-08  8:04 ` George Dunlap
@ 2019-08-09 17:57 ` Dario Faggioli
  2019-08-09 18:30   ` Stefano Stabellini
  1 sibling, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-08-09 17:57 UTC (permalink / raw)
  To: Stefano Stabellini, George.Dunlap; +Cc: xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1688 bytes --]

On Wed, 2019-08-07 at 11:22 -0700, Stefano Stabellini wrote:
> Hi Dario, George,
> 
> Dom0less with sched=null is broken on staging, it simply hangs soon
> after Xen is finished loading things. My impression is that vcpus are
> not actually started. I did a git bisection and it pointed to:
> 
> commit d545f1d6c2519a183ed631cfca7aff0baf29fde5 (refs/bisect/bad)
> Author: Dario Faggioli <dfaggioli@suse.com>
> Date:   Mon Aug 5 11:50:55 2019 +0100
> 
>     xen: sched: deal with vCPUs being or becoming online or offline
>     
> Any ideas?
> 
Ok, I've done some basic testing, and inspected the code again, and
honestly I am not finding anything really suspicious.

Of course, I'm not really testing dom0less, and I'm not sure I can
easily do that.

Can you help me with this, e.g., by providing some more info and, if
possible, logs?

E.g., you say boot stops after Xen loading. Is there a bootlog that we
can see (ideally from a debug build, and with "loglvl=all
guest_loglvl=all")?

Does the system respond to debug-keys? If yes, the log after triggering
the 'r' debug-key would be useful.

These patches are about vcpus going offline and online... does dom0less
play with vcpu onffline/online in any way?

I've put together a debug patch (attached), focusing on what the
mentioned commit does, but it's nothing more than a shot in the dark,
for now...

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.1.2: xen-sched-null-vcpu-onoff-debug.patch --]
[-- Type: text/x-patch, Size: 1594 bytes --]

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 26c6f0f129..afd42e552f 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -455,6 +455,7 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
 
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "Not inserting %pv (not online!)\n", v);
         vcpu_schedule_unlock_irq(lock, v);
         return;
     }
@@ -516,6 +517,7 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
     /* If offline, the vcpu shouldn't be assigned, nor in the waitqueue */
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "Not removing %pv (wasn't online!)\n", v);
         ASSERT(per_cpu(npc, v->processor).vcpu != v);
         ASSERT(list_empty(&nvc->waitq_elem));
         goto out;
@@ -635,6 +637,8 @@ static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
         }
         else if ( per_cpu(npc, cpu).vcpu == v )
             tickled = vcpu_deassign(prv, v);
+
+        dprintk(XENLOG_G_INFO, "%pv is, apparently, going offline (tickled=%d)\n", v, tickled);
     }
 
     /* If v is not assigned to a pCPU, or is not running, no need to bother */
@@ -697,6 +701,8 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      */
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "%pv is, apparently, going offline\n", v);
+
         spin_lock(&prv->waitq_lock);
         list_del_init(&nvc->waitq_elem);
         spin_unlock(&prv->waitq_lock);

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-09 17:57 ` Dario Faggioli
@ 2019-08-09 18:30   ` Stefano Stabellini
  2019-08-13 15:27     ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2019-08-09 18:30 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George.Dunlap, xen-devel, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 1828 bytes --]

On Fri, 9 Aug 2019, Dario Faggioli wrote:
> On Wed, 2019-08-07 at 11:22 -0700, Stefano Stabellini wrote:
> > Hi Dario, George,
> > 
> > Dom0less with sched=null is broken on staging, it simply hangs soon
> > after Xen is finished loading things. My impression is that vcpus are
> > not actually started. I did a git bisection and it pointed to:
> > 
> > commit d545f1d6c2519a183ed631cfca7aff0baf29fde5 (refs/bisect/bad)
> > Author: Dario Faggioli <dfaggioli@suse.com>
> > Date:   Mon Aug 5 11:50:55 2019 +0100
> > 
> >     xen: sched: deal with vCPUs being or becoming online or offline
> >     
> > Any ideas?
> > 
> Ok, I've done some basic testing, and inspected the code again, and
> honestly I am not finding anything really suspicious.
> 
> Of course, I'm not really testing dom0less, and I'm not sure I can
> easily do that.
> 
> Can you help me with this, e.g., by providing some more info and, if
> possible, logs?

I am attaching the logs. Interestingly, I get a bunch of:

(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d2v0 (not online!)

Maybe we are missing a call to online the vcpus somewhere in
xen/arch/arm/domain_build.c:construct_domain?


> E.g., you say boot stops after Xen loading. Is there a bootlog that we
> can see (ideally from a debug build, and with "loglvl=all
> guest_loglvl=all")?
> 
> Does the system respond to debug-keys? If yes, the log after triggering
> the 'r' debug-key would be useful.

The system doesn't respond to debug keys. My guess is that it is too
early maybe.


> These patches are about vcpus going offline and online... does dom0less
> play with vcpu onffline/online in any way?
> 
> I've put together a debug patch (attached), focusing on what the
> mentioned commit does, but it's nothing more than a shot in the dark,
> for now...

[-- Attachment #2: Type: text/plain, Size: 8014 bytes --]

- UART enabled -
- Boot CPU booting -
- Current EL 00000008 -
- Zero BSS -
- Initialize CPU -
- Turning on paging -
- Ready -
(XEN) Checking for initrd in /chosen
(XEN) RAM: 0000000000000000 - 000000007fefffff
(XEN) RAM: 0000000800000000 - 000000087fffffff
(XEN) 
(XEN) MODULE[0]: 0000000005e00000 - 0000000005e08000 Device Tree 
(XEN) MODULE[1]: 0000000005c00000 - 0000000005d83400 Ramdisk     
(XEN) MODULE[2]: 0000000004c00000 - 0000000005bdfa00 Kernel      
(XEN) MODULE[3]: 0000000004a00000 - 0000000004b83400 Ramdisk     
(XEN) MODULE[4]: 0000000003a00000 - 00000000049dfa00 Kernel      
(XEN) MODULE[5]: 0000000002400000 - 00000000039fa954 Ramdisk     
(XEN) MODULE[6]: 0000000001000000 - 00000000022f2200 Kernel      
(XEN)  RESVD[0]: 0000000005e00000 - 0000000005e08000
(XEN) 
(XEN) CMDLINE[0000000004c00000]:domU1 console=ttyAMA0
(XEN) CMDLINE[0000000003a00000]:domU0 console=ttyAMA0
(XEN) CMDLINE[0000000001000000]:chosen console=hvc0 earlycon=xen earlyprintk=xen root=/dev/ram0
(XEN) 
(XEN) Command line: console=dtuart dtuart=serial0 dom0_mem=700M dom0_max_vcpus=1 bootscrub=0 serrors=forward vwfi=native sched=null
(XEN) PFN compression on bits 19...22
(XEN) Domain heap initialised
(XEN) Booting using Device Tree
(XEN) Platform: Xilinx ZynqMP
(XEN) Looking for dtuart at "serial0", options ""
 Xen 4.13-unstable
(XEN) Xen version 4.13-unstable (sstabellini@) (aarch64-linux-gnu-gcc (Linaro GCC 5.3-2016.05) 5.3.1 20160412) debug=y  Fri Aug  9 11:25:18 PDT 2019
(XEN) Latest ChangeSet: Fri Aug 9 13:14:40 2019 +0100 git:762b9a2d99-dirty
(XEN) build-id: 23d86e8e8792dcc96038a90d8dab9698ddc3ed57
(XEN) Processor: 410fd034: "ARM Limited", variant: 0x0, part 0xd03, rev 0x4
(XEN) 64-bit Execution:
(XEN)   Processor Features: 1100000000002222 0000000000000000
(XEN)     Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
(XEN)     Extensions: FloatingPoint AdvancedSIMD
(XEN)   Debug Features: 0000000010305106 0000000000000000
(XEN)   Auxiliary Features: 0000000000000000 0000000000000000
(XEN)   Memory Model Features: 0000000000001122 0000000000000000
(XEN)   ISA Features:  0000000000011120 0000000000000000
(XEN) 32-bit Execution:
(XEN)   Processor Features: 00001231:00011011
(XEN)     Instruction Sets: AArch32 A32 Thumb Thumb-2 ThumbEE Jazelle
(XEN)     Extensions: GenericTimer Security
(XEN)   Debug Features: 03010066
(XEN)   Auxiliary Features: 00000000
(XEN)   Memory Model Features: 10101105 40000000 01260000 02102211
(XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
(XEN) Using SMC Calling Convention v1.1
(XEN) Using PSCI v1.1
(XEN) SMP: Allowing 4 CPUs
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 50000 KHz
(XEN) GICv2 initialization:
(XEN)         gic_dist_addr=00000000f9010000
(XEN)         gic_cpu_addr=00000000f9020000
(XEN)         gic_hyp_addr=00000000f9040000
(XEN)         gic_vcpu_addr=00000000f9060000
(XEN)         gic_maintenance_irq=25
(XEN) GICv2: Adjusting CPU interface base to 0xf902f000
(XEN) GICv2: 192 lines, 4 cpus (IID 00000000).
(XEN) XSM Framework v1.0.0 initialized
(XEN) Initialising XSM SILO mode
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) Using scheduler: null Scheduler (null)
(XEN) Initializing null scheduler
(XEN) WARNING: This is experimental software in development.
(XEN) Use at your own risk.
(XEN) Allocated console ring of 32 KiB.
(XEN) CPU0: Guest atomics will try 1 times before pausing the domain
(XEN) Bringing up CPU1
- CPU 00000001 booting -
- Current EL 00000008 -
- Initialize CPU -
- Turning on paging -
- Ready -
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU1: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 1 booted.
(XEN) Bringing up CPU2
- CPU 00000002 booting -
- Current EL 00000008 -
- Initialize CPU -
- Turning on paging -
- Ready -
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU2: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 2 booted.
(XEN) Bringing up CPU3
- CPU 00000003 booting -
- Current EL 00000008 -
- Initialize CPU -
- Turning on paging -
- Ready -
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU3: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 3 booted.
(XEN) Brought up 4 CPUs
(XEN) P2M: 40-bit IPA with 40-bit PA and 8-bit VMID
(XEN) P2M: 3 levels with order-1 root, VTCR 0x80023558
(XEN) smmu: /amba/smmu@fd800000: probing hardware configuration...
(XEN) smmu: /amba/smmu@fd800000: SMMUv2 with:
(XEN) smmu: /amba/smmu@fd800000:        stage 2 translation
(XEN) smmu: /amba/smmu@fd800000:        stream matching with 48 register groups, mask 0x7fff
(XEN) smmu: /amba/smmu@fd800000:        16 context banks (0 stage-2 only)
(XEN) smmu: /amba/smmu@fd800000:        Stage-2: 40-bit IPA -> 48-bit PA
(XEN) smmu: /amba/smmu@fd800000: registered 26 master devices
/amba@0/smmu0@0xFD800000: Decode error: write to 6c=0
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) alternatives: Patching with alt table 00000000002bbe60 -> 00000000002bc520
(XEN) sched_null.c:458: Not inserting d0v0 (not online!)
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Loading d0 kernel from boot module @ 0000000001000000
(XEN) Loading ramdisk from boot module @ 0000000002400000
(XEN) Allocating 1:1 mappings totalling 700MB for dom0:
(XEN) BANK[0] 0x00000020000000-0x00000040000000 (512MB)
(XEN) BANK[1] 0x00000070000000-0x00000078000000 (128MB)
(XEN) BANK[2] 0x0000007c000000-0x0000007fc00000 (60MB)
(XEN) Grant table range: 0x00000000e00000-0x00000000e40000
(XEN) smmu: /amba/smmu@fd800000: d0: p2maddr 0x000000087ffa2000
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading zImage from 0000000001000000 to 0000000020080000-0000000021372200
(XEN) Loading dom0 initrd from 0000000002400000 to 0x0000000028200000-0x00000000297fa954
(XEN) Loading dom0 DTB to 0x0000000028000000-0x0000000028006d75
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d1v0 (not online!)
(XEN) Loading d1 kernel from boot module @ 0000000004c00000
(XEN) Loading ramdisk from boot module @ 0000000005c00000
(XEN) Allocating mappings totalling 256MB for d1:
(XEN) d1 BANK[0] 0x00000040000000-0x00000050000000 (256MB)
(XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
(XEN) Loading zImage from 0000000004c00000 to 0000000040080000-000000004105fa00
(XEN) Loading dom0 initrd from 0000000005c00000 to 0x0000000048200000-0x0000000048383400
(XEN) Loading dom0 DTB to 0x0000000048000000-0x00000000480004bd
(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d2v0 (not online!)
(XEN) Loading d2 kernel from boot module @ 0000000003a00000
(XEN) Loading ramdisk from boot module @ 0000000004a00000
(XEN) Allocating mappings totalling 256MB for d2:
(XEN) d2 BANK[0] 0x00000040000000-0x00000050000000 (256MB)
(XEN) d2 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
(XEN) Loading zImage from 0000000003a00000 to 0000000040080000-000000004105fa00
(XEN) Loading dom0 initrd from 0000000004a00000 to 0x0000000048200000-0x0000000048383400
(XEN) Loading dom0 DTB to 0x0000000048000000-0x00000000480004bd


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-09 18:30   ` Stefano Stabellini
@ 2019-08-13 15:27     ` Dario Faggioli
  2019-08-13 16:52       ` Julien Grall
  2019-08-13 21:14       ` Stefano Stabellini
  0 siblings, 2 replies; 26+ messages in thread
From: Dario Faggioli @ 2019-08-13 15:27 UTC (permalink / raw)
  To: sstabellini; +Cc: George.Dunlap, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1655 bytes --]

On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
> On Fri, 9 Aug 2019, Dario Faggioli wrote:
> > Can you help me with this, e.g., by providing some more info and,
> > if
> > possible, logs?
> 
> I am attaching the logs. 
>
Thanks!

> Interestingly, I get a bunch of:
> 
> (XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
> (XEN) sched_null.c:458: Not inserting d2v0 (not online!)
> 
> Maybe we are missing a call to online the vcpus somewhere in
> xen/arch/arm/domain_build.c:construct_domain?
> 
Actually, those lines are normal, because vCPUs are created offline.
(see the set_bit(_VPF_down) in vcpu_create()).

The problem is why aren't they coming up. Basically, you're missing a
call to vcpu_wake().

In my (x86 and "dom0full") testbox, this seems to come from
domain_unpause_by_systemcontroller(dom0) called by
xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().

I don't know if domain construction in an ARM dom0less system works
similarly, though. What we want, is someone calling either vcpu_wake()
or vcpu_unpause(), after having cleared _VPF_down from pause_flags.

I am attaching an updated debug patch, with an additional printk when
we reach the point, within the null scheduler, when the vcpu would wake
up (to check whether the problem is that we never reach that point, or
something else).

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.1.2: xen-sched-null-vcpu-onoff-debug-v2.patc --]
[-- Type: text/x-patch, Size: 2001 bytes --]

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 26c6f0f129..e78afadf5c 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -455,6 +455,7 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
 
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "Not inserting %pv (not online!)\n", v);
         vcpu_schedule_unlock_irq(lock, v);
         return;
     }
@@ -516,6 +517,7 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
     /* If offline, the vcpu shouldn't be assigned, nor in the waitqueue */
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "Not removing %pv (wasn't online!)\n", v);
         ASSERT(per_cpu(npc, v->processor).vcpu != v);
         ASSERT(list_empty(&nvc->waitq_elem));
         goto out;
@@ -571,6 +573,7 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
      */
     if ( unlikely(per_cpu(npc, cpu).vcpu != v && list_empty(&nvc->waitq_elem)) )
     {
+        dprintk(XENLOG_G_INFO, "%pv is waking up after having been offline\n", v);
         spin_lock(&prv->waitq_lock);
         list_add_tail(&nvc->waitq_elem, &prv->waitq);
         spin_unlock(&prv->waitq_lock);
@@ -635,6 +638,8 @@ static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
         }
         else if ( per_cpu(npc, cpu).vcpu == v )
             tickled = vcpu_deassign(prv, v);
+
+        dprintk(XENLOG_G_INFO, "%pv is, apparently, going offline (tickled=%d)\n", v, tickled);
     }
 
     /* If v is not assigned to a pCPU, or is not running, no need to bother */
@@ -697,6 +702,8 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      */
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "%pv is, apparently, going offline\n", v);
+
         spin_lock(&prv->waitq_lock);
         list_del_init(&nvc->waitq_elem);
         spin_unlock(&prv->waitq_lock);

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 15:27     ` Dario Faggioli
@ 2019-08-13 16:52       ` Julien Grall
  2019-08-13 17:34         ` Dario Faggioli
  2019-08-13 21:14       ` Stefano Stabellini
  1 sibling, 1 reply; 26+ messages in thread
From: Julien Grall @ 2019-08-13 16:52 UTC (permalink / raw)
  To: Dario Faggioli, sstabellini; +Cc: George.Dunlap, xen-devel

Hi Dario,

On 8/13/19 4:27 PM, Dario Faggioli wrote:
> On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
>> On Fri, 9 Aug 2019, Dario Faggioli wrote:
>>> Can you help me with this, e.g., by providing some more info and,
>>> if
>>> possible, logs?
>>
>> I am attaching the logs.
>>
> Thanks!
> 
>> Interestingly, I get a bunch of:
>>
>> (XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
>> (XEN) sched_null.c:458: Not inserting d2v0 (not online!)
>>
>> Maybe we are missing a call to online the vcpus somewhere in
>> xen/arch/arm/domain_build.c:construct_domain?
>>
> Actually, those lines are normal, because vCPUs are created offline.
> (see the set_bit(_VPF_down) in vcpu_create()).
> 
> The problem is why aren't they coming up. Basically, you're missing a
> call to vcpu_wake().
> 
> In my (x86 and "dom0full") testbox, this seems to come from
> domain_unpause_by_systemcontroller(dom0) called by
> xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().
> 
> I don't know if domain construction in an ARM dom0less system works
> similarly, though. What we want, is someone calling either vcpu_wake()
> or vcpu_unpause(), after having cleared _VPF_down from pause_flags.

Looking at create_domUs() there is a call to 
domain_unpause_by_controller for each domUs.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 16:52       ` Julien Grall
@ 2019-08-13 17:34         ` Dario Faggioli
  2019-08-13 18:43           ` Julien Grall
  0 siblings, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-08-13 17:34 UTC (permalink / raw)
  To: sstabellini, julien.grall; +Cc: George.Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1723 bytes --]

On Tue, 2019-08-13 at 17:52 +0100, Julien Grall wrote:
> Hi Dario,
> 
Hello!

> On 8/13/19 4:27 PM, Dario Faggioli wrote:
> > On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
> > > 
> > In my (x86 and "dom0full") testbox, this seems to come from
> > domain_unpause_by_systemcontroller(dom0) called by
> > xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().
> > 
> > I don't know if domain construction in an ARM dom0less system works
> > similarly, though. What we want, is someone calling either
> > vcpu_wake()
> > or vcpu_unpause(), after having cleared _VPF_down from pause_flags.
> 
> Looking at create_domUs() there is a call to 
> domain_unpause_by_controller for each domUs.
> 
Yes, I saw that. And I've seen the one done don dom0, at the end of
xen/arch/arm/setup.c:start_xen(), as well.

Also, both construct_dom0() (still from start_xen()) and
construct_domU() (called from create_domUs()) call construct_domain(),
which does clear_bit(_VPF_down), setting the domain to online.

So, unless the flag gets cleared again, or something else happens that
makes the vCPU(s) fail the vcpu_runnable() check in
domain_unpause()->vcpu_wake(), I don't see why the wakeup that let the
null scheduler start scheduling the vCPU doesn't happen... as it
instead does on x86 or !dom0less ARM (because, as far as I've
understood, it's only dom0less that doesn't work, it this correct?)

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 17:34         ` Dario Faggioli
@ 2019-08-13 18:43           ` Julien Grall
  2019-08-13 22:26             ` Julien Grall
  2019-08-13 22:34             ` Dario Faggioli
  0 siblings, 2 replies; 26+ messages in thread
From: Julien Grall @ 2019-08-13 18:43 UTC (permalink / raw)
  To: Dario Faggioli, sstabellini; +Cc: George.Dunlap, xen-devel



On 8/13/19 6:34 PM, Dario Faggioli wrote:
> On Tue, 2019-08-13 at 17:52 +0100, Julien Grall wrote:
>> Hi Dario,
>>
> Hello!
> 
>> On 8/13/19 4:27 PM, Dario Faggioli wrote:
>>> On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
>>>>
>>> In my (x86 and "dom0full") testbox, this seems to come from
>>> domain_unpause_by_systemcontroller(dom0) called by
>>> xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().
>>>
>>> I don't know if domain construction in an ARM dom0less system works
>>> similarly, though. What we want, is someone calling either
>>> vcpu_wake()
>>> or vcpu_unpause(), after having cleared _VPF_down from pause_flags.
>>
>> Looking at create_domUs() there is a call to
>> domain_unpause_by_controller for each domUs.
>>
> Yes, I saw that. And I've seen the one done don dom0, at the end of
> xen/arch/arm/setup.c:start_xen(), as well.
> 
> Also, both construct_dom0() (still from start_xen()) and
> construct_domU() (called from create_domUs()) call construct_domain(),
> which does clear_bit(_VPF_down), setting the domain to online.
> 
> So, unless the flag gets cleared again, or something else happens that
> makes the vCPU(s) fail the vcpu_runnable() check in
> domain_unpause()->vcpu_wake(), I don't see why the wakeup that let the
> null scheduler start scheduling the vCPU doesn't happen... as it
> instead does on x86 or !dom0less ARM (because, as far as I've
> understood, it's only dom0less that doesn't work, it this correct?)

Yes, I quickly tried to use NULL scheduler with just dom0 and it boots.

Interestingly, I can't see the log:

(XEN) Freed 328kB init memory.

This is called as part of init_done before CPU0 goes into the idle loop.

Adding more debug, it is getting stuck when calling 
domain_unpause_by_controller for dom0. Specifically vcpu_wake on dom0v0.

The loop to assign a pCPU in null_vcpu_wake() is turning into an 
infinite loop. Indeed the loop is trying to pick CPU0 for dom0v0 that is 
already used by dom1v0. So the problem is in pick_cpu() or the data used 
by it.

It feels to me this is an affinity problem. Note that I didn't request 
to pin dom0 vCPUs.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 15:27     ` Dario Faggioli
  2019-08-13 16:52       ` Julien Grall
@ 2019-08-13 21:14       ` Stefano Stabellini
  2019-08-14  2:04         ` Dario Faggioli
  1 sibling, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2019-08-13 21:14 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George.Dunlap, xen-devel, sstabellini

[-- Attachment #1: Type: text/plain, Size: 1466 bytes --]

On Tue, 13 Aug 2019, Dario Faggioli wrote:
> On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
> > On Fri, 9 Aug 2019, Dario Faggioli wrote:
> > > Can you help me with this, e.g., by providing some more info and,
> > > if
> > > possible, logs?
> > 
> > I am attaching the logs. 
> >
> Thanks!
> 
> > Interestingly, I get a bunch of:
> > 
> > (XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
> > (XEN) sched_null.c:458: Not inserting d2v0 (not online!)
> > 
> > Maybe we are missing a call to online the vcpus somewhere in
> > xen/arch/arm/domain_build.c:construct_domain?
> > 
> Actually, those lines are normal, because vCPUs are created offline.
> (see the set_bit(_VPF_down) in vcpu_create()).
> 
> The problem is why aren't they coming up. Basically, you're missing a
> call to vcpu_wake().
> 
> In my (x86 and "dom0full") testbox, this seems to come from
> domain_unpause_by_systemcontroller(dom0) called by
> xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().
> 
> I don't know if domain construction in an ARM dom0less system works
> similarly, though. What we want, is someone calling either vcpu_wake()
> or vcpu_unpause(), after having cleared _VPF_down from pause_flags.
> 
> I am attaching an updated debug patch, with an additional printk when
> we reach the point, within the null scheduler, when the vcpu would wake
> up (to check whether the problem is that we never reach that point, or
> something else).

See attached.

[-- Attachment #2: Type: text/plain, Size: 6479 bytes --]

(XEN) Xen version 4.13-unstable (sstabellini@) (aarch64-linux-gnu-gcc (Linaro GCC 5.3-2016.05) 5.3.1 20160412) debug=y  Tue Aug 13 14:12:29 PDT 2019
(XEN) Latest ChangeSet: Fri Dec 21 13:44:30 2018 +0000 git:243cc95d48-dirty
(XEN) build-id: 95462325c4240e3913a88e8465cd8a3aaf007b53
(XEN) Processor: 410fd034: "ARM Limited", variant: 0x0, part 0xd03, rev 0x4
(XEN) 64-bit Execution:
(XEN)   Processor Features: 1100000000002222 0000000000000000
(XEN)     Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
(XEN)     Extensions: FloatingPoint AdvancedSIMD
(XEN)   Debug Features: 0000000010305106 0000000000000000
(XEN)   Auxiliary Features: 0000000000000000 0000000000000000
(XEN)   Memory Model Features: 0000000000001122 0000000000000000
(XEN)   ISA Features:  0000000000011120 0000000000000000
(XEN) 32-bit Execution:
(XEN)   Processor Features: 00001231:00011011
(XEN)     Instruction Sets: AArch32 A32 Thumb Thumb-2 ThumbEE Jazelle
(XEN)     Extensions: GenericTimer Security
(XEN)   Debug Features: 03010066
(XEN)   Auxiliary Features: 00000000
(XEN)   Memory Model Features: 10101105 40000000 01260000 02102211
(XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
(XEN) Using SMC Calling Convention v1.1
(XEN) Using PSCI v1.1
(XEN) SMP: Allowing 4 CPUs
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 50000 KHz
(XEN) GICv2 initialization:
(XEN)         gic_dist_addr=00000000f9010000
(XEN)         gic_cpu_addr=00000000f9020000
(XEN)         gic_hyp_addr=00000000f9040000
(XEN)         gic_vcpu_addr=00000000f9060000
(XEN)         gic_maintenance_irq=25
(XEN) GICv2: Adjusting CPU interface base to 0xf902f000
(XEN) GICv2: 192 lines, 4 cpus (IID 00000000).
(XEN) XSM Framework v1.0.0 initialized
(XEN) Initialising XSM SILO mode
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) Using scheduler: null Scheduler (null)
(XEN) Initializing null scheduler
(XEN) WARNING: This is experimental software in development.
(XEN) Use at your own risk.
(XEN) Allocated console ring of 32 KiB.
(XEN) CPU0: Guest atomics will try 1 times before pausing the domain
(XEN) Bringing up CPU1
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU1: Guest atomics will try 1 times before pausing the domain
(XEN) Bringing up CPU2
(XEN) CPU 1 booted.
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU2: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 2 booted.
(XEN) Bringing up CPU3
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU3: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 3 booted.
(XEN) Brought up 4 CPUs
(XEN) P2M: 40-bit IPA with 40-bit PA and 8-bit VMID
(XEN) P2M: 3 levels with order-1 root, VTCR 0x80023558
(XEN) smmu: /amba/smmu@fd800000: probing hardware configuration...
(XEN) smmu: /amba/smmu@fd800000: SMMUv2 with:
(XEN) smmu: /amba/smmu@fd800000:        stage 2 translation
(XEN) smmu: /amba/smmu@fd800000:        stream matching with 48 register groups, mask 0x7fff
(XEN) smmu: /amba/smmu@fd800000:        16 context banks (0 stage-2 only)
(XEN) smmu: /amba/smmu@fd800000:        Stage-2: 40-bit IPA -> 48-bit PA
(XEN) smmu: /amba/smmu@fd800000: registered 26 master devices
/amba@0/smmu0@0xFD800000: Decode error: write to 6c=0
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) alternatives: Patching with alt table 00000000002bbe68 -> 00000000002bc528
(XEN) sched_null.c:458: Not inserting d0v0 (not online!)
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Loading d0 kernel from boot module @ 0000000001000000
(XEN) Loading ramdisk from boot module @ 0000000002400000
(XEN) Allocating 1:1 mappings totalling 700MB for dom0:
(XEN) BANK[0] 0x00000020000000-0x00000040000000 (512MB)
(XEN) BANK[1] 0x00000070000000-0x00000078000000 (128MB)
(XEN) BANK[2] 0x0000007c000000-0x0000007fc00000 (60MB)
(XEN) Grant table range: 0x00000000e00000-0x00000000e40000
(XEN) smmu: /amba/smmu@fd800000: d0: p2maddr 0x000000087ffa2000
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading zImage from 0000000001000000 to 0000000020080000-0000000021372200
(XEN) Loading dom0 initrd from 0000000002400000 to 0x0000000028200000-0x00000000297fa954
(XEN) Loading dom0 DTB to 0x0000000028000000-0x0000000028006d75
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d1v0 (not online!)
(XEN) Loading d1 kernel from boot module @ 0000000004c00000
(XEN) Loading ramdisk from boot module @ 0000000005c00000
(XEN) Allocating mappings totalling 256MB for d1:
(XEN) d1 BANK[0] 0x00000040000000-0x00000050000000 (256MB)
(XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
(XEN) Loading zImage from 0000000004c00000 to 0000000040080000-000000004105fa00
(XEN) Loading dom0 initrd from 0000000005c00000 to 0x0000000048200000-0x0000000048383400
(XEN) Loading dom0 DTB to 0x0000000048000000-0x00000000480004bd
(XEN) sched_null.c:576: d1v0 is waking up after having been offline
(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d2v0 (not online!)
(XEN) Loading d2 kernel from boot module @ 0000000003a00000
(XEN) Loading ramdisk from boot module @ 0000000004a00000
(XEN) Allocating mappings totalling 256MB for d2:
(XEN) d2 BANK[0] 0x00000040000000-0x00000050000000 (256MB)
(XEN) d2 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
(XEN) Loading zImage from 0000000003a00000 to 0000000040080000-000000004105fa00
(XEN) Loading dom0 initrd from 0000000004a00000 to 0x0000000048200000-0x0000000048383400
(XEN) Loading dom0 DTB to 0x0000000048000000-0x00000000480004bd
(XEN) sched_null.c:576: d2v0 is waking up after having been offline


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 18:43           ` Julien Grall
@ 2019-08-13 22:26             ` Julien Grall
  2019-08-13 22:34             ` Dario Faggioli
  1 sibling, 0 replies; 26+ messages in thread
From: Julien Grall @ 2019-08-13 22:26 UTC (permalink / raw)
  To: Dario Faggioli, sstabellini; +Cc: George.Dunlap, xen-devel

Hi,

On 8/13/19 7:43 PM, Julien Grall wrote:
> 
> 
> On 8/13/19 6:34 PM, Dario Faggioli wrote:
>> On Tue, 2019-08-13 at 17:52 +0100, Julien Grall wrote:
>>> Hi Dario,
>>>
>> Hello!
>>
>>> On 8/13/19 4:27 PM, Dario Faggioli wrote:
>>>> On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
>>>>>
>>>> In my (x86 and "dom0full") testbox, this seems to come from
>>>> domain_unpause_by_systemcontroller(dom0) called by
>>>> xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().
>>>>
>>>> I don't know if domain construction in an ARM dom0less system works
>>>> similarly, though. What we want, is someone calling either
>>>> vcpu_wake()
>>>> or vcpu_unpause(), after having cleared _VPF_down from pause_flags.
>>>
>>> Looking at create_domUs() there is a call to
>>> domain_unpause_by_controller for each domUs.
>>>
>> Yes, I saw that. And I've seen the one done don dom0, at the end of
>> xen/arch/arm/setup.c:start_xen(), as well.
>>
>> Also, both construct_dom0() (still from start_xen()) and
>> construct_domU() (called from create_domUs()) call construct_domain(),
>> which does clear_bit(_VPF_down), setting the domain to online.
>>
>> So, unless the flag gets cleared again, or something else happens that
>> makes the vCPU(s) fail the vcpu_runnable() check in
>> domain_unpause()->vcpu_wake(), I don't see why the wakeup that let the
>> null scheduler start scheduling the vCPU doesn't happen... as it
>> instead does on x86 or !dom0less ARM (because, as far as I've
>> understood, it's only dom0less that doesn't work, it this correct?)
> 
> Yes, I quickly tried to use NULL scheduler with just dom0 and it boots.
> 
> Interestingly, I can't see the log:
> 
> (XEN) Freed 328kB init memory.
> 
> This is called as part of init_done before CPU0 goes into the idle loop.
> 
> Adding more debug, it is getting stuck when calling 
> domain_unpause_by_controller for dom0. Specifically vcpu_wake on dom0v0.
> 
> The loop to assign a pCPU in null_vcpu_wake() is turning into an 
> infinite loop. Indeed the loop is trying to pick CPU0 for dom0v0 that is 
> already used by dom1v0. So the problem is in pick_cpu() or the data used 
> by it.
> 
> It feels to me this is an affinity problem. Note that I didn't request 
> to pin dom0 vCPUs.

I did a bit more digging, as I pointed out before, pick_cpu() is
returning pCPU0. This is because per_cpu(ncp, 0) == NULL.

per_cpu(npc, 0) will be set by vcpu_assign(). AFAIU, the function
is called during scheduling. As CPU0 is not able to serve softirq until it
finishes to initialize, per_cpu(npc, 0) will still be NULL when trying to
wake dom0v0.

My knowledge of the scheduler is pretty limited, so I will leave to
Dario and George suggesting a fix :).

On a side note, I have tried to hack a bit the Dom0 vCPU allocation
to see if I can help you to reproduce it on x86. But I stumbled across
another error while bringing up d0v1:

(XEN) Assertion 'lock == per_cpu(schedule_data, v->processor).schedule_lock' failed at /home/julieng/works/xen/xen/include/xen/sched-if.h:108
(XEN) ----[ Xen-4.13-unstable  arm64  debug=y   Not tainted ]----
(XEN) CPU:    0

[...]

(XEN) Xen call trace:
(XEN)    [<00000000002251b8>] vcpu_wake+0x550/0x554 (PC)
(XEN)    [<0000000000224da4>] vcpu_wake+0x13c/0x554 (LR)
(XEN)    [<0000000000261624>] vpsci.c#do_common_cpu_on+0x134/0x1c4
(XEN)    [<0000000000261a04>] do_vpsci_0_2_call+0x294/0x3d0
(XEN)    [<00000000002612c0>] vsmc.c#vsmccc_handle_call+0x3a0/0x4b0
(XEN)    [<0000000000261484>] do_trap_hvc_smccc+0x28/0x4c
(XEN)    [<0000000000257efc>] do_trap_guest_sync+0x508/0x5d8
(XEN)    [<000000000026542c>] entry.o#guest_sync_slowpath+0x9c/0xcc
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'lock == per_cpu(schedule_data, v->processor).schedule_lock' failed at /home/julieng/works/xen/xen/include/xen/sched-***************************************

I only try to create all the vCPU to pCPU 0 with the following code:

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 4c8404155a..ce92e3841f 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2004,7 +2004,7 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
     for ( i = 1, cpu = 0; i < d->max_vcpus; i++ )
     {
         cpu = cpumask_cycle(cpu, &cpu_online_map);
-        if ( vcpu_create(d, i, cpu) == NULL )
+        if ( vcpu_create(d, i, 0) == NULL )
         {
             printk("Failed to allocate dom0 vcpu %d on pcpu %d\n", i, cpu);
             break;

I am not entirely sure whether the problem is related.

Anyway, I have wrote the following patch to reproduce on Arm without
dom0less:

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 4c8404155a..20246ae475 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2004,7 +2004,7 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
     for ( i = 1, cpu = 0; i < d->max_vcpus; i++ )
     {
         cpu = cpumask_cycle(cpu, &cpu_online_map);
-        if ( vcpu_create(d, i, cpu) == NULL )
+        if ( vcpu_create(d, i, 0) == NULL )
         {
             printk("Failed to allocate dom0 vcpu %d on pcpu %d\n", i, cpu);
             break;
@@ -2019,6 +2019,10 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
     v->is_initialised = 1;
     clear_bit(_VPF_down, &v->pause_flags);
 
+    v = d->vcpu[1];
+    v->is_initialised = 1;
+    clear_bit(_VPF_down, &v->pause_flags);
+
     return 0;
 }
 
This could easily be adapt for x86 so you can reproduce it easily :).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 18:43           ` Julien Grall
  2019-08-13 22:26             ` Julien Grall
@ 2019-08-13 22:34             ` Dario Faggioli
  2019-08-13 23:07               ` Julien Grall
  1 sibling, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-08-13 22:34 UTC (permalink / raw)
  To: sstabellini, julien.grall; +Cc: George.Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1826 bytes --]

On Tue, 2019-08-13 at 19:43 +0100, Julien Grall wrote:
> On 8/13/19 6:34 PM, Dario Faggioli wrote:
> > On Tue, 2019-08-13 at 17:52 +0100, Julien Grall wrote:
> > > 
> > So, unless the flag gets cleared again, or something else happens
> > that
> > makes the vCPU(s) fail the vcpu_runnable() check in
> > domain_unpause()->vcpu_wake(), I don't see why the wakeup that let
> > the
> > null scheduler start scheduling the vCPU doesn't happen... as it
> > instead does on x86 or !dom0less ARM (because, as far as I've
> > understood, it's only dom0less that doesn't work, it this correct?)
> 
> Yes, I quickly tried to use NULL scheduler with just dom0 and it
> boots.
> 
Ok.

> Interestingly, I can't see the log:
> 
> (XEN) Freed 328kB init memory.
> 
> This is called as part of init_done before CPU0 goes into the idle
> loop.
> 
> Adding more debug, it is getting stuck when calling 
> domain_unpause_by_controller for dom0. Specifically vcpu_wake on
> dom0v0.
> 
Wait... Is this also with just dom0, or when trying dom0less with some
domUs?

> The loop to assign a pCPU in null_vcpu_wake() is turning into an 
> infinite loop. Indeed the loop is trying to pick CPU0 for dom0v0 that
> is 
> already used by dom1v0. So the problem is in pick_cpu() or the data
> used 
> by it.
> 
Ah, interesting...

> It feels to me this is an affinity problem. Note that I didn't
> request 
> to pin dom0 vCPUs.
> 
Yep, looking better, I think I've seen something suspicious now. I'll
send another debug patch.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 22:34             ` Dario Faggioli
@ 2019-08-13 23:07               ` Julien Grall
  0 siblings, 0 replies; 26+ messages in thread
From: Julien Grall @ 2019-08-13 23:07 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George.Dunlap, xen-devel, Julien Grall, Stefano Stabellini


[-- Attachment #1.1: Type: text/plain, Size: 2323 bytes --]

On Tue, 13 Aug 2019, 23:39 Dario Faggioli, <dfaggioli@suse.com> wrote:

> On Tue, 2019-08-13 at 19:43 +0100, Julien Grall wrote:
> > On 8/13/19 6:34 PM, Dario Faggioli wrote:
> > > On Tue, 2019-08-13 at 17:52 +0100, Julien Grall wrote:
> > > >
> > > So, unless the flag gets cleared again, or something else happens
> > > that
> > > makes the vCPU(s) fail the vcpu_runnable() check in
> > > domain_unpause()->vcpu_wake(), I don't see why the wakeup that let
> > > the
> > > null scheduler start scheduling the vCPU doesn't happen... as it
> > > instead does on x86 or !dom0less ARM (because, as far as I've
> > > understood, it's only dom0less that doesn't work, it this correct?)
> >
> > Yes, I quickly tried to use NULL scheduler with just dom0 and it
> > boots.
> >
> Ok.
>
> > Interestingly, I can't see the log:
> >
> > (XEN) Freed 328kB init memory.
> >
> > This is called as part of init_done before CPU0 goes into the idle
> > loop.
> >
> > Adding more debug, it is getting stuck when calling
> > domain_unpause_by_controller for dom0. Specifically vcpu_wake on
> > dom0v0.
> >
> Wait... Is this also with just dom0, or when trying dom0less with some
> domUs?
>

Dom0 is unpaused after all the domUs. In other words, the scheduler will
see domUs first.



> > The loop to assign a pCPU in null_vcpu_wake() is turning into an
> > infinite loop. Indeed the loop is trying to pick CPU0 for dom0v0 that
> > is
> > already used by dom1v0. So the problem is in pick_cpu() or the data
> > used
> > by it.
> >
> Ah, interesting...
>
> > It feels to me this is an affinity problem. Note that I didn't
> > request
> > to pin dom0 vCPUs.
> >
> Yep, looking better, I think I've seen something suspicious now. I'll
> send another debug patch.
>

You may want to see my last e-mail first just in case it rings a bell. :) I
did more debugging during the evening.

Cheers,



> Regards
> --
> Dario Faggioli, Ph.D
> http://about.me/dario.faggioli
> Virtualization Software Engineer
> SUSE Labs, SUSE https://www.suse.com/
> -------------------------------------------------------------------
> <<This happens because _I_ choose it to happen!>> (Raistlin Majere)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 3932 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-13 21:14       ` Stefano Stabellini
@ 2019-08-14  2:04         ` Dario Faggioli
  2019-08-14 16:27           ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-08-14  2:04 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George.Dunlap, xen-devel, Julien Grall


[-- Attachment #1.1.1: Type: text/plain, Size: 1025 bytes --]

On Tue, 2019-08-13 at 14:14 -0700, Stefano Stabellini wrote:
> On Tue, 13 Aug 2019, Dario Faggioli wrote:
> > 
> > I am attaching an updated debug patch, with an additional printk
> > when
> > we reach the point, within the null scheduler, when the vcpu would
> > wake
> > up (to check whether the problem is that we never reach that point,
> > or
> > something else).
> 
> See attached.
>
Ok, so we're not missing an "online call" nor a wakeup.

As Julien has identified, we seem to be stuck in a loop.

Now, while staring at the code of that loop, I've seen that pick_cpu()
may mess up with the scratch cpumask for the CPU, which I don't think
is a good thing.

So, can you also try this third debug-patch?

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.1.2: xen-sched-null-vcpu-onoff-debug-v3.patch --]
[-- Type: text/x-patch, Size: 2914 bytes --]

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 26c6f0f129..f90b146209 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -455,6 +455,7 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
 
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "Not inserting %pv (not online!)\n", v);
         vcpu_schedule_unlock_irq(lock, v);
         return;
     }
@@ -516,6 +517,7 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
     /* If offline, the vcpu shouldn't be assigned, nor in the waitqueue */
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "Not removing %pv (wasn't online!)\n", v);
         ASSERT(per_cpu(npc, v->processor).vcpu != v);
         ASSERT(list_empty(&nvc->waitq_elem));
         goto out;
@@ -571,14 +573,17 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
      */
     if ( unlikely(per_cpu(npc, cpu).vcpu != v && list_empty(&nvc->waitq_elem)) )
     {
+        cpumask_t mask;
+
+        dprintk(XENLOG_G_INFO, "%pv is waking up after having been offline\n", v);
         spin_lock(&prv->waitq_lock);
         list_add_tail(&nvc->waitq_elem, &prv->waitq);
         spin_unlock(&prv->waitq_lock);
 
-        cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
+        cpumask_and(&mask, v->cpu_hard_affinity,
                     cpupool_domain_cpumask(v->domain));
 
-        if ( !cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
+        if ( !cpumask_intersects(&prv->cpus_free, &mask) )
         {
             dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU!\n",
                     v->domain->domain_id, v->vcpu_id);
@@ -595,7 +600,7 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
          * - if we're racing already, and if there still are free cpus, try
          *   again.
          */
-        while ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
+        while ( cpumask_intersects(&prv->cpus_free, &mask) )
         {
             unsigned int new_cpu = pick_cpu(prv, v);
 
@@ -635,6 +640,8 @@ static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
         }
         else if ( per_cpu(npc, cpu).vcpu == v )
             tickled = vcpu_deassign(prv, v);
+
+        dprintk(XENLOG_G_INFO, "%pv is, apparently, going offline (tickled=%d)\n", v, tickled);
     }
 
     /* If v is not assigned to a pCPU, or is not running, no need to bother */
@@ -697,6 +704,8 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      */
     if ( unlikely(!is_vcpu_online(v)) )
     {
+        dprintk(XENLOG_G_INFO, "%pv is, apparently, going offline\n", v);
+
         spin_lock(&prv->waitq_lock);
         list_del_init(&nvc->waitq_elem);
         spin_unlock(&prv->waitq_lock);

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-14  2:04         ` Dario Faggioli
@ 2019-08-14 16:27           ` Stefano Stabellini
  2019-08-14 17:35             ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2019-08-14 16:27 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George.Dunlap, xen-devel, Julien Grall, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 814 bytes --]

On Wed, 14 Aug 2019, Dario Faggioli wrote:
> On Tue, 2019-08-13 at 14:14 -0700, Stefano Stabellini wrote:
> > On Tue, 13 Aug 2019, Dario Faggioli wrote:
> > > 
> > > I am attaching an updated debug patch, with an additional printk
> > > when
> > > we reach the point, within the null scheduler, when the vcpu would
> > > wake
> > > up (to check whether the problem is that we never reach that point,
> > > or
> > > something else).
> > 
> > See attached.
> >
> Ok, so we're not missing an "online call" nor a wakeup.
> 
> As Julien has identified, we seem to be stuck in a loop.
> 
> Now, while staring at the code of that loop, I've seen that pick_cpu()
> may mess up with the scratch cpumask for the CPU, which I don't think
> is a good thing.
> 
> So, can you also try this third debug-patch?

Yep, see attached

[-- Attachment #2: Type: text/plain, Size: 6479 bytes --]

(XEN) Xen version 4.13-unstable (sstabellini@) (aarch64-linux-gnu-gcc (Linaro GCC 5.3-2016.05) 5.3.1 20160412) debug=y  Wed Aug 14 09:24:00 PDT 2019
(XEN) Latest ChangeSet: Fri Dec 21 13:44:30 2018 +0000 git:243cc95d48-dirty
(XEN) build-id: 992c2d9483465d0f9d8f1e9ec372e1cfc4b90bb3
(XEN) Processor: 410fd034: "ARM Limited", variant: 0x0, part 0xd03, rev 0x4
(XEN) 64-bit Execution:
(XEN)   Processor Features: 1100000000002222 0000000000000000
(XEN)     Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
(XEN)     Extensions: FloatingPoint AdvancedSIMD
(XEN)   Debug Features: 0000000010305106 0000000000000000
(XEN)   Auxiliary Features: 0000000000000000 0000000000000000
(XEN)   Memory Model Features: 0000000000001122 0000000000000000
(XEN)   ISA Features:  0000000000011120 0000000000000000
(XEN) 32-bit Execution:
(XEN)   Processor Features: 00001231:00011011
(XEN)     Instruction Sets: AArch32 A32 Thumb Thumb-2 ThumbEE Jazelle
(XEN)     Extensions: GenericTimer Security
(XEN)   Debug Features: 03010066
(XEN)   Auxiliary Features: 00000000
(XEN)   Memory Model Features: 10101105 40000000 01260000 02102211
(XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
(XEN) Using SMC Calling Convention v1.1
(XEN) Using PSCI v1.1
(XEN) SMP: Allowing 4 CPUs
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 50000 KHz
(XEN) GICv2 initialization:
(XEN)         gic_dist_addr=00000000f9010000
(XEN)         gic_cpu_addr=00000000f9020000
(XEN)         gic_hyp_addr=00000000f9040000
(XEN)         gic_vcpu_addr=00000000f9060000
(XEN)         gic_maintenance_irq=25
(XEN) GICv2: Adjusting CPU interface base to 0xf902f000
(XEN) GICv2: 192 lines, 4 cpus (IID 00000000).
(XEN) XSM Framework v1.0.0 initialized
(XEN) Initialising XSM SILO mode
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) Using scheduler: null Scheduler (null)
(XEN) Initializing null scheduler
(XEN) WARNING: This is experimental software in development.
(XEN) Use at your own risk.
(XEN) Allocated console ring of 32 KiB.
(XEN) CPU0: Guest atomics will try 1 times before pausing the domain
(XEN) Bringing up CPU1
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU1: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 1 booted.
(XEN) Bringing up CPU2
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU2: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 2 booted.
(XEN) Bringing up CPU3
(XEN) WARNING: hypervisor-timer IRQ26 is not level triggered.
(XEN) WARNING: virtual-timer IRQ27 is not level triggered.
(XEN) WARNING: NS-physical-timer IRQ30 is not level triggered.
(XEN) CPU3: Guest atomics will try 1 times before pausing the domain
(XEN) CPU 3 booted.
(XEN) Brought up 4 CPUs
(XEN) P2M: 40-bit IPA with 40-bit PA and 8-bit VMID
(XEN) P2M: 3 levels with order-1 root, VTCR 0x80023558
(XEN) smmu: /amba/smmu@fd800000: probing hardware configuration...
(XEN) smmu: /amba/smmu@fd800000: SMMUv2 with:
(XEN) smmu: /amba/smmu@fd800000:        stage 2 translation
(XEN) smmu: /amba/smmu@fd800000:        stream matching with 48 register groups, mask 0x7fff
(XEN) smmu: /amba/smmu@fd800000:        16 context banks (0 stage-2 only)
(XEN) smmu: /amba/smmu@fd800000:        Stage-2: 40-bit IPA -> 48-bit PA
(XEN) smmu: /amba/smmu@fd800000: registered 26 master devices
/amba@0/smmu0@0xFD800000: Decode error: write to 6c=0
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) alternatives: Patching with alt table 00000000002bbe68 -> 00000000002bc528
(XEN) sched_null.c:458: Not inserting d0v0 (not online!)
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Loading d0 kernel from boot module @ 0000000001000000
(XEN) Loading ramdisk from boot module @ 0000000002400000
(XEN) Allocating 1:1 mappings totalling 700MB for dom0:
(XEN) BANK[0] 0x00000020000000-0x00000040000000 (512MB)
(XEN) BANK[1] 0x00000070000000-0x00000078000000 (128MB)
(XEN) BANK[2] 0x0000007c000000-0x0000007fc00000 (60MB)
(XEN) Grant table range: 0x00000000e00000-0x00000000e40000
(XEN) smmu: /amba/smmu@fd800000: d0: p2maddr 0x000000087ffa2000
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading zImage from 0000000001000000 to 0000000020080000-0000000021372200
(XEN) Loading dom0 initrd from 0000000002400000 to 0x0000000028200000-0x00000000297fa954
(XEN) Loading dom0 DTB to 0x0000000028000000-0x0000000028006d75
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d1v0 (not online!)
(XEN) Loading d1 kernel from boot module @ 0000000004c00000
(XEN) Loading ramdisk from boot module @ 0000000005c00000
(XEN) Allocating mappings totalling 256MB for d1:
(XEN) d1 BANK[0] 0x00000040000000-0x00000050000000 (256MB)
(XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
(XEN) Loading zImage from 0000000004c00000 to 0000000040080000-000000004105fa00
(XEN) Loading dom0 initrd from 0000000005c00000 to 0x0000000048200000-0x0000000048383400
(XEN) Loading dom0 DTB to 0x0000000048000000-0x00000000480004bd
(XEN) sched_null.c:578: d1v0 is waking up after having been offline
(XEN) *** LOADING DOMU cpus=1 memory=40000KB ***
(XEN) sched_null.c:458: Not inserting d2v0 (not online!)
(XEN) Loading d2 kernel from boot module @ 0000000003a00000
(XEN) Loading ramdisk from boot module @ 0000000004a00000
(XEN) Allocating mappings totalling 256MB for d2:
(XEN) d2 BANK[0] 0x00000040000000-0x00000050000000 (256MB)
(XEN) d2 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
(XEN) Loading zImage from 0000000003a00000 to 0000000040080000-000000004105fa00
(XEN) Loading dom0 initrd from 0000000004a00000 to 0x0000000048200000-0x0000000048383400
(XEN) Loading dom0 DTB to 0x0000000048000000-0x00000000480004bd
(XEN) sched_null.c:578: d2v0 is waking up after having been offline


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-14 16:27           ` Stefano Stabellini
@ 2019-08-14 17:35             ` Dario Faggioli
  2019-08-21 10:33               ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-08-14 17:35 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George.Dunlap, xen-devel, Julien Grall


[-- Attachment #1.1: Type: text/plain, Size: 1014 bytes --]

On Wed, 2019-08-14 at 09:27 -0700, Stefano Stabellini wrote:
> On Wed, 14 Aug 2019, Dario Faggioli wrote:
> > On Tue, 2019-08-13 at 14:14 -0700, Stefano Stabellini wrote:
> > > 
> > Now, while staring at the code of that loop, I've seen that
> > pick_cpu()
> > may mess up with the scratch cpumask for the CPU, which I don't
> > think
> > is a good thing.
> > 
> > So, can you also try this third debug-patch?
> 
> Yep, see attached
>
Ok, thanks again. So, cpumask_scratch() being mishandled was part of
the problem, but not the root-cause.

Well, it was worth a shot. :-P

I think we need to get rid of the loop in which we're stuck. I have in
mind a way to do this... I'll craft a patch later, or on Friday.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-14 17:35             ` Dario Faggioli
@ 2019-08-21 10:33               ` Dario Faggioli
  2019-08-24  1:16                 ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-08-21 10:33 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George.Dunlap, xen-devel, Julien Grall


[-- Attachment #1.1.1: Type: text/plain, Size: 1226 bytes --]

On Wed, 2019-08-14 at 19:35 +0200, Dario Faggioli wrote:
> On Wed, 2019-08-14 at 09:27 -0700, Stefano Stabellini wrote:
> > On Wed, 14 Aug 2019, Dario Faggioli wrote:
> > > On Tue, 2019-08-13 at 14:14 -0700, Stefano Stabellini wrote:
> > > Now, while staring at the code of that loop, I've seen that
> > > pick_cpu()
> > > may mess up with the scratch cpumask for the CPU, which I don't
> > > think
> > > is a good thing.
> > > 
> > > So, can you also try this third debug-patch?
> > 
> > Yep, see attached
> > 
> Ok, thanks again. So, cpumask_scratch() being mishandled was part of
> the problem, but not the root-cause.
> 
> Well, it was worth a shot. :-P
> 
> I think we need to get rid of the loop in which we're stuck. 
>
Hey, Stefano, Julien,

Here's another patch.

Rather than a debug patch, this is rather an actual "proposed
solution".

Can you give it a go? If it works, I'll spin it as a proper patch.

Thanks!
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.1.2: xen-sched-null-vcpu-onoff.patch --]
[-- Type: text/x-patch, Size: 6645 bytes --]

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 26c6f0f129..4fc6f3a3c5 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -565,50 +565,52 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
     else
         SCHED_STAT_CRANK(vcpu_wake_not_runnable);
 
+    if ( likely(per_cpu(npc, cpu).vcpu == v) )
+    {
+        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
+        return;
+    }
+
     /*
      * If a vcpu is neither on a pCPU nor in the waitqueue, it means it was
-     * offline, and that it is now coming back being online.
+     * offline, and that it is now coming back being online. If we're lucky,
+     * and v->processor is free (and affinities match), we can just assign
+     * the vcpu to it (we own the proper lock already) and be done.
      */
-    if ( unlikely(per_cpu(npc, cpu).vcpu != v && list_empty(&nvc->waitq_elem)) )
+    if ( per_cpu(npc, cpu).vcpu == NULL &&
+         vcpu_check_affinity(v, cpu, BALANCE_HARD_AFFINITY) )
     {
-        spin_lock(&prv->waitq_lock);
-        list_add_tail(&nvc->waitq_elem, &prv->waitq);
-        spin_unlock(&prv->waitq_lock);
-
-        cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
-                    cpupool_domain_cpumask(v->domain));
-
-        if ( !cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
+        if ( !has_soft_affinity(v) ||
+             vcpu_check_affinity(v, cpu, BALANCE_SOFT_AFFINITY) )
         {
-            dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU!\n",
-                    v->domain->domain_id, v->vcpu_id);
+            vcpu_assign(prv, v, cpu);
+            cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
             return;
         }
+    }
 
-        /*
-         * Now we would want to assign the vcpu to cpu, but we can't, because
-         * we don't have the lock. So, let's do the following:
-         * - try to remove cpu from the list of free cpus, to avoid races with
-         *   other onlining, inserting or migrating operations;
-         * - tickle the cpu, which will pickup work from the waitqueue, and
-         *   assign it to itself;
-         * - if we're racing already, and if there still are free cpus, try
-         *   again.
-         */
-        while ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
-        {
-            unsigned int new_cpu = pick_cpu(prv, v);
+    /*
+     * If v->processor is not free (or affinities do not match) we need
+     * to assign v to some other CPU, but we can't do it here, as:
+     * - we don't own  the proper lock,
+     * - we can't change v->processor under vcpu_wake()'s feet.
+     * So we add it to the waitqueue, and tickle all the free CPUs (if any)
+     * on which v can run. The first one that schedules will pick it up.
+     */
+    spin_lock(&prv->waitq_lock);
+    list_add_tail(&nvc->waitq_elem, &prv->waitq);
+    spin_unlock(&prv->waitq_lock);
 
-            if ( test_and_clear_bit(new_cpu, &prv->cpus_free) )
-            {
-                cpu_raise_softirq(new_cpu, SCHEDULE_SOFTIRQ);
-                return;
-            }
-        }
-    }
+    cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
+                cpupool_domain_cpumask(v->domain));
+    cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu),
+                &prv->cpus_free);
 
-    /* Note that we get here only for vCPUs assigned to a pCPU */
-    cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
+    if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
+        dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU!\n",
+                v->domain->domain_id, v->vcpu_id);
+    else
+        cpumask_raise_softirq(cpumask_scratch_cpu(cpu), SCHEDULE_SOFTIRQ);
 }
 
 static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
@@ -822,6 +824,8 @@ static struct task_slice null_schedule(const struct scheduler *ops,
      */
     if ( unlikely(ret.task == NULL) )
     {
+        bool vcpu_found;
+
         spin_lock(&prv->waitq_lock);
 
         if ( list_empty(&prv->waitq) )
@@ -834,6 +838,7 @@ static struct task_slice null_schedule(const struct scheduler *ops,
          * it only in cases where a pcpu has no vcpu associated (e.g., as
          * said above, the cpu has just joined a cpupool).
          */
+        vcpu_found = false;
         for_each_affinity_balance_step( bs )
         {
             list_for_each_entry( wvc, &prv->waitq, waitq_elem )
@@ -844,13 +849,44 @@ static struct task_slice null_schedule(const struct scheduler *ops,
 
                 if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) )
                 {
-                    vcpu_assign(prv, wvc->vcpu, cpu);
-                    list_del_init(&wvc->waitq_elem);
-                    ret.task = wvc->vcpu;
-                    goto unlock;
+                    spinlock_t *lock;
+
+                    vcpu_found = true;
+
+                    /*
+                     * If the vcpu in the waitqueue has just come up online,
+                     * we risk racing with vcpu_wake(). To avoid this, sync
+                     * on the spinlock that vcpu_wake() holds, while waking up
+                     * this vcpu (but only with trylock, or we may deadlock).
+                     */
+                    lock = pcpu_schedule_trylock(wvc->vcpu->processor);
+
+                    /*
+                     * We know the vcpu's lock is not this cpu's lock. In
+                     * fact, if it were, since this cpu is free, vcpu_wake()
+                     * would have assigned the vcpu to this cpu directly.
+                     */
+                    ASSERT(lock != per_cpu(schedule_data, cpu).schedule_lock);
+
+                    if ( lock ) {
+                        vcpu_assign(prv, wvc->vcpu, cpu);
+                        list_del_init(&wvc->waitq_elem);
+                        ret.task = wvc->vcpu;
+                        spin_unlock(lock);
+                        goto unlock;
+                    }
                 }
             }
         }
+        /*
+         * If we did find a vcpu with suitable affinity in the waitqueue, but
+         * we could not pick it up (due to lock contention), and hence we are
+         * still free, plan for another try. In fact, we don't want such vcpu
+         * to be stuck in the waitqueue, when there are free cpus where it
+         * could run.
+         */
+        if ( unlikely( vcpu_found && ret.task == NULL && !list_empty(&prv->waitq)) )
+            cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
  unlock:
         spin_unlock(&prv->waitq_lock);
 

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-21 10:33               ` Dario Faggioli
@ 2019-08-24  1:16                 ` Stefano Stabellini
  2019-09-11 13:53                   ` Dario Faggioli
  2019-10-28  5:35                   ` Dario Faggioli
  0 siblings, 2 replies; 26+ messages in thread
From: Stefano Stabellini @ 2019-08-24  1:16 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George.Dunlap, xen-devel, Julien Grall, Stefano Stabellini

On Wed, 21 Aug 2019, Dario Faggioli wrote:
> On Wed, 2019-08-14 at 19:35 +0200, Dario Faggioli wrote:
> > On Wed, 2019-08-14 at 09:27 -0700, Stefano Stabellini wrote:
> > > On Wed, 14 Aug 2019, Dario Faggioli wrote:
> > > > On Tue, 2019-08-13 at 14:14 -0700, Stefano Stabellini wrote:
> > > > Now, while staring at the code of that loop, I've seen that
> > > > pick_cpu()
> > > > may mess up with the scratch cpumask for the CPU, which I don't
> > > > think
> > > > is a good thing.
> > > > 
> > > > So, can you also try this third debug-patch?
> > > 
> > > Yep, see attached
> > > 
> > Ok, thanks again. So, cpumask_scratch() being mishandled was part of
> > the problem, but not the root-cause.
> > 
> > Well, it was worth a shot. :-P
> > 
> > I think we need to get rid of the loop in which we're stuck. 
> >
> Hey, Stefano, Julien,
> 
> Here's another patch.
> 
> Rather than a debug patch, this is rather an actual "proposed
> solution".
> 
> Can you give it a go? If it works, I'll spin it as a proper patch.

Yes, this seems to solve the problem, thank you!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-24  1:16                 ` Stefano Stabellini
@ 2019-09-11 13:53                   ` Dario Faggioli
  2019-09-25 15:19                     ` Julien Grall
  2019-10-28  5:35                   ` Dario Faggioli
  1 sibling, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-09-11 13:53 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George.Dunlap, xen-devel, Julien Grall


[-- Attachment #1.1: Type: text/plain, Size: 860 bytes --]

On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
> On Wed, 21 Aug 2019, Dario Faggioli wrote:
> > 
> > Hey, Stefano, Julien,
> > 
> > Here's another patch.
> > 
> > Rather than a debug patch, this is rather an actual "proposed
> > solution".
> > 
> > Can you give it a go? If it works, I'll spin it as a proper patch.
> 
> Yes, this seems to solve the problem, thank you!
> 
Ok, thanks again for testing, and good to know.

I'm still catching up after vacations, and I'm traveling next week. But
I'll submit a proper patch as soon as I find time.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-09-11 13:53                   ` Dario Faggioli
@ 2019-09-25 15:19                     ` Julien Grall
  2019-09-25 15:34                       ` Dario Faggioli
  0 siblings, 1 reply; 26+ messages in thread
From: Julien Grall @ 2019-09-25 15:19 UTC (permalink / raw)
  To: Dario Faggioli, Stefano Stabellini
  Cc: George.Dunlap, xen-devel, Juergen Gross

(+Juergen)

Hi Dario,

On 11/09/2019 14:53, Dario Faggioli wrote:
> On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
>> On Wed, 21 Aug 2019, Dario Faggioli wrote:
>>>
>>> Hey, Stefano, Julien,
>>>
>>> Here's another patch.
>>>
>>> Rather than a debug patch, this is rather an actual "proposed
>>> solution".
>>>
>>> Can you give it a go? If it works, I'll spin it as a proper patch.
>>
>> Yes, this seems to solve the problem, thank you!
>>
> Ok, thanks again for testing, and good to know.
> 
> I'm still catching up after vacations, and I'm traveling next week. But
> I'll submit a proper patch as soon as I find time.

Just wanted to follow-up on this. Do you have an update for the fix?

I would rather not want to see Xen 4.13 released with this. So I have CCed 
Juergen to mark it as a blocker.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-09-25 15:19                     ` Julien Grall
@ 2019-09-25 15:34                       ` Dario Faggioli
  2019-09-25 15:39                         ` Julien Grall
  0 siblings, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-09-25 15:34 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: George.Dunlap, xen-devel, Juergen Gross


[-- Attachment #1.1: Type: text/plain, Size: 1300 bytes --]

On Wed, 2019-09-25 at 16:19 +0100, Julien Grall wrote:
> (+Juergen)
> 
> Hi Dario,
> 
Hi,

> On 11/09/2019 14:53, Dario Faggioli wrote:
> > On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
> > Ok, thanks again for testing, and good to know.
> > 
> > I'm still catching up after vacations, and I'm traveling next week.
> > But
> > I'll submit a proper patch as soon as I find time.
> 
> Just wanted to follow-up on this. Do you have an update for the fix?
> 
> I would rather not want to see Xen 4.13 released with this. So I have
> CCed 
> Juergen to mark it as a blocker.
> 
Yep, I spoke with Juergen about this last week (in person). Basically,
since we decided to try to push core-scheduling in, I'm focusing on
that series right now.

In fact, this fix can go in after code-freeze as well, since it's,
well, a fix. :-)

After code freeze, I'll prepare and send the patch (and if core-
scheduling would have gone in, I'll rebase it on top of that, of
course).

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-09-25 15:34                       ` Dario Faggioli
@ 2019-09-25 15:39                         ` Julien Grall
  2019-09-25 15:41                           ` Jürgen Groß
  0 siblings, 1 reply; 26+ messages in thread
From: Julien Grall @ 2019-09-25 15:39 UTC (permalink / raw)
  To: Dario Faggioli, Stefano Stabellini
  Cc: George.Dunlap, xen-devel, Juergen Gross

Hi,

On 25/09/2019 16:34, Dario Faggioli wrote:
> On Wed, 2019-09-25 at 16:19 +0100, Julien Grall wrote:
>> (+Juergen)
>>
>> Hi Dario,
>>
> Hi,
> 
>> On 11/09/2019 14:53, Dario Faggioli wrote:
>>> On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
>>> Ok, thanks again for testing, and good to know.
>>>
>>> I'm still catching up after vacations, and I'm traveling next week.
>>> But
>>> I'll submit a proper patch as soon as I find time.
>>
>> Just wanted to follow-up on this. Do you have an update for the fix?
>>
>> I would rather not want to see Xen 4.13 released with this. So I have
>> CCed
>> Juergen to mark it as a blocker.
>>
> Yep, I spoke with Juergen about this last week (in person). Basically,
> since we decided to try to push core-scheduling in, I'm focusing on
> that series right now.
> 
> In fact, this fix can go in after code-freeze as well, since it's,
> well, a fix. :-)
> 
> After code freeze, I'll prepare and send the patch (and if core-
> scheduling would have gone in, I'll rebase it on top of that, of
> course).

Make sense. I just wanted to make sure this is tracked by Juergen :).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-09-25 15:39                         ` Julien Grall
@ 2019-09-25 15:41                           ` Jürgen Groß
  0 siblings, 0 replies; 26+ messages in thread
From: Jürgen Groß @ 2019-09-25 15:41 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini, Dario Faggioli; +Cc: George.Dunlap, xen-devel

On 25.09.19 17:39, Julien Grall wrote:
> Hi,
> 
> On 25/09/2019 16:34, Dario Faggioli wrote:
>> On Wed, 2019-09-25 at 16:19 +0100, Julien Grall wrote:
>>> (+Juergen)
>>>
>>> Hi Dario,
>>>
>> Hi,
>>
>>> On 11/09/2019 14:53, Dario Faggioli wrote:
>>>> On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
>>>> Ok, thanks again for testing, and good to know.
>>>>
>>>> I'm still catching up after vacations, and I'm traveling next week.
>>>> But
>>>> I'll submit a proper patch as soon as I find time.
>>>
>>> Just wanted to follow-up on this. Do you have an update for the fix?
>>>
>>> I would rather not want to see Xen 4.13 released with this. So I have
>>> CCed
>>> Juergen to mark it as a blocker.
>>>
>> Yep, I spoke with Juergen about this last week (in person). Basically,
>> since we decided to try to push core-scheduling in, I'm focusing on
>> that series right now.
>>
>> In fact, this fix can go in after code-freeze as well, since it's,
>> well, a fix. :-)
>>
>> After code freeze, I'll prepare and send the patch (and if core-
>> scheduling would have gone in, I'll rebase it on top of that, of
>> course).
> 
> Make sense. I just wanted to make sure this is tracked by Juergen :).

It is. :-)


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-08-24  1:16                 ` Stefano Stabellini
  2019-09-11 13:53                   ` Dario Faggioli
@ 2019-10-28  5:35                   ` Dario Faggioli
  2019-10-28 18:40                     ` Stefano Stabellini
  1 sibling, 1 reply; 26+ messages in thread
From: Dario Faggioli @ 2019-10-28  5:35 UTC (permalink / raw)
  To: sstabellini; +Cc: George.Dunlap, xen-devel, julien.grall, jgross


[-- Attachment #1.1.1: Type: text/plain, Size: 1057 bytes --]

On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
> On Wed, 21 Aug 2019, Dario Faggioli wrote:
> > Hey, Stefano, Julien,
> > 
> > Here's another patch.
> > 
> > Rather than a debug patch, this is rather an actual "proposed
> > solution".
> > 
> > Can you give it a go? If it works, I'll spin it as a proper patch.
> 
> Yes, this seems to solve the problem, thank you!
> 
Hey,

Sorry this is taking a little while. Can any of you please test the
attached, on top of current staging?

In fact, I rebased the patch in my last email on top of that, and I'd
like to know if it still works, even now that core-scheduling is in.

If it does, then a proper changelog is the only thing it'd be missing,
and I'll do it quickly, I promise :-)

Regards,
Dario
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.1.2: xen-sched-null-vcpu-onoff-coresched.patch --]
[-- Type: text/x-patch, Size: 6949 bytes --]

commit 403339e2da498491573b8db539fe0307643264ee
Author: Dario Faggioli <dfaggioli@suse.com>
Date:   Sat Oct 26 00:21:29 2019 +0200

    TBD: Fix for online issue

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 2525464a7c..af1cf5e37e 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -568,50 +568,52 @@ static void null_unit_wake(const struct scheduler *ops,
     else
         SCHED_STAT_CRANK(unit_wake_not_runnable);
 
+    if ( likely(per_cpu(npc, cpu).unit == unit) )
+    {
+        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
+        return;
+    }
+
     /*
      * If a unit is neither on a pCPU nor in the waitqueue, it means it was
-     * offline, and that it is now coming back being online.
+     * offline, and that it is now coming back being online. If we're lucky,
+     * and it's previous resource is free (and affinities match), we can just
+     * assign the unit to it (we own the proper lock already) and be done.
      */
-    if ( unlikely(per_cpu(npc, cpu).unit != unit && list_empty(&nvc->waitq_elem)) )
+    if ( per_cpu(npc, cpu).unit == NULL &&
+         unit_check_affinity(unit, cpu, BALANCE_HARD_AFFINITY) )
     {
-        spin_lock(&prv->waitq_lock);
-        list_add_tail(&nvc->waitq_elem, &prv->waitq);
-        spin_unlock(&prv->waitq_lock);
-
-        cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
-                    cpupool_domain_master_cpumask(unit->domain));
-
-        if ( !cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
+        if ( !has_soft_affinity(unit) ||
+             unit_check_affinity(unit, cpu, BALANCE_SOFT_AFFINITY) )
         {
-            dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU!\n",
-                    unit->domain->domain_id, unit->unit_id);
+            unit_assign(prv, unit, cpu);
+            cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
             return;
         }
+    }
 
-        /*
-         * Now we would want to assign the unit to cpu, but we can't, because
-         * we don't have the lock. So, let's do the following:
-         * - try to remove cpu from the list of free cpus, to avoid races with
-         *   other onlining, inserting or migrating operations;
-         * - tickle the cpu, which will pickup work from the waitqueue, and
-         *   assign it to itself;
-         * - if we're racing already, and if there still are free cpus, try
-         *   again.
-         */
-        while ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
-        {
-            unsigned int new_cpu = pick_res(prv, unit)->master_cpu;
+    /*
+     * If the resource is not free (or affinities do not match) we need
+     * to assign unit to some other one, but we can't do it here, as:
+     * - we don't own  the proper lock,
+     * - we can't change v->processor under vcpu_wake()'s feet.
+     * So we add it to the waitqueue, and tickle all the free CPUs (if any)
+     * on which unit can run. The first one that schedules will pick it up.
+     */
+    spin_lock(&prv->waitq_lock);
+    list_add_tail(&nvc->waitq_elem, &prv->waitq);
+    spin_unlock(&prv->waitq_lock);
 
-            if ( test_and_clear_bit(new_cpu, &prv->cpus_free) )
-            {
-                cpu_raise_softirq(new_cpu, SCHEDULE_SOFTIRQ);
-                return;
-            }
-        }
-    }
+    cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
+                cpupool_domain_master_cpumask(unit->domain));
+    cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu),
+                &prv->cpus_free);
 
-    /* Note that we get here only for units assigned to a pCPU */
-    cpu_raise_softirq(sched_unit_master(unit), SCHEDULE_SOFTIRQ);
+    if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
+        dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU!\n",
+                unit->domain->domain_id, unit->unit_id);
+    else
+        cpumask_raise_softirq(cpumask_scratch_cpu(cpu), SCHEDULE_SOFTIRQ);
 }
 
 static void null_unit_sleep(const struct scheduler *ops,
@@ -827,6 +829,8 @@ static void null_schedule(const struct scheduler *ops, struct sched_unit *prev,
      */
     if ( unlikely(prev->next_task == NULL) )
     {
+        bool unit_found;
+
         spin_lock(&prv->waitq_lock);
 
         if ( list_empty(&prv->waitq) )
@@ -839,6 +843,7 @@ static void null_schedule(const struct scheduler *ops, struct sched_unit *prev,
          * it only in cases where a pcpu has no unit associated (e.g., as
          * said above, the cpu has just joined a cpupool).
          */
+        unit_found = false;
         for_each_affinity_balance_step( bs )
         {
             list_for_each_entry( wvc, &prv->waitq, waitq_elem )
@@ -849,13 +854,45 @@ static void null_schedule(const struct scheduler *ops, struct sched_unit *prev,
 
                 if ( unit_check_affinity(wvc->unit, sched_cpu, bs) )
                 {
-                    unit_assign(prv, wvc->unit, sched_cpu);
-                    list_del_init(&wvc->waitq_elem);
-                    prev->next_task = wvc->unit;
-                    goto unlock;
+                    spinlock_t *lock;
+
+                    unit_found = true;
+
+                    /*
+                     * If the unit in the waitqueue has just come up online,
+                     * we risk racing with vcpu_wake(). To avoid this, sync
+                     * on the spinlock that vcpu_wake() holds, but only with
+		     * trylock, to avoid deadlock).
+                     */
+                    lock = pcpu_schedule_trylock(sched_unit_master(wvc->unit));
+
+                    /*
+                     * We know the vcpu's lock is not this resource's lock. In
+                     * fact, if it were, since this cpu is free, vcpu_wake()
+                     * would have assigned the unit to here directly.
+                     */
+                    ASSERT(lock != get_sched_res(sched_cpu)->schedule_lock);
+
+                    if ( lock ) {
+                        unit_assign(prv, wvc->unit, sched_cpu);
+                        list_del_init(&wvc->waitq_elem);
+                        prev->next_task = wvc->unit;
+                        spin_unlock(lock);
+                        goto unlock;
+                    }
                 }
             }
         }
+        /*
+         * If we did find a unit with suitable affinity in the waitqueue, but
+         * we could not pick it up (due to lock contention), and hence we are
+         * still free, plan for another try. In fact, we don't want such unit
+         * to be stuck in the waitqueue, when there are free cpus where it
+         * could run.
+         */
+        if ( unlikely( unit_found && prev->next_task == NULL &&
+                       !list_empty(&prv->waitq)) )
+            cpu_raise_softirq(cur_cpu, SCHEDULE_SOFTIRQ);
  unlock:
         spin_unlock(&prv->waitq_lock);
 

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] dom0less + sched=null => broken in staging
  2019-10-28  5:35                   ` Dario Faggioli
@ 2019-10-28 18:40                     ` Stefano Stabellini
  0 siblings, 0 replies; 26+ messages in thread
From: Stefano Stabellini @ 2019-10-28 18:40 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: George.Dunlap, xen-devel, julien.grall, sstabellini, jgross

On Mon, 28 Oct 2019, Dario Faggioli wrote:
> On Fri, 2019-08-23 at 18:16 -0700, Stefano Stabellini wrote:
> > On Wed, 21 Aug 2019, Dario Faggioli wrote:
> > > Hey, Stefano, Julien,
> > > 
> > > Here's another patch.
> > > 
> > > Rather than a debug patch, this is rather an actual "proposed
> > > solution".
> > > 
> > > Can you give it a go? If it works, I'll spin it as a proper patch.
> > 
> > Yes, this seems to solve the problem, thank you!
> > 
> Hey,
> 
> Sorry this is taking a little while. Can any of you please test the
> attached, on top of current staging?
> 
> In fact, I rebased the patch in my last email on top of that, and I'd
> like to know if it still works, even now that core-scheduling is in.
> 
> If it does, then a proper changelog is the only thing it'd be missing,
> and I'll do it quickly, I promise :-)

Tested-by: Stefano Stabellini <sstabellini@kernel.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-10-28 18:40 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-07 18:22 [Xen-devel] dom0less + sched=null => broken in staging Stefano Stabellini
2019-08-08  8:04 ` George Dunlap
2019-08-08 20:44   ` Stefano Stabellini
2019-08-09  7:40     ` Dario Faggioli
2019-08-09 17:57 ` Dario Faggioli
2019-08-09 18:30   ` Stefano Stabellini
2019-08-13 15:27     ` Dario Faggioli
2019-08-13 16:52       ` Julien Grall
2019-08-13 17:34         ` Dario Faggioli
2019-08-13 18:43           ` Julien Grall
2019-08-13 22:26             ` Julien Grall
2019-08-13 22:34             ` Dario Faggioli
2019-08-13 23:07               ` Julien Grall
2019-08-13 21:14       ` Stefano Stabellini
2019-08-14  2:04         ` Dario Faggioli
2019-08-14 16:27           ` Stefano Stabellini
2019-08-14 17:35             ` Dario Faggioli
2019-08-21 10:33               ` Dario Faggioli
2019-08-24  1:16                 ` Stefano Stabellini
2019-09-11 13:53                   ` Dario Faggioli
2019-09-25 15:19                     ` Julien Grall
2019-09-25 15:34                       ` Dario Faggioli
2019-09-25 15:39                         ` Julien Grall
2019-09-25 15:41                           ` Jürgen Groß
2019-10-28  5:35                   ` Dario Faggioli
2019-10-28 18:40                     ` Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).