All of lore.kernel.org
 help / color / mirror / Atom feed
* [ANNOUNCE] Xen 4.15 release update - still in feature freeze
@ 2021-03-15 12:18 Ian Jackson
  2021-03-15 13:10 ` Andrew Cooper
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ian Jackson @ 2021-03-15 12:18 UTC (permalink / raw)
  To: committers, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Frédéric Pierret, Dario Faggioli

Thanks everyone for your hard work so far.  I think things are looking
pretty good, although we have slipped.

Please see below for my updated list of release blockers and tracking
issues.  Please let me know if there is information missing, or if you
have corrections.

There is one issue on my radar that I am concerned about and want to
see sorted out: "io-apic issue on Ryzen 1800X".  If we can't get it
fixed soon we may have to live with it as a release notes issue.

I am probably going to take the scheduler issues off this list because
I haven't seen any sign of activity, and because I don't actually
think there are release critical bugs there.  Please let me know if
you disagree.

As previously announced, we are still in codefreeze.  All changes must
have a release-ack.

My current tentative schedule is:

   Tuesday 16th March  RC3 test day

   Wednesday 17th March
       Branch, turn off debug on the 4.15 branch
       xen-next will be open but only for non-disruptive changes

   Monday 22nd March   RC4
   Tuesday 23nd March  RC4 test day

   Week of 29th March **tentative*
       Release (probably Tuesday or Wednesday)

Thanks,
Ian.


OPEN ISSUES AND BLOCKERS
========================

io-apic issue on Ryzen 1800X
Related Qubes issue tracking this:
https://github.com/QubesOS/qubes-issues/issues/6423
Information from
  Jan Beulich <jbeulich@suse.com>
  Andrew Cooper <andrew.cooper3@citrix.com>
  Frédéric Pierret <frederic.pierret@qubes-os.org>


ABI stability checking

   [PATCH for-4.15 00/10] tools: Support to use abi-dumper on libraries
   [PATCH v2 for-4.15] tools/libxl: Work around unintialised variable libxl__domain_get_device_model_uid()
   etc.

This is testing/build work and will enable ABI checking of future
changes to 4.15 after its release.  I don't think it's a blocker but
it would be nice to have.

My most recent impression is that there are still some loose ends
here.



SCHEDULER ISSUES NOT MAKING PROCESS ?
-------------------------------------

BUG: credit=sched2 machine hang when using DRAKVUF

Information from
  Dario Faggioli <dfaggioli@suse.com>
References
  https://lists.xen.org/archives/html/xen-devel/2020-05/msg01985.html
  https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01561.html
  https://bugzilla.opensuse.org/show_bug.cgi?id=1179246

Quoting Dario:
| Manifests only with certain combination of hardware and workload. 
| I'm not reproducing, but there are multiple reports of it (see 
| above). I'm investigating and trying to come up at least with 
| debug patches that one of the reporter should be able and willing to 
| test.

Dario is working on this.  Last update 29.1.21 ?


G. Null scheduler and vwfi native problem

Information from
  Dario Faggioli <dfaggioli@suse.com>

References
  https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01634.html

Quoting Dario:
| RCU issues, but manifests due to scheduler behavior (especially   
| NULL scheduler, especially on ARM).
|
| Patches that should solve the issue for ARM posted already. They
| will need to be slightly adjusted to cover x86 as well.

As of last update from Dario 29.1.21:
waiting for test report from submitter.


H. Ryzen 4000 (Mobile) Softlocks/Micro-stutters

Information from
  Dario Faggioli <dfaggioli@suse.com>

As of last update from Dario 29.1.21:
Discussions currently ongoing about the severity of this issue.


ISSUES BELIEVED NEWLY RESOLVED
==============================

Fallout from MSR handling behavioral change.

I think there are now no outstanding patches to fix/change MSR
behaviour and there is no longer any blocker here ?

Key partipants:
  Jan Beulich <jbeulich@suse.com>
  Andrew Cooper <andrew.cooper3@citrix.com>


Use-after-free in the IOMMU code

Information from
  Julien Grall <julien@xen.org>
References
 [PATCH for-4.15 v5 0/3] xen/iommu: Collection of bug fixes for     
 IOMMU teardown
Now committed


"x86/PV: avoid speculation abuse through guest accessors"

Information from
  Jan Beulich <jbeulich@suse.com>

| F. The almost-XSA "x86/PV: avoid speculation abuse through guest
| accessors" - the first 4 patches are needed to address the actual
| issue. The next 3 patches are needed to get the tree into
| consistent state again, identifier-wise. The remaining patches
| can probably wait.

This has been committed.


Problems with xl save / cancel

Information from Jürgen Groß:
  xl daemon won't kill the domain after it has gone through a
  suspend-cancel cycle.

I think this was fixed by
  tools/libs/light: fix xl save -c handling


x86/time: calibration rendezvous adjustments

Information from
  Jan Beulich <jbeulich@suse.com>

Not entirely a regression.  3 out of the 4 patches seem to have been
committed.

Patch 4/ is not targeted at 4.15 I think.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release update - still in feature freeze
  2021-03-15 12:18 [ANNOUNCE] Xen 4.15 release update - still in feature freeze Ian Jackson
@ 2021-03-15 13:10 ` Andrew Cooper
  2021-03-15 13:46 ` Jan Beulich
  2021-03-18 18:11 ` Dario Faggioli
  2 siblings, 0 replies; 7+ messages in thread
From: Andrew Cooper @ 2021-03-15 13:10 UTC (permalink / raw)
  To: Ian Jackson, committers, xen-devel
  Cc: Jan Beulich, Frédéric Pierret, Dario Faggioli

On 15/03/2021 12:18, Ian Jackson wrote:
> OPEN ISSUES AND BLOCKERS
> ========================
>
> io-apic issue on Ryzen 1800X
> Related Qubes issue tracking this:
> https://github.com/QubesOS/qubes-issues/issues/6423
> Information from
>   Jan Beulich <jbeulich@suse.com>
>   Andrew Cooper <andrew.cooper3@citrix.com>
>   Frédéric Pierret <frederic.pierret@qubes-os.org>

Debugging ongoing.

> ABI stability checking
>
>    [PATCH for-4.15 00/10] tools: Support to use abi-dumper on libraries
>    [PATCH v2 for-4.15] tools/libxl: Work around unintialised variable libxl__domain_get_device_model_uid()
>    etc.

The libxl thing is already committed (2ff2adc61fcfa0).

> This is testing/build work and will enable ABI checking of future
> changes to 4.15 after its release.  I don't think it's a blocker but
> it would be nice to have.
>
> My most recent impression is that there are still some loose ends
> here.

Plan 1 (committing dumps into the tree) won't work.  Plan 2 (OSSTest and
other systems doing a double checkout) probably does require a tweak or
two in 4.15 to make it easy to start in 4.16.


Also, "xenstore_lib.h and libxenstore API/ABI problems" still has work
to do for 4.15.

> ISSUES BELIEVED NEWLY RESOLVED
> ==============================
>
> Fallout from MSR handling behavioral change.
>
> I think there are now no outstanding patches to fix/change MSR
> behaviour and there is no longer any blocker here ?

Still one known issue remaining, as pointed out in Roger's summary.  I'm
still working on it.

~Andrew



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release update - still in feature freeze
  2021-03-15 12:18 [ANNOUNCE] Xen 4.15 release update - still in feature freeze Ian Jackson
  2021-03-15 13:10 ` Andrew Cooper
@ 2021-03-15 13:46 ` Jan Beulich
  2021-03-16  9:43   ` Roger Pau Monné
  2021-03-18 18:11 ` Dario Faggioli
  2 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2021-03-15 13:46 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Andrew Cooper, Frédéric Pierret, Dario Faggioli,
	committers, xen-devel

On 15.03.2021 13:18, Ian Jackson wrote:
> ISSUES BELIEVED NEWLY RESOLVED
> ==============================
> 
> Fallout from MSR handling behavioral change.
> 
> I think there are now no outstanding patches to fix/change MSR
> behaviour and there is no longer any blocker here ?

In addition to what Andrew has said, while not a blocker in that
sense I think the excessive verbosity of the logging is also an
issue.

> x86/time: calibration rendezvous adjustments
> 
> Information from
>   Jan Beulich <jbeulich@suse.com>
> 
> Not entirely a regression.  3 out of the 4 patches seem to have been
> committed.
> 
> Patch 4/ is not targeted at 4.15 I think.

Indeed.

Jan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release update - still in feature freeze
  2021-03-15 13:46 ` Jan Beulich
@ 2021-03-16  9:43   ` Roger Pau Monné
  2021-03-16 10:12     ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Pau Monné @ 2021-03-16  9:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ian Jackson, Andrew Cooper, Frédéric Pierret,
	Dario Faggioli, committers, xen-devel

On Mon, Mar 15, 2021 at 02:46:07PM +0100, Jan Beulich wrote:
> On 15.03.2021 13:18, Ian Jackson wrote:
> > ISSUES BELIEVED NEWLY RESOLVED
> > ==============================
> > 
> > Fallout from MSR handling behavioral change.
> > 
> > I think there are now no outstanding patches to fix/change MSR
> > behaviour and there is no longer any blocker here ?
> 
> In addition to what Andrew has said, while not a blocker in that
> sense I think the excessive verbosity of the logging is also an
> issue.

I think you meant the logging done for each MSR that's not explicitly
handled?

While I agree it might be too verbose, I don't see how we can change
that right now. We could introduce a command line parameter to select
whether to print those messages or not, but I think that's too
specific for a command line option.

We should look into some kind of logging improvements that allow
selecting which messages to print on a per-domain basis IMO.

In any case, those messages will only show up in debug builds, so it's
mostly annoying to developers but transparent to consumers of the
production build.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release update - still in feature freeze
  2021-03-16  9:43   ` Roger Pau Monné
@ 2021-03-16 10:12     ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2021-03-16 10:12 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Ian Jackson, Andrew Cooper, Frédéric Pierret,
	Dario Faggioli, committers, xen-devel

On 16.03.2021 10:43, Roger Pau Monné wrote:
> On Mon, Mar 15, 2021 at 02:46:07PM +0100, Jan Beulich wrote:
>> On 15.03.2021 13:18, Ian Jackson wrote:
>>> ISSUES BELIEVED NEWLY RESOLVED
>>> ==============================
>>>
>>> Fallout from MSR handling behavioral change.
>>>
>>> I think there are now no outstanding patches to fix/change MSR
>>> behaviour and there is no longer any blocker here ?
>>
>> In addition to what Andrew has said, while not a blocker in that
>> sense I think the excessive verbosity of the logging is also an
>> issue.
> 
> I think you meant the logging done for each MSR that's not explicitly
> handled?
> 
> While I agree it might be too verbose, I don't see how we can change
> that right now. We could introduce a command line parameter to select
> whether to print those messages or not, but I think that's too
> specific for a command line option.

Yes, I agree.

> We should look into some kind of logging improvements that allow
> selecting which messages to print on a per-domain basis IMO.

Indeed, this was my thinking as well. I was wondering whether we
could at least limit reporting each unhandled MSR only once per
domain. But yes, this would require at least two extra pages to
hold the required bitmaps (one for the MSRs starting at 0x00000000
and the other for the group up from 0xC0000000; a 3rd one for AMD
for the group up from 0xC0010000).

> In any case, those messages will only show up in debug builds, so it's
> mostly annoying to developers but transparent to consumers of the
> production build.

Or when, because of things working differently than before, people
need to be told to increase verbosity for debugging purposes.

Jan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release update - still in feature freeze
  2021-03-15 12:18 [ANNOUNCE] Xen 4.15 release update - still in feature freeze Ian Jackson
  2021-03-15 13:10 ` Andrew Cooper
  2021-03-15 13:46 ` Jan Beulich
@ 2021-03-18 18:11 ` Dario Faggioli
  2021-03-29 17:16   ` Dario Faggioli
  2 siblings, 1 reply; 7+ messages in thread
From: Dario Faggioli @ 2021-03-18 18:11 UTC (permalink / raw)
  To: Ian Jackson, committers, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Frédéric Pierret, George Dunlap


[-- Attachment #1.1: Type: text/plain, Size: 4038 bytes --]

[Adding George, since it's scheduling]

On Mon, 2021-03-15 at 12:18 +0000, Ian Jackson wrote:
> 
> OPEN ISSUES AND BLOCKERS
> ========================
> 
> [...]
> 
> SCHEDULER ISSUES NOT MAKING PROCESS ?
> -------------------------------------
> 
Yeah... let's try.

> BUG: credit=sched2 machine hang when using DRAKVUF
> 
> Information from
>   Dario Faggioli <dfaggioli@suse.com>
> References
>   https://lists.xen.org/archives/html/xen-devel/2020-05/msg01985.html
>    
> https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01561.html
>   https://bugzilla.opensuse.org/show_bug.cgi?id=1179246
> 
So, this is mostly about the third issue, the one described in the
openSUSE bug, which was however also reported here, by different
people.

As I've just wrote there (on the bug), I've been working on trying to
reproduce the problem on a variety of different machines. Seems AMD
seemed to be the most impacted, I've lately focused on hardware from
such vendor.

I have been, however, unable to re-create a situation where the
symptoms described in the reports occur. I specifically looked for
hardware that was the same, or similar enough, and I replayed the dom0
vcpu pinning configuration and the creation of domUs, both PV and HVM,
but the problem did not show up for me. The only difference between
what I've done so far and what is described, e.g., in the bug is that
I've not been able to check Windows guests yet. (I'll try that as soon
as I can, but if this would really be a scheduling issue, which OS runs
in the guest should not really matter much, I think).

Code inspection for something that comes from and/or affects the
scheduler and is both:
- CPU-vendor specific, and
- guest-type specific

also led me pretty much nowhere.

I produced a debug patch (I attach two versions of it, one for staging
and one for v4.13.2) that should help me tell whether or not the
scheduler is being invoked every time it should be and whether or not
there are vcpus that manages to run for longer than how the scheduler
would want them to.

But as you can imagine, a debug patch is not really helpful if it can't
be used within the scenario it is meant to debug, i.e., without a
reproducer.

I did manage to find an actual bug in Credit2, but that's totally
unrelated to the problem at hand (and that will hence be discussed in
another email).

So, that's the status. I definitely was hoping for things to be better
at this point of the release cycle. Sorry they're not. And of course
I'll keep digging, but unless I find a way to reproduce, I don't expect
big breakthrough. :-/

> G. Null scheduler and vwfi native problem
> 
> Information from
>   Dario Faggioli <dfaggioli@suse.com>
> 
> References
>    
> https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01634.html
> 
> Quoting Dario:
> > RCU issues, but manifests due to scheduler behavior (especially   
> > NULL scheduler, especially on ARM).
> > 
> > Patches that should solve the issue for ARM posted already. They
> > will need to be slightly adjusted to cover x86 as well.
> 
> As of last update from Dario 29.1.21:
> waiting for test report from submitter.
> 
For this, I made progress toward making an actual patch that works for
both ARM and x86, but I've been sidetracked by a number of things, and
have not finished it.

The ARM-only fix has been tested successfully and would be ready
already. The full solution may not be ready in time for 4.15.

So, I'd say we can either merge the ARM part (ARM is where the issue
manifests most of the times and more severely) or wait for a full
solution during 4.16 development, which we will then backport.

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #1.2: xen-sched-suspect-debug.patch --]
[-- Type: text/x-patch, Size: 7534 bytes --]

commit 278305aff03edd326382374d7757822a20d96c86
Author: Dario Faggioli <dfaggioli@suse.com>
Date:   Tue Mar 2 19:03:05 2021 +0000

    Debug patch for suspect scheduler issues.
    
    Signed-off-by: Dario Faggioli <dfaggioli@suse.com>

diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index b7f4e2bea8..8ce705bd48 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7440,6 +7440,17 @@ void sched_process(struct pcpu_info *p)
         /* TRC_SCHED_VERBOSE */
         switch(ri->event)
         {
+        case TRC_SCHED_MAX_INTRV:
+            if(opt.dump_all) {
+                struct {
+                    unsigned int domid, vcpuid;
+                    unsigned int interv, time, last;
+                } *r = (typeof(r))ri->d;
+
+                printf(" %s sched_max_interv %u usecs, at %u usecs with d%uv%u (last: %u usecs)\n",
+                       ri->dump_header, r->interv, r->time, r->domid, r->vcpuid, r->last);
+            }
+            break;
         case TRC_SCHED_DOM_ADD:
             if(opt.dump_all) {
                 struct {
@@ -7904,6 +7915,18 @@ void sched_process(struct pcpu_info *p)
                        ri->dump_header, r->domid, r->vcpuid);
             }
             break;
+        case TRC_SCHED_CLASS_EVT(CSCHED2, 24):
+            if(opt.dump_all) {
+                struct {
+                    unsigned int domid, vcpuid;
+                    unsigned int limits, now, exec;
+                } *r = (typeof(r))ri->d;
+
+                printf(" %s csched2:limit_credit_loss[#%u] d%uv%u, at %u, exec'd %u usecs!\n",
+                       ri->dump_header, r->limits, r->domid, r->vcpuid, r->now, r->exec);
+            }
+            break;
+
         /* RTDS (TRC_RTDS_xxx) */
         case TRC_SCHED_CLASS_EVT(RTDS, 1): /* TICKLE           */
             if(opt.dump_all) {
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 6d34764d38..a88e2a1d0f 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2627,6 +2627,12 @@ static void sched_slave(void)
                          is_idle_unit(next) && !is_idle_unit(prev), now);
 }
 
+static DEFINE_PER_CPU(s_time_t, last_sched_time);
+static DEFINE_PER_CPU(s_time_t, last_sched_interval);
+static DEFINE_PER_CPU(s_time_t, max_sched_interval);
+static DEFINE_PER_CPU(s_time_t, max_sched_time);
+static DEFINE_PER_CPU(struct vcpu *, max_sched_interv_vprev);
+
 /*
  * The main function
  * - deschedule the current domain (scheduler independent).
@@ -2641,6 +2647,7 @@ static void schedule(void)
     spinlock_t           *lock;
     int cpu = smp_processor_id();
     unsigned int          gran;
+    s_time_t              sched_interval;
 
     ASSERT_NOT_IN_ATOMIC();
 
@@ -2650,6 +2657,21 @@ static void schedule(void)
 
     lock = pcpu_schedule_lock_irq(cpu);
 
+    now = NOW();
+
+    sched_interval = this_cpu(last_sched_interval) = now - this_cpu(last_sched_time);
+    if ( sched_interval > this_cpu(max_sched_interval) )
+    {
+        this_cpu(max_sched_interval) = sched_interval;
+        this_cpu(max_sched_interv_vprev) = vprev;
+	this_cpu(max_sched_time) = now;
+        TRACE_5D(TRC_SCHED_MAX_INTRV, vprev->domain->domain_id, vprev->vcpu_id,
+                 (uint32_t)(this_cpu(max_sched_interval) / MICROSECS(1)),
+		 (uint32_t)(this_cpu(max_sched_time) / MICROSECS(1)),
+                 (uint32_t)(this_cpu(last_sched_interval) / MICROSECS(1)));
+    }
+    this_cpu(last_sched_time) = now;
+
     sr = get_sched_res(cpu);
     gran = sr->granularity;
 
@@ -2669,8 +2691,6 @@ static void schedule(void)
 
     stop_timer(&sr->s_timer);
 
-    now = NOW();
-
     if ( gran > 1 )
     {
         cpumask_t *mask = cpumask_scratch_cpu(cpu);
@@ -3365,6 +3385,12 @@ void schedule_dump(struct cpupool *c)
         printk("CPU[%02d] current=%pv, curr=%pv, prev=%pv\n", i,
                get_cpu_current(i), sr->curr ? sr->curr->vcpu_list : NULL,
                sr->prev ? sr->prev->vcpu_list : NULL);
+        printk("\tlast schedule: %"PRI_stime", last_interval=%"PRI_stime", "
+               "max_interval=%"PRI_stime" at %"PRI_stime" (after running %pv)\n",
+               per_cpu(last_sched_time, i), per_cpu(last_sched_interval, i),
+               per_cpu(max_sched_interval, i), per_cpu(max_sched_time, i),
+	       per_cpu(max_sched_interv_vprev, i));
+               per_cpu(max_sched_interval, i) = 0;
         for_each_cpu (j, sr->cpus)
             if ( i != j )
                 printk("CPU[%02d] current=%pv\n", j, get_cpu_current(j));
diff --git a/xen/common/sched/credit2.c b/xen/common/sched/credit2.c
index eb5e5a78c5..4263b67f23 100644
--- a/xen/common/sched/credit2.c
+++ b/xen/common/sched/credit2.c
@@ -61,6 +61,7 @@
 #define TRC_CSCHED2_SCHEDULE         TRC_SCHED_CLASS_EVT(CSCHED2, 21)
 #define TRC_CSCHED2_RATELIMIT        TRC_SCHED_CLASS_EVT(CSCHED2, 22)
 #define TRC_CSCHED2_RUNQ_CAND_CHECK  TRC_SCHED_CLASS_EVT(CSCHED2, 23)
+#define TRC_CSCHED2_LIMIT_CREDITS    TRC_SCHED_CLASS_EVT(CSCHED2, 24)
 
 /*
  * TODO:
@@ -798,6 +799,11 @@ static int get_fallback_cpu(struct csched2_unit *svc)
     return cpumask_any(cpumask_scratch_cpu(sched_unit_master(unit)));
 }
 
+static DEFINE_PER_CPU(unsigned int, limit_credits);
+static DEFINE_PER_CPU(s_time_t, limit_credits_time);
+static DEFINE_PER_CPU(s_time_t, limit_credits_exec);
+static DEFINE_PER_CPU(struct sched_unit *, limit_credits_unit);
+
 /*
  * Time-to-credit, credit-to-time.
  *
@@ -815,7 +821,17 @@ static void t2c_update(const struct csched2_runqueue_data *rqd, s_time_t time,
     /* Getting to lower credit than CSCHED2_CREDIT_MIN makes no sense. */
     val = svc->credit - val;
     if ( unlikely(val < CSCHED2_CREDIT_MIN) )
+    {
+        this_cpu(limit_credits)++;
+        this_cpu(limit_credits_time) = NOW();
+        this_cpu(limit_credits_exec) = time;
+        this_cpu(limit_credits_unit) = svc->unit;
+        TRACE_5D(TRC_CSCHED2_LIMIT_CREDITS, svc->unit->domain->domain_id,
+                 svc->unit->unit_id, this_cpu(limit_credits),
+                 (uint32_t)(this_cpu(limit_credits_time)/MICROSECS(1)),
+                 (uint32_t)(this_cpu(limit_credits_exec)/MICROSECS(1)));
         svc->credit = CSCHED2_CREDIT_MIN;
+    }
     else
         svc->credit = val;
 }
@@ -3757,6 +3773,12 @@ dump_pcpu(const struct scheduler *ops, int cpu)
            cpu, c2r(cpu),
            CPUMASK_PR(per_cpu(cpu_sibling_mask, cpu)),
            CPUMASK_PR(per_cpu(cpu_core_mask, cpu)));
+    if ( per_cpu(limit_credits_unit, cpu) != NULL ) {
+        printk("\tCredit limited: #%u, last at %"PRI_stime" as d%uv%u exec'd %"PRI_stime"\n",
+               per_cpu(limit_credits, cpu), per_cpu(limit_credits_time, cpu),
+               per_cpu(limit_credits_unit, cpu)->domain->domain_id,
+               per_cpu(limit_credits_unit, cpu)->unit_id, per_cpu(limit_credits_exec, cpu));
+    }
 
     /* current UNIT (nothing to say if that's the idle unit) */
     svc = csched2_unit(curr_on_cpu(cpu));
diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h
index d5fa4aea8d..5b3faf0fd5 100644
--- a/xen/include/public/trace.h
+++ b/xen/include/public/trace.h
@@ -117,6 +117,7 @@
 #define TRC_SCHED_SWITCH_INFNEXT (TRC_SCHED_VERBOSE + 15)
 #define TRC_SCHED_SHUTDOWN_CODE  (TRC_SCHED_VERBOSE + 16)
 #define TRC_SCHED_SWITCH_INFCONT (TRC_SCHED_VERBOSE + 17)
+#define TRC_SCHED_MAX_INTRV      (TRC_SCHED_VERBOSE + 18)
 
 #define TRC_DOM0_DOM_ADD         (TRC_DOM0_DOMOPS + 1)
 #define TRC_DOM0_DOM_REM         (TRC_DOM0_DOMOPS + 2)

[-- Attachment #1.3: xen-sched-suspect-debug_4.13.2.patch --]
[-- Type: text/x-patch, Size: 8314 bytes --]

commit 0f8ef8f23718cc24b0bc958979aa789be4ed89d5
Author: Dario Faggioli <dfaggioli@suse.com>
Date:   Tue Mar 2 19:03:05 2021 +0000

    Debug patch for suspect scheduler issues.
    
    Signed-off-by: Dario Faggioli <dfaggioli@suse.com>

diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index b7f4e2bea8..8ce705bd48 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7440,6 +7440,17 @@ void sched_process(struct pcpu_info *p)
         /* TRC_SCHED_VERBOSE */
         switch(ri->event)
         {
+        case TRC_SCHED_MAX_INTRV:
+            if(opt.dump_all) {
+                struct {
+                    unsigned int domid, vcpuid;
+                    unsigned int interv, time, last;
+                } *r = (typeof(r))ri->d;
+
+                printf(" %s sched_max_interv %u usecs, at %u usecs with d%uv%u (last: %u usecs)\n",
+                       ri->dump_header, r->interv, r->time, r->domid, r->vcpuid, r->last);
+            }
+            break;
         case TRC_SCHED_DOM_ADD:
             if(opt.dump_all) {
                 struct {
@@ -7904,6 +7915,18 @@ void sched_process(struct pcpu_info *p)
                        ri->dump_header, r->domid, r->vcpuid);
             }
             break;
+        case TRC_SCHED_CLASS_EVT(CSCHED2, 24):
+            if(opt.dump_all) {
+                struct {
+                    unsigned int domid, vcpuid;
+                    unsigned int limits, now, exec;
+                } *r = (typeof(r))ri->d;
+
+                printf(" %s csched2:limit_credit_loss[#%u] d%uv%u, at %u, exec'd %u usecs!\n",
+                       ri->dump_header, r->limits, r->domid, r->vcpuid, r->now, r->exec);
+            }
+            break;
+
         /* RTDS (TRC_RTDS_xxx) */
         case TRC_SCHED_CLASS_EVT(RTDS, 1): /* TICKLE           */
             if(opt.dump_all) {
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index ce7c56147b..29aa99db86 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -57,6 +57,7 @@
 #define TRC_CSCHED2_SCHEDULE         TRC_SCHED_CLASS_EVT(CSCHED2, 21)
 #define TRC_CSCHED2_RATELIMIT        TRC_SCHED_CLASS_EVT(CSCHED2, 22)
 #define TRC_CSCHED2_RUNQ_CAND_CHECK  TRC_SCHED_CLASS_EVT(CSCHED2, 23)
+#define TRC_CSCHED2_LIMIT_CREDITS    TRC_SCHED_CLASS_EVT(CSCHED2, 24)
 
 /*
  * TODO:
@@ -775,6 +776,11 @@ static int get_fallback_cpu(struct csched2_unit *svc)
     return cpumask_any(cpumask_scratch_cpu(sched_unit_master(unit)));
 }
 
+static DEFINE_PER_CPU(unsigned int, limit_credits);
+static DEFINE_PER_CPU(s_time_t, limit_credits_time);
+static DEFINE_PER_CPU(s_time_t, limit_credits_exec);
+static DEFINE_PER_CPU(struct sched_unit *, limit_credits_unit);
+
 /*
  * Time-to-credit, credit-to-time.
  *
@@ -792,7 +798,17 @@ static void t2c_update(struct csched2_runqueue_data *rqd, s_time_t time,
     /* Getting to lower credit than CSCHED2_CREDIT_MIN makes no sense. */
     val = svc->credit - val;
     if ( unlikely(val < CSCHED2_CREDIT_MIN) )
+    {
+        this_cpu(limit_credits)++;
+        this_cpu(limit_credits_time) = NOW();
+        this_cpu(limit_credits_exec) = time;
+        this_cpu(limit_credits_unit) = svc->unit;
+        TRACE_5D(TRC_CSCHED2_LIMIT_CREDITS, svc->unit->domain->domain_id,
+                 svc->unit->unit_id, this_cpu(limit_credits),
+                 (uint32_t)(this_cpu(limit_credits_time)/MICROSECS(1)),
+                 (uint32_t)(this_cpu(limit_credits_exec)/MICROSECS(1)));
         svc->credit = CSCHED2_CREDIT_MIN;
+    }
     else
         svc->credit = val;
 }
@@ -3661,6 +3677,12 @@ dump_pcpu(const struct scheduler *ops, int cpu)
            cpu, c2r(cpu),
            CPUMASK_PR(per_cpu(cpu_sibling_mask, cpu)),
            CPUMASK_PR(per_cpu(cpu_core_mask, cpu)));
+    if ( per_cpu(limit_credits_unit, cpu) != NULL ) {
+        printk("\tCredit limited: #%u, last at %"PRI_stime" as d%uv%u exec'd %"PRI_stime"\n",
+               per_cpu(limit_credits, cpu), per_cpu(limit_credits_time, cpu),
+               per_cpu(limit_credits_unit, cpu)->domain->domain_id,
+               per_cpu(limit_credits_unit, cpu)->unit_id, per_cpu(limit_credits_exec, cpu));
+    }
 
     /* current UNIT (nothing to say if that's the idle unit) */
     svc = csched2_unit(curr_on_cpu(cpu));
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 6b1ae7bf8c..4a950f3b57 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -2385,6 +2385,12 @@ static void sched_slave(void)
                          is_idle_unit(next) && !is_idle_unit(prev), now);
 }
 
+static DEFINE_PER_CPU(s_time_t, last_sched_time);
+static DEFINE_PER_CPU(s_time_t, last_sched_interval);
+static DEFINE_PER_CPU(s_time_t, max_sched_interval);
+static DEFINE_PER_CPU(s_time_t, max_sched_time);
+static DEFINE_PER_CPU(struct vcpu *, max_sched_interv_vprev);
+
 /*
  * The main function
  * - deschedule the current domain (scheduler independent).
@@ -2399,6 +2405,7 @@ static void schedule(void)
     spinlock_t           *lock;
     int cpu = smp_processor_id();
     unsigned int          gran;
+    s_time_t              sched_interval;
 
     ASSERT_NOT_IN_ATOMIC();
 
@@ -2408,6 +2415,21 @@ static void schedule(void)
 
     lock = pcpu_schedule_lock_irq(cpu);
 
+    now = NOW();
+
+    sched_interval = this_cpu(last_sched_interval) = now - this_cpu(last_sched_time);
+    if ( sched_interval > this_cpu(max_sched_interval) )
+    {
+        this_cpu(max_sched_interval) = sched_interval;
+        this_cpu(max_sched_interv_vprev) = vprev;
+	this_cpu(max_sched_time) = now;
+        TRACE_5D(TRC_SCHED_MAX_INTRV, vprev->domain->domain_id, vprev->vcpu_id,
+                 (uint32_t)(this_cpu(max_sched_interval) / MICROSECS(1)),
+		 (uint32_t)(this_cpu(max_sched_time) / MICROSECS(1)),
+                 (uint32_t)(this_cpu(last_sched_interval) / MICROSECS(1)));
+    }
+    this_cpu(last_sched_time) = now;
+
     sr = get_sched_res(cpu);
     gran = sr->granularity;
 
@@ -2427,8 +2449,6 @@ static void schedule(void)
 
     stop_timer(&sr->s_timer);
 
-    now = NOW();
-
     if ( gran > 1 )
     {
         cpumask_t mask;
@@ -3085,7 +3105,7 @@ void scheduler_free(struct scheduler *sched)
 
 void schedule_dump(struct cpupool *c)
 {
-    unsigned int      i;
+    unsigned int      i,j;
     struct scheduler *sched;
     cpumask_t        *cpus;
 
@@ -3106,11 +3126,30 @@ void schedule_dump(struct cpupool *c)
         cpus = &cpupool_free_cpus;
     }
 
-    if ( sched->dump_cpu_state != NULL )
-    {
-        printk("CPUs info:\n");
-        for_each_cpu (i, cpus)
-            sched_dump_cpu_state(sched, i);
+    printk("CPUs info:\n");
+    for_each_cpu (i, cpus) {
+        struct sched_resource *sr = get_sched_res(i);
+        unsigned long flags;
+        spinlock_t *lock;
+
+        lock = pcpu_schedule_lock_irqsave(i, &flags);
+
+        printk("CPU[%02d] current=%pv, curr=%pv, prev=%pv\n", i,
+               get_cpu_current(i), sr->curr ? sr->curr->vcpu_list : NULL,
+               sr->prev ? sr->prev->vcpu_list : NULL);
+        printk("\tlast schedule: %"PRI_stime", last_interval=%"PRI_stime", "
+               "max_interval=%"PRI_stime" at %"PRI_stime" (after running %pv)\n",
+               per_cpu(last_sched_time, i), per_cpu(last_sched_interval, i),
+               per_cpu(max_sched_interval, i), per_cpu(max_sched_time, i),
+               per_cpu(max_sched_interv_vprev, i));
+               per_cpu(max_sched_interval, i) = 0;
+        for_each_cpu (j, sr->cpus)
+            if ( i != j )
+                printk("CPU[%02d] current=%pv\n", j, get_cpu_current(j));
+
+        pcpu_schedule_unlock_irqrestore(lock, flags, i);
+
+        sched_dump_cpu_state(sched, i);
     }
 
     rcu_read_unlock(&sched_res_rculock);
diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h
index d5fa4aea8d..5b3faf0fd5 100644
--- a/xen/include/public/trace.h
+++ b/xen/include/public/trace.h
@@ -117,6 +117,7 @@
 #define TRC_SCHED_SWITCH_INFNEXT (TRC_SCHED_VERBOSE + 15)
 #define TRC_SCHED_SHUTDOWN_CODE  (TRC_SCHED_VERBOSE + 16)
 #define TRC_SCHED_SWITCH_INFCONT (TRC_SCHED_VERBOSE + 17)
+#define TRC_SCHED_MAX_INTRV      (TRC_SCHED_VERBOSE + 18)
 
 #define TRC_DOM0_DOM_ADD         (TRC_DOM0_DOMOPS + 1)
 #define TRC_DOM0_DOM_REM         (TRC_DOM0_DOMOPS + 2)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release update - still in feature freeze
  2021-03-18 18:11 ` Dario Faggioli
@ 2021-03-29 17:16   ` Dario Faggioli
  0 siblings, 0 replies; 7+ messages in thread
From: Dario Faggioli @ 2021-03-29 17:16 UTC (permalink / raw)
  To: Ian Jackson, committers, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Frédéric Pierret, George Dunlap

[-- Attachment #1: Type: text/plain, Size: 1090 bytes --]

On Thu, 2021-03-18 at 19:11 +0100, Dario Faggioli wrote:
> On Mon, 2021-03-15 at 12:18 +0000, Ian Jackson wrote:
> 
> >  
> >   https://bugzilla.opensuse.org/show_bug.cgi?id=1179246
> > 
> So, this is mostly about the third issue, the one described in the
> openSUSE bug, which was however also reported here, by different
> people.
> 
> As I've just wrote there (on the bug), I've been working on trying to
> reproduce the problem on a variety of different machines. Seems AMD
> seemed to be the most impacted, I've lately focused on hardware from
> such vendor.
> 
FWIW, as a further update, there are now new info/logs here:
https://bugzilla.opensuse.org/show_bug.cgi?id=1179246

which I'm analyzing. And I should be able to have direct access to a
box where the issue can reproduced.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-03-29 17:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-15 12:18 [ANNOUNCE] Xen 4.15 release update - still in feature freeze Ian Jackson
2021-03-15 13:10 ` Andrew Cooper
2021-03-15 13:46 ` Jan Beulich
2021-03-16  9:43   ` Roger Pau Monné
2021-03-16 10:12     ` Jan Beulich
2021-03-18 18:11 ` Dario Faggioli
2021-03-29 17:16   ` Dario Faggioli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.