From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
Jan Beulich <JBeulich@suse.com>,
George Dunlap <george.dunlap@citrix.com>
Cc: xen-devel@lists.xenproject.org,
Anshul Makkar <anshul.makkar@citrix.com>,
MengXu <mengxu@cis.upenn.edu>
Subject: Re: [PATCH 2/3] xen: Have schedulers revise initial placement
Date: Thu, 11 Aug 2016 16:51:25 +0100 [thread overview]
Message-ID: <a52941cd-ac3d-40a1-b684-1938e6b31d81@citrix.com> (raw)
In-Reply-To: <1470927584.6250.26.camel@citrix.com>
On 11/08/16 15:59, Dario Faggioli wrote:
> On Fri, 2016-08-05 at 07:24 -0600, Jan Beulich wrote:
>> I'd really like to have those backported, but I have to ask one
>> of you to identify which prereq-s are needed on 4.6 and 4.5
>> (I'll revert them from 4.5 right away, but I'll wait for an osstest
>> flight to confirm the same issue exists on 4.6).
>>
> Hey, I could only start working on this this morning (sorry for the
> delay), and I'll continue tomorrow but, at least here, staging-4.6
> (plus the patches!) crashes like this:
>
> (XEN) [ 0.000000] ----[ Xen-4.6.4-pre x86_64 debug=y Not tainted ]----
> (XEN) [ 0.000000] CPU: 0
> (XEN) [ 0.000000] RIP: e008:[<ffff82d0801238c4>] _csched_cpu_pick+0x156/0x612
> (XEN) [ 0.000000] RFLAGS: 0000000000010046 CONTEXT: hypervisor
> (XEN) [ 0.000000] rax: 0000000000000040 rbx: 0000000000000040 rcx: 0000000000000000
> (XEN) [ 0.000000] rdx: 000000000000003f rsi: 0000000000000040 rdi: 0000000000000040
> (XEN) [ 0.000000] rbp: ffff82d0802f7d68 rsp: ffff82d0802f7c78 r8: 0000000000000000
> (XEN) [ 0.000000] r9: 0000000000000001 r10: 0000000000000001 r11: 00000000000000b4
> (XEN) [ 0.000000] r12: 0000000000000040 r13: ffff83032152bf40 r14: ffff82d08033ae40
> (XEN) [ 0.000000] r15: ffff8300dbdf4000 cr0: 0000000080050033 cr4: 00000000000006e0
> (XEN) [ 0.000000] cr3: 00000000dba9f000 cr2: 0000000000000000
> (XEN) [ 0.000000] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) [ 0.000000] Xen stack trace from rsp=ffff82d0802f7c78:
> (XEN) [ 0.000000] 0000000000000046 0000000200000092 ffff82d08033ae48 ffff82d08028b020
> (XEN) [ 0.000000] 0000000000000000 ffff82d0802ff040 0000000100000001 ffff82d08033ae40
> (XEN) [ 0.000000] 0000000000000097 ffff82d0802f7cd8 0000000000000006 ffff82d0802f7ce8
> (XEN) [ 0.000000] ffff82d08012d027 ffff830321532000 ffff82d0802f7d28 ffff82d08013c779
> (XEN) [ 0.000000] 0000000000000000 0000000000000010 0000000000000048 0000000000000048
> (XEN) [ 0.000000] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [ 0.000000] ffff82d0802f7d78 ffff8300dbdf4000 ffff83032152a000 ffff83032152bf40
> (XEN) [ 0.000000] 0000000000000000 ffff82d08028e020 ffff82d0802f7d78 ffff82d080123d8e
> (XEN) [ 0.000000] ffff82d0802f7db8 ffff82d080123db0 ffff83032152a000 ffff8300dbdf4000
> (XEN) [ 0.000000] ffff83032152a000 0000000000000000 0000000000000000 ffff82d08028e020
> (XEN) [ 0.000000] ffff82d0802f7de8 ffff82d080129e8d ffff82d0802f7de8 ffff82d08013cda0
> (XEN) [ 0.000000] ffff8300dbdf4000 ffff83032152a000 ffff82d0802f7e18 ffff82d080105f59
> (XEN) [ 0.000000] ffff82d0802a1720 ffff82d0802a1718 ffff82d0802a1720 ffff82d0802daa60
> (XEN) [ 0.000000] ffff82d0802f7e48 ffff82d0802a8145 0000000000000002 ffff83032152b610
> (XEN) [ 0.000000] 0000000000000002 0000000000000001 ffff82d0802f7f08 ffff82d0802c5c6f
> (XEN) [ 0.000000] 0000000000000000 0000000000100000 00000000014ae000 0000000000324000
> (XEN) [ 0.000000] 0080016300000000 0000000300000015 0000000000000000 0000000000000010
> (XEN) [ 0.000000] ffff83000008dd50 ffff83000008df20 0000000000000002 ffff83000008dfb0
> (XEN) [ 0.000000] 0000000800000000 000000010000006e 0000000000000003 00000000000002f8
> (XEN) [ 0.000000] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [ 0.000000] Xen call trace:
> (XEN) [ 0.000000] [<ffff82d0801238c4>] _csched_cpu_pick+0x156/0x612
> (XEN) [ 0.000000] [<ffff82d080123d8e>] csched_cpu_pick+0xe/0x10
> (XEN) [ 0.000000] [<ffff82d080123db0>] csched_vcpu_insert+0x20/0x145
> (XEN) [ 0.000000] [<ffff82d080129e8d>] sched_init_vcpu+0x1d1/0x218
> (XEN) [ 0.000000] [<ffff82d080105f59>] alloc_vcpu+0x1ba/0x2a4
> (XEN) [ 0.000000] [<ffff82d0802a8145>] scheduler_init+0x1b0/0x271
> (XEN) [ 0.000000] [<ffff82d0802c5c6f>] __start_xen+0x1f85/0x2550
> (XEN) [ 0.000000] [<ffff82d080100073>] __high_start+0x53/0x55
> (XEN) [ 0.000000]
> (XEN) [ 0.000000]
> (XEN) [ 0.000000] ****************************************
> (XEN) [ 0.000000] Panic on CPU 0:
> (XEN) [ 0.000000] Assertion 'cpu < nr_cpu_ids' failed at ...e/SOURCES/xen/xen.git/xen/include/xen/cpumask.h:97
>
> Which, I think needs at least this hunk (from 6b53bb4ab3c9 "sched:
> better handle (not) inserting idle vCPUs in runqueues"):
>
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 2beebe8..fddcd52 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -240,20 +240,22 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
> init_timer(&v->poll_timer, poll_timer_fn,
> v, v->processor);
>
> - /* Idle VCPUs are scheduled immediately. */
> + v->sched_priv = SCHED_OP(DOM2OP(d), alloc_vdata, v, d->sched_priv);
> + if ( v->sched_priv == NULL )
> + return 1;
> +
> + TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_id, v->vcpu_id);
> +
> + /* Idle VCPUs are scheduled immediately, so don't put them in runqueue. */
> if ( is_idle_domain(d) )
> {
> per_cpu(schedule_data, v->processor).curr = v;
> v->is_running = 1;
> }
> -
> - TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_id, v->vcpu_id);
> -
> - v->sched_priv = SCHED_OP(DOM2OP(d), alloc_vdata, v, d->sched_priv);
> - if ( v->sched_priv == NULL )
> - return 1;
> -
> - SCHED_OP(DOM2OP(d), insert_vcpu, v);
> + else
> + {
> + SCHED_OP(DOM2OP(d), insert_vcpu, v);
> + }
>
> return 0;
> }
>
> So, yeah, it's proving a little more complicated than how I thought it
> would have, just by looking at the patches. :-/
>
> Will let know.
FWIW, this looks very similar to the regression I just raised against
Xen 4.7 "[Xen-devel] Scheduler regression in 4.7". The stack traces are
suspiciously similar. I expect they have the same root cause.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-08-11 15:51 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-15 18:02 [PATCH 1/3] xen: Some code motion to avoid having to do forward-declaration George Dunlap
2016-07-15 18:02 ` [PATCH 2/3] xen: Have schedulers revise initial placement George Dunlap
2016-07-15 18:07 ` Andrew Cooper
2016-07-16 14:12 ` Dario Faggioli
2016-07-18 18:10 ` Andrew Cooper
2016-07-18 18:55 ` Dario Faggioli
2016-07-18 21:36 ` Andrew Cooper
2016-07-19 7:14 ` Dario Faggioli
2016-07-18 10:28 ` Dario Faggioli
2016-07-25 11:17 ` George Dunlap
2016-07-25 14:36 ` Meng Xu
2016-07-26 9:17 ` Dario Faggioli
2016-07-25 14:35 ` Meng Xu
2016-08-01 10:40 ` Jan Beulich
2016-08-01 12:32 ` Dario Faggioli
2016-08-05 13:24 ` Jan Beulich
2016-08-05 14:09 ` Dario Faggioli
2016-08-05 14:44 ` Jan Beulich
2016-08-11 14:59 ` Dario Faggioli
2016-08-11 15:51 ` Andrew Cooper [this message]
2016-08-11 23:35 ` Dario Faggioli
2016-08-12 1:59 ` dependences for backporting to 4.6 [was: Re: [PATCH 2/3] xen: Have schedulers revise initial placement] Dario Faggioli
2016-08-12 13:53 ` Jan Beulich
2016-08-16 10:21 ` Dario Faggioli
2016-08-16 11:21 ` Jan Beulich
2016-08-12 8:58 ` dependences for backporting to 4.5 " Dario Faggioli
2016-07-15 18:02 ` [PATCH 3/3] xen: Remove buggy initial placement algorithm George Dunlap
2016-07-15 18:10 ` Andrew Cooper
2016-07-16 13:55 ` Dario Faggioli
2016-07-18 10:03 ` George Dunlap
2016-07-16 15:48 ` [PATCH 1/3] xen: Some code motion to avoid having to do forward-declaration Meng Xu
2016-07-18 9:58 ` Dario Faggioli
2016-07-18 10:06 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a52941cd-ac3d-40a1-b684-1938e6b31d81@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=anshul.makkar@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=george.dunlap@citrix.com \
--cc=mengxu@cis.upenn.edu \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).