All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>,
	Xen-devel List <xen-devel@lists.xen.org>
Subject: Re: Scheduler regression in 4.7
Date: Thu, 11 Aug 2016 16:42:13 +0100	[thread overview]
Message-ID: <bbc5e619-4ef5-16fd-bb4c-782ffd9b0988@citrix.com> (raw)
In-Reply-To: <1470925692.6250.20.camel@citrix.com>

On 11/08/16 15:28, Dario Faggioli wrote:
> On Thu, 2016-08-11 at 14:39 +0100, Andrew Cooper wrote:
>> On 11/08/16 14:24, George Dunlap wrote:
>>> On 11/08/16 12:35, Andrew Cooper wrote:
>>>> The actual cause is _csched_cpu_pick() falling over LIST_POISON,
>>>> which
>>>> happened to occur at the same time as a domain was shutting
>>>> down.  The
>>>> instruction in question is `mov 0x10(%rax),%rax` which looks like
>>>> reverse list traversal.
> Thanks for the report.
>
>>> Could you use line2addr or objdump -dl to get a better idea where
>>> the
>>> #GP is happening?
>> addr2line -e xen-syms-4.7.0-xs127493 ffff82d08012944f
>> /obj/RPM_BUILD_DIRECTORY/xen-4.7.0/xen/common/sched_credit.c:775
>> (discriminator 1)
>>
>> It will be IS_RUNQ_IDLE() which is the problem.
>>
> Ok, that does one step of list traversing (the runq). What I didn't
> understand from your report is what crashed when.

IS_RUNQ_IDLE() was traversing a list, and it encountered an element
which was being concurrently deleted on a different pcpu.

>
> IS_RUNQ_IDLE() has been introduced a while back and anything like that
> has been ever caught so far. George's patch makes _csched_cpu_pick() be
> called during insert_vcpu()-->csched_vcpu_insert() which, in 4.7, is
> called:
>  1) during domain (well, vcpu) creation,
>  2) when domain is moved among cpupools
>
> AFAICR, during domain destruction we basically move the domain to
> cpupool0, and without a patch that I sent recently, that is always done
> as a full fledged cpupool movement, even if the domain is _already_ in
> cpupool0. So, even if you are not using cpupools, and since you mention
> domain shutdown we probably are looking at 2).

XenServer doesn't use any cpupools, so all pcpus and vcpus are in cpupool0.

>
> But this is what I'm not sure I got well... Do you have enough info to
> tell precisely when the crash manifests? Is it indeed during a domain
> shutdown, or was it during a domain creation (sched_init_vcpu() is in
> the stack trace... although I've read it's a non-debug one)? And is it
> a 'regular' domain or dom0 that is shutting down/coming up?

It is a vm reboot of a an HVM domU (CentOS 7 64bit, although I doubt
that is relevant).

The testcase is vm lifecycle ops on a 32vcpu VM, on a host which happens
to have 32pcpus.

>
> The idea behind IS_RUNQ_IDLE() is that we need to know whether there is
> someone in the runq of a cpu or not, to correctly initialize --and
> hence avoid biasing-- some load balancing calculations. I've never
> liked the idea (leave it alone the code), but it's necessary (or, at
> least, I don't see a sensible alternative).
>
> The questions I'm asking above have the aim of figuring out what the
> status of the runq could be, and why adding a call to csched_cpu_pick()
> from insert_vcpu() is making things explode...

It turns out that the stack trace is rather less stack rubble than I
first thought.  We are in domain construction, and specifically the
XEN_DOMCTL_max_vcpus hypercall.  All other pcpus are in idle.

    for ( i = 0; i < max; i++ )
    {
            if ( d->vcpu[i] != NULL )
                continue;

            cpu = (i == 0) ?
                cpumask_any(online) :
                cpumask_cycle(d->vcpu[i-1]->processor, online);

            if ( alloc_vcpu(d, i, cpu) == NULL )
                goto maxvcpu_out;
    }

The cpumask_cycle() call is complete and execution has moved into
alloc_vcpu()

Unfortunately, none of the code around here spills i or cpu onto the
stack, so I can't see which values the have from the stack dump.

However, I see that csched_vcpu_insert() plays with vc->processor, which
surely invalidates the cycle logic behind this loop?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-08-11 15:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-11 11:35 Scheduler regression in 4.7 Andrew Cooper
2016-08-11 13:24 ` George Dunlap
2016-08-11 13:39   ` Andrew Cooper
2016-08-11 14:28     ` Dario Faggioli
2016-08-11 15:42       ` Andrew Cooper [this message]
2016-08-12  3:32         ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bbc5e619-4ef5-16fd-bb4c-782ffd9b0988@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.