All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Graeme Gregory <graeme.gregory@linaro.org>,
	Al Stone <al.stone@linaro.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Anshul Makkar <anshul.makkar@citrix.com>,
	Julien Grall <julien.grall@arm.com>,
	Jan Beulich <JBeulich@suse.com>,
	xen-devel@lists.xenproject.org
Subject: Re: PVH CPU hotplug design document
Date: Wed, 18 Jan 2017 10:34:18 +0000	[thread overview]
Message-ID: <20170118103418.7kuv3nhptedz4f4h@dhcp-3-221.uk.xensource.com> (raw)
In-Reply-To: <5d804004-fb7a-9425-8364-b5b330459021@oracle.com>

On Tue, Jan 17, 2017 at 01:50:14PM -0500, Boris Ostrovsky wrote:
> On 01/17/2017 12:45 PM, Roger Pau Monné wrote:
> > On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
> >> On 01/17/2017 10:33 AM, Jan Beulich wrote:
> >>>>>> On 17.01.17 at 16:27, <boris.ostrovsky@oracle.com> wrote:
> >>>> On 01/17/2017 09:44 AM, Jan Beulich wrote:
> >>>>>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
> >>>>>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
> >>>>>> type also doesn't have emulated local APICs. We agreed that this model was
> >>>>>> interesting from things like unikernels DomUs, but that's the only reason why
> >>>>>> we are providing it. Not that full OSes couldn't use it, but it seems
> >>>>>> pointless.
> >>>>> You writing things this way makes me notice another possible design
> >>>>> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
> >>>>> going different directions for at least some use cases (SFI being one
> >>>>> example). Hence I think ACPI should - like on bare hardware - remain
> >>>>> an optional thing. Which in turn require _all_ information obtained from
> >>>>> ACPI (if available) to also be available another way. And this other
> >>>>> way might by hypercalls in our case.
> >>>> At the risk of derailing this thread: why do we need vCPU hotplug for
> >>>> dom0 in the first place? What do we gain over "echo {1|0} >
> >>>> /sys/devices/system/cpu/cpuX/online" ?
> >>>>
> >>>> I can see why this may be needed for domUs where Xen can enforce number
> >>>> of vCPUs that are allowed to run (which we don't enforce now anyway) but
> >>>> why for dom0?
> >>> Good that you now ask this too - that's the PV hotplug mechanism,
> >>> and I've been saying all the time that this should be just fine for PVH
> >>> (Dom0 and DomU).
> >> I think domU hotplug has some value in that we can change number VCPUs
> >> that the guest sees and ACPI-based hotplug allows us to do that in a
> >> "standard" manner.
> >>
> >> For dom0 this doesn't seem to be necessary as it's a special domain
> >> available only to platform administrator.
> >>
> >> Part of confusion I think is because PV hotplug is not hotplug, really,
> >> as far as Linux kernel is concerned.
> > Hm, I'm not really sure I'm following, but I think that we could translate this
> > Dom0 PV hotplug mechanism to PVH as:
> >
> >  - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
> >    the entries > dom0_max_vcpus are marked as disabled.
> >  - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
> >    APIC or an hypercall.
> >
> > Would that match what's done for classic PV Dom0?
> 
> To match what we have for PV dom0 I believe you'd provide MADT with
> opt_dom0_max_vcpus_max entries and mark all of them enabled.
> 
> dom0 brings up all opt_dom0_max_vcpus_max VCPUs, and then offlines
> (opt_dom0_max_vcpus_min+1)..opt_dom0_max_vcpus_max. See
> drivers/xen/cpu_hotplug.c:setup_cpu_watcher(). That's why I said it's
> not a hotplug but rather on/off-lining.

But how does Dom0 get the value of opt_dom0_max_vcpus_min? It doesn't seem to
be propagated anywhere from domain_build.

Also the logic in cpu_hotplug.c is weird IMHO:

static int vcpu_online(unsigned int cpu)
{
        int err;
        char dir[16], state[16];

        sprintf(dir, "cpu/%u", cpu);
        err = xenbus_scanf(XBT_NIL, dir, "availability", "%15s", state);
        if (err != 1) {
                if (!xen_initial_domain())
                        pr_err("Unable to read cpu state\n");
                return err;
        }

        if (strcmp(state, "online") == 0)
                return 1;
        else if (strcmp(state, "offline") == 0)
                return 0;

        pr_err("unknown state(%s) on CPU%d\n", state, cpu);
        return -EINVAL;
}
[...]
static int setup_cpu_watcher(struct notifier_block *notifier,
                              unsigned long event, void *data)
{
        int cpu;
        static struct xenbus_watch cpu_watch = {
                .node = "cpu",
                .callback = handle_vcpu_hotplug_event};

        (void)register_xenbus_watch(&cpu_watch);

        for_each_possible_cpu(cpu) {
                if (vcpu_online(cpu) == 0) {
                        (void)cpu_down(cpu);
                        set_cpu_present(cpu, false);
                }
        }

        return NOTIFY_DONE;
}

xenbus_scanf should return ENOENT for Dom0, because those paths don't exist,
and the all vcpus are going to be left enabled? I'm quite sure I'm missing
something here...

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-01-18 10:34 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-12 12:13 PVH CPU hotplug design document Roger Pau Monné
2017-01-12 19:00 ` Andrew Cooper
2017-01-13  3:06   ` Boris Ostrovsky
2017-01-13 15:27   ` Jan Beulich
2017-01-16 14:59     ` Roger Pau Monné
2017-01-16 14:50   ` Roger Pau Monné
2017-01-13 15:51 ` Jan Beulich
2017-01-13 19:41   ` Stefano Stabellini
2017-01-14  1:44   ` Boris Ostrovsky
2017-01-16 11:03     ` Jan Beulich
2017-01-16 15:14   ` Roger Pau Monné
2017-01-16 16:09     ` Jan Beulich
2017-01-16 16:31       ` Roger Pau Monné
2017-01-16 16:50         ` Jan Beulich
2017-01-16 17:44           ` Roger Pau Monné
2017-01-16 18:16             ` Stefano Stabellini
2017-01-17  9:12             ` Jan Beulich
2017-01-17 11:43               ` Roger Pau Monné
2017-01-17 12:33                 ` Jan Beulich
2017-01-17 14:13                   ` Roger Pau Monné
2017-01-17 14:44                     ` Jan Beulich
2017-01-17 15:27                       ` Boris Ostrovsky
2017-01-17 15:33                         ` Jan Beulich
2017-01-17 15:50                           ` Boris Ostrovsky
2017-01-17 17:45                             ` Roger Pau Monné
2017-01-17 18:50                               ` Boris Ostrovsky
2017-01-18 10:34                                 ` Roger Pau Monné [this message]
2017-01-18 10:44                                   ` Jan Beulich
2017-01-18 11:54                                     ` Roger Pau Monné
2017-01-18 13:25                                       ` Jan Beulich
2017-01-22 18:39                                         ` Boris Ostrovsky
2017-01-23 10:35                                           ` Jan Beulich
2017-01-23 14:28                                             ` Boris Ostrovsky
2017-01-23 15:17                                               ` Jan Beulich
2017-01-23 15:43                                                 ` Boris Ostrovsky
2017-01-23 15:49                                                   ` Boris Ostrovsky
2017-01-23 15:58                                                     ` Jan Beulich
2017-01-23 16:24                                                       ` Boris Ostrovsky
2017-01-23 16:50                                                         ` Roger Pau Monné
2017-01-23 17:05                                                           ` Boris Ostrovsky
2017-01-23 17:14                                                             ` Roger Pau Monné
2017-01-23 18:36                                                               ` Boris Ostrovsky
2017-01-24  7:43                                                                 ` Jan Beulich
2017-01-24 14:15                                                                   ` Boris Ostrovsky
2017-01-23 15:49                                                   ` Jan Beulich
2017-01-18 13:34                                       ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170118103418.7kuv3nhptedz4f4h@dhcp-3-221.uk.xensource.com \
    --to=roger.pau@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=al.stone@linaro.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=graeme.gregory@linaro.org \
    --cc=julien.grall@arm.com \
    --cc=konrad.wilk@oracle.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.