All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Xen-devel <xen-devel@lists.xen.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH 3/3] x86/smt: Support for enabling/disabling SMT at runtime
Date: Wed, 03 Apr 2019 04:44:42 -0600	[thread overview]
Message-ID: <5CA48E9A020000780022438B@prv1-mh.provo.novell.com> (raw)
In-Reply-To: <8641e436-9f65-ca4f-df24-72745d9acdb1@citrix.com>

>>> On 03.04.19 at 12:17, <andrew.cooper3@citrix.com> wrote:
> On 03/04/2019 10:33, Jan Beulich wrote:
>>>>> On 02.04.19 at 21:57, <andrew.cooper3@citrix.com> wrote:
>>> Slightly RFC.  I'm not very happy with the contination situation, but -EBUSY
>>> is the preexisting style and it seems like it is the only option from tasklet
>>> context.
>> Well, offloading the re-invocation to the caller isn't really nice.
>> Looking at the code, is there any reason why couldn't use
>> the usual -ERESTART / hypercall_create_continuation()? This
>> would require a little bit of re-work, in particular to allow
>> passing the vCPU into hypercall_create_continuation(), but
>> beyond that I can't see any immediate obstacles. Though
>> clearly I wouldn't make this a prereq requirement for the work
>> here.
> 
> The problem isn't really the ERESTART.  We could do some plumbing and
> make it work, but the real problem is that I can't stash the current cpu
> index in the sysctl data block across the continuation point.
> 
> At the moment, the loop depends on, once all CPUs are in the correct
> state, getting through the for_each_present_cpu() loop without taking a
> further continuation.

But these are two orthogonal things: One is how to invoke the
continuation, and the other is where the continuation is to
resume from. I think the former is more important to address,
as it affects how the tools side code needs to look like.

>>> Is it intentional that we can actually online and offline processors beyond
>>> maxcpu?  This is a consequence of the cpu parking logic.
>> I think so, yes. That's meant to be a boot time limit only imo.
>> The runtime limit is nr_cpu_ids.
>>
>>> --- a/xen/arch/x86/setup.c
>>> +++ b/xen/arch/x86/setup.c
>>> @@ -60,7 +60,7 @@ static bool __initdata opt_nosmp;
>>>  boolean_param("nosmp", opt_nosmp);
>>>  
>>>  /* maxcpus: maximum number of CPUs to activate. */
>>> -static unsigned int __initdata max_cpus;
>>> +unsigned int max_cpus;
>>>  integer_param("maxcpus", max_cpus);
>> As per above I don't think this change should be needed or
>> wanted, but if so for whatever reason, wouldn't the variable
>> better be __read_mostly?
> 
> __read_mostly, yes, but as to whether the change is needed, that
> entirely depends on whether the change in semantics to maxcpus= was
> accidental or intentional.

Well, as said, I did consider this while putting together the
parking series, and I therefore consider it intentional.

>>> +    opt_smt = true;
>> Perhaps also bail early when the variable already has the
>> designated value? And again perhaps right in the sysctl
>> handler?
> 
> That is not safe across continuations.
> 
> While it would be a very silly thing to do, there could be two callers
> which are fighting over whether SMT is disabled or enabled.

Oh, and actually not just that: The continuation then wouldn't
do anything anymore (unless you first reverted the setting,
which in turn wouldn't be right in case any other CPU activity
would occur in parallel, while the continuation is still pending).

>>> +    for_each_present_cpu ( cpu )
>>> +    {
>>> +        if ( cpu == 0 )
>>> +            continue;
>> Is this special case really needed? If so, perhaps worth a brief
>> comment?
> 
> Trying to down cpu 0 is a hard -EINVAL.

But here we're on the CPU-up path. Plus, for eventually supporting
the offlining of CPU 0, it would feel slightly better if you used
smp_processor_id() here.

>>> +        if ( cpu >= max_cpus )
>>> +            break;
>>> +
>>> +        if ( x86_cpu_to_apicid[cpu] & sibling_mask )
>>> +            ret = cpu_up_helper(_p(cpu));
>> Shouldn't this be restricted to CPUs a sibling of which is already
>> online? And widened at the same time, to also online thread 0
>> if one of the other threads is already online?
> 
> Unfortunately, that turns into a rats nest very very quickly, which is
> why I gave up and simplified the semantics to strictly "this shall
> {of,off}line the nonzero siblings threads".

Okay, if that's the intention, then I can certainly live with this.
But it needs to be called out at the very least in the public header.
(It might be worthwhile setting up a flag right away for "full"
behavior, but leave acting upon it unimplemented). It also wouldn't
hurt if the patch description already set expectations accordingly.

Then again, considering your "maxcpus=" related question,
it would certainly be odd for people to see non-zero threads
come online here when they've intentionally left entire cores
or nodes offline for whatever reason. Arguably that's not
something to expect people would commonly do, and hence it
may not be worth wasting meaningful extra effort on. But as
above, and such "oddities" should be spelled out, such that it
can be recognized that they're not oversights.

>> I also notice that the two functions are extremely similar, and
>> hence it might be worthwhile considering to fold them, with the
>> caller controlling the behavior via the so far unused function
>> parameter (at which point the related remark of mine on patch
>> 2 would become inapplicable).
> 
> By passing the plug boolean in via data?

Yes.

>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -246,8 +246,17 @@ struct xen_sysctl_get_pmstat {
>>>  struct xen_sysctl_cpu_hotplug {
>>>      /* IN variables */
>>>      uint32_t cpu;   /* Physical cpu. */
>>> +
>>> +    /* Single CPU enable/disable. */
>>>  #define XEN_SYSCTL_CPU_HOTPLUG_ONLINE  0
>>>  #define XEN_SYSCTL_CPU_HOTPLUG_OFFLINE 1
>>> +
>>> +    /*
>>> +     * SMT enable/disable. Caller must zero the 'cpu' field to begin, and
>>> +     * ignore it on completion.
>>> +     */
>>> +#define XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE  2
>>> +#define XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE 3
>> Is the "cpu" field constraint mentioned in the comment just a
>> precaution? I can't see you encode anything into that field, or
>> use it upon getting re-invoked. I assume that's because of the
>> expectation that only actual onlining/offlining would potentially
>> take long, while iterating over all present CPUs without further
>> action ought to be fast enough.
> 
> Ah - that was stale from before I encountered the "fun" of continuations
> from tasklet context.
> 
> I would prefer to find a better way, but short of doing a full vcpu
> context switch, I don't see an option.

And I don't think there's a strong need. It should just be made
clear (again in the description) that the remark here is just a
precaution at this time, unless you want to drop it altogether.

One thing you may want to do though:

        /* Tolerate already-online siblings. */
        if ( ret == -EEXIST )
        {
            ret = 0;
            continue;
        }

to bypass the general_preempt_check() in that case, such
that you can guarantee making forward progress.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-04-03 10:44 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02 19:57 [PATCH 0/3] x86/smt: Runtime SMT controls Andrew Cooper
2019-04-02 19:57 ` [PATCH 1/3] xen/cpu: Distinguish "cpu already in that state" in cpu_{up, down}() Andrew Cooper
2019-04-03  8:49   ` Jan Beulich
2019-04-02 19:57 ` [PATCH 2/3] x86/sysctl: Clean up XEN_SYSCTL_cpu_hotplug Andrew Cooper
2019-04-03  8:53   ` Jan Beulich
2019-04-03  9:06     ` Andrew Cooper
2019-04-03  9:38       ` Jan Beulich
2019-04-04 13:31         ` Andrew Cooper
2019-04-02 19:57 ` [PATCH 3/3] x86/smt: Support for enabling/disabling SMT at runtime Andrew Cooper
2019-04-03  9:33   ` Jan Beulich
2019-04-03 10:17     ` Andrew Cooper
2019-04-03 10:44       ` Jan Beulich [this message]
2019-04-03 11:33         ` Andrew Cooper
2019-04-03 12:10           ` Jan Beulich
2019-04-11  8:16       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5CA48E9A020000780022438B@prv1-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.