Re: [Xen-devel] [PATCH 13/60] xen/sched: move some per-vcpu items to struct sched_unit

From: Juergen Gross <jgross@suse.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>, Dario Faggioli <dfaggioli@suse.com>,
	Julien Grall <julien.grall@arm.com>,
	Meng Xu <mengxu@cis.upenn.edu>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ian Jackson <ian.jackson@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH 13/60] xen/sched: move some per-vcpu items to struct sched_unit
Date: Tue, 2 Jul 2019 10:44:47 +0200	[thread overview]
Message-ID: <64f37be6-1dd4-60fb-fe23-d9d4e9e3cf63@suse.com> (raw)
In-Reply-To: <948bef8c-79bb-a5e1-f510-91ce95c4082f@suse.com>

On 02.07.19 10:27, Jan Beulich wrote:
> On 02.07.2019 10:14, Juergen Gross wrote:
>> On 02.07.19 09:54, Jan Beulich wrote:
>>> And again - if someone pins every vCPU to a single pCPU, that last
>>> such pinning operation will be what takes long term effect. Aiui all
>>> vCPU-s in the unit will then be pinned to that one pCPU, i.e.
>>> they'll either all compete for the one pCPU's time, or only one of
>>> them will ever get scheduled.
>>
>> No, that's not how it works. Lets say we have a system with the
>> following topology and core scheduling active:
>>
>> cpu0: core 0, thread 0
>> cpu1: core 0, thread 1
>> cpu2: core 1, thread 0
>> cpu3: core 1, thread 1
>>
>> Then any even numbered vcpu will only ever be scheduled on cpu0 or cpu2,
>> while any odd numbered vcpu will only run on cpu1 or cpu3.
>>
>> So virtual cores get scheduled on physical cores. Virtual thread 0 will
>> only run on physical thread 0 and the associated virtual thread 1 will
>> run on the associated physical thread 1 of the same physical core.
>>
>> Pinning a virtual thread 1 to a physical thread 0 is not possible (in
>> reality only the virtual core is being pinned to the physical core).
> 
> But that's what existing guests may be doing. You may want to
> take a look at our old, non-pvops kernels, in particular the
> functionality provided by their drivers/xen/core/domctl.c. Yes,
> {sys,dom}ctl-s aren't supposed to be used by the kernel, but to
> achieve the intended effects I saw no way around (ab)using them.
> (I mean this to be taken as an example only - I realize that the
> code there wouldn't work on modern Xen without updating, due to
> the sysctl/domctl interface version that needs setting.)

First - speaking of "guests" is a little bit misleading here. This is
restricted to dom0.

So when you want to use such a dom0 kernel with Xen 4.13 or later you'd
need to stay with cpu scheduling. Any recent kernel will run just fine
as dom0 with core scheduling active.

> 
>>>>>>>> --- a/xen/include/xen/sched-if.h
>>>>>>>> +++ b/xen/include/xen/sched-if.h
>>>>>>>> @@ -438,11 +438,11 @@ static inline cpumask_t* cpupool_domain_cpumask(struct domain *d)
>>>>>>>>       * * The hard affinity is not a subset of soft affinity
>>>>>>>>       * * There is an overlap between the soft and hard affinity masks
>>>>>>>>       */
>>>>>>>> -static inline int has_soft_affinity(const struct vcpu *v)
>>>>>>>> +static inline int has_soft_affinity(const struct sched_unit *unit)
>>>>>>>>      {
>>>>>>>> -    return v->soft_aff_effective &&
>>>>>>>> -           !cpumask_subset(cpupool_domain_cpumask(v->domain),
>>>>>>>> -                           v->cpu_soft_affinity);
>>>>>>>> +    return unit->soft_aff_effective &&
>>>>>>>> +           !cpumask_subset(cpupool_domain_cpumask(unit->vcpu->domain),
>>>>>>>> +                           unit->cpu_soft_affinity);
>>>>>>>>      }
>>>>>>>
>>>>>>> Okay, at the moment there looks to be a 1:1 relationship between sched
>>>>>>> units and vCPU-s. This would - at this point of the series - invalidate most
>>>>>>> my earlier comments. However, in patch 57 I don't see how this unit->vcpu
>>>>>>> mapping would get broken, and I can't seem to identify any other patch
>>>>>>> where this might be happening. Looking at the github branch I also get the
>>>>>>> impression that the struct vcpu * pointer out of struct sched_unit survives
>>>>>>> until the end of the series, which doesn't seem right to me.
>>>>>>
>>>>>> It is right. The vcpu pointer in the sched_unit is pointing to the first
>>>>>> vcpu of the unit at the end of the series. Further vcpus are found via
>>>>>> v->next_in_list.
>>>>>
>>>>> I'm afraid this sets us up for misunderstanding and misuse. I don't
>>>>> think there should be a straight struct vcpu * out of struct sched_unit.
>>>>
>>>> That was the most effective way to do it. What are you suggesting?
>>>
>>> An actual list, i.e. with a struct list_head. That'll make obvious that
>>> more than one vCPU might be associated with a unit. That's even more so
>>> that the ability to associate more than one appears only quite late in
>>> the series, i.e. there may be further instances like the code above, and
>>> it would require a careful audit (rather than the compiler finding such
>>> instance) to determine all places where using the first vCPU in a unit
>>> isn't really what was meant.
>>
>> TBH I don't see how this would help at all.
>>
>> Instead of using the vcpu pointer I'd had to use the list head instead.
>> Why is that different to a plain pointer regarding finding the places
>> where using the first vcpu was wrong?
> 
> Take the example above: Is it correct to act on just the first vCPU?
> I guess _here_ it is, but the same pattern could be found elsewhere.
> If, from the beginning, you use a clearly identifiable list construct,
> then it'll be obvious to you as the writer and to reviewers that by
> the end of the series there may be multiple entities that need dealing
> with - we'd see list_first*() or for_each*() constructs right away
> (and you wouldn't be able to circumvent their use in a way that
> wouldn't trigger "don't open-code" comments).

Would you be fine with just renaming the pointer to "vcpu_list"?
This would avoid the need to introduce another vcpu list in struct vcpu
and I could re-use the already existing list as today.

It should be noted that I named the pointer just "vcpu" in order to
avoid lots of reformatting due to longer lines, though.

Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel