[Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
@ 2019-09-30 10:09 Juergen Gross
  2019-09-30 10:25 ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Juergen Gross @ 2019-09-30 10:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich

Add documentation for the new "sched-gran" hypervisor boot parameter.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index fc64429064..c855246050 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
 default is 30ms.  Reasonable values may include 10, 5, or even 1 for
 very latency-sensitive workloads.
 
+### sched-gran (x86)
+> `= cpu | core | socket`
+
+> Default: `sched-gran=cpu`
+
+Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
+`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
+statically to a "scheduling unit" which will then be subject to scheduling.
+This assignment of vcpus to scheduling units is fixed.
+
+`cpu`: Vcpus will be scheduled individually on single cpus.
+
+`core`: As many vcpus as there are hyperthreads on a physical core are
+scheduled together on a physical core.
+
+`socket`: As many vcpus as there are hyperthreads on a physical sockets are
+scheduled together on a physical socket.
+
+Note: a value other than `cpu` will result in rejecting a runtime modification
+of the "smt" setting.
+
 ### sched_ratelimit_us
 > `= <integer>`
 
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 10:09 [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation Juergen Gross
@ 2019-09-30 10:25 ` Jan Beulich
  2019-09-30 10:51   ` Jürgen Groß
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2019-09-30 10:25 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.2019 12:09, Juergen Gross wrote:
> Add documentation for the new "sched-gran" hypervisor boot parameter.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index fc64429064..c855246050 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
>  default is 30ms.  Reasonable values may include 10, 5, or even 1 for
>  very latency-sensitive workloads.
>  
> +### sched-gran (x86)
> +> `= cpu | core | socket`
> +
> +> Default: `sched-gran=cpu`
> +
> +Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
> +statically to a "scheduling unit" which will then be subject to scheduling.
> +This assignment of vcpus to scheduling units is fixed.
> +
> +`cpu`: Vcpus will be scheduled individually on single cpus.
> +
> +`core`: As many vcpus as there are hyperthreads on a physical core are
> +scheduled together on a physical core.
> +
> +`socket`: As many vcpus as there are hyperthreads on a physical sockets are
> +scheduled together on a physical socket.

I'd prefer if this didn't end up Intel-centric; ideally it also wouldn't be
x86-specific. AMD has introduced hyperthreading in Fam17 only; Fam15 used
"compute units", grouping together "cores". Internally the Intel side
"core vs hyperthread" is represented in the same variables (cpu_sibling_mask
in particular) as the AMD side "compute unit vs core".

Therefore it may be better to talk here about e.g. "smallest topological
sub-unit" and only say "e.g. a hyperthread to make a connection to common
x86 / Intel terminology". Of course the AMD side alternative use of the
variables also renders the actual command line option "sched-gran=core"
not overly fortunate. Perhaps we'd want to also use more abstract terms
here, e.g. topological "levels"?

> +Note: a value other than `cpu` will result in rejecting a runtime modification
> +of the "smt" setting.

Perhaps add "attempt" here?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 10:25 ` Jan Beulich
@ 2019-09-30 10:51   ` Jürgen Groß
  2019-09-30 11:02     ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Jürgen Groß @ 2019-09-30 10:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.19 12:25, Jan Beulich wrote:
> On 30.09.2019 12:09, Juergen Gross wrote:
>> Add documentation for the new "sched-gran" hypervisor boot parameter.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>> index fc64429064..c855246050 100644
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
>>   default is 30ms.  Reasonable values may include 10, 5, or even 1 for
>>   very latency-sensitive workloads.
>>   
>> +### sched-gran (x86)
>> +> `= cpu | core | socket`
>> +
>> +> Default: `sched-gran=cpu`
>> +
>> +Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
>> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
>> +statically to a "scheduling unit" which will then be subject to scheduling.
>> +This assignment of vcpus to scheduling units is fixed.
>> +
>> +`cpu`: Vcpus will be scheduled individually on single cpus.
>> +
>> +`core`: As many vcpus as there are hyperthreads on a physical core are
>> +scheduled together on a physical core.
>> +
>> +`socket`: As many vcpus as there are hyperthreads on a physical sockets are
>> +scheduled together on a physical socket.
> 
> I'd prefer if this didn't end up Intel-centric; ideally it also wouldn't be
> x86-specific. AMD has introduced hyperthreading in Fam17 only; Fam15 used
> "compute units", grouping together "cores". Internally the Intel side
> "core vs hyperthread" is represented in the same variables (cpu_sibling_mask
> in particular) as the AMD side "compute unit vs core".

Yes, it is a mess.

> Therefore it may be better to talk here about e.g. "smallest topological
> sub-unit" and only say "e.g. a hyperthread to make a connection to common
> x86 / Intel terminology". Of course the AMD side alternative use of the
> variables also renders the actual command line option "sched-gran=core"
> not overly fortunate. Perhaps we'd want to also use more abstract terms
> here, e.g. topological "levels"?

I think regarding usage of "hyperthreads" I'll go with:

+`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
+ hyperthread using x86/Intel terminology)
+
+ `core`: As many vcpus as there are cpus on a physical core are
+ scheduled together on a physical core.
...

I think using "core" is fine. We have it in multiple places in the
hypervisor which are _not_ specific to Intel. And "core-scheduling" is
a well-known buzzword already.

> 
>> +Note: a value other than `cpu` will result in rejecting a runtime modification
>> +of the "smt" setting.
> 
> Perhaps add "attempt" here?

Yes.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 10:51   ` Jürgen Groß
@ 2019-09-30 11:02     ` Jan Beulich
  2019-09-30 11:13       ` Jürgen Groß
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2019-09-30 11:02 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.2019 12:51, Jürgen Groß wrote:
> On 30.09.19 12:25, Jan Beulich wrote:
>> On 30.09.2019 12:09, Juergen Gross wrote:
>>> Add documentation for the new "sched-gran" hypervisor boot parameter.
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>>   docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>>>   1 file changed, 21 insertions(+)
>>>
>>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>>> index fc64429064..c855246050 100644
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
>>>   default is 30ms.  Reasonable values may include 10, 5, or even 1 for
>>>   very latency-sensitive workloads.
>>>   
>>> +### sched-gran (x86)
>>> +> `= cpu | core | socket`
>>> +
>>> +> Default: `sched-gran=cpu`
>>> +
>>> +Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
>>> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
>>> +statically to a "scheduling unit" which will then be subject to scheduling.
>>> +This assignment of vcpus to scheduling units is fixed.
>>> +
>>> +`cpu`: Vcpus will be scheduled individually on single cpus.
>>> +
>>> +`core`: As many vcpus as there are hyperthreads on a physical core are
>>> +scheduled together on a physical core.
>>> +
>>> +`socket`: As many vcpus as there are hyperthreads on a physical sockets are
>>> +scheduled together on a physical socket.
>>
>> I'd prefer if this didn't end up Intel-centric; ideally it also wouldn't be
>> x86-specific. AMD has introduced hyperthreading in Fam17 only; Fam15 used
>> "compute units", grouping together "cores". Internally the Intel side
>> "core vs hyperthread" is represented in the same variables (cpu_sibling_mask
>> in particular) as the AMD side "compute unit vs core".
> 
> Yes, it is a mess.
> 
>> Therefore it may be better to talk here about e.g. "smallest topological
>> sub-unit" and only say "e.g. a hyperthread to make a connection to common
>> x86 / Intel terminology". Of course the AMD side alternative use of the
>> variables also renders the actual command line option "sched-gran=core"
>> not overly fortunate. Perhaps we'd want to also use more abstract terms
>> here, e.g. topological "levels"?
> 
> I think regarding usage of "hyperthreads" I'll go with:
> 
> +`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
> + hyperthread using x86/Intel terminology)
> +
> + `core`: As many vcpus as there are cpus on a physical core are
> + scheduled together on a physical core.
> ...
> 
> I think using "core" is fine. We have it in multiple places in the
> hypervisor which are _not_ specific to Intel.

Well, what we have in hypervisor sources is one thing - we can
settle on any convention we want there. It's the user (admin)
interface (i.e. the command line option name and description
here) which we may want to be a little more careful with. But
yes, I can see how we use "core" already in similar contexts
in the command line option doc, first and foremost on
"credit2_runqueue". (In retrospect I think this might have been
a mistake though.)

> And "core-scheduling" is a well-known buzzword already.

Let me not get started on buzzwords ;-)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 11:02     ` Jan Beulich
@ 2019-09-30 11:13       ` Jürgen Groß
  2019-09-30 11:20         ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Jürgen Groß @ 2019-09-30 11:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.19 13:02, Jan Beulich wrote:
> On 30.09.2019 12:51, Jürgen Groß wrote:
>> On 30.09.19 12:25, Jan Beulich wrote:
>>> On 30.09.2019 12:09, Juergen Gross wrote:
>>>> Add documentation for the new "sched-gran" hypervisor boot parameter.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>>    docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>>>>    1 file changed, 21 insertions(+)
>>>>
>>>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>>>> index fc64429064..c855246050 100644
>>>> --- a/docs/misc/xen-command-line.pandoc
>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
>>>>    default is 30ms.  Reasonable values may include 10, 5, or even 1 for
>>>>    very latency-sensitive workloads.
>>>>    
>>>> +### sched-gran (x86)
>>>> +> `= cpu | core | socket`
>>>> +
>>>> +> Default: `sched-gran=cpu`
>>>> +
>>>> +Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
>>>> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
>>>> +statically to a "scheduling unit" which will then be subject to scheduling.
>>>> +This assignment of vcpus to scheduling units is fixed.
>>>> +
>>>> +`cpu`: Vcpus will be scheduled individually on single cpus.
>>>> +
>>>> +`core`: As many vcpus as there are hyperthreads on a physical core are
>>>> +scheduled together on a physical core.
>>>> +
>>>> +`socket`: As many vcpus as there are hyperthreads on a physical sockets are
>>>> +scheduled together on a physical socket.
>>>
>>> I'd prefer if this didn't end up Intel-centric; ideally it also wouldn't be
>>> x86-specific. AMD has introduced hyperthreading in Fam17 only; Fam15 used
>>> "compute units", grouping together "cores". Internally the Intel side
>>> "core vs hyperthread" is represented in the same variables (cpu_sibling_mask
>>> in particular) as the AMD side "compute unit vs core".
>>
>> Yes, it is a mess.
>>
>>> Therefore it may be better to talk here about e.g. "smallest topological
>>> sub-unit" and only say "e.g. a hyperthread to make a connection to common
>>> x86 / Intel terminology". Of course the AMD side alternative use of the
>>> variables also renders the actual command line option "sched-gran=core"
>>> not overly fortunate. Perhaps we'd want to also use more abstract terms
>>> here, e.g. topological "levels"?
>>
>> I think regarding usage of "hyperthreads" I'll go with:
>>
>> +`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
>> + hyperthread using x86/Intel terminology)
>> +
>> + `core`: As many vcpus as there are cpus on a physical core are
>> + scheduled together on a physical core.
>> ...
>>
>> I think using "core" is fine. We have it in multiple places in the
>> hypervisor which are _not_ specific to Intel.
> 
> Well, what we have in hypervisor sources is one thing - we can
> settle on any convention we want there. It's the user (admin)
> interface (i.e. the command line option name and description
> here) which we may want to be a little more careful with. But
> yes, I can see how we use "core" already in similar contexts
> in the command line option doc, first and foremost on
> "credit2_runqueue". (In retrospect I think this might have been
> a mistake though.)

So what do you suggest?

<Irony on>
"topology-level-just-above-the-smallest-topological-sub-unit"?
<Irony-off>

I can't think of any sensible terminology not resulting in something
which is much harder to understand than "core".

And we are using "core" or "cores" in hypervisor messages, too.

>> And "core-scheduling" is a well-known buzzword already.
> 
> Let me not get started on buzzwords ;-)

:-)


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 11:13       ` Jürgen Groß
@ 2019-09-30 11:20         ` Jan Beulich
  2019-09-30 11:26           ` Jürgen Groß
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2019-09-30 11:20 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.2019 13:13, Jürgen Groß wrote:
> On 30.09.19 13:02, Jan Beulich wrote:
>> On 30.09.2019 12:51, Jürgen Groß wrote:
>>> On 30.09.19 12:25, Jan Beulich wrote:
>>>> On 30.09.2019 12:09, Juergen Gross wrote:
>>>>> Add documentation for the new "sched-gran" hypervisor boot parameter.
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> ---
>>>>>    docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>>>>>    1 file changed, 21 insertions(+)
>>>>>
>>>>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>>>>> index fc64429064..c855246050 100644
>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
>>>>>    default is 30ms.  Reasonable values may include 10, 5, or even 1 for
>>>>>    very latency-sensitive workloads.
>>>>>    
>>>>> +### sched-gran (x86)
>>>>> +> `= cpu | core | socket`
>>>>> +
>>>>> +> Default: `sched-gran=cpu`
>>>>> +
>>>>> +Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
>>>>> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
>>>>> +statically to a "scheduling unit" which will then be subject to scheduling.
>>>>> +This assignment of vcpus to scheduling units is fixed.
>>>>> +
>>>>> +`cpu`: Vcpus will be scheduled individually on single cpus.
>>>>> +
>>>>> +`core`: As many vcpus as there are hyperthreads on a physical core are
>>>>> +scheduled together on a physical core.
>>>>> +
>>>>> +`socket`: As many vcpus as there are hyperthreads on a physical sockets are
>>>>> +scheduled together on a physical socket.
>>>>
>>>> I'd prefer if this didn't end up Intel-centric; ideally it also wouldn't be
>>>> x86-specific. AMD has introduced hyperthreading in Fam17 only; Fam15 used
>>>> "compute units", grouping together "cores". Internally the Intel side
>>>> "core vs hyperthread" is represented in the same variables (cpu_sibling_mask
>>>> in particular) as the AMD side "compute unit vs core".
>>>
>>> Yes, it is a mess.
>>>
>>>> Therefore it may be better to talk here about e.g. "smallest topological
>>>> sub-unit" and only say "e.g. a hyperthread to make a connection to common
>>>> x86 / Intel terminology". Of course the AMD side alternative use of the
>>>> variables also renders the actual command line option "sched-gran=core"
>>>> not overly fortunate. Perhaps we'd want to also use more abstract terms
>>>> here, e.g. topological "levels"?
>>>
>>> I think regarding usage of "hyperthreads" I'll go with:
>>>
>>> +`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
>>> + hyperthread using x86/Intel terminology)
>>> +
>>> + `core`: As many vcpus as there are cpus on a physical core are
>>> + scheduled together on a physical core.
>>> ...
>>>
>>> I think using "core" is fine. We have it in multiple places in the
>>> hypervisor which are _not_ specific to Intel.
>>
>> Well, what we have in hypervisor sources is one thing - we can
>> settle on any convention we want there. It's the user (admin)
>> interface (i.e. the command line option name and description
>> here) which we may want to be a little more careful with. But
>> yes, I can see how we use "core" already in similar contexts
>> in the command line option doc, first and foremost on
>> "credit2_runqueue". (In retrospect I think this might have been
>> a mistake though.)
> 
> So what do you suggest?
> 
> <Irony on>
> "topology-level-just-above-the-smallest-topological-sub-unit"?
> <Irony-off>
> 
> I can't think of any sensible terminology not resulting in something
> which is much harder to understand than "core".

Ideally I'd like us to have an arch-independent way of
expressing things - "socket" and "node" look to be common enough,
so perhaps wouldn't need further abstraction, but sub-socket
granularities could perhaps be expressed as "level1" or "level2"?
And then there could be context sensitive meanings of "core",
"cu", and perhaps (in the future) "die".

My concern is that AMD-focused people may, when using "core", not
get what they'd expect (and this concern extends to the existing
uses of "core"). IOW "context sensitive" above would assign
different meaning to "core" depending on the hardware we run on.
Granted I can also see how this might confuse people other than
the example AMD-focused ones.

> And we are using "core" or "cores" in hypervisor messages, too.

That's still slightly different though.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 11:20         ` Jan Beulich
@ 2019-09-30 11:26           ` Jürgen Groß
  2019-09-30 11:45             ` George Dunlap
  2019-09-30 12:38             ` Jan Beulich
  0 siblings, 2 replies; 10+ messages in thread
From: Jürgen Groß @ 2019-09-30 11:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.19 13:20, Jan Beulich wrote:
> On 30.09.2019 13:13, Jürgen Groß wrote:
>> On 30.09.19 13:02, Jan Beulich wrote:
>>> On 30.09.2019 12:51, Jürgen Groß wrote:
>>>> On 30.09.19 12:25, Jan Beulich wrote:
>>>>> On 30.09.2019 12:09, Juergen Gross wrote:
>>>>>> Add documentation for the new "sched-gran" hypervisor boot parameter.
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>> ---
>>>>>>     docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>>>>>>     1 file changed, 21 insertions(+)
>>>>>>
>>>>>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>>>>>> index fc64429064..c855246050 100644
>>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>>> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1 scheduler, in milliseconds.  The
>>>>>>     default is 30ms.  Reasonable values may include 10, 5, or even 1 for
>>>>>>     very latency-sensitive workloads.
>>>>>>     
>>>>>> +### sched-gran (x86)
>>>>>> +> `= cpu | core | socket`
>>>>>> +
>>>>>> +> Default: `sched-gran=cpu`
>>>>>> +
>>>>>> +Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
>>>>>> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
>>>>>> +statically to a "scheduling unit" which will then be subject to scheduling.
>>>>>> +This assignment of vcpus to scheduling units is fixed.
>>>>>> +
>>>>>> +`cpu`: Vcpus will be scheduled individually on single cpus.
>>>>>> +
>>>>>> +`core`: As many vcpus as there are hyperthreads on a physical core are
>>>>>> +scheduled together on a physical core.
>>>>>> +
>>>>>> +`socket`: As many vcpus as there are hyperthreads on a physical sockets are
>>>>>> +scheduled together on a physical socket.
>>>>>
>>>>> I'd prefer if this didn't end up Intel-centric; ideally it also wouldn't be
>>>>> x86-specific. AMD has introduced hyperthreading in Fam17 only; Fam15 used
>>>>> "compute units", grouping together "cores". Internally the Intel side
>>>>> "core vs hyperthread" is represented in the same variables (cpu_sibling_mask
>>>>> in particular) as the AMD side "compute unit vs core".
>>>>
>>>> Yes, it is a mess.
>>>>
>>>>> Therefore it may be better to talk here about e.g. "smallest topological
>>>>> sub-unit" and only say "e.g. a hyperthread to make a connection to common
>>>>> x86 / Intel terminology". Of course the AMD side alternative use of the
>>>>> variables also renders the actual command line option "sched-gran=core"
>>>>> not overly fortunate. Perhaps we'd want to also use more abstract terms
>>>>> here, e.g. topological "levels"?
>>>>
>>>> I think regarding usage of "hyperthreads" I'll go with:
>>>>
>>>> +`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
>>>> + hyperthread using x86/Intel terminology)
>>>> +
>>>> + `core`: As many vcpus as there are cpus on a physical core are
>>>> + scheduled together on a physical core.
>>>> ...
>>>>
>>>> I think using "core" is fine. We have it in multiple places in the
>>>> hypervisor which are _not_ specific to Intel.
>>>
>>> Well, what we have in hypervisor sources is one thing - we can
>>> settle on any convention we want there. It's the user (admin)
>>> interface (i.e. the command line option name and description
>>> here) which we may want to be a little more careful with. But
>>> yes, I can see how we use "core" already in similar contexts
>>> in the command line option doc, first and foremost on
>>> "credit2_runqueue". (In retrospect I think this might have been
>>> a mistake though.)
>>
>> So what do you suggest?
>>
>> <Irony on>
>> "topology-level-just-above-the-smallest-topological-sub-unit"?
>> <Irony-off>
>>
>> I can't think of any sensible terminology not resulting in something
>> which is much harder to understand than "core".
> 
> Ideally I'd like us to have an arch-independent way of
> expressing things - "socket" and "node" look to be common enough,
> so perhaps wouldn't need further abstraction, but sub-socket
> granularities could perhaps be expressed as "level1" or "level2"?
> And then there could be context sensitive meanings of "core",
> "cu", and perhaps (in the future) "die".
> 
> My concern is that AMD-focused people may, when using "core", not
> get what they'd expect (and this concern extends to the existing
> uses of "core"). IOW "context sensitive" above would assign
> different meaning to "core" depending on the hardware we run on.
> Granted I can also see how this might confuse people other than
> the example AMD-focused ones.

And it will be fatal for large scale installations with AMD- and INTEL-
servers. Boot-parameters having the same semantics should be named the
same (regardless of the name or value part) in order to enable such
customers to use the same setting on each server.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 11:26           ` Jürgen Groß
@ 2019-09-30 11:45             ` George Dunlap
  2019-09-30 12:30               ` Jan Beulich
  2019-09-30 12:38             ` Jan Beulich
  1 sibling, 1 reply; 10+ messages in thread
From: George Dunlap @ 2019-09-30 11:45 UTC (permalink / raw)
  To: Jürgen Groß, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 9/30/19 12:26 PM, Jürgen Groß wrote:
> On 30.09.19 13:20, Jan Beulich wrote:
>> On 30.09.2019 13:13, Jürgen Groß wrote:
>>> On 30.09.19 13:02, Jan Beulich wrote:
>>>> On 30.09.2019 12:51, Jürgen Groß wrote:
>>>>> On 30.09.19 12:25, Jan Beulich wrote:
>>>>>> On 30.09.2019 12:09, Juergen Gross wrote:
>>>>>>> Add documentation for the new "sched-gran" hypervisor boot
>>>>>>> parameter.
>>>>>>>
>>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>>> ---
>>>>>>>     docs/misc/xen-command-line.pandoc | 21 +++++++++++++++++++++
>>>>>>>     1 file changed, 21 insertions(+)
>>>>>>>
>>>>>>> diff --git a/docs/misc/xen-command-line.pandoc
>>>>>>> b/docs/misc/xen-command-line.pandoc
>>>>>>> index fc64429064..c855246050 100644
>>>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>>>> @@ -1782,6 +1782,27 @@ Set the timeslice of the credit1
>>>>>>> scheduler, in milliseconds.  The
>>>>>>>     default is 30ms.  Reasonable values may include 10, 5, or
>>>>>>> even 1 for
>>>>>>>     very latency-sensitive workloads.
>>>>>>>     +### sched-gran (x86)
>>>>>>> +> `= cpu | core | socket`
>>>>>>> +
>>>>>>> +> Default: `sched-gran=cpu`
>>>>>>> +
>>>>>>> +Set the scheduling granularity. In case the granularity is
>>>>>>> larger than 1 (e.g.
>>>>>>> +`core`on a SMT-enabled system, or `socket`) multiple vcpus are
>>>>>>> assigned
>>>>>>> +statically to a "scheduling unit" which will then be subject to
>>>>>>> scheduling.
>>>>>>> +This assignment of vcpus to scheduling units is fixed.
>>>>>>> +
>>>>>>> +`cpu`: Vcpus will be scheduled individually on single cpus.
>>>>>>> +
>>>>>>> +`core`: As many vcpus as there are hyperthreads on a physical
>>>>>>> core are
>>>>>>> +scheduled together on a physical core.
>>>>>>> +
>>>>>>> +`socket`: As many vcpus as there are hyperthreads on a physical
>>>>>>> sockets are
>>>>>>> +scheduled together on a physical socket.
>>>>>>
>>>>>> I'd prefer if this didn't end up Intel-centric; ideally it also
>>>>>> wouldn't be
>>>>>> x86-specific. AMD has introduced hyperthreading in Fam17 only;
>>>>>> Fam15 used
>>>>>> "compute units", grouping together "cores". Internally the Intel side
>>>>>> "core vs hyperthread" is represented in the same variables
>>>>>> (cpu_sibling_mask
>>>>>> in particular) as the AMD side "compute unit vs core".
>>>>>
>>>>> Yes, it is a mess.
>>>>>
>>>>>> Therefore it may be better to talk here about e.g. "smallest
>>>>>> topological
>>>>>> sub-unit" and only say "e.g. a hyperthread to make a connection to
>>>>>> common
>>>>>> x86 / Intel terminology". Of course the AMD side alternative use
>>>>>> of the
>>>>>> variables also renders the actual command line option
>>>>>> "sched-gran=core"
>>>>>> not overly fortunate. Perhaps we'd want to also use more abstract
>>>>>> terms
>>>>>> here, e.g. topological "levels"?
>>>>>
>>>>> I think regarding usage of "hyperthreads" I'll go with:
>>>>>
>>>>> +`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
>>>>> + hyperthread using x86/Intel terminology)
>>>>> +
>>>>> + `core`: As many vcpus as there are cpus on a physical core are
>>>>> + scheduled together on a physical core.
>>>>> ...
>>>>>
>>>>> I think using "core" is fine. We have it in multiple places in the
>>>>> hypervisor which are _not_ specific to Intel.
>>>>
>>>> Well, what we have in hypervisor sources is one thing - we can
>>>> settle on any convention we want there. It's the user (admin)
>>>> interface (i.e. the command line option name and description
>>>> here) which we may want to be a little more careful with. But
>>>> yes, I can see how we use "core" already in similar contexts
>>>> in the command line option doc, first and foremost on
>>>> "credit2_runqueue". (In retrospect I think this might have been
>>>> a mistake though.)
>>>
>>> So what do you suggest?
>>>
>>> <Irony on>
>>> "topology-level-just-above-the-smallest-topological-sub-unit"?
>>> <Irony-off>
>>>
>>> I can't think of any sensible terminology not resulting in something
>>> which is much harder to understand than "core".
>>
>> Ideally I'd like us to have an arch-independent way of
>> expressing things - "socket" and "node" look to be common enough,
>> so perhaps wouldn't need further abstraction, but sub-socket
>> granularities could perhaps be expressed as "level1" or "level2"?
>> And then there could be context sensitive meanings of "core",
>> "cu", and perhaps (in the future) "die".

Words like "core" should have a consistent  meaning.

I did a quick search and couldn't really find any useful resources
describing the difference.

I think we have a couple of options (not necessarily all of these are
exclusive):

* Use "core / thread" for both, and document the rough mapping of these
onto AMD terminologies.

* Use "core / thread" for Intel, and AMD-specific terminology for AMD.

* Add higher-level terms, like "secure" and "performance" (or
"smallest"), so that an administrator can say, "Give me the smallest
granularity which is still secure", and "Give me the best performance
regardless of security".  If HT is ever fixed on future processors, then
those processors in the fleet will have thread-based scheduling, and
insecure processors will have core-based scheduling.

Fundamentally, either the topology levels are similar enough that a
single setting is sensible to use across both, or they are not.  If they
are similar enough, then I think using "core / thread" and mapping them
is probably the best option.  If they are not similar enough, then
things like "level1" and "level2" aren't actually useful anyway, because
what they mean on different systems is too divergent; i.e., in all
likelihood you'd want "level2" on Intels and "level1" on AMD anyway.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 11:45             ` George Dunlap
@ 2019-09-30 12:30               ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-09-30 12:30 UTC (permalink / raw)
  To: George Dunlap, Juergen Gross
  Cc: Stefano Stabellini, WeiLiu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall, xen-devel

On 30.09.2019 13:45, George Dunlap wrote:
> Fundamentally, either the topology levels are similar enough that a
> single setting is sensible to use across both, or they are not.  If they
> are similar enough, then I think using "core / thread" and mapping them
> is probably the best option.

Indeed - hence my comment here and not on the code actually parsing
the option. I.e. while I'd ideally prefer to see even the tokens on
the command line to match what they mean on underlying hardware, I
can accept (the reasons for) a common spelling, as long as the
respective doc parts sufficiently clarify the meaning.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation
  2019-09-30 11:26           ` Jürgen Groß
  2019-09-30 11:45             ` George Dunlap
@ 2019-09-30 12:38             ` Jan Beulich
  1 sibling, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2019-09-30 12:38 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On 30.09.2019 13:26, Jürgen Groß wrote:
> And it will be fatal for large scale installations with AMD- and INTEL-
> servers. Boot-parameters having the same semantics should be named the
> same (regardless of the name or value part) in order to enable such
> customers to use the same setting on each server.

But such a large scale user would quite likely want the meaning of
"core" in the respective vendor's sense, i.e. CPU scheduling on AMD
(as not being affected by the various HT leaks), and core scheduling
on Intel. Due to AMD Fam17 now actually calling the thing HT too, in
fact such installations would likely want _different_ options when
the primary goal is security, and a secondary one is performance /
throughput. Otoh I guess this is going to be our default eventually,
i.e. no command line option ought to be needed to achieve this.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-09-30 12:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-30 10:09 [Xen-devel] [PATCH v5 20/19] docs: add "sched-gran" boot parameter documentation Juergen Gross
2019-09-30 10:25 ` Jan Beulich
2019-09-30 10:51   ` Jürgen Groß
2019-09-30 11:02     ` Jan Beulich
2019-09-30 11:13       ` Jürgen Groß
2019-09-30 11:20         ` Jan Beulich
2019-09-30 11:26           ` Jürgen Groß
2019-09-30 11:45             ` George Dunlap
2019-09-30 12:30               ` Jan Beulich
2019-09-30 12:38             ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.