* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:06 ` Peter Zijlstra
@ 2017-04-20 15:40 ` Vitaly Kuznetsov
2017-04-20 15:40 ` Vitaly Kuznetsov
` (2 subsequent siblings)
3 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-04-20 15:40 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Prarit Bhargava, x86, linux-kernel, Ingo Molnar, H. Peter Anvin,
xen-devel, Thomas Gleixner, Borislav Petkov
Peter Zijlstra <peterz@infradead.org> writes:
> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>> In this patch I suggest we set __max_logical_packages based on the
>> max_physical_pkg_id and total_cpus,
>
> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
> instead of 4.
>
> This wastes quite a bit of memory for the per-node arrays. Luckily most
> are just pointer arrays, but still, wasting 140*8 bytes for each of
> them.
>
>> this should be safe and cover all
>> possible cases. Alternatively, we may think about eliminating the concept
>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>> total_cpus where we currently use topology_max_packages().
>>
>> The issue could've been solved in Xen too I guess. CPUID returning
>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>> all logical packages of the guest.
>
> This is getting ludicrous. Xen is plain broken, and instead of fixing
> it, you propose to somehow deal with its obviously crack induced
> behaviour :-(
Totally agree and I don't like the solution I propose (and that's why
this is RFC)... The problem is that there are such Xen setups in the
wild and with the recent changes some guests will BUG() :-(
Alternatively, we can just remove the BUG() and do something with CPUs
which have their pkg >= __max_logical_packages, e.g. assign them to the
last package. Far from ideal but will help to avoid the regression.
--
Vitaly
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:06 ` Peter Zijlstra
2017-04-20 15:40 ` Vitaly Kuznetsov
@ 2017-04-20 15:40 ` Vitaly Kuznetsov
2017-04-20 16:01 ` [Xen-devel] " Boris Ostrovsky
` (3 more replies)
2017-04-20 17:04 ` Andrew Cooper
2017-04-20 17:04 ` [Xen-devel] " Andrew Cooper
3 siblings, 4 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-04-20 15:40 UTC (permalink / raw)
To: Peter Zijlstra
Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Prarit Bhargava, linux-kernel, xen-devel
Peter Zijlstra <peterz@infradead.org> writes:
> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>> In this patch I suggest we set __max_logical_packages based on the
>> max_physical_pkg_id and total_cpus,
>
> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
> instead of 4.
>
> This wastes quite a bit of memory for the per-node arrays. Luckily most
> are just pointer arrays, but still, wasting 140*8 bytes for each of
> them.
>
>> this should be safe and cover all
>> possible cases. Alternatively, we may think about eliminating the concept
>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>> total_cpus where we currently use topology_max_packages().
>>
>> The issue could've been solved in Xen too I guess. CPUID returning
>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>> all logical packages of the guest.
>
> This is getting ludicrous. Xen is plain broken, and instead of fixing
> it, you propose to somehow deal with its obviously crack induced
> behaviour :-(
Totally agree and I don't like the solution I propose (and that's why
this is RFC)... The problem is that there are such Xen setups in the
wild and with the recent changes some guests will BUG() :-(
Alternatively, we can just remove the BUG() and do something with CPUs
which have their pkg >= __max_logical_packages, e.g. assign them to the
last package. Far from ideal but will help to avoid the regression.
--
Vitaly
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Xen-devel] [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:40 ` Vitaly Kuznetsov
@ 2017-04-20 16:01 ` Boris Ostrovsky
2017-04-20 16:21 ` Vitaly Kuznetsov
2017-04-20 16:21 ` Vitaly Kuznetsov
2017-04-20 16:01 ` Boris Ostrovsky
` (2 subsequent siblings)
3 siblings, 2 replies; 15+ messages in thread
From: Boris Ostrovsky @ 2017-04-20 16:01 UTC (permalink / raw)
To: Vitaly Kuznetsov, Peter Zijlstra
Cc: Prarit Bhargava, x86, linux-kernel, Ingo Molnar, H. Peter Anvin,
xen-devel, Thomas Gleixner, Borislav Petkov, Juergen Gross
On 04/20/2017 11:40 AM, Vitaly Kuznetsov wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>>> In this patch I suggest we set __max_logical_packages based on the
>>> max_physical_pkg_id and total_cpus,
>> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
>> instead of 4.
>>
>> This wastes quite a bit of memory for the per-node arrays. Luckily most
>> are just pointer arrays, but still, wasting 140*8 bytes for each of
>> them.
>>
>>> this should be safe and cover all
>>> possible cases. Alternatively, we may think about eliminating the concept
>>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>>> total_cpus where we currently use topology_max_packages().
>>>
>>> The issue could've been solved in Xen too I guess. CPUID returning
>>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>>> all logical packages of the guest.
>> This is getting ludicrous. Xen is plain broken, and instead of fixing
>> it, you propose to somehow deal with its obviously crack induced
>> behaviour :-(
> Totally agree and I don't like the solution I propose (and that's why
> this is RFC)... The problem is that there are such Xen setups in the
> wild and with the recent changes some guests will BUG() :-(
>
> Alternatively, we can just remove the BUG() and do something with CPUs
> which have their pkg >= __max_logical_packages, e.g. assign them to the
> last package. Far from ideal but will help to avoid the regression.
Do you observe this failure on PV or HVM guest?
We've had a number of issues with topology discovery for PV guests but
AFAIK they have been addressed (so far). I wonder though whether it
would make sense to have some sort of a callback (or an smp_ops.op) to
override native topology info, if needed.
-boris
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Xen-devel] [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 16:01 ` [Xen-devel] " Boris Ostrovsky
@ 2017-04-20 16:21 ` Vitaly Kuznetsov
2017-04-20 16:21 ` Vitaly Kuznetsov
1 sibling, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-04-20 16:21 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: Peter Zijlstra, Prarit Bhargava, x86, linux-kernel, Ingo Molnar,
H. Peter Anvin, xen-devel, Thomas Gleixner, Borislav Petkov,
Juergen Gross
Boris Ostrovsky <boris.ostrovsky@oracle.com> writes:
> On 04/20/2017 11:40 AM, Vitaly Kuznetsov wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>>
>>> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>>>> In this patch I suggest we set __max_logical_packages based on the
>>>> max_physical_pkg_id and total_cpus,
>>> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
>>> instead of 4.
>>>
>>> This wastes quite a bit of memory for the per-node arrays. Luckily most
>>> are just pointer arrays, but still, wasting 140*8 bytes for each of
>>> them.
>>>
>>>> this should be safe and cover all
>>>> possible cases. Alternatively, we may think about eliminating the concept
>>>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>>>> total_cpus where we currently use topology_max_packages().
>>>>
>>>> The issue could've been solved in Xen too I guess. CPUID returning
>>>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>>>> all logical packages of the guest.
>>> This is getting ludicrous. Xen is plain broken, and instead of fixing
>>> it, you propose to somehow deal with its obviously crack induced
>>> behaviour :-(
>> Totally agree and I don't like the solution I propose (and that's why
>> this is RFC)... The problem is that there are such Xen setups in the
>> wild and with the recent changes some guests will BUG() :-(
>>
>> Alternatively, we can just remove the BUG() and do something with CPUs
>> which have their pkg >= __max_logical_packages, e.g. assign them to the
>> last package. Far from ideal but will help to avoid the regression.
>
> Do you observe this failure on PV or HVM guest?
>
> We've had a number of issues with topology discovery for PV guests but
> AFAIK they have been addressed (so far). I wonder though whether it
> would make sense to have some sort of a callback (or an smp_ops.op) to
> override native topology info, if needed.
>
This is HVM.
I guess that CPUID handling for AMD processors in the hypervisor doesn't
adjust the core information and passes it from hardware as-is while it
should be tweaked.
--
Vitaly
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 16:01 ` [Xen-devel] " Boris Ostrovsky
2017-04-20 16:21 ` Vitaly Kuznetsov
@ 2017-04-20 16:21 ` Vitaly Kuznetsov
1 sibling, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-04-20 16:21 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: Prarit Bhargava, Juergen Gross, Peter Zijlstra, x86,
linux-kernel, Ingo Molnar, H. Peter Anvin, xen-devel,
Thomas Gleixner, Borislav Petkov
Boris Ostrovsky <boris.ostrovsky@oracle.com> writes:
> On 04/20/2017 11:40 AM, Vitaly Kuznetsov wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>>
>>> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>>>> In this patch I suggest we set __max_logical_packages based on the
>>>> max_physical_pkg_id and total_cpus,
>>> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
>>> instead of 4.
>>>
>>> This wastes quite a bit of memory for the per-node arrays. Luckily most
>>> are just pointer arrays, but still, wasting 140*8 bytes for each of
>>> them.
>>>
>>>> this should be safe and cover all
>>>> possible cases. Alternatively, we may think about eliminating the concept
>>>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>>>> total_cpus where we currently use topology_max_packages().
>>>>
>>>> The issue could've been solved in Xen too I guess. CPUID returning
>>>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>>>> all logical packages of the guest.
>>> This is getting ludicrous. Xen is plain broken, and instead of fixing
>>> it, you propose to somehow deal with its obviously crack induced
>>> behaviour :-(
>> Totally agree and I don't like the solution I propose (and that's why
>> this is RFC)... The problem is that there are such Xen setups in the
>> wild and with the recent changes some guests will BUG() :-(
>>
>> Alternatively, we can just remove the BUG() and do something with CPUs
>> which have their pkg >= __max_logical_packages, e.g. assign them to the
>> last package. Far from ideal but will help to avoid the regression.
>
> Do you observe this failure on PV or HVM guest?
>
> We've had a number of issues with topology discovery for PV guests but
> AFAIK they have been addressed (so far). I wonder though whether it
> would make sense to have some sort of a callback (or an smp_ops.op) to
> override native topology info, if needed.
>
This is HVM.
I guess that CPUID handling for AMD processors in the hypervisor doesn't
adjust the core information and passes it from hardware as-is while it
should be tweaked.
--
Vitaly
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:40 ` Vitaly Kuznetsov
2017-04-20 16:01 ` [Xen-devel] " Boris Ostrovsky
@ 2017-04-20 16:01 ` Boris Ostrovsky
2017-04-20 16:15 ` Peter Zijlstra
2017-04-20 16:15 ` Peter Zijlstra
3 siblings, 0 replies; 15+ messages in thread
From: Boris Ostrovsky @ 2017-04-20 16:01 UTC (permalink / raw)
To: Vitaly Kuznetsov, Peter Zijlstra
Cc: Prarit Bhargava, Juergen Gross, x86, linux-kernel, Ingo Molnar,
H. Peter Anvin, xen-devel, Thomas Gleixner, Borislav Petkov
On 04/20/2017 11:40 AM, Vitaly Kuznetsov wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>>> In this patch I suggest we set __max_logical_packages based on the
>>> max_physical_pkg_id and total_cpus,
>> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
>> instead of 4.
>>
>> This wastes quite a bit of memory for the per-node arrays. Luckily most
>> are just pointer arrays, but still, wasting 140*8 bytes for each of
>> them.
>>
>>> this should be safe and cover all
>>> possible cases. Alternatively, we may think about eliminating the concept
>>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>>> total_cpus where we currently use topology_max_packages().
>>>
>>> The issue could've been solved in Xen too I guess. CPUID returning
>>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>>> all logical packages of the guest.
>> This is getting ludicrous. Xen is plain broken, and instead of fixing
>> it, you propose to somehow deal with its obviously crack induced
>> behaviour :-(
> Totally agree and I don't like the solution I propose (and that's why
> this is RFC)... The problem is that there are such Xen setups in the
> wild and with the recent changes some guests will BUG() :-(
>
> Alternatively, we can just remove the BUG() and do something with CPUs
> which have their pkg >= __max_logical_packages, e.g. assign them to the
> last package. Far from ideal but will help to avoid the regression.
Do you observe this failure on PV or HVM guest?
We've had a number of issues with topology discovery for PV guests but
AFAIK they have been addressed (so far). I wonder though whether it
would make sense to have some sort of a callback (or an smp_ops.op) to
override native topology info, if needed.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:40 ` Vitaly Kuznetsov
2017-04-20 16:01 ` [Xen-devel] " Boris Ostrovsky
2017-04-20 16:01 ` Boris Ostrovsky
@ 2017-04-20 16:15 ` Peter Zijlstra
2017-04-20 16:15 ` Peter Zijlstra
3 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2017-04-20 16:15 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Prarit Bhargava, x86, linux-kernel, Ingo Molnar, H. Peter Anvin,
xen-devel, Thomas Gleixner, Borislav Petkov
On Thu, Apr 20, 2017 at 05:40:37PM +0200, Vitaly Kuznetsov wrote:
> > This is getting ludicrous. Xen is plain broken, and instead of fixing
> > it, you propose to somehow deal with its obviously crack induced
> > behaviour :-(
>
> Totally agree and I don't like the solution I propose (and that's why
> this is RFC)... The problem is that there are such Xen setups in the
> wild and with the recent changes some guests will BUG() :-(
>
> Alternatively, we can just remove the BUG() and do something with CPUs
> which have their pkg >= __max_logical_packages, e.g. assign them to the
> last package. Far from ideal but will help to avoid the regression.
So currently none of the stuff that uses this should appear in Xen. Its
all drivers for hardware that isn't virtualized (afaik). So assigning to
the last package 'works'.
But the moment this ends up getting used that explodes, because we'll
need different object instances for each piece of hardware.
There just isn't a good solution; on the one hand the BIOS is prone to
providing crap numbers, on the other hand virt (esp. Xen as it turns
out) provides absolutely bonkers/inconsistent topology information.
Very frustrating :-/
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:40 ` Vitaly Kuznetsov
` (2 preceding siblings ...)
2017-04-20 16:15 ` Peter Zijlstra
@ 2017-04-20 16:15 ` Peter Zijlstra
2017-04-20 17:09 ` Boris Ostrovsky
2017-04-20 17:09 ` [Xen-devel] " Boris Ostrovsky
3 siblings, 2 replies; 15+ messages in thread
From: Peter Zijlstra @ 2017-04-20 16:15 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
Borislav Petkov, Prarit Bhargava, linux-kernel, xen-devel
On Thu, Apr 20, 2017 at 05:40:37PM +0200, Vitaly Kuznetsov wrote:
> > This is getting ludicrous. Xen is plain broken, and instead of fixing
> > it, you propose to somehow deal with its obviously crack induced
> > behaviour :-(
>
> Totally agree and I don't like the solution I propose (and that's why
> this is RFC)... The problem is that there are such Xen setups in the
> wild and with the recent changes some guests will BUG() :-(
>
> Alternatively, we can just remove the BUG() and do something with CPUs
> which have their pkg >= __max_logical_packages, e.g. assign them to the
> last package. Far from ideal but will help to avoid the regression.
So currently none of the stuff that uses this should appear in Xen. Its
all drivers for hardware that isn't virtualized (afaik). So assigning to
the last package 'works'.
But the moment this ends up getting used that explodes, because we'll
need different object instances for each piece of hardware.
There just isn't a good solution; on the one hand the BIOS is prone to
providing crap numbers, on the other hand virt (esp. Xen as it turns
out) provides absolutely bonkers/inconsistent topology information.
Very frustrating :-/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 16:15 ` Peter Zijlstra
@ 2017-04-20 17:09 ` Boris Ostrovsky
2017-04-20 17:09 ` [Xen-devel] " Boris Ostrovsky
1 sibling, 0 replies; 15+ messages in thread
From: Boris Ostrovsky @ 2017-04-20 17:09 UTC (permalink / raw)
To: Peter Zijlstra, Vitaly Kuznetsov
Cc: Prarit Bhargava, Juergen Gross, x86, linux-kernel, Ingo Molnar,
H. Peter Anvin, xen-devel, Thomas Gleixner, Borislav Petkov
On 04/20/2017 12:15 PM, Peter Zijlstra wrote:
> On Thu, Apr 20, 2017 at 05:40:37PM +0200, Vitaly Kuznetsov wrote:
>>> This is getting ludicrous. Xen is plain broken, and instead of fixing
>>> it, you propose to somehow deal with its obviously crack induced
>>> behaviour :-(
>> Totally agree and I don't like the solution I propose (and that's why
>> this is RFC)... The problem is that there are such Xen setups in the
>> wild and with the recent changes some guests will BUG() :-(
>>
>> Alternatively, we can just remove the BUG() and do something with CPUs
>> which have their pkg >= __max_logical_packages, e.g. assign them to the
>> last package. Far from ideal but will help to avoid the regression.
> So currently none of the stuff that uses this should appear in Xen. Its
> all drivers for hardware that isn't virtualized (afaik). So assigning to
> the last package 'works'.
>
> But the moment this ends up getting used that explodes, because we'll
> need different object instances for each piece of hardware.
This already gets used. I don't remember details but we had to fix
something due to RAPL code referencing topology info (and yes, there is
no reason for a guest to use RAPL, but we shouldn't crash neither)
>
> There just isn't a good solution; on the one hand the BIOS is prone to
> providing crap numbers, on the other hand virt (esp. Xen as it turns
> out) provides absolutely bonkers/inconsistent topology information.
>
> Very frustrating :-/
>
So we might need a way to bypass topology discovery and present some
sort of default topology (single package, or 1 CPU per package, for
example) and ignore APICID and all that.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Xen-devel] [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 16:15 ` Peter Zijlstra
2017-04-20 17:09 ` Boris Ostrovsky
@ 2017-04-20 17:09 ` Boris Ostrovsky
1 sibling, 0 replies; 15+ messages in thread
From: Boris Ostrovsky @ 2017-04-20 17:09 UTC (permalink / raw)
To: Peter Zijlstra, Vitaly Kuznetsov
Cc: Prarit Bhargava, x86, linux-kernel, Ingo Molnar, H. Peter Anvin,
xen-devel, Thomas Gleixner, Borislav Petkov, Juergen Gross
On 04/20/2017 12:15 PM, Peter Zijlstra wrote:
> On Thu, Apr 20, 2017 at 05:40:37PM +0200, Vitaly Kuznetsov wrote:
>>> This is getting ludicrous. Xen is plain broken, and instead of fixing
>>> it, you propose to somehow deal with its obviously crack induced
>>> behaviour :-(
>> Totally agree and I don't like the solution I propose (and that's why
>> this is RFC)... The problem is that there are such Xen setups in the
>> wild and with the recent changes some guests will BUG() :-(
>>
>> Alternatively, we can just remove the BUG() and do something with CPUs
>> which have their pkg >= __max_logical_packages, e.g. assign them to the
>> last package. Far from ideal but will help to avoid the regression.
> So currently none of the stuff that uses this should appear in Xen. Its
> all drivers for hardware that isn't virtualized (afaik). So assigning to
> the last package 'works'.
>
> But the moment this ends up getting used that explodes, because we'll
> need different object instances for each piece of hardware.
This already gets used. I don't remember details but we had to fix
something due to RAPL code referencing topology info (and yes, there is
no reason for a guest to use RAPL, but we shouldn't crash neither)
>
> There just isn't a good solution; on the one hand the BIOS is prone to
> providing crap numbers, on the other hand virt (esp. Xen as it turns
> out) provides absolutely bonkers/inconsistent topology information.
>
> Very frustrating :-/
>
So we might need a way to bypass topology discovery and present some
sort of default topology (single package, or 1 CPU per package, for
example) and ignore APICID and all that.
-boris
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:06 ` Peter Zijlstra
2017-04-20 15:40 ` Vitaly Kuznetsov
2017-04-20 15:40 ` Vitaly Kuznetsov
@ 2017-04-20 17:04 ` Andrew Cooper
2017-04-20 17:04 ` [Xen-devel] " Andrew Cooper
3 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2017-04-20 17:04 UTC (permalink / raw)
To: Peter Zijlstra, Vitaly Kuznetsov
Cc: Prarit Bhargava, x86, linux-kernel, Ingo Molnar, H. Peter Anvin,
xen-devel, Thomas Gleixner, Borislav Petkov
On 20/04/17 16:06, Peter Zijlstra wrote:
> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>> In this patch I suggest we set __max_logical_packages based on the
>> max_physical_pkg_id and total_cpus,
> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
> instead of 4.
>
> This wastes quite a bit of memory for the per-node arrays. Luckily most
> are just pointer arrays, but still, wasting 140*8 bytes for each of
> them.
>
>> this should be safe and cover all
>> possible cases. Alternatively, we may think about eliminating the concept
>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>> total_cpus where we currently use topology_max_packages().
>>
>> The issue could've been solved in Xen too I guess. CPUID returning
>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>> all logical packages of the guest.
> This is getting ludicrous. Xen is plain broken, and instead of fixing
> it, you propose to somehow deal with its obviously crack induced
> behaviour :-(
Xen is (and has been forever) plain broken in this specific regard.
Fixing CPUID handling for guests has taken me 18 months and ~80 patches
so far, and it still isn't complete.
However, Linux needs to be able to function on existing production
systems (which is every instance of Xen currently running).
Linux shouldn't BUG() because something provided by the firmware looks
wonky. This is conceptually no different from finding junk the APCI tables.
(I fully agree that we shouldn't have got to this state, but we are 12
years too late in this respect.)
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Xen-devel] [PATCH RFC] x86/smpboot: Set safer __max_logical_packages limit
2017-04-20 15:06 ` Peter Zijlstra
` (2 preceding siblings ...)
2017-04-20 17:04 ` Andrew Cooper
@ 2017-04-20 17:04 ` Andrew Cooper
3 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2017-04-20 17:04 UTC (permalink / raw)
To: Peter Zijlstra, Vitaly Kuznetsov
Cc: Prarit Bhargava, x86, linux-kernel, Ingo Molnar, H. Peter Anvin,
xen-devel, Thomas Gleixner, Borislav Petkov
On 20/04/17 16:06, Peter Zijlstra wrote:
> On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
>> In this patch I suggest we set __max_logical_packages based on the
>> max_physical_pkg_id and total_cpus,
> So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
> instead of 4.
>
> This wastes quite a bit of memory for the per-node arrays. Luckily most
> are just pointer arrays, but still, wasting 140*8 bytes for each of
> them.
>
>> this should be safe and cover all
>> possible cases. Alternatively, we may think about eliminating the concept
>> of __max_logical_packages completely and relying on max_physical_pkg_id/
>> total_cpus where we currently use topology_max_packages().
>>
>> The issue could've been solved in Xen too I guess. CPUID returning
>> x86_max_cores can be tweaked to be the lowerest(?) possible number of
>> all logical packages of the guest.
> This is getting ludicrous. Xen is plain broken, and instead of fixing
> it, you propose to somehow deal with its obviously crack induced
> behaviour :-(
Xen is (and has been forever) plain broken in this specific regard.
Fixing CPUID handling for guests has taken me 18 months and ~80 patches
so far, and it still isn't complete.
However, Linux needs to be able to function on existing production
systems (which is every instance of Xen currently running).
Linux shouldn't BUG() because something provided by the firmware looks
wonky. This is conceptually no different from finding junk the APCI tables.
(I fully agree that we shouldn't have got to this state, but we are 12
years too late in this respect.)
~Andrew
^ permalink raw reply [flat|nested] 15+ messages in thread