All of lore.kernel.org
 help / color / mirror / Atom feed
* dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
@ 2020-07-20 14:53 Alejandro
  2020-07-24 10:45 ` Julien Grall
  0 siblings, 1 reply; 11+ messages in thread
From: Alejandro @ 2020-07-20 14:53 UTC (permalink / raw)
  To: xen-devel

Hello all.

I'm new to this community, and firstly I'd like to thank you all for
your efforts on supporting Xen in ARM devices.

I'm trying Xen 4.13.1 in a Allwinner H6 SoC (more precisely a Pine H64
model B, with a ARM Cortex-A53 CPU).
I managed to get a dom0 Linux 5.8-rc5 kernel running fine, unpatched,
and I'm using the upstream device tree for
my board. However, the dom0 kernel has trouble when reading some DT
nodes that are related to the CPUs, and
it can't initialize the thermal subsystem properly, which is a kind of
showstopper for me, because I'm concerned
that letting the CPU run at the maximum frequency without watching out
its temperature may cause overheating.
The relevant kernel messages are:

[  +0.001959] sun50i-cpufreq-nvmem: probe of sun50i-cpufreq-nvmem
failed with error -2
...
[  +0.003053] hw perfevents: failed to parse interrupt-affinity[0] for pmu
[  +0.000043] hw perfevents: /pmu: failed to register PMU devices!
[  +0.000037] armv8-pmu: probe of pmu failed with error -22
...
[  +0.000163] OF: /thermal-zones/cpu-thermal/cooling-maps/map0: could
not find phandle
[  +0.000063] thermal_sys: failed to build thermal zone cpu-thermal: -22

I've searched for issues, code or commits that may be related for this
issue. The most relevant things I found are:

- A patch that blacklists the A53 PMU:
https://patchwork.kernel.org/patch/10899881/
- The handle_node function in xen/arch/arm/domain_build.c:
https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1427

I've thought about removing "/cpus" from the skip_matches array in the
handle_node function, but I'm not sure
that would be a good fix.

I'd appreciate any tips for fixing this issue. Don't hesitate to
contact me back if you need any more information
about the problem.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-20 14:53 dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC Alejandro
@ 2020-07-24 10:45 ` Julien Grall
  2020-07-24 11:17   ` Amit Tomer
  2020-07-24 11:20   ` Alejandro
  0 siblings, 2 replies; 11+ messages in thread
From: Julien Grall @ 2020-07-24 10:45 UTC (permalink / raw)
  To: Alejandro, xen-devel
  Cc: Andre Przywara, Stefano Stabellini, Volodymyr Babchuk

(+ Andre and Stefano)

On 20/07/2020 15:53, Alejandro wrote:
> Hello all.

Hello,

> 
> I'm new to this community, and firstly I'd like to thank you all for
> your efforts on supporting Xen in ARM devices.

Welcome to the community!

> 
> I'm trying Xen 4.13.1 in a Allwinner H6 SoC (more precisely a Pine H64
> model B, with a ARM Cortex-A53 CPU).
> I managed to get a dom0 Linux 5.8-rc5 kernel running fine, unpatched,
> and I'm using the upstream device tree for
> my board. However, the dom0 kernel has trouble when reading some DT
> nodes that are related to the CPUs, and
> it can't initialize the thermal subsystem properly, which is a kind of
> showstopper for me, because I'm concerned
> that letting the CPU run at the maximum frequency without watching out
> its temperature may cause overheating.

I understand this concern, I am aware of some efforts to get CPUFreq 
working on Xen but I am not sure if there is anything available yet. I 
have CCed a couple of more person that may be able to help here.

> The relevant kernel messages are:
> 
> [  +0.001959] sun50i-cpufreq-nvmem: probe of sun50i-cpufreq-nvmem
> failed with error -2
> ...
> [  +0.003053] hw perfevents: failed to parse interrupt-affinity[0] for pmu
> [  +0.000043] hw perfevents: /pmu: failed to register PMU devices!
> [  +0.000037] armv8-pmu: probe of pmu failed with error -22

I am not sure the PMU failure is related to the thermal failure below.

> ...
> [  +0.000163] OF: /thermal-zones/cpu-thermal/cooling-maps/map0: could
> not find phandle
> [  +0.000063] thermal_sys: failed to build thermal zone cpu-thermal: -22
Would it be possible to paste the device-tree node for 
/thermal-zones/cpu-thermal/cooling-maps? I suspect the issue is because 
we recreated /cpus from scratch.

I don't know much about how the thermal subsystem works, but I suspect 
this will not be enough to get it working properly on Xen. For a 
workaround, you would need to create a dom0 with the same numbers of 
vCPU as the numbers of pCPUs. They would also need to be pinned.

I will leave the others to fill in more details.

> 
> I've searched for issues, code or commits that may be related for this
> issue. The most relevant things I found are:
> 
> - A patch that blacklists the A53 PMU:
> https://patchwork.kernel.org/patch/10899881/
> - The handle_node function in xen/arch/arm/domain_build.c:
> https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1427

I remember this discussion. The problem was that the PMU is using 
per-CPU interrupts. Xen is not yet able to handle PPIs as they often 
requires more context to be saved/restored (in this case the PMU context).

There was a proposal to look if a device is using PPIs and just remove 
them from the Device-Tree. Unfortunately, I haven't seen any official 
submission for this patch.

Did you have to apply the patch to boot up? If not, then the error above 
shouldn't be a concern. However, if you need PMU support for the using 
thermal devices then it is going to require some work.

> 
> I've thought about removing "/cpus" from the skip_matches array in the
> handle_node function, but I'm not sure
> that would be a good fix.

The node "/cpus" and its sub-node are recreated by Xen for Dom0. This is 
because Dom0 may have a different numbers of vCPUs and it doesn't seen 
the pCPUs.

If you don't skip "/cpus" from the host DT then you would end up with 
two "/cpus" path in your dom0 DT. Mostly likely, Linux will not be happy 
with it.

I vaguely remember some discussions on how to deal with CPUFreq in Xen. 
IIRC we agreed that Dom0 should be part of the equation because it 
already contains all the drivers. However, I can't remember if we agreed 
how the dom0 would be made aware of the pCPUs.

@Volodymyr, I think you were looking at CPUFreq. Maybe you can help?

Best regards,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-24 10:45 ` Julien Grall
@ 2020-07-24 11:17   ` Amit Tomer
  2020-07-24 11:18     ` Julien Grall
  2020-07-24 11:20   ` Alejandro
  1 sibling, 1 reply; 11+ messages in thread
From: Amit Tomer @ 2020-07-24 11:17 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Stefano Stabellini, Volodymyr Babchuk, Alejandro,
	Andre Przywara

Hi,

> I remember this discussion. The problem was that the PMU is using
> per-CPU interrupts. Xen is not yet able to handle PPIs as they often
> requires more context to be saved/restored (in this case the PMU context).
>
> There was a proposal to look if a device is using PPIs and just remove
> them from the Device-Tree. Unfortunately, I haven't seen any official
> submission for this patch.

But we have this patch that remove devices using PPIs
http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=9b1a31922ac066ef0dffe36ebd6a6ba016567d69

Thanks
-Amit


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-24 11:17   ` Amit Tomer
@ 2020-07-24 11:18     ` Julien Grall
  0 siblings, 0 replies; 11+ messages in thread
From: Julien Grall @ 2020-07-24 11:18 UTC (permalink / raw)
  To: Amit Tomer
  Cc: xen-devel, Stefano Stabellini, Volodymyr Babchuk, Alejandro,
	Andre Przywara



On 24/07/2020 12:17, Amit Tomer wrote:
> Hi,

Hi,

>> I remember this discussion. The problem was that the PMU is using
>> per-CPU interrupts. Xen is not yet able to handle PPIs as they often
>> requires more context to be saved/restored (in this case the PMU context).
>>
>> There was a proposal to look if a device is using PPIs and just remove
>> them from the Device-Tree. Unfortunately, I haven't seen any official
>> submission for this patch.
> 
> But we have this patch that remove devices using PPIs
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=9b1a31922ac066ef0dffe36ebd6a6ba016567d69

Urgh, I forgot we merged it. I should have double-checked the tree. 
Apologies for that.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-24 10:45 ` Julien Grall
  2020-07-24 11:17   ` Amit Tomer
@ 2020-07-24 11:20   ` Alejandro
  2020-07-26 20:24     ` André Przywara
  1 sibling, 1 reply; 11+ messages in thread
From: Alejandro @ 2020-07-24 11:20 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Stefano Stabellini, Volodymyr Babchuk, Andre Przywara

Hello, and thanks for the response.

El vie., 24 jul. 2020 a las 12:45, Julien Grall (<julien@xen.org>) escribió:
> > I'm trying Xen 4.13.1 in a Allwinner H6 SoC (more precisely a Pine H64
> > model B, with a ARM Cortex-A53 CPU).
> > I managed to get a dom0 Linux 5.8-rc5 kernel running fine, unpatched,
> > and I'm using the upstream device tree for
> > my board. However, the dom0 kernel has trouble when reading some DT
> > nodes that are related to the CPUs, and
> > it can't initialize the thermal subsystem properly, which is a kind of
> > showstopper for me, because I'm concerned
> > that letting the CPU run at the maximum frequency without watching out
> > its temperature may cause overheating.
>
> I understand this concern, I am aware of some efforts to get CPUFreq
> working on Xen but I am not sure if there is anything available yet. I
> have CCed a couple of more person that may be able to help here.

Thank you for the CCs. I hope they can bring on some insight about this :)

> > The relevant kernel messages are:
> >
> > [  +0.001959] sun50i-cpufreq-nvmem: probe of sun50i-cpufreq-nvmem
> > failed with error -2
> > ...
> > [  +0.003053] hw perfevents: failed to parse interrupt-affinity[0] for pmu
> > [  +0.000043] hw perfevents: /pmu: failed to register PMU devices!
> > [  +0.000037] armv8-pmu: probe of pmu failed with error -22
>
> I am not sure the PMU failure is related to the thermal failure below.

I'm not sure either, but after comparing the kernel messages for a
boot with and without Xen, those were the differences (excluding, of
course, the messages that inform that the Xen hypervisor console is
being used and such). For the sake of completeness, I decided to
mention it anyway.

> > [  +0.000163] OF: /thermal-zones/cpu-thermal/cooling-maps/map0: could
> > not find phandle
> > [  +0.000063] thermal_sys: failed to build thermal zone cpu-thermal: -22
> Would it be possible to paste the device-tree node for
> /thermal-zones/cpu-thermal/cooling-maps? I suspect the issue is because
> we recreated /cpus from scratch.
>
> I don't know much about how the thermal subsystem works, but I suspect
> this will not be enough to get it working properly on Xen. For a
> workaround, you would need to create a dom0 with the same numbers of
> vCPU as the numbers of pCPUs. They would also need to be pinned.
>
> I will leave the others to fill in more details.

I think I should mention that I've tried to hackily fix things by
removing the make_cpus_node call on handle_node
(https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1585),
after removing the /cpus node from the skip_matches array. This way,
the original /cpus node was passed through, without being recreated by
Xen. Of course, I made sure that dom0 used the same number of vCPUs as
pCPUs, because otherwise things would probably blow up, which luckily
that was not a compromise for me. The end result was that the
aforementioned kernel error messages were gone, and the thermal
subsystem worked fine again. However, this time the cpufreq-dt probe
failed, with what I think was an ENODEV error. This left the CPU
locked at the boot frequency of less than 1 GHz, compared to the
maximum 1.8 GHz frequency that the SoC supports, which has bad
implications for performance.

Therefore, as it seems that passing more properties (like
#cooling-cells) is enough to get temperatures working, I suspect that
fixing the thermal issue is relatively easy, at least for my SoC. But
maybe I have just been lucky and that's not supposed to work anyway;
I'm not sure.

> >
> > I've searched for issues, code or commits that may be related for this
> > issue. The most relevant things I found are:
> >
> > - A patch that blacklists the A53 PMU:
> > https://patchwork.kernel.org/patch/10899881/
> > - The handle_node function in xen/arch/arm/domain_build.c:
> > https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1427
>
> I remember this discussion. The problem was that the PMU is using
> per-CPU interrupts. Xen is not yet able to handle PPIs as they often
> requires more context to be saved/restored (in this case the PMU context).
>
> There was a proposal to look if a device is using PPIs and just remove
> them from the Device-Tree. Unfortunately, I haven't seen any official
> submission for this patch.
>
> Did you have to apply the patch to boot up? If not, then the error above
> shouldn't be a concern. However, if you need PMU support for the using
> thermal devices then it is going to require some work.

No, I didn't apply any patch to Xen whatsoever. It worked fine out of
the box. As I mentioned above, with a more complete /cpus node
declaration, the thermal subsystem works. I guess the PMU worked fine
too, but I didn't test it in any way, so maybe it is just barely able
to probe successfully somehow.

> > I've thought about removing "/cpus" from the skip_matches array in the
> > handle_node function, but I'm not sure
> > that would be a good fix.
>
> The node "/cpus" and its sub-node are recreated by Xen for Dom0. This is
> because Dom0 may have a different numbers of vCPUs and it doesn't seen
> the pCPUs.
>
> If you don't skip "/cpus" from the host DT then you would end up with
> two "/cpus" path in your dom0 DT. Mostly likely, Linux will not be happy
> with it.

Indeed, that is consistent with my observations of how the source code
works. Thanks for the confirmation :)

> I vaguely remember some discussions on how to deal with CPUFreq in Xen.
> IIRC we agreed that Dom0 should be part of the equation because it
> already contains all the drivers. However, I can't remember if we agreed
> how the dom0 would be made aware of the pCPUs.

That makes sense. Supporting every existing thermal and cpufreq method
in every ARM SoC seems like a lot of unneeded duplication of work,
provided that Linux already has pretty good support for that. But, if
that's the case, I guess we should not mark the "dom0-kernel" cpufreq
boot parameter as deprecated in the documentation, at least for the
ARM platform: http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#cpufreq


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-24 11:20   ` Alejandro
@ 2020-07-26 20:24     ` André Przywara
  2020-07-28 10:39       ` Alejandro
  0 siblings, 1 reply; 11+ messages in thread
From: André Przywara @ 2020-07-26 20:24 UTC (permalink / raw)
  To: Alejandro, Julien Grall; +Cc: xen-devel, Stefano Stabellini, Volodymyr Babchuk

On 24/07/2020 12:20, Alejandro wrote:

Hi,

> El vie., 24 jul. 2020 a las 12:45, Julien Grall (<julien@xen.org>) escribió:
>>> I'm trying Xen 4.13.1 in a Allwinner H6 SoC (more precisely a Pine H64
>>> model B, with a ARM Cortex-A53 CPU).
>>> I managed to get a dom0 Linux 5.8-rc5 kernel running fine, unpatched,
>>> and I'm using the upstream device tree for
>>> my board. However, the dom0 kernel has trouble when reading some DT
>>> nodes that are related to the CPUs, and
>>> it can't initialize the thermal subsystem properly, which is a kind of
>>> showstopper for me, because I'm concerned
>>> that letting the CPU run at the maximum frequency without watching out
>>> its temperature may cause overheating.
>>
>> I understand this concern, I am aware of some efforts to get CPUFreq
>> working on Xen but I am not sure if there is anything available yet. I
>> have CCed a couple of more person that may be able to help here.
> 
> Thank you for the CCs. I hope they can bring on some insight about this :)
> 
>>> The relevant kernel messages are:
>>>
>>> [  +0.001959] sun50i-cpufreq-nvmem: probe of sun50i-cpufreq-nvmem
>>> failed with error -2
>>> ...
>>> [  +0.003053] hw perfevents: failed to parse interrupt-affinity[0] for pmu
>>> [  +0.000043] hw perfevents: /pmu: failed to register PMU devices!
>>> [  +0.000037] armv8-pmu: probe of pmu failed with error -22
>>
>> I am not sure the PMU failure is related to the thermal failure below.
> 
> I'm not sure either, but after comparing the kernel messages for a
> boot with and without Xen, those were the differences (excluding, of
> course, the messages that inform that the Xen hypervisor console is
> being used and such). For the sake of completeness, I decided to
> mention it anyway.
> 
>>> [  +0.000163] OF: /thermal-zones/cpu-thermal/cooling-maps/map0: could
>>> not find phandle
>>> [  +0.000063] thermal_sys: failed to build thermal zone cpu-thermal: -22
>> Would it be possible to paste the device-tree node for
>> /thermal-zones/cpu-thermal/cooling-maps? I suspect the issue is because
>> we recreated /cpus from scratch.
>>
>> I don't know much about how the thermal subsystem works, but I suspect
>> this will not be enough to get it working properly on Xen. For a
>> workaround, you would need to create a dom0 with the same numbers of
>> vCPU as the numbers of pCPUs. They would also need to be pinned.
>>
>> I will leave the others to fill in more details.
> 
> I think I should mention that I've tried to hackily fix things by
> removing the make_cpus_node call on handle_node
> (https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1585),
> after removing the /cpus node from the skip_matches array. This way,
> the original /cpus node was passed through, without being recreated by
> Xen. Of course, I made sure that dom0 used the same number of vCPUs as
> pCPUs, because otherwise things would probably blow up, which luckily
> that was not a compromise for me. The end result was that the
> aforementioned kernel error messages were gone, and the thermal
> subsystem worked fine again. However, this time the cpufreq-dt probe
> failed, with what I think was an ENODEV error. This left the CPU
> locked at the boot frequency of less than 1 GHz, compared to the
> maximum 1.8 GHz frequency that the SoC supports, which has bad
> implications for performance.

So this was actually my first thought: The firmware (U-Boot SPL) sets up
some basic CPU frequency (888 MHz for H6 [1]), which is known to never
overheat the chip, even under full load. So any concern from your side
about the board or SoC overheating could be dismissed, with the current
mainline code, at least. However you lose the full speed, by quite a
margin on the H6 (on the A64 it's only 816 vs 1200(ish) MHz).
However, without the clock entries in the CPU node, the frequency would
never be changed by Dom0 anyway (nor by Xen, which doesn't even know how
to do this).
So from a practical point of view: unless you hack Xen to pass on more
cpu node properties, you are stuck at 888 MHz anyway, and don't need to
worry about overheating.

Now if you would pass on the CPU clock frequency control to Dom0, you
run into more issues: the Linux governors would probably try to setup
both frequency and voltage based on load, BUT this would be Dom0's bogus
perception of the actual system load. Even with pinned Dom0 vCPUs, a
busy system might spend most of its CPU time in DomU VCPUs, which
probably makes it look mostly idle in Dom0. Using a fixed governor
(performance) would avoid this, at the cost of running full speed all of
the time, probably needlessly.

So fixing the CPU clocking issue is more complex and requires more
ground work in Xen first, probably involving some enlightenend Dom0
drivers as well. I didn't follow latest developments in this area, nor
do I remember x86's answer to this, but it's not something easy, I would
presume.

Alejandro: can you try to measure the actual CPU frequency in Dom0?
Maybe some easy benchmark? "mhz" from lmbench does a great job in
telling you the actual frequency, just by clever measurement. But any
other CPU bound benchmark would do, if you compare bare metal Linux vs.
Dom0.
Also, does cpufreq come up in Dom0 at all? Can you select governors and
frequencies?

Cheers,
Andre.

> Therefore, as it seems that passing more properties (like
> #cooling-cells) is enough to get temperatures working, I suspect that
> fixing the thermal issue is relatively easy, at least for my SoC. But
> maybe I have just been lucky and that's not supposed to work anyway;
> I'm not sure.
> 
>>>
>>> I've searched for issues, code or commits that may be related for this
>>> issue. The most relevant things I found are:
>>>
>>> - A patch that blacklists the A53 PMU:
>>> https://patchwork.kernel.org/patch/10899881/
>>> - The handle_node function in xen/arch/arm/domain_build.c:
>>> https://github.com/xen-project/xen/blob/master/xen/arch/arm/domain_build.c#L1427
>>
>> I remember this discussion. The problem was that the PMU is using
>> per-CPU interrupts. Xen is not yet able to handle PPIs as they often
>> requires more context to be saved/restored (in this case the PMU context).
>>
>> There was a proposal to look if a device is using PPIs and just remove
>> them from the Device-Tree. Unfortunately, I haven't seen any official
>> submission for this patch.
>>
>> Did you have to apply the patch to boot up? If not, then the error above
>> shouldn't be a concern. However, if you need PMU support for the using
>> thermal devices then it is going to require some work.
> 
> No, I didn't apply any patch to Xen whatsoever. It worked fine out of
> the box. As I mentioned above, with a more complete /cpus node
> declaration, the thermal subsystem works. I guess the PMU worked fine
> too, but I didn't test it in any way, so maybe it is just barely able
> to probe successfully somehow.
> 
>>> I've thought about removing "/cpus" from the skip_matches array in the
>>> handle_node function, but I'm not sure
>>> that would be a good fix.
>>
>> The node "/cpus" and its sub-node are recreated by Xen for Dom0. This is
>> because Dom0 may have a different numbers of vCPUs and it doesn't seen
>> the pCPUs.
>>
>> If you don't skip "/cpus" from the host DT then you would end up with
>> two "/cpus" path in your dom0 DT. Mostly likely, Linux will not be happy
>> with it.
> 
> Indeed, that is consistent with my observations of how the source code
> works. Thanks for the confirmation :)
> 
>> I vaguely remember some discussions on how to deal with CPUFreq in Xen.
>> IIRC we agreed that Dom0 should be part of the equation because it
>> already contains all the drivers. However, I can't remember if we agreed
>> how the dom0 would be made aware of the pCPUs.
> 
> That makes sense. Supporting every existing thermal and cpufreq method
> in every ARM SoC seems like a lot of unneeded duplication of work,
> provided that Linux already has pretty good support for that. But, if
> that's the case, I guess we should not mark the "dom0-kernel" cpufreq
> boot parameter as deprecated in the documentation, at least for the
> ARM platform: http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#cpufreq
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-26 20:24     ` André Przywara
@ 2020-07-28 10:39       ` Alejandro
  2020-07-28 11:17         ` André Przywara
  0 siblings, 1 reply; 11+ messages in thread
From: Alejandro @ 2020-07-28 10:39 UTC (permalink / raw)
  To: André Przywara
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Volodymyr Babchuk

Hello,

El dom., 26 jul. 2020 a las 22:25, André Przywara
(<andre.przywara@arm.com>) escribió:
> So this was actually my first thought: The firmware (U-Boot SPL) sets up
> some basic CPU frequency (888 MHz for H6 [1]), which is known to never
> overheat the chip, even under full load. So any concern from your side
> about the board or SoC overheating could be dismissed, with the current
> mainline code, at least. However you lose the full speed, by quite a
> margin on the H6 (on the A64 it's only 816 vs 1200(ish) MHz).
> However, without the clock entries in the CPU node, the frequency would
> never be changed by Dom0 anyway (nor by Xen, which doesn't even know how
> to do this).
> So from a practical point of view: unless you hack Xen to pass on more
> cpu node properties, you are stuck at 888 MHz anyway, and don't need to
> worry about overheating.
Thank you. Knowing that at least it won't overheat is a relief. But
the performance definitely suffers from the current situation, and
quite a bit. I'm thinking about using KVM instead: even if it does
less paravirtualization of guests, I'm sure that the ability to use
the maximum frequency of the CPU would offset the additional overhead,
and in general offer better performance. But with KVM I lose the
ability to have individual domU's dedicated to some device driver,
which is a nice thing to have from a security standpoint.

> Now if you would pass on the CPU clock frequency control to Dom0, you
> run into more issues: the Linux governors would probably try to setup
> both frequency and voltage based on load, BUT this would be Dom0's bogus
> perception of the actual system load. Even with pinned Dom0 vCPUs, a
> busy system might spend most of its CPU time in DomU VCPUs, which
> probably makes it look mostly idle in Dom0. Using a fixed governor
> (performance) would avoid this, at the cost of running full speed all of
> the time, probably needlessly.
>
> So fixing the CPU clocking issue is more complex and requires more
> ground work in Xen first, probably involving some enlightenend Dom0
> drivers as well. I didn't follow latest developments in this area, nor
> do I remember x86's answer to this, but it's not something easy, I would
> presume.
I understand, thanks :). I know that recent Intel CPUs (from Sandy
Bridge onwards) use P-states to manage frequencies, and even have a
mode of operation that lets the CPU select the P-states by itself. On
older processors, Xen can probably rely on ACPI data to do the
frequency scaling. But the most similar "standard thing" that my board
has, a AR100 coprocessor that with the (work in progress) Crust
firmware can be used with SCMI, doesn't even seem to support the use
case of changing CPU frequency... and SCMI is the most promising
approach for adding DVFS support in Xen for ARM, according to this
previous work: https://www.slideshare.net/xen_com_mgr/xpdds18-cpufreq-in-xen-on-arm-oleksandr-tyshchenko-epam-systems

> Alejandro: can you try to measure the actual CPU frequency in Dom0?
> Maybe some easy benchmark? "mhz" from lmbench does a great job in
> telling you the actual frequency, just by clever measurement. But any
> other CPU bound benchmark would do, if you compare bare metal Linux vs.
> Dom0.
I have measured the CPU frequency in Dom0 using lmbench several times
and it seems to be stuck at 888 MHz, the frequency set by U-Boot.
Overall, the system feels more sluggish than when using bare Linux,
too. It doesn't matter if I apply the "hacky fix" I mentioned before
or not.

> Also, does cpufreq come up in Dom0 at all? Can you select governors and
> frequencies?
It doesn't come up, and no sysfs entries are created for cpufreq. With
the "fix", the kernel prints an error message complaining that it
couldn't probe cpufreq-dt, but it still doesn't come up, and sysfs
entries for cpufreq aren't created either.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-28 10:39       ` Alejandro
@ 2020-07-28 11:17         ` André Przywara
  2020-07-28 18:14           ` Stefano Stabellini
  0 siblings, 1 reply; 11+ messages in thread
From: André Przywara @ 2020-07-28 11:17 UTC (permalink / raw)
  To: Alejandro; +Cc: xen-devel, Stefano Stabellini, Julien Grall, Volodymyr Babchuk

On 28/07/2020 11:39, Alejandro wrote:
> Hello,
> 
> El dom., 26 jul. 2020 a las 22:25, André Przywara
> (<andre.przywara@arm.com>) escribió:
>> So this was actually my first thought: The firmware (U-Boot SPL) sets up
>> some basic CPU frequency (888 MHz for H6 [1]), which is known to never
>> overheat the chip, even under full load. So any concern from your side
>> about the board or SoC overheating could be dismissed, with the current
>> mainline code, at least. However you lose the full speed, by quite a
>> margin on the H6 (on the A64 it's only 816 vs 1200(ish) MHz).
>> However, without the clock entries in the CPU node, the frequency would
>> never be changed by Dom0 anyway (nor by Xen, which doesn't even know how
>> to do this).
>> So from a practical point of view: unless you hack Xen to pass on more
>> cpu node properties, you are stuck at 888 MHz anyway, and don't need to
>> worry about overheating.
> Thank you. Knowing that at least it won't overheat is a relief. But
> the performance definitely suffers from the current situation, and
> quite a bit. I'm thinking about using KVM instead: even if it does
> less paravirtualization of guests,

What is this statement based on? I think on ARM this never really
applied, and in general whether you do virtio or xen front-end/back-end
does not really matter. IMHO any reasoning about performance just based
on software architecture is mostly flawed (because it's complex and
reality might have missed some memos ;-) So just measure your particular
use case, then you know.

> I'm sure that the ability to use
> the maximum frequency of the CPU would offset the additional overhead,
> and in general offer better performance. But with KVM I lose the
> ability to have individual domU's dedicated to some device driver,
> which is a nice thing to have from a security standpoint.

I understand the theoretical merits, but a) does this really work on
your board and b) is this really more secure? What do you want to
protect against?

>> Now if you would pass on the CPU clock frequency control to Dom0, you
>> run into more issues: the Linux governors would probably try to setup
>> both frequency and voltage based on load, BUT this would be Dom0's bogus
>> perception of the actual system load. Even with pinned Dom0 vCPUs, a
>> busy system might spend most of its CPU time in DomU VCPUs, which
>> probably makes it look mostly idle in Dom0. Using a fixed governor
>> (performance) would avoid this, at the cost of running full speed all of
>> the time, probably needlessly.
>>
>> So fixing the CPU clocking issue is more complex and requires more
>> ground work in Xen first, probably involving some enlightenend Dom0
>> drivers as well. I didn't follow latest developments in this area, nor
>> do I remember x86's answer to this, but it's not something easy, I would
>> presume.
> I understand, thanks :). I know that recent Intel CPUs (from Sandy
> Bridge onwards) use P-states to manage frequencies, and even have a
> mode of operation that lets the CPU select the P-states by itself. On
> older processors, Xen can probably rely on ACPI data to do the
> frequency scaling. But the most similar "standard thing" that my board
> has, a AR100 coprocessor that with the (work in progress) Crust
> firmware can be used with SCMI, doesn't even seem to support the use
> case of changing CPU frequency... and SCMI is the most promising
> approach for adding DVFS support in Xen for ARM, according to this
> previous work: https://www.slideshare.net/xen_com_mgr/xpdds18-cpufreq-in-xen-on-arm-oleksandr-tyshchenko-epam-systems

So architecturally you could run all cores at full speed, always, and
tell Crust to clock down / decrease voltage once a thermal condition
triggers. That's not power-saving, but at least should be relatively safe.
On Allwinner platforms this isn't really bullet-proof, though, since the
THS device is non-secure, so anyone with access to the MMIO region could
turn it off. Or Dom0 could just turn the THS clock off - which it
actually does, because it's not used.
In the end it's a much bigger discussion about doing those things in
firmware or in the OS. For those traditionally embedded platforms like
Allwinner there is a huge fraction that does not trust firmware,
unfortunately, so moving responsibility to firmware is not very popular
upstream (been there, done that).

>> Alejandro: can you try to measure the actual CPU frequency in Dom0?
>> Maybe some easy benchmark? "mhz" from lmbench does a great job in
>> telling you the actual frequency, just by clever measurement. But any
>> other CPU bound benchmark would do, if you compare bare metal Linux vs.
>> Dom0.
> I have measured the CPU frequency in Dom0 using lmbench several times
> and it seems to be stuck at 888 MHz, the frequency set by U-Boot.
> Overall, the system feels more sluggish than when using bare Linux,
> too. It doesn't matter if I apply the "hacky fix" I mentioned before
> or not.>
>> Also, does cpufreq come up in Dom0 at all? Can you select governors and
>> frequencies?
> It doesn't come up, and no sysfs entries are created for cpufreq. With
> the "fix", the kernel prints an error message complaining that it
> couldn't probe cpufreq-dt, but it still doesn't come up, and sysfs
> entries for cpufreq aren't created either.

I see, many thanks for doing this, as this seems to confirm my assumptions.

If you have good cooling in place, or always one hand on the power plug,
you could change U-Boot to bump up the CPU frequency (make menuconfig,
search for CONFIG_SYS_CLK_FREQ). Then you could at least see if your
observed performance issues are related to the core frequency. You might
need to adjust the CPU voltage, too.

Cheers,
Andre


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-28 11:17         ` André Przywara
@ 2020-07-28 18:14           ` Stefano Stabellini
  2020-07-28 18:52             ` Christopher Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Stefano Stabellini @ 2020-07-28 18:14 UTC (permalink / raw)
  To: André Przywara
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Alejandro

[-- Attachment #1: Type: text/plain, Size: 2933 bytes --]

On Tue, 28 Jul 2020, André Przywara wrote:
> On 28/07/2020 11:39, Alejandro wrote:
> > Hello,
> > 
> > El dom., 26 jul. 2020 a las 22:25, André Przywara
> > (<andre.przywara@arm.com>) escribió:
> >> So this was actually my first thought: The firmware (U-Boot SPL) sets up
> >> some basic CPU frequency (888 MHz for H6 [1]), which is known to never
> >> overheat the chip, even under full load. So any concern from your side
> >> about the board or SoC overheating could be dismissed, with the current
> >> mainline code, at least. However you lose the full speed, by quite a
> >> margin on the H6 (on the A64 it's only 816 vs 1200(ish) MHz).
> >> However, without the clock entries in the CPU node, the frequency would
> >> never be changed by Dom0 anyway (nor by Xen, which doesn't even know how
> >> to do this).
> >> So from a practical point of view: unless you hack Xen to pass on more
> >> cpu node properties, you are stuck at 888 MHz anyway, and don't need to
> >> worry about overheating.
> > Thank you. Knowing that at least it won't overheat is a relief. But
> > the performance definitely suffers from the current situation, and
> > quite a bit. I'm thinking about using KVM instead: even if it does
> > less paravirtualization of guests,
> 
> What is this statement based on? I think on ARM this never really
> applied, and in general whether you do virtio or xen front-end/back-end
> does not really matter. IMHO any reasoning about performance just based
> on software architecture is mostly flawed (because it's complex and
> reality might have missed some memos ;-) So just measure your particular
> use case, then you know.
> 
> > I'm sure that the ability to use
> > the maximum frequency of the CPU would offset the additional overhead,
> > and in general offer better performance. But with KVM I lose the
> > ability to have individual domU's dedicated to some device driver,
> > which is a nice thing to have from a security standpoint.
> 
> I understand the theoretical merits, but a) does this really work on
> your board and b) is this really more secure? What do you want to
> protect against?

For "does it work on your board", the main obstacle is typically IOMMU
support to be able to do device assignment properly. That's definitely
something to check. If it doesn't work nowadays you can try to
workaround it by using direct 1:1 memory mappings [1].  However, for
security then you have to configure a MPU. I wonder if H6 has a MPU and
how it can be configured. In any case, something to keep in mind in case
the default IOMMU-based setup doesn't work for some reason for the
device you care about.

For "is this really more secure?", yes it is more secure as you are
running larger portions of the codebase in unprivileged mode and isolated
from each other with IOMMU (or MPU) protection. See what the OpenXT and
Qubes OS guys have been doing.


[1] https://marc.info/?l=xen-devel&m=158691258712815

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-28 18:14           ` Stefano Stabellini
@ 2020-07-28 18:52             ` Christopher Clark
  2020-07-29  0:18               ` André Przywara
  0 siblings, 1 reply; 11+ messages in thread
From: Christopher Clark @ 2020-07-28 18:52 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: André Przywara, Julien Grall, Volodymyr Babchuk, Alejandro,
	xen-devel

On Tue, Jul 28, 2020 at 11:16 AM Stefano Stabellini
<sstabellini@kernel.org> wrote:
>
> On Tue, 28 Jul 2020, André Przywara wrote:
> > On 28/07/2020 11:39, Alejandro wrote:
> > > Hello,
> > >
> > > El dom., 26 jul. 2020 a las 22:25, André Przywara
> > > (<andre.przywara@arm.com>) escribió:
> > >> So this was actually my first thought: The firmware (U-Boot SPL) sets up
> > >> some basic CPU frequency (888 MHz for H6 [1]), which is known to never
> > >> overheat the chip, even under full load. So any concern from your side
> > >> about the board or SoC overheating could be dismissed, with the current
> > >> mainline code, at least. However you lose the full speed, by quite a
> > >> margin on the H6 (on the A64 it's only 816 vs 1200(ish) MHz).
> > >> However, without the clock entries in the CPU node, the frequency would
> > >> never be changed by Dom0 anyway (nor by Xen, which doesn't even know how
> > >> to do this).
> > >> So from a practical point of view: unless you hack Xen to pass on more
> > >> cpu node properties, you are stuck at 888 MHz anyway, and don't need to
> > >> worry about overheating.
> > > Thank you. Knowing that at least it won't overheat is a relief. But
> > > the performance definitely suffers from the current situation, and
> > > quite a bit. I'm thinking about using KVM instead: even if it does
> > > less paravirtualization of guests,
> >
> > What is this statement based on? I think on ARM this never really
> > applied, and in general whether you do virtio or xen front-end/back-end
> > does not really matter.

When you say "in general" here, this becomes a very broad statement
about virtio and xen front-end/back-ends being equivalent and
interchangable, and that could cause some misunderstanding for a
newcomer.

There are important differences between the isolation properties of
classic virtio and Xen's front-end/back-ends -- and also the Argo
transport. It's particularly important for Xen because it has
priortized support for stronger isolation between execution
environments to a greater extent than some other hypervisors. It is a
critical differentiator for it. The importance of isolation is why Xen
4.14's headline feature was support for Linux stubdomains, upstreamed
to Xen after years of work by the Qubes and OpenXT communities.

> > IMHO any reasoning about performance just based
> > on software architecture is mostly flawed (because it's complex and
> > reality might have missed some memos ;-)

That's another pretty strong statement. Measurement is great, but
maybe performance analysis that is informed and directed by an
understanding of the architecture under test could potentially be more
rigorous and persuasive than work done without it?

> > So just measure your particular use case, then you know.

Hmm.

> > > I'm sure that the ability to use
> > > the maximum frequency of the CPU would offset the additional overhead,
> > > and in general offer better performance. But with KVM I lose the
> > > ability to have individual domU's dedicated to some device driver,
> > > which is a nice thing to have from a security standpoint.
> >
> > I understand the theoretical merits, but a) does this really work on
> > your board and b) is this really more secure? What do you want to
> > protect against?
>
> For "does it work on your board", the main obstacle is typically IOMMU
> support to be able to do device assignment properly. That's definitely
> something to check. If it doesn't work nowadays you can try to
> workaround it by using direct 1:1 memory mappings [1].  However, for
> security then you have to configure a MPU. I wonder if H6 has a MPU and
> how it can be configured. In any case, something to keep in mind in case
> the default IOMMU-based setup doesn't work for some reason for the
> device you care about.
>
> For "is this really more secure?", yes it is more secure as you are
> running larger portions of the codebase in unprivileged mode and isolated
> from each other with IOMMU (or MPU) protection. See what the OpenXT and
> Qubes OS guys have been doing.

Yes. Both projects have done quite a lot of work to enable and
maintain driver domains.

thanks,

Christopher

>
>
> [1] https://marc.info/?l=xen-devel&m=158691258712815


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC
  2020-07-28 18:52             ` Christopher Clark
@ 2020-07-29  0:18               ` André Przywara
  0 siblings, 0 replies; 11+ messages in thread
From: André Przywara @ 2020-07-29  0:18 UTC (permalink / raw)
  To: Christopher Clark, Stefano Stabellini
  Cc: xen-devel, Julien Grall, Volodymyr Babchuk, Alejandro

On 28/07/2020 19:52, Christopher Clark wrote:

Hi Christopher,

wow, this quickly got out of hand. I never meant to downplay anyone's
work here, but on this particular platform some things might look a bit
different than normal. See below.

> On Tue, Jul 28, 2020 at 11:16 AM Stefano Stabellini
> <sstabellini@kernel.org> wrote:
>>
>> On Tue, 28 Jul 2020, André Przywara wrote:
>>> On 28/07/2020 11:39, Alejandro wrote:
>>>> Hello,
>>>>
>>>> El dom., 26 jul. 2020 a las 22:25, André Przywara
>>>> (<andre.przywara@arm.com>) escribió:
>>>>> So this was actually my first thought: The firmware (U-Boot SPL) sets up
>>>>> some basic CPU frequency (888 MHz for H6 [1]), which is known to never
>>>>> overheat the chip, even under full load. So any concern from your side
>>>>> about the board or SoC overheating could be dismissed, with the current
>>>>> mainline code, at least. However you lose the full speed, by quite a
>>>>> margin on the H6 (on the A64 it's only 816 vs 1200(ish) MHz).
>>>>> However, without the clock entries in the CPU node, the frequency would
>>>>> never be changed by Dom0 anyway (nor by Xen, which doesn't even know how
>>>>> to do this).
>>>>> So from a practical point of view: unless you hack Xen to pass on more
>>>>> cpu node properties, you are stuck at 888 MHz anyway, and don't need to
>>>>> worry about overheating.
>>>> Thank you. Knowing that at least it won't overheat is a relief. But
>>>> the performance definitely suffers from the current situation, and
>>>> quite a bit. I'm thinking about using KVM instead: even if it does
>>>> less paravirtualization of guests,
>>>
>>> What is this statement based on? I think on ARM this never really
>>> applied, and in general whether you do virtio or xen front-end/back-end
>>> does not really matter.
> 
> When you say "in general" here, this becomes a very broad statement
> about virtio and xen front-end/back-ends being equivalent and
> interchangable, and that could cause some misunderstanding for a
> newcomer.
> 
> There are important differences between the isolation properties of
> classic virtio and Xen's front-end/back-ends -- and also the Argo
> transport. It's particularly important for Xen because it has
> priortized support for stronger isolation between execution
> environments to a greater extent than some other hypervisors. It is a
> critical differentiator for it. The importance of isolation is why Xen
> 4.14's headline feature was support for Linux stubdomains, upstreamed
> to Xen after years of work by the Qubes and OpenXT communities.

He was talking about performance. My take on this was that this seems to
go back to the old days, when Xen was considered faster because of
paravirt (vs. trap&emulate h/w in QEMU). And this clearly does not apply
anymore, and never really applied to ARM.

>>> IMHO any reasoning about performance just based
>>> on software architecture is mostly flawed (because it's complex and
>>> reality might have missed some memos ;-)
> 
> That's another pretty strong statement. Measurement is great, but
> maybe performance analysis that is informed and directed by an
> understanding of the architecture under test could potentially be more
> rigorous and persuasive than work done without it?

You seem to draw quite a lot from my statement. All I was saying that
modern systems are far too complex to reason about actual performance
based on some architectural ideas.
Also my statement was in response to some generic statement, but of
course in this particular context. Please keep in mind that we are
talking about a 5 US$ TV-box SoC here, basically a toy platform. The
chip has severe architectural issues (secure devices not being secure,
critical devices not being isolated). I/O probably means SD card at
about 25MB/s, the fastest I have seen is 80MB/s on some better (but
optional!) eMMC modules. DRAM is via a single channel 32bit path. The
cores are using an almost 8 year old energy-efficient
micro-architecture. So whether any clever architecture really
contributes to performance on this system is somewhat questionable.

So I was suggesting that before jumping to conclusions based on broad
architectural design ideas an actual reality check of whether those
really apply to the platform might be warranted.
Also I haven't seen what kind of performance he is actually interested
in. Is the task at hand I/O bound, memory bound, CPU bound?
The discussion so far was about the CPU clock frequency only.

>>> So just measure your particular use case, then you know.
> 
> Hmm.

Is this questioning the usefulness of actual performance measurement? He
seems to be after a particular setup, so keeping an eye on the *actual*
performance outcome seems quite reasonable to me.

>>>> I'm sure that the ability to use
>>>> the maximum frequency of the CPU would offset the additional overhead,
>>>> and in general offer better performance. But with KVM I lose the
>>>> ability to have individual domU's dedicated to some device driver,
>>>> which is a nice thing to have from a security standpoint.
>>>
>>> I understand the theoretical merits, but a) does this really work on
>>> your board and b) is this really more secure? What do you want to
>>> protect against?
>>
>> For "does it work on your board", the main obstacle is typically IOMMU
>> support to be able to do device assignment properly. That's definitely
>> something to check. If it doesn't work nowadays you can try to
>> workaround it by using direct 1:1 memory mappings [1].  However, for
>> security then you have to configure a MPU. I wonder if H6 has a MPU and
>> how it can be configured. In any case, something to keep in mind in case
>> the default IOMMU-based setup doesn't work for some reason for the
>> device you care about.

It's even worse: this SoC only provides platform devices, which all rely
on at least pinctrl, clocks and regulators to function. All of this
functionality is provided via centralised devices, probably controlled
by Dom0 (or just one domain, anyway). The MMC controller for instance
needs to adjust the SD bus clock to the storage array dynamically, which
requires to reprogram the CCU. So I don't see how a driver domain would
conceptually work, without solving the very same problems that we just
faced with cpufreq here.

And of course this device does not have an IOMMU worth mentioning: there
is some device with that name, but it mostly provides scatter-gather
support for the video and display devices *only*.
The MMC controller has its own built-in DMA controller, so it owns the
*whole* of memory, including Xen's very own one.

>> For "is this really more secure?", yes it is more secure as you are
>> running larger portions of the codebase in unprivileged mode and isolated
>> from each other with IOMMU (or MPU) protection. See what the OpenXT and
>> Qubes OS guys have been doing.
> 
> Yes. Both projects have done quite a lot of work to enable and
> maintain driver domains.

Which don't work here, see above. Besides, I was genuinely interested in
the actual threat model here. What do we expect to go wrong and how
would putting the driver in its own domain help? (while considering the
platform's limitations)

Cheers,
Andre


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-07-29  0:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-20 14:53 dom0 LInux 5.8-rc5 kernel failing to initialize cooling maps for Allwinner H6 SoC Alejandro
2020-07-24 10:45 ` Julien Grall
2020-07-24 11:17   ` Amit Tomer
2020-07-24 11:18     ` Julien Grall
2020-07-24 11:20   ` Alejandro
2020-07-26 20:24     ` André Przywara
2020-07-28 10:39       ` Alejandro
2020-07-28 11:17         ` André Przywara
2020-07-28 18:14           ` Stefano Stabellini
2020-07-28 18:52             ` Christopher Clark
2020-07-29  0:18               ` André Przywara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.