All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/31] CPUFreq on ARM
@ 2017-11-09 17:09 Oleksandr Tyshchenko
  2017-11-09 17:09 ` [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location Oleksandr Tyshchenko
                   ` (33 more replies)
  0 siblings, 34 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Andre Przywara,
	Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Hi, all.

The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load. Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.

We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.

Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.

Let me explain a bit more what these possible approaches are:

1. “Xen+hwdom” solution.
GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.

2. “all-in-Xen” solution.
This implies that all CPUFreq related stuff should be located in Xen.
Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.

3. “Xen+SCP(ARM TF)” solution.
It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.

The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.

Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.

I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.

To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.

I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
2. A bunch of device-tree helpers and macros.
3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
6. SCPI based CPUFreq driver and CPUFreq interface component.
7. Misc patches mostly to ARM subsystem.
8. Patch from Volodymyr Babchuk which adds SMC wrapper.

Most important TODOs regarding the whole patch series:
1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
2. Thermal management integration.
3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
4. Handle CPU_TURBO frequencies if they are supported by HW.

You can find the whole patch series here:
repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1

P.S. There is no need to modify xenpm tool. It works out of the box on ARM.

[1]
Linux code:
http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt

Recent protocol version:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf

[2]
Xen part:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
Linux part:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html

[3]
http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf

[4]
http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox

[5]
http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf

Oleksandr Dmytryshyn (6):
  cpufreq: move cpufreq.h file to the xen/include/xen location
  pm: move processor_perf.h file to the xen/include/xen location
  pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  cpufreq: make turbo settings to be configurable
  pmstat: make pmstat functions more generalizable
  cpufreq: make cpufreq driver more generalizable

Oleksandr Tyshchenko (24):
  xenpm: Clarify xenpm usage
  xen/device-tree: Add dt_count_phandle_with_args helper
  xen/device-tree: Add dt_property_for_each_string macros
  xen/device-tree: Add dt_property_read_u32_index helper
  xen/device-tree: Add dt_property_count_elems_of_size helper
  xen/device-tree: Add dt_property_read_string_helper and friends
  xen/arm: Add driver_data field to struct device
  xen/arm: Add DEVICE_MAILBOX device class
  xen/arm: Store device-tree node per cpu
  xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
  xen/arm: Add mailbox infrastructure
  xen/arm: Introduce ARM SMC based mailbox
  xen/arm: Add common header file wrappers.h
  xen/arm: Add rxdone_auto flag to mbox_controller structure
  xen/arm: Add Xen changes to SCPI protocol
  xen/arm: Add Xen changes to mailbox infrastructure
  xen/arm: Add Xen changes to ARM SMC based mailbox
  xen/arm: Use non-blocking mode for SCPI protocol
  xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
  cpufreq: hack: perf->states isn't a real guest handle on ARM
  xen/arm: Introduce SCPI based CPUFreq driver
  xen/arm: Introduce CPUFreq Interface component
  xen/arm: Build CPUFreq components
  xen/arm: Enable CPUFreq on ARM

Volodymyr Babchuk (1):
  arm: add SMC wrapper that is compatible with SMCCC

 MAINTAINERS                                  |    4 +-
 tools/misc/xenpm.c                           |    6 +-
 xen/arch/arm/Kconfig                         |    2 +
 xen/arch/arm/Makefile                        |    1 +
 xen/arch/arm/arm32/Makefile                  |    1 +
 xen/arch/arm/arm32/smc.S                     |   32 +
 xen/arch/arm/arm64/Makefile                  |    1 +
 xen/arch/arm/arm64/smc.S                     |   29 +
 xen/arch/arm/cpufreq/Makefile                |    5 +
 xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
 xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
 xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
 xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
 xen/arch/arm/cpufreq/mailbox.h               |   28 +
 xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
 xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
 xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
 xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
 xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
 xen/arch/arm/smpboot.c                       |    5 +
 xen/arch/x86/Kconfig                         |    2 +
 xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
 xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
 xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
 xen/arch/x86/acpi/power.c                    |    2 +-
 xen/arch/x86/cpu/mwait-idle.c                |    2 +-
 xen/arch/x86/platform_hypercall.c            |    2 +-
 xen/common/device_tree.c                     |  124 +++
 xen/common/sysctl.c                          |    2 +-
 xen/drivers/Kconfig                          |    2 +
 xen/drivers/Makefile                         |    1 +
 xen/drivers/acpi/Makefile                    |    1 -
 xen/drivers/acpi/pmstat.c                    |  526 ------------
 xen/drivers/cpufreq/Kconfig                  |    3 +
 xen/drivers/cpufreq/cpufreq.c                |  102 ++-
 xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
 xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
 xen/drivers/cpufreq/utility.c                |   13 +-
 xen/drivers/pm/Kconfig                       |    3 +
 xen/drivers/pm/Makefile                      |    1 +
 xen/drivers/pm/stat.c                        |  538 ++++++++++++
 xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
 xen/include/acpi/cpufreq/processor_perf.h    |   63 --
 xen/include/asm-arm/device.h                 |    2 +
 xen/include/asm-arm/processor.h              |    4 +
 xen/include/public/platform.h                |    1 +
 xen/include/xen/cpufreq.h                    |  254 ++++++
 xen/include/xen/device_tree.h                |  158 ++++
 xen/include/xen/pmstat.h                     |    2 +
 xen/include/xen/processor_perf.h             |   69 ++
 50 files changed, 4822 insertions(+), 862 deletions(-)
 create mode 100644 xen/arch/arm/arm32/smc.S
 create mode 100644 xen/arch/arm/arm64/smc.S
 create mode 100644 xen/arch/arm/cpufreq/Makefile
 create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
 create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
 create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
 create mode 100644 xen/arch/arm/cpufreq/mailbox.c
 create mode 100644 xen/arch/arm/cpufreq/mailbox.h
 create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
 create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
 create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
 create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
 create mode 100644 xen/arch/arm/cpufreq/wrappers.h
 delete mode 100644 xen/drivers/acpi/pmstat.c
 create mode 100644 xen/drivers/pm/Kconfig
 create mode 100644 xen/drivers/pm/Makefile
 create mode 100644 xen/drivers/pm/stat.c
 delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
 delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
 create mode 100644 xen/include/xen/cpufreq.h
 create mode 100644 xen/include/xen/processor_perf.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-02  0:35   ` Stefano Stabellini
  2017-11-09 17:09 ` [RFC PATCH 02/31] pm: move processor_perf.h " Oleksandr Tyshchenko
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich

From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>

Cpufreq driver should be more generalizable (not ACPI-specific).
Thus this file should be placed to more convenient location.

This is a rebased version of the original patch:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00938.html

Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 MAINTAINERS                                  |   1 +
 xen/arch/x86/acpi/cpu_idle.c                 |   2 +-
 xen/arch/x86/acpi/cpufreq/cpufreq.c          |   2 +-
 xen/arch/x86/acpi/cpufreq/powernow.c         |   2 +-
 xen/arch/x86/acpi/power.c                    |   2 +-
 xen/arch/x86/cpu/mwait-idle.c                |   2 +-
 xen/drivers/acpi/pmstat.c                    |   2 +-
 xen/drivers/cpufreq/cpufreq.c                |   2 +-
 xen/drivers/cpufreq/cpufreq_misc_governors.c |   2 +-
 xen/drivers/cpufreq/cpufreq_ondemand.c       |   4 +-
 xen/drivers/cpufreq/utility.c                |   2 +-
 xen/include/acpi/cpufreq/cpufreq.h           | 245 --------------------------
 xen/include/xen/cpufreq.h                    | 248 +++++++++++++++++++++++++++
 13 files changed, 260 insertions(+), 256 deletions(-)
 delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
 create mode 100644 xen/include/xen/cpufreq.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 5b9e123..524e067 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -295,6 +295,7 @@ X:	xen/arch/x86/acpi/boot.c
 X:	xen/arch/x86/acpi/lib.c
 F:	xen/drivers/cpufreq/
 F:	xen/include/acpi/cpufreq/
+F:	xen/include/xen/cpufreq.h
 
 PUBLIC I/O INTERFACES AND PV DRIVERS DESIGNS
 M:  Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
index 482b8a7..c66622e 100644
--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -49,7 +49,7 @@
 #include <xen/softirq.h>
 #include <public/platform.h>
 #include <public/sysctl.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 #include <asm/apic.h>
 #include <asm/cpuidle.h>
 #include <asm/mwait.h>
diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c b/xen/arch/x86/acpi/cpufreq/cpufreq.c
index 1f8d02a..bd82025 100644
--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -41,7 +41,7 @@
 #include <asm/percpu.h>
 #include <asm/cpufeature.h>
 #include <acpi/acpi.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 enum {
     UNDEFINED_CAPABLE = 0,
diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
index 8f1ac74..79f55a3 100644
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -35,7 +35,7 @@
 #include <asm/percpu.h>
 #include <asm/cpufeature.h>
 #include <acpi/acpi.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 #define CPUID_FREQ_VOLT_CAPABILITIES    0x80000007
 #define CPB_CAPABLE             0x00000200
diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 1e4e568..beebd5a 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -28,7 +28,7 @@
 #include <asm/tboot.h>
 #include <asm/apic.h>
 #include <asm/io_apic.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 uint32_t system_reset_counter = 1;
 
diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c
index 762dff1..29f0286 100644
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -58,7 +58,7 @@
 #include <asm/hpet.h>
 #include <asm/mwait.h>
 #include <asm/msr.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 #define MWAIT_IDLE_VERSION "0.4.1"
 #undef PREFIX
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
index 2a6c4c7..2dbde1c 100644
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -38,7 +38,7 @@
 #include <xen/acpi.h>
 
 #include <public/sysctl.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 #include <xen/pmstat.h>
 
 DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index 212f48f..ab909e2 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -43,7 +43,7 @@
 #include <asm/processor.h>
 #include <asm/percpu.h>
 #include <acpi/acpi.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 static unsigned int __read_mostly usr_min_freq;
 static unsigned int __read_mostly usr_max_freq;
diff --git a/xen/drivers/cpufreq/cpufreq_misc_governors.c b/xen/drivers/cpufreq/cpufreq_misc_governors.c
index 746bbcd..4a5510c 100644
--- a/xen/drivers/cpufreq/cpufreq_misc_governors.c
+++ b/xen/drivers/cpufreq/cpufreq_misc_governors.c
@@ -18,7 +18,7 @@
 #include <xen/init.h>
 #include <xen/percpu.h>
 #include <xen/sched.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 /*
  * cpufreq userspace governor
diff --git a/xen/drivers/cpufreq/cpufreq_ondemand.c b/xen/drivers/cpufreq/cpufreq_ondemand.c
index fe6c63d..1c384ec 100644
--- a/xen/drivers/cpufreq/cpufreq_ondemand.c
+++ b/xen/drivers/cpufreq/cpufreq_ondemand.c
@@ -1,5 +1,5 @@
 /*
- *  xen/arch/x86/acpi/cpufreq/cpufreq_ondemand.c
+ *  xen/drivers/cpufreq/cpufreq_ondemand.c
  *
  *  Copyright (C)  2001 Russell King
  *            (C)  2003 Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>.
@@ -18,7 +18,7 @@
 #include <xen/types.h>
 #include <xen/sched.h>
 #include <xen/timer.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 
 #define DEF_FREQUENCY_UP_THRESHOLD              (80)
 #define MIN_FREQUENCY_UP_THRESHOLD              (11)
diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index 53879fe..a687e5a 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -28,7 +28,7 @@
 #include <xen/sched.h>
 #include <xen/timer.h>
 #include <xen/trace.h>
-#include <acpi/cpufreq/cpufreq.h>
+#include <xen/cpufreq.h>
 #include <public/sysctl.h>
 
 struct cpufreq_driver   *cpufreq_driver;
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
deleted file mode 100644
index a5cd7d0..0000000
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ /dev/null
@@ -1,245 +0,0 @@
-/*
- *  xen/include/acpi/cpufreq/cpufreq.h
- *
- *  Copyright (C) 2001 Russell King
- *            (C) 2002 - 2003 Dominik Brodowski <linux@brodo.de>
- *
- * $Id: cpufreq.h,v 1.36 2003/01/20 17:31:48 db Exp $
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#ifndef __XEN_CPUFREQ_PM_H__
-#define __XEN_CPUFREQ_PM_H__
-
-#include <xen/types.h>
-#include <xen/list.h>
-#include <xen/cpumask.h>
-
-#include "processor_perf.h"
-
-DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
-
-extern bool_t cpufreq_verbose;
-
-struct cpufreq_governor;
-
-struct acpi_cpufreq_data {
-    struct processor_performance *acpi_data;
-    struct cpufreq_frequency_table *freq_table;
-    unsigned int arch_cpu_flags;
-};
-
-extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
-
-struct cpufreq_cpuinfo {
-    unsigned int        max_freq;
-    unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
-    unsigned int        min_freq;
-    unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
-};
-
-struct perf_limits {
-    bool_t no_turbo;
-    bool_t turbo_disabled;
-    uint32_t turbo_pct;
-    uint32_t max_perf_pct; /* max performance in percentage */
-    uint32_t min_perf_pct; /* min performance in percentage */
-    uint32_t max_perf;
-    uint32_t min_perf;
-    uint32_t max_policy_pct;
-    uint32_t min_policy_pct;
-};
-
-struct cpufreq_policy {
-    cpumask_var_t       cpus;          /* affected CPUs */
-    unsigned int        shared_type;   /* ANY or ALL affected CPUs
-                                          should set cpufreq */
-    unsigned int        cpu;           /* cpu nr of registered CPU */
-    struct cpufreq_cpuinfo    cpuinfo;
-
-    unsigned int        min;    /* in kHz */
-    unsigned int        max;    /* in kHz */
-    unsigned int        cur;    /* in kHz, only needed if cpufreq
-                                 * governors are used */
-    struct perf_limits  limits;
-    struct cpufreq_governor     *governor;
-
-    bool_t              resume; /* flag for cpufreq 1st run
-                                 * S3 wakeup, hotplug cpu, etc */
-    s8                  turbo;  /* tristate flag: 0 for unsupported
-                                 * -1 for disable, 1 for enabled
-                                 * See CPUFREQ_TURBO_* below for defines */
-    bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
-};
-DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
-
-extern int __cpufreq_set_policy(struct cpufreq_policy *data,
-                                struct cpufreq_policy *policy);
-
-#define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
-#define CPUFREQ_SHARED_TYPE_HW   (1) /* HW does needed coordination */
-#define CPUFREQ_SHARED_TYPE_ALL  (2) /* All dependent CPUs should set freq */
-#define CPUFREQ_SHARED_TYPE_ANY  (3) /* Freq can be set from any dependent CPU*/
-
-/******************** cpufreq transition notifiers *******************/
-
-struct cpufreq_freqs {
-    unsigned int cpu;    /* cpu nr */
-    unsigned int old;
-    unsigned int new;
-    u8 flags;            /* flags of cpufreq_driver, see below. */
-};
-
-
-/*********************************************************************
- *                          CPUFREQ GOVERNORS                        *
- *********************************************************************/
-
-#define CPUFREQ_GOV_START  1
-#define CPUFREQ_GOV_STOP   2
-#define CPUFREQ_GOV_LIMITS 3
-
-struct cpufreq_governor {
-    char    name[CPUFREQ_NAME_LEN];
-    int     (*governor)(struct cpufreq_policy *policy,
-                        unsigned int event);
-    bool_t  (*handle_option)(const char *name, const char *value);
-    struct list_head governor_list;
-};
-
-extern struct cpufreq_governor *cpufreq_opt_governor;
-extern struct cpufreq_governor cpufreq_gov_dbs;
-extern struct cpufreq_governor cpufreq_gov_userspace;
-extern struct cpufreq_governor cpufreq_gov_performance;
-extern struct cpufreq_governor cpufreq_gov_powersave;
-
-extern struct list_head cpufreq_governor_list;
-
-extern int cpufreq_register_governor(struct cpufreq_governor *governor);
-extern struct cpufreq_governor *__find_governor(const char *governor);
-#define CPUFREQ_DEFAULT_GOVERNOR &cpufreq_gov_dbs
-
-/* pass a target to the cpufreq driver */
-extern int __cpufreq_driver_target(struct cpufreq_policy *policy,
-                                   unsigned int target_freq,
-                                   unsigned int relation);
-
-#define GOV_GETAVG     1
-#define USR_GETAVG     2
-extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
-
-#define CPUFREQ_TURBO_DISABLED      -1
-#define CPUFREQ_TURBO_UNSUPPORTED   0
-#define CPUFREQ_TURBO_ENABLED       1
-
-extern int cpufreq_update_turbo(int cpuid, int new_state);
-extern int cpufreq_get_turbo_status(int cpuid);
-
-static __inline__ int 
-__cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
-{
-    return policy->governor->governor(policy, event);
-}
-
-
-/*********************************************************************
- *                      CPUFREQ DRIVER INTERFACE                     *
- *********************************************************************/
-
-#define CPUFREQ_RELATION_L 0  /* lowest frequency at or above target */
-#define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
-
-struct cpufreq_driver {
-    char   name[CPUFREQ_NAME_LEN];
-    int    (*init)(struct cpufreq_policy *policy);
-    int    (*verify)(struct cpufreq_policy *policy);
-    int    (*setpolicy)(struct cpufreq_policy *policy);
-    int    (*update)(int cpuid, struct cpufreq_policy *policy);
-    int    (*target)(struct cpufreq_policy *policy,
-                     unsigned int target_freq,
-                     unsigned int relation);
-    unsigned int    (*get)(unsigned int cpu);
-    unsigned int    (*getavg)(unsigned int cpu, unsigned int flag);
-    int    (*exit)(struct cpufreq_policy *policy);
-};
-
-extern struct cpufreq_driver *cpufreq_driver;
-
-int cpufreq_register_driver(struct cpufreq_driver *);
-
-static __inline__
-void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
-                                  unsigned int min, unsigned int max)
-{
-    if (policy->min < min)
-        policy->min = min;
-    if (policy->max < min)
-        policy->max = min;
-    if (policy->min > max)
-        policy->min = max;
-    if (policy->max > max)
-        policy->max = max;
-    if (policy->min > policy->max)
-        policy->min = policy->max;
-    return;
-}
-
-
-/*********************************************************************
- *                     FREQUENCY TABLE HELPERS                       *
- *********************************************************************/
-
-#define CPUFREQ_ENTRY_INVALID ~0
-#define CPUFREQ_TABLE_END     ~1
-
-struct cpufreq_frequency_table {
-    unsigned int    index;     /* any */
-    unsigned int    frequency; /* kHz - doesn't need to be in ascending
-                                * order */
-};
-
-int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
-                   struct cpufreq_frequency_table *table);
-
-int cpufreq_frequency_table_verify(struct cpufreq_policy *policy,
-                   struct cpufreq_frequency_table *table);
-
-int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
-                   struct cpufreq_frequency_table *table,
-                   unsigned int target_freq,
-                   unsigned int relation,
-                   unsigned int *index);
-
-
-/*********************************************************************
- *                     UNIFIED DEBUG HELPERS                         *
- *********************************************************************/
-
-struct cpu_dbs_info_s {
-    uint64_t prev_cpu_idle;
-    uint64_t prev_cpu_wall;
-    struct cpufreq_policy *cur_policy;
-    struct cpufreq_frequency_table *freq_table;
-    int cpu;
-    unsigned int enable:1;
-    unsigned int stoppable:1;
-    unsigned int turbo_enabled:1;
-};
-
-int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
-int get_cpufreq_ondemand_para(uint32_t *sampling_rate_max,
-                              uint32_t *sampling_rate_min,
-                              uint32_t *sampling_rate,
-                              uint32_t *up_threshold);
-int write_ondemand_sampling_rate(unsigned int sampling_rate);
-int write_ondemand_up_threshold(unsigned int up_threshold);
-
-int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq);
-
-void cpufreq_dbs_timer_suspend(void);
-void cpufreq_dbs_timer_resume(void);
-
-#endif /* __XEN_CPUFREQ_PM_H__ */
diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
new file mode 100644
index 0000000..ed38a6c
--- /dev/null
+++ b/xen/include/xen/cpufreq.h
@@ -0,0 +1,248 @@
+/*
+ *  xen/include/acpi/cpufreq/cpufreq.h
+ *
+ *  Copyright (C) 2001 Russell King
+ *            (C) 2002 - 2003 Dominik Brodowski <linux@brodo.de>
+ *
+ * $Id: cpufreq.h,v 1.36 2003/01/20 17:31:48 db Exp $
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __XEN_CPUFREQ_PM_H__
+#define __XEN_CPUFREQ_PM_H__
+
+#include <xen/types.h>
+#include <xen/list.h>
+#include <xen/percpu.h>
+#include <xen/spinlock.h>
+#include <xen/errno.h>
+#include <xen/cpumask.h>
+
+#include <acpi/cpufreq/processor_perf.h>
+
+DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
+
+extern bool_t cpufreq_verbose;
+
+struct cpufreq_governor;
+
+struct acpi_cpufreq_data {
+    struct processor_performance *acpi_data;
+    struct cpufreq_frequency_table *freq_table;
+    unsigned int arch_cpu_flags;
+};
+
+extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
+
+struct cpufreq_cpuinfo {
+    unsigned int        max_freq;
+    unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
+    unsigned int        min_freq;
+    unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
+};
+
+struct perf_limits {
+    bool_t no_turbo;
+    bool_t turbo_disabled;
+    uint32_t turbo_pct;
+    uint32_t max_perf_pct; /* max performance in percentage */
+    uint32_t min_perf_pct; /* min performance in percentage */
+    uint32_t max_perf;
+    uint32_t min_perf;
+    uint32_t max_policy_pct;
+    uint32_t min_policy_pct;
+};
+
+struct cpufreq_policy {
+    cpumask_var_t       cpus;          /* affected CPUs */
+    unsigned int        shared_type;   /* ANY or ALL affected CPUs
+                                          should set cpufreq */
+    unsigned int        cpu;           /* cpu nr of registered CPU */
+    struct cpufreq_cpuinfo    cpuinfo;
+
+    unsigned int        min;    /* in kHz */
+    unsigned int        max;    /* in kHz */
+    unsigned int        cur;    /* in kHz, only needed if cpufreq
+                                 * governors are used */
+    struct perf_limits  limits;
+    struct cpufreq_governor     *governor;
+
+    bool_t              resume; /* flag for cpufreq 1st run
+                                 * S3 wakeup, hotplug cpu, etc */
+    s8                  turbo;  /* tristate flag: 0 for unsupported
+                                 * -1 for disable, 1 for enabled
+                                 * See CPUFREQ_TURBO_* below for defines */
+    bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
+};
+DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
+
+extern int __cpufreq_set_policy(struct cpufreq_policy *data,
+                                struct cpufreq_policy *policy);
+
+#define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
+#define CPUFREQ_SHARED_TYPE_HW   (1) /* HW does needed coordination */
+#define CPUFREQ_SHARED_TYPE_ALL  (2) /* All dependent CPUs should set freq */
+#define CPUFREQ_SHARED_TYPE_ANY  (3) /* Freq can be set from any dependent CPU*/
+
+/******************** cpufreq transition notifiers *******************/
+
+struct cpufreq_freqs {
+    unsigned int cpu;    /* cpu nr */
+    unsigned int old;
+    unsigned int new;
+    u8 flags;            /* flags of cpufreq_driver, see below. */
+};
+
+
+/*********************************************************************
+ *                          CPUFREQ GOVERNORS                        *
+ *********************************************************************/
+
+#define CPUFREQ_GOV_START  1
+#define CPUFREQ_GOV_STOP   2
+#define CPUFREQ_GOV_LIMITS 3
+
+struct cpufreq_governor {
+    char    name[CPUFREQ_NAME_LEN];
+    int     (*governor)(struct cpufreq_policy *policy,
+                        unsigned int event);
+    bool_t  (*handle_option)(const char *name, const char *value);
+    struct list_head governor_list;
+};
+
+extern struct cpufreq_governor *cpufreq_opt_governor;
+extern struct cpufreq_governor cpufreq_gov_dbs;
+extern struct cpufreq_governor cpufreq_gov_userspace;
+extern struct cpufreq_governor cpufreq_gov_performance;
+extern struct cpufreq_governor cpufreq_gov_powersave;
+
+extern struct list_head cpufreq_governor_list;
+
+extern int cpufreq_register_governor(struct cpufreq_governor *governor);
+extern struct cpufreq_governor *__find_governor(const char *governor);
+#define CPUFREQ_DEFAULT_GOVERNOR &cpufreq_gov_dbs
+
+/* pass a target to the cpufreq driver */
+extern int __cpufreq_driver_target(struct cpufreq_policy *policy,
+                                   unsigned int target_freq,
+                                   unsigned int relation);
+
+#define GOV_GETAVG     1
+#define USR_GETAVG     2
+extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
+
+#define CPUFREQ_TURBO_DISABLED      -1
+#define CPUFREQ_TURBO_UNSUPPORTED   0
+#define CPUFREQ_TURBO_ENABLED       1
+
+extern int cpufreq_update_turbo(int cpuid, int new_state);
+extern int cpufreq_get_turbo_status(int cpuid);
+
+static __inline__ int 
+__cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
+{
+    return policy->governor->governor(policy, event);
+}
+
+
+/*********************************************************************
+ *                      CPUFREQ DRIVER INTERFACE                     *
+ *********************************************************************/
+
+#define CPUFREQ_RELATION_L 0  /* lowest frequency at or above target */
+#define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
+
+struct cpufreq_driver {
+    char   name[CPUFREQ_NAME_LEN];
+    int    (*init)(struct cpufreq_policy *policy);
+    int    (*verify)(struct cpufreq_policy *policy);
+    int    (*setpolicy)(struct cpufreq_policy *policy);
+    int    (*update)(int cpuid, struct cpufreq_policy *policy);
+    int    (*target)(struct cpufreq_policy *policy,
+                     unsigned int target_freq,
+                     unsigned int relation);
+    unsigned int    (*get)(unsigned int cpu);
+    unsigned int    (*getavg)(unsigned int cpu, unsigned int flag);
+    int    (*exit)(struct cpufreq_policy *policy);
+};
+
+extern struct cpufreq_driver *cpufreq_driver;
+
+int cpufreq_register_driver(struct cpufreq_driver *);
+
+static __inline__
+void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
+                                  unsigned int min, unsigned int max)
+{
+    if (policy->min < min)
+        policy->min = min;
+    if (policy->max < min)
+        policy->max = min;
+    if (policy->min > max)
+        policy->min = max;
+    if (policy->max > max)
+        policy->max = max;
+    if (policy->min > policy->max)
+        policy->min = policy->max;
+    return;
+}
+
+
+/*********************************************************************
+ *                     FREQUENCY TABLE HELPERS                       *
+ *********************************************************************/
+
+#define CPUFREQ_ENTRY_INVALID ~0
+#define CPUFREQ_TABLE_END     ~1
+
+struct cpufreq_frequency_table {
+    unsigned int    index;     /* any */
+    unsigned int    frequency; /* kHz - doesn't need to be in ascending
+                                * order */
+};
+
+int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
+                   struct cpufreq_frequency_table *table);
+
+int cpufreq_frequency_table_verify(struct cpufreq_policy *policy,
+                   struct cpufreq_frequency_table *table);
+
+int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
+                   struct cpufreq_frequency_table *table,
+                   unsigned int target_freq,
+                   unsigned int relation,
+                   unsigned int *index);
+
+
+/*********************************************************************
+ *                     UNIFIED DEBUG HELPERS                         *
+ *********************************************************************/
+
+struct cpu_dbs_info_s {
+    uint64_t prev_cpu_idle;
+    uint64_t prev_cpu_wall;
+    struct cpufreq_policy *cur_policy;
+    struct cpufreq_frequency_table *freq_table;
+    int cpu;
+    unsigned int enable:1;
+    unsigned int stoppable:1;
+    unsigned int turbo_enabled:1;
+};
+
+int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
+int get_cpufreq_ondemand_para(uint32_t *sampling_rate_max,
+                              uint32_t *sampling_rate_min,
+                              uint32_t *sampling_rate,
+                              uint32_t *up_threshold);
+int write_ondemand_sampling_rate(unsigned int sampling_rate);
+int write_ondemand_up_threshold(unsigned int up_threshold);
+
+int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq);
+
+void cpufreq_dbs_timer_suspend(void);
+void cpufreq_dbs_timer_resume(void);
+
+#endif /* __XEN_CPUFREQ_PM_H__ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 02/31] pm: move processor_perf.h file to the xen/include/xen location
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
  2017-11-09 17:09 ` [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-02  0:41   ` Stefano Stabellini
  2017-11-09 17:09 ` [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location Oleksandr Tyshchenko
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich

From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>

Cpufreq driver should be more generalizable (not ACPI-specific).
Thus this file should be placed to more convenient location.

This is a rebased version of the original patch:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00934.html

Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 MAINTAINERS                               |  2 +-
 xen/arch/x86/platform_hypercall.c         |  2 +-
 xen/include/acpi/cpufreq/processor_perf.h | 63 -------------------------------
 xen/include/xen/cpufreq.h                 |  2 +-
 xen/include/xen/processor_perf.h          | 63 +++++++++++++++++++++++++++++++
 5 files changed, 66 insertions(+), 66 deletions(-)
 delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
 create mode 100644 xen/include/xen/processor_perf.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 524e067..9794a81 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -294,8 +294,8 @@ F:	xen/arch/x86/acpi/
 X:	xen/arch/x86/acpi/boot.c
 X:	xen/arch/x86/acpi/lib.c
 F:	xen/drivers/cpufreq/
-F:	xen/include/acpi/cpufreq/
 F:	xen/include/xen/cpufreq.h
+F:	xen/include/xen/processor_perf.h
 
 PUBLIC I/O INTERFACES AND PV DRIVERS DESIGNS
 M:  Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
index ebc2f39..17c8304 100644
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -25,7 +25,7 @@
 #include <xen/symbols.h>
 #include <asm/current.h>
 #include <public/platform.h>
-#include <acpi/cpufreq/processor_perf.h>
+#include <xen/processor_perf.h>
 #include <asm/edd.h>
 #include <asm/mtrr.h>
 #include <asm/io_apic.h>
diff --git a/xen/include/acpi/cpufreq/processor_perf.h b/xen/include/acpi/cpufreq/processor_perf.h
deleted file mode 100644
index d8a1ba6..0000000
--- a/xen/include/acpi/cpufreq/processor_perf.h
+++ /dev/null
@@ -1,63 +0,0 @@
-#ifndef __XEN_PROCESSOR_PM_H__
-#define __XEN_PROCESSOR_PM_H__
-
-#include <public/platform.h>
-#include <public/sysctl.h>
-#include <xen/acpi.h>
-
-#define XEN_PX_INIT 0x80000000
-
-int powernow_cpufreq_init(void);
-unsigned int powernow_register_driver(void);
-unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
-void cpufreq_residency_update(unsigned int, uint8_t);
-void cpufreq_statistic_update(unsigned int, uint8_t, uint8_t);
-int  cpufreq_statistic_init(unsigned int);
-void cpufreq_statistic_exit(unsigned int);
-void cpufreq_statistic_reset(unsigned int);
-
-int  cpufreq_limit_change(unsigned int);
-
-int  cpufreq_add_cpu(unsigned int);
-int  cpufreq_del_cpu(unsigned int);
-
-struct processor_performance {
-    uint32_t state;
-    uint32_t platform_limit;
-    struct xen_pct_register control_register;
-    struct xen_pct_register status_register;
-    uint32_t state_count;
-    struct xen_processor_px *states;
-    struct xen_psd_package domain_info;
-    uint32_t shared_type;
-
-    uint32_t init;
-};
-
-struct processor_pminfo {
-    uint32_t acpi_id;
-    uint32_t id;
-    struct processor_performance    perf;
-};
-
-extern struct processor_pminfo *processor_pminfo[NR_CPUS];
-
-struct px_stat {
-    uint8_t total;        /* total Px states */
-    uint8_t usable;       /* usable Px states */
-    uint8_t last;         /* last Px state */
-    uint8_t cur;          /* current Px state */
-    uint64_t *trans_pt;   /* Px transition table */
-    pm_px_val_t *pt;
-};
-
-struct pm_px {
-    struct px_stat u;
-    uint64_t prev_state_wall;
-    uint64_t prev_idle_wall;
-};
-
-DECLARE_PER_CPU(struct pm_px *, cpufreq_statistic_data);
-
-int cpufreq_cpu_init(unsigned int cpuid);
-#endif /* __XEN_PROCESSOR_PM_H__ */
diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
index ed38a6c..30c70c9 100644
--- a/xen/include/xen/cpufreq.h
+++ b/xen/include/xen/cpufreq.h
@@ -21,7 +21,7 @@
 #include <xen/errno.h>
 #include <xen/cpumask.h>
 
-#include <acpi/cpufreq/processor_perf.h>
+#include <xen/processor_perf.h>
 
 DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
 
diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
new file mode 100644
index 0000000..d8a1ba6
--- /dev/null
+++ b/xen/include/xen/processor_perf.h
@@ -0,0 +1,63 @@
+#ifndef __XEN_PROCESSOR_PM_H__
+#define __XEN_PROCESSOR_PM_H__
+
+#include <public/platform.h>
+#include <public/sysctl.h>
+#include <xen/acpi.h>
+
+#define XEN_PX_INIT 0x80000000
+
+int powernow_cpufreq_init(void);
+unsigned int powernow_register_driver(void);
+unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
+void cpufreq_residency_update(unsigned int, uint8_t);
+void cpufreq_statistic_update(unsigned int, uint8_t, uint8_t);
+int  cpufreq_statistic_init(unsigned int);
+void cpufreq_statistic_exit(unsigned int);
+void cpufreq_statistic_reset(unsigned int);
+
+int  cpufreq_limit_change(unsigned int);
+
+int  cpufreq_add_cpu(unsigned int);
+int  cpufreq_del_cpu(unsigned int);
+
+struct processor_performance {
+    uint32_t state;
+    uint32_t platform_limit;
+    struct xen_pct_register control_register;
+    struct xen_pct_register status_register;
+    uint32_t state_count;
+    struct xen_processor_px *states;
+    struct xen_psd_package domain_info;
+    uint32_t shared_type;
+
+    uint32_t init;
+};
+
+struct processor_pminfo {
+    uint32_t acpi_id;
+    uint32_t id;
+    struct processor_performance    perf;
+};
+
+extern struct processor_pminfo *processor_pminfo[NR_CPUS];
+
+struct px_stat {
+    uint8_t total;        /* total Px states */
+    uint8_t usable;       /* usable Px states */
+    uint8_t last;         /* last Px state */
+    uint8_t cur;          /* current Px state */
+    uint64_t *trans_pt;   /* Px transition table */
+    pm_px_val_t *pt;
+};
+
+struct pm_px {
+    struct px_stat u;
+    uint64_t prev_state_wall;
+    uint64_t prev_idle_wall;
+};
+
+DECLARE_PER_CPU(struct pm_px *, cpufreq_statistic_data);
+
+int cpufreq_cpu_init(unsigned int cpuid);
+#endif /* __XEN_PROCESSOR_PM_H__ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
  2017-11-09 17:09 ` [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location Oleksandr Tyshchenko
  2017-11-09 17:09 ` [RFC PATCH 02/31] pm: move processor_perf.h " Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-02  0:47   ` Stefano Stabellini
  2018-05-07 15:36   ` Jan Beulich
  2017-11-09 17:09 ` [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable Oleksandr Tyshchenko
                   ` (30 subsequent siblings)
  33 siblings, 2 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich

From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>

Cpufreq driver should be more generalizable (not ACPI-specific).
Thus this file should be placed to more convenient location.

This is a rebased version of the original patch:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00935.html

Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 MAINTAINERS               |   1 +
 xen/arch/x86/Kconfig      |   1 +
 xen/common/sysctl.c       |   2 +-
 xen/drivers/Kconfig       |   2 +
 xen/drivers/Makefile      |   1 +
 xen/drivers/acpi/Makefile |   1 -
 xen/drivers/acpi/pmstat.c | 526 ----------------------------------------------
 xen/drivers/pm/Kconfig    |   3 +
 xen/drivers/pm/Makefile   |   1 +
 xen/drivers/pm/stat.c     | 526 ++++++++++++++++++++++++++++++++++++++++++++++
 10 files changed, 536 insertions(+), 528 deletions(-)
 delete mode 100644 xen/drivers/acpi/pmstat.c
 create mode 100644 xen/drivers/pm/Kconfig
 create mode 100644 xen/drivers/pm/Makefile
 create mode 100644 xen/drivers/pm/stat.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 9794a81..87ade6f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -294,6 +294,7 @@ F:	xen/arch/x86/acpi/
 X:	xen/arch/x86/acpi/boot.c
 X:	xen/arch/x86/acpi/lib.c
 F:	xen/drivers/cpufreq/
+F:	xen/drivers/pm/
 F:	xen/include/xen/cpufreq.h
 F:	xen/include/xen/processor_perf.h
 
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 30c2769..86c8eca 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -23,6 +23,7 @@ config X86
 	select HAS_PDX
 	select NUMA
 	select VGA
+	select HAS_PM
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index a6882d1..ac96347 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -171,7 +171,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         op->u.availheap.avail_bytes <<= PAGE_SHIFT;
         break;
 
-#if defined (CONFIG_ACPI) && defined (CONFIG_HAS_CPUFREQ)
+#if defined (CONFIG_HAS_PM) && defined (CONFIG_HAS_CPUFREQ)
     case XEN_SYSCTL_get_pmstat:
         ret = do_get_pm_info(&op->u.get_pmstat);
         break;
diff --git a/xen/drivers/Kconfig b/xen/drivers/Kconfig
index bc3a54f..ddaec11 100644
--- a/xen/drivers/Kconfig
+++ b/xen/drivers/Kconfig
@@ -12,4 +12,6 @@ source "drivers/pci/Kconfig"
 
 source "drivers/video/Kconfig"
 
+source "drivers/pm/Kconfig"
+
 endmenu
diff --git a/xen/drivers/Makefile b/xen/drivers/Makefile
index 1939180..dd0b496 100644
--- a/xen/drivers/Makefile
+++ b/xen/drivers/Makefile
@@ -4,3 +4,4 @@ subdir-$(CONFIG_HAS_PCI) += pci
 subdir-$(CONFIG_HAS_PASSTHROUGH) += passthrough
 subdir-$(CONFIG_ACPI) += acpi
 subdir-$(CONFIG_VIDEO) += video
+subdir-$(CONFIG_HAS_PM) += pm
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
index 444b11d..6f6470a 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -5,7 +5,6 @@ subdir-$(CONFIG_X86) += apei
 obj-bin-y += tables.init.o
 obj-$(CONFIG_NUMA) += numa.o
 obj-y += osl.o
-obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
 
 obj-$(CONFIG_X86) += hwregs.o
 obj-$(CONFIG_X86) += reboot.o
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
deleted file mode 100644
index 2dbde1c..0000000
--- a/xen/drivers/acpi/pmstat.c
+++ /dev/null
@@ -1,526 +0,0 @@
-/*****************************************************************************
-#  pmstat.c - Power Management statistic information (Px/Cx/Tx, etc.)
-#
-#  Copyright (c) 2008, Liu Jinsong <jinsong.liu@intel.com>
-#
-# This program is free software; you can redistribute it and/or modify it 
-# under the terms of the GNU General Public License as published by the Free 
-# Software Foundation; either version 2 of the License, or (at your option) 
-# any later version.
-#
-# This program is distributed in the hope that it will be useful, but WITHOUT 
-# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
-# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
-# more details.
-#
-# You should have received a copy of the GNU General Public License along with
-# this program; If not, see <http://www.gnu.org/licenses/>.
-#
-# The full GNU General Public License is included in this distribution in the
-# file called LICENSE.
-#
-*****************************************************************************/
-
-#include <xen/lib.h>
-#include <xen/errno.h>
-#include <xen/sched.h>
-#include <xen/event.h>
-#include <xen/irq.h>
-#include <xen/iocap.h>
-#include <xen/compat.h>
-#include <xen/guest_access.h>
-#include <asm/current.h>
-#include <public/xen.h>
-#include <xen/cpumask.h>
-#include <asm/processor.h>
-#include <xen/percpu.h>
-#include <xen/domain.h>
-#include <xen/acpi.h>
-
-#include <public/sysctl.h>
-#include <xen/cpufreq.h>
-#include <xen/pmstat.h>
-
-DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
-
-/*
- * Get PM statistic info
- */
-int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
-{
-    int ret = 0;
-    const struct processor_pminfo *pmpt;
-
-    if ( !op || (op->cpuid >= nr_cpu_ids) || !cpu_online(op->cpuid) )
-        return -EINVAL;
-    pmpt = processor_pminfo[op->cpuid];
-
-    switch ( op->type & PMSTAT_CATEGORY_MASK )
-    {
-    case PMSTAT_CX:
-        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_CX) )
-            return -ENODEV;
-        break;
-    case PMSTAT_PX:
-        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
-            return -ENODEV;
-        if ( !cpufreq_driver )
-            return -ENODEV;
-        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
-            return -EINVAL;
-        break;
-    default:
-        return -ENODEV;
-    }
-
-    switch ( op->type )
-    {
-    case PMSTAT_get_max_px:
-    {
-        op->u.getpx.total = pmpt->perf.state_count;
-        break;
-    }
-
-    case PMSTAT_get_pxstat:
-    {
-        uint32_t ct;
-        struct pm_px *pxpt;
-        spinlock_t *cpufreq_statistic_lock = 
-                   &per_cpu(cpufreq_statistic_lock, op->cpuid);
-
-        spin_lock(cpufreq_statistic_lock);
-
-        pxpt = per_cpu(cpufreq_statistic_data, op->cpuid);
-        if ( !pxpt || !pxpt->u.pt || !pxpt->u.trans_pt )
-        {
-            spin_unlock(cpufreq_statistic_lock);
-            return -ENODATA;
-        }
-
-        pxpt->u.usable = pmpt->perf.state_count - pmpt->perf.platform_limit;
-
-        cpufreq_residency_update(op->cpuid, pxpt->u.cur);
-
-        ct = pmpt->perf.state_count;
-        if ( copy_to_guest(op->u.getpx.trans_pt, pxpt->u.trans_pt, ct*ct) )
-        {
-            spin_unlock(cpufreq_statistic_lock);
-            ret = -EFAULT;
-            break;
-        }
-
-        if ( copy_to_guest(op->u.getpx.pt, pxpt->u.pt, ct) )
-        {
-            spin_unlock(cpufreq_statistic_lock);
-            ret = -EFAULT;
-            break;
-        }
-
-        op->u.getpx.total = pxpt->u.total;
-        op->u.getpx.usable = pxpt->u.usable;
-        op->u.getpx.last = pxpt->u.last;
-        op->u.getpx.cur = pxpt->u.cur;
-
-        spin_unlock(cpufreq_statistic_lock);
-
-        break;
-    }
-
-    case PMSTAT_reset_pxstat:
-    {
-        cpufreq_statistic_reset(op->cpuid);
-        break;
-    }
-
-    case PMSTAT_get_max_cx:
-    {
-        op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
-        ret = 0;
-        break;
-    }
-
-    case PMSTAT_get_cxstat:
-    {
-        ret = pmstat_get_cx_stat(op->cpuid, &op->u.getcx);
-        break;
-    }
-
-    case PMSTAT_reset_cxstat:
-    {
-        ret = pmstat_reset_cx_stat(op->cpuid);
-        break;
-    }
-
-    default:
-        printk("not defined sub-hypercall @ do_get_pm_info\n");
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-/*
- * 1. Get PM parameter
- * 2. Provide user PM control
- */
-static int read_scaling_available_governors(char *scaling_available_governors,
-                                            unsigned int size)
-{
-    unsigned int i = 0;
-    struct cpufreq_governor *t;
-
-    if ( !scaling_available_governors )
-        return -EINVAL;
-
-    list_for_each_entry(t, &cpufreq_governor_list, governor_list)
-    {
-        i += scnprintf(&scaling_available_governors[i],
-                       CPUFREQ_NAME_LEN, "%s ", t->name);
-        if ( i > size )
-            return -EINVAL;
-    }
-    scaling_available_governors[i-1] = '\0';
-
-    return 0;
-}
-
-static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
-{
-    uint32_t ret = 0;
-    const struct processor_pminfo *pmpt;
-    struct cpufreq_policy *policy;
-    uint32_t gov_num = 0;
-    uint32_t *affected_cpus;
-    uint32_t *scaling_available_frequencies;
-    char     *scaling_available_governors;
-    struct list_head *pos;
-    uint32_t cpu, i, j = 0;
-
-    pmpt = processor_pminfo[op->cpuid];
-    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
-
-    if ( !pmpt || !pmpt->perf.states ||
-         !policy || !policy->governor )
-        return -EINVAL;
-
-    list_for_each(pos, &cpufreq_governor_list)
-        gov_num++;
-
-    if ( (op->u.get_para.cpu_num  != cpumask_weight(policy->cpus)) ||
-         (op->u.get_para.freq_num != pmpt->perf.state_count)    ||
-         (op->u.get_para.gov_num  != gov_num) )
-    {
-        op->u.get_para.cpu_num =  cpumask_weight(policy->cpus);
-        op->u.get_para.freq_num = pmpt->perf.state_count;
-        op->u.get_para.gov_num  = gov_num;
-        return -EAGAIN;
-    }
-
-    if ( !(affected_cpus = xzalloc_array(uint32_t, op->u.get_para.cpu_num)) )
-        return -ENOMEM;
-    for_each_cpu(cpu, policy->cpus)
-        affected_cpus[j++] = cpu;
-    ret = copy_to_guest(op->u.get_para.affected_cpus,
-                       affected_cpus, op->u.get_para.cpu_num);
-    xfree(affected_cpus);
-    if ( ret )
-        return ret;
-
-    if ( !(scaling_available_frequencies =
-           xzalloc_array(uint32_t, op->u.get_para.freq_num)) )
-        return -ENOMEM;
-    for ( i = 0; i < op->u.get_para.freq_num; i++ )
-        scaling_available_frequencies[i] =
-                        pmpt->perf.states[i].core_frequency * 1000;
-    ret = copy_to_guest(op->u.get_para.scaling_available_frequencies,
-                   scaling_available_frequencies, op->u.get_para.freq_num);
-    xfree(scaling_available_frequencies);
-    if ( ret )
-        return ret;
-
-    if ( !(scaling_available_governors =
-           xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
-        return -ENOMEM;
-    if ( (ret = read_scaling_available_governors(scaling_available_governors,
-                gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
-    {
-        xfree(scaling_available_governors);
-        return ret;
-    }
-    ret = copy_to_guest(op->u.get_para.scaling_available_governors,
-                scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
-    xfree(scaling_available_governors);
-    if ( ret )
-        return ret;
-
-    op->u.get_para.cpuinfo_cur_freq =
-        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
-    op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
-    op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
-    op->u.get_para.scaling_cur_freq = policy->cur;
-    op->u.get_para.scaling_max_freq = policy->max;
-    op->u.get_para.scaling_min_freq = policy->min;
-
-    if ( cpufreq_driver->name[0] )
-        strlcpy(op->u.get_para.scaling_driver, 
-            cpufreq_driver->name, CPUFREQ_NAME_LEN);
-    else
-        strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
-
-    if ( policy->governor->name[0] )
-        strlcpy(op->u.get_para.scaling_governor, 
-            policy->governor->name, CPUFREQ_NAME_LEN);
-    else
-        strlcpy(op->u.get_para.scaling_governor, "Unknown", CPUFREQ_NAME_LEN);
-
-    /* governor specific para */
-    if ( !strnicmp(op->u.get_para.scaling_governor, 
-                   "userspace", CPUFREQ_NAME_LEN) )
-    {
-        op->u.get_para.u.userspace.scaling_setspeed = policy->cur;
-    }
-
-    if ( !strnicmp(op->u.get_para.scaling_governor, 
-                   "ondemand", CPUFREQ_NAME_LEN) )
-    {
-        ret = get_cpufreq_ondemand_para(
-            &op->u.get_para.u.ondemand.sampling_rate_max,
-            &op->u.get_para.u.ondemand.sampling_rate_min,
-            &op->u.get_para.u.ondemand.sampling_rate,
-            &op->u.get_para.u.ondemand.up_threshold);
-    }
-    op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
-
-    return ret;
-}
-
-static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
-{
-    struct cpufreq_policy new_policy, *old_policy;
-
-    old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
-    if ( !old_policy )
-        return -EINVAL;
-
-    memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
-
-    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
-    if (new_policy.governor == NULL)
-        return -EINVAL;
-
-    return __cpufreq_set_policy(old_policy, &new_policy);
-}
-
-static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
-{
-    int ret = 0;
-    struct cpufreq_policy *policy;
-
-    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
-
-    if ( !policy || !policy->governor )
-        return -EINVAL;
-
-    switch(op->u.set_para.ctrl_type)
-    {
-    case SCALING_MAX_FREQ:
-    {
-        struct cpufreq_policy new_policy;
-
-        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
-        new_policy.max = op->u.set_para.ctrl_value;
-        ret = __cpufreq_set_policy(policy, &new_policy);
-
-        break;
-    }
-
-    case SCALING_MIN_FREQ:
-    {
-        struct cpufreq_policy new_policy;
-
-        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
-        new_policy.min = op->u.set_para.ctrl_value;
-        ret = __cpufreq_set_policy(policy, &new_policy);
-
-        break;
-    }
-
-    case SCALING_SETSPEED:
-    {
-        unsigned int freq =op->u.set_para.ctrl_value;
-
-        if ( !strnicmp(policy->governor->name,
-                       "userspace", CPUFREQ_NAME_LEN) )
-            ret = write_userspace_scaling_setspeed(op->cpuid, freq);
-        else
-            ret = -EINVAL;
-
-        break;
-    }
-
-    case SAMPLING_RATE:
-    {
-        unsigned int sampling_rate = op->u.set_para.ctrl_value;
-
-        if ( !strnicmp(policy->governor->name,
-                       "ondemand", CPUFREQ_NAME_LEN) )
-            ret = write_ondemand_sampling_rate(sampling_rate);
-        else
-            ret = -EINVAL;
-
-        break;
-    }
-
-    case UP_THRESHOLD:
-    {
-        unsigned int up_threshold = op->u.set_para.ctrl_value;
-
-        if ( !strnicmp(policy->governor->name,
-                       "ondemand", CPUFREQ_NAME_LEN) )
-            ret = write_ondemand_up_threshold(up_threshold);
-        else
-            ret = -EINVAL;
-
-        break;
-    }
-
-    default:
-        ret = -EINVAL;
-        break;
-    }
-
-    return ret;
-}
-
-int do_pm_op(struct xen_sysctl_pm_op *op)
-{
-    int ret = 0;
-    const struct processor_pminfo *pmpt;
-
-    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
-        return -EINVAL;
-    pmpt = processor_pminfo[op->cpuid];
-
-    switch ( op->cmd & PM_PARA_CATEGORY_MASK )
-    {
-    case CPUFREQ_PARA:
-        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
-            return -ENODEV;
-        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
-            return -EINVAL;
-        break;
-    }
-
-    switch ( op->cmd )
-    {
-    case GET_CPUFREQ_PARA:
-    {
-        ret = get_cpufreq_para(op);
-        break;
-    }
-
-    case SET_CPUFREQ_GOV:
-    {
-        ret = set_cpufreq_gov(op);
-        break;
-    }
-
-    case SET_CPUFREQ_PARA:
-    {
-        ret = set_cpufreq_para(op);
-        break;
-    }
-
-    case GET_CPUFREQ_AVGFREQ:
-    {
-        op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
-    {
-        uint32_t saved_value;
-
-        saved_value = sched_smt_power_savings;
-        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
-        op->u.set_sched_opt_smt = saved_value;
-
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
-    {
-        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
-    {
-        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_get_max_cstate:
-    {
-        op->u.get_max_cstate = acpi_get_cstate_limit();
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_max_cstate:
-    {
-        acpi_set_cstate_limit(op->u.set_max_cstate);
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_enable_turbo:
-    {
-        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_disable_turbo:
-    {
-        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
-        break;
-    }
-
-    default:
-        printk("not defined sub-hypercall @ do_pm_op\n");
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
-{
-    u32 bits[3];
-    int ret;
-
-    if ( copy_from_guest(bits, pdc, 2) )
-        ret = -EFAULT;
-    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
-        ret = -EINVAL;
-    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
-        ret = -EFAULT;
-    else
-    {
-        u32 mask = 0;
-
-        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
-            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
-        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
-            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
-        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
-            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
-        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
-                    ACPI_PDC_SMP_C1PT) & ~mask;
-        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
-    }
-    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
-        ret = -EFAULT;
-
-    return ret;
-}
diff --git a/xen/drivers/pm/Kconfig b/xen/drivers/pm/Kconfig
new file mode 100644
index 0000000..6d4fda1
--- /dev/null
+++ b/xen/drivers/pm/Kconfig
@@ -0,0 +1,3 @@
+
+config HAS_PM
+	bool
diff --git a/xen/drivers/pm/Makefile b/xen/drivers/pm/Makefile
new file mode 100644
index 0000000..2073683
--- /dev/null
+++ b/xen/drivers/pm/Makefile
@@ -0,0 +1 @@
+obj-y += stat.o
diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
new file mode 100644
index 0000000..2dbde1c
--- /dev/null
+++ b/xen/drivers/pm/stat.c
@@ -0,0 +1,526 @@
+/*****************************************************************************
+#  pmstat.c - Power Management statistic information (Px/Cx/Tx, etc.)
+#
+#  Copyright (c) 2008, Liu Jinsong <jinsong.liu@intel.com>
+#
+# This program is free software; you can redistribute it and/or modify it 
+# under the terms of the GNU General Public License as published by the Free 
+# Software Foundation; either version 2 of the License, or (at your option) 
+# any later version.
+#
+# This program is distributed in the hope that it will be useful, but WITHOUT 
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
+# more details.
+#
+# You should have received a copy of the GNU General Public License along with
+# this program; If not, see <http://www.gnu.org/licenses/>.
+#
+# The full GNU General Public License is included in this distribution in the
+# file called LICENSE.
+#
+*****************************************************************************/
+
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/sched.h>
+#include <xen/event.h>
+#include <xen/irq.h>
+#include <xen/iocap.h>
+#include <xen/compat.h>
+#include <xen/guest_access.h>
+#include <asm/current.h>
+#include <public/xen.h>
+#include <xen/cpumask.h>
+#include <asm/processor.h>
+#include <xen/percpu.h>
+#include <xen/domain.h>
+#include <xen/acpi.h>
+
+#include <public/sysctl.h>
+#include <xen/cpufreq.h>
+#include <xen/pmstat.h>
+
+DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
+
+/*
+ * Get PM statistic info
+ */
+int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
+{
+    int ret = 0;
+    const struct processor_pminfo *pmpt;
+
+    if ( !op || (op->cpuid >= nr_cpu_ids) || !cpu_online(op->cpuid) )
+        return -EINVAL;
+    pmpt = processor_pminfo[op->cpuid];
+
+    switch ( op->type & PMSTAT_CATEGORY_MASK )
+    {
+    case PMSTAT_CX:
+        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_CX) )
+            return -ENODEV;
+        break;
+    case PMSTAT_PX:
+        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
+            return -ENODEV;
+        if ( !cpufreq_driver )
+            return -ENODEV;
+        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
+            return -EINVAL;
+        break;
+    default:
+        return -ENODEV;
+    }
+
+    switch ( op->type )
+    {
+    case PMSTAT_get_max_px:
+    {
+        op->u.getpx.total = pmpt->perf.state_count;
+        break;
+    }
+
+    case PMSTAT_get_pxstat:
+    {
+        uint32_t ct;
+        struct pm_px *pxpt;
+        spinlock_t *cpufreq_statistic_lock = 
+                   &per_cpu(cpufreq_statistic_lock, op->cpuid);
+
+        spin_lock(cpufreq_statistic_lock);
+
+        pxpt = per_cpu(cpufreq_statistic_data, op->cpuid);
+        if ( !pxpt || !pxpt->u.pt || !pxpt->u.trans_pt )
+        {
+            spin_unlock(cpufreq_statistic_lock);
+            return -ENODATA;
+        }
+
+        pxpt->u.usable = pmpt->perf.state_count - pmpt->perf.platform_limit;
+
+        cpufreq_residency_update(op->cpuid, pxpt->u.cur);
+
+        ct = pmpt->perf.state_count;
+        if ( copy_to_guest(op->u.getpx.trans_pt, pxpt->u.trans_pt, ct*ct) )
+        {
+            spin_unlock(cpufreq_statistic_lock);
+            ret = -EFAULT;
+            break;
+        }
+
+        if ( copy_to_guest(op->u.getpx.pt, pxpt->u.pt, ct) )
+        {
+            spin_unlock(cpufreq_statistic_lock);
+            ret = -EFAULT;
+            break;
+        }
+
+        op->u.getpx.total = pxpt->u.total;
+        op->u.getpx.usable = pxpt->u.usable;
+        op->u.getpx.last = pxpt->u.last;
+        op->u.getpx.cur = pxpt->u.cur;
+
+        spin_unlock(cpufreq_statistic_lock);
+
+        break;
+    }
+
+    case PMSTAT_reset_pxstat:
+    {
+        cpufreq_statistic_reset(op->cpuid);
+        break;
+    }
+
+    case PMSTAT_get_max_cx:
+    {
+        op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
+        ret = 0;
+        break;
+    }
+
+    case PMSTAT_get_cxstat:
+    {
+        ret = pmstat_get_cx_stat(op->cpuid, &op->u.getcx);
+        break;
+    }
+
+    case PMSTAT_reset_cxstat:
+    {
+        ret = pmstat_reset_cx_stat(op->cpuid);
+        break;
+    }
+
+    default:
+        printk("not defined sub-hypercall @ do_get_pm_info\n");
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+/*
+ * 1. Get PM parameter
+ * 2. Provide user PM control
+ */
+static int read_scaling_available_governors(char *scaling_available_governors,
+                                            unsigned int size)
+{
+    unsigned int i = 0;
+    struct cpufreq_governor *t;
+
+    if ( !scaling_available_governors )
+        return -EINVAL;
+
+    list_for_each_entry(t, &cpufreq_governor_list, governor_list)
+    {
+        i += scnprintf(&scaling_available_governors[i],
+                       CPUFREQ_NAME_LEN, "%s ", t->name);
+        if ( i > size )
+            return -EINVAL;
+    }
+    scaling_available_governors[i-1] = '\0';
+
+    return 0;
+}
+
+static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
+{
+    uint32_t ret = 0;
+    const struct processor_pminfo *pmpt;
+    struct cpufreq_policy *policy;
+    uint32_t gov_num = 0;
+    uint32_t *affected_cpus;
+    uint32_t *scaling_available_frequencies;
+    char     *scaling_available_governors;
+    struct list_head *pos;
+    uint32_t cpu, i, j = 0;
+
+    pmpt = processor_pminfo[op->cpuid];
+    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+
+    if ( !pmpt || !pmpt->perf.states ||
+         !policy || !policy->governor )
+        return -EINVAL;
+
+    list_for_each(pos, &cpufreq_governor_list)
+        gov_num++;
+
+    if ( (op->u.get_para.cpu_num  != cpumask_weight(policy->cpus)) ||
+         (op->u.get_para.freq_num != pmpt->perf.state_count)    ||
+         (op->u.get_para.gov_num  != gov_num) )
+    {
+        op->u.get_para.cpu_num =  cpumask_weight(policy->cpus);
+        op->u.get_para.freq_num = pmpt->perf.state_count;
+        op->u.get_para.gov_num  = gov_num;
+        return -EAGAIN;
+    }
+
+    if ( !(affected_cpus = xzalloc_array(uint32_t, op->u.get_para.cpu_num)) )
+        return -ENOMEM;
+    for_each_cpu(cpu, policy->cpus)
+        affected_cpus[j++] = cpu;
+    ret = copy_to_guest(op->u.get_para.affected_cpus,
+                       affected_cpus, op->u.get_para.cpu_num);
+    xfree(affected_cpus);
+    if ( ret )
+        return ret;
+
+    if ( !(scaling_available_frequencies =
+           xzalloc_array(uint32_t, op->u.get_para.freq_num)) )
+        return -ENOMEM;
+    for ( i = 0; i < op->u.get_para.freq_num; i++ )
+        scaling_available_frequencies[i] =
+                        pmpt->perf.states[i].core_frequency * 1000;
+    ret = copy_to_guest(op->u.get_para.scaling_available_frequencies,
+                   scaling_available_frequencies, op->u.get_para.freq_num);
+    xfree(scaling_available_frequencies);
+    if ( ret )
+        return ret;
+
+    if ( !(scaling_available_governors =
+           xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
+        return -ENOMEM;
+    if ( (ret = read_scaling_available_governors(scaling_available_governors,
+                gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
+    {
+        xfree(scaling_available_governors);
+        return ret;
+    }
+    ret = copy_to_guest(op->u.get_para.scaling_available_governors,
+                scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
+    xfree(scaling_available_governors);
+    if ( ret )
+        return ret;
+
+    op->u.get_para.cpuinfo_cur_freq =
+        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
+    op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
+    op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
+    op->u.get_para.scaling_cur_freq = policy->cur;
+    op->u.get_para.scaling_max_freq = policy->max;
+    op->u.get_para.scaling_min_freq = policy->min;
+
+    if ( cpufreq_driver->name[0] )
+        strlcpy(op->u.get_para.scaling_driver, 
+            cpufreq_driver->name, CPUFREQ_NAME_LEN);
+    else
+        strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
+
+    if ( policy->governor->name[0] )
+        strlcpy(op->u.get_para.scaling_governor, 
+            policy->governor->name, CPUFREQ_NAME_LEN);
+    else
+        strlcpy(op->u.get_para.scaling_governor, "Unknown", CPUFREQ_NAME_LEN);
+
+    /* governor specific para */
+    if ( !strnicmp(op->u.get_para.scaling_governor, 
+                   "userspace", CPUFREQ_NAME_LEN) )
+    {
+        op->u.get_para.u.userspace.scaling_setspeed = policy->cur;
+    }
+
+    if ( !strnicmp(op->u.get_para.scaling_governor, 
+                   "ondemand", CPUFREQ_NAME_LEN) )
+    {
+        ret = get_cpufreq_ondemand_para(
+            &op->u.get_para.u.ondemand.sampling_rate_max,
+            &op->u.get_para.u.ondemand.sampling_rate_min,
+            &op->u.get_para.u.ondemand.sampling_rate,
+            &op->u.get_para.u.ondemand.up_threshold);
+    }
+    op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
+
+    return ret;
+}
+
+static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
+{
+    struct cpufreq_policy new_policy, *old_policy;
+
+    old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+    if ( !old_policy )
+        return -EINVAL;
+
+    memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
+
+    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
+    if (new_policy.governor == NULL)
+        return -EINVAL;
+
+    return __cpufreq_set_policy(old_policy, &new_policy);
+}
+
+static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
+{
+    int ret = 0;
+    struct cpufreq_policy *policy;
+
+    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+
+    if ( !policy || !policy->governor )
+        return -EINVAL;
+
+    switch(op->u.set_para.ctrl_type)
+    {
+    case SCALING_MAX_FREQ:
+    {
+        struct cpufreq_policy new_policy;
+
+        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
+        new_policy.max = op->u.set_para.ctrl_value;
+        ret = __cpufreq_set_policy(policy, &new_policy);
+
+        break;
+    }
+
+    case SCALING_MIN_FREQ:
+    {
+        struct cpufreq_policy new_policy;
+
+        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
+        new_policy.min = op->u.set_para.ctrl_value;
+        ret = __cpufreq_set_policy(policy, &new_policy);
+
+        break;
+    }
+
+    case SCALING_SETSPEED:
+    {
+        unsigned int freq =op->u.set_para.ctrl_value;
+
+        if ( !strnicmp(policy->governor->name,
+                       "userspace", CPUFREQ_NAME_LEN) )
+            ret = write_userspace_scaling_setspeed(op->cpuid, freq);
+        else
+            ret = -EINVAL;
+
+        break;
+    }
+
+    case SAMPLING_RATE:
+    {
+        unsigned int sampling_rate = op->u.set_para.ctrl_value;
+
+        if ( !strnicmp(policy->governor->name,
+                       "ondemand", CPUFREQ_NAME_LEN) )
+            ret = write_ondemand_sampling_rate(sampling_rate);
+        else
+            ret = -EINVAL;
+
+        break;
+    }
+
+    case UP_THRESHOLD:
+    {
+        unsigned int up_threshold = op->u.set_para.ctrl_value;
+
+        if ( !strnicmp(policy->governor->name,
+                       "ondemand", CPUFREQ_NAME_LEN) )
+            ret = write_ondemand_up_threshold(up_threshold);
+        else
+            ret = -EINVAL;
+
+        break;
+    }
+
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
+int do_pm_op(struct xen_sysctl_pm_op *op)
+{
+    int ret = 0;
+    const struct processor_pminfo *pmpt;
+
+    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
+        return -EINVAL;
+    pmpt = processor_pminfo[op->cpuid];
+
+    switch ( op->cmd & PM_PARA_CATEGORY_MASK )
+    {
+    case CPUFREQ_PARA:
+        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
+            return -ENODEV;
+        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
+            return -EINVAL;
+        break;
+    }
+
+    switch ( op->cmd )
+    {
+    case GET_CPUFREQ_PARA:
+    {
+        ret = get_cpufreq_para(op);
+        break;
+    }
+
+    case SET_CPUFREQ_GOV:
+    {
+        ret = set_cpufreq_gov(op);
+        break;
+    }
+
+    case SET_CPUFREQ_PARA:
+    {
+        ret = set_cpufreq_para(op);
+        break;
+    }
+
+    case GET_CPUFREQ_AVGFREQ:
+    {
+        op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
+    {
+        uint32_t saved_value;
+
+        saved_value = sched_smt_power_savings;
+        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
+        op->u.set_sched_opt_smt = saved_value;
+
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
+    {
+        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
+    {
+        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_get_max_cstate:
+    {
+        op->u.get_max_cstate = acpi_get_cstate_limit();
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_set_max_cstate:
+    {
+        acpi_set_cstate_limit(op->u.set_max_cstate);
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_enable_turbo:
+    {
+        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
+        break;
+    }
+
+    case XEN_SYSCTL_pm_op_disable_turbo:
+    {
+        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
+        break;
+    }
+
+    default:
+        printk("not defined sub-hypercall @ do_pm_op\n");
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
+{
+    u32 bits[3];
+    int ret;
+
+    if ( copy_from_guest(bits, pdc, 2) )
+        ret = -EFAULT;
+    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
+        ret = -EINVAL;
+    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
+        ret = -EFAULT;
+    else
+    {
+        u32 mask = 0;
+
+        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
+            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
+        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
+            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
+        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
+            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
+        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
+                    ACPI_PDC_SMP_C1PT) & ~mask;
+        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
+    }
+    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
+        ret = -EFAULT;
+
+    return ret;
+}
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (2 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-02  1:06   ` Stefano Stabellini
  2018-05-07 15:39   ` Jan Beulich
  2017-11-09 17:09 ` [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable Oleksandr Tyshchenko
                   ` (29 subsequent siblings)
  33 siblings, 2 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich

From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>

This settings is not needed for some architectures.
So make it to be configurable and use it for x86
architecture.

This is a rebased version of the original patch:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00942.html

Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/x86/Kconfig          |  1 +
 xen/drivers/cpufreq/Kconfig   |  3 +++
 xen/drivers/cpufreq/utility.c | 11 ++++++++++-
 xen/drivers/pm/stat.c         |  6 ++++++
 xen/include/xen/cpufreq.h     |  6 ++++++
 5 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 86c8eca..c1eac1d 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -24,6 +24,7 @@ config X86
 	select NUMA
 	select VGA
 	select HAS_PM
+	select HAS_CPU_TURBO
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/drivers/cpufreq/Kconfig b/xen/drivers/cpufreq/Kconfig
index cce80f4..427ea2a 100644
--- a/xen/drivers/cpufreq/Kconfig
+++ b/xen/drivers/cpufreq/Kconfig
@@ -1,3 +1,6 @@
 
 config HAS_CPUFREQ
 	bool
+
+config HAS_CPU_TURBO
+	bool
diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index a687e5a..25bf983 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -209,7 +209,9 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
 {
     unsigned int min_freq = ~0;
     unsigned int max_freq = 0;
+#ifdef CONFIG_HAS_CPU_TURBO
     unsigned int second_max_freq = 0;
+#endif
     unsigned int i;
 
     for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
@@ -221,6 +223,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
         if (freq > max_freq)
             max_freq = freq;
     }
+#ifdef CONFIG_HAS_CPU_TURBO
     for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
         unsigned int freq = table[i].frequency;
         if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
@@ -234,9 +237,13 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
         printk("max_freq: %u    second_max_freq: %u\n",
                max_freq, second_max_freq);
 
+    policy->cpuinfo.second_max_freq = second_max_freq;
+#else /* !CONFIG_HAS_CPU_TURBO */
+    if (cpufreq_verbose)
+        printk("max_freq: %u\n", max_freq);
+#endif /* CONFIG_HAS_CPU_TURBO */
     policy->min = policy->cpuinfo.min_freq = min_freq;
     policy->max = policy->cpuinfo.max_freq = max_freq;
-    policy->cpuinfo.second_max_freq = second_max_freq;
 
     if (policy->min == ~0)
         return -EINVAL;
@@ -390,6 +397,7 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
     return policy->cur;
 }
 
+#ifdef CONFIG_HAS_CPU_TURBO
 int cpufreq_update_turbo(int cpuid, int new_state)
 {
     struct cpufreq_policy *policy;
@@ -430,6 +438,7 @@ int cpufreq_get_turbo_status(int cpuid)
     policy = per_cpu(cpufreq_cpu_policy, cpuid);
     return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
 }
+#endif /* CONFIG_HAS_CPU_TURBO */
 
 /*********************************************************************
  *                 POLICY                                            *
diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
index 2dbde1c..133e64d 100644
--- a/xen/drivers/pm/stat.c
+++ b/xen/drivers/pm/stat.c
@@ -290,7 +290,11 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
             &op->u.get_para.u.ondemand.sampling_rate,
             &op->u.get_para.u.ondemand.up_threshold);
     }
+#ifdef CONFIG_HAS_CPU_TURBO
     op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
+#else
+    op->u.get_para.turbo_enabled = 0;
+#endif
 
     return ret;
 }
@@ -473,6 +477,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
         break;
     }
 
+#ifdef CONFIG_HAS_CPU_TURBO
     case XEN_SYSCTL_pm_op_enable_turbo:
     {
         ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
@@ -484,6 +489,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
         ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
         break;
     }
+#endif /* CONFIG_HAS_CPU_TURBO */
 
     default:
         printk("not defined sub-hypercall @ do_pm_op\n");
diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
index 30c70c9..2e0c16a 100644
--- a/xen/include/xen/cpufreq.h
+++ b/xen/include/xen/cpufreq.h
@@ -39,7 +39,9 @@ extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
 
 struct cpufreq_cpuinfo {
     unsigned int        max_freq;
+#ifdef CONFIG_HAS_CPU_TURBO
     unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
+#endif
     unsigned int        min_freq;
     unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
 };
@@ -72,9 +74,11 @@ struct cpufreq_policy {
 
     bool_t              resume; /* flag for cpufreq 1st run
                                  * S3 wakeup, hotplug cpu, etc */
+#ifdef CONFIG_HAS_CPU_TURBO
     s8                  turbo;  /* tristate flag: 0 for unsupported
                                  * -1 for disable, 1 for enabled
                                  * See CPUFREQ_TURBO_* below for defines */
+#endif
     bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
 };
 DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
@@ -138,8 +142,10 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
 #define CPUFREQ_TURBO_UNSUPPORTED   0
 #define CPUFREQ_TURBO_ENABLED       1
 
+#ifdef CONFIG_HAS_CPU_TURBO
 extern int cpufreq_update_turbo(int cpuid, int new_state);
 extern int cpufreq_get_turbo_status(int cpuid);
+#endif
 
 static __inline__ int 
 __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (3 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-02  1:21   ` Stefano Stabellini
  2017-11-09 17:09 ` [RFC PATCH 06/31] cpufreq: make cpufreq driver " Oleksandr Tyshchenko
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich

From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>

ACPI-specific parts are moved under appropriate ifdefs.
Now pmstat functions can be used in ARM platform.

This is a rebased version of the original patch:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00941.html

Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/drivers/pm/stat.c    | 8 +++++++-
 xen/include/xen/pmstat.h | 2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
index 133e64d..986ba41 100644
--- a/xen/drivers/pm/stat.c
+++ b/xen/drivers/pm/stat.c
@@ -35,7 +35,6 @@
 #include <asm/processor.h>
 #include <xen/percpu.h>
 #include <xen/domain.h>
-#include <xen/acpi.h>
 
 #include <public/sysctl.h>
 #include <xen/cpufreq.h>
@@ -132,6 +131,8 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
         break;
     }
 
+/* For now those operations can be used only when ACPI is enabled */
+#ifdef CONFIG_ACPI
     case PMSTAT_get_max_cx:
     {
         op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
@@ -150,6 +151,7 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
         ret = pmstat_reset_cx_stat(op->cpuid);
         break;
     }
+#endif /* CONFIG_ACPI */
 
     default:
         printk("not defined sub-hypercall @ do_get_pm_info\n");
@@ -465,6 +467,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
         break;
     }
 
+#ifdef CONFIG_ACPI
     case XEN_SYSCTL_pm_op_get_max_cstate:
     {
         op->u.get_max_cstate = acpi_get_cstate_limit();
@@ -476,6 +479,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
         acpi_set_cstate_limit(op->u.set_max_cstate);
         break;
     }
+#endif /* CONFIG_ACPI */
 
 #ifdef CONFIG_HAS_CPU_TURBO
     case XEN_SYSCTL_pm_op_enable_turbo:
@@ -500,6 +504,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
     return ret;
 }
 
+#ifdef CONFIG_ACPI
 int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
 {
     u32 bits[3];
@@ -530,3 +535,4 @@ int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
 
     return ret;
 }
+#endif /* CONFIG_ACPI */
diff --git a/xen/include/xen/pmstat.h b/xen/include/xen/pmstat.h
index 266bc16..a870c8a 100644
--- a/xen/include/xen/pmstat.h
+++ b/xen/include/xen/pmstat.h
@@ -6,10 +6,12 @@
 #include <public/sysctl.h>   /* for struct pm_cx_stat */
 
 int set_px_pminfo(uint32_t cpu, struct xen_processor_performance *perf);
+#ifdef CONFIG_ACPI
 long set_cx_pminfo(uint32_t cpu, struct xen_processor_power *power);
 uint32_t pmstat_get_cx_nr(uint32_t cpuid);
 int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat);
 int pmstat_reset_cx_stat(uint32_t cpuid);
+#endif
 
 int do_get_pm_info(struct xen_sysctl_get_pmstat *op);
 int do_pm_op(struct xen_sysctl_pm_op *op);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (4 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-02  1:37   ` Stefano Stabellini
  2017-11-09 17:09 ` [RFC PATCH 07/31] xenpm: Clarify xenpm usage Oleksandr Tyshchenko
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich

From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>

First implementation of the cpufreq driver has been
written with x86 in mind. This patch makes possible
the cpufreq driver be working on both x86 and arm
architectures.

This is a rebased version of the original patch:
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00932.html

Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/drivers/cpufreq/cpufreq.c    | 81 +++++++++++++++++++++++++++++++++++++---
 xen/include/public/platform.h    |  1 +
 xen/include/xen/processor_perf.h |  6 +++
 3 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index ab909e2..64e1ae7 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -42,7 +42,6 @@
 #include <asm/io.h>
 #include <asm/processor.h>
 #include <asm/percpu.h>
-#include <acpi/acpi.h>
 #include <xen/cpufreq.h>
 
 static unsigned int __read_mostly usr_min_freq;
@@ -206,6 +205,7 @@ int cpufreq_add_cpu(unsigned int cpu)
     } else {
         /* domain sanity check under whatever coordination type */
         firstcpu = cpumask_first(cpufreq_dom->map);
+#ifdef CONFIG_ACPI
         if ((perf->domain_info.coord_type !=
             processor_pminfo[firstcpu]->perf.domain_info.coord_type) ||
             (perf->domain_info.num_processors !=
@@ -221,6 +221,19 @@ int cpufreq_add_cpu(unsigned int cpu)
                 );
             return -EINVAL;
         }
+#else /* !CONFIG_ACPI */
+        if ((perf->domain_info.num_processors !=
+            processor_pminfo[firstcpu]->perf.domain_info.num_processors)) {
+
+            printk(KERN_WARNING "cpufreq fail to add CPU%d:"
+                   "incorrect num processors (%"PRIu64"), "
+                   "expect(%"PRIu64")\n",
+                   cpu, perf->domain_info.num_processors,
+                   processor_pminfo[firstcpu]->perf.domain_info.num_processors
+                );
+            return -EINVAL;
+        }
+#endif /* CONFIG_ACPI */
     }
 
     if (!domexist || hw_all) {
@@ -380,6 +393,7 @@ int cpufreq_del_cpu(unsigned int cpu)
     return 0;
 }
 
+#ifdef CONFIG_ACPI
 static void print_PCT(struct xen_pct_register *ptr)
 {
     printk("\t_PCT: descriptor=%d, length=%d, space_id=%d, "
@@ -387,12 +401,14 @@ static void print_PCT(struct xen_pct_register *ptr)
            ptr->descriptor, ptr->length, ptr->space_id, ptr->bit_width,
            ptr->bit_offset, ptr->reserved, ptr->address);
 }
+#endif /* CONFIG_ACPI */
 
 static void print_PSS(struct xen_processor_px *ptr, int count)
 {
     int i;
     printk("\t_PSS: state_count=%d\n", count);
     for (i=0; i<count; i++){
+#ifdef CONFIG_ACPI
         printk("\tState%d: %"PRId64"MHz %"PRId64"mW %"PRId64"us "
                "%"PRId64"us %#"PRIx64" %#"PRIx64"\n",
                i,
@@ -402,15 +418,26 @@ static void print_PSS(struct xen_processor_px *ptr, int count)
                ptr[i].bus_master_latency,
                ptr[i].control,
                ptr[i].status);
+#else /* !CONFIG_ACPI */
+        printk("\tState%d: %"PRId64"MHz %"PRId64"us\n",
+               i,
+               ptr[i].core_frequency,
+               ptr[i].transition_latency);
+#endif /* CONFIG_ACPI */
     }
 }
 
 static void print_PSD( struct xen_psd_package *ptr)
 {
+#ifdef CONFIG_ACPI
     printk("\t_PSD: num_entries=%"PRId64" rev=%"PRId64
            " domain=%"PRId64" coord_type=%"PRId64" num_processors=%"PRId64"\n",
            ptr->num_entries, ptr->revision, ptr->domain, ptr->coord_type,
            ptr->num_processors);
+#else /* !CONFIG_ACPI */
+    printk("\t_PSD:  domain=%"PRId64" num_processors=%"PRId64"\n",
+           ptr->domain, ptr->num_processors);
+#endif /* CONFIG_ACPI */
 }
 
 static void print_PPC(unsigned int platform_limit)
@@ -418,13 +445,53 @@ static void print_PPC(unsigned int platform_limit)
     printk("\t_PPC: %d\n", platform_limit);
 }
 
+static inline bool is_pss_data(struct xen_processor_performance *px)
+{
+#ifdef CONFIG_ACPI
+    return px->flags & XEN_PX_PSS;
+#else
+    return px->flags == XEN_PX_DATA;
+#endif
+}
+
+static inline bool is_psd_data(struct xen_processor_performance *px)
+{
+#ifdef CONFIG_ACPI
+    return px->flags & XEN_PX_PSD;
+#else
+    return px->flags == XEN_PX_DATA;
+#endif
+}
+
+static inline bool is_ppc_data(struct xen_processor_performance *px)
+{
+#ifdef CONFIG_ACPI
+    return px->flags & XEN_PX_PPC;
+#else
+    return px->flags == XEN_PX_DATA;
+#endif
+}
+
+static inline bool is_all_data(struct xen_processor_performance *px)
+{
+#ifdef CONFIG_ACPI
+    return px->flags == ( XEN_PX_PCT | XEN_PX_PSS | XEN_PX_PSD | XEN_PX_PPC );
+#else
+    return px->flags == XEN_PX_DATA;
+#endif
+}
+
 int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_info)
 {
     int ret=0, cpuid;
     struct processor_pminfo *pmpt;
     struct processor_performance *pxpt;
 
+#ifdef CONFIG_ACPI
     cpuid = get_cpu_id(acpi_id);
+#else
+    cpuid = acpi_id;
+#endif
     if ( cpuid < 0 || !dom0_px_info)
     {
         ret = -EINVAL;
@@ -446,6 +513,8 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
         processor_pminfo[cpuid] = pmpt;
     }
     pxpt = &pmpt->perf;
+
+#ifdef CONFIG_ACPI
     pmpt->acpi_id = acpi_id;
     pmpt->id = cpuid;
 
@@ -472,8 +541,9 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
             print_PCT(&pxpt->status_register);
         }
     }
+#endif /* CONFIG_ACPI */
 
-    if ( dom0_px_info->flags & XEN_PX_PSS ) 
+    if ( is_pss_data(dom0_px_info) )
     {
         /* capability check */
         if (dom0_px_info->state_count <= 1)
@@ -500,7 +570,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
             print_PSS(pxpt->states,pxpt->state_count);
     }
 
-    if ( dom0_px_info->flags & XEN_PX_PSD )
+    if ( is_psd_data(dom0_px_info) )
     {
         /* check domain coordination */
         if (dom0_px_info->shared_type != CPUFREQ_SHARED_TYPE_ALL &&
@@ -520,7 +590,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
             print_PSD(&pxpt->domain_info);
     }
 
-    if ( dom0_px_info->flags & XEN_PX_PPC )
+    if ( is_ppc_data(dom0_px_info) )
     {
         pxpt->platform_limit = dom0_px_info->platform_limit;
 
@@ -534,8 +604,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
         }
     }
 
-    if ( dom0_px_info->flags == ( XEN_PX_PCT | XEN_PX_PSS |
-                XEN_PX_PSD | XEN_PX_PPC ) )
+    if ( is_all_data(dom0_px_info) )
     {
         pxpt->init = XEN_PX_INIT;
 
diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
index 94dbc3f..328579c 100644
--- a/xen/include/public/platform.h
+++ b/xen/include/public/platform.h
@@ -384,6 +384,7 @@ DEFINE_XEN_GUEST_HANDLE(xenpf_getidletime_t);
 #define XEN_PX_PSS   2
 #define XEN_PX_PPC   4
 #define XEN_PX_PSD   8
+#define XEN_PX_DATA  16
 
 struct xen_power_register {
     uint32_t     space_id;
diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
index d8a1ba6..afdccf2 100644
--- a/xen/include/xen/processor_perf.h
+++ b/xen/include/xen/processor_perf.h
@@ -3,7 +3,9 @@
 
 #include <public/platform.h>
 #include <public/sysctl.h>
+#ifdef CONFIG_ACPI
 #include <xen/acpi.h>
+#endif
 
 #define XEN_PX_INIT 0x80000000
 
@@ -24,8 +26,10 @@ int  cpufreq_del_cpu(unsigned int);
 struct processor_performance {
     uint32_t state;
     uint32_t platform_limit;
+#ifdef CONFIG_ACPI
     struct xen_pct_register control_register;
     struct xen_pct_register status_register;
+#endif
     uint32_t state_count;
     struct xen_processor_px *states;
     struct xen_psd_package domain_info;
@@ -35,8 +39,10 @@ struct processor_performance {
 };
 
 struct processor_pminfo {
+#ifdef CONFIG_ACPI
     uint32_t acpi_id;
     uint32_t id;
+#endif
     struct processor_performance    perf;
 };
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 07/31] xenpm: Clarify xenpm usage
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (5 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 06/31] cpufreq: make cpufreq driver " Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-11-09 17:13   ` Wei Liu
  2017-11-09 17:09 ` [RFC PATCH 08/31] xen/device-tree: Add dt_count_phandle_with_args helper Oleksandr Tyshchenko
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Ian Jackson, Wei Liu,
	Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

CPU frequencies are in kHz. So, correct displayed text.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 tools/misc/xenpm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index 762311e..37da1d8 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -48,11 +48,11 @@ void show_help(void)
             " get-cpufreq-average   [cpuid]       average cpu frequency since last invocation\n"
             "                                     for CPU <cpuid> or all\n"
             " get-cpufreq-para      [cpuid]       list cpu freq parameter of CPU <cpuid> or all\n"
-            " set-scaling-maxfreq   [cpuid] <HZ>  set max cpu frequency <HZ> on CPU <cpuid>\n"
+            " set-scaling-maxfreq   [cpuid] <kHZ> set max cpu frequency <kHZ> on CPU <cpuid>\n"
             "                                     or all CPUs\n"
-            " set-scaling-minfreq   [cpuid] <HZ>  set min cpu frequency <HZ> on CPU <cpuid>\n"
+            " set-scaling-minfreq   [cpuid] <kHZ> set min cpu frequency <kHZ> on CPU <cpuid>\n"
             "                                     or all CPUs\n"
-            " set-scaling-speed     [cpuid] <num> set scaling speed on CPU <cpuid> or all\n"
+            " set-scaling-speed     [cpuid] <kHZ> set scaling speed <kHZ> on CPU <cpuid> or all\n"
             "                                     it is used in userspace governor.\n"
             " set-scaling-governor  [cpuid] <gov> set scaling governor on CPU <cpuid> or all\n"
             "                                     as userspace/performance/powersave/ondemand\n"
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 08/31] xen/device-tree: Add dt_count_phandle_with_args helper
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (6 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 07/31] xenpm: Clarify xenpm usage Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-11-09 17:09 ` [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros Oleksandr Tyshchenko
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Port Linux helper of_count_phandle_with_args for counting
number of phandles in a property.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>

---
   Changes in v1:
      - Add Julien's reviewed-by

   Changes in v2:
      -
---
 xen/common/device_tree.c      |  7 +++++++
 xen/include/xen/device_tree.h | 19 +++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 7b009ea..60b0095 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -1663,6 +1663,13 @@ int dt_parse_phandle_with_args(const struct dt_device_node *np,
                                         index, out_args);
 }
 
+int dt_count_phandle_with_args(const struct dt_device_node *np,
+                               const char *list_name,
+                               const char *cells_name)
+{
+    return __dt_parse_phandle_with_args(np, list_name, cells_name, 0, -1, NULL);
+}
+
 /**
  * unflatten_dt_node - Alloc and populate a device_node from the flat tree
  * @fdt: The parent device tree blob
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 0aecbe0..738f1b6 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -764,6 +764,25 @@ int dt_parse_phandle_with_args(const struct dt_device_node *np,
                                const char *cells_name, int index,
                                struct dt_phandle_args *out_args);
 
+/**
+ * dt_count_phandle_with_args() - Find the number of phandles references in a property
+ * @np: pointer to a device tree node containing a list
+ * @list_name: property name that contains a list
+ * @cells_name: property name that specifies phandles' arguments count
+ *
+ * Returns the number of phandle + argument tuples within a property. It
+ * is a typical pattern to encode a list of phandle and variable
+ * arguments into a single property. The number of arguments is encoded
+ * by a property in the phandle-target node. For example, a gpios
+ * property would contain a list of GPIO specifies consisting of a
+ * phandle and 1 or more arguments. The number of arguments are
+ * determined by the #gpio-cells property in the node pointed to by the
+ * phandle.
+ */
+int dt_count_phandle_with_args(const struct dt_device_node *np,
+                               const char *list_name,
+                               const char *cells_name);
+
 #ifdef CONFIG_DEVICE_TREE_DEBUG
 #define dt_dprintk(fmt, args...)  \
     printk(XENLOG_DEBUG fmt, ## args)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (7 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 08/31] xen/device-tree: Add dt_count_phandle_with_args helper Oleksandr Tyshchenko
@ 2017-11-09 17:09 ` Oleksandr Tyshchenko
  2017-12-04 23:24   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 10/31] xen/device-tree: Add dt_property_read_u32_index helper Oleksandr Tyshchenko
                   ` (24 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is a port from Linux.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/common/device_tree.c      | 18 ++++++++++++++++++
 xen/include/xen/device_tree.h | 21 +++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 60b0095..08f8072 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -208,6 +208,24 @@ int dt_property_read_string(const struct dt_device_node *np,
     return 0;
 }
 
+const char *dt_property_next_string(const struct dt_property *prop,
+                                    const char *cur)
+{
+    const void *curv = cur;
+
+    if ( !prop )
+        return NULL;
+
+    if ( !cur )
+        return prop->value;
+
+    curv += strlen(cur) + 1;
+    if ( curv >= prop->value + prop->length )
+        return NULL;
+
+    return curv;
+}
+
 bool_t dt_device_is_compatible(const struct dt_device_node *device,
                                const char *compat)
 {
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 738f1b6..9e0931c 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -420,6 +420,27 @@ int dt_property_read_string(const struct dt_device_node *np,
                             const char *propname, const char **out_string);
 
 /**
+ * dt_property_for_each_string - Iterate over an array of strings within
+ * a property with a given name for a given node.
+ *
+ * Example:
+ *
+ * struct dt_property *prop;
+ * const char *s;
+ *
+ * dt_property_for_each_string(np, "propname", prop, s)
+ *     printk("String value: %s\n", s);
+ */
+const char *dt_property_next_string(const struct dt_property *prop,
+                                    const char *cur);
+
+#define dt_property_for_each_string(np, propname, prop, s)    \
+    for (prop = dt_find_property(np, propname, NULL),         \
+        s = dt_property_next_string(prop, NULL);              \
+        s;                                                    \
+        s = dt_property_next_string(prop, s))
+
+/**
  * Checks if the given "compat" string matches one of the strings in
  * the device's "compatible" property
  */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 10/31] xen/device-tree: Add dt_property_read_u32_index helper
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (8 preceding siblings ...)
  2017-11-09 17:09 ` [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-04 23:29   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 11/31] xen/device-tree: Add dt_property_count_elems_of_size helper Oleksandr Tyshchenko
                   ` (23 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is a port from Linux.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/common/device_tree.c      | 52 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/device_tree.h | 20 +++++++++++++++++
 2 files changed, 72 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 08f8072..0fa654e 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -176,6 +176,58 @@ bool_t dt_property_read_u32(const struct dt_device_node *np,
     return 1;
 }
 
+/**
+ * dt_find_property_value_of_size
+ *
+ * @np:       device node from which the property value is to be read.
+ * @propname: name of the property to be searched.
+ * @min:      minimum allowed length of property value
+ * @max:      maximum allowed length of property value (0 means unlimited)
+ * @len:      if !=NULL, actual length is written to here
+ *
+ * Search for a property in a device node and valid the requested size.
+ * Returns the property value on success, -EINVAL if the property does not
+ * exist, -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data is too small or too large.
+ */
+static void *dt_find_property_value_of_size(const struct dt_device_node *np,
+                                            const char *propname,
+                                            u32 min, u32 max, size_t *len)
+{
+    const struct dt_property *prop = dt_find_property(np, propname, NULL);
+
+    if ( !prop )
+        return ERR_PTR(-EINVAL);
+    if ( !prop->value )
+        return ERR_PTR(-ENODATA);
+    if ( prop->length < min )
+        return ERR_PTR(-EOVERFLOW);
+    if ( max && prop->length > max )
+        return ERR_PTR(-EOVERFLOW);
+
+    if ( len )
+        *len = prop->length;
+
+    return prop->value;
+}
+
+int dt_property_read_u32_index(const struct dt_device_node *np,
+                               const char *propname,
+                               u32 index, u32 *out_value)
+{
+    const u32 *val =
+        dt_find_property_value_of_size(np, propname,
+                                       ((index + 1) * sizeof(*out_value)),
+                                       0,
+                                       NULL);
+
+    if ( IS_ERR(val) )
+        return PTR_ERR(val);
+
+    *out_value = be32_to_cpup(((__be32 *)val) + index);
+
+    return 0;
+}
 
 bool_t dt_property_read_u64(const struct dt_device_node *np,
                          const char *name, u64 *out_value)
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 9e0931c..87b4b67 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -374,6 +374,26 @@ const struct dt_property *dt_find_property(const struct dt_device_node *np,
  */
 bool_t dt_property_read_u32(const struct dt_device_node *np,
                             const char *name, u32 *out_value);
+
+/**
+ * dt_property_read_u32_index - Find and read a u32 from a multi-value property.
+ *
+ * @np:        device node from which the property value is to be read.
+ * @propname:  name of the property to be searched.
+ * @index:     index of the u32 in the list of values
+ * @out_value: pointer to return value, modified only if no error.
+ *
+ * Search for a property in a device node and read nth 32-bit value from
+ * it. Returns 0 on success, -EINVAL if the property does not exist,
+ * -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data isn't large enough.
+ *
+ * The out_value is modified only if a valid u32 value can be decoded.
+ */
+int dt_property_read_u32_index(const struct dt_device_node *np,
+                               const char *propname,
+                               u32 index, u32 *out_value);
+
 /**
  * dt_property_read_u64 - Helper to read a u64 property.
  * @np: node to get the value
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 11/31] xen/device-tree: Add dt_property_count_elems_of_size helper
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (9 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 10/31] xen/device-tree: Add dt_property_read_u32_index helper Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-04 23:29   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 12/31] xen/device-tree: Add dt_property_read_string_helper and friends Oleksandr Tyshchenko
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is a port from Linux.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/common/device_tree.c      | 20 ++++++++++++++++++++
 xen/include/xen/device_tree.h | 15 +++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 0fa654e..7b4cad3 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -278,6 +278,26 @@ const char *dt_property_next_string(const struct dt_property *prop,
     return curv;
 }
 
+int dt_property_count_elems_of_size(const struct dt_device_node *np,
+                                    const char *propname, int elem_size)
+{
+    const struct dt_property *prop = dt_find_property(np, propname, NULL);
+
+    if ( !prop )
+        return -EINVAL;
+    if ( !prop->value )
+        return -ENODATA;
+
+    if ( prop->length % elem_size != 0 )
+    {
+        printk("%s: size of %s is not a multiple of %d\n", np->full_name,
+               propname, elem_size);
+        return -EINVAL;
+    }
+
+    return prop->length / elem_size;
+}
+
 bool_t dt_device_is_compatible(const struct dt_device_node *device,
                                const char *compat)
 {
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 87b4b67..e2d7346 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -461,6 +461,21 @@ const char *dt_property_next_string(const struct dt_property *prop,
         s = dt_property_next_string(prop, s))
 
 /**
+ * dt_property_count_elems_of_size - Count the number of elements in a property
+ *
+ * @np:        device node from which the property value is to be read.
+ * @propname:  name of the property to be searched.
+ * @elem_size: size of the individual element
+ *
+ * Search for a property in a device node and count the number of elements of
+ * size elem_size in it. Returns number of elements on sucess, -EINVAL if the
+ * property does not exist or its length does not match a multiple of elem_size
+ * and -ENODATA if the property does not have a value.
+ */
+int dt_property_count_elems_of_size(const struct dt_device_node *np,
+                                    const char *propname, int elem_size);
+
+/**
  * Checks if the given "compat" string matches one of the strings in
  * the device's "compatible" property
  */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 12/31] xen/device-tree: Add dt_property_read_string_helper and friends
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (10 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 11/31] xen/device-tree: Add dt_property_count_elems_of_size helper Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-04 23:29   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device Oleksandr Tyshchenko
                   ` (21 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is a port from Linux.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/common/device_tree.c      | 27 +++++++++++++++
 xen/include/xen/device_tree.h | 81 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 7b4cad3..827eadd 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -260,6 +260,33 @@ int dt_property_read_string(const struct dt_device_node *np,
     return 0;
 }
 
+int dt_property_read_string_helper(const struct dt_device_node *np,
+                                   const char *propname, const char **out_strs,
+                                   size_t sz, int skip)
+{
+    const struct dt_property *prop = dt_find_property(np, propname, NULL);
+    int l = 0, i = 0;
+    const char *p, *end;
+
+    if ( !prop )
+        return -EINVAL;
+    if ( !prop->value )
+        return -ENODATA;
+    p = prop->value;
+    end = p + prop->length;
+
+    for ( i = 0; p < end && (!out_strs || i < skip + sz); i++, p += l )
+    {
+        l = strnlen(p, end - p) + 1;
+        if ( p + l > end )
+            return -EILSEQ;
+        if ( out_strs && i >= skip )
+            *out_strs++ = p;
+    }
+    i -= skip;
+    return i <= 0 ? -ENODATA : i;
+}
+
 const char *dt_property_next_string(const struct dt_property *prop,
                                     const char *cur)
 {
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index e2d7346..7e51a7a 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -440,6 +440,87 @@ int dt_property_read_string(const struct dt_device_node *np,
                             const char *propname, const char **out_string);
 
 /**
+ * dt_property_read_string_helper() - Utility helper for parsing string properties
+ * @np:       device node from which the property value is to be read.
+ * @propname: name of the property to be searched.
+ * @out_strs: output array of string pointers.
+ * @sz:       number of array elements to read.
+ * @skip:     Number of strings to skip over at beginning of list.
+ *
+ * Don't call this function directly. It is a utility helper for the
+ * dt_property_read_string*() family of functions.
+ */
+int dt_property_read_string_helper(const struct dt_device_node *np,
+                                   const char *propname, const char **out_strs,
+                                   size_t sz, int skip);
+
+/**
+ * dt_property_read_string_array() - Read an array of strings from a multiple
+ *                                   strings property.
+ * @np:       device node from which the property value is to be read.
+ * @propname: name of the property to be searched.
+ * @out_strs: output array of string pointers.
+ * @sz:       number of array elements to read.
+ *
+ * Search for a property in a device tree node and retrieve a list of
+ * terminated string values (pointer to data, not a copy) in that property.
+ *
+ * If @out_strs is NULL, the number of strings in the property is returned.
+ */
+static inline int dt_property_read_string_array(const struct dt_device_node *np,
+                                                const char *propname,
+                                                const char **out_strs,
+                                                size_t sz)
+{
+	return dt_property_read_string_helper(np, propname, out_strs, sz, 0);
+}
+
+/**
+ * dt_property_count_strings() - Find and return the number of strings from a
+ *                               multiple strings property.
+ * @np:       device node from which the property value is to be read.
+ * @propname: name of the property to be searched.
+ *
+ * Search for a property in a device tree node and retrieve the number of null
+ * terminated string contain in it. Returns the number of strings on
+ * success, -EINVAL if the property does not exist, -ENODATA if property
+ * does not have a value, and -EILSEQ if the string is not null-terminated
+ * within the length of the property data.
+ */
+static inline int dt_property_count_strings(const struct dt_device_node *np,
+                                            const char *propname)
+{
+	return dt_property_read_string_helper(np, propname, NULL, 0, 0);
+}
+
+/**
+ * dt_property_read_string_index() - Find and read a string from a multiple
+ *                                   strings property.
+ * @np:         device node from which the property value is to be read.
+ * @propname:   name of the property to be searched.
+ * @index:      index of the string in the list of strings
+ * @out_string: pointer to null terminated return string, modified only if
+ *              return value is 0.
+ *
+ * Search for a property in a device tree node and retrieve a null
+ * terminated string value (pointer to data, not a copy) in the list of strings
+ * contained in that property.
+ * Returns 0 on success, -EINVAL if the property does not exist, -ENODATA if
+ * property does not have a value, and -EILSEQ if the string is not
+ * null-terminated within the length of the property data.
+ *
+ * The out_string pointer is modified only if a valid string can be decoded.
+ */
+static inline int dt_property_read_string_index(const struct dt_device_node *np,
+                                                const char *propname,
+                                                int index, const char **output)
+{
+	int rc = dt_property_read_string_helper(np, propname, output, 1, index);
+
+	return rc < 0 ? rc : 0;
+}
+
+/**
  * dt_property_for_each_string - Iterate over an array of strings within
  * a property with a given name for a given node.
  *
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (11 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 12/31] xen/device-tree: Add dt_property_read_string_helper and friends Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-04 23:31   ` Stefano Stabellini
  2017-12-05 11:26   ` Julien Grall
  2017-11-09 17:10 ` [RFC PATCH 14/31] xen/arm: Add DEVICE_MAILBOX device class Oleksandr Tyshchenko
                   ` (20 subsequent siblings)
  33 siblings, 2 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/include/asm-arm/device.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 6734ae8..3e2f34a 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -20,6 +20,7 @@ struct device
     struct dt_device_node *of_node; /* Used by drivers imported from Linux */
 #endif
     struct dev_archdata archdata;
+    void *driver_data;
 };
 
 typedef struct device device_t;
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 14/31] xen/arm: Add DEVICE_MAILBOX device class
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (12 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 15/31] xen/arm: Store device-tree node per cpu Oleksandr Tyshchenko
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/include/asm-arm/device.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 3e2f34a..e8ce338 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -36,6 +36,7 @@ enum device_class
     DEVICE_SERIAL,
     DEVICE_IOMMU,
     DEVICE_GIC,
+    DEVICE_MAILBOX,
     /* Use for error */
     DEVICE_UNKNOWN,
 };
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 15/31] xen/arm: Store device-tree node per cpu
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (13 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 14/31] xen/arm: Add DEVICE_MAILBOX device class Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC Oleksandr Tyshchenko
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/smpboot.c        | 5 +++++
 xen/include/xen/device_tree.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 32e8722..caa126e 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -39,6 +39,7 @@ cpumask_t cpu_present_map;
 cpumask_t cpu_possible_map;
 
 struct cpuinfo_arm cpu_data[NR_CPUS];
+struct dt_device_node *cpu_dt_nodes[NR_CPUS];
 
 /* CPU logical map: map xen cpuid to an MPIDR */
 register_t __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = MPIDR_INVALID };
@@ -110,6 +111,8 @@ static void __init dt_smp_init_cpus(void)
 
     mpidr = boot_cpu_data.mpidr.bits & MPIDR_HWID_MASK;
 
+    memset(cpu_dt_nodes, 0, sizeof(cpu_dt_nodes));
+
     if ( !cpus )
     {
         printk(XENLOG_WARNING "WARNING: Can't find /cpus in the device tree.\n"
@@ -211,6 +214,8 @@ static void __init dt_smp_init_cpus(void)
             break;
         }
 
+        cpu_dt_nodes[i] = cpu;
+
         if ( (rc = arch_cpu_init(i, cpu)) < 0 )
         {
             printk("cpu%d init failed (hwid %"PRIregister"): %d\n", i, hwid, rc);
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 7e51a7a..98933f7 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -920,6 +920,8 @@ int dt_count_phandle_with_args(const struct dt_device_node *np,
                                const char *list_name,
                                const char *cells_name);
 
+extern struct dt_device_node *cpu_dt_nodes[];
+
 #ifdef CONFIG_DEVICE_TREE_DEBUG
 #define dt_dprintk(fmt, args...)  \
     printk(XENLOG_DEBUG fmt, ## args)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (14 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 15/31] xen/arm: Store device-tree node per cpu Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-05  2:30   ` Stefano Stabellini
  2017-12-05 14:58   ` Julien Grall
  2017-11-09 17:10 ` [RFC PATCH 17/31] xen/arm: Add ARM System Control and Power Interface (SCPI) protocol Oleksandr Tyshchenko
                   ` (17 subsequent siblings)
  33 siblings, 2 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Edgar E. Iglesias, Stefano Stabellini, Volodymyr Babchuk, Julien Grall

From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

Existing SMC wrapper call_smc() allows only 4 parameters and
returns only one value. This is enough for existing
use in PSCI code, but TEE mediator will need a call that is
fully compatible with ARM SMCCC.
This patch adds this call for both arm32 and arm64.

There was similar patch by Edgar E. Iglesias ([1]), but looks
like it is abandoned.

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html

CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/arm32/Makefile     |  1 +
 xen/arch/arm/arm32/smc.S        | 32 ++++++++++++++++++++++++++++++++
 xen/arch/arm/arm64/Makefile     |  1 +
 xen/arch/arm/arm64/smc.S        | 29 +++++++++++++++++++++++++++++
 xen/include/asm-arm/processor.h |  4 ++++
 5 files changed, 67 insertions(+)
 create mode 100644 xen/arch/arm/arm32/smc.S
 create mode 100644 xen/arch/arm/arm64/smc.S

diff --git a/xen/arch/arm/arm32/Makefile b/xen/arch/arm/arm32/Makefile
index 0ac254f..a2362f3 100644
--- a/xen/arch/arm/arm32/Makefile
+++ b/xen/arch/arm/arm32/Makefile
@@ -8,6 +8,7 @@ obj-y += insn.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
 obj-y += proc-v7.o proc-caxx.o
 obj-y += smpboot.o
+obj-y += smc.o
 obj-y += traps.o
 obj-y += vfp.o
 
diff --git a/xen/arch/arm/arm32/smc.S b/xen/arch/arm/arm32/smc.S
new file mode 100644
index 0000000..1cc9528
--- /dev/null
+++ b/xen/arch/arm/arm32/smc.S
@@ -0,0 +1,32 @@
+/*
+ * xen/arch/arm/arm32/smc.S
+ *
+ * Wrapper for Secure Monitors Calls
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <asm/macros.h>
+
+/*
+ * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
+ *                     register_t a3, register_t a4, register_t a5,
+ *                     register_t a6, register_t a7, register_t res[4])
+ */
+ENTRY(call_smccc_smc)
+        mov     r12, sp
+        push    {r4-r7}
+        ldm     r12, {r4-r7}
+        smc     #0
+        pop     {r4-r7}
+        ldr     r12, [sp, #(4 * 4)]
+        stm     r12, {r0-r3}
+        bx      lr
diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
index 149b6b3..7831dc1 100644
--- a/xen/arch/arm/arm64/Makefile
+++ b/xen/arch/arm/arm64/Makefile
@@ -8,5 +8,6 @@ obj-y += entry.o
 obj-y += insn.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
 obj-y += smpboot.o
+obj-y += smc.o
 obj-y += traps.o
 obj-y += vfp.o
diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
new file mode 100644
index 0000000..aa44fba
--- /dev/null
+++ b/xen/arch/arm/arm64/smc.S
@@ -0,0 +1,29 @@
+/*
+ * xen/arch/arm/arm64/smc.S
+ *
+ * Wrapper for Secure Monitors Calls
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <asm/macros.h>
+
+/*
+ * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
+ *                     register_t a3, register_t a4, register_t a5,
+ *                     register_t a6, register_t a7, register_t res[4])
+ */
+ENTRY(call_smccc_smc)
+        smc     #0
+        ldr     x4, [sp]
+        stp     x0, x1, [x4, 0]
+        stp     x2, x3, [x4, 16]
+        ret
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 9f7a42f..4ce5bb6 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -786,6 +786,10 @@ void vcpu_regs_user_to_hyp(struct vcpu *vcpu,
 int call_smc(register_t function_id, register_t arg0, register_t arg1,
              register_t arg2);
 
+void call_smccc_smc(register_t a0, register_t a1, register_t a2,
+                    register_t a3, register_t a4, register_t a5,
+                    register_t a6, register_t a7, register_t res[4]);
+
 void do_trap_hyp_serror(struct cpu_user_regs *regs);
 
 void do_trap_guest_serror(struct cpu_user_regs *regs);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 17/31] xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (15 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 18/31] xen/arm: Add mailbox infrastructure Oleksandr Tyshchenko
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This code is completely borrowed from the Linux. Please see:
http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h

Bindings are here:
http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt

Recent protocol version you can find here:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf

I port this protocol with having CPUFreq on ARM in mind.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/arm_scpi.c      | 1085 ++++++++++++++++++++++++++++++++++
 xen/arch/arm/cpufreq/scpi_protocol.h |   84 +++
 2 files changed, 1169 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
 create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h

diff --git a/xen/arch/arm/cpufreq/arm_scpi.c b/xen/arch/arm/cpufreq/arm_scpi.c
new file mode 100644
index 0000000..7da9f1b
--- /dev/null
+++ b/xen/arch/arm/cpufreq/arm_scpi.c
@@ -0,0 +1,1085 @@
+/*
+ * System Control and Power Interface (SCPI) Message Protocol driver
+ *
+ * SCPI Message Protocol is used between the System Control Processor(SCP)
+ * and the Application Processors(AP). The Message Handling Unit(MHU)
+ * provides a mechanism for inter-processor communication between SCP's
+ * Cortex M3 and AP.
+ *
+ * SCP offers control and management of the core/cluster power states,
+ * various power domain DVFS including the core/cluster, certain system
+ * clocks configuration, thermal sensors and many others.
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/bitmap.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/export.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/mailbox_client.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+#include <linux/printk.h>
+#include <linux/pm_opp.h>
+#include <linux/scpi_protocol.h>
+#include <linux/slab.h>
+#include <linux/sort.h>
+#include <linux/spinlock.h>
+
+#define CMD_ID_SHIFT		0
+#define CMD_ID_MASK		0x7f
+#define CMD_TOKEN_ID_SHIFT	8
+#define CMD_TOKEN_ID_MASK	0xff
+#define CMD_DATA_SIZE_SHIFT	16
+#define CMD_DATA_SIZE_MASK	0x1ff
+#define CMD_LEGACY_DATA_SIZE_SHIFT	20
+#define CMD_LEGACY_DATA_SIZE_MASK	0x1ff
+#define PACK_SCPI_CMD(cmd_id, tx_sz)			\
+	((((cmd_id) & CMD_ID_MASK) << CMD_ID_SHIFT) |	\
+	(((tx_sz) & CMD_DATA_SIZE_MASK) << CMD_DATA_SIZE_SHIFT))
+#define ADD_SCPI_TOKEN(cmd, token)			\
+	((cmd) |= (((token) & CMD_TOKEN_ID_MASK) << CMD_TOKEN_ID_SHIFT))
+#define PACK_LEGACY_SCPI_CMD(cmd_id, tx_sz)				\
+	((((cmd_id) & CMD_ID_MASK) << CMD_ID_SHIFT) |			       \
+	(((tx_sz) & CMD_LEGACY_DATA_SIZE_MASK) << CMD_LEGACY_DATA_SIZE_SHIFT))
+
+#define CMD_SIZE(cmd)	(((cmd) >> CMD_DATA_SIZE_SHIFT) & CMD_DATA_SIZE_MASK)
+#define CMD_LEGACY_SIZE(cmd)	(((cmd) >> CMD_LEGACY_DATA_SIZE_SHIFT) & \
+					CMD_LEGACY_DATA_SIZE_MASK)
+#define CMD_UNIQ_MASK	(CMD_TOKEN_ID_MASK << CMD_TOKEN_ID_SHIFT | CMD_ID_MASK)
+#define CMD_XTRACT_UNIQ(cmd)	((cmd) & CMD_UNIQ_MASK)
+
+#define SCPI_SLOT		0
+
+#define MAX_DVFS_DOMAINS	8
+#define MAX_DVFS_OPPS		16
+#define DVFS_LATENCY(hdr)	(le32_to_cpu(hdr) >> 16)
+#define DVFS_OPP_COUNT(hdr)	((le32_to_cpu(hdr) >> 8) & 0xff)
+
+#define PROTOCOL_REV_MINOR_BITS	16
+#define PROTOCOL_REV_MINOR_MASK	((1U << PROTOCOL_REV_MINOR_BITS) - 1)
+#define PROTOCOL_REV_MAJOR(x)	((x) >> PROTOCOL_REV_MINOR_BITS)
+#define PROTOCOL_REV_MINOR(x)	((x) & PROTOCOL_REV_MINOR_MASK)
+
+#define FW_REV_MAJOR_BITS	24
+#define FW_REV_MINOR_BITS	16
+#define FW_REV_PATCH_MASK	((1U << FW_REV_MINOR_BITS) - 1)
+#define FW_REV_MINOR_MASK	((1U << FW_REV_MAJOR_BITS) - 1)
+#define FW_REV_MAJOR(x)		((x) >> FW_REV_MAJOR_BITS)
+#define FW_REV_MINOR(x)		(((x) & FW_REV_MINOR_MASK) >> FW_REV_MINOR_BITS)
+#define FW_REV_PATCH(x)		((x) & FW_REV_PATCH_MASK)
+
+#define MAX_RX_TIMEOUT		(msecs_to_jiffies(30))
+
+enum scpi_error_codes {
+	SCPI_SUCCESS = 0, /* Success */
+	SCPI_ERR_PARAM = 1, /* Invalid parameter(s) */
+	SCPI_ERR_ALIGN = 2, /* Invalid alignment */
+	SCPI_ERR_SIZE = 3, /* Invalid size */
+	SCPI_ERR_HANDLER = 4, /* Invalid handler/callback */
+	SCPI_ERR_ACCESS = 5, /* Invalid access/permission denied */
+	SCPI_ERR_RANGE = 6, /* Value out of range */
+	SCPI_ERR_TIMEOUT = 7, /* Timeout has occurred */
+	SCPI_ERR_NOMEM = 8, /* Invalid memory area or pointer */
+	SCPI_ERR_PWRSTATE = 9, /* Invalid power state */
+	SCPI_ERR_SUPPORT = 10, /* Not supported or disabled */
+	SCPI_ERR_DEVICE = 11, /* Device error */
+	SCPI_ERR_BUSY = 12, /* Device busy */
+	SCPI_ERR_MAX
+};
+
+/* SCPI Standard commands */
+enum scpi_std_cmd {
+	SCPI_CMD_INVALID		= 0x00,
+	SCPI_CMD_SCPI_READY		= 0x01,
+	SCPI_CMD_SCPI_CAPABILITIES	= 0x02,
+	SCPI_CMD_SET_CSS_PWR_STATE	= 0x03,
+	SCPI_CMD_GET_CSS_PWR_STATE	= 0x04,
+	SCPI_CMD_SET_SYS_PWR_STATE	= 0x05,
+	SCPI_CMD_SET_CPU_TIMER		= 0x06,
+	SCPI_CMD_CANCEL_CPU_TIMER	= 0x07,
+	SCPI_CMD_DVFS_CAPABILITIES	= 0x08,
+	SCPI_CMD_GET_DVFS_INFO		= 0x09,
+	SCPI_CMD_SET_DVFS		= 0x0a,
+	SCPI_CMD_GET_DVFS		= 0x0b,
+	SCPI_CMD_GET_DVFS_STAT		= 0x0c,
+	SCPI_CMD_CLOCK_CAPABILITIES	= 0x0d,
+	SCPI_CMD_GET_CLOCK_INFO		= 0x0e,
+	SCPI_CMD_SET_CLOCK_VALUE	= 0x0f,
+	SCPI_CMD_GET_CLOCK_VALUE	= 0x10,
+	SCPI_CMD_PSU_CAPABILITIES	= 0x11,
+	SCPI_CMD_GET_PSU_INFO		= 0x12,
+	SCPI_CMD_SET_PSU		= 0x13,
+	SCPI_CMD_GET_PSU		= 0x14,
+	SCPI_CMD_SENSOR_CAPABILITIES	= 0x15,
+	SCPI_CMD_SENSOR_INFO		= 0x16,
+	SCPI_CMD_SENSOR_VALUE		= 0x17,
+	SCPI_CMD_SENSOR_CFG_PERIODIC	= 0x18,
+	SCPI_CMD_SENSOR_CFG_BOUNDS	= 0x19,
+	SCPI_CMD_SENSOR_ASYNC_VALUE	= 0x1a,
+	SCPI_CMD_SET_DEVICE_PWR_STATE	= 0x1b,
+	SCPI_CMD_GET_DEVICE_PWR_STATE	= 0x1c,
+	SCPI_CMD_COUNT
+};
+
+/* SCPI Legacy Commands */
+enum legacy_scpi_std_cmd {
+	LEGACY_SCPI_CMD_INVALID			= 0x00,
+	LEGACY_SCPI_CMD_SCPI_READY		= 0x01,
+	LEGACY_SCPI_CMD_SCPI_CAPABILITIES	= 0x02,
+	LEGACY_SCPI_CMD_EVENT			= 0x03,
+	LEGACY_SCPI_CMD_SET_CSS_PWR_STATE	= 0x04,
+	LEGACY_SCPI_CMD_GET_CSS_PWR_STATE	= 0x05,
+	LEGACY_SCPI_CMD_CFG_PWR_STATE_STAT	= 0x06,
+	LEGACY_SCPI_CMD_GET_PWR_STATE_STAT	= 0x07,
+	LEGACY_SCPI_CMD_SYS_PWR_STATE		= 0x08,
+	LEGACY_SCPI_CMD_L2_READY		= 0x09,
+	LEGACY_SCPI_CMD_SET_AP_TIMER		= 0x0a,
+	LEGACY_SCPI_CMD_CANCEL_AP_TIME		= 0x0b,
+	LEGACY_SCPI_CMD_DVFS_CAPABILITIES	= 0x0c,
+	LEGACY_SCPI_CMD_GET_DVFS_INFO		= 0x0d,
+	LEGACY_SCPI_CMD_SET_DVFS		= 0x0e,
+	LEGACY_SCPI_CMD_GET_DVFS		= 0x0f,
+	LEGACY_SCPI_CMD_GET_DVFS_STAT		= 0x10,
+	LEGACY_SCPI_CMD_SET_RTC			= 0x11,
+	LEGACY_SCPI_CMD_GET_RTC			= 0x12,
+	LEGACY_SCPI_CMD_CLOCK_CAPABILITIES	= 0x13,
+	LEGACY_SCPI_CMD_SET_CLOCK_INDEX		= 0x14,
+	LEGACY_SCPI_CMD_SET_CLOCK_VALUE		= 0x15,
+	LEGACY_SCPI_CMD_GET_CLOCK_VALUE		= 0x16,
+	LEGACY_SCPI_CMD_PSU_CAPABILITIES	= 0x17,
+	LEGACY_SCPI_CMD_SET_PSU			= 0x18,
+	LEGACY_SCPI_CMD_GET_PSU			= 0x19,
+	LEGACY_SCPI_CMD_SENSOR_CAPABILITIES	= 0x1a,
+	LEGACY_SCPI_CMD_SENSOR_INFO		= 0x1b,
+	LEGACY_SCPI_CMD_SENSOR_VALUE		= 0x1c,
+	LEGACY_SCPI_CMD_SENSOR_CFG_PERIODIC	= 0x1d,
+	LEGACY_SCPI_CMD_SENSOR_CFG_BOUNDS	= 0x1e,
+	LEGACY_SCPI_CMD_SENSOR_ASYNC_VALUE	= 0x1f,
+	LEGACY_SCPI_CMD_COUNT
+};
+
+/* List all commands that are required to go through the high priority link */
+static int legacy_hpriority_cmds[] = {
+	LEGACY_SCPI_CMD_GET_CSS_PWR_STATE,
+	LEGACY_SCPI_CMD_CFG_PWR_STATE_STAT,
+	LEGACY_SCPI_CMD_GET_PWR_STATE_STAT,
+	LEGACY_SCPI_CMD_SET_DVFS,
+	LEGACY_SCPI_CMD_GET_DVFS,
+	LEGACY_SCPI_CMD_SET_RTC,
+	LEGACY_SCPI_CMD_GET_RTC,
+	LEGACY_SCPI_CMD_SET_CLOCK_INDEX,
+	LEGACY_SCPI_CMD_SET_CLOCK_VALUE,
+	LEGACY_SCPI_CMD_GET_CLOCK_VALUE,
+	LEGACY_SCPI_CMD_SET_PSU,
+	LEGACY_SCPI_CMD_GET_PSU,
+	LEGACY_SCPI_CMD_SENSOR_CFG_PERIODIC,
+	LEGACY_SCPI_CMD_SENSOR_CFG_BOUNDS,
+};
+
+/* List all commands used by this driver, used as indexes */
+enum scpi_drv_cmds {
+	CMD_SCPI_CAPABILITIES = 0,
+	CMD_GET_CLOCK_INFO,
+	CMD_GET_CLOCK_VALUE,
+	CMD_SET_CLOCK_VALUE,
+	CMD_GET_DVFS,
+	CMD_SET_DVFS,
+	CMD_GET_DVFS_INFO,
+	CMD_SENSOR_CAPABILITIES,
+	CMD_SENSOR_INFO,
+	CMD_SENSOR_VALUE,
+	CMD_SET_DEVICE_PWR_STATE,
+	CMD_GET_DEVICE_PWR_STATE,
+	CMD_MAX_COUNT,
+};
+
+static int scpi_std_commands[CMD_MAX_COUNT] = {
+	SCPI_CMD_SCPI_CAPABILITIES,
+	SCPI_CMD_GET_CLOCK_INFO,
+	SCPI_CMD_GET_CLOCK_VALUE,
+	SCPI_CMD_SET_CLOCK_VALUE,
+	SCPI_CMD_GET_DVFS,
+	SCPI_CMD_SET_DVFS,
+	SCPI_CMD_GET_DVFS_INFO,
+	SCPI_CMD_SENSOR_CAPABILITIES,
+	SCPI_CMD_SENSOR_INFO,
+	SCPI_CMD_SENSOR_VALUE,
+	SCPI_CMD_SET_DEVICE_PWR_STATE,
+	SCPI_CMD_GET_DEVICE_PWR_STATE,
+};
+
+static int scpi_legacy_commands[CMD_MAX_COUNT] = {
+	LEGACY_SCPI_CMD_SCPI_CAPABILITIES,
+	-1, /* GET_CLOCK_INFO */
+	LEGACY_SCPI_CMD_GET_CLOCK_VALUE,
+	LEGACY_SCPI_CMD_SET_CLOCK_VALUE,
+	LEGACY_SCPI_CMD_GET_DVFS,
+	LEGACY_SCPI_CMD_SET_DVFS,
+	LEGACY_SCPI_CMD_GET_DVFS_INFO,
+	LEGACY_SCPI_CMD_SENSOR_CAPABILITIES,
+	LEGACY_SCPI_CMD_SENSOR_INFO,
+	LEGACY_SCPI_CMD_SENSOR_VALUE,
+	-1, /* SET_DEVICE_PWR_STATE */
+	-1, /* GET_DEVICE_PWR_STATE */
+};
+
+struct scpi_xfer {
+	u32 slot; /* has to be first element */
+	u32 cmd;
+	u32 status;
+	const void *tx_buf;
+	void *rx_buf;
+	unsigned int tx_len;
+	unsigned int rx_len;
+	struct list_head node;
+	struct completion done;
+};
+
+struct scpi_chan {
+	struct mbox_client cl;
+	struct mbox_chan *chan;
+	void __iomem *tx_payload;
+	void __iomem *rx_payload;
+	struct list_head rx_pending;
+	struct list_head xfers_list;
+	struct scpi_xfer *xfers;
+	spinlock_t rx_lock; /* locking for the rx pending list */
+	struct mutex xfers_lock;
+	u8 token;
+};
+
+struct scpi_drvinfo {
+	u32 protocol_version;
+	u32 firmware_version;
+	bool is_legacy;
+	int num_chans;
+	int *commands;
+	DECLARE_BITMAP(cmd_priority, LEGACY_SCPI_CMD_COUNT);
+	atomic_t next_chan;
+	struct scpi_ops *scpi_ops;
+	struct scpi_chan *channels;
+	struct scpi_dvfs_info *dvfs[MAX_DVFS_DOMAINS];
+};
+
+/*
+ * The SCP firmware only executes in little-endian mode, so any buffers
+ * shared through SCPI should have their contents converted to little-endian
+ */
+struct scpi_shared_mem {
+	__le32 command;
+	__le32 status;
+	u8 payload[0];
+} __packed;
+
+struct legacy_scpi_shared_mem {
+	__le32 status;
+	u8 payload[0];
+} __packed;
+
+struct scp_capabilities {
+	__le32 protocol_version;
+	__le32 event_version;
+	__le32 platform_version;
+	__le32 commands[4];
+} __packed;
+
+struct clk_get_info {
+	__le16 id;
+	__le16 flags;
+	__le32 min_rate;
+	__le32 max_rate;
+	u8 name[20];
+} __packed;
+
+struct clk_get_value {
+	__le32 rate;
+} __packed;
+
+struct clk_set_value {
+	__le16 id;
+	__le16 reserved;
+	__le32 rate;
+} __packed;
+
+struct legacy_clk_set_value {
+	__le32 rate;
+	__le16 id;
+	__le16 reserved;
+} __packed;
+
+struct dvfs_info {
+	__le32 header;
+	struct {
+		__le32 freq;
+		__le32 m_volt;
+	} opps[MAX_DVFS_OPPS];
+} __packed;
+
+struct dvfs_set {
+	u8 domain;
+	u8 index;
+} __packed;
+
+struct sensor_capabilities {
+	__le16 sensors;
+} __packed;
+
+struct _scpi_sensor_info {
+	__le16 sensor_id;
+	u8 class;
+	u8 trigger_type;
+	char name[20];
+};
+
+struct sensor_value {
+	__le32 lo_val;
+	__le32 hi_val;
+} __packed;
+
+struct dev_pstate_set {
+	__le16 dev_id;
+	u8 pstate;
+} __packed;
+
+static struct scpi_drvinfo *scpi_info;
+
+static int scpi_linux_errmap[SCPI_ERR_MAX] = {
+	/* better than switch case as long as return value is continuous */
+	0, /* SCPI_SUCCESS */
+	-EINVAL, /* SCPI_ERR_PARAM */
+	-ENOEXEC, /* SCPI_ERR_ALIGN */
+	-EMSGSIZE, /* SCPI_ERR_SIZE */
+	-EINVAL, /* SCPI_ERR_HANDLER */
+	-EACCES, /* SCPI_ERR_ACCESS */
+	-ERANGE, /* SCPI_ERR_RANGE */
+	-ETIMEDOUT, /* SCPI_ERR_TIMEOUT */
+	-ENOMEM, /* SCPI_ERR_NOMEM */
+	-EINVAL, /* SCPI_ERR_PWRSTATE */
+	-EOPNOTSUPP, /* SCPI_ERR_SUPPORT */
+	-EIO, /* SCPI_ERR_DEVICE */
+	-EBUSY, /* SCPI_ERR_BUSY */
+};
+
+static inline int scpi_to_linux_errno(int errno)
+{
+	if (errno >= SCPI_SUCCESS && errno < SCPI_ERR_MAX)
+		return scpi_linux_errmap[errno];
+	return -EIO;
+}
+
+static void scpi_process_cmd(struct scpi_chan *ch, u32 cmd)
+{
+	unsigned long flags;
+	struct scpi_xfer *t, *match = NULL;
+
+	spin_lock_irqsave(&ch->rx_lock, flags);
+	if (list_empty(&ch->rx_pending)) {
+		spin_unlock_irqrestore(&ch->rx_lock, flags);
+		return;
+	}
+
+	/* Command type is not replied by the SCP Firmware in legacy Mode
+	 * We should consider that command is the head of pending RX commands
+	 * if the list is not empty. In TX only mode, the list would be empty.
+	 */
+	if (scpi_info->is_legacy) {
+		match = list_first_entry(&ch->rx_pending, struct scpi_xfer,
+					 node);
+		list_del(&match->node);
+	} else {
+		list_for_each_entry(t, &ch->rx_pending, node)
+			if (CMD_XTRACT_UNIQ(t->cmd) == CMD_XTRACT_UNIQ(cmd)) {
+				list_del(&t->node);
+				match = t;
+				break;
+			}
+	}
+	/* check if wait_for_completion is in progress or timed-out */
+	if (match && !completion_done(&match->done)) {
+		unsigned int len;
+
+		if (scpi_info->is_legacy) {
+			struct legacy_scpi_shared_mem *mem = ch->rx_payload;
+
+			/* RX Length is not replied by the legacy Firmware */
+			len = match->rx_len;
+
+			match->status = le32_to_cpu(mem->status);
+			memcpy_fromio(match->rx_buf, mem->payload, len);
+		} else {
+			struct scpi_shared_mem *mem = ch->rx_payload;
+
+			len = min(match->rx_len, CMD_SIZE(cmd));
+
+			match->status = le32_to_cpu(mem->status);
+			memcpy_fromio(match->rx_buf, mem->payload, len);
+		}
+
+		if (match->rx_len > len)
+			memset(match->rx_buf + len, 0, match->rx_len - len);
+		complete(&match->done);
+	}
+	spin_unlock_irqrestore(&ch->rx_lock, flags);
+}
+
+static void scpi_handle_remote_msg(struct mbox_client *c, void *msg)
+{
+	struct scpi_chan *ch = container_of(c, struct scpi_chan, cl);
+	struct scpi_shared_mem *mem = ch->rx_payload;
+	u32 cmd = 0;
+
+	if (!scpi_info->is_legacy)
+		cmd = le32_to_cpu(mem->command);
+
+	scpi_process_cmd(ch, cmd);
+}
+
+static void scpi_tx_prepare(struct mbox_client *c, void *msg)
+{
+	unsigned long flags;
+	struct scpi_xfer *t = msg;
+	struct scpi_chan *ch = container_of(c, struct scpi_chan, cl);
+	struct scpi_shared_mem *mem = (struct scpi_shared_mem *)ch->tx_payload;
+
+	if (t->tx_buf) {
+		if (scpi_info->is_legacy)
+			memcpy_toio(ch->tx_payload, t->tx_buf, t->tx_len);
+		else
+			memcpy_toio(mem->payload, t->tx_buf, t->tx_len);
+	}
+
+	if (t->rx_buf) {
+		if (!(++ch->token))
+			++ch->token;
+		ADD_SCPI_TOKEN(t->cmd, ch->token);
+		spin_lock_irqsave(&ch->rx_lock, flags);
+		list_add_tail(&t->node, &ch->rx_pending);
+		spin_unlock_irqrestore(&ch->rx_lock, flags);
+	}
+
+	if (!scpi_info->is_legacy)
+		mem->command = cpu_to_le32(t->cmd);
+}
+
+static struct scpi_xfer *get_scpi_xfer(struct scpi_chan *ch)
+{
+	struct scpi_xfer *t;
+
+	mutex_lock(&ch->xfers_lock);
+	if (list_empty(&ch->xfers_list)) {
+		mutex_unlock(&ch->xfers_lock);
+		return NULL;
+	}
+	t = list_first_entry(&ch->xfers_list, struct scpi_xfer, node);
+	list_del(&t->node);
+	mutex_unlock(&ch->xfers_lock);
+	return t;
+}
+
+static void put_scpi_xfer(struct scpi_xfer *t, struct scpi_chan *ch)
+{
+	mutex_lock(&ch->xfers_lock);
+	list_add_tail(&t->node, &ch->xfers_list);
+	mutex_unlock(&ch->xfers_lock);
+}
+
+static int scpi_send_message(u8 idx, void *tx_buf, unsigned int tx_len,
+			     void *rx_buf, unsigned int rx_len)
+{
+	int ret;
+	u8 chan;
+	u8 cmd;
+	struct scpi_xfer *msg;
+	struct scpi_chan *scpi_chan;
+
+	if (scpi_info->commands[idx] < 0)
+		return -EOPNOTSUPP;
+
+	cmd = scpi_info->commands[idx];
+
+	if (scpi_info->is_legacy)
+		chan = test_bit(cmd, scpi_info->cmd_priority) ? 1 : 0;
+	else
+		chan = atomic_inc_return(&scpi_info->next_chan) %
+			scpi_info->num_chans;
+	scpi_chan = scpi_info->channels + chan;
+
+	msg = get_scpi_xfer(scpi_chan);
+	if (!msg)
+		return -ENOMEM;
+
+	if (scpi_info->is_legacy) {
+		msg->cmd = PACK_LEGACY_SCPI_CMD(cmd, tx_len);
+		msg->slot = msg->cmd;
+	} else {
+		msg->slot = BIT(SCPI_SLOT);
+		msg->cmd = PACK_SCPI_CMD(cmd, tx_len);
+	}
+	msg->tx_buf = tx_buf;
+	msg->tx_len = tx_len;
+	msg->rx_buf = rx_buf;
+	msg->rx_len = rx_len;
+	reinit_completion(&msg->done);
+
+	ret = mbox_send_message(scpi_chan->chan, msg);
+	if (ret < 0 || !rx_buf)
+		goto out;
+
+	if (!wait_for_completion_timeout(&msg->done, MAX_RX_TIMEOUT))
+		ret = -ETIMEDOUT;
+	else
+		/* first status word */
+		ret = msg->status;
+out:
+	if (ret < 0 && rx_buf) /* remove entry from the list if timed-out */
+		scpi_process_cmd(scpi_chan, msg->cmd);
+
+	put_scpi_xfer(msg, scpi_chan);
+	/* SCPI error codes > 0, translate them to Linux scale*/
+	return ret > 0 ? scpi_to_linux_errno(ret) : ret;
+}
+
+static u32 scpi_get_version(void)
+{
+	return scpi_info->protocol_version;
+}
+
+static int
+scpi_clk_get_range(u16 clk_id, unsigned long *min, unsigned long *max)
+{
+	int ret;
+	struct clk_get_info clk;
+	__le16 le_clk_id = cpu_to_le16(clk_id);
+
+	ret = scpi_send_message(CMD_GET_CLOCK_INFO, &le_clk_id,
+				sizeof(le_clk_id), &clk, sizeof(clk));
+	if (!ret) {
+		*min = le32_to_cpu(clk.min_rate);
+		*max = le32_to_cpu(clk.max_rate);
+	}
+	return ret;
+}
+
+static unsigned long scpi_clk_get_val(u16 clk_id)
+{
+	int ret;
+	struct clk_get_value clk;
+	__le16 le_clk_id = cpu_to_le16(clk_id);
+
+	ret = scpi_send_message(CMD_GET_CLOCK_VALUE, &le_clk_id,
+				sizeof(le_clk_id), &clk, sizeof(clk));
+
+	return ret ? ret : le32_to_cpu(clk.rate);
+}
+
+static int scpi_clk_set_val(u16 clk_id, unsigned long rate)
+{
+	int stat;
+	struct clk_set_value clk = {
+		.id = cpu_to_le16(clk_id),
+		.rate = cpu_to_le32(rate)
+	};
+
+	return scpi_send_message(CMD_SET_CLOCK_VALUE, &clk, sizeof(clk),
+				 &stat, sizeof(stat));
+}
+
+static int legacy_scpi_clk_set_val(u16 clk_id, unsigned long rate)
+{
+	int stat;
+	struct legacy_clk_set_value clk = {
+		.id = cpu_to_le16(clk_id),
+		.rate = cpu_to_le32(rate)
+	};
+
+	return scpi_send_message(CMD_SET_CLOCK_VALUE, &clk, sizeof(clk),
+				 &stat, sizeof(stat));
+}
+
+static int scpi_dvfs_get_idx(u8 domain)
+{
+	int ret;
+	u8 dvfs_idx;
+
+	ret = scpi_send_message(CMD_GET_DVFS, &domain, sizeof(domain),
+				&dvfs_idx, sizeof(dvfs_idx));
+
+	return ret ? ret : dvfs_idx;
+}
+
+static int scpi_dvfs_set_idx(u8 domain, u8 index)
+{
+	int stat;
+	struct dvfs_set dvfs = {domain, index};
+
+	return scpi_send_message(CMD_SET_DVFS, &dvfs, sizeof(dvfs),
+				 &stat, sizeof(stat));
+}
+
+static int opp_cmp_func(const void *opp1, const void *opp2)
+{
+	const struct scpi_opp *t1 = opp1, *t2 = opp2;
+
+	return t1->freq - t2->freq;
+}
+
+static struct scpi_dvfs_info *scpi_dvfs_get_info(u8 domain)
+{
+	struct scpi_dvfs_info *info;
+	struct scpi_opp *opp;
+	struct dvfs_info buf;
+	int ret, i;
+
+	if (domain >= MAX_DVFS_DOMAINS)
+		return ERR_PTR(-EINVAL);
+
+	if (scpi_info->dvfs[domain])	/* data already populated */
+		return scpi_info->dvfs[domain];
+
+	ret = scpi_send_message(CMD_GET_DVFS_INFO, &domain, sizeof(domain),
+				&buf, sizeof(buf));
+	if (ret)
+		return ERR_PTR(ret);
+
+	info = kmalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	info->count = DVFS_OPP_COUNT(buf.header);
+	info->latency = DVFS_LATENCY(buf.header) * 1000; /* uS to nS */
+
+	info->opps = kcalloc(info->count, sizeof(*opp), GFP_KERNEL);
+	if (!info->opps) {
+		kfree(info);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	for (i = 0, opp = info->opps; i < info->count; i++, opp++) {
+		opp->freq = le32_to_cpu(buf.opps[i].freq);
+		opp->m_volt = le32_to_cpu(buf.opps[i].m_volt);
+	}
+
+	sort(info->opps, info->count, sizeof(*opp), opp_cmp_func, NULL);
+
+	scpi_info->dvfs[domain] = info;
+	return info;
+}
+
+static int scpi_dev_domain_id(struct device *dev)
+{
+	struct of_phandle_args clkspec;
+
+	if (of_parse_phandle_with_args(dev->of_node, "clocks", "#clock-cells",
+				       0, &clkspec))
+		return -EINVAL;
+
+	return clkspec.args[0];
+}
+
+static struct scpi_dvfs_info *scpi_dvfs_info(struct device *dev)
+{
+	int domain = scpi_dev_domain_id(dev);
+
+	if (domain < 0)
+		return ERR_PTR(domain);
+
+	return scpi_dvfs_get_info(domain);
+}
+
+static int scpi_dvfs_get_transition_latency(struct device *dev)
+{
+	struct scpi_dvfs_info *info = scpi_dvfs_info(dev);
+
+	if (IS_ERR(info))
+		return PTR_ERR(info);
+
+	if (!info->latency)
+		return 0;
+
+	return info->latency;
+}
+
+static int scpi_dvfs_add_opps_to_device(struct device *dev)
+{
+	int idx, ret;
+	struct scpi_opp *opp;
+	struct scpi_dvfs_info *info = scpi_dvfs_info(dev);
+
+	if (IS_ERR(info))
+		return PTR_ERR(info);
+
+	if (!info->opps)
+		return -EIO;
+
+	for (opp = info->opps, idx = 0; idx < info->count; idx++, opp++) {
+		ret = dev_pm_opp_add(dev, opp->freq, opp->m_volt * 1000);
+		if (ret) {
+			dev_warn(dev, "failed to add opp %uHz %umV\n",
+				 opp->freq, opp->m_volt);
+			while (idx-- > 0)
+				dev_pm_opp_remove(dev, (--opp)->freq);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int scpi_sensor_get_capability(u16 *sensors)
+{
+	struct sensor_capabilities cap_buf;
+	int ret;
+
+	ret = scpi_send_message(CMD_SENSOR_CAPABILITIES, NULL, 0, &cap_buf,
+				sizeof(cap_buf));
+	if (!ret)
+		*sensors = le16_to_cpu(cap_buf.sensors);
+
+	return ret;
+}
+
+static int scpi_sensor_get_info(u16 sensor_id, struct scpi_sensor_info *info)
+{
+	__le16 id = cpu_to_le16(sensor_id);
+	struct _scpi_sensor_info _info;
+	int ret;
+
+	ret = scpi_send_message(CMD_SENSOR_INFO, &id, sizeof(id),
+				&_info, sizeof(_info));
+	if (!ret) {
+		memcpy(info, &_info, sizeof(*info));
+		info->sensor_id = le16_to_cpu(_info.sensor_id);
+	}
+
+	return ret;
+}
+
+static int scpi_sensor_get_value(u16 sensor, u64 *val)
+{
+	__le16 id = cpu_to_le16(sensor);
+	struct sensor_value buf;
+	int ret;
+
+	ret = scpi_send_message(CMD_SENSOR_VALUE, &id, sizeof(id),
+				&buf, sizeof(buf));
+	if (ret)
+		return ret;
+
+	if (scpi_info->is_legacy)
+		/* only 32-bits supported, hi_val can be junk */
+		*val = le32_to_cpu(buf.lo_val);
+	else
+		*val = (u64)le32_to_cpu(buf.hi_val) << 32 |
+			le32_to_cpu(buf.lo_val);
+
+	return 0;
+}
+
+static int scpi_device_get_power_state(u16 dev_id)
+{
+	int ret;
+	u8 pstate;
+	__le16 id = cpu_to_le16(dev_id);
+
+	ret = scpi_send_message(CMD_GET_DEVICE_PWR_STATE, &id,
+				sizeof(id), &pstate, sizeof(pstate));
+	return ret ? ret : pstate;
+}
+
+static int scpi_device_set_power_state(u16 dev_id, u8 pstate)
+{
+	int stat;
+	struct dev_pstate_set dev_set = {
+		.dev_id = cpu_to_le16(dev_id),
+		.pstate = pstate,
+	};
+
+	return scpi_send_message(CMD_SET_DEVICE_PWR_STATE, &dev_set,
+				 sizeof(dev_set), &stat, sizeof(stat));
+}
+
+static struct scpi_ops scpi_ops = {
+	.get_version = scpi_get_version,
+	.clk_get_range = scpi_clk_get_range,
+	.clk_get_val = scpi_clk_get_val,
+	.clk_set_val = scpi_clk_set_val,
+	.dvfs_get_idx = scpi_dvfs_get_idx,
+	.dvfs_set_idx = scpi_dvfs_set_idx,
+	.dvfs_get_info = scpi_dvfs_get_info,
+	.device_domain_id = scpi_dev_domain_id,
+	.get_transition_latency = scpi_dvfs_get_transition_latency,
+	.add_opps_to_device = scpi_dvfs_add_opps_to_device,
+	.sensor_get_capability = scpi_sensor_get_capability,
+	.sensor_get_info = scpi_sensor_get_info,
+	.sensor_get_value = scpi_sensor_get_value,
+	.device_get_power_state = scpi_device_get_power_state,
+	.device_set_power_state = scpi_device_set_power_state,
+};
+
+struct scpi_ops *get_scpi_ops(void)
+{
+	return scpi_info ? scpi_info->scpi_ops : NULL;
+}
+EXPORT_SYMBOL_GPL(get_scpi_ops);
+
+static int scpi_init_versions(struct scpi_drvinfo *info)
+{
+	int ret;
+	struct scp_capabilities caps;
+
+	ret = scpi_send_message(CMD_SCPI_CAPABILITIES, NULL, 0,
+				&caps, sizeof(caps));
+	if (!ret) {
+		info->protocol_version = le32_to_cpu(caps.protocol_version);
+		info->firmware_version = le32_to_cpu(caps.platform_version);
+	}
+	/* Ignore error if not implemented */
+	if (scpi_info->is_legacy && ret == -EOPNOTSUPP)
+		return 0;
+
+	return ret;
+}
+
+static ssize_t protocol_version_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	struct scpi_drvinfo *scpi_info = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d.%d\n",
+		       PROTOCOL_REV_MAJOR(scpi_info->protocol_version),
+		       PROTOCOL_REV_MINOR(scpi_info->protocol_version));
+}
+static DEVICE_ATTR_RO(protocol_version);
+
+static ssize_t firmware_version_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	struct scpi_drvinfo *scpi_info = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d.%d.%d\n",
+		       FW_REV_MAJOR(scpi_info->firmware_version),
+		       FW_REV_MINOR(scpi_info->firmware_version),
+		       FW_REV_PATCH(scpi_info->firmware_version));
+}
+static DEVICE_ATTR_RO(firmware_version);
+
+static struct attribute *versions_attrs[] = {
+	&dev_attr_firmware_version.attr,
+	&dev_attr_protocol_version.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(versions);
+
+static void
+scpi_free_channels(struct device *dev, struct scpi_chan *pchan, int count)
+{
+	int i;
+
+	for (i = 0; i < count && pchan->chan; i++, pchan++) {
+		mbox_free_channel(pchan->chan);
+		devm_kfree(dev, pchan->xfers);
+		devm_iounmap(dev, pchan->rx_payload);
+	}
+}
+
+static int scpi_remove(struct platform_device *pdev)
+{
+	int i;
+	struct device *dev = &pdev->dev;
+	struct scpi_drvinfo *info = platform_get_drvdata(pdev);
+
+	scpi_info = NULL; /* stop exporting SCPI ops through get_scpi_ops */
+
+	of_platform_depopulate(dev);
+	sysfs_remove_groups(&dev->kobj, versions_groups);
+	scpi_free_channels(dev, info->channels, info->num_chans);
+	platform_set_drvdata(pdev, NULL);
+
+	for (i = 0; i < MAX_DVFS_DOMAINS && info->dvfs[i]; i++) {
+		kfree(info->dvfs[i]->opps);
+		kfree(info->dvfs[i]);
+	}
+	devm_kfree(dev, info->channels);
+	devm_kfree(dev, info);
+
+	return 0;
+}
+
+#define MAX_SCPI_XFERS		10
+static int scpi_alloc_xfer_list(struct device *dev, struct scpi_chan *ch)
+{
+	int i;
+	struct scpi_xfer *xfers;
+
+	xfers = devm_kzalloc(dev, MAX_SCPI_XFERS * sizeof(*xfers), GFP_KERNEL);
+	if (!xfers)
+		return -ENOMEM;
+
+	ch->xfers = xfers;
+	for (i = 0; i < MAX_SCPI_XFERS; i++, xfers++) {
+		init_completion(&xfers->done);
+		list_add_tail(&xfers->node, &ch->xfers_list);
+	}
+
+	return 0;
+}
+
+static const struct of_device_id legacy_scpi_of_match[] = {
+	{.compatible = "arm,scpi-pre-1.0"},
+	{},
+};
+
+static int scpi_probe(struct platform_device *pdev)
+{
+	int count, idx, ret;
+	struct resource res;
+	struct scpi_chan *scpi_chan;
+	struct device *dev = &pdev->dev;
+	struct device_node *np = dev->of_node;
+
+	scpi_info = devm_kzalloc(dev, sizeof(*scpi_info), GFP_KERNEL);
+	if (!scpi_info)
+		return -ENOMEM;
+
+	if (of_match_device(legacy_scpi_of_match, &pdev->dev))
+		scpi_info->is_legacy = true;
+
+	count = of_count_phandle_with_args(np, "mboxes", "#mbox-cells");
+	if (count < 0) {
+		dev_err(dev, "no mboxes property in '%pOF'\n", np);
+		return -ENODEV;
+	}
+
+	scpi_chan = devm_kcalloc(dev, count, sizeof(*scpi_chan), GFP_KERNEL);
+	if (!scpi_chan)
+		return -ENOMEM;
+
+	for (idx = 0; idx < count; idx++) {
+		resource_size_t size;
+		struct scpi_chan *pchan = scpi_chan + idx;
+		struct mbox_client *cl = &pchan->cl;
+		struct device_node *shmem = of_parse_phandle(np, "shmem", idx);
+
+		ret = of_address_to_resource(shmem, 0, &res);
+		of_node_put(shmem);
+		if (ret) {
+			dev_err(dev, "failed to get SCPI payload mem resource\n");
+			goto err;
+		}
+
+		size = resource_size(&res);
+		pchan->rx_payload = devm_ioremap(dev, res.start, size);
+		if (!pchan->rx_payload) {
+			dev_err(dev, "failed to ioremap SCPI payload\n");
+			ret = -EADDRNOTAVAIL;
+			goto err;
+		}
+		pchan->tx_payload = pchan->rx_payload + (size >> 1);
+
+		cl->dev = dev;
+		cl->rx_callback = scpi_handle_remote_msg;
+		cl->tx_prepare = scpi_tx_prepare;
+		cl->tx_block = true;
+		cl->tx_tout = 20;
+		cl->knows_txdone = false; /* controller can't ack */
+
+		INIT_LIST_HEAD(&pchan->rx_pending);
+		INIT_LIST_HEAD(&pchan->xfers_list);
+		spin_lock_init(&pchan->rx_lock);
+		mutex_init(&pchan->xfers_lock);
+
+		ret = scpi_alloc_xfer_list(dev, pchan);
+		if (!ret) {
+			pchan->chan = mbox_request_channel(cl, idx);
+			if (!IS_ERR(pchan->chan))
+				continue;
+			ret = PTR_ERR(pchan->chan);
+			if (ret != -EPROBE_DEFER)
+				dev_err(dev, "failed to get channel%d err %d\n",
+					idx, ret);
+		}
+err:
+		scpi_free_channels(dev, scpi_chan, idx);
+		scpi_info = NULL;
+		return ret;
+	}
+
+	scpi_info->channels = scpi_chan;
+	scpi_info->num_chans = count;
+	scpi_info->commands = scpi_std_commands;
+
+	platform_set_drvdata(pdev, scpi_info);
+
+	if (scpi_info->is_legacy) {
+		/* Replace with legacy variants */
+		scpi_ops.clk_set_val = legacy_scpi_clk_set_val;
+		scpi_info->commands = scpi_legacy_commands;
+
+		/* Fill priority bitmap */
+		for (idx = 0; idx < ARRAY_SIZE(legacy_hpriority_cmds); idx++)
+			set_bit(legacy_hpriority_cmds[idx],
+				scpi_info->cmd_priority);
+	}
+
+	ret = scpi_init_versions(scpi_info);
+	if (ret) {
+		dev_err(dev, "incorrect or no SCP firmware found\n");
+		scpi_remove(pdev);
+		return ret;
+	}
+
+	_dev_info(dev, "SCP Protocol %d.%d Firmware %d.%d.%d version\n",
+		  PROTOCOL_REV_MAJOR(scpi_info->protocol_version),
+		  PROTOCOL_REV_MINOR(scpi_info->protocol_version),
+		  FW_REV_MAJOR(scpi_info->firmware_version),
+		  FW_REV_MINOR(scpi_info->firmware_version),
+		  FW_REV_PATCH(scpi_info->firmware_version));
+	scpi_info->scpi_ops = &scpi_ops;
+
+	ret = sysfs_create_groups(&dev->kobj, versions_groups);
+	if (ret)
+		dev_err(dev, "unable to create sysfs version group\n");
+
+	return of_platform_populate(dev->of_node, NULL, NULL, dev);
+}
+
+static const struct of_device_id scpi_of_match[] = {
+	{.compatible = "arm,scpi"},
+	{.compatible = "arm,scpi-pre-1.0"},
+	{},
+};
+
+MODULE_DEVICE_TABLE(of, scpi_of_match);
+
+static struct platform_driver scpi_driver = {
+	.driver = {
+		.name = "scpi_protocol",
+		.of_match_table = scpi_of_match,
+	},
+	.probe = scpi_probe,
+	.remove = scpi_remove,
+};
+module_platform_driver(scpi_driver);
+
+MODULE_AUTHOR("Sudeep Holla <sudeep.holla@arm.com>");
+MODULE_DESCRIPTION("ARM SCPI mailbox protocol driver");
+MODULE_LICENSE("GPL v2");
diff --git a/xen/arch/arm/cpufreq/scpi_protocol.h b/xen/arch/arm/cpufreq/scpi_protocol.h
new file mode 100644
index 0000000..327d656
--- /dev/null
+++ b/xen/arch/arm/cpufreq/scpi_protocol.h
@@ -0,0 +1,84 @@
+/*
+ * SCPI Message Protocol driver header
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/types.h>
+
+struct scpi_opp {
+	u32 freq;
+	u32 m_volt;
+} __packed;
+
+struct scpi_dvfs_info {
+	unsigned int count;
+	unsigned int latency; /* in nanoseconds */
+	struct scpi_opp *opps;
+};
+
+enum scpi_sensor_class {
+	TEMPERATURE,
+	VOLTAGE,
+	CURRENT,
+	POWER,
+	ENERGY,
+};
+
+struct scpi_sensor_info {
+	u16 sensor_id;
+	u8 class;
+	u8 trigger_type;
+	char name[20];
+} __packed;
+
+/**
+ * struct scpi_ops - represents the various operations provided
+ *	by SCP through SCPI message protocol
+ * @get_version: returns the major and minor revision on the SCPI
+ *	message protocol
+ * @clk_get_range: gets clock range limit(min - max in Hz)
+ * @clk_get_val: gets clock value(in Hz)
+ * @clk_set_val: sets the clock value, setting to 0 will disable the
+ *	clock (if supported)
+ * @dvfs_get_idx: gets the Operating Point of the given power domain.
+ *	OPP is an index to the list return by @dvfs_get_info
+ * @dvfs_set_idx: sets the Operating Point of the given power domain.
+ *	OPP is an index to the list return by @dvfs_get_info
+ * @dvfs_get_info: returns the DVFS capabilities of the given power
+ *	domain. It includes the OPP list and the latency information
+ */
+struct scpi_ops {
+	u32 (*get_version)(void);
+	int (*clk_get_range)(u16, unsigned long *, unsigned long *);
+	unsigned long (*clk_get_val)(u16);
+	int (*clk_set_val)(u16, unsigned long);
+	int (*dvfs_get_idx)(u8);
+	int (*dvfs_set_idx)(u8, u8);
+	struct scpi_dvfs_info *(*dvfs_get_info)(u8);
+	int (*device_domain_id)(struct device *);
+	int (*get_transition_latency)(struct device *);
+	int (*add_opps_to_device)(struct device *);
+	int (*sensor_get_capability)(u16 *sensors);
+	int (*sensor_get_info)(u16 sensor_id, struct scpi_sensor_info *);
+	int (*sensor_get_value)(u16, u64 *);
+	int (*device_get_power_state)(u16);
+	int (*device_set_power_state)(u16, u8);
+};
+
+#if IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL)
+struct scpi_ops *get_scpi_ops(void);
+#else
+static inline struct scpi_ops *get_scpi_ops(void) { return NULL; }
+#endif
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 18/31] xen/arm: Add mailbox infrastructure
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (16 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 17/31] xen/arm: Add ARM System Control and Power Interface (SCPI) protocol Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 19/31] xen/arm: Introduce ARM SMC based mailbox Oleksandr Tyshchenko
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The mailbox feature is used by the SCPI protocol for inter-processor
communication between System Control Processor(SCP) and Application
Processor(s) (AP). Existing SCPI implementation uses mailbox feature
in common with shared memory region. Actually the mailbox purpose
is to signal a request for some action to be taken by SCP.

This code is completely borrowed from the Linux.
Please see:
http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/mailbox/mailbox.c
http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/mailbox/mailbox.h
http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/mailbox_client.h
http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/mailbox_controller.h

It is an open question where the common mailbox stuff should be located.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/mailbox.c            | 517 ++++++++++++++++++++++++++++++
 xen/arch/arm/cpufreq/mailbox.h            |  14 +
 xen/arch/arm/cpufreq/mailbox_client.h     |  51 +++
 xen/arch/arm/cpufreq/mailbox_controller.h | 134 ++++++++
 4 files changed, 716 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/mailbox.c
 create mode 100644 xen/arch/arm/cpufreq/mailbox.h
 create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
 create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h

diff --git a/xen/arch/arm/cpufreq/mailbox.c b/xen/arch/arm/cpufreq/mailbox.c
new file mode 100644
index 0000000..537f4f6
--- /dev/null
+++ b/xen/arch/arm/cpufreq/mailbox.c
@@ -0,0 +1,517 @@
+/*
+ * Mailbox: Common code for Mailbox controllers and users
+ *
+ * Copyright (C) 2013-2014 Linaro Ltd.
+ * Author: Jassi Brar <jassisinghbrar@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/slab.h>
+#include <linux/err.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/bitops.h>
+#include <linux/mailbox_client.h>
+#include <linux/mailbox_controller.h>
+
+#include "mailbox.h"
+
+static LIST_HEAD(mbox_cons);
+static DEFINE_MUTEX(con_mutex);
+
+static int add_to_rbuf(struct mbox_chan *chan, void *mssg)
+{
+	int idx;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	/* See if there is any space left */
+	if (chan->msg_count == MBOX_TX_QUEUE_LEN) {
+		spin_unlock_irqrestore(&chan->lock, flags);
+		return -ENOBUFS;
+	}
+
+	idx = chan->msg_free;
+	chan->msg_data[idx] = mssg;
+	chan->msg_count++;
+
+	if (idx == MBOX_TX_QUEUE_LEN - 1)
+		chan->msg_free = 0;
+	else
+		chan->msg_free++;
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return idx;
+}
+
+static void msg_submit(struct mbox_chan *chan)
+{
+	unsigned count, idx;
+	unsigned long flags;
+	void *data;
+	int err = -EBUSY;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	if (!chan->msg_count || chan->active_req)
+		goto exit;
+
+	count = chan->msg_count;
+	idx = chan->msg_free;
+	if (idx >= count)
+		idx -= count;
+	else
+		idx += MBOX_TX_QUEUE_LEN - count;
+
+	data = chan->msg_data[idx];
+
+	if (chan->cl->tx_prepare)
+		chan->cl->tx_prepare(chan->cl, data);
+	/* Try to submit a message to the MBOX controller */
+	err = chan->mbox->ops->send_data(chan, data);
+	if (!err) {
+		chan->active_req = data;
+		chan->msg_count--;
+	}
+exit:
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	if (!err && (chan->txdone_method & TXDONE_BY_POLL))
+		/* kick start the timer immediately to avoid delays */
+		hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
+}
+
+static void tx_tick(struct mbox_chan *chan, int r)
+{
+	unsigned long flags;
+	void *mssg;
+
+	spin_lock_irqsave(&chan->lock, flags);
+	mssg = chan->active_req;
+	chan->active_req = NULL;
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	/* Submit next message */
+	msg_submit(chan);
+
+	if (!mssg)
+		return;
+
+	/* Notify the client */
+	if (chan->cl->tx_done)
+		chan->cl->tx_done(chan->cl, mssg, r);
+
+	if (r != -ETIME && chan->cl->tx_block)
+		complete(&chan->tx_complete);
+}
+
+static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
+{
+	struct mbox_controller *mbox =
+		container_of(hrtimer, struct mbox_controller, poll_hrt);
+	bool txdone, resched = false;
+	int i;
+
+	for (i = 0; i < mbox->num_chans; i++) {
+		struct mbox_chan *chan = &mbox->chans[i];
+
+		if (chan->active_req && chan->cl) {
+			txdone = chan->mbox->ops->last_tx_done(chan);
+			if (txdone)
+				tx_tick(chan, 0);
+			else
+				resched = true;
+		}
+	}
+
+	if (resched) {
+		hrtimer_forward_now(hrtimer, ms_to_ktime(mbox->txpoll_period));
+		return HRTIMER_RESTART;
+	}
+	return HRTIMER_NORESTART;
+}
+
+/**
+ * mbox_chan_received_data - A way for controller driver to push data
+ *				received from remote to the upper layer.
+ * @chan: Pointer to the mailbox channel on which RX happened.
+ * @mssg: Client specific message typecasted as void *
+ *
+ * After startup and before shutdown any data received on the chan
+ * is passed on to the API via atomic mbox_chan_received_data().
+ * The controller should ACK the RX only after this call returns.
+ */
+void mbox_chan_received_data(struct mbox_chan *chan, void *mssg)
+{
+	/* No buffering the received data */
+	if (chan->cl->rx_callback)
+		chan->cl->rx_callback(chan->cl, mssg);
+}
+EXPORT_SYMBOL_GPL(mbox_chan_received_data);
+
+/**
+ * mbox_chan_txdone - A way for controller driver to notify the
+ *			framework that the last TX has completed.
+ * @chan: Pointer to the mailbox chan on which TX happened.
+ * @r: Status of last TX - OK or ERROR
+ *
+ * The controller that has IRQ for TX ACK calls this atomic API
+ * to tick the TX state machine. It works only if txdone_irq
+ * is set by the controller.
+ */
+void mbox_chan_txdone(struct mbox_chan *chan, int r)
+{
+	if (unlikely(!(chan->txdone_method & TXDONE_BY_IRQ))) {
+		dev_err(chan->mbox->dev,
+		       "Controller can't run the TX ticker\n");
+		return;
+	}
+
+	tx_tick(chan, r);
+}
+EXPORT_SYMBOL_GPL(mbox_chan_txdone);
+
+/**
+ * mbox_client_txdone - The way for a client to run the TX state machine.
+ * @chan: Mailbox channel assigned to this client.
+ * @r: Success status of last transmission.
+ *
+ * The client/protocol had received some 'ACK' packet and it notifies
+ * the API that the last packet was sent successfully. This only works
+ * if the controller can't sense TX-Done.
+ */
+void mbox_client_txdone(struct mbox_chan *chan, int r)
+{
+	if (unlikely(!(chan->txdone_method & TXDONE_BY_ACK))) {
+		dev_err(chan->mbox->dev, "Client can't run the TX ticker\n");
+		return;
+	}
+
+	tx_tick(chan, r);
+}
+EXPORT_SYMBOL_GPL(mbox_client_txdone);
+
+/**
+ * mbox_client_peek_data - A way for client driver to pull data
+ *			received from remote by the controller.
+ * @chan: Mailbox channel assigned to this client.
+ *
+ * A poke to controller driver for any received data.
+ * The data is actually passed onto client via the
+ * mbox_chan_received_data()
+ * The call can be made from atomic context, so the controller's
+ * implementation of peek_data() must not sleep.
+ *
+ * Return: True, if controller has, and is going to push after this,
+ *          some data.
+ *         False, if controller doesn't have any data to be read.
+ */
+bool mbox_client_peek_data(struct mbox_chan *chan)
+{
+	if (chan->mbox->ops->peek_data)
+		return chan->mbox->ops->peek_data(chan);
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(mbox_client_peek_data);
+
+/**
+ * mbox_send_message -	For client to submit a message to be
+ *				sent to the remote.
+ * @chan: Mailbox channel assigned to this client.
+ * @mssg: Client specific message typecasted.
+ *
+ * For client to submit data to the controller destined for a remote
+ * processor. If the client had set 'tx_block', the call will return
+ * either when the remote receives the data or when 'tx_tout' millisecs
+ * run out.
+ *  In non-blocking mode, the requests are buffered by the API and a
+ * non-negative token is returned for each queued request. If the request
+ * is not queued, a negative token is returned. Upon failure or successful
+ * TX, the API calls 'tx_done' from atomic context, from which the client
+ * could submit yet another request.
+ * The pointer to message should be preserved until it is sent
+ * over the chan, i.e, tx_done() is made.
+ * This function could be called from atomic context as it simply
+ * queues the data and returns a token against the request.
+ *
+ * Return: Non-negative integer for successful submission (non-blocking mode)
+ *	or transmission over chan (blocking mode).
+ *	Negative value denotes failure.
+ */
+int mbox_send_message(struct mbox_chan *chan, void *mssg)
+{
+	int t;
+
+	if (!chan || !chan->cl)
+		return -EINVAL;
+
+	t = add_to_rbuf(chan, mssg);
+	if (t < 0) {
+		dev_err(chan->mbox->dev, "Try increasing MBOX_TX_QUEUE_LEN\n");
+		return t;
+	}
+
+	msg_submit(chan);
+
+	if (chan->cl->tx_block) {
+		unsigned long wait;
+		int ret;
+
+		if (!chan->cl->tx_tout) /* wait forever */
+			wait = msecs_to_jiffies(3600000);
+		else
+			wait = msecs_to_jiffies(chan->cl->tx_tout);
+
+		ret = wait_for_completion_timeout(&chan->tx_complete, wait);
+		if (ret == 0) {
+			t = -ETIME;
+			tx_tick(chan, t);
+		}
+	}
+
+	return t;
+}
+EXPORT_SYMBOL_GPL(mbox_send_message);
+
+/**
+ * mbox_request_channel - Request a mailbox channel.
+ * @cl: Identity of the client requesting the channel.
+ * @index: Index of mailbox specifier in 'mboxes' property.
+ *
+ * The Client specifies its requirements and capabilities while asking for
+ * a mailbox channel. It can't be called from atomic context.
+ * The channel is exclusively allocated and can't be used by another
+ * client before the owner calls mbox_free_channel.
+ * After assignment, any packet received on this channel will be
+ * handed over to the client via the 'rx_callback'.
+ * The framework holds reference to the client, so the mbox_client
+ * structure shouldn't be modified until the mbox_free_channel returns.
+ *
+ * Return: Pointer to the channel assigned to the client if successful.
+ *		ERR_PTR for request failure.
+ */
+struct mbox_chan *mbox_request_channel(struct mbox_client *cl, int index)
+{
+	struct device *dev = cl->dev;
+	struct mbox_controller *mbox;
+	struct of_phandle_args spec;
+	struct mbox_chan *chan;
+	unsigned long flags;
+	int ret;
+
+	if (!dev || !dev->of_node) {
+		pr_debug("%s: No owner device node\n", __func__);
+		return ERR_PTR(-ENODEV);
+	}
+
+	mutex_lock(&con_mutex);
+
+	if (of_parse_phandle_with_args(dev->of_node, "mboxes",
+				       "#mbox-cells", index, &spec)) {
+		dev_dbg(dev, "%s: can't parse \"mboxes\" property\n", __func__);
+		mutex_unlock(&con_mutex);
+		return ERR_PTR(-ENODEV);
+	}
+
+	chan = ERR_PTR(-EPROBE_DEFER);
+	list_for_each_entry(mbox, &mbox_cons, node)
+		if (mbox->dev->of_node == spec.np) {
+			chan = mbox->of_xlate(mbox, &spec);
+			break;
+		}
+
+	of_node_put(spec.np);
+
+	if (IS_ERR(chan)) {
+		mutex_unlock(&con_mutex);
+		return chan;
+	}
+
+	if (chan->cl || !try_module_get(mbox->dev->driver->owner)) {
+		dev_dbg(dev, "%s: mailbox not free\n", __func__);
+		mutex_unlock(&con_mutex);
+		return ERR_PTR(-EBUSY);
+	}
+
+	spin_lock_irqsave(&chan->lock, flags);
+	chan->msg_free = 0;
+	chan->msg_count = 0;
+	chan->active_req = NULL;
+	chan->cl = cl;
+	init_completion(&chan->tx_complete);
+
+	if (chan->txdone_method	== TXDONE_BY_POLL && cl->knows_txdone)
+		chan->txdone_method |= TXDONE_BY_ACK;
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	if (chan->mbox->ops->startup) {
+		ret = chan->mbox->ops->startup(chan);
+
+		if (ret) {
+			dev_err(dev, "Unable to startup the chan (%d)\n", ret);
+			mbox_free_channel(chan);
+			chan = ERR_PTR(ret);
+		}
+	}
+
+	mutex_unlock(&con_mutex);
+	return chan;
+}
+EXPORT_SYMBOL_GPL(mbox_request_channel);
+
+struct mbox_chan *mbox_request_channel_byname(struct mbox_client *cl,
+					      const char *name)
+{
+	struct device_node *np = cl->dev->of_node;
+	struct property *prop;
+	const char *mbox_name;
+	int index = 0;
+
+	if (!np) {
+		dev_err(cl->dev, "%s() currently only supports DT\n", __func__);
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (!of_get_property(np, "mbox-names", NULL)) {
+		dev_err(cl->dev,
+			"%s() requires an \"mbox-names\" property\n", __func__);
+		return ERR_PTR(-EINVAL);
+	}
+
+	of_property_for_each_string(np, "mbox-names", prop, mbox_name) {
+		if (!strncmp(name, mbox_name, strlen(name)))
+			break;
+		index++;
+	}
+
+	return mbox_request_channel(cl, index);
+}
+EXPORT_SYMBOL_GPL(mbox_request_channel_byname);
+
+/**
+ * mbox_free_channel - The client relinquishes control of a mailbox
+ *			channel by this call.
+ * @chan: The mailbox channel to be freed.
+ */
+void mbox_free_channel(struct mbox_chan *chan)
+{
+	unsigned long flags;
+
+	if (!chan || !chan->cl)
+		return;
+
+	if (chan->mbox->ops->shutdown)
+		chan->mbox->ops->shutdown(chan);
+
+	/* The queued TX requests are simply aborted, no callbacks are made */
+	spin_lock_irqsave(&chan->lock, flags);
+	chan->cl = NULL;
+	chan->active_req = NULL;
+	if (chan->txdone_method == (TXDONE_BY_POLL | TXDONE_BY_ACK))
+		chan->txdone_method = TXDONE_BY_POLL;
+
+	module_put(chan->mbox->dev->driver->owner);
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+EXPORT_SYMBOL_GPL(mbox_free_channel);
+
+static struct mbox_chan *
+of_mbox_index_xlate(struct mbox_controller *mbox,
+		    const struct of_phandle_args *sp)
+{
+	int ind = sp->args[0];
+
+	if (ind >= mbox->num_chans)
+		return ERR_PTR(-EINVAL);
+
+	return &mbox->chans[ind];
+}
+
+/**
+ * mbox_controller_register - Register the mailbox controller
+ * @mbox:	Pointer to the mailbox controller.
+ *
+ * The controller driver registers its communication channels
+ */
+int mbox_controller_register(struct mbox_controller *mbox)
+{
+	int i, txdone;
+
+	/* Sanity check */
+	if (!mbox || !mbox->dev || !mbox->ops || !mbox->num_chans)
+		return -EINVAL;
+
+	if (mbox->txdone_irq)
+		txdone = TXDONE_BY_IRQ;
+	else if (mbox->txdone_poll)
+		txdone = TXDONE_BY_POLL;
+	else /* It has to be ACK then */
+		txdone = TXDONE_BY_ACK;
+
+	if (txdone == TXDONE_BY_POLL) {
+
+		if (!mbox->ops->last_tx_done) {
+			dev_err(mbox->dev, "last_tx_done method is absent\n");
+			return -EINVAL;
+		}
+
+		hrtimer_init(&mbox->poll_hrt, CLOCK_MONOTONIC,
+			     HRTIMER_MODE_REL);
+		mbox->poll_hrt.function = txdone_hrtimer;
+	}
+
+	for (i = 0; i < mbox->num_chans; i++) {
+		struct mbox_chan *chan = &mbox->chans[i];
+
+		chan->cl = NULL;
+		chan->mbox = mbox;
+		chan->txdone_method = txdone;
+		spin_lock_init(&chan->lock);
+	}
+
+	if (!mbox->of_xlate)
+		mbox->of_xlate = of_mbox_index_xlate;
+
+	mutex_lock(&con_mutex);
+	list_add_tail(&mbox->node, &mbox_cons);
+	mutex_unlock(&con_mutex);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mbox_controller_register);
+
+/**
+ * mbox_controller_unregister - Unregister the mailbox controller
+ * @mbox:	Pointer to the mailbox controller.
+ */
+void mbox_controller_unregister(struct mbox_controller *mbox)
+{
+	int i;
+
+	if (!mbox)
+		return;
+
+	mutex_lock(&con_mutex);
+
+	list_del(&mbox->node);
+
+	for (i = 0; i < mbox->num_chans; i++)
+		mbox_free_channel(&mbox->chans[i]);
+
+	if (mbox->txdone_poll)
+		hrtimer_cancel(&mbox->poll_hrt);
+
+	mutex_unlock(&con_mutex);
+}
+EXPORT_SYMBOL_GPL(mbox_controller_unregister);
diff --git a/xen/arch/arm/cpufreq/mailbox.h b/xen/arch/arm/cpufreq/mailbox.h
new file mode 100644
index 0000000..456ba68
--- /dev/null
+++ b/xen/arch/arm/cpufreq/mailbox.h
@@ -0,0 +1,14 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __MAILBOX_H
+#define __MAILBOX_H
+
+#define TXDONE_BY_IRQ	BIT(0) /* controller has remote RTR irq */
+#define TXDONE_BY_POLL	BIT(1) /* controller can read status of last TX */
+#define TXDONE_BY_ACK	BIT(2) /* S/W ACK recevied by Client ticks the TX */
+
+#endif /* __MAILBOX_H */
diff --git a/xen/arch/arm/cpufreq/mailbox_client.h b/xen/arch/arm/cpufreq/mailbox_client.h
new file mode 100644
index 0000000..4434871
--- /dev/null
+++ b/xen/arch/arm/cpufreq/mailbox_client.h
@@ -0,0 +1,51 @@
+/*
+ * Copyright (C) 2013-2014 Linaro Ltd.
+ * Author: Jassi Brar <jassisinghbrar@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __MAILBOX_CLIENT_H
+#define __MAILBOX_CLIENT_H
+
+#include <linux/of.h>
+#include <linux/device.h>
+
+struct mbox_chan;
+
+/**
+ * struct mbox_client - User of a mailbox
+ * @dev:		The client device
+ * @tx_block:		If the mbox_send_message should block until data is
+ *			transmitted.
+ * @tx_tout:		Max block period in ms before TX is assumed failure
+ * @knows_txdone:	If the client could run the TX state machine. Usually
+ *			if the client receives some ACK packet for transmission.
+ *			Unused if the controller already has TX_Done/RTR IRQ.
+ * @rx_callback:	Atomic callback to provide client the data received
+ * @tx_prepare: 	Atomic callback to ask client to prepare the payload
+ *			before initiating the transmission if required.
+ * @tx_done:		Atomic callback to tell client of data transmission
+ */
+struct mbox_client {
+	struct device *dev;
+	bool tx_block;
+	unsigned long tx_tout;
+	bool knows_txdone;
+
+	void (*rx_callback)(struct mbox_client *cl, void *mssg);
+	void (*tx_prepare)(struct mbox_client *cl, void *mssg);
+	void (*tx_done)(struct mbox_client *cl, void *mssg, int r);
+};
+
+struct mbox_chan *mbox_request_channel_byname(struct mbox_client *cl,
+					      const char *name);
+struct mbox_chan *mbox_request_channel(struct mbox_client *cl, int index);
+int mbox_send_message(struct mbox_chan *chan, void *mssg);
+void mbox_client_txdone(struct mbox_chan *chan, int r); /* atomic */
+bool mbox_client_peek_data(struct mbox_chan *chan); /* atomic */
+void mbox_free_channel(struct mbox_chan *chan); /* may sleep */
+
+#endif /* __MAILBOX_CLIENT_H */
diff --git a/xen/arch/arm/cpufreq/mailbox_controller.h b/xen/arch/arm/cpufreq/mailbox_controller.h
new file mode 100644
index 0000000..74deadb
--- /dev/null
+++ b/xen/arch/arm/cpufreq/mailbox_controller.h
@@ -0,0 +1,134 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __MAILBOX_CONTROLLER_H
+#define __MAILBOX_CONTROLLER_H
+
+#include <linux/of.h>
+#include <linux/types.h>
+#include <linux/hrtimer.h>
+#include <linux/device.h>
+#include <linux/completion.h>
+
+struct mbox_chan;
+
+/**
+ * struct mbox_chan_ops - methods to control mailbox channels
+ * @send_data:	The API asks the MBOX controller driver, in atomic
+ *		context try to transmit a message on the bus. Returns 0 if
+ *		data is accepted for transmission, -EBUSY while rejecting
+ *		if the remote hasn't yet read the last data sent. Actual
+ *		transmission of data is reported by the controller via
+ *		mbox_chan_txdone (if it has some TX ACK irq). It must not
+ *		sleep.
+ * @startup:	Called when a client requests the chan. The controller
+ *		could ask clients for additional parameters of communication
+ *		to be provided via client's chan_data. This call may
+ *		block. After this call the Controller must forward any
+ *		data received on the chan by calling mbox_chan_received_data.
+ *		The controller may do stuff that need to sleep.
+ * @shutdown:	Called when a client relinquishes control of a chan.
+ *		This call may block too. The controller must not forward
+ *		any received data anymore.
+ *		The controller may do stuff that need to sleep.
+ * @last_tx_done: If the controller sets 'txdone_poll', the API calls
+ *		  this to poll status of last TX. The controller must
+ *		  give priority to IRQ method over polling and never
+ *		  set both txdone_poll and txdone_irq. Only in polling
+ *		  mode 'send_data' is expected to return -EBUSY.
+ *		  The controller may do stuff that need to sleep/block.
+ *		  Used only if txdone_poll:=true && txdone_irq:=false
+ * @peek_data: Atomic check for any received data. Return true if controller
+ *		  has some data to push to the client. False otherwise.
+ */
+struct mbox_chan_ops {
+	int (*send_data)(struct mbox_chan *chan, void *data);
+	int (*startup)(struct mbox_chan *chan);
+	void (*shutdown)(struct mbox_chan *chan);
+	bool (*last_tx_done)(struct mbox_chan *chan);
+	bool (*peek_data)(struct mbox_chan *chan);
+};
+
+/**
+ * struct mbox_controller - Controller of a class of communication channels
+ * @dev:		Device backing this controller
+ * @ops:		Operators that work on each communication chan
+ * @chans:		Array of channels
+ * @num_chans:		Number of channels in the 'chans' array.
+ * @txdone_irq:		Indicates if the controller can report to API when
+ *			the last transmitted data was read by the remote.
+ *			Eg, if it has some TX ACK irq.
+ * @txdone_poll:	If the controller can read but not report the TX
+ *			done. Ex, some register shows the TX status but
+ *			no interrupt rises. Ignored if 'txdone_irq' is set.
+ * @txpoll_period:	If 'txdone_poll' is in effect, the API polls for
+ *			last TX's status after these many millisecs
+ * @of_xlate:		Controller driver specific mapping of channel via DT
+ * @poll_hrt:		API private. hrtimer used to poll for TXDONE on all
+ *			channels.
+ * @node:		API private. To hook into list of controllers.
+ */
+struct mbox_controller {
+	struct device *dev;
+	const struct mbox_chan_ops *ops;
+	struct mbox_chan *chans;
+	int num_chans;
+	bool txdone_irq;
+	bool txdone_poll;
+	unsigned txpoll_period;
+	struct mbox_chan *(*of_xlate)(struct mbox_controller *mbox,
+				      const struct of_phandle_args *sp);
+	/* Internal to API */
+	struct hrtimer poll_hrt;
+	struct list_head node;
+};
+
+/*
+ * The length of circular buffer for queuing messages from a client.
+ * 'msg_count' tracks the number of buffered messages while 'msg_free'
+ * is the index where the next message would be buffered.
+ * We shouldn't need it too big because every transfer is interrupt
+ * triggered and if we have lots of data to transfer, the interrupt
+ * latencies are going to be the bottleneck, not the buffer length.
+ * Besides, mbox_send_message could be called from atomic context and
+ * the client could also queue another message from the notifier 'tx_done'
+ * of the last transfer done.
+ * REVISIT: If too many platforms see the "Try increasing MBOX_TX_QUEUE_LEN"
+ * print, it needs to be taken from config option or somesuch.
+ */
+#define MBOX_TX_QUEUE_LEN	20
+
+/**
+ * struct mbox_chan - s/w representation of a communication chan
+ * @mbox:		Pointer to the parent/provider of this channel
+ * @txdone_method:	Way to detect TXDone chosen by the API
+ * @cl:			Pointer to the current owner of this channel
+ * @tx_complete:	Transmission completion
+ * @active_req:		Currently active request hook
+ * @msg_count:		No. of mssg currently queued
+ * @msg_free:		Index of next available mssg slot
+ * @msg_data:		Hook for data packet
+ * @lock:		Serialise access to the channel
+ * @con_priv:		Hook for controller driver to attach private data
+ */
+struct mbox_chan {
+	struct mbox_controller *mbox;
+	unsigned txdone_method;
+	struct mbox_client *cl;
+	struct completion tx_complete;
+	void *active_req;
+	unsigned msg_count, msg_free;
+	void *msg_data[MBOX_TX_QUEUE_LEN];
+	spinlock_t lock; /* Serialise access to the channel */
+	void *con_priv;
+};
+
+int mbox_controller_register(struct mbox_controller *mbox); /* can sleep */
+void mbox_controller_unregister(struct mbox_controller *mbox); /* can sleep */
+void mbox_chan_received_data(struct mbox_chan *chan, void *data); /* atomic */
+void mbox_chan_txdone(struct mbox_chan *chan, int r); /* atomic */
+
+#endif /* __MAILBOX_CONTROLLER_H */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 19/31] xen/arm: Introduce ARM SMC based mailbox
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (17 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 18/31] xen/arm: Add mailbox infrastructure Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 20/31] xen/arm: Add common header file wrappers.h Oleksandr Tyshchenko
                   ` (14 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Andre Przywara, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This code is completely borrowed from the patch series for Linux
which hasn't been upstreamed yet:
[PATCH v2 0/3] mailbox: arm: introduce smc triggered mailbox
https://lkml.org/lkml/2017/7/23/129

I am very excited about the idea described it a link above.
This solution lets us (who plays with ARM based SoCs with
Security Extensions enabled, but without possibility to involve
some standalone co-prossesor to control PM) have SCPI based CPUFreq
in Xen on ARM with a minimal cost.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
CC: Andre Przywara <andre.przywara@linaro.org>
---
 xen/arch/arm/cpufreq/arm-smc-mailbox.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c

diff --git a/xen/arch/arm/cpufreq/arm-smc-mailbox.c b/xen/arch/arm/cpufreq/arm-smc-mailbox.c
new file mode 100644
index 0000000..d7b61a7
--- /dev/null
+++ b/xen/arch/arm/cpufreq/arm-smc-mailbox.c
@@ -0,0 +1,155 @@
+/*
+ *  Copyright (C) 2016,2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This device provides a mechanism for emulating a mailbox by using
+ * smc calls, allowing a "mailbox" consumer to sit in firmware running
+ * on the same core.
+ */
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/mailbox_controller.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/arm-smccc.h>
+
+#define ARM_SMC_MBOX_USE_HVC	BIT(0)
+
+struct arm_smc_chan_data {
+	u32 function_id;
+	u32 flags;
+};
+
+static int arm_smc_send_data(struct mbox_chan *link, void *data)
+{
+	struct arm_smc_chan_data *chan_data = link->con_priv;
+	u32 function_id = chan_data->function_id;
+	struct arm_smccc_res res;
+	u32 msg = *(u32 *)data;
+
+	if (chan_data->flags & ARM_SMC_MBOX_USE_HVC)
+		arm_smccc_hvc(function_id, msg, 0, 0, 0, 0, 0, 0, &res);
+	else
+		arm_smccc_smc(function_id, msg, 0, 0, 0, 0, 0, 0, &res);
+
+	mbox_chan_received_data(link, (void *)res.a0);
+
+	return 0;
+}
+
+/* This mailbox is synchronous, so we are always done. */
+static bool arm_smc_last_tx_done(struct mbox_chan *link)
+{
+	return true;
+}
+
+static const struct mbox_chan_ops arm_smc_mbox_chan_ops = {
+	.send_data	= arm_smc_send_data,
+	.last_tx_done	= arm_smc_last_tx_done
+};
+
+static int arm_smc_mbox_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct mbox_controller *mbox;
+	struct arm_smc_chan_data *chan_data;
+	const char *method;
+	bool use_hvc = false;
+	int ret, i;
+
+	ret = of_property_count_elems_of_size(dev->of_node, "arm,func-ids",
+					      sizeof(u32));
+	if (ret < 0)
+		return ret;
+
+	if (!of_property_read_string(dev->of_node, "method", &method)) {
+		if (!strcmp("hvc", method)) {
+			use_hvc = true;
+		} else if (!strcmp("smc", method)) {
+			use_hvc = false;
+		} else {
+			dev_warn(dev, "invalid \"method\" property: %s\n",
+				 method);
+
+			return -EINVAL;
+		}
+	}
+
+	mbox = devm_kzalloc(dev, sizeof(*mbox), GFP_KERNEL);
+	if (!mbox)
+		return -ENOMEM;
+
+	mbox->num_chans = ret;
+	mbox->chans = devm_kcalloc(dev, mbox->num_chans, sizeof(*mbox->chans),
+				   GFP_KERNEL);
+	if (!mbox->chans)
+		return -ENOMEM;
+
+	chan_data = devm_kcalloc(dev, mbox->num_chans, sizeof(*chan_data),
+				 GFP_KERNEL);
+	if (!chan_data)
+		return -ENOMEM;
+
+	for (i = 0; i < mbox->num_chans; i++) {
+		u32 function_id;
+
+		ret = of_property_read_u32_index(dev->of_node,
+						 "arm,func-ids", i,
+						 &function_id);
+		if (ret)
+			return ret;
+
+		chan_data[i].function_id = function_id;
+		if (use_hvc)
+			chan_data[i].flags |= ARM_SMC_MBOX_USE_HVC;
+		mbox->chans[i].con_priv = &chan_data[i];
+	}
+
+	mbox->txdone_poll = true;
+	mbox->txdone_irq = false;
+	mbox->txpoll_period = 1;
+	mbox->ops = &arm_smc_mbox_chan_ops;
+	mbox->dev = dev;
+
+	ret = mbox_controller_register(mbox);
+	if (ret)
+		return ret;
+
+	platform_set_drvdata(pdev, mbox);
+	dev_info(dev, "ARM SMC mailbox enabled with %d chan%s.\n",
+		 mbox->num_chans, mbox->num_chans == 1 ? "" : "s");
+
+	return ret;
+}
+
+static int arm_smc_mbox_remove(struct platform_device *pdev)
+{
+	struct mbox_controller *mbox = platform_get_drvdata(pdev);
+
+	mbox_controller_unregister(mbox);
+	return 0;
+}
+
+static const struct of_device_id arm_smc_mbox_of_match[] = {
+	{ .compatible = "arm,smc-mbox", },
+	{},
+};
+MODULE_DEVICE_TABLE(of, arm_smc_mbox_of_match);
+
+static struct platform_driver arm_smc_mbox_driver = {
+	.driver = {
+		.name = "arm-smc-mbox",
+		.of_match_table = arm_smc_mbox_of_match,
+	},
+	.probe		= arm_smc_mbox_probe,
+	.remove		= arm_smc_mbox_remove,
+};
+module_platform_driver(arm_smc_mbox_driver);
+
+MODULE_AUTHOR("Andre Przywara <andre.przywara@arm.com>");
+MODULE_DESCRIPTION("Generic ARM smc mailbox driver");
+MODULE_LICENSE("GPL v2");
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 20/31] xen/arm: Add common header file wrappers.h
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (18 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 19/31] xen/arm: Introduce ARM SMC based mailbox Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 21/31] xen/arm: Add rxdone_auto flag to mbox_controller structure Oleksandr Tyshchenko
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This header file is intended to keep various Linux2Xen wrappers,
define-s, stubs which used by all direct ported CPUfreq components.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/wrappers.h | 239 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 239 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/wrappers.h

diff --git a/xen/arch/arm/cpufreq/wrappers.h b/xen/arch/arm/cpufreq/wrappers.h
new file mode 100644
index 0000000..284faa4
--- /dev/null
+++ b/xen/arch/arm/cpufreq/wrappers.h
@@ -0,0 +1,239 @@
+/*
+ * xen/arch/arm/cpufreq/wrappers.h
+ *
+ * This header file contains Linux2Xen wrappers, define-s, different stubs
+ * which used by all direct ported CPUfreq components.
+ *
+ * Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
+ * Copyright (c) 2017 EPAM Systems.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ARCH_ARM_CPUFREQ_WRAPPERS_H__
+#define __ARCH_ARM_CPUFREQ_WRAPPERS_H__
+
+#include <xen/time.h>
+#include <xen/delay.h>
+#include <xen/softirq.h>
+#include <xen/spinlock.h>
+#include <asm/device.h>
+#include <asm/atomic.h>
+
+/* Xen doesn't have mutex, so use spinlock instead. */
+#define mutex        spinlock
+#define mutex_lock   spin_lock
+#define mutex_unlock spin_unlock
+#define mutex_init   spin_lock_init
+#define DEFINE_MUTEX DEFINE_SPINLOCK
+
+/* Aliases to Xen allocation helpers. */
+#define devm_kmalloc(dev, size, flags)    _xmalloc(size, sizeof(void *))
+#define devm_kzalloc(dev, size, flags)    _xzalloc(size, sizeof(void *))
+#define devm_kcalloc(dev, n, size, flags) _xzalloc_array(size, sizeof(void *), n)
+#define kmalloc(size, flags)              _xmalloc(size, sizeof(void *))
+#define kcalloc(size, n, flags)           _xzalloc_array(size, sizeof(void *), n)
+#define devm_kfree(dev, p)                xfree(p)
+#define kfree                             xfree
+
+/* Aliases to Xen device tree helpers. */
+#define device_node                     dt_device_node
+#define platform_device                 dt_device_node
+#define of_device_id                    dt_device_match
+#define of_match_node                   dt_match_node
+#define of_property_count_elems_of_size dt_property_count_elems_of_size
+#define of_property_read_u32_index      dt_property_read_u32_index
+#define of_property_for_each_string     dt_property_for_each_string
+#define of_parse_phandle_with_args      dt_parse_phandle_with_args
+#define of_count_phandle_with_args      dt_count_phandle_with_args
+#define of_property_read_string         dt_property_read_string
+#define of_parse_phandle                dt_parse_phandle
+#define of_phandle_args                 dt_phandle_args
+#define of_get_property                 dt_get_property
+#define property                        dt_property
+
+static inline const struct of_device_id *of_match_device(
+    const struct of_device_id *matches, const struct device *dev)
+{
+    if ( !matches || !dev->of_node )
+        return NULL;
+
+    return of_match_node(matches, dev->of_node);
+}
+
+/* Stuff to deal with device address ranges. */
+struct resource
+{
+    u64 start;
+    u64 size;
+};
+
+#define resource_size(res) (res)->size;
+
+static inline int of_address_to_resource(struct device_node *node, int index,
+                                         struct resource *res)
+{
+    return dt_device_get_address(node, index, &res->start, &res->size);
+}
+
+typedef u64 resource_size_t;
+
+#define devm_ioremap(dev, addr, size) ioremap_nocache(addr, size)
+#define devm_iounmap(dev, addr)       iounmap(addr)
+
+/* Device logger functions */
+#define dev_print(dev, lvl, fmt, ...)    \
+    printk(lvl "scpi: %s: " fmt, dt_node_full_name(dev_to_dt(dev)), ## __VA_ARGS__)
+
+#define dev_info(dev, fmt, ...) dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
+#define dev_dbg(dev, fmt, ...)  dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__)
+#define dev_warn(dev, fmt, ...) dev_print(dev, XENLOG_WARNING, fmt, ## __VA_ARGS__)
+#define dev_err(dev, fmt, ...)  dev_print(dev, XENLOG_ERR, fmt, ## __VA_ARGS__)
+
+#define pr_debug  printk
+#define _dev_info dev_info
+
+/* Helpers to get/set driver specific info. */
+static inline void platform_set_drvdata(struct platform_device *pdev, void *data)
+{
+    pdev->dev.driver_data = data;
+}
+
+static inline void *platform_get_drvdata(const struct platform_device *pdev)
+{
+    return pdev->dev.driver_data;
+}
+
+/*
+ * Xen doesn't have such a synchronization mechanism as Linux's
+ * "wait-for-completion", because of its nature. Create dummy completion
+ * infrastructure to make direct ported code compilable.
+ */
+struct completion {
+    atomic_t done;
+};
+
+static inline void init_completion(struct completion *x)
+{
+    atomic_set(&x->done, 0);
+}
+
+static inline void reinit_completion(struct completion *x)
+{
+    atomic_set(&x->done, 0);
+}
+
+static inline void complete(struct completion *x)
+{
+    atomic_set(&x->done, 1);
+}
+
+/*
+ * Please note that this function is not properly functional in Xen, at least
+ * for now. Following "busy loop" based wait logic won't work in all cases.
+ * For example, when a code gets stuck at busy loop waiting for a completion
+ * to be signaled and SW timer, which interrupt handler is in charge of
+ * signaling this completion are on the same CPU.
+ * So, avoid using it whenever possible. Only allowed method is to signal
+ * a completion in HW interrupt handlers. Also please note that this code
+ * mustn't be used out of the cpufreq directory.
+ *
+ * It worth to mention that we won't have any problem with synchronous mailbox
+ * here since we always enter this function with already signaled completion
+ * in hand.
+ *
+ * I put a warn here to notify users every time they call this while
+ * manipulating with asynchronous mailbox... But it should be reconsidered.
+ */
+static inline unsigned long wait_for_completion_timeout(struct completion *x,
+                                                        unsigned long timeout)
+{
+    s_time_t deadline = NOW() + MILLISECS(timeout);
+    bool warn_once = true;
+
+    do
+    {
+        if ( atomic_cmpxchg(&x->done, 1, 0) )
+            return 1;
+
+        if ( warn_once )
+        {
+            warn_once = false;
+            printk("wait_for_completion isn't properly functional! "
+                   "It might lead to wrong timeout error\n");
+            WARN();
+        }
+
+        cpu_relax();
+        udelay(1);
+        process_pending_softirqs();
+    } while (NOW() <= deadline);
+
+    return 0;
+}
+
+static inline bool completion_done(struct completion *x)
+{
+    if ( !atomic_read(&x->done) )
+        return false;
+
+    return true;
+}
+
+/*
+ * As we only call this function to obtain a timeout for wait_for_completion_timeout
+ * which was modified to expect a timeout in millisecs, just return passed argument.
+ */
+static inline unsigned long msecs_to_jiffies(unsigned long timeout)
+{
+    return timeout;
+}
+
+/* Misc */
+#define MODULE_DEVICE_TABLE(type, name)
+#define EXPORT_SYMBOL_GPL(name)
+
+#define module_put(owner)
+#define try_module_get(owner) 1
+
+#define of_node_put(np)
+
+#define memcpy_fromio memcpy
+#define memcpy_toio   memcpy
+
+#define EMSGSIZE 90        /* Message too long */
+#define EPROBE_DEFER 517   /* Driver requests probe retry */
+
+/* Stubs to make driver compilable */
+static inline int dev_pm_opp_add(struct device *dev, unsigned long freq,
+                                 unsigned long u_volt)
+{
+    return 0;
+}
+
+static inline void dev_pm_opp_remove(struct device *dev, unsigned long freq)
+{
+
+}
+
+#endif /* __ARCH_ARM_CPUFREQ_WRAPPERS_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 21/31] xen/arm: Add rxdone_auto flag to mbox_controller structure
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (19 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 20/31] xen/arm: Add common header file wrappers.h Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol Oleksandr Tyshchenko
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds a flag which indicates if mailbox controller doesn't
need to poll for received data. It either has RX done irq for signaling
when received data are ready or received data 'appears' right after
transmitted data has been sent (synchronous case).

The purpose of this flag is to help framework to recognize
and then restrict a registration of controllers which need to use
timer based polling for received data.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/mailbox_controller.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/xen/arch/arm/cpufreq/mailbox_controller.h b/xen/arch/arm/cpufreq/mailbox_controller.h
index 74deadb..05c6e45 100644
--- a/xen/arch/arm/cpufreq/mailbox_controller.h
+++ b/xen/arch/arm/cpufreq/mailbox_controller.h
@@ -64,6 +64,10 @@ struct mbox_chan_ops {
  * @txdone_poll:	If the controller can read but not report the TX
  *			done. Ex, some register shows the TX status but
  *			no interrupt rises. Ignored if 'txdone_irq' is set.
+ * @rxdone_auto:	Indicates if controller doesn't need to poll for
+ * 			received data. It either has RX done irq for signaling when
+ * 			received data are ready or received data 'appears' right after
+ * 			transmitted data has been sent (synchronous case).
  * @txpoll_period:	If 'txdone_poll' is in effect, the API polls for
  *			last TX's status after these many millisecs
  * @of_xlate:		Controller driver specific mapping of channel via DT
@@ -78,6 +82,7 @@ struct mbox_controller {
 	int num_chans;
 	bool txdone_irq;
 	bool txdone_poll;
+	bool rxdone_auto;
 	unsigned txpoll_period;
 	struct mbox_chan *(*of_xlate)(struct mbox_controller *mbox,
 				      const struct of_phandle_args *sp);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (20 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 21/31] xen/arm: Add rxdone_auto flag to mbox_controller structure Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-05 21:20   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 23/31] xen/arm: Add Xen changes to mailbox infrastructure Oleksandr Tyshchenko
                   ` (11 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Modify the direct ported SCPI Message Protocol driver to be
functional inside Xen.

As SCPI Message protocol driver expects mailbox to be registed,
find and initialize mailbox before probing it.

Include "wrappers.h" which contains all required things the direct
ported code relies on.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/arm_scpi.c      | 90 ++++++++++++++++++++++++++++++++++++
 xen/arch/arm/cpufreq/scpi_protocol.h | 32 +++++++++++++
 2 files changed, 122 insertions(+)

diff --git a/xen/arch/arm/cpufreq/arm_scpi.c b/xen/arch/arm/cpufreq/arm_scpi.c
index 7da9f1b..553a516 100644
--- a/xen/arch/arm/cpufreq/arm_scpi.c
+++ b/xen/arch/arm/cpufreq/arm_scpi.c
@@ -23,8 +23,16 @@
  *
  * You should have received a copy of the GNU General Public License along
  * with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Based on Linux drivers/firmware/arm_scpi.c
+ * => commit 0d30176819c8738b012ec623c7b3db19df818e70
+ *
+ * Xen modification:
+ * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
+ * Copyright (C) 2017 EPAM Systems Inc.
  */
 
+#if 0
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/bitmap.h>
@@ -44,6 +52,22 @@
 #include <linux/slab.h>
 #include <linux/sort.h>
 #include <linux/spinlock.h>
+#endif
+
+#include <xen/device_tree.h>
+#include <xen/err.h>
+#include <xen/vmap.h>
+#include <xen/sort.h>
+
+#include "scpi_protocol.h"
+#include "mailbox_client.h"
+#include "mailbox_controller.h"
+#include "wrappers.h"
+
+/*
+ * TODO:
+ * 1. Add releasing resources since devm.
+ */
 
 #define CMD_ID_SHIFT		0
 #define CMD_ID_MASK		0x7f
@@ -859,6 +883,7 @@ static int scpi_init_versions(struct scpi_drvinfo *info)
 	return ret;
 }
 
+#if 0
 static ssize_t protocol_version_show(struct device *dev,
 				     struct device_attribute *attr, char *buf)
 {
@@ -888,6 +913,7 @@ static struct attribute *versions_attrs[] = {
 	NULL,
 };
 ATTRIBUTE_GROUPS(versions);
+#endif
 
 static void
 scpi_free_channels(struct device *dev, struct scpi_chan *pchan, int count)
@@ -909,8 +935,10 @@ static int scpi_remove(struct platform_device *pdev)
 
 	scpi_info = NULL; /* stop exporting SCPI ops through get_scpi_ops */
 
+#if 0
 	of_platform_depopulate(dev);
 	sysfs_remove_groups(&dev->kobj, versions_groups);
+#endif
 	scpi_free_channels(dev, info->channels, info->num_chans);
 	platform_set_drvdata(pdev, NULL);
 
@@ -1055,11 +1083,15 @@ err:
 		  FW_REV_PATCH(scpi_info->firmware_version));
 	scpi_info->scpi_ops = &scpi_ops;
 
+#if 0
 	ret = sysfs_create_groups(&dev->kobj, versions_groups);
 	if (ret)
 		dev_err(dev, "unable to create sysfs version group\n");
 
 	return of_platform_populate(dev->of_node, NULL, NULL, dev);
+#else
+	return 0;
+#endif
 }
 
 static const struct of_device_id scpi_of_match[] = {
@@ -1070,6 +1102,7 @@ static const struct of_device_id scpi_of_match[] = {
 
 MODULE_DEVICE_TABLE(of, scpi_of_match);
 
+#if 0
 static struct platform_driver scpi_driver = {
 	.driver = {
 		.name = "scpi_protocol",
@@ -1083,3 +1116,60 @@ module_platform_driver(scpi_driver);
 MODULE_AUTHOR("Sudeep Holla <sudeep.holla@arm.com>");
 MODULE_DESCRIPTION("ARM SCPI mailbox protocol driver");
 MODULE_LICENSE("GPL v2");
+#endif
+
+static struct device *scpi_dev;
+
+struct device *get_scpi_dev(void)
+{
+	return scpi_dev;
+}
+
+int __init scpi_init(void)
+{
+	struct dt_device_node *scpi, *mbox;
+	bool has_mbox = false;
+	int ret = -ENODEV;
+
+	scpi = dt_find_matching_node(NULL, scpi_of_match);
+	if (!scpi) {
+		printk("failed to find SCPI node in the device tree\n");
+		return -ENXIO;
+	}
+
+	/* At first find and initialize mailbox to communicate with SCP */
+	dt_for_each_device_node(dt_host, mbox) {
+		ret = device_init(mbox, DEVICE_MAILBOX, NULL);
+		if (!ret) {
+			has_mbox = true;
+			break;
+		}
+	}
+
+	if (!has_mbox) {
+		dev_err(&scpi->dev, "failed to init Mailbox interface (%d)\n", ret);
+		return ret;
+	}
+
+	ret = scpi_probe(scpi);
+	if (ret) {
+		/* TODO Do we need to deinit mailbox? */
+		dev_err(&scpi->dev, "failed to init SCPI Message Protocol (%d)\n", ret);
+		return ret;
+	}
+
+	scpi_dev = &scpi->dev;
+
+	/* TODO Do we need to mark device as used by Xen? */
+
+	return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/xen/arch/arm/cpufreq/scpi_protocol.h b/xen/arch/arm/cpufreq/scpi_protocol.h
index 327d656..0f6dab3 100644
--- a/xen/arch/arm/cpufreq/scpi_protocol.h
+++ b/xen/arch/arm/cpufreq/scpi_protocol.h
@@ -14,8 +14,25 @@
  *
  * You should have received a copy of the GNU General Public License along with
  * this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Based on Linux include/linux/scpi_protocol.h
+ * => commit 45ca7df7c345465dbd2426a33012c9c33d27de62
+ *
+ * Xen modification:
+ * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
+ * Copyright (C) 2017 EPAM Systems Inc.
  */
+
+#ifndef __ARCH_ARM_CPUFREQ_SCPI_PROTOCOL_H__
+#define __ARCH_ARM_CPUFREQ_SCPI_PROTOCOL_H__
+
+#if 0
 #include <linux/types.h>
+#endif
+
+#include <asm/device.h>
+
+#define IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL) 1
 
 struct scpi_opp {
 	u32 freq;
@@ -78,7 +95,22 @@ struct scpi_ops {
 };
 
 #if IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL)
+int scpi_init(void);
+struct device *get_scpi_dev(void);
 struct scpi_ops *get_scpi_ops(void);
 #else
+static inline int scpi_init(void) { return -1; }
+static inline struct device *get_scpi_dev(void) { return NULL; }
 static inline struct scpi_ops *get_scpi_ops(void) { return NULL; }
 #endif
+
+#endif /* __ARCH_ARM_CPUFREQ_SCPI_PROTOCOL_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 23/31] xen/arm: Add Xen changes to mailbox infrastructure
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (21 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 24/31] xen/arm: Add Xen changes to ARM SMC based mailbox Oleksandr Tyshchenko
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Modify the direct ported mailbox infrastructure to be
functional inside Xen.

Include "wrappers.h" which contains all required things the direct
ported code relies on.

Important note: the usage of dummy "wait-for-completion" based on
busy loop restricts us from using timer based polling.
So, prevent mailbox controllers (which need polling timer involved)
from being registered.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/mailbox.c            | 51 +++++++++++++++++++++++++++++--
 xen/arch/arm/cpufreq/mailbox.h            | 14 +++++++++
 xen/arch/arm/cpufreq/mailbox_client.h     | 18 +++++++++++
 xen/arch/arm/cpufreq/mailbox_controller.h | 22 +++++++++++++
 4 files changed, 102 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/cpufreq/mailbox.c b/xen/arch/arm/cpufreq/mailbox.c
index 537f4f6..7a34e36 100644
--- a/xen/arch/arm/cpufreq/mailbox.c
+++ b/xen/arch/arm/cpufreq/mailbox.c
@@ -7,8 +7,16 @@
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
+ *
+ * Based on Linux drivers/mailbox/mailbox.c
+ * => commit b7133d6fcd9a9eb4633357d4a27430d4e0c794ad
+ *
+ * Xen modification:
+ * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
+ * Copyright (C) 2017 EPAM Systems Inc.
  */
 
+#if 0
 #include <linux/interrupt.h>
 #include <linux/spinlock.h>
 #include <linux/mutex.h>
@@ -20,8 +28,16 @@
 #include <linux/bitops.h>
 #include <linux/mailbox_client.h>
 #include <linux/mailbox_controller.h>
+#endif
+
+#include <xen/device_tree.h>
+#include <xen/err.h>
+#include <xen/xmalloc.h>
 
 #include "mailbox.h"
+#include "mailbox_client.h"
+#include "mailbox_controller.h"
+#include "wrappers.h"
 
 static LIST_HEAD(mbox_cons);
 static DEFINE_MUTEX(con_mutex);
@@ -85,9 +101,11 @@ static void msg_submit(struct mbox_chan *chan)
 exit:
 	spin_unlock_irqrestore(&chan->lock, flags);
 
+#if 0 /* We don't support timer based polling. */
 	if (!err && (chan->txdone_method & TXDONE_BY_POLL))
 		/* kick start the timer immediately to avoid delays */
 		hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
+#endif
 }
 
 static void tx_tick(struct mbox_chan *chan, int r)
@@ -114,6 +132,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
 		complete(&chan->tx_complete);
 }
 
+#if 0 /* We don't support timer based polling. */
 static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
 {
 	struct mbox_controller *mbox =
@@ -139,6 +158,7 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
 	}
 	return HRTIMER_NORESTART;
 }
+#endif
 
 /**
  * mbox_chan_received_data - A way for controller driver to push data
@@ -374,7 +394,7 @@ struct mbox_chan *mbox_request_channel_byname(struct mbox_client *cl,
 					      const char *name)
 {
 	struct device_node *np = cl->dev->of_node;
-	struct property *prop;
+	const struct property *prop;
 	const char *mbox_name;
 	int index = 0;
 
@@ -452,13 +472,26 @@ int mbox_controller_register(struct mbox_controller *mbox)
 	if (!mbox || !mbox->dev || !mbox->ops || !mbox->num_chans)
 		return -EINVAL;
 
+	/*
+	 * Unfortunately, here we have to prevent some controllers (which need
+	 * polling timer involved) from being registered. The possible controller
+	 * must have both TX-Done and RX-Done irqs or to be completely synchronous.
+	 */
+	if (!mbox->rxdone_auto) {
+		dev_err(mbox->dev, "rx polling method is not supported\n");
+		return -EINVAL;
+	}
+
 	if (mbox->txdone_irq)
 		txdone = TXDONE_BY_IRQ;
-	else if (mbox->txdone_poll)
-		txdone = TXDONE_BY_POLL;
+	else if (mbox->txdone_poll) {
+		dev_err(mbox->dev, "tx polling method is not supported\n");
+		return -EINVAL;
+	}
 	else /* It has to be ACK then */
 		txdone = TXDONE_BY_ACK;
 
+#if 0 /* We don't support timer based polling. */
 	if (txdone == TXDONE_BY_POLL) {
 
 		if (!mbox->ops->last_tx_done) {
@@ -470,6 +503,7 @@ int mbox_controller_register(struct mbox_controller *mbox)
 			     HRTIMER_MODE_REL);
 		mbox->poll_hrt.function = txdone_hrtimer;
 	}
+#endif
 
 	for (i = 0; i < mbox->num_chans; i++) {
 		struct mbox_chan *chan = &mbox->chans[i];
@@ -509,9 +543,20 @@ void mbox_controller_unregister(struct mbox_controller *mbox)
 	for (i = 0; i < mbox->num_chans; i++)
 		mbox_free_channel(&mbox->chans[i]);
 
+#if 0 /* We don't support timer based polling. */
 	if (mbox->txdone_poll)
 		hrtimer_cancel(&mbox->poll_hrt);
+#endif
 
 	mutex_unlock(&con_mutex);
 }
 EXPORT_SYMBOL_GPL(mbox_controller_unregister);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/xen/arch/arm/cpufreq/mailbox.h b/xen/arch/arm/cpufreq/mailbox.h
index 456ba68..ed8fd42 100644
--- a/xen/arch/arm/cpufreq/mailbox.h
+++ b/xen/arch/arm/cpufreq/mailbox.h
@@ -2,6 +2,11 @@
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
+ *
+ * Based on Linux drivers/mailbox/mailbox.h
+ * => commit 86c22f8c9a3b71d42d38bfcd80372de72f573713
+ *
+ * No Xen modification.
  */
 
 #ifndef __MAILBOX_H
@@ -12,3 +17,12 @@
 #define TXDONE_BY_ACK	BIT(2) /* S/W ACK recevied by Client ticks the TX */
 
 #endif /* __MAILBOX_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/xen/arch/arm/cpufreq/mailbox_client.h b/xen/arch/arm/cpufreq/mailbox_client.h
index 4434871..d6ded8b 100644
--- a/xen/arch/arm/cpufreq/mailbox_client.h
+++ b/xen/arch/arm/cpufreq/mailbox_client.h
@@ -5,13 +5,22 @@
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
+ *
+ * Based on Linux include/linux/mailbox_client.h
+ * => commit dfabde206aa10ae71a89ba75e68b1f58a6336a05
+ *
+ * No Xen modification.
  */
 
 #ifndef __MAILBOX_CLIENT_H
 #define __MAILBOX_CLIENT_H
 
+#if 0
 #include <linux/of.h>
 #include <linux/device.h>
+#endif
+
+#include <asm/device.h>
 
 struct mbox_chan;
 
@@ -49,3 +58,12 @@ bool mbox_client_peek_data(struct mbox_chan *chan); /* atomic */
 void mbox_free_channel(struct mbox_chan *chan); /* may sleep */
 
 #endif /* __MAILBOX_CLIENT_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/xen/arch/arm/cpufreq/mailbox_controller.h b/xen/arch/arm/cpufreq/mailbox_controller.h
index 05c6e45..93ab62d 100644
--- a/xen/arch/arm/cpufreq/mailbox_controller.h
+++ b/xen/arch/arm/cpufreq/mailbox_controller.h
@@ -2,16 +2,27 @@
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
+ *
+ * Based on Linux include/linux/mailbox_controller.h
+ * => commit 0cc67945ea5933d53db69606312cf52f553d1b81
+ *
+ * Xen modification:
+ * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
+ * Copyright (C) 2017 EPAM Systems Inc.
  */
 
 #ifndef __MAILBOX_CONTROLLER_H
 #define __MAILBOX_CONTROLLER_H
 
+#if 0
 #include <linux/of.h>
 #include <linux/types.h>
 #include <linux/hrtimer.h>
 #include <linux/device.h>
 #include <linux/completion.h>
+#endif
+
+#include "wrappers.h"
 
 struct mbox_chan;
 
@@ -87,7 +98,9 @@ struct mbox_controller {
 	struct mbox_chan *(*of_xlate)(struct mbox_controller *mbox,
 				      const struct of_phandle_args *sp);
 	/* Internal to API */
+#if 0 /* We don't support timer based polling. */
 	struct hrtimer poll_hrt;
+#endif
 	struct list_head node;
 };
 
@@ -137,3 +150,12 @@ void mbox_chan_received_data(struct mbox_chan *chan, void *data); /* atomic */
 void mbox_chan_txdone(struct mbox_chan *chan, int r); /* atomic */
 
 #endif /* __MAILBOX_CONTROLLER_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 24/31] xen/arm: Add Xen changes to ARM SMC based mailbox
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (22 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 23/31] xen/arm: Add Xen changes to mailbox infrastructure Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 25/31] xen/arm: Use non-blocking mode for SCPI protocol Oleksandr Tyshchenko
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Andre Przywara, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Modify the direct ported ARM SMC based mailbox to be
functional inside Xen.

Include "wrappers.h" which contains all required things the direct
ported code relies on.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
CC: Andre Przywara <andre.przywara@linaro.org>
---
 xen/arch/arm/cpufreq/arm-smc-mailbox.c | 101 +++++++++++++++++++++++++++++++++
 1 file changed, 101 insertions(+)

diff --git a/xen/arch/arm/cpufreq/arm-smc-mailbox.c b/xen/arch/arm/cpufreq/arm-smc-mailbox.c
index d7b61a7..65d183e 100644
--- a/xen/arch/arm/cpufreq/arm-smc-mailbox.c
+++ b/xen/arch/arm/cpufreq/arm-smc-mailbox.c
@@ -8,14 +8,73 @@
  * This device provides a mechanism for emulating a mailbox by using
  * smc calls, allowing a "mailbox" consumer to sit in firmware running
  * on the same core.
+ *
+ * Based on patch series which hasn't reach upstream yet:
+ * => https://lkml.org/lkml/2017/7/23/129
+ *    [PATCH v2 0/3] mailbox: arm: introduce smc triggered mailbox
+ *
+ * Xen modification:
+ * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
+ * Copyright (C) 2017 EPAM Systems Inc.
  */
 
+#if 0
 #include <linux/device.h>
 #include <linux/kernel.h>
 #include <linux/mailbox_controller.h>
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/arm-smccc.h>
+#endif
+
+#include <xen/device_tree.h>
+#include <xen/err.h>
+#include <xen/xmalloc.h>
+
+#include "mailbox_controller.h"
+#include "wrappers.h"
+
+/*
+ * TODO:
+ * 1. Add releasing resources since devm.
+ */
+
+struct arm_smccc_res {
+	unsigned long a0;
+	unsigned long a1;
+	unsigned long a2;
+	unsigned long a3;
+};
+
+/* This is just to align interfaces. */
+static inline void arm_smccc_smc(unsigned long a0, unsigned long a1,
+		unsigned long a2, unsigned long a3, unsigned long a4,
+		unsigned long a5, unsigned long a6, unsigned long a7,
+		struct arm_smccc_res *res)
+{
+	register_t ret[4] = { 0 };
+
+	call_smccc_smc(a0, a1, a2, a3, a4, a5, a6, a7, ret);
+
+	res->a0 = ret[0];
+	res->a1 = ret[1];
+	res->a2 = ret[2];
+	res->a3 = ret[3];
+}
+
+static inline void arm_smccc_hvc(unsigned long a0, unsigned long a1,
+		unsigned long a2, unsigned long a3, unsigned long a4,
+		unsigned long a5, unsigned long a6, unsigned long a7,
+		struct arm_smccc_res *res)
+{
+	/*
+	 * We should never get here since the "use_hvc" flag is always false
+	 * (smc is only allowed method).
+	*/
+	BUG();
+}
+
+/***** Start of Linux code *****/
 
 #define ARM_SMC_MBOX_USE_HVC	BIT(0)
 
@@ -69,6 +128,9 @@ static int arm_smc_mbox_probe(struct platform_device *pdev)
 	if (!of_property_read_string(dev->of_node, "method", &method)) {
 		if (!strcmp("hvc", method)) {
 			use_hvc = true;
+
+			dev_warn(dev,"method must be smc\n");
+			return -EINVAL;
 		} else if (!strcmp("smc", method)) {
 			use_hvc = false;
 		} else {
@@ -112,6 +174,11 @@ static int arm_smc_mbox_probe(struct platform_device *pdev)
 	mbox->txdone_poll = true;
 	mbox->txdone_irq = false;
 	mbox->txpoll_period = 1;
+	/*
+	 * We don't have RX-done irq, but always have received data in hand since
+	 * mailbox is synchronous.
+	 */
+	mbox->rxdone_auto = true;
 	mbox->ops = &arm_smc_mbox_chan_ops;
 	mbox->dev = dev;
 
@@ -126,6 +193,7 @@ static int arm_smc_mbox_probe(struct platform_device *pdev)
 	return ret;
 }
 
+#if 0
 static int arm_smc_mbox_remove(struct platform_device *pdev)
 {
 	struct mbox_controller *mbox = platform_get_drvdata(pdev);
@@ -133,6 +201,7 @@ static int arm_smc_mbox_remove(struct platform_device *pdev)
 	mbox_controller_unregister(mbox);
 	return 0;
 }
+#endif
 
 static const struct of_device_id arm_smc_mbox_of_match[] = {
 	{ .compatible = "arm,smc-mbox", },
@@ -140,6 +209,7 @@ static const struct of_device_id arm_smc_mbox_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, arm_smc_mbox_of_match);
 
+#if 0
 static struct platform_driver arm_smc_mbox_driver = {
 	.driver = {
 		.name = "arm-smc-mbox",
@@ -153,3 +223,34 @@ module_platform_driver(arm_smc_mbox_driver);
 MODULE_AUTHOR("Andre Przywara <andre.przywara@arm.com>");
 MODULE_DESCRIPTION("Generic ARM smc mailbox driver");
 MODULE_LICENSE("GPL v2");
+#endif
+
+/***** End of Linux code *****/
+
+static int __init arm_smc_mbox_init(struct dt_device_node *dev,
+		const void *data)
+{
+	int ret;
+
+	ret = arm_smc_mbox_probe(dev);
+	if (ret) {
+		dev_err(&dev->dev, "failed to init ARM SMC mailbox (%d)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+DT_DEVICE_START(arm_smc_mbox, "ARM SMC mailbox", DEVICE_MAILBOX)
+	.dt_match = arm_smc_mbox_of_match,
+	.init = arm_smc_mbox_init,
+DT_DEVICE_END
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 25/31] xen/arm: Use non-blocking mode for SCPI protocol
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (23 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 24/31] xen/arm: Add Xen changes to ARM SMC based mailbox Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 26/31] xen/arm: Don't set txdone_poll flag for ARM SMC mailbox Oleksandr Tyshchenko
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Don't block until data is transmitted.
As we are limited to use only two methods TXDONE_BY_IRQ and TXDONE_BY_ACK,
there are two possible scenario:
- If the mailbox controller has TX-done irq it definitely knows when
  transmitted data has been sent and it is responsible for calling
  mbox_chan_txdone() to signal framework about TX-done.
- If controller can't generate TX-Done irq the client has to signal
  TX-done by itself.

So, in case of "ARM SMC mailbox" we explicitly tick the TX state machine
by calling mbox_client_txdone().

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/arm_scpi.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/cpufreq/arm_scpi.c b/xen/arch/arm/cpufreq/arm_scpi.c
index 553a516..51a9742 100644
--- a/xen/arch/arm/cpufreq/arm_scpi.c
+++ b/xen/arch/arm/cpufreq/arm_scpi.c
@@ -530,7 +530,7 @@ static void put_scpi_xfer(struct scpi_xfer *t, struct scpi_chan *ch)
 static int scpi_send_message(u8 idx, void *tx_buf, unsigned int tx_len,
 			     void *rx_buf, unsigned int rx_len)
 {
-	int ret;
+	int ret, ret2;
 	u8 chan;
 	u8 cmd;
 	struct scpi_xfer *msg;
@@ -566,21 +566,37 @@ static int scpi_send_message(u8 idx, void *tx_buf, unsigned int tx_len,
 	reinit_completion(&msg->done);
 
 	ret = mbox_send_message(scpi_chan->chan, msg);
-	if (ret < 0 || !rx_buf)
+	if (ret < 0 || !rx_buf) {
+		/* mbox_send_message returns non-negative value on success */
+		ret2 = ret < 0 ? ret : 0;
 		goto out;
+	}
 
 	if (!wait_for_completion_timeout(&msg->done, MAX_RX_TIMEOUT))
 		ret = -ETIMEDOUT;
 	else
 		/* first status word */
 		ret = msg->status;
+
+	/* SCPI error codes > 0, translate them to Linux scale */
+	ret2 = ret > 0 ? scpi_to_linux_errno(ret) : ret;
+
+	/*
+	 * If the mailbox controller has TX-done irq it definitely knows when
+	 * transmitted data has been sent and it is responsible for calling
+	 * mbox_chan_txdone() to signal framework about TX-done.
+	 * If controller can't generate TX-Done irq the client has to signal
+	 * TX-done by itself.
+	 */
+	if (!scpi_chan->chan->mbox->txdone_irq)
+		mbox_client_txdone(scpi_chan->chan, ret2);
+
 out:
 	if (ret < 0 && rx_buf) /* remove entry from the list if timed-out */
 		scpi_process_cmd(scpi_chan, msg->cmd);
 
 	put_scpi_xfer(msg, scpi_chan);
-	/* SCPI error codes > 0, translate them to Linux scale*/
-	return ret > 0 ? scpi_to_linux_errno(ret) : ret;
+	return ret2;
 }
 
 static u32 scpi_get_version(void)
@@ -1026,9 +1042,9 @@ static int scpi_probe(struct platform_device *pdev)
 		cl->dev = dev;
 		cl->rx_callback = scpi_handle_remote_msg;
 		cl->tx_prepare = scpi_tx_prepare;
-		cl->tx_block = true;
-		cl->tx_tout = 20;
-		cl->knows_txdone = false; /* controller can't ack */
+		/* Use non-blocking mode for client */
+		cl->tx_block = false;
+		cl->knows_txdone = true;
 
 		INIT_LIST_HEAD(&pchan->rx_pending);
 		INIT_LIST_HEAD(&pchan->xfers_list);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 26/31] xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (24 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 25/31] xen/arm: Use non-blocking mode for SCPI protocol Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 27/31] cpufreq: hack: perf->states isn't a real guest handle on ARM Oleksandr Tyshchenko
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Andre Przywara, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Don't set txdone_poll flag resulting in TXDONE_BY_POLL method.
It is not optimal to use this method along with the dummy
last_tx_done(), since the controller is completely synchronous.
What is more the TXDONE_BY_POLL method is prohibited because of
involving timer based polling.

This change leads to using TXDONE_BY_ACK method and as the result
the client (SCPI protocol) explicitly ticks the TX state machine.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
CC: Andre Przywara <andre.przywara@linaro.org>
---
 xen/arch/arm/cpufreq/arm-smc-mailbox.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/xen/arch/arm/cpufreq/arm-smc-mailbox.c b/xen/arch/arm/cpufreq/arm-smc-mailbox.c
index 65d183e..c9fea49 100644
--- a/xen/arch/arm/cpufreq/arm-smc-mailbox.c
+++ b/xen/arch/arm/cpufreq/arm-smc-mailbox.c
@@ -100,15 +100,8 @@ static int arm_smc_send_data(struct mbox_chan *link, void *data)
 	return 0;
 }
 
-/* This mailbox is synchronous, so we are always done. */
-static bool arm_smc_last_tx_done(struct mbox_chan *link)
-{
-	return true;
-}
-
 static const struct mbox_chan_ops arm_smc_mbox_chan_ops = {
 	.send_data	= arm_smc_send_data,
-	.last_tx_done	= arm_smc_last_tx_done
 };
 
 static int arm_smc_mbox_probe(struct platform_device *pdev)
@@ -171,9 +164,8 @@ static int arm_smc_mbox_probe(struct platform_device *pdev)
 		mbox->chans[i].con_priv = &chan_data[i];
 	}
 
-	mbox->txdone_poll = true;
+	mbox->txdone_poll = false;
 	mbox->txdone_irq = false;
-	mbox->txpoll_period = 1;
 	/*
 	 * We don't have RX-done irq, but always have received data in hand since
 	 * mailbox is synchronous.
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 27/31] cpufreq: hack: perf->states isn't a real guest handle on ARM
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (25 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 26/31] xen/arm: Don't set txdone_poll flag for ARM SMC mailbox Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-05 21:34   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 28/31] xen/arm: Introduce SCPI based CPUFreq driver Oleksandr Tyshchenko
                   ` (6 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Jan Beulich, Andrew Cooper

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch is just a temp solution to highlight a problem which
should be resolved in a proper way.

set_px_pminfo() is intended to be called from platform hypercall
where "perf" argument was entirely filled in by hwdom.

But unlike x86 we don't get this info from hwdom on ARM,
we get it from other sources (device tree + firmware). In order to
retain function interface, we emulate receiving hypercall and
pass argument which function expects to see. Although "perf->states"
looks like a guest handle it is not a real handle and we can't use
copy_from_guest() over it. As only scpi-cpufreq sets XEN_PX_DATA flag
use it as an indicator to do memcpy.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/drivers/cpufreq/cpufreq.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index 64e1ae7..1022cd1 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -558,11 +558,22 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
             ret = -ENOMEM;
             goto out;
         }
-        if ( copy_from_guest(pxpt->states, dom0_px_info->states,
-                             dom0_px_info->state_count) )
+
+        if ( dom0_px_info->flags == XEN_PX_DATA )
         {
-            ret = -EFAULT;
-            goto out;
+            struct xen_processor_px *states = (dom0_px_info->states).p;
+
+            memcpy(pxpt->states, states,
+                   dom0_px_info->state_count * sizeof(struct xen_processor_px));
+        }
+        else
+        {
+            if ( copy_from_guest(pxpt->states, dom0_px_info->states,
+                                 dom0_px_info->state_count) )
+            {
+                ret = -EFAULT;
+                goto out;
+            }
         }
         pxpt->state_count = dom0_px_info->state_count;
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 28/31] xen/arm: Introduce SCPI based CPUFreq driver
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (26 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 27/31] cpufreq: hack: perf->states isn't a real guest handle on ARM Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component Oleksandr Tyshchenko
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds a CPUFreq driver for controlling CPUs DVFS feature
provided by System Control Processor (SCP) using SCPI protocol
for inter-processor communication.

The important point is that unlike Linux Xen doesn't have
clock infrastructure and clocks for the CPUs (DVFS clocks)
provided by SCP are managed by this driver directly using
DVFS operations over power domains the controlled CPUs are part of.

Non-arch specific driver code is mostly borrowed from
the x86 ACPI CPUFreq.

Most important TODOs regarding the whole patch series:
1. Handle devm in the direct ported code. Currently, in case of any
   errors previously allocated resources are left unfreed.
2. Thermal management integration.
3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
4. Handle CPU_TURBO frequencies.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/scpi_cpufreq.c | 328 ++++++++++++++++++++++++++++++++++++
 1 file changed, 328 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c

diff --git a/xen/arch/arm/cpufreq/scpi_cpufreq.c b/xen/arch/arm/cpufreq/scpi_cpufreq.c
new file mode 100644
index 0000000..bcd8889
--- /dev/null
+++ b/xen/arch/arm/cpufreq/scpi_cpufreq.c
@@ -0,0 +1,328 @@
+/*
+ * xen/arch/arm/cpufreq/scpi_cpufreq.c
+ *
+ * SCPI based CPUFreq driver
+ *
+ * Based on Xen arch/x86/acpi/cpufreq/cpufreq.c
+ *
+ * Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
+ * Copyright (c) 2017 EPAM Systems.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/types.h>
+#include <xen/delay.h>
+#include <xen/cpumask.h>
+#include <xen/sched.h>
+#include <xen/xmalloc.h>
+#include <xen/err.h>
+#include <xen/cpufreq.h>
+#include <asm/bug.h>
+#include <asm/percpu.h>
+
+#include "scpi_protocol.h"
+
+extern struct device *get_cpu_device(unsigned int cpu);
+
+struct scpi_cpufreq_data
+{
+    struct processor_performance *perf;
+    struct cpufreq_frequency_table *freq_table;
+    struct scpi_dvfs_info *info; /* DVFS capabilities of the CPU's power domain */
+    int domain; /* power domain id this CPU belongs to */
+};
+
+static struct scpi_cpufreq_data *cpufreq_driver_data[NR_CPUS];
+
+static struct cpufreq_driver scpi_cpufreq_driver;
+
+static struct scpi_ops *scpi_ops;
+
+static unsigned int scpi_cpufreq_get(unsigned int cpu)
+{
+    struct scpi_cpufreq_data *data;
+    struct cpufreq_policy *policy;
+    const struct scpi_opp *opp;
+    int idx;
+
+    if ( cpu >= nr_cpu_ids || !cpu_online(cpu) )
+        return 0;
+
+    policy = per_cpu(cpufreq_cpu_policy, cpu);
+    if ( !policy || !(data = cpufreq_driver_data[policy->cpu]) ||
+         !data->info )
+        return 0;
+
+    idx = scpi_ops->dvfs_get_idx(data->domain);
+    if ( idx < 0 )
+        return 0;
+
+    opp = data->info->opps + idx;
+
+    /* Convert Hz -> kHz */
+    return opp->freq / 1000;
+}
+
+static int scpi_cpufreq_target(struct cpufreq_policy *policy,
+                               unsigned int target_freq, unsigned int relation)
+{
+    struct scpi_cpufreq_data *data = cpufreq_driver_data[policy->cpu];
+    struct processor_performance *perf;
+    struct cpufreq_freqs freqs;
+    cpumask_t online_policy_cpus;
+    unsigned int next_state = 0; /* Index into freq_table */
+    unsigned int next_perf_state = 0; /* Index into perf table */
+    unsigned int j;
+    int result;
+    const struct scpi_opp *opp;
+    int idx, max_opp;
+
+    if ( unlikely(!data) || !data->perf || !data->freq_table || !data->info )
+        return -ENODEV;
+
+    perf = data->perf;
+    result = cpufreq_frequency_table_target(policy,
+                                            data->freq_table,
+                                            target_freq,
+                                            relation, &next_state);
+    if ( unlikely(result) )
+        return -ENODEV;
+
+    cpumask_and(&online_policy_cpus, &cpu_online_map, policy->cpus);
+
+    next_perf_state = data->freq_table[next_state].index;
+    if ( perf->state == next_perf_state )
+    {
+        if ( unlikely(policy->resume) )
+            policy->resume = 0;
+        else
+            return 0;
+    }
+
+    /* Convert MHz -> kHz */
+    freqs.old = perf->states[perf->state].core_frequency * 1000;
+    freqs.new = data->freq_table[next_state].frequency;
+
+    /* Find corresponding index */
+    max_opp = data->info->count;
+    opp = data->info->opps;
+    for ( idx = 0; idx < max_opp; idx++, opp++ )
+    {
+        /* Compare in kHz */
+        if ( opp->freq / 1000 == freqs.new )
+            break;
+    }
+    if ( idx == max_opp )
+        return -EINVAL;
+
+    result = scpi_ops->dvfs_set_idx(data->domain, idx);
+    if ( result < 0 )
+        return result;
+
+    for_each_cpu( j, &online_policy_cpus )
+        cpufreq_statistic_update(j, perf->state, next_perf_state);
+
+    perf->state = next_perf_state;
+    policy->cur = freqs.new;
+
+    return result;
+}
+
+static int scpi_cpufreq_verify(struct cpufreq_policy *policy)
+{
+    struct scpi_cpufreq_data *data;
+    struct processor_performance *perf;
+
+    if ( !policy || !(data = cpufreq_driver_data[policy->cpu]) ||
+         !processor_pminfo[policy->cpu] )
+        return -EINVAL;
+
+    perf = &processor_pminfo[policy->cpu]->perf;
+
+    /* Convert MHz -> kHz */
+    cpufreq_verify_within_limits(policy, 0,
+        perf->states[perf->platform_limit].core_frequency * 1000);
+
+    return cpufreq_frequency_table_verify(policy, data->freq_table);
+}
+
+static int scpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
+{
+    unsigned int i;
+    unsigned int valid_states = 0;
+    unsigned int curr_state, curr_freq;
+    struct scpi_cpufreq_data *data;
+    int result;
+    struct processor_performance *perf;
+    struct device *cpu_dev;
+    struct scpi_dvfs_info *info;
+    int domain;
+
+    cpu_dev = get_cpu_device(policy->cpu);
+    if ( !cpu_dev )
+        return -ENODEV;
+
+    data = xzalloc(struct scpi_cpufreq_data);
+    if ( !data )
+        return -ENOMEM;
+
+    cpufreq_driver_data[policy->cpu] = data;
+
+    data->perf = &processor_pminfo[policy->cpu]->perf;
+
+    perf = data->perf;
+    policy->shared_type = perf->shared_type;
+
+    data->freq_table = xmalloc_array(struct cpufreq_frequency_table,
+                                    (perf->state_count + 1));
+    if ( !data->freq_table )
+    {
+        result = -ENOMEM;
+        goto err_unreg;
+    }
+
+    /* Detect transition latency */
+    policy->cpuinfo.transition_latency = 0;
+    for ( i = 0; i < perf->state_count; i++ )
+    {
+        /* Compare in ns */
+        if ( perf->states[i].transition_latency * 1000 >
+             policy->cpuinfo.transition_latency )
+            /* Convert us -> ns */
+            policy->cpuinfo.transition_latency =
+                perf->states[i].transition_latency * 1000;
+    }
+
+    policy->governor = cpufreq_opt_governor ? : CPUFREQ_DEFAULT_GOVERNOR;
+
+    /* Initialize frequency table */
+    for ( i = 0; i < perf->state_count; i++ )
+    {
+        /* Compare in MHz */
+        if ( i > 0 && perf->states[i].core_frequency >=
+             data->freq_table[valid_states - 1].frequency / 1000 )
+            continue;
+
+        data->freq_table[valid_states].index = i;
+        /* Convert MHz -> kHz */
+        data->freq_table[valid_states].frequency =
+            perf->states[i].core_frequency * 1000;
+        valid_states++;
+    }
+    data->freq_table[valid_states].frequency = CPUFREQ_TABLE_END;
+    perf->state = 0;
+
+    result = cpufreq_frequency_table_cpuinfo(policy, data->freq_table);
+    if ( result )
+        goto err_freqfree;
+
+    /* Fill in fields needed for frequency changing */
+    domain = scpi_ops->device_domain_id(cpu_dev);
+    if ( domain < 0 )
+    {
+        result = domain;
+        goto err_freqfree;
+    }
+    data->domain = domain;
+
+    info = scpi_ops->dvfs_get_info(domain);
+    if ( IS_ERR(info) )
+    {
+        result = PTR_ERR(info);
+        goto err_freqfree;
+    }
+    data->info = info;
+
+    /* Retrieve current frequency */
+    curr_freq = scpi_cpufreq_get(policy->cpu);
+
+    /* Find corresponding state */
+    curr_state = 0;
+    for ( i = 0; data->freq_table[i].frequency != CPUFREQ_TABLE_END; i++ )
+    {
+        if ( curr_freq == data->freq_table[i].frequency )
+        {
+            curr_state = i;
+            break;
+        }
+    }
+
+    /* Update fields with actual values */
+    policy->cur = curr_freq;
+    perf->state = data->freq_table[curr_state].index;
+
+    /*
+     * the first call to ->target() should result in us actually
+     * writing something to the appropriate registers.
+     */
+    policy->resume = 1;
+
+    return result;
+
+err_freqfree:
+    xfree(data->freq_table);
+err_unreg:
+    xfree(data);
+    cpufreq_driver_data[policy->cpu] = NULL;
+
+    return result;
+}
+
+static int scpi_cpufreq_cpu_exit(struct cpufreq_policy *policy)
+{
+    struct scpi_cpufreq_data *data = cpufreq_driver_data[policy->cpu];
+
+    if ( data )
+    {
+        xfree(data->freq_table);
+        xfree(data);
+        cpufreq_driver_data[policy->cpu] = NULL;
+    }
+
+    return 0;
+}
+
+static struct cpufreq_driver scpi_cpufreq_driver = {
+    .name   = "scpi-cpufreq",
+
+    .verify = scpi_cpufreq_verify,
+    .target = scpi_cpufreq_target,
+    .get    = scpi_cpufreq_get,
+    .init   = scpi_cpufreq_cpu_init,
+    .exit   = scpi_cpufreq_cpu_exit,
+};
+
+int __init scpi_cpufreq_register_driver(void)
+{
+    scpi_ops = get_scpi_ops();
+    if ( !scpi_ops )
+        return -ENXIO;
+
+    return cpufreq_register_driver(&scpi_cpufreq_driver);
+}
+
+int cpufreq_cpu_init(unsigned int cpuid)
+{
+    return cpufreq_add_cpu(cpuid);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (27 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 28/31] xen/arm: Introduce SCPI based CPUFreq driver Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-12-05 22:25   ` Stefano Stabellini
  2017-11-09 17:10 ` [RFC PATCH 30/31] xen/arm: Build CPUFreq components Oleksandr Tyshchenko
                   ` (4 subsequent siblings)
  33 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds an interface component which performs following steps:
1. Initialize everything needed SCPI based CPUFreq driver to be functional
   (SCPI Message protocol, mailbox to communicate with SCP, etc).
   Also preliminary check if SCPI DVFS clock nodes offered by SCP are
   present in a device tree.
2. Register SCPI based CPUFreq driver.
3. Populate CPUs. Get DVFS info (OPP list and the latency information)
   for all DVFS capable CPUs using SCPI protocol, convert these capabilities
   into PM data the CPUFreq framework expects to see followed by
   uploading it.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/cpufreq/cpufreq_if.c | 522 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 522 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c

diff --git a/xen/arch/arm/cpufreq/cpufreq_if.c b/xen/arch/arm/cpufreq/cpufreq_if.c
new file mode 100644
index 0000000..2451d00
--- /dev/null
+++ b/xen/arch/arm/cpufreq/cpufreq_if.c
@@ -0,0 +1,522 @@
+/*
+ * xen/arch/arm/cpufreq/cpufreq_if.c
+ *
+ * CPUFreq interface component
+ *
+ * Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
+ * Copyright (c) 2017 EPAM Systems.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/device_tree.h>
+#include <xen/err.h>
+#include <xen/sched.h>
+#include <xen/cpufreq.h>
+#include <xen/pmstat.h>
+#include <xen/guest_access.h>
+
+#include "scpi_protocol.h"
+
+/*
+ * TODO:
+ * 1. Add __init to required funcs
+ * 2. Put get_cpu_device() into common place
+ */
+
+static struct scpi_ops *scpi_ops;
+
+extern int scpi_cpufreq_register_driver(void);
+
+#define dev_name(dev) dt_node_full_name(dev_to_dt(dev))
+
+struct device *get_cpu_device(unsigned int cpu)
+{
+    if ( cpu < nr_cpu_ids && cpu_possible(cpu) )
+        return dt_to_dev(cpu_dt_nodes[cpu]);
+    else
+        return NULL;
+}
+
+static bool is_dvfs_capable(unsigned int cpu)
+{
+    static const struct dt_device_match scpi_dvfs_clock_match[] =
+    {
+        DT_MATCH_COMPATIBLE("arm,scpi-dvfs-clocks"),
+        { /* sentinel */ },
+    };
+    struct device *cpu_dev;
+    struct dt_phandle_args clock_spec;
+    struct scpi_dvfs_info *info;
+    u32 domain;
+    int i, ret, count;
+
+    cpu_dev = get_cpu_device(cpu);
+    if ( !cpu_dev )
+    {
+        printk("cpu%d: failed to get device\n", cpu);
+        return false;
+    }
+
+    /* First of all find a clock node this CPU is a consumer of */
+    ret = dt_parse_phandle_with_args(cpu_dev->of_node,
+                                     "clocks",
+                                     "#clock-cells",
+                                     0,
+                                     &clock_spec);
+    if ( ret )
+    {
+        printk("cpu%d: failed to get clock node\n", cpu);
+        return false;
+    }
+
+    /* Make sure it is an available DVFS clock node */
+    if ( !dt_match_node(scpi_dvfs_clock_match, clock_spec.np) ||
+         !dt_device_is_available(clock_spec.np) )
+    {
+        printk("cpu%d: clock node '%s' is either non-DVFS or non-available\n",
+               cpu, dev_name(&clock_spec.np->dev));
+        return false;
+    }
+
+    /*
+     * Actually we already have a power domain id this CPU belongs to,
+     * it is a stored in args[0] CPU clock specifier, so we could ask SCP
+     * to provide its DVFS info. But we want to dig a little bit deeper
+     * to make sure that everything is correct.
+     */
+
+    /* Check how many clock ids a DVFS clock node has */
+    ret = dt_property_count_elems_of_size(clock_spec.np,
+                                          "clock-indices",
+                                          sizeof(u32));
+    if ( ret < 0 )
+    {
+        printk("cpu%d: failed to get clock-indices count in '%s'\n",
+               cpu, dev_name(&clock_spec.np->dev));
+        return false;
+    }
+    count = ret;
+
+    /* Check if a clock id the CPU clock specifier points to is present */
+    for ( i = 0; i < count; i++ )
+    {
+        ret = dt_property_read_u32_index(clock_spec.np,
+                                         "clock-indices",
+                                         i,
+                                         &domain);
+        if ( ret )
+        {
+            printk("cpu%d: failed to get clock index in '%s'\n",
+                   cpu, dev_name(&clock_spec.np->dev));
+            return false;
+        }
+
+        /* Match found */
+        if ( clock_spec.args[0] == domain )
+            break;
+    }
+
+    if ( i == count )
+    {
+        printk("cpu%d: failed to find matching clk_id (pd) %d\n",
+               cpu, clock_spec.args[0]);
+        return false;
+    }
+
+    /*
+     * Check if a SCP is aware of this power domain. SCPI Message protocol
+     * driver will populate power domain's DVFS info then.
+     */
+    info = scpi_ops->dvfs_get_info(domain);
+    if ( IS_ERR(info) )
+    {
+        printk("cpu%d: failed to get DVFS info of pd%u\n", cpu, domain);
+        return false;
+    }
+
+    printk(XENLOG_DEBUG "cpu%d: is DVFS capable, belongs to pd%u\n",
+           cpu, domain);
+
+    return true;
+}
+
+static int get_sharing_cpus(unsigned int cpu, cpumask_t *mask)
+{
+    struct device *cpu_dev = get_cpu_device(cpu), *tcpu_dev;
+    unsigned int tcpu;
+    int domain, tdomain;
+
+    BUG_ON(!cpu_dev);
+
+    domain = scpi_ops->device_domain_id(cpu_dev);
+    if ( domain < 0 )
+        return domain;
+
+    cpumask_clear(mask);
+    cpumask_set_cpu(cpu, mask);
+
+    for_each_online_cpu( tcpu )
+    {
+        if ( tcpu == cpu )
+            continue;
+
+        tcpu_dev = get_cpu_device(tcpu);
+        if ( !tcpu_dev )
+            continue;
+
+        tdomain = scpi_ops->device_domain_id(tcpu_dev);
+        if ( tdomain == domain )
+            cpumask_set_cpu(tcpu, mask);
+    }
+
+    return 0;
+}
+
+static int get_transition_latency(struct device *cpu_dev)
+{
+    return scpi_ops->get_transition_latency(cpu_dev);
+}
+
+static struct scpi_dvfs_info *get_dvfs_info(struct device *cpu_dev)
+{
+    int domain;
+
+    domain = scpi_ops->device_domain_id(cpu_dev);
+    if ( domain < 0 )
+        return ERR_PTR(-EINVAL);
+
+    return scpi_ops->dvfs_get_info(domain);
+}
+
+static int init_cpufreq_table(unsigned int cpu,
+                              struct cpufreq_frequency_table **table)
+{
+    struct cpufreq_frequency_table *freq_table = NULL;
+    struct device *cpu_dev = get_cpu_device(cpu);
+    struct scpi_dvfs_info *info;
+    struct scpi_opp *opp;
+    int i;
+
+    BUG_ON(!cpu_dev);
+
+    info = get_dvfs_info(cpu_dev);
+    if ( IS_ERR(info) )
+        return PTR_ERR(info);
+
+    if ( !info->opps )
+        return -EIO;
+
+    freq_table = xzalloc_array(struct cpufreq_frequency_table, info->count + 1);
+    if ( !freq_table )
+        return -ENOMEM;
+
+    for ( opp = info->opps, i = 0; i < info->count; i++, opp++ )
+    {
+        freq_table[i].index = i;
+        /* Convert Hz -> kHz */
+        freq_table[i].frequency = opp->freq / 1000;
+    }
+
+    freq_table[i].index = i;
+    freq_table[i].frequency = CPUFREQ_TABLE_END;
+
+    *table = &freq_table[0];
+
+    return 0;
+}
+
+static void free_cpufreq_table(struct cpufreq_frequency_table **table)
+{
+    if ( !table )
+        return;
+
+    xfree(*table);
+    *table = NULL;
+}
+
+static int upload_cpufreq_data(cpumask_t *mask,
+                               struct cpufreq_frequency_table *table)
+{
+    struct xen_processor_performance *perf;
+    struct xen_processor_px *states;
+    uint32_t platform_limit = 0, state_count = 0;
+    unsigned int max_freq = 0, prev_freq = 0, cpu = cpumask_first(mask);
+    int i, latency, ret = 0;
+
+    perf = xzalloc(struct xen_processor_performance);
+    if ( !perf )
+        return -ENOMEM;
+
+    /* Check frequency table and find max frequency */
+    for ( i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++ )
+    {
+        unsigned int freq = table[i].frequency;
+
+        if ( freq == CPUFREQ_ENTRY_INVALID )
+            continue;
+
+        if ( table[i].index != state_count || freq <= prev_freq )
+        {
+            printk("cpu%d: frequency table format error\n", cpu);
+            ret = -EINVAL;
+            goto out;
+        }
+
+        prev_freq = freq;
+        state_count++;
+        if ( freq > max_freq )
+            max_freq = freq;
+    }
+
+    /*
+     * The frequency table we have is just a temporary place for storing
+     * provided by SCP DVFS info. Create performance states array.
+     */
+    if ( !state_count )
+    {
+        printk("cpu%d: no available performance states\n", cpu);
+        ret = -EINVAL;
+        goto out;
+    }
+
+    states = xzalloc_array(struct xen_processor_px, state_count);
+    if ( !states )
+    {
+        ret = -ENOMEM;
+        goto out;
+    }
+
+    set_xen_guest_handle(perf->states, states);
+    perf->state_count = state_count;
+
+    latency = get_transition_latency(get_cpu_device(cpu));
+
+    /* Performance states must start from higher values */
+    for ( i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++ )
+    {
+        unsigned int freq = table[i].frequency;
+        unsigned int index = state_count - 1 - table[i].index;
+
+        if ( freq == CPUFREQ_ENTRY_INVALID )
+            continue;
+
+        if ( freq == max_freq )
+            platform_limit = index;
+
+        /* Convert kHz -> MHz */
+        states[index].core_frequency = freq / 1000;
+        /* Convert ns -> us */
+        states[index].transition_latency = DIV_ROUND_UP(latency, 1000);
+    }
+
+    perf->flags = XEN_PX_DATA; /* all info in a one-shot */
+    perf->platform_limit = platform_limit;
+    perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
+    perf->domain_info.domain = cpumask_first(mask);
+    perf->domain_info.num_processors = cpumask_weight(mask);
+
+    /* Iterate through all CPUs which are on the same boat */
+    for_each_cpu( cpu, mask )
+    {
+        ret = set_px_pminfo(cpu, perf);
+        if ( ret )
+        {
+            printk("cpu%d: failed to set Px states (%d)\n", cpu, ret);
+            break;
+        }
+
+        printk(XENLOG_DEBUG "cpu%d: set Px states\n", cpu);
+    }
+
+    xfree(states);
+out:
+    xfree(perf);
+
+    return ret;
+}
+
+static int __init scpi_cpufreq_postinit(void)
+{
+    struct cpufreq_frequency_table *freq_table = NULL;
+    cpumask_t processed_cpus, shared_cpus;
+    unsigned int cpu;
+    int ret = -ENODEV;
+
+    cpumask_clear(&processed_cpus);
+
+    for_each_online_cpu( cpu )
+    {
+        if ( cpumask_test_cpu(cpu, &processed_cpus) )
+            continue;
+
+        if ( !is_dvfs_capable(cpu) )
+            continue;
+
+        ret = get_sharing_cpus(cpu, &shared_cpus);
+        if ( ret )
+        {
+            printk("cpu%d: failed to get sharing cpumask (%d)\n", cpu, ret);
+            return ret;
+        }
+
+        BUG_ON(cpumask_empty(&shared_cpus));
+        cpumask_or(&processed_cpus, &processed_cpus, &shared_cpus);
+
+        /* Create intermediate frequency table */
+        ret = init_cpufreq_table(cpu, &freq_table);
+        if ( ret )
+        {
+            printk("cpu%d: failed to initialize frequency table (%d)\n",
+                   cpu, ret);
+            return ret;
+        }
+
+        ret = upload_cpufreq_data(&shared_cpus, freq_table);
+        /* Destroy intermediate frequency table */
+        free_cpufreq_table(&freq_table);
+        if ( ret )
+        {
+            printk("cpu%d: failed to upload cpufreq data (%d)\n", cpu, ret);
+            return ret;
+        }
+
+        printk(XENLOG_DEBUG "cpu%d: uploaded cpufreq data\n", cpu);
+    }
+
+    return ret;
+}
+
+static int __init scpi_cpufreq_preinit(void)
+{
+    struct dt_device_node *scpi, *clk, *dvfs_clk;
+    int ret;
+
+    /* Initialize SCPI Message protocol */
+    ret = scpi_init();
+    if ( ret )
+    {
+        printk("failed to initialize SCPI (%d)\n", ret);
+        return ret;
+    }
+
+    /* Sanity check */
+    if ( !get_scpi_ops() || !get_scpi_dev() )
+        return -ENXIO;
+
+    scpi = get_scpi_dev()->of_node;
+    scpi_ops = get_scpi_ops();
+
+    ret = -ENODEV;
+
+    /*
+     * Check for clock related nodes for now. But it might additional nodes,
+     * like thermal sensor, etc.
+     */
+    dt_for_each_child_node( scpi, clk )
+    {
+        /*
+         * First of all there must be a container node which contains all
+         * clocks provided by SCP.
+         */
+        if ( !dt_device_is_compatible(clk, "arm,scpi-clocks") )
+            continue;
+
+        /*
+         * As we are interested in DVFS feature only, check for DVFS clock
+         * sub-node. At the current stage check for it presence only.
+         * Without it there is no point to register SCPI based CPUFreq. We will
+         * perform a thorough check later when populating DVFS clock consumers.
+         */
+        dt_for_each_child_node( clk, dvfs_clk )
+        {
+            if ( !dt_device_is_compatible(dvfs_clk, "arm,scpi-dvfs-clocks") )
+                continue;
+
+            return 0;
+        }
+
+        break;
+    }
+
+    printk("failed to find SCPI DVFS clocks (%d)\n", ret);
+
+    return ret;
+}
+
+/* TODO Implement me */
+static void scpi_cpufreq_deinit(void)
+{
+
+}
+
+static int __init cpufreq_driver_init(void)
+{
+    int ret;
+
+    if ( cpufreq_controller != FREQCTL_xen )
+        return 0;
+
+    /*
+     * Initialize everything needed SCPI based CPUFreq driver to be functional
+     * (SCPI Message protocol, mailbox to communicate with SCP, etc).
+     * Also preliminary check if SCPI DVFS clock nodes offered by SCP are
+     * present in a device tree.
+     */
+    ret = scpi_cpufreq_preinit();
+    if ( ret )
+        goto out;
+
+    /* Register SCPI based CPUFreq driver */
+    ret = scpi_cpufreq_register_driver();
+    if ( ret )
+        goto out;
+
+    /*
+     * Populate CPUs. Get DVFS info (OPP list and the latency information)
+     * for all DVFS capable CPUs using SCPI protocol, convert these capabilities
+     * into PM data the CPUFreq framework expects to see followed by
+     * uploading it.
+     *
+     * Actually it is almost the same PM data which hwdom uploads in case of
+     * x86 via platform hypercall after parsing ACPI tables. In our case we
+     * don't need hwdom to be involved in, since we already have everything in
+     * hand. Moreover, the hwdom doesn't even know anything about physical CPUs.
+     * Not completely sure that it is the best place to do so, but certainly
+     * it must be after driver registration.
+     */
+    ret = scpi_cpufreq_postinit();
+
+out:
+    if ( ret )
+    {
+        printk("failed to initialize SCPI based CPUFreq (%d)\n", ret);
+        scpi_cpufreq_deinit();
+        return ret;
+    }
+
+    printk("initialized SCPI based CPUFreq\n");
+
+    return 0;
+}
+__initcall(cpufreq_driver_init);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 30/31] xen/arm: Build CPUFreq components
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (28 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:10 ` [RFC PATCH 31/31] xen/arm: Enable CPUFreq on ARM Oleksandr Tyshchenko
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/Makefile         | 1 +
 xen/arch/arm/cpufreq/Makefile | 5 +++++
 2 files changed, 6 insertions(+)
 create mode 100644 xen/arch/arm/cpufreq/Makefile

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 282d2c2..fe98570 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
 subdir-y += platforms
 subdir-$(CONFIG_ARM_64) += efi
 subdir-$(CONFIG_ACPI) += acpi
+subdir-$(CONFIG_HAS_CPUFREQ) += cpufreq
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/cpufreq/Makefile b/xen/arch/arm/cpufreq/Makefile
new file mode 100644
index 0000000..9880208
--- /dev/null
+++ b/xen/arch/arm/cpufreq/Makefile
@@ -0,0 +1,5 @@
+obj-y += cpufreq_if.o
+obj-y += scpi_cpufreq.o
+obj-y += arm_scpi.o
+obj-y += mailbox.o
+obj-y += arm-smc-mailbox.o
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [RFC PATCH 31/31] xen/arm: Enable CPUFreq on ARM
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (29 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 30/31] xen/arm: Build CPUFreq components Oleksandr Tyshchenko
@ 2017-11-09 17:10 ` Oleksandr Tyshchenko
  2017-11-09 17:18 ` [RFC PATCH 00/31] " Andrii Anisov
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-09 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index d46b98c..edd12f8 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -23,6 +23,8 @@ config ARM
 	select HAS_PASSTHROUGH
 	select HAS_PDX
 	select VIDEO
+	select HAS_CPUFREQ
+	select HAS_PM
 
 config ARCH_DEFCONFIG
 	string
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 07/31] xenpm: Clarify xenpm usage
  2017-11-09 17:09 ` [RFC PATCH 07/31] xenpm: Clarify xenpm usage Oleksandr Tyshchenko
@ 2017-11-09 17:13   ` Wei Liu
  2017-12-02  1:28     ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Wei Liu @ 2017-11-09 17:13 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Wei Liu, Julien Grall, Ian Jackson,
	Oleksandr Tyshchenko, xen-devel

On Thu, Nov 09, 2017 at 07:09:57PM +0200, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> CPU frequencies are in kHz. So, correct displayed text.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  tools/misc/xenpm.c | 6 +++---

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (30 preceding siblings ...)
  2017-11-09 17:10 ` [RFC PATCH 31/31] xen/arm: Enable CPUFreq on ARM Oleksandr Tyshchenko
@ 2017-11-09 17:18 ` Andrii Anisov
  2017-11-13 19:40   ` Oleksandr Tyshchenko
  2017-11-13 15:21 ` Andre Przywara
  2017-12-05 22:26 ` Stefano Stabellini
  33 siblings, 1 reply; 108+ messages in thread
From: Andrii Anisov @ 2017-11-09 17:18 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Edgar E . Iglesias, Stefano Stabellini, Oleksandr Tyshchenko,
	Andrew Cooper, Julien Grall, Andre Przywara, Jassi Brar,
	Jan Beulich, Sudeep Holla

Dear Oleksandr,


Please consider my `Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>` 
for all patches.

What you missed after extracting this stuff from github.


On 09.11.17 19:09, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (31 preceding siblings ...)
  2017-11-09 17:18 ` [RFC PATCH 00/31] " Andrii Anisov
@ 2017-11-13 15:21 ` Andre Przywara
  2017-11-13 19:40   ` Oleksandr Tyshchenko
  2017-12-05 22:26 ` Stefano Stabellini
  33 siblings, 1 reply; 108+ messages in thread
From: Andre Przywara @ 2017-11-13 15:21 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Oleksandr Tyshchenko, Jan Beulich,
	Sudeep Holla

Hi,

thanks very much for your work on this!

On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Hi, all.
> 
> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.

Can you please sketch your usage scenario or workloads here? I can think
of quite different scenarios (oversubscribed server vs. partitioning
RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
in the design are quite different between those.

In general I doubt that a hypervisor scheduling vCPUs is in a good
position to make a decision on the proper frequency physical CPUs should
run with. From all I know it's already hard for an OS kernel to make
that call. So I would actually expect that guests provide some input,
for instance by signalling OPP change request up to the hypervisor. This
could then decide to act on it - or not.

> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.

Have you looked at how this is used on x86 these days? Can you briefly
describe how this works and it's used there?

> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
> 
> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
> 
> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
> 
> Let me explain a bit more what these possible approaches are:
> 
> 1. “Xen+hwdom” solution.
> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.

Stefano, Julien and I were thinking about this: Wouldn't it be possible
to come up with some hardware domain, solely dealing with CPUFreq
changes? This could run a Linux kernel, but no or very little userland.
All its vCPUs would be pinned to pCPUs and would normally not be
scheduled by Xen. If Xen wants to change the frequency, it schedules the
respective vCPU to the right pCPU and passes down the frequency change
request. Sounds a bit involved, though, and probably doesn't solve the
problem where this domain needs to share access to hardware with Dom0
(clocks come to mind).

> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
> 
> 2. “all-in-Xen” solution.
> This implies that all CPUFreq related stuff should be located in Xen.
> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.

Yes, I even think it's not feasible to implement this. With a modern
clock implementation there is one driver to control *all* clocks of an
SoC, so you can't single out the CPU clock easily, for instance. One
would probably run into synchronisation issues, at best.

> 3. “Xen+SCP(ARM TF)” solution.
> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
> 
> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.

While I feel flattered that you like that idea as well ;-), you should
mention that this requires actual firmware providing those services. I
am not sure there is actually *any* implementation of this at the
moment, apart from my PoC code for Allwinner.
And from a Xen point of view I am not sure we are in the position to
force users to use this firmware. This may be feasible in a classic
embedded scenario, where both firmware and software are provided by the
same entity, but that should be clearly noted as a restriction.

> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.

It should be noted that this synchronous nature of the communication can
actually be a problem: a DVFS request usually involves regulator and PLL
changes, which could take some time to settle in. Blocking all of this
time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.

> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
> 
> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
> 
> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
> 2. A bunch of device-tree helpers and macros.
> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.

Why do you actually need this mailbox framework? Actually I just
proposed the SMC driver the make it fit into the Linux framework. All we
actually need for SCPI is to write a simple command into some memory and
"press a button". I don't see a need to import the whole Linux
framework, especially as our mailbox usage is actually just a corner
case of the mailbox's capability (namely a "single-bit" doorbell).
The SMC use case is trivial to implement, and I believe using the Juno
mailbox is similarly simple, for instance.


So to summarize I think we need to agree on those general questions:
1) Shall the Xen hypervisor actually be involved in CPUFreq at all? Can
this be left to corner-cases like pinned CPUs/guests, where guests
requests are passed on to the hardware?
2) Is EL3/ATF providing SCPI services something we can build on?
Normally I would expect we write drivers to match existing firmware.
3) When we go this way, do we really need to port all of the Linux
drivers and its framework to Xen? Can't we get away with much simpler
solutions? In the end all the SMC mailbox driver does it to trigger an
single SMC call, embedded in a lot of glorious Linux boiler plate code.

What I was *actually* thinking of when using the SMC mailbox approach is
the ability to provide *virtual* SCPI services to guest, in a generic,
not-SoC-specific way. The proposed SMC mailbox binding allows using
*hvc* calls to trigger services, so Xen could pick up DVFS requests from
guests in a generic way and act upon them.

Cheers,
Andre.

> 4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
> 5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
> 6. SCPI based CPUFreq driver and CPUFreq interface component.
> 7. Misc patches mostly to ARM subsystem.
> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
> 
> Most important TODOs regarding the whole patch series:
> 1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
> 2. Thermal management integration.
> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
> 4. Handle CPU_TURBO frequencies if they are supported by HW.
> 
> You can find the whole patch series here:
> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
> 
> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
> 
> [1]
> Linux code:
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
> 
> Recent protocol version:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
> 
> [2]
> Xen part:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
> Linux part:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
> 
> [3]
> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
> 
> [4]
> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
> 
> [5]
> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
> 
> Oleksandr Dmytryshyn (6):
>   cpufreq: move cpufreq.h file to the xen/include/xen location
>   pm: move processor_perf.h file to the xen/include/xen location
>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>   cpufreq: make turbo settings to be configurable
>   pmstat: make pmstat functions more generalizable
>   cpufreq: make cpufreq driver more generalizable
> 
> Oleksandr Tyshchenko (24):
>   xenpm: Clarify xenpm usage
>   xen/device-tree: Add dt_count_phandle_with_args helper
>   xen/device-tree: Add dt_property_for_each_string macros
>   xen/device-tree: Add dt_property_read_u32_index helper
>   xen/device-tree: Add dt_property_count_elems_of_size helper
>   xen/device-tree: Add dt_property_read_string_helper and friends
>   xen/arm: Add driver_data field to struct device
>   xen/arm: Add DEVICE_MAILBOX device class
>   xen/arm: Store device-tree node per cpu
>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>   xen/arm: Add mailbox infrastructure
>   xen/arm: Introduce ARM SMC based mailbox
>   xen/arm: Add common header file wrappers.h
>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>   xen/arm: Add Xen changes to SCPI protocol
>   xen/arm: Add Xen changes to mailbox infrastructure
>   xen/arm: Add Xen changes to ARM SMC based mailbox
>   xen/arm: Use non-blocking mode for SCPI protocol
>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>   xen/arm: Introduce SCPI based CPUFreq driver
>   xen/arm: Introduce CPUFreq Interface component
>   xen/arm: Build CPUFreq components
>   xen/arm: Enable CPUFreq on ARM
> 
> Volodymyr Babchuk (1):
>   arm: add SMC wrapper that is compatible with SMCCC
> 
>  MAINTAINERS                                  |    4 +-
>  tools/misc/xenpm.c                           |    6 +-
>  xen/arch/arm/Kconfig                         |    2 +
>  xen/arch/arm/Makefile                        |    1 +
>  xen/arch/arm/arm32/Makefile                  |    1 +
>  xen/arch/arm/arm32/smc.S                     |   32 +
>  xen/arch/arm/arm64/Makefile                  |    1 +
>  xen/arch/arm/arm64/smc.S                     |   29 +
>  xen/arch/arm/cpufreq/Makefile                |    5 +
>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>  xen/arch/arm/smpboot.c                       |    5 +
>  xen/arch/x86/Kconfig                         |    2 +
>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>  xen/arch/x86/acpi/power.c                    |    2 +-
>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>  xen/arch/x86/platform_hypercall.c            |    2 +-
>  xen/common/device_tree.c                     |  124 +++
>  xen/common/sysctl.c                          |    2 +-
>  xen/drivers/Kconfig                          |    2 +
>  xen/drivers/Makefile                         |    1 +
>  xen/drivers/acpi/Makefile                    |    1 -
>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>  xen/drivers/cpufreq/Kconfig                  |    3 +
>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>  xen/drivers/cpufreq/utility.c                |   13 +-
>  xen/drivers/pm/Kconfig                       |    3 +
>  xen/drivers/pm/Makefile                      |    1 +
>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>  xen/include/asm-arm/device.h                 |    2 +
>  xen/include/asm-arm/processor.h              |    4 +
>  xen/include/public/platform.h                |    1 +
>  xen/include/xen/cpufreq.h                    |  254 ++++++
>  xen/include/xen/device_tree.h                |  158 ++++
>  xen/include/xen/pmstat.h                     |    2 +
>  xen/include/xen/processor_perf.h             |   69 ++
>  50 files changed, 4822 insertions(+), 862 deletions(-)
>  create mode 100644 xen/arch/arm/arm32/smc.S
>  create mode 100644 xen/arch/arm/arm64/smc.S
>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>  delete mode 100644 xen/drivers/acpi/pmstat.c
>  create mode 100644 xen/drivers/pm/Kconfig
>  create mode 100644 xen/drivers/pm/Makefile
>  create mode 100644 xen/drivers/pm/stat.c
>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>  create mode 100644 xen/include/xen/cpufreq.h
>  create mode 100644 xen/include/xen/processor_perf.h
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-09 17:18 ` [RFC PATCH 00/31] " Andrii Anisov
@ 2017-11-13 19:40   ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-13 19:40 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Edgar E . Iglesias, Stefano Stabellini, Oleksandr Tyshchenko,
	Andrew Cooper, Julien Grall, Andre Przywara, Jassi Brar,
	Jan Beulich, Sudeep Holla, xen-devel

On Thu, Nov 9, 2017 at 7:18 PM, Andrii Anisov <andrii_anisov@epam.com> wrote:
> Dear Oleksandr,
Dear Andrii

>
>
> Please consider my `Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>` for
> all patches.
>
> What you missed after extracting this stuff from github.
Thanks. I will add.

>
>
> On 09.11.17 19:09, Oleksandr Tyshchenko wrote:
>>
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
>
> --
>
> *Andrii Anisov*
>
>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-13 15:21 ` Andre Przywara
@ 2017-11-13 19:40   ` Oleksandr Tyshchenko
  2017-11-14 10:49     ` Andre Przywara
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-13 19:40 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Oleksandr Tyshchenko, Jan Beulich,
	Sudeep Holla, xen-devel

On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
<andre.przywara@linaro.org> wrote:
> Hi,
Hi Andre

>
> thanks very much for your work on this!
Thank you for your comments.

>
> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Hi, all.
>>
>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>
> Can you please sketch your usage scenario or workloads here? I can think
> of quite different scenarios (oversubscribed server vs. partitioning
> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
> in the design are quite different between those.
We keep embedded use-cases in mind. For example, it is a system with
several domains,
where one domain has most critical SW running on and other domain(s)
are, let say, for entertainment purposes.
I think, the CPUFreq is useful where power consumption is a question.

>
> In general I doubt that a hypervisor scheduling vCPUs is in a good
> position to make a decision on the proper frequency physical CPUs should
> run with. From all I know it's already hard for an OS kernel to make
> that call. So I would actually expect that guests provide some input,
> for instance by signalling OPP change request up to the hypervisor. This
> could then decide to act on it - or not.
Each running guest sees only part of the picture, but hypervisor has
the whole picture, it knows all about CPU, measures CPU load and able
to choose required CPU frequency to run on. I am wondering, does Xen
need additional input from guests for make a decision?
BTW, currently guest domain on ARM doesn't even know how many physical
CPUs the system has and what are these OPPs. When creating guest
domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
OPPs, thermal, etc are not passed to guest.

>
>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>
> Have you looked at how this is used on x86 these days? Can you briefly
> describe how this works and it's used there?
Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
used at the moment, but it is another question. So, there are two
possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
[2]. As I understand, the second option is more popular.
Two different implementations of "Hypervisor based CPUFreq" are
present: ACPI Processor P-States Driver and AMD Architectural P-state
Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.

[1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
[2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq

>
>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>
>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>
>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>
>> Let me explain a bit more what these possible approaches are:
>>
>> 1. “Xen+hwdom” solution.
>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>
> Stefano, Julien and I were thinking about this: Wouldn't it be possible
> to come up with some hardware domain, solely dealing with CPUFreq
> changes? This could run a Linux kernel, but no or very little userland.
> All its vCPUs would be pinned to pCPUs and would normally not be
> scheduled by Xen. If Xen wants to change the frequency, it schedules the
> respective vCPU to the right pCPU and passes down the frequency change
> request. Sounds a bit involved, though, and probably doesn't solve the
> problem where this domain needs to share access to hardware with Dom0
> (clocks come to mind).
Yes, another question is how to get this Linux kernel stuff (backend,
top level driver, etc) upstreamed.

>
>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>
>> 2. “all-in-Xen” solution.
>> This implies that all CPUFreq related stuff should be located in Xen.
>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>
> Yes, I even think it's not feasible to implement this. With a modern
> clock implementation there is one driver to control *all* clocks of an
> SoC, so you can't single out the CPU clock easily, for instance. One
> would probably run into synchronisation issues, at best.
>
>> 3. “Xen+SCP(ARM TF)” solution.
>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>
>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>
> While I feel flattered that you like that idea as well ;-), you should
> mention that this requires actual firmware providing those services.
Yes, a some firmware, which provides these services, must be present
on the other end.
It is a firmware which runs on the dedicated IP core(s) in common case.
And it is a firmware which runs on the same core(s) as the hypervisor
in particular case.

> I
> am not sure there is actually *any* implementation of this at the
> moment, apart from my PoC code for Allwinner.
Your PoC is a good example for writing firmware side. So, why don't
use it as a base for
other platform.

> And from a Xen point of view I am not sure we are in the position to
> force users to use this firmware. This may be feasible in a classic
> embedded scenario, where both firmware and software are provided by the
> same entity, but that should be clearly noted as a restriction.
Agree.

>
>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>
> It should be noted that this synchronous nature of the communication can
> actually be a problem: a DVFS request usually involves regulator and PLL
> changes, which could take some time to settle in. Blocking all of this
> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
Agree. I haven't measured time yet to say how long is it, since I
don't have a working firmware at the moment, just an emulator,
but, yes, it will definitely take some time. The whole system won't be
blocked, only the CPU which performs SMC call.
But, if we ask hwdom to change frequency we will wait too? Or if Xen
manages PLL/regulator by itself, it will wait anyway?

>
>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>
>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>
>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>> 2. A bunch of device-tree helpers and macros.
>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>
> Why do you actually need this mailbox framework? Actually I just
> proposed the SMC driver the make it fit into the Linux framework. All we
> actually need for SCPI is to write a simple command into some memory and
> "press a button". I don't see a need to import the whole Linux
> framework, especially as our mailbox usage is actually just a corner
> case of the mailbox's capability (namely a "single-bit" doorbell).
> The SMC use case is trivial to implement, and I believe using the Juno
> mailbox is similarly simple, for instance.
I did a direct port for SCPI protocol. I think, it is something that
should be retained as much as possible.
Protocol relies on mailbox feature, so I ported mailbox too. I think,
it would be much more easy for me to just add
a few required commands handling with issuing SMC call and without any
mailbox infrastructure involved.
But, I want to show what is going on and what place these things come from.

What is more, I don't want to restrict a usage of this CPUFreq by only
covering single scenario where a
firmware, which provides DVFS service, is in ARM TF. I hope, that this
solution will be suitable for ARM SoCs where a standalone SCP
is present and real mailbox IP, which has asynchronous nature, is used
for IPC. Of course, this mailbox must have TX/RX-done irqs.
This is a limitation at the moment.

>
>
> So to summarize I think we need to agree on those general questions:
> 1) Shall the Xen hypervisor actually be involved in CPUFreq at all? Can
> this be left to corner-cases like pinned CPUs/guests, where guests
> requests are passed on to the hardware?
> 2) Is EL3/ATF providing SCPI services something we can build on?
> Normally I would expect we write drivers to match existing firmware.
> 3) When we go this way, do we really need to port all of the Linux
> drivers and its framework to Xen? Can't we get away with much simpler
> solutions? In the end all the SMC mailbox driver does it to trigger an
> single SMC call, embedded in a lot of glorious Linux boiler plate code.
>
> What I was *actually* thinking of when using the SMC mailbox approach is
> the ability to provide *virtual* SCPI services to guest, in a generic,
> not-SoC-specific way. The proposed SMC mailbox binding allows using
> *hvc* calls to trigger services, so Xen could pick up DVFS requests from
> guests in a generic way and act upon them.
>
> Cheers,
> Andre.
>
>> 4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
>> 5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
>> 6. SCPI based CPUFreq driver and CPUFreq interface component.
>> 7. Misc patches mostly to ARM subsystem.
>> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
>>
>> Most important TODOs regarding the whole patch series:
>> 1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
>> 2. Thermal management integration.
>> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
>> 4. Handle CPU_TURBO frequencies if they are supported by HW.
>>
>> You can find the whole patch series here:
>> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
>>
>> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
>>
>> [1]
>> Linux code:
>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>>
>> Recent protocol version:
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>>
>> [2]
>> Xen part:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
>> Linux part:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
>>
>> [3]
>> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
>>
>> [4]
>> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
>>
>> [5]
>> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
>>
>> Oleksandr Dmytryshyn (6):
>>   cpufreq: move cpufreq.h file to the xen/include/xen location
>>   pm: move processor_perf.h file to the xen/include/xen location
>>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>>   cpufreq: make turbo settings to be configurable
>>   pmstat: make pmstat functions more generalizable
>>   cpufreq: make cpufreq driver more generalizable
>>
>> Oleksandr Tyshchenko (24):
>>   xenpm: Clarify xenpm usage
>>   xen/device-tree: Add dt_count_phandle_with_args helper
>>   xen/device-tree: Add dt_property_for_each_string macros
>>   xen/device-tree: Add dt_property_read_u32_index helper
>>   xen/device-tree: Add dt_property_count_elems_of_size helper
>>   xen/device-tree: Add dt_property_read_string_helper and friends
>>   xen/arm: Add driver_data field to struct device
>>   xen/arm: Add DEVICE_MAILBOX device class
>>   xen/arm: Store device-tree node per cpu
>>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>>   xen/arm: Add mailbox infrastructure
>>   xen/arm: Introduce ARM SMC based mailbox
>>   xen/arm: Add common header file wrappers.h
>>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>>   xen/arm: Add Xen changes to SCPI protocol
>>   xen/arm: Add Xen changes to mailbox infrastructure
>>   xen/arm: Add Xen changes to ARM SMC based mailbox
>>   xen/arm: Use non-blocking mode for SCPI protocol
>>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>>   xen/arm: Introduce SCPI based CPUFreq driver
>>   xen/arm: Introduce CPUFreq Interface component
>>   xen/arm: Build CPUFreq components
>>   xen/arm: Enable CPUFreq on ARM
>>
>> Volodymyr Babchuk (1):
>>   arm: add SMC wrapper that is compatible with SMCCC
>>
>>  MAINTAINERS                                  |    4 +-
>>  tools/misc/xenpm.c                           |    6 +-
>>  xen/arch/arm/Kconfig                         |    2 +
>>  xen/arch/arm/Makefile                        |    1 +
>>  xen/arch/arm/arm32/Makefile                  |    1 +
>>  xen/arch/arm/arm32/smc.S                     |   32 +
>>  xen/arch/arm/arm64/Makefile                  |    1 +
>>  xen/arch/arm/arm64/smc.S                     |   29 +
>>  xen/arch/arm/cpufreq/Makefile                |    5 +
>>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
>>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>>  xen/arch/arm/smpboot.c                       |    5 +
>>  xen/arch/x86/Kconfig                         |    2 +
>>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>>  xen/arch/x86/acpi/power.c                    |    2 +-
>>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>>  xen/arch/x86/platform_hypercall.c            |    2 +-
>>  xen/common/device_tree.c                     |  124 +++
>>  xen/common/sysctl.c                          |    2 +-
>>  xen/drivers/Kconfig                          |    2 +
>>  xen/drivers/Makefile                         |    1 +
>>  xen/drivers/acpi/Makefile                    |    1 -
>>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>>  xen/drivers/cpufreq/Kconfig                  |    3 +
>>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>>  xen/drivers/cpufreq/utility.c                |   13 +-
>>  xen/drivers/pm/Kconfig                       |    3 +
>>  xen/drivers/pm/Makefile                      |    1 +
>>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>>  xen/include/asm-arm/device.h                 |    2 +
>>  xen/include/asm-arm/processor.h              |    4 +
>>  xen/include/public/platform.h                |    1 +
>>  xen/include/xen/cpufreq.h                    |  254 ++++++
>>  xen/include/xen/device_tree.h                |  158 ++++
>>  xen/include/xen/pmstat.h                     |    2 +
>>  xen/include/xen/processor_perf.h             |   69 ++
>>  50 files changed, 4822 insertions(+), 862 deletions(-)
>>  create mode 100644 xen/arch/arm/arm32/smc.S
>>  create mode 100644 xen/arch/arm/arm64/smc.S
>>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>>  delete mode 100644 xen/drivers/acpi/pmstat.c
>>  create mode 100644 xen/drivers/pm/Kconfig
>>  create mode 100644 xen/drivers/pm/Makefile
>>  create mode 100644 xen/drivers/pm/stat.c
>>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>>  create mode 100644 xen/include/xen/cpufreq.h
>>  create mode 100644 xen/include/xen/processor_perf.h
>>


-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-13 19:40   ` Oleksandr Tyshchenko
@ 2017-11-14 10:49     ` Andre Przywara
  2017-11-14 20:46       ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Andre Przywara @ 2017-11-14 10:49 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Oleksandr Tyshchenko, Jan Beulich,
	Sudeep Holla, xen-devel

Hi,

On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
> <andre.przywara@linaro.org> wrote:
>> Hi,
> Hi Andre
> 
>>
>> thanks very much for your work on this!
> Thank you for your comments.
> 
>>
>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> Hi, all.
>>>
>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>>
>> Can you please sketch your usage scenario or workloads here? I can think
>> of quite different scenarios (oversubscribed server vs. partitioning
>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>> in the design are quite different between those.
> We keep embedded use-cases in mind. For example, it is a system with
> several domains,
> where one domain has most critical SW running on and other domain(s)
> are, let say, for entertainment purposes.
> I think, the CPUFreq is useful where power consumption is a question.

Does the SoC you use allow different frequencies for each core? Or is it
one frequency for all cores? Most x86 CPU allow different frequencies
for each core, AFAIK. Just having the same OPP for the whole SoC might
limit the usefulness of this approach in general.

>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>> position to make a decision on the proper frequency physical CPUs should
>> run with. From all I know it's already hard for an OS kernel to make
>> that call. So I would actually expect that guests provide some input,
>> for instance by signalling OPP change request up to the hypervisor. This
>> could then decide to act on it - or not.
> Each running guest sees only part of the picture, but hypervisor has
> the whole picture, it knows all about CPU, measures CPU load and able
> to choose required CPU frequency to run on.

But based on what data? All Xen sees is a vCPU trapping on MMIO, a
hypercall or on WFI, for that matter. It does not know much more about
the guest, especially it's rather clueless about what the guest OS
actually intended to do.
For instance Linux can track the actual utilization of a core by keeping
statistics of runnable processes and monitoring their time slice usage.
It can see that a certain process exhibits periodical, but bursty CPU
usage, which may hint that is could run at lower frequency. Xen does not
see this fine granular information.

> I am wondering, does Xen
> need additional input from guests for make a decision?

I very much believe so. The guest OS is in a much better position to
make that call.

> BTW, currently guest domain on ARM doesn't even know how many physical
> CPUs the system has and what are these OPPs. When creating guest
> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
> OPPs, thermal, etc are not passed to guest.

Sure, because this is what virtualization is about. And I am not asking
for unconditionally allowing any guest to change frequency.
But there could be certain use cases where this could be considered:
Think about your "critical SW" mentioned above, which is probably some
RTOS, also possibly running on pinned vCPUs. For that
(latency-sensitive) guest it might be well suited to run at a lower
frequency for some time, but how should Xen know about this?
"Normally" the best strategy to save power is to run as fast as
possible, finish all outstanding work, then put the core to sleep.
Because not running at all consumes much less energy than running at a
reduced frequency. But this may not be suitable for an RTOS.

So I think we would need a combined approach:
a) Let an administrator (via tools running in Dom0) tell Xen about power
management strategies to use for certain guests. An RTOS could be
treated differently (lower, but constant frequency) than an
"entertainment" guest (varying frequency, based on guest OS input), also
differently than some background guest doing logging, OTA update, etc.
(constant high frequency, but putting cores to sleep instead as often as
possible).
b) Allow some guests (based on policy from (a)) to signal CPUFreq change
requests to the hypervisor. Xen takes those into account, though it may
decide to not act immediately on it, because it is going to schedule
another vCPU, for instance.
c) Have some way of actually realising certain OPPs. This could be via
an SCPI client in Xen, or some other way. Might be an implementation detail.

>>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>>
>> Have you looked at how this is used on x86 these days? Can you briefly
>> describe how this works and it's used there?
> Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
> used at the moment, but it is another question. So, there are two
> possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
> [2]. As I understand, the second option is more popular.
> Two different implementations of "Hypervisor based CPUFreq" are
> present: ACPI Processor P-States Driver and AMD Architectural P-state
> Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.
> 
> [1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
> [2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq

Thanks for the research and the pointers, will look at it later.

>>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>>
>>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>>
>>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>>
>>> Let me explain a bit more what these possible approaches are:
>>>
>>> 1. “Xen+hwdom” solution.
>>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>>
>> Stefano, Julien and I were thinking about this: Wouldn't it be possible
>> to come up with some hardware domain, solely dealing with CPUFreq
>> changes? This could run a Linux kernel, but no or very little userland.
>> All its vCPUs would be pinned to pCPUs and would normally not be
>> scheduled by Xen. If Xen wants to change the frequency, it schedules the
>> respective vCPU to the right pCPU and passes down the frequency change
>> request. Sounds a bit involved, though, and probably doesn't solve the
>> problem where this domain needs to share access to hardware with Dom0
>> (clocks come to mind).
> Yes, another question is how to get this Linux kernel stuff (backend,
> top level driver, etc) upstreamed.

Well, the idea would be to use already upstream drivers to actually
implement OPP changes (via Linux clock and regulator drivers), then use
existing interfaces like the userspace governor, for instance, to
trigger those. I don't think we need much extra kernel code for that.

>>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>>
>>> 2. “all-in-Xen” solution.
>>> This implies that all CPUFreq related stuff should be located in Xen.
>>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>
>> Yes, I even think it's not feasible to implement this. With a modern
>> clock implementation there is one driver to control *all* clocks of an
>> SoC, so you can't single out the CPU clock easily, for instance. One
>> would probably run into synchronisation issues, at best.
>>
>>> 3. “Xen+SCP(ARM TF)” solution.
>>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>>
>>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>
>> While I feel flattered that you like that idea as well ;-), you should
>> mention that this requires actual firmware providing those services.
> Yes, a some firmware, which provides these services, must be present
> on the other end.
> It is a firmware which runs on the dedicated IP core(s) in common case.
> And it is a firmware which runs on the same core(s) as the hypervisor
> in particular case.
> 
>> I
>> am not sure there is actually *any* implementation of this at the
>> moment, apart from my PoC code for Allwinner.
> Your PoC is a good example for writing firmware side. So, why don't
> use it as a base for
> other platform.

Sure, but normally firmware is provided by the vendor. And until more
vendors actually implement this, it's a bit weird to ask Xen users to
install this hand-crafted home-brew firmware to use this feature.
For a particular embedded use case like yours this might be feasible,
though.

>> And from a Xen point of view I am not sure we are in the position to
>> force users to use this firmware. This may be feasible in a classic
>> embedded scenario, where both firmware and software are provided by the
>> same entity, but that should be clearly noted as a restriction.
> Agree.
> 
>>
>>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>
>> It should be noted that this synchronous nature of the communication can
>> actually be a problem: a DVFS request usually involves regulator and PLL
>> changes, which could take some time to settle in. Blocking all of this
>> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
> Agree. I haven't measured time yet to say how long is it, since I
> don't have a working firmware at the moment, just an emulator,
> but, yes, it will definitely take some time. The whole system won't be
> blocked, only the CPU which performs SMC call.
> But, if we ask hwdom to change frequency we will wait too? Or if Xen
> manages PLL/regulator by itself, it will wait anyway?

Normally this is done asynchronously. For instance the OS programs the
regulator to change the voltage, then does other things until the
regulator signals the change has been realised. The it re-programs the
PLL, again executing other code, eventually being interrupted by a
completion interrupt (or by periodically polling a bit). If we need to
spend all of this time in EL3, the HV is blocked on this. This might or
might not be a problem, but it should be noted.

>>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>>
>>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>>
>>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>>> 2. A bunch of device-tree helpers and macros.
>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>
>> Why do you actually need this mailbox framework? Actually I just
>> proposed the SMC driver the make it fit into the Linux framework. All we
>> actually need for SCPI is to write a simple command into some memory and
>> "press a button". I don't see a need to import the whole Linux
>> framework, especially as our mailbox usage is actually just a corner
>> case of the mailbox's capability (namely a "single-bit" doorbell).
>> The SMC use case is trivial to implement, and I believe using the Juno
>> mailbox is similarly simple, for instance.
> I did a direct port for SCPI protocol. I think, it is something that
> should be retained as much as possible.

But the actual protocol is really simple. And we just need a subset of
it, namely to query and trigger OPPs.

> Protocol relies on mailbox feature, so I ported mailbox too. I think,
> it would be much more easy for me to just add
> a few required commands handling with issuing SMC call and without any
> mailbox infrastructure involved.
> But, I want to show what is going on and what place these things come from.

I appreciate that, but I think we already have enough "bloated" Linux +
glue code in Xen. And in particular the Linux mailbox framework is much
more powerful than we need for SCPI, so we have a lot of unneeded
functionality.
If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
mailbox, we can get away with a *much* simpler solution.
- We would need to port mailbox drivers one-by-one anyway, so we could
as well implement the simple "press-the-button" subset for each mailbox
separately. The interface between the SCPI code and the mailbox is
probably just "signal_mailbox()". For SMC it's trivial, and for the Juno
MHU it's also simple, I guess ([1], chapter 3.6).
- The SCPI message assembly is easy as well.
- The only other code needed is some DT parsing code to be compatible
with the existing DTs describing the SCPI implementation. We would claim
to have a mailbox driver for those compatibles, but cheat a bit since we
only use it for SCPI and just need the single bit subset of the mailbox.

> What is more, I don't want to restrict a usage of this CPUFreq by only
> covering single scenario where a
> firmware, which provides DVFS service, is in ARM TF. I hope, that this
> solution will be suitable for ARM SoCs where a standalone SCP
> is present and real mailbox IP, which has asynchronous nature, is used
> for IPC. Of course, this mailbox must have TX/RX-done irqs.
> This is a limitation at the moment.

Sure, see above and the document [1] below.

Cheers,
Andre.

[1]
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515f/DDI0515F_juno_arm_development_platform_soc_trm.pdf
> 
>>
>>
>> So to summarize I think we need to agree on those general questions:
>> 1) Shall the Xen hypervisor actually be involved in CPUFreq at all? Can
>> this be left to corner-cases like pinned CPUs/guests, where guests
>> requests are passed on to the hardware?
>> 2) Is EL3/ATF providing SCPI services something we can build on?
>> Normally I would expect we write drivers to match existing firmware.
>> 3) When we go this way, do we really need to port all of the Linux
>> drivers and its framework to Xen? Can't we get away with much simpler
>> solutions? In the end all the SMC mailbox driver does it to trigger an
>> single SMC call, embedded in a lot of glorious Linux boiler plate code.
>>
>> What I was *actually* thinking of when using the SMC mailbox approach is
>> the ability to provide *virtual* SCPI services to guest, in a generic,
>> not-SoC-specific way. The proposed SMC mailbox binding allows using
>> *hvc* calls to trigger services, so Xen could pick up DVFS requests from
>> guests in a generic way and act upon them.
>>
>> Cheers,
>> Andre.
>>
>>> 4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
>>> 5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
>>> 6. SCPI based CPUFreq driver and CPUFreq interface component.
>>> 7. Misc patches mostly to ARM subsystem.
>>> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
>>>
>>> Most important TODOs regarding the whole patch series:
>>> 1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
>>> 2. Thermal management integration.
>>> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
>>> 4. Handle CPU_TURBO frequencies if they are supported by HW.
>>>
>>> You can find the whole patch series here:
>>> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
>>>
>>> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
>>>
>>> [1]
>>> Linux code:
>>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
>>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
>>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>>>
>>> Recent protocol version:
>>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>>>
>>> [2]
>>> Xen part:
>>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
>>> Linux part:
>>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
>>>
>>> [3]
>>> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
>>>
>>> [4]
>>> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
>>>
>>> [5]
>>> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
>>>
>>> Oleksandr Dmytryshyn (6):
>>>   cpufreq: move cpufreq.h file to the xen/include/xen location
>>>   pm: move processor_perf.h file to the xen/include/xen location
>>>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>>>   cpufreq: make turbo settings to be configurable
>>>   pmstat: make pmstat functions more generalizable
>>>   cpufreq: make cpufreq driver more generalizable
>>>
>>> Oleksandr Tyshchenko (24):
>>>   xenpm: Clarify xenpm usage
>>>   xen/device-tree: Add dt_count_phandle_with_args helper
>>>   xen/device-tree: Add dt_property_for_each_string macros
>>>   xen/device-tree: Add dt_property_read_u32_index helper
>>>   xen/device-tree: Add dt_property_count_elems_of_size helper
>>>   xen/device-tree: Add dt_property_read_string_helper and friends
>>>   xen/arm: Add driver_data field to struct device
>>>   xen/arm: Add DEVICE_MAILBOX device class
>>>   xen/arm: Store device-tree node per cpu
>>>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>>>   xen/arm: Add mailbox infrastructure
>>>   xen/arm: Introduce ARM SMC based mailbox
>>>   xen/arm: Add common header file wrappers.h
>>>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>>>   xen/arm: Add Xen changes to SCPI protocol
>>>   xen/arm: Add Xen changes to mailbox infrastructure
>>>   xen/arm: Add Xen changes to ARM SMC based mailbox
>>>   xen/arm: Use non-blocking mode for SCPI protocol
>>>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>>>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>>>   xen/arm: Introduce SCPI based CPUFreq driver
>>>   xen/arm: Introduce CPUFreq Interface component
>>>   xen/arm: Build CPUFreq components
>>>   xen/arm: Enable CPUFreq on ARM
>>>
>>> Volodymyr Babchuk (1):
>>>   arm: add SMC wrapper that is compatible with SMCCC
>>>
>>>  MAINTAINERS                                  |    4 +-
>>>  tools/misc/xenpm.c                           |    6 +-
>>>  xen/arch/arm/Kconfig                         |    2 +
>>>  xen/arch/arm/Makefile                        |    1 +
>>>  xen/arch/arm/arm32/Makefile                  |    1 +
>>>  xen/arch/arm/arm32/smc.S                     |   32 +
>>>  xen/arch/arm/arm64/Makefile                  |    1 +
>>>  xen/arch/arm/arm64/smc.S                     |   29 +
>>>  xen/arch/arm/cpufreq/Makefile                |    5 +
>>>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>>>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
>>>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>>>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>>>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>>>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>>>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>>>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>>>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>>>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>>>  xen/arch/arm/smpboot.c                       |    5 +
>>>  xen/arch/x86/Kconfig                         |    2 +
>>>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>>>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>>>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>>>  xen/arch/x86/acpi/power.c                    |    2 +-
>>>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>>>  xen/arch/x86/platform_hypercall.c            |    2 +-
>>>  xen/common/device_tree.c                     |  124 +++
>>>  xen/common/sysctl.c                          |    2 +-
>>>  xen/drivers/Kconfig                          |    2 +
>>>  xen/drivers/Makefile                         |    1 +
>>>  xen/drivers/acpi/Makefile                    |    1 -
>>>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>>>  xen/drivers/cpufreq/Kconfig                  |    3 +
>>>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>>>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>>>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>>>  xen/drivers/cpufreq/utility.c                |   13 +-
>>>  xen/drivers/pm/Kconfig                       |    3 +
>>>  xen/drivers/pm/Makefile                      |    1 +
>>>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>>>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>>>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>>>  xen/include/asm-arm/device.h                 |    2 +
>>>  xen/include/asm-arm/processor.h              |    4 +
>>>  xen/include/public/platform.h                |    1 +
>>>  xen/include/xen/cpufreq.h                    |  254 ++++++
>>>  xen/include/xen/device_tree.h                |  158 ++++
>>>  xen/include/xen/pmstat.h                     |    2 +
>>>  xen/include/xen/processor_perf.h             |   69 ++
>>>  50 files changed, 4822 insertions(+), 862 deletions(-)
>>>  create mode 100644 xen/arch/arm/arm32/smc.S
>>>  create mode 100644 xen/arch/arm/arm64/smc.S
>>>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>>>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>>>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>>>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>>>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>>>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>>>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>>>  delete mode 100644 xen/drivers/acpi/pmstat.c
>>>  create mode 100644 xen/drivers/pm/Kconfig
>>>  create mode 100644 xen/drivers/pm/Makefile
>>>  create mode 100644 xen/drivers/pm/stat.c
>>>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>>>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>>>  create mode 100644 xen/include/xen/cpufreq.h
>>>  create mode 100644 xen/include/xen/processor_perf.h
>>>
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-14 10:49     ` Andre Przywara
@ 2017-11-14 20:46       ` Oleksandr Tyshchenko
  2017-11-15  3:03         ` Jassi Brar
  2017-11-15 14:28         ` Andre Przywara
  0 siblings, 2 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-14 20:46 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Oleksandr Tyshchenko, Jan Beulich,
	Sudeep Holla, xen-devel

On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
<andre.przywara@linaro.org> wrote:
> Hi,
Hi Andre

>
> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>> <andre.przywara@linaro.org> wrote:
>>> Hi,
>> Hi Andre,
>>
>>>
>>> thanks very much for your work on this!
>> Thank you for your comments.
>>
>>>
>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>
>>>> Hi, all.
>>>>
>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>>>
>>> Can you please sketch your usage scenario or workloads here? I can think
>>> of quite different scenarios (oversubscribed server vs. partitioning
>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>> in the design are quite different between those.
>> We keep embedded use-cases in mind. For example, it is a system with
>> several domains,
>> where one domain has most critical SW running on and other domain(s)
>> are, let say, for entertainment purposes.
>> I think, the CPUFreq is useful where power consumption is a question.
>
> Does the SoC you use allow different frequencies for each core? Or is it
> one frequency for all cores? Most x86 CPU allow different frequencies
> for each core, AFAIK. Just having the same OPP for the whole SoC might
> limit the usefulness of this approach in general.
Good question. All cores in a cluster share the same clock. It is
impossible to set different frequencies on the cores inside one
cluster.

>
>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>> position to make a decision on the proper frequency physical CPUs should
>>> run with. From all I know it's already hard for an OS kernel to make
>>> that call. So I would actually expect that guests provide some input,
>>> for instance by signalling OPP change request up to the hypervisor. This
>>> could then decide to act on it - or not.
>> Each running guest sees only part of the picture, but hypervisor has
>> the whole picture, it knows all about CPU, measures CPU load and able
>> to choose required CPU frequency to run on.
>
> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
> hypercall or on WFI, for that matter. It does not know much more about
> the guest, especially it's rather clueless about what the guest OS
> actually intended to do.
> For instance Linux can track the actual utilization of a core by keeping
> statistics of runnable processes and monitoring their time slice usage.
> It can see that a certain process exhibits periodical, but bursty CPU
> usage, which may hint that is could run at lower frequency. Xen does not
> see this fine granular information.
>
>> I am wondering, does Xen
>> need additional input from guests for make a decision?
>
> I very much believe so. The guest OS is in a much better position to
> make that call.
>
>> BTW, currently guest domain on ARM doesn't even know how many physical
>> CPUs the system has and what are these OPPs. When creating guest
>> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
>> OPPs, thermal, etc are not passed to guest.
>
> Sure, because this is what virtualization is about. And I am not asking
> for unconditionally allowing any guest to change frequency.
> But there could be certain use cases where this could be considered:
> Think about your "critical SW" mentioned above, which is probably some
> RTOS, also possibly running on pinned vCPUs. For that
> (latency-sensitive) guest it might be well suited to run at a lower
> frequency for some time, but how should Xen know about this?
> "Normally" the best strategy to save power is to run as fast as
> possible, finish all outstanding work, then put the core to sleep.
> Because not running at all consumes much less energy than running at a
> reduced frequency. But this may not be suitable for an RTOS.
Saying "one domain has most critical SW running on" I meant hardware
domain/driver domain or even other
domain which perform some important tasks (disk, net, display, camera,
whatever) which treated by the whole system as critical
and must never fail. Other domains, for example, it might be Android
as well, are not critical at all from the system point of view.
Being honest, I haven't considered yet using CPUFreq in system where
some RT guest is present.
I think it is something that should be *thoroughly* investigated and
then worked out.
I am not familiar with RT system requirements, I suppose, but not
entirely sure, that CPUFreq should use const
frequency for all cores the RT system is running on, or RT system
parameters should be recalculated each time the CPU frequency is being
changed
(in such case guest needs some input from Xen).

Anyway, I got your point about some guest input. Could you, please,
describe how you think it should look like:
1. Xen doesn't have CPUFreq logic at all. It only collects OPP change
requests from all guests and make
a decision based on these requests and maybe some policy for
prioritizing requests. Then it sends OPP change request to SCP.
2. Xen has CPUFreq logic. In addition it can collect OPP change
requests from all guests and make
a decision based on both: it's own view and guest requests. Then it
sends OPP change request to SCP.

Both variant implies that something like PV CPUFreq should be involved
with frontend drivers are located in guests. Am I correct?

>
> So I think we would need a combined approach:
> a) Let an administrator (via tools running in Dom0) tell Xen about power
> management strategies to use for certain guests. An RTOS could be
> treated differently (lower, but constant frequency) than an
> "entertainment" guest (varying frequency, based on guest OS input), also
> differently than some background guest doing logging, OTA update, etc.
> (constant high frequency, but putting cores to sleep instead as often as
> possible).
> b) Allow some guests (based on policy from (a)) to signal CPUFreq change
> requests to the hypervisor. Xen takes those into account, though it may
> decide to not act immediately on it, because it is going to schedule
> another vCPU, for instance.
> c) Have some way of actually realising certain OPPs. This could be via
> an SCPI client in Xen, or some other way. Might be an implementation detail.

Just to clarify if I got the main idea correct:
1. Guests have CPUFreq logic, they send OPP change requests to Xen.
2. Xen has CPUFreq logic too, but in additional it can take into the account OPP
    change requests from guests. Xen sends final OPP change request.
Is my understanding correct?

Also "Different power management strategies to use for certain guests"
means that it should be
hard vCPU->pCPU pinning for each guest together with possibility in
Xen to have different CPUFreq governors
running at the same time (each governor for each CPU pool)?

>
>>>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>>>
>>> Have you looked at how this is used on x86 these days? Can you briefly
>>> describe how this works and it's used there?
>> Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
>> used at the moment, but it is another question. So, there are two
>> possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
>> [2]. As I understand, the second option is more popular.
>> Two different implementations of "Hypervisor based CPUFreq" are
>> present: ACPI Processor P-States Driver and AMD Architectural P-state
>> Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.
>>
>> [1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
>> [2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq
>
> Thanks for the research and the pointers, will look at it later.
>
>>>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>>>
>>>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>>>
>>>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>>>
>>>> Let me explain a bit more what these possible approaches are:
>>>>
>>>> 1. “Xen+hwdom” solution.
>>>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>>>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>>>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>>>
>>> Stefano, Julien and I were thinking about this: Wouldn't it be possible
>>> to come up with some hardware domain, solely dealing with CPUFreq
>>> changes? This could run a Linux kernel, but no or very little userland.
>>> All its vCPUs would be pinned to pCPUs and would normally not be
>>> scheduled by Xen. If Xen wants to change the frequency, it schedules the
>>> respective vCPU to the right pCPU and passes down the frequency change
>>> request. Sounds a bit involved, though, and probably doesn't solve the
>>> problem where this domain needs to share access to hardware with Dom0
>>> (clocks come to mind).
>> Yes, another question is how to get this Linux kernel stuff (backend,
>> top level driver, etc) upstreamed.
>
> Well, the idea would be to use already upstream drivers to actually
> implement OPP changes (via Linux clock and regulator drivers), then use
> existing interfaces like the userspace governor, for instance, to
> trigger those. I don't think we need much extra kernel code for that.
I understand. Backend in userspace sets desired frequency by request
from frontend in Xen.

>
>>>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>>>
>>>> 2. “all-in-Xen” solution.
>>>> This implies that all CPUFreq related stuff should be located in Xen.
>>>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>>>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>>
>>> Yes, I even think it's not feasible to implement this. With a modern
>>> clock implementation there is one driver to control *all* clocks of an
>>> SoC, so you can't single out the CPU clock easily, for instance. One
>>> would probably run into synchronisation issues, at best.
>>>
>>>> 3. “Xen+SCP(ARM TF)” solution.
>>>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>>>
>>>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>>
>>> While I feel flattered that you like that idea as well ;-), you should
>>> mention that this requires actual firmware providing those services.
>> Yes, a some firmware, which provides these services, must be present
>> on the other end.
>> It is a firmware which runs on the dedicated IP core(s) in common case.
>> And it is a firmware which runs on the same core(s) as the hypervisor
>> in particular case.
>>
>>> I
>>> am not sure there is actually *any* implementation of this at the
>>> moment, apart from my PoC code for Allwinner.
>> Your PoC is a good example for writing firmware side. So, why don't
>> use it as a base for
>> other platform.
>
> Sure, but normally firmware is provided by the vendor. And until more
> vendors actually implement this, it's a bit weird to ask Xen users to
> install this hand-crafted home-brew firmware to use this feature.
> For a particular embedded use case like yours this might be feasible,
> though.
Agree. it is exactly for ARM SoCs with security extensions enabled,
but where SCP isn't available.
And these SoCs are exists.

>
>>> And from a Xen point of view I am not sure we are in the position to
>>> force users to use this firmware. This may be feasible in a classic
>>> embedded scenario, where both firmware and software are provided by the
>>> same entity, but that should be clearly noted as a restriction.
>> Agree.
>>
>>>
>>>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>>>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>>
>>> It should be noted that this synchronous nature of the communication can
>>> actually be a problem: a DVFS request usually involves regulator and PLL
>>> changes, which could take some time to settle in. Blocking all of this
>>> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
>> Agree. I haven't measured time yet to say how long is it, since I
>> don't have a working firmware at the moment, just an emulator,
>> but, yes, it will definitely take some time. The whole system won't be
>> blocked, only the CPU which performs SMC call.
>> But, if we ask hwdom to change frequency we will wait too? Or if Xen
>> manages PLL/regulator by itself, it will wait anyway?
>
> Normally this is done asynchronously. For instance the OS programs the
> regulator to change the voltage, then does other things until the
> regulator signals the change has been realised. The it re-programs the
> PLL, again executing other code, eventually being interrupted by a
> completion interrupt (or by periodically polling a bit). If we need to
> spend all of this time in EL3, the HV is blocked on this. This might or
> might not be a problem, but it should be noted.
Agree.

>
>>>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>>>
>>>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>>>
>>>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>>>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>>>> 2. A bunch of device-tree helpers and macros.
>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>
>>> Why do you actually need this mailbox framework? Actually I just
>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>> actually need for SCPI is to write a simple command into some memory and
>>> "press a button". I don't see a need to import the whole Linux
>>> framework, especially as our mailbox usage is actually just a corner
>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>> The SMC use case is trivial to implement, and I believe using the Juno
>>> mailbox is similarly simple, for instance.
>> I did a direct port for SCPI protocol. I think, it is something that
>> should be retained as much as possible.
>
> But the actual protocol is really simple. And we just need a subset of
> it, namely to query and trigger OPPs.
Yes. I think, that "Sensors service" is needed as well. I think that
CPUFreq is not completed without thermal feedback.

>
>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>> it would be much more easy for me to just add
>> a few required commands handling with issuing SMC call and without any
>> mailbox infrastructure involved.
>> But, I want to show what is going on and what place these things come from.
>
> I appreciate that, but I think we already have enough "bloated" Linux +
> glue code in Xen. And in particular the Linux mailbox framework is much
> more powerful than we need for SCPI, so we have a lot of unneeded
> functionality.
> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
> mailbox, we can get away with a *much* simpler solution.

Agree, but I am afraid that simplifying things now might lead to some
difficulties when there is a need
to integrate a little bit different mailbox IP. Also, we need to
recheck if SCMI, we might want to support as well,
have the similar interface with mailbox.

> - We would need to port mailbox drivers one-by-one anyway, so we could
> as well implement the simple "press-the-button" subset for each mailbox
> separately. The interface between the SCPI code and the mailbox is
> probably just "signal_mailbox()". For SMC it's trivial, and for the Juno
> MHU it's also simple, I guess ([1], chapter 3.6).
> - The SCPI message assembly is easy as well.
> - The only other code needed is some DT parsing code to be compatible
> with the existing DTs describing the SCPI implementation. We would claim
> to have a mailbox driver for those compatibles, but cheat a bit since we
> only use it for SCPI and just need the single bit subset of the mailbox.
Yes, I think, we can optimize in a such way.

Just to clarify:
Proposed "signal_mailbox" is intended for both actions: sending
request and receiving response?
So when it returns we will have either response or timeout error or
some callback will be needed anyway?

I don't have any objections regarding optimizations, we need to
decide what mailboxes we should stick to (we can support) and in what
form we should keep
all this stuff in.
Also while making a decision, we need to keep in mind "direct ported
code" advantages:
- "direct ported code" (SCPI + mailbox) have had a thorough review by
the Linux community and Xen community
  may rely on their review.
- As "direct ported code" wasn't changed heavily, I believe, it would
be easy to backport fixes/features to Xen.

So, let's decide.

>
>> What is more, I don't want to restrict a usage of this CPUFreq by only
>> covering single scenario where a
>> firmware, which provides DVFS service, is in ARM TF. I hope, that this
>> solution will be suitable for ARM SoCs where a standalone SCP
>> is present and real mailbox IP, which has asynchronous nature, is used
>> for IPC. Of course, this mailbox must have TX/RX-done irqs.
>> This is a limitation at the moment.
>
> Sure, see above and the document [1] below.
Thank you for the link, it seems with MHU we have to poll for the
last_tx_done (where deasserted interrupt line in a status register is
a condition for)
after pressing the button. Or I missed something?

>
> Cheers,
> Andre.
>
> [1]
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515f/DDI0515F_juno_arm_development_platform_soc_trm.pdf
>>
>>>
>>>
>>> So to summarize I think we need to agree on those general questions:
>>> 1) Shall the Xen hypervisor actually be involved in CPUFreq at all? Can
>>> this be left to corner-cases like pinned CPUs/guests, where guests
>>> requests are passed on to the hardware?
>>> 2) Is EL3/ATF providing SCPI services something we can build on?
>>> Normally I would expect we write drivers to match existing firmware.
>>> 3) When we go this way, do we really need to port all of the Linux
>>> drivers and its framework to Xen? Can't we get away with much simpler
>>> solutions? In the end all the SMC mailbox driver does it to trigger an
>>> single SMC call, embedded in a lot of glorious Linux boiler plate code.
>>>
>>> What I was *actually* thinking of when using the SMC mailbox approach is
>>> the ability to provide *virtual* SCPI services to guest, in a generic,
>>> not-SoC-specific way. The proposed SMC mailbox binding allows using
>>> *hvc* calls to trigger services, so Xen could pick up DVFS requests from
>>> guests in a generic way and act upon them.
>>>
>>> Cheers,
>>> Andre.
>>>
>>>> 4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
>>>> 5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
>>>> 6. SCPI based CPUFreq driver and CPUFreq interface component.
>>>> 7. Misc patches mostly to ARM subsystem.
>>>> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
>>>>
>>>> Most important TODOs regarding the whole patch series:
>>>> 1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
>>>> 2. Thermal management integration.
>>>> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
>>>> 4. Handle CPU_TURBO frequencies if they are supported by HW.
>>>>
>>>> You can find the whole patch series here:
>>>> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
>>>>
>>>> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
>>>>
>>>> [1]
>>>> Linux code:
>>>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
>>>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
>>>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>>>>
>>>> Recent protocol version:
>>>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>>>>
>>>> [2]
>>>> Xen part:
>>>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
>>>> Linux part:
>>>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
>>>>
>>>> [3]
>>>> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
>>>>
>>>> [4]
>>>> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
>>>>
>>>> [5]
>>>> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
>>>>
>>>> Oleksandr Dmytryshyn (6):
>>>>   cpufreq: move cpufreq.h file to the xen/include/xen location
>>>>   pm: move processor_perf.h file to the xen/include/xen location
>>>>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>>>>   cpufreq: make turbo settings to be configurable
>>>>   pmstat: make pmstat functions more generalizable
>>>>   cpufreq: make cpufreq driver more generalizable
>>>>
>>>> Oleksandr Tyshchenko (24):
>>>>   xenpm: Clarify xenpm usage
>>>>   xen/device-tree: Add dt_count_phandle_with_args helper
>>>>   xen/device-tree: Add dt_property_for_each_string macros
>>>>   xen/device-tree: Add dt_property_read_u32_index helper
>>>>   xen/device-tree: Add dt_property_count_elems_of_size helper
>>>>   xen/device-tree: Add dt_property_read_string_helper and friends
>>>>   xen/arm: Add driver_data field to struct device
>>>>   xen/arm: Add DEVICE_MAILBOX device class
>>>>   xen/arm: Store device-tree node per cpu
>>>>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>>>>   xen/arm: Add mailbox infrastructure
>>>>   xen/arm: Introduce ARM SMC based mailbox
>>>>   xen/arm: Add common header file wrappers.h
>>>>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>>>>   xen/arm: Add Xen changes to SCPI protocol
>>>>   xen/arm: Add Xen changes to mailbox infrastructure
>>>>   xen/arm: Add Xen changes to ARM SMC based mailbox
>>>>   xen/arm: Use non-blocking mode for SCPI protocol
>>>>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>>>>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>>>>   xen/arm: Introduce SCPI based CPUFreq driver
>>>>   xen/arm: Introduce CPUFreq Interface component
>>>>   xen/arm: Build CPUFreq components
>>>>   xen/arm: Enable CPUFreq on ARM
>>>>
>>>> Volodymyr Babchuk (1):
>>>>   arm: add SMC wrapper that is compatible with SMCCC
>>>>
>>>>  MAINTAINERS                                  |    4 +-
>>>>  tools/misc/xenpm.c                           |    6 +-
>>>>  xen/arch/arm/Kconfig                         |    2 +
>>>>  xen/arch/arm/Makefile                        |    1 +
>>>>  xen/arch/arm/arm32/Makefile                  |    1 +
>>>>  xen/arch/arm/arm32/smc.S                     |   32 +
>>>>  xen/arch/arm/arm64/Makefile                  |    1 +
>>>>  xen/arch/arm/arm64/smc.S                     |   29 +
>>>>  xen/arch/arm/cpufreq/Makefile                |    5 +
>>>>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>>>>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
>>>>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>>>>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>>>>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>>>>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>>>>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>>>>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>>>>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>>>>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>>>>  xen/arch/arm/smpboot.c                       |    5 +
>>>>  xen/arch/x86/Kconfig                         |    2 +
>>>>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>>>>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>>>>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>>>>  xen/arch/x86/acpi/power.c                    |    2 +-
>>>>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>>>>  xen/arch/x86/platform_hypercall.c            |    2 +-
>>>>  xen/common/device_tree.c                     |  124 +++
>>>>  xen/common/sysctl.c                          |    2 +-
>>>>  xen/drivers/Kconfig                          |    2 +
>>>>  xen/drivers/Makefile                         |    1 +
>>>>  xen/drivers/acpi/Makefile                    |    1 -
>>>>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>>>>  xen/drivers/cpufreq/Kconfig                  |    3 +
>>>>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>>>>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>>>>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>>>>  xen/drivers/cpufreq/utility.c                |   13 +-
>>>>  xen/drivers/pm/Kconfig                       |    3 +
>>>>  xen/drivers/pm/Makefile                      |    1 +
>>>>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>>>>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>>>>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>>>>  xen/include/asm-arm/device.h                 |    2 +
>>>>  xen/include/asm-arm/processor.h              |    4 +
>>>>  xen/include/public/platform.h                |    1 +
>>>>  xen/include/xen/cpufreq.h                    |  254 ++++++
>>>>  xen/include/xen/device_tree.h                |  158 ++++
>>>>  xen/include/xen/pmstat.h                     |    2 +
>>>>  xen/include/xen/processor_perf.h             |   69 ++
>>>>  50 files changed, 4822 insertions(+), 862 deletions(-)
>>>>  create mode 100644 xen/arch/arm/arm32/smc.S
>>>>  create mode 100644 xen/arch/arm/arm64/smc.S
>>>>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>>>>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>>>>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>>>>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>>>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>>>>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>>>>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>>>>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>>>>  delete mode 100644 xen/drivers/acpi/pmstat.c
>>>>  create mode 100644 xen/drivers/pm/Kconfig
>>>>  create mode 100644 xen/drivers/pm/Makefile
>>>>  create mode 100644 xen/drivers/pm/stat.c
>>>>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>>>>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>>>>  create mode 100644 xen/include/xen/cpufreq.h
>>>>  create mode 100644 xen/include/xen/processor_perf.h
>>>>
>>
>>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-14 20:46       ` Oleksandr Tyshchenko
@ 2017-11-15  3:03         ` Jassi Brar
  2017-11-15 13:28           ` Andre Przywara
  2017-11-15 14:28         ` Andre Przywara
  1 sibling, 1 reply; 108+ messages in thread
From: Jassi Brar @ 2017-11-15  3:03 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Edgar E . Iglesias, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Andre Przywara, Oleksandr Tyshchenko, Jan Beulich,
	Sudeep Holla, xen-devel

On 15 November 2017 at 02:16, Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
> <andre.przywara@linaro.org> wrote:
>

>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>
>>>> Why do you actually need this mailbox framework?
>
It is unnecessary if you are always going to use one particular signal
mechanism, say SMC. However ...

>>>> Actually I just
>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>> actually need for SCPI is to write a simple command into some memory and
>>>> "press a button". I don't see a need to import the whole Linux
>>>> framework, especially as our mailbox usage is actually just a corner
>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>> mailbox is similarly simple, for instance.
>
... Its going to be SMC and MHU now... and you talk about Rockchip as
well later. That becomes unwieldy.


>>
>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>> it would be much more easy for me to just add
>>> a few required commands handling with issuing SMC call and without any
>>> mailbox infrastructure involved.
>>> But, I want to show what is going on and what place these things come from.
>>
>> I appreciate that, but I think we already have enough "bloated" Linux +
>> glue code in Xen. And in particular the Linux mailbox framework is much
>> more powerful than we need for SCPI, so we have a lot of unneeded
>> functionality.
>
That is a painful misconception.
Mailbox api is designed to be (almost) as light weight as being
transparent. Please have a look at mbox_send_message() and see how
negligible overhead it adds for "SMC controller" that you compare
against here..... just integer manipulations protected by a spinlock.
Of course if your protocol needs async messaging, you pay the price
but only fair.


>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>> mailbox, we can get away with a *much* simpler solution.
>
> Agree, but I am afraid that simplifying things now might lead to some
> difficulties when there is a need
> to integrate a little bit different mailbox IP. Also, we need to
> recheck if SCMI, we might want to support as well,
> have the similar interface with mailbox.
>
Exactly.


>> - We would need to port mailbox drivers one-by-one anyway, so we could
>> as well implement the simple "press-the-button" subset for each mailbox
>> separately.
>
Is it about virtual controller?

>> The interface between the SCPI code and the mailbox is
>> probably just "signal_mailbox()".
>
Afterall we should have the following to spread the nice feeling of
"supporting doorbell controllers"  :)

mailbox_client.h
*******************
void signal_mailbox(struct mbox_chan *chan)
{
   (void)mbox_send_message(chan, NULL);
   mbox_client_txdone(chan, 0);
}


Cheers!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-15  3:03         ` Jassi Brar
@ 2017-11-15 13:28           ` Andre Przywara
  2017-11-15 15:18             ` Jassi Brar
  0 siblings, 1 reply; 108+ messages in thread
From: Andre Przywara @ 2017-11-15 13:28 UTC (permalink / raw)
  To: Jassi Brar, Oleksandr Tyshchenko
  Cc: Edgar E . Iglesias, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla,
	xen-devel

Hi,

On 15/11/17 03:03, Jassi Brar wrote:
> On 15 November 2017 at 02:16, Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>> <andre.przywara@linaro.org> wrote:
>>
> 
>>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>>
>>>>> Why do you actually need this mailbox framework?
>>
> It is unnecessary if you are always going to use one particular signal
> mechanism, say SMC. However ...
> 
>>>>> Actually I just
>>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>>> actually need for SCPI is to write a simple command into some memory and
>>>>> "press a button". I don't see a need to import the whole Linux
>>>>> framework, especially as our mailbox usage is actually just a corner
>>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>>> mailbox is similarly simple, for instance.
>>
> ... Its going to be SMC and MHU now... and you talk about Rockchip as
> well later. That becomes unwieldy.
> 
> 
>>>
>>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>>> it would be much more easy for me to just add
>>>> a few required commands handling with issuing SMC call and without any
>>>> mailbox infrastructure involved.
>>>> But, I want to show what is going on and what place these things come from.
>>>
>>> I appreciate that, but I think we already have enough "bloated" Linux +
>>> glue code in Xen. And in particular the Linux mailbox framework is much
>>> more powerful than we need for SCPI, so we have a lot of unneeded
>>> functionality.
>>
> That is a painful misconception.
> Mailbox api is designed to be (almost) as light weight as being
> transparent. Please have a look at mbox_send_message() and see how
> negligible overhead it adds for "SMC controller" that you compare
> against here..... just integer manipulations protected by a spinlock.
> Of course if your protocol needs async messaging, you pay the price
> but only fair.

Normally I would agree on importing some well designed code rather than
hacking up something yourself.

BUT: This is Xen, which is meant to be lean, micro-kernel like
hypervisor. If we now add code from Linux, there must be a good
rationale why we need it. And this is why we need to make sure that
CPUFreq is really justified in the first place.
So I am a bit wary that pulling some rather unrelated Linux *framework*
into Xen bloats it up and introduces more burden to the trusted code
base. With SCPI being the only user, this controller - client
abstraction is not really needed. And to just trigger an interrupt on
the SCP side we just need to:
	writel(BIT(channel), base_addr + CPU_INTR_H_SET);

I expect other mailboxes to be similarly simple.
The only other code needed is some DT parsing.

That being said I haven't look too closely how much code this actually
pulls in, it is just my gut feeling that it's a bit over the top,
conceptually.

>>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>>> mailbox, we can get away with a *much* simpler solution.
>>
>> Agree, but I am afraid that simplifying things now might lead to some
>> difficulties when there is a need
>> to integrate a little bit different mailbox IP. Also, we need to
>> recheck if SCMI, we might want to support as well,
>> have the similar interface with mailbox.
>>
> Exactly.

My understanding is that the SCMI transport protocol is not different
from that used by SCPI.

Cheers,
Andre.

>>> - We would need to port mailbox drivers one-by-one anyway, so we could
>>> as well implement the simple "press-the-button" subset for each mailbox
>>> separately.
>>
> Is it about virtual controller?
> 
>>> The interface between the SCPI code and the mailbox is
>>> probably just "signal_mailbox()".
>>
> Afterall we should have the following to spread the nice feeling of
> "supporting doorbell controllers"  :)
> 
> mailbox_client.h
> *******************
> void signal_mailbox(struct mbox_chan *chan)
> {
>    (void)mbox_send_message(chan, NULL);
>    mbox_client_txdone(chan, 0);
> }
> 
> 
> Cheers!
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-14 20:46       ` Oleksandr Tyshchenko
  2017-11-15  3:03         ` Jassi Brar
@ 2017-11-15 14:28         ` Andre Przywara
  2017-11-16 14:57           ` Oleksandr Tyshchenko
  1 sibling, 1 reply; 108+ messages in thread
From: Andre Przywara @ 2017-11-15 14:28 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Oleksandr Tyshchenko, Jan Beulich,
	Sudeep Holla, xen-devel

Hi,

On 14/11/17 20:46, Oleksandr Tyshchenko wrote:
> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
> <andre.przywara@linaro.org> wrote:
>> Hi,
> Hi Andre
> 
>>
>> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>>> <andre.przywara@linaro.org> wrote:
>>>> Hi,
>>> Hi Andre,
>>>
>>>>
>>>> thanks very much for your work on this!
>>> Thank you for your comments.
>>>
>>>>
>>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>>
>>>>> Hi, all.
>>>>>
>>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>>>>
>>>> Can you please sketch your usage scenario or workloads here? I can think
>>>> of quite different scenarios (oversubscribed server vs. partitioning
>>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>>> in the design are quite different between those.
>>> We keep embedded use-cases in mind. For example, it is a system with
>>> several domains,
>>> where one domain has most critical SW running on and other domain(s)
>>> are, let say, for entertainment purposes.
>>> I think, the CPUFreq is useful where power consumption is a question.
>>
>> Does the SoC you use allow different frequencies for each core? Or is it
>> one frequency for all cores? Most x86 CPU allow different frequencies
>> for each core, AFAIK. Just having the same OPP for the whole SoC might
>> limit the usefulness of this approach in general.
> Good question. All cores in a cluster share the same clock. It is
> impossible to set different frequencies on the cores inside one
> cluster.
> 
>>
>>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>>> position to make a decision on the proper frequency physical CPUs should
>>>> run with. From all I know it's already hard for an OS kernel to make
>>>> that call. So I would actually expect that guests provide some input,
>>>> for instance by signalling OPP change request up to the hypervisor. This
>>>> could then decide to act on it - or not.
>>> Each running guest sees only part of the picture, but hypervisor has
>>> the whole picture, it knows all about CPU, measures CPU load and able
>>> to choose required CPU frequency to run on.
>>
>> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
>> hypercall or on WFI, for that matter. It does not know much more about
>> the guest, especially it's rather clueless about what the guest OS
>> actually intended to do.
>> For instance Linux can track the actual utilization of a core by keeping
>> statistics of runnable processes and monitoring their time slice usage.
>> It can see that a certain process exhibits periodical, but bursty CPU
>> usage, which may hint that is could run at lower frequency. Xen does not
>> see this fine granular information.
>>
>>> I am wondering, does Xen
>>> need additional input from guests for make a decision?
>>
>> I very much believe so. The guest OS is in a much better position to
>> make that call.
>>
>>> BTW, currently guest domain on ARM doesn't even know how many physical
>>> CPUs the system has and what are these OPPs. When creating guest
>>> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
>>> OPPs, thermal, etc are not passed to guest.
>>
>> Sure, because this is what virtualization is about. And I am not asking
>> for unconditionally allowing any guest to change frequency.
>> But there could be certain use cases where this could be considered:
>> Think about your "critical SW" mentioned above, which is probably some
>> RTOS, also possibly running on pinned vCPUs. For that
>> (latency-sensitive) guest it might be well suited to run at a lower
>> frequency for some time, but how should Xen know about this?
>> "Normally" the best strategy to save power is to run as fast as
>> possible, finish all outstanding work, then put the core to sleep.
>> Because not running at all consumes much less energy than running at a
>> reduced frequency. But this may not be suitable for an RTOS.
> Saying "one domain has most critical SW running on" I meant hardware
> domain/driver domain or even other
> domain which perform some important tasks (disk, net, display, camera,
> whatever) which treated by the whole system as critical
> and must never fail. Other domains, for example, it might be Android
> as well, are not critical at all from the system point of view.
> Being honest, I haven't considered yet using CPUFreq in system where
> some RT guest is present.
> I think it is something that should be *thoroughly* investigated and
> then worked out.

Yes, as mentioned before there are quite different use cases with quite
different requirements when it comes to DVFS.
I believe the best would be to define typical scenarios, then assess the
usefulness of CPUFreq separately for each one of them.
Based on this we then should be able to make a decision.

> I am not familiar with RT system requirements, I suppose, but not
> entirely sure, that CPUFreq should use const
> frequency for all cores the RT system is running on, or RT system
> parameters should be recalculated each time the CPU frequency is being
> changed
> (in such case guest needs some input from Xen).
> 
> Anyway, I got your point about some guest input. Could you, please,
> describe how you think it should look like:
> 1. Xen doesn't have CPUFreq logic at all. It only collects OPP change
> requests from all guests and make
> a decision based on these requests and maybe some policy for
> prioritizing requests. Then it sends OPP change request to SCP.
> 2. Xen has CPUFreq logic. In addition it can collect OPP change
> requests from all guests and make
> a decision based on both: it's own view and guest requests. Then it
> sends OPP change request to SCP.

I am leaning towards 1) conceptually. But if there is some kind of
reasonable implementation of 2) already in Xen (for x86), this might be
feasible as well.

> Both variant implies that something like PV CPUFreq should be involved
> with frontend drivers are located in guests. Am I correct?

And here the SMC mailbox comes into play again, but with a twist. For
guests we create SCPI, mailbox and shmem DT nodes, and use the SMC
mailbox with: method = "hvc";. Xen's HVC handles then redirects this to
the CPUFreq code.
This would be platform agnostic for the guests, while making all CPUFreq
requests ending up in Xen. So there is no need for an extra PV protocol.

>> So I think we would need a combined approach:
>> a) Let an administrator (via tools running in Dom0) tell Xen about power
>> management strategies to use for certain guests. An RTOS could be
>> treated differently (lower, but constant frequency) than an
>> "entertainment" guest (varying frequency, based on guest OS input), also
>> differently than some background guest doing logging, OTA update, etc.
>> (constant high frequency, but putting cores to sleep instead as often as
>> possible).
>> b) Allow some guests (based on policy from (a)) to signal CPUFreq change
>> requests to the hypervisor. Xen takes those into account, though it may
>> decide to not act immediately on it, because it is going to schedule
>> another vCPU, for instance.
>> c) Have some way of actually realising certain OPPs. This could be via
>> an SCPI client in Xen, or some other way. Might be an implementation detail.
> 
> Just to clarify if I got the main idea correct:
> 1. Guests have CPUFreq logic, they send OPP change requests to Xen.
> 2. Xen has CPUFreq logic too, but in additional it can take into the account OPP
>     change requests from guests. Xen sends final OPP change request.
> Is my understanding correct?

Yes, I think this sounds like the most flexible. Xen's CPUFreq logic
could be quite simple, possibly starting with some static assignment
based on administrator input, e.g. given at guest creation time.
It might not involve further runtime decisions.

> Also "Different power management strategies to use for certain guests"
> means that it should be
> hard vCPU->pCPU pinning for each guest together with possibility in
> Xen to have different CPUFreq governors
> running at the same time (each governor for each CPU pool)?

That would need to be worked out, but I suspect that CPU pinning might
be *one* option for a certain class of guests. This would probably be
related to the CPUFreq policy. Without pinning the decision might become
quite involved: If Xen wants to migrate a vCPU to a different pCPU, it
needs to take the different P-states into account, including the cost to
change the OPP. I am not sure the benefit justifies the effort. Some
numbers would help here.

>>>>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>>>>
>>>> Have you looked at how this is used on x86 these days? Can you briefly
>>>> describe how this works and it's used there?
>>> Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
>>> used at the moment, but it is another question. So, there are two
>>> possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
>>> [2]. As I understand, the second option is more popular.
>>> Two different implementations of "Hypervisor based CPUFreq" are
>>> present: ACPI Processor P-States Driver and AMD Architectural P-state
>>> Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.
>>>
>>> [1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
>>> [2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq
>>
>> Thanks for the research and the pointers, will look at it later.
>>
>>>>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>>>>
>>>>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>>>>
>>>>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>>>>
>>>>> Let me explain a bit more what these possible approaches are:
>>>>>
>>>>> 1. “Xen+hwdom” solution.
>>>>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>>>>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>>>>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>>>>
>>>> Stefano, Julien and I were thinking about this: Wouldn't it be possible
>>>> to come up with some hardware domain, solely dealing with CPUFreq
>>>> changes? This could run a Linux kernel, but no or very little userland.
>>>> All its vCPUs would be pinned to pCPUs and would normally not be
>>>> scheduled by Xen. If Xen wants to change the frequency, it schedules the
>>>> respective vCPU to the right pCPU and passes down the frequency change
>>>> request. Sounds a bit involved, though, and probably doesn't solve the
>>>> problem where this domain needs to share access to hardware with Dom0
>>>> (clocks come to mind).
>>> Yes, another question is how to get this Linux kernel stuff (backend,
>>> top level driver, etc) upstreamed.
>>
>> Well, the idea would be to use already upstream drivers to actually
>> implement OPP changes (via Linux clock and regulator drivers), then use
>> existing interfaces like the userspace governor, for instance, to
>> trigger those. I don't think we need much extra kernel code for that.
> I understand. Backend in userspace sets desired frequency by request
> from frontend in Xen.

Yeah, something like that. It was just an idea, not fully thought
through yet.

>>>>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>>>>
>>>>> 2. “all-in-Xen” solution.
>>>>> This implies that all CPUFreq related stuff should be located in Xen.
>>>>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>>>>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>>>
>>>> Yes, I even think it's not feasible to implement this. With a modern
>>>> clock implementation there is one driver to control *all* clocks of an
>>>> SoC, so you can't single out the CPU clock easily, for instance. One
>>>> would probably run into synchronisation issues, at best.
>>>>
>>>>> 3. “Xen+SCP(ARM TF)” solution.
>>>>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>>>>
>>>>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>>>
>>>> While I feel flattered that you like that idea as well ;-), you should
>>>> mention that this requires actual firmware providing those services.
>>> Yes, a some firmware, which provides these services, must be present
>>> on the other end.
>>> It is a firmware which runs on the dedicated IP core(s) in common case.
>>> And it is a firmware which runs on the same core(s) as the hypervisor
>>> in particular case.
>>>
>>>> I
>>>> am not sure there is actually *any* implementation of this at the
>>>> moment, apart from my PoC code for Allwinner.
>>> Your PoC is a good example for writing firmware side. So, why don't
>>> use it as a base for
>>> other platform.
>>
>> Sure, but normally firmware is provided by the vendor. And until more
>> vendors actually implement this, it's a bit weird to ask Xen users to
>> install this hand-crafted home-brew firmware to use this feature.
>> For a particular embedded use case like yours this might be feasible,
>> though.
> Agree. it is exactly for ARM SoCs with security extensions enabled,
> but where SCP isn't available.
> And these SoCs are exists.

Sure, also it depends on the accessibility of firmware. Some SoCs only
run signed firmware, or there is no source code for crucial firmware
components (SoC setup, DRAM init), so changing the firmware might not be
an option.

>>>> And from a Xen point of view I am not sure we are in the position to
>>>> force users to use this firmware. This may be feasible in a classic
>>>> embedded scenario, where both firmware and software are provided by the
>>>> same entity, but that should be clearly noted as a restriction.
>>> Agree.
>>>
>>>>
>>>>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>>>>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>>>
>>>> It should be noted that this synchronous nature of the communication can
>>>> actually be a problem: a DVFS request usually involves regulator and PLL
>>>> changes, which could take some time to settle in. Blocking all of this
>>>> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
>>> Agree. I haven't measured time yet to say how long is it, since I
>>> don't have a working firmware at the moment, just an emulator,
>>> but, yes, it will definitely take some time. The whole system won't be
>>> blocked, only the CPU which performs SMC call.
>>> But, if we ask hwdom to change frequency we will wait too? Or if Xen
>>> manages PLL/regulator by itself, it will wait anyway?
>>
>> Normally this is done asynchronously. For instance the OS programs the
>> regulator to change the voltage, then does other things until the
>> regulator signals the change has been realised. The it re-programs the
>> PLL, again executing other code, eventually being interrupted by a
>> completion interrupt (or by periodically polling a bit). If we need to
>> spend all of this time in EL3, the HV is blocked on this. This might or
>> might not be a problem, but it should be noted.
> Agree.
> 
>>
>>>>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>>>>
>>>>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>>>>
>>>>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>>>>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>>>>> 2. A bunch of device-tree helpers and macros.
>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>
>>>> Why do you actually need this mailbox framework? Actually I just
>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>> actually need for SCPI is to write a simple command into some memory and
>>>> "press a button". I don't see a need to import the whole Linux
>>>> framework, especially as our mailbox usage is actually just a corner
>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>> mailbox is similarly simple, for instance.
>>> I did a direct port for SCPI protocol. I think, it is something that
>>> should be retained as much as possible.
>>
>> But the actual protocol is really simple. And we just need a subset of
>> it, namely to query and trigger OPPs.
> Yes. I think, that "Sensors service" is needed as well. I think that
> CPUFreq is not completed without thermal feedback.

Personally I think this should be handled by the SCPI firmware: if the
requested OPP would violate thermal constraint, the firmware would just
not set it. Also (secure) temperature alarm interrupts could lower the OPP.
Doing this in firmware means it would just need to be implemented once,
and I consider this system critical, so firmware is conceptually the
better place for this code.

>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>> it would be much more easy for me to just add
>>> a few required commands handling with issuing SMC call and without any
>>> mailbox infrastructure involved.
>>> But, I want to show what is going on and what place these things come from.
>>
>> I appreciate that, but I think we already have enough "bloated" Linux +
>> glue code in Xen. And in particular the Linux mailbox framework is much
>> more powerful than we need for SCPI, so we have a lot of unneeded
>> functionality.
>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>> mailbox, we can get away with a *much* simpler solution.
> 
> Agree, but I am afraid that simplifying things now might lead to some
> difficulties when there is a need
> to integrate a little bit different mailbox IP. Also, we need to
> recheck if SCMI, we might want to support as well,
> have the similar interface with mailbox.
> 
>> - We would need to port mailbox drivers one-by-one anyway, so we could
>> as well implement the simple "press-the-button" subset for each mailbox
>> separately. The interface between the SCPI code and the mailbox is
>> probably just "signal_mailbox()". For SMC it's trivial, and for the Juno
>> MHU it's also simple, I guess ([1], chapter 3.6).
>> - The SCPI message assembly is easy as well.
>> - The only other code needed is some DT parsing code to be compatible
>> with the existing DTs describing the SCPI implementation. We would claim
>> to have a mailbox driver for those compatibles, but cheat a bit since we
>> only use it for SCPI and just need the single bit subset of the mailbox.
> Yes, I think, we can optimize in a such way.
> 
> Just to clarify:
> Proposed "signal_mailbox" is intended for both actions: sending
> request and receiving response?
> So when it returns we will have either response or timeout error or
> some callback will be needed anyway?
> 
> I don't have any objections regarding optimizations, we need to
> decide what mailboxes we should stick to (we can support) and in what
> form we should keep
> all this stuff in.
> Also while making a decision, we need to keep in mind "direct ported
> code" advantages:
> - "direct ported code" (SCPI + mailbox) have had a thorough review by
> the Linux community and Xen community
>   may rely on their review.
> - As "direct ported code" wasn't changed heavily, I believe, it would
> be easy to backport fixes/features to Xen.

I understand that, but as I wrote in the other mail: This is a lean
hypervisor, not a driver and subsystem dump site. The security aspect of
 just having much less code is crucial here.

> So, let's decide.
> 
>>
>>> What is more, I don't want to restrict a usage of this CPUFreq by only
>>> covering single scenario where a
>>> firmware, which provides DVFS service, is in ARM TF. I hope, that this
>>> solution will be suitable for ARM SoCs where a standalone SCP
>>> is present and real mailbox IP, which has asynchronous nature, is used
>>> for IPC. Of course, this mailbox must have TX/RX-done irqs.
>>> This is a limitation at the moment.
>>
>> Sure, see above and the document [1] below.
> Thank you for the link, it seems with MHU we have to poll for the
> last_tx_done (where deasserted interrupt line in a status register is
> a condition for)
> after pressing the button. Or I missed something?

It depends on whether we care. We could just treat this request in a
fire-and-forget manner. I am not sure in how far Xen really needs to
know the actual OPP used and when it's ready.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-15 13:28           ` Andre Przywara
@ 2017-11-15 15:18             ` Jassi Brar
  0 siblings, 0 replies; 108+ messages in thread
From: Jassi Brar @ 2017-11-15 15:18 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Edgar E . Iglesias, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Oleksandr Tyshchenko, Oleksandr Tyshchenko,
	Jan Beulich, Sudeep Holla, xen-devel

On 15 November 2017 at 18:58, Andre Przywara <andre.przywara@linaro.org> wrote:
> Hi,
>
> On 15/11/17 03:03, Jassi Brar wrote:
>> On 15 November 2017 at 02:16, Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>>> <andre.przywara@linaro.org> wrote:
>>>
>>
>>>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>>>
>>>>>> Why do you actually need this mailbox framework?
>>>
>> It is unnecessary if you are always going to use one particular signal
>> mechanism, say SMC. However ...
>>
>>>>>> Actually I just
>>>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>>>> actually need for SCPI is to write a simple command into some memory and
>>>>>> "press a button". I don't see a need to import the whole Linux
>>>>>> framework, especially as our mailbox usage is actually just a corner
>>>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>>>> mailbox is similarly simple, for instance.
>>>
>> ... Its going to be SMC and MHU now... and you talk about Rockchip as
>> well later. That becomes unwieldy.
>>
>>
>>>>
>>>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>>>> it would be much more easy for me to just add
>>>>> a few required commands handling with issuing SMC call and without any
>>>>> mailbox infrastructure involved.
>>>>> But, I want to show what is going on and what place these things come from.
>>>>
>>>> I appreciate that, but I think we already have enough "bloated" Linux +
>>>> glue code in Xen. And in particular the Linux mailbox framework is much
>>>> more powerful than we need for SCPI, so we have a lot of unneeded
>>>> functionality.
>>>
>> That is a painful misconception.
>> Mailbox api is designed to be (almost) as light weight as being
>> transparent. Please have a look at mbox_send_message() and see how
>> negligible overhead it adds for "SMC controller" that you compare
>> against here..... just integer manipulations protected by a spinlock.
>> Of course if your protocol needs async messaging, you pay the price
>> but only fair.
>
> Normally I would agree on importing some well designed code rather than
> hacking up something yourself.
>
> BUT: This is Xen, which is meant to be lean, micro-kernel like
> hypervisor. If we now add code from Linux, there must be a good
> rationale why we need it. And this is why we need to make sure that
> CPUFreq is really justified in the first place.
> So I am a bit wary that pulling some rather unrelated Linux *framework*
> into Xen bloats it up and introduces more burden to the trusted code
> base. With SCPI being the only user, this controller - client
> abstraction is not really needed. And to just trigger an interrupt on
> the SCP side we just need to:
>         writel(BIT(channel), base_addr + CPU_INTR_H_SET);
>
> I expect other mailboxes to be similarly simple.
> The only other code needed is some DT parsing.
>
> That being said I haven't look too closely how much code this actually
> pulls in, it is just my gut feeling that it's a bit over the top,
> conceptually.
>
Please do have a look and let me know how it drags the SCP down.

Thanks

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-15 14:28         ` Andre Przywara
@ 2017-11-16 14:57           ` Oleksandr Tyshchenko
  2017-11-16 17:04             ` Andre Przywara
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-16 14:57 UTC (permalink / raw)
  To: Andre Przywara, Jassi Brar
  Cc: Edgar E . Iglesias, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla,
	xen-devel

On Wed, Nov 15, 2017 at 4:28 PM, Andre Przywara
<andre.przywara@linaro.org> wrote:
> Hi,
Hi Andre, Jassi

Thank you for your comments!

>
> On 14/11/17 20:46, Oleksandr Tyshchenko wrote:
>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>> <andre.przywara@linaro.org> wrote:
>>> Hi,
>> Hi Andre
>>
>>>
>>> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>>>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>>>> <andre.przywara@linaro.org> wrote:
>>>>> Hi,
>>>> Hi Andre,
>>>>
>>>>>
>>>>> thanks very much for your work on this!
>>>> Thank you for your comments.
>>>>
>>>>>
>>>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>>>
>>>>>> Hi, all.
>>>>>>
>>>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>>>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>>>>>
>>>>> Can you please sketch your usage scenario or workloads here? I can think
>>>>> of quite different scenarios (oversubscribed server vs. partitioning
>>>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>>>> in the design are quite different between those.
>>>> We keep embedded use-cases in mind. For example, it is a system with
>>>> several domains,
>>>> where one domain has most critical SW running on and other domain(s)
>>>> are, let say, for entertainment purposes.
>>>> I think, the CPUFreq is useful where power consumption is a question.
>>>
>>> Does the SoC you use allow different frequencies for each core? Or is it
>>> one frequency for all cores? Most x86 CPU allow different frequencies
>>> for each core, AFAIK. Just having the same OPP for the whole SoC might
>>> limit the usefulness of this approach in general.
>> Good question. All cores in a cluster share the same clock. It is
>> impossible to set different frequencies on the cores inside one
>> cluster.
>>
>>>
>>>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>>>> position to make a decision on the proper frequency physical CPUs should
>>>>> run with. From all I know it's already hard for an OS kernel to make
>>>>> that call. So I would actually expect that guests provide some input,
>>>>> for instance by signalling OPP change request up to the hypervisor. This
>>>>> could then decide to act on it - or not.
>>>> Each running guest sees only part of the picture, but hypervisor has
>>>> the whole picture, it knows all about CPU, measures CPU load and able
>>>> to choose required CPU frequency to run on.
>>>
>>> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
>>> hypercall or on WFI, for that matter. It does not know much more about
>>> the guest, especially it's rather clueless about what the guest OS
>>> actually intended to do.
>>> For instance Linux can track the actual utilization of a core by keeping
>>> statistics of runnable processes and monitoring their time slice usage.
>>> It can see that a certain process exhibits periodical, but bursty CPU
>>> usage, which may hint that is could run at lower frequency. Xen does not
>>> see this fine granular information.
>>>
>>>> I am wondering, does Xen
>>>> need additional input from guests for make a decision?
>>>
>>> I very much believe so. The guest OS is in a much better position to
>>> make that call.
>>>
>>>> BTW, currently guest domain on ARM doesn't even know how many physical
>>>> CPUs the system has and what are these OPPs. When creating guest
>>>> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
>>>> OPPs, thermal, etc are not passed to guest.
>>>
>>> Sure, because this is what virtualization is about. And I am not asking
>>> for unconditionally allowing any guest to change frequency.
>>> But there could be certain use cases where this could be considered:
>>> Think about your "critical SW" mentioned above, which is probably some
>>> RTOS, also possibly running on pinned vCPUs. For that
>>> (latency-sensitive) guest it might be well suited to run at a lower
>>> frequency for some time, but how should Xen know about this?
>>> "Normally" the best strategy to save power is to run as fast as
>>> possible, finish all outstanding work, then put the core to sleep.
>>> Because not running at all consumes much less energy than running at a
>>> reduced frequency. But this may not be suitable for an RTOS.
>> Saying "one domain has most critical SW running on" I meant hardware
>> domain/driver domain or even other
>> domain which perform some important tasks (disk, net, display, camera,
>> whatever) which treated by the whole system as critical
>> and must never fail. Other domains, for example, it might be Android
>> as well, are not critical at all from the system point of view.
>> Being honest, I haven't considered yet using CPUFreq in system where
>> some RT guest is present.
>> I think it is something that should be *thoroughly* investigated and
>> then worked out.
>
> Yes, as mentioned before there are quite different use cases with quite
> different requirements when it comes to DVFS.
> I believe the best would be to define typical scenarios, then assess the
> usefulness of CPUFreq separately for each one of them.
> Based on this we then should be able to make a decision.

Agree here.
Well, let's imagine following use-case(s), maybe too complex, but it
might take place.
ARM SoC is big.LITTLE and it has >=1 big core(s) and >=1 little
core(s) with following abilities:
1. big core(s) is DVFS capable (>1 OPP), little core(s) isn't DVFS
capable (1 OPP) and vice versa.
2. Both types are DVFS capable.
The system which runs on this SoC has 3 guests:
1. Thin dom0, has some storage driver (mmc, sata, whatever) with
blkback running.
Tasks:
- Running VM
- Watchdog
- vbd support
2. Driver domain (maybe RT-guest: Linux with RT infra or even some
RTOS, maybe non-RT-guest)
For example, instrumental cluster.
Tasks:
- Gears
- RVC
- OpenCL
- 3D UI
- vdispl, vsnd, vif, vusb, (vbd) support.
3. Entertainment domain.
For example, Android.
Tasks:
- Navi(Maps)
- Multimedia(Audio/Video)
- Cell
- OTA
- Third-party apps
Also, such system might be "battery-powered".

>
>> I am not familiar with RT system requirements, I suppose, but not
>> entirely sure, that CPUFreq should use const
>> frequency for all cores the RT system is running on, or RT system
>> parameters should be recalculated each time the CPU frequency is being
>> changed
>> (in such case guest needs some input from Xen).
>>
>> Anyway, I got your point about some guest input. Could you, please,
>> describe how you think it should look like:
>> 1. Xen doesn't have CPUFreq logic at all. It only collects OPP change
>> requests from all guests and make
>> a decision based on these requests and maybe some policy for
>> prioritizing requests. Then it sends OPP change request to SCP.
>> 2. Xen has CPUFreq logic. In addition it can collect OPP change
>> requests from all guests and make
>> a decision based on both: it's own view and guest requests. Then it
>> sends OPP change request to SCP.
>
> I am leaning towards 1) conceptually. But if there is some kind of
> reasonable implementation of 2) already in Xen (for x86), this might be
> feasible as well.

Sure, Xen has common CPUFreq infra (core, set of governors) and
two ACPI P-state CPUFreq drivers. Actually this patch series adds SCPI-based
CPUFreq driver, which as well as existing drivers, are just for
issuing command to change CPU frequency.
The entity which decides what CPU frequency to set next is already present.

I got your point. I think that approach 1 is radically different from
what we have in Xen for x86 these days.
Anyway, we need to weight all pros and cons to decide what direction
we want to follow.

BTW, I see that existing CPUFreq drivers can read some performance counters
to measure performance over a period of time and this measured
performance can be used as an additional input for
governor then. Do we have something on ARM?
I was thinking, how to actually take into the account guest's OPP
change requests from the governor's perspective,
and these "requests" might be considered as performance counters.

>
>> Both variant implies that something like PV CPUFreq should be involved
>> with frontend drivers are located in guests. Am I correct?
>
> And here the SMC mailbox comes into play again, but with a twist. For
> guests we create SCPI, mailbox and shmem DT nodes, and use the SMC
> mailbox with: method = "hvc";. Xen's HVC handles then redirects this to
> the CPUFreq code.
> This would be platform agnostic for the guests, while making all CPUFreq
> requests ending up in Xen. So there is no need for an extra PV protocol.

This idea is indeed interesting.

Could you please answer these questions:
1. As I understand correctly here in Xen we have to emulate all DVFS
related commands, I mean to be an SCP for the guests?
2. How do we recognize from guest's OPP change request on which
physical CPU it wants to change frequency?
    Do we need to pin guest's vCPU to the respective pCPU?
3. Linux "SCPI CPUFreq Interface driver" is tied to "ARM big.LITTLE
Platforms CPUFreq driver", so will the latter be "happy"
    to play with virtual CPUs a particular guests is running on?
4. Together with creating dummy SCPI nodes for guest we have to insert
clock specifier into a CPU node
    which we expose to guest (clocks = <&scpi_dvfs 0>;). Correct?
5. Will there be any possible synchronization issues if two guest send
OPP change requests at the same time?

>
>>> So I think we would need a combined approach:
>>> a) Let an administrator (via tools running in Dom0) tell Xen about power
>>> management strategies to use for certain guests. An RTOS could be
>>> treated differently (lower, but constant frequency) than an
>>> "entertainment" guest (varying frequency, based on guest OS input), also
>>> differently than some background guest doing logging, OTA update, etc.
>>> (constant high frequency, but putting cores to sleep instead as often as
>>> possible).
>>> b) Allow some guests (based on policy from (a)) to signal CPUFreq change
>>> requests to the hypervisor. Xen takes those into account, though it may
>>> decide to not act immediately on it, because it is going to schedule
>>> another vCPU, for instance.
>>> c) Have some way of actually realising certain OPPs. This could be via
>>> an SCPI client in Xen, or some other way. Might be an implementation detail.
>>
>> Just to clarify if I got the main idea correct:
>> 1. Guests have CPUFreq logic, they send OPP change requests to Xen.
>> 2. Xen has CPUFreq logic too, but in additional it can take into the account OPP
>>     change requests from guests. Xen sends final OPP change request.
>> Is my understanding correct?
>
> Yes, I think this sounds like the most flexible. Xen's CPUFreq logic
> could be quite simple, possibly starting with some static assignment
> based on administrator input, e.g. given at guest creation time.
> It might not involve further runtime decisions.
>
>> Also "Different power management strategies to use for certain guests"
>> means that it should be
>> hard vCPU->pCPU pinning for each guest together with possibility in
>> Xen to have different CPUFreq governors
>> running at the same time (each governor for each CPU pool)?
>
> That would need to be worked out, but I suspect that CPU pinning might
> be *one* option for a certain class of guests. This would probably be
> related to the CPUFreq policy. Without pinning the decision might become
> quite involved: If Xen wants to migrate a vCPU to a different pCPU, it
> needs to take the different P-states into account, including the cost to
> change the OPP. I am not sure the benefit justifies the effort. Some
> numbers would help here.

I can't even imagine a development effort of adding ability to have different
CPUFreq policies over different CPUs in Xen. Another question is, if
all cores shares OPP
it is not feasible to realize that, I am afraid.

Anyway, I think we should go step-by-step.
If community agreed that CPUFreq feature in Xen on ARM was needed and
SCPI/SCMI based approach
was the right thing to do in general I would stick to next taking into
the account Andre's suggestions
regarding some guest input:

1. Xen do have CPUFreq logic. It measures CPUs utilization by itself.
2. In addition it can collect OPP change requests from the guests:
  - There are some politics describing which guest is allowed to send
OPP change request.
  - Of course, involved guests have CPUFreq enabled. All we need is
these OPP change requests don't lead to
    any physical changes and be picked up by Xen. Here we could use
Andre's idea here (SCPI CPUFreq + SMC mailbox with hvc method).
3. Xen makes a decision based on the whole system status it measures
periodically and guests input (OPP change requests) if present.
4. Xen actually issues command to change the CPU frequency (sends OPP
change request to SCP).

How does it sound?

>
>>>>>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>>>>>
>>>>> Have you looked at how this is used on x86 these days? Can you briefly
>>>>> describe how this works and it's used there?
>>>> Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
>>>> used at the moment, but it is another question. So, there are two
>>>> possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
>>>> [2]. As I understand, the second option is more popular.
>>>> Two different implementations of "Hypervisor based CPUFreq" are
>>>> present: ACPI Processor P-States Driver and AMD Architectural P-state
>>>> Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.
>>>>
>>>> [1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
>>>> [2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq
>>>
>>> Thanks for the research and the pointers, will look at it later.
>>>
>>>>>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>>>>>
>>>>>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>>>>>
>>>>>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>>>>>
>>>>>> Let me explain a bit more what these possible approaches are:
>>>>>>
>>>>>> 1. “Xen+hwdom” solution.
>>>>>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>>>>>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>>>>>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>>>>>
>>>>> Stefano, Julien and I were thinking about this: Wouldn't it be possible
>>>>> to come up with some hardware domain, solely dealing with CPUFreq
>>>>> changes? This could run a Linux kernel, but no or very little userland.
>>>>> All its vCPUs would be pinned to pCPUs and would normally not be
>>>>> scheduled by Xen. If Xen wants to change the frequency, it schedules the
>>>>> respective vCPU to the right pCPU and passes down the frequency change
>>>>> request. Sounds a bit involved, though, and probably doesn't solve the
>>>>> problem where this domain needs to share access to hardware with Dom0
>>>>> (clocks come to mind).
>>>> Yes, another question is how to get this Linux kernel stuff (backend,
>>>> top level driver, etc) upstreamed.
>>>
>>> Well, the idea would be to use already upstream drivers to actually
>>> implement OPP changes (via Linux clock and regulator drivers), then use
>>> existing interfaces like the userspace governor, for instance, to
>>> trigger those. I don't think we need much extra kernel code for that.
>> I understand. Backend in userspace sets desired frequency by request
>> from frontend in Xen.
>
> Yeah, something like that. It was just an idea, not fully thought
> through yet.
>
>>>>>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>>>>>
>>>>>> 2. “all-in-Xen” solution.
>>>>>> This implies that all CPUFreq related stuff should be located in Xen.
>>>>>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>>>>>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>>>>
>>>>> Yes, I even think it's not feasible to implement this. With a modern
>>>>> clock implementation there is one driver to control *all* clocks of an
>>>>> SoC, so you can't single out the CPU clock easily, for instance. One
>>>>> would probably run into synchronisation issues, at best.
>>>>>
>>>>>> 3. “Xen+SCP(ARM TF)” solution.
>>>>>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>>>>>
>>>>>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>>>>
>>>>> While I feel flattered that you like that idea as well ;-), you should
>>>>> mention that this requires actual firmware providing those services.
>>>> Yes, a some firmware, which provides these services, must be present
>>>> on the other end.
>>>> It is a firmware which runs on the dedicated IP core(s) in common case.
>>>> And it is a firmware which runs on the same core(s) as the hypervisor
>>>> in particular case.
>>>>
>>>>> I
>>>>> am not sure there is actually *any* implementation of this at the
>>>>> moment, apart from my PoC code for Allwinner.
>>>> Your PoC is a good example for writing firmware side. So, why don't
>>>> use it as a base for
>>>> other platform.
>>>
>>> Sure, but normally firmware is provided by the vendor. And until more
>>> vendors actually implement this, it's a bit weird to ask Xen users to
>>> install this hand-crafted home-brew firmware to use this feature.
>>> For a particular embedded use case like yours this might be feasible,
>>> though.
>> Agree. it is exactly for ARM SoCs with security extensions enabled,
>> but where SCP isn't available.
>> And these SoCs are exists.
>
> Sure, also it depends on the accessibility of firmware. Some SoCs only
> run signed firmware, or there is no source code for crucial firmware
> components (SoC setup, DRAM init), so changing the firmware might not be
> an option.

Agree.

>
>>>>> And from a Xen point of view I am not sure we are in the position to
>>>>> force users to use this firmware. This may be feasible in a classic
>>>>> embedded scenario, where both firmware and software are provided by the
>>>>> same entity, but that should be clearly noted as a restriction.
>>>> Agree.
>>>>
>>>>>
>>>>>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>>>>>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>>>>
>>>>> It should be noted that this synchronous nature of the communication can
>>>>> actually be a problem: a DVFS request usually involves regulator and PLL
>>>>> changes, which could take some time to settle in. Blocking all of this
>>>>> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
>>>> Agree. I haven't measured time yet to say how long is it, since I
>>>> don't have a working firmware at the moment, just an emulator,
>>>> but, yes, it will definitely take some time. The whole system won't be
>>>> blocked, only the CPU which performs SMC call.
>>>> But, if we ask hwdom to change frequency we will wait too? Or if Xen
>>>> manages PLL/regulator by itself, it will wait anyway?
>>>
>>> Normally this is done asynchronously. For instance the OS programs the
>>> regulator to change the voltage, then does other things until the
>>> regulator signals the change has been realised. The it re-programs the
>>> PLL, again executing other code, eventually being interrupted by a
>>> completion interrupt (or by periodically polling a bit). If we need to
>>> spend all of this time in EL3, the HV is blocked on this. This might or
>>> might not be a problem, but it should be noted.
>> Agree.
>>
>>>
>>>>>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>>>>>
>>>>>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>>>>>
>>>>>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>>>>>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>>>>>> 2. A bunch of device-tree helpers and macros.
>>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>>
>>>>> Why do you actually need this mailbox framework? Actually I just
>>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>>> actually need for SCPI is to write a simple command into some memory and
>>>>> "press a button". I don't see a need to import the whole Linux
>>>>> framework, especially as our mailbox usage is actually just a corner
>>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>>> mailbox is similarly simple, for instance.
>>>> I did a direct port for SCPI protocol. I think, it is something that
>>>> should be retained as much as possible.
>>>
>>> But the actual protocol is really simple. And we just need a subset of
>>> it, namely to query and trigger OPPs.
>> Yes. I think, that "Sensors service" is needed as well. I think that
>> CPUFreq is not completed without thermal feedback.
>
> Personally I think this should be handled by the SCPI firmware: if the
> requested OPP would violate thermal constraint, the firmware would just
> not set it. Also (secure) temperature alarm interrupts could lower the OPP.
> Doing this in firmware means it would just need to be implemented once,
> and I consider this system critical, so firmware is conceptually the
> better place for this code.

Sounds reasonable for me.

>
>>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>>> it would be much more easy for me to just add
>>>> a few required commands handling with issuing SMC call and without any
>>>> mailbox infrastructure involved.
>>>> But, I want to show what is going on and what place these things come from.
>>>
>>> I appreciate that, but I think we already have enough "bloated" Linux +
>>> glue code in Xen. And in particular the Linux mailbox framework is much
>>> more powerful than we need for SCPI, so we have a lot of unneeded
>>> functionality.
>>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>>> mailbox, we can get away with a *much* simpler solution.
>>
>> Agree, but I am afraid that simplifying things now might lead to some
>> difficulties when there is a need
>> to integrate a little bit different mailbox IP. Also, we need to
>> recheck if SCMI, we might want to support as well,
>> have the similar interface with mailbox.
>>
>>> - We would need to port mailbox drivers one-by-one anyway, so we could
>>> as well implement the simple "press-the-button" subset for each mailbox
>>> separately. The interface between the SCPI code and the mailbox is
>>> probably just "signal_mailbox()". For SMC it's trivial, and for the Juno
>>> MHU it's also simple, I guess ([1], chapter 3.6).
>>> - The SCPI message assembly is easy as well.
>>> - The only other code needed is some DT parsing code to be compatible
>>> with the existing DTs describing the SCPI implementation. We would claim
>>> to have a mailbox driver for those compatibles, but cheat a bit since we
>>> only use it for SCPI and just need the single bit subset of the mailbox.
>> Yes, I think, we can optimize in a such way.
>>
>> Just to clarify:
>> Proposed "signal_mailbox" is intended for both actions: sending
>> request and receiving response?
>> So when it returns we will have either response or timeout error or
>> some callback will be needed anyway?
>>
>> I don't have any objections regarding optimizations, we need to
>> decide what mailboxes we should stick to (we can support) and in what
>> form we should keep
>> all this stuff in.
>> Also while making a decision, we need to keep in mind "direct ported
>> code" advantages:
>> - "direct ported code" (SCPI + mailbox) have had a thorough review by
>> the Linux community and Xen community
>>   may rely on their review.
>> - As "direct ported code" wasn't changed heavily, I believe, it would
>> be easy to backport fixes/features to Xen.
>
> I understand that, but as I wrote in the other mail: This is a lean
> hypervisor, not a driver and subsystem dump site. The security aspect of
>  just having much less code is crucial here.
>
>> So, let's decide.
>>
>>>
>>>> What is more, I don't want to restrict a usage of this CPUFreq by only
>>>> covering single scenario where a
>>>> firmware, which provides DVFS service, is in ARM TF. I hope, that this
>>>> solution will be suitable for ARM SoCs where a standalone SCP
>>>> is present and real mailbox IP, which has asynchronous nature, is used
>>>> for IPC. Of course, this mailbox must have TX/RX-done irqs.
>>>> This is a limitation at the moment.
>>>
>>> Sure, see above and the document [1] below.
>> Thank you for the link, it seems with MHU we have to poll for the
>> last_tx_done (where deasserted interrupt line in a status register is
>> a condition for)
>> after pressing the button. Or I missed something?
>
> It depends on whether we care. We could just treat this request in a
> fire-and-forget manner. I am not sure in how far Xen really needs to
> know the actual OPP used and when it's ready.

I got your point.

There is a "get" callback for CPUFreq drivers, where the CPUFreq core
expects to get current frequency.
Current frequency is also needed for initial condition, we might guess
it, but why if SCPI does allow to retrieve it.

Personally I think, that although "fire-and-forget" manner has
advantage (a code is much simple) we will never know what is going on
in case of errors,
there are, I think, a few reasons for the firmware not to process request.
I agree, that we could try not to wait for the real TX-done condition
at all for asynchronous mailboxes if we are not going to queue
requests.
Because it is quite clear if we already got a response, that a request
has been successfully reached other end,
but if we got a timeout error, that something bad had happened and we
could treat it as a global connection error, for example.
But, responses it is something we should handle.
So, MHU as well as Rockchip and other mailbox IPs which do have
RX-done irq, I believe, we will be able to handle.

>
> Cheers,
> Andre.

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-16 14:57           ` Oleksandr Tyshchenko
@ 2017-11-16 17:04             ` Andre Przywara
  2017-11-17 14:01               ` Julien Grall
  2017-11-17 14:55               ` Oleksandr Tyshchenko
  0 siblings, 2 replies; 108+ messages in thread
From: Andre Przywara @ 2017-11-16 17:04 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, Jassi Brar
  Cc: Edgar E . Iglesias, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla,
	xen-devel

Hi,

On 16/11/17 14:57, Oleksandr Tyshchenko wrote:
> On Wed, Nov 15, 2017 at 4:28 PM, Andre Przywara
> <andre.przywara@linaro.org> wrote:
>> Hi,
> Hi Andre, Jassi
> 
> Thank you for your comments!
> 
>>
>> On 14/11/17 20:46, Oleksandr Tyshchenko wrote:
>>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>>> <andre.przywara@linaro.org> wrote:
>>>> Hi,
>>> Hi Andre
>>>
>>>>
>>>> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>>>>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>>>>> <andre.przywara@linaro.org> wrote:
>>>>>> Hi,
>>>>> Hi Andre,
>>>>>
>>>>>>
>>>>>> thanks very much for your work on this!
>>>>> Thank you for your comments.
>>>>>
>>>>>>
>>>>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>>>>
>>>>>>> Hi, all.
>>>>>>>
>>>>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>>>>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>>>>>>
>>>>>> Can you please sketch your usage scenario or workloads here? I can think
>>>>>> of quite different scenarios (oversubscribed server vs. partitioning
>>>>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>>>>> in the design are quite different between those.
>>>>> We keep embedded use-cases in mind. For example, it is a system with
>>>>> several domains,
>>>>> where one domain has most critical SW running on and other domain(s)
>>>>> are, let say, for entertainment purposes.
>>>>> I think, the CPUFreq is useful where power consumption is a question.
>>>>
>>>> Does the SoC you use allow different frequencies for each core? Or is it
>>>> one frequency for all cores? Most x86 CPU allow different frequencies
>>>> for each core, AFAIK. Just having the same OPP for the whole SoC might
>>>> limit the usefulness of this approach in general.
>>> Good question. All cores in a cluster share the same clock. It is
>>> impossible to set different frequencies on the cores inside one
>>> cluster.
>>>
>>>>
>>>>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>>>>> position to make a decision on the proper frequency physical CPUs should
>>>>>> run with. From all I know it's already hard for an OS kernel to make
>>>>>> that call. So I would actually expect that guests provide some input,
>>>>>> for instance by signalling OPP change request up to the hypervisor. This
>>>>>> could then decide to act on it - or not.
>>>>> Each running guest sees only part of the picture, but hypervisor has
>>>>> the whole picture, it knows all about CPU, measures CPU load and able
>>>>> to choose required CPU frequency to run on.
>>>>
>>>> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
>>>> hypercall or on WFI, for that matter. It does not know much more about
>>>> the guest, especially it's rather clueless about what the guest OS
>>>> actually intended to do.
>>>> For instance Linux can track the actual utilization of a core by keeping
>>>> statistics of runnable processes and monitoring their time slice usage.
>>>> It can see that a certain process exhibits periodical, but bursty CPU
>>>> usage, which may hint that is could run at lower frequency. Xen does not
>>>> see this fine granular information.
>>>>
>>>>> I am wondering, does Xen
>>>>> need additional input from guests for make a decision?
>>>>
>>>> I very much believe so. The guest OS is in a much better position to
>>>> make that call.
>>>>
>>>>> BTW, currently guest domain on ARM doesn't even know how many physical
>>>>> CPUs the system has and what are these OPPs. When creating guest
>>>>> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
>>>>> OPPs, thermal, etc are not passed to guest.
>>>>
>>>> Sure, because this is what virtualization is about. And I am not asking
>>>> for unconditionally allowing any guest to change frequency.
>>>> But there could be certain use cases where this could be considered:
>>>> Think about your "critical SW" mentioned above, which is probably some
>>>> RTOS, also possibly running on pinned vCPUs. For that
>>>> (latency-sensitive) guest it might be well suited to run at a lower
>>>> frequency for some time, but how should Xen know about this?
>>>> "Normally" the best strategy to save power is to run as fast as
>>>> possible, finish all outstanding work, then put the core to sleep.
>>>> Because not running at all consumes much less energy than running at a
>>>> reduced frequency. But this may not be suitable for an RTOS.
>>> Saying "one domain has most critical SW running on" I meant hardware
>>> domain/driver domain or even other
>>> domain which perform some important tasks (disk, net, display, camera,
>>> whatever) which treated by the whole system as critical
>>> and must never fail. Other domains, for example, it might be Android
>>> as well, are not critical at all from the system point of view.
>>> Being honest, I haven't considered yet using CPUFreq in system where
>>> some RT guest is present.
>>> I think it is something that should be *thoroughly* investigated and
>>> then worked out.
>>
>> Yes, as mentioned before there are quite different use cases with quite
>> different requirements when it comes to DVFS.
>> I believe the best would be to define typical scenarios, then assess the
>> usefulness of CPUFreq separately for each one of them.
>> Based on this we then should be able to make a decision.
> 
> Agree here.
> Well, let's imagine following use-case(s), maybe too complex, but it
> might take place.
> ARM SoC is big.LITTLE and it has >=1 big core(s) and >=1 little
> core(s) with following abilities:
> 1. big core(s) is DVFS capable (>1 OPP), little core(s) isn't DVFS
> capable (1 OPP) and vice versa.
> 2. Both types are DVFS capable.
> The system which runs on this SoC has 3 guests:
> 1. Thin dom0, has some storage driver (mmc, sata, whatever) with
> blkback running.
> Tasks:
> - Running VM
> - Watchdog
> - vbd support
> 2. Driver domain (maybe RT-guest: Linux with RT infra or even some
> RTOS, maybe non-RT-guest)
> For example, instrumental cluster.
> Tasks:
> - Gears
> - RVC
> - OpenCL
> - 3D UI
> - vdispl, vsnd, vif, vusb, (vbd) support.
> 3. Entertainment domain.
> For example, Android.
> Tasks:
> - Navi(Maps)
> - Multimedia(Audio/Video)
> - Cell
> - OTA
> - Third-party apps
> Also, such system might be "battery-powered".

All valid points, and demonstrates the variety of use cases. I was
hoping for more general systems or guest use case, like:
- oversubscribed server machine, possibly in a migration pool
- server for isolating system components (web server, mail server,
application server), possibly not loaded 100% all of the time
- desktop machine or laptop, isolation for security reasons (Qubes OS)
- embedded system, mostly partitioning (not oversubscribed, vCPUs pinned)
- embedded system with at least one "media domain" (video/audio playback)
- embedded system with at least one realtime domain
....

>>
>>> I am not familiar with RT system requirements, I suppose, but not
>>> entirely sure, that CPUFreq should use const
>>> frequency for all cores the RT system is running on, or RT system
>>> parameters should be recalculated each time the CPU frequency is being
>>> changed
>>> (in such case guest needs some input from Xen).
>>>
>>> Anyway, I got your point about some guest input. Could you, please,
>>> describe how you think it should look like:
>>> 1. Xen doesn't have CPUFreq logic at all. It only collects OPP change
>>> requests from all guests and make
>>> a decision based on these requests and maybe some policy for
>>> prioritizing requests. Then it sends OPP change request to SCP.
>>> 2. Xen has CPUFreq logic. In addition it can collect OPP change
>>> requests from all guests and make
>>> a decision based on both: it's own view and guest requests. Then it
>>> sends OPP change request to SCP.
>>
>> I am leaning towards 1) conceptually. But if there is some kind of
>> reasonable implementation of 2) already in Xen (for x86), this might be
>> feasible as well.
> 
> Sure, Xen has common CPUFreq infra (core, set of governors) and
> two ACPI P-state CPUFreq drivers. Actually this patch series adds SCPI-based
> CPUFreq driver, which as well as existing drivers, are just for
> issuing command to change CPU frequency.
> The entity which decides what CPU frequency to set next is already present.
> 
> I got your point. I think that approach 1 is radically different from
> what we have in Xen for x86 these days.
> Anyway, we need to weight all pros and cons to decide what direction
> we want to follow.
> 
> BTW, I see that existing CPUFreq drivers can read some performance counters
> to measure performance over a period of time

Is that APERF/MPERF on x86? Which gives you the ratio between idle and
wall clock time?

> and this measured
> performance can be used as an additional input for
> governor then. Do we have something on ARM?

Not architecturally, but I guess you can track the arch timer counter
before entering WFI and when coming back to record the time spent sleeping.
But I am not sure that sleep time is a good metric to deduct CPU frequency.

> I was thinking, how to actually take into the account guest's OPP
> change requests from the governor's perspective,
> and these "requests" might be considered as performance counters.

Maybe, maybe it's even simpler. You have a static vCPU frequency
setting, as given by the administrator from Dom0, either at domain
creation time or at runtime. Plus you have the guests' requests, which
may or may not override this.
So the policies could be:
- Always run at full speed.
- Run at full speed, and realise guest CPUFreq requests
- Run at low speed, and realise guest CPUFreq requests
- Always run at low speed

So Xen does not need to throw in its own ideas here. Which would avoid
some of the hard problems we encountered.

>>> Both variant implies that something like PV CPUFreq should be involved
>>> with frontend drivers are located in guests. Am I correct?
>>
>> And here the SMC mailbox comes into play again, but with a twist. For
>> guests we create SCPI, mailbox and shmem DT nodes, and use the SMC
>> mailbox with: method = "hvc";. Xen's HVC handles then redirects this to
>> the CPUFreq code.
>> This would be platform agnostic for the guests, while making all CPUFreq
>> requests ending up in Xen. So there is no need for an extra PV protocol.
> 
> This idea is indeed interesting.
> 
> Could you please answer these questions:
> 1. As I understand correctly here in Xen we have to emulate all DVFS
> related commands, I mean to be an SCP for the guests?

Yes, though "emulate all DVFS commands" sounds more complicated than it
is, it could be as simple as my ATF implementation:
https://github.com/apritzel/arm-trusted-firmware/commit/2f6f7d1746f72d0fe4da461ab1b3bfddc082636d

> 2. How do we recognize from guest's OPP change request on which
> physical CPU it wants to change frequency?

I think that maps to the DVFS power domains. We could offer one power
domain per vCPU.

>     Do we need to pin guest's vCPU to the respective pCPU?

No, I don't see why. Makes the code and the decision when to switch more
complicated, of course.

> 3. Linux "SCPI CPUFreq Interface driver" is tied to "ARM big.LITTLE
> Platforms CPUFreq driver", so will the latter be "happy"
>     to play with virtual CPUs a particular guests is running on?

I think so. But possibly SCMI provides a better answer to this.

> 4. Together with creating dummy SCPI nodes for guest we have to insert
> clock specifier into a CPU node
>     which we expose to guest (clocks = <&scpi_dvfs 0>;). Correct?

Yes, but that should be easy.

> 5. Will there be any possible synchronization issues if two guest send
> OPP change requests at the same time?

No, this is per a VCPU trap to EL2 and will be handled in context of a
VCPU and its domain. How this translates to the actual frequency of a
physical core is a different question, though. One of the reason I am a
bit wary of the usefulness of this exercise: because the downclocked
physical core might be given to another VCPU in another guest shortly
afterwards, at which point it might need to be clocked up again - or not.

>>>> So I think we would need a combined approach:
>>>> a) Let an administrator (via tools running in Dom0) tell Xen about power
>>>> management strategies to use for certain guests. An RTOS could be
>>>> treated differently (lower, but constant frequency) than an
>>>> "entertainment" guest (varying frequency, based on guest OS input), also
>>>> differently than some background guest doing logging, OTA update, etc.
>>>> (constant high frequency, but putting cores to sleep instead as often as
>>>> possible).
>>>> b) Allow some guests (based on policy from (a)) to signal CPUFreq change
>>>> requests to the hypervisor. Xen takes those into account, though it may
>>>> decide to not act immediately on it, because it is going to schedule
>>>> another vCPU, for instance.
>>>> c) Have some way of actually realising certain OPPs. This could be via
>>>> an SCPI client in Xen, or some other way. Might be an implementation detail.
>>>
>>> Just to clarify if I got the main idea correct:
>>> 1. Guests have CPUFreq logic, they send OPP change requests to Xen.
>>> 2. Xen has CPUFreq logic too, but in additional it can take into the account OPP
>>>     change requests from guests. Xen sends final OPP change request.
>>> Is my understanding correct?
>>
>> Yes, I think this sounds like the most flexible. Xen's CPUFreq logic
>> could be quite simple, possibly starting with some static assignment
>> based on administrator input, e.g. given at guest creation time.
>> It might not involve further runtime decisions.
>>
>>> Also "Different power management strategies to use for certain guests"
>>> means that it should be
>>> hard vCPU->pCPU pinning for each guest together with possibility in
>>> Xen to have different CPUFreq governors
>>> running at the same time (each governor for each CPU pool)?
>>
>> That would need to be worked out, but I suspect that CPU pinning might
>> be *one* option for a certain class of guests. This would probably be
>> related to the CPUFreq policy. Without pinning the decision might become
>> quite involved: If Xen wants to migrate a vCPU to a different pCPU, it
>> needs to take the different P-states into account, including the cost to
>> change the OPP. I am not sure the benefit justifies the effort. Some
>> numbers would help here.
> 
> I can't even imagine a development effort of adding ability to have different
> CPUFreq policies over different CPUs in Xen. Another question is, if
> all cores shares OPP
> it is not feasible to realize that, I am afraid.
> 
> Anyway, I think we should go step-by-step.
> If community agreed that CPUFreq feature in Xen on ARM was needed and
> SCPI/SCMI based approach
> was the right thing to do in general I would stick to next taking into
> the account Andre's suggestions
> regarding some guest input:
> 
> 1. Xen do have CPUFreq logic. It measures CPUs utilization by itself.
> 2. In addition it can collect OPP change requests from the guests:
>   - There are some politics describing which guest is allowed to send
> OPP change request.
>   - Of course, involved guests have CPUFreq enabled. All we need is
> these OPP change requests don't lead to
>     any physical changes and be picked up by Xen. Here we could use
> Andre's idea here (SCPI CPUFreq + SMC mailbox with hvc method).
> 3. Xen makes a decision based on the whole system status it measures
> periodically and guests input (OPP change requests) if present.
> 4. Xen actually issues command to change the CPU frequency (sends OPP
> change request to SCP).
> 
> How does it sound?

0. Decide whether CPUFreq justifies 1.-4. in the first place. That
sounds like a lot of work and code, so we should be sure it's worth it.

I wonder if you could provide some input, ideally measurements on the
actual power savings CPUFreq provides.

Does the wish to have CPUFreq purely come from some "tick-the-box"
exercise? As in: We have it on native Linux, so we need it in Xen?

What power savings can we expect from CPUFreq? Can those possible
savings be transferred into a virtualized environment at all? And do
those saving justify all the extra code in Xen?

I think those questions need to be answered first, then we can discuss
about the implementation details.

>>>>>>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>>>>>>
>>>>>> Have you looked at how this is used on x86 these days? Can you briefly
>>>>>> describe how this works and it's used there?
>>>>> Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
>>>>> used at the moment, but it is another question. So, there are two
>>>>> possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
>>>>> [2]. As I understand, the second option is more popular.
>>>>> Two different implementations of "Hypervisor based CPUFreq" are
>>>>> present: ACPI Processor P-States Driver and AMD Architectural P-state
>>>>> Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.
>>>>>
>>>>> [1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
>>>>> [2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq
>>>>
>>>> Thanks for the research and the pointers, will look at it later.
>>>>
>>>>>>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>>>>>>
>>>>>>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>>>>>>
>>>>>>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>>>>>>
>>>>>>> Let me explain a bit more what these possible approaches are:
>>>>>>>
>>>>>>> 1. “Xen+hwdom” solution.
>>>>>>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>>>>>>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>>>>>>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>>>>>>
>>>>>> Stefano, Julien and I were thinking about this: Wouldn't it be possible
>>>>>> to come up with some hardware domain, solely dealing with CPUFreq
>>>>>> changes? This could run a Linux kernel, but no or very little userland.
>>>>>> All its vCPUs would be pinned to pCPUs and would normally not be
>>>>>> scheduled by Xen. If Xen wants to change the frequency, it schedules the
>>>>>> respective vCPU to the right pCPU and passes down the frequency change
>>>>>> request. Sounds a bit involved, though, and probably doesn't solve the
>>>>>> problem where this domain needs to share access to hardware with Dom0
>>>>>> (clocks come to mind).
>>>>> Yes, another question is how to get this Linux kernel stuff (backend,
>>>>> top level driver, etc) upstreamed.
>>>>
>>>> Well, the idea would be to use already upstream drivers to actually
>>>> implement OPP changes (via Linux clock and regulator drivers), then use
>>>> existing interfaces like the userspace governor, for instance, to
>>>> trigger those. I don't think we need much extra kernel code for that.
>>> I understand. Backend in userspace sets desired frequency by request
>>> from frontend in Xen.
>>
>> Yeah, something like that. It was just an idea, not fully thought
>> through yet.
>>
>>>>>>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>>>>>>
>>>>>>> 2. “all-in-Xen” solution.
>>>>>>> This implies that all CPUFreq related stuff should be located in Xen.
>>>>>>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>>>>>>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>>>>>
>>>>>> Yes, I even think it's not feasible to implement this. With a modern
>>>>>> clock implementation there is one driver to control *all* clocks of an
>>>>>> SoC, so you can't single out the CPU clock easily, for instance. One
>>>>>> would probably run into synchronisation issues, at best.
>>>>>>
>>>>>>> 3. “Xen+SCP(ARM TF)” solution.
>>>>>>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>>>>>>
>>>>>>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>>>>>
>>>>>> While I feel flattered that you like that idea as well ;-), you should
>>>>>> mention that this requires actual firmware providing those services.
>>>>> Yes, a some firmware, which provides these services, must be present
>>>>> on the other end.
>>>>> It is a firmware which runs on the dedicated IP core(s) in common case.
>>>>> And it is a firmware which runs on the same core(s) as the hypervisor
>>>>> in particular case.
>>>>>
>>>>>> I
>>>>>> am not sure there is actually *any* implementation of this at the
>>>>>> moment, apart from my PoC code for Allwinner.
>>>>> Your PoC is a good example for writing firmware side. So, why don't
>>>>> use it as a base for
>>>>> other platform.
>>>>
>>>> Sure, but normally firmware is provided by the vendor. And until more
>>>> vendors actually implement this, it's a bit weird to ask Xen users to
>>>> install this hand-crafted home-brew firmware to use this feature.
>>>> For a particular embedded use case like yours this might be feasible,
>>>> though.
>>> Agree. it is exactly for ARM SoCs with security extensions enabled,
>>> but where SCP isn't available.
>>> And these SoCs are exists.
>>
>> Sure, also it depends on the accessibility of firmware. Some SoCs only
>> run signed firmware, or there is no source code for crucial firmware
>> components (SoC setup, DRAM init), so changing the firmware might not be
>> an option.
> 
> Agree.
> 
>>
>>>>>> And from a Xen point of view I am not sure we are in the position to
>>>>>> force users to use this firmware. This may be feasible in a classic
>>>>>> embedded scenario, where both firmware and software are provided by the
>>>>>> same entity, but that should be clearly noted as a restriction.
>>>>> Agree.
>>>>>
>>>>>>
>>>>>>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>>>>>>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>>>>>
>>>>>> It should be noted that this synchronous nature of the communication can
>>>>>> actually be a problem: a DVFS request usually involves regulator and PLL
>>>>>> changes, which could take some time to settle in. Blocking all of this
>>>>>> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
>>>>> Agree. I haven't measured time yet to say how long is it, since I
>>>>> don't have a working firmware at the moment, just an emulator,
>>>>> but, yes, it will definitely take some time. The whole system won't be
>>>>> blocked, only the CPU which performs SMC call.
>>>>> But, if we ask hwdom to change frequency we will wait too? Or if Xen
>>>>> manages PLL/regulator by itself, it will wait anyway?
>>>>
>>>> Normally this is done asynchronously. For instance the OS programs the
>>>> regulator to change the voltage, then does other things until the
>>>> regulator signals the change has been realised. The it re-programs the
>>>> PLL, again executing other code, eventually being interrupted by a
>>>> completion interrupt (or by periodically polling a bit). If we need to
>>>> spend all of this time in EL3, the HV is blocked on this. This might or
>>>> might not be a problem, but it should be noted.
>>> Agree.
>>>
>>>>
>>>>>>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>>>>>>
>>>>>>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>>>>>>
>>>>>>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>>>>>>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>>>>>>> 2. A bunch of device-tree helpers and macros.
>>>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>>>
>>>>>> Why do you actually need this mailbox framework? Actually I just
>>>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>>>> actually need for SCPI is to write a simple command into some memory and
>>>>>> "press a button". I don't see a need to import the whole Linux
>>>>>> framework, especially as our mailbox usage is actually just a corner
>>>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>>>> mailbox is similarly simple, for instance.
>>>>> I did a direct port for SCPI protocol. I think, it is something that
>>>>> should be retained as much as possible.
>>>>
>>>> But the actual protocol is really simple. And we just need a subset of
>>>> it, namely to query and trigger OPPs.
>>> Yes. I think, that "Sensors service" is needed as well. I think that
>>> CPUFreq is not completed without thermal feedback.
>>
>> Personally I think this should be handled by the SCPI firmware: if the
>> requested OPP would violate thermal constraint, the firmware would just
>> not set it. Also (secure) temperature alarm interrupts could lower the OPP.
>> Doing this in firmware means it would just need to be implemented once,
>> and I consider this system critical, so firmware is conceptually the
>> better place for this code.
> 
> Sounds reasonable for me.
> 
>>
>>>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>>>> it would be much more easy for me to just add
>>>>> a few required commands handling with issuing SMC call and without any
>>>>> mailbox infrastructure involved.
>>>>> But, I want to show what is going on and what place these things come from.
>>>>
>>>> I appreciate that, but I think we already have enough "bloated" Linux +
>>>> glue code in Xen. And in particular the Linux mailbox framework is much
>>>> more powerful than we need for SCPI, so we have a lot of unneeded
>>>> functionality.
>>>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>>>> mailbox, we can get away with a *much* simpler solution.
>>>
>>> Agree, but I am afraid that simplifying things now might lead to some
>>> difficulties when there is a need
>>> to integrate a little bit different mailbox IP. Also, we need to
>>> recheck if SCMI, we might want to support as well,
>>> have the similar interface with mailbox.
>>>
>>>> - We would need to port mailbox drivers one-by-one anyway, so we could
>>>> as well implement the simple "press-the-button" subset for each mailbox
>>>> separately. The interface between the SCPI code and the mailbox is
>>>> probably just "signal_mailbox()". For SMC it's trivial, and for the Juno
>>>> MHU it's also simple, I guess ([1], chapter 3.6).
>>>> - The SCPI message assembly is easy as well.
>>>> - The only other code needed is some DT parsing code to be compatible
>>>> with the existing DTs describing the SCPI implementation. We would claim
>>>> to have a mailbox driver for those compatibles, but cheat a bit since we
>>>> only use it for SCPI and just need the single bit subset of the mailbox.
>>> Yes, I think, we can optimize in a such way.
>>>
>>> Just to clarify:
>>> Proposed "signal_mailbox" is intended for both actions: sending
>>> request and receiving response?
>>> So when it returns we will have either response or timeout error or
>>> some callback will be needed anyway?
>>>
>>> I don't have any objections regarding optimizations, we need to
>>> decide what mailboxes we should stick to (we can support) and in what
>>> form we should keep
>>> all this stuff in.
>>> Also while making a decision, we need to keep in mind "direct ported
>>> code" advantages:
>>> - "direct ported code" (SCPI + mailbox) have had a thorough review by
>>> the Linux community and Xen community
>>>   may rely on their review.
>>> - As "direct ported code" wasn't changed heavily, I believe, it would
>>> be easy to backport fixes/features to Xen.
>>
>> I understand that, but as I wrote in the other mail: This is a lean
>> hypervisor, not a driver and subsystem dump site. The security aspect of
>>  just having much less code is crucial here.
>>
>>> So, let's decide.
>>>
>>>>
>>>>> What is more, I don't want to restrict a usage of this CPUFreq by only
>>>>> covering single scenario where a
>>>>> firmware, which provides DVFS service, is in ARM TF. I hope, that this
>>>>> solution will be suitable for ARM SoCs where a standalone SCP
>>>>> is present and real mailbox IP, which has asynchronous nature, is used
>>>>> for IPC. Of course, this mailbox must have TX/RX-done irqs.
>>>>> This is a limitation at the moment.
>>>>
>>>> Sure, see above and the document [1] below.
>>> Thank you for the link, it seems with MHU we have to poll for the
>>> last_tx_done (where deasserted interrupt line in a status register is
>>> a condition for)
>>> after pressing the button. Or I missed something?
>>
>> It depends on whether we care. We could just treat this request in a
>> fire-and-forget manner. I am not sure in how far Xen really needs to
>> know the actual OPP used and when it's ready.
> 
> I got your point.
> 
> There is a "get" callback for CPUFreq drivers, where the CPUFreq core
> expects to get current frequency.
> Current frequency is also needed for initial condition, we might guess
> it, but why if SCPI does allow to retrieve it.

Well, that means you can read it if you want to know. That doesn't mean
that an implementation needs to poll the current state to see if has
been realized already.
If a system asks for a lower frequency, it might just express the
possibility to run at this speed, not necessary the hard requirement to
actually do so.

> Personally I think, that although "fire-and-forget" manner has
> advantage (a code is much simple) we will never know what is going on
> in case of errors,
> there are, I think, a few reasons for the firmware not to process request.
> I agree, that we could try not to wait for the real TX-done condition
> at all for asynchronous mailboxes if we are not going to queue
> requests.
> Because it is quite clear if we already got a response, that a request
> has been successfully reached other end,
> but if we got a timeout error, that something bad had happened and we
> could treat it as a global connection error, for example.
> But, responses it is something we should handle.

I think we might care when we want to change it *again* or when a user
actually asks for the current frequency. But even this might not tell
you the truth.
Think of your x86 laptop: It might get boosted without the OS knowing,
or thermal throttling might actually limit the frequency. Mine at least
does that all of the time.

Cheers,
Andre.

> So, MHU as well as Rockchip and other mailbox IPs which do have
> RX-done irq, I believe, we will be able to handle.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-16 17:04             ` Andre Przywara
@ 2017-11-17 14:01               ` Julien Grall
  2017-11-17 18:36                 ` Oleksandr Tyshchenko
  2017-11-17 14:55               ` Oleksandr Tyshchenko
  1 sibling, 1 reply; 108+ messages in thread
From: Julien Grall @ 2017-11-17 14:01 UTC (permalink / raw)
  To: Andre Przywara, Oleksandr Tyshchenko, Jassi Brar
  Cc: Edgar E . Iglesias, Stefano Stabellini, Andrew Cooper,
	Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla, xen-devel

Hi,

First of all, thank you Oleksandr for starting a thread around CPUFreq 
support.

On 11/16/2017 05:04 PM, Andre Przywara wrote:
> On 16/11/17 14:57, Oleksandr Tyshchenko wrote:
>> On Wed, Nov 15, 2017 at 4:28 PM, Andre Przywara
>> <andre.przywara@linaro.org> wrote:
>> Anyway, I think we should go step-by-step.
>> If community agreed that CPUFreq feature in Xen on ARM was needed and
>> SCPI/SCMI based approach
>> was the right thing to do in general I would stick to next taking into
>> the account Andre's suggestions
>> regarding some guest input:
>>
>> 1. Xen do have CPUFreq logic. It measures CPUs utilization by itself.
>> 2. In addition it can collect OPP change requests from the guests:
>>    - There are some politics describing which guest is allowed to send
>> OPP change request.
>>    - Of course, involved guests have CPUFreq enabled. All we need is
>> these OPP change requests don't lead to
>>      any physical changes and be picked up by Xen. Here we could use
>> Andre's idea here (SCPI CPUFreq + SMC mailbox with hvc method).
>> 3. Xen makes a decision based on the whole system status it measures
>> periodically and guests input (OPP change requests) if present.
>> 4. Xen actually issues command to change the CPU frequency (sends OPP
>> change request to SCP).
>>
>> How does it sound?
> 
> 0. Decide whether CPUFreq justifies 1.-4. in the first place. That
> sounds like a lot of work and code, so we should be sure it's worth it.
> 
> I wonder if you could provide some input, ideally measurements on the
> actual power savings CPUFreq provides.
> 
> Does the wish to have CPUFreq purely come from some "tick-the-box"
> exercise? As in: We have it on native Linux, so we need it in Xen?
> 
> What power savings can we expect from CPUFreq? Can those possible
> savings be transferred into a virtualized environment at all? And do
> those saving justify all the extra code in Xen?
> 
> I think those questions need to be answered first, then we can discuss
> about the implementation details.

I am going to throw a bit more ideas. From the discussion, it look like 
to me the story is around power saving when using Xen. Am I right?

Have you explored some other possibility to save power? I am asking 
that, because the topic is fairly new with Xen.

Once area where power could be saved is the idle loop (see idle_loop in 
arch/arm/domain.c). At the momment only WFI is used. It would be 
possible to go in deeper low-power state by using PSCI.

Similarly, the virtual PSCI implementation for suspend is quite simple. 
You could potentially use those information to decide what to do with 
the pCPU (suspend, turning off...).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-16 17:04             ` Andre Przywara
  2017-11-17 14:01               ` Julien Grall
@ 2017-11-17 14:55               ` Oleksandr Tyshchenko
  2017-11-17 16:41                 ` Andre Przywara
  1 sibling, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-17 14:55 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Edgar E . Iglesias, Stefano Stabellini, Oleksandr Tyshchenko,
	Andrew Cooper, Julien Grall, Jassi Brar, Jan Beulich,
	Sudeep Holla, xen-devel

On Thu, Nov 16, 2017 at 7:04 PM, Andre Przywara
<andre.przywara@linaro.org> wrote:
> Hi,
Hi Andre

Thank you for your comments!

>
> On 16/11/17 14:57, Oleksandr Tyshchenko wrote:
>> On Wed, Nov 15, 2017 at 4:28 PM, Andre Przywara
>> <andre.przywara@linaro.org> wrote:
>>> Hi,
>> Hi Andre, Jassi
>>
>> Thank you for your comments!
>>
>>>
>>> On 14/11/17 20:46, Oleksandr Tyshchenko wrote:
>>>> On Tue, Nov 14, 2017 at 12:49 PM, Andre Przywara
>>>> <andre.przywara@linaro.org> wrote:
>>>>> Hi,
>>>> Hi Andre
>>>>
>>>>>
>>>>> On 13/11/17 19:40, Oleksandr Tyshchenko wrote:
>>>>>> On Mon, Nov 13, 2017 at 5:21 PM, Andre Przywara
>>>>>> <andre.przywara@linaro.org> wrote:
>>>>>>> Hi,
>>>>>> Hi Andre,
>>>>>>
>>>>>>>
>>>>>>> thanks very much for your work on this!
>>>>>> Thank you for your comments.
>>>>>>
>>>>>>>
>>>>>>> On 09/11/17 17:09, Oleksandr Tyshchenko wrote:
>>>>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>>>>>
>>>>>>>> Hi, all.
>>>>>>>>
>>>>>>>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>>>>>>>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load.
>>>>>>>
>>>>>>> Can you please sketch your usage scenario or workloads here? I can think
>>>>>>> of quite different scenarios (oversubscribed server vs. partitioning
>>>>>>> RTOS guests, for instance). The usefulness of CPUFreq and the trade-offs
>>>>>>> in the design are quite different between those.
>>>>>> We keep embedded use-cases in mind. For example, it is a system with
>>>>>> several domains,
>>>>>> where one domain has most critical SW running on and other domain(s)
>>>>>> are, let say, for entertainment purposes.
>>>>>> I think, the CPUFreq is useful where power consumption is a question.
>>>>>
>>>>> Does the SoC you use allow different frequencies for each core? Or is it
>>>>> one frequency for all cores? Most x86 CPU allow different frequencies
>>>>> for each core, AFAIK. Just having the same OPP for the whole SoC might
>>>>> limit the usefulness of this approach in general.
>>>> Good question. All cores in a cluster share the same clock. It is
>>>> impossible to set different frequencies on the cores inside one
>>>> cluster.
>>>>
>>>>>
>>>>>>> In general I doubt that a hypervisor scheduling vCPUs is in a good
>>>>>>> position to make a decision on the proper frequency physical CPUs should
>>>>>>> run with. From all I know it's already hard for an OS kernel to make
>>>>>>> that call. So I would actually expect that guests provide some input,
>>>>>>> for instance by signalling OPP change request up to the hypervisor. This
>>>>>>> could then decide to act on it - or not.
>>>>>> Each running guest sees only part of the picture, but hypervisor has
>>>>>> the whole picture, it knows all about CPU, measures CPU load and able
>>>>>> to choose required CPU frequency to run on.
>>>>>
>>>>> But based on what data? All Xen sees is a vCPU trapping on MMIO, a
>>>>> hypercall or on WFI, for that matter. It does not know much more about
>>>>> the guest, especially it's rather clueless about what the guest OS
>>>>> actually intended to do.
>>>>> For instance Linux can track the actual utilization of a core by keeping
>>>>> statistics of runnable processes and monitoring their time slice usage.
>>>>> It can see that a certain process exhibits periodical, but bursty CPU
>>>>> usage, which may hint that is could run at lower frequency. Xen does not
>>>>> see this fine granular information.
>>>>>
>>>>>> I am wondering, does Xen
>>>>>> need additional input from guests for make a decision?
>>>>>
>>>>> I very much believe so. The guest OS is in a much better position to
>>>>> make that call.
>>>>>
>>>>>> BTW, currently guest domain on ARM doesn't even know how many physical
>>>>>> CPUs the system has and what are these OPPs. When creating guest
>>>>>> domain Xen inserts only dummy CPU nodes. All CPU info, such as clocks,
>>>>>> OPPs, thermal, etc are not passed to guest.
>>>>>
>>>>> Sure, because this is what virtualization is about. And I am not asking
>>>>> for unconditionally allowing any guest to change frequency.
>>>>> But there could be certain use cases where this could be considered:
>>>>> Think about your "critical SW" mentioned above, which is probably some
>>>>> RTOS, also possibly running on pinned vCPUs. For that
>>>>> (latency-sensitive) guest it might be well suited to run at a lower
>>>>> frequency for some time, but how should Xen know about this?
>>>>> "Normally" the best strategy to save power is to run as fast as
>>>>> possible, finish all outstanding work, then put the core to sleep.
>>>>> Because not running at all consumes much less energy than running at a
>>>>> reduced frequency. But this may not be suitable for an RTOS.
>>>> Saying "one domain has most critical SW running on" I meant hardware
>>>> domain/driver domain or even other
>>>> domain which perform some important tasks (disk, net, display, camera,
>>>> whatever) which treated by the whole system as critical
>>>> and must never fail. Other domains, for example, it might be Android
>>>> as well, are not critical at all from the system point of view.
>>>> Being honest, I haven't considered yet using CPUFreq in system where
>>>> some RT guest is present.
>>>> I think it is something that should be *thoroughly* investigated and
>>>> then worked out.
>>>
>>> Yes, as mentioned before there are quite different use cases with quite
>>> different requirements when it comes to DVFS.
>>> I believe the best would be to define typical scenarios, then assess the
>>> usefulness of CPUFreq separately for each one of them.
>>> Based on this we then should be able to make a decision.
>>
>> Agree here.
>> Well, let's imagine following use-case(s), maybe too complex, but it
>> might take place.
>> ARM SoC is big.LITTLE and it has >=1 big core(s) and >=1 little
>> core(s) with following abilities:
>> 1. big core(s) is DVFS capable (>1 OPP), little core(s) isn't DVFS
>> capable (1 OPP) and vice versa.
>> 2. Both types are DVFS capable.
>> The system which runs on this SoC has 3 guests:
>> 1. Thin dom0, has some storage driver (mmc, sata, whatever) with
>> blkback running.
>> Tasks:
>> - Running VM
>> - Watchdog
>> - vbd support
>> 2. Driver domain (maybe RT-guest: Linux with RT infra or even some
>> RTOS, maybe non-RT-guest)
>> For example, instrumental cluster.
>> Tasks:
>> - Gears
>> - RVC
>> - OpenCL
>> - 3D UI
>> - vdispl, vsnd, vif, vusb, (vbd) support.
>> 3. Entertainment domain.
>> For example, Android.
>> Tasks:
>> - Navi(Maps)
>> - Multimedia(Audio/Video)
>> - Cell
>> - OTA
>> - Third-party apps
>> Also, such system might be "battery-powered".
>
> All valid points, and demonstrates the variety of use cases. I was
> hoping for more general systems or guest use case, like:
> - oversubscribed server machine, possibly in a migration pool
> - server for isolating system components (web server, mail server,
> application server), possibly not loaded 100% all of the time
> - desktop machine or laptop, isolation for security reasons (Qubes OS)
> - embedded system, mostly partitioning (not oversubscribed, vCPUs pinned)
> - embedded system with at least one "media domain" (video/audio playback)
> - embedded system with at least one realtime domain
> ....
>
>>>
>>>> I am not familiar with RT system requirements, I suppose, but not
>>>> entirely sure, that CPUFreq should use const
>>>> frequency for all cores the RT system is running on, or RT system
>>>> parameters should be recalculated each time the CPU frequency is being
>>>> changed
>>>> (in such case guest needs some input from Xen).
>>>>
>>>> Anyway, I got your point about some guest input. Could you, please,
>>>> describe how you think it should look like:
>>>> 1. Xen doesn't have CPUFreq logic at all. It only collects OPP change
>>>> requests from all guests and make
>>>> a decision based on these requests and maybe some policy for
>>>> prioritizing requests. Then it sends OPP change request to SCP.
>>>> 2. Xen has CPUFreq logic. In addition it can collect OPP change
>>>> requests from all guests and make
>>>> a decision based on both: it's own view and guest requests. Then it
>>>> sends OPP change request to SCP.
>>>
>>> I am leaning towards 1) conceptually. But if there is some kind of
>>> reasonable implementation of 2) already in Xen (for x86), this might be
>>> feasible as well.
>>
>> Sure, Xen has common CPUFreq infra (core, set of governors) and
>> two ACPI P-state CPUFreq drivers. Actually this patch series adds SCPI-based
>> CPUFreq driver, which as well as existing drivers, are just for
>> issuing command to change CPU frequency.
>> The entity which decides what CPU frequency to set next is already present.
>>
>> I got your point. I think that approach 1 is radically different from
>> what we have in Xen for x86 these days.
>> Anyway, we need to weight all pros and cons to decide what direction
>> we want to follow.
>>
>> BTW, I see that existing CPUFreq drivers can read some performance counters
>> to measure performance over a period of time
>
> Is that APERF/MPERF on x86? Which gives you the ratio between idle and
> wall clock time?
Yes, I meant APERF/MPERF.

>
>> and this measured
>> performance can be used as an additional input for
>> governor then. Do we have something on ARM?
>
> Not architecturally, but I guess you can track the arch timer counter
> before entering WFI and when coming back to record the time spent sleeping.
> But I am not sure that sleep time is a good metric to deduct CPU frequency.
Agree.

>
>> I was thinking, how to actually take into the account guest's OPP
>> change requests from the governor's perspective,
>> and these "requests" might be considered as performance counters.
>
> Maybe, maybe it's even simpler. You have a static vCPU frequency
> setting, as given by the administrator from Dom0, either at domain
> creation time or at runtime. Plus you have the guests' requests, which
> may or may not override this.
> So the policies could be:
> - Always run at full speed.
> - Run at full speed, and realise guest CPUFreq requests
> - Run at low speed, and realise guest CPUFreq requests
> - Always run at low speed
I see, looks like userspace governor modified a bit.

>
> So Xen does not need to throw in its own ideas here. Which would avoid
> some of the hard problems we encountered.
I got all your point.
Just question. Why does existing CPUFreq on x86 have own logic? Do we have
something yet another on ARM that having own logic in Xen doesn't make
any sense?

>
>>>> Both variant implies that something like PV CPUFreq should be involved
>>>> with frontend drivers are located in guests. Am I correct?
>>>
>>> And here the SMC mailbox comes into play again, but with a twist. For
>>> guests we create SCPI, mailbox and shmem DT nodes, and use the SMC
>>> mailbox with: method = "hvc";. Xen's HVC handles then redirects this to
>>> the CPUFreq code.
>>> This would be platform agnostic for the guests, while making all CPUFreq
>>> requests ending up in Xen. So there is no need for an extra PV protocol.
>>
>> This idea is indeed interesting.
>>
>> Could you please answer these questions:
>> 1. As I understand correctly here in Xen we have to emulate all DVFS
>> related commands, I mean to be an SCP for the guests?
>
> Yes, though "emulate all DVFS commands" sounds more complicated than it
> is, it could be as simple as my ATF implementation:
> https://github.com/apritzel/arm-trusted-firmware/commit/2f6f7d1746f72d0fe4da461ab1b3bfddc082636d
Yes, though for "Xen being an SCP for guest" we need a little bit more
than just this patch adds, I guess.
But anyway, we have a good example to start.

>
>> 2. How do we recognize from guest's OPP change request on which
>> physical CPU it wants to change frequency?
>
> I think that maps to the DVFS power domains. We could offer one power
> domain per vCPU.
Probably, yes. I don't see why this idea won't work.

>
>>     Do we need to pin guest's vCPU to the respective pCPU?
>
> No, I don't see why. Makes the code and the decision when to switch more
> complicated, of course.
Good.

>
>> 3. Linux "SCPI CPUFreq Interface driver" is tied to "ARM big.LITTLE
>> Platforms CPUFreq driver", so will the latter be "happy"
>>     to play with virtual CPUs a particular guests is running on?
>
> I think so. But possibly SCMI provides a better answer to this.
So, need additional investigation.

>
>> 4. Together with creating dummy SCPI nodes for guest we have to insert
>> clock specifier into a CPU node
>>     which we expose to guest (clocks = <&scpi_dvfs 0>;). Correct?
>
> Yes, but that should be easy.
Agree.

>
>> 5. Will there be any possible synchronization issues if two guest send
>> OPP change requests at the same time?
>
> No, this is per a VCPU trap to EL2 and will be handled in context of a
> VCPU and its domain.
Agree there too.

> How this translates to the actual frequency of a
> physical core is a different question, though. One of the reason I am a
> bit wary of the usefulness of this exercise: because the downclocked
> physical core might be given to another VCPU in another guest shortly
> afterwards, at which point it might need to be clocked up again - or not.
Oh, I see, some scheduler input might be needed...

>
>>>>> So I think we would need a combined approach:
>>>>> a) Let an administrator (via tools running in Dom0) tell Xen about power
>>>>> management strategies to use for certain guests. An RTOS could be
>>>>> treated differently (lower, but constant frequency) than an
>>>>> "entertainment" guest (varying frequency, based on guest OS input), also
>>>>> differently than some background guest doing logging, OTA update, etc.
>>>>> (constant high frequency, but putting cores to sleep instead as often as
>>>>> possible).
>>>>> b) Allow some guests (based on policy from (a)) to signal CPUFreq change
>>>>> requests to the hypervisor. Xen takes those into account, though it may
>>>>> decide to not act immediately on it, because it is going to schedule
>>>>> another vCPU, for instance.
>>>>> c) Have some way of actually realising certain OPPs. This could be via
>>>>> an SCPI client in Xen, or some other way. Might be an implementation detail.
>>>>
>>>> Just to clarify if I got the main idea correct:
>>>> 1. Guests have CPUFreq logic, they send OPP change requests to Xen.
>>>> 2. Xen has CPUFreq logic too, but in additional it can take into the account OPP
>>>>     change requests from guests. Xen sends final OPP change request.
>>>> Is my understanding correct?
>>>
>>> Yes, I think this sounds like the most flexible. Xen's CPUFreq logic
>>> could be quite simple, possibly starting with some static assignment
>>> based on administrator input, e.g. given at guest creation time.
>>> It might not involve further runtime decisions.
>>>
>>>> Also "Different power management strategies to use for certain guests"
>>>> means that it should be
>>>> hard vCPU->pCPU pinning for each guest together with possibility in
>>>> Xen to have different CPUFreq governors
>>>> running at the same time (each governor for each CPU pool)?
>>>
>>> That would need to be worked out, but I suspect that CPU pinning might
>>> be *one* option for a certain class of guests. This would probably be
>>> related to the CPUFreq policy. Without pinning the decision might become
>>> quite involved: If Xen wants to migrate a vCPU to a different pCPU, it
>>> needs to take the different P-states into account, including the cost to
>>> change the OPP. I am not sure the benefit justifies the effort. Some
>>> numbers would help here.
>>
>> I can't even imagine a development effort of adding ability to have different
>> CPUFreq policies over different CPUs in Xen. Another question is, if
>> all cores shares OPP
>> it is not feasible to realize that, I am afraid.
>>
>> Anyway, I think we should go step-by-step.
>> If community agreed that CPUFreq feature in Xen on ARM was needed and
>> SCPI/SCMI based approach
>> was the right thing to do in general I would stick to next taking into
>> the account Andre's suggestions
>> regarding some guest input:
>>
>> 1. Xen do have CPUFreq logic. It measures CPUs utilization by itself.
>> 2. In addition it can collect OPP change requests from the guests:
>>   - There are some politics describing which guest is allowed to send
>> OPP change request.
>>   - Of course, involved guests have CPUFreq enabled. All we need is
>> these OPP change requests don't lead to
>>     any physical changes and be picked up by Xen. Here we could use
>> Andre's idea here (SCPI CPUFreq + SMC mailbox with hvc method).
>> 3. Xen makes a decision based on the whole system status it measures
>> periodically and guests input (OPP change requests) if present.
>> 4. Xen actually issues command to change the CPU frequency (sends OPP
>> change request to SCP).
>>
>> How does it sound?
>
> 0. Decide whether CPUFreq justifies 1.-4. in the first place.
Sure,
> That sounds like a lot of work and code, so we should be sure it's worth it.
>
> I wonder if you could provide some input, ideally measurements on the
> actual power savings CPUFreq provides.
Well, I think I will be able to provide some numbers when a firmware,
which runs on the SoC
I am using, is ready. Actually, currently I have an emulator without
any real freq/volt changes.

>
> Does the wish to have CPUFreq purely come from some "tick-the-box"
> exercise? As in: We have it on native Linux, so we need it in Xen?
As I said before, we are interesting in purely embedded use-cases
where power consumption is a question.
If you know how to save power without having CPUFreq involved I would
appreciate the pointers.

>
> What power savings can we expect from CPUFreq? Can those possible
> savings be transferred into a virtualized environment at all? And do
> those saving justify all the extra code in Xen?
>
> I think those questions need to be answered first, then we can discuss
> about the implementation details.
OK.

>
>>>>>>>> Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>>>>>>>
>>>>>>> Have you looked at how this is used on x86 these days? Can you briefly
>>>>>>> describe how this works and it's used there?
>>>>>> Xen supports CPUFreq feature on x86 [1]. I don't know how widely it is
>>>>>> used at the moment, but it is another question. So, there are two
>>>>>> possible modes: Domain0 based CPUFreq and Hypervisor based CPUFreq
>>>>>> [2]. As I understand, the second option is more popular.
>>>>>> Two different implementations of "Hypervisor based CPUFreq" are
>>>>>> present: ACPI Processor P-States Driver and AMD Architectural P-state
>>>>>> Driver. You can find both them in xen/arch/x86/acpi/cpufreq/ dir.
>>>>>>
>>>>>> [1] https://wiki.xenproject.org/wiki/Xen_power_management#CPU_P-states_.28cpufreq.29
>>>>>> [2] https://wiki.xenproject.org/wiki/Xen_power_management#Hypervisor_based_cpufreq
>>>>>
>>>>> Thanks for the research and the pointers, will look at it later.
>>>>>
>>>>>>>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>>>>>>>
>>>>>>>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>>>>>>>
>>>>>>>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>>>>>>>
>>>>>>>> Let me explain a bit more what these possible approaches are:
>>>>>>>>
>>>>>>>> 1. “Xen+hwdom” solution.
>>>>>>>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>>>>>>>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>>>>>>>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>>>>>>>
>>>>>>> Stefano, Julien and I were thinking about this: Wouldn't it be possible
>>>>>>> to come up with some hardware domain, solely dealing with CPUFreq
>>>>>>> changes? This could run a Linux kernel, but no or very little userland.
>>>>>>> All its vCPUs would be pinned to pCPUs and would normally not be
>>>>>>> scheduled by Xen. If Xen wants to change the frequency, it schedules the
>>>>>>> respective vCPU to the right pCPU and passes down the frequency change
>>>>>>> request. Sounds a bit involved, though, and probably doesn't solve the
>>>>>>> problem where this domain needs to share access to hardware with Dom0
>>>>>>> (clocks come to mind).
>>>>>> Yes, another question is how to get this Linux kernel stuff (backend,
>>>>>> top level driver, etc) upstreamed.
>>>>>
>>>>> Well, the idea would be to use already upstream drivers to actually
>>>>> implement OPP changes (via Linux clock and regulator drivers), then use
>>>>> existing interfaces like the userspace governor, for instance, to
>>>>> trigger those. I don't think we need much extra kernel code for that.
>>>> I understand. Backend in userspace sets desired frequency by request
>>>> from frontend in Xen.
>>>
>>> Yeah, something like that. It was just an idea, not fully thought
>>> through yet.
>>>
>>>>>>>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>>>>>>>
>>>>>>>> 2. “all-in-Xen” solution.
>>>>>>>> This implies that all CPUFreq related stuff should be located in Xen.
>>>>>>>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>>>>>>>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>>>>>>
>>>>>>> Yes, I even think it's not feasible to implement this. With a modern
>>>>>>> clock implementation there is one driver to control *all* clocks of an
>>>>>>> SoC, so you can't single out the CPU clock easily, for instance. One
>>>>>>> would probably run into synchronisation issues, at best.
>>>>>>>
>>>>>>>> 3. “Xen+SCP(ARM TF)” solution.
>>>>>>>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>>>>>>>
>>>>>>>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>>>>>>
>>>>>>> While I feel flattered that you like that idea as well ;-), you should
>>>>>>> mention that this requires actual firmware providing those services.
>>>>>> Yes, a some firmware, which provides these services, must be present
>>>>>> on the other end.
>>>>>> It is a firmware which runs on the dedicated IP core(s) in common case.
>>>>>> And it is a firmware which runs on the same core(s) as the hypervisor
>>>>>> in particular case.
>>>>>>
>>>>>>> I
>>>>>>> am not sure there is actually *any* implementation of this at the
>>>>>>> moment, apart from my PoC code for Allwinner.
>>>>>> Your PoC is a good example for writing firmware side. So, why don't
>>>>>> use it as a base for
>>>>>> other platform.
>>>>>
>>>>> Sure, but normally firmware is provided by the vendor. And until more
>>>>> vendors actually implement this, it's a bit weird to ask Xen users to
>>>>> install this hand-crafted home-brew firmware to use this feature.
>>>>> For a particular embedded use case like yours this might be feasible,
>>>>> though.
>>>> Agree. it is exactly for ARM SoCs with security extensions enabled,
>>>> but where SCP isn't available.
>>>> And these SoCs are exists.
>>>
>>> Sure, also it depends on the accessibility of firmware. Some SoCs only
>>> run signed firmware, or there is no source code for crucial firmware
>>> components (SoC setup, DRAM init), so changing the firmware might not be
>>> an option.
>>
>> Agree.
>>
>>>
>>>>>>> And from a Xen point of view I am not sure we are in the position to
>>>>>>> force users to use this firmware. This may be feasible in a classic
>>>>>>> embedded scenario, where both firmware and software are provided by the
>>>>>>> same entity, but that should be clearly noted as a restriction.
>>>>>> Agree.
>>>>>>
>>>>>>>
>>>>>>>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>>>>>>>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>>>>>>
>>>>>>> It should be noted that this synchronous nature of the communication can
>>>>>>> actually be a problem: a DVFS request usually involves regulator and PLL
>>>>>>> changes, which could take some time to settle in. Blocking all of this
>>>>>>> time (milliseconds?) in EL3 (probably busy-waiting) might not be desirable.
>>>>>> Agree. I haven't measured time yet to say how long is it, since I
>>>>>> don't have a working firmware at the moment, just an emulator,
>>>>>> but, yes, it will definitely take some time. The whole system won't be
>>>>>> blocked, only the CPU which performs SMC call.
>>>>>> But, if we ask hwdom to change frequency we will wait too? Or if Xen
>>>>>> manages PLL/regulator by itself, it will wait anyway?
>>>>>
>>>>> Normally this is done asynchronously. For instance the OS programs the
>>>>> regulator to change the voltage, then does other things until the
>>>>> regulator signals the change has been realised. The it re-programs the
>>>>> PLL, again executing other code, eventually being interrupted by a
>>>>> completion interrupt (or by periodically polling a bit). If we need to
>>>>> spend all of this time in EL3, the HV is blocked on this. This might or
>>>>> might not be a problem, but it should be noted.
>>>> Agree.
>>>>
>>>>>
>>>>>>>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>>>>>>>
>>>>>>>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>>>>>>>
>>>>>>>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>>>>>>>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>>>>>>>> 2. A bunch of device-tree helpers and macros.
>>>>>>>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>>>>>>>
>>>>>>> Why do you actually need this mailbox framework? Actually I just
>>>>>>> proposed the SMC driver the make it fit into the Linux framework. All we
>>>>>>> actually need for SCPI is to write a simple command into some memory and
>>>>>>> "press a button". I don't see a need to import the whole Linux
>>>>>>> framework, especially as our mailbox usage is actually just a corner
>>>>>>> case of the mailbox's capability (namely a "single-bit" doorbell).
>>>>>>> The SMC use case is trivial to implement, and I believe using the Juno
>>>>>>> mailbox is similarly simple, for instance.
>>>>>> I did a direct port for SCPI protocol. I think, it is something that
>>>>>> should be retained as much as possible.
>>>>>
>>>>> But the actual protocol is really simple. And we just need a subset of
>>>>> it, namely to query and trigger OPPs.
>>>> Yes. I think, that "Sensors service" is needed as well. I think that
>>>> CPUFreq is not completed without thermal feedback.
>>>
>>> Personally I think this should be handled by the SCPI firmware: if the
>>> requested OPP would violate thermal constraint, the firmware would just
>>> not set it. Also (secure) temperature alarm interrupts could lower the OPP.
>>> Doing this in firmware means it would just need to be implemented once,
>>> and I consider this system critical, so firmware is conceptually the
>>> better place for this code.
>>
>> Sounds reasonable for me.
>>
>>>
>>>>>> Protocol relies on mailbox feature, so I ported mailbox too. I think,
>>>>>> it would be much more easy for me to just add
>>>>>> a few required commands handling with issuing SMC call and without any
>>>>>> mailbox infrastructure involved.
>>>>>> But, I want to show what is going on and what place these things come from.
>>>>>
>>>>> I appreciate that, but I think we already have enough "bloated" Linux +
>>>>> glue code in Xen. And in particular the Linux mailbox framework is much
>>>>> more powerful than we need for SCPI, so we have a lot of unneeded
>>>>> functionality.
>>>>> If we just want to support CPUfreq using SCPI via SMC/Juno MHU/Rockchip
>>>>> mailbox, we can get away with a *much* simpler solution.
>>>>
>>>> Agree, but I am afraid that simplifying things now might lead to some
>>>> difficulties when there is a need
>>>> to integrate a little bit different mailbox IP. Also, we need to
>>>> recheck if SCMI, we might want to support as well,
>>>> have the similar interface with mailbox.
>>>>
>>>>> - We would need to port mailbox drivers one-by-one anyway, so we could
>>>>> as well implement the simple "press-the-button" subset for each mailbox
>>>>> separately. The interface between the SCPI code and the mailbox is
>>>>> probably just "signal_mailbox()". For SMC it's trivial, and for the Juno
>>>>> MHU it's also simple, I guess ([1], chapter 3.6).
>>>>> - The SCPI message assembly is easy as well.
>>>>> - The only other code needed is some DT parsing code to be compatible
>>>>> with the existing DTs describing the SCPI implementation. We would claim
>>>>> to have a mailbox driver for those compatibles, but cheat a bit since we
>>>>> only use it for SCPI and just need the single bit subset of the mailbox.
>>>> Yes, I think, we can optimize in a such way.
>>>>
>>>> Just to clarify:
>>>> Proposed "signal_mailbox" is intended for both actions: sending
>>>> request and receiving response?
>>>> So when it returns we will have either response or timeout error or
>>>> some callback will be needed anyway?
>>>>
>>>> I don't have any objections regarding optimizations, we need to
>>>> decide what mailboxes we should stick to (we can support) and in what
>>>> form we should keep
>>>> all this stuff in.
>>>> Also while making a decision, we need to keep in mind "direct ported
>>>> code" advantages:
>>>> - "direct ported code" (SCPI + mailbox) have had a thorough review by
>>>> the Linux community and Xen community
>>>>   may rely on their review.
>>>> - As "direct ported code" wasn't changed heavily, I believe, it would
>>>> be easy to backport fixes/features to Xen.
>>>
>>> I understand that, but as I wrote in the other mail: This is a lean
>>> hypervisor, not a driver and subsystem dump site. The security aspect of
>>>  just having much less code is crucial here.
>>>
>>>> So, let's decide.
>>>>
>>>>>
>>>>>> What is more, I don't want to restrict a usage of this CPUFreq by only
>>>>>> covering single scenario where a
>>>>>> firmware, which provides DVFS service, is in ARM TF. I hope, that this
>>>>>> solution will be suitable for ARM SoCs where a standalone SCP
>>>>>> is present and real mailbox IP, which has asynchronous nature, is used
>>>>>> for IPC. Of course, this mailbox must have TX/RX-done irqs.
>>>>>> This is a limitation at the moment.
>>>>>
>>>>> Sure, see above and the document [1] below.
>>>> Thank you for the link, it seems with MHU we have to poll for the
>>>> last_tx_done (where deasserted interrupt line in a status register is
>>>> a condition for)
>>>> after pressing the button. Or I missed something?
>>>
>>> It depends on whether we care. We could just treat this request in a
>>> fire-and-forget manner. I am not sure in how far Xen really needs to
>>> know the actual OPP used and when it's ready.
>>
>> I got your point.
>>
>> There is a "get" callback for CPUFreq drivers, where the CPUFreq core
>> expects to get current frequency.
>> Current frequency is also needed for initial condition, we might guess
>> it, but why if SCPI does allow to retrieve it.
>
> Well, that means you can read it if you want to know. That doesn't mean
> that an implementation needs to poll the current state to see if has
> been realized already.
> If a system asks for a lower frequency, it might just express the
> possibility to run at this speed, not necessary the hard requirement to
> actually do so.
>
>> Personally I think, that although "fire-and-forget" manner has
>> advantage (a code is much simple) we will never know what is going on
>> in case of errors,
>> there are, I think, a few reasons for the firmware not to process request.
>> I agree, that we could try not to wait for the real TX-done condition
>> at all for asynchronous mailboxes if we are not going to queue
>> requests.
>> Because it is quite clear if we already got a response, that a request
>> has been successfully reached other end,
>> but if we got a timeout error, that something bad had happened and we
>> could treat it as a global connection error, for example.
>> But, responses it is something we should handle.
>
> I think we might care when we want to change it *again* or when a user
> actually asks for the current frequency. But even this might not tell
> you the truth.
> Think of your x86 laptop: It might get boosted without the OS knowing,
> or thermal throttling might actually limit the frequency. Mine at least
> does that all of the time.

I understand that.

To be clear, I was not asking to poll the current state to see if the
CPU frequency has been *physically* changed.
I just worried that "fire-and-forget" manner wouldn't allow us to see
any responses from the other side.

Anyway, it is discussable.

>
> Cheers,
> Andre.
>
>> So, MHU as well as Rockchip and other mailbox IPs which do have
>> RX-done irq, I believe, we will be able to handle.

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-17 14:55               ` Oleksandr Tyshchenko
@ 2017-11-17 16:41                 ` Andre Przywara
  2017-11-17 17:22                   ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Andre Przywara @ 2017-11-17 16:41 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Edgar E . Iglesias, Stefano Stabellini, Oleksandr Tyshchenko,
	Andrew Cooper, Julien Grall, Jassi Brar, Jan Beulich,
	Sudeep Holla, xen-devel

Hi,

....

>> So Xen does not need to throw in its own ideas here. Which would avoid
>> some of the hard problems we encountered.
> I got all your point.
> Just question. Why does existing CPUFreq on x86 have own logic? Do we have
> something yet another on ARM that having own logic in Xen doesn't make
> any sense?

That's a good question. From quickly poking some people in #xendevel,
Julien learnt that CPUFreq on x86 might not really work well or at least
not as expected.
So the benefit is not even clear there. It just went in the tree once,
and possibly nobody ever revisited it since.
And even if there were good reasons back then, modern CPUs tend to be
quite different in terms of power characteristics.

....

>> 0. Decide whether CPUFreq justifies 1.-4. in the first place.
> Sure,
>> That sounds like a lot of work and code, so we should be sure it's worth it.
>>
>> I wonder if you could provide some input, ideally measurements on the
>> actual power savings CPUFreq provides.
> Well, I think I will be able to provide some numbers when a firmware,
> which runs on the SoC
> I am using, is ready. Actually, currently I have an emulator without
> any real freq/volt changes.

Yes, some actual numbers would very much help the case. I don't think
you need very sophisticated equipment, just running a workload once with
and once without CPUFreq and compare the power consumption would be a
good start. This could be as easy as measuring the (m)Wh consumed with
some wall-plug type power meter. I use some very cheap USB power
meter[1], which I put between the PSU and some single board computer to
get an idea on what the power consumption is. Surely not really
reliable, but better than nothing.

>> Does the wish to have CPUFreq purely come from some "tick-the-box"
>> exercise? As in: We have it on native Linux, so we need it in Xen?
> As I said before, we are interesting in purely embedded use-cases
> where power consumption is a question.
> If you know how to save power without having CPUFreq involved I would
> appreciate the pointers.

As Julien said, I guess idling and CPU offlining/CPU suspend (via PSCI)
would be a good start to look at. You could try to get some numbers on
this as well.

Cheers,
Andre.

[1]
https://www.ebay.co.uk/itm/USB-Charger-Doctor-Voltage-Current-Meter-Mobile-Battery-Tester-Power-Detector-UK/263220956905

>> What power savings can we expect from CPUFreq? Can those possible
>> savings be transferred into a virtualized environment at all? And do
>> those saving justify all the extra code in Xen?
>>
>> I think those questions need to be answered first, then we can discuss
>> about the implementation details.
> OK.
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-17 16:41                 ` Andre Przywara
@ 2017-11-17 17:22                   ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-17 17:22 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Edgar E . Iglesias, Stefano Stabellini, Oleksandr Tyshchenko,
	Andrew Cooper, Julien Grall, Jassi Brar, Jan Beulich,
	Sudeep Holla, xen-devel

On Fri, Nov 17, 2017 at 6:41 PM, Andre Przywara
<andre.przywara@linaro.org> wrote:
> Hi,
Hi Andre

>
> ....
>
>>> So Xen does not need to throw in its own ideas here. Which would avoid
>>> some of the hard problems we encountered.
>> I got all your point.
>> Just question. Why does existing CPUFreq on x86 have own logic? Do we have
>> something yet another on ARM that having own logic in Xen doesn't make
>> any sense?
>
> That's a good question. From quickly poking some people in #xendevel,
> Julien learnt that CPUFreq on x86 might not really work well or at least
> not as expected.
> So the benefit is not even clear there. It just went in the tree once,
> and possibly nobody ever revisited it since.
> And even if there were good reasons back then, modern CPUs tend to be
> quite different in terms of power characteristics.
Thank you for the clarification. It is clear now.

>
> ....
>
>>> 0. Decide whether CPUFreq justifies 1.-4. in the first place.
>> Sure,
>>> That sounds like a lot of work and code, so we should be sure it's worth it.
>>>
>>> I wonder if you could provide some input, ideally measurements on the
>>> actual power savings CPUFreq provides.
>> Well, I think I will be able to provide some numbers when a firmware,
>> which runs on the SoC
>> I am using, is ready. Actually, currently I have an emulator without
>> any real freq/volt changes.
>
> Yes, some actual numbers would very much help the case. I don't think
> you need very sophisticated equipment, just running a workload once with
> and once without CPUFreq and compare the power consumption would be a
> good start. This could be as easy as measuring the (m)Wh consumed with
> some wall-plug type power meter. I use some very cheap USB power
> meter[1], which I put between the PSU and some single board computer to
> get an idea on what the power consumption is. Surely not really
> reliable, but better than nothing.
Thank you for the pointer. I am afraid, it is going to be a question how measure
power consumption on my developing board) Most effectively would be
measure a current
via CPU power rail(s).

I think, I could collect some statistics (Px vs time) for different
use-cases using xenpm tool.
Where "without CPUFReq" means just to set "userspace" governor and
exactly the same frequency,
on which we come from the firmware.

>
>>> Does the wish to have CPUFreq purely come from some "tick-the-box"
>>> exercise? As in: We have it on native Linux, so we need it in Xen?
>> As I said before, we are interesting in purely embedded use-cases
>> where power consumption is a question.
>> If you know how to save power without having CPUFreq involved I would
>> appreciate the pointers.
>
> As Julien said, I guess idling and CPU offlining/CPU suspend (via PSCI)
> would be a good start to look at. You could try to get some numbers on
> this as well.
Yes.

>
> Cheers,
> Andre.
>
> [1]
> https://www.ebay.co.uk/itm/USB-Charger-Doctor-Voltage-Current-Meter-Mobile-Battery-Tester-Power-Detector-UK/263220956905
>
>>> What power savings can we expect from CPUFreq? Can those possible
>>> savings be transferred into a virtualized environment at all? And do
>>> those saving justify all the extra code in Xen?
>>>
>>> I think those questions need to be answered first, then we can discuss
>>> about the implementation details.
>> OK.
>>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-17 14:01               ` Julien Grall
@ 2017-11-17 18:36                 ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-11-17 18:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E . Iglesias, Stefano Stabellini, Oleksandr Tyshchenko,
	Andrew Cooper, Andre Przywara, Jassi Brar, Jan Beulich,
	Sudeep Holla, xen-devel

On Fri, Nov 17, 2017 at 4:01 PM, Julien Grall <julien.grall@linaro.org> wrote:
> Hi,
Hi, Julien.

>
> First of all, thank you Oleksandr for starting a thread around CPUFreq
> support.
Thank you for the valued comments.

>
> On 11/16/2017 05:04 PM, Andre Przywara wrote:
>>
>> On 16/11/17 14:57, Oleksandr Tyshchenko wrote:
>>>
>>> On Wed, Nov 15, 2017 at 4:28 PM, Andre Przywara
>>> <andre.przywara@linaro.org> wrote:
>>> Anyway, I think we should go step-by-step.
>>> If community agreed that CPUFreq feature in Xen on ARM was needed and
>>> SCPI/SCMI based approach
>>> was the right thing to do in general I would stick to next taking into
>>> the account Andre's suggestions
>>> regarding some guest input:
>>>
>>> 1. Xen do have CPUFreq logic. It measures CPUs utilization by itself.
>>> 2. In addition it can collect OPP change requests from the guests:
>>>    - There are some politics describing which guest is allowed to send
>>> OPP change request.
>>>    - Of course, involved guests have CPUFreq enabled. All we need is
>>> these OPP change requests don't lead to
>>>      any physical changes and be picked up by Xen. Here we could use
>>> Andre's idea here (SCPI CPUFreq + SMC mailbox with hvc method).
>>> 3. Xen makes a decision based on the whole system status it measures
>>> periodically and guests input (OPP change requests) if present.
>>> 4. Xen actually issues command to change the CPU frequency (sends OPP
>>> change request to SCP).
>>>
>>> How does it sound?
>>
>>
>> 0. Decide whether CPUFreq justifies 1.-4. in the first place. That
>> sounds like a lot of work and code, so we should be sure it's worth it.
>>
>> I wonder if you could provide some input, ideally measurements on the
>> actual power savings CPUFreq provides.
>>
>> Does the wish to have CPUFreq purely come from some "tick-the-box"
>> exercise? As in: We have it on native Linux, so we need it in Xen?
>>
>> What power savings can we expect from CPUFreq? Can those possible
>> savings be transferred into a virtualized environment at all? And do
>> those saving justify all the extra code in Xen?
>>
>> I think those questions need to be answered first, then we can discuss
>> about the implementation details.
>
>
> I am going to throw a bit more ideas. From the discussion, it look like to
> me the story is around power saving when using Xen. Am I right?
Yes.

>
> Have you explored some other possibility to save power? I am asking that,
> because the topic is fairly new with Xen.
As for me, no, I haven't.

>
> Once area where power could be saved is the idle loop (see idle_loop in
> arch/arm/domain.c). At the momment only WFI is used. It would be possible to
> go in deeper low-power state by using PSCI.
>
> Similarly, the virtual PSCI implementation for suspend is quite simple. You
> could potentially use those information to decide what to do with the pCPU
> (suspend, turning off...).
Is vPSCI implementation already present? If so, could you point me
some pointers to look at?

>
> Cheers,
>
> --
> Julien Grall

What I was thinking too is "boot time".

For example, there is strict boot time requirement for some
embedded/industrial system powered by Xen hypervisor.
So the whole system should up and running as soon as possible.
Together with other boot time optimization techniques the CPU boost
feature (if present) can help here.
Usually firmware sets some initial frequency (possibly nominal
frequency, possibly, the highest frequency,
I am not really sure, what "boot" frequencies are across other ARM
SoCs), which is used until CPUFreq comes into play.
And If we don't have CPUFreq in system, we can't set the highest(or
even turbo) frequency in firmware (or in Xen before starting dom0) to
speed up booting.
Because there is nothing is the system who will scale the CPU
frequency down then, and it is may be "not safe" from the silicon
viewpoint as well as "not optimal" from the power consumption
viewpoint
to run at such high frequency "all the time".

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location
  2017-11-09 17:09 ` [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location Oleksandr Tyshchenko
@ 2017-12-02  0:35   ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  0:35 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> Cpufreq driver should be more generalizable (not ACPI-specific).
> Thus this file should be placed to more convenient location.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00938.html
> 
> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  MAINTAINERS                                  |   1 +
>  xen/arch/x86/acpi/cpu_idle.c                 |   2 +-
>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |   2 +-
>  xen/arch/x86/acpi/cpufreq/powernow.c         |   2 +-
>  xen/arch/x86/acpi/power.c                    |   2 +-
>  xen/arch/x86/cpu/mwait-idle.c                |   2 +-
>  xen/drivers/acpi/pmstat.c                    |   2 +-
>  xen/drivers/cpufreq/cpufreq.c                |   2 +-
>  xen/drivers/cpufreq/cpufreq_misc_governors.c |   2 +-
>  xen/drivers/cpufreq/cpufreq_ondemand.c       |   4 +-
>  xen/drivers/cpufreq/utility.c                |   2 +-
>  xen/include/acpi/cpufreq/cpufreq.h           | 245 --------------------------
>  xen/include/xen/cpufreq.h                    | 248 +++++++++++++++++++++++++++
>  13 files changed, 260 insertions(+), 256 deletions(-)
>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>  create mode 100644 xen/include/xen/cpufreq.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5b9e123..524e067 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -295,6 +295,7 @@ X:	xen/arch/x86/acpi/boot.c
>  X:	xen/arch/x86/acpi/lib.c
>  F:	xen/drivers/cpufreq/
>  F:	xen/include/acpi/cpufreq/
> +F:	xen/include/xen/cpufreq.h
>  
>  PUBLIC I/O INTERFACES AND PV DRIVERS DESIGNS
>  M:  Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
> index 482b8a7..c66622e 100644
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -49,7 +49,7 @@
>  #include <xen/softirq.h>
>  #include <public/platform.h>
>  #include <public/sysctl.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  #include <asm/apic.h>
>  #include <asm/cpuidle.h>
>  #include <asm/mwait.h>
> diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> index 1f8d02a..bd82025 100644
> --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> @@ -41,7 +41,7 @@
>  #include <asm/percpu.h>
>  #include <asm/cpufeature.h>
>  #include <acpi/acpi.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  enum {
>      UNDEFINED_CAPABLE = 0,
> diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
> index 8f1ac74..79f55a3 100644
> --- a/xen/arch/x86/acpi/cpufreq/powernow.c
> +++ b/xen/arch/x86/acpi/cpufreq/powernow.c
> @@ -35,7 +35,7 @@
>  #include <asm/percpu.h>
>  #include <asm/cpufeature.h>
>  #include <acpi/acpi.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  #define CPUID_FREQ_VOLT_CAPABILITIES    0x80000007
>  #define CPB_CAPABLE             0x00000200
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 1e4e568..beebd5a 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -28,7 +28,7 @@
>  #include <asm/tboot.h>
>  #include <asm/apic.h>
>  #include <asm/io_apic.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  uint32_t system_reset_counter = 1;
>  
> diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c
> index 762dff1..29f0286 100644
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -58,7 +58,7 @@
>  #include <asm/hpet.h>
>  #include <asm/mwait.h>
>  #include <asm/msr.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  #define MWAIT_IDLE_VERSION "0.4.1"
>  #undef PREFIX
> diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
> index 2a6c4c7..2dbde1c 100644
> --- a/xen/drivers/acpi/pmstat.c
> +++ b/xen/drivers/acpi/pmstat.c
> @@ -38,7 +38,7 @@
>  #include <xen/acpi.h>
>  
>  #include <public/sysctl.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  #include <xen/pmstat.h>
>  
>  DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> index 212f48f..ab909e2 100644
> --- a/xen/drivers/cpufreq/cpufreq.c
> +++ b/xen/drivers/cpufreq/cpufreq.c
> @@ -43,7 +43,7 @@
>  #include <asm/processor.h>
>  #include <asm/percpu.h>
>  #include <acpi/acpi.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  static unsigned int __read_mostly usr_min_freq;
>  static unsigned int __read_mostly usr_max_freq;
> diff --git a/xen/drivers/cpufreq/cpufreq_misc_governors.c b/xen/drivers/cpufreq/cpufreq_misc_governors.c
> index 746bbcd..4a5510c 100644
> --- a/xen/drivers/cpufreq/cpufreq_misc_governors.c
> +++ b/xen/drivers/cpufreq/cpufreq_misc_governors.c
> @@ -18,7 +18,7 @@
>  #include <xen/init.h>
>  #include <xen/percpu.h>
>  #include <xen/sched.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  /*
>   * cpufreq userspace governor
> diff --git a/xen/drivers/cpufreq/cpufreq_ondemand.c b/xen/drivers/cpufreq/cpufreq_ondemand.c
> index fe6c63d..1c384ec 100644
> --- a/xen/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/xen/drivers/cpufreq/cpufreq_ondemand.c
> @@ -1,5 +1,5 @@
>  /*
> - *  xen/arch/x86/acpi/cpufreq/cpufreq_ondemand.c
> + *  xen/drivers/cpufreq/cpufreq_ondemand.c
>   *
>   *  Copyright (C)  2001 Russell King
>   *            (C)  2003 Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>.
> @@ -18,7 +18,7 @@
>  #include <xen/types.h>
>  #include <xen/sched.h>
>  #include <xen/timer.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  
>  #define DEF_FREQUENCY_UP_THRESHOLD              (80)
>  #define MIN_FREQUENCY_UP_THRESHOLD              (11)
> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> index 53879fe..a687e5a 100644
> --- a/xen/drivers/cpufreq/utility.c
> +++ b/xen/drivers/cpufreq/utility.c
> @@ -28,7 +28,7 @@
>  #include <xen/sched.h>
>  #include <xen/timer.h>
>  #include <xen/trace.h>
> -#include <acpi/cpufreq/cpufreq.h>
> +#include <xen/cpufreq.h>
>  #include <public/sysctl.h>
>  
>  struct cpufreq_driver   *cpufreq_driver;
> diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
> deleted file mode 100644
> index a5cd7d0..0000000
> --- a/xen/include/acpi/cpufreq/cpufreq.h
> +++ /dev/null
> @@ -1,245 +0,0 @@
> -/*
> - *  xen/include/acpi/cpufreq/cpufreq.h
> - *
> - *  Copyright (C) 2001 Russell King
> - *            (C) 2002 - 2003 Dominik Brodowski <linux@brodo.de>
> - *
> - * $Id: cpufreq.h,v 1.36 2003/01/20 17:31:48 db Exp $
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - */
> -
> -#ifndef __XEN_CPUFREQ_PM_H__
> -#define __XEN_CPUFREQ_PM_H__
> -
> -#include <xen/types.h>
> -#include <xen/list.h>
> -#include <xen/cpumask.h>
> -
> -#include "processor_perf.h"
> -
> -DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
> -
> -extern bool_t cpufreq_verbose;
> -
> -struct cpufreq_governor;
> -
> -struct acpi_cpufreq_data {
> -    struct processor_performance *acpi_data;
> -    struct cpufreq_frequency_table *freq_table;
> -    unsigned int arch_cpu_flags;
> -};
> -
> -extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
> -
> -struct cpufreq_cpuinfo {
> -    unsigned int        max_freq;
> -    unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
> -    unsigned int        min_freq;
> -    unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
> -};
> -
> -struct perf_limits {
> -    bool_t no_turbo;
> -    bool_t turbo_disabled;
> -    uint32_t turbo_pct;
> -    uint32_t max_perf_pct; /* max performance in percentage */
> -    uint32_t min_perf_pct; /* min performance in percentage */
> -    uint32_t max_perf;
> -    uint32_t min_perf;
> -    uint32_t max_policy_pct;
> -    uint32_t min_policy_pct;
> -};
> -
> -struct cpufreq_policy {
> -    cpumask_var_t       cpus;          /* affected CPUs */
> -    unsigned int        shared_type;   /* ANY or ALL affected CPUs
> -                                          should set cpufreq */
> -    unsigned int        cpu;           /* cpu nr of registered CPU */
> -    struct cpufreq_cpuinfo    cpuinfo;
> -
> -    unsigned int        min;    /* in kHz */
> -    unsigned int        max;    /* in kHz */
> -    unsigned int        cur;    /* in kHz, only needed if cpufreq
> -                                 * governors are used */
> -    struct perf_limits  limits;
> -    struct cpufreq_governor     *governor;
> -
> -    bool_t              resume; /* flag for cpufreq 1st run
> -                                 * S3 wakeup, hotplug cpu, etc */
> -    s8                  turbo;  /* tristate flag: 0 for unsupported
> -                                 * -1 for disable, 1 for enabled
> -                                 * See CPUFREQ_TURBO_* below for defines */
> -    bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
> -};
> -DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
> -
> -extern int __cpufreq_set_policy(struct cpufreq_policy *data,
> -                                struct cpufreq_policy *policy);
> -
> -#define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
> -#define CPUFREQ_SHARED_TYPE_HW   (1) /* HW does needed coordination */
> -#define CPUFREQ_SHARED_TYPE_ALL  (2) /* All dependent CPUs should set freq */
> -#define CPUFREQ_SHARED_TYPE_ANY  (3) /* Freq can be set from any dependent CPU*/
> -
> -/******************** cpufreq transition notifiers *******************/
> -
> -struct cpufreq_freqs {
> -    unsigned int cpu;    /* cpu nr */
> -    unsigned int old;
> -    unsigned int new;
> -    u8 flags;            /* flags of cpufreq_driver, see below. */
> -};
> -
> -
> -/*********************************************************************
> - *                          CPUFREQ GOVERNORS                        *
> - *********************************************************************/
> -
> -#define CPUFREQ_GOV_START  1
> -#define CPUFREQ_GOV_STOP   2
> -#define CPUFREQ_GOV_LIMITS 3
> -
> -struct cpufreq_governor {
> -    char    name[CPUFREQ_NAME_LEN];
> -    int     (*governor)(struct cpufreq_policy *policy,
> -                        unsigned int event);
> -    bool_t  (*handle_option)(const char *name, const char *value);
> -    struct list_head governor_list;
> -};
> -
> -extern struct cpufreq_governor *cpufreq_opt_governor;
> -extern struct cpufreq_governor cpufreq_gov_dbs;
> -extern struct cpufreq_governor cpufreq_gov_userspace;
> -extern struct cpufreq_governor cpufreq_gov_performance;
> -extern struct cpufreq_governor cpufreq_gov_powersave;
> -
> -extern struct list_head cpufreq_governor_list;
> -
> -extern int cpufreq_register_governor(struct cpufreq_governor *governor);
> -extern struct cpufreq_governor *__find_governor(const char *governor);
> -#define CPUFREQ_DEFAULT_GOVERNOR &cpufreq_gov_dbs
> -
> -/* pass a target to the cpufreq driver */
> -extern int __cpufreq_driver_target(struct cpufreq_policy *policy,
> -                                   unsigned int target_freq,
> -                                   unsigned int relation);
> -
> -#define GOV_GETAVG     1
> -#define USR_GETAVG     2
> -extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
> -
> -#define CPUFREQ_TURBO_DISABLED      -1
> -#define CPUFREQ_TURBO_UNSUPPORTED   0
> -#define CPUFREQ_TURBO_ENABLED       1
> -
> -extern int cpufreq_update_turbo(int cpuid, int new_state);
> -extern int cpufreq_get_turbo_status(int cpuid);
> -
> -static __inline__ int 
> -__cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
> -{
> -    return policy->governor->governor(policy, event);
> -}
> -
> -
> -/*********************************************************************
> - *                      CPUFREQ DRIVER INTERFACE                     *
> - *********************************************************************/
> -
> -#define CPUFREQ_RELATION_L 0  /* lowest frequency at or above target */
> -#define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
> -
> -struct cpufreq_driver {
> -    char   name[CPUFREQ_NAME_LEN];
> -    int    (*init)(struct cpufreq_policy *policy);
> -    int    (*verify)(struct cpufreq_policy *policy);
> -    int    (*setpolicy)(struct cpufreq_policy *policy);
> -    int    (*update)(int cpuid, struct cpufreq_policy *policy);
> -    int    (*target)(struct cpufreq_policy *policy,
> -                     unsigned int target_freq,
> -                     unsigned int relation);
> -    unsigned int    (*get)(unsigned int cpu);
> -    unsigned int    (*getavg)(unsigned int cpu, unsigned int flag);
> -    int    (*exit)(struct cpufreq_policy *policy);
> -};
> -
> -extern struct cpufreq_driver *cpufreq_driver;
> -
> -int cpufreq_register_driver(struct cpufreq_driver *);
> -
> -static __inline__
> -void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
> -                                  unsigned int min, unsigned int max)
> -{
> -    if (policy->min < min)
> -        policy->min = min;
> -    if (policy->max < min)
> -        policy->max = min;
> -    if (policy->min > max)
> -        policy->min = max;
> -    if (policy->max > max)
> -        policy->max = max;
> -    if (policy->min > policy->max)
> -        policy->min = policy->max;
> -    return;
> -}
> -
> -
> -/*********************************************************************
> - *                     FREQUENCY TABLE HELPERS                       *
> - *********************************************************************/
> -
> -#define CPUFREQ_ENTRY_INVALID ~0
> -#define CPUFREQ_TABLE_END     ~1
> -
> -struct cpufreq_frequency_table {
> -    unsigned int    index;     /* any */
> -    unsigned int    frequency; /* kHz - doesn't need to be in ascending
> -                                * order */
> -};
> -
> -int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
> -                   struct cpufreq_frequency_table *table);
> -
> -int cpufreq_frequency_table_verify(struct cpufreq_policy *policy,
> -                   struct cpufreq_frequency_table *table);
> -
> -int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
> -                   struct cpufreq_frequency_table *table,
> -                   unsigned int target_freq,
> -                   unsigned int relation,
> -                   unsigned int *index);
> -
> -
> -/*********************************************************************
> - *                     UNIFIED DEBUG HELPERS                         *
> - *********************************************************************/
> -
> -struct cpu_dbs_info_s {
> -    uint64_t prev_cpu_idle;
> -    uint64_t prev_cpu_wall;
> -    struct cpufreq_policy *cur_policy;
> -    struct cpufreq_frequency_table *freq_table;
> -    int cpu;
> -    unsigned int enable:1;
> -    unsigned int stoppable:1;
> -    unsigned int turbo_enabled:1;
> -};
> -
> -int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
> -int get_cpufreq_ondemand_para(uint32_t *sampling_rate_max,
> -                              uint32_t *sampling_rate_min,
> -                              uint32_t *sampling_rate,
> -                              uint32_t *up_threshold);
> -int write_ondemand_sampling_rate(unsigned int sampling_rate);
> -int write_ondemand_up_threshold(unsigned int up_threshold);
> -
> -int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq);
> -
> -void cpufreq_dbs_timer_suspend(void);
> -void cpufreq_dbs_timer_resume(void);
> -
> -#endif /* __XEN_CPUFREQ_PM_H__ */
> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
> new file mode 100644
> index 0000000..ed38a6c
> --- /dev/null
> +++ b/xen/include/xen/cpufreq.h
> @@ -0,0 +1,248 @@
> +/*
> + *  xen/include/acpi/cpufreq/cpufreq.h
> + *
> + *  Copyright (C) 2001 Russell King
> + *            (C) 2002 - 2003 Dominik Brodowski <linux@brodo.de>
> + *
> + * $Id: cpufreq.h,v 1.36 2003/01/20 17:31:48 db Exp $
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __XEN_CPUFREQ_PM_H__
> +#define __XEN_CPUFREQ_PM_H__
> +
> +#include <xen/types.h>
> +#include <xen/list.h>
> +#include <xen/percpu.h>
> +#include <xen/spinlock.h>
> +#include <xen/errno.h>
> +#include <xen/cpumask.h>
> +
> +#include <acpi/cpufreq/processor_perf.h>
> +
> +DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
> +
> +extern bool_t cpufreq_verbose;
> +
> +struct cpufreq_governor;
> +
> +struct acpi_cpufreq_data {
> +    struct processor_performance *acpi_data;
> +    struct cpufreq_frequency_table *freq_table;
> +    unsigned int arch_cpu_flags;
> +};
> +
> +extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
> +
> +struct cpufreq_cpuinfo {
> +    unsigned int        max_freq;
> +    unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
> +    unsigned int        min_freq;
> +    unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
> +};
> +
> +struct perf_limits {
> +    bool_t no_turbo;
> +    bool_t turbo_disabled;
> +    uint32_t turbo_pct;
> +    uint32_t max_perf_pct; /* max performance in percentage */
> +    uint32_t min_perf_pct; /* min performance in percentage */
> +    uint32_t max_perf;
> +    uint32_t min_perf;
> +    uint32_t max_policy_pct;
> +    uint32_t min_policy_pct;
> +};
> +
> +struct cpufreq_policy {
> +    cpumask_var_t       cpus;          /* affected CPUs */
> +    unsigned int        shared_type;   /* ANY or ALL affected CPUs
> +                                          should set cpufreq */
> +    unsigned int        cpu;           /* cpu nr of registered CPU */
> +    struct cpufreq_cpuinfo    cpuinfo;
> +
> +    unsigned int        min;    /* in kHz */
> +    unsigned int        max;    /* in kHz */
> +    unsigned int        cur;    /* in kHz, only needed if cpufreq
> +                                 * governors are used */
> +    struct perf_limits  limits;
> +    struct cpufreq_governor     *governor;
> +
> +    bool_t              resume; /* flag for cpufreq 1st run
> +                                 * S3 wakeup, hotplug cpu, etc */
> +    s8                  turbo;  /* tristate flag: 0 for unsupported
> +                                 * -1 for disable, 1 for enabled
> +                                 * See CPUFREQ_TURBO_* below for defines */
> +    bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
> +};
> +DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
> +
> +extern int __cpufreq_set_policy(struct cpufreq_policy *data,
> +                                struct cpufreq_policy *policy);
> +
> +#define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
> +#define CPUFREQ_SHARED_TYPE_HW   (1) /* HW does needed coordination */
> +#define CPUFREQ_SHARED_TYPE_ALL  (2) /* All dependent CPUs should set freq */
> +#define CPUFREQ_SHARED_TYPE_ANY  (3) /* Freq can be set from any dependent CPU*/
> +
> +/******************** cpufreq transition notifiers *******************/
> +
> +struct cpufreq_freqs {
> +    unsigned int cpu;    /* cpu nr */
> +    unsigned int old;
> +    unsigned int new;
> +    u8 flags;            /* flags of cpufreq_driver, see below. */
> +};
> +
> +
> +/*********************************************************************
> + *                          CPUFREQ GOVERNORS                        *
> + *********************************************************************/
> +
> +#define CPUFREQ_GOV_START  1
> +#define CPUFREQ_GOV_STOP   2
> +#define CPUFREQ_GOV_LIMITS 3
> +
> +struct cpufreq_governor {
> +    char    name[CPUFREQ_NAME_LEN];
> +    int     (*governor)(struct cpufreq_policy *policy,
> +                        unsigned int event);
> +    bool_t  (*handle_option)(const char *name, const char *value);
> +    struct list_head governor_list;
> +};
> +
> +extern struct cpufreq_governor *cpufreq_opt_governor;
> +extern struct cpufreq_governor cpufreq_gov_dbs;
> +extern struct cpufreq_governor cpufreq_gov_userspace;
> +extern struct cpufreq_governor cpufreq_gov_performance;
> +extern struct cpufreq_governor cpufreq_gov_powersave;
> +
> +extern struct list_head cpufreq_governor_list;
> +
> +extern int cpufreq_register_governor(struct cpufreq_governor *governor);
> +extern struct cpufreq_governor *__find_governor(const char *governor);
> +#define CPUFREQ_DEFAULT_GOVERNOR &cpufreq_gov_dbs
> +
> +/* pass a target to the cpufreq driver */
> +extern int __cpufreq_driver_target(struct cpufreq_policy *policy,
> +                                   unsigned int target_freq,
> +                                   unsigned int relation);
> +
> +#define GOV_GETAVG     1
> +#define USR_GETAVG     2
> +extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
> +
> +#define CPUFREQ_TURBO_DISABLED      -1
> +#define CPUFREQ_TURBO_UNSUPPORTED   0
> +#define CPUFREQ_TURBO_ENABLED       1
> +
> +extern int cpufreq_update_turbo(int cpuid, int new_state);
> +extern int cpufreq_get_turbo_status(int cpuid);
> +
> +static __inline__ int 
> +__cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
> +{
> +    return policy->governor->governor(policy, event);
> +}
> +
> +
> +/*********************************************************************
> + *                      CPUFREQ DRIVER INTERFACE                     *
> + *********************************************************************/
> +
> +#define CPUFREQ_RELATION_L 0  /* lowest frequency at or above target */
> +#define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
> +
> +struct cpufreq_driver {
> +    char   name[CPUFREQ_NAME_LEN];
> +    int    (*init)(struct cpufreq_policy *policy);
> +    int    (*verify)(struct cpufreq_policy *policy);
> +    int    (*setpolicy)(struct cpufreq_policy *policy);
> +    int    (*update)(int cpuid, struct cpufreq_policy *policy);
> +    int    (*target)(struct cpufreq_policy *policy,
> +                     unsigned int target_freq,
> +                     unsigned int relation);
> +    unsigned int    (*get)(unsigned int cpu);
> +    unsigned int    (*getavg)(unsigned int cpu, unsigned int flag);
> +    int    (*exit)(struct cpufreq_policy *policy);
> +};
> +
> +extern struct cpufreq_driver *cpufreq_driver;
> +
> +int cpufreq_register_driver(struct cpufreq_driver *);
> +
> +static __inline__
> +void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
> +                                  unsigned int min, unsigned int max)
> +{
> +    if (policy->min < min)
> +        policy->min = min;
> +    if (policy->max < min)
> +        policy->max = min;
> +    if (policy->min > max)
> +        policy->min = max;
> +    if (policy->max > max)
> +        policy->max = max;
> +    if (policy->min > policy->max)
> +        policy->min = policy->max;
> +    return;
> +}
> +
> +
> +/*********************************************************************
> + *                     FREQUENCY TABLE HELPERS                       *
> + *********************************************************************/
> +
> +#define CPUFREQ_ENTRY_INVALID ~0
> +#define CPUFREQ_TABLE_END     ~1
> +
> +struct cpufreq_frequency_table {
> +    unsigned int    index;     /* any */
> +    unsigned int    frequency; /* kHz - doesn't need to be in ascending
> +                                * order */
> +};
> +
> +int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
> +                   struct cpufreq_frequency_table *table);
> +
> +int cpufreq_frequency_table_verify(struct cpufreq_policy *policy,
> +                   struct cpufreq_frequency_table *table);
> +
> +int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
> +                   struct cpufreq_frequency_table *table,
> +                   unsigned int target_freq,
> +                   unsigned int relation,
> +                   unsigned int *index);
> +
> +
> +/*********************************************************************
> + *                     UNIFIED DEBUG HELPERS                         *
> + *********************************************************************/
> +
> +struct cpu_dbs_info_s {
> +    uint64_t prev_cpu_idle;
> +    uint64_t prev_cpu_wall;
> +    struct cpufreq_policy *cur_policy;
> +    struct cpufreq_frequency_table *freq_table;
> +    int cpu;
> +    unsigned int enable:1;
> +    unsigned int stoppable:1;
> +    unsigned int turbo_enabled:1;
> +};
> +
> +int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
> +int get_cpufreq_ondemand_para(uint32_t *sampling_rate_max,
> +                              uint32_t *sampling_rate_min,
> +                              uint32_t *sampling_rate,
> +                              uint32_t *up_threshold);
> +int write_ondemand_sampling_rate(unsigned int sampling_rate);
> +int write_ondemand_up_threshold(unsigned int up_threshold);
> +
> +int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq);
> +
> +void cpufreq_dbs_timer_suspend(void);
> +void cpufreq_dbs_timer_resume(void);
> +
> +#endif /* __XEN_CPUFREQ_PM_H__ */
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 02/31] pm: move processor_perf.h file to the xen/include/xen location
  2017-11-09 17:09 ` [RFC PATCH 02/31] pm: move processor_perf.h " Oleksandr Tyshchenko
@ 2017-12-02  0:41   ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  0:41 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> Cpufreq driver should be more generalizable (not ACPI-specific).
> Thus this file should be placed to more convenient location.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00934.html
> 
> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  MAINTAINERS                               |  2 +-
>  xen/arch/x86/platform_hypercall.c         |  2 +-
>  xen/include/acpi/cpufreq/processor_perf.h | 63 -------------------------------
>  xen/include/xen/cpufreq.h                 |  2 +-
>  xen/include/xen/processor_perf.h          | 63 +++++++++++++++++++++++++++++++
>  5 files changed, 66 insertions(+), 66 deletions(-)
>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>  create mode 100644 xen/include/xen/processor_perf.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 524e067..9794a81 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -294,8 +294,8 @@ F:	xen/arch/x86/acpi/
>  X:	xen/arch/x86/acpi/boot.c
>  X:	xen/arch/x86/acpi/lib.c
>  F:	xen/drivers/cpufreq/
> -F:	xen/include/acpi/cpufreq/
>  F:	xen/include/xen/cpufreq.h
> +F:	xen/include/xen/processor_perf.h
>  
>  PUBLIC I/O INTERFACES AND PV DRIVERS DESIGNS
>  M:  Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> index ebc2f39..17c8304 100644
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -25,7 +25,7 @@
>  #include <xen/symbols.h>
>  #include <asm/current.h>
>  #include <public/platform.h>
> -#include <acpi/cpufreq/processor_perf.h>
> +#include <xen/processor_perf.h>
>  #include <asm/edd.h>
>  #include <asm/mtrr.h>
>  #include <asm/io_apic.h>
> diff --git a/xen/include/acpi/cpufreq/processor_perf.h b/xen/include/acpi/cpufreq/processor_perf.h
> deleted file mode 100644
> index d8a1ba6..0000000
> --- a/xen/include/acpi/cpufreq/processor_perf.h
> +++ /dev/null
> @@ -1,63 +0,0 @@
> -#ifndef __XEN_PROCESSOR_PM_H__
> -#define __XEN_PROCESSOR_PM_H__
> -
> -#include <public/platform.h>
> -#include <public/sysctl.h>
> -#include <xen/acpi.h>
> -
> -#define XEN_PX_INIT 0x80000000
> -
> -int powernow_cpufreq_init(void);
> -unsigned int powernow_register_driver(void);
> -unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
> -void cpufreq_residency_update(unsigned int, uint8_t);
> -void cpufreq_statistic_update(unsigned int, uint8_t, uint8_t);
> -int  cpufreq_statistic_init(unsigned int);
> -void cpufreq_statistic_exit(unsigned int);
> -void cpufreq_statistic_reset(unsigned int);
> -
> -int  cpufreq_limit_change(unsigned int);
> -
> -int  cpufreq_add_cpu(unsigned int);
> -int  cpufreq_del_cpu(unsigned int);
> -
> -struct processor_performance {
> -    uint32_t state;
> -    uint32_t platform_limit;
> -    struct xen_pct_register control_register;
> -    struct xen_pct_register status_register;
> -    uint32_t state_count;
> -    struct xen_processor_px *states;
> -    struct xen_psd_package domain_info;
> -    uint32_t shared_type;
> -
> -    uint32_t init;
> -};
> -
> -struct processor_pminfo {
> -    uint32_t acpi_id;
> -    uint32_t id;
> -    struct processor_performance    perf;
> -};
> -
> -extern struct processor_pminfo *processor_pminfo[NR_CPUS];
> -
> -struct px_stat {
> -    uint8_t total;        /* total Px states */
> -    uint8_t usable;       /* usable Px states */
> -    uint8_t last;         /* last Px state */
> -    uint8_t cur;          /* current Px state */
> -    uint64_t *trans_pt;   /* Px transition table */
> -    pm_px_val_t *pt;
> -};
> -
> -struct pm_px {
> -    struct px_stat u;
> -    uint64_t prev_state_wall;
> -    uint64_t prev_idle_wall;
> -};
> -
> -DECLARE_PER_CPU(struct pm_px *, cpufreq_statistic_data);
> -
> -int cpufreq_cpu_init(unsigned int cpuid);
> -#endif /* __XEN_PROCESSOR_PM_H__ */
> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
> index ed38a6c..30c70c9 100644
> --- a/xen/include/xen/cpufreq.h
> +++ b/xen/include/xen/cpufreq.h
> @@ -21,7 +21,7 @@
>  #include <xen/errno.h>
>  #include <xen/cpumask.h>
>  
> -#include <acpi/cpufreq/processor_perf.h>
> +#include <xen/processor_perf.h>
>  
>  DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
>  
> diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
> new file mode 100644
> index 0000000..d8a1ba6
> --- /dev/null
> +++ b/xen/include/xen/processor_perf.h
> @@ -0,0 +1,63 @@
> +#ifndef __XEN_PROCESSOR_PM_H__
> +#define __XEN_PROCESSOR_PM_H__
> +
> +#include <public/platform.h>
> +#include <public/sysctl.h>
> +#include <xen/acpi.h>
> +
> +#define XEN_PX_INIT 0x80000000
> +
> +int powernow_cpufreq_init(void);
> +unsigned int powernow_register_driver(void);
> +unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
> +void cpufreq_residency_update(unsigned int, uint8_t);
> +void cpufreq_statistic_update(unsigned int, uint8_t, uint8_t);
> +int  cpufreq_statistic_init(unsigned int);
> +void cpufreq_statistic_exit(unsigned int);
> +void cpufreq_statistic_reset(unsigned int);
> +
> +int  cpufreq_limit_change(unsigned int);
> +
> +int  cpufreq_add_cpu(unsigned int);
> +int  cpufreq_del_cpu(unsigned int);
> +
> +struct processor_performance {
> +    uint32_t state;
> +    uint32_t platform_limit;
> +    struct xen_pct_register control_register;
> +    struct xen_pct_register status_register;
> +    uint32_t state_count;
> +    struct xen_processor_px *states;
> +    struct xen_psd_package domain_info;
> +    uint32_t shared_type;
> +
> +    uint32_t init;
> +};
> +
> +struct processor_pminfo {
> +    uint32_t acpi_id;
> +    uint32_t id;
> +    struct processor_performance    perf;
> +};
> +
> +extern struct processor_pminfo *processor_pminfo[NR_CPUS];
> +
> +struct px_stat {
> +    uint8_t total;        /* total Px states */
> +    uint8_t usable;       /* usable Px states */
> +    uint8_t last;         /* last Px state */
> +    uint8_t cur;          /* current Px state */
> +    uint64_t *trans_pt;   /* Px transition table */
> +    pm_px_val_t *pt;
> +};
> +
> +struct pm_px {
> +    struct px_stat u;
> +    uint64_t prev_state_wall;
> +    uint64_t prev_idle_wall;
> +};
> +
> +DECLARE_PER_CPU(struct pm_px *, cpufreq_statistic_data);
> +
> +int cpufreq_cpu_init(unsigned int cpuid);
> +#endif /* __XEN_PROCESSOR_PM_H__ */
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2017-11-09 17:09 ` [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location Oleksandr Tyshchenko
@ 2017-12-02  0:47   ` Stefano Stabellini
  2018-05-07 15:36   ` Jan Beulich
  1 sibling, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  0:47 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> Cpufreq driver should be more generalizable (not ACPI-specific).
> Thus this file should be placed to more convenient location.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00935.html
> 
> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  MAINTAINERS               |   1 +
>  xen/arch/x86/Kconfig      |   1 +
>  xen/common/sysctl.c       |   2 +-
>  xen/drivers/Kconfig       |   2 +
>  xen/drivers/Makefile      |   1 +
>  xen/drivers/acpi/Makefile |   1 -
>  xen/drivers/acpi/pmstat.c | 526 ----------------------------------------------
>  xen/drivers/pm/Kconfig    |   3 +
>  xen/drivers/pm/Makefile   |   1 +
>  xen/drivers/pm/stat.c     | 526 ++++++++++++++++++++++++++++++++++++++++++++++
>  10 files changed, 536 insertions(+), 528 deletions(-)
>  delete mode 100644 xen/drivers/acpi/pmstat.c
>  create mode 100644 xen/drivers/pm/Kconfig
>  create mode 100644 xen/drivers/pm/Makefile
>  create mode 100644 xen/drivers/pm/stat.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9794a81..87ade6f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -294,6 +294,7 @@ F:	xen/arch/x86/acpi/
>  X:	xen/arch/x86/acpi/boot.c
>  X:	xen/arch/x86/acpi/lib.c
>  F:	xen/drivers/cpufreq/
> +F:	xen/drivers/pm/
>  F:	xen/include/xen/cpufreq.h
>  F:	xen/include/xen/processor_perf.h
>  
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 30c2769..86c8eca 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -23,6 +23,7 @@ config X86
>  	select HAS_PDX
>  	select NUMA
>  	select VGA
> +	select HAS_PM
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index a6882d1..ac96347 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -171,7 +171,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>          op->u.availheap.avail_bytes <<= PAGE_SHIFT;
>          break;
>  
> -#if defined (CONFIG_ACPI) && defined (CONFIG_HAS_CPUFREQ)
> +#if defined (CONFIG_HAS_PM) && defined (CONFIG_HAS_CPUFREQ)
>      case XEN_SYSCTL_get_pmstat:
>          ret = do_get_pm_info(&op->u.get_pmstat);
>          break;
> diff --git a/xen/drivers/Kconfig b/xen/drivers/Kconfig
> index bc3a54f..ddaec11 100644
> --- a/xen/drivers/Kconfig
> +++ b/xen/drivers/Kconfig
> @@ -12,4 +12,6 @@ source "drivers/pci/Kconfig"
>  
>  source "drivers/video/Kconfig"
>  
> +source "drivers/pm/Kconfig"
> +
>  endmenu
> diff --git a/xen/drivers/Makefile b/xen/drivers/Makefile
> index 1939180..dd0b496 100644
> --- a/xen/drivers/Makefile
> +++ b/xen/drivers/Makefile
> @@ -4,3 +4,4 @@ subdir-$(CONFIG_HAS_PCI) += pci
>  subdir-$(CONFIG_HAS_PASSTHROUGH) += passthrough
>  subdir-$(CONFIG_ACPI) += acpi
>  subdir-$(CONFIG_VIDEO) += video
> +subdir-$(CONFIG_HAS_PM) += pm
> diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
> index 444b11d..6f6470a 100644
> --- a/xen/drivers/acpi/Makefile
> +++ b/xen/drivers/acpi/Makefile
> @@ -5,7 +5,6 @@ subdir-$(CONFIG_X86) += apei
>  obj-bin-y += tables.init.o
>  obj-$(CONFIG_NUMA) += numa.o
>  obj-y += osl.o
> -obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
>  
>  obj-$(CONFIG_X86) += hwregs.o
>  obj-$(CONFIG_X86) += reboot.o
> diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
> deleted file mode 100644
> index 2dbde1c..0000000
> --- a/xen/drivers/acpi/pmstat.c
> +++ /dev/null
> @@ -1,526 +0,0 @@
> -/*****************************************************************************
> -#  pmstat.c - Power Management statistic information (Px/Cx/Tx, etc.)
> -#
> -#  Copyright (c) 2008, Liu Jinsong <jinsong.liu@intel.com>
> -#
> -# This program is free software; you can redistribute it and/or modify it 
> -# under the terms of the GNU General Public License as published by the Free 
> -# Software Foundation; either version 2 of the License, or (at your option) 
> -# any later version.
> -#
> -# This program is distributed in the hope that it will be useful, but WITHOUT 
> -# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
> -# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
> -# more details.
> -#
> -# You should have received a copy of the GNU General Public License along with
> -# this program; If not, see <http://www.gnu.org/licenses/>.
> -#
> -# The full GNU General Public License is included in this distribution in the
> -# file called LICENSE.
> -#
> -*****************************************************************************/
> -
> -#include <xen/lib.h>
> -#include <xen/errno.h>
> -#include <xen/sched.h>
> -#include <xen/event.h>
> -#include <xen/irq.h>
> -#include <xen/iocap.h>
> -#include <xen/compat.h>
> -#include <xen/guest_access.h>
> -#include <asm/current.h>
> -#include <public/xen.h>
> -#include <xen/cpumask.h>
> -#include <asm/processor.h>
> -#include <xen/percpu.h>
> -#include <xen/domain.h>
> -#include <xen/acpi.h>
> -
> -#include <public/sysctl.h>
> -#include <xen/cpufreq.h>
> -#include <xen/pmstat.h>
> -
> -DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
> -
> -/*
> - * Get PM statistic info
> - */
> -int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
> -{
> -    int ret = 0;
> -    const struct processor_pminfo *pmpt;
> -
> -    if ( !op || (op->cpuid >= nr_cpu_ids) || !cpu_online(op->cpuid) )
> -        return -EINVAL;
> -    pmpt = processor_pminfo[op->cpuid];
> -
> -    switch ( op->type & PMSTAT_CATEGORY_MASK )
> -    {
> -    case PMSTAT_CX:
> -        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_CX) )
> -            return -ENODEV;
> -        break;
> -    case PMSTAT_PX:
> -        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> -            return -ENODEV;
> -        if ( !cpufreq_driver )
> -            return -ENODEV;
> -        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> -            return -EINVAL;
> -        break;
> -    default:
> -        return -ENODEV;
> -    }
> -
> -    switch ( op->type )
> -    {
> -    case PMSTAT_get_max_px:
> -    {
> -        op->u.getpx.total = pmpt->perf.state_count;
> -        break;
> -    }
> -
> -    case PMSTAT_get_pxstat:
> -    {
> -        uint32_t ct;
> -        struct pm_px *pxpt;
> -        spinlock_t *cpufreq_statistic_lock = 
> -                   &per_cpu(cpufreq_statistic_lock, op->cpuid);
> -
> -        spin_lock(cpufreq_statistic_lock);
> -
> -        pxpt = per_cpu(cpufreq_statistic_data, op->cpuid);
> -        if ( !pxpt || !pxpt->u.pt || !pxpt->u.trans_pt )
> -        {
> -            spin_unlock(cpufreq_statistic_lock);
> -            return -ENODATA;
> -        }
> -
> -        pxpt->u.usable = pmpt->perf.state_count - pmpt->perf.platform_limit;
> -
> -        cpufreq_residency_update(op->cpuid, pxpt->u.cur);
> -
> -        ct = pmpt->perf.state_count;
> -        if ( copy_to_guest(op->u.getpx.trans_pt, pxpt->u.trans_pt, ct*ct) )
> -        {
> -            spin_unlock(cpufreq_statistic_lock);
> -            ret = -EFAULT;
> -            break;
> -        }
> -
> -        if ( copy_to_guest(op->u.getpx.pt, pxpt->u.pt, ct) )
> -        {
> -            spin_unlock(cpufreq_statistic_lock);
> -            ret = -EFAULT;
> -            break;
> -        }
> -
> -        op->u.getpx.total = pxpt->u.total;
> -        op->u.getpx.usable = pxpt->u.usable;
> -        op->u.getpx.last = pxpt->u.last;
> -        op->u.getpx.cur = pxpt->u.cur;
> -
> -        spin_unlock(cpufreq_statistic_lock);
> -
> -        break;
> -    }
> -
> -    case PMSTAT_reset_pxstat:
> -    {
> -        cpufreq_statistic_reset(op->cpuid);
> -        break;
> -    }
> -
> -    case PMSTAT_get_max_cx:
> -    {
> -        op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
> -        ret = 0;
> -        break;
> -    }
> -
> -    case PMSTAT_get_cxstat:
> -    {
> -        ret = pmstat_get_cx_stat(op->cpuid, &op->u.getcx);
> -        break;
> -    }
> -
> -    case PMSTAT_reset_cxstat:
> -    {
> -        ret = pmstat_reset_cx_stat(op->cpuid);
> -        break;
> -    }
> -
> -    default:
> -        printk("not defined sub-hypercall @ do_get_pm_info\n");
> -        ret = -ENOSYS;
> -        break;
> -    }
> -
> -    return ret;
> -}
> -
> -/*
> - * 1. Get PM parameter
> - * 2. Provide user PM control
> - */
> -static int read_scaling_available_governors(char *scaling_available_governors,
> -                                            unsigned int size)
> -{
> -    unsigned int i = 0;
> -    struct cpufreq_governor *t;
> -
> -    if ( !scaling_available_governors )
> -        return -EINVAL;
> -
> -    list_for_each_entry(t, &cpufreq_governor_list, governor_list)
> -    {
> -        i += scnprintf(&scaling_available_governors[i],
> -                       CPUFREQ_NAME_LEN, "%s ", t->name);
> -        if ( i > size )
> -            return -EINVAL;
> -    }
> -    scaling_available_governors[i-1] = '\0';
> -
> -    return 0;
> -}
> -
> -static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
> -{
> -    uint32_t ret = 0;
> -    const struct processor_pminfo *pmpt;
> -    struct cpufreq_policy *policy;
> -    uint32_t gov_num = 0;
> -    uint32_t *affected_cpus;
> -    uint32_t *scaling_available_frequencies;
> -    char     *scaling_available_governors;
> -    struct list_head *pos;
> -    uint32_t cpu, i, j = 0;
> -
> -    pmpt = processor_pminfo[op->cpuid];
> -    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> -
> -    if ( !pmpt || !pmpt->perf.states ||
> -         !policy || !policy->governor )
> -        return -EINVAL;
> -
> -    list_for_each(pos, &cpufreq_governor_list)
> -        gov_num++;
> -
> -    if ( (op->u.get_para.cpu_num  != cpumask_weight(policy->cpus)) ||
> -         (op->u.get_para.freq_num != pmpt->perf.state_count)    ||
> -         (op->u.get_para.gov_num  != gov_num) )
> -    {
> -        op->u.get_para.cpu_num =  cpumask_weight(policy->cpus);
> -        op->u.get_para.freq_num = pmpt->perf.state_count;
> -        op->u.get_para.gov_num  = gov_num;
> -        return -EAGAIN;
> -    }
> -
> -    if ( !(affected_cpus = xzalloc_array(uint32_t, op->u.get_para.cpu_num)) )
> -        return -ENOMEM;
> -    for_each_cpu(cpu, policy->cpus)
> -        affected_cpus[j++] = cpu;
> -    ret = copy_to_guest(op->u.get_para.affected_cpus,
> -                       affected_cpus, op->u.get_para.cpu_num);
> -    xfree(affected_cpus);
> -    if ( ret )
> -        return ret;
> -
> -    if ( !(scaling_available_frequencies =
> -           xzalloc_array(uint32_t, op->u.get_para.freq_num)) )
> -        return -ENOMEM;
> -    for ( i = 0; i < op->u.get_para.freq_num; i++ )
> -        scaling_available_frequencies[i] =
> -                        pmpt->perf.states[i].core_frequency * 1000;
> -    ret = copy_to_guest(op->u.get_para.scaling_available_frequencies,
> -                   scaling_available_frequencies, op->u.get_para.freq_num);
> -    xfree(scaling_available_frequencies);
> -    if ( ret )
> -        return ret;
> -
> -    if ( !(scaling_available_governors =
> -           xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
> -        return -ENOMEM;
> -    if ( (ret = read_scaling_available_governors(scaling_available_governors,
> -                gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
> -    {
> -        xfree(scaling_available_governors);
> -        return ret;
> -    }
> -    ret = copy_to_guest(op->u.get_para.scaling_available_governors,
> -                scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
> -    xfree(scaling_available_governors);
> -    if ( ret )
> -        return ret;
> -
> -    op->u.get_para.cpuinfo_cur_freq =
> -        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
> -    op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
> -    op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
> -    op->u.get_para.scaling_cur_freq = policy->cur;
> -    op->u.get_para.scaling_max_freq = policy->max;
> -    op->u.get_para.scaling_min_freq = policy->min;
> -
> -    if ( cpufreq_driver->name[0] )
> -        strlcpy(op->u.get_para.scaling_driver, 
> -            cpufreq_driver->name, CPUFREQ_NAME_LEN);
> -    else
> -        strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
> -
> -    if ( policy->governor->name[0] )
> -        strlcpy(op->u.get_para.scaling_governor, 
> -            policy->governor->name, CPUFREQ_NAME_LEN);
> -    else
> -        strlcpy(op->u.get_para.scaling_governor, "Unknown", CPUFREQ_NAME_LEN);
> -
> -    /* governor specific para */
> -    if ( !strnicmp(op->u.get_para.scaling_governor, 
> -                   "userspace", CPUFREQ_NAME_LEN) )
> -    {
> -        op->u.get_para.u.userspace.scaling_setspeed = policy->cur;
> -    }
> -
> -    if ( !strnicmp(op->u.get_para.scaling_governor, 
> -                   "ondemand", CPUFREQ_NAME_LEN) )
> -    {
> -        ret = get_cpufreq_ondemand_para(
> -            &op->u.get_para.u.ondemand.sampling_rate_max,
> -            &op->u.get_para.u.ondemand.sampling_rate_min,
> -            &op->u.get_para.u.ondemand.sampling_rate,
> -            &op->u.get_para.u.ondemand.up_threshold);
> -    }
> -    op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
> -
> -    return ret;
> -}
> -
> -static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
> -{
> -    struct cpufreq_policy new_policy, *old_policy;
> -
> -    old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> -    if ( !old_policy )
> -        return -EINVAL;
> -
> -    memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
> -
> -    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
> -    if (new_policy.governor == NULL)
> -        return -EINVAL;
> -
> -    return __cpufreq_set_policy(old_policy, &new_policy);
> -}
> -
> -static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
> -{
> -    int ret = 0;
> -    struct cpufreq_policy *policy;
> -
> -    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> -
> -    if ( !policy || !policy->governor )
> -        return -EINVAL;
> -
> -    switch(op->u.set_para.ctrl_type)
> -    {
> -    case SCALING_MAX_FREQ:
> -    {
> -        struct cpufreq_policy new_policy;
> -
> -        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> -        new_policy.max = op->u.set_para.ctrl_value;
> -        ret = __cpufreq_set_policy(policy, &new_policy);
> -
> -        break;
> -    }
> -
> -    case SCALING_MIN_FREQ:
> -    {
> -        struct cpufreq_policy new_policy;
> -
> -        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> -        new_policy.min = op->u.set_para.ctrl_value;
> -        ret = __cpufreq_set_policy(policy, &new_policy);
> -
> -        break;
> -    }
> -
> -    case SCALING_SETSPEED:
> -    {
> -        unsigned int freq =op->u.set_para.ctrl_value;
> -
> -        if ( !strnicmp(policy->governor->name,
> -                       "userspace", CPUFREQ_NAME_LEN) )
> -            ret = write_userspace_scaling_setspeed(op->cpuid, freq);
> -        else
> -            ret = -EINVAL;
> -
> -        break;
> -    }
> -
> -    case SAMPLING_RATE:
> -    {
> -        unsigned int sampling_rate = op->u.set_para.ctrl_value;
> -
> -        if ( !strnicmp(policy->governor->name,
> -                       "ondemand", CPUFREQ_NAME_LEN) )
> -            ret = write_ondemand_sampling_rate(sampling_rate);
> -        else
> -            ret = -EINVAL;
> -
> -        break;
> -    }
> -
> -    case UP_THRESHOLD:
> -    {
> -        unsigned int up_threshold = op->u.set_para.ctrl_value;
> -
> -        if ( !strnicmp(policy->governor->name,
> -                       "ondemand", CPUFREQ_NAME_LEN) )
> -            ret = write_ondemand_up_threshold(up_threshold);
> -        else
> -            ret = -EINVAL;
> -
> -        break;
> -    }
> -
> -    default:
> -        ret = -EINVAL;
> -        break;
> -    }
> -
> -    return ret;
> -}
> -
> -int do_pm_op(struct xen_sysctl_pm_op *op)
> -{
> -    int ret = 0;
> -    const struct processor_pminfo *pmpt;
> -
> -    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
> -        return -EINVAL;
> -    pmpt = processor_pminfo[op->cpuid];
> -
> -    switch ( op->cmd & PM_PARA_CATEGORY_MASK )
> -    {
> -    case CPUFREQ_PARA:
> -        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> -            return -ENODEV;
> -        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> -            return -EINVAL;
> -        break;
> -    }
> -
> -    switch ( op->cmd )
> -    {
> -    case GET_CPUFREQ_PARA:
> -    {
> -        ret = get_cpufreq_para(op);
> -        break;
> -    }
> -
> -    case SET_CPUFREQ_GOV:
> -    {
> -        ret = set_cpufreq_gov(op);
> -        break;
> -    }
> -
> -    case SET_CPUFREQ_PARA:
> -    {
> -        ret = set_cpufreq_para(op);
> -        break;
> -    }
> -
> -    case GET_CPUFREQ_AVGFREQ:
> -    {
> -        op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
> -    {
> -        uint32_t saved_value;
> -
> -        saved_value = sched_smt_power_savings;
> -        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
> -        op->u.set_sched_opt_smt = saved_value;
> -
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
> -    {
> -        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
> -    {
> -        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_get_max_cstate:
> -    {
> -        op->u.get_max_cstate = acpi_get_cstate_limit();
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_set_max_cstate:
> -    {
> -        acpi_set_cstate_limit(op->u.set_max_cstate);
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_enable_turbo:
> -    {
> -        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
> -        break;
> -    }
> -
> -    case XEN_SYSCTL_pm_op_disable_turbo:
> -    {
> -        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
> -        break;
> -    }
> -
> -    default:
> -        printk("not defined sub-hypercall @ do_pm_op\n");
> -        ret = -ENOSYS;
> -        break;
> -    }
> -
> -    return ret;
> -}
> -
> -int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
> -{
> -    u32 bits[3];
> -    int ret;
> -
> -    if ( copy_from_guest(bits, pdc, 2) )
> -        ret = -EFAULT;
> -    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
> -        ret = -EINVAL;
> -    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
> -        ret = -EFAULT;
> -    else
> -    {
> -        u32 mask = 0;
> -
> -        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
> -            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
> -        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
> -            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
> -        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
> -            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
> -        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
> -                    ACPI_PDC_SMP_C1PT) & ~mask;
> -        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
> -    }
> -    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
> -        ret = -EFAULT;
> -
> -    return ret;
> -}
> diff --git a/xen/drivers/pm/Kconfig b/xen/drivers/pm/Kconfig
> new file mode 100644
> index 0000000..6d4fda1
> --- /dev/null
> +++ b/xen/drivers/pm/Kconfig
> @@ -0,0 +1,3 @@
> +
> +config HAS_PM
> +	bool
> diff --git a/xen/drivers/pm/Makefile b/xen/drivers/pm/Makefile
> new file mode 100644
> index 0000000..2073683
> --- /dev/null
> +++ b/xen/drivers/pm/Makefile
> @@ -0,0 +1 @@
> +obj-y += stat.o
> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
> new file mode 100644
> index 0000000..2dbde1c
> --- /dev/null
> +++ b/xen/drivers/pm/stat.c
> @@ -0,0 +1,526 @@
> +/*****************************************************************************
> +#  pmstat.c - Power Management statistic information (Px/Cx/Tx, etc.)
> +#
> +#  Copyright (c) 2008, Liu Jinsong <jinsong.liu@intel.com>
> +#
> +# This program is free software; you can redistribute it and/or modify it 
> +# under the terms of the GNU General Public License as published by the Free 
> +# Software Foundation; either version 2 of the License, or (at your option) 
> +# any later version.
> +#
> +# This program is distributed in the hope that it will be useful, but WITHOUT 
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
> +# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
> +# more details.
> +#
> +# You should have received a copy of the GNU General Public License along with
> +# this program; If not, see <http://www.gnu.org/licenses/>.
> +#
> +# The full GNU General Public License is included in this distribution in the
> +# file called LICENSE.
> +#
> +*****************************************************************************/
> +
> +#include <xen/lib.h>
> +#include <xen/errno.h>
> +#include <xen/sched.h>
> +#include <xen/event.h>
> +#include <xen/irq.h>
> +#include <xen/iocap.h>
> +#include <xen/compat.h>
> +#include <xen/guest_access.h>
> +#include <asm/current.h>
> +#include <public/xen.h>
> +#include <xen/cpumask.h>
> +#include <asm/processor.h>
> +#include <xen/percpu.h>
> +#include <xen/domain.h>
> +#include <xen/acpi.h>
> +
> +#include <public/sysctl.h>
> +#include <xen/cpufreq.h>
> +#include <xen/pmstat.h>
> +
> +DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
> +
> +/*
> + * Get PM statistic info
> + */
> +int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
> +{
> +    int ret = 0;
> +    const struct processor_pminfo *pmpt;
> +
> +    if ( !op || (op->cpuid >= nr_cpu_ids) || !cpu_online(op->cpuid) )
> +        return -EINVAL;
> +    pmpt = processor_pminfo[op->cpuid];
> +
> +    switch ( op->type & PMSTAT_CATEGORY_MASK )
> +    {
> +    case PMSTAT_CX:
> +        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_CX) )
> +            return -ENODEV;
> +        break;
> +    case PMSTAT_PX:
> +        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> +            return -ENODEV;
> +        if ( !cpufreq_driver )
> +            return -ENODEV;
> +        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> +            return -EINVAL;
> +        break;
> +    default:
> +        return -ENODEV;
> +    }
> +
> +    switch ( op->type )
> +    {
> +    case PMSTAT_get_max_px:
> +    {
> +        op->u.getpx.total = pmpt->perf.state_count;
> +        break;
> +    }
> +
> +    case PMSTAT_get_pxstat:
> +    {
> +        uint32_t ct;
> +        struct pm_px *pxpt;
> +        spinlock_t *cpufreq_statistic_lock = 
> +                   &per_cpu(cpufreq_statistic_lock, op->cpuid);
> +
> +        spin_lock(cpufreq_statistic_lock);
> +
> +        pxpt = per_cpu(cpufreq_statistic_data, op->cpuid);
> +        if ( !pxpt || !pxpt->u.pt || !pxpt->u.trans_pt )
> +        {
> +            spin_unlock(cpufreq_statistic_lock);
> +            return -ENODATA;
> +        }
> +
> +        pxpt->u.usable = pmpt->perf.state_count - pmpt->perf.platform_limit;
> +
> +        cpufreq_residency_update(op->cpuid, pxpt->u.cur);
> +
> +        ct = pmpt->perf.state_count;
> +        if ( copy_to_guest(op->u.getpx.trans_pt, pxpt->u.trans_pt, ct*ct) )
> +        {
> +            spin_unlock(cpufreq_statistic_lock);
> +            ret = -EFAULT;
> +            break;
> +        }
> +
> +        if ( copy_to_guest(op->u.getpx.pt, pxpt->u.pt, ct) )
> +        {
> +            spin_unlock(cpufreq_statistic_lock);
> +            ret = -EFAULT;
> +            break;
> +        }
> +
> +        op->u.getpx.total = pxpt->u.total;
> +        op->u.getpx.usable = pxpt->u.usable;
> +        op->u.getpx.last = pxpt->u.last;
> +        op->u.getpx.cur = pxpt->u.cur;
> +
> +        spin_unlock(cpufreq_statistic_lock);
> +
> +        break;
> +    }
> +
> +    case PMSTAT_reset_pxstat:
> +    {
> +        cpufreq_statistic_reset(op->cpuid);
> +        break;
> +    }
> +
> +    case PMSTAT_get_max_cx:
> +    {
> +        op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
> +        ret = 0;
> +        break;
> +    }
> +
> +    case PMSTAT_get_cxstat:
> +    {
> +        ret = pmstat_get_cx_stat(op->cpuid, &op->u.getcx);
> +        break;
> +    }
> +
> +    case PMSTAT_reset_cxstat:
> +    {
> +        ret = pmstat_reset_cx_stat(op->cpuid);
> +        break;
> +    }
> +
> +    default:
> +        printk("not defined sub-hypercall @ do_get_pm_info\n");
> +        ret = -ENOSYS;
> +        break;
> +    }
> +
> +    return ret;
> +}
> +
> +/*
> + * 1. Get PM parameter
> + * 2. Provide user PM control
> + */
> +static int read_scaling_available_governors(char *scaling_available_governors,
> +                                            unsigned int size)
> +{
> +    unsigned int i = 0;
> +    struct cpufreq_governor *t;
> +
> +    if ( !scaling_available_governors )
> +        return -EINVAL;
> +
> +    list_for_each_entry(t, &cpufreq_governor_list, governor_list)
> +    {
> +        i += scnprintf(&scaling_available_governors[i],
> +                       CPUFREQ_NAME_LEN, "%s ", t->name);
> +        if ( i > size )
> +            return -EINVAL;
> +    }
> +    scaling_available_governors[i-1] = '\0';
> +
> +    return 0;
> +}
> +
> +static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
> +{
> +    uint32_t ret = 0;
> +    const struct processor_pminfo *pmpt;
> +    struct cpufreq_policy *policy;
> +    uint32_t gov_num = 0;
> +    uint32_t *affected_cpus;
> +    uint32_t *scaling_available_frequencies;
> +    char     *scaling_available_governors;
> +    struct list_head *pos;
> +    uint32_t cpu, i, j = 0;
> +
> +    pmpt = processor_pminfo[op->cpuid];
> +    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +
> +    if ( !pmpt || !pmpt->perf.states ||
> +         !policy || !policy->governor )
> +        return -EINVAL;
> +
> +    list_for_each(pos, &cpufreq_governor_list)
> +        gov_num++;
> +
> +    if ( (op->u.get_para.cpu_num  != cpumask_weight(policy->cpus)) ||
> +         (op->u.get_para.freq_num != pmpt->perf.state_count)    ||
> +         (op->u.get_para.gov_num  != gov_num) )
> +    {
> +        op->u.get_para.cpu_num =  cpumask_weight(policy->cpus);
> +        op->u.get_para.freq_num = pmpt->perf.state_count;
> +        op->u.get_para.gov_num  = gov_num;
> +        return -EAGAIN;
> +    }
> +
> +    if ( !(affected_cpus = xzalloc_array(uint32_t, op->u.get_para.cpu_num)) )
> +        return -ENOMEM;
> +    for_each_cpu(cpu, policy->cpus)
> +        affected_cpus[j++] = cpu;
> +    ret = copy_to_guest(op->u.get_para.affected_cpus,
> +                       affected_cpus, op->u.get_para.cpu_num);
> +    xfree(affected_cpus);
> +    if ( ret )
> +        return ret;
> +
> +    if ( !(scaling_available_frequencies =
> +           xzalloc_array(uint32_t, op->u.get_para.freq_num)) )
> +        return -ENOMEM;
> +    for ( i = 0; i < op->u.get_para.freq_num; i++ )
> +        scaling_available_frequencies[i] =
> +                        pmpt->perf.states[i].core_frequency * 1000;
> +    ret = copy_to_guest(op->u.get_para.scaling_available_frequencies,
> +                   scaling_available_frequencies, op->u.get_para.freq_num);
> +    xfree(scaling_available_frequencies);
> +    if ( ret )
> +        return ret;
> +
> +    if ( !(scaling_available_governors =
> +           xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
> +        return -ENOMEM;
> +    if ( (ret = read_scaling_available_governors(scaling_available_governors,
> +                gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
> +    {
> +        xfree(scaling_available_governors);
> +        return ret;
> +    }
> +    ret = copy_to_guest(op->u.get_para.scaling_available_governors,
> +                scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
> +    xfree(scaling_available_governors);
> +    if ( ret )
> +        return ret;
> +
> +    op->u.get_para.cpuinfo_cur_freq =
> +        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
> +    op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
> +    op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
> +    op->u.get_para.scaling_cur_freq = policy->cur;
> +    op->u.get_para.scaling_max_freq = policy->max;
> +    op->u.get_para.scaling_min_freq = policy->min;
> +
> +    if ( cpufreq_driver->name[0] )
> +        strlcpy(op->u.get_para.scaling_driver, 
> +            cpufreq_driver->name, CPUFREQ_NAME_LEN);
> +    else
> +        strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
> +
> +    if ( policy->governor->name[0] )
> +        strlcpy(op->u.get_para.scaling_governor, 
> +            policy->governor->name, CPUFREQ_NAME_LEN);
> +    else
> +        strlcpy(op->u.get_para.scaling_governor, "Unknown", CPUFREQ_NAME_LEN);
> +
> +    /* governor specific para */
> +    if ( !strnicmp(op->u.get_para.scaling_governor, 
> +                   "userspace", CPUFREQ_NAME_LEN) )
> +    {
> +        op->u.get_para.u.userspace.scaling_setspeed = policy->cur;
> +    }
> +
> +    if ( !strnicmp(op->u.get_para.scaling_governor, 
> +                   "ondemand", CPUFREQ_NAME_LEN) )
> +    {
> +        ret = get_cpufreq_ondemand_para(
> +            &op->u.get_para.u.ondemand.sampling_rate_max,
> +            &op->u.get_para.u.ondemand.sampling_rate_min,
> +            &op->u.get_para.u.ondemand.sampling_rate,
> +            &op->u.get_para.u.ondemand.up_threshold);
> +    }
> +    op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
> +
> +    return ret;
> +}
> +
> +static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
> +{
> +    struct cpufreq_policy new_policy, *old_policy;
> +
> +    old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +    if ( !old_policy )
> +        return -EINVAL;
> +
> +    memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
> +
> +    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
> +    if (new_policy.governor == NULL)
> +        return -EINVAL;
> +
> +    return __cpufreq_set_policy(old_policy, &new_policy);
> +}
> +
> +static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
> +{
> +    int ret = 0;
> +    struct cpufreq_policy *policy;
> +
> +    policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +
> +    if ( !policy || !policy->governor )
> +        return -EINVAL;
> +
> +    switch(op->u.set_para.ctrl_type)
> +    {
> +    case SCALING_MAX_FREQ:
> +    {
> +        struct cpufreq_policy new_policy;
> +
> +        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> +        new_policy.max = op->u.set_para.ctrl_value;
> +        ret = __cpufreq_set_policy(policy, &new_policy);
> +
> +        break;
> +    }
> +
> +    case SCALING_MIN_FREQ:
> +    {
> +        struct cpufreq_policy new_policy;
> +
> +        memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> +        new_policy.min = op->u.set_para.ctrl_value;
> +        ret = __cpufreq_set_policy(policy, &new_policy);
> +
> +        break;
> +    }
> +
> +    case SCALING_SETSPEED:
> +    {
> +        unsigned int freq =op->u.set_para.ctrl_value;
> +
> +        if ( !strnicmp(policy->governor->name,
> +                       "userspace", CPUFREQ_NAME_LEN) )
> +            ret = write_userspace_scaling_setspeed(op->cpuid, freq);
> +        else
> +            ret = -EINVAL;
> +
> +        break;
> +    }
> +
> +    case SAMPLING_RATE:
> +    {
> +        unsigned int sampling_rate = op->u.set_para.ctrl_value;
> +
> +        if ( !strnicmp(policy->governor->name,
> +                       "ondemand", CPUFREQ_NAME_LEN) )
> +            ret = write_ondemand_sampling_rate(sampling_rate);
> +        else
> +            ret = -EINVAL;
> +
> +        break;
> +    }
> +
> +    case UP_THRESHOLD:
> +    {
> +        unsigned int up_threshold = op->u.set_para.ctrl_value;
> +
> +        if ( !strnicmp(policy->governor->name,
> +                       "ondemand", CPUFREQ_NAME_LEN) )
> +            ret = write_ondemand_up_threshold(up_threshold);
> +        else
> +            ret = -EINVAL;
> +
> +        break;
> +    }
> +
> +    default:
> +        ret = -EINVAL;
> +        break;
> +    }
> +
> +    return ret;
> +}
> +
> +int do_pm_op(struct xen_sysctl_pm_op *op)
> +{
> +    int ret = 0;
> +    const struct processor_pminfo *pmpt;
> +
> +    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
> +        return -EINVAL;
> +    pmpt = processor_pminfo[op->cpuid];
> +
> +    switch ( op->cmd & PM_PARA_CATEGORY_MASK )
> +    {
> +    case CPUFREQ_PARA:
> +        if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> +            return -ENODEV;
> +        if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> +            return -EINVAL;
> +        break;
> +    }
> +
> +    switch ( op->cmd )
> +    {
> +    case GET_CPUFREQ_PARA:
> +    {
> +        ret = get_cpufreq_para(op);
> +        break;
> +    }
> +
> +    case SET_CPUFREQ_GOV:
> +    {
> +        ret = set_cpufreq_gov(op);
> +        break;
> +    }
> +
> +    case SET_CPUFREQ_PARA:
> +    {
> +        ret = set_cpufreq_para(op);
> +        break;
> +    }
> +
> +    case GET_CPUFREQ_AVGFREQ:
> +    {
> +        op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
> +    {
> +        uint32_t saved_value;
> +
> +        saved_value = sched_smt_power_savings;
> +        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
> +        op->u.set_sched_opt_smt = saved_value;
> +
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
> +    {
> +        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
> +    {
> +        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_get_max_cstate:
> +    {
> +        op->u.get_max_cstate = acpi_get_cstate_limit();
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_set_max_cstate:
> +    {
> +        acpi_set_cstate_limit(op->u.set_max_cstate);
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_enable_turbo:
> +    {
> +        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
> +        break;
> +    }
> +
> +    case XEN_SYSCTL_pm_op_disable_turbo:
> +    {
> +        ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
> +        break;
> +    }
> +
> +    default:
> +        printk("not defined sub-hypercall @ do_pm_op\n");
> +        ret = -ENOSYS;
> +        break;
> +    }
> +
> +    return ret;
> +}
> +
> +int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
> +{
> +    u32 bits[3];
> +    int ret;
> +
> +    if ( copy_from_guest(bits, pdc, 2) )
> +        ret = -EFAULT;
> +    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
> +        ret = -EINVAL;
> +    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
> +        ret = -EFAULT;
> +    else
> +    {
> +        u32 mask = 0;
> +
> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
> +            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
> +            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
> +            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
> +        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
> +                    ACPI_PDC_SMP_C1PT) & ~mask;
> +        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
> +    }
> +    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
> +        ret = -EFAULT;
> +
> +    return ret;
> +}
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-11-09 17:09 ` [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable Oleksandr Tyshchenko
@ 2017-12-02  1:06   ` Stefano Stabellini
  2017-12-02 17:25     ` Oleksandr Tyshchenko
  2018-05-07 15:39   ` Jan Beulich
  1 sibling, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  1:06 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> This settings is not needed for some architectures.
> So make it to be configurable and use it for x86
> architecture.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00942.html
> 
> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/x86/Kconfig          |  1 +
>  xen/drivers/cpufreq/Kconfig   |  3 +++
>  xen/drivers/cpufreq/utility.c | 11 ++++++++++-
>  xen/drivers/pm/stat.c         |  6 ++++++
>  xen/include/xen/cpufreq.h     |  6 ++++++
>  5 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 86c8eca..c1eac1d 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -24,6 +24,7 @@ config X86
>  	select NUMA
>  	select VGA
>  	select HAS_PM
> +	select HAS_CPU_TURBO
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/drivers/cpufreq/Kconfig b/xen/drivers/cpufreq/Kconfig
> index cce80f4..427ea2a 100644
> --- a/xen/drivers/cpufreq/Kconfig
> +++ b/xen/drivers/cpufreq/Kconfig
> @@ -1,3 +1,6 @@
>  
>  config HAS_CPUFREQ
>  	bool
> +
> +config HAS_CPU_TURBO
> +	bool
> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> index a687e5a..25bf983 100644
> --- a/xen/drivers/cpufreq/utility.c
> +++ b/xen/drivers/cpufreq/utility.c
> @@ -209,7 +209,9 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>  {
>      unsigned int min_freq = ~0;
>      unsigned int max_freq = 0;
> +#ifdef CONFIG_HAS_CPU_TURBO
>      unsigned int second_max_freq = 0;
> +#endif
>      unsigned int i;
>  
>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
> @@ -221,6 +223,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>          if (freq > max_freq)
>              max_freq = freq;
>      }
> +#ifdef CONFIG_HAS_CPU_TURBO
>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>          unsigned int freq = table[i].frequency;
>          if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
> @@ -234,9 +237,13 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>          printk("max_freq: %u    second_max_freq: %u\n",
>                 max_freq, second_max_freq);
>  
> +    policy->cpuinfo.second_max_freq = second_max_freq;
> +#else /* !CONFIG_HAS_CPU_TURBO */
> +    if (cpufreq_verbose)
> +        printk("max_freq: %u\n", max_freq);
> +#endif /* CONFIG_HAS_CPU_TURBO */
>      policy->min = policy->cpuinfo.min_freq = min_freq;
>      policy->max = policy->cpuinfo.max_freq = max_freq;
> -    policy->cpuinfo.second_max_freq = second_max_freq;
>  
>      if (policy->min == ~0)
>          return -EINVAL;
> @@ -390,6 +397,7 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
>      return policy->cur;
>  }
>  
> +#ifdef CONFIG_HAS_CPU_TURBO
>  int cpufreq_update_turbo(int cpuid, int new_state)
>  {
>      struct cpufreq_policy *policy;
> @@ -430,6 +438,7 @@ int cpufreq_get_turbo_status(int cpuid)
>      policy = per_cpu(cpufreq_cpu_policy, cpuid);
>      return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
>  }
> +#endif /* CONFIG_HAS_CPU_TURBO */
>  
>  /*********************************************************************
>   *                 POLICY                                            *

I am wondering if we need to go as far as #ifdef'ing
cpufreq_update_turbo. For the sake of reducing the number if #ifdef's,
would it be enough if we only make sure it is disabled?

In other words, I would keep the changes to stat.c but I would leave
utility.c and cpufreq.h pretty much untouched.


> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
> index 2dbde1c..133e64d 100644
> --- a/xen/drivers/pm/stat.c
> +++ b/xen/drivers/pm/stat.c
> @@ -290,7 +290,11 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>              &op->u.get_para.u.ondemand.sampling_rate,
>              &op->u.get_para.u.ondemand.up_threshold);
>      }
> +#ifdef CONFIG_HAS_CPU_TURBO
>      op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
> +#else
> +    op->u.get_para.turbo_enabled = 0;
> +#endif
>  
>      return ret;
>  }
> @@ -473,6 +477,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>          break;
>      }
>  
> +#ifdef CONFIG_HAS_CPU_TURBO
>      case XEN_SYSCTL_pm_op_enable_turbo:
>      {
>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
> @@ -484,6 +489,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
>          break;
>      }
> +#endif /* CONFIG_HAS_CPU_TURBO */
>  
>      default:
>          printk("not defined sub-hypercall @ do_pm_op\n");
> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
> index 30c70c9..2e0c16a 100644
> --- a/xen/include/xen/cpufreq.h
> +++ b/xen/include/xen/cpufreq.h
> @@ -39,7 +39,9 @@ extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
>  
>  struct cpufreq_cpuinfo {
>      unsigned int        max_freq;
> +#ifdef CONFIG_HAS_CPU_TURBO
>      unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
> +#endif
>      unsigned int        min_freq;
>      unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
>  };
> @@ -72,9 +74,11 @@ struct cpufreq_policy {
>  
>      bool_t              resume; /* flag for cpufreq 1st run
>                                   * S3 wakeup, hotplug cpu, etc */
> +#ifdef CONFIG_HAS_CPU_TURBO
>      s8                  turbo;  /* tristate flag: 0 for unsupported
>                                   * -1 for disable, 1 for enabled
>                                   * See CPUFREQ_TURBO_* below for defines */
> +#endif
>      bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
>  };
>  DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
> @@ -138,8 +142,10 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
>  #define CPUFREQ_TURBO_UNSUPPORTED   0
>  #define CPUFREQ_TURBO_ENABLED       1
>  
> +#ifdef CONFIG_HAS_CPU_TURBO
>  extern int cpufreq_update_turbo(int cpuid, int new_state);
>  extern int cpufreq_get_turbo_status(int cpuid);
> +#endif
>  
>  static __inline__ int 
>  __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable
  2017-11-09 17:09 ` [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable Oleksandr Tyshchenko
@ 2017-12-02  1:21   ` Stefano Stabellini
  2017-12-04 16:21     ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  1:21 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> ACPI-specific parts are moved under appropriate ifdefs.
> Now pmstat functions can be used in ARM platform.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00941.html

My first maybe naive question is: why do we want to disable the C-states
and not the P-states? After all, they are both defined in ACPI?

The second question is: instead of #ifdef'ing everything C-states,
couldn't we just rely on XEN_PROCESSOR_PM_CX not being available?


> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/drivers/pm/stat.c    | 8 +++++++-
>  xen/include/xen/pmstat.h | 2 ++
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
> index 133e64d..986ba41 100644
> --- a/xen/drivers/pm/stat.c
> +++ b/xen/drivers/pm/stat.c
> @@ -35,7 +35,6 @@
>  #include <asm/processor.h>
>  #include <xen/percpu.h>
>  #include <xen/domain.h>
> -#include <xen/acpi.h>
>  
>  #include <public/sysctl.h>
>  #include <xen/cpufreq.h>
> @@ -132,6 +131,8 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
>          break;
>      }
>  
> +/* For now those operations can be used only when ACPI is enabled */
> +#ifdef CONFIG_ACPI
>      case PMSTAT_get_max_cx:
>      {
>          op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
> @@ -150,6 +151,7 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
>          ret = pmstat_reset_cx_stat(op->cpuid);
>          break;
>      }
> +#endif /* CONFIG_ACPI */
>  
>      default:
>          printk("not defined sub-hypercall @ do_get_pm_info\n");
> @@ -465,6 +467,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>          break;
>      }
>  
> +#ifdef CONFIG_ACPI
>      case XEN_SYSCTL_pm_op_get_max_cstate:
>      {
>          op->u.get_max_cstate = acpi_get_cstate_limit();
> @@ -476,6 +479,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>          acpi_set_cstate_limit(op->u.set_max_cstate);
>          break;
>      }
> +#endif /* CONFIG_ACPI */
>  
>  #ifdef CONFIG_HAS_CPU_TURBO
>      case XEN_SYSCTL_pm_op_enable_turbo:
> @@ -500,6 +504,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>      return ret;
>  }
>  
> +#ifdef CONFIG_ACPI
>  int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>  {
>      u32 bits[3];
> @@ -530,3 +535,4 @@ int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>  
>      return ret;
>  }
> +#endif /* CONFIG_ACPI */
> diff --git a/xen/include/xen/pmstat.h b/xen/include/xen/pmstat.h
> index 266bc16..a870c8a 100644
> --- a/xen/include/xen/pmstat.h
> +++ b/xen/include/xen/pmstat.h
> @@ -6,10 +6,12 @@
>  #include <public/sysctl.h>   /* for struct pm_cx_stat */
>  
>  int set_px_pminfo(uint32_t cpu, struct xen_processor_performance *perf);
> +#ifdef CONFIG_ACPI
>  long set_cx_pminfo(uint32_t cpu, struct xen_processor_power *power);
>  uint32_t pmstat_get_cx_nr(uint32_t cpuid);
>  int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat);
>  int pmstat_reset_cx_stat(uint32_t cpuid);
> +#endif
>  
>  int do_get_pm_info(struct xen_sysctl_get_pmstat *op);
>  int do_pm_op(struct xen_sysctl_pm_op *op);
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 07/31] xenpm: Clarify xenpm usage
  2017-11-09 17:13   ` Wei Liu
@ 2017-12-02  1:28     ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  1:28 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Julien Grall, Ian Jackson,
	Oleksandr Tyshchenko, Oleksandr Tyshchenko, xen-devel

On Thu, 9 Nov 2017, Wei Liu wrote:
> On Thu, Nov 09, 2017 at 07:09:57PM +0200, Oleksandr Tyshchenko wrote:
> > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > 
> > CPU frequencies are in kHz. So, correct displayed text.
> > 
> > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > CC: Ian Jackson <ian.jackson@eu.citrix.com>
> > CC: Wei Liu <wei.liu2@citrix.com>
> > CC: Stefano Stabellini <sstabellini@kernel.org>
> > CC: Julien Grall <julien.grall@linaro.org>
> > ---
> >  tools/misc/xenpm.c | 6 +++---
> 
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-11-09 17:09 ` [RFC PATCH 06/31] cpufreq: make cpufreq driver " Oleksandr Tyshchenko
@ 2017-12-02  1:37   ` Stefano Stabellini
  2017-12-04 19:34     ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-02  1:37 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> First implementation of the cpufreq driver has been
> written with x86 in mind. This patch makes possible
> the cpufreq driver be working on both x86 and arm
> architectures.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00932.html
> 
> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/drivers/cpufreq/cpufreq.c    | 81 +++++++++++++++++++++++++++++++++++++---
>  xen/include/public/platform.h    |  1 +
>  xen/include/xen/processor_perf.h |  6 +++
>  3 files changed, 82 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> index ab909e2..64e1ae7 100644
> --- a/xen/drivers/cpufreq/cpufreq.c
> +++ b/xen/drivers/cpufreq/cpufreq.c
> @@ -42,7 +42,6 @@
>  #include <asm/io.h>
>  #include <asm/processor.h>
>  #include <asm/percpu.h>
> -#include <acpi/acpi.h>
>  #include <xen/cpufreq.h>
>  
>  static unsigned int __read_mostly usr_min_freq;
> @@ -206,6 +205,7 @@ int cpufreq_add_cpu(unsigned int cpu)
>      } else {
>          /* domain sanity check under whatever coordination type */
>          firstcpu = cpumask_first(cpufreq_dom->map);
> +#ifdef CONFIG_ACPI
>          if ((perf->domain_info.coord_type !=
>              processor_pminfo[firstcpu]->perf.domain_info.coord_type) ||
>              (perf->domain_info.num_processors !=
> @@ -221,6 +221,19 @@ int cpufreq_add_cpu(unsigned int cpu)
>                  );
>              return -EINVAL;
>          }
> +#else /* !CONFIG_ACPI */
> +        if ((perf->domain_info.num_processors !=
> +            processor_pminfo[firstcpu]->perf.domain_info.num_processors)) {
> +
> +            printk(KERN_WARNING "cpufreq fail to add CPU%d:"
> +                   "incorrect num processors (%"PRIu64"), "
> +                   "expect(%"PRIu64")\n",
> +                   cpu, perf->domain_info.num_processors,
> +                   processor_pminfo[firstcpu]->perf.domain_info.num_processors
> +                );
> +            return -EINVAL;
> +        }
> +#endif /* CONFIG_ACPI */

Why is this necessary? I am asking this question, because I think it
would be best to avoid more #ifdef's if we can avoid them, and some of
the code #ifdef'ed doesn't look very acpi specific (at least at first
sight). It doesn't look like this change is very beneficial. What am I
missing?


>      }
>  
>      if (!domexist || hw_all) {
> @@ -380,6 +393,7 @@ int cpufreq_del_cpu(unsigned int cpu)
>      return 0;
>  }
>  
> +#ifdef CONFIG_ACPI
>  static void print_PCT(struct xen_pct_register *ptr)
>  {
>      printk("\t_PCT: descriptor=%d, length=%d, space_id=%d, "
> @@ -387,12 +401,14 @@ static void print_PCT(struct xen_pct_register *ptr)
>             ptr->descriptor, ptr->length, ptr->space_id, ptr->bit_width,
>             ptr->bit_offset, ptr->reserved, ptr->address);
>  }
> +#endif /* CONFIG_ACPI */

same question


>  static void print_PSS(struct xen_processor_px *ptr, int count)
>  {
>      int i;
>      printk("\t_PSS: state_count=%d\n", count);
>      for (i=0; i<count; i++){
> +#ifdef CONFIG_ACPI
>          printk("\tState%d: %"PRId64"MHz %"PRId64"mW %"PRId64"us "
>                 "%"PRId64"us %#"PRIx64" %#"PRIx64"\n",
>                 i,
> @@ -402,15 +418,26 @@ static void print_PSS(struct xen_processor_px *ptr, int count)
>                 ptr[i].bus_master_latency,
>                 ptr[i].control,
>                 ptr[i].status);
> +#else /* !CONFIG_ACPI */
> +        printk("\tState%d: %"PRId64"MHz %"PRId64"us\n",
> +               i,
> +               ptr[i].core_frequency,
> +               ptr[i].transition_latency);
> +#endif /* CONFIG_ACPI */
>      }
>  }
  
same question


>  static void print_PSD( struct xen_psd_package *ptr)
>  {
> +#ifdef CONFIG_ACPI
>      printk("\t_PSD: num_entries=%"PRId64" rev=%"PRId64
>             " domain=%"PRId64" coord_type=%"PRId64" num_processors=%"PRId64"\n",
>             ptr->num_entries, ptr->revision, ptr->domain, ptr->coord_type,
>             ptr->num_processors);
> +#else /* !CONFIG_ACPI */
> +    printk("\t_PSD:  domain=%"PRId64" num_processors=%"PRId64"\n",
> +           ptr->domain, ptr->num_processors);
> +#endif /* CONFIG_ACPI */
>  }

same question


>  static void print_PPC(unsigned int platform_limit)
> @@ -418,13 +445,53 @@ static void print_PPC(unsigned int platform_limit)
>      printk("\t_PPC: %d\n", platform_limit);
>  }
>  
> +static inline bool is_pss_data(struct xen_processor_performance *px)
> +{
> +#ifdef CONFIG_ACPI
> +    return px->flags & XEN_PX_PSS;
> +#else
> +    return px->flags == XEN_PX_DATA;
> +#endif
> +}
> +
> +static inline bool is_psd_data(struct xen_processor_performance *px)
> +{
> +#ifdef CONFIG_ACPI
> +    return px->flags & XEN_PX_PSD;
> +#else
> +    return px->flags == XEN_PX_DATA;
> +#endif
> +}
> +
> +static inline bool is_ppc_data(struct xen_processor_performance *px)
> +{
> +#ifdef CONFIG_ACPI
> +    return px->flags & XEN_PX_PPC;
> +#else
> +    return px->flags == XEN_PX_DATA;
> +#endif
> +}
> +
> +static inline bool is_all_data(struct xen_processor_performance *px)
> +{
> +#ifdef CONFIG_ACPI
> +    return px->flags == ( XEN_PX_PCT | XEN_PX_PSS | XEN_PX_PSD | XEN_PX_PPC );
> +#else
> +    return px->flags == XEN_PX_DATA;
> +#endif
> +}

Could you please explain here and in the commit message the idea behind
this? It looks like we want to get rid of the different flags on
non-ACPI systems? Why can't we reuse the same flags?


>  int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_info)
>  {
>      int ret=0, cpuid;
>      struct processor_pminfo *pmpt;
>      struct processor_performance *pxpt;
>  
> +#ifdef CONFIG_ACPI
>      cpuid = get_cpu_id(acpi_id);
> +#else
> +    cpuid = acpi_id;
> +#endif

Rather than an #ifdef here, I would probably generalize the get_cpu_id
function.


>      if ( cpuid < 0 || !dom0_px_info)
>      {
>          ret = -EINVAL;
> @@ -446,6 +513,8 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>          processor_pminfo[cpuid] = pmpt;
>      }
>      pxpt = &pmpt->perf;
> +
> +#ifdef CONFIG_ACPI
>      pmpt->acpi_id = acpi_id;
>      pmpt->id = cpuid;
>  
> @@ -472,8 +541,9 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>              print_PCT(&pxpt->status_register);
>          }
>      }
> +#endif /* CONFIG_ACPI */
>  
> -    if ( dom0_px_info->flags & XEN_PX_PSS ) 
> +    if ( is_pss_data(dom0_px_info) )
>      {
>          /* capability check */
>          if (dom0_px_info->state_count <= 1)
> @@ -500,7 +570,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>              print_PSS(pxpt->states,pxpt->state_count);
>      }
>  
> -    if ( dom0_px_info->flags & XEN_PX_PSD )
> +    if ( is_psd_data(dom0_px_info) )
>      {
>          /* check domain coordination */
>          if (dom0_px_info->shared_type != CPUFREQ_SHARED_TYPE_ALL &&
> @@ -520,7 +590,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>              print_PSD(&pxpt->domain_info);
>      }
>  
> -    if ( dom0_px_info->flags & XEN_PX_PPC )
> +    if ( is_ppc_data(dom0_px_info) )
>      {
>          pxpt->platform_limit = dom0_px_info->platform_limit;
>  
> @@ -534,8 +604,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>          }
>      }
>  
> -    if ( dom0_px_info->flags == ( XEN_PX_PCT | XEN_PX_PSS |
> -                XEN_PX_PSD | XEN_PX_PPC ) )
> +    if ( is_all_data(dom0_px_info) )
>      {
>          pxpt->init = XEN_PX_INIT;
>  
> diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
> index 94dbc3f..328579c 100644
> --- a/xen/include/public/platform.h
> +++ b/xen/include/public/platform.h
> @@ -384,6 +384,7 @@ DEFINE_XEN_GUEST_HANDLE(xenpf_getidletime_t);
>  #define XEN_PX_PSS   2
>  #define XEN_PX_PPC   4
>  #define XEN_PX_PSD   8
> +#define XEN_PX_DATA  16
>  
>  struct xen_power_register {
>      uint32_t     space_id;
> diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
> index d8a1ba6..afdccf2 100644
> --- a/xen/include/xen/processor_perf.h
> +++ b/xen/include/xen/processor_perf.h
> @@ -3,7 +3,9 @@
>  
>  #include <public/platform.h>
>  #include <public/sysctl.h>
> +#ifdef CONFIG_ACPI
>  #include <xen/acpi.h>
> +#endif
>  
>  #define XEN_PX_INIT 0x80000000
>  
> @@ -24,8 +26,10 @@ int  cpufreq_del_cpu(unsigned int);
>  struct processor_performance {
>      uint32_t state;
>      uint32_t platform_limit;
> +#ifdef CONFIG_ACPI
>      struct xen_pct_register control_register;
>      struct xen_pct_register status_register;
> +#endif
>      uint32_t state_count;
>      struct xen_processor_px *states;
>      struct xen_psd_package domain_info;
> @@ -35,8 +39,10 @@ struct processor_performance {
>  };
>  
>  struct processor_pminfo {
> +#ifdef CONFIG_ACPI
>      uint32_t acpi_id;
>      uint32_t id;
> +#endif
>      struct processor_performance    perf;
>  };
>  
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-02  1:06   ` Stefano Stabellini
@ 2017-12-02 17:25     ` Oleksandr Tyshchenko
  2017-12-04 11:58       ` Andre Przywara
  2017-12-04 22:18       ` Stefano Stabellini
  0 siblings, 2 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-02 17:25 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi Stefano

On Sat, Dec 2, 2017 at 3:06 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>>
>> This settings is not needed for some architectures.
>> So make it to be configurable and use it for x86
>> architecture.
>>
>> This is a rebased version of the original patch:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00942.html
>>
>> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/arch/x86/Kconfig          |  1 +
>>  xen/drivers/cpufreq/Kconfig   |  3 +++
>>  xen/drivers/cpufreq/utility.c | 11 ++++++++++-
>>  xen/drivers/pm/stat.c         |  6 ++++++
>>  xen/include/xen/cpufreq.h     |  6 ++++++
>>  5 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
>> index 86c8eca..c1eac1d 100644
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -24,6 +24,7 @@ config X86
>>       select NUMA
>>       select VGA
>>       select HAS_PM
>> +     select HAS_CPU_TURBO
>>
>>  config ARCH_DEFCONFIG
>>       string
>> diff --git a/xen/drivers/cpufreq/Kconfig b/xen/drivers/cpufreq/Kconfig
>> index cce80f4..427ea2a 100644
>> --- a/xen/drivers/cpufreq/Kconfig
>> +++ b/xen/drivers/cpufreq/Kconfig
>> @@ -1,3 +1,6 @@
>>
>>  config HAS_CPUFREQ
>>       bool
>> +
>> +config HAS_CPU_TURBO
>> +     bool
>> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>> index a687e5a..25bf983 100644
>> --- a/xen/drivers/cpufreq/utility.c
>> +++ b/xen/drivers/cpufreq/utility.c
>> @@ -209,7 +209,9 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>>  {
>>      unsigned int min_freq = ~0;
>>      unsigned int max_freq = 0;
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>      unsigned int second_max_freq = 0;
>> +#endif
>>      unsigned int i;
>>
>>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>> @@ -221,6 +223,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>>          if (freq > max_freq)
>>              max_freq = freq;
>>      }
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>>          unsigned int freq = table[i].frequency;
>>          if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
>> @@ -234,9 +237,13 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>>          printk("max_freq: %u    second_max_freq: %u\n",
>>                 max_freq, second_max_freq);
>>
>> +    policy->cpuinfo.second_max_freq = second_max_freq;
>> +#else /* !CONFIG_HAS_CPU_TURBO */
>> +    if (cpufreq_verbose)
>> +        printk("max_freq: %u\n", max_freq);
>> +#endif /* CONFIG_HAS_CPU_TURBO */
>>      policy->min = policy->cpuinfo.min_freq = min_freq;
>>      policy->max = policy->cpuinfo.max_freq = max_freq;
>> -    policy->cpuinfo.second_max_freq = second_max_freq;
>>
>>      if (policy->min == ~0)
>>          return -EINVAL;
>> @@ -390,6 +397,7 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
>>      return policy->cur;
>>  }
>>
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>  int cpufreq_update_turbo(int cpuid, int new_state)
>>  {
>>      struct cpufreq_policy *policy;
>> @@ -430,6 +438,7 @@ int cpufreq_get_turbo_status(int cpuid)
>>      policy = per_cpu(cpufreq_cpu_policy, cpuid);
>>      return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
>>  }
>> +#endif /* CONFIG_HAS_CPU_TURBO */
>>
>>  /*********************************************************************
>>   *                 POLICY                                            *
>
> I am wondering if we need to go as far as #ifdef'ing
> cpufreq_update_turbo. For the sake of reducing the number if #ifdef's,
> would it be enough if we only make sure it is disabled?
>
> In other words, I would keep the changes to stat.c but I would leave
> utility.c and cpufreq.h pretty much untouched.

Yes. I was thinking about dropping this patch at all. If platform
doesn't support CPU Boost, the platform
driver should just inform framework about that (policy->turbo =
CPUFREQ_TURBO_UNSUPPORTED).
That's all.

cpufreq_update_turbo() will return -EOPNOTSUPP if someone tries to
enable/disable turbo mode.
cpufreq_get_turbo_status() will return that turbo mode "is not enabled".

Another question is second_max_freq. As I understand, it is highest
non-turbo frequency calculated by framework to limit target frequency
when
turbo mode "is disabled". And Xen assumes that second_max_freq is
always P1 if turbo mode is on.
But, there might be a case when a few highest frequencies are
turbo-frequencies. So, I propose to add an extra flag for handling
that.
So, each CPUFreq driver responsibility will be to mark
turbo-frequency(ies) for the framework to properly calculate
second_max_freq.

Something like that:

diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index 25bf983..122a88b 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
cpufreq_policy *policy,
 #ifdef CONFIG_HAS_CPU_TURBO
     for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
         unsigned int freq = table[i].frequency;
-        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
+        if ((freq == CPUFREQ_ENTRY_INVALID) ||
+            (table[i].flags & CPUFREQ_BOOST_FREQ))
             continue;
         if (freq > second_max_freq)
             second_max_freq = freq;
diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
index 2e0c16a..77b29da 100644
--- a/xen/include/xen/cpufreq.h
+++ b/xen/include/xen/cpufreq.h
@@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
cpufreq_policy *policy,
 #define CPUFREQ_ENTRY_INVALID ~0
 #define CPUFREQ_TABLE_END     ~1

+/* Special Values of .flags field */
+#define CPUFREQ_BOOST_FREQ    (1 << 0)
+
 struct cpufreq_frequency_table {
+       unsigned int    flags;
     unsigned int    index;     /* any */
     unsigned int    frequency; /* kHz - doesn't need to be in ascending
                                 * order */

Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
a turbo-frequency if turbo mode "is supported". Am I correct?

And the most important question is how to recognize in Xen on ARM
(using SCPI protocol) which frequencies are turbo-frequencies
actually? I couldn't find any information regarding that in protocol
description.
For DT-based CPUFreq it is not an issue, since there is a specific
property "turbo-mode" to mark corresponding OPPs. [1].
But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
mentions about it. Perhaps, additional command should be added to pass
such info.

[1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
[2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
[3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf

>
>
>> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
>> index 2dbde1c..133e64d 100644
>> --- a/xen/drivers/pm/stat.c
>> +++ b/xen/drivers/pm/stat.c
>> @@ -290,7 +290,11 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>>              &op->u.get_para.u.ondemand.sampling_rate,
>>              &op->u.get_para.u.ondemand.up_threshold);
>>      }
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>      op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
>> +#else
>> +    op->u.get_para.turbo_enabled = 0;
>> +#endif
>>
>>      return ret;
>>  }
>> @@ -473,6 +477,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>          break;
>>      }
>>
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>      case XEN_SYSCTL_pm_op_enable_turbo:
>>      {
>>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
>> @@ -484,6 +489,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
>>          break;
>>      }
>> +#endif /* CONFIG_HAS_CPU_TURBO */
>>
>>      default:
>>          printk("not defined sub-hypercall @ do_pm_op\n");
>> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
>> index 30c70c9..2e0c16a 100644
>> --- a/xen/include/xen/cpufreq.h
>> +++ b/xen/include/xen/cpufreq.h
>> @@ -39,7 +39,9 @@ extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
>>
>>  struct cpufreq_cpuinfo {
>>      unsigned int        max_freq;
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>      unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
>> +#endif
>>      unsigned int        min_freq;
>>      unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
>>  };
>> @@ -72,9 +74,11 @@ struct cpufreq_policy {
>>
>>      bool_t              resume; /* flag for cpufreq 1st run
>>                                   * S3 wakeup, hotplug cpu, etc */
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>      s8                  turbo;  /* tristate flag: 0 for unsupported
>>                                   * -1 for disable, 1 for enabled
>>                                   * See CPUFREQ_TURBO_* below for defines */
>> +#endif
>>      bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
>>  };
>>  DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
>> @@ -138,8 +142,10 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
>>  #define CPUFREQ_TURBO_UNSUPPORTED   0
>>  #define CPUFREQ_TURBO_ENABLED       1
>>
>> +#ifdef CONFIG_HAS_CPU_TURBO
>>  extern int cpufreq_update_turbo(int cpuid, int new_state);
>>  extern int cpufreq_get_turbo_status(int cpuid);
>> +#endif
>>
>>  static __inline__ int
>>  __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-02 17:25     ` Oleksandr Tyshchenko
@ 2017-12-04 11:58       ` Andre Przywara
  2017-12-05 15:23         ` Oleksandr Tyshchenko
  2017-12-04 22:18       ` Stefano Stabellini
  1 sibling, 1 reply; 108+ messages in thread
From: Andre Przywara @ 2017-12-04 11:58 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi,

....

> And the most important question is how to recognize in Xen on ARM
> (using SCPI protocol) which frequencies are turbo-frequencies
> actually? I couldn't find any information regarding that in protocol
> description.

So traditionally on ARM there is no notion of a "turbo" frequency. The
idea is to expose the highest possible frequency, and let thermal
throttling (possibly in hardware or in firmware) limit the frequency if
the thermal budget is busted.
Also in the ARM world it is expected that an OS has much better
knowledge on how to handle frequencies, for instance when to give more
power to the GPU and when to the CPU.

> For DT-based CPUFreq it is not an issue, since there is a specific
> property "turbo-mode" to mark corresponding OPPs. [1].
> But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
> mentions about it. Perhaps, additional command should be added to pass
> such info.

The DT binding you mentioned in Linux is a generic one.
In general DT only describes non-discoverable properties. But for SCPI
the OPPs are handled in the SCP and advertised via SCPI calls (3.2.9 Get
DVFS Info, command 0x9).
So the OPP table is not in the DT, and thus you don't have any way of
detecting turbo frequencies.
But as mentioned before, this is so by design, as ARM does not endorse
the concept of turbo frequencies in general.

Now with the advent of more "server-y" chips and ACPI, this might change
in the future. For instance SCMI is designed to be closer to ACPI, so we
might inherit some turbo notion from there.

So we should not completely rule out the idea of turbo, but for a start
we can somewhat assume that an ARM based system does not have turbo per se.

Cheers,
Andre.

> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
> [2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
> [3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
> 
>>
>>
>>> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
>>> index 2dbde1c..133e64d 100644
>>> --- a/xen/drivers/pm/stat.c
>>> +++ b/xen/drivers/pm/stat.c
>>> @@ -290,7 +290,11 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>>>              &op->u.get_para.u.ondemand.sampling_rate,
>>>              &op->u.get_para.u.ondemand.up_threshold);
>>>      }
>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>      op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
>>> +#else
>>> +    op->u.get_para.turbo_enabled = 0;
>>> +#endif
>>>
>>>      return ret;
>>>  }
>>> @@ -473,6 +477,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>>          break;
>>>      }
>>>
>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>      case XEN_SYSCTL_pm_op_enable_turbo:
>>>      {
>>>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
>>> @@ -484,6 +489,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
>>>          break;
>>>      }
>>> +#endif /* CONFIG_HAS_CPU_TURBO */
>>>
>>>      default:
>>>          printk("not defined sub-hypercall @ do_pm_op\n");
>>> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
>>> index 30c70c9..2e0c16a 100644
>>> --- a/xen/include/xen/cpufreq.h
>>> +++ b/xen/include/xen/cpufreq.h
>>> @@ -39,7 +39,9 @@ extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
>>>
>>>  struct cpufreq_cpuinfo {
>>>      unsigned int        max_freq;
>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>      unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
>>> +#endif
>>>      unsigned int        min_freq;
>>>      unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
>>>  };
>>> @@ -72,9 +74,11 @@ struct cpufreq_policy {
>>>
>>>      bool_t              resume; /* flag for cpufreq 1st run
>>>                                   * S3 wakeup, hotplug cpu, etc */
>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>      s8                  turbo;  /* tristate flag: 0 for unsupported
>>>                                   * -1 for disable, 1 for enabled
>>>                                   * See CPUFREQ_TURBO_* below for defines */
>>> +#endif
>>>      bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
>>>  };
>>>  DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
>>> @@ -138,8 +142,10 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
>>>  #define CPUFREQ_TURBO_UNSUPPORTED   0
>>>  #define CPUFREQ_TURBO_ENABLED       1
>>>
>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>  extern int cpufreq_update_turbo(int cpuid, int new_state);
>>>  extern int cpufreq_get_turbo_status(int cpuid);
>>> +#endif
>>>
>>>  static __inline__ int
>>>  __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable
  2017-12-02  1:21   ` Stefano Stabellini
@ 2017-12-04 16:21     ` Oleksandr Tyshchenko
  2017-12-04 22:30       ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-04 16:21 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi Stefano

On Sat, Dec 2, 2017 at 3:21 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>>
>> ACPI-specific parts are moved under appropriate ifdefs.
>> Now pmstat functions can be used in ARM platform.
>>
>> This is a rebased version of the original patch:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00941.html
>
> My first maybe naive question is: why do we want to disable the C-states
> and not the P-states? After all, they are both defined in ACPI?

Good question. Xen CPUFreq infrastructure based on ACPI P-states. We
have to either
completely rework generic code/existing drivers or integrate into
"current environment" (so, the CPUFreq driver,
this patch series adds, is pretending that it does understand what the
P-states are). The second option requires much less
developing & upstreaming (I hope) efforts. BTW, with the current
solution you don't have to modify public sysctl & xenpm.
And looking through all previous discussions [1] I got a feeling that
the original author of this patch had had similar opinion.

[1]
/* RFC v0 */
https://lists.xen.org/archives/html/xen-devel/2014-08/msg02919.html
/* RFC v1 */
https://lists.xenproject.org/archives/html/xen-devel/2014-10/msg00787.html
/* RFC v2 */
https://lists.xenproject.org/archives/html/xen-devel/2014-10/msg01879.html
/* RFC v3 */
https://marc.info/?l=xen-devel&m=141407701110860&w=2
/* RFC v4 */
https://marc.info/?l=xen-devel&m=141510663108037&w=2
/* RFC v5 */
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html

>
> The second question is: instead of #ifdef'ing everything C-states,
> couldn't we just rely on XEN_PROCESSOR_PM_CX not being available?

I am afraid that relying on XEN_PROCESSOR_PM_CX not being available is
not enough.
A few functions, which were #ifdef'd by original author of the patch,
are located at arch/x86 path.
So, I think, the question was to get pmstat.c compilable on ARM.

But completely agree that a scope of #ifdef's can be reduced.

1. For next functions we will be able to omit #ifdef CONFIG_ACPI if we
create corresponding stubs.
- pmstat_get_cx_nr()
- pmstat_get_cx_stat()
- pmstat_reset_cx_stat()
They won't never be called if XEN_PROCESSOR_PM_CX is not set.

2. For next functions we, probably, may omit #ifdef CONFIG_ACPI, since
the corresponding stubs already present (see !CONFIG_ACPI_CSTATE in
acpi.h)
- acpi_get_cstate_limit()
- acpi_set_cstate_limit()

But acpi_set_pdc_bits() I would leave under #ifdef CONFIG_ACPI
(CONFIG_X86 ?) or move it to arch/x86.
It is called from arch/x86/platform_hypercall.c and pulls a bunch of
#define-s from pdc_intel.h

Something like that:

diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
index 133e64d..353d0ab 100644
--- a/xen/drivers/pm/stat.c
+++ b/xen/drivers/pm/stat.c
@@ -500,6 +500,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
     return ret;
 }

+#ifdef CONFIG_ACPI /* or CONFIG_X86 ? */
 int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
 {
     u32 bits[3];
@@ -530,3 +531,4 @@ int acpi_set_pdc_bits(u32 acpi_id,
XEN_GUEST_HANDLE_PARAM(uint32) pdc)

     return ret;
 }
+#endif /* CONFIG_ACPI */
diff --git a/xen/include/xen/pmstat.h b/xen/include/xen/pmstat.h
index 266bc16..05d6b7b 100644
--- a/xen/include/xen/pmstat.h
+++ b/xen/include/xen/pmstat.h
@@ -6,10 +6,17 @@
 #include <public/sysctl.h>   /* for struct pm_cx_stat */

 int set_px_pminfo(uint32_t cpu, struct xen_processor_performance *perf);
+#ifdef CONFIG_ACPI /* or CONFIG_X86 ? */
 long set_cx_pminfo(uint32_t cpu, struct xen_processor_power *power);
 uint32_t pmstat_get_cx_nr(uint32_t cpuid);
 int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat);
 int pmstat_reset_cx_stat(uint32_t cpuid);
+#else
+static inline long set_cx_pminfo(uint32_t cpu, struct
xen_processor_power *power) { return 0; }
+static inline uint32_t pmstat_get_cx_nr(uint32_t cpuid) { return 0; }
+static inline int pmstat_get_cx_stat(uint32_t cpuid, struct
pm_cx_stat *stat) { return 0; }
+static inline int pmstat_reset_cx_stat(uint32_t cpuid) { return 0; }
+#endif

 int do_get_pm_info(struct xen_sysctl_get_pmstat *op);
 int do_pm_op(struct xen_sysctl_pm_op *op);

What do you think?

>
>
>> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/drivers/pm/stat.c    | 8 +++++++-
>>  xen/include/xen/pmstat.h | 2 ++
>>  2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
>> index 133e64d..986ba41 100644
>> --- a/xen/drivers/pm/stat.c
>> +++ b/xen/drivers/pm/stat.c
>> @@ -35,7 +35,6 @@
>>  #include <asm/processor.h>
>>  #include <xen/percpu.h>
>>  #include <xen/domain.h>
>> -#include <xen/acpi.h>
>>
>>  #include <public/sysctl.h>
>>  #include <xen/cpufreq.h>
>> @@ -132,6 +131,8 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
>>          break;
>>      }
>>
>> +/* For now those operations can be used only when ACPI is enabled */
>> +#ifdef CONFIG_ACPI
>>      case PMSTAT_get_max_cx:
>>      {
>>          op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
>> @@ -150,6 +151,7 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
>>          ret = pmstat_reset_cx_stat(op->cpuid);
>>          break;
>>      }
>> +#endif /* CONFIG_ACPI */
>>
>>      default:
>>          printk("not defined sub-hypercall @ do_get_pm_info\n");
>> @@ -465,6 +467,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>          break;
>>      }
>>
>> +#ifdef CONFIG_ACPI
>>      case XEN_SYSCTL_pm_op_get_max_cstate:
>>      {
>>          op->u.get_max_cstate = acpi_get_cstate_limit();
>> @@ -476,6 +479,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>          acpi_set_cstate_limit(op->u.set_max_cstate);
>>          break;
>>      }
>> +#endif /* CONFIG_ACPI */
>>
>>  #ifdef CONFIG_HAS_CPU_TURBO
>>      case XEN_SYSCTL_pm_op_enable_turbo:
>> @@ -500,6 +504,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>      return ret;
>>  }
>>
>> +#ifdef CONFIG_ACPI
>>  int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>>  {
>>      u32 bits[3];
>> @@ -530,3 +535,4 @@ int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>>
>>      return ret;
>>  }
>> +#endif /* CONFIG_ACPI */
>> diff --git a/xen/include/xen/pmstat.h b/xen/include/xen/pmstat.h
>> index 266bc16..a870c8a 100644
>> --- a/xen/include/xen/pmstat.h
>> +++ b/xen/include/xen/pmstat.h
>> @@ -6,10 +6,12 @@
>>  #include <public/sysctl.h>   /* for struct pm_cx_stat */
>>
>>  int set_px_pminfo(uint32_t cpu, struct xen_processor_performance *perf);
>> +#ifdef CONFIG_ACPI
>>  long set_cx_pminfo(uint32_t cpu, struct xen_processor_power *power);
>>  uint32_t pmstat_get_cx_nr(uint32_t cpuid);
>>  int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat);
>>  int pmstat_reset_cx_stat(uint32_t cpuid);
>> +#endif
>>
>>  int do_get_pm_info(struct xen_sysctl_get_pmstat *op);
>>  int do_pm_op(struct xen_sysctl_pm_op *op);
>> --
>> 2.7.4
>>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-02  1:37   ` Stefano Stabellini
@ 2017-12-04 19:34     ` Oleksandr Tyshchenko
  2017-12-04 22:46       ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-04 19:34 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi, Stefano

On Sat, Dec 2, 2017 at 3:37 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>>
>> First implementation of the cpufreq driver has been
>> written with x86 in mind. This patch makes possible
>> the cpufreq driver be working on both x86 and arm
>> architectures.
>>
>> This is a rebased version of the original patch:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00932.html
>>
>> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/drivers/cpufreq/cpufreq.c    | 81 +++++++++++++++++++++++++++++++++++++---
>>  xen/include/public/platform.h    |  1 +
>>  xen/include/xen/processor_perf.h |  6 +++
>>  3 files changed, 82 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
>> index ab909e2..64e1ae7 100644
>> --- a/xen/drivers/cpufreq/cpufreq.c
>> +++ b/xen/drivers/cpufreq/cpufreq.c
>> @@ -42,7 +42,6 @@
>>  #include <asm/io.h>
>>  #include <asm/processor.h>
>>  #include <asm/percpu.h>
>> -#include <acpi/acpi.h>
>>  #include <xen/cpufreq.h>
>>
>>  static unsigned int __read_mostly usr_min_freq;
>> @@ -206,6 +205,7 @@ int cpufreq_add_cpu(unsigned int cpu)
>>      } else {
>>          /* domain sanity check under whatever coordination type */
>>          firstcpu = cpumask_first(cpufreq_dom->map);
>> +#ifdef CONFIG_ACPI
>>          if ((perf->domain_info.coord_type !=
>>              processor_pminfo[firstcpu]->perf.domain_info.coord_type) ||
>>              (perf->domain_info.num_processors !=
>> @@ -221,6 +221,19 @@ int cpufreq_add_cpu(unsigned int cpu)
>>                  );
>>              return -EINVAL;
>>          }
>> +#else /* !CONFIG_ACPI */
>> +        if ((perf->domain_info.num_processors !=
>> +            processor_pminfo[firstcpu]->perf.domain_info.num_processors)) {
>> +
>> +            printk(KERN_WARNING "cpufreq fail to add CPU%d:"
>> +                   "incorrect num processors (%"PRIu64"), "
>> +                   "expect(%"PRIu64")\n",
>> +                   cpu, perf->domain_info.num_processors,
>> +                   processor_pminfo[firstcpu]->perf.domain_info.num_processors
>> +                );
>> +            return -EINVAL;
>> +        }
>> +#endif /* CONFIG_ACPI */
>
> Why is this necessary? I am asking this question, because I think it
> would be best to avoid more #ifdef's if we can avoid them, and some of
> the code #ifdef'ed doesn't look very acpi specific (at least at first
> sight). It doesn't look like this change is very beneficial. What am I
> missing?

Probably, the original author of this patch wanted to avoid playing
with some stuff (code & variables) which didn't make sense/wouldn't be
used on non-ACPI systems.

Agree here, we are able to avoid this #ifdef as well as many others. I
don't see an issue, for example, to print something defaulting for
coord_type/num_entries/revision/etc.

>
>
>>      }
>>
>>      if (!domexist || hw_all) {
>> @@ -380,6 +393,7 @@ int cpufreq_del_cpu(unsigned int cpu)
>>      return 0;
>>  }
>>
>> +#ifdef CONFIG_ACPI
>>  static void print_PCT(struct xen_pct_register *ptr)
>>  {
>>      printk("\t_PCT: descriptor=%d, length=%d, space_id=%d, "
>> @@ -387,12 +401,14 @@ static void print_PCT(struct xen_pct_register *ptr)
>>             ptr->descriptor, ptr->length, ptr->space_id, ptr->bit_width,
>>             ptr->bit_offset, ptr->reserved, ptr->address);
>>  }
>> +#endif /* CONFIG_ACPI */
>
> same question

definitely omit #ifdef

>
>
>>  static void print_PSS(struct xen_processor_px *ptr, int count)
>>  {
>>      int i;
>>      printk("\t_PSS: state_count=%d\n", count);
>>      for (i=0; i<count; i++){
>> +#ifdef CONFIG_ACPI
>>          printk("\tState%d: %"PRId64"MHz %"PRId64"mW %"PRId64"us "
>>                 "%"PRId64"us %#"PRIx64" %#"PRIx64"\n",
>>                 i,
>> @@ -402,15 +418,26 @@ static void print_PSS(struct xen_processor_px *ptr, int count)
>>                 ptr[i].bus_master_latency,
>>                 ptr[i].control,
>>                 ptr[i].status);
>> +#else /* !CONFIG_ACPI */
>> +        printk("\tState%d: %"PRId64"MHz %"PRId64"us\n",
>> +               i,
>> +               ptr[i].core_frequency,
>> +               ptr[i].transition_latency);
>> +#endif /* CONFIG_ACPI */
>>      }
>>  }
>
> same question

same answer)

>
>
>>  static void print_PSD( struct xen_psd_package *ptr)
>>  {
>> +#ifdef CONFIG_ACPI
>>      printk("\t_PSD: num_entries=%"PRId64" rev=%"PRId64
>>             " domain=%"PRId64" coord_type=%"PRId64" num_processors=%"PRId64"\n",
>>             ptr->num_entries, ptr->revision, ptr->domain, ptr->coord_type,
>>             ptr->num_processors);
>> +#else /* !CONFIG_ACPI */
>> +    printk("\t_PSD:  domain=%"PRId64" num_processors=%"PRId64"\n",
>> +           ptr->domain, ptr->num_processors);
>> +#endif /* CONFIG_ACPI */
>>  }
>
> same question

same answer)

>
>
>>  static void print_PPC(unsigned int platform_limit)
>> @@ -418,13 +445,53 @@ static void print_PPC(unsigned int platform_limit)
>>      printk("\t_PPC: %d\n", platform_limit);
>>  }
>>
>> +static inline bool is_pss_data(struct xen_processor_performance *px)
>> +{
>> +#ifdef CONFIG_ACPI
>> +    return px->flags & XEN_PX_PSS;
>> +#else
>> +    return px->flags == XEN_PX_DATA;
>> +#endif
>> +}
>> +
>> +static inline bool is_psd_data(struct xen_processor_performance *px)
>> +{
>> +#ifdef CONFIG_ACPI
>> +    return px->flags & XEN_PX_PSD;
>> +#else
>> +    return px->flags == XEN_PX_DATA;
>> +#endif
>> +}
>> +
>> +static inline bool is_ppc_data(struct xen_processor_performance *px)
>> +{
>> +#ifdef CONFIG_ACPI
>> +    return px->flags & XEN_PX_PPC;
>> +#else
>> +    return px->flags == XEN_PX_DATA;
>> +#endif
>> +}
>> +
>> +static inline bool is_all_data(struct xen_processor_performance *px)
>> +{
>> +#ifdef CONFIG_ACPI
>> +    return px->flags == ( XEN_PX_PCT | XEN_PX_PSS | XEN_PX_PSD | XEN_PX_PPC );
>> +#else
>> +    return px->flags == XEN_PX_DATA;
>> +#endif
>> +}
>
> Could you please explain here and in the commit message the idea behind
> this? It looks like we want to get rid of the different flags on
> non-ACPI systems? Why can't we reuse the same flags?

You are right. Indeed looks redundant.
I will drop all these helpers and reuse existing flags. If we are
pretending to be an P-state driver and uploading the same P-state data
which [1] uploads
then I will just reuse existing flags. It will cost me nothing.

May I ask you to take a look at this patch [2]? It looks like a hack
right now, but how to make it in a proper way?

[1] https://github.com/torvalds/linux/blob/master/drivers/xen/xen-acpi-processor.c#L210
[2] https://www.mail-archive.com/xen-devel@lists.xen.org/msg128410.html

>
>
>>  int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_info)
>>  {
>>      int ret=0, cpuid;
>>      struct processor_pminfo *pmpt;
>>      struct processor_performance *pxpt;
>>
>> +#ifdef CONFIG_ACPI
>>      cpuid = get_cpu_id(acpi_id);
>> +#else
>> +    cpuid = acpi_id;
>> +#endif
>
> Rather than an #ifdef here, I would probably generalize the get_cpu_id
> function.

Would a following stub be enough?

diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 9409350..4aab41e 100644
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -123,7 +123,11 @@ static inline int acpi_boot_table_init(void)

 #endif         /*!CONFIG_ACPI*/

+#ifdef CONFIG_ACPI
 int get_cpu_id(u32 acpi_id);
+#else
+static inline int get_cpu_id(u32 acpi_id) { return acpi_id; }
+#endif

 unsigned int acpi_register_gsi (u32 gsi, int edge_level, int active_high_low);
 int acpi_gsi_to_irq (u32 gsi, unsigned int *irq);

>
>
>>      if ( cpuid < 0 || !dom0_px_info)
>>      {
>>          ret = -EINVAL;
>> @@ -446,6 +513,8 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>>          processor_pminfo[cpuid] = pmpt;
>>      }
>>      pxpt = &pmpt->perf;
>> +
>> +#ifdef CONFIG_ACPI
>>      pmpt->acpi_id = acpi_id;
>>      pmpt->id = cpuid;
>>
>> @@ -472,8 +541,9 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>>              print_PCT(&pxpt->status_register);
>>          }
>>      }
>> +#endif /* CONFIG_ACPI */

BTW, at the first sight we could omit this #ifdef too with being taken
care of space_id check to pass successfully.

>>
>> -    if ( dom0_px_info->flags & XEN_PX_PSS )
>> +    if ( is_pss_data(dom0_px_info) )
>>      {
>>          /* capability check */
>>          if (dom0_px_info->state_count <= 1)
>> @@ -500,7 +570,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>>              print_PSS(pxpt->states,pxpt->state_count);
>>      }
>>
>> -    if ( dom0_px_info->flags & XEN_PX_PSD )
>> +    if ( is_psd_data(dom0_px_info) )
>>      {
>>          /* check domain coordination */
>>          if (dom0_px_info->shared_type != CPUFREQ_SHARED_TYPE_ALL &&
>> @@ -520,7 +590,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>>              print_PSD(&pxpt->domain_info);
>>      }
>>
>> -    if ( dom0_px_info->flags & XEN_PX_PPC )
>> +    if ( is_ppc_data(dom0_px_info) )
>>      {
>>          pxpt->platform_limit = dom0_px_info->platform_limit;
>>
>> @@ -534,8 +604,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>>          }
>>      }
>>
>> -    if ( dom0_px_info->flags == ( XEN_PX_PCT | XEN_PX_PSS |
>> -                XEN_PX_PSD | XEN_PX_PPC ) )
>> +    if ( is_all_data(dom0_px_info) )
>>      {
>>          pxpt->init = XEN_PX_INIT;
>>
>> diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
>> index 94dbc3f..328579c 100644
>> --- a/xen/include/public/platform.h
>> +++ b/xen/include/public/platform.h
>> @@ -384,6 +384,7 @@ DEFINE_XEN_GUEST_HANDLE(xenpf_getidletime_t);
>>  #define XEN_PX_PSS   2
>>  #define XEN_PX_PPC   4
>>  #define XEN_PX_PSD   8
>> +#define XEN_PX_DATA  16
>>
>>  struct xen_power_register {
>>      uint32_t     space_id;
>> diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
>> index d8a1ba6..afdccf2 100644
>> --- a/xen/include/xen/processor_perf.h
>> +++ b/xen/include/xen/processor_perf.h
>> @@ -3,7 +3,9 @@
>>
>>  #include <public/platform.h>
>>  #include <public/sysctl.h>
>> +#ifdef CONFIG_ACPI
>>  #include <xen/acpi.h>
>> +#endif
>>
>>  #define XEN_PX_INIT 0x80000000
>>
>> @@ -24,8 +26,10 @@ int  cpufreq_del_cpu(unsigned int);
>>  struct processor_performance {
>>      uint32_t state;
>>      uint32_t platform_limit;
>> +#ifdef CONFIG_ACPI
>>      struct xen_pct_register control_register;
>>      struct xen_pct_register status_register;
>> +#endif
>>      uint32_t state_count;
>>      struct xen_processor_px *states;
>>      struct xen_psd_package domain_info;
>> @@ -35,8 +39,10 @@ struct processor_performance {
>>  };
>>
>>  struct processor_pminfo {
>> +#ifdef CONFIG_ACPI
>>      uint32_t acpi_id;
>>      uint32_t id;
>> +#endif
>>      struct processor_performance    perf;
>>  };

There will be no changes here as well.

>>
>> --
>> 2.7.4
>>


-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-02 17:25     ` Oleksandr Tyshchenko
  2017-12-04 11:58       ` Andre Przywara
@ 2017-12-04 22:18       ` Stefano Stabellini
  2017-12-05 11:13         ` Oleksandr Tyshchenko
  1 sibling, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 22:18 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Sat, 2 Dec 2017, Oleksandr Tyshchenko wrote:
> On Sat, Dec 2, 2017 at 3:06 AM, Stefano Stabellini
> <sstabellini@kernel.org> wrote:
> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> >> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >>
> >> This settings is not needed for some architectures.
> >> So make it to be configurable and use it for x86
> >> architecture.
> >>
> >> This is a rebased version of the original patch:
> >> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00942.html
> >>
> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >> CC: Jan Beulich <jbeulich@suse.com>
> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >> CC: Stefano Stabellini <sstabellini@kernel.org>
> >> CC: Julien Grall <julien.grall@linaro.org>
> >> ---
> >>  xen/arch/x86/Kconfig          |  1 +
> >>  xen/drivers/cpufreq/Kconfig   |  3 +++
> >>  xen/drivers/cpufreq/utility.c | 11 ++++++++++-
> >>  xen/drivers/pm/stat.c         |  6 ++++++
> >>  xen/include/xen/cpufreq.h     |  6 ++++++
> >>  5 files changed, 26 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> >> index 86c8eca..c1eac1d 100644
> >> --- a/xen/arch/x86/Kconfig
> >> +++ b/xen/arch/x86/Kconfig
> >> @@ -24,6 +24,7 @@ config X86
> >>       select NUMA
> >>       select VGA
> >>       select HAS_PM
> >> +     select HAS_CPU_TURBO
> >>
> >>  config ARCH_DEFCONFIG
> >>       string
> >> diff --git a/xen/drivers/cpufreq/Kconfig b/xen/drivers/cpufreq/Kconfig
> >> index cce80f4..427ea2a 100644
> >> --- a/xen/drivers/cpufreq/Kconfig
> >> +++ b/xen/drivers/cpufreq/Kconfig
> >> @@ -1,3 +1,6 @@
> >>
> >>  config HAS_CPUFREQ
> >>       bool
> >> +
> >> +config HAS_CPU_TURBO
> >> +     bool
> >> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> >> index a687e5a..25bf983 100644
> >> --- a/xen/drivers/cpufreq/utility.c
> >> +++ b/xen/drivers/cpufreq/utility.c
> >> @@ -209,7 +209,9 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
> >>  {
> >>      unsigned int min_freq = ~0;
> >>      unsigned int max_freq = 0;
> >> +#ifdef CONFIG_HAS_CPU_TURBO
> >>      unsigned int second_max_freq = 0;
> >> +#endif
> >>      unsigned int i;
> >>
> >>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
> >> @@ -221,6 +223,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
> >>          if (freq > max_freq)
> >>              max_freq = freq;
> >>      }
> >> +#ifdef CONFIG_HAS_CPU_TURBO
> >>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
> >>          unsigned int freq = table[i].frequency;
> >>          if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
> >> @@ -234,9 +237,13 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
> >>          printk("max_freq: %u    second_max_freq: %u\n",
> >>                 max_freq, second_max_freq);
> >>
> >> +    policy->cpuinfo.second_max_freq = second_max_freq;
> >> +#else /* !CONFIG_HAS_CPU_TURBO */
> >> +    if (cpufreq_verbose)
> >> +        printk("max_freq: %u\n", max_freq);
> >> +#endif /* CONFIG_HAS_CPU_TURBO */
> >>      policy->min = policy->cpuinfo.min_freq = min_freq;
> >>      policy->max = policy->cpuinfo.max_freq = max_freq;
> >> -    policy->cpuinfo.second_max_freq = second_max_freq;
> >>
> >>      if (policy->min == ~0)
> >>          return -EINVAL;
> >> @@ -390,6 +397,7 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
> >>      return policy->cur;
> >>  }
> >>
> >> +#ifdef CONFIG_HAS_CPU_TURBO
> >>  int cpufreq_update_turbo(int cpuid, int new_state)
> >>  {
> >>      struct cpufreq_policy *policy;
> >> @@ -430,6 +438,7 @@ int cpufreq_get_turbo_status(int cpuid)
> >>      policy = per_cpu(cpufreq_cpu_policy, cpuid);
> >>      return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
> >>  }
> >> +#endif /* CONFIG_HAS_CPU_TURBO */
> >>
> >>  /*********************************************************************
> >>   *                 POLICY                                            *
> >
> > I am wondering if we need to go as far as #ifdef'ing
> > cpufreq_update_turbo. For the sake of reducing the number if #ifdef's,
> > would it be enough if we only make sure it is disabled?
> >
> > In other words, I would keep the changes to stat.c but I would leave
> > utility.c and cpufreq.h pretty much untouched.
> 
> Yes. I was thinking about dropping this patch at all. If platform
> doesn't support CPU Boost, the platform
> driver should just inform framework about that (policy->turbo =
> CPUFREQ_TURBO_UNSUPPORTED).
> That's all.

Right


> cpufreq_update_turbo() will return -EOPNOTSUPP if someone tries to
> enable/disable turbo mode.
> cpufreq_get_turbo_status() will return that turbo mode "is not enabled".

Exactly what I was thinking


> Another question is second_max_freq. As I understand, it is highest
> non-turbo frequency calculated by framework to limit target frequency
> when
> turbo mode "is disabled". And Xen assumes that second_max_freq is
> always P1 if turbo mode is on.
> But, there might be a case when a few highest frequencies are
> turbo-frequencies. So, I propose to add an extra flag for handling
> that.
> So, each CPUFreq driver responsibility will be to mark
> turbo-frequency(ies) for the framework to properly calculate
> second_max_freq.

As Andre wrote, we can start simply assuming that ARM doesn't have
turbo. If turbo mode is assumed to be off, I don't think we need the
patch below and the new flag, because second_max_freq == max_freq.


> Something like that:
> 
> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> index 25bf983..122a88b 100644
> --- a/xen/drivers/cpufreq/utility.c
> +++ b/xen/drivers/cpufreq/utility.c
> @@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
> cpufreq_policy *policy,
>  #ifdef CONFIG_HAS_CPU_TURBO
>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>          unsigned int freq = table[i].frequency;
> -        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
> +        if ((freq == CPUFREQ_ENTRY_INVALID) ||
> +            (table[i].flags & CPUFREQ_BOOST_FREQ))
>              continue;
>          if (freq > second_max_freq)
>              second_max_freq = freq;
> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
> index 2e0c16a..77b29da 100644
> --- a/xen/include/xen/cpufreq.h
> +++ b/xen/include/xen/cpufreq.h
> @@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
> cpufreq_policy *policy,
>  #define CPUFREQ_ENTRY_INVALID ~0
>  #define CPUFREQ_TABLE_END     ~1
> 
> +/* Special Values of .flags field */
> +#define CPUFREQ_BOOST_FREQ    (1 << 0)
> +
>  struct cpufreq_frequency_table {
> +       unsigned int    flags;
>      unsigned int    index;     /* any */
>      unsigned int    frequency; /* kHz - doesn't need to be in ascending
>                                  * order */
> 
> Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
> a turbo-frequency if turbo mode "is supported". Am I correct?
> 
> And the most important question is how to recognize in Xen on ARM
> (using SCPI protocol) which frequencies are turbo-frequencies
> actually? I couldn't find any information regarding that in protocol
> description.
> For DT-based CPUFreq it is not an issue, since there is a specific
> property "turbo-mode" to mark corresponding OPPs. [1].
> But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
> mentions about it. Perhaps, additional command should be added to pass
> such info.
> 
> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
> [2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
> [3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable
  2017-12-04 16:21     ` Oleksandr Tyshchenko
@ 2017-12-04 22:30       ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 22:30 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Mon, 4 Dec 2017, Oleksandr Tyshchenko wrote:
> Hi Stefano
> 
> On Sat, Dec 2, 2017 at 3:21 AM, Stefano Stabellini
> <sstabellini@kernel.org> wrote:
> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> >> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >>
> >> ACPI-specific parts are moved under appropriate ifdefs.
> >> Now pmstat functions can be used in ARM platform.
> >>
> >> This is a rebased version of the original patch:
> >> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00941.html
> >
> > My first maybe naive question is: why do we want to disable the C-states
> > and not the P-states? After all, they are both defined in ACPI?
> 
> Good question. Xen CPUFreq infrastructure based on ACPI P-states. We
> have to either
> completely rework generic code/existing drivers or integrate into
> "current environment" (so, the CPUFreq driver,
> this patch series adds, is pretending that it does understand what the
> P-states are). The second option requires much less
> developing & upstreaming (I hope) efforts. BTW, with the current
> solution you don't have to modify public sysctl & xenpm.
> And looking through all previous discussions [1] I got a feeling that
> the original author of this patch had had similar opinion.
> 
> [1]
> /* RFC v0 */
> https://lists.xen.org/archives/html/xen-devel/2014-08/msg02919.html
> /* RFC v1 */
> https://lists.xenproject.org/archives/html/xen-devel/2014-10/msg00787.html
> /* RFC v2 */
> https://lists.xenproject.org/archives/html/xen-devel/2014-10/msg01879.html
> /* RFC v3 */
> https://marc.info/?l=xen-devel&m=141407701110860&w=2
> /* RFC v4 */
> https://marc.info/?l=xen-devel&m=141510663108037&w=2
> /* RFC v5 */
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html

thank you, it makes sense


> >
> > The second question is: instead of #ifdef'ing everything C-states,
> > couldn't we just rely on XEN_PROCESSOR_PM_CX not being available?
> 
> I am afraid that relying on XEN_PROCESSOR_PM_CX not being available is
> not enough.
> A few functions, which were #ifdef'd by original author of the patch,
> are located at arch/x86 path.
> So, I think, the question was to get pmstat.c compilable on ARM.
> 
> But completely agree that a scope of #ifdef's can be reduced.
> 
> 1. For next functions we will be able to omit #ifdef CONFIG_ACPI if we
> create corresponding stubs.
> - pmstat_get_cx_nr()
> - pmstat_get_cx_stat()
> - pmstat_reset_cx_stat()
> They won't never be called if XEN_PROCESSOR_PM_CX is not set.

sounds good


> 2. For next functions we, probably, may omit #ifdef CONFIG_ACPI, since
> the corresponding stubs already present (see !CONFIG_ACPI_CSTATE in
> acpi.h)
> - acpi_get_cstate_limit()
> - acpi_set_cstate_limit()

it looks like it, yes


> But acpi_set_pdc_bits() I would leave under #ifdef CONFIG_ACPI
> (CONFIG_X86 ?) or move it to arch/x86.
> It is called from arch/x86/platform_hypercall.c and pulls a bunch of
> #define-s from pdc_intel.h

Yes, I would move it to arch/x86.


> Something like that:
> 
> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
> index 133e64d..353d0ab 100644
> --- a/xen/drivers/pm/stat.c
> +++ b/xen/drivers/pm/stat.c
> @@ -500,6 +500,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>      return ret;
>  }
> 
> +#ifdef CONFIG_ACPI /* or CONFIG_X86 ? */
>  int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>  {
>      u32 bits[3];
> @@ -530,3 +531,4 @@ int acpi_set_pdc_bits(u32 acpi_id,
> XEN_GUEST_HANDLE_PARAM(uint32) pdc)
> 
>      return ret;
>  }
> +#endif /* CONFIG_ACPI */
> diff --git a/xen/include/xen/pmstat.h b/xen/include/xen/pmstat.h
> index 266bc16..05d6b7b 100644
> --- a/xen/include/xen/pmstat.h
> +++ b/xen/include/xen/pmstat.h
> @@ -6,10 +6,17 @@
>  #include <public/sysctl.h>   /* for struct pm_cx_stat */
> 
>  int set_px_pminfo(uint32_t cpu, struct xen_processor_performance *perf);
> +#ifdef CONFIG_ACPI /* or CONFIG_X86 ? */
>  long set_cx_pminfo(uint32_t cpu, struct xen_processor_power *power);
>  uint32_t pmstat_get_cx_nr(uint32_t cpuid);
>  int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat);
>  int pmstat_reset_cx_stat(uint32_t cpuid);
> +#else
> +static inline long set_cx_pminfo(uint32_t cpu, struct
> xen_processor_power *power) { return 0; }
> +static inline uint32_t pmstat_get_cx_nr(uint32_t cpuid) { return 0; }
> +static inline int pmstat_get_cx_stat(uint32_t cpuid, struct
> pm_cx_stat *stat) { return 0; }
> +static inline int pmstat_reset_cx_stat(uint32_t cpuid) { return 0; }
> +#endif
> 
>  int do_get_pm_info(struct xen_sysctl_get_pmstat *op);
>  int do_pm_op(struct xen_sysctl_pm_op *op);
> 
> What do you think?

much better


> >
> >
> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >> CC: Jan Beulich <jbeulich@suse.com>
> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >> CC: Stefano Stabellini <sstabellini@kernel.org>
> >> CC: Julien Grall <julien.grall@linaro.org>
> >> ---
> >>  xen/drivers/pm/stat.c    | 8 +++++++-
> >>  xen/include/xen/pmstat.h | 2 ++
> >>  2 files changed, 9 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
> >> index 133e64d..986ba41 100644
> >> --- a/xen/drivers/pm/stat.c
> >> +++ b/xen/drivers/pm/stat.c
> >> @@ -35,7 +35,6 @@
> >>  #include <asm/processor.h>
> >>  #include <xen/percpu.h>
> >>  #include <xen/domain.h>
> >> -#include <xen/acpi.h>
> >>
> >>  #include <public/sysctl.h>
> >>  #include <xen/cpufreq.h>
> >> @@ -132,6 +131,8 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
> >>          break;
> >>      }
> >>
> >> +/* For now those operations can be used only when ACPI is enabled */
> >> +#ifdef CONFIG_ACPI
> >>      case PMSTAT_get_max_cx:
> >>      {
> >>          op->u.getcx.nr = pmstat_get_cx_nr(op->cpuid);
> >> @@ -150,6 +151,7 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
> >>          ret = pmstat_reset_cx_stat(op->cpuid);
> >>          break;
> >>      }
> >> +#endif /* CONFIG_ACPI */
> >>
> >>      default:
> >>          printk("not defined sub-hypercall @ do_get_pm_info\n");
> >> @@ -465,6 +467,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
> >>          break;
> >>      }
> >>
> >> +#ifdef CONFIG_ACPI
> >>      case XEN_SYSCTL_pm_op_get_max_cstate:
> >>      {
> >>          op->u.get_max_cstate = acpi_get_cstate_limit();
> >> @@ -476,6 +479,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
> >>          acpi_set_cstate_limit(op->u.set_max_cstate);
> >>          break;
> >>      }
> >> +#endif /* CONFIG_ACPI */
> >>
> >>  #ifdef CONFIG_HAS_CPU_TURBO
> >>      case XEN_SYSCTL_pm_op_enable_turbo:
> >> @@ -500,6 +504,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
> >>      return ret;
> >>  }
> >>
> >> +#ifdef CONFIG_ACPI
> >>  int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
> >>  {
> >>      u32 bits[3];
> >> @@ -530,3 +535,4 @@ int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
> >>
> >>      return ret;
> >>  }
> >> +#endif /* CONFIG_ACPI */
> >> diff --git a/xen/include/xen/pmstat.h b/xen/include/xen/pmstat.h
> >> index 266bc16..a870c8a 100644
> >> --- a/xen/include/xen/pmstat.h
> >> +++ b/xen/include/xen/pmstat.h
> >> @@ -6,10 +6,12 @@
> >>  #include <public/sysctl.h>   /* for struct pm_cx_stat */
> >>
> >>  int set_px_pminfo(uint32_t cpu, struct xen_processor_performance *perf);
> >> +#ifdef CONFIG_ACPI
> >>  long set_cx_pminfo(uint32_t cpu, struct xen_processor_power *power);
> >>  uint32_t pmstat_get_cx_nr(uint32_t cpuid);
> >>  int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat);
> >>  int pmstat_reset_cx_stat(uint32_t cpuid);
> >> +#endif
> >>
> >>  int do_get_pm_info(struct xen_sysctl_get_pmstat *op);
> >>  int do_pm_op(struct xen_sysctl_pm_op *op);
> >> --
> >> 2.7.4
> >>
> 
> 
> 
> -- 
> Regards,
> 
> Oleksandr Tyshchenko
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-04 19:34     ` Oleksandr Tyshchenko
@ 2017-12-04 22:46       ` Stefano Stabellini
  2017-12-05 19:29         ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 22:46 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Mon, 4 Dec 2017, Oleksandr Tyshchenko wrote:
> Hi, Stefano
> 
> On Sat, Dec 2, 2017 at 3:37 AM, Stefano Stabellini
> <sstabellini@kernel.org> wrote:
> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> >> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >>
> >> First implementation of the cpufreq driver has been
> >> written with x86 in mind. This patch makes possible
> >> the cpufreq driver be working on both x86 and arm
> >> architectures.
> >>
> >> This is a rebased version of the original patch:
> >> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00932.html
> >>
> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >> CC: Jan Beulich <jbeulich@suse.com>
> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >> CC: Stefano Stabellini <sstabellini@kernel.org>
> >> CC: Julien Grall <julien.grall@linaro.org>
> >> ---
> >>  xen/drivers/cpufreq/cpufreq.c    | 81 +++++++++++++++++++++++++++++++++++++---
> >>  xen/include/public/platform.h    |  1 +
> >>  xen/include/xen/processor_perf.h |  6 +++
> >>  3 files changed, 82 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> >> index ab909e2..64e1ae7 100644
> >> --- a/xen/drivers/cpufreq/cpufreq.c
> >> +++ b/xen/drivers/cpufreq/cpufreq.c
> >> @@ -42,7 +42,6 @@
> >>  #include <asm/io.h>
> >>  #include <asm/processor.h>
> >>  #include <asm/percpu.h>
> >> -#include <acpi/acpi.h>
> >>  #include <xen/cpufreq.h>
> >>
> >>  static unsigned int __read_mostly usr_min_freq;
> >> @@ -206,6 +205,7 @@ int cpufreq_add_cpu(unsigned int cpu)
> >>      } else {
> >>          /* domain sanity check under whatever coordination type */
> >>          firstcpu = cpumask_first(cpufreq_dom->map);
> >> +#ifdef CONFIG_ACPI
> >>          if ((perf->domain_info.coord_type !=
> >>              processor_pminfo[firstcpu]->perf.domain_info.coord_type) ||
> >>              (perf->domain_info.num_processors !=
> >> @@ -221,6 +221,19 @@ int cpufreq_add_cpu(unsigned int cpu)
> >>                  );
> >>              return -EINVAL;
> >>          }
> >> +#else /* !CONFIG_ACPI */
> >> +        if ((perf->domain_info.num_processors !=
> >> +            processor_pminfo[firstcpu]->perf.domain_info.num_processors)) {
> >> +
> >> +            printk(KERN_WARNING "cpufreq fail to add CPU%d:"
> >> +                   "incorrect num processors (%"PRIu64"), "
> >> +                   "expect(%"PRIu64")\n",
> >> +                   cpu, perf->domain_info.num_processors,
> >> +                   processor_pminfo[firstcpu]->perf.domain_info.num_processors
> >> +                );
> >> +            return -EINVAL;
> >> +        }
> >> +#endif /* CONFIG_ACPI */
> >
> > Why is this necessary? I am asking this question, because I think it
> > would be best to avoid more #ifdef's if we can avoid them, and some of
> > the code #ifdef'ed doesn't look very acpi specific (at least at first
> > sight). It doesn't look like this change is very beneficial. What am I
> > missing?
> 
> Probably, the original author of this patch wanted to avoid playing
> with some stuff (code & variables) which didn't make sense/wouldn't be
> used on non-ACPI systems.
> 
> Agree here, we are able to avoid this #ifdef as well as many others. I
> don't see an issue, for example, to print something defaulting for
> coord_type/num_entries/revision/etc.

I agree


> >
> >
> >>      }
> >>
> >>      if (!domexist || hw_all) {
> >> @@ -380,6 +393,7 @@ int cpufreq_del_cpu(unsigned int cpu)
> >>      return 0;
> >>  }
> >>
> >> +#ifdef CONFIG_ACPI
> >>  static void print_PCT(struct xen_pct_register *ptr)
> >>  {
> >>      printk("\t_PCT: descriptor=%d, length=%d, space_id=%d, "
> >> @@ -387,12 +401,14 @@ static void print_PCT(struct xen_pct_register *ptr)
> >>             ptr->descriptor, ptr->length, ptr->space_id, ptr->bit_width,
> >>             ptr->bit_offset, ptr->reserved, ptr->address);
> >>  }
> >> +#endif /* CONFIG_ACPI */
> >
> > same question
> 
> definitely omit #ifdef
> 
> >
> >
> >>  static void print_PSS(struct xen_processor_px *ptr, int count)
> >>  {
> >>      int i;
> >>      printk("\t_PSS: state_count=%d\n", count);
> >>      for (i=0; i<count; i++){
> >> +#ifdef CONFIG_ACPI
> >>          printk("\tState%d: %"PRId64"MHz %"PRId64"mW %"PRId64"us "
> >>                 "%"PRId64"us %#"PRIx64" %#"PRIx64"\n",
> >>                 i,
> >> @@ -402,15 +418,26 @@ static void print_PSS(struct xen_processor_px *ptr, int count)
> >>                 ptr[i].bus_master_latency,
> >>                 ptr[i].control,
> >>                 ptr[i].status);
> >> +#else /* !CONFIG_ACPI */
> >> +        printk("\tState%d: %"PRId64"MHz %"PRId64"us\n",
> >> +               i,
> >> +               ptr[i].core_frequency,
> >> +               ptr[i].transition_latency);
> >> +#endif /* CONFIG_ACPI */
> >>      }
> >>  }
> >
> > same question
> 
> same answer)
> 
> >
> >
> >>  static void print_PSD( struct xen_psd_package *ptr)
> >>  {
> >> +#ifdef CONFIG_ACPI
> >>      printk("\t_PSD: num_entries=%"PRId64" rev=%"PRId64
> >>             " domain=%"PRId64" coord_type=%"PRId64" num_processors=%"PRId64"\n",
> >>             ptr->num_entries, ptr->revision, ptr->domain, ptr->coord_type,
> >>             ptr->num_processors);
> >> +#else /* !CONFIG_ACPI */
> >> +    printk("\t_PSD:  domain=%"PRId64" num_processors=%"PRId64"\n",
> >> +           ptr->domain, ptr->num_processors);
> >> +#endif /* CONFIG_ACPI */
> >>  }
> >
> > same question
> 
> same answer)
> 
> >
> >
> >>  static void print_PPC(unsigned int platform_limit)
> >> @@ -418,13 +445,53 @@ static void print_PPC(unsigned int platform_limit)
> >>      printk("\t_PPC: %d\n", platform_limit);
> >>  }
> >>
> >> +static inline bool is_pss_data(struct xen_processor_performance *px)
> >> +{
> >> +#ifdef CONFIG_ACPI
> >> +    return px->flags & XEN_PX_PSS;
> >> +#else
> >> +    return px->flags == XEN_PX_DATA;
> >> +#endif
> >> +}
> >> +
> >> +static inline bool is_psd_data(struct xen_processor_performance *px)
> >> +{
> >> +#ifdef CONFIG_ACPI
> >> +    return px->flags & XEN_PX_PSD;
> >> +#else
> >> +    return px->flags == XEN_PX_DATA;
> >> +#endif
> >> +}
> >> +
> >> +static inline bool is_ppc_data(struct xen_processor_performance *px)
> >> +{
> >> +#ifdef CONFIG_ACPI
> >> +    return px->flags & XEN_PX_PPC;
> >> +#else
> >> +    return px->flags == XEN_PX_DATA;
> >> +#endif
> >> +}
> >> +
> >> +static inline bool is_all_data(struct xen_processor_performance *px)
> >> +{
> >> +#ifdef CONFIG_ACPI
> >> +    return px->flags == ( XEN_PX_PCT | XEN_PX_PSS | XEN_PX_PSD | XEN_PX_PPC );
> >> +#else
> >> +    return px->flags == XEN_PX_DATA;
> >> +#endif
> >> +}
> >
> > Could you please explain here and in the commit message the idea behind
> > this? It looks like we want to get rid of the different flags on
> > non-ACPI systems? Why can't we reuse the same flags?
> 
> You are right. Indeed looks redundant.
> I will drop all these helpers and reuse existing flags. If we are
> pretending to be an P-state driver and uploading the same P-state data
> which [1] uploads
> then I will just reuse existing flags. It will cost me nothing.

Makes sense


> May I ask you to take a look at this patch [2]? It looks like a hack
> right now, but how to make it in a proper way?
> 
> [1] https://github.com/torvalds/linux/blob/master/drivers/xen/xen-acpi-processor.c#L210
> [2] https://www.mail-archive.com/xen-devel@lists.xen.org/msg128410.html

Regarding [2]:

This is something that needs to be agreed with the x86 maintainers.
However, I would move the copy_from_guest (and everything related to
parsing caller provided arguments) to
xen/arch/x86/platform_hypercall.c:do_platform_op.

Then, I would make set_px_pminfo look like a regular function that
takes regular arguments (no XEN_GUEST_HANDLEs), so that it can be called
on ARM without having to "fake" an hypercall.
 

> >
> >
> >>  int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_info)
> >>  {
> >>      int ret=0, cpuid;
> >>      struct processor_pminfo *pmpt;
> >>      struct processor_performance *pxpt;
> >>
> >> +#ifdef CONFIG_ACPI
> >>      cpuid = get_cpu_id(acpi_id);
> >> +#else
> >> +    cpuid = acpi_id;
> >> +#endif
> >
> > Rather than an #ifdef here, I would probably generalize the get_cpu_id
> > function.
> 
> Would a following stub be enough?
> 
> diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
> index 9409350..4aab41e 100644
> --- a/xen/include/xen/acpi.h
> +++ b/xen/include/xen/acpi.h
> @@ -123,7 +123,11 @@ static inline int acpi_boot_table_init(void)
> 
>  #endif         /*!CONFIG_ACPI*/
> 
> +#ifdef CONFIG_ACPI
>  int get_cpu_id(u32 acpi_id);
> +#else
> +static inline int get_cpu_id(u32 acpi_id) { return acpi_id; }
> +#endif
> 
>  unsigned int acpi_register_gsi (u32 gsi, int edge_level, int active_high_low);
>  int acpi_gsi_to_irq (u32 gsi, unsigned int *irq);

Yes, I think that's OK.


> >
> >
> >>      if ( cpuid < 0 || !dom0_px_info)
> >>      {
> >>          ret = -EINVAL;
> >> @@ -446,6 +513,8 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
> >>          processor_pminfo[cpuid] = pmpt;
> >>      }
> >>      pxpt = &pmpt->perf;
> >> +
> >> +#ifdef CONFIG_ACPI
> >>      pmpt->acpi_id = acpi_id;
> >>      pmpt->id = cpuid;
> >>
> >> @@ -472,8 +541,9 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
> >>              print_PCT(&pxpt->status_register);
> >>          }
> >>      }
> >> +#endif /* CONFIG_ACPI */
> 
> BTW, at the first sight we could omit this #ifdef too with being taken
> care of space_id check to pass successfully.
> 
> >>
> >> -    if ( dom0_px_info->flags & XEN_PX_PSS )
> >> +    if ( is_pss_data(dom0_px_info) )
> >>      {
> >>          /* capability check */
> >>          if (dom0_px_info->state_count <= 1)
> >> @@ -500,7 +570,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
> >>              print_PSS(pxpt->states,pxpt->state_count);
> >>      }
> >>
> >> -    if ( dom0_px_info->flags & XEN_PX_PSD )
> >> +    if ( is_psd_data(dom0_px_info) )
> >>      {
> >>          /* check domain coordination */
> >>          if (dom0_px_info->shared_type != CPUFREQ_SHARED_TYPE_ALL &&
> >> @@ -520,7 +590,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
> >>              print_PSD(&pxpt->domain_info);
> >>      }
> >>
> >> -    if ( dom0_px_info->flags & XEN_PX_PPC )
> >> +    if ( is_ppc_data(dom0_px_info) )
> >>      {
> >>          pxpt->platform_limit = dom0_px_info->platform_limit;
> >>
> >> @@ -534,8 +604,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
> >>          }
> >>      }
> >>
> >> -    if ( dom0_px_info->flags == ( XEN_PX_PCT | XEN_PX_PSS |
> >> -                XEN_PX_PSD | XEN_PX_PPC ) )
> >> +    if ( is_all_data(dom0_px_info) )
> >>      {
> >>          pxpt->init = XEN_PX_INIT;
> >>
> >> diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
> >> index 94dbc3f..328579c 100644
> >> --- a/xen/include/public/platform.h
> >> +++ b/xen/include/public/platform.h
> >> @@ -384,6 +384,7 @@ DEFINE_XEN_GUEST_HANDLE(xenpf_getidletime_t);
> >>  #define XEN_PX_PSS   2
> >>  #define XEN_PX_PPC   4
> >>  #define XEN_PX_PSD   8
> >> +#define XEN_PX_DATA  16
> >>
> >>  struct xen_power_register {
> >>      uint32_t     space_id;
> >> diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
> >> index d8a1ba6..afdccf2 100644
> >> --- a/xen/include/xen/processor_perf.h
> >> +++ b/xen/include/xen/processor_perf.h
> >> @@ -3,7 +3,9 @@
> >>
> >>  #include <public/platform.h>
> >>  #include <public/sysctl.h>
> >> +#ifdef CONFIG_ACPI
> >>  #include <xen/acpi.h>
> >> +#endif
> >>
> >>  #define XEN_PX_INIT 0x80000000
> >>
> >> @@ -24,8 +26,10 @@ int  cpufreq_del_cpu(unsigned int);
> >>  struct processor_performance {
> >>      uint32_t state;
> >>      uint32_t platform_limit;
> >> +#ifdef CONFIG_ACPI
> >>      struct xen_pct_register control_register;
> >>      struct xen_pct_register status_register;
> >> +#endif
> >>      uint32_t state_count;
> >>      struct xen_processor_px *states;
> >>      struct xen_psd_package domain_info;
> >> @@ -35,8 +39,10 @@ struct processor_performance {
> >>  };
> >>
> >>  struct processor_pminfo {
> >> +#ifdef CONFIG_ACPI
> >>      uint32_t acpi_id;
> >>      uint32_t id;
> >> +#endif
> >>      struct processor_performance    perf;
> >>  };
> 
> There will be no changes here as well.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros
  2017-11-09 17:09 ` [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros Oleksandr Tyshchenko
@ 2017-12-04 23:24   ` Stefano Stabellini
  2017-12-05 14:19     ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 23:24 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This is a port from Linux.

When you port stuff from Linux you have to retain the original
copyright. Please add the original Signed-off-by lines (you actually
have to use git log and git blame to narrow them down for copyright
reasons):

  Signed-off-by: Stephen Warren <swarren@nvidia.com>
  Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

With those:

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/common/device_tree.c      | 18 ++++++++++++++++++
>  xen/include/xen/device_tree.h | 21 +++++++++++++++++++++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 60b0095..08f8072 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -208,6 +208,24 @@ int dt_property_read_string(const struct dt_device_node *np,
>      return 0;
>  }
>  
> +const char *dt_property_next_string(const struct dt_property *prop,
> +                                    const char *cur)
> +{
> +    const void *curv = cur;
> +
> +    if ( !prop )
> +        return NULL;
> +
> +    if ( !cur )
> +        return prop->value;
> +
> +    curv += strlen(cur) + 1;
> +    if ( curv >= prop->value + prop->length )
> +        return NULL;
> +
> +    return curv;
> +}
> +
>  bool_t dt_device_is_compatible(const struct dt_device_node *device,
>                                 const char *compat)
>  {
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 738f1b6..9e0931c 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -420,6 +420,27 @@ int dt_property_read_string(const struct dt_device_node *np,
>                              const char *propname, const char **out_string);
>  
>  /**
> + * dt_property_for_each_string - Iterate over an array of strings within
> + * a property with a given name for a given node.
> + *
> + * Example:
> + *
> + * struct dt_property *prop;
> + * const char *s;
> + *
> + * dt_property_for_each_string(np, "propname", prop, s)
> + *     printk("String value: %s\n", s);
> + */
> +const char *dt_property_next_string(const struct dt_property *prop,
> +                                    const char *cur);
> +
> +#define dt_property_for_each_string(np, propname, prop, s)    \
> +    for (prop = dt_find_property(np, propname, NULL),         \
> +        s = dt_property_next_string(prop, NULL);              \
> +        s;                                                    \
> +        s = dt_property_next_string(prop, s))
> +
> +/**
>   * Checks if the given "compat" string matches one of the strings in
>   * the device's "compatible" property
>   */
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 10/31] xen/device-tree: Add dt_property_read_u32_index helper
  2017-11-09 17:10 ` [RFC PATCH 10/31] xen/device-tree: Add dt_property_read_u32_index helper Oleksandr Tyshchenko
@ 2017-12-04 23:29   ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 23:29 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This is a port from Linux.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Same here: please original signed-off-bys and also name of the property
in Linux.

> ---
>  xen/common/device_tree.c      | 52 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/device_tree.h | 20 +++++++++++++++++
>  2 files changed, 72 insertions(+)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 08f8072..0fa654e 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -176,6 +176,58 @@ bool_t dt_property_read_u32(const struct dt_device_node *np,
>      return 1;
>  }
>  
> +/**
> + * dt_find_property_value_of_size
> + *
> + * @np:       device node from which the property value is to be read.
> + * @propname: name of the property to be searched.
> + * @min:      minimum allowed length of property value
> + * @max:      maximum allowed length of property value (0 means unlimited)
> + * @len:      if !=NULL, actual length is written to here
> + *
> + * Search for a property in a device node and valid the requested size.
> + * Returns the property value on success, -EINVAL if the property does not
> + * exist, -ENODATA if property does not have a value, and -EOVERFLOW if the
> + * property data is too small or too large.
> + */
> +static void *dt_find_property_value_of_size(const struct dt_device_node *np,
> +                                            const char *propname,
> +                                            u32 min, u32 max, size_t *len)
> +{
> +    const struct dt_property *prop = dt_find_property(np, propname, NULL);
> +
> +    if ( !prop )
> +        return ERR_PTR(-EINVAL);
> +    if ( !prop->value )
> +        return ERR_PTR(-ENODATA);
> +    if ( prop->length < min )
> +        return ERR_PTR(-EOVERFLOW);
> +    if ( max && prop->length > max )
> +        return ERR_PTR(-EOVERFLOW);
> +
> +    if ( len )
> +        *len = prop->length;
> +
> +    return prop->value;
> +}
> +
> +int dt_property_read_u32_index(const struct dt_device_node *np,
> +                               const char *propname,
> +                               u32 index, u32 *out_value)
> +{
> +    const u32 *val =
> +        dt_find_property_value_of_size(np, propname,
> +                                       ((index + 1) * sizeof(*out_value)),
> +                                       0,
> +                                       NULL);
> +
> +    if ( IS_ERR(val) )
> +        return PTR_ERR(val);
> +
> +    *out_value = be32_to_cpup(((__be32 *)val) + index);
> +
> +    return 0;
> +}
>  
>  bool_t dt_property_read_u64(const struct dt_device_node *np,
>                           const char *name, u64 *out_value)
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 9e0931c..87b4b67 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -374,6 +374,26 @@ const struct dt_property *dt_find_property(const struct dt_device_node *np,
>   */
>  bool_t dt_property_read_u32(const struct dt_device_node *np,
>                              const char *name, u32 *out_value);
> +
> +/**
> + * dt_property_read_u32_index - Find and read a u32 from a multi-value property.
> + *
> + * @np:        device node from which the property value is to be read.
> + * @propname:  name of the property to be searched.
> + * @index:     index of the u32 in the list of values
> + * @out_value: pointer to return value, modified only if no error.
> + *
> + * Search for a property in a device node and read nth 32-bit value from
> + * it. Returns 0 on success, -EINVAL if the property does not exist,
> + * -ENODATA if property does not have a value, and -EOVERFLOW if the
> + * property data isn't large enough.
> + *
> + * The out_value is modified only if a valid u32 value can be decoded.
> + */
> +int dt_property_read_u32_index(const struct dt_device_node *np,
> +                               const char *propname,
> +                               u32 index, u32 *out_value);
> +
>  /**
>   * dt_property_read_u64 - Helper to read a u64 property.
>   * @np: node to get the value
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 11/31] xen/device-tree: Add dt_property_count_elems_of_size helper
  2017-11-09 17:10 ` [RFC PATCH 11/31] xen/device-tree: Add dt_property_count_elems_of_size helper Oleksandr Tyshchenko
@ 2017-12-04 23:29   ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 23:29 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This is a port from Linux.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Same here

> ---
>  xen/common/device_tree.c      | 20 ++++++++++++++++++++
>  xen/include/xen/device_tree.h | 15 +++++++++++++++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 0fa654e..7b4cad3 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -278,6 +278,26 @@ const char *dt_property_next_string(const struct dt_property *prop,
>      return curv;
>  }
>  
> +int dt_property_count_elems_of_size(const struct dt_device_node *np,
> +                                    const char *propname, int elem_size)
> +{
> +    const struct dt_property *prop = dt_find_property(np, propname, NULL);
> +
> +    if ( !prop )
> +        return -EINVAL;
> +    if ( !prop->value )
> +        return -ENODATA;
> +
> +    if ( prop->length % elem_size != 0 )
> +    {
> +        printk("%s: size of %s is not a multiple of %d\n", np->full_name,
> +               propname, elem_size);
> +        return -EINVAL;
> +    }
> +
> +    return prop->length / elem_size;
> +}
> +
>  bool_t dt_device_is_compatible(const struct dt_device_node *device,
>                                 const char *compat)
>  {
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 87b4b67..e2d7346 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -461,6 +461,21 @@ const char *dt_property_next_string(const struct dt_property *prop,
>          s = dt_property_next_string(prop, s))
>  
>  /**
> + * dt_property_count_elems_of_size - Count the number of elements in a property
> + *
> + * @np:        device node from which the property value is to be read.
> + * @propname:  name of the property to be searched.
> + * @elem_size: size of the individual element
> + *
> + * Search for a property in a device node and count the number of elements of
> + * size elem_size in it. Returns number of elements on sucess, -EINVAL if the
> + * property does not exist or its length does not match a multiple of elem_size
> + * and -ENODATA if the property does not have a value.
> + */
> +int dt_property_count_elems_of_size(const struct dt_device_node *np,
> +                                    const char *propname, int elem_size);
> +
> +/**
>   * Checks if the given "compat" string matches one of the strings in
>   * the device's "compatible" property
>   */
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 12/31] xen/device-tree: Add dt_property_read_string_helper and friends
  2017-11-09 17:10 ` [RFC PATCH 12/31] xen/device-tree: Add dt_property_read_string_helper and friends Oleksandr Tyshchenko
@ 2017-12-04 23:29   ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 23:29 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This is a port from Linux.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Same here


> ---
>  xen/common/device_tree.c      | 27 +++++++++++++++
>  xen/include/xen/device_tree.h | 81 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 108 insertions(+)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 7b4cad3..827eadd 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -260,6 +260,33 @@ int dt_property_read_string(const struct dt_device_node *np,
>      return 0;
>  }
>  
> +int dt_property_read_string_helper(const struct dt_device_node *np,
> +                                   const char *propname, const char **out_strs,
> +                                   size_t sz, int skip)
> +{
> +    const struct dt_property *prop = dt_find_property(np, propname, NULL);
> +    int l = 0, i = 0;
> +    const char *p, *end;
> +
> +    if ( !prop )
> +        return -EINVAL;
> +    if ( !prop->value )
> +        return -ENODATA;
> +    p = prop->value;
> +    end = p + prop->length;
> +
> +    for ( i = 0; p < end && (!out_strs || i < skip + sz); i++, p += l )
> +    {
> +        l = strnlen(p, end - p) + 1;
> +        if ( p + l > end )
> +            return -EILSEQ;
> +        if ( out_strs && i >= skip )
> +            *out_strs++ = p;
> +    }
> +    i -= skip;
> +    return i <= 0 ? -ENODATA : i;
> +}
> +
>  const char *dt_property_next_string(const struct dt_property *prop,
>                                      const char *cur)
>  {
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index e2d7346..7e51a7a 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -440,6 +440,87 @@ int dt_property_read_string(const struct dt_device_node *np,
>                              const char *propname, const char **out_string);
>  
>  /**
> + * dt_property_read_string_helper() - Utility helper for parsing string properties
> + * @np:       device node from which the property value is to be read.
> + * @propname: name of the property to be searched.
> + * @out_strs: output array of string pointers.
> + * @sz:       number of array elements to read.
> + * @skip:     Number of strings to skip over at beginning of list.
> + *
> + * Don't call this function directly. It is a utility helper for the
> + * dt_property_read_string*() family of functions.
> + */
> +int dt_property_read_string_helper(const struct dt_device_node *np,
> +                                   const char *propname, const char **out_strs,
> +                                   size_t sz, int skip);
> +
> +/**
> + * dt_property_read_string_array() - Read an array of strings from a multiple
> + *                                   strings property.
> + * @np:       device node from which the property value is to be read.
> + * @propname: name of the property to be searched.
> + * @out_strs: output array of string pointers.
> + * @sz:       number of array elements to read.
> + *
> + * Search for a property in a device tree node and retrieve a list of
> + * terminated string values (pointer to data, not a copy) in that property.
> + *
> + * If @out_strs is NULL, the number of strings in the property is returned.
> + */
> +static inline int dt_property_read_string_array(const struct dt_device_node *np,
> +                                                const char *propname,
> +                                                const char **out_strs,
> +                                                size_t sz)
> +{
> +	return dt_property_read_string_helper(np, propname, out_strs, sz, 0);
> +}
> +
> +/**
> + * dt_property_count_strings() - Find and return the number of strings from a
> + *                               multiple strings property.
> + * @np:       device node from which the property value is to be read.
> + * @propname: name of the property to be searched.
> + *
> + * Search for a property in a device tree node and retrieve the number of null
> + * terminated string contain in it. Returns the number of strings on
> + * success, -EINVAL if the property does not exist, -ENODATA if property
> + * does not have a value, and -EILSEQ if the string is not null-terminated
> + * within the length of the property data.
> + */
> +static inline int dt_property_count_strings(const struct dt_device_node *np,
> +                                            const char *propname)
> +{
> +	return dt_property_read_string_helper(np, propname, NULL, 0, 0);
> +}
> +
> +/**
> + * dt_property_read_string_index() - Find and read a string from a multiple
> + *                                   strings property.
> + * @np:         device node from which the property value is to be read.
> + * @propname:   name of the property to be searched.
> + * @index:      index of the string in the list of strings
> + * @out_string: pointer to null terminated return string, modified only if
> + *              return value is 0.
> + *
> + * Search for a property in a device tree node and retrieve a null
> + * terminated string value (pointer to data, not a copy) in the list of strings
> + * contained in that property.
> + * Returns 0 on success, -EINVAL if the property does not exist, -ENODATA if
> + * property does not have a value, and -EILSEQ if the string is not
> + * null-terminated within the length of the property data.
> + *
> + * The out_string pointer is modified only if a valid string can be decoded.
> + */
> +static inline int dt_property_read_string_index(const struct dt_device_node *np,
> +                                                const char *propname,
> +                                                int index, const char **output)
> +{
> +	int rc = dt_property_read_string_helper(np, propname, output, 1, index);
> +
> +	return rc < 0 ? rc : 0;
> +}
> +
> +/**
>   * dt_property_for_each_string - Iterate over an array of strings within
>   * a property with a given name for a given node.
>   *
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device
  2017-11-09 17:10 ` [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device Oleksandr Tyshchenko
@ 2017-12-04 23:31   ` Stefano Stabellini
  2017-12-05 11:26   ` Julien Grall
  1 sibling, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-04 23:31 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>  xen/include/asm-arm/device.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 6734ae8..3e2f34a 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -20,6 +20,7 @@ struct device
>      struct dt_device_node *of_node; /* Used by drivers imported from Linux */
>  #endif
>      struct dev_archdata archdata;
> +    void *driver_data;
>  };
>  
>  typedef struct device device_t;
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-11-09 17:10 ` [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC Oleksandr Tyshchenko
@ 2017-12-05  2:30   ` Stefano Stabellini
  2017-12-05 15:33     ` Volodymyr Babchuk
  2017-12-05 14:58   ` Julien Grall
  1 sibling, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05  2:30 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Edgar E. Iglesias, xen-devel, Stefano Stabellini,
	Volodymyr Babchuk, Julien Grall

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> 
> Existing SMC wrapper call_smc() allows only 4 parameters and
> returns only one value. This is enough for existing
> use in PSCI code, but TEE mediator will need a call that is
> fully compatible with ARM SMCCC.
> This patch adds this call for both arm32 and arm64.
> 
> There was similar patch by Edgar E. Iglesias ([1]), but looks
> like it is abandoned.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html
> 
> CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/arm32/Makefile     |  1 +
>  xen/arch/arm/arm32/smc.S        | 32 ++++++++++++++++++++++++++++++++
>  xen/arch/arm/arm64/Makefile     |  1 +
>  xen/arch/arm/arm64/smc.S        | 29 +++++++++++++++++++++++++++++
>  xen/include/asm-arm/processor.h |  4 ++++
>  5 files changed, 67 insertions(+)
>  create mode 100644 xen/arch/arm/arm32/smc.S
>  create mode 100644 xen/arch/arm/arm64/smc.S
> 
> diff --git a/xen/arch/arm/arm32/Makefile b/xen/arch/arm/arm32/Makefile
> index 0ac254f..a2362f3 100644
> --- a/xen/arch/arm/arm32/Makefile
> +++ b/xen/arch/arm/arm32/Makefile
> @@ -8,6 +8,7 @@ obj-y += insn.o
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
>  obj-y += proc-v7.o proc-caxx.o
>  obj-y += smpboot.o
> +obj-y += smc.o
>  obj-y += traps.o
>  obj-y += vfp.o
>  
> diff --git a/xen/arch/arm/arm32/smc.S b/xen/arch/arm/arm32/smc.S
> new file mode 100644
> index 0000000..1cc9528
> --- /dev/null
> +++ b/xen/arch/arm/arm32/smc.S
> @@ -0,0 +1,32 @@
> +/*
> + * xen/arch/arm/arm32/smc.S
> + *
> + * Wrapper for Secure Monitors Calls
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <asm/macros.h>
> +
> +/*
> + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> + *                     register_t a3, register_t a4, register_t a5,
> + *                     register_t a6, register_t a7, register_t res[4])
> + */
> +ENTRY(call_smccc_smc)
> +        mov     r12, sp
> +        push    {r4-r7}
> +        ldm     r12, {r4-r7}
> +        smc     #0
> +        pop     {r4-r7}
> +        ldr     r12, [sp, #(4 * 4)]

I haven't run this, but shouldn't it be:

  ldr     r12, [sp, #20]

?


> +        stm     r12, {r0-r3}
> +        bx      lr
> diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
> index 149b6b3..7831dc1 100644
> --- a/xen/arch/arm/arm64/Makefile
> +++ b/xen/arch/arm/arm64/Makefile
> @@ -8,5 +8,6 @@ obj-y += entry.o
>  obj-y += insn.o
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
>  obj-y += smpboot.o
> +obj-y += smc.o
>  obj-y += traps.o
>  obj-y += vfp.o
> diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
> new file mode 100644
> index 0000000..aa44fba
> --- /dev/null
> +++ b/xen/arch/arm/arm64/smc.S
> @@ -0,0 +1,29 @@
> +/*
> + * xen/arch/arm/arm64/smc.S
> + *
> + * Wrapper for Secure Monitors Calls
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <asm/macros.h>
> +
> +/*
> + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> + *                     register_t a3, register_t a4, register_t a5,
> + *                     register_t a6, register_t a7, register_t res[4])
> + */
> +ENTRY(call_smccc_smc)
> +        smc     #0
> +        ldr     x4, [sp]
> +        stp     x0, x1, [x4, 0]
> +        stp     x2, x3, [x4, 16]
> +        ret
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 9f7a42f..4ce5bb6 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -786,6 +786,10 @@ void vcpu_regs_user_to_hyp(struct vcpu *vcpu,
>  int call_smc(register_t function_id, register_t arg0, register_t arg1,
>               register_t arg2);
>  
> +void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> +                    register_t a3, register_t a4, register_t a5,
> +                    register_t a6, register_t a7, register_t res[4]);
> +
>  void do_trap_hyp_serror(struct cpu_user_regs *regs);
>  
>  void do_trap_guest_serror(struct cpu_user_regs *regs);
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-04 22:18       ` Stefano Stabellini
@ 2017-12-05 11:13         ` Oleksandr Tyshchenko
  2017-12-05 19:24           ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-05 11:13 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi Stefano

On Tue, Dec 5, 2017 at 12:18 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Sat, 2 Dec 2017, Oleksandr Tyshchenko wrote:
>> On Sat, Dec 2, 2017 at 3:06 AM, Stefano Stabellini
>> <sstabellini@kernel.org> wrote:
>> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> >> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> >>
>> >> This settings is not needed for some architectures.
>> >> So make it to be configurable and use it for x86
>> >> architecture.
>> >>
>> >> This is a rebased version of the original patch:
>> >> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00942.html
>> >>
>> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> >> CC: Jan Beulich <jbeulich@suse.com>
>> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> >> CC: Stefano Stabellini <sstabellini@kernel.org>
>> >> CC: Julien Grall <julien.grall@linaro.org>
>> >> ---
>> >>  xen/arch/x86/Kconfig          |  1 +
>> >>  xen/drivers/cpufreq/Kconfig   |  3 +++
>> >>  xen/drivers/cpufreq/utility.c | 11 ++++++++++-
>> >>  xen/drivers/pm/stat.c         |  6 ++++++
>> >>  xen/include/xen/cpufreq.h     |  6 ++++++
>> >>  5 files changed, 26 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
>> >> index 86c8eca..c1eac1d 100644
>> >> --- a/xen/arch/x86/Kconfig
>> >> +++ b/xen/arch/x86/Kconfig
>> >> @@ -24,6 +24,7 @@ config X86
>> >>       select NUMA
>> >>       select VGA
>> >>       select HAS_PM
>> >> +     select HAS_CPU_TURBO
>> >>
>> >>  config ARCH_DEFCONFIG
>> >>       string
>> >> diff --git a/xen/drivers/cpufreq/Kconfig b/xen/drivers/cpufreq/Kconfig
>> >> index cce80f4..427ea2a 100644
>> >> --- a/xen/drivers/cpufreq/Kconfig
>> >> +++ b/xen/drivers/cpufreq/Kconfig
>> >> @@ -1,3 +1,6 @@
>> >>
>> >>  config HAS_CPUFREQ
>> >>       bool
>> >> +
>> >> +config HAS_CPU_TURBO
>> >> +     bool
>> >> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>> >> index a687e5a..25bf983 100644
>> >> --- a/xen/drivers/cpufreq/utility.c
>> >> +++ b/xen/drivers/cpufreq/utility.c
>> >> @@ -209,7 +209,9 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>> >>  {
>> >>      unsigned int min_freq = ~0;
>> >>      unsigned int max_freq = 0;
>> >> +#ifdef CONFIG_HAS_CPU_TURBO
>> >>      unsigned int second_max_freq = 0;
>> >> +#endif
>> >>      unsigned int i;
>> >>
>> >>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>> >> @@ -221,6 +223,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>> >>          if (freq > max_freq)
>> >>              max_freq = freq;
>> >>      }
>> >> +#ifdef CONFIG_HAS_CPU_TURBO
>> >>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>> >>          unsigned int freq = table[i].frequency;
>> >>          if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
>> >> @@ -234,9 +237,13 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
>> >>          printk("max_freq: %u    second_max_freq: %u\n",
>> >>                 max_freq, second_max_freq);
>> >>
>> >> +    policy->cpuinfo.second_max_freq = second_max_freq;
>> >> +#else /* !CONFIG_HAS_CPU_TURBO */
>> >> +    if (cpufreq_verbose)
>> >> +        printk("max_freq: %u\n", max_freq);
>> >> +#endif /* CONFIG_HAS_CPU_TURBO */
>> >>      policy->min = policy->cpuinfo.min_freq = min_freq;
>> >>      policy->max = policy->cpuinfo.max_freq = max_freq;
>> >> -    policy->cpuinfo.second_max_freq = second_max_freq;
>> >>
>> >>      if (policy->min == ~0)
>> >>          return -EINVAL;
>> >> @@ -390,6 +397,7 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
>> >>      return policy->cur;
>> >>  }
>> >>
>> >> +#ifdef CONFIG_HAS_CPU_TURBO
>> >>  int cpufreq_update_turbo(int cpuid, int new_state)
>> >>  {
>> >>      struct cpufreq_policy *policy;
>> >> @@ -430,6 +438,7 @@ int cpufreq_get_turbo_status(int cpuid)
>> >>      policy = per_cpu(cpufreq_cpu_policy, cpuid);
>> >>      return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
>> >>  }
>> >> +#endif /* CONFIG_HAS_CPU_TURBO */
>> >>
>> >>  /*********************************************************************
>> >>   *                 POLICY                                            *
>> >
>> > I am wondering if we need to go as far as #ifdef'ing
>> > cpufreq_update_turbo. For the sake of reducing the number if #ifdef's,
>> > would it be enough if we only make sure it is disabled?
>> >
>> > In other words, I would keep the changes to stat.c but I would leave
>> > utility.c and cpufreq.h pretty much untouched.
>>
>> Yes. I was thinking about dropping this patch at all. If platform
>> doesn't support CPU Boost, the platform
>> driver should just inform framework about that (policy->turbo =
>> CPUFREQ_TURBO_UNSUPPORTED).
>> That's all.
>
> Right
>
>
>> cpufreq_update_turbo() will return -EOPNOTSUPP if someone tries to
>> enable/disable turbo mode.
>> cpufreq_get_turbo_status() will return that turbo mode "is not enabled".
>
> Exactly what I was thinking

Great, I will drop this patch.

>
>
>> Another question is second_max_freq. As I understand, it is highest
>> non-turbo frequency calculated by framework to limit target frequency
>> when
>> turbo mode "is disabled". And Xen assumes that second_max_freq is
>> always P1 if turbo mode is on.
>> But, there might be a case when a few highest frequencies are
>> turbo-frequencies. So, I propose to add an extra flag for handling
>> that.
>> So, each CPUFreq driver responsibility will be to mark
>> turbo-frequency(ies) for the framework to properly calculate
>> second_max_freq.
>
> As Andre wrote, we can start simply assuming that ARM doesn't have
> turbo. If turbo mode is assumed to be off, I don't think we need the
> patch below and the new flag, because second_max_freq == max_freq.

I just want to show you real example, where we have ARM SoC +
turbo-mode + > 1 turbo freq
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/arch/arm64/boot/dts/renesas/r8a7795.dtsi?h=v4.9/rcar-3.5.9#n197
As you can see, there are two freqs marked as turbo-freqs: 1600000000
Hz and 1700000000 Hz

https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/arch/arm64/boot/dts/renesas/r8a7796.dtsi?h=v4.9/rcar-3.5.9#n166
For M3 SoC three turbo-freqs are used: 1600000000 Hz, 1700000000 Hz
and 1800000000 Hz

If a proposed below patch is not an option then we should find another
way to clarify second_max_freq.

>
>
>> Something like that:
>>
>> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>> index 25bf983..122a88b 100644
>> --- a/xen/drivers/cpufreq/utility.c
>> +++ b/xen/drivers/cpufreq/utility.c
>> @@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
>> cpufreq_policy *policy,
>>  #ifdef CONFIG_HAS_CPU_TURBO
>>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>>          unsigned int freq = table[i].frequency;
>> -        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
>> +        if ((freq == CPUFREQ_ENTRY_INVALID) ||
>> +            (table[i].flags & CPUFREQ_BOOST_FREQ))
>>              continue;
>>          if (freq > second_max_freq)
>>              second_max_freq = freq;
>> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
>> index 2e0c16a..77b29da 100644
>> --- a/xen/include/xen/cpufreq.h
>> +++ b/xen/include/xen/cpufreq.h
>> @@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
>> cpufreq_policy *policy,
>>  #define CPUFREQ_ENTRY_INVALID ~0
>>  #define CPUFREQ_TABLE_END     ~1
>>
>> +/* Special Values of .flags field */
>> +#define CPUFREQ_BOOST_FREQ    (1 << 0)
>> +
>>  struct cpufreq_frequency_table {
>> +       unsigned int    flags;
>>      unsigned int    index;     /* any */
>>      unsigned int    frequency; /* kHz - doesn't need to be in ascending
>>                                  * order */
>>
>> Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
>> a turbo-frequency if turbo mode "is supported". Am I correct?
>>
>> And the most important question is how to recognize in Xen on ARM
>> (using SCPI protocol) which frequencies are turbo-frequencies
>> actually? I couldn't find any information regarding that in protocol
>> description.
>> For DT-based CPUFreq it is not an issue, since there is a specific
>> property "turbo-mode" to mark corresponding OPPs. [1].
>> But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
>> mentions about it. Perhaps, additional command should be added to pass
>> such info.
>>
>> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
>> [2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>> [3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device
  2017-11-09 17:10 ` [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device Oleksandr Tyshchenko
  2017-12-04 23:31   ` Stefano Stabellini
@ 2017-12-05 11:26   ` Julien Grall
  2017-12-05 12:57     ` Oleksandr Tyshchenko
  1 sibling, 1 reply; 108+ messages in thread
From: Julien Grall @ 2017-12-05 11:26 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel; +Cc: Oleksandr Tyshchenko, Stefano Stabellini



On 09/11/17 17:10, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Please explain the rationale behind adding a new field in struct device.

> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>   xen/include/asm-arm/device.h | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 6734ae8..3e2f34a 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -20,6 +20,7 @@ struct device
>       struct dt_device_node *of_node; /* Used by drivers imported from Linux */
>   #endif
>       struct dev_archdata archdata;
> +    void *driver_data;
>   };
>   
>   typedef struct device device_t;
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device
  2017-12-05 11:26   ` Julien Grall
@ 2017-12-05 12:57     ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-05 12:57 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, Oleksandr Tyshchenko

Hi, Julien.

On Tue, Dec 5, 2017 at 1:26 PM, Julien Grall <julien.grall@linaro.org> wrote:
>
>
> On 09/11/17 17:10, Oleksandr Tyshchenko wrote:
>>
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
>
> Please explain the rationale behind adding a new field in struct device.

Basically it is needed for the "direct ported" drivers from Linux. I
added this field in order to
make SCPI protocol driver happy. It operates with
platform_set_drvdata/platform_get_drvdata helpers.

Sure, I will add detailed description if we decide to go this way.

>
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>   xen/include/asm-arm/device.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
>> index 6734ae8..3e2f34a 100644
>> --- a/xen/include/asm-arm/device.h
>> +++ b/xen/include/asm-arm/device.h
>> @@ -20,6 +20,7 @@ struct device
>>       struct dt_device_node *of_node; /* Used by drivers imported from
>> Linux */
>>   #endif
>>       struct dev_archdata archdata;
>> +    void *driver_data;
>>   };
>>     typedef struct device device_t;
>>
>
> Cheers,
>
> --
> Julien Grall



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros
  2017-12-04 23:24   ` Stefano Stabellini
@ 2017-12-05 14:19     ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-05 14:19 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall, Oleksandr Tyshchenko

Hi, Stefano

On Tue, Dec 5, 2017 at 1:24 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This is a port from Linux.
>
> When you port stuff from Linux you have to retain the original
> copyright. Please add the original Signed-off-by lines (you actually
> have to use git log and git blame to narrow them down for copyright
> reasons):
>
>   Signed-off-by: Stephen Warren <swarren@nvidia.com>
>   Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

Sure, I will add original author(s) here and in all device-tree
patches I ported.

>
> With those:
>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Thanks.

>
>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/common/device_tree.c      | 18 ++++++++++++++++++
>>  xen/include/xen/device_tree.h | 21 +++++++++++++++++++++
>>  2 files changed, 39 insertions(+)
>>
>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>> index 60b0095..08f8072 100644
>> --- a/xen/common/device_tree.c
>> +++ b/xen/common/device_tree.c
>> @@ -208,6 +208,24 @@ int dt_property_read_string(const struct dt_device_node *np,
>>      return 0;
>>  }
>>
>> +const char *dt_property_next_string(const struct dt_property *prop,
>> +                                    const char *cur)
>> +{
>> +    const void *curv = cur;
>> +
>> +    if ( !prop )
>> +        return NULL;
>> +
>> +    if ( !cur )
>> +        return prop->value;
>> +
>> +    curv += strlen(cur) + 1;
>> +    if ( curv >= prop->value + prop->length )
>> +        return NULL;
>> +
>> +    return curv;
>> +}
>> +
>>  bool_t dt_device_is_compatible(const struct dt_device_node *device,
>>                                 const char *compat)
>>  {
>> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
>> index 738f1b6..9e0931c 100644
>> --- a/xen/include/xen/device_tree.h
>> +++ b/xen/include/xen/device_tree.h
>> @@ -420,6 +420,27 @@ int dt_property_read_string(const struct dt_device_node *np,
>>                              const char *propname, const char **out_string);
>>
>>  /**
>> + * dt_property_for_each_string - Iterate over an array of strings within
>> + * a property with a given name for a given node.
>> + *
>> + * Example:
>> + *
>> + * struct dt_property *prop;
>> + * const char *s;
>> + *
>> + * dt_property_for_each_string(np, "propname", prop, s)
>> + *     printk("String value: %s\n", s);
>> + */
>> +const char *dt_property_next_string(const struct dt_property *prop,
>> +                                    const char *cur);
>> +
>> +#define dt_property_for_each_string(np, propname, prop, s)    \
>> +    for (prop = dt_find_property(np, propname, NULL),         \
>> +        s = dt_property_next_string(prop, NULL);              \
>> +        s;                                                    \
>> +        s = dt_property_next_string(prop, s))
>> +
>> +/**
>>   * Checks if the given "compat" string matches one of the strings in
>>   * the device's "compatible" property
>>   */
>> --
>> 2.7.4
>>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-11-09 17:10 ` [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC Oleksandr Tyshchenko
  2017-12-05  2:30   ` Stefano Stabellini
@ 2017-12-05 14:58   ` Julien Grall
  2017-12-05 17:08     ` Volodymyr Babchuk
  1 sibling, 1 reply; 108+ messages in thread
From: Julien Grall @ 2017-12-05 14:58 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Edgar E. Iglesias, Stefano Stabellini, Volodymyr Babchuk

Hi Oleksandr,

On 09/11/17 17:10, Oleksandr Tyshchenko wrote:
> From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> 
> Existing SMC wrapper call_smc() allows only 4 parameters and
> returns only one value. This is enough for existing
> use in PSCI code, but TEE mediator will need a call that is
> fully compatible with ARM SMCCC.
> This patch adds this call for both arm32 and arm64.
> 
> There was similar patch by Edgar E. Iglesias ([1]), but looks
> like it is abandoned.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html
> 
> CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

This patch was sent by Volodymyr a month ago (see [2]) and I had 
comments on it. I would appreciate if you address them.

Cheers,

[2] 
https://lists.xenproject.org/archives/html/xen-devel/2017-10/msg01881.html

> ---
>   xen/arch/arm/arm32/Makefile     |  1 +
>   xen/arch/arm/arm32/smc.S        | 32 ++++++++++++++++++++++++++++++++
>   xen/arch/arm/arm64/Makefile     |  1 +
>   xen/arch/arm/arm64/smc.S        | 29 +++++++++++++++++++++++++++++
>   xen/include/asm-arm/processor.h |  4 ++++
>   5 files changed, 67 insertions(+)
>   create mode 100644 xen/arch/arm/arm32/smc.S
>   create mode 100644 xen/arch/arm/arm64/smc.S
> 
> diff --git a/xen/arch/arm/arm32/Makefile b/xen/arch/arm/arm32/Makefile
> index 0ac254f..a2362f3 100644
> --- a/xen/arch/arm/arm32/Makefile
> +++ b/xen/arch/arm/arm32/Makefile
> @@ -8,6 +8,7 @@ obj-y += insn.o
>   obj-$(CONFIG_LIVEPATCH) += livepatch.o
>   obj-y += proc-v7.o proc-caxx.o
>   obj-y += smpboot.o
> +obj-y += smc.o
>   obj-y += traps.o
>   obj-y += vfp.o
>   
> diff --git a/xen/arch/arm/arm32/smc.S b/xen/arch/arm/arm32/smc.S
> new file mode 100644
> index 0000000..1cc9528
> --- /dev/null
> +++ b/xen/arch/arm/arm32/smc.S
> @@ -0,0 +1,32 @@
> +/*
> + * xen/arch/arm/arm32/smc.S
> + *
> + * Wrapper for Secure Monitors Calls
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <asm/macros.h>
> +
> +/*
> + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> + *                     register_t a3, register_t a4, register_t a5,
> + *                     register_t a6, register_t a7, register_t res[4])
> + */
> +ENTRY(call_smccc_smc)
> +        mov     r12, sp
> +        push    {r4-r7}
> +        ldm     r12, {r4-r7}
> +        smc     #0
> +        pop     {r4-r7}
> +        ldr     r12, [sp, #(4 * 4)]
> +        stm     r12, {r0-r3}
> +        bx      lr
> diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
> index 149b6b3..7831dc1 100644
> --- a/xen/arch/arm/arm64/Makefile
> +++ b/xen/arch/arm/arm64/Makefile
> @@ -8,5 +8,6 @@ obj-y += entry.o
>   obj-y += insn.o
>   obj-$(CONFIG_LIVEPATCH) += livepatch.o
>   obj-y += smpboot.o
> +obj-y += smc.o
>   obj-y += traps.o
>   obj-y += vfp.o
> diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
> new file mode 100644
> index 0000000..aa44fba
> --- /dev/null
> +++ b/xen/arch/arm/arm64/smc.S
> @@ -0,0 +1,29 @@
> +/*
> + * xen/arch/arm/arm64/smc.S
> + *
> + * Wrapper for Secure Monitors Calls
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <asm/macros.h>
> +
> +/*
> + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> + *                     register_t a3, register_t a4, register_t a5,
> + *                     register_t a6, register_t a7, register_t res[4])
> + */
> +ENTRY(call_smccc_smc)
> +        smc     #0
> +        ldr     x4, [sp]
> +        stp     x0, x1, [x4, 0]
> +        stp     x2, x3, [x4, 16]
> +        ret
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 9f7a42f..4ce5bb6 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -786,6 +786,10 @@ void vcpu_regs_user_to_hyp(struct vcpu *vcpu,
>   int call_smc(register_t function_id, register_t arg0, register_t arg1,
>                register_t arg2);
>   
> +void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> +                    register_t a3, register_t a4, register_t a5,
> +                    register_t a6, register_t a7, register_t res[4]);
> +
>   void do_trap_hyp_serror(struct cpu_user_regs *regs);
>   
>   void do_trap_guest_serror(struct cpu_user_regs *regs);
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-04 11:58       ` Andre Przywara
@ 2017-12-05 15:23         ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-05 15:23 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Mon, Dec 4, 2017 at 1:58 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi,
Hi Andre

>
> ....
>
>> And the most important question is how to recognize in Xen on ARM
>> (using SCPI protocol) which frequencies are turbo-frequencies
>> actually? I couldn't find any information regarding that in protocol
>> description.
>
> So traditionally on ARM there is no notion of a "turbo" frequency. The
> idea is to expose the highest possible frequency, and let thermal
> throttling (possibly in hardware or in firmware) limit the frequency if
> the thermal budget is busted.
> Also in the ARM world it is expected that an OS has much better
> knowledge on how to handle frequencies, for instance when to give more
> power to the GPU and when to the CPU.
>
>> For DT-based CPUFreq it is not an issue, since there is a specific
>> property "turbo-mode" to mark corresponding OPPs. [1].
>> But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
>> mentions about it. Perhaps, additional command should be added to pass
>> such info.
>
> The DT binding you mentioned in Linux is a generic one.
> In general DT only describes non-discoverable properties. But for SCPI
> the OPPs are handled in the SCP and advertised via SCPI calls (3.2.9 Get
> DVFS Info, command 0x9).
> So the OPP table is not in the DT, and thus you don't have any way of
> detecting turbo frequencies.
> But as mentioned before, this is so by design, as ARM does not endorse
> the concept of turbo frequencies in general.
>
> Now with the advent of more "server-y" chips and ACPI, this might change
> in the future. For instance SCMI is designed to be closer to ACPI, so we
> might inherit some turbo notion from there.
>
> So we should not completely rule out the idea of turbo, but for a start
> we can somewhat assume that an ARM based system does not have turbo per se.

Thank you for the detailed explanation. I will take a look at SCMI
documentation.

>
> Cheers,
> Andre.
>
>> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
>> [2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>> [3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>>
>>>
>>>
>>>> diff --git a/xen/drivers/pm/stat.c b/xen/drivers/pm/stat.c
>>>> index 2dbde1c..133e64d 100644
>>>> --- a/xen/drivers/pm/stat.c
>>>> +++ b/xen/drivers/pm/stat.c
>>>> @@ -290,7 +290,11 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>>>>              &op->u.get_para.u.ondemand.sampling_rate,
>>>>              &op->u.get_para.u.ondemand.up_threshold);
>>>>      }
>>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>>      op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
>>>> +#else
>>>> +    op->u.get_para.turbo_enabled = 0;
>>>> +#endif
>>>>
>>>>      return ret;
>>>>  }
>>>> @@ -473,6 +477,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>>>          break;
>>>>      }
>>>>
>>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>>      case XEN_SYSCTL_pm_op_enable_turbo:
>>>>      {
>>>>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
>>>> @@ -484,6 +489,7 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
>>>>          ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
>>>>          break;
>>>>      }
>>>> +#endif /* CONFIG_HAS_CPU_TURBO */
>>>>
>>>>      default:
>>>>          printk("not defined sub-hypercall @ do_pm_op\n");
>>>> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
>>>> index 30c70c9..2e0c16a 100644
>>>> --- a/xen/include/xen/cpufreq.h
>>>> +++ b/xen/include/xen/cpufreq.h
>>>> @@ -39,7 +39,9 @@ extern struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
>>>>
>>>>  struct cpufreq_cpuinfo {
>>>>      unsigned int        max_freq;
>>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>>      unsigned int        second_max_freq;    /* P1 if Turbo Mode is on */
>>>> +#endif
>>>>      unsigned int        min_freq;
>>>>      unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
>>>>  };
>>>> @@ -72,9 +74,11 @@ struct cpufreq_policy {
>>>>
>>>>      bool_t              resume; /* flag for cpufreq 1st run
>>>>                                   * S3 wakeup, hotplug cpu, etc */
>>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>>      s8                  turbo;  /* tristate flag: 0 for unsupported
>>>>                                   * -1 for disable, 1 for enabled
>>>>                                   * See CPUFREQ_TURBO_* below for defines */
>>>> +#endif
>>>>      bool_t              aperf_mperf; /* CPU has APERF/MPERF MSRs */
>>>>  };
>>>>  DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
>>>> @@ -138,8 +142,10 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
>>>>  #define CPUFREQ_TURBO_UNSUPPORTED   0
>>>>  #define CPUFREQ_TURBO_ENABLED       1
>>>>
>>>> +#ifdef CONFIG_HAS_CPU_TURBO
>>>>  extern int cpufreq_update_turbo(int cpuid, int new_state);
>>>>  extern int cpufreq_get_turbo_status(int cpuid);
>>>> +#endif
>>>>
>>>>  static __inline__ int
>>>>  __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
>>
>>
>>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-12-05  2:30   ` Stefano Stabellini
@ 2017-12-05 15:33     ` Volodymyr Babchuk
  2017-12-05 17:21       ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Volodymyr Babchuk @ 2017-12-05 15:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Oleksandr Tyshchenko, xen-devel, Edgar E. Iglesias, Julien Grall

Hi Stefano,

On Mon, Dec 04, 2017 at 06:30:13PM -0800, Stefano Stabellini wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> > From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> > 
> > Existing SMC wrapper call_smc() allows only 4 parameters and
> > returns only one value. This is enough for existing
> > use in PSCI code, but TEE mediator will need a call that is
> > fully compatible with ARM SMCCC.
> > This patch adds this call for both arm32 and arm64.
> > 
> > There was similar patch by Edgar E. Iglesias ([1]), but looks
> > like it is abandoned.
> > 
> > [1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html
> > 
> > CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> > 
> > Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> > CC: Stefano Stabellini <sstabellini@kernel.org>
> > CC: Julien Grall <julien.grall@linaro.org>
> > ---
> >  xen/arch/arm/arm32/Makefile     |  1 +
> >  xen/arch/arm/arm32/smc.S        | 32 ++++++++++++++++++++++++++++++++
> >  xen/arch/arm/arm64/Makefile     |  1 +
> >  xen/arch/arm/arm64/smc.S        | 29 +++++++++++++++++++++++++++++
> >  xen/include/asm-arm/processor.h |  4 ++++
> >  5 files changed, 67 insertions(+)
> >  create mode 100644 xen/arch/arm/arm32/smc.S
> >  create mode 100644 xen/arch/arm/arm64/smc.S
> > 
> > diff --git a/xen/arch/arm/arm32/Makefile b/xen/arch/arm/arm32/Makefile
> > index 0ac254f..a2362f3 100644
> > --- a/xen/arch/arm/arm32/Makefile
> > +++ b/xen/arch/arm/arm32/Makefile
> > @@ -8,6 +8,7 @@ obj-y += insn.o
> >  obj-$(CONFIG_LIVEPATCH) += livepatch.o
> >  obj-y += proc-v7.o proc-caxx.o
> >  obj-y += smpboot.o
> > +obj-y += smc.o
> >  obj-y += traps.o
> >  obj-y += vfp.o
> >  
> > diff --git a/xen/arch/arm/arm32/smc.S b/xen/arch/arm/arm32/smc.S
> > new file mode 100644
> > index 0000000..1cc9528
> > --- /dev/null
> > +++ b/xen/arch/arm/arm32/smc.S
> > @@ -0,0 +1,32 @@
> > +/*
> > + * xen/arch/arm/arm32/smc.S
> > + *
> > + * Wrapper for Secure Monitors Calls
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + */
> > +
> > +#include <asm/macros.h>
> > +
> > +/*
> > + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> > + *                     register_t a3, register_t a4, register_t a5,
> > + *                     register_t a6, register_t a7, register_t res[4])
> > + */
> > +ENTRY(call_smccc_smc)
> > +        mov     r12, sp
> > +        push    {r4-r7}
> > +        ldm     r12, {r4-r7}
> > +        smc     #0
> > +        pop     {r4-r7}
> > +        ldr     r12, [sp, #(4 * 4)]
> 
> I haven't run this, but shouldn't it be:
> 
>   ldr     r12, [sp, #20]
> 
> ?
> 
I took this code from linux (arch/arm/kernel/arm-smccc.h).
But, why #20? There are 5 parameters on the stack: a4-a7 and res:
a4:  [sp]
a5:  [sp, #4]
a6:  [sp, #8]
a7:  [sp, #12]
res: [sp, #16]

We need to save returnred values to res. So it looks right. Unless
I'm terribly wrong :)

> > +        stm     r12, {r0-r3}
> > +        bx      lr
> > diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
> > index 149b6b3..7831dc1 100644
> > --- a/xen/arch/arm/arm64/Makefile
> > +++ b/xen/arch/arm/arm64/Makefile
> > @@ -8,5 +8,6 @@ obj-y += entry.o
> >  obj-y += insn.o
> >  obj-$(CONFIG_LIVEPATCH) += livepatch.o
> >  obj-y += smpboot.o
> > +obj-y += smc.o
> >  obj-y += traps.o
> >  obj-y += vfp.o
> > diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
> > new file mode 100644
> > index 0000000..aa44fba
> > --- /dev/null
> > +++ b/xen/arch/arm/arm64/smc.S
> > @@ -0,0 +1,29 @@
> > +/*
> > + * xen/arch/arm/arm64/smc.S
> > + *
> > + * Wrapper for Secure Monitors Calls
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + */
> > +
> > +#include <asm/macros.h>
> > +
> > +/*
> > + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> > + *                     register_t a3, register_t a4, register_t a5,
> > + *                     register_t a6, register_t a7, register_t res[4])
> > + */
> > +ENTRY(call_smccc_smc)
> > +        smc     #0
> > +        ldr     x4, [sp]
> > +        stp     x0, x1, [x4, 0]
> > +        stp     x2, x3, [x4, 16]
> > +        ret
> > diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> > index 9f7a42f..4ce5bb6 100644
> > --- a/xen/include/asm-arm/processor.h
> > +++ b/xen/include/asm-arm/processor.h
> > @@ -786,6 +786,10 @@ void vcpu_regs_user_to_hyp(struct vcpu *vcpu,
> >  int call_smc(register_t function_id, register_t arg0, register_t arg1,
> >               register_t arg2);
> >  
> > +void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> > +                    register_t a3, register_t a4, register_t a5,
> > +                    register_t a6, register_t a7, register_t res[4]);
> > +
> >  void do_trap_hyp_serror(struct cpu_user_regs *regs);
> >  
> >  void do_trap_guest_serror(struct cpu_user_regs *regs);
> > -- 
> > 2.7.4
> > 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-12-05 14:58   ` Julien Grall
@ 2017-12-05 17:08     ` Volodymyr Babchuk
  2017-12-05 17:08       ` Julien Grall
  2017-12-05 17:20       ` Oleksandr Tyshchenko
  0 siblings, 2 replies; 108+ messages in thread
From: Volodymyr Babchuk @ 2017-12-05 17:08 UTC (permalink / raw)
  To: Julien Grall, Oleksandr Tyshchenko, xen-devel
  Cc: Edgar E. Iglesias, Stefano Stabellini

Hi Julien,

On 05.12.17 16:58, Julien Grall wrote:
> Hi Oleksandr,
> 
> On 09/11/17 17:10, Oleksandr Tyshchenko wrote:
>> From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>
>> Existing SMC wrapper call_smc() allows only 4 parameters and
>> returns only one value. This is enough for existing
>> use in PSCI code, but TEE mediator will need a call that is
>> fully compatible with ARM SMCCC.
>> This patch adds this call for both arm32 and arm64.
>>
>> There was similar patch by Edgar E. Iglesias ([1]), but looks
>> like it is abandoned.
>>
>> [1] 
>> https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html 
>>
>>
>> CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>>
>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
> 
> This patch was sent by Volodymyr a month ago (see [2]) and I had 
> comments on it. I would appreciate if you address them.
I can address your comments and send it as a separate patch to the ML.
Will it be fine?

WBR
-- 
Volodymyr

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-12-05 17:08     ` Volodymyr Babchuk
@ 2017-12-05 17:08       ` Julien Grall
  2017-12-05 17:20       ` Oleksandr Tyshchenko
  1 sibling, 0 replies; 108+ messages in thread
From: Julien Grall @ 2017-12-05 17:08 UTC (permalink / raw)
  To: Volodymyr Babchuk, Oleksandr Tyshchenko, xen-devel
  Cc: Edgar E. Iglesias, Stefano Stabellini



On 05/12/17 17:08, Volodymyr Babchuk wrote:
> Hi Julien,
> 
> On 05.12.17 16:58, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 09/11/17 17:10, Oleksandr Tyshchenko wrote:
>>> From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>>
>>> Existing SMC wrapper call_smc() allows only 4 parameters and
>>> returns only one value. This is enough for existing
>>> use in PSCI code, but TEE mediator will need a call that is
>>> fully compatible with ARM SMCCC.
>>> This patch adds this call for both arm32 and arm64.
>>>
>>> There was similar patch by Edgar E. Iglesias ([1]), but looks
>>> like it is abandoned.
>>>
>>> [1] 
>>> https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html 
>>>
>>>
>>> CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>>>
>>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>> CC: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: Julien Grall <julien.grall@linaro.org>
>>
>> This patch was sent by Volodymyr a month ago (see [2]) and I had 
>> comments on it. I would appreciate if you address them.
> I can address your comments and send it as a separate patch to the ML.
> Will it be fine?

Sure.

> 
> WBR

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-12-05 17:08     ` Volodymyr Babchuk
  2017-12-05 17:08       ` Julien Grall
@ 2017-12-05 17:20       ` Oleksandr Tyshchenko
  1 sibling, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-05 17:20 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: Edgar E. Iglesias, xen-devel, Julien Grall, Stefano Stabellini

On Tue, Dec 5, 2017 at 7:08 PM, Volodymyr Babchuk
<volodymyr_babchuk@epam.com> wrote:
> Hi Julien,
Hi Julien, Volodymyr.

>
> On 05.12.17 16:58, Julien Grall wrote:
>>
>> Hi Oleksandr,
>>
>> On 09/11/17 17:10, Oleksandr Tyshchenko wrote:
>>>
>>> From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>>
>>> Existing SMC wrapper call_smc() allows only 4 parameters and
>>> returns only one value. This is enough for existing
>>> use in PSCI code, but TEE mediator will need a call that is
>>> fully compatible with ARM SMCCC.
>>> This patch adds this call for both arm32 and arm64.
>>>
>>> There was similar patch by Edgar E. Iglesias ([1]), but looks
>>> like it is abandoned.
>>>
>>> [1]
>>> https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html
>>>
>>> CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>>>
>>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>> CC: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: Julien Grall <julien.grall@linaro.org>
>>
>>
>> This patch was sent by Volodymyr a month ago (see [2]) and I had comments
>> on it. I would appreciate if you address them.
>
> I can address your comments and send it as a separate patch to the ML.
That would be really great! I will be able to apply and test on ARM64.

> Will it be fine?
>
> WBR
> --
> Volodymyr



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC
  2017-12-05 15:33     ` Volodymyr Babchuk
@ 2017-12-05 17:21       ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 17:21 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: Oleksandr Tyshchenko, xen-devel, Stefano Stabellini,
	Edgar E. Iglesias, Julien Grall

On Tue, 5 Dec 2017, Volodymyr Babchuk wrote:
> Hi Stefano,
> 
> On Mon, Dec 04, 2017 at 06:30:13PM -0800, Stefano Stabellini wrote:
> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> > > From: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> > > 
> > > Existing SMC wrapper call_smc() allows only 4 parameters and
> > > returns only one value. This is enough for existing
> > > use in PSCI code, but TEE mediator will need a call that is
> > > fully compatible with ARM SMCCC.
> > > This patch adds this call for both arm32 and arm64.
> > > 
> > > There was similar patch by Edgar E. Iglesias ([1]), but looks
> > > like it is abandoned.
> > > 
> > > [1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00636.html
> > > 
> > > CC: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> > > 
> > > Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> > > CC: Stefano Stabellini <sstabellini@kernel.org>
> > > CC: Julien Grall <julien.grall@linaro.org>
> > > ---
> > >  xen/arch/arm/arm32/Makefile     |  1 +
> > >  xen/arch/arm/arm32/smc.S        | 32 ++++++++++++++++++++++++++++++++
> > >  xen/arch/arm/arm64/Makefile     |  1 +
> > >  xen/arch/arm/arm64/smc.S        | 29 +++++++++++++++++++++++++++++
> > >  xen/include/asm-arm/processor.h |  4 ++++
> > >  5 files changed, 67 insertions(+)
> > >  create mode 100644 xen/arch/arm/arm32/smc.S
> > >  create mode 100644 xen/arch/arm/arm64/smc.S
> > > 
> > > diff --git a/xen/arch/arm/arm32/Makefile b/xen/arch/arm/arm32/Makefile
> > > index 0ac254f..a2362f3 100644
> > > --- a/xen/arch/arm/arm32/Makefile
> > > +++ b/xen/arch/arm/arm32/Makefile
> > > @@ -8,6 +8,7 @@ obj-y += insn.o
> > >  obj-$(CONFIG_LIVEPATCH) += livepatch.o
> > >  obj-y += proc-v7.o proc-caxx.o
> > >  obj-y += smpboot.o
> > > +obj-y += smc.o
> > >  obj-y += traps.o
> > >  obj-y += vfp.o
> > >  
> > > diff --git a/xen/arch/arm/arm32/smc.S b/xen/arch/arm/arm32/smc.S
> > > new file mode 100644
> > > index 0000000..1cc9528
> > > --- /dev/null
> > > +++ b/xen/arch/arm/arm32/smc.S
> > > @@ -0,0 +1,32 @@
> > > +/*
> > > + * xen/arch/arm/arm32/smc.S
> > > + *
> > > + * Wrapper for Secure Monitors Calls
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License as published by
> > > + * the Free Software Foundation; either version 2 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + */
> > > +
> > > +#include <asm/macros.h>
> > > +
> > > +/*
> > > + * void call_smccc_smc(register_t a0, register_t a1, register_t a2,
> > > + *                     register_t a3, register_t a4, register_t a5,
> > > + *                     register_t a6, register_t a7, register_t res[4])
> > > + */
> > > +ENTRY(call_smccc_smc)
> > > +        mov     r12, sp
> > > +        push    {r4-r7}
> > > +        ldm     r12, {r4-r7}
> > > +        smc     #0
> > > +        pop     {r4-r7}
> > > +        ldr     r12, [sp, #(4 * 4)]
> > 
> > I haven't run this, but shouldn't it be:
> > 
> >   ldr     r12, [sp, #20]
> > 
> > ?
> > 
> I took this code from linux (arch/arm/kernel/arm-smccc.h).
> But, why #20? There are 5 parameters on the stack: a4-a7 and res:
> a4:  [sp]
> a5:  [sp, #4]
> a6:  [sp, #8]
> a7:  [sp, #12]
> res: [sp, #16]
> 
> We need to save returnred values to res. So it looks right. Unless
> I'm terribly wrong :)

Ops, I miscounted.
When taking code from Linux, it would be nice to say where you took it
from. Also, you definitely need to add the right signed-off-by lines for
copyright reasons.

Cheers,

Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-05 11:13         ` Oleksandr Tyshchenko
@ 2017-12-05 19:24           ` Stefano Stabellini
  2017-12-06 11:28             ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 19:24 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Tue, 5 Dec 2017, Oleksandr Tyshchenko wrote:
> >> Another question is second_max_freq. As I understand, it is highest
> >> non-turbo frequency calculated by framework to limit target frequency
> >> when
> >> turbo mode "is disabled". And Xen assumes that second_max_freq is
> >> always P1 if turbo mode is on.
> >> But, there might be a case when a few highest frequencies are
> >> turbo-frequencies. So, I propose to add an extra flag for handling
> >> that.
> >> So, each CPUFreq driver responsibility will be to mark
> >> turbo-frequency(ies) for the framework to properly calculate
> >> second_max_freq.
> >
> > As Andre wrote, we can start simply assuming that ARM doesn't have
> > turbo. If turbo mode is assumed to be off, I don't think we need the
> > patch below and the new flag, because second_max_freq == max_freq.
> 
> I just want to show you real example, where we have ARM SoC +
> turbo-mode + > 1 turbo freq
> https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/arch/arm64/boot/dts/renesas/r8a7795.dtsi?h=v4.9/rcar-3.5.9#n197
> As you can see, there are two freqs marked as turbo-freqs: 1600000000
> Hz and 1700000000 Hz
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/arch/arm64/boot/dts/renesas/r8a7796.dtsi?h=v4.9/rcar-3.5.9#n166
> For M3 SoC three turbo-freqs are used: 1600000000 Hz, 1700000000 Hz
> and 1800000000 Hz

Oh well, I take that back then :-)


> If a proposed below patch is not an option then we should find another
> way to clarify second_max_freq.

Yes, it looks like there must be better ways to define second_max_freq.
Taking the first frequency below the max seems a bit crude to me.


> >
> >> Something like that:
> >>
> >> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> >> index 25bf983..122a88b 100644
> >> --- a/xen/drivers/cpufreq/utility.c
> >> +++ b/xen/drivers/cpufreq/utility.c
> >> @@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
> >> cpufreq_policy *policy,
> >>  #ifdef CONFIG_HAS_CPU_TURBO
> >>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
> >>          unsigned int freq = table[i].frequency;
> >> -        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
> >> +        if ((freq == CPUFREQ_ENTRY_INVALID) ||
> >> +            (table[i].flags & CPUFREQ_BOOST_FREQ))
> >>              continue;
> >>          if (freq > second_max_freq)
> >>              second_max_freq = freq;
> >> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
> >> index 2e0c16a..77b29da 100644
> >> --- a/xen/include/xen/cpufreq.h
> >> +++ b/xen/include/xen/cpufreq.h
> >> @@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
> >> cpufreq_policy *policy,
> >>  #define CPUFREQ_ENTRY_INVALID ~0
> >>  #define CPUFREQ_TABLE_END     ~1
> >>
> >> +/* Special Values of .flags field */
> >> +#define CPUFREQ_BOOST_FREQ    (1 << 0)
> >> +
> >>  struct cpufreq_frequency_table {
> >> +       unsigned int    flags;
> >>      unsigned int    index;     /* any */
> >>      unsigned int    frequency; /* kHz - doesn't need to be in ascending
> >>                                  * order */
> >>
> >> Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
> >> a turbo-frequency if turbo mode "is supported". Am I correct?

Yes, I think it is a better approach than what we have today, even for
x86.


> >> And the most important question is how to recognize in Xen on ARM
> >> (using SCPI protocol) which frequencies are turbo-frequencies
> >> actually? I couldn't find any information regarding that in protocol
> >> description.
> >> For DT-based CPUFreq it is not an issue, since there is a specific
> >> property "turbo-mode" to mark corresponding OPPs. [1].
> >> But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
> >> mentions about it. Perhaps, additional command should be added to pass
> >> such info.
> >>
> >> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
> >> [2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
> >> [3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf

If there are no mentions of them, then I would assume that none of the
available frequencies are turbo frequencies.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-04 22:46       ` Stefano Stabellini
@ 2017-12-05 19:29         ` Oleksandr Tyshchenko
  2017-12-05 20:48           ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-05 19:29 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi Stefano

On Tue, Dec 5, 2017 at 12:46 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Mon, 4 Dec 2017, Oleksandr Tyshchenko wrote:
>> Hi, Stefano
>>
>> On Sat, Dec 2, 2017 at 3:37 AM, Stefano Stabellini
>> <sstabellini@kernel.org> wrote:
>> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> >> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> >>
>> >> First implementation of the cpufreq driver has been
>> >> written with x86 in mind. This patch makes possible
>> >> the cpufreq driver be working on both x86 and arm
>> >> architectures.
>> >>
>> >> This is a rebased version of the original patch:
>> >> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00932.html
>> >>
>> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> >> CC: Jan Beulich <jbeulich@suse.com>
>> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> >> CC: Stefano Stabellini <sstabellini@kernel.org>
>> >> CC: Julien Grall <julien.grall@linaro.org>
>> >> ---
>> >>  xen/drivers/cpufreq/cpufreq.c    | 81 +++++++++++++++++++++++++++++++++++++---
>> >>  xen/include/public/platform.h    |  1 +
>> >>  xen/include/xen/processor_perf.h |  6 +++
>> >>  3 files changed, 82 insertions(+), 6 deletions(-)
>> >>
>> >> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
>> >> index ab909e2..64e1ae7 100644
>> >> --- a/xen/drivers/cpufreq/cpufreq.c
>> >> +++ b/xen/drivers/cpufreq/cpufreq.c
>> >> @@ -42,7 +42,6 @@
>> >>  #include <asm/io.h>
>> >>  #include <asm/processor.h>
>> >>  #include <asm/percpu.h>
>> >> -#include <acpi/acpi.h>
>> >>  #include <xen/cpufreq.h>
>> >>
>> >>  static unsigned int __read_mostly usr_min_freq;
>> >> @@ -206,6 +205,7 @@ int cpufreq_add_cpu(unsigned int cpu)
>> >>      } else {
>> >>          /* domain sanity check under whatever coordination type */
>> >>          firstcpu = cpumask_first(cpufreq_dom->map);
>> >> +#ifdef CONFIG_ACPI
>> >>          if ((perf->domain_info.coord_type !=
>> >>              processor_pminfo[firstcpu]->perf.domain_info.coord_type) ||
>> >>              (perf->domain_info.num_processors !=
>> >> @@ -221,6 +221,19 @@ int cpufreq_add_cpu(unsigned int cpu)
>> >>                  );
>> >>              return -EINVAL;
>> >>          }
>> >> +#else /* !CONFIG_ACPI */
>> >> +        if ((perf->domain_info.num_processors !=
>> >> +            processor_pminfo[firstcpu]->perf.domain_info.num_processors)) {
>> >> +
>> >> +            printk(KERN_WARNING "cpufreq fail to add CPU%d:"
>> >> +                   "incorrect num processors (%"PRIu64"), "
>> >> +                   "expect(%"PRIu64")\n",
>> >> +                   cpu, perf->domain_info.num_processors,
>> >> +                   processor_pminfo[firstcpu]->perf.domain_info.num_processors
>> >> +                );
>> >> +            return -EINVAL;
>> >> +        }
>> >> +#endif /* CONFIG_ACPI */
>> >
>> > Why is this necessary? I am asking this question, because I think it
>> > would be best to avoid more #ifdef's if we can avoid them, and some of
>> > the code #ifdef'ed doesn't look very acpi specific (at least at first
>> > sight). It doesn't look like this change is very beneficial. What am I
>> > missing?
>>
>> Probably, the original author of this patch wanted to avoid playing
>> with some stuff (code & variables) which didn't make sense/wouldn't be
>> used on non-ACPI systems.
>>
>> Agree here, we are able to avoid this #ifdef as well as many others. I
>> don't see an issue, for example, to print something defaulting for
>> coord_type/num_entries/revision/etc.
>
> I agree
>
>
>> >
>> >
>> >>      }
>> >>
>> >>      if (!domexist || hw_all) {
>> >> @@ -380,6 +393,7 @@ int cpufreq_del_cpu(unsigned int cpu)
>> >>      return 0;
>> >>  }
>> >>
>> >> +#ifdef CONFIG_ACPI
>> >>  static void print_PCT(struct xen_pct_register *ptr)
>> >>  {
>> >>      printk("\t_PCT: descriptor=%d, length=%d, space_id=%d, "
>> >> @@ -387,12 +401,14 @@ static void print_PCT(struct xen_pct_register *ptr)
>> >>             ptr->descriptor, ptr->length, ptr->space_id, ptr->bit_width,
>> >>             ptr->bit_offset, ptr->reserved, ptr->address);
>> >>  }
>> >> +#endif /* CONFIG_ACPI */
>> >
>> > same question
>>
>> definitely omit #ifdef
>>
>> >
>> >
>> >>  static void print_PSS(struct xen_processor_px *ptr, int count)
>> >>  {
>> >>      int i;
>> >>      printk("\t_PSS: state_count=%d\n", count);
>> >>      for (i=0; i<count; i++){
>> >> +#ifdef CONFIG_ACPI
>> >>          printk("\tState%d: %"PRId64"MHz %"PRId64"mW %"PRId64"us "
>> >>                 "%"PRId64"us %#"PRIx64" %#"PRIx64"\n",
>> >>                 i,
>> >> @@ -402,15 +418,26 @@ static void print_PSS(struct xen_processor_px *ptr, int count)
>> >>                 ptr[i].bus_master_latency,
>> >>                 ptr[i].control,
>> >>                 ptr[i].status);
>> >> +#else /* !CONFIG_ACPI */
>> >> +        printk("\tState%d: %"PRId64"MHz %"PRId64"us\n",
>> >> +               i,
>> >> +               ptr[i].core_frequency,
>> >> +               ptr[i].transition_latency);
>> >> +#endif /* CONFIG_ACPI */
>> >>      }
>> >>  }
>> >
>> > same question
>>
>> same answer)
>>
>> >
>> >
>> >>  static void print_PSD( struct xen_psd_package *ptr)
>> >>  {
>> >> +#ifdef CONFIG_ACPI
>> >>      printk("\t_PSD: num_entries=%"PRId64" rev=%"PRId64
>> >>             " domain=%"PRId64" coord_type=%"PRId64" num_processors=%"PRId64"\n",
>> >>             ptr->num_entries, ptr->revision, ptr->domain, ptr->coord_type,
>> >>             ptr->num_processors);
>> >> +#else /* !CONFIG_ACPI */
>> >> +    printk("\t_PSD:  domain=%"PRId64" num_processors=%"PRId64"\n",
>> >> +           ptr->domain, ptr->num_processors);
>> >> +#endif /* CONFIG_ACPI */
>> >>  }
>> >
>> > same question
>>
>> same answer)
>>
>> >
>> >
>> >>  static void print_PPC(unsigned int platform_limit)
>> >> @@ -418,13 +445,53 @@ static void print_PPC(unsigned int platform_limit)
>> >>      printk("\t_PPC: %d\n", platform_limit);
>> >>  }
>> >>
>> >> +static inline bool is_pss_data(struct xen_processor_performance *px)
>> >> +{
>> >> +#ifdef CONFIG_ACPI
>> >> +    return px->flags & XEN_PX_PSS;
>> >> +#else
>> >> +    return px->flags == XEN_PX_DATA;
>> >> +#endif
>> >> +}
>> >> +
>> >> +static inline bool is_psd_data(struct xen_processor_performance *px)
>> >> +{
>> >> +#ifdef CONFIG_ACPI
>> >> +    return px->flags & XEN_PX_PSD;
>> >> +#else
>> >> +    return px->flags == XEN_PX_DATA;
>> >> +#endif
>> >> +}
>> >> +
>> >> +static inline bool is_ppc_data(struct xen_processor_performance *px)
>> >> +{
>> >> +#ifdef CONFIG_ACPI
>> >> +    return px->flags & XEN_PX_PPC;
>> >> +#else
>> >> +    return px->flags == XEN_PX_DATA;
>> >> +#endif
>> >> +}
>> >> +
>> >> +static inline bool is_all_data(struct xen_processor_performance *px)
>> >> +{
>> >> +#ifdef CONFIG_ACPI
>> >> +    return px->flags == ( XEN_PX_PCT | XEN_PX_PSS | XEN_PX_PSD | XEN_PX_PPC );
>> >> +#else
>> >> +    return px->flags == XEN_PX_DATA;
>> >> +#endif
>> >> +}
>> >
>> > Could you please explain here and in the commit message the idea behind
>> > this? It looks like we want to get rid of the different flags on
>> > non-ACPI systems? Why can't we reuse the same flags?
>>
>> You are right. Indeed looks redundant.
>> I will drop all these helpers and reuse existing flags. If we are
>> pretending to be an P-state driver and uploading the same P-state data
>> which [1] uploads
>> then I will just reuse existing flags. It will cost me nothing.
>
> Makes sense
>
>
>> May I ask you to take a look at this patch [2]? It looks like a hack
>> right now, but how to make it in a proper way?
>>
>> [1] https://github.com/torvalds/linux/blob/master/drivers/xen/xen-acpi-processor.c#L210
>> [2] https://www.mail-archive.com/xen-devel@lists.xen.org/msg128410.html
>
> Regarding [2]:
>
> This is something that needs to be agreed with the x86 maintainers.
> However, I would move the copy_from_guest (and everything related to
> parsing caller provided arguments) to
> xen/arch/x86/platform_hypercall.c:do_platform_op.
>
> Then, I would make set_px_pminfo look like a regular function that
> takes regular arguments (no XEN_GUEST_HANDLEs), so that it can be called
> on ARM without having to "fake" an hypercall.

Just to clarify:

The current function interface is:
int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance
*dom0_px_info)
where "dom0_px_info" argument contains XEN_GUEST_HANDLE we would like
to avoid playing with in case of ARM.

The idea to move operation over XEN_GUEST_HANDLE (copy_from_guest) out
of the function sounds reasonable.
But what function interface we will end up with?

Looks like we need either to pass each structure field as a separate
argument, so "new" function interface will be the following:
int set_px_pminfo(uint32_t acpi_id, uint32_t flags, ... , struct
xen_processor_px *states, ... , uint32_t shared_type)
or to reuse "struct processor_performance" somehow in order to reduce
a scope of possible arguments...

Or I missed something?

>
>
>> >
>> >
>> >>  int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_info)
>> >>  {
>> >>      int ret=0, cpuid;
>> >>      struct processor_pminfo *pmpt;
>> >>      struct processor_performance *pxpt;
>> >>
>> >> +#ifdef CONFIG_ACPI
>> >>      cpuid = get_cpu_id(acpi_id);
>> >> +#else
>> >> +    cpuid = acpi_id;
>> >> +#endif
>> >
>> > Rather than an #ifdef here, I would probably generalize the get_cpu_id
>> > function.
>>
>> Would a following stub be enough?
>>
>> diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
>> index 9409350..4aab41e 100644
>> --- a/xen/include/xen/acpi.h
>> +++ b/xen/include/xen/acpi.h
>> @@ -123,7 +123,11 @@ static inline int acpi_boot_table_init(void)
>>
>>  #endif         /*!CONFIG_ACPI*/
>>
>> +#ifdef CONFIG_ACPI
>>  int get_cpu_id(u32 acpi_id);
>> +#else
>> +static inline int get_cpu_id(u32 acpi_id) { return acpi_id; }
>> +#endif
>>
>>  unsigned int acpi_register_gsi (u32 gsi, int edge_level, int active_high_low);
>>  int acpi_gsi_to_irq (u32 gsi, unsigned int *irq);
>
> Yes, I think that's OK.
>
>
>> >
>> >
>> >>      if ( cpuid < 0 || !dom0_px_info)
>> >>      {
>> >>          ret = -EINVAL;
>> >> @@ -446,6 +513,8 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>> >>          processor_pminfo[cpuid] = pmpt;
>> >>      }
>> >>      pxpt = &pmpt->perf;
>> >> +
>> >> +#ifdef CONFIG_ACPI
>> >>      pmpt->acpi_id = acpi_id;
>> >>      pmpt->id = cpuid;
>> >>
>> >> @@ -472,8 +541,9 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>> >>              print_PCT(&pxpt->status_register);
>> >>          }
>> >>      }
>> >> +#endif /* CONFIG_ACPI */
>>
>> BTW, at the first sight we could omit this #ifdef too with being taken
>> care of space_id check to pass successfully.
>>
>> >>
>> >> -    if ( dom0_px_info->flags & XEN_PX_PSS )
>> >> +    if ( is_pss_data(dom0_px_info) )
>> >>      {
>> >>          /* capability check */
>> >>          if (dom0_px_info->state_count <= 1)
>> >> @@ -500,7 +570,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>> >>              print_PSS(pxpt->states,pxpt->state_count);
>> >>      }
>> >>
>> >> -    if ( dom0_px_info->flags & XEN_PX_PSD )
>> >> +    if ( is_psd_data(dom0_px_info) )
>> >>      {
>> >>          /* check domain coordination */
>> >>          if (dom0_px_info->shared_type != CPUFREQ_SHARED_TYPE_ALL &&
>> >> @@ -520,7 +590,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>> >>              print_PSD(&pxpt->domain_info);
>> >>      }
>> >>
>> >> -    if ( dom0_px_info->flags & XEN_PX_PPC )
>> >> +    if ( is_ppc_data(dom0_px_info) )
>> >>      {
>> >>          pxpt->platform_limit = dom0_px_info->platform_limit;
>> >>
>> >> @@ -534,8 +604,7 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>> >>          }
>> >>      }
>> >>
>> >> -    if ( dom0_px_info->flags == ( XEN_PX_PCT | XEN_PX_PSS |
>> >> -                XEN_PX_PSD | XEN_PX_PPC ) )
>> >> +    if ( is_all_data(dom0_px_info) )
>> >>      {
>> >>          pxpt->init = XEN_PX_INIT;
>> >>
>> >> diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
>> >> index 94dbc3f..328579c 100644
>> >> --- a/xen/include/public/platform.h
>> >> +++ b/xen/include/public/platform.h
>> >> @@ -384,6 +384,7 @@ DEFINE_XEN_GUEST_HANDLE(xenpf_getidletime_t);
>> >>  #define XEN_PX_PSS   2
>> >>  #define XEN_PX_PPC   4
>> >>  #define XEN_PX_PSD   8
>> >> +#define XEN_PX_DATA  16
>> >>
>> >>  struct xen_power_register {
>> >>      uint32_t     space_id;
>> >> diff --git a/xen/include/xen/processor_perf.h b/xen/include/xen/processor_perf.h
>> >> index d8a1ba6..afdccf2 100644
>> >> --- a/xen/include/xen/processor_perf.h
>> >> +++ b/xen/include/xen/processor_perf.h
>> >> @@ -3,7 +3,9 @@
>> >>
>> >>  #include <public/platform.h>
>> >>  #include <public/sysctl.h>
>> >> +#ifdef CONFIG_ACPI
>> >>  #include <xen/acpi.h>
>> >> +#endif
>> >>
>> >>  #define XEN_PX_INIT 0x80000000
>> >>
>> >> @@ -24,8 +26,10 @@ int  cpufreq_del_cpu(unsigned int);
>> >>  struct processor_performance {
>> >>      uint32_t state;
>> >>      uint32_t platform_limit;
>> >> +#ifdef CONFIG_ACPI
>> >>      struct xen_pct_register control_register;
>> >>      struct xen_pct_register status_register;
>> >> +#endif
>> >>      uint32_t state_count;
>> >>      struct xen_processor_px *states;
>> >>      struct xen_psd_package domain_info;
>> >> @@ -35,8 +39,10 @@ struct processor_performance {
>> >>  };
>> >>
>> >>  struct processor_pminfo {
>> >> +#ifdef CONFIG_ACPI
>> >>      uint32_t acpi_id;
>> >>      uint32_t id;
>> >> +#endif
>> >>      struct processor_performance    perf;
>> >>  };
>>
>> There will be no changes here as well.
>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-05 19:29         ` Oleksandr Tyshchenko
@ 2017-12-05 20:48           ` Stefano Stabellini
  2017-12-06  7:54             ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 20:48 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, andrew.cooper3, jbeulich
  Cc: xen-devel, Oleksandr Dmytryshyn, Stefano Stabellini,
	Julien Grall, Oleksandr Tyshchenko

On Tue, 5 Dec 2017, Oleksandr Tyshchenko wrote:
> Hi Stefano
> 
> On Tue, Dec 5, 2017 at 12:46 AM, Stefano Stabellini
> <sstabellini@kernel.org> wrote:
> > On Mon, 4 Dec 2017, Oleksandr Tyshchenko wrote:
> >> Hi, Stefano
> >>
> >> On Sat, Dec 2, 2017 at 3:37 AM, Stefano Stabellini
> >> <sstabellini@kernel.org> wrote:
> >> > On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> >> >> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >> >>
> >> >> First implementation of the cpufreq driver has been
> >> >> written with x86 in mind. This patch makes possible
> >> >> the cpufreq driver be working on both x86 and arm
> >> >> architectures.
> >> >>
> >> >> This is a rebased version of the original patch:
> >> >> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00932.html
> >> >>
> >> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> >> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >> >> CC: Jan Beulich <jbeulich@suse.com>
> >> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >> >> CC: Stefano Stabellini <sstabellini@kernel.org>
> >> >> CC: Julien Grall <julien.grall@linaro.org>
> >> >> ---
> >> >>  xen/drivers/cpufreq/cpufreq.c    | 81 +++++++++++++++++++++++++++++++++++++---
> >> >>  xen/include/public/platform.h    |  1 +
> >> >>  xen/include/xen/processor_perf.h |  6 +++
> >> >>  3 files changed, 82 insertions(+), 6 deletions(-)
> >> >>
> >> >> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> >> >> index ab909e2..64e1ae7 100644
> >> >> --- a/xen/drivers/cpufreq/cpufreq.c
> >> >> +++ b/xen/drivers/cpufreq/cpufreq.c
> >> >> @@ -42,7 +42,6 @@
> >> >>  #include <asm/io.h>
> >> >>  #include <asm/processor.h>
> >> >>  #include <asm/percpu.h>
> >> >> -#include <acpi/acpi.h>
> >> >>  #include <xen/cpufreq.h>
> >> >>
> >> >>  static unsigned int __read_mostly usr_min_freq;
> >> >> @@ -206,6 +205,7 @@ int cpufreq_add_cpu(unsigned int cpu)
> >> >>      } else {
> >> >>          /* domain sanity check under whatever coordination type */
> >> >>          firstcpu = cpumask_first(cpufreq_dom->map);
> >> >> +#ifdef CONFIG_ACPI
> >> >>          if ((perf->domain_info.coord_type !=
> >> >>              processor_pminfo[firstcpu]->perf.domain_info.coord_type) ||
> >> >>              (perf->domain_info.num_processors !=
> >> >> @@ -221,6 +221,19 @@ int cpufreq_add_cpu(unsigned int cpu)
> >> >>                  );
> >> >>              return -EINVAL;
> >> >>          }
> >> >> +#else /* !CONFIG_ACPI */
> >> >> +        if ((perf->domain_info.num_processors !=
> >> >> +            processor_pminfo[firstcpu]->perf.domain_info.num_processors)) {
> >> >> +
> >> >> +            printk(KERN_WARNING "cpufreq fail to add CPU%d:"
> >> >> +                   "incorrect num processors (%"PRIu64"), "
> >> >> +                   "expect(%"PRIu64")\n",
> >> >> +                   cpu, perf->domain_info.num_processors,
> >> >> +                   processor_pminfo[firstcpu]->perf.domain_info.num_processors
> >> >> +                );
> >> >> +            return -EINVAL;
> >> >> +        }
> >> >> +#endif /* CONFIG_ACPI */
> >> >
> >> > Why is this necessary? I am asking this question, because I think it
> >> > would be best to avoid more #ifdef's if we can avoid them, and some of
> >> > the code #ifdef'ed doesn't look very acpi specific (at least at first
> >> > sight). It doesn't look like this change is very beneficial. What am I
> >> > missing?
> >>
> >> Probably, the original author of this patch wanted to avoid playing
> >> with some stuff (code & variables) which didn't make sense/wouldn't be
> >> used on non-ACPI systems.
> >>
> >> Agree here, we are able to avoid this #ifdef as well as many others. I
> >> don't see an issue, for example, to print something defaulting for
> >> coord_type/num_entries/revision/etc.
> >
> > I agree
> >
> >
> >> >
> >> >
> >> >>      }
> >> >>
> >> >>      if (!domexist || hw_all) {
> >> >> @@ -380,6 +393,7 @@ int cpufreq_del_cpu(unsigned int cpu)
> >> >>      return 0;
> >> >>  }
> >> >>
> >> >> +#ifdef CONFIG_ACPI
> >> >>  static void print_PCT(struct xen_pct_register *ptr)
> >> >>  {
> >> >>      printk("\t_PCT: descriptor=%d, length=%d, space_id=%d, "
> >> >> @@ -387,12 +401,14 @@ static void print_PCT(struct xen_pct_register *ptr)
> >> >>             ptr->descriptor, ptr->length, ptr->space_id, ptr->bit_width,
> >> >>             ptr->bit_offset, ptr->reserved, ptr->address);
> >> >>  }
> >> >> +#endif /* CONFIG_ACPI */
> >> >
> >> > same question
> >>
> >> definitely omit #ifdef
> >>
> >> >
> >> >
> >> >>  static void print_PSS(struct xen_processor_px *ptr, int count)
> >> >>  {
> >> >>      int i;
> >> >>      printk("\t_PSS: state_count=%d\n", count);
> >> >>      for (i=0; i<count; i++){
> >> >> +#ifdef CONFIG_ACPI
> >> >>          printk("\tState%d: %"PRId64"MHz %"PRId64"mW %"PRId64"us "
> >> >>                 "%"PRId64"us %#"PRIx64" %#"PRIx64"\n",
> >> >>                 i,
> >> >> @@ -402,15 +418,26 @@ static void print_PSS(struct xen_processor_px *ptr, int count)
> >> >>                 ptr[i].bus_master_latency,
> >> >>                 ptr[i].control,
> >> >>                 ptr[i].status);
> >> >> +#else /* !CONFIG_ACPI */
> >> >> +        printk("\tState%d: %"PRId64"MHz %"PRId64"us\n",
> >> >> +               i,
> >> >> +               ptr[i].core_frequency,
> >> >> +               ptr[i].transition_latency);
> >> >> +#endif /* CONFIG_ACPI */
> >> >>      }
> >> >>  }
> >> >
> >> > same question
> >>
> >> same answer)
> >>
> >> >
> >> >
> >> >>  static void print_PSD( struct xen_psd_package *ptr)
> >> >>  {
> >> >> +#ifdef CONFIG_ACPI
> >> >>      printk("\t_PSD: num_entries=%"PRId64" rev=%"PRId64
> >> >>             " domain=%"PRId64" coord_type=%"PRId64" num_processors=%"PRId64"\n",
> >> >>             ptr->num_entries, ptr->revision, ptr->domain, ptr->coord_type,
> >> >>             ptr->num_processors);
> >> >> +#else /* !CONFIG_ACPI */
> >> >> +    printk("\t_PSD:  domain=%"PRId64" num_processors=%"PRId64"\n",
> >> >> +           ptr->domain, ptr->num_processors);
> >> >> +#endif /* CONFIG_ACPI */
> >> >>  }
> >> >
> >> > same question
> >>
> >> same answer)
> >>
> >> >
> >> >
> >> >>  static void print_PPC(unsigned int platform_limit)
> >> >> @@ -418,13 +445,53 @@ static void print_PPC(unsigned int platform_limit)
> >> >>      printk("\t_PPC: %d\n", platform_limit);
> >> >>  }
> >> >>
> >> >> +static inline bool is_pss_data(struct xen_processor_performance *px)
> >> >> +{
> >> >> +#ifdef CONFIG_ACPI
> >> >> +    return px->flags & XEN_PX_PSS;
> >> >> +#else
> >> >> +    return px->flags == XEN_PX_DATA;
> >> >> +#endif
> >> >> +}
> >> >> +
> >> >> +static inline bool is_psd_data(struct xen_processor_performance *px)
> >> >> +{
> >> >> +#ifdef CONFIG_ACPI
> >> >> +    return px->flags & XEN_PX_PSD;
> >> >> +#else
> >> >> +    return px->flags == XEN_PX_DATA;
> >> >> +#endif
> >> >> +}
> >> >> +
> >> >> +static inline bool is_ppc_data(struct xen_processor_performance *px)
> >> >> +{
> >> >> +#ifdef CONFIG_ACPI
> >> >> +    return px->flags & XEN_PX_PPC;
> >> >> +#else
> >> >> +    return px->flags == XEN_PX_DATA;
> >> >> +#endif
> >> >> +}
> >> >> +
> >> >> +static inline bool is_all_data(struct xen_processor_performance *px)
> >> >> +{
> >> >> +#ifdef CONFIG_ACPI
> >> >> +    return px->flags == ( XEN_PX_PCT | XEN_PX_PSS | XEN_PX_PSD | XEN_PX_PPC );
> >> >> +#else
> >> >> +    return px->flags == XEN_PX_DATA;
> >> >> +#endif
> >> >> +}
> >> >
> >> > Could you please explain here and in the commit message the idea behind
> >> > this? It looks like we want to get rid of the different flags on
> >> > non-ACPI systems? Why can't we reuse the same flags?
> >>
> >> You are right. Indeed looks redundant.
> >> I will drop all these helpers and reuse existing flags. If we are
> >> pretending to be an P-state driver and uploading the same P-state data
> >> which [1] uploads
> >> then I will just reuse existing flags. It will cost me nothing.
> >
> > Makes sense
> >
> >
> >> May I ask you to take a look at this patch [2]? It looks like a hack
> >> right now, but how to make it in a proper way?
> >>
> >> [1] https://github.com/torvalds/linux/blob/master/drivers/xen/xen-acpi-processor.c#L210
> >> [2] https://www.mail-archive.com/xen-devel@lists.xen.org/msg128410.html
> >
> > Regarding [2]:
> >
> > This is something that needs to be agreed with the x86 maintainers.
> > However, I would move the copy_from_guest (and everything related to
> > parsing caller provided arguments) to
> > xen/arch/x86/platform_hypercall.c:do_platform_op.
> >
> > Then, I would make set_px_pminfo look like a regular function that
> > takes regular arguments (no XEN_GUEST_HANDLEs), so that it can be called
> > on ARM without having to "fake" an hypercall.
> 
> Just to clarify:
> 
> The current function interface is:
> int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance
> *dom0_px_info)
> where "dom0_px_info" argument contains XEN_GUEST_HANDLE we would like
> to avoid playing with in case of ARM.
> 
> The idea to move operation over XEN_GUEST_HANDLE (copy_from_guest) out
> of the function sounds reasonable.
> But what function interface we will end up with?
> 
> Looks like we need either to pass each structure field as a separate
> argument, so "new" function interface will be the following:
> int set_px_pminfo(uint32_t acpi_id, uint32_t flags, ... , struct
> xen_processor_px *states, ... , uint32_t shared_type)
> or to reuse "struct processor_performance" somehow in order to reduce
> a scope of possible arguments...
> 
> Or I missed something?

You are right. We need to define a new struct for internal usage, for
example:

struct xen_processor_performance_internal {
    uint32_t flags;     /* flag for Px sub info type */
    uint32_t platform_limit;  /* Platform limitation on freq usage */
    struct xen_pct_register control_register;
    struct xen_pct_register status_register;
    uint32_t state_count;     /* total available performance states */
    struct xen_processor_px states;
    struct xen_psd_package domain_info;
    uint32_t shared_type;     /* coordination type of this processor */
};

Jan, Andrew, does this sound like a good approach to you?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol
  2017-11-09 17:10 ` [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol Oleksandr Tyshchenko
@ 2017-12-05 21:20   ` Stefano Stabellini
  2017-12-05 21:41     ` Julien Grall
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 21:20 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Modify the direct ported SCPI Message Protocol driver to be
> functional inside Xen.
> 
> As SCPI Message protocol driver expects mailbox to be registed,
> find and initialize mailbox before probing it.
> 
> Include "wrappers.h" which contains all required things the direct
> ported code relies on.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

As far as drivers ported from Linux go, this looks pretty clean in terms
of changes and nasty glue code required to get it to work.

The wrappers.h header is not too bad. The question remains on whether we
should keep the #if 0 to retain "textual compatibility" with Linux, or
we should just bite the bullet and apply the changes. If we commit them
as a separate patch, we can always dig out the difference between the
original driver and the Xen version using git.

Julien, what do you think?


> ---
>  xen/arch/arm/cpufreq/arm_scpi.c      | 90 ++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/cpufreq/scpi_protocol.h | 32 +++++++++++++
>  2 files changed, 122 insertions(+)
> 
> diff --git a/xen/arch/arm/cpufreq/arm_scpi.c b/xen/arch/arm/cpufreq/arm_scpi.c
> index 7da9f1b..553a516 100644
> --- a/xen/arch/arm/cpufreq/arm_scpi.c
> +++ b/xen/arch/arm/cpufreq/arm_scpi.c
> @@ -23,8 +23,16 @@
>   *
>   * You should have received a copy of the GNU General Public License along
>   * with this program. If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Based on Linux drivers/firmware/arm_scpi.c
> + * => commit 0d30176819c8738b012ec623c7b3db19df818e70
> + *
> + * Xen modification:
> + * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
> + * Copyright (C) 2017 EPAM Systems Inc.
>   */
>  
> +#if 0
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>  
>  #include <linux/bitmap.h>
> @@ -44,6 +52,22 @@
>  #include <linux/slab.h>
>  #include <linux/sort.h>
>  #include <linux/spinlock.h>
> +#endif
> +
> +#include <xen/device_tree.h>
> +#include <xen/err.h>
> +#include <xen/vmap.h>
> +#include <xen/sort.h>
> +
> +#include "scpi_protocol.h"
> +#include "mailbox_client.h"
> +#include "mailbox_controller.h"
> +#include "wrappers.h"
> +
> +/*
> + * TODO:
> + * 1. Add releasing resources since devm.
> + */
>  
>  #define CMD_ID_SHIFT		0
>  #define CMD_ID_MASK		0x7f
> @@ -859,6 +883,7 @@ static int scpi_init_versions(struct scpi_drvinfo *info)
>  	return ret;
>  }
>  
> +#if 0
>  static ssize_t protocol_version_show(struct device *dev,
>  				     struct device_attribute *attr, char *buf)
>  {
> @@ -888,6 +913,7 @@ static struct attribute *versions_attrs[] = {
>  	NULL,
>  };
>  ATTRIBUTE_GROUPS(versions);
> +#endif
>  
>  static void
>  scpi_free_channels(struct device *dev, struct scpi_chan *pchan, int count)
> @@ -909,8 +935,10 @@ static int scpi_remove(struct platform_device *pdev)
>  
>  	scpi_info = NULL; /* stop exporting SCPI ops through get_scpi_ops */
>  
> +#if 0
>  	of_platform_depopulate(dev);
>  	sysfs_remove_groups(&dev->kobj, versions_groups);
> +#endif
>  	scpi_free_channels(dev, info->channels, info->num_chans);
>  	platform_set_drvdata(pdev, NULL);
>  
> @@ -1055,11 +1083,15 @@ err:
>  		  FW_REV_PATCH(scpi_info->firmware_version));
>  	scpi_info->scpi_ops = &scpi_ops;
>  
> +#if 0
>  	ret = sysfs_create_groups(&dev->kobj, versions_groups);
>  	if (ret)
>  		dev_err(dev, "unable to create sysfs version group\n");
>  
>  	return of_platform_populate(dev->of_node, NULL, NULL, dev);
> +#else
> +	return 0;
> +#endif
>  }
>  
>  static const struct of_device_id scpi_of_match[] = {
> @@ -1070,6 +1102,7 @@ static const struct of_device_id scpi_of_match[] = {
>  
>  MODULE_DEVICE_TABLE(of, scpi_of_match);
>  
> +#if 0
>  static struct platform_driver scpi_driver = {
>  	.driver = {
>  		.name = "scpi_protocol",
> @@ -1083,3 +1116,60 @@ module_platform_driver(scpi_driver);
>  MODULE_AUTHOR("Sudeep Holla <sudeep.holla@arm.com>");
>  MODULE_DESCRIPTION("ARM SCPI mailbox protocol driver");
>  MODULE_LICENSE("GPL v2");
> +#endif
> +
> +static struct device *scpi_dev;
> +
> +struct device *get_scpi_dev(void)
> +{
> +	return scpi_dev;
> +}
> +
> +int __init scpi_init(void)
> +{
> +	struct dt_device_node *scpi, *mbox;
> +	bool has_mbox = false;
> +	int ret = -ENODEV;
> +
> +	scpi = dt_find_matching_node(NULL, scpi_of_match);
> +	if (!scpi) {
> +		printk("failed to find SCPI node in the device tree\n");
> +		return -ENXIO;
> +	}
> +
> +	/* At first find and initialize mailbox to communicate with SCP */
> +	dt_for_each_device_node(dt_host, mbox) {
> +		ret = device_init(mbox, DEVICE_MAILBOX, NULL);
> +		if (!ret) {
> +			has_mbox = true;
> +			break;
> +		}
> +	}
> +
> +	if (!has_mbox) {
> +		dev_err(&scpi->dev, "failed to init Mailbox interface (%d)\n", ret);
> +		return ret;
> +	}
> +
> +	ret = scpi_probe(scpi);
> +	if (ret) {
> +		/* TODO Do we need to deinit mailbox? */
> +		dev_err(&scpi->dev, "failed to init SCPI Message Protocol (%d)\n", ret);
> +		return ret;
> +	}
> +
> +	scpi_dev = &scpi->dev;
> +
> +	/* TODO Do we need to mark device as used by Xen? */
> +
> +	return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 8
> + * indent-tabs-mode: t
> + * End:
> + */
> diff --git a/xen/arch/arm/cpufreq/scpi_protocol.h b/xen/arch/arm/cpufreq/scpi_protocol.h
> index 327d656..0f6dab3 100644
> --- a/xen/arch/arm/cpufreq/scpi_protocol.h
> +++ b/xen/arch/arm/cpufreq/scpi_protocol.h
> @@ -14,8 +14,25 @@
>   *
>   * You should have received a copy of the GNU General Public License along with
>   * this program. If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Based on Linux include/linux/scpi_protocol.h
> + * => commit 45ca7df7c345465dbd2426a33012c9c33d27de62
> + *
> + * Xen modification:
> + * Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>
> + * Copyright (C) 2017 EPAM Systems Inc.
>   */
> +
> +#ifndef __ARCH_ARM_CPUFREQ_SCPI_PROTOCOL_H__
> +#define __ARCH_ARM_CPUFREQ_SCPI_PROTOCOL_H__
> +
> +#if 0
>  #include <linux/types.h>
> +#endif
> +
> +#include <asm/device.h>
> +
> +#define IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL) 1
>  
>  struct scpi_opp {
>  	u32 freq;
> @@ -78,7 +95,22 @@ struct scpi_ops {
>  };
>  
>  #if IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL)
> +int scpi_init(void);
> +struct device *get_scpi_dev(void);
>  struct scpi_ops *get_scpi_ops(void);
>  #else
> +static inline int scpi_init(void) { return -1; }
> +static inline struct device *get_scpi_dev(void) { return NULL; }
>  static inline struct scpi_ops *get_scpi_ops(void) { return NULL; }
>  #endif
> +
> +#endif /* __ARCH_ARM_CPUFREQ_SCPI_PROTOCOL_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 8
> + * indent-tabs-mode: t
> + * End:
> + */
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 27/31] cpufreq: hack: perf->states isn't a real guest handle on ARM
  2017-11-09 17:10 ` [RFC PATCH 27/31] cpufreq: hack: perf->states isn't a real guest handle on ARM Oleksandr Tyshchenko
@ 2017-12-05 21:34   ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 21:34 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch is just a temp solution to highlight a problem which
> should be resolved in a proper way.
> 
> set_px_pminfo() is intended to be called from platform hypercall
> where "perf" argument was entirely filled in by hwdom.
> 
> But unlike x86 we don't get this info from hwdom on ARM,
> we get it from other sources (device tree + firmware). In order to
> retain function interface, we emulate receiving hypercall and
> pass argument which function expects to see. Although "perf->states"
> looks like a guest handle it is not a real handle and we can't use
> copy_from_guest() over it. As only scpi-cpufreq sets XEN_PX_DATA flag
> use it as an indicator to do memcpy.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>

As a reference, this patch has been discussed here:

https://marc.info/?l=xen-devel&m=151250698607186



> ---
>  xen/drivers/cpufreq/cpufreq.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> index 64e1ae7..1022cd1 100644
> --- a/xen/drivers/cpufreq/cpufreq.c
> +++ b/xen/drivers/cpufreq/cpufreq.c
> @@ -558,11 +558,22 @@ int set_px_pminfo(uint32_t acpi_id, struct xen_processor_performance *dom0_px_in
>              ret = -ENOMEM;
>              goto out;
>          }
> -        if ( copy_from_guest(pxpt->states, dom0_px_info->states,
> -                             dom0_px_info->state_count) )
> +
> +        if ( dom0_px_info->flags == XEN_PX_DATA )
>          {
> -            ret = -EFAULT;
> -            goto out;
> +            struct xen_processor_px *states = (dom0_px_info->states).p;
> +
> +            memcpy(pxpt->states, states,
> +                   dom0_px_info->state_count * sizeof(struct xen_processor_px));
> +        }
> +        else
> +        {
> +            if ( copy_from_guest(pxpt->states, dom0_px_info->states,
> +                                 dom0_px_info->state_count) )
> +            {
> +                ret = -EFAULT;
> +                goto out;
> +            }
>          }
>          pxpt->state_count = dom0_px_info->state_count;
>  
> -- 
> 2.7.4
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol
  2017-12-05 21:20   ` Stefano Stabellini
@ 2017-12-05 21:41     ` Julien Grall
  2017-12-06 10:08       ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Julien Grall @ 2017-12-05 21:41 UTC (permalink / raw)
  To: Stefano Stabellini, Oleksandr Tyshchenko; +Cc: xen-devel, Oleksandr Tyshchenko



On 05/12/2017 21:20, Stefano Stabellini wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Modify the direct ported SCPI Message Protocol driver to be
>> functional inside Xen.
>>
>> As SCPI Message protocol driver expects mailbox to be registed,
>> find and initialize mailbox before probing it.
>>
>> Include "wrappers.h" which contains all required things the direct
>> ported code relies on.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
> 
> As far as drivers ported from Linux go, this looks pretty clean in terms
> of changes and nasty glue code required to get it to work.
> 
> The wrappers.h header is not too bad. The question remains on whether we
> should keep the #if 0 to retain "textual compatibility" with Linux, or
> we should just bite the bullet and apply the changes. If we commit them
> as a separate patch, we can always dig out the difference between the
> original driver and the Xen version using git.
> 
> Julien, what do you think?

When I see the diff of that series:

  50 files changed, 4822 insertions(+), 862 deletions(-)

this is a rather huge series for benefits that we still don't know (e.g 
we don't have any numbers). Based on the current discussion, it looks 
like the design will change quite a lot. So in entire honesty, I haven't 
spent and will not spend much times look at the code itself until we get 
an agreement on the benefits.

However I had a brief look at the code and I raised quiet a few time the 
eyebrow at the glue code. I saw that mutex was converted spinlock 
without any justification (see patch #20).

Anyway, Oleksandr promised to come back with numbers and investigating 
the discussion. We should probably wait that before looking at this 
series in more details.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component
  2017-11-09 17:10 ` [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component Oleksandr Tyshchenko
@ 2017-12-05 22:25   ` Stefano Stabellini
  2017-12-06 10:54     ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 22:25 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch adds an interface component which performs following steps:
> 1. Initialize everything needed SCPI based CPUFreq driver to be functional
>    (SCPI Message protocol, mailbox to communicate with SCP, etc).
>    Also preliminary check if SCPI DVFS clock nodes offered by SCP are
>    present in a device tree.
> 2. Register SCPI based CPUFreq driver.
> 3. Populate CPUs. Get DVFS info (OPP list and the latency information)
>    for all DVFS capable CPUs using SCPI protocol, convert these capabilities
>    into PM data the CPUFreq framework expects to see followed by
>    uploading it.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/cpufreq/cpufreq_if.c | 522 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 522 insertions(+)
>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
> 
> diff --git a/xen/arch/arm/cpufreq/cpufreq_if.c b/xen/arch/arm/cpufreq/cpufreq_if.c
> new file mode 100644
> index 0000000..2451d00
> --- /dev/null
> +++ b/xen/arch/arm/cpufreq/cpufreq_if.c
> @@ -0,0 +1,522 @@
> +/*
> + * xen/arch/arm/cpufreq/cpufreq_if.c
> + *
> + * CPUFreq interface component
> + *
> + * Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> + * Copyright (c) 2017 EPAM Systems.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/device_tree.h>
> +#include <xen/err.h>
> +#include <xen/sched.h>
> +#include <xen/cpufreq.h>
> +#include <xen/pmstat.h>
> +#include <xen/guest_access.h>
> +
> +#include "scpi_protocol.h"
> +
> +/*
> + * TODO:
> + * 1. Add __init to required funcs
> + * 2. Put get_cpu_device() into common place
> + */
> +
> +static struct scpi_ops *scpi_ops;
> +
> +extern int scpi_cpufreq_register_driver(void);
> +
> +#define dev_name(dev) dt_node_full_name(dev_to_dt(dev))
> +
> +struct device *get_cpu_device(unsigned int cpu)
> +{
> +    if ( cpu < nr_cpu_ids && cpu_possible(cpu) )
> +        return dt_to_dev(cpu_dt_nodes[cpu]);
> +    else
> +        return NULL;
> +}
> +
> +static bool is_dvfs_capable(unsigned int cpu)
> +{
> +    static const struct dt_device_match scpi_dvfs_clock_match[] =
> +    {
> +        DT_MATCH_COMPATIBLE("arm,scpi-dvfs-clocks"),
> +        { /* sentinel */ },
> +    };
> +    struct device *cpu_dev;
> +    struct dt_phandle_args clock_spec;
> +    struct scpi_dvfs_info *info;
> +    u32 domain;
> +    int i, ret, count;
> +
> +    cpu_dev = get_cpu_device(cpu);
> +    if ( !cpu_dev )
> +    {
> +        printk("cpu%d: failed to get device\n", cpu);
> +        return false;
> +    }
> +
> +    /* First of all find a clock node this CPU is a consumer of */
> +    ret = dt_parse_phandle_with_args(cpu_dev->of_node,
> +                                     "clocks",
> +                                     "#clock-cells",
> +                                     0,
> +                                     &clock_spec);
> +    if ( ret )
> +    {
> +        printk("cpu%d: failed to get clock node\n", cpu);
> +        return false;
> +    }
> +
> +    /* Make sure it is an available DVFS clock node */
> +    if ( !dt_match_node(scpi_dvfs_clock_match, clock_spec.np) ||
> +         !dt_device_is_available(clock_spec.np) )
> +    {
> +        printk("cpu%d: clock node '%s' is either non-DVFS or non-available\n",
> +               cpu, dev_name(&clock_spec.np->dev));
> +        return false;
> +    }
> +
> +    /*
> +     * Actually we already have a power domain id this CPU belongs to,
> +     * it is a stored in args[0] CPU clock specifier, so we could ask SCP
> +     * to provide its DVFS info. But we want to dig a little bit deeper
> +     * to make sure that everything is correct.
> +     */
> +
> +    /* Check how many clock ids a DVFS clock node has */
> +    ret = dt_property_count_elems_of_size(clock_spec.np,
> +                                          "clock-indices",
> +                                          sizeof(u32));
> +    if ( ret < 0 )
> +    {
> +        printk("cpu%d: failed to get clock-indices count in '%s'\n",
> +               cpu, dev_name(&clock_spec.np->dev));
> +        return false;
> +    }
> +    count = ret;
> +
> +    /* Check if a clock id the CPU clock specifier points to is present */
> +    for ( i = 0; i < count; i++ )
> +    {
> +        ret = dt_property_read_u32_index(clock_spec.np,
> +                                         "clock-indices",
> +                                         i,
> +                                         &domain);
> +        if ( ret )
> +        {
> +            printk("cpu%d: failed to get clock index in '%s'\n",
> +                   cpu, dev_name(&clock_spec.np->dev));
> +            return false;
> +        }
> +
> +        /* Match found */
> +        if ( clock_spec.args[0] == domain )
> +            break;
> +    }
> +
> +    if ( i == count )
> +    {
> +        printk("cpu%d: failed to find matching clk_id (pd) %d\n",
> +               cpu, clock_spec.args[0]);
> +        return false;
> +    }
> +
> +    /*
> +     * Check if a SCP is aware of this power domain. SCPI Message protocol
> +     * driver will populate power domain's DVFS info then.
> +     */
> +    info = scpi_ops->dvfs_get_info(domain);
> +    if ( IS_ERR(info) )
> +    {
> +        printk("cpu%d: failed to get DVFS info of pd%u\n", cpu, domain);
> +        return false;
> +    }
> +
> +    printk(XENLOG_DEBUG "cpu%d: is DVFS capable, belongs to pd%u\n",
> +           cpu, domain);
> +
> +    return true;
> +}
> +
> +static int get_sharing_cpus(unsigned int cpu, cpumask_t *mask)
> +{
> +    struct device *cpu_dev = get_cpu_device(cpu), *tcpu_dev;
> +    unsigned int tcpu;
> +    int domain, tdomain;
> +
> +    BUG_ON(!cpu_dev);
> +
> +    domain = scpi_ops->device_domain_id(cpu_dev);
> +    if ( domain < 0 )
> +        return domain;
> +
> +    cpumask_clear(mask);
> +    cpumask_set_cpu(cpu, mask);
> +
> +    for_each_online_cpu( tcpu )
> +    {
> +        if ( tcpu == cpu )
> +            continue;
> +
> +        tcpu_dev = get_cpu_device(tcpu);
> +        if ( !tcpu_dev )
> +            continue;
> +
> +        tdomain = scpi_ops->device_domain_id(tcpu_dev);
> +        if ( tdomain == domain )
> +            cpumask_set_cpu(tcpu, mask);
> +    }
> +
> +    return 0;
> +}
> +
> +static int get_transition_latency(struct device *cpu_dev)
> +{
> +    return scpi_ops->get_transition_latency(cpu_dev);
> +}
> +
> +static struct scpi_dvfs_info *get_dvfs_info(struct device *cpu_dev)
> +{
> +    int domain;
> +
> +    domain = scpi_ops->device_domain_id(cpu_dev);
> +    if ( domain < 0 )
> +        return ERR_PTR(-EINVAL);
> +
> +    return scpi_ops->dvfs_get_info(domain);
> +}
> +
> +static int init_cpufreq_table(unsigned int cpu,
> +                              struct cpufreq_frequency_table **table)
> +{
> +    struct cpufreq_frequency_table *freq_table = NULL;
> +    struct device *cpu_dev = get_cpu_device(cpu);
> +    struct scpi_dvfs_info *info;
> +    struct scpi_opp *opp;
> +    int i;
> +
> +    BUG_ON(!cpu_dev);
> +
> +    info = get_dvfs_info(cpu_dev);
> +    if ( IS_ERR(info) )
> +        return PTR_ERR(info);
> +
> +    if ( !info->opps )
> +        return -EIO;
> +
> +    freq_table = xzalloc_array(struct cpufreq_frequency_table, info->count + 1);
> +    if ( !freq_table )
> +        return -ENOMEM;
> +
> +    for ( opp = info->opps, i = 0; i < info->count; i++, opp++ )
> +    {
> +        freq_table[i].index = i;
> +        /* Convert Hz -> kHz */
> +        freq_table[i].frequency = opp->freq / 1000;
> +    }
> +
> +    freq_table[i].index = i;
> +    freq_table[i].frequency = CPUFREQ_TABLE_END;
> +
> +    *table = &freq_table[0];
> +
> +    return 0;
> +}
> +
> +static void free_cpufreq_table(struct cpufreq_frequency_table **table)
> +{
> +    if ( !table )
> +        return;
> +
> +    xfree(*table);
> +    *table = NULL;
> +}
> +
> +static int upload_cpufreq_data(cpumask_t *mask,
> +                               struct cpufreq_frequency_table *table)
> +{
> +    struct xen_processor_performance *perf;
> +    struct xen_processor_px *states;
> +    uint32_t platform_limit = 0, state_count = 0;
> +    unsigned int max_freq = 0, prev_freq = 0, cpu = cpumask_first(mask);
> +    int i, latency, ret = 0;
> +
> +    perf = xzalloc(struct xen_processor_performance);
> +    if ( !perf )
> +        return -ENOMEM;
> +
> +    /* Check frequency table and find max frequency */
> +    for ( i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++ )
> +    {
> +        unsigned int freq = table[i].frequency;
> +
> +        if ( freq == CPUFREQ_ENTRY_INVALID )
> +            continue;
> +
> +        if ( table[i].index != state_count || freq <= prev_freq )
> +        {
> +            printk("cpu%d: frequency table format error\n", cpu);
> +            ret = -EINVAL;
> +            goto out;
> +        }
> +
> +        prev_freq = freq;
> +        state_count++;
> +        if ( freq > max_freq )
> +            max_freq = freq;
> +    }
> +
> +    /*
> +     * The frequency table we have is just a temporary place for storing
> +     * provided by SCP DVFS info. Create performance states array.
> +     */
> +    if ( !state_count )
> +    {
> +        printk("cpu%d: no available performance states\n", cpu);
> +        ret = -EINVAL;
> +        goto out;
> +    }
> +
> +    states = xzalloc_array(struct xen_processor_px, state_count);
> +    if ( !states )
> +    {
> +        ret = -ENOMEM;
> +        goto out;
> +    }
> +
> +    set_xen_guest_handle(perf->states, states);

this is the bit that should go away


> +    perf->state_count = state_count;
> +
> +    latency = get_transition_latency(get_cpu_device(cpu));
> +
> +    /* Performance states must start from higher values */
> +    for ( i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++ )
> +    {
> +        unsigned int freq = table[i].frequency;
> +        unsigned int index = state_count - 1 - table[i].index;
> +
> +        if ( freq == CPUFREQ_ENTRY_INVALID )
> +            continue;
> +
> +        if ( freq == max_freq )
> +            platform_limit = index;
> +
> +        /* Convert kHz -> MHz */
> +        states[index].core_frequency = freq / 1000;
> +        /* Convert ns -> us */
> +        states[index].transition_latency = DIV_ROUND_UP(latency, 1000);

Why are we using DIV_ROUND_UP here and not in all the other frequency
conversions?


> +    }
> +
> +    perf->flags = XEN_PX_DATA; /* all info in a one-shot */

Please use existing flags


> +    perf->platform_limit = platform_limit;
> +    perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
> +    perf->domain_info.domain = cpumask_first(mask);
> +    perf->domain_info.num_processors = cpumask_weight(mask);
> +
> +    /* Iterate through all CPUs which are on the same boat */
> +    for_each_cpu( cpu, mask )
> +    {
> +        ret = set_px_pminfo(cpu, perf);
> +        if ( ret )
> +        {
> +            printk("cpu%d: failed to set Px states (%d)\n", cpu, ret);
> +            break;
> +        }
> +
> +        printk(XENLOG_DEBUG "cpu%d: set Px states\n", cpu);
> +    }
> +
> +    xfree(states);
> +out:
> +    xfree(perf);
> +
> +    return ret;
> +}
> +
> +static int __init scpi_cpufreq_postinit(void)
> +{
> +    struct cpufreq_frequency_table *freq_table = NULL;
> +    cpumask_t processed_cpus, shared_cpus;
> +    unsigned int cpu;
> +    int ret = -ENODEV;
> +
> +    cpumask_clear(&processed_cpus);
> +
> +    for_each_online_cpu( cpu )
> +    {
> +        if ( cpumask_test_cpu(cpu, &processed_cpus) )
> +            continue;
> +
> +        if ( !is_dvfs_capable(cpu) )
> +            continue;
> +
> +        ret = get_sharing_cpus(cpu, &shared_cpus);
> +        if ( ret )
> +        {
> +            printk("cpu%d: failed to get sharing cpumask (%d)\n", cpu, ret);
> +            return ret;
> +        }
> +
> +        BUG_ON(cpumask_empty(&shared_cpus));
> +        cpumask_or(&processed_cpus, &processed_cpus, &shared_cpus);
> +
> +        /* Create intermediate frequency table */
> +        ret = init_cpufreq_table(cpu, &freq_table);
> +        if ( ret )
> +        {
> +            printk("cpu%d: failed to initialize frequency table (%d)\n",
> +                   cpu, ret);
> +            return ret;
> +        }
> +
> +        ret = upload_cpufreq_data(&shared_cpus, freq_table);
> +        /* Destroy intermediate frequency table */
> +        free_cpufreq_table(&freq_table);
> +        if ( ret )
> +        {
> +            printk("cpu%d: failed to upload cpufreq data (%d)\n", cpu, ret);
> +            return ret;
> +        }
> +
> +        printk(XENLOG_DEBUG "cpu%d: uploaded cpufreq data\n", cpu);
> +    }
> +
> +    return ret;
> +}
> +
> +static int __init scpi_cpufreq_preinit(void)
> +{
> +    struct dt_device_node *scpi, *clk, *dvfs_clk;
> +    int ret;
> +
> +    /* Initialize SCPI Message protocol */
> +    ret = scpi_init();
> +    if ( ret )
> +    {
> +        printk("failed to initialize SCPI (%d)\n", ret);
> +        return ret;
> +    }
> +
> +    /* Sanity check */
> +    if ( !get_scpi_ops() || !get_scpi_dev() )
> +        return -ENXIO;
> +
> +    scpi = get_scpi_dev()->of_node;
> +    scpi_ops = get_scpi_ops();
> +
> +    ret = -ENODEV;
> +
> +    /*
> +     * Check for clock related nodes for now. But it might additional nodes,
> +     * like thermal sensor, etc.
> +     */
> +    dt_for_each_child_node( scpi, clk )

Wouldn't it make sense to have a proper:

DT_DEVICE_START
...
DT_DEVICE_END

block and register the driver that way?


> +    {
> +        /*
> +         * First of all there must be a container node which contains all
> +         * clocks provided by SCP.
> +         */
> +        if ( !dt_device_is_compatible(clk, "arm,scpi-clocks") )
> +            continue;
> +
> +        /*
> +         * As we are interested in DVFS feature only, check for DVFS clock
> +         * sub-node. At the current stage check for it presence only.
> +         * Without it there is no point to register SCPI based CPUFreq. We will
> +         * perform a thorough check later when populating DVFS clock consumers.
> +         */
> +        dt_for_each_child_node( clk, dvfs_clk )
> +        {
> +            if ( !dt_device_is_compatible(dvfs_clk, "arm,scpi-dvfs-clocks") )
> +                continue;
> +
> +            return 0;
> +        }
> +
> +        break;
> +    }
> +
> +    printk("failed to find SCPI DVFS clocks (%d)\n", ret);
> +
> +    return ret;
> +}
> +
> +/* TODO Implement me */

:-)


> +static void scpi_cpufreq_deinit(void)
> +{
> +
> +}
> +
> +static int __init cpufreq_driver_init(void)
> +{
> +    int ret;
> +
> +    if ( cpufreq_controller != FREQCTL_xen )
> +        return 0;
> +
> +    /*
> +     * Initialize everything needed SCPI based CPUFreq driver to be functional
> +     * (SCPI Message protocol, mailbox to communicate with SCP, etc).
> +     * Also preliminary check if SCPI DVFS clock nodes offered by SCP are
> +     * present in a device tree.
> +     */
> +    ret = scpi_cpufreq_preinit();
> +    if ( ret )
> +        goto out;
> +
> +    /* Register SCPI based CPUFreq driver */
> +    ret = scpi_cpufreq_register_driver();
> +    if ( ret )
> +        goto out;
> +
> +    /*
> +     * Populate CPUs. Get DVFS info (OPP list and the latency information)
> +     * for all DVFS capable CPUs using SCPI protocol, convert these capabilities
> +     * into PM data the CPUFreq framework expects to see followed by
> +     * uploading it.
> +     *
> +     * Actually it is almost the same PM data which hwdom uploads in case of
> +     * x86 via platform hypercall after parsing ACPI tables. In our case we
> +     * don't need hwdom to be involved in, since we already have everything in
> +     * hand. Moreover, the hwdom doesn't even know anything about physical CPUs.
> +     * Not completely sure that it is the best place to do so, but certainly
> +     * it must be after driver registration.
> +     */
> +    ret = scpi_cpufreq_postinit();
> +
> +out:
> +    if ( ret )
> +    {
> +        printk("failed to initialize SCPI based CPUFreq (%d)\n", ret);
> +        scpi_cpufreq_deinit();
> +        return ret;
> +    }
> +
> +    printk("initialized SCPI based CPUFreq\n");
> +
> +    return 0;
> +}
> +__initcall(cpufreq_driver_init);
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
                   ` (32 preceding siblings ...)
  2017-11-13 15:21 ` Andre Przywara
@ 2017-12-05 22:26 ` Stefano Stabellini
  2017-12-06 10:10   ` Oleksandr Tyshchenko
  33 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-05 22:26 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Edgar E . Iglesias, Stefano Stabellini, Jassi Brar,
	Andrew Cooper, Julien Grall, Andre Przywara,
	Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 16357 bytes --]

Hi Oleksandr,

I just wanted to tell you that the patch series is very well organized
and the patches very nicely split.

Thank you!

- Stefano


On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Hi, all.
> 
> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load. Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
> 
> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
> 
> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
> 
> Let me explain a bit more what these possible approaches are:
> 
> 1. “Xen+hwdom” solution.
> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
> 
> 2. “all-in-Xen” solution.
> This implies that all CPUFreq related stuff should be located in Xen.
> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
> 
> 3. “Xen+SCP(ARM TF)” solution.
> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly 
 SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
> 
> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
> 
> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
> 
> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
> 
> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
> 
> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
> 2. A bunch of device-tree helpers and macros.
> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
> 4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
> 5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
> 6. SCPI based CPUFreq driver and CPUFreq interface component.
> 7. Misc patches mostly to ARM subsystem.
> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
> 
> Most important TODOs regarding the whole patch series:
> 1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
> 2. Thermal management integration.
> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
> 4. Handle CPU_TURBO frequencies if they are supported by HW.
> 
> You can find the whole patch series here:
> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
> 
> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
> 
> [1]
> Linux code:
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
> 
> Recent protocol version:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
> 
> [2]
> Xen part:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
> Linux part:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
> 
> [3]
> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
> 
> [4]
> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
> 
> [5]
> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
> 
> Oleksandr Dmytryshyn (6):
>   cpufreq: move cpufreq.h file to the xen/include/xen location
>   pm: move processor_perf.h file to the xen/include/xen location
>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>   cpufreq: make turbo settings to be configurable
>   pmstat: make pmstat functions more generalizable
>   cpufreq: make cpufreq driver more generalizable
> 
> Oleksandr Tyshchenko (24):
>   xenpm: Clarify xenpm usage
>   xen/device-tree: Add dt_count_phandle_with_args helper
>   xen/device-tree: Add dt_property_for_each_string macros
>   xen/device-tree: Add dt_property_read_u32_index helper
>   xen/device-tree: Add dt_property_count_elems_of_size helper
>   xen/device-tree: Add dt_property_read_string_helper and friends
>   xen/arm: Add driver_data field to struct device
>   xen/arm: Add DEVICE_MAILBOX device class
>   xen/arm: Store device-tree node per cpu
>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>   xen/arm: Add mailbox infrastructure
>   xen/arm: Introduce ARM SMC based mailbox
>   xen/arm: Add common header file wrappers.h
>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>   xen/arm: Add Xen changes to SCPI protocol
>   xen/arm: Add Xen changes to mailbox infrastructure
>   xen/arm: Add Xen changes to ARM SMC based mailbox
>   xen/arm: Use non-blocking mode for SCPI protocol
>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>   xen/arm: Introduce SCPI based CPUFreq driver
>   xen/arm: Introduce CPUFreq Interface component
>   xen/arm: Build CPUFreq components
>   xen/arm: Enable CPUFreq on ARM
> 
> Volodymyr Babchuk (1):
>   arm: add SMC wrapper that is compatible with SMCCC
> 
>  MAINTAINERS                                  |    4 +-
>  tools/misc/xenpm.c                           |    6 +-
>  xen/arch/arm/Kconfig                         |    2 +
>  xen/arch/arm/Makefile                        |    1 +
>  xen/arch/arm/arm32/Makefile                  |    1 +
>  xen/arch/arm/arm32/smc.S                     |   32 +
>  xen/arch/arm/arm64/Makefile                  |    1 +
>  xen/arch/arm/arm64/smc.S                     |   29 +
>  xen/arch/arm/cpufreq/Makefile                |    5 +
>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>  xen/arch/arm/smpboot.c                       |    5 +
>  xen/arch/x86/Kconfig                         |    2 +
>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>  xen/arch/x86/acpi/power.c                    |    2 +-
>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>  xen/arch/x86/platform_hypercall.c            |    2 +-
>  xen/common/device_tree.c                     |  124 +++
>  xen/common/sysctl.c                          |    2 +-
>  xen/drivers/Kconfig                          |    2 +
>  xen/drivers/Makefile                         |    1 +
>  xen/drivers/acpi/Makefile                    |    1 -
>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>  xen/drivers/cpufreq/Kconfig                  |    3 +
>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>  xen/drivers/cpufreq/utility.c                |   13 +-
>  xen/drivers/pm/Kconfig                       |    3 +
>  xen/drivers/pm/Makefile                      |    1 +
>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>  xen/include/asm-arm/device.h                 |    2 +
>  xen/include/asm-arm/processor.h              |    4 +
>  xen/include/public/platform.h                |    1 +
>  xen/include/xen/cpufreq.h                    |  254 ++++++
>  xen/include/xen/device_tree.h                |  158 ++++
>  xen/include/xen/pmstat.h                     |    2 +
>  xen/include/xen/processor_perf.h             |   69 ++
>  50 files changed, 4822 insertions(+), 862 deletions(-)
>  create mode 100644 xen/arch/arm/arm32/smc.S
>  create mode 100644 xen/arch/arm/arm64/smc.S
>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>  delete mode 100644 xen/drivers/acpi/pmstat.c
>  create mode 100644 xen/drivers/pm/Kconfig
>  create mode 100644 xen/drivers/pm/Makefile
>  create mode 100644 xen/drivers/pm/stat.c
>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>  create mode 100644 xen/include/xen/cpufreq.h
>  create mode 100644 xen/include/xen/processor_perf.h
> 
> -- 
> 2.7.4
> 

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-05 20:48           ` Stefano Stabellini
@ 2017-12-06  7:54             ` Jan Beulich
  2017-12-06 23:44               ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2017-12-06  7:54 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: andrew.cooper3, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Oleksandr Tyshchenko, xen-devel

>>> On 05.12.17 at 21:48, <sstabellini@kernel.org> wrote:
> You are right. We need to define a new struct for internal usage, for
> example:
> 
> struct xen_processor_performance_internal {
>     uint32_t flags;     /* flag for Px sub info type */
>     uint32_t platform_limit;  /* Platform limitation on freq usage */
>     struct xen_pct_register control_register;
>     struct xen_pct_register status_register;
>     uint32_t state_count;     /* total available performance states */
>     struct xen_processor_px states;
>     struct xen_psd_package domain_info;
>     uint32_t shared_type;     /* coordination type of this processor */
> };
> 
> Jan, Andrew, does this sound like a good approach to you?

I'm afraid I don't have the time to go through this discussion (and
the original patch) in detail to figure out the full context in which
you raise the question. IOW please summarize things alongside
the proposed structure, or alternatively Oleksandr could simply
submit an updated patch to allow seeing the actual context
(albeit in any case I can't promise timely feedback, given the
number of pending patches plus all the work I still hope to be
able to get done myself eventually.

From a brief check, I can't really figure much of a difference to
the already existing (and internal) struct processor_performance.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol
  2017-12-05 21:41     ` Julien Grall
@ 2017-12-06 10:08       ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-06 10:08 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Oleksandr Tyshchenko

Hi Julien, Stefano

On Tue, Dec 5, 2017 at 11:41 PM, Julien Grall <julien.grall@linaro.org> wrote:
>
>
> On 05/12/2017 21:20, Stefano Stabellini wrote:
>>
>> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>>>
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> Modify the direct ported SCPI Message Protocol driver to be
>>> functional inside Xen.
>>>
>>> As SCPI Message protocol driver expects mailbox to be registed,
>>> find and initialize mailbox before probing it.
>>>
>>> Include "wrappers.h" which contains all required things the direct
>>> ported code relies on.
>>>
>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>> CC: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: Julien Grall <julien.grall@linaro.org>
>>
>>
>> As far as drivers ported from Linux go, this looks pretty clean in terms
>> of changes and nasty glue code required to get it to work.
>>
>> The wrappers.h header is not too bad. The question remains on whether we
>> should keep the #if 0 to retain "textual compatibility" with Linux, or
>> we should just bite the bullet and apply the changes. If we commit them
>> as a separate patch, we can always dig out the difference between the
>> original driver and the Xen version using git.
>>
>> Julien, what do you think?
>
>
> When I see the diff of that series:
>
>  50 files changed, 4822 insertions(+), 862 deletions(-)
>
> this is a rather huge series for benefits that we still don't know (e.g we
> don't have any numbers). Based on the current discussion, it looks like the
> design will change quite a lot. So in entire honesty, I haven't spent and
> will not spend much times look at the code itself until we get an agreement
> on the benefits.
>
> However I had a brief look at the code and I raised quiet a few time the
> eyebrow at the glue code. I saw that mutex was converted spinlock without
> any justification (see patch #20).
>
> Anyway, Oleksandr promised to come back with numbers and investigating the
> discussion. We should probably wait that before looking at this series in
> more details.
Yes, I am working on getting numbers. We will resume discussion when I
provide them.

>
> Cheers,
>
> --
> Julien Grall



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 00/31] CPUFreq on ARM
  2017-12-05 22:26 ` Stefano Stabellini
@ 2017-12-06 10:10   ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-06 10:10 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Edgar E . Iglesias, Jassi Brar, Andrew Cooper, Julien Grall,
	Andre Przywara, Oleksandr Tyshchenko, Jan Beulich, Sudeep Holla,
	xen-devel

On Wed, Dec 6, 2017 at 12:26 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> Hi Oleksandr,
Hi Stefano

>
> I just wanted to tell you that the patch series is very well organized
> and the patches very nicely split.
Nice to hear. Thank you.

>
> Thank you!
>
> - Stefano
>
>
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Hi, all.
>>
>> The purpose of this RFC patch series is to add CPUFreq support to Xen on ARM.
>> Motivation of hypervisor based CPUFreq is to enable one of the main PM use-cases in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs. It is quite clear that a decision about frequency change should be taken by hypervisor as only it has information about actual CPU load. Although these required components (CPUFreq core, governors, etc) already exist in Xen, it is worth to mention that they are ACPI specific. So, a part of the current patch series makes them more generic in order to make possible a CPUFreq usage on architectures without ACPI support in.
>> But, the main question we have to answer is about frequency changing interface in virtualized system. The frequency changing interface and all dependent components which needed CPUFreq to be functional on ARM are not present in Xen these days. The list of required components is quite big and may change across different ARM SoC vendors. As an example, the following components are involved in DVFS on Renesas Salvator-X board which has R-Car Gen3 SoC installed: generic clock, regulator and thermal frameworks, Vendor’s CPG, PMIC, AVS, THS drivers, i2c support, etc.
>>
>> We were considering a few possible approaches of hypervisor based CPUFreqs on ARM and came to conclusion to base this solution on popular at the moment, already upstreamed to Linux, ARM System Control and Power Interface(SCPI) protocol [1]. We chose SCPI protocol instead of newer ARM System Control and Management Interface (SCMI) protocol [2] since it is widely spread in Linux, there are good examples how to use it, the range of capabilities it has is enough for implementing hypervisor based CPUFreq and, what is more, upstream Linux support for SCMI is missed so far, but SCMI could be used as well.
>>
>> Briefly speaking, the SCPI protocol is used between the System Control Processor(SCP) and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication between SCP and AP. The main purpose of SCP is to offload different PM related tasks from AP and one of the services that SCP provides is Dynamic voltage and frequency scaling (DVFS), it is what we actually need for CPUFreq. I will describe this approach in details down the text.
>>
>> Let me explain a bit more what these possible approaches are:
>>
>> 1. “Xen+hwdom” solution.
>> GlobalLogic team proposed split model [3], where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hwdom (possibly dom0) in order to scale physical CPUs. This solution hasn’t been accepted by Xen community yet and seems it is not going to be accepted without taking into the account still unanswered major questions and proving that “all-in-Xen” solution, which Xen community considered as more architecturally cleaner option, would be unworkable in practice.
>> The other reasons why we decided not to stick to this approach are complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, passing CPU info via DT, etc and possible synchronization issues with a proposed solution.
>> Although it is worth to mention that the beauty of this approach was that there wouldn’t be a need to port a lot of things to Xen. All frequency changing interface and all dependent components which needed CPUFreq to be functional were already in place.
>> Although this approach is not used, still I picked a few already acked patches which made ACPI specific CPUFreq stuff more generic.
>>
>> 2. “all-in-Xen” solution.
>> This implies that all CPUFreq related stuff should be located in Xen.
>> Community considered this solution as more architecturally cleaner option than “Xen+hwdom” one. No layering violation comparing with the previous approach (letting guest OS manage one or more physical CPUs is more of a layering violation).
>> This solution looks better, but to be honest, we are not in favor of this solution as well. We expect enormous developing effort to get this support in (the scope of required components looks unreal) and maintain it. So, we decided not to stick to this approach as well.
>>
>> 3. “Xen+SCP(ARM TF)” solution.
>> It is yet another solution based on ARM SCPI protocol. The generic idea here is that there is a firmware, which being a server runs on some dedicated IP core (server), provides different PM services (DVFS, sensors, etc). On the other side there is a CPUFreq driver in Xen, which is running on the AP (client), consumes these services. CPUFreq driver neither changes the CPU frequency/voltage by itself nor cooperates with Linux in order to do such job. It just communicates with SCP directly using SCPI protocol. As I said before, some integrated into a SoC mailbox IP need to be used for IPC (doorbell for triggering action and shared memory region for commands). CPUFreq driver doesn’t even need to know what should be physically changed for the new frequency to take effect. It is a certainly SCP’s responsibility. This all avoid CPUFreq infrastructure in Xen on ARM from diving into each supported SoC internals and as the result having a lot of code.
>>
>> The possible issue here could be in SCP, the problem is that some dedicated IP core may be absent at all or performs other than PM tasks. Fortunately, there is a brilliant solution to teach firmware running in the EL3 exception level (ARM TF) to perform SCP functions and use SMC calls for communications [4]. Exactly this transport implementation I want to bring to Xen the first. Such solution is going to be generic across all ARM platforms that do have firmware running in the EL3 exception level and don’t have candidate for being SCP.
>>
>> Here we have completely synchronous case because of SMC calls nature. SMC triggered mailbox driver emulates a mailbox which signals transmitted data via Secure Monitor Call (SMC) instruction [5]. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way. Like PSCI requests.
>> As you can see this method is free from synchronization issues. What is more, this solution is more architecturally cleaner solution than split model “Xen+hwdom” one. From the security point of view, I hope, everything will be much more correct since the ARM TF, which we want to see in charge of controlling CPU frequency/voltage, is a trusted SW layer. Moreover, ARM TF is responsible for enabling/disabling CPU (PSCI) and nobody complains about it, so let it do DVFS too.
>>
>> I have to admit that I have checked this solution only due to a lack of candidate for being SCP. But, I hope, that other ARM SoCs where dedicated SCP is present (asynchronous case) will work too, but with some limitations. The mailbox IPs for these ARM SoCs must have TX/RX-done irqs. I have described in the corresponding patches why this limitation is present.
>>
>> To be honest I have Renesas R-Car Gen3 SoCs in mind as our nearest target, but I would like to make this solution as generic as possible. I don’t treat proposed solution as world-wide generic, but I hope, this solution may be suitable for other ARM SoCs which meet such requirements. Anyway, having something which works, but doesn’t cover all cases is better than having nothing.
>>
>> I would like to notice that the patches are POC state and I post them just to illustrate in more detail of what I am talking about. Patch series consist of the following parts:
>> 1. GL’s patches which make ACPI specific CPUFreq stuff more generic. Although these patches has been already acked by Xen community and the CPUFreq code base hasn’t changed in a last few years I drop all A-b.
>> 2. A bunch of device-tree helpers and macros.
>> 3. Direct ported SCPI protocol, mailbox infrastructure and the ARM SMC triggered mailbox driver. All components except mailbox driver are in mainline Linux.
>> 4. Xen changes to direct ported code for making it compilable. These changes don’t change functionality.
>> 5. Some modification to direct ported code which slightly change functionality, I would say to restrict it.
>> 6. SCPI based CPUFreq driver and CPUFreq interface component.
>> 7. Misc patches mostly to ARM subsystem.
>> 8. Patch from Volodymyr Babchuk which adds SMC wrapper.
>>
>> Most important TODOs regarding the whole patch series:
>> 1. Handle devm in the direct ported code. Currently, in case of any errors previously allocated resources are left unfreed.
>> 2. Thermal management integration.
>> 3. Don't pass CPUFreq related nodes to dom0. Xen owns SCPI completely.
>> 4. Handle CPU_TURBO frequencies if they are supported by HW.
>>
>> You can find the whole patch series here:
>> repo: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel1
>>
>> P.S. There is no need to modify xenpm tool. It works out of the box on ARM.
>>
>> [1]
>> Linux code:
>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/drivers/firmware/arm_scpi.c
>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/include/linux/scpi_protocol.h
>> http://elixir.free-electrons.com/linux/v4.14-rc6/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>>
>> Recent protocol version:
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>>
>> [2]
>> Xen part:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
>> Linux part:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
>>
>> [3]
>> http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_System_Control_and_Management_Interface.pdf
>>
>> [4]
>> http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-introduce-smc-triggered-mailbox
>>
>> [5]
>> http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
>>
>> Oleksandr Dmytryshyn (6):
>>   cpufreq: move cpufreq.h file to the xen/include/xen location
>>   pm: move processor_perf.h file to the xen/include/xen location
>>   pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
>>   cpufreq: make turbo settings to be configurable
>>   pmstat: make pmstat functions more generalizable
>>   cpufreq: make cpufreq driver more generalizable
>>
>> Oleksandr Tyshchenko (24):
>>   xenpm: Clarify xenpm usage
>>   xen/device-tree: Add dt_count_phandle_with_args helper
>>   xen/device-tree: Add dt_property_for_each_string macros
>>   xen/device-tree: Add dt_property_read_u32_index helper
>>   xen/device-tree: Add dt_property_count_elems_of_size helper
>>   xen/device-tree: Add dt_property_read_string_helper and friends
>>   xen/arm: Add driver_data field to struct device
>>   xen/arm: Add DEVICE_MAILBOX device class
>>   xen/arm: Store device-tree node per cpu
>>   xen/arm: Add ARM System Control and Power Interface (SCPI) protocol
>>   xen/arm: Add mailbox infrastructure
>>   xen/arm: Introduce ARM SMC based mailbox
>>   xen/arm: Add common header file wrappers.h
>>   xen/arm: Add rxdone_auto flag to mbox_controller structure
>>   xen/arm: Add Xen changes to SCPI protocol
>>   xen/arm: Add Xen changes to mailbox infrastructure
>>   xen/arm: Add Xen changes to ARM SMC based mailbox
>>   xen/arm: Use non-blocking mode for SCPI protocol
>>   xen/arm: Don't set txdone_poll flag for ARM SMC mailbox
>>   cpufreq: hack: perf->states isn't a real guest handle on ARM
>>   xen/arm: Introduce SCPI based CPUFreq driver
>>   xen/arm: Introduce CPUFreq Interface component
>>   xen/arm: Build CPUFreq components
>>   xen/arm: Enable CPUFreq on ARM
>>
>> Volodymyr Babchuk (1):
>>   arm: add SMC wrapper that is compatible with SMCCC
>>
>>  MAINTAINERS                                  |    4 +-
>>  tools/misc/xenpm.c                           |    6 +-
>>  xen/arch/arm/Kconfig                         |    2 +
>>  xen/arch/arm/Makefile                        |    1 +
>>  xen/arch/arm/arm32/Makefile                  |    1 +
>>  xen/arch/arm/arm32/smc.S                     |   32 +
>>  xen/arch/arm/arm64/Makefile                  |    1 +
>>  xen/arch/arm/arm64/smc.S                     |   29 +
>>  xen/arch/arm/cpufreq/Makefile                |    5 +
>>  xen/arch/arm/cpufreq/arm-smc-mailbox.c       |  248 ++++++
>>  xen/arch/arm/cpufreq/arm_scpi.c              | 1191 ++++++++++++++++++++++++++
>>  xen/arch/arm/cpufreq/cpufreq_if.c            |  522 +++++++++++
>>  xen/arch/arm/cpufreq/mailbox.c               |  562 ++++++++++++
>>  xen/arch/arm/cpufreq/mailbox.h               |   28 +
>>  xen/arch/arm/cpufreq/mailbox_client.h        |   69 ++
>>  xen/arch/arm/cpufreq/mailbox_controller.h    |  161 ++++
>>  xen/arch/arm/cpufreq/scpi_cpufreq.c          |  328 +++++++
>>  xen/arch/arm/cpufreq/scpi_protocol.h         |  116 +++
>>  xen/arch/arm/cpufreq/wrappers.h              |  239 ++++++
>>  xen/arch/arm/smpboot.c                       |    5 +
>>  xen/arch/x86/Kconfig                         |    2 +
>>  xen/arch/x86/acpi/cpu_idle.c                 |    2 +-
>>  xen/arch/x86/acpi/cpufreq/cpufreq.c          |    2 +-
>>  xen/arch/x86/acpi/cpufreq/powernow.c         |    2 +-
>>  xen/arch/x86/acpi/power.c                    |    2 +-
>>  xen/arch/x86/cpu/mwait-idle.c                |    2 +-
>>  xen/arch/x86/platform_hypercall.c            |    2 +-
>>  xen/common/device_tree.c                     |  124 +++
>>  xen/common/sysctl.c                          |    2 +-
>>  xen/drivers/Kconfig                          |    2 +
>>  xen/drivers/Makefile                         |    1 +
>>  xen/drivers/acpi/Makefile                    |    1 -
>>  xen/drivers/acpi/pmstat.c                    |  526 ------------
>>  xen/drivers/cpufreq/Kconfig                  |    3 +
>>  xen/drivers/cpufreq/cpufreq.c                |  102 ++-
>>  xen/drivers/cpufreq/cpufreq_misc_governors.c |    2 +-
>>  xen/drivers/cpufreq/cpufreq_ondemand.c       |    4 +-
>>  xen/drivers/cpufreq/utility.c                |   13 +-
>>  xen/drivers/pm/Kconfig                       |    3 +
>>  xen/drivers/pm/Makefile                      |    1 +
>>  xen/drivers/pm/stat.c                        |  538 ++++++++++++
>>  xen/include/acpi/cpufreq/cpufreq.h           |  245 ------
>>  xen/include/acpi/cpufreq/processor_perf.h    |   63 --
>>  xen/include/asm-arm/device.h                 |    2 +
>>  xen/include/asm-arm/processor.h              |    4 +
>>  xen/include/public/platform.h                |    1 +
>>  xen/include/xen/cpufreq.h                    |  254 ++++++
>>  xen/include/xen/device_tree.h                |  158 ++++
>>  xen/include/xen/pmstat.h                     |    2 +
>>  xen/include/xen/processor_perf.h             |   69 ++
>>  50 files changed, 4822 insertions(+), 862 deletions(-)
>>  create mode 100644 xen/arch/arm/arm32/smc.S
>>  create mode 100644 xen/arch/arm/arm64/smc.S
>>  create mode 100644 xen/arch/arm/cpufreq/Makefile
>>  create mode 100644 xen/arch/arm/cpufreq/arm-smc-mailbox.c
>>  create mode 100644 xen/arch/arm/cpufreq/arm_scpi.c
>>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.c
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox.h
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_client.h
>>  create mode 100644 xen/arch/arm/cpufreq/mailbox_controller.h
>>  create mode 100644 xen/arch/arm/cpufreq/scpi_cpufreq.c
>>  create mode 100644 xen/arch/arm/cpufreq/scpi_protocol.h
>>  create mode 100644 xen/arch/arm/cpufreq/wrappers.h
>>  delete mode 100644 xen/drivers/acpi/pmstat.c
>>  create mode 100644 xen/drivers/pm/Kconfig
>>  create mode 100644 xen/drivers/pm/Makefile
>>  create mode 100644 xen/drivers/pm/stat.c
>>  delete mode 100644 xen/include/acpi/cpufreq/cpufreq.h
>>  delete mode 100644 xen/include/acpi/cpufreq/processor_perf.h
>>  create mode 100644 xen/include/xen/cpufreq.h
>>  create mode 100644 xen/include/xen/processor_perf.h
>>
>> --
>> 2.7.4
>>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component
  2017-12-05 22:25   ` Stefano Stabellini
@ 2017-12-06 10:54     ` Oleksandr Tyshchenko
  2017-12-07  1:40       ` Stefano Stabellini
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-06 10:54 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall, Oleksandr Tyshchenko

Hi Stefano

On Wed, Dec 6, 2017 at 12:25 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Thu, 9 Nov 2017, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This patch adds an interface component which performs following steps:
>> 1. Initialize everything needed SCPI based CPUFreq driver to be functional
>>    (SCPI Message protocol, mailbox to communicate with SCP, etc).
>>    Also preliminary check if SCPI DVFS clock nodes offered by SCP are
>>    present in a device tree.
>> 2. Register SCPI based CPUFreq driver.
>> 3. Populate CPUs. Get DVFS info (OPP list and the latency information)
>>    for all DVFS capable CPUs using SCPI protocol, convert these capabilities
>>    into PM data the CPUFreq framework expects to see followed by
>>    uploading it.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/arch/arm/cpufreq/cpufreq_if.c | 522 ++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 522 insertions(+)
>>  create mode 100644 xen/arch/arm/cpufreq/cpufreq_if.c
>>
>> diff --git a/xen/arch/arm/cpufreq/cpufreq_if.c b/xen/arch/arm/cpufreq/cpufreq_if.c
>> new file mode 100644
>> index 0000000..2451d00
>> --- /dev/null
>> +++ b/xen/arch/arm/cpufreq/cpufreq_if.c
>> @@ -0,0 +1,522 @@
>> +/*
>> + * xen/arch/arm/cpufreq/cpufreq_if.c
>> + *
>> + * CPUFreq interface component
>> + *
>> + * Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> + * Copyright (c) 2017 EPAM Systems.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/device_tree.h>
>> +#include <xen/err.h>
>> +#include <xen/sched.h>
>> +#include <xen/cpufreq.h>
>> +#include <xen/pmstat.h>
>> +#include <xen/guest_access.h>
>> +
>> +#include "scpi_protocol.h"
>> +
>> +/*
>> + * TODO:
>> + * 1. Add __init to required funcs
>> + * 2. Put get_cpu_device() into common place
>> + */
>> +
>> +static struct scpi_ops *scpi_ops;
>> +
>> +extern int scpi_cpufreq_register_driver(void);
>> +
>> +#define dev_name(dev) dt_node_full_name(dev_to_dt(dev))
>> +
>> +struct device *get_cpu_device(unsigned int cpu)
>> +{
>> +    if ( cpu < nr_cpu_ids && cpu_possible(cpu) )
>> +        return dt_to_dev(cpu_dt_nodes[cpu]);
>> +    else
>> +        return NULL;
>> +}
>> +
>> +static bool is_dvfs_capable(unsigned int cpu)
>> +{
>> +    static const struct dt_device_match scpi_dvfs_clock_match[] =
>> +    {
>> +        DT_MATCH_COMPATIBLE("arm,scpi-dvfs-clocks"),
>> +        { /* sentinel */ },
>> +    };
>> +    struct device *cpu_dev;
>> +    struct dt_phandle_args clock_spec;
>> +    struct scpi_dvfs_info *info;
>> +    u32 domain;
>> +    int i, ret, count;
>> +
>> +    cpu_dev = get_cpu_device(cpu);
>> +    if ( !cpu_dev )
>> +    {
>> +        printk("cpu%d: failed to get device\n", cpu);
>> +        return false;
>> +    }
>> +
>> +    /* First of all find a clock node this CPU is a consumer of */
>> +    ret = dt_parse_phandle_with_args(cpu_dev->of_node,
>> +                                     "clocks",
>> +                                     "#clock-cells",
>> +                                     0,
>> +                                     &clock_spec);
>> +    if ( ret )
>> +    {
>> +        printk("cpu%d: failed to get clock node\n", cpu);
>> +        return false;
>> +    }
>> +
>> +    /* Make sure it is an available DVFS clock node */
>> +    if ( !dt_match_node(scpi_dvfs_clock_match, clock_spec.np) ||
>> +         !dt_device_is_available(clock_spec.np) )
>> +    {
>> +        printk("cpu%d: clock node '%s' is either non-DVFS or non-available\n",
>> +               cpu, dev_name(&clock_spec.np->dev));
>> +        return false;
>> +    }
>> +
>> +    /*
>> +     * Actually we already have a power domain id this CPU belongs to,
>> +     * it is a stored in args[0] CPU clock specifier, so we could ask SCP
>> +     * to provide its DVFS info. But we want to dig a little bit deeper
>> +     * to make sure that everything is correct.
>> +     */
>> +
>> +    /* Check how many clock ids a DVFS clock node has */
>> +    ret = dt_property_count_elems_of_size(clock_spec.np,
>> +                                          "clock-indices",
>> +                                          sizeof(u32));
>> +    if ( ret < 0 )
>> +    {
>> +        printk("cpu%d: failed to get clock-indices count in '%s'\n",
>> +               cpu, dev_name(&clock_spec.np->dev));
>> +        return false;
>> +    }
>> +    count = ret;
>> +
>> +    /* Check if a clock id the CPU clock specifier points to is present */
>> +    for ( i = 0; i < count; i++ )
>> +    {
>> +        ret = dt_property_read_u32_index(clock_spec.np,
>> +                                         "clock-indices",
>> +                                         i,
>> +                                         &domain);
>> +        if ( ret )
>> +        {
>> +            printk("cpu%d: failed to get clock index in '%s'\n",
>> +                   cpu, dev_name(&clock_spec.np->dev));
>> +            return false;
>> +        }
>> +
>> +        /* Match found */
>> +        if ( clock_spec.args[0] == domain )
>> +            break;
>> +    }
>> +
>> +    if ( i == count )
>> +    {
>> +        printk("cpu%d: failed to find matching clk_id (pd) %d\n",
>> +               cpu, clock_spec.args[0]);
>> +        return false;
>> +    }
>> +
>> +    /*
>> +     * Check if a SCP is aware of this power domain. SCPI Message protocol
>> +     * driver will populate power domain's DVFS info then.
>> +     */
>> +    info = scpi_ops->dvfs_get_info(domain);
>> +    if ( IS_ERR(info) )
>> +    {
>> +        printk("cpu%d: failed to get DVFS info of pd%u\n", cpu, domain);
>> +        return false;
>> +    }
>> +
>> +    printk(XENLOG_DEBUG "cpu%d: is DVFS capable, belongs to pd%u\n",
>> +           cpu, domain);
>> +
>> +    return true;
>> +}
>> +
>> +static int get_sharing_cpus(unsigned int cpu, cpumask_t *mask)
>> +{
>> +    struct device *cpu_dev = get_cpu_device(cpu), *tcpu_dev;
>> +    unsigned int tcpu;
>> +    int domain, tdomain;
>> +
>> +    BUG_ON(!cpu_dev);
>> +
>> +    domain = scpi_ops->device_domain_id(cpu_dev);
>> +    if ( domain < 0 )
>> +        return domain;
>> +
>> +    cpumask_clear(mask);
>> +    cpumask_set_cpu(cpu, mask);
>> +
>> +    for_each_online_cpu( tcpu )
>> +    {
>> +        if ( tcpu == cpu )
>> +            continue;
>> +
>> +        tcpu_dev = get_cpu_device(tcpu);
>> +        if ( !tcpu_dev )
>> +            continue;
>> +
>> +        tdomain = scpi_ops->device_domain_id(tcpu_dev);
>> +        if ( tdomain == domain )
>> +            cpumask_set_cpu(tcpu, mask);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int get_transition_latency(struct device *cpu_dev)
>> +{
>> +    return scpi_ops->get_transition_latency(cpu_dev);
>> +}
>> +
>> +static struct scpi_dvfs_info *get_dvfs_info(struct device *cpu_dev)
>> +{
>> +    int domain;
>> +
>> +    domain = scpi_ops->device_domain_id(cpu_dev);
>> +    if ( domain < 0 )
>> +        return ERR_PTR(-EINVAL);
>> +
>> +    return scpi_ops->dvfs_get_info(domain);
>> +}
>> +
>> +static int init_cpufreq_table(unsigned int cpu,
>> +                              struct cpufreq_frequency_table **table)
>> +{
>> +    struct cpufreq_frequency_table *freq_table = NULL;
>> +    struct device *cpu_dev = get_cpu_device(cpu);
>> +    struct scpi_dvfs_info *info;
>> +    struct scpi_opp *opp;
>> +    int i;
>> +
>> +    BUG_ON(!cpu_dev);
>> +
>> +    info = get_dvfs_info(cpu_dev);
>> +    if ( IS_ERR(info) )
>> +        return PTR_ERR(info);
>> +
>> +    if ( !info->opps )
>> +        return -EIO;
>> +
>> +    freq_table = xzalloc_array(struct cpufreq_frequency_table, info->count + 1);
>> +    if ( !freq_table )
>> +        return -ENOMEM;
>> +
>> +    for ( opp = info->opps, i = 0; i < info->count; i++, opp++ )
>> +    {
>> +        freq_table[i].index = i;
>> +        /* Convert Hz -> kHz */
>> +        freq_table[i].frequency = opp->freq / 1000;
>> +    }
>> +
>> +    freq_table[i].index = i;
>> +    freq_table[i].frequency = CPUFREQ_TABLE_END;
>> +
>> +    *table = &freq_table[0];
>> +
>> +    return 0;
>> +}
>> +
>> +static void free_cpufreq_table(struct cpufreq_frequency_table **table)
>> +{
>> +    if ( !table )
>> +        return;
>> +
>> +    xfree(*table);
>> +    *table = NULL;
>> +}
>> +
>> +static int upload_cpufreq_data(cpumask_t *mask,
>> +                               struct cpufreq_frequency_table *table)
>> +{
>> +    struct xen_processor_performance *perf;
>> +    struct xen_processor_px *states;
>> +    uint32_t platform_limit = 0, state_count = 0;
>> +    unsigned int max_freq = 0, prev_freq = 0, cpu = cpumask_first(mask);
>> +    int i, latency, ret = 0;
>> +
>> +    perf = xzalloc(struct xen_processor_performance);
>> +    if ( !perf )
>> +        return -ENOMEM;
>> +
>> +    /* Check frequency table and find max frequency */
>> +    for ( i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++ )
>> +    {
>> +        unsigned int freq = table[i].frequency;
>> +
>> +        if ( freq == CPUFREQ_ENTRY_INVALID )
>> +            continue;
>> +
>> +        if ( table[i].index != state_count || freq <= prev_freq )
>> +        {
>> +            printk("cpu%d: frequency table format error\n", cpu);
>> +            ret = -EINVAL;
>> +            goto out;
>> +        }
>> +
>> +        prev_freq = freq;
>> +        state_count++;
>> +        if ( freq > max_freq )
>> +            max_freq = freq;
>> +    }
>> +
>> +    /*
>> +     * The frequency table we have is just a temporary place for storing
>> +     * provided by SCP DVFS info. Create performance states array.
>> +     */
>> +    if ( !state_count )
>> +    {
>> +        printk("cpu%d: no available performance states\n", cpu);
>> +        ret = -EINVAL;
>> +        goto out;
>> +    }
>> +
>> +    states = xzalloc_array(struct xen_processor_px, state_count);
>> +    if ( !states )
>> +    {
>> +        ret = -ENOMEM;
>> +        goto out;
>> +    }
>> +
>> +    set_xen_guest_handle(perf->states, states);
>
> this is the bit that should go away

Yes. To add some glue I put references:

This patch must be reworked:
https://www.mail-archive.com/xen-devel@lists.xen.org/msg128410.html

Here we have started discussion how to rework it:
https://marc.info/?l=xen-devel&m=151250698607186

>
>
>> +    perf->state_count = state_count;
>> +
>> +    latency = get_transition_latency(get_cpu_device(cpu));
>> +
>> +    /* Performance states must start from higher values */
>> +    for ( i = 0; (table[i].frequency != CPUFREQ_TABLE_END); i++ )
>> +    {
>> +        unsigned int freq = table[i].frequency;
>> +        unsigned int index = state_count - 1 - table[i].index;
>> +
>> +        if ( freq == CPUFREQ_ENTRY_INVALID )
>> +            continue;
>> +
>> +        if ( freq == max_freq )
>> +            platform_limit = index;
>> +
>> +        /* Convert kHz -> MHz */
>> +        states[index].core_frequency = freq / 1000;
>> +        /* Convert ns -> us */
>> +        states[index].transition_latency = DIV_ROUND_UP(latency, 1000);
>
> Why are we using DIV_ROUND_UP here and not in all the other frequency
> conversions?

I decided to use DIV_ROUND_UP here, since the latency theoretically
might be less then 1000 ns and
we might end up with 0 us using the simple division.

>
>
>> +    }
>> +
>> +    perf->flags = XEN_PX_DATA; /* all info in a one-shot */
>
> Please use existing flags

Yes, sure. As we have already agreed here:
https://marc.info/?l=xen-devel&m=151250698607186

>
>
>> +    perf->platform_limit = platform_limit;
>> +    perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
>> +    perf->domain_info.domain = cpumask_first(mask);
>> +    perf->domain_info.num_processors = cpumask_weight(mask);
>> +
>> +    /* Iterate through all CPUs which are on the same boat */
>> +    for_each_cpu( cpu, mask )
>> +    {
>> +        ret = set_px_pminfo(cpu, perf);
>> +        if ( ret )
>> +        {
>> +            printk("cpu%d: failed to set Px states (%d)\n", cpu, ret);
>> +            break;
>> +        }
>> +
>> +        printk(XENLOG_DEBUG "cpu%d: set Px states\n", cpu);
>> +    }
>> +
>> +    xfree(states);
>> +out:
>> +    xfree(perf);
>> +
>> +    return ret;
>> +}
>> +
>> +static int __init scpi_cpufreq_postinit(void)
>> +{
>> +    struct cpufreq_frequency_table *freq_table = NULL;
>> +    cpumask_t processed_cpus, shared_cpus;
>> +    unsigned int cpu;
>> +    int ret = -ENODEV;
>> +
>> +    cpumask_clear(&processed_cpus);
>> +
>> +    for_each_online_cpu( cpu )
>> +    {
>> +        if ( cpumask_test_cpu(cpu, &processed_cpus) )
>> +            continue;
>> +
>> +        if ( !is_dvfs_capable(cpu) )
>> +            continue;
>> +
>> +        ret = get_sharing_cpus(cpu, &shared_cpus);
>> +        if ( ret )
>> +        {
>> +            printk("cpu%d: failed to get sharing cpumask (%d)\n", cpu, ret);
>> +            return ret;
>> +        }
>> +
>> +        BUG_ON(cpumask_empty(&shared_cpus));
>> +        cpumask_or(&processed_cpus, &processed_cpus, &shared_cpus);
>> +
>> +        /* Create intermediate frequency table */
>> +        ret = init_cpufreq_table(cpu, &freq_table);
>> +        if ( ret )
>> +        {
>> +            printk("cpu%d: failed to initialize frequency table (%d)\n",
>> +                   cpu, ret);
>> +            return ret;
>> +        }
>> +
>> +        ret = upload_cpufreq_data(&shared_cpus, freq_table);
>> +        /* Destroy intermediate frequency table */
>> +        free_cpufreq_table(&freq_table);
>> +        if ( ret )
>> +        {
>> +            printk("cpu%d: failed to upload cpufreq data (%d)\n", cpu, ret);
>> +            return ret;
>> +        }
>> +
>> +        printk(XENLOG_DEBUG "cpu%d: uploaded cpufreq data\n", cpu);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static int __init scpi_cpufreq_preinit(void)
>> +{
>> +    struct dt_device_node *scpi, *clk, *dvfs_clk;
>> +    int ret;
>> +
>> +    /* Initialize SCPI Message protocol */
>> +    ret = scpi_init();
>> +    if ( ret )
>> +    {
>> +        printk("failed to initialize SCPI (%d)\n", ret);
>> +        return ret;
>> +    }
>> +
>> +    /* Sanity check */
>> +    if ( !get_scpi_ops() || !get_scpi_dev() )
>> +        return -ENXIO;
>> +
>> +    scpi = get_scpi_dev()->of_node;
>> +    scpi_ops = get_scpi_ops();
>> +
>> +    ret = -ENODEV;
>> +
>> +    /*
>> +     * Check for clock related nodes for now. But it might additional nodes,
>> +     * like thermal sensor, etc.
>> +     */
>> +    dt_for_each_child_node( scpi, clk )
>
> Wouldn't it make sense to have a proper:
>
> DT_DEVICE_START
> ...
> DT_DEVICE_END
>
> block and register the driver that way?

I am not sure that I got your question completely.
Which driver need to be registered in a such way?
If we had separate dt-related driver to manage clocks we would have to
register it in a proposed way.
Here we just iterating through all SCPI child in order to be sure that
DVFS clock sub-node is present.
Let say, preliminary check.

BTW, in a proposed way I register ARM SMC triggered mailbox driver:
https://www.mail-archive.com/xen-devel@lists.xen.org/msg128411.html
With adding new DEVICE_MAILBOX class:
https://www.mail-archive.com/xen-devel@lists.xen.org/msg128402.html

>
>
>> +    {
>> +        /*
>> +         * First of all there must be a container node which contains all
>> +         * clocks provided by SCP.
>> +         */
>> +        if ( !dt_device_is_compatible(clk, "arm,scpi-clocks") )
>> +            continue;
>> +
>> +        /*
>> +         * As we are interested in DVFS feature only, check for DVFS clock
>> +         * sub-node. At the current stage check for it presence only.
>> +         * Without it there is no point to register SCPI based CPUFreq. We will
>> +         * perform a thorough check later when populating DVFS clock consumers.
>> +         */
>> +        dt_for_each_child_node( clk, dvfs_clk )
>> +        {
>> +            if ( !dt_device_is_compatible(dvfs_clk, "arm,scpi-dvfs-clocks") )
>> +                continue;
>> +
>> +            return 0;
>> +        }
>> +
>> +        break;
>> +    }
>> +
>> +    printk("failed to find SCPI DVFS clocks (%d)\n", ret);
>> +
>> +    return ret;
>> +}
>> +
>> +/* TODO Implement me */
>
> :-)
>
>
>> +static void scpi_cpufreq_deinit(void)
>> +{
>> +
>> +}
>> +
>> +static int __init cpufreq_driver_init(void)
>> +{
>> +    int ret;
>> +
>> +    if ( cpufreq_controller != FREQCTL_xen )
>> +        return 0;
>> +
>> +    /*
>> +     * Initialize everything needed SCPI based CPUFreq driver to be functional
>> +     * (SCPI Message protocol, mailbox to communicate with SCP, etc).
>> +     * Also preliminary check if SCPI DVFS clock nodes offered by SCP are
>> +     * present in a device tree.
>> +     */
>> +    ret = scpi_cpufreq_preinit();
>> +    if ( ret )
>> +        goto out;
>> +
>> +    /* Register SCPI based CPUFreq driver */
>> +    ret = scpi_cpufreq_register_driver();
>> +    if ( ret )
>> +        goto out;
>> +
>> +    /*
>> +     * Populate CPUs. Get DVFS info (OPP list and the latency information)
>> +     * for all DVFS capable CPUs using SCPI protocol, convert these capabilities
>> +     * into PM data the CPUFreq framework expects to see followed by
>> +     * uploading it.
>> +     *
>> +     * Actually it is almost the same PM data which hwdom uploads in case of
>> +     * x86 via platform hypercall after parsing ACPI tables. In our case we
>> +     * don't need hwdom to be involved in, since we already have everything in
>> +     * hand. Moreover, the hwdom doesn't even know anything about physical CPUs.
>> +     * Not completely sure that it is the best place to do so, but certainly
>> +     * it must be after driver registration.
>> +     */
>> +    ret = scpi_cpufreq_postinit();
>> +
>> +out:
>> +    if ( ret )
>> +    {
>> +        printk("failed to initialize SCPI based CPUFreq (%d)\n", ret);
>> +        scpi_cpufreq_deinit();
>> +        return ret;
>> +    }
>> +
>> +    printk("initialized SCPI based CPUFreq\n");
>> +
>> +    return 0;
>> +}
>> +__initcall(cpufreq_driver_init);
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-12-05 19:24           ` Stefano Stabellini
@ 2017-12-06 11:28             ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-06 11:28 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Jan Beulich, xen-devel

Hi, Stefano.

On Tue, Dec 5, 2017 at 9:24 PM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Tue, 5 Dec 2017, Oleksandr Tyshchenko wrote:
>> >> Another question is second_max_freq. As I understand, it is highest
>> >> non-turbo frequency calculated by framework to limit target frequency
>> >> when
>> >> turbo mode "is disabled". And Xen assumes that second_max_freq is
>> >> always P1 if turbo mode is on.
>> >> But, there might be a case when a few highest frequencies are
>> >> turbo-frequencies. So, I propose to add an extra flag for handling
>> >> that.
>> >> So, each CPUFreq driver responsibility will be to mark
>> >> turbo-frequency(ies) for the framework to properly calculate
>> >> second_max_freq.
>> >
>> > As Andre wrote, we can start simply assuming that ARM doesn't have
>> > turbo. If turbo mode is assumed to be off, I don't think we need the
>> > patch below and the new flag, because second_max_freq == max_freq.
>>
>> I just want to show you real example, where we have ARM SoC +
>> turbo-mode + > 1 turbo freq
>> https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/arch/arm64/boot/dts/renesas/r8a7795.dtsi?h=v4.9/rcar-3.5.9#n197
>> As you can see, there are two freqs marked as turbo-freqs: 1600000000
>> Hz and 1700000000 Hz
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/arch/arm64/boot/dts/renesas/r8a7796.dtsi?h=v4.9/rcar-3.5.9#n166
>> For M3 SoC three turbo-freqs are used: 1600000000 Hz, 1700000000 Hz
>> and 1800000000 Hz
>
> Oh well, I take that back then :-)
>
>
>> If a proposed below patch is not an option then we should find another
>> way to clarify second_max_freq.
>
> Yes, it looks like there must be better ways to define second_max_freq.
> Taking the first frequency below the max seems a bit crude to me.
>
>
>> >
>> >> Something like that:
>> >>
>> >> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>> >> index 25bf983..122a88b 100644
>> >> --- a/xen/drivers/cpufreq/utility.c
>> >> +++ b/xen/drivers/cpufreq/utility.c
>> >> @@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
>> >> cpufreq_policy *policy,
>> >>  #ifdef CONFIG_HAS_CPU_TURBO
>> >>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>> >>          unsigned int freq = table[i].frequency;
>> >> -        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
>> >> +        if ((freq == CPUFREQ_ENTRY_INVALID) ||
>> >> +            (table[i].flags & CPUFREQ_BOOST_FREQ))
>> >>              continue;
>> >>          if (freq > second_max_freq)
>> >>              second_max_freq = freq;
>> >> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
>> >> index 2e0c16a..77b29da 100644
>> >> --- a/xen/include/xen/cpufreq.h
>> >> +++ b/xen/include/xen/cpufreq.h
>> >> @@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
>> >> cpufreq_policy *policy,
>> >>  #define CPUFREQ_ENTRY_INVALID ~0
>> >>  #define CPUFREQ_TABLE_END     ~1
>> >>
>> >> +/* Special Values of .flags field */
>> >> +#define CPUFREQ_BOOST_FREQ    (1 << 0)
>> >> +
>> >>  struct cpufreq_frequency_table {
>> >> +       unsigned int    flags;
>> >>      unsigned int    index;     /* any */
>> >>      unsigned int    frequency; /* kHz - doesn't need to be in ascending
>> >>                                  * order */
>> >>
>> >> Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
>> >> a turbo-frequency if turbo mode "is supported". Am I correct?
>
> Yes, I think it is a better approach than what we have today, even for
> x86.

OK, I will prepare patches which will include these changes to common part and
changes to the existing on x86 CPUFreq drivers (to mark P0 frequency as
a turbo-frequency if turbo mode "is supported") if nobody mind.

>
>
>> >> And the most important question is how to recognize in Xen on ARM
>> >> (using SCPI protocol) which frequencies are turbo-frequencies
>> >> actually? I couldn't find any information regarding that in protocol
>> >> description.
>> >> For DT-based CPUFreq it is not an issue, since there is a specific
>> >> property "turbo-mode" to mark corresponding OPPs. [1].
>> >> But neither SCPI DT bindings [2] nor the SCPI protocol itself [3]
>> >> mentions about it. Perhaps, additional command should be added to pass
>> >> such info.
>> >>
>> >> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/opp/opp.txt
>> >> [2] http://elixir.free-electrons.com/linux/v4.15-rc1/source/Documentation/devicetree/bindings/arm/arm,scpi.txt
>> >> [3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message_interface_v1_2_DUI0922G_en.pdf
>
> If there are no mentions of them, then I would assume that none of the
> available frequencies are turbo frequencies.



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-06  7:54             ` Jan Beulich
@ 2017-12-06 23:44               ` Stefano Stabellini
  2017-12-07  8:45                 ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-06 23:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, andrew.cooper3, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, Oleksandr Tyshchenko,
	xen-devel

On Wed, 6 Dec 2017, Jan Beulich wrote:
> >>> On 05.12.17 at 21:48, <sstabellini@kernel.org> wrote:
> > You are right. We need to define a new struct for internal usage, for
> > example:
> > 
> > struct xen_processor_performance_internal {
> >     uint32_t flags;     /* flag for Px sub info type */
> >     uint32_t platform_limit;  /* Platform limitation on freq usage */
> >     struct xen_pct_register control_register;
> >     struct xen_pct_register status_register;
> >     uint32_t state_count;     /* total available performance states */
> >     struct xen_processor_px states;
> >     struct xen_psd_package domain_info;
> >     uint32_t shared_type;     /* coordination type of this processor */
> > };
> > 
> > Jan, Andrew, does this sound like a good approach to you?
> 
> I'm afraid I don't have the time to go through this discussion (and
> the original patch) in detail to figure out the full context in which
> you raise the question. IOW please summarize things alongside
> the proposed structure, or alternatively Oleksandr could simply
> submit an updated patch to allow seeing the actual context
> (albeit in any case I can't promise timely feedback, given the
> number of pending patches plus all the work I still hope to be
> able to get done myself eventually.
> 
> >From a brief check, I can't really figure much of a difference to
> the already existing (and internal) struct processor_performance.

Fair enough. Actually you have a good eye for being able to spot your
name in one of so many patch replies :-)


Oleksandr would like to call set_px_pminfo from a non-hypercall context,
meaning that there are no XEN_GUEST_HANDLE parameters. Today, struct
xen_processor_performance contains a

  XEN_GUEST_HANDLE(xen_processor_px_t) states;

field. Instead of "faking" the XEN_GUEST_HANDLE field from Xen, I
suggested to modify set_px_pminfo to take a different struct, one
without any XEN_GUEST_HANDLE field. For example:

 struct xen_processor_performance_internal {
     uint32_t flags;     /* flag for Px sub info type */
     uint32_t platform_limit;  /* Platform limitation on freq usage */
     struct xen_pct_register control_register;
     struct xen_pct_register status_register;
     uint32_t state_count;     /* total available performance states */
     struct xen_processor_px states;   <---- this is the interesting change
     struct xen_psd_package domain_info;
     uint32_t shared_type;     /* coordination type of this processor */
 };

The caller, in the x86 case is
xen/arch/x86/platform_hypercall.c:do_platform_op, would be resposible
for issuing the copy_from_guest.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component
  2017-12-06 10:54     ` Oleksandr Tyshchenko
@ 2017-12-07  1:40       ` Stefano Stabellini
  0 siblings, 0 replies; 108+ messages in thread
From: Stefano Stabellini @ 2017-12-07  1:40 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Stefano Stabellini, Julien Grall, Oleksandr Tyshchenko

On Wed, 6 Dec 2017, Oleksandr Tyshchenko wrote:
> >> +    perf->platform_limit = platform_limit;
> >> +    perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
> >> +    perf->domain_info.domain = cpumask_first(mask);
> >> +    perf->domain_info.num_processors = cpumask_weight(mask);
> >> +
> >> +    /* Iterate through all CPUs which are on the same boat */
> >> +    for_each_cpu( cpu, mask )
> >> +    {
> >> +        ret = set_px_pminfo(cpu, perf);
> >> +        if ( ret )
> >> +        {
> >> +            printk("cpu%d: failed to set Px states (%d)\n", cpu, ret);
> >> +            break;
> >> +        }
> >> +
> >> +        printk(XENLOG_DEBUG "cpu%d: set Px states\n", cpu);
> >> +    }
> >> +
> >> +    xfree(states);
> >> +out:
> >> +    xfree(perf);
> >> +
> >> +    return ret;
> >> +}
> >> +
> >> +static int __init scpi_cpufreq_postinit(void)
> >> +{
> >> +    struct cpufreq_frequency_table *freq_table = NULL;
> >> +    cpumask_t processed_cpus, shared_cpus;
> >> +    unsigned int cpu;
> >> +    int ret = -ENODEV;
> >> +
> >> +    cpumask_clear(&processed_cpus);
> >> +
> >> +    for_each_online_cpu( cpu )
> >> +    {
> >> +        if ( cpumask_test_cpu(cpu, &processed_cpus) )
> >> +            continue;
> >> +
> >> +        if ( !is_dvfs_capable(cpu) )
> >> +            continue;
> >> +
> >> +        ret = get_sharing_cpus(cpu, &shared_cpus);
> >> +        if ( ret )
> >> +        {
> >> +            printk("cpu%d: failed to get sharing cpumask (%d)\n", cpu, ret);
> >> +            return ret;
> >> +        }
> >> +
> >> +        BUG_ON(cpumask_empty(&shared_cpus));
> >> +        cpumask_or(&processed_cpus, &processed_cpus, &shared_cpus);
> >> +
> >> +        /* Create intermediate frequency table */
> >> +        ret = init_cpufreq_table(cpu, &freq_table);
> >> +        if ( ret )
> >> +        {
> >> +            printk("cpu%d: failed to initialize frequency table (%d)\n",
> >> +                   cpu, ret);
> >> +            return ret;
> >> +        }
> >> +
> >> +        ret = upload_cpufreq_data(&shared_cpus, freq_table);
> >> +        /* Destroy intermediate frequency table */
> >> +        free_cpufreq_table(&freq_table);
> >> +        if ( ret )
> >> +        {
> >> +            printk("cpu%d: failed to upload cpufreq data (%d)\n", cpu, ret);
> >> +            return ret;
> >> +        }
> >> +
> >> +        printk(XENLOG_DEBUG "cpu%d: uploaded cpufreq data\n", cpu);
> >> +    }
> >> +
> >> +    return ret;
> >> +}
> >> +
> >> +static int __init scpi_cpufreq_preinit(void)
> >> +{
> >> +    struct dt_device_node *scpi, *clk, *dvfs_clk;
> >> +    int ret;
> >> +
> >> +    /* Initialize SCPI Message protocol */
> >> +    ret = scpi_init();
> >> +    if ( ret )
> >> +    {
> >> +        printk("failed to initialize SCPI (%d)\n", ret);
> >> +        return ret;
> >> +    }
> >> +
> >> +    /* Sanity check */
> >> +    if ( !get_scpi_ops() || !get_scpi_dev() )
> >> +        return -ENXIO;
> >> +
> >> +    scpi = get_scpi_dev()->of_node;
> >> +    scpi_ops = get_scpi_ops();
> >> +
> >> +    ret = -ENODEV;
> >> +
> >> +    /*
> >> +     * Check for clock related nodes for now. But it might additional nodes,
> >> +     * like thermal sensor, etc.
> >> +     */
> >> +    dt_for_each_child_node( scpi, clk )
> >
> > Wouldn't it make sense to have a proper:
> >
> > DT_DEVICE_START
> > ...
> > DT_DEVICE_END
> >
> > block and register the driver that way?
> 
> I am not sure that I got your question completely.
> Which driver need to be registered in a such way?
> If we had separate dt-related driver to manage clocks we would have to
> register it in a proposed way.
> Here we just iterating through all SCPI child in order to be sure that
> DVFS clock sub-node is present.
> Let say, preliminary check.
> 
> BTW, in a proposed way I register ARM SMC triggered mailbox driver:
> https://www.mail-archive.com/xen-devel@lists.xen.org/msg128411.html
> With adding new DEVICE_MAILBOX class:
> https://www.mail-archive.com/xen-devel@lists.xen.org/msg128402.html

Fair enough, and I see that it is not even scanning the whole device
tree but only the scpi node. It's fine then.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-06 23:44               ` Stefano Stabellini
@ 2017-12-07  8:45                 ` Jan Beulich
  2017-12-07 20:31                   ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2017-12-07  8:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: andrew.cooper3, Oleksandr Dmytryshyn, Julien Grall,
	Oleksandr Tyshchenko, Oleksandr Tyshchenko, xen-devel

>>> On 07.12.17 at 00:44, <sstabellini@kernel.org> wrote:
> Oleksandr would like to call set_px_pminfo from a non-hypercall context,
> meaning that there are no XEN_GUEST_HANDLE parameters. Today, struct
> xen_processor_performance contains a
> 
>   XEN_GUEST_HANDLE(xen_processor_px_t) states;
> 
> field. Instead of "faking" the XEN_GUEST_HANDLE field from Xen, I
> suggested to modify set_px_pminfo to take a different struct, one
> without any XEN_GUEST_HANDLE field. For example:
> 
>  struct xen_processor_performance_internal {
>      uint32_t flags;     /* flag for Px sub info type */
>      uint32_t platform_limit;  /* Platform limitation on freq usage */
>      struct xen_pct_register control_register;
>      struct xen_pct_register status_register;
>      uint32_t state_count;     /* total available performance states */
>      struct xen_processor_px states;   <---- this is the interesting change
>      struct xen_psd_package domain_info;
>      uint32_t shared_type;     /* coordination type of this processor */
>  };
> 
> The caller, in the x86 case is
> xen/arch/x86/platform_hypercall.c:do_platform_op, would be resposible
> for issuing the copy_from_guest.

I think we don't want yet another variant of the structure: I'd
then prefer to have a function doing the translation from struct
xen_processor_performance to struct processor_performance,
and hand the result to set_px_pminfo(). For consistency I'd then
like to ask though that the same be done for set_cx_pminfo().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-07  8:45                 ` Jan Beulich
@ 2017-12-07 20:31                   ` Oleksandr Tyshchenko
  2017-12-08  8:07                     ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-07 20:31 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: Andrew Cooper, Oleksandr Dmytryshyn, Julien Grall, xen-devel,
	Oleksandr Tyshchenko

Hi, Stefano, Jan

On Thu, Dec 7, 2017 at 10:45 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 07.12.17 at 00:44, <sstabellini@kernel.org> wrote:
>> Oleksandr would like to call set_px_pminfo from a non-hypercall context,
>> meaning that there are no XEN_GUEST_HANDLE parameters. Today, struct
>> xen_processor_performance contains a
>>
>>   XEN_GUEST_HANDLE(xen_processor_px_t) states;
>>
>> field. Instead of "faking" the XEN_GUEST_HANDLE field from Xen, I
>> suggested to modify set_px_pminfo to take a different struct, one
>> without any XEN_GUEST_HANDLE field. For example:
>>
>>  struct xen_processor_performance_internal {
>>      uint32_t flags;     /* flag for Px sub info type */
>>      uint32_t platform_limit;  /* Platform limitation on freq usage */
>>      struct xen_pct_register control_register;
>>      struct xen_pct_register status_register;
>>      uint32_t state_count;     /* total available performance states */
>>      struct xen_processor_px states;   <---- this is the interesting change
>>      struct xen_psd_package domain_info;
>>      uint32_t shared_type;     /* coordination type of this processor */
>>  };
>>
>> The caller, in the x86 case is
>> xen/arch/x86/platform_hypercall.c:do_platform_op, would be resposible
>> for issuing the copy_from_guest.
Stefano, thank you for the detailed clarification.

>
> I think we don't want yet another variant of the structure: I'd
> then prefer to have a function doing the translation from struct
> xen_processor_performance to struct processor_performance,
> and hand the result to set_px_pminfo(). For consistency I'd then
> like to ask though that the same be done for set_cx_pminfo().

Jan, Stefano, thank you for suggestions.

Have questions which need to be clarified:

If I understood correctly, new variant of set_px_pminfo is going to
have an extra "flag" argument, since
struct processor_performance doesn't have "flag" field (it contains
"state" field instead, which has yet another meaning).
Something like that:
int set_px_pminfo(uint32_t acpi_id, uint32_t flag, struct
processor_performance *dom0_px_info)
Is my understanding correct?

As for set_cx_pminfo(). To what struct we should do translation from
struct xen_processor_power? (struct acpi_processor_power?)

Briefly looking at set_cx_pminfo(), I got a feeling, that in order to
modify it in a "set_px_pminfo() manner"
we need to rework print_cx_pminfo(),  set_cx(), check_cx(),
acpi_processor_ffh_cstate_probe() too, since
all these function have arguments which contain XEN_GUEST_HANDLE. I am
wondering is it worth
doing such rework taking into the account that set_cx_pminfo() is not
going to be called from the non-hypercall context.
Or I missed something?

>
> Jan
>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-07 20:31                   ` Oleksandr Tyshchenko
@ 2017-12-08  8:07                     ` Jan Beulich
  2017-12-08 12:16                       ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2017-12-08  8:07 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, xen-devel

>>> On 07.12.17 at 21:31, <olekstysh@gmail.com> wrote:
> Have questions which need to be clarified:
> 
> If I understood correctly, new variant of set_px_pminfo is going to
> have an extra "flag" argument, since
> struct processor_performance doesn't have "flag" field (it contains
> "state" field instead, which has yet another meaning).
> Something like that:
> int set_px_pminfo(uint32_t acpi_id, uint32_t flag, struct
> processor_performance *dom0_px_info)
> Is my understanding correct?

Well, you obviously must not lose information, so having that
extra parameter is unavoidable. Please use common sense
when dealing with such re-structuring. And btw, please also be
precise: There's no "flag" field, but there is a "flags" one. Such
should also be the name of the new parameter - we're talking
about multiple bits here, after all.

> As for set_cx_pminfo(). To what struct we should do translation from
> struct xen_processor_power? (struct acpi_processor_power?)

Yes, of course.

> Briefly looking at set_cx_pminfo(), I got a feeling, that in order to
> modify it in a "set_px_pminfo() manner"
> we need to rework print_cx_pminfo(),  set_cx(), check_cx(),
> acpi_processor_ffh_cstate_probe() too, since
> all these function have arguments which contain XEN_GUEST_HANDLE. I am
> wondering is it worth
> doing such rework taking into the account that set_cx_pminfo() is not
> going to be called from the non-hypercall context.
> Or I missed something?

Without looking at the details of this, please again use common
sense. If there are good reasons for the two functions to not
follow the same model, please simply state so in the overview
mail of the patch series and/or (briefly, but concisely) in the
specific patch's description. A good reason for example would
be if overly large amounts of other code would need touching.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 06/31] cpufreq: make cpufreq driver more generalizable
  2017-12-08  8:07                     ` Jan Beulich
@ 2017-12-08 12:16                       ` Oleksandr Tyshchenko
  0 siblings, 0 replies; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2017-12-08 12:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, xen-devel

Hi Jan

On Fri, Dec 8, 2017 at 10:07 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 07.12.17 at 21:31, <olekstysh@gmail.com> wrote:
>> Have questions which need to be clarified:
>>
>> If I understood correctly, new variant of set_px_pminfo is going to
>> have an extra "flag" argument, since
>> struct processor_performance doesn't have "flag" field (it contains
>> "state" field instead, which has yet another meaning).
>> Something like that:
>> int set_px_pminfo(uint32_t acpi_id, uint32_t flag, struct
>> processor_performance *dom0_px_info)
>> Is my understanding correct?
>
> Well, you obviously must not lose information, so having that
> extra parameter is unavoidable. Please use common sense
> when dealing with such re-structuring. And btw, please also be
> precise: There's no "flag" field, but there is a "flags" one. Such
> should also be the name of the new parameter - we're talking
> about multiple bits here, after all.
Indeed "flags", sorry for being unclear.

>
>> As for set_cx_pminfo(). To what struct we should do translation from
>> struct xen_processor_power? (struct acpi_processor_power?)
>
> Yes, of course.
>
>> Briefly looking at set_cx_pminfo(), I got a feeling, that in order to
>> modify it in a "set_px_pminfo() manner"
>> we need to rework print_cx_pminfo(),  set_cx(), check_cx(),
>> acpi_processor_ffh_cstate_probe() too, since
>> all these function have arguments which contain XEN_GUEST_HANDLE. I am
>> wondering is it worth
>> doing such rework taking into the account that set_cx_pminfo() is not
>> going to be called from the non-hypercall context.
>> Or I missed something?
>
> Without looking at the details of this, please again use common
> sense. If there are good reasons for the two functions to not
> follow the same model, please simply state so in the overview
> mail of the patch series and/or (briefly, but concisely) in the
> specific patch's description. A good reason for example would
> be if overly large amounts of other code would need touching.
Agree.

>
> Jan
>



-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2017-11-09 17:09 ` [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location Oleksandr Tyshchenko
  2017-12-02  0:47   ` Stefano Stabellini
@ 2018-05-07 15:36   ` Jan Beulich
  2018-05-18 11:14     ` Oleksandr Tyshchenko
  1 sibling, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2018-05-07 15:36 UTC (permalink / raw)
  To: olekstysh
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	julien.grall, oleksandr_tyshchenko, xen-devel

>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> 
> Cpufreq driver should be more generalizable (not ACPI-specific).
> Thus this file should be placed to more convenient location.
> 
> This is a rebased version of the original patch:
> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00935.html 
> 
> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@linaro.org>
> ---
>  MAINTAINERS               |   1 +
>  xen/arch/x86/Kconfig      |   1 +
>  xen/common/sysctl.c       |   2 +-
>  xen/drivers/Kconfig       |   2 +
>  xen/drivers/Makefile      |   1 +
>  xen/drivers/acpi/Makefile |   1 -
>  xen/drivers/acpi/pmstat.c | 526 ----------------------------------------------
>  xen/drivers/pm/Kconfig    |   3 +
>  xen/drivers/pm/Makefile   |   1 +
>  xen/drivers/pm/stat.c     | 526 ++++++++++++++++++++++++++++++++++++++++++++++

I think I'd prefer drivers/power/*, and please try present movement of files as
renames instead of as delete+create.

> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -23,6 +23,7 @@ config X86
>  	select HAS_PDX
>  	select NUMA
>  	select VGA
> +	select HAS_PM

Please insert at the right spot.

> +int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
> +{
> +    u32 bits[3];
> +    int ret;
> +
> +    if ( copy_from_guest(bits, pdc, 2) )
> +        ret = -EFAULT;
> +    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
> +        ret = -EINVAL;
> +    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
> +        ret = -EFAULT;
> +    else
> +    {
> +        u32 mask = 0;
> +
> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
> +            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
> +            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
> +            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
> +        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
> +                    ACPI_PDC_SMP_C1PT) & ~mask;
> +        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
> +    }
> +    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
> +        ret = -EFAULT;
> +
> +    return ret;
> +}

Looks quite ACPI-specific.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2017-11-09 17:09 ` [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable Oleksandr Tyshchenko
  2017-12-02  1:06   ` Stefano Stabellini
@ 2018-05-07 15:39   ` Jan Beulich
  2018-05-18 14:36     ` Oleksandr Tyshchenko
  1 sibling, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2018-05-07 15:39 UTC (permalink / raw)
  To: olekstysh
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	julien.grall, oleksandr_tyshchenko, xen-devel

>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
> --- a/xen/drivers/cpufreq/Kconfig
> +++ b/xen/drivers/cpufreq/Kconfig
> @@ -1,3 +1,6 @@
>  
>  config HAS_CPUFREQ
>  	bool
> +
> +config HAS_CPU_TURBO
> +	bool

This is about cpufreq, so HAS_CPUFREQ_TURBO please.

Also please try to limit the number of #ifdef-s, perhaps by way of introducing
a few helpers (ending up empty without that setting enabled).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2018-05-07 15:36   ` Jan Beulich
@ 2018-05-18 11:14     ` Oleksandr Tyshchenko
  2018-05-18 11:35       ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2018-05-18 11:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, xen-devel

Hi, Jan.

Sorry for the late response.

On Mon, May 7, 2018 at 6:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
>> From: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>>
>> Cpufreq driver should be more generalizable (not ACPI-specific).
>> Thus this file should be placed to more convenient location.
>>
>> This is a rebased version of the original patch:
>> https://lists.xen.org/archives/html/xen-devel/2014-11/msg00935.html
>>
>> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@linaro.org>
>> ---
>>  MAINTAINERS               |   1 +
>>  xen/arch/x86/Kconfig      |   1 +
>>  xen/common/sysctl.c       |   2 +-
>>  xen/drivers/Kconfig       |   2 +
>>  xen/drivers/Makefile      |   1 +
>>  xen/drivers/acpi/Makefile |   1 -
>>  xen/drivers/acpi/pmstat.c | 526 ----------------------------------------------
>>  xen/drivers/pm/Kconfig    |   3 +
>>  xen/drivers/pm/Makefile   |   1 +
>>  xen/drivers/pm/stat.c     | 526 ++++++++++++++++++++++++++++++++++++++++++++++
>
> I think I'd prefer drivers/power/*, and please try present movement of files as
> renames instead of as delete+create.
Will do.

>
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -23,6 +23,7 @@ config X86
>>       select HAS_PDX
>>       select NUMA
>>       select VGA
>> +     select HAS_PM
>
> Please insert at the right spot.
ok.

>
>> +int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>> +{
>> +    u32 bits[3];
>> +    int ret;
>> +
>> +    if ( copy_from_guest(bits, pdc, 2) )
>> +        ret = -EFAULT;
>> +    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
>> +        ret = -EINVAL;
>> +    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
>> +        ret = -EFAULT;
>> +    else
>> +    {
>> +        u32 mask = 0;
>> +
>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
>> +            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
>> +            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
>> +            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
>> +        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
>> +                    ACPI_PDC_SMP_C1PT) & ~mask;
>> +        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
>> +    }
>> +    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
>> +        ret = -EFAULT;
>> +
>> +    return ret;
>> +}
>
> Looks quite ACPI-specific.
Yes, current patch does just a movement.

Next patch [1] wraps it in #ifdef CONFIG_ACPI.

However during patch discussion we decided to move this function to arch/x86.
It is called from arch/x86/platform_hypercall.c and pulls a bunch of
#define-s from pdc_intel.h

Sounds ok?

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-11/msg00651.html

>
> Jan
>
>

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2018-05-18 11:14     ` Oleksandr Tyshchenko
@ 2018-05-18 11:35       ` Jan Beulich
  2018-05-18 14:13         ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2018-05-18 11:35 UTC (permalink / raw)
  To: olekstysh
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	julien.grall, oleksandr_tyshchenko, xen-devel

>>> On 18.05.18 at 13:14, <olekstysh@gmail.com> wrote:
> On Mon, May 7, 2018 at 6:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
>>> +int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>>> +{
>>> +    u32 bits[3];
>>> +    int ret;
>>> +
>>> +    if ( copy_from_guest(bits, pdc, 2) )
>>> +        ret = -EFAULT;
>>> +    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
>>> +        ret = -EINVAL;
>>> +    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
>>> +        ret = -EFAULT;
>>> +    else
>>> +    {
>>> +        u32 mask = 0;
>>> +
>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
>>> +            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
>>> +            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
>>> +            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
>>> +        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
>>> +                    ACPI_PDC_SMP_C1PT) & ~mask;
>>> +        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
>>> +    }
>>> +    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
>>> +        ret = -EFAULT;
>>> +
>>> +    return ret;
>>> +}
>>
>> Looks quite ACPI-specific.
> Yes, current patch does just a movement.
> 
> Next patch [1] wraps it in #ifdef CONFIG_ACPI.
> 
> However during patch discussion we decided to move this function to arch/x86.
> It is called from arch/x86/platform_hypercall.c and pulls a bunch of
> #define-s from pdc_intel.h

Not sure - the function may be used by x86 only right now, but is what it
does really x86-specific?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2018-05-18 11:35       ` Jan Beulich
@ 2018-05-18 14:13         ` Oleksandr Tyshchenko
  2018-05-18 14:21           ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2018-05-18 14:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, xen-devel

Hi,

On Fri, May 18, 2018 at 2:35 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 18.05.18 at 13:14, <olekstysh@gmail.com> wrote:
>> On Mon, May 7, 2018 at 6:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
>>>> +int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>>>> +{
>>>> +    u32 bits[3];
>>>> +    int ret;
>>>> +
>>>> +    if ( copy_from_guest(bits, pdc, 2) )
>>>> +        ret = -EFAULT;
>>>> +    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
>>>> +        ret = -EINVAL;
>>>> +    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
>>>> +        ret = -EFAULT;
>>>> +    else
>>>> +    {
>>>> +        u32 mask = 0;
>>>> +
>>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
>>>> +            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
>>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
>>>> +            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
>>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
>>>> +            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
>>>> +        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
>>>> +                    ACPI_PDC_SMP_C1PT) & ~mask;
>>>> +        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
>>>> +    }
>>>> +    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
>>>> +        ret = -EFAULT;
>>>> +
>>>> +    return ret;
>>>> +}
>>>
>>> Looks quite ACPI-specific.
>> Yes, current patch does just a movement.
>>
>> Next patch [1] wraps it in #ifdef CONFIG_ACPI.
>>
>> However during patch discussion we decided to move this function to arch/x86.
>> It is called from arch/x86/platform_hypercall.c and pulls a bunch of
>> #define-s from pdc_intel.h
>
> Not sure - the function may be used by x86 only right now, but is what it
> does really x86-specific?

I am not familiar with ACPI to answer precisely.
What I see here is that these are named "Intel Processor Driver
Capabilities flags".

However, the Section 8.4.1 of document [1] doesn't explicitly say that
"_PDC" is supposed to be x86 specific thing only.

[1] http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf

So, I can leave acpi_set_pdc_bits() in xen/drivers/pm/stat.c for now,
but definitely wrapped into #ifdef CONFIG_ACPI.

What do you think?

>
> Jan
>
>

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location
  2018-05-18 14:13         ` Oleksandr Tyshchenko
@ 2018-05-18 14:21           ` Jan Beulich
  0 siblings, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2018-05-18 14:21 UTC (permalink / raw)
  To: olekstysh
  Cc: Andrew Cooper, Stefano Stabellini, xen-devel, julien.grall,
	oleksandr_tyshchenko

>>> On 18.05.18 at 16:13, <olekstysh@gmail.com> wrote:
> Hi,
> 
> On Fri, May 18, 2018 at 2:35 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 18.05.18 at 13:14, <olekstysh@gmail.com> wrote:
>>> On Mon, May 7, 2018 at 6:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
>>>>> +int acpi_set_pdc_bits(u32 acpi_id, XEN_GUEST_HANDLE_PARAM(uint32) pdc)
>>>>> +{
>>>>> +    u32 bits[3];
>>>>> +    int ret;
>>>>> +
>>>>> +    if ( copy_from_guest(bits, pdc, 2) )
>>>>> +        ret = -EFAULT;
>>>>> +    else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
>>>>> +        ret = -EINVAL;
>>>>> +    else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
>>>>> +        ret = -EFAULT;
>>>>> +    else
>>>>> +    {
>>>>> +        u32 mask = 0;
>>>>> +
>>>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
>>>>> +            mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
>>>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
>>>>> +            mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
>>>>> +        if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
>>>>> +            mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
>>>>> +        bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
>>>>> +                    ACPI_PDC_SMP_C1PT) & ~mask;
>>>>> +        ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
>>>>> +    }
>>>>> +    if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
>>>>> +        ret = -EFAULT;
>>>>> +
>>>>> +    return ret;
>>>>> +}
>>>>
>>>> Looks quite ACPI-specific.
>>> Yes, current patch does just a movement.
>>>
>>> Next patch [1] wraps it in #ifdef CONFIG_ACPI.
>>>
>>> However during patch discussion we decided to move this function to arch/x86.
>>> It is called from arch/x86/platform_hypercall.c and pulls a bunch of
>>> #define-s from pdc_intel.h
>>
>> Not sure - the function may be used by x86 only right now, but is what it
>> does really x86-specific?
> 
> I am not familiar with ACPI to answer precisely.
> What I see here is that these are named "Intel Processor Driver
> Capabilities flags".
> 
> However, the Section 8.4.1 of document [1] doesn't explicitly say that
> "_PDC" is supposed to be x86 specific thing only.
> 
> [1] 
> http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf 
> 
> So, I can leave acpi_set_pdc_bits() in xen/drivers/pm/stat.c for now,
> but definitely wrapped into #ifdef CONFIG_ACPI.

Yes please, unless indications of it being x86 specific can be provided.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2018-05-07 15:39   ` Jan Beulich
@ 2018-05-18 14:36     ` Oleksandr Tyshchenko
  2018-05-18 14:41       ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Oleksandr Tyshchenko @ 2018-05-18 14:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Andrew Cooper, Oleksandr Dmytryshyn,
	Julien Grall, Oleksandr Tyshchenko, xen-devel

Hi, Jan.

Sorry for the late response.

On Mon, May 7, 2018 at 6:39 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 09.11.17 at 18:09, <olekstysh@gmail.com> wrote:
>> --- a/xen/drivers/cpufreq/Kconfig
>> +++ b/xen/drivers/cpufreq/Kconfig
>> @@ -1,3 +1,6 @@
>>
>>  config HAS_CPUFREQ
>>       bool
>> +
>> +config HAS_CPU_TURBO
>> +     bool
>
> This is about cpufreq, so HAS_CPUFREQ_TURBO please.
>
> Also please try to limit the number of #ifdef-s, perhaps by way of introducing
> a few helpers (ending up empty without that setting enabled).

I would like to inform you that we decided to drop this patch. We can
go on without it. Thanks to Stefano
for the valuable comments.

All what we need at the moment regarding "turbo frequencies" is to
"correct the way of defining second_max_freq".
But it is going to be an another patch.

BTW, what do you think about the following:

Another question is second_max_freq. As I understand, it is highest
non-turbo frequency calculated by framework to limit target frequency
when turbo mode "is disabled". And Xen assumes that second_max_freq is
always P1 if turbo mode is on.
But, there might be a case when a few highest frequencies are
turbo-frequencies. So, I propose to add an extra flag for handling
that. So, each CPUFreq driver responsibility will be to mark
turbo-frequency(ies) for the framework to properly calculate
second_max_freq.

Something like that:

diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index 25bf983..122a88b 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
cpufreq_policy *policy,
 #ifdef CONFIG_HAS_CPU_TURBO
     for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
         unsigned int freq = table[i].frequency;
-        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
+        if ((freq == CPUFREQ_ENTRY_INVALID) ||
+            (table[i].flags & CPUFREQ_BOOST_FREQ))
             continue;
         if (freq > second_max_freq)
             second_max_freq = freq;
diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
index 2e0c16a..77b29da 100644
--- a/xen/include/xen/cpufreq.h
+++ b/xen/include/xen/cpufreq.h
@@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
cpufreq_policy *policy,
 #define CPUFREQ_ENTRY_INVALID ~0
 #define CPUFREQ_TABLE_END     ~1

+/* Special Values of .flags field */
+#define CPUFREQ_BOOST_FREQ    (1 << 0)
+
 struct cpufreq_frequency_table {
+       unsigned int    flags;
     unsigned int    index;     /* any */
     unsigned int    frequency; /* kHz - doesn't need to be in ascending
                                 * order */

Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
a turbo-frequency if turbo mode "is supported".

>
> Jan
>
>

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable
  2018-05-18 14:36     ` Oleksandr Tyshchenko
@ 2018-05-18 14:41       ` Jan Beulich
  0 siblings, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2018-05-18 14:41 UTC (permalink / raw)
  To: olekstysh
  Cc: Andrew Cooper, Stefano Stabellini, xen-devel, julien.grall,
	oleksandr_tyshchenko

>>> On 18.05.18 at 16:36, <olekstysh@gmail.com> wrote:
> BTW, what do you think about the following:
> 
> Another question is second_max_freq. As I understand, it is highest
> non-turbo frequency calculated by framework to limit target frequency
> when turbo mode "is disabled". And Xen assumes that second_max_freq is
> always P1 if turbo mode is on.
> But, there might be a case when a few highest frequencies are
> turbo-frequencies. So, I propose to add an extra flag for handling
> that. So, each CPUFreq driver responsibility will be to mark
> turbo-frequency(ies) for the framework to properly calculate
> second_max_freq.

Sounds reasonable at the first glance.

Jan

> Something like that:
> 
> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> index 25bf983..122a88b 100644
> --- a/xen/drivers/cpufreq/utility.c
> +++ b/xen/drivers/cpufreq/utility.c
> @@ -226,7 +226,8 @@ int cpufreq_frequency_table_cpuinfo(struct
> cpufreq_policy *policy,
>  #ifdef CONFIG_HAS_CPU_TURBO
>      for (i=0; (table[i].frequency != CPUFREQ_TABLE_END); i++) {
>          unsigned int freq = table[i].frequency;
> -        if (freq == CPUFREQ_ENTRY_INVALID || freq == max_freq)
> +        if ((freq == CPUFREQ_ENTRY_INVALID) ||
> +            (table[i].flags & CPUFREQ_BOOST_FREQ))
>              continue;
>          if (freq > second_max_freq)
>              second_max_freq = freq;
> diff --git a/xen/include/xen/cpufreq.h b/xen/include/xen/cpufreq.h
> index 2e0c16a..77b29da 100644
> --- a/xen/include/xen/cpufreq.h
> +++ b/xen/include/xen/cpufreq.h
> @@ -204,7 +204,11 @@ void cpufreq_verify_within_limits(struct
> cpufreq_policy *policy,
>  #define CPUFREQ_ENTRY_INVALID ~0
>  #define CPUFREQ_TABLE_END     ~1
> 
> +/* Special Values of .flags field */
> +#define CPUFREQ_BOOST_FREQ    (1 << 0)
> +
>  struct cpufreq_frequency_table {
> +       unsigned int    flags;
>      unsigned int    index;     /* any */
>      unsigned int    frequency; /* kHz - doesn't need to be in ascending
>                                  * order */
> 
> Both existing on x86 CPUFreq drivers just need to mark P0 frequency as
> a turbo-frequency if turbo mode "is supported".
> 
>>
>> Jan
>>
>>
> 
> -- 
> Regards,
> 
> Oleksandr Tyshchenko





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2018-05-18 14:41 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09 17:09 [RFC PATCH 00/31] CPUFreq on ARM Oleksandr Tyshchenko
2017-11-09 17:09 ` [RFC PATCH 01/31] cpufreq: move cpufreq.h file to the xen/include/xen location Oleksandr Tyshchenko
2017-12-02  0:35   ` Stefano Stabellini
2017-11-09 17:09 ` [RFC PATCH 02/31] pm: move processor_perf.h " Oleksandr Tyshchenko
2017-12-02  0:41   ` Stefano Stabellini
2017-11-09 17:09 ` [RFC PATCH 03/31] pmstat: move pmstat.c file to the xen/drivers/pm/stat.c location Oleksandr Tyshchenko
2017-12-02  0:47   ` Stefano Stabellini
2018-05-07 15:36   ` Jan Beulich
2018-05-18 11:14     ` Oleksandr Tyshchenko
2018-05-18 11:35       ` Jan Beulich
2018-05-18 14:13         ` Oleksandr Tyshchenko
2018-05-18 14:21           ` Jan Beulich
2017-11-09 17:09 ` [RFC PATCH 04/31] cpufreq: make turbo settings to be configurable Oleksandr Tyshchenko
2017-12-02  1:06   ` Stefano Stabellini
2017-12-02 17:25     ` Oleksandr Tyshchenko
2017-12-04 11:58       ` Andre Przywara
2017-12-05 15:23         ` Oleksandr Tyshchenko
2017-12-04 22:18       ` Stefano Stabellini
2017-12-05 11:13         ` Oleksandr Tyshchenko
2017-12-05 19:24           ` Stefano Stabellini
2017-12-06 11:28             ` Oleksandr Tyshchenko
2018-05-07 15:39   ` Jan Beulich
2018-05-18 14:36     ` Oleksandr Tyshchenko
2018-05-18 14:41       ` Jan Beulich
2017-11-09 17:09 ` [RFC PATCH 05/31] pmstat: make pmstat functions more generalizable Oleksandr Tyshchenko
2017-12-02  1:21   ` Stefano Stabellini
2017-12-04 16:21     ` Oleksandr Tyshchenko
2017-12-04 22:30       ` Stefano Stabellini
2017-11-09 17:09 ` [RFC PATCH 06/31] cpufreq: make cpufreq driver " Oleksandr Tyshchenko
2017-12-02  1:37   ` Stefano Stabellini
2017-12-04 19:34     ` Oleksandr Tyshchenko
2017-12-04 22:46       ` Stefano Stabellini
2017-12-05 19:29         ` Oleksandr Tyshchenko
2017-12-05 20:48           ` Stefano Stabellini
2017-12-06  7:54             ` Jan Beulich
2017-12-06 23:44               ` Stefano Stabellini
2017-12-07  8:45                 ` Jan Beulich
2017-12-07 20:31                   ` Oleksandr Tyshchenko
2017-12-08  8:07                     ` Jan Beulich
2017-12-08 12:16                       ` Oleksandr Tyshchenko
2017-11-09 17:09 ` [RFC PATCH 07/31] xenpm: Clarify xenpm usage Oleksandr Tyshchenko
2017-11-09 17:13   ` Wei Liu
2017-12-02  1:28     ` Stefano Stabellini
2017-11-09 17:09 ` [RFC PATCH 08/31] xen/device-tree: Add dt_count_phandle_with_args helper Oleksandr Tyshchenko
2017-11-09 17:09 ` [RFC PATCH 09/31] xen/device-tree: Add dt_property_for_each_string macros Oleksandr Tyshchenko
2017-12-04 23:24   ` Stefano Stabellini
2017-12-05 14:19     ` Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 10/31] xen/device-tree: Add dt_property_read_u32_index helper Oleksandr Tyshchenko
2017-12-04 23:29   ` Stefano Stabellini
2017-11-09 17:10 ` [RFC PATCH 11/31] xen/device-tree: Add dt_property_count_elems_of_size helper Oleksandr Tyshchenko
2017-12-04 23:29   ` Stefano Stabellini
2017-11-09 17:10 ` [RFC PATCH 12/31] xen/device-tree: Add dt_property_read_string_helper and friends Oleksandr Tyshchenko
2017-12-04 23:29   ` Stefano Stabellini
2017-11-09 17:10 ` [RFC PATCH 13/31] xen/arm: Add driver_data field to struct device Oleksandr Tyshchenko
2017-12-04 23:31   ` Stefano Stabellini
2017-12-05 11:26   ` Julien Grall
2017-12-05 12:57     ` Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 14/31] xen/arm: Add DEVICE_MAILBOX device class Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 15/31] xen/arm: Store device-tree node per cpu Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 16/31] arm: add SMC wrapper that is compatible with SMCCC Oleksandr Tyshchenko
2017-12-05  2:30   ` Stefano Stabellini
2017-12-05 15:33     ` Volodymyr Babchuk
2017-12-05 17:21       ` Stefano Stabellini
2017-12-05 14:58   ` Julien Grall
2017-12-05 17:08     ` Volodymyr Babchuk
2017-12-05 17:08       ` Julien Grall
2017-12-05 17:20       ` Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 17/31] xen/arm: Add ARM System Control and Power Interface (SCPI) protocol Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 18/31] xen/arm: Add mailbox infrastructure Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 19/31] xen/arm: Introduce ARM SMC based mailbox Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 20/31] xen/arm: Add common header file wrappers.h Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 21/31] xen/arm: Add rxdone_auto flag to mbox_controller structure Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 22/31] xen/arm: Add Xen changes to SCPI protocol Oleksandr Tyshchenko
2017-12-05 21:20   ` Stefano Stabellini
2017-12-05 21:41     ` Julien Grall
2017-12-06 10:08       ` Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 23/31] xen/arm: Add Xen changes to mailbox infrastructure Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 24/31] xen/arm: Add Xen changes to ARM SMC based mailbox Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 25/31] xen/arm: Use non-blocking mode for SCPI protocol Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 26/31] xen/arm: Don't set txdone_poll flag for ARM SMC mailbox Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 27/31] cpufreq: hack: perf->states isn't a real guest handle on ARM Oleksandr Tyshchenko
2017-12-05 21:34   ` Stefano Stabellini
2017-11-09 17:10 ` [RFC PATCH 28/31] xen/arm: Introduce SCPI based CPUFreq driver Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 29/31] xen/arm: Introduce CPUFreq Interface component Oleksandr Tyshchenko
2017-12-05 22:25   ` Stefano Stabellini
2017-12-06 10:54     ` Oleksandr Tyshchenko
2017-12-07  1:40       ` Stefano Stabellini
2017-11-09 17:10 ` [RFC PATCH 30/31] xen/arm: Build CPUFreq components Oleksandr Tyshchenko
2017-11-09 17:10 ` [RFC PATCH 31/31] xen/arm: Enable CPUFreq on ARM Oleksandr Tyshchenko
2017-11-09 17:18 ` [RFC PATCH 00/31] " Andrii Anisov
2017-11-13 19:40   ` Oleksandr Tyshchenko
2017-11-13 15:21 ` Andre Przywara
2017-11-13 19:40   ` Oleksandr Tyshchenko
2017-11-14 10:49     ` Andre Przywara
2017-11-14 20:46       ` Oleksandr Tyshchenko
2017-11-15  3:03         ` Jassi Brar
2017-11-15 13:28           ` Andre Przywara
2017-11-15 15:18             ` Jassi Brar
2017-11-15 14:28         ` Andre Przywara
2017-11-16 14:57           ` Oleksandr Tyshchenko
2017-11-16 17:04             ` Andre Przywara
2017-11-17 14:01               ` Julien Grall
2017-11-17 18:36                 ` Oleksandr Tyshchenko
2017-11-17 14:55               ` Oleksandr Tyshchenko
2017-11-17 16:41                 ` Andre Przywara
2017-11-17 17:22                   ` Oleksandr Tyshchenko
2017-12-05 22:26 ` Stefano Stabellini
2017-12-06 10:10   ` Oleksandr Tyshchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.