openbmc.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13] Introduce PECI subsystem
@ 2021-11-15 18:25 Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 01/13] dt-bindings: Add generic bindings for PECI Iwona Winiarska
                   ` (14 more replies)
  0 siblings, 15 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Hi,

This is a third round of patches introducing PECI subsystem.
Sorry for the delay between v2 and v3.

The Platform Environment Control Interface (PECI) is a communication
interface between Intel processors and management controllers (e.g.
Baseboard Management Controller, BMC).

This series adds a PECI subsystem and introduces drivers which run in
the Linux instance on the management controller (not the main Intel
processor) and is intended to be used by the OpenBMC [1], a Linux
distribution for BMC devices.
The information exposed over PECI (like processor and DIMM
temperature) refers to the Intel processor and can be consumed by
daemons running on the BMC to, for example, display the processor
temperature in its web interface.

The PECI bus is collection of code that provides interface support
between PECI devices (that actually represent processors) and PECI
controllers (such as the "peci-aspeed" controller) that allow to
access physical PECI interface. PECI devices are bound to PECI
drivers that provides access to PECI services. This series introduces
a generic "peci-cpu" driver that exposes hardware monitoring "cputemp"
and "dimmtemp" using the auxiliary bus.

Exposing "raw" PECI to userspace, either to write userspace drivers or
for debug/testing purpose was left out of this series to encourage
writing kernel drivers instead, but may be pursued in the future.

Introducing PECI to upstream Linux was already attempted before [2].
Since it's been over a year since last revision, and the series
changed quite a bit in the meantime, I've decided to start from v1.

I would also like to give credit to everyone who helped me with
different aspects of preliminary review:
- Pierre-Louis Bossart,
- Tony Luck, 
- Andy Shevchenko,
- Dave Hansen.

[1] https://github.com/openbmc/openbmc
[2] https://lore.kernel.org/openbmc/20191211194624.2872-1-jae.hyun.yoo@linux.intel.com/

Changes v2 -> v3:

* Dropped x86/cpu patches (Boris)
* Dropped pr_fmt() for PECI module (Dan)
* Fixed releasing peci controller device flow (Dan) 
* Improved peci-aspeed commit-msg and Kconfig help (Dan)
* Fixed aspeed_peci_xfer() to use the proper spin_lock function (Dan) 
* Wrapped print_hex_dump_bytes() in CONFIG_DYNAMIC_DEBUG (Dan)
* Removed debug status logs from aspeed_peci_irq_handler() (Dan)
* Renamed functions using devres to start with "devm" (Dan)
* Changed request to be allocated on stack in peci_detect (Dan)
* Removed redundant WARN_ON on invalid PECI addr (Dan)
* Changed peci_device_create() to use device_initialize() + device_add() pattern (Dan)
* Fixed peci_device_destroy() to use kill_device() avoiding double-free (Dan)
* Renamed functions that perform xfer using "peci_xfer_*" prefix (Dan) 
* Renamed peci_request_data_dib(temp) -> peci_request_dib(temp)_read (Dan)
* Fixed thermal margin readings for older Intel processors (Zev) 
* Misc hwmon simplifications (Guenter)
* Used BIT_PER_TYPE to verify macro value constrains (Guenter)
* Improved WARN_ON message to print chan_rank_max and idx_dimm_max (Guenter)
* Improved dimmtemp to not reattempt probe if no dimms are populated

Changes v1 -> v2:

Biggest changes when it comes to diffstat are locking in HWMON
(I decided to clean things up a bit while adding it), switching to
devres usage in more places and exposing sysfs interface in separate patch.

* Moved extending X86 ARCHITECTURE MAINTAINERS earlier in series (Dan)
* Removed "default n" for GENERIC_LIB_X86 (Dan)
* Added vendor prefix for peci-aspeed specific properties (Rob)
* Refactored PECI to use devres consistently (Dan)
* Added missing sysfs documentation and excluded adding peci-sysfs to
  separate patch (Dan)
* Used module_init() instead of subsys_init() for peci module initialization (Dan)
* Removed redundant struct peci_device member (Dan)
* Improved PECI Kconfig help (Randy/Dan)
* Fixed/removed log messages (Dan, Guenter)
* Refactored peci-cputemp and peci-dimmtemp and added missing locks (Guenter)
* Removed unused dev_set_drvdata() in peci-cputemp and peci-dimmtemp (Guenter)
* Fixed used types, names, fixed broken and added additional comments
  to peci-hwmon (Guenter, Zev)
* Refactored peci-dimmtemp to not return -ETIMEDOUT (Guenter)
* Added sanity check for min_peci_revision in peci-hwmon drivers (Zev)
* Added assert for DIMM_NUMS_MAX and additional warning in peci-dimmtemp (Zev)
* Fixed macro names in peci-aspeed (Zev)
* Refactored peci-aspeed sanitizing properties to a single helper function (Zev)
* Fixed peci_cpu_device_ids definition for Broadwell Xeon D (David)
* Refactor peci_request to use a single allocation (Zev)
* Used min_t() to improve code readability (Zev)
* Added macro for PECI_RDENDPTCFG_MMIO_WR_LEN_BASE and fixed adev type
  array name to more descriptive (Zev)
* Fixed peci-hwmon commit-msg and documentation (Zev)

Thanks
-Iwona

Iwona Winiarska (11):
  dt-bindings: Add generic bindings for PECI
  dt-bindings: Add bindings for peci-aspeed
  ARM: dts: aspeed: Add PECI controller nodes
  peci: Add core infrastructure
  peci: Add device detection
  peci: Add sysfs interface for PECI bus
  peci: Add support for PECI device drivers
  peci: Add peci-cpu driver
  hwmon: peci: Add cputemp driver
  hwmon: peci: Add dimmtemp driver
  docs: Add PECI documentation

Jae Hyun Yoo (2):
  peci: Add peci-aspeed controller driver
  docs: hwmon: Document PECI drivers

 Documentation/ABI/testing/sysfs-bus-peci      |  16 +
 .../devicetree/bindings/peci/peci-aspeed.yaml | 109 +++
 .../bindings/peci/peci-controller.yaml        |  33 +
 Documentation/hwmon/index.rst                 |   2 +
 Documentation/hwmon/peci-cputemp.rst          |  90 +++
 Documentation/hwmon/peci-dimmtemp.rst         |  57 ++
 Documentation/index.rst                       |   1 +
 Documentation/peci/index.rst                  |  16 +
 Documentation/peci/peci.rst                   |  51 ++
 MAINTAINERS                                   |  29 +
 arch/arm/boot/dts/aspeed-g4.dtsi              |  14 +
 arch/arm/boot/dts/aspeed-g5.dtsi              |  14 +
 arch/arm/boot/dts/aspeed-g6.dtsi              |  14 +
 drivers/Kconfig                               |   3 +
 drivers/Makefile                              |   1 +
 drivers/hwmon/Kconfig                         |   2 +
 drivers/hwmon/Makefile                        |   1 +
 drivers/hwmon/peci/Kconfig                    |  31 +
 drivers/hwmon/peci/Makefile                   |   7 +
 drivers/hwmon/peci/common.h                   |  58 ++
 drivers/hwmon/peci/cputemp.c                  | 592 ++++++++++++++++
 drivers/hwmon/peci/dimmtemp.c                 | 630 ++++++++++++++++++
 drivers/peci/Kconfig                          |  36 +
 drivers/peci/Makefile                         |  10 +
 drivers/peci/controller/Kconfig               |  17 +
 drivers/peci/controller/Makefile              |   3 +
 drivers/peci/controller/peci-aspeed.c         | 429 ++++++++++++
 drivers/peci/core.c                           | 236 +++++++
 drivers/peci/cpu.c                            | 343 ++++++++++
 drivers/peci/device.c                         | 249 +++++++
 drivers/peci/internal.h                       | 136 ++++
 drivers/peci/request.c                        | 482 ++++++++++++++
 drivers/peci/sysfs.c                          |  82 +++
 include/linux/peci-cpu.h                      |  40 ++
 include/linux/peci.h                          | 110 +++
 35 files changed, 3944 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml
 create mode 100644 Documentation/devicetree/bindings/peci/peci-controller.yaml
 create mode 100644 Documentation/hwmon/peci-cputemp.rst
 create mode 100644 Documentation/hwmon/peci-dimmtemp.rst
 create mode 100644 Documentation/peci/index.rst
 create mode 100644 Documentation/peci/peci.rst
 create mode 100644 drivers/hwmon/peci/Kconfig
 create mode 100644 drivers/hwmon/peci/Makefile
 create mode 100644 drivers/hwmon/peci/common.h
 create mode 100644 drivers/hwmon/peci/cputemp.c
 create mode 100644 drivers/hwmon/peci/dimmtemp.c
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/controller/Kconfig
 create mode 100644 drivers/peci/controller/Makefile
 create mode 100644 drivers/peci/controller/peci-aspeed.c
 create mode 100644 drivers/peci/core.c
 create mode 100644 drivers/peci/cpu.c
 create mode 100644 drivers/peci/device.c
 create mode 100644 drivers/peci/internal.h
 create mode 100644 drivers/peci/request.c
 create mode 100644 drivers/peci/sysfs.c
 create mode 100644 include/linux/peci-cpu.h
 create mode 100644 include/linux/peci.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 01/13] dt-bindings: Add generic bindings for PECI
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 02/13] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Rob Herring,
	Jae Hyun Yoo, Jonathan Corbet, Pierre-Louis Bossart,
	Guenter Roeck, devicetree, Jean Delvare, Arnd Bergmann,
	Rob Herring, Borislav Petkov, Iwona Winiarska, Dan Williams,
	Andy Shevchenko, linux-arm-kernel, linux-hwmon, Tony Luck,
	Andrew Jeffery, Randy Dunlap, Olof Johansson

Add device tree bindings for the PECI controller.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 .../bindings/peci/peci-controller.yaml        | 33 +++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-controller.yaml

diff --git a/Documentation/devicetree/bindings/peci/peci-controller.yaml b/Documentation/devicetree/bindings/peci/peci-controller.yaml
new file mode 100644
index 000000000000..bbc3d3f3a929
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-controller.yaml
@@ -0,0 +1,33 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/peci/peci-controller.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Generic Device Tree Bindings for PECI
+
+maintainers:
+  - Iwona Winiarska <iwona.winiarska@intel.com>
+
+description:
+  PECI (Platform Environment Control Interface) is an interface that provides a
+  communication channel from Intel processors and chipset components to external
+  monitoring or control devices.
+
+properties:
+  $nodename:
+    pattern: "^peci-controller(@.*)?$"
+
+  cmd-timeout-ms:
+    description:
+      Command timeout in units of ms.
+
+additionalProperties: true
+
+examples:
+  - |
+    peci-controller@1e78b000 {
+      reg = <0x1e78b000 0x100>;
+      cmd-timeout-ms = <500>;
+    };
+...
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 02/13] dt-bindings: Add bindings for peci-aspeed
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 01/13] dt-bindings: Add generic bindings for PECI Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 03/13] ARM: dts: aspeed: Add PECI controller nodes Iwona Winiarska
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Rob Herring,
	Jae Hyun Yoo, Jonathan Corbet, Pierre-Louis Bossart,
	Guenter Roeck, devicetree, Jean Delvare, Arnd Bergmann,
	Rob Herring, Borislav Petkov, Iwona Winiarska, Dan Williams,
	Andy Shevchenko, linux-arm-kernel, linux-hwmon, Tony Luck,
	Andrew Jeffery, Randy Dunlap, Olof Johansson

Add device tree bindings for the peci-aspeed controller driver.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 .../devicetree/bindings/peci/peci-aspeed.yaml | 109 ++++++++++++++++++
 1 file changed, 109 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml

diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.yaml b/Documentation/devicetree/bindings/peci/peci-aspeed.yaml
new file mode 100644
index 000000000000..2929d1e000d8
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-aspeed.yaml
@@ -0,0 +1,109 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/peci/peci-aspeed.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Aspeed PECI Bus Device Tree Bindings
+
+maintainers:
+  - Iwona Winiarska <iwona.winiarska@intel.com>
+  - Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+allOf:
+  - $ref: peci-controller.yaml#
+
+properties:
+  compatible:
+    enum:
+      - aspeed,ast2400-peci
+      - aspeed,ast2500-peci
+      - aspeed,ast2600-peci
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    description:
+      Clock source for PECI controller. Should reference the external
+      oscillator clock.
+    maxItems: 1
+
+  resets:
+    maxItems: 1
+
+  cmd-timeout-ms:
+    minimum: 1
+    maximum: 1000
+    default: 1000
+
+  aspeed,clock-divider:
+    description:
+      This value determines PECI controller internal clock dividing
+      rate. The divider will be calculated as 2 raised to the power of
+      the given value.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 7
+    default: 0
+
+  aspeed,msg-timing:
+    description:
+      Message timing negotiation period. This value will determine the period
+      of message timing negotiation to be issued by PECI controller. The unit
+      of the programmed value is four times of PECI clock period.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 255
+    default: 1
+
+  aspeed,addr-timing:
+    description:
+      Address timing negotiation period. This value will determine the period
+      of address timing negotiation to be issued by PECI controller. The unit
+      of the programmed value is four times of PECI clock period.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 255
+    default: 1
+
+  aspeed,rd-sampling-point:
+    description:
+      Read sampling point selection. The whole period of a bit time will be
+      divided into 16 time frames. This value will determine the time frame
+      in which the controller will sample PECI signal for data read back.
+      Usually in the middle of a bit time is the best.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 15
+    default: 8
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - resets
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+    #include <dt-bindings/clock/ast2600-clock.h>
+    peci-controller@1e78b000 {
+      compatible = "aspeed,ast2600-peci";
+      reg = <0x1e78b000 0x100>;
+      interrupts = <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>;
+      clocks = <&syscon ASPEED_CLK_GATE_REF0CLK>;
+      resets = <&syscon ASPEED_RESET_PECI>;
+      cmd-timeout-ms = <1000>;
+      aspeed,clock-divider = <0>;
+      aspeed,msg-timing = <1>;
+      aspeed,addr-timing = <1>;
+      aspeed,rd-sampling-point = <8>;
+    };
+...
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 03/13] ARM: dts: aspeed: Add PECI controller nodes
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 01/13] dt-bindings: Add generic bindings for PECI Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 02/13] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 04/13] peci: Add core infrastructure Iwona Winiarska
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Add PECI controller nodes with all required information.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
---
 arch/arm/boot/dts/aspeed-g4.dtsi | 14 ++++++++++++++
 arch/arm/boot/dts/aspeed-g5.dtsi | 14 ++++++++++++++
 arch/arm/boot/dts/aspeed-g6.dtsi | 14 ++++++++++++++
 3 files changed, 42 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi b/arch/arm/boot/dts/aspeed-g4.dtsi
index b313a1cf5f73..cdaf6db4f07d 100644
--- a/arch/arm/boot/dts/aspeed-g4.dtsi
+++ b/arch/arm/boot/dts/aspeed-g4.dtsi
@@ -391,6 +391,20 @@ uart_routing: uart-routing@9c {
 				};
 			};
 
+			peci0: peci-controller@1e78b000 {
+				compatible = "aspeed,ast2400-peci";
+				reg = <0x1e78b000 0x60>;
+				interrupts = <15>;
+				clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+				resets = <&syscon ASPEED_RESET_PECI>;
+				cmd-timeout-ms = <1000>;
+				aspeed,clock-divider = <0>;
+				aspeed,msg-timing = <1>;
+				aspeed,addr-timing = <1>;
+				aspeed,rd-sampling-point = <8>;
+				status = "disabled";
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index c7049454c7cb..1ff71c182c95 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -511,6 +511,20 @@ ibt: ibt@140 {
 				};
 			};
 
+			peci0: peci-controller@1e78b000 {
+				compatible = "aspeed,ast2500-peci";
+				reg = <0x1e78b000 0x60>;
+				interrupts = <15>;
+				clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+				resets = <&syscon ASPEED_RESET_PECI>;
+				cmd-timeout-ms = <1000>;
+				aspeed,clock-divider = <0>;
+				aspeed,msg-timing = <1>;
+				aspeed,addr-timing = <1>;
+				aspeed,rd-sampling-point = <8>;
+				status = "disabled";
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
index 5106a424f1ce..efd87ceb40b6 100644
--- a/arch/arm/boot/dts/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed-g6.dtsi
@@ -507,6 +507,20 @@ wdt4: watchdog@1e7850c0 {
 				status = "disabled";
 			};
 
+			peci0: peci-controller@1e78b000 {
+				compatible = "aspeed,ast2600-peci";
+				reg = <0x1e78b000 0x100>;
+				interrupts = <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>;
+				clocks = <&syscon ASPEED_CLK_GATE_REF0CLK>;
+				resets = <&syscon ASPEED_RESET_PECI>;
+				cmd-timeout-ms = <1000>;
+				aspeed,clock-divider = <0>;
+				aspeed,msg-timing = <1>;
+				aspeed,addr-timing = <1>;
+				aspeed,rd-sampling-point = <8>;
+				status = "disabled";
+			};
+
 			lpc: lpc@1e789000 {
 				compatible = "aspeed,ast2600-lpc-v2", "simple-mfd", "syscon";
 				reg = <0x1e789000 0x1000>;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 04/13] peci: Add core infrastructure
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (2 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 03/13] ARM: dts: aspeed: Add PECI controller nodes Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 05/13] peci: Add peci-aspeed controller driver Iwona Winiarska
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Jason M Bills, Zev Weiss,
	Jae Hyun Yoo, Jonathan Corbet, Pierre-Louis Bossart,
	Guenter Roeck, devicetree, Jean Delvare, Arnd Bergmann,
	Rob Herring, Borislav Petkov, Iwona Winiarska, Dan Williams,
	Andy Shevchenko, linux-arm-kernel, linux-hwmon, Tony Luck,
	Andrew Jeffery, Randy Dunlap, Olof Johansson

Intel processors provide access for various services designed to support
processor and DRAM thermal management, platform manageability and
processor interface tuning and diagnostics.
Those services are available via the Platform Environment Control
Interface (PECI) that provides a communication channel between the
processor and the Baseboard Management Controller (BMC) or other
platform management device.

This change introduces PECI subsystem by adding the initial core module
and API for controller drivers.

Co-developed-by: Jason M Bills <jason.m.bills@linux.intel.com>
Signed-off-by: Jason M Bills <jason.m.bills@linux.intel.com>
Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS             |   9 +++
 drivers/Kconfig         |   3 +
 drivers/Makefile        |   1 +
 drivers/peci/Kconfig    |  15 ++++
 drivers/peci/Makefile   |   5 ++
 drivers/peci/core.c     | 158 ++++++++++++++++++++++++++++++++++++++++
 drivers/peci/internal.h |  16 ++++
 include/linux/peci.h    |  99 +++++++++++++++++++++++++
 8 files changed, 306 insertions(+)
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/core.c
 create mode 100644 drivers/peci/internal.h
 create mode 100644 include/linux/peci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7a2345ce8521..11a6c0869010 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14911,6 +14911,15 @@ L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/peaq-wmi.c
 
+PECI SUBSYSTEM
+M:	Iwona Winiarska <iwona.winiarska@intel.com>
+R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
+S:	Supported
+F:	Documentation/devicetree/bindings/peci/
+F:	drivers/peci/
+F:	include/linux/peci.h
+
 PENSANDO ETHERNET DRIVERS
 M:	Shannon Nelson <snelson@pensando.io>
 M:	drivers@pensando.io
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 0d399ddaa185..8d6cd5d08722 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -236,4 +236,7 @@ source "drivers/interconnect/Kconfig"
 source "drivers/counter/Kconfig"
 
 source "drivers/most/Kconfig"
+
+source "drivers/peci/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index be5d40ae1488..ed84d5102302 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -188,3 +188,4 @@ obj-$(CONFIG_GNSS)		+= gnss/
 obj-$(CONFIG_INTERCONNECT)	+= interconnect/
 obj-$(CONFIG_COUNTER)		+= counter/
 obj-$(CONFIG_MOST)		+= most/
+obj-$(CONFIG_PECI)		+= peci/
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
new file mode 100644
index 000000000000..71a4ad81225a
--- /dev/null
+++ b/drivers/peci/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+menuconfig PECI
+	tristate "PECI support"
+	help
+	  The Platform Environment Control Interface (PECI) is an interface
+	  that provides a communication channel to Intel processors and
+	  chipset components from external monitoring or control devices.
+
+	  If you are building a Baseboard Management Controller (BMC) kernel
+	  for Intel platform say Y here and also to the specific driver for
+	  your adapter(s) below. If unsure say N.
+
+	  This support is also available as a module. If so, the module
+	  will be called peci.
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
new file mode 100644
index 000000000000..e789a354e842
--- /dev/null
+++ b/drivers/peci/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+# Core functionality
+peci-y := core.o
+obj-$(CONFIG_PECI) += peci.o
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
new file mode 100644
index 000000000000..73ad0a47fa9d
--- /dev/null
+++ b/drivers/peci/core.c
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/bug.h>
+#include <linux/device.h>
+#include <linux/export.h>
+#include <linux/idr.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/pm_runtime.h>
+#include <linux/property.h>
+#include <linux/slab.h>
+
+#include "internal.h"
+
+static DEFINE_IDA(peci_controller_ida);
+
+static void peci_controller_dev_release(struct device *dev)
+{
+	struct peci_controller *controller = to_peci_controller(dev);
+
+	mutex_destroy(&controller->bus_lock);
+	ida_free(&peci_controller_ida, controller->id);
+	kfree(controller);
+}
+
+struct device_type peci_controller_type = {
+	.release	= peci_controller_dev_release,
+};
+
+static struct peci_controller *peci_controller_alloc(struct device *dev,
+						     struct peci_controller_ops *ops)
+{
+	struct peci_controller *controller;
+	int ret;
+
+	if (!ops->xfer)
+		return ERR_PTR(-EINVAL);
+
+	controller = kzalloc(sizeof(*controller), GFP_KERNEL);
+	if (!controller)
+		return ERR_PTR(-ENOMEM);
+
+	ret = ida_alloc_max(&peci_controller_ida, U8_MAX, GFP_KERNEL);
+	if (ret < 0)
+		goto err;
+	controller->id = ret;
+
+	controller->ops = ops;
+
+	controller->dev.parent = dev;
+	controller->dev.bus = &peci_bus_type;
+	controller->dev.type = &peci_controller_type;
+
+	device_initialize(&controller->dev);
+
+	mutex_init(&controller->bus_lock);
+
+	return controller;
+
+err:
+	kfree(controller);
+	return ERR_PTR(ret);
+}
+
+static void unregister_controller(void *_controller)
+{
+	struct peci_controller *controller = _controller;
+
+	device_unregister(&controller->dev);
+
+	fwnode_handle_put(controller->dev.fwnode);
+
+	pm_runtime_disable(&controller->dev);
+}
+
+/**
+ * devm_peci_controller_add() - add PECI controller
+ * @dev: device for devm operations
+ * @ops: pointer to controller specific methods
+ *
+ * In final stage of its probe(), peci_controller driver calls
+ * devm_peci_controller_add() to register itself with the PECI bus.
+ *
+ * Return: Pointer to the newly allocated controller or ERR_PTR() in case of failure.
+ */
+struct peci_controller *devm_peci_controller_add(struct device *dev,
+						 struct peci_controller_ops *ops)
+{
+	struct peci_controller *controller;
+	int ret;
+
+	controller = peci_controller_alloc(dev, ops);
+	if (IS_ERR(controller))
+		return controller;
+
+	ret = dev_set_name(&controller->dev, "peci-%d", controller->id);
+	if (ret)
+		goto err_put;
+
+	pm_runtime_no_callbacks(&controller->dev);
+	pm_suspend_ignore_children(&controller->dev, true);
+	pm_runtime_enable(&controller->dev);
+
+	device_set_node(&controller->dev, fwnode_handle_get(dev_fwnode(dev)));
+
+	ret = device_add(&controller->dev);
+	if (ret)
+		goto err_fwnode;
+
+	ret = devm_add_action_or_reset(dev, unregister_controller, controller);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return controller;
+
+err_fwnode:
+	fwnode_handle_put(controller->dev.fwnode);
+
+	pm_runtime_disable(&controller->dev);
+
+err_put:
+	put_device(&controller->dev);
+
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
+
+struct bus_type peci_bus_type = {
+	.name		= "peci",
+};
+
+static int __init peci_init(void)
+{
+	int ret;
+
+	ret = bus_register(&peci_bus_type);
+	if (ret < 0) {
+		pr_err("peci: failed to register PECI bus type!\n");
+		return ret;
+	}
+
+	return 0;
+}
+module_init(peci_init);
+
+static void __exit peci_exit(void)
+{
+	bus_unregister(&peci_bus_type);
+}
+module_exit(peci_exit);
+
+MODULE_AUTHOR("Jason M Bills <jason.m.bills@linux.intel.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI bus core module");
+MODULE_LICENSE("GPL");
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
new file mode 100644
index 000000000000..918dea745a86
--- /dev/null
+++ b/drivers/peci/internal.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2018-2021 Intel Corporation */
+
+#ifndef __PECI_INTERNAL_H
+#define __PECI_INTERNAL_H
+
+#include <linux/device.h>
+#include <linux/types.h>
+
+struct peci_controller;
+
+extern struct bus_type peci_bus_type;
+
+extern struct device_type peci_controller_type;
+
+#endif /* __PECI_INTERNAL_H */
diff --git a/include/linux/peci.h b/include/linux/peci.h
new file mode 100644
index 000000000000..26e0a4e73b50
--- /dev/null
+++ b/include/linux/peci.h
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2018-2021 Intel Corporation */
+
+#ifndef __LINUX_PECI_H
+#define __LINUX_PECI_H
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+/*
+ * Currently we don't support any PECI command over 32 bytes.
+ */
+#define PECI_REQUEST_MAX_BUF_SIZE 32
+
+struct peci_controller;
+struct peci_request;
+
+/**
+ * struct peci_controller_ops - PECI controller specific methods
+ * @xfer: PECI transfer function
+ *
+ * PECI controllers may have different hardware interfaces - the drivers
+ * implementing PECI controllers can use this structure to abstract away those
+ * differences by exposing a common interface for PECI core.
+ */
+struct peci_controller_ops {
+	int (*xfer)(struct peci_controller *controller, u8 addr, struct peci_request *req);
+};
+
+/**
+ * struct peci_controller - PECI controller
+ * @dev: device object to register PECI controller to the device model
+ * @ops: pointer to device specific controller operations
+ * @bus_lock: lock used to protect multiple callers
+ * @id: PECI controller ID
+ *
+ * PECI controllers usually connect to their drivers using non-PECI bus,
+ * such as the platform bus.
+ * Each PECI controller can communicate with one or more PECI devices.
+ */
+struct peci_controller {
+	struct device dev;
+	struct peci_controller_ops *ops;
+	struct mutex bus_lock; /* held for the duration of xfer */
+	u8 id;
+};
+
+struct peci_controller *devm_peci_controller_add(struct device *parent,
+						 struct peci_controller_ops *ops);
+
+static inline struct peci_controller *to_peci_controller(void *d)
+{
+	return container_of(d, struct peci_controller, dev);
+}
+
+/**
+ * struct peci_device - PECI device
+ * @dev: device object to register PECI device to the device model
+ * @controller: manages the bus segment hosting this PECI device
+ * @addr: address used on the PECI bus connected to the parent controller
+ *
+ * A peci_device identifies a single device (i.e. CPU) connected to a PECI bus.
+ * The behaviour exposed to the rest of the system is defined by the PECI driver
+ * managing the device.
+ */
+struct peci_device {
+	struct device dev;
+	u8 addr;
+};
+
+static inline struct peci_device *to_peci_device(struct device *d)
+{
+	return container_of(d, struct peci_device, dev);
+}
+
+/**
+ * struct peci_request - PECI request
+ * @device: PECI device to which the request is sent
+ * @tx: TX buffer specific data
+ * @tx.buf: TX buffer
+ * @tx.len: transfer data length in bytes
+ * @rx: RX buffer specific data
+ * @rx.buf: RX buffer
+ * @rx.len: received data length in bytes
+ *
+ * A peci_request represents a request issued by PECI originator (TX) and
+ * a response received from PECI responder (RX).
+ */
+struct peci_request {
+	struct peci_device *device;
+	struct {
+		u8 buf[PECI_REQUEST_MAX_BUF_SIZE];
+		u8 len;
+	} rx, tx;
+};
+
+#endif /* __LINUX_PECI_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 05/13] peci: Add peci-aspeed controller driver
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (3 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 04/13] peci: Add core infrastructure Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 06/13] peci: Add device detection Iwona Winiarska
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>

ASPEED AST24xx/AST25xx/AST26xx SoCs support the PECI electrical
interface (a.k.a PECI wire) that provides a communication channel with
Intel processors.
This driver allows BMC to discover devices connected to it and
communicate with them using PECI protocol.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Co-developed-by: Iwona Winiarska <iwona.winiarska@intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS                           |   9 +
 drivers/peci/Kconfig                  |   6 +
 drivers/peci/Makefile                 |   3 +
 drivers/peci/controller/Kconfig       |  17 +
 drivers/peci/controller/Makefile      |   3 +
 drivers/peci/controller/peci-aspeed.c | 429 ++++++++++++++++++++++++++
 6 files changed, 467 insertions(+)
 create mode 100644 drivers/peci/controller/Kconfig
 create mode 100644 drivers/peci/controller/Makefile
 create mode 100644 drivers/peci/controller/peci-aspeed.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 11a6c0869010..a4da3bba09ba 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2945,6 +2945,15 @@ S:	Maintained
 F:	Documentation/devicetree/bindings/net/asix,ax88796c.yaml
 F:	drivers/net/ethernet/asix/ax88796c_*
 
+ASPEED PECI CONTROLLER
+M:	Iwona Winiarska <iwona.winiarska@intel.com>
+M:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
+L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
+S:	Supported
+F:	Documentation/devicetree/bindings/peci/peci-aspeed.yaml
+F:	drivers/peci/controller/peci-aspeed.c
+
 ASPEED PINCTRL DRIVERS
 M:	Andrew Jeffery <andrew@aj.id.au>
 L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 71a4ad81225a..99279df97a78 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -13,3 +13,9 @@ menuconfig PECI
 
 	  This support is also available as a module. If so, the module
 	  will be called peci.
+
+if PECI
+
+source "drivers/peci/controller/Kconfig"
+
+endif # PECI
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index e789a354e842..926d8df15cbd 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -3,3 +3,6 @@
 # Core functionality
 peci-y := core.o
 obj-$(CONFIG_PECI) += peci.o
+
+# Hardware specific bus drivers
+obj-y += controller/
diff --git a/drivers/peci/controller/Kconfig b/drivers/peci/controller/Kconfig
new file mode 100644
index 000000000000..3008b1b81049
--- /dev/null
+++ b/drivers/peci/controller/Kconfig
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config PECI_ASPEED
+	tristate "ASPEED PECI support"
+	depends on ARCH_ASPEED || COMPILE_TEST
+	depends on OF
+	depends on HAS_IOMEM
+	help
+	  This option enables PECI controller driver for ASPEED AST2400,
+	  AST2500 and AST2600 SoCs. It allows BMC to discover devices
+	  connected to it, and communicate with them using PECI protocol.
+
+	  Say Y here if your system runs on ASPEED SoC and you are using it
+	  as BMC for Intel platform.
+
+	  This driver can also be built as a module. If so, the module will
+	  be called peci-aspeed.
diff --git a/drivers/peci/controller/Makefile b/drivers/peci/controller/Makefile
new file mode 100644
index 000000000000..022c28ef1bf0
--- /dev/null
+++ b/drivers/peci/controller/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_PECI_ASPEED)	+= peci-aspeed.o
diff --git a/drivers/peci/controller/peci-aspeed.c b/drivers/peci/controller/peci-aspeed.c
new file mode 100644
index 000000000000..e1ba14fc0d0c
--- /dev/null
+++ b/drivers/peci/controller/peci-aspeed.c
@@ -0,0 +1,429 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2012-2017 ASPEED Technology Inc.
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/platform_device.h>
+#include <linux/reset.h>
+
+/* ASPEED PECI Registers */
+/* Control Register */
+#define ASPEED_PECI_CTRL			0x00
+#define   ASPEED_PECI_CTRL_SAMPLING_MASK	GENMASK(19, 16)
+#define   ASPEED_PECI_CTRL_RD_MODE_MASK		GENMASK(13, 12)
+#define     ASPEED_PECI_CTRL_RD_MODE_DBG	BIT(13)
+#define     ASPEED_PECI_CTRL_RD_MODE_COUNT	BIT(12)
+#define   ASPEED_PECI_CTRL_CLK_SOURCE		BIT(11)
+#define   ASPEED_PECI_CTRL_CLK_DIV_MASK		GENMASK(10, 8)
+#define   ASPEED_PECI_CTRL_INVERT_OUT		BIT(7)
+#define   ASPEED_PECI_CTRL_INVERT_IN		BIT(6)
+#define   ASPEED_PECI_CTRL_BUS_CONTENTION_EN	BIT(5)
+#define   ASPEED_PECI_CTRL_PECI_EN		BIT(4)
+#define   ASPEED_PECI_CTRL_PECI_CLK_EN		BIT(0)
+
+/* Timing Negotiation Register */
+#define ASPEED_PECI_TIMING_NEGOTIATION		0x04
+#define   ASPEED_PECI_T_NEGO_MSG_MASK		GENMASK(15, 8)
+#define   ASPEED_PECI_T_NEGO_ADDR_MASK		GENMASK(7, 0)
+
+/* Command Register */
+#define ASPEED_PECI_CMD				0x08
+#define   ASPEED_PECI_CMD_PIN_MONITORING	BIT(31)
+#define   ASPEED_PECI_CMD_STS_MASK		GENMASK(27, 24)
+#define     ASPEED_PECI_CMD_STS_ADDR_T_NEGO	0x3
+#define   ASPEED_PECI_CMD_IDLE_MASK		\
+	  (ASPEED_PECI_CMD_STS_MASK | ASPEED_PECI_CMD_PIN_MONITORING)
+#define   ASPEED_PECI_CMD_FIRE			BIT(0)
+
+/* Read/Write Length Register */
+#define ASPEED_PECI_RW_LENGTH			0x0c
+#define   ASPEED_PECI_AW_FCS_EN			BIT(31)
+#define   ASPEED_PECI_RD_LEN_MASK		GENMASK(23, 16)
+#define   ASPEED_PECI_WR_LEN_MASK		GENMASK(15, 8)
+#define   ASPEED_PECI_TARGET_ADDR_MASK		GENMASK(7, 0)
+
+/* Expected FCS Data Register */
+#define ASPEED_PECI_EXPECTED_FCS		0x10
+#define   ASPEED_PECI_EXPECTED_RD_FCS_MASK	GENMASK(23, 16)
+#define   ASPEED_PECI_EXPECTED_AW_FCS_AUTO_MASK	GENMASK(15, 8)
+#define   ASPEED_PECI_EXPECTED_WR_FCS_MASK	GENMASK(7, 0)
+
+/* Captured FCS Data Register */
+#define ASPEED_PECI_CAPTURED_FCS		0x14
+#define   ASPEED_PECI_CAPTURED_RD_FCS_MASK	GENMASK(23, 16)
+#define   ASPEED_PECI_CAPTURED_WR_FCS_MASK	GENMASK(7, 0)
+
+/* Interrupt Register */
+#define ASPEED_PECI_INT_CTRL			0x18
+#define   ASPEED_PECI_TIMING_NEGO_SEL_MASK	GENMASK(31, 30)
+#define     ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO	0
+#define     ASPEED_PECI_2ND_BIT_OF_ADDR_NEGO	1
+#define     ASPEED_PECI_MESSAGE_NEGO		2
+#define   ASPEED_PECI_INT_MASK			GENMASK(4, 0)
+#define     ASPEED_PECI_INT_BUS_TIMEOUT		BIT(4)
+#define     ASPEED_PECI_INT_BUS_CONTENTION	BIT(3)
+#define     ASPEED_PECI_INT_WR_FCS_BAD		BIT(2)
+#define     ASPEED_PECI_INT_WR_FCS_ABORT	BIT(1)
+#define     ASPEED_PECI_INT_CMD_DONE		BIT(0)
+
+/* Interrupt Status Register */
+#define ASPEED_PECI_INT_STS			0x1c
+#define   ASPEED_PECI_INT_TIMING_RESULT_MASK	GENMASK(29, 16)
+	  /* bits[4..0]: Same bit fields in the 'Interrupt Register' */
+
+/* Rx/Tx Data Buffer Registers */
+#define ASPEED_PECI_WR_DATA0			0x20
+#define ASPEED_PECI_WR_DATA1			0x24
+#define ASPEED_PECI_WR_DATA2			0x28
+#define ASPEED_PECI_WR_DATA3			0x2c
+#define ASPEED_PECI_RD_DATA0			0x30
+#define ASPEED_PECI_RD_DATA1			0x34
+#define ASPEED_PECI_RD_DATA2			0x38
+#define ASPEED_PECI_RD_DATA3			0x3c
+#define ASPEED_PECI_WR_DATA4			0x40
+#define ASPEED_PECI_WR_DATA5			0x44
+#define ASPEED_PECI_WR_DATA6			0x48
+#define ASPEED_PECI_WR_DATA7			0x4c
+#define ASPEED_PECI_RD_DATA4			0x50
+#define ASPEED_PECI_RD_DATA5			0x54
+#define ASPEED_PECI_RD_DATA6			0x58
+#define ASPEED_PECI_RD_DATA7			0x5c
+#define   ASPEED_PECI_DATA_BUF_SIZE_MAX		32
+
+/* Timing Negotiation */
+#define ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT	8
+#define ASPEED_PECI_RD_SAMPLING_POINT_MAX	(BIT(4) - 1)
+#define ASPEED_PECI_CLK_DIV_DEFAULT		0
+#define ASPEED_PECI_CLK_DIV_MAX			(BIT(3) - 1)
+#define ASPEED_PECI_MSG_TIMING_DEFAULT		1
+#define ASPEED_PECI_MSG_TIMING_MAX		(BIT(8) - 1)
+#define ASPEED_PECI_ADDR_TIMING_DEFAULT		1
+#define ASPEED_PECI_ADDR_TIMING_MAX		(BIT(8) - 1)
+
+/* Timeout */
+#define ASPEED_PECI_IDLE_CHECK_TIMEOUT_US	(50 * USEC_PER_MSEC)
+#define ASPEED_PECI_IDLE_CHECK_INTERVAL_US	(10 * USEC_PER_MSEC)
+#define ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT	(1000)
+#define ASPEED_PECI_CMD_TIMEOUT_MS_MAX		(1000)
+
+struct aspeed_peci {
+	struct peci_controller *controller;
+	struct device *dev;
+	void __iomem *base;
+	struct clk *clk;
+	struct reset_control *rst;
+	int irq;
+	spinlock_t lock; /* to sync completion status handling */
+	struct completion xfer_complete;
+	u32 status;
+	u32 cmd_timeout_ms;
+	u32 msg_timing;
+	u32 addr_timing;
+	u32 rd_sampling_point;
+	u32 clk_div;
+};
+
+static void aspeed_peci_init_regs(struct aspeed_peci *priv)
+{
+	u32 val;
+
+	val = FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, ASPEED_PECI_CLK_DIV_DEFAULT);
+	val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
+	writel(val, priv->base + ASPEED_PECI_CTRL);
+	/*
+	 * Timing negotiation period setting.
+	 * The unit of the programmed value is 4 times of PECI clock period.
+	 */
+	val = FIELD_PREP(ASPEED_PECI_T_NEGO_MSG_MASK, priv->msg_timing);
+	val |= FIELD_PREP(ASPEED_PECI_T_NEGO_ADDR_MASK, priv->addr_timing);
+	writel(val, priv->base + ASPEED_PECI_TIMING_NEGOTIATION);
+
+	/* Clear interrupts */
+	val = readl(priv->base + ASPEED_PECI_INT_STS) | ASPEED_PECI_INT_MASK;
+	writel(val, priv->base + ASPEED_PECI_INT_STS);
+
+	/* Set timing negotiation mode and enable interrupts */
+	val = FIELD_PREP(ASPEED_PECI_TIMING_NEGO_SEL_MASK, ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO);
+	val |= ASPEED_PECI_INT_MASK;
+	writel(val, priv->base + ASPEED_PECI_INT_CTRL);
+
+	val = FIELD_PREP(ASPEED_PECI_CTRL_SAMPLING_MASK, priv->rd_sampling_point);
+	val |= FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, priv->clk_div);
+	val |= ASPEED_PECI_CTRL_PECI_EN;
+	val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
+	writel(val, priv->base + ASPEED_PECI_CTRL);
+}
+
+static inline int aspeed_peci_check_idle(struct aspeed_peci *priv)
+{
+	u32 cmd_sts = readl(priv->base + ASPEED_PECI_CMD);
+
+	if (FIELD_GET(ASPEED_PECI_CMD_STS_MASK, cmd_sts) == ASPEED_PECI_CMD_STS_ADDR_T_NEGO)
+		aspeed_peci_init_regs(priv);
+
+	return readl_poll_timeout(priv->base + ASPEED_PECI_CMD,
+				  cmd_sts,
+				  !(cmd_sts & ASPEED_PECI_CMD_IDLE_MASK),
+				  ASPEED_PECI_IDLE_CHECK_INTERVAL_US,
+				  ASPEED_PECI_IDLE_CHECK_TIMEOUT_US);
+}
+
+static int aspeed_peci_xfer(struct peci_controller *controller,
+			    u8 addr, struct peci_request *req)
+{
+	struct aspeed_peci *priv = dev_get_drvdata(controller->dev.parent);
+	unsigned long timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
+	u32 peci_head;
+	int ret;
+
+	if (req->tx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX ||
+	    req->rx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX)
+		return -EINVAL;
+
+	/* Check command sts and bus idle state */
+	ret = aspeed_peci_check_idle(priv);
+	if (ret)
+		return ret; /* -ETIMEDOUT */
+
+	spin_lock_irq(&priv->lock);
+	reinit_completion(&priv->xfer_complete);
+
+	peci_head = FIELD_PREP(ASPEED_PECI_TARGET_ADDR_MASK, addr) |
+		    FIELD_PREP(ASPEED_PECI_WR_LEN_MASK, req->tx.len) |
+		    FIELD_PREP(ASPEED_PECI_RD_LEN_MASK, req->rx.len);
+
+	writel(peci_head, priv->base + ASPEED_PECI_RW_LENGTH);
+
+	memcpy_toio(priv->base + ASPEED_PECI_WR_DATA0, req->tx.buf, min_t(u8, req->tx.len, 16));
+	if (req->tx.len > 16)
+		memcpy_toio(priv->base + ASPEED_PECI_WR_DATA4, req->tx.buf + 16,
+			    req->tx.len - 16);
+
+#if IS_ENABLED(CONFIG_DYNAMIC_DEBUG)
+	dev_dbg(priv->dev, "HEAD : %#08x\n", peci_head);
+	print_hex_dump_bytes("TX : ", DUMP_PREFIX_NONE, req->tx.buf, req->tx.len);
+#endif
+	priv->status = 0;
+	writel(ASPEED_PECI_CMD_FIRE, priv->base + ASPEED_PECI_CMD);
+	spin_unlock_irq(&priv->lock);
+
+	ret = wait_for_completion_interruptible_timeout(&priv->xfer_complete, timeout);
+	if (ret < 0)
+		return ret;
+
+	if (ret == 0) {
+		dev_dbg(priv->dev, "Timeout waiting for a response!\n");
+		return -ETIMEDOUT;
+	}
+
+	spin_lock_irq(&priv->lock);
+
+	writel(0, priv->base + ASPEED_PECI_CMD);
+
+	if (priv->status != ASPEED_PECI_INT_CMD_DONE) {
+		spin_unlock_irq(&priv->lock);
+		dev_dbg(priv->dev, "No valid response, status: %#02x\n", priv->status);
+		return -EIO;
+	}
+
+	spin_unlock_irq(&priv->lock);
+
+	memcpy_fromio(req->rx.buf, priv->base + ASPEED_PECI_RD_DATA0, min_t(u8, req->rx.len, 16));
+	if (req->rx.len > 16)
+		memcpy_fromio(req->rx.buf + 16, priv->base + ASPEED_PECI_RD_DATA4,
+			      req->rx.len - 16);
+
+#if IS_ENABLED(CONFIG_DYNAMIC_DEBUG)
+	print_hex_dump_bytes("RX : ", DUMP_PREFIX_NONE, req->rx.buf, req->rx.len);
+#endif
+	return 0;
+}
+
+static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
+{
+	struct aspeed_peci *priv = arg;
+	u32 status;
+
+	spin_lock(&priv->lock);
+	status = readl(priv->base + ASPEED_PECI_INT_STS);
+	writel(status, priv->base + ASPEED_PECI_INT_STS);
+	priv->status |= (status & ASPEED_PECI_INT_MASK);
+
+	/*
+	 * All commands should be ended up with a ASPEED_PECI_INT_CMD_DONE bit
+	 * set even in an error case.
+	 */
+	if (status & ASPEED_PECI_INT_CMD_DONE)
+		complete(&priv->xfer_complete);
+
+	spin_unlock(&priv->lock);
+
+	return IRQ_HANDLED;
+}
+
+static void aspeed_peci_property_sanitize(struct device *dev, const char *propname,
+					  u32 min, u32 max, u32 default_val, u32 *propval)
+{
+	u32 val;
+	int ret;
+
+	ret = device_property_read_u32(dev, propname, &val);
+	if (ret) {
+		val = default_val;
+	} else if (val > max || val < min) {
+		dev_warn(dev, "Invalid %s: %u, falling back to: %u\n",
+			 propname, val, default_val);
+
+		val = default_val;
+	}
+
+	*propval = val;
+}
+
+static void aspeed_peci_property_setup(struct aspeed_peci *priv)
+{
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,clock-divider",
+				      0, ASPEED_PECI_CLK_DIV_MAX,
+				      ASPEED_PECI_CLK_DIV_DEFAULT, &priv->clk_div);
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,msg-timing",
+				      0, ASPEED_PECI_MSG_TIMING_MAX,
+				      ASPEED_PECI_MSG_TIMING_DEFAULT, &priv->msg_timing);
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,addr-timing",
+				      0, ASPEED_PECI_ADDR_TIMING_MAX,
+				      ASPEED_PECI_ADDR_TIMING_DEFAULT, &priv->addr_timing);
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,rd-sampling-point",
+				      0, ASPEED_PECI_RD_SAMPLING_POINT_MAX,
+				      ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT,
+				      &priv->rd_sampling_point);
+	aspeed_peci_property_sanitize(priv->dev, "cmd-timeout-ms",
+				      1, ASPEED_PECI_CMD_TIMEOUT_MS_MAX,
+				      ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT, &priv->cmd_timeout_ms);
+}
+
+static struct peci_controller_ops aspeed_ops = {
+	.xfer = aspeed_peci_xfer,
+};
+
+static void aspeed_peci_reset_control_release(void *data)
+{
+	reset_control_assert(data);
+}
+
+static int devm_aspeed_peci_reset_control_deassert(struct device *dev, struct reset_control *rst)
+{
+	int ret;
+
+	ret = reset_control_deassert(rst);
+	if (ret)
+		return ret;
+
+	return devm_add_action_or_reset(dev, aspeed_peci_reset_control_release, rst);
+}
+
+static void aspeed_peci_clk_release(void *data)
+{
+	clk_disable_unprepare(data);
+}
+
+static int devm_aspeed_peci_clk_enable(struct device *dev, struct clk *clk)
+{
+	int ret;
+
+	ret = clk_prepare_enable(clk);
+	if (ret)
+		return ret;
+
+	return devm_add_action_or_reset(dev, aspeed_peci_clk_release, clk);
+}
+
+static int aspeed_peci_probe(struct platform_device *pdev)
+{
+	struct peci_controller *controller;
+	struct aspeed_peci *priv;
+	int ret;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->dev = &pdev->dev;
+	dev_set_drvdata(priv->dev, priv);
+
+	priv->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(priv->base))
+		return PTR_ERR(priv->base);
+
+	priv->irq = platform_get_irq(pdev, 0);
+	if (!priv->irq)
+		return priv->irq;
+
+	ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
+			       0, "peci-aspeed", priv);
+	if (ret)
+		return ret;
+
+	init_completion(&priv->xfer_complete);
+	spin_lock_init(&priv->lock);
+
+	priv->rst = devm_reset_control_get(&pdev->dev, NULL);
+	if (IS_ERR(priv->rst))
+		return dev_err_probe(priv->dev, PTR_ERR(priv->rst),
+				     "failed to get reset control\n");
+
+	ret = devm_aspeed_peci_reset_control_deassert(priv->dev, priv->rst);
+	if (ret)
+		return dev_err_probe(priv->dev, ret, "cannot deassert reset control\n");
+
+	priv->clk = devm_clk_get(priv->dev, NULL);
+	if (IS_ERR(priv->clk))
+		return dev_err_probe(priv->dev, PTR_ERR(priv->clk), "failed to get clk\n");
+
+	ret = devm_aspeed_peci_clk_enable(priv->dev, priv->clk);
+	if (ret)
+		return dev_err_probe(priv->dev, ret, "failed to enable clock\n");
+
+	aspeed_peci_property_setup(priv);
+
+	aspeed_peci_init_regs(priv);
+
+	controller = devm_peci_controller_add(priv->dev, &aspeed_ops);
+	if (IS_ERR(controller))
+		return dev_err_probe(priv->dev, PTR_ERR(controller),
+				     "failed to add aspeed peci controller\n");
+
+	priv->controller = controller;
+
+	return 0;
+}
+
+static const struct of_device_id aspeed_peci_of_table[] = {
+	{ .compatible = "aspeed,ast2400-peci", },
+	{ .compatible = "aspeed,ast2500-peci", },
+	{ .compatible = "aspeed,ast2600-peci", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
+
+static struct platform_driver aspeed_peci_driver = {
+	.probe  = aspeed_peci_probe,
+	.driver = {
+		.name           = "peci-aspeed",
+		.of_match_table = aspeed_peci_of_table,
+	},
+};
+module_platform_driver(aspeed_peci_driver);
+
+MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("ASPEED PECI driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 06/13] peci: Add device detection
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (4 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 05/13] peci: Add peci-aspeed controller driver Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:49   ` Greg Kroah-Hartman
  2021-11-15 18:25 ` [PATCH v3 07/13] peci: Add sysfs interface for PECI bus Iwona Winiarska
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Since PECI devices are discoverable, we can dynamically detect devices
that are actually available in the system.

This change complements the earlier implementation by rescanning PECI
bus to detect available devices. For this purpose, it also introduces the
minimal API for PECI requests.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/peci/Makefile   |   2 +-
 drivers/peci/core.c     |  33 ++++++++++++
 drivers/peci/device.c   | 117 ++++++++++++++++++++++++++++++++++++++++
 drivers/peci/internal.h |  14 +++++
 drivers/peci/request.c  |  55 +++++++++++++++++++
 5 files changed, 220 insertions(+), 1 deletion(-)
 create mode 100644 drivers/peci/device.c
 create mode 100644 drivers/peci/request.c

diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 926d8df15cbd..c5f9d3fe21bb 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 # Core functionality
-peci-y := core.o
+peci-y := core.o request.o device.o
 obj-$(CONFIG_PECI) += peci.o
 
 # Hardware specific bus drivers
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
index 73ad0a47fa9d..c3361e6e043a 100644
--- a/drivers/peci/core.c
+++ b/drivers/peci/core.c
@@ -29,6 +29,20 @@ struct device_type peci_controller_type = {
 	.release	= peci_controller_dev_release,
 };
 
+static int peci_controller_scan_devices(struct peci_controller *controller)
+{
+	int ret;
+	u8 addr;
+
+	for (addr = PECI_BASE_ADDR; addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX; addr++) {
+		ret = peci_device_create(controller, addr);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static struct peci_controller *peci_controller_alloc(struct device *dev,
 						     struct peci_controller_ops *ops)
 {
@@ -64,10 +78,23 @@ static struct peci_controller *peci_controller_alloc(struct device *dev,
 	return ERR_PTR(ret);
 }
 
+static int unregister_child(struct device *dev, void *dummy)
+{
+	peci_device_destroy(to_peci_device(dev));
+
+	return 0;
+}
+
 static void unregister_controller(void *_controller)
 {
 	struct peci_controller *controller = _controller;
 
+	/*
+	 * Detach any active PECI devices. This can't fail, thus we do not
+	 * check the returned value.
+	 */
+	device_for_each_child_reverse(&controller->dev, NULL, unregister_child);
+
 	device_unregister(&controller->dev);
 
 	fwnode_handle_put(controller->dev.fwnode);
@@ -113,6 +140,12 @@ struct peci_controller *devm_peci_controller_add(struct device *dev,
 	if (ret)
 		return ERR_PTR(ret);
 
+	/*
+	 * Ignoring retval since failures during scan are non-critical for
+	 * controller itself.
+	 */
+	peci_controller_scan_devices(controller);
+
 	return controller;
 
 err_fwnode:
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
new file mode 100644
index 000000000000..39f7f6409391
--- /dev/null
+++ b/drivers/peci/device.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/peci.h>
+#include <linux/slab.h>
+
+#include "internal.h"
+
+static int peci_detect(struct peci_controller *controller, u8 addr)
+{
+	/*
+	 * PECI Ping is a command encoded by tx_len = 0, rx_len = 0.
+	 * We expect correct Write FCS if the device at the target address
+	 * is able to respond.
+	 */
+	struct peci_request req = { 0 };
+	int ret;
+
+	mutex_lock(&controller->bus_lock);
+	ret = controller->ops->xfer(controller, addr, &req);
+	mutex_unlock(&controller->bus_lock);
+
+	return ret;
+}
+
+static bool peci_addr_valid(u8 addr)
+{
+	return addr >= PECI_BASE_ADDR && addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX;
+}
+
+static int peci_dev_exists(struct device *dev, void *data)
+{
+	struct peci_device *device = to_peci_device(dev);
+	u8 *addr = data;
+
+	if (device->addr == *addr)
+		return -EBUSY;
+
+	return 0;
+}
+
+int peci_device_create(struct peci_controller *controller, u8 addr)
+{
+	struct peci_device *device;
+	int ret;
+
+	if (!peci_addr_valid(addr))
+		return -EINVAL;
+
+	/* Check if we have already detected this device before. */
+	ret = device_for_each_child(&controller->dev, &addr, peci_dev_exists);
+	if (ret)
+		return 0;
+
+	ret = peci_detect(controller, addr);
+	if (ret) {
+		/*
+		 * Device not present or host state doesn't allow successful
+		 * detection at this time.
+		 */
+		if (ret == -EIO || ret == -ETIMEDOUT)
+			return 0;
+
+		return ret;
+	}
+
+	device = kzalloc(sizeof(*device), GFP_KERNEL);
+	if (!device)
+		return -ENOMEM;
+
+	device_initialize(&device->dev);
+
+	device->addr = addr;
+	device->dev.parent = &controller->dev;
+	device->dev.bus = &peci_bus_type;
+	device->dev.type = &peci_device_type;
+
+	ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device->addr);
+	if (ret)
+		goto err_put;
+
+	ret = device_add(&device->dev);
+	if (ret)
+		goto err_put;
+
+	return 0;
+
+err_put:
+	put_device(&device->dev);
+
+	return ret;
+}
+
+void peci_device_destroy(struct peci_device *device)
+{
+	bool killed;
+
+	device_lock(&device->dev);
+	killed = kill_device(&device->dev);
+	device_unlock(&device->dev);
+
+	if (!killed)
+		return;
+
+	device_unregister(&device->dev);
+}
+
+static void peci_device_release(struct device *dev)
+{
+	struct peci_device *device = to_peci_device(dev);
+
+	kfree(device);
+}
+
+struct device_type peci_device_type = {
+	.release	= peci_device_release,
+};
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 918dea745a86..57d11a902c5d 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -8,6 +8,20 @@
 #include <linux/types.h>
 
 struct peci_controller;
+struct peci_device;
+struct peci_request;
+
+/* PECI CPU address range 0x30-0x37 */
+#define PECI_BASE_ADDR		0x30
+#define PECI_DEVICE_NUM_MAX	8
+
+struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len);
+void peci_request_free(struct peci_request *req);
+
+extern struct device_type peci_device_type;
+
+int peci_device_create(struct peci_controller *controller, u8 addr);
+void peci_device_destroy(struct peci_device *device);
 
 extern struct bus_type peci_bus_type;
 
diff --git a/drivers/peci/request.c b/drivers/peci/request.c
new file mode 100644
index 000000000000..7dee51c50dd2
--- /dev/null
+++ b/drivers/peci/request.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2021 Intel Corporation
+
+#include <linux/export.h>
+#include <linux/peci.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include "internal.h"
+
+/**
+ * peci_request_alloc() - allocate &struct peci_requests
+ * @device: PECI device to which request is going to be sent
+ * @tx_len: TX length
+ * @rx_len: RX length
+ *
+ * Return: A pointer to a newly allocated &struct peci_request on success or NULL otherwise.
+ */
+struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len)
+{
+	struct peci_request *req;
+
+	/*
+	 * TX and RX buffers are fixed length members of peci_request, this is
+	 * just a warn for developers to make sure to expand the buffers (or
+	 * change the allocation method) if we go over the current limit.
+	 */
+	if (WARN_ON_ONCE(tx_len > PECI_REQUEST_MAX_BUF_SIZE || rx_len > PECI_REQUEST_MAX_BUF_SIZE))
+		return NULL;
+	/*
+	 * PECI controllers that we are using now don't support DMA, this
+	 * should be converted to DMA API once support for controllers that do
+	 * allow it is added to avoid an extra copy.
+	 */
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return NULL;
+
+	req->device = device;
+	req->tx.len = tx_len;
+	req->rx.len = rx_len;
+
+	return req;
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_alloc, PECI);
+
+/**
+ * peci_request_free() - free peci_request
+ * @req: the PECI request to be freed
+ */
+void peci_request_free(struct peci_request *req)
+{
+	kfree(req);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 07/13] peci: Add sysfs interface for PECI bus
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (5 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 06/13] peci: Add device detection Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 08/13] peci: Add support for PECI device drivers Iwona Winiarska
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

PECI devices may not be discoverable at the time when PECI controller is
being added (e.g. BMC can boot up when the Host system is still in S5).
Since we currently don't have the capabilities to figure out the Host
system state inside the PECI subsystem itself, we have to rely on
userspace to do it for us.

In the future, PECI subsystem may be expanded with mechanisms that allow
us to avoid depending on userspace interaction (e.g. CPU presence could
be detected using GPIO, and the information on whether it's discoverable
could be obtained over IPMI).
Unfortunately, those methods may ultimately not be available (support
will vary from platform to platform), which means that we still need
platform independent method triggered by userspace.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-peci | 16 +++++
 drivers/peci/Makefile                    |  2 +-
 drivers/peci/core.c                      |  3 +-
 drivers/peci/device.c                    |  1 +
 drivers/peci/internal.h                  |  5 ++
 drivers/peci/sysfs.c                     | 82 ++++++++++++++++++++++++
 6 files changed, 107 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
 create mode 100644 drivers/peci/sysfs.c

diff --git a/Documentation/ABI/testing/sysfs-bus-peci b/Documentation/ABI/testing/sysfs-bus-peci
new file mode 100644
index 000000000000..56c2b2216bbd
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-peci
@@ -0,0 +1,16 @@
+What:		/sys/bus/peci/rescan
+Date:		July 2021
+KernelVersion:	5.15
+Contact:	Iwona Winiarska <iwona.winiarska@intel.com>
+Description:
+		Writing a non-zero value to this attribute will
+		initiate scan for PECI devices on all PECI controllers
+		in the system.
+
+What:		/sys/bus/peci/devices/<controller_id>-<device_addr>/remove
+Date:		July 2021
+KernelVersion:	5.15
+Contact:	Iwona Winiarska <iwona.winiarska@intel.com>
+Description:
+		Writing a non-zero value to this attribute will
+		remove the PECI device and any of its children.
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index c5f9d3fe21bb..917f689e147a 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 # Core functionality
-peci-y := core.o request.o device.o
+peci-y := core.o request.o device.o sysfs.o
 obj-$(CONFIG_PECI) += peci.o
 
 # Hardware specific bus drivers
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
index c3361e6e043a..e993615cf521 100644
--- a/drivers/peci/core.c
+++ b/drivers/peci/core.c
@@ -29,7 +29,7 @@ struct device_type peci_controller_type = {
 	.release	= peci_controller_dev_release,
 };
 
-static int peci_controller_scan_devices(struct peci_controller *controller)
+int peci_controller_scan_devices(struct peci_controller *controller)
 {
 	int ret;
 	u8 addr;
@@ -162,6 +162,7 @@ EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
 
 struct bus_type peci_bus_type = {
 	.name		= "peci",
+	.bus_groups	= peci_bus_groups,
 };
 
 static int __init peci_init(void)
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
index 39f7f6409391..91b1522ce6f7 100644
--- a/drivers/peci/device.c
+++ b/drivers/peci/device.c
@@ -113,5 +113,6 @@ static void peci_device_release(struct device *dev)
 }
 
 struct device_type peci_device_type = {
+	.groups		= peci_device_groups,
 	.release	= peci_device_release,
 };
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 57d11a902c5d..978e12c8e1d3 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -8,6 +8,7 @@
 #include <linux/types.h>
 
 struct peci_controller;
+struct attribute_group;
 struct peci_device;
 struct peci_request;
 
@@ -19,12 +20,16 @@ struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u
 void peci_request_free(struct peci_request *req);
 
 extern struct device_type peci_device_type;
+extern const struct attribute_group *peci_device_groups[];
 
 int peci_device_create(struct peci_controller *controller, u8 addr);
 void peci_device_destroy(struct peci_device *device);
 
 extern struct bus_type peci_bus_type;
+extern const struct attribute_group *peci_bus_groups[];
 
 extern struct device_type peci_controller_type;
 
+int peci_controller_scan_devices(struct peci_controller *controller);
+
 #endif /* __PECI_INTERNAL_H */
diff --git a/drivers/peci/sysfs.c b/drivers/peci/sysfs.c
new file mode 100644
index 000000000000..db9ef05776e3
--- /dev/null
+++ b/drivers/peci/sysfs.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2021 Intel Corporation
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/peci.h>
+
+#include "internal.h"
+
+static int rescan_controller(struct device *dev, void *data)
+{
+	if (dev->type != &peci_controller_type)
+		return 0;
+
+	return peci_controller_scan_devices(to_peci_controller(dev));
+}
+
+static ssize_t rescan_store(struct bus_type *bus, const char *buf, size_t count)
+{
+	bool res;
+	int ret;
+
+	ret = kstrtobool(buf, &res);
+	if (ret)
+		return ret;
+
+	if (!res)
+		return count;
+
+	ret = bus_for_each_dev(&peci_bus_type, NULL, NULL, rescan_controller);
+	if (ret)
+		return ret;
+
+	return count;
+}
+static BUS_ATTR_WO(rescan);
+
+static struct attribute *peci_bus_attrs[] = {
+	&bus_attr_rescan.attr,
+	NULL
+};
+
+static const struct attribute_group peci_bus_group = {
+	.attrs = peci_bus_attrs,
+};
+
+const struct attribute_group *peci_bus_groups[] = {
+	&peci_bus_group,
+	NULL
+};
+
+static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
+			    const char *buf, size_t count)
+{
+	struct peci_device *device = to_peci_device(dev);
+	bool res;
+	int ret;
+
+	ret = kstrtobool(buf, &res);
+	if (ret)
+		return ret;
+
+	if (res && device_remove_file_self(dev, attr))
+		peci_device_destroy(device);
+
+	return count;
+}
+static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0200, NULL, remove_store);
+
+static struct attribute *peci_device_attrs[] = {
+	&dev_attr_remove.attr,
+	NULL
+};
+
+static const struct attribute_group peci_device_group = {
+	.attrs = peci_device_attrs,
+};
+
+const struct attribute_group *peci_device_groups[] = {
+	&peci_device_group,
+	NULL
+};
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 08/13] peci: Add support for PECI device drivers
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (6 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 07/13] peci: Add sysfs interface for PECI bus Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 09/13] peci: Add peci-cpu driver Iwona Winiarska
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Add support for PECI device drivers, which unlike PECI controller
drivers are actually able to provide functionalities to userspace.

Also, extend peci_request API to allow querying more details about PECI
device (e.g. model/family), that's going to be used to find a compatible
peci_driver.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/peci/core.c     |  44 +++++++++
 drivers/peci/device.c   | 130 ++++++++++++++++++++++++
 drivers/peci/internal.h |  74 ++++++++++++++
 drivers/peci/request.c  | 214 ++++++++++++++++++++++++++++++++++++++++
 include/linux/peci.h    |  19 ++++
 5 files changed, 481 insertions(+)

diff --git a/drivers/peci/core.c b/drivers/peci/core.c
index e993615cf521..9c8cf07e51c7 100644
--- a/drivers/peci/core.c
+++ b/drivers/peci/core.c
@@ -160,8 +160,52 @@ struct peci_controller *devm_peci_controller_add(struct device *dev,
 }
 EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
 
+static const struct peci_device_id *
+peci_bus_match_device_id(const struct peci_device_id *id, struct peci_device *device)
+{
+	while (id->family != 0) {
+		if (id->family == device->info.family &&
+		    id->model == device->info.model)
+			return id;
+		id++;
+	}
+
+	return NULL;
+}
+
+static int peci_bus_device_match(struct device *dev, struct device_driver *drv)
+{
+	struct peci_device *device = to_peci_device(dev);
+	struct peci_driver *peci_drv = to_peci_driver(drv);
+
+	if (dev->type != &peci_device_type)
+		return 0;
+
+	return !!peci_bus_match_device_id(peci_drv->id_table, device);
+}
+
+static int peci_bus_device_probe(struct device *dev)
+{
+	struct peci_device *device = to_peci_device(dev);
+	struct peci_driver *driver = to_peci_driver(dev->driver);
+
+	return driver->probe(device, peci_bus_match_device_id(driver->id_table, device));
+}
+
+static void peci_bus_device_remove(struct device *dev)
+{
+	struct peci_device *device = to_peci_device(dev);
+	struct peci_driver *driver = to_peci_driver(dev->driver);
+
+	if (driver->remove)
+		driver->remove(device);
+}
+
 struct bus_type peci_bus_type = {
 	.name		= "peci",
+	.match		= peci_bus_device_match,
+	.probe		= peci_bus_device_probe,
+	.remove		= peci_bus_device_remove,
 	.bus_groups	= peci_bus_groups,
 };
 
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
index 91b1522ce6f7..0a7660f4fc9b 100644
--- a/drivers/peci/device.c
+++ b/drivers/peci/device.c
@@ -1,11 +1,110 @@
 // SPDX-License-Identifier: GPL-2.0-only
 // Copyright (c) 2018-2021 Intel Corporation
 
+#include <linux/bitfield.h>
 #include <linux/peci.h>
 #include <linux/slab.h>
 
 #include "internal.h"
 
+#define REVISION_NUM_MASK GENMASK(15, 8)
+static int peci_get_revision(struct peci_device *device, u8 *revision)
+{
+	struct peci_request *req;
+	u64 dib;
+
+	req = peci_xfer_get_dib(device);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	/*
+	 * PECI device may be in a state where it is unable to return a proper
+	 * DIB, in which case it returns 0 as DIB value.
+	 * Let's treat this as an error to avoid carrying on with the detection
+	 * using invalid revision.
+	 */
+	dib = peci_request_dib_read(req);
+	if (dib == 0) {
+		peci_request_free(req);
+		return -EIO;
+	}
+
+	*revision = FIELD_GET(REVISION_NUM_MASK, dib);
+
+	peci_request_free(req);
+
+	return 0;
+}
+
+static int peci_get_cpu_id(struct peci_device *device, u32 *cpu_id)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_xfer_pkg_cfg_readl(device, PECI_PCS_PKG_ID, PECI_PKG_ID_CPU_ID);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*cpu_id = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+
+static unsigned int peci_x86_cpu_family(unsigned int sig)
+{
+	unsigned int x86;
+
+	x86 = (sig >> 8) & 0xf;
+
+	if (x86 == 0xf)
+		x86 += (sig >> 20) & 0xff;
+
+	return x86;
+}
+
+static unsigned int peci_x86_cpu_model(unsigned int sig)
+{
+	unsigned int fam, model;
+
+	fam = peci_x86_cpu_family(sig);
+
+	model = (sig >> 4) & 0xf;
+
+	if (fam >= 0x6)
+		model += ((sig >> 16) & 0xf) << 4;
+
+	return model;
+}
+
+static int peci_device_info_init(struct peci_device *device)
+{
+	u8 revision;
+	u32 cpu_id;
+	int ret;
+
+	ret = peci_get_cpu_id(device, &cpu_id);
+	if (ret)
+		return ret;
+
+	device->info.family = peci_x86_cpu_family(cpu_id);
+	device->info.model = peci_x86_cpu_model(cpu_id);
+
+	ret = peci_get_revision(device, &revision);
+	if (ret)
+		return ret;
+	device->info.peci_revision = revision;
+
+	device->info.socket_id = device->addr - PECI_BASE_ADDR;
+
+	return 0;
+}
+
 static int peci_detect(struct peci_controller *controller, u8 addr)
 {
 	/*
@@ -75,6 +174,10 @@ int peci_device_create(struct peci_controller *controller, u8 addr)
 	device->dev.bus = &peci_bus_type;
 	device->dev.type = &peci_device_type;
 
+	ret = peci_device_info_init(device);
+	if (ret)
+		goto err_put;
+
 	ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device->addr);
 	if (ret)
 		goto err_put;
@@ -105,6 +208,33 @@ void peci_device_destroy(struct peci_device *device)
 	device_unregister(&device->dev);
 }
 
+int __peci_driver_register(struct peci_driver *driver, struct module *owner,
+			   const char *mod_name)
+{
+	driver->driver.bus = &peci_bus_type;
+	driver->driver.owner = owner;
+	driver->driver.mod_name = mod_name;
+
+	if (!driver->probe) {
+		pr_err("peci: trying to register driver without probe callback\n");
+		return -EINVAL;
+	}
+
+	if (!driver->id_table) {
+		pr_err("peci: trying to register driver without device id table\n");
+		return -EINVAL;
+	}
+
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_NS_GPL(__peci_driver_register, PECI);
+
+void peci_driver_unregister(struct peci_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_NS_GPL(peci_driver_unregister, PECI);
+
 static void peci_device_release(struct device *dev)
 {
 	struct peci_device *device = to_peci_device(dev);
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 978e12c8e1d3..52c02e12874f 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -19,6 +19,35 @@ struct peci_request;
 struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len);
 void peci_request_free(struct peci_request *req);
 
+int peci_request_status(struct peci_request *req);
+
+u64 peci_request_dib_read(struct peci_request *req);
+
+u8 peci_request_data_readb(struct peci_request *req);
+u16 peci_request_data_readw(struct peci_request *req);
+u32 peci_request_data_readl(struct peci_request *req);
+u64 peci_request_data_readq(struct peci_request *req);
+
+struct peci_request *peci_xfer_get_dib(struct peci_device *device);
+struct peci_request *peci_xfer_get_temp(struct peci_device *device);
+
+struct peci_request *peci_xfer_pkg_cfg_readb(struct peci_device *device, u8 index, u16 param);
+struct peci_request *peci_xfer_pkg_cfg_readw(struct peci_device *device, u8 index, u16 param);
+struct peci_request *peci_xfer_pkg_cfg_readl(struct peci_device *device, u8 index, u16 param);
+struct peci_request *peci_xfer_pkg_cfg_readq(struct peci_device *device, u8 index, u16 param);
+
+/**
+ * struct peci_device_id - PECI device data to match
+ * @data: pointer to driver private data specific to device
+ * @family: device family
+ * @model: device model
+ */
+struct peci_device_id {
+	const void *data;
+	u16 family;
+	u8 model;
+};
+
 extern struct device_type peci_device_type;
 extern const struct attribute_group *peci_device_groups[];
 
@@ -28,6 +57,51 @@ void peci_device_destroy(struct peci_device *device);
 extern struct bus_type peci_bus_type;
 extern const struct attribute_group *peci_bus_groups[];
 
+/**
+ * struct peci_driver - PECI driver
+ * @driver: inherit device driver
+ * @probe: probe callback
+ * @remove: remove callback
+ * @id_table: PECI device match table to decide which device to bind
+ */
+struct peci_driver {
+	struct device_driver driver;
+	int (*probe)(struct peci_device *device, const struct peci_device_id *id);
+	void (*remove)(struct peci_device *device);
+	const struct peci_device_id *id_table;
+};
+
+static inline struct peci_driver *to_peci_driver(struct device_driver *d)
+{
+	return container_of(d, struct peci_driver, driver);
+}
+
+int __peci_driver_register(struct peci_driver *driver, struct module *owner,
+			   const char *mod_name);
+/**
+ * peci_driver_register() - register PECI driver
+ * @driver: the driver to be registered
+ *
+ * PECI drivers that don't need to do anything special in module init should
+ * use the convenience "module_peci_driver" macro instead
+ *
+ * Return: zero on success, else a negative error code.
+ */
+#define peci_driver_register(driver) \
+	__peci_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
+void peci_driver_unregister(struct peci_driver *driver);
+
+/**
+ * module_peci_driver() - helper macro for registering a modular PECI driver
+ * @__peci_driver: peci_driver struct
+ *
+ * Helper macro for PECI drivers which do not do anything special in module
+ * init/exit. This eliminates a lot of boilerplate. Each module may only
+ * use this macro once, and calling it replaces module_init() and module_exit()
+ */
+#define module_peci_driver(__peci_driver) \
+	module_driver(__peci_driver, peci_driver_register, peci_driver_unregister)
+
 extern struct device_type peci_controller_type;
 
 int peci_controller_scan_devices(struct peci_controller *controller);
diff --git a/drivers/peci/request.c b/drivers/peci/request.c
index 7dee51c50dd2..9bd1d81c6093 100644
--- a/drivers/peci/request.c
+++ b/drivers/peci/request.c
@@ -1,13 +1,140 @@
 // SPDX-License-Identifier: GPL-2.0-only
 // Copyright (c) 2021 Intel Corporation
 
+#include <linux/bug.h>
 #include <linux/export.h>
 #include <linux/peci.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include <asm/unaligned.h>
+
 #include "internal.h"
 
+#define PECI_GET_DIB_CMD		0xf7
+#define  PECI_GET_DIB_WR_LEN		1
+#define  PECI_GET_DIB_RD_LEN		8
+
+#define PECI_RDPKGCFG_CMD		0xa1
+#define  PECI_RDPKGCFG_WR_LEN		5
+#define  PECI_RDPKGCFG_RD_LEN_BASE	1
+#define PECI_WRPKGCFG_CMD		0xa5
+#define  PECI_WRPKGCFG_WR_LEN_BASE	6
+#define  PECI_WRPKGCFG_RD_LEN		1
+
+/* Device Specific Completion Code (CC) Definition */
+#define PECI_CC_SUCCESS				0x40
+#define PECI_CC_NEED_RETRY			0x80
+#define PECI_CC_OUT_OF_RESOURCE			0x81
+#define PECI_CC_UNAVAIL_RESOURCE		0x82
+#define PECI_CC_INVALID_REQ			0x90
+#define PECI_CC_MCA_ERROR			0x91
+#define PECI_CC_CATASTROPHIC_MCA_ERROR		0x93
+#define PECI_CC_FATAL_MCA_ERROR			0x94
+#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB		0x98
+#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR	0x9B
+#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA	0x9C
+
+#define PECI_RETRY_BIT			BIT(0)
+
+#define PECI_RETRY_TIMEOUT		msecs_to_jiffies(700)
+#define PECI_RETRY_INTERVAL_MIN		msecs_to_jiffies(1)
+#define PECI_RETRY_INTERVAL_MAX		msecs_to_jiffies(128)
+
+static u8 peci_request_data_cc(struct peci_request *req)
+{
+	return req->rx.buf[0];
+}
+
+/**
+ * peci_request_status() - return -errno based on PECI completion code
+ * @req: the PECI request that contains response data with completion code
+ *
+ * It can't be used for Ping(), GetDIB() and GetTemp() - for those commands we
+ * don't expect completion code in the response.
+ *
+ * Return: -errno
+ */
+int peci_request_status(struct peci_request *req)
+{
+	u8 cc = peci_request_data_cc(req);
+
+	if (cc != PECI_CC_SUCCESS)
+		dev_dbg(&req->device->dev, "ret: %#02x\n", cc);
+
+	switch (cc) {
+	case PECI_CC_SUCCESS:
+		return 0;
+	case PECI_CC_NEED_RETRY:
+	case PECI_CC_OUT_OF_RESOURCE:
+	case PECI_CC_UNAVAIL_RESOURCE:
+		return -EAGAIN;
+	case PECI_CC_INVALID_REQ:
+		return -EINVAL;
+	case PECI_CC_MCA_ERROR:
+	case PECI_CC_CATASTROPHIC_MCA_ERROR:
+	case PECI_CC_FATAL_MCA_ERROR:
+	case PECI_CC_PARITY_ERR_GPSB_OR_PMSB:
+	case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR:
+	case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA:
+		return -EIO;
+	}
+
+	WARN_ONCE(1, "Unknown PECI completion code: %#02x\n", cc);
+
+	return -EIO;
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_status, PECI);
+
+static int peci_request_xfer(struct peci_request *req)
+{
+	struct peci_device *device = req->device;
+	struct peci_controller *controller = to_peci_controller(device->dev.parent);
+	int ret;
+
+	mutex_lock(&controller->bus_lock);
+	ret = controller->ops->xfer(controller, device->addr, req);
+	mutex_unlock(&controller->bus_lock);
+
+	return ret;
+}
+
+static int peci_request_xfer_retry(struct peci_request *req)
+{
+	long wait_interval = PECI_RETRY_INTERVAL_MIN;
+	struct peci_device *device = req->device;
+	struct peci_controller *controller = to_peci_controller(device->dev.parent);
+	unsigned long start = jiffies;
+	int ret;
+
+	/* Don't try to use it for ping */
+	if (WARN_ON(!req->rx.buf))
+		return 0;
+
+	do {
+		ret = peci_request_xfer(req);
+		if (ret) {
+			dev_dbg(&controller->dev, "xfer error: %d\n", ret);
+			return ret;
+		}
+
+		if (peci_request_status(req) != -EAGAIN)
+			return 0;
+
+		/* Set the retry bit to indicate a retry attempt */
+		req->tx.buf[1] |= PECI_RETRY_BIT;
+
+		if (schedule_timeout_interruptible(wait_interval))
+			return -ERESTARTSYS;
+
+		wait_interval = min_t(long, wait_interval * 2, PECI_RETRY_INTERVAL_MAX);
+	} while (time_before(jiffies, start + PECI_RETRY_TIMEOUT));
+
+	dev_dbg(&controller->dev, "request timed out\n");
+
+	return -ETIMEDOUT;
+}
+
 /**
  * peci_request_alloc() - allocate &struct peci_requests
  * @device: PECI device to which request is going to be sent
@@ -53,3 +180,90 @@ void peci_request_free(struct peci_request *req)
 	kfree(req);
 }
 EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);
+
+struct peci_request *peci_xfer_get_dib(struct peci_device *device)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_GET_DIB_WR_LEN, PECI_GET_DIB_RD_LEN);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_GET_DIB_CMD;
+
+	ret = peci_request_xfer(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+EXPORT_SYMBOL_NS_GPL(peci_xfer_get_dib, PECI);
+
+static struct peci_request *
+__pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_RDPKGCFG_WR_LEN, PECI_RDPKGCFG_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_RDPKGCFG_CMD;
+	req->tx.buf[1] = 0;
+	req->tx.buf[2] = index;
+	put_unaligned_le16(param, &req->tx.buf[3]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
+u8 peci_request_data_readb(struct peci_request *req)
+{
+	return req->rx.buf[1];
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readb, PECI);
+
+u16 peci_request_data_readw(struct peci_request *req)
+{
+	return get_unaligned_le16(&req->rx.buf[1]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readw, PECI);
+
+u32 peci_request_data_readl(struct peci_request *req)
+{
+	return get_unaligned_le32(&req->rx.buf[1]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readl, PECI);
+
+u64 peci_request_data_readq(struct peci_request *req)
+{
+	return get_unaligned_le64(&req->rx.buf[1]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readq, PECI);
+
+u64 peci_request_dib_read(struct peci_request *req)
+{
+	return get_unaligned_le64(&req->rx.buf[0]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_dib_read, PECI);
+
+#define __read_pkg_config(x, type) \
+struct peci_request *peci_xfer_pkg_cfg_##x(struct peci_device *device, u8 index, u16 param) \
+{ \
+	return __pkg_cfg_read(device, index, param, sizeof(type)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_xfer_pkg_cfg_##x, PECI)
+
+__read_pkg_config(readb, u8);
+__read_pkg_config(readw, u16);
+__read_pkg_config(readl, u32);
+__read_pkg_config(readq, u64);
diff --git a/include/linux/peci.h b/include/linux/peci.h
index 26e0a4e73b50..dcf1c53f4e40 100644
--- a/include/linux/peci.h
+++ b/include/linux/peci.h
@@ -14,6 +14,14 @@
  */
 #define PECI_REQUEST_MAX_BUF_SIZE 32
 
+#define PECI_PCS_PKG_ID			0  /* Package Identifier Read */
+#define  PECI_PKG_ID_CPU_ID		0x0000  /* CPUID Info */
+#define  PECI_PKG_ID_PLATFORM_ID	0x0001  /* Platform ID */
+#define  PECI_PKG_ID_DEVICE_ID		0x0002  /* Uncore Device ID */
+#define  PECI_PKG_ID_MAX_THREAD_ID	0x0003  /* Max Thread ID */
+#define  PECI_PKG_ID_MICROCODE_REV	0x0004  /* CPU Microcode Update Revision */
+#define  PECI_PKG_ID_MCA_ERROR_LOG	0x0005  /* Machine Check Status */
+
 struct peci_controller;
 struct peci_request;
 
@@ -59,6 +67,11 @@ static inline struct peci_controller *to_peci_controller(void *d)
  * struct peci_device - PECI device
  * @dev: device object to register PECI device to the device model
  * @controller: manages the bus segment hosting this PECI device
+ * @info: PECI device characteristics
+ * @info.family: device family
+ * @info.model: device model
+ * @info.peci_revision: PECI revision supported by the PECI device
+ * @info.socket_id: the socket ID represented by the PECI device
  * @addr: address used on the PECI bus connected to the parent controller
  *
  * A peci_device identifies a single device (i.e. CPU) connected to a PECI bus.
@@ -67,6 +80,12 @@ static inline struct peci_controller *to_peci_controller(void *d)
  */
 struct peci_device {
 	struct device dev;
+	struct {
+		u16 family;
+		u8 model;
+		u8 peci_revision;
+		u8 socket_id;
+	} info;
 	u8 addr;
 };
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 09/13] peci: Add peci-cpu driver
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (7 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 08/13] peci: Add support for PECI device drivers Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 10/13] hwmon: peci: Add cputemp driver Iwona Winiarska
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

PECI is an interface that may be used by different types of devices.
Add a peci-cpu driver compatible with Intel processors. The driver is
responsible for handling auxiliary devices that can subsequently be used
by other drivers (e.g. hwmons).

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS              |   1 +
 drivers/peci/Kconfig     |  15 ++
 drivers/peci/Makefile    |   2 +
 drivers/peci/cpu.c       | 343 +++++++++++++++++++++++++++++++++++++++
 drivers/peci/device.c    |   1 +
 drivers/peci/internal.h  |  27 +++
 drivers/peci/request.c   | 213 ++++++++++++++++++++++++
 include/linux/peci-cpu.h |  40 +++++
 include/linux/peci.h     |   8 -
 9 files changed, 642 insertions(+), 8 deletions(-)
 create mode 100644 drivers/peci/cpu.c
 create mode 100644 include/linux/peci-cpu.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a4da3bba09ba..d751590f32e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14927,6 +14927,7 @@ L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/peci/
 F:	drivers/peci/
+F:	include/linux/peci-cpu.h
 F:	include/linux/peci.h
 
 PENSANDO ETHERNET DRIVERS
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 99279df97a78..89872ad83320 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -16,6 +16,21 @@ menuconfig PECI
 
 if PECI
 
+config PECI_CPU
+	tristate "PECI CPU"
+	select AUXILIARY_BUS
+	help
+	  This option enables peci-cpu driver for Intel processors. It is
+	  responsible for creating auxiliary devices that can subsequently
+	  be used by other drivers in order to perform various
+	  functionalities such as e.g. temperature monitoring.
+
+	  Additional drivers must be enabled in order to use the functionality
+	  of the device.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called peci-cpu.
+
 source "drivers/peci/controller/Kconfig"
 
 endif # PECI
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 917f689e147a..7de18137e738 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -3,6 +3,8 @@
 # Core functionality
 peci-y := core.o request.o device.o sysfs.o
 obj-$(CONFIG_PECI) += peci.o
+peci-cpu-y := cpu.o
+obj-$(CONFIG_PECI_CPU) += peci-cpu.o
 
 # Hardware specific bus drivers
 obj-y += controller/
diff --git a/drivers/peci/cpu.c b/drivers/peci/cpu.c
new file mode 100644
index 000000000000..68eb61c65d34
--- /dev/null
+++ b/drivers/peci/cpu.c
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2021 Intel Corporation
+
+#include <linux/auxiliary_bus.h>
+#include <linux/module.h>
+#include <linux/peci.h>
+#include <linux/peci-cpu.h>
+#include <linux/slab.h>
+
+#include "internal.h"
+
+/**
+ * peci_temp_read() - read the maximum die temperature from PECI target device
+ * @device: PECI device to which request is going to be sent
+ * @temp_raw: where to store the read temperature
+ *
+ * It uses GetTemp PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_temp_read(struct peci_device *device, s16 *temp_raw)
+{
+	struct peci_request *req;
+
+	req = peci_xfer_get_temp(device);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	*temp_raw = peci_request_temp_read(req);
+
+	peci_request_free(req);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(peci_temp_read, PECI_CPU);
+
+/**
+ * peci_pcs_read() - read PCS register
+ * @device: PECI device to which request is going to be sent
+ * @index: PCS index
+ * @param: PCS parameter
+ * @data: where to store the read data
+ *
+ * It uses RdPkgConfig PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_pcs_read(struct peci_device *device, u8 index, u16 param, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_xfer_pkg_cfg_readl(device, index, param);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_pcs_read, PECI_CPU);
+
+/**
+ * peci_pci_local_read() - read 32-bit memory location using raw address
+ * @device: PECI device to which request is going to be sent
+ * @bus: bus
+ * @dev: device
+ * @func: function
+ * @reg: register
+ * @data: where to store the read data
+ *
+ * It uses RdPCIConfigLocal PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_pci_local_read(struct peci_device *device, u8 bus, u8 dev, u8 func,
+			u16 reg, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_xfer_pci_cfg_local_readl(device, bus, dev, func, reg);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_pci_local_read, PECI_CPU);
+
+/**
+ * peci_ep_pci_local_read() - read 32-bit memory location using raw address
+ * @device: PECI device to which request is going to be sent
+ * @seg: PCI segment
+ * @bus: bus
+ * @dev: device
+ * @func: function
+ * @reg: register
+ * @data: where to store the read data
+ *
+ * Like &peci_pci_local_read, but it uses RdEndpointConfig PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_ep_pci_local_read(struct peci_device *device, u8 seg,
+			   u8 bus, u8 dev, u8 func, u16 reg, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_xfer_ep_pci_cfg_local_readl(device, seg, bus, dev, func, reg);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_ep_pci_local_read, PECI_CPU);
+
+/**
+ * peci_mmio_read() - read 32-bit memory location using 64-bit bar offset address
+ * @device: PECI device to which request is going to be sent
+ * @bar: PCI bar
+ * @seg: PCI segment
+ * @bus: bus
+ * @dev: device
+ * @func: function
+ * @address: 64-bit MMIO address
+ * @data: where to store the read data
+ *
+ * It uses RdEndpointConfig PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_mmio_read(struct peci_device *device, u8 bar, u8 seg,
+		   u8 bus, u8 dev, u8 func, u64 address, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_xfer_ep_mmio64_readl(device, bar, seg, bus, dev, func, address);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_mmio_read, PECI_CPU);
+
+static const char * const peci_adev_types[] = {
+	"cputemp",
+	"dimmtemp",
+};
+
+struct peci_cpu {
+	struct peci_device *device;
+	const struct peci_device_id *id;
+};
+
+static void adev_release(struct device *dev)
+{
+	struct auxiliary_device *adev = to_auxiliary_dev(dev);
+
+	auxiliary_device_uninit(adev);
+
+	kfree(adev->name);
+	kfree(adev);
+}
+
+static struct auxiliary_device *adev_alloc(struct peci_cpu *priv, int idx)
+{
+	struct peci_controller *controller = to_peci_controller(priv->device->dev.parent);
+	struct auxiliary_device *adev;
+	const char *name;
+	int ret;
+
+	adev = kzalloc(sizeof(*adev), GFP_KERNEL);
+	if (!adev)
+		return ERR_PTR(-ENOMEM);
+
+	name = kasprintf(GFP_KERNEL, "%s.%s", peci_adev_types[idx], (const char *)priv->id->data);
+	if (!name) {
+		ret = -ENOMEM;
+		goto free_adev;
+	}
+
+	adev->name = name;
+	adev->dev.parent = &priv->device->dev;
+	adev->dev.release = adev_release;
+	adev->id = (controller->id << 16) | (priv->device->addr);
+
+	ret = auxiliary_device_init(adev);
+	if (ret)
+		goto free_name;
+
+	return adev;
+
+free_name:
+	kfree(name);
+free_adev:
+	kfree(adev);
+	return ERR_PTR(ret);
+}
+
+static void unregister_adev(void *_adev)
+{
+	struct auxiliary_device *adev = _adev;
+
+	auxiliary_device_delete(adev);
+}
+
+static int devm_adev_add(struct device *dev, int idx)
+{
+	struct peci_cpu *priv = dev_get_drvdata(dev);
+	struct auxiliary_device *adev;
+	int ret;
+
+	adev = adev_alloc(priv, idx);
+	if (IS_ERR(adev))
+		return PTR_ERR(adev);
+
+	ret = auxiliary_device_add(adev);
+	if (ret) {
+		auxiliary_device_uninit(adev);
+		return ret;
+	}
+
+	ret = devm_add_action_or_reset(&priv->device->dev, unregister_adev, adev);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static void peci_cpu_add_adevices(struct peci_cpu *priv)
+{
+	struct device *dev = &priv->device->dev;
+	int ret, i;
+
+	for (i = 0; i < ARRAY_SIZE(peci_adev_types); i++) {
+		ret = devm_adev_add(dev, i);
+		if (ret) {
+			dev_warn(dev, "Failed to register PECI auxiliary: %s, ret = %d\n",
+				 peci_adev_types[i], ret);
+			continue;
+		}
+	}
+}
+
+static int
+peci_cpu_probe(struct peci_device *device, const struct peci_device_id *id)
+{
+	struct device *dev = &device->dev;
+	struct peci_cpu *priv;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->device = device;
+	priv->id = id;
+
+	peci_cpu_add_adevices(priv);
+
+	return 0;
+}
+
+static const struct peci_device_id peci_cpu_device_ids[] = {
+	{ /* Haswell Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_HASWELL_X,
+		.data	= "hsx",
+	},
+	{ /* Broadwell Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_BROADWELL_X,
+		.data	= "bdx",
+	},
+	{ /* Broadwell Xeon D */
+		.family	= 6,
+		.model	= INTEL_FAM6_BROADWELL_D,
+		.data	= "bdxd",
+	},
+	{ /* Skylake Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_SKYLAKE_X,
+		.data	= "skx",
+	},
+	{ /* Icelake Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_ICELAKE_X,
+		.data	= "icx",
+	},
+	{ /* Icelake Xeon D */
+		.family	= 6,
+		.model	= INTEL_FAM6_ICELAKE_D,
+		.data	= "icxd",
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(peci, peci_cpu_device_ids);
+
+static struct peci_driver peci_cpu_driver = {
+	.probe		= peci_cpu_probe,
+	.id_table	= peci_cpu_device_ids,
+	.driver		= {
+		.name		= "peci-cpu",
+	},
+};
+module_peci_driver(peci_cpu_driver);
+
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI CPU driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI);
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
index 0a7660f4fc9b..235815710e5b 100644
--- a/drivers/peci/device.c
+++ b/drivers/peci/device.c
@@ -3,6 +3,7 @@
 
 #include <linux/bitfield.h>
 #include <linux/peci.h>
+#include <linux/peci-cpu.h>
 #include <linux/slab.h>
 
 #include "internal.h"
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 52c02e12874f..9d75ea54504c 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -22,6 +22,7 @@ void peci_request_free(struct peci_request *req);
 int peci_request_status(struct peci_request *req);
 
 u64 peci_request_dib_read(struct peci_request *req);
+s16 peci_request_temp_read(struct peci_request *req);
 
 u8 peci_request_data_readb(struct peci_request *req);
 u16 peci_request_data_readw(struct peci_request *req);
@@ -36,6 +37,32 @@ struct peci_request *peci_xfer_pkg_cfg_readw(struct peci_device *device, u8 inde
 struct peci_request *peci_xfer_pkg_cfg_readl(struct peci_device *device, u8 index, u16 param);
 struct peci_request *peci_xfer_pkg_cfg_readq(struct peci_device *device, u8 index, u16 param);
 
+struct peci_request *peci_xfer_pci_cfg_local_readb(struct peci_device *device,
+						   u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_xfer_pci_cfg_local_readw(struct peci_device *device,
+						   u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_xfer_pci_cfg_local_readl(struct peci_device *device,
+						   u8 bus, u8 dev, u8 func, u16 reg);
+
+struct peci_request *peci_xfer_ep_pci_cfg_local_readb(struct peci_device *device, u8 seg,
+						      u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_xfer_ep_pci_cfg_local_readw(struct peci_device *device, u8 seg,
+						      u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_xfer_ep_pci_cfg_local_readl(struct peci_device *device, u8 seg,
+						      u8 bus, u8 dev, u8 func, u16 reg);
+
+struct peci_request *peci_xfer_ep_pci_cfg_readb(struct peci_device *device, u8 seg,
+						u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_xfer_ep_pci_cfg_readw(struct peci_device *device, u8 seg,
+						u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_xfer_ep_pci_cfg_readl(struct peci_device *device, u8 seg,
+						u8 bus, u8 dev, u8 func, u16 reg);
+
+struct peci_request *peci_xfer_ep_mmio32_readl(struct peci_device *device, u8 bar, u8 seg,
+					       u8 bus, u8 dev, u8 func, u64 offset);
+
+struct peci_request *peci_xfer_ep_mmio64_readl(struct peci_device *device, u8 bar, u8 seg,
+					       u8 bus, u8 dev, u8 func, u64 offset);
 /**
  * struct peci_device_id - PECI device data to match
  * @data: pointer to driver private data specific to device
diff --git a/drivers/peci/request.c b/drivers/peci/request.c
index 9bd1d81c6093..33da90d801d5 100644
--- a/drivers/peci/request.c
+++ b/drivers/peci/request.c
@@ -3,6 +3,7 @@
 
 #include <linux/bug.h>
 #include <linux/export.h>
+#include <linux/pci.h>
 #include <linux/peci.h>
 #include <linux/slab.h>
 #include <linux/types.h>
@@ -15,6 +16,10 @@
 #define  PECI_GET_DIB_WR_LEN		1
 #define  PECI_GET_DIB_RD_LEN		8
 
+#define PECI_GET_TEMP_CMD		0x01
+#define  PECI_GET_TEMP_WR_LEN		1
+#define  PECI_GET_TEMP_RD_LEN		2
+
 #define PECI_RDPKGCFG_CMD		0xa1
 #define  PECI_RDPKGCFG_WR_LEN		5
 #define  PECI_RDPKGCFG_RD_LEN_BASE	1
@@ -22,6 +27,45 @@
 #define  PECI_WRPKGCFG_WR_LEN_BASE	6
 #define  PECI_WRPKGCFG_RD_LEN		1
 
+#define PECI_RDIAMSR_CMD		0xb1
+#define  PECI_RDIAMSR_WR_LEN		5
+#define  PECI_RDIAMSR_RD_LEN		9
+#define PECI_WRIAMSR_CMD		0xb5
+#define PECI_RDIAMSREX_CMD		0xd1
+#define  PECI_RDIAMSREX_WR_LEN		6
+#define  PECI_RDIAMSREX_RD_LEN		9
+
+#define PECI_RDPCICFG_CMD		0x61
+#define  PECI_RDPCICFG_WR_LEN		6
+#define  PECI_RDPCICFG_RD_LEN		5
+#define  PECI_RDPCICFG_RD_LEN_MAX	24
+#define PECI_WRPCICFG_CMD		0x65
+
+#define PECI_RDPCICFGLOCAL_CMD			0xe1
+#define  PECI_RDPCICFGLOCAL_WR_LEN		5
+#define  PECI_RDPCICFGLOCAL_RD_LEN_BASE		1
+#define PECI_WRPCICFGLOCAL_CMD			0xe5
+#define  PECI_WRPCICFGLOCAL_WR_LEN_BASE		6
+#define  PECI_WRPCICFGLOCAL_RD_LEN		1
+
+#define PECI_ENDPTCFG_TYPE_LOCAL_PCI		0x03
+#define PECI_ENDPTCFG_TYPE_PCI			0x04
+#define PECI_ENDPTCFG_TYPE_MMIO			0x05
+#define PECI_ENDPTCFG_ADDR_TYPE_PCI		0x04
+#define PECI_ENDPTCFG_ADDR_TYPE_MMIO_D		0x05
+#define PECI_ENDPTCFG_ADDR_TYPE_MMIO_Q		0x06
+#define PECI_RDENDPTCFG_CMD			0xc1
+#define  PECI_RDENDPTCFG_PCI_WR_LEN		12
+#define  PECI_RDENDPTCFG_MMIO_WR_LEN_BASE	10
+#define  PECI_RDENDPTCFG_MMIO_D_WR_LEN		14
+#define  PECI_RDENDPTCFG_MMIO_Q_WR_LEN		18
+#define  PECI_RDENDPTCFG_RD_LEN_BASE		1
+#define PECI_WRENDPTCFG_CMD			0xc5
+#define  PECI_WRENDPTCFG_PCI_WR_LEN_BASE	13
+#define  PECI_WRENDPTCFG_MMIO_D_WR_LEN_BASE	15
+#define  PECI_WRENDPTCFG_MMIO_Q_WR_LEN_BASE	19
+#define  PECI_WRENDPTCFG_RD_LEN			1
+
 /* Device Specific Completion Code (CC) Definition */
 #define PECI_CC_SUCCESS				0x40
 #define PECI_CC_NEED_RETRY			0x80
@@ -202,6 +246,27 @@ struct peci_request *peci_xfer_get_dib(struct peci_device *device)
 }
 EXPORT_SYMBOL_NS_GPL(peci_xfer_get_dib, PECI);
 
+struct peci_request *peci_xfer_get_temp(struct peci_device *device)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_GET_TEMP_WR_LEN, PECI_GET_TEMP_RD_LEN);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_GET_TEMP_CMD;
+
+	ret = peci_request_xfer(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+EXPORT_SYMBOL_NS_GPL(peci_xfer_get_temp, PECI);
+
 static struct peci_request *
 __pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
 {
@@ -226,6 +291,108 @@ __pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
 	return req;
 }
 
+static u32 __get_pci_addr(u8 bus, u8 dev, u8 func, u16 reg)
+{
+	return reg | PCI_DEVID(bus, PCI_DEVFN(dev, func)) << 12;
+}
+
+static struct peci_request *
+__pci_cfg_local_read(struct peci_device *device, u8 bus, u8 dev, u8 func, u16 reg, u8 len)
+{
+	struct peci_request *req;
+	u32 pci_addr;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_RDPCICFGLOCAL_WR_LEN,
+				 PECI_RDPCICFGLOCAL_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	pci_addr = __get_pci_addr(bus, dev, func, reg);
+
+	req->tx.buf[0] = PECI_RDPCICFGLOCAL_CMD;
+	req->tx.buf[1] = 0;
+	put_unaligned_le24(pci_addr, &req->tx.buf[2]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
+static struct peci_request *
+__ep_pci_cfg_read(struct peci_device *device, u8 msg_type, u8 seg,
+		  u8 bus, u8 dev, u8 func, u16 reg, u8 len)
+{
+	struct peci_request *req;
+	u32 pci_addr;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_RDENDPTCFG_PCI_WR_LEN,
+				 PECI_RDENDPTCFG_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	pci_addr = __get_pci_addr(bus, dev, func, reg);
+
+	req->tx.buf[0] = PECI_RDENDPTCFG_CMD;
+	req->tx.buf[1] = 0;
+	req->tx.buf[2] = msg_type;
+	req->tx.buf[3] = 0;
+	req->tx.buf[4] = 0;
+	req->tx.buf[5] = 0;
+	req->tx.buf[6] = PECI_ENDPTCFG_ADDR_TYPE_PCI;
+	req->tx.buf[7] = seg; /* PCI Segment */
+	put_unaligned_le32(pci_addr, &req->tx.buf[8]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
+static struct peci_request *
+__ep_mmio_read(struct peci_device *device, u8 bar, u8 addr_type, u8 seg,
+	       u8 bus, u8 dev, u8 func, u64 offset, u8 tx_len, u8 len)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, tx_len, PECI_RDENDPTCFG_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_RDENDPTCFG_CMD;
+	req->tx.buf[1] = 0;
+	req->tx.buf[2] = PECI_ENDPTCFG_TYPE_MMIO;
+	req->tx.buf[3] = 0; /* Endpoint ID */
+	req->tx.buf[4] = 0; /* Reserved */
+	req->tx.buf[5] = bar;
+	req->tx.buf[6] = addr_type;
+	req->tx.buf[7] = seg; /* PCI Segment */
+	req->tx.buf[8] = PCI_DEVFN(dev, func);
+	req->tx.buf[9] = bus; /* PCI Bus */
+
+	if (addr_type == PECI_ENDPTCFG_ADDR_TYPE_MMIO_D)
+		put_unaligned_le32(offset, &req->tx.buf[10]);
+	else
+		put_unaligned_le64(offset, &req->tx.buf[10]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
 u8 peci_request_data_readb(struct peci_request *req)
 {
 	return req->rx.buf[1];
@@ -256,6 +423,12 @@ u64 peci_request_dib_read(struct peci_request *req)
 }
 EXPORT_SYMBOL_NS_GPL(peci_request_dib_read, PECI);
 
+s16 peci_request_temp_read(struct peci_request *req)
+{
+	return get_unaligned_le16(&req->rx.buf[0]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_temp_read, PECI);
+
 #define __read_pkg_config(x, type) \
 struct peci_request *peci_xfer_pkg_cfg_##x(struct peci_device *device, u8 index, u16 param) \
 { \
@@ -267,3 +440,43 @@ __read_pkg_config(readb, u8);
 __read_pkg_config(readw, u16);
 __read_pkg_config(readl, u32);
 __read_pkg_config(readq, u64);
+
+#define __read_pci_config_local(x, type) \
+struct peci_request * \
+peci_xfer_pci_cfg_local_##x(struct peci_device *device, u8 bus, u8 dev, u8 func, u16 reg) \
+{ \
+	return __pci_cfg_local_read(device, bus, dev, func, reg, sizeof(type)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_xfer_pci_cfg_local_##x, PECI)
+
+__read_pci_config_local(readb, u8);
+__read_pci_config_local(readw, u16);
+__read_pci_config_local(readl, u32);
+
+#define __read_ep_pci_config(x, msg_type, type) \
+struct peci_request * \
+peci_xfer_ep_pci_cfg_##x(struct peci_device *device, u8 seg, u8 bus, u8 dev, u8 func, u16 reg) \
+{ \
+	return __ep_pci_cfg_read(device, msg_type, seg, bus, dev, func, reg, sizeof(type)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_xfer_ep_pci_cfg_##x, PECI)
+
+__read_ep_pci_config(local_readb, PECI_ENDPTCFG_TYPE_LOCAL_PCI, u8);
+__read_ep_pci_config(local_readw, PECI_ENDPTCFG_TYPE_LOCAL_PCI, u16);
+__read_ep_pci_config(local_readl, PECI_ENDPTCFG_TYPE_LOCAL_PCI, u32);
+__read_ep_pci_config(readb, PECI_ENDPTCFG_TYPE_PCI, u8);
+__read_ep_pci_config(readw, PECI_ENDPTCFG_TYPE_PCI, u16);
+__read_ep_pci_config(readl, PECI_ENDPTCFG_TYPE_PCI, u32);
+
+#define __read_ep_mmio(x, y, addr_type, type1, type2) \
+struct peci_request *peci_xfer_ep_mmio##y##_##x(struct peci_device *device, u8 bar, u8 seg, \
+					   u8 bus, u8 dev, u8 func, u64 offset) \
+{ \
+	return __ep_mmio_read(device, bar, addr_type, seg, bus, dev, func, \
+			      offset, PECI_RDENDPTCFG_MMIO_WR_LEN_BASE + sizeof(type1), \
+			      sizeof(type2)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_xfer_ep_mmio##y##_##x, PECI)
+
+__read_ep_mmio(readl, 32, PECI_ENDPTCFG_ADDR_TYPE_MMIO_D, u32, u32);
+__read_ep_mmio(readl, 64, PECI_ENDPTCFG_ADDR_TYPE_MMIO_Q, u64, u32);
diff --git a/include/linux/peci-cpu.h b/include/linux/peci-cpu.h
new file mode 100644
index 000000000000..ff8ae9c26c80
--- /dev/null
+++ b/include/linux/peci-cpu.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2021 Intel Corporation */
+
+#ifndef __LINUX_PECI_CPU_H
+#define __LINUX_PECI_CPU_H
+
+#include <linux/types.h>
+
+#include "../../arch/x86/include/asm/intel-family.h"
+
+#define PECI_PCS_PKG_ID			0  /* Package Identifier Read */
+#define  PECI_PKG_ID_CPU_ID		0x0000  /* CPUID Info */
+#define  PECI_PKG_ID_PLATFORM_ID	0x0001  /* Platform ID */
+#define  PECI_PKG_ID_DEVICE_ID		0x0002  /* Uncore Device ID */
+#define  PECI_PKG_ID_MAX_THREAD_ID	0x0003  /* Max Thread ID */
+#define  PECI_PKG_ID_MICROCODE_REV	0x0004  /* CPU Microcode Update Revision */
+#define  PECI_PKG_ID_MCA_ERROR_LOG	0x0005  /* Machine Check Status */
+#define PECI_PCS_MODULE_TEMP		9  /* Per Core DTS Temperature Read */
+#define PECI_PCS_THERMAL_MARGIN		10 /* DTS thermal margin */
+#define PECI_PCS_DDR_DIMM_TEMP		14 /* DDR DIMM Temperature */
+#define PECI_PCS_TEMP_TARGET		16 /* Temperature Target Read */
+#define PECI_PCS_TDP_UNITS		30 /* Units for power/energy registers */
+
+struct peci_device;
+
+int peci_temp_read(struct peci_device *device, s16 *temp_raw);
+
+int peci_pcs_read(struct peci_device *device, u8 index,
+		  u16 param, u32 *data);
+
+int peci_pci_local_read(struct peci_device *device, u8 bus, u8 dev,
+			u8 func, u16 reg, u32 *data);
+
+int peci_ep_pci_local_read(struct peci_device *device, u8 seg,
+			   u8 bus, u8 dev, u8 func, u16 reg, u32 *data);
+
+int peci_mmio_read(struct peci_device *device, u8 bar, u8 seg,
+		   u8 bus, u8 dev, u8 func, u64 address, u32 *data);
+
+#endif /* __LINUX_PECI_CPU_H */
diff --git a/include/linux/peci.h b/include/linux/peci.h
index dcf1c53f4e40..ce43705eaac4 100644
--- a/include/linux/peci.h
+++ b/include/linux/peci.h
@@ -14,14 +14,6 @@
  */
 #define PECI_REQUEST_MAX_BUF_SIZE 32
 
-#define PECI_PCS_PKG_ID			0  /* Package Identifier Read */
-#define  PECI_PKG_ID_CPU_ID		0x0000  /* CPUID Info */
-#define  PECI_PKG_ID_PLATFORM_ID	0x0001  /* Platform ID */
-#define  PECI_PKG_ID_DEVICE_ID		0x0002  /* Uncore Device ID */
-#define  PECI_PKG_ID_MAX_THREAD_ID	0x0003  /* Max Thread ID */
-#define  PECI_PKG_ID_MICROCODE_REV	0x0004  /* CPU Microcode Update Revision */
-#define  PECI_PKG_ID_MCA_ERROR_LOG	0x0005  /* Machine Check Status */
-
 struct peci_controller;
 struct peci_request;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 10/13] hwmon: peci: Add cputemp driver
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (8 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 09/13] peci: Add peci-cpu driver Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-16  0:52   ` Guenter Roeck
  2021-11-15 18:25 ` [PATCH v3 11/13] hwmon: peci: Add dimmtemp driver Iwona Winiarska
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Add peci-cputemp driver for Digital Thermal Sensor (DTS) thermal
readings of the processor package and processor cores that are
accessible via the PECI interface.

The main use case for the driver (and PECI interface) is out-of-band
management, where we're able to obtain the DTS readings from an external
entity connected with PECI, e.g. BMC on server platforms.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS                  |   7 +
 drivers/hwmon/Kconfig        |   2 +
 drivers/hwmon/Makefile       |   1 +
 drivers/hwmon/peci/Kconfig   |  18 ++
 drivers/hwmon/peci/Makefile  |   5 +
 drivers/hwmon/peci/common.h  |  58 ++++
 drivers/hwmon/peci/cputemp.c | 592 +++++++++++++++++++++++++++++++++++
 7 files changed, 683 insertions(+)
 create mode 100644 drivers/hwmon/peci/Kconfig
 create mode 100644 drivers/hwmon/peci/Makefile
 create mode 100644 drivers/hwmon/peci/common.h
 create mode 100644 drivers/hwmon/peci/cputemp.c

diff --git a/MAINTAINERS b/MAINTAINERS
index d751590f32e3..0094370be6c4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14920,6 +14920,13 @@ L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/peaq-wmi.c
 
+PECI HARDWARE MONITORING DRIVERS
+M:	Iwona Winiarska <iwona.winiarska@intel.com>
+R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+L:	linux-hwmon@vger.kernel.org
+S:	Supported
+F:	drivers/hwmon/peci/
+
 PECI SUBSYSTEM
 M:	Iwona Winiarska <iwona.winiarska@intel.com>
 R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 64bd3dfba2c4..4858f7cfa81e 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1528,6 +1528,8 @@ config SENSORS_PCF8591
 	  These devices are hard to detect and rarely found on mainstream
 	  hardware. If unsure, say N.
 
+source "drivers/hwmon/peci/Kconfig"
+
 source "drivers/hwmon/pmbus/Kconfig"
 
 config SENSORS_PWM_FAN
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index baee6a8d4dd1..2c0e210a28b8 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -204,6 +204,7 @@ obj-$(CONFIG_SENSORS_WM8350)	+= wm8350-hwmon.o
 obj-$(CONFIG_SENSORS_XGENE)	+= xgene-hwmon.o
 
 obj-$(CONFIG_SENSORS_OCC)	+= occ/
+obj-$(CONFIG_SENSORS_PECI)	+= peci/
 obj-$(CONFIG_PMBUS)		+= pmbus/
 
 ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
new file mode 100644
index 000000000000..e10eed68d70a
--- /dev/null
+++ b/drivers/hwmon/peci/Kconfig
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config SENSORS_PECI_CPUTEMP
+	tristate "PECI CPU temperature monitoring client"
+	depends on PECI
+	select SENSORS_PECI
+	select PECI_CPU
+	help
+	  If you say yes here you get support for the generic Intel PECI
+	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+	  readings of the CPU package and CPU cores that are accessible via
+	  the processor PECI interface.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called peci-cputemp.
+
+config SENSORS_PECI
+	tristate
diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
new file mode 100644
index 000000000000..e8a0ada5ab1f
--- /dev/null
+++ b/drivers/hwmon/peci/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+peci-cputemp-y := cputemp.o
+
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
diff --git a/drivers/hwmon/peci/common.h b/drivers/hwmon/peci/common.h
new file mode 100644
index 000000000000..734506b0eca2
--- /dev/null
+++ b/drivers/hwmon/peci/common.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2021 Intel Corporation */
+
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+#ifndef __PECI_HWMON_COMMON_H
+#define __PECI_HWMON_COMMON_H
+
+#define PECI_HWMON_UPDATE_INTERVAL	HZ
+
+/**
+ * struct peci_sensor_state - PECI state information
+ * @valid: flag to indicate the sensor value is valid
+ * @last_updated: time of the last update in jiffies
+ * @lock: mutex to protect sensor access
+ */
+struct peci_sensor_state {
+	bool valid;
+	unsigned long last_updated;
+	struct mutex lock; /* protect sensor access */
+};
+
+/**
+ * struct peci_sensor_data - PECI sensor information
+ * @value: sensor value in milli units
+ * @state: sensor update state
+ */
+
+struct peci_sensor_data {
+	s32 value;
+	struct peci_sensor_state state;
+};
+
+/**
+ * peci_sensor_need_update() - check whether sensor update is needed or not
+ * @sensor: pointer to sensor data struct
+ *
+ * Return: true if update is needed, false if not.
+ */
+
+static inline bool peci_sensor_need_update(struct peci_sensor_state *state)
+{
+	return !state->valid ||
+	       time_after(jiffies, state->last_updated + PECI_HWMON_UPDATE_INTERVAL);
+}
+
+/**
+ * peci_sensor_mark_updated() - mark the sensor is updated
+ * @sensor: pointer to sensor data struct
+ */
+static inline void peci_sensor_mark_updated(struct peci_sensor_state *state)
+{
+	state->valid = true;
+	state->last_updated = jiffies;
+}
+
+#endif /* __PECI_HWMON_COMMON_H */
diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c
new file mode 100644
index 000000000000..d264d6c2f437
--- /dev/null
+++ b/drivers/hwmon/peci/cputemp.c
@@ -0,0 +1,592 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/auxiliary_bus.h>
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/hwmon.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/peci.h>
+#include <linux/peci-cpu.h>
+#include <linux/units.h>
+
+#include "common.h"
+
+#define CORE_NUMS_MAX		64
+
+#define BASE_CHANNEL_NUMS	5
+#define CPUTEMP_CHANNEL_NUMS	(BASE_CHANNEL_NUMS + CORE_NUMS_MAX)
+
+#define TEMP_TARGET_FAN_TEMP_MASK	GENMASK(15, 8)
+#define TEMP_TARGET_REF_TEMP_MASK	GENMASK(23, 16)
+#define TEMP_TARGET_TJ_OFFSET_MASK	GENMASK(29, 24)
+
+#define DTS_MARGIN_MASK		GENMASK(15, 0)
+#define PCS_MODULE_TEMP_MASK	GENMASK(15, 0)
+
+struct resolved_cores_reg {
+	u8 bus;
+	u8 dev;
+	u8 func;
+	u8 offset;
+};
+
+struct cpu_info {
+	struct resolved_cores_reg *reg;
+	u8 min_peci_revision;
+	s32 (*thermal_margin_to_millidegree)(s32 val);
+};
+
+struct peci_temp_target {
+	s32 tcontrol;
+	s32 tthrottle;
+	s32 tjmax;
+	struct peci_sensor_state state;
+};
+
+enum peci_temp_target_type {
+	tcontrol_type,
+	tthrottle_type,
+	tjmax_type,
+	crit_hyst_type,
+};
+
+struct peci_cputemp {
+	struct peci_device *peci_dev;
+	struct device *dev;
+	const char *name;
+	const struct cpu_info *gen_info;
+	struct {
+		struct peci_temp_target target;
+		struct peci_sensor_data die;
+		struct peci_sensor_data dts;
+		struct peci_sensor_data core[CORE_NUMS_MAX];
+	} temp;
+	const char **coretemp_label;
+	DECLARE_BITMAP(core_mask, CORE_NUMS_MAX);
+};
+
+enum cputemp_channels {
+	channel_die,
+	channel_dts,
+	channel_tcontrol,
+	channel_tthrottle,
+	channel_tjmax,
+	channel_core,
+};
+
+static const char * const cputemp_label[BASE_CHANNEL_NUMS] = {
+	"Die",
+	"DTS",
+	"Tcontrol",
+	"Tthrottle",
+	"Tjmax",
+};
+
+static int update_temp_target(struct peci_cputemp *priv)
+{
+	s32 tthrottle_offset, tcontrol_margin;
+	u32 pcs;
+	int ret;
+
+	if (!peci_sensor_need_update(&priv->temp.target.state))
+		return 0;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_TEMP_TARGET, 0, &pcs);
+	if (ret)
+		return ret;
+
+	priv->temp.target.tjmax =
+		FIELD_GET(TEMP_TARGET_REF_TEMP_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
+
+	tcontrol_margin = FIELD_GET(TEMP_TARGET_FAN_TEMP_MASK, pcs);
+	tcontrol_margin = sign_extend32(tcontrol_margin, 7) * MILLIDEGREE_PER_DEGREE;
+	priv->temp.target.tcontrol = priv->temp.target.tjmax - tcontrol_margin;
+
+	tthrottle_offset = FIELD_GET(TEMP_TARGET_TJ_OFFSET_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
+	priv->temp.target.tthrottle = priv->temp.target.tjmax - tthrottle_offset;
+
+	peci_sensor_mark_updated(&priv->temp.target.state);
+
+	return 0;
+}
+
+static int get_temp_target(struct peci_cputemp *priv, enum peci_temp_target_type type, long *val)
+{
+	int ret;
+
+	mutex_lock(&priv->temp.target.state.lock);
+
+	ret = update_temp_target(priv);
+	if (ret)
+		goto unlock;
+
+	switch (type) {
+	case tcontrol_type:
+		*val = priv->temp.target.tcontrol;
+		break;
+	case tthrottle_type:
+		*val = priv->temp.target.tthrottle;
+		break;
+	case tjmax_type:
+		*val = priv->temp.target.tjmax;
+		break;
+	case crit_hyst_type:
+		*val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+unlock:
+	mutex_unlock(&priv->temp.target.state.lock);
+
+	return ret;
+}
+
+/*
+ * Error codes:
+ *   0x8000: General sensor error
+ *   0x8001: Reserved
+ *   0x8002: Underflow on reading value
+ *   0x8003-0x81ff: Reserved
+ */
+static bool dts_valid(s32 val)
+{
+	return val < 0x8000 || val > 0x81ff;
+}
+
+/*
+ * Processors return a value of DTS reading in S10.6 fixed point format
+ * (16 bits: 10-bit signed magnitude, 6-bit fraction).
+ */
+static s32 dts_ten_dot_six_to_millidegree(s32 val)
+{
+	return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / 64;
+}
+
+/*
+ * For older processors, thermal margin reading is returned in S8.8 fixed
+ * point format (16 bits: 8-bit signed magnitude, 8-bit fraction).
+ */
+static s32 dts_eight_dot_eight_to_millidegree(s32 val)
+{
+	return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / 256;
+}
+
+static int get_die_temp(struct peci_cputemp *priv, long *val)
+{
+	int ret = 0;
+	long tjmax;
+	s16 temp;
+
+	mutex_lock(&priv->temp.die.state.lock);
+	if (!peci_sensor_need_update(&priv->temp.die.state))
+		goto skip_update;
+
+	ret = peci_temp_read(priv->peci_dev, &temp);
+	if (ret)
+		goto err_unlock;
+
+	if (!dts_valid(temp)) {
+		ret = -EIO;
+		goto err_unlock;
+	}
+
+	ret = get_temp_target(priv, tjmax_type, &tjmax);
+	if (ret)
+		goto err_unlock;
+
+	priv->temp.die.value = (s32)tjmax + dts_ten_dot_six_to_millidegree(temp);
+
+	peci_sensor_mark_updated(&priv->temp.die.state);
+
+skip_update:
+	*val = priv->temp.die.value;
+err_unlock:
+	mutex_unlock(&priv->temp.die.state.lock);
+	return ret;
+}
+
+static int get_dts(struct peci_cputemp *priv, long *val)
+{
+	int ret = 0;
+	s32 thermal_margin;
+	long tcontrol;
+	u32 pcs;
+
+	mutex_lock(&priv->temp.dts.state.lock);
+	if (!peci_sensor_need_update(&priv->temp.dts.state))
+		goto skip_update;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_THERMAL_MARGIN, 0, &pcs);
+	if (ret)
+		goto err_unlock;
+
+	thermal_margin = FIELD_GET(DTS_MARGIN_MASK, pcs);
+	if (!dts_valid(thermal_margin)) {
+		ret = -EIO;
+		goto err_unlock;
+	}
+
+	ret = get_temp_target(priv, tcontrol_type, &tcontrol);
+	if (ret)
+		goto err_unlock;
+
+	/* Note that the tcontrol should be available before calling it */
+	priv->temp.dts.value =
+		(s32)tcontrol - priv->gen_info->thermal_margin_to_millidegree(thermal_margin);
+
+	peci_sensor_mark_updated(&priv->temp.dts.state);
+
+skip_update:
+	*val = priv->temp.dts.value;
+err_unlock:
+	mutex_unlock(&priv->temp.dts.state.lock);
+	return ret;
+}
+
+static int get_core_temp(struct peci_cputemp *priv, int core_index, long *val)
+{
+	int ret = 0;
+	s32 core_dts_margin;
+	long tjmax;
+	u32 pcs;
+
+	mutex_lock(&priv->temp.core[core_index].state.lock);
+	if (!peci_sensor_need_update(&priv->temp.core[core_index].state))
+		goto skip_update;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_MODULE_TEMP, core_index, &pcs);
+	if (ret)
+		goto err_unlock;
+
+	core_dts_margin = FIELD_GET(PCS_MODULE_TEMP_MASK, pcs);
+	if (!dts_valid(core_dts_margin)) {
+		ret = -EIO;
+		goto err_unlock;
+	}
+
+	ret = get_temp_target(priv, tjmax_type, &tjmax);
+	if (ret)
+		goto err_unlock;
+
+	/* Note that the tjmax should be available before calling it */
+	priv->temp.core[core_index].value =
+		(s32)tjmax + dts_ten_dot_six_to_millidegree(core_dts_margin);
+
+	peci_sensor_mark_updated(&priv->temp.core[core_index].state);
+
+skip_update:
+	*val = priv->temp.core[core_index].value;
+err_unlock:
+	mutex_unlock(&priv->temp.core[core_index].state.lock);
+	return ret;
+}
+
+static int cputemp_read_string(struct device *dev, enum hwmon_sensor_types type,
+			       u32 attr, int channel, const char **str)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+
+	if (attr != hwmon_temp_label)
+		return -EOPNOTSUPP;
+
+	*str = channel < channel_core ?
+		cputemp_label[channel] : priv->coretemp_label[channel - channel_core];
+
+	return 0;
+}
+
+static int cputemp_read(struct device *dev, enum hwmon_sensor_types type,
+			u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+
+	switch (attr) {
+	case hwmon_temp_input:
+		switch (channel) {
+		case channel_die:
+			return get_die_temp(priv, val);
+		case channel_dts:
+			return get_dts(priv, val);
+		case channel_tcontrol:
+			return get_temp_target(priv, tcontrol_type, val);
+		case channel_tthrottle:
+			return get_temp_target(priv, tthrottle_type, val);
+		case channel_tjmax:
+			return get_temp_target(priv, tjmax_type, val);
+		default:
+			return get_core_temp(priv, channel - channel_core, val);
+		}
+		break;
+	case hwmon_temp_max:
+		return get_temp_target(priv, tcontrol_type, val);
+	case hwmon_temp_crit:
+		return get_temp_target(priv, tjmax_type, val);
+	case hwmon_temp_crit_hyst:
+		return get_temp_target(priv, crit_hyst_type, val);
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types type,
+				  u32 attr, int channel)
+{
+	const struct peci_cputemp *priv = data;
+
+	if (channel > CPUTEMP_CHANNEL_NUMS)
+		return 0;
+
+	if (channel < channel_core)
+		return 0444;
+
+	if (test_bit(channel - channel_core, priv->core_mask))
+		return 0444;
+
+	return 0;
+}
+
+static int init_core_mask(struct peci_cputemp *priv)
+{
+	struct peci_device *peci_dev = priv->peci_dev;
+	struct resolved_cores_reg *reg = priv->gen_info->reg;
+	u64 core_mask;
+	u32 data;
+	int ret;
+
+	/* Get the RESOLVED_CORES register value */
+	switch (peci_dev->info.model) {
+	case INTEL_FAM6_ICELAKE_X:
+	case INTEL_FAM6_ICELAKE_D:
+		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
+					     reg->func, reg->offset + 4, &data);
+		if (ret)
+			return ret;
+
+		core_mask = (u64)data << 32;
+
+		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
+					     reg->func, reg->offset, &data);
+		if (ret)
+			return ret;
+
+		core_mask |= data;
+
+		break;
+	default:
+		ret = peci_pci_local_read(peci_dev, reg->bus, reg->dev,
+					  reg->func, reg->offset, &data);
+		if (ret)
+			return ret;
+
+		core_mask = data;
+
+		break;
+	}
+
+	if (!core_mask)
+		return -EIO;
+
+	bitmap_from_u64(priv->core_mask, core_mask);
+
+	return 0;
+}
+
+static int create_temp_label(struct peci_cputemp *priv)
+{
+	unsigned long core_max = find_last_bit(priv->core_mask, CORE_NUMS_MAX);
+	int i;
+
+	priv->coretemp_label = devm_kzalloc(priv->dev, core_max * sizeof(char *), GFP_KERNEL);
+	if (!priv->coretemp_label)
+		return -ENOMEM;
+
+	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) {
+		priv->coretemp_label[i] = devm_kasprintf(priv->dev, GFP_KERNEL, "Core %d", i);
+		if (!priv->coretemp_label[i])
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void check_resolved_cores(struct peci_cputemp *priv)
+{
+	/*
+	 * Failure to resolve cores is non-critical, we're still able to
+	 * provide other sensor data.
+	 */
+
+	if (init_core_mask(priv))
+		return;
+
+	if (create_temp_label(priv))
+		bitmap_zero(priv->core_mask, CORE_NUMS_MAX);
+}
+
+static void sensor_init(struct peci_cputemp *priv)
+{
+	int i;
+
+	mutex_init(&priv->temp.target.state.lock);
+	mutex_init(&priv->temp.die.state.lock);
+	mutex_init(&priv->temp.dts.state.lock);
+
+	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX)
+		mutex_init(&priv->temp.core[i].state.lock);
+}
+
+static const struct hwmon_ops peci_cputemp_ops = {
+	.is_visible = cputemp_is_visible,
+	.read_string = cputemp_read_string,
+	.read = cputemp_read,
+};
+
+static const u32 peci_cputemp_temp_channel_config[] = {
+	/* Die temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
+	/* DTS margin */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
+	/* Tcontrol temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
+	/* Tthrottle temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+	/* Tjmax temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+	/* Core temperature - for all core channels */
+	[channel_core ... CPUTEMP_CHANNEL_NUMS - 1] = HWMON_T_LABEL | HWMON_T_INPUT,
+	0
+};
+
+static const struct hwmon_channel_info peci_cputemp_temp_channel = {
+	.type = hwmon_temp,
+	.config = peci_cputemp_temp_channel_config,
+};
+
+static const struct hwmon_channel_info *peci_cputemp_info[] = {
+	&peci_cputemp_temp_channel,
+	NULL
+};
+
+static const struct hwmon_chip_info peci_cputemp_chip_info = {
+	.ops = &peci_cputemp_ops,
+	.info = peci_cputemp_info,
+};
+
+static int peci_cputemp_probe(struct auxiliary_device *adev,
+			      const struct auxiliary_device_id *id)
+{
+	struct device *dev = &adev->dev;
+	struct peci_device *peci_dev = to_peci_device(dev->parent);
+	struct peci_cputemp *priv;
+	struct device *hwmon_dev;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_cputemp.cpu%d",
+				    peci_dev->info.socket_id);
+	if (!priv->name)
+		return -ENOMEM;
+
+	priv->dev = dev;
+	priv->peci_dev = peci_dev;
+	priv->gen_info = (const struct cpu_info *)id->driver_data;
+
+	/*
+	 * This is just a sanity check. Since we're using commands that are
+	 * guaranteed to be supported on a given platform, we should never see
+	 * revision lower than expected.
+	 */
+	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
+		dev_warn(priv->dev,
+			 "Unexpected PECI revision %#x, some features may be unavailable\n",
+			 peci_dev->info.peci_revision);
+
+	check_resolved_cores(priv);
+
+	sensor_init(priv);
+
+	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev, priv->name,
+							 priv, &peci_cputemp_chip_info, NULL);
+
+	return PTR_ERR_OR_ZERO(hwmon_dev);
+}
+
+/*
+ * RESOLVED_CORES PCI configuration register may have different location on
+ * different platforms.
+ */
+static struct resolved_cores_reg resolved_cores_reg_hsx = {
+	.bus = 1,
+	.dev = 30,
+	.func = 3,
+	.offset = 0xb4,
+};
+
+static struct resolved_cores_reg resolved_cores_reg_icx = {
+	.bus = 14,
+	.dev = 30,
+	.func = 3,
+	.offset = 0xd0,
+};
+
+static const struct cpu_info cpu_hsx = {
+	.reg		= &resolved_cores_reg_hsx,
+	.min_peci_revision = 0x33,
+	.thermal_margin_to_millidegree = &dts_eight_dot_eight_to_millidegree,
+};
+
+static const struct cpu_info cpu_icx = {
+	.reg		= &resolved_cores_reg_icx,
+	.min_peci_revision = 0x40,
+	.thermal_margin_to_millidegree = &dts_ten_dot_six_to_millidegree,
+};
+
+static const struct auxiliary_device_id peci_cputemp_ids[] = {
+	{
+		.name = "peci_cpu.cputemp.hsx",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.bdx",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.bdxd",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.skx",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.icx",
+		.driver_data = (kernel_ulong_t)&cpu_icx,
+	},
+	{
+		.name = "peci_cpu.cputemp.icxd",
+		.driver_data = (kernel_ulong_t)&cpu_icx,
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(auxiliary, peci_cputemp_ids);
+
+static struct auxiliary_driver peci_cputemp_driver = {
+	.probe		= peci_cputemp_probe,
+	.id_table	= peci_cputemp_ids,
+};
+
+module_auxiliary_driver(peci_cputemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI cputemp driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI_CPU);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 11/13] hwmon: peci: Add dimmtemp driver
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (9 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 10/13] hwmon: peci: Add cputemp driver Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 12/13] docs: hwmon: Document PECI drivers Iwona Winiarska
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
are accessible via the processor PECI interface.

The main use case for the driver (and PECI interface) is out-of-band
management, where we're able to obtain thermal readings from an external
entity connected with PECI, e.g. BMC on server platforms.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/hwmon/peci/Kconfig    |  13 +
 drivers/hwmon/peci/Makefile   |   2 +
 drivers/hwmon/peci/dimmtemp.c | 630 ++++++++++++++++++++++++++++++++++
 3 files changed, 645 insertions(+)
 create mode 100644 drivers/hwmon/peci/dimmtemp.c

diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
index e10eed68d70a..9d32a57badfe 100644
--- a/drivers/hwmon/peci/Kconfig
+++ b/drivers/hwmon/peci/Kconfig
@@ -14,5 +14,18 @@ config SENSORS_PECI_CPUTEMP
 	  This driver can also be built as a module. If so, the module
 	  will be called peci-cputemp.
 
+config SENSORS_PECI_DIMMTEMP
+	tristate "PECI DIMM temperature monitoring client"
+	depends on PECI
+	select SENSORS_PECI
+	select PECI_CPU
+	help
+	  If you say yes here you get support for the generic Intel PECI hwmon
+	  driver which provides Temperature Sensor on DIMM readings that are
+	  accessible via the processor PECI interface.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called peci-dimmtemp.
+
 config SENSORS_PECI
 	tristate
diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
index e8a0ada5ab1f..191cfa0227f3 100644
--- a/drivers/hwmon/peci/Makefile
+++ b/drivers/hwmon/peci/Makefile
@@ -1,5 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 peci-cputemp-y := cputemp.o
+peci-dimmtemp-y := dimmtemp.o
 
 obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
new file mode 100644
index 000000000000..ec435b03e401
--- /dev/null
+++ b/drivers/hwmon/peci/dimmtemp.c
@@ -0,0 +1,630 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/auxiliary_bus.h>
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/hwmon.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/peci.h>
+#include <linux/peci-cpu.h>
+#include <linux/units.h>
+#include <linux/workqueue.h>
+
+#include "common.h"
+
+#define DIMM_MASK_CHECK_DELAY_JIFFIES	msecs_to_jiffies(5000)
+
+/* Max number of channel ranks and DIMM index per channel */
+#define CHAN_RANK_MAX_ON_HSX	8
+#define DIMM_IDX_MAX_ON_HSX	3
+#define CHAN_RANK_MAX_ON_BDX	4
+#define DIMM_IDX_MAX_ON_BDX	3
+#define CHAN_RANK_MAX_ON_BDXD	2
+#define DIMM_IDX_MAX_ON_BDXD	2
+#define CHAN_RANK_MAX_ON_SKX	6
+#define DIMM_IDX_MAX_ON_SKX	2
+#define CHAN_RANK_MAX_ON_ICX	8
+#define DIMM_IDX_MAX_ON_ICX	2
+#define CHAN_RANK_MAX_ON_ICXD	4
+#define DIMM_IDX_MAX_ON_ICXD	2
+
+#define CHAN_RANK_MAX		CHAN_RANK_MAX_ON_HSX
+#define DIMM_IDX_MAX		DIMM_IDX_MAX_ON_HSX
+#define DIMM_NUMS_MAX		(CHAN_RANK_MAX * DIMM_IDX_MAX)
+
+#define CPU_SEG_MASK		GENMASK(23, 16)
+#define GET_CPU_SEG(x)		(((x) & CPU_SEG_MASK) >> 16)
+#define CPU_BUS_MASK		GENMASK(7, 0)
+#define GET_CPU_BUS(x)		((x) & CPU_BUS_MASK)
+
+#define DIMM_TEMP_MAX		GENMASK(15, 8)
+#define DIMM_TEMP_CRIT		GENMASK(23, 16)
+#define GET_TEMP_MAX(x)		(((x) & DIMM_TEMP_MAX) >> 8)
+#define GET_TEMP_CRIT(x)	(((x) & DIMM_TEMP_CRIT) >> 16)
+
+#define NO_DIMM_RETRY_COUNT_MAX	5
+
+struct peci_dimmtemp;
+
+struct dimm_info {
+	int chan_rank_max;
+	int dimm_idx_max;
+	u8 min_peci_revision;
+	int (*read_thresholds)(struct peci_dimmtemp *priv, int dimm_order,
+			       int chan_rank, u32 *data);
+};
+
+struct peci_dimm_thresholds {
+	long temp_max;
+	long temp_crit;
+	struct peci_sensor_state state;
+};
+
+enum peci_dimm_threshold_type {
+	temp_max_type,
+	temp_crit_type,
+};
+
+struct peci_dimmtemp {
+	struct peci_device *peci_dev;
+	struct device *dev;
+	const char *name;
+	const struct dimm_info *gen_info;
+	struct delayed_work detect_work;
+	struct {
+		struct peci_sensor_data temp;
+		struct peci_dimm_thresholds thresholds;
+	} dimm[DIMM_NUMS_MAX];
+	char **dimmtemp_label;
+	DECLARE_BITMAP(dimm_mask, DIMM_NUMS_MAX);
+	u8 no_dimm_retry_count;
+};
+
+static u8 __dimm_temp(u32 reg, int dimm_order)
+{
+	return (reg >> (dimm_order * 8)) & 0xff;
+}
+
+static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no, long *val)
+{
+	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
+	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
+	int ret = 0;
+	u32 data;
+
+	mutex_lock(&priv->dimm[dimm_no].temp.state.lock);
+	if (!peci_sensor_need_update(&priv->dimm[dimm_no].temp.state))
+		goto skip_update;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &data);
+	if (ret)
+		goto unlock;
+
+	priv->dimm[dimm_no].temp.value = __dimm_temp(data, dimm_order) * MILLIDEGREE_PER_DEGREE;
+
+	peci_sensor_mark_updated(&priv->dimm[dimm_no].temp.state);
+
+skip_update:
+	*val = priv->dimm[dimm_no].temp.value;
+unlock:
+	mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
+	return ret;
+}
+
+static int update_thresholds(struct peci_dimmtemp *priv, int dimm_no)
+{
+	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
+	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
+	u32 data;
+	int ret;
+
+	if (!peci_sensor_need_update(&priv->dimm[dimm_no].thresholds.state))
+		return 0;
+
+	ret = priv->gen_info->read_thresholds(priv, dimm_order, chan_rank, &data);
+	if (ret == -ENODATA) /* Use default or previous value */
+		return 0;
+	if (ret)
+		return ret;
+
+	priv->dimm[dimm_no].thresholds.temp_max = GET_TEMP_MAX(data) * MILLIDEGREE_PER_DEGREE;
+	priv->dimm[dimm_no].thresholds.temp_crit = GET_TEMP_CRIT(data) * MILLIDEGREE_PER_DEGREE;
+
+	peci_sensor_mark_updated(&priv->dimm[dimm_no].thresholds.state);
+
+	return 0;
+}
+
+static int get_dimm_thresholds(struct peci_dimmtemp *priv, enum peci_dimm_threshold_type type,
+			       int dimm_no, long *val)
+{
+	int ret;
+
+	mutex_lock(&priv->dimm[dimm_no].thresholds.state.lock);
+	ret = update_thresholds(priv, dimm_no);
+	if (ret)
+		goto unlock;
+
+	switch (type) {
+	case temp_max_type:
+		*val = priv->dimm[dimm_no].thresholds.temp_max;
+		break;
+	case temp_crit_type:
+		*val = priv->dimm[dimm_no].thresholds.temp_crit;
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+unlock:
+	mutex_unlock(&priv->dimm[dimm_no].thresholds.state.lock);
+
+	return ret;
+}
+
+static int dimmtemp_read_string(struct device *dev,
+				enum hwmon_sensor_types type,
+				u32 attr, int channel, const char **str)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+
+	if (attr != hwmon_temp_label)
+		return -EOPNOTSUPP;
+
+	*str = (const char *)priv->dimmtemp_label[channel];
+
+	return 0;
+}
+
+static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
+			 u32 attr, int channel, long *val)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+
+	switch (attr) {
+	case hwmon_temp_input:
+		return get_dimm_temp(priv, channel, val);
+	case hwmon_temp_max:
+		return get_dimm_thresholds(priv, temp_max_type, channel, val);
+	case hwmon_temp_crit:
+		return get_dimm_thresholds(priv, temp_crit_type, channel, val);
+	default:
+		break;
+	}
+
+	return -EOPNOTSUPP;
+}
+
+static umode_t dimmtemp_is_visible(const void *data, enum hwmon_sensor_types type,
+				   u32 attr, int channel)
+{
+	const struct peci_dimmtemp *priv = data;
+
+	if (test_bit(channel, priv->dimm_mask))
+		return 0444;
+
+	return 0;
+}
+
+static const struct hwmon_ops peci_dimmtemp_ops = {
+	.is_visible = dimmtemp_is_visible,
+	.read_string = dimmtemp_read_string,
+	.read = dimmtemp_read,
+};
+
+static int check_populated_dimms(struct peci_dimmtemp *priv)
+{
+	int chan_rank_max = priv->gen_info->chan_rank_max;
+	int dimm_idx_max = priv->gen_info->dimm_idx_max;
+	u32 chan_rank_empty = 0;
+	u64 dimm_mask = 0;
+	int chan_rank, dimm_idx, ret;
+	u32 pcs;
+
+	BUILD_BUG_ON(BITS_PER_TYPE(chan_rank_empty) < CHAN_RANK_MAX);
+	BUILD_BUG_ON(BITS_PER_TYPE(dimm_mask) < DIMM_NUMS_MAX);
+	if (chan_rank_max * dimm_idx_max > DIMM_NUMS_MAX) {
+		WARN_ONCE(1, "Unsupported number of DIMMs - chan_rank_max: %d, dimm_idx_max: %d",
+			  chan_rank_max, dimm_idx_max);
+		return -EINVAL;
+	}
+
+	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
+		ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &pcs);
+		if (ret) {
+			/*
+			 * Overall, we expect either success or -EINVAL in
+			 * order to determine whether DIMM is populated or not.
+			 * For anything else we fall back to deferring the
+			 * detection to be performed at a later point in time.
+			 */
+			if (ret == -EINVAL) {
+				chan_rank_empty |= BIT(chan_rank);
+				continue;
+			}
+
+			return -EAGAIN;
+		}
+
+		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++)
+			if (__dimm_temp(pcs, dimm_idx))
+				dimm_mask |= BIT(chan_rank * dimm_idx_max + dimm_idx);
+	}
+
+	/*
+	 * If we got all -EINVALs, it means that the CPU doesn't have any
+	 * DIMMs. Unfortunately, it may also happen at the very start of
+	 * host platform boot. Retrying a couple of times lets us make sure
+	 * that the state is persistent.
+	 */
+	if (chan_rank_empty == GENMASK(chan_rank_max - 1, 0)) {
+		if (priv->no_dimm_retry_count < NO_DIMM_RETRY_COUNT_MAX) {
+			priv->no_dimm_retry_count++;
+
+			return -EAGAIN;
+		} else {
+			return -ENODEV;
+		}
+	}
+
+	/*
+	 * It's possible that memory training is not done yet. In this case we
+	 * defer the detection to be performed at a later point in time.
+	 */
+	if (!dimm_mask) {
+		priv->no_dimm_retry_count = 0;
+		return -EAGAIN;
+	}
+
+	dev_dbg(priv->dev, "Scanned populated DIMMs: %#llx\n", dimm_mask);
+
+	bitmap_from_u64(priv->dimm_mask, dimm_mask);
+
+	return 0;
+}
+
+static int create_dimm_temp_label(struct peci_dimmtemp *priv, int chan)
+{
+	int rank = chan / priv->gen_info->dimm_idx_max;
+	int idx = chan % priv->gen_info->dimm_idx_max;
+
+	priv->dimmtemp_label[chan] = devm_kasprintf(priv->dev, GFP_KERNEL,
+						    "DIMM %c%d", 'A' + rank,
+						    idx + 1);
+	if (!priv->dimmtemp_label[chan])
+		return -ENOMEM;
+
+	return 0;
+}
+
+static const u32 peci_dimmtemp_temp_channel_config[] = {
+	[0 ... DIMM_NUMS_MAX - 1] = HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT,
+	0
+};
+
+static const struct hwmon_channel_info peci_dimmtemp_temp_channel = {
+	.type = hwmon_temp,
+	.config = peci_dimmtemp_temp_channel_config,
+};
+
+static const struct hwmon_channel_info *peci_dimmtemp_temp_info[] = {
+	&peci_dimmtemp_temp_channel,
+	NULL
+};
+
+static const struct hwmon_chip_info peci_dimmtemp_chip_info = {
+	.ops = &peci_dimmtemp_ops,
+	.info = peci_dimmtemp_temp_info,
+};
+
+static int create_dimm_temp_info(struct peci_dimmtemp *priv)
+{
+	int ret, i, channels;
+	struct device *dev;
+
+	/*
+	 * We expect to either find populated DIMMs and carry on with creating
+	 * sensors, or find out that there are no DIMMs populated.
+	 * All other states mean that the platform never reached the state that
+	 * allows to check DIMM state - causing us to retry later on.
+	 */
+	ret = check_populated_dimms(priv);
+	if (ret == -ENODEV) {
+		dev_dbg(priv->dev, "No DIMMs found\n");
+		return 0;
+	} else if (ret) {
+		schedule_delayed_work(&priv->detect_work, DIMM_MASK_CHECK_DELAY_JIFFIES);
+		dev_dbg(priv->dev, "Deferred populating DIMM temp info\n");
+		return ret;
+	}
+
+	channels = priv->gen_info->chan_rank_max * priv->gen_info->dimm_idx_max;
+
+	priv->dimmtemp_label = devm_kzalloc(priv->dev, channels * sizeof(char *), GFP_KERNEL);
+	if (!priv->dimmtemp_label)
+		return -ENOMEM;
+
+	for_each_set_bit(i, priv->dimm_mask, DIMM_NUMS_MAX) {
+		ret = create_dimm_temp_label(priv, i);
+		if (ret)
+			return ret;
+		mutex_init(&priv->dimm[i].thresholds.state.lock);
+		mutex_init(&priv->dimm[i].temp.state.lock);
+	}
+
+	dev = devm_hwmon_device_register_with_info(priv->dev, priv->name, priv,
+						   &peci_dimmtemp_chip_info, NULL);
+	if (IS_ERR(dev)) {
+		dev_err(priv->dev, "Failed to register hwmon device\n");
+		return PTR_ERR(dev);
+	}
+
+	dev_dbg(priv->dev, "%s: sensor '%s'\n", dev_name(dev), priv->name);
+
+	return 0;
+}
+
+static void create_dimm_temp_info_delayed(struct work_struct *work)
+{
+	struct peci_dimmtemp *priv = container_of(to_delayed_work(work),
+						  struct peci_dimmtemp,
+						  detect_work);
+	int ret;
+
+	ret = create_dimm_temp_info(priv);
+	if (ret && ret != -EAGAIN)
+		dev_err(priv->dev, "Failed to populate DIMM temp info\n");
+}
+
+static void remove_delayed_work(void *_priv)
+{
+	struct peci_dimmtemp *priv = _priv;
+
+	cancel_delayed_work_sync(&priv->detect_work);
+}
+
+static int peci_dimmtemp_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id)
+{
+	struct device *dev = &adev->dev;
+	struct peci_device *peci_dev = to_peci_device(dev->parent);
+	struct peci_dimmtemp *priv;
+	int ret;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_dimmtemp.cpu%d",
+				    peci_dev->info.socket_id);
+	if (!priv->name)
+		return -ENOMEM;
+
+	priv->dev = dev;
+	priv->peci_dev = peci_dev;
+	priv->gen_info = (const struct dimm_info *)id->driver_data;
+
+	/*
+	 * This is just a sanity check. Since we're using commands that are
+	 * guaranteed to be supported on a given platform, we should never see
+	 * revision lower than expected.
+	 */
+	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
+		dev_warn(priv->dev,
+			 "Unexpected PECI revision %#x, some features may be unavailable\n",
+			 peci_dev->info.peci_revision);
+
+	INIT_DELAYED_WORK(&priv->detect_work, create_dimm_temp_info_delayed);
+
+	ret = devm_add_action_or_reset(priv->dev, remove_delayed_work, priv);
+	if (ret)
+		return ret;
+
+	ret = create_dimm_temp_info(priv);
+	if (ret && ret != -EAGAIN) {
+		dev_err(dev, "Failed to populate DIMM temp info\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+read_thresholds_hsx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u8 dev, func;
+	u16 reg;
+	int ret;
+
+	/*
+	 * Device 20, Function 0: IMC 0 channel 0 -> rank 0
+	 * Device 20, Function 1: IMC 0 channel 1 -> rank 1
+	 * Device 21, Function 0: IMC 0 channel 2 -> rank 2
+	 * Device 21, Function 1: IMC 0 channel 3 -> rank 3
+	 * Device 23, Function 0: IMC 1 channel 0 -> rank 4
+	 * Device 23, Function 1: IMC 1 channel 1 -> rank 5
+	 * Device 24, Function 0: IMC 1 channel 2 -> rank 6
+	 * Device 24, Function 1: IMC 1 channel 3 -> rank 7
+	 */
+	dev = 20 + chan_rank / 2 + chan_rank / 4;
+	func = chan_rank % 2;
+	reg = 0x120 + dimm_order * 4;
+
+	ret = peci_pci_local_read(priv->peci_dev, 1, dev, func, reg, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int
+read_thresholds_bdxd(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u8 dev, func;
+	u16 reg;
+	int ret;
+
+	/*
+	 * Device 10, Function 2: IMC 0 channel 0 -> rank 0
+	 * Device 10, Function 6: IMC 0 channel 1 -> rank 1
+	 * Device 12, Function 2: IMC 1 channel 0 -> rank 2
+	 * Device 12, Function 6: IMC 1 channel 1 -> rank 3
+	 */
+	dev = 10 + chan_rank / 2 * 2;
+	func = (chan_rank % 2) ? 6 : 2;
+	reg = 0x120 + dimm_order * 4;
+
+	ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int
+read_thresholds_skx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u8 dev, func;
+	u16 reg;
+	int ret;
+
+	/*
+	 * Device 10, Function 2: IMC 0 channel 0 -> rank 0
+	 * Device 10, Function 6: IMC 0 channel 1 -> rank 1
+	 * Device 11, Function 2: IMC 0 channel 2 -> rank 2
+	 * Device 12, Function 2: IMC 1 channel 0 -> rank 3
+	 * Device 12, Function 6: IMC 1 channel 1 -> rank 4
+	 * Device 13, Function 2: IMC 1 channel 2 -> rank 5
+	 */
+	dev = 10 + chan_rank / 3 * 2 + (chan_rank % 3 == 2 ? 1 : 0);
+	func = chan_rank % 3 == 1 ? 6 : 2;
+	reg = 0x120 + dimm_order * 4;
+
+	ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int
+read_thresholds_icx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u32 reg_val;
+	u64 offset;
+	int ret;
+	u8 dev;
+
+	ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd4, &reg_val);
+	if (ret || !(reg_val & BIT(31)))
+		return -ENODATA; /* Use default or previous value */
+
+	ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd0, &reg_val);
+	if (ret)
+		return -ENODATA; /* Use default or previous value */
+
+	/*
+	 * Device 26, Offset 224e0: IMC 0 channel 0 -> rank 0
+	 * Device 26, Offset 264e0: IMC 0 channel 1 -> rank 1
+	 * Device 27, Offset 224e0: IMC 1 channel 0 -> rank 2
+	 * Device 27, Offset 264e0: IMC 1 channel 1 -> rank 3
+	 * Device 28, Offset 224e0: IMC 2 channel 0 -> rank 4
+	 * Device 28, Offset 264e0: IMC 2 channel 1 -> rank 5
+	 * Device 29, Offset 224e0: IMC 3 channel 0 -> rank 6
+	 * Device 29, Offset 264e0: IMC 3 channel 1 -> rank 7
+	 */
+	dev = 26 + chan_rank / 2;
+	offset = 0x224e0 + dimm_order * 4 + (chan_rank % 2) * 0x4000;
+
+	ret = peci_mmio_read(priv->peci_dev, 0, GET_CPU_SEG(reg_val), GET_CPU_BUS(reg_val),
+			     dev, 0, offset, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static const struct dimm_info dimm_hsx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_HSX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_HSX,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_hsx,
+};
+
+static const struct dimm_info dimm_bdx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_BDX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_BDX,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_hsx,
+};
+
+static const struct dimm_info dimm_bdxd = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_BDXD,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_BDXD,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_bdxd,
+};
+
+static const struct dimm_info dimm_skx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_SKX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_SKX,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_skx,
+};
+
+static const struct dimm_info dimm_icx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_ICX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_ICX,
+	.min_peci_revision = 0x40,
+	.read_thresholds = &read_thresholds_icx,
+};
+
+static const struct dimm_info dimm_icxd = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_ICXD,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_ICXD,
+	.min_peci_revision = 0x40,
+	.read_thresholds = &read_thresholds_icx,
+};
+
+static const struct auxiliary_device_id peci_dimmtemp_ids[] = {
+	{
+		.name = "peci_cpu.dimmtemp.hsx",
+		.driver_data = (kernel_ulong_t)&dimm_hsx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.bdx",
+		.driver_data = (kernel_ulong_t)&dimm_bdx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.bdxd",
+		.driver_data = (kernel_ulong_t)&dimm_bdxd,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.skx",
+		.driver_data = (kernel_ulong_t)&dimm_skx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.icx",
+		.driver_data = (kernel_ulong_t)&dimm_icx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.icxd",
+		.driver_data = (kernel_ulong_t)&dimm_icxd,
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(auxiliary, peci_dimmtemp_ids);
+
+static struct auxiliary_driver peci_dimmtemp_driver = {
+	.probe		= peci_dimmtemp_probe,
+	.id_table	= peci_dimmtemp_ids,
+};
+
+module_auxiliary_driver(peci_dimmtemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI dimmtemp driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI_CPU);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 12/13] docs: hwmon: Document PECI drivers
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (10 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 11/13] hwmon: peci: Add dimmtemp driver Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-15 18:25 ` [PATCH v3 13/13] docs: Add PECI documentation Iwona Winiarska
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>

Add documentation for peci-cputemp driver that provides DTS thermal
readings for CPU packages and CPU cores, and peci-dimmtemp driver that
provides Temperature Sensor on DIMM readings.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Co-developed-by: Iwona Winiarska <iwona.winiarska@intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 Documentation/hwmon/index.rst         |  2 +
 Documentation/hwmon/peci-cputemp.rst  | 90 +++++++++++++++++++++++++++
 Documentation/hwmon/peci-dimmtemp.rst | 57 +++++++++++++++++
 MAINTAINERS                           |  2 +
 4 files changed, 151 insertions(+)
 create mode 100644 Documentation/hwmon/peci-cputemp.rst
 create mode 100644 Documentation/hwmon/peci-dimmtemp.rst

diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index 7046bf1870d9..6ebd73a55c26 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -156,6 +156,8 @@ Hardware Monitoring Kernel Drivers
    pcf8591
    pim4328
    pm6764tr
+   peci-cputemp
+   peci-dimmtemp
    pmbus
    powr1220
    pxe1610
diff --git a/Documentation/hwmon/peci-cputemp.rst b/Documentation/hwmon/peci-cputemp.rst
new file mode 100644
index 000000000000..fe0422248dc5
--- /dev/null
+++ b/Documentation/hwmon/peci-cputemp.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+Kernel driver peci-cputemp
+==========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon D family
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides Digital
+Thermal Sensor (DTS) thermal readings of the CPU package and CPU cores that are
+accessible via the processor PECI interface.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+Sysfs interface
+-------------------
+
+======================= =======================================================
+temp1_label		"Die"
+temp1_input		Provides current die temperature of the CPU package.
+temp1_max		Provides thermal control temperature of the CPU package
+			which is also known as Tcontrol.
+temp1_crit		Provides shutdown temperature of the CPU package which
+			is also known as the maximum processor junction
+			temperature, Tjmax or Tprochot.
+temp1_crit_hyst		Provides the hysteresis value from Tcontrol to Tjmax of
+			the CPU package.
+
+temp2_label		"DTS"
+temp2_input		Provides current temperature of the CPU package scaled
+			to match DTS thermal profile.
+temp2_max		Provides thermal control temperature of the CPU package
+			which is also known as Tcontrol.
+temp2_crit		Provides shutdown temperature of the CPU package which
+			is also known as the maximum processor junction
+			temperature, Tjmax or Tprochot.
+temp2_crit_hyst		Provides the hysteresis value from Tcontrol to Tjmax of
+			the CPU package.
+
+temp3_label		"Tcontrol"
+temp3_input		Provides current Tcontrol temperature of the CPU
+			package which is also known as Fan Temperature target.
+			Indicates the relative value from thermal monitor trip
+			temperature at which fans should be engaged.
+temp3_crit		Provides Tcontrol critical value of the CPU package
+			which is same to Tjmax.
+
+temp4_label		"Tthrottle"
+temp4_input		Provides current Tthrottle temperature of the CPU
+			package. Used for throttling temperature. If this value
+			is allowed and lower than Tjmax - the throttle will
+			occur and reported at lower than Tjmax.
+
+temp5_label		"Tjmax"
+temp5_input		Provides the maximum junction temperature, Tjmax of the
+			CPU package.
+
+temp[6-N]_label		Provides string "Core X", where X is resolved core
+			number.
+temp[6-N]_input		Provides current temperature of each core.
+
+======================= =======================================================
diff --git a/Documentation/hwmon/peci-dimmtemp.rst b/Documentation/hwmon/peci-dimmtemp.rst
new file mode 100644
index 000000000000..e562aed620de
--- /dev/null
+++ b/Documentation/hwmon/peci-dimmtemp.rst
@@ -0,0 +1,57 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver peci-dimmtemp
+===========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon D family
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides
+Temperature sensor on DIMM readings that are accessible via the processor PECI interface.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+Sysfs interface
+-------------------
+
+======================= =======================================================
+
+temp[N]_label		Provides string "DIMM CI", where C is DIMM channel and
+			I is DIMM index of the populated DIMM.
+temp[N]_input		Provides current temperature of the populated DIMM.
+temp[N]_max		Provides thermal control temperature of the DIMM.
+temp[N]_crit		Provides shutdown temperature of the DIMM.
+
+======================= =======================================================
+
+Note:
+	DIMM temperature attributes will appear when the client CPU's BIOS
+	completes memory training and testing.
diff --git a/MAINTAINERS b/MAINTAINERS
index 0094370be6c4..0f7216644bd5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14925,6 +14925,8 @@ M:	Iwona Winiarska <iwona.winiarska@intel.com>
 R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
 L:	linux-hwmon@vger.kernel.org
 S:	Supported
+F:	Documentation/hwmon/peci-cputemp.rst
+F:	Documentation/hwmon/peci-dimmtemp.rst
 F:	drivers/hwmon/peci/
 
 PECI SUBSYSTEM
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 13/13] docs: Add PECI documentation
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (11 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 12/13] docs: hwmon: Document PECI drivers Iwona Winiarska
@ 2021-11-15 18:25 ` Iwona Winiarska
  2021-11-17  3:56 ` [PATCH v3 00/13] Introduce PECI subsystem Zev Weiss
  2021-11-18 12:19 ` Tomer Maimon
  14 siblings, 0 replies; 24+ messages in thread
From: Iwona Winiarska @ 2021-11-15 18:25 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, Guenter Roeck, devicetree,
	Jean Delvare, Arnd Bergmann, Rob Herring, Borislav Petkov,
	Iwona Winiarska, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	Olof Johansson

Add a brief overview of PECI and PECI wire interface.
The documentation also contains kernel-doc for PECI subsystem internals
and PECI CPU Driver API.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 Documentation/index.rst      |  1 +
 Documentation/peci/index.rst | 16 +++++++++++
 Documentation/peci/peci.rst  | 51 ++++++++++++++++++++++++++++++++++++
 MAINTAINERS                  |  1 +
 4 files changed, 69 insertions(+)
 create mode 100644 Documentation/peci/index.rst
 create mode 100644 Documentation/peci/peci.rst

diff --git a/Documentation/index.rst b/Documentation/index.rst
index 54ce34fd6fbd..7671f2cd474f 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -137,6 +137,7 @@ needed).
    misc-devices/index
    scheduler/index
    mhi/index
+   peci/index
 
 Architecture-agnostic documentation
 -----------------------------------
diff --git a/Documentation/peci/index.rst b/Documentation/peci/index.rst
new file mode 100644
index 000000000000..989de10416e7
--- /dev/null
+++ b/Documentation/peci/index.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+====================
+Linux PECI Subsystem
+====================
+
+.. toctree::
+
+   peci
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/peci/peci.rst b/Documentation/peci/peci.rst
new file mode 100644
index 000000000000..331b1ec00e22
--- /dev/null
+++ b/Documentation/peci/peci.rst
@@ -0,0 +1,51 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+========
+Overview
+========
+
+The Platform Environment Control Interface (PECI) is a communication
+interface between Intel processor and management controllers
+(e.g. Baseboard Management Controller, BMC).
+PECI provides services that allow the management controller to
+configure, monitor and debug platform by accessing various registers.
+It defines a dedicated command protocol, where the management
+controller is acting as a PECI originator and the processor - as
+a PECI responder.
+PECI can be used in both single processor and multiple-processor based
+systems.
+
+NOTE:
+Intel PECI specification is not released as a dedicated document,
+instead it is a part of External Design Specification (EDS) for given
+Intel CPU. External Design Specifications are usually not publicly
+available.
+
+PECI Wire
+---------
+
+PECI Wire interface uses a single wire for self-clocking and data
+transfer. It does not require any additional control lines - the
+physical layer is a self-clocked one-wire bus signal that begins each
+bit with a driven, rising edge from an idle near zero volts. The
+duration of the signal driven high allows to determine whether the bit
+value is logic '0' or logic '1'. PECI Wire also includes variable data
+rate established with every message.
+
+For PECI Wire, each processor package will utilize unique, fixed
+addresses within a defined range and that address should
+have a fixed relationship with the processor socket ID - if one of the
+processors is removed, it does not affect addresses of remaining
+processors.
+
+PECI subsystem internals
+------------------------
+
+.. kernel-doc:: include/linux/peci.h
+.. kernel-doc:: drivers/peci/internal.h
+.. kernel-doc:: drivers/peci/core.c
+.. kernel-doc:: drivers/peci/request.c
+
+PECI CPU Driver API
+-------------------
+.. kernel-doc:: drivers/peci/cpu.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 0f7216644bd5..4db4979db60a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14935,6 +14935,7 @@ R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
 L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/peci/
+F:	Documentation/peci/
 F:	drivers/peci/
 F:	include/linux/peci-cpu.h
 F:	include/linux/peci.h
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 06/13] peci: Add device detection
  2021-11-15 18:25 ` [PATCH v3 06/13] peci: Add device detection Iwona Winiarska
@ 2021-11-15 18:49   ` Greg Kroah-Hartman
  2021-11-15 22:35     ` Winiarska, Iwona
  0 siblings, 1 reply; 24+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-15 18:49 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, openbmc, Pierre-Louis Bossart, Guenter Roeck,
	devicetree, Jean Delvare, Arnd Bergmann, Rob Herring,
	Borislav Petkov, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Randy Dunlap,
	linux-kernel, Olof Johansson

On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote:
> +void peci_device_destroy(struct peci_device *device)
> +{
> +	bool killed;
> +
> +	device_lock(&device->dev);
> +	killed = kill_device(&device->dev);

Eeek, why call this?

> +	device_unlock(&device->dev);
> +
> +	if (!killed)
> +		return;

What happened if something changed after you unlocked it?

Why is kill_device() required at all?  That's a very rare function to
call, and one that only one "bus" calls today because it is very
special (i.e. crazy and broken...)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 06/13] peci: Add device detection
  2021-11-15 18:49   ` Greg Kroah-Hartman
@ 2021-11-15 22:35     ` Winiarska, Iwona
  2021-11-16  6:26       ` gregkh
  0 siblings, 1 reply; 24+ messages in thread
From: Winiarska, Iwona @ 2021-11-15 22:35 UTC (permalink / raw)
  To: gregkh
  Cc: linux-aspeed, linux-doc, Hansen, Dave, zweiss, jae.hyun.yoo,
	corbet, openbmc, pierre-louis.bossart, linux, devicetree,
	jdelvare, arnd, robh+dt, bp, Williams, Dan J, andriy.shevchenko,
	linux-arm-kernel, linux-hwmon, Luck,  Tony, andrew, rdunlap,
	linux-kernel, olof

On Mon, 2021-11-15 at 19:49 +0100, Greg Kroah-Hartman wrote:
> On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote:
> > +void peci_device_destroy(struct peci_device *device)
> > +{
> > +       bool killed;
> > +
> > +       device_lock(&device->dev);
> > +       killed = kill_device(&device->dev);
> 
> Eeek, why call this?
> 
> > +       device_unlock(&device->dev);
> > +
> > +       if (!killed)
> > +               return;
> 
> What happened if something changed after you unlocked it?

We either killed it, or the other caller killed it.

> 
> Why is kill_device() required at all?  That's a very rare function to
> call, and one that only one "bus" calls today because it is very
> special (i.e. crazy and broken...)

It's used to avoid double-delete in case of races between peci_controller
unregister and "manually" removing the device using sysfs (pointed out by Dan in
v2). We're calling peci_device_destroy() in both callsites.
Other way to solve it would be to just have a peci-specific lock, but
kill_device seemed to be well suited for the problem at hand.
Do you suggest to remove it and just go with the lock?

Thanks
-Iwona

> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 10/13] hwmon: peci: Add cputemp driver
  2021-11-15 18:25 ` [PATCH v3 10/13] hwmon: peci: Add cputemp driver Iwona Winiarska
@ 2021-11-16  0:52   ` Guenter Roeck
  2021-11-17 22:20     ` Winiarska, Iwona
  0 siblings, 1 reply; 24+ messages in thread
From: Guenter Roeck @ 2021-11-16  0:52 UTC (permalink / raw)
  To: Iwona Winiarska, linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, Pierre-Louis Bossart, devicetree, Jean Delvare,
	Arnd Bergmann, Rob Herring, Borislav Petkov, Dan Williams,
	Andy Shevchenko, linux-arm-kernel, linux-hwmon, Tony Luck,
	Andrew Jeffery, Randy Dunlap, Olof Johansson

On 11/15/21 10:25 AM, Iwona Winiarska wrote:
> Add peci-cputemp driver for Digital Thermal Sensor (DTS) thermal
> readings of the processor package and processor cores that are
> accessible via the PECI interface.
> 
> The main use case for the driver (and PECI interface) is out-of-band
> management, where we're able to obtain the DTS readings from an external
> entity connected with PECI, e.g. BMC on server platforms.
> 
> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>   MAINTAINERS                  |   7 +
>   drivers/hwmon/Kconfig        |   2 +
>   drivers/hwmon/Makefile       |   1 +
>   drivers/hwmon/peci/Kconfig   |  18 ++
>   drivers/hwmon/peci/Makefile  |   5 +
>   drivers/hwmon/peci/common.h  |  58 ++++
>   drivers/hwmon/peci/cputemp.c | 592 +++++++++++++++++++++++++++++++++++
>   7 files changed, 683 insertions(+)
>   create mode 100644 drivers/hwmon/peci/Kconfig
>   create mode 100644 drivers/hwmon/peci/Makefile
>   create mode 100644 drivers/hwmon/peci/common.h
>   create mode 100644 drivers/hwmon/peci/cputemp.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d751590f32e3..0094370be6c4 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14920,6 +14920,13 @@ L:	platform-driver-x86@vger.kernel.org
>   S:	Maintained
>   F:	drivers/platform/x86/peaq-wmi.c
>   
> +PECI HARDWARE MONITORING DRIVERS
> +M:	Iwona Winiarska <iwona.winiarska@intel.com>
> +R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> +L:	linux-hwmon@vger.kernel.org
> +S:	Supported
> +F:	drivers/hwmon/peci/
> +
>   PECI SUBSYSTEM
>   M:	Iwona Winiarska <iwona.winiarska@intel.com>
>   R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> index 64bd3dfba2c4..4858f7cfa81e 100644
> --- a/drivers/hwmon/Kconfig
> +++ b/drivers/hwmon/Kconfig
> @@ -1528,6 +1528,8 @@ config SENSORS_PCF8591
>   	  These devices are hard to detect and rarely found on mainstream
>   	  hardware. If unsure, say N.
>   
> +source "drivers/hwmon/peci/Kconfig"
> +
>   source "drivers/hwmon/pmbus/Kconfig"
>   
>   config SENSORS_PWM_FAN
> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> index baee6a8d4dd1..2c0e210a28b8 100644
> --- a/drivers/hwmon/Makefile
> +++ b/drivers/hwmon/Makefile
> @@ -204,6 +204,7 @@ obj-$(CONFIG_SENSORS_WM8350)	+= wm8350-hwmon.o
>   obj-$(CONFIG_SENSORS_XGENE)	+= xgene-hwmon.o
>   
>   obj-$(CONFIG_SENSORS_OCC)	+= occ/
> +obj-$(CONFIG_SENSORS_PECI)	+= peci/
>   obj-$(CONFIG_PMBUS)		+= pmbus/
>   
>   ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
> diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> new file mode 100644
> index 000000000000..e10eed68d70a
> --- /dev/null
> +++ b/drivers/hwmon/peci/Kconfig
> @@ -0,0 +1,18 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config SENSORS_PECI_CPUTEMP
> +	tristate "PECI CPU temperature monitoring client"
> +	depends on PECI
> +	select SENSORS_PECI
> +	select PECI_CPU
> +	help
> +	  If you say yes here you get support for the generic Intel PECI
> +	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
> +	  readings of the CPU package and CPU cores that are accessible via
> +	  the processor PECI interface.
> +
> +	  This driver can also be built as a module. If so, the module
> +	  will be called peci-cputemp.
> +
> +config SENSORS_PECI
> +	tristate
> diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> new file mode 100644
> index 000000000000..e8a0ada5ab1f
> --- /dev/null
> +++ b/drivers/hwmon/peci/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +peci-cputemp-y := cputemp.o
> +
> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
> diff --git a/drivers/hwmon/peci/common.h b/drivers/hwmon/peci/common.h
> new file mode 100644
> index 000000000000..734506b0eca2
> --- /dev/null
> +++ b/drivers/hwmon/peci/common.h
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright (c) 2021 Intel Corporation */
> +
> +#include <linux/mutex.h>
> +#include <linux/types.h>
> +
> +#ifndef __PECI_HWMON_COMMON_H
> +#define __PECI_HWMON_COMMON_H
> +
> +#define PECI_HWMON_UPDATE_INTERVAL	HZ
> +
> +/**
> + * struct peci_sensor_state - PECI state information
> + * @valid: flag to indicate the sensor value is valid
> + * @last_updated: time of the last update in jiffies
> + * @lock: mutex to protect sensor access
> + */
> +struct peci_sensor_state {
> +	bool valid;
> +	unsigned long last_updated;
> +	struct mutex lock; /* protect sensor access */
> +};
> +
> +/**
> + * struct peci_sensor_data - PECI sensor information
> + * @value: sensor value in milli units
> + * @state: sensor update state
> + */
> +
> +struct peci_sensor_data {
> +	s32 value;
> +	struct peci_sensor_state state;
> +};
> +
> +/**
> + * peci_sensor_need_update() - check whether sensor update is needed or not
> + * @sensor: pointer to sensor data struct
> + *
> + * Return: true if update is needed, false if not.
> + */
> +
> +static inline bool peci_sensor_need_update(struct peci_sensor_state *state)
> +{
> +	return !state->valid ||
> +	       time_after(jiffies, state->last_updated + PECI_HWMON_UPDATE_INTERVAL);
> +}
> +
> +/**
> + * peci_sensor_mark_updated() - mark the sensor is updated
> + * @sensor: pointer to sensor data struct
> + */
> +static inline void peci_sensor_mark_updated(struct peci_sensor_state *state)
> +{
> +	state->valid = true;
> +	state->last_updated = jiffies;
> +}
> +
> +#endif /* __PECI_HWMON_COMMON_H */
> diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c
> new file mode 100644
> index 000000000000..d264d6c2f437
> --- /dev/null
> +++ b/drivers/hwmon/peci/cputemp.c
> @@ -0,0 +1,592 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2018-2021 Intel Corporation
> +
> +#include <linux/auxiliary_bus.h>
> +#include <linux/bitfield.h>
> +#include <linux/bitops.h>
> +#include <linux/hwmon.h>
> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/peci.h>
> +#include <linux/peci-cpu.h>
> +#include <linux/units.h>
> +
> +#include "common.h"
> +
> +#define CORE_NUMS_MAX		64
> +
> +#define BASE_CHANNEL_NUMS	5
> +#define CPUTEMP_CHANNEL_NUMS	(BASE_CHANNEL_NUMS + CORE_NUMS_MAX)
> +
> +#define TEMP_TARGET_FAN_TEMP_MASK	GENMASK(15, 8)
> +#define TEMP_TARGET_REF_TEMP_MASK	GENMASK(23, 16)
> +#define TEMP_TARGET_TJ_OFFSET_MASK	GENMASK(29, 24)
> +
> +#define DTS_MARGIN_MASK		GENMASK(15, 0)
> +#define PCS_MODULE_TEMP_MASK	GENMASK(15, 0)
> +
> +struct resolved_cores_reg {
> +	u8 bus;
> +	u8 dev;
> +	u8 func;
> +	u8 offset;
> +};
> +
> +struct cpu_info {
> +	struct resolved_cores_reg *reg;
> +	u8 min_peci_revision;
> +	s32 (*thermal_margin_to_millidegree)(s32 val);
> +};
> +
> +struct peci_temp_target {
> +	s32 tcontrol;
> +	s32 tthrottle;
> +	s32 tjmax;
> +	struct peci_sensor_state state;
> +};
> +
> +enum peci_temp_target_type {
> +	tcontrol_type,
> +	tthrottle_type,
> +	tjmax_type,
> +	crit_hyst_type,
> +};
> +
> +struct peci_cputemp {
> +	struct peci_device *peci_dev;
> +	struct device *dev;
> +	const char *name;
> +	const struct cpu_info *gen_info;
> +	struct {
> +		struct peci_temp_target target;
> +		struct peci_sensor_data die;
> +		struct peci_sensor_data dts;
> +		struct peci_sensor_data core[CORE_NUMS_MAX];
> +	} temp;
> +	const char **coretemp_label;
> +	DECLARE_BITMAP(core_mask, CORE_NUMS_MAX);
> +};
> +
> +enum cputemp_channels {
> +	channel_die,
> +	channel_dts,
> +	channel_tcontrol,
> +	channel_tthrottle,
> +	channel_tjmax,
> +	channel_core,
> +};
> +
> +static const char * const cputemp_label[BASE_CHANNEL_NUMS] = {
> +	"Die",
> +	"DTS",
> +	"Tcontrol",
> +	"Tthrottle",
> +	"Tjmax",
> +};
> +
> +static int update_temp_target(struct peci_cputemp *priv)
> +{
> +	s32 tthrottle_offset, tcontrol_margin;
> +	u32 pcs;
> +	int ret;
> +
> +	if (!peci_sensor_need_update(&priv->temp.target.state))
> +		return 0;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_TEMP_TARGET, 0, &pcs);
> +	if (ret)
> +		return ret;
> +
> +	priv->temp.target.tjmax =
> +		FIELD_GET(TEMP_TARGET_REF_TEMP_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
> +
> +	tcontrol_margin = FIELD_GET(TEMP_TARGET_FAN_TEMP_MASK, pcs);
> +	tcontrol_margin = sign_extend32(tcontrol_margin, 7) * MILLIDEGREE_PER_DEGREE;
> +	priv->temp.target.tcontrol = priv->temp.target.tjmax - tcontrol_margin;
> +
> +	tthrottle_offset = FIELD_GET(TEMP_TARGET_TJ_OFFSET_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
> +	priv->temp.target.tthrottle = priv->temp.target.tjmax - tthrottle_offset;
> +
> +	peci_sensor_mark_updated(&priv->temp.target.state);
> +
> +	return 0;
> +}
> +
> +static int get_temp_target(struct peci_cputemp *priv, enum peci_temp_target_type type, long *val)

Why is this a long * ? It only uses variables which are declared as s32.

> +{
> +	int ret;
> +
> +	mutex_lock(&priv->temp.target.state.lock);
> +
> +	ret = update_temp_target(priv);
> +	if (ret)
> +		goto unlock;
> +
> +	switch (type) {
> +	case tcontrol_type:
> +		*val = priv->temp.target.tcontrol;
> +		break;
> +	case tthrottle_type:
> +		*val = priv->temp.target.tthrottle;
> +		break;
> +	case tjmax_type:
> +		*val = priv->temp.target.tjmax;
> +		break;
> +	case crit_hyst_type:
> +		*val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
> +		break;
> +	default:
> +		ret = -EOPNOTSUPP;
> +		break;
> +	}
> +unlock:
> +	mutex_unlock(&priv->temp.target.state.lock);
> +
> +	return ret;
> +}
> +
> +/*
> + * Error codes:
> + *   0x8000: General sensor error
> + *   0x8001: Reserved
> + *   0x8002: Underflow on reading value
> + *   0x8003-0x81ff: Reserved
> + */
> +static bool dts_valid(s32 val)
> +{
> +	return val < 0x8000 || val > 0x81ff;
> +}
> +
> +/*
> + * Processors return a value of DTS reading in S10.6 fixed point format
> + * (16 bits: 10-bit signed magnitude, 6-bit fraction).
> + */
> +static s32 dts_ten_dot_six_to_millidegree(s32 val)
> +{
> +	return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / 64;
> +}
> +
> +/*
> + * For older processors, thermal margin reading is returned in S8.8 fixed
> + * point format (16 bits: 8-bit signed magnitude, 8-bit fraction).
> + */
> +static s32 dts_eight_dot_eight_to_millidegree(s32 val)
> +{
> +	return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / 256;
> +}
> +
> +static int get_die_temp(struct peci_cputemp *priv, long *val)
> +{
> +	int ret = 0;
> +	long tjmax;
> +	s16 temp;
> +
> +	mutex_lock(&priv->temp.die.state.lock);
> +	if (!peci_sensor_need_update(&priv->temp.die.state))
> +		goto skip_update;
> +
> +	ret = peci_temp_read(priv->peci_dev, &temp);
> +	if (ret)
> +		goto err_unlock;
> +
> +	if (!dts_valid(temp)) {
> +		ret = -EIO;
> +		goto err_unlock;
> +	}
> +

I am getting lost with all the variable types. temp is s16,
passed to dts_valid() as s32 and thus presumably sign extended.
Since temp is s16, that means that a positive value of 0x8000
or larger isn't really possible, and dts_valid() will never return
false. With that in mind, you might as well drop the check.

> +	ret = get_temp_target(priv, tjmax_type, &tjmax);
> +	if (ret)
> +		goto err_unlock;
> +
> +	priv->temp.die.value = (s32)tjmax + dts_ten_dot_six_to_millidegree(temp);
> +
> +	peci_sensor_mark_updated(&priv->temp.die.state);
> +
> +skip_update:
> +	*val = priv->temp.die.value;
> +err_unlock:
> +	mutex_unlock(&priv->temp.die.state.lock);
> +	return ret;
> +}
> +
> +static int get_dts(struct peci_cputemp *priv, long *val)
> +{
> +	int ret = 0;
> +	s32 thermal_margin;
> +	long tcontrol;
> +	u32 pcs;
> +
> +	mutex_lock(&priv->temp.dts.state.lock);
> +	if (!peci_sensor_need_update(&priv->temp.dts.state))
> +		goto skip_update;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_THERMAL_MARGIN, 0, &pcs);
> +	if (ret)
> +		goto err_unlock;
> +
> +	thermal_margin = FIELD_GET(DTS_MARGIN_MASK, pcs);
> +	if (!dts_valid(thermal_margin)) { > +		ret = -EIO;
> +		goto err_unlock;
> +	}
> +
> +	ret = get_temp_target(priv, tcontrol_type, &tcontrol);
> +	if (ret)
> +		goto err_unlock;
> +
> +	/* Note that the tcontrol should be available before calling it */
> +	priv->temp.dts.value =
> +		(s32)tcontrol - priv->gen_info->thermal_margin_to_millidegree(thermal_margin);

tcontrol is stored as s32, converted to long, only to be typecasted
back to s32. Any reason for doing that ?

> +
> +	peci_sensor_mark_updated(&priv->temp.dts.state);
> +
> +skip_update:
> +	*val = priv->temp.dts.value;
> +err_unlock:
> +	mutex_unlock(&priv->temp.dts.state.lock);
> +	return ret;
> +}
> +
> +static int get_core_temp(struct peci_cputemp *priv, int core_index, long *val)
> +{
> +	int ret = 0;
> +	s32 core_dts_margin;
> +	long tjmax;
> +	u32 pcs;
> +
> +	mutex_lock(&priv->temp.core[core_index].state.lock);
> +	if (!peci_sensor_need_update(&priv->temp.core[core_index].state))
> +		goto skip_update;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_MODULE_TEMP, core_index, &pcs);
> +	if (ret)
> +		goto err_unlock;
> +
> +	core_dts_margin = FIELD_GET(PCS_MODULE_TEMP_MASK, pcs);
> +	if (!dts_valid(core_dts_margin)) {
> +		ret = -EIO;
> +		goto err_unlock;
> +	}
> +
> +	ret = get_temp_target(priv, tjmax_type, &tjmax);
> +	if (ret)
> +		goto err_unlock;
> +
> +	/* Note that the tjmax should be available before calling it */
> +	priv->temp.core[core_index].value =
> +		(s32)tjmax + dts_ten_dot_six_to_millidegree(core_dts_margin);
> +
> +	peci_sensor_mark_updated(&priv->temp.core[core_index].state);
> +
> +skip_update:
> +	*val = priv->temp.core[core_index].value;
> +err_unlock:
> +	mutex_unlock(&priv->temp.core[core_index].state.lock);
> +	return ret;
> +}
> +
> +static int cputemp_read_string(struct device *dev, enum hwmon_sensor_types type,
> +			       u32 attr, int channel, const char **str)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +
> +	if (attr != hwmon_temp_label)
> +		return -EOPNOTSUPP;
> +
> +	*str = channel < channel_core ?
> +		cputemp_label[channel] : priv->coretemp_label[channel - channel_core];
> +
> +	return 0;
> +}
> +
> +static int cputemp_read(struct device *dev, enum hwmon_sensor_types type,
> +			u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		switch (channel) {
> +		case channel_die:
> +			return get_die_temp(priv, val);
> +		case channel_dts:
> +			return get_dts(priv, val);
> +		case channel_tcontrol:
> +			return get_temp_target(priv, tcontrol_type, val);
> +		case channel_tthrottle:
> +			return get_temp_target(priv, tthrottle_type, val);
> +		case channel_tjmax:
> +			return get_temp_target(priv, tjmax_type, val);
> +		default:
> +			return get_core_temp(priv, channel - channel_core, val);
> +		}
> +		break;
> +	case hwmon_temp_max:
> +		return get_temp_target(priv, tcontrol_type, val);
> +	case hwmon_temp_crit:
> +		return get_temp_target(priv, tjmax_type, val);
> +	case hwmon_temp_crit_hyst:
> +		return get_temp_target(priv, crit_hyst_type, val);
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
> +static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types type,
> +				  u32 attr, int channel)
> +{
> +	const struct peci_cputemp *priv = data;
> +
> +	if (channel > CPUTEMP_CHANNEL_NUMS)
> +		return 0;
> +
> +	if (channel < channel_core)
> +		return 0444;
> +
> +	if (test_bit(channel - channel_core, priv->core_mask))
> +		return 0444;
> +
> +	return 0;
> +}
> +
> +static int init_core_mask(struct peci_cputemp *priv)
> +{
> +	struct peci_device *peci_dev = priv->peci_dev;
> +	struct resolved_cores_reg *reg = priv->gen_info->reg;
> +	u64 core_mask;
> +	u32 data;
> +	int ret;
> +
> +	/* Get the RESOLVED_CORES register value */
> +	switch (peci_dev->info.model) {
> +	case INTEL_FAM6_ICELAKE_X:
> +	case INTEL_FAM6_ICELAKE_D:
> +		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
> +					     reg->func, reg->offset + 4, &data);
> +		if (ret)
> +			return ret;
> +
> +		core_mask = (u64)data << 32;
> +
> +		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
> +					     reg->func, reg->offset, &data);
> +		if (ret)
> +			return ret;
> +
> +		core_mask |= data;
> +
> +		break;
> +	default:
> +		ret = peci_pci_local_read(peci_dev, reg->bus, reg->dev,
> +					  reg->func, reg->offset, &data);
> +		if (ret)
> +			return ret;
> +
> +		core_mask = data;
> +
> +		break;
> +	}
> +
> +	if (!core_mask)
> +		return -EIO;
> +
> +	bitmap_from_u64(priv->core_mask, core_mask);
> +
> +	return 0;
> +}
> +
> +static int create_temp_label(struct peci_cputemp *priv)
> +{
> +	unsigned long core_max = find_last_bit(priv->core_mask, CORE_NUMS_MAX);
> +	int i;
> +
> +	priv->coretemp_label = devm_kzalloc(priv->dev, core_max * sizeof(char *), GFP_KERNEL);
> +	if (!priv->coretemp_label)
> +		return -ENOMEM;
> +
> +	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) {
> +		priv->coretemp_label[i] = devm_kasprintf(priv->dev, GFP_KERNEL, "Core %d", i);
> +		if (!priv->coretemp_label[i])
> +			return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static void check_resolved_cores(struct peci_cputemp *priv)
> +{
> +	/*
> +	 * Failure to resolve cores is non-critical, we're still able to
> +	 * provide other sensor data.
> +	 */
> +
> +	if (init_core_mask(priv))
> +		return;
> +
> +	if (create_temp_label(priv))
> +		bitmap_zero(priv->core_mask, CORE_NUMS_MAX);
> +}
> +
> +static void sensor_init(struct peci_cputemp *priv)
> +{
> +	int i;
> +
> +	mutex_init(&priv->temp.target.state.lock);
> +	mutex_init(&priv->temp.die.state.lock);
> +	mutex_init(&priv->temp.dts.state.lock);
> +
> +	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX)
> +		mutex_init(&priv->temp.core[i].state.lock);
> +}
> +
> +static const struct hwmon_ops peci_cputemp_ops = {
> +	.is_visible = cputemp_is_visible,
> +	.read_string = cputemp_read_string,
> +	.read = cputemp_read,
> +};
> +
> +static const u32 peci_cputemp_temp_channel_config[] = {
> +	/* Die temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
> +	/* DTS margin */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
> +	/* Tcontrol temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
> +	/* Tthrottle temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +	/* Tjmax temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +	/* Core temperature - for all core channels */
> +	[channel_core ... CPUTEMP_CHANNEL_NUMS - 1] = HWMON_T_LABEL | HWMON_T_INPUT,
> +	0
> +};
> +
> +static const struct hwmon_channel_info peci_cputemp_temp_channel = {
> +	.type = hwmon_temp,
> +	.config = peci_cputemp_temp_channel_config,
> +};
> +
> +static const struct hwmon_channel_info *peci_cputemp_info[] = {
> +	&peci_cputemp_temp_channel,
> +	NULL
> +};
> +
> +static const struct hwmon_chip_info peci_cputemp_chip_info = {
> +	.ops = &peci_cputemp_ops,
> +	.info = peci_cputemp_info,
> +};
> +
> +static int peci_cputemp_probe(struct auxiliary_device *adev,
> +			      const struct auxiliary_device_id *id)
> +{
> +	struct device *dev = &adev->dev;
> +	struct peci_device *peci_dev = to_peci_device(dev->parent);
> +	struct peci_cputemp *priv;
> +	struct device *hwmon_dev;
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_cputemp.cpu%d",
> +				    peci_dev->info.socket_id);
> +	if (!priv->name)
> +		return -ENOMEM;
> +
> +	priv->dev = dev;
> +	priv->peci_dev = peci_dev;
> +	priv->gen_info = (const struct cpu_info *)id->driver_data;
> +
> +	/*
> +	 * This is just a sanity check. Since we're using commands that are
> +	 * guaranteed to be supported on a given platform, we should never see
> +	 * revision lower than expected.
> +	 */
> +	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
> +		dev_warn(priv->dev,
> +			 "Unexpected PECI revision %#x, some features may be unavailable\n",
> +			 peci_dev->info.peci_revision);
> +
> +	check_resolved_cores(priv);
> +
> +	sensor_init(priv);
> +
> +	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev, priv->name,
> +							 priv, &peci_cputemp_chip_info, NULL);
> +
> +	return PTR_ERR_OR_ZERO(hwmon_dev);
> +}
> +
> +/*
> + * RESOLVED_CORES PCI configuration register may have different location on
> + * different platforms.
> + */
> +static struct resolved_cores_reg resolved_cores_reg_hsx = {
> +	.bus = 1,
> +	.dev = 30,
> +	.func = 3,
> +	.offset = 0xb4,
> +};
> +
> +static struct resolved_cores_reg resolved_cores_reg_icx = {
> +	.bus = 14,
> +	.dev = 30,
> +	.func = 3,
> +	.offset = 0xd0,
> +};
> +
> +static const struct cpu_info cpu_hsx = {
> +	.reg		= &resolved_cores_reg_hsx,
> +	.min_peci_revision = 0x33,
> +	.thermal_margin_to_millidegree = &dts_eight_dot_eight_to_millidegree,
> +};
> +
> +static const struct cpu_info cpu_icx = {
> +	.reg		= &resolved_cores_reg_icx,
> +	.min_peci_revision = 0x40,
> +	.thermal_margin_to_millidegree = &dts_ten_dot_six_to_millidegree,
> +};
> +
> +static const struct auxiliary_device_id peci_cputemp_ids[] = {
> +	{
> +		.name = "peci_cpu.cputemp.hsx",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.bdx",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.bdxd",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.skx",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.icx",
> +		.driver_data = (kernel_ulong_t)&cpu_icx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.icxd",
> +		.driver_data = (kernel_ulong_t)&cpu_icx,
> +	},
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(auxiliary, peci_cputemp_ids);
> +
> +static struct auxiliary_driver peci_cputemp_driver = {
> +	.probe		= peci_cputemp_probe,
> +	.id_table	= peci_cputemp_ids,
> +};
> +
> +module_auxiliary_driver(peci_cputemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
> +MODULE_DESCRIPTION("PECI cputemp driver");
> +MODULE_LICENSE("GPL");
> +MODULE_IMPORT_NS(PECI_CPU);
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 06/13] peci: Add device detection
  2021-11-15 22:35     ` Winiarska, Iwona
@ 2021-11-16  6:26       ` gregkh
  2021-11-17 23:19         ` Winiarska, Iwona
  0 siblings, 1 reply; 24+ messages in thread
From: gregkh @ 2021-11-16  6:26 UTC (permalink / raw)
  To: Winiarska, Iwona
  Cc: linux-aspeed, linux-doc, Hansen, Dave, zweiss, jae.hyun.yoo,
	corbet, openbmc, pierre-louis.bossart, linux, devicetree,
	jdelvare, arnd, robh+dt, bp, Williams, Dan J, andriy.shevchenko,
	linux-arm-kernel, linux-hwmon, Luck, Tony, andrew, rdunlap,
	linux-kernel, olof

On Mon, Nov 15, 2021 at 10:35:23PM +0000, Winiarska, Iwona wrote:
> On Mon, 2021-11-15 at 19:49 +0100, Greg Kroah-Hartman wrote:
> > On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote:
> > > +void peci_device_destroy(struct peci_device *device)
> > > +{
> > > +       bool killed;
> > > +
> > > +       device_lock(&device->dev);
> > > +       killed = kill_device(&device->dev);
> > 
> > Eeek, why call this?
> > 
> > > +       device_unlock(&device->dev);
> > > +
> > > +       if (!killed)
> > > +               return;
> > 
> > What happened if something changed after you unlocked it?
> 
> We either killed it, or the other caller killed it.
> 
> > 
> > Why is kill_device() required at all?  That's a very rare function to
> > call, and one that only one "bus" calls today because it is very
> > special (i.e. crazy and broken...)
> 
> It's used to avoid double-delete in case of races between peci_controller
> unregister and "manually" removing the device using sysfs (pointed out by Dan in
> v2). We're calling peci_device_destroy() in both callsites.
> Other way to solve it would be to just have a peci-specific lock, but
> kill_device seemed to be well suited for the problem at hand.
> Do you suggest to remove it and just go with the lock?

Yes please, remove it and use the lock.

Also, why are you required to have a sysfs file that can remove the
device?  Who wants that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 00/13] Introduce PECI subsystem
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (12 preceding siblings ...)
  2021-11-15 18:25 ` [PATCH v3 13/13] docs: Add PECI documentation Iwona Winiarska
@ 2021-11-17  3:56 ` Zev Weiss
  2021-11-17 23:25   ` Winiarska, Iwona
  2021-11-18 12:19 ` Tomer Maimon
  14 siblings, 1 reply; 24+ messages in thread
From: Zev Weiss @ 2021-11-17  3:56 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-aspeed, linux-doc, Dave Hansen, Jae Hyun Yoo,
	Jonathan Corbet, openbmc, Pierre-Louis Bossart, Guenter Roeck,
	devicetree, Jean Delvare, Arnd Bergmann, Rob Herring,
	Borislav Petkov, Dan Williams, Andy Shevchenko, linux-arm-kernel,
	linux-hwmon, Tony Luck, Andrew Jeffery, Greg Kroah-Hartman,
	Randy Dunlap, linux-kernel, Olof Johansson

On Mon, Nov 15, 2021 at 10:25:39AM PST, Iwona Winiarska wrote:
>Hi,
>
>This is a third round of patches introducing PECI subsystem.
>Sorry for the delay between v2 and v3.
>

Hi Iwona,

I've done some testing of these patches on my AST2500/E-2778G OpenBMC
platform -- I had to do a small bit of hacking to add support for
INTEL_FAM6_KABYLAKE, but with that in place the newly-added code for the
8.8 format seems to work as it should.  Thanks!

In poking at it a bit further I encountered some sub-optimal behavior
w.r.t. to host power state transitions and timeouts though --
essentially, if I ever hit a timeout in aspeed_peci_xfer() (for example
on a read of a hwmon tempX_input file after an unexpected host
shutdown), it seems to get stuck in a state where even if the host comes
back online, all attempted PECI transfers continue just timing out.
(Rebooting the BMC seems to resolve the problem.)  This also happens if
I remove the peci client device via the 'remove' sysfs file, shut down
the host, and then do a rescan via sysfs while the host is off (i.e.
another operation that times out).

Let me know if there's any other info that would be helpful for
debugging.


Thanks,
Zev

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 10/13] hwmon: peci: Add cputemp driver
  2021-11-16  0:52   ` Guenter Roeck
@ 2021-11-17 22:20     ` Winiarska, Iwona
  0 siblings, 0 replies; 24+ messages in thread
From: Winiarska, Iwona @ 2021-11-17 22:20 UTC (permalink / raw)
  To: linux, gregkh, linux-kernel, openbmc
  Cc: linux-aspeed, linux-doc, Hansen, Dave, zweiss, jae.hyun.yoo,
	corbet, pierre-louis.bossart, devicetree, jdelvare, arnd,
	robh+dt, bp, Williams, Dan J, andriy.shevchenko,
	linux-arm-kernel, linux-hwmon, Luck,  Tony, andrew, rdunlap,
	olof

On Mon, 2021-11-15 at 16:52 -0800, Guenter Roeck wrote:
> On 11/15/21 10:25 AM, Iwona Winiarska wrote:
> > Add peci-cputemp driver for Digital Thermal Sensor (DTS) thermal
> > readings of the processor package and processor cores that are
> > accessible via the PECI interface.
> > 
> > The main use case for the driver (and PECI interface) is out-of-band
> > management, where we're able to obtain the DTS readings from an external
> > entity connected with PECI, e.g. BMC on server platforms.
> > 
> > Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> >   MAINTAINERS                  |   7 +
> >   drivers/hwmon/Kconfig        |   2 +
> >   drivers/hwmon/Makefile       |   1 +
> >   drivers/hwmon/peci/Kconfig   |  18 ++
> >   drivers/hwmon/peci/Makefile  |   5 +
> >   drivers/hwmon/peci/common.h  |  58 ++++
> >   drivers/hwmon/peci/cputemp.c | 592 +++++++++++++++++++++++++++++++++++
> >   7 files changed, 683 insertions(+)
> >   create mode 100644 drivers/hwmon/peci/Kconfig
> >   create mode 100644 drivers/hwmon/peci/Makefile
> >   create mode 100644 drivers/hwmon/peci/common.h
> >   create mode 100644 drivers/hwmon/peci/cputemp.c
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index d751590f32e3..0094370be6c4 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -14920,6 +14920,13 @@ L:     platform-driver-x86@vger.kernel.org
> >   S:    Maintained
> >   F:    drivers/platform/x86/peaq-wmi.c
> >   
> > +PECI HARDWARE MONITORING DRIVERS
> > +M:     Iwona Winiarska <iwona.winiarska@intel.com>
> > +R:     Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > +L:     linux-hwmon@vger.kernel.org
> > +S:     Supported
> > +F:     drivers/hwmon/peci/
> > +
> >   PECI SUBSYSTEM
> >   M:    Iwona Winiarska <iwona.winiarska@intel.com>
> >   R:    Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index 64bd3dfba2c4..4858f7cfa81e 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1528,6 +1528,8 @@ config SENSORS_PCF8591
> >           These devices are hard to detect and rarely found on mainstream
> >           hardware. If unsure, say N.
> >   
> > +source "drivers/hwmon/peci/Kconfig"
> > +
> >   source "drivers/hwmon/pmbus/Kconfig"
> >   
> >   config SENSORS_PWM_FAN
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index baee6a8d4dd1..2c0e210a28b8 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -204,6 +204,7 @@ obj-$(CONFIG_SENSORS_WM8350)        += wm8350-hwmon.o
> >   obj-$(CONFIG_SENSORS_XGENE)   += xgene-hwmon.o
> >   
> >   obj-$(CONFIG_SENSORS_OCC)     += occ/
> > +obj-$(CONFIG_SENSORS_PECI)     += peci/
> >   obj-$(CONFIG_PMBUS)           += pmbus/
> >   
> >   ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
> > diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> > new file mode 100644
> > index 000000000000..e10eed68d70a
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/Kconfig
> > @@ -0,0 +1,18 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +config SENSORS_PECI_CPUTEMP
> > +       tristate "PECI CPU temperature monitoring client"
> > +       depends on PECI
> > +       select SENSORS_PECI
> > +       select PECI_CPU
> > +       help
> > +         If you say yes here you get support for the generic Intel PECI
> > +         cputemp driver which provides Digital Thermal Sensor (DTS) thermal
> > +         readings of the CPU package and CPU cores that are accessible via
> > +         the processor PECI interface.
> > +
> > +         This driver can also be built as a module. If so, the module
> > +         will be called peci-cputemp.
> > +
> > +config SENSORS_PECI
> > +       tristate
> > diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> > new file mode 100644
> > index 000000000000..e8a0ada5ab1f
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/Makefile
> > @@ -0,0 +1,5 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +peci-cputemp-y := cputemp.o
> > +
> > +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)     += peci-cputemp.o
> > diff --git a/drivers/hwmon/peci/common.h b/drivers/hwmon/peci/common.h
> > new file mode 100644
> > index 000000000000..734506b0eca2
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/common.h
> > @@ -0,0 +1,58 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright (c) 2021 Intel Corporation */
> > +
> > +#include <linux/mutex.h>
> > +#include <linux/types.h>
> > +
> > +#ifndef __PECI_HWMON_COMMON_H
> > +#define __PECI_HWMON_COMMON_H
> > +
> > +#define PECI_HWMON_UPDATE_INTERVAL     HZ
> > +
> > +/**
> > + * struct peci_sensor_state - PECI state information
> > + * @valid: flag to indicate the sensor value is valid
> > + * @last_updated: time of the last update in jiffies
> > + * @lock: mutex to protect sensor access
> > + */
> > +struct peci_sensor_state {
> > +       bool valid;
> > +       unsigned long last_updated;
> > +       struct mutex lock; /* protect sensor access */
> > +};
> > +
> > +/**
> > + * struct peci_sensor_data - PECI sensor information
> > + * @value: sensor value in milli units
> > + * @state: sensor update state
> > + */
> > +
> > +struct peci_sensor_data {
> > +       s32 value;
> > +       struct peci_sensor_state state;
> > +};
> > +
> > +/**
> > + * peci_sensor_need_update() - check whether sensor update is needed or not
> > + * @sensor: pointer to sensor data struct
> > + *
> > + * Return: true if update is needed, false if not.
> > + */
> > +
> > +static inline bool peci_sensor_need_update(struct peci_sensor_state *state)
> > +{
> > +       return !state->valid ||
> > +              time_after(jiffies, state->last_updated +
> > PECI_HWMON_UPDATE_INTERVAL);
> > +}
> > +
> > +/**
> > + * peci_sensor_mark_updated() - mark the sensor is updated
> > + * @sensor: pointer to sensor data struct
> > + */
> > +static inline void peci_sensor_mark_updated(struct peci_sensor_state
> > *state)
> > +{
> > +       state->valid = true;
> > +       state->last_updated = jiffies;
> > +}
> > +
> > +#endif /* __PECI_HWMON_COMMON_H */
> > diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c
> > new file mode 100644
> > index 000000000000..d264d6c2f437
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/cputemp.c
> > @@ -0,0 +1,592 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2018-2021 Intel Corporation
> > +
> > +#include <linux/auxiliary_bus.h>
> > +#include <linux/bitfield.h>
> > +#include <linux/bitops.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/jiffies.h>
> > +#include <linux/module.h>
> > +#include <linux/peci.h>
> > +#include <linux/peci-cpu.h>
> > +#include <linux/units.h>
> > +
> > +#include "common.h"
> > +
> > +#define CORE_NUMS_MAX          64
> > +
> > +#define BASE_CHANNEL_NUMS      5
> > +#define CPUTEMP_CHANNEL_NUMS   (BASE_CHANNEL_NUMS + CORE_NUMS_MAX)
> > +
> > +#define TEMP_TARGET_FAN_TEMP_MASK      GENMASK(15, 8)
> > +#define TEMP_TARGET_REF_TEMP_MASK      GENMASK(23, 16)
> > +#define TEMP_TARGET_TJ_OFFSET_MASK     GENMASK(29, 24)
> > +
> > +#define DTS_MARGIN_MASK                GENMASK(15, 0)
> > +#define PCS_MODULE_TEMP_MASK   GENMASK(15, 0)
> > +
> > +struct resolved_cores_reg {
> > +       u8 bus;
> > +       u8 dev;
> > +       u8 func;
> > +       u8 offset;
> > +};
> > +
> > +struct cpu_info {
> > +       struct resolved_cores_reg *reg;
> > +       u8 min_peci_revision;
> > +       s32 (*thermal_margin_to_millidegree)(s32 val);
> > +};
> > +
> > +struct peci_temp_target {
> > +       s32 tcontrol;
> > +       s32 tthrottle;
> > +       s32 tjmax;
> > +       struct peci_sensor_state state;
> > +};
> > +
> > +enum peci_temp_target_type {
> > +       tcontrol_type,
> > +       tthrottle_type,
> > +       tjmax_type,
> > +       crit_hyst_type,
> > +};
> > +
> > +struct peci_cputemp {
> > +       struct peci_device *peci_dev;
> > +       struct device *dev;
> > +       const char *name;
> > +       const struct cpu_info *gen_info;
> > +       struct {
> > +               struct peci_temp_target target;
> > +               struct peci_sensor_data die;
> > +               struct peci_sensor_data dts;
> > +               struct peci_sensor_data core[CORE_NUMS_MAX];
> > +       } temp;
> > +       const char **coretemp_label;
> > +       DECLARE_BITMAP(core_mask, CORE_NUMS_MAX);
> > +};
> > +
> > +enum cputemp_channels {
> > +       channel_die,
> > +       channel_dts,
> > +       channel_tcontrol,
> > +       channel_tthrottle,
> > +       channel_tjmax,
> > +       channel_core,
> > +};
> > +
> > +static const char * const cputemp_label[BASE_CHANNEL_NUMS] = {
> > +       "Die",
> > +       "DTS",
> > +       "Tcontrol",
> > +       "Tthrottle",
> > +       "Tjmax",
> > +};
> > +
> > +static int update_temp_target(struct peci_cputemp *priv)
> > +{
> > +       s32 tthrottle_offset, tcontrol_margin;
> > +       u32 pcs;
> > +       int ret;
> > +
> > +       if (!peci_sensor_need_update(&priv->temp.target.state))
> > +               return 0;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_TEMP_TARGET, 0, &pcs);
> > +       if (ret)
> > +               return ret;
> > +
> > +       priv->temp.target.tjmax =
> > +               FIELD_GET(TEMP_TARGET_REF_TEMP_MASK, pcs) *
> > MILLIDEGREE_PER_DEGREE;
> > +
> > +       tcontrol_margin = FIELD_GET(TEMP_TARGET_FAN_TEMP_MASK, pcs);
> > +       tcontrol_margin = sign_extend32(tcontrol_margin, 7) *
> > MILLIDEGREE_PER_DEGREE;
> > +       priv->temp.target.tcontrol = priv->temp.target.tjmax -
> > tcontrol_margin;
> > +
> > +       tthrottle_offset = FIELD_GET(TEMP_TARGET_TJ_OFFSET_MASK, pcs) *
> > MILLIDEGREE_PER_DEGREE;
> > +       priv->temp.target.tthrottle = priv->temp.target.tjmax -
> > tthrottle_offset;
> > +
> > +       peci_sensor_mark_updated(&priv->temp.target.state);
> > +
> > +       return 0;
> > +}
> > +
> > +static int get_temp_target(struct peci_cputemp *priv, enum
> > peci_temp_target_type type, long *val)
> 
> Why is this a long * ? It only uses variables which are declared as s32.

It's going to be used by cputemp_read, which implements hwmon_ops.read (which
uses long *val).

> 
> > +{
> > +       int ret;
> > +
> > +       mutex_lock(&priv->temp.target.state.lock);
> > +
> > +       ret = update_temp_target(priv);
> > +       if (ret)
> > +               goto unlock;
> > +
> > +       switch (type) {
> > +       case tcontrol_type:
> > +               *val = priv->temp.target.tcontrol;
> > +               break;
> > +       case tthrottle_type:
> > +               *val = priv->temp.target.tthrottle;
> > +               break;
> > +       case tjmax_type:
> > +               *val = priv->temp.target.tjmax;
> > +               break;
> > +       case crit_hyst_type:
> > +               *val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
> > +               break;
> > +       default:
> > +               ret = -EOPNOTSUPP;
> > +               break;
> > +       }
> > +unlock:
> > +       mutex_unlock(&priv->temp.target.state.lock);
> > +
> > +       return ret;
> > +}
> > +
> > +/*
> > + * Error codes:
> > + *   0x8000: General sensor error
> > + *   0x8001: Reserved
> > + *   0x8002: Underflow on reading value
> > + *   0x8003-0x81ff: Reserved
> > + */
> > +static bool dts_valid(s32 val)
> > +{
> > +       return val < 0x8000 || val > 0x81ff;
> > +}
> > +
> > +/*
> > + * Processors return a value of DTS reading in S10.6 fixed point format
> > + * (16 bits: 10-bit signed magnitude, 6-bit fraction).
> > + */
> > +static s32 dts_ten_dot_six_to_millidegree(s32 val)
> > +{
> > +       return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / 64;
> > +}
> > +
> > +/*
> > + * For older processors, thermal margin reading is returned in S8.8 fixed
> > + * point format (16 bits: 8-bit signed magnitude, 8-bit fraction).
> > + */
> > +static s32 dts_eight_dot_eight_to_millidegree(s32 val)
> > +{
> > +       return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / 256;
> > +}
> > +
> > +static int get_die_temp(struct peci_cputemp *priv, long *val)
> > +{
> > +       int ret = 0;
> > +       long tjmax;
> > +       s16 temp;
> > +
> > +       mutex_lock(&priv->temp.die.state.lock);
> > +       if (!peci_sensor_need_update(&priv->temp.die.state))
> > +               goto skip_update;
> > +
> > +       ret = peci_temp_read(priv->peci_dev, &temp);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       if (!dts_valid(temp)) {
> > +               ret = -EIO;
> > +               goto err_unlock;
> > +       }
> > +
> 
> I am getting lost with all the variable types. temp is s16,
> passed to dts_valid() as s32 and thus presumably sign extended.
> Since temp is s16, that means that a positive value of 0x8000
> or larger isn't really possible, and dts_valid() will never return
> false. With that in mind, you might as well drop the check.

It is a bug - dts_valid() should operate on unsigned value.
I'll fix it to take u16 instead of s32.
The idea then would be to treat value obtained from peci_temp_read (or any other
function that pulls values in "DTS" S10.6 or S8.8 format) as "raw" u16 and only
convert it to proper signed value after using respective X_to_milidegree
function.

> 
> > +       ret = get_temp_target(priv, tjmax_type, &tjmax);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       priv->temp.die.value = (s32)tjmax +
> > dts_ten_dot_six_to_millidegree(temp);
> > +
> > +       peci_sensor_mark_updated(&priv->temp.die.state);
> > +
> > +skip_update:
> > +       *val = priv->temp.die.value;
> > +err_unlock:
> > +       mutex_unlock(&priv->temp.die.state.lock);
> > +       return ret;
> > +}
> > +
> > +static int get_dts(struct peci_cputemp *priv, long *val)
> > +{
> > +       int ret = 0;
> > +       s32 thermal_margin;
> > +       long tcontrol;
> > +       u32 pcs;
> > +
> > +       mutex_lock(&priv->temp.dts.state.lock);
> > +       if (!peci_sensor_need_update(&priv->temp.dts.state))
> > +               goto skip_update;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_THERMAL_MARGIN, 0,
> > &pcs);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       thermal_margin = FIELD_GET(DTS_MARGIN_MASK, pcs);
> > +       if (!dts_valid(thermal_margin)) { > +           ret = -EIO;
> > +               goto err_unlock;
> > +       }
> > +
> > +       ret = get_temp_target(priv, tcontrol_type, &tcontrol);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       /* Note that the tcontrol should be available before calling it */
> > +       priv->temp.dts.value =
> > +               (s32)tcontrol - priv->gen_info-
> > >thermal_margin_to_millidegree(thermal_margin);
> 
> tcontrol is stored as s32, converted to long, only to be typecasted
> back to s32. Any reason for doing that ?

tcontrol is passed to get_temp_target(), which expects long.

Thanks
-Iwona

> 
> > +
> > +       peci_sensor_mark_updated(&priv->temp.dts.state);
> > +
> > +skip_update:
> > +       *val = priv->temp.dts.value;
> > +err_unlock:
> > +       mutex_unlock(&priv->temp.dts.state.lock);
> > +       return ret;
> > +}
> > +
> > +static int get_core_temp(struct peci_cputemp *priv, int core_index, long
> > *val)
> > +{
> > +       int ret = 0;
> > +       s32 core_dts_margin;
> > +       long tjmax;
> > +       u32 pcs;
> > +
> > +       mutex_lock(&priv->temp.core[core_index].state.lock);
> > +       if (!peci_sensor_need_update(&priv->temp.core[core_index].state))
> > +               goto skip_update;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_MODULE_TEMP,
> > core_index, &pcs);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       core_dts_margin = FIELD_GET(PCS_MODULE_TEMP_MASK, pcs);
> > +       if (!dts_valid(core_dts_margin)) {
> > +               ret = -EIO;
> > +               goto err_unlock;
> > +       }
> > +
> > +       ret = get_temp_target(priv, tjmax_type, &tjmax);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       /* Note that the tjmax should be available before calling it */
> > +       priv->temp.core[core_index].value =
> > +               (s32)tjmax +
> > dts_ten_dot_six_to_millidegree(core_dts_margin);
> > +
> > +       peci_sensor_mark_updated(&priv->temp.core[core_index].state);
> > +
> > +skip_update:
> > +       *val = priv->temp.core[core_index].value;
> > +err_unlock:
> > +       mutex_unlock(&priv->temp.core[core_index].state.lock);
> > +       return ret;
> > +}
> > +
> > +static int cputemp_read_string(struct device *dev, enum hwmon_sensor_types
> > type,
> > +                              u32 attr, int channel, const char **str)
> > +{
> > +       struct peci_cputemp *priv = dev_get_drvdata(dev);
> > +
> > +       if (attr != hwmon_temp_label)
> > +               return -EOPNOTSUPP;
> > +
> > +       *str = channel < channel_core ?
> > +               cputemp_label[channel] : priv->coretemp_label[channel -
> > channel_core];
> > +
> > +       return 0;
> > +}
> > +
> > +static int cputemp_read(struct device *dev, enum hwmon_sensor_types type,
> > +                       u32 attr, int channel, long *val)
> > +{
> > +       struct peci_cputemp *priv = dev_get_drvdata(dev);
> > +
> > +       switch (attr) {
> > +       case hwmon_temp_input:
> > +               switch (channel) {
> > +               case channel_die:
> > +                       return get_die_temp(priv, val);
> > +               case channel_dts:
> > +                       return get_dts(priv, val);
> > +               case channel_tcontrol:
> > +                       return get_temp_target(priv, tcontrol_type, val);
> > +               case channel_tthrottle:
> > +                       return get_temp_target(priv, tthrottle_type, val);
> > +               case channel_tjmax:
> > +                       return get_temp_target(priv, tjmax_type, val);
> > +               default:
> > +                       return get_core_temp(priv, channel - channel_core,
> > val);
> > +               }
> > +               break;
> > +       case hwmon_temp_max:
> > +               return get_temp_target(priv, tcontrol_type, val);
> > +       case hwmon_temp_crit:
> > +               return get_temp_target(priv, tjmax_type, val);
> > +       case hwmon_temp_crit_hyst:
> > +               return get_temp_target(priv, crit_hyst_type, val);
> > +       default:
> > +               return -EOPNOTSUPP;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types
> > type,
> > +                                 u32 attr, int channel)
> > +{
> > +       const struct peci_cputemp *priv = data;
> > +
> > +       if (channel > CPUTEMP_CHANNEL_NUMS)
> > +               return 0;
> > +
> > +       if (channel < channel_core)
> > +               return 0444;
> > +
> > +       if (test_bit(channel - channel_core, priv->core_mask))
> > +               return 0444;
> > +
> > +       return 0;
> > +}
> > +
> > +static int init_core_mask(struct peci_cputemp *priv)
> > +{
> > +       struct peci_device *peci_dev = priv->peci_dev;
> > +       struct resolved_cores_reg *reg = priv->gen_info->reg;
> > +       u64 core_mask;
> > +       u32 data;
> > +       int ret;
> > +
> > +       /* Get the RESOLVED_CORES register value */
> > +       switch (peci_dev->info.model) {
> > +       case INTEL_FAM6_ICELAKE_X:
> > +       case INTEL_FAM6_ICELAKE_D:
> > +               ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg-
> > >dev,
> > +                                            reg->func, reg->offset + 4,
> > &data);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               core_mask = (u64)data << 32;
> > +
> > +               ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg-
> > >dev,
> > +                                            reg->func, reg->offset, &data);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               core_mask |= data;
> > +
> > +               break;
> > +       default:
> > +               ret = peci_pci_local_read(peci_dev, reg->bus, reg->dev,
> > +                                         reg->func, reg->offset, &data);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               core_mask = data;
> > +
> > +               break;
> > +       }
> > +
> > +       if (!core_mask)
> > +               return -EIO;
> > +
> > +       bitmap_from_u64(priv->core_mask, core_mask);
> > +
> > +       return 0;
> > +}
> > +
> > +static int create_temp_label(struct peci_cputemp *priv)
> > +{
> > +       unsigned long core_max = find_last_bit(priv->core_mask,
> > CORE_NUMS_MAX);
> > +       int i;
> > +
> > +       priv->coretemp_label = devm_kzalloc(priv->dev, core_max *
> > sizeof(char *), GFP_KERNEL);
> > +       if (!priv->coretemp_label)
> > +               return -ENOMEM;
> > +
> > +       for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) {
> > +               priv->coretemp_label[i] = devm_kasprintf(priv->dev,
> > GFP_KERNEL, "Core %d", i);
> > +               if (!priv->coretemp_label[i])
> > +                       return -ENOMEM;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void check_resolved_cores(struct peci_cputemp *priv)
> > +{
> > +       /*
> > +        * Failure to resolve cores is non-critical, we're still able to
> > +        * provide other sensor data.
> > +        */
> > +
> > +       if (init_core_mask(priv))
> > +               return;
> > +
> > +       if (create_temp_label(priv))
> > +               bitmap_zero(priv->core_mask, CORE_NUMS_MAX);
> > +}
> > +
> > +static void sensor_init(struct peci_cputemp *priv)
> > +{
> > +       int i;
> > +
> > +       mutex_init(&priv->temp.target.state.lock);
> > +       mutex_init(&priv->temp.die.state.lock);
> > +       mutex_init(&priv->temp.dts.state.lock);
> > +
> > +       for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX)
> > +               mutex_init(&priv->temp.core[i].state.lock);
> > +}
> > +
> > +static const struct hwmon_ops peci_cputemp_ops = {
> > +       .is_visible = cputemp_is_visible,
> > +       .read_string = cputemp_read_string,
> > +       .read = cputemp_read,
> > +};
> > +
> > +static const u32 peci_cputemp_temp_channel_config[] = {
> > +       /* Die temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> > HWMON_T_CRIT_HYST,
> > +       /* DTS margin */
> > +       HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> > HWMON_T_CRIT_HYST,
> > +       /* Tcontrol temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
> > +       /* Tthrottle temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT,
> > +       /* Tjmax temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT,
> > +       /* Core temperature - for all core channels */
> > +       [channel_core ... CPUTEMP_CHANNEL_NUMS - 1] = HWMON_T_LABEL |
> > HWMON_T_INPUT,
> > +       0
> > +};
> > +
> > +static const struct hwmon_channel_info peci_cputemp_temp_channel = {
> > +       .type = hwmon_temp,
> > +       .config = peci_cputemp_temp_channel_config,
> > +};
> > +
> > +static const struct hwmon_channel_info *peci_cputemp_info[] = {
> > +       &peci_cputemp_temp_channel,
> > +       NULL
> > +};
> > +
> > +static const struct hwmon_chip_info peci_cputemp_chip_info = {
> > +       .ops = &peci_cputemp_ops,
> > +       .info = peci_cputemp_info,
> > +};
> > +
> > +static int peci_cputemp_probe(struct auxiliary_device *adev,
> > +                             const struct auxiliary_device_id *id)
> > +{
> > +       struct device *dev = &adev->dev;
> > +       struct peci_device *peci_dev = to_peci_device(dev->parent);
> > +       struct peci_cputemp *priv;
> > +       struct device *hwmon_dev;
> > +
> > +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +
> > +       priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_cputemp.cpu%d",
> > +                                   peci_dev->info.socket_id);
> > +       if (!priv->name)
> > +               return -ENOMEM;
> > +
> > +       priv->dev = dev;
> > +       priv->peci_dev = peci_dev;
> > +       priv->gen_info = (const struct cpu_info *)id->driver_data;
> > +
> > +       /*
> > +        * This is just a sanity check. Since we're using commands that are
> > +        * guaranteed to be supported on a given platform, we should never
> > see
> > +        * revision lower than expected.
> > +        */
> > +       if (peci_dev->info.peci_revision < priv->gen_info-
> > >min_peci_revision)
> > +               dev_warn(priv->dev,
> > +                        "Unexpected PECI revision %#x, some features may be
> > unavailable\n",
> > +                        peci_dev->info.peci_revision);
> > +
> > +       check_resolved_cores(priv);
> > +
> > +       sensor_init(priv);
> > +
> > +       hwmon_dev = devm_hwmon_device_register_with_info(priv->dev, priv-
> > >name,
> > +                                                        priv,
> > &peci_cputemp_chip_info, NULL);
> > +
> > +       return PTR_ERR_OR_ZERO(hwmon_dev);
> > +}
> > +
> > +/*
> > + * RESOLVED_CORES PCI configuration register may have different location on
> > + * different platforms.
> > + */
> > +static struct resolved_cores_reg resolved_cores_reg_hsx = {
> > +       .bus = 1,
> > +       .dev = 30,
> > +       .func = 3,
> > +       .offset = 0xb4,
> > +};
> > +
> > +static struct resolved_cores_reg resolved_cores_reg_icx = {
> > +       .bus = 14,
> > +       .dev = 30,
> > +       .func = 3,
> > +       .offset = 0xd0,
> > +};
> > +
> > +static const struct cpu_info cpu_hsx = {
> > +       .reg            = &resolved_cores_reg_hsx,
> > +       .min_peci_revision = 0x33,
> > +       .thermal_margin_to_millidegree =
> > &dts_eight_dot_eight_to_millidegree,
> > +};
> > +
> > +static const struct cpu_info cpu_icx = {
> > +       .reg            = &resolved_cores_reg_icx,
> > +       .min_peci_revision = 0x40,
> > +       .thermal_margin_to_millidegree = &dts_ten_dot_six_to_millidegree,
> > +};
> > +
> > +static const struct auxiliary_device_id peci_cputemp_ids[] = {
> > +       {
> > +               .name = "peci_cpu.cputemp.hsx",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.bdx",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.bdxd",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.skx",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.icx",
> > +               .driver_data = (kernel_ulong_t)&cpu_icx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.icxd",
> > +               .driver_data = (kernel_ulong_t)&cpu_icx,
> > +       },
> > +       { }
> > +};
> > +MODULE_DEVICE_TABLE(auxiliary, peci_cputemp_ids);
> > +
> > +static struct auxiliary_driver peci_cputemp_driver = {
> > +       .probe          = peci_cputemp_probe,
> > +       .id_table       = peci_cputemp_ids,
> > +};
> > +
> > +module_auxiliary_driver(peci_cputemp_driver);
> > +
> > +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> > +MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
> > +MODULE_DESCRIPTION("PECI cputemp driver");
> > +MODULE_LICENSE("GPL");
> > +MODULE_IMPORT_NS(PECI_CPU);
> > 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 06/13] peci: Add device detection
  2021-11-16  6:26       ` gregkh
@ 2021-11-17 23:19         ` Winiarska, Iwona
  0 siblings, 0 replies; 24+ messages in thread
From: Winiarska, Iwona @ 2021-11-17 23:19 UTC (permalink / raw)
  To: gregkh
  Cc: linux-aspeed, linux-doc, Hansen,  Dave, zweiss, jae.hyun.yoo,
	corbet, openbmc, pierre-louis.bossart, linux, devicetree,
	jdelvare, arnd, robh+dt, bp, Williams, Dan J, andriy.shevchenko,
	linux-arm-kernel, linux-hwmon, Luck, Tony, andrew, rdunlap,
	linux-kernel, olof

On Tue, 2021-11-16 at 07:26 +0100, gregkh@linuxfoundation.org wrote:
> On Mon, Nov 15, 2021 at 10:35:23PM +0000, Winiarska, Iwona wrote:
> > On Mon, 2021-11-15 at 19:49 +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote:
> > > > +void peci_device_destroy(struct peci_device *device)
> > > > +{
> > > > +       bool killed;
> > > > +
> > > > +       device_lock(&device->dev);
> > > > +       killed = kill_device(&device->dev);
> > > 
> > > Eeek, why call this?
> > > 
> > > > +       device_unlock(&device->dev);
> > > > +
> > > > +       if (!killed)
> > > > +               return;
> > > 
> > > What happened if something changed after you unlocked it?
> > 
> > We either killed it, or the other caller killed it.
> > 
> > > 
> > > Why is kill_device() required at all?  That's a very rare function to
> > > call, and one that only one "bus" calls today because it is very
> > > special (i.e. crazy and broken...)
> > 
> > It's used to avoid double-delete in case of races between peci_controller
> > unregister and "manually" removing the device using sysfs (pointed out by Dan
> > in
> > v2). We're calling peci_device_destroy() in both callsites.
> > Other way to solve it would be to just have a peci-specific lock, but
> > kill_device seemed to be well suited for the problem at hand.
> > Do you suggest to remove it and just go with the lock?
> 
> Yes please, remove it and use the lock.

Ack.

> 
> Also, why are you required to have a sysfs file that can remove the
> device?  Who wants that?

From the following patch:

"PECI devices may not be discoverable at the time when PECI controller is
being added (e.g. BMC can boot up when the Host system is still in S5).
Since we currently don't have the capabilities to figure out the Host
system state inside the PECI subsystem itself, we have to rely on
userspace to do it for us."

That's about rescan, but userspace might also want to remove the devices e.g.
when Host goes into S5.
It's also useful for development and debug purposes (and also allows us to have
a nice bit of symmetry with rescan).

Thanks
-Iwona

> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 00/13] Introduce PECI subsystem
  2021-11-17  3:56 ` [PATCH v3 00/13] Introduce PECI subsystem Zev Weiss
@ 2021-11-17 23:25   ` Winiarska, Iwona
  0 siblings, 0 replies; 24+ messages in thread
From: Winiarska, Iwona @ 2021-11-17 23:25 UTC (permalink / raw)
  To: zweiss
  Cc: linux-aspeed, linux-doc, Hansen, Dave, jae.hyun.yoo, corbet,
	openbmc, pierre-louis.bossart, linux, devicetree, jdelvare, arnd,
	robh+dt, bp, Williams, Dan J, andriy.shevchenko,
	linux-arm-kernel, linux-hwmon, Luck,  Tony, andrew, gregkh,
	rdunlap, linux-kernel, olof

On Wed, 2021-11-17 at 03:56 +0000, Zev Weiss wrote:
> On Mon, Nov 15, 2021 at 10:25:39AM PST, Iwona Winiarska wrote:
> > Hi,
> > 
> > This is a third round of patches introducing PECI subsystem.
> > Sorry for the delay between v2 and v3.
> > 
> 
> Hi Iwona,
> 
> I've done some testing of these patches on my AST2500/E-2778G OpenBMC
> platform -- I had to do a small bit of hacking to add support for
> INTEL_FAM6_KABYLAKE, but with that in place the newly-added code for the
> 8.8 format seems to work as it should.  Thanks!

Thanks for the report and testing :)

> 
> In poking at it a bit further I encountered some sub-optimal behavior
> w.r.t. to host power state transitions and timeouts though --
> essentially, if I ever hit a timeout in aspeed_peci_xfer() (for example
> on a read of a hwmon tempX_input file after an unexpected host
> shutdown), it seems to get stuck in a state where even if the host comes
> back online, all attempted PECI transfers continue just timing out.
> (Rebooting the BMC seems to resolve the problem.)  This also happens if
> I remove the peci client device via the 'remove' sysfs file, shut down
> the host, and then do a rescan via sysfs while the host is off (i.e.
> another operation that times out).
> 
> Let me know if there's any other info that would be helpful for
> debugging.

That's unexpected. I do have an idea what might have caused that. Let me fix it
in v4.

Thanks
-Iwona

> 
> 
> Thanks,
> Zev


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 00/13] Introduce PECI subsystem
  2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
                   ` (13 preceding siblings ...)
  2021-11-17  3:56 ` [PATCH v3 00/13] Introduce PECI subsystem Zev Weiss
@ 2021-11-18 12:19 ` Tomer Maimon
  2021-11-18 21:51   ` Winiarska, Iwona
  14 siblings, 1 reply; 24+ messages in thread
From: Tomer Maimon @ 2021-11-18 12:19 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-aspeed, linux-doc, Dave Hansen, Zev Weiss, Jae Hyun Yoo,
	Jonathan Corbet, OpenBMC Maillist, Pierre-Louis Bossart,
	Guenter Roeck, devicetree, Jean Delvare, Arnd Bergmann,
	Rob Herring, Borislav Petkov, Dan Williams, Andy Shevchenko,
	Linux ARM, Linux HWMON List, Tony Luck, Andrew Jeffery,
	Greg Kroah-Hartman, Randy Dunlap, Linux Kernel Mailing List,
	Olof Johansson

[-- Attachment #1: Type: text/plain, Size: 9707 bytes --]

Hi Iwona,

My name is Tomer I working as a SW engineer in Nuvoton BMC project.

First, thanks for upstreaming the PECI driver!

Nuvoton (NPCM) PECI driver was in the PECI patchset that has been
handheld by Jae.
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20191211194624.2872-10-jae.hyun.yoo@linux.intel.com/

Could you add Nuvoton (NPCM) PECI driver to your patch set next time you
will send upstream patches to Linux vanilla?
If you agree, we will check your patchset in Nuvoton systems in a few days
and send you NPCM OECI driver and documentation.

Thanks,

Tomer

On Mon, 15 Nov 2021 at 20:28, Iwona Winiarska <iwona.winiarska@intel.com>
wrote:

> Hi,
>
> This is a third round of patches introducing PECI subsystem.
> Sorry for the delay between v2 and v3.
>
> The Platform Environment Control Interface (PECI) is a communication
> interface between Intel processors and management controllers (e.g.
> Baseboard Management Controller, BMC).
>
> This series adds a PECI subsystem and introduces drivers which run in
> the Linux instance on the management controller (not the main Intel
> processor) and is intended to be used by the OpenBMC [1], a Linux
> distribution for BMC devices.
> The information exposed over PECI (like processor and DIMM
> temperature) refers to the Intel processor and can be consumed by
> daemons running on the BMC to, for example, display the processor
> temperature in its web interface.
>
> The PECI bus is collection of code that provides interface support
> between PECI devices (that actually represent processors) and PECI
> controllers (such as the "peci-aspeed" controller) that allow to
> access physical PECI interface. PECI devices are bound to PECI
> drivers that provides access to PECI services. This series introduces
> a generic "peci-cpu" driver that exposes hardware monitoring "cputemp"
> and "dimmtemp" using the auxiliary bus.
>
> Exposing "raw" PECI to userspace, either to write userspace drivers or
> for debug/testing purpose was left out of this series to encourage
> writing kernel drivers instead, but may be pursued in the future.
>
> Introducing PECI to upstream Linux was already attempted before [2].
> Since it's been over a year since last revision, and the series
> changed quite a bit in the meantime, I've decided to start from v1.
>
> I would also like to give credit to everyone who helped me with
> different aspects of preliminary review:
> - Pierre-Louis Bossart,
> - Tony Luck,
> - Andy Shevchenko,
> - Dave Hansen.
>
> [1] https://github.com/openbmc/openbmc
> [2]
> https://lore.kernel.org/openbmc/20191211194624.2872-1-jae.hyun.yoo@linux.intel.com/
>
> Changes v2 -> v3:
>
> * Dropped x86/cpu patches (Boris)
> * Dropped pr_fmt() for PECI module (Dan)
> * Fixed releasing peci controller device flow (Dan)
> * Improved peci-aspeed commit-msg and Kconfig help (Dan)
> * Fixed aspeed_peci_xfer() to use the proper spin_lock function (Dan)
> * Wrapped print_hex_dump_bytes() in CONFIG_DYNAMIC_DEBUG (Dan)
> * Removed debug status logs from aspeed_peci_irq_handler() (Dan)
> * Renamed functions using devres to start with "devm" (Dan)
> * Changed request to be allocated on stack in peci_detect (Dan)
> * Removed redundant WARN_ON on invalid PECI addr (Dan)
> * Changed peci_device_create() to use device_initialize() + device_add()
> pattern (Dan)
> * Fixed peci_device_destroy() to use kill_device() avoiding double-free
> (Dan)
> * Renamed functions that perform xfer using "peci_xfer_*" prefix (Dan)
> * Renamed peci_request_data_dib(temp) -> peci_request_dib(temp)_read (Dan)
> * Fixed thermal margin readings for older Intel processors (Zev)
> * Misc hwmon simplifications (Guenter)
> * Used BIT_PER_TYPE to verify macro value constrains (Guenter)
> * Improved WARN_ON message to print chan_rank_max and idx_dimm_max
> (Guenter)
> * Improved dimmtemp to not reattempt probe if no dimms are populated
>
> Changes v1 -> v2:
>
> Biggest changes when it comes to diffstat are locking in HWMON
> (I decided to clean things up a bit while adding it), switching to
> devres usage in more places and exposing sysfs interface in separate patch.
>
> * Moved extending X86 ARCHITECTURE MAINTAINERS earlier in series (Dan)
> * Removed "default n" for GENERIC_LIB_X86 (Dan)
> * Added vendor prefix for peci-aspeed specific properties (Rob)
> * Refactored PECI to use devres consistently (Dan)
> * Added missing sysfs documentation and excluded adding peci-sysfs to
>   separate patch (Dan)
> * Used module_init() instead of subsys_init() for peci module
> initialization (Dan)
> * Removed redundant struct peci_device member (Dan)
> * Improved PECI Kconfig help (Randy/Dan)
> * Fixed/removed log messages (Dan, Guenter)
> * Refactored peci-cputemp and peci-dimmtemp and added missing locks
> (Guenter)
> * Removed unused dev_set_drvdata() in peci-cputemp and peci-dimmtemp
> (Guenter)
> * Fixed used types, names, fixed broken and added additional comments
>   to peci-hwmon (Guenter, Zev)
> * Refactored peci-dimmtemp to not return -ETIMEDOUT (Guenter)
> * Added sanity check for min_peci_revision in peci-hwmon drivers (Zev)
> * Added assert for DIMM_NUMS_MAX and additional warning in peci-dimmtemp
> (Zev)
> * Fixed macro names in peci-aspeed (Zev)
> * Refactored peci-aspeed sanitizing properties to a single helper function
> (Zev)
> * Fixed peci_cpu_device_ids definition for Broadwell Xeon D (David)
> * Refactor peci_request to use a single allocation (Zev)
> * Used min_t() to improve code readability (Zev)
> * Added macro for PECI_RDENDPTCFG_MMIO_WR_LEN_BASE and fixed adev type
>   array name to more descriptive (Zev)
> * Fixed peci-hwmon commit-msg and documentation (Zev)
>
> Thanks
> -Iwona
>
> Iwona Winiarska (11):
>   dt-bindings: Add generic bindings for PECI
>   dt-bindings: Add bindings for peci-aspeed
>   ARM: dts: aspeed: Add PECI controller nodes
>   peci: Add core infrastructure
>   peci: Add device detection
>   peci: Add sysfs interface for PECI bus
>   peci: Add support for PECI device drivers
>   peci: Add peci-cpu driver
>   hwmon: peci: Add cputemp driver
>   hwmon: peci: Add dimmtemp driver
>   docs: Add PECI documentation
>
> Jae Hyun Yoo (2):
>   peci: Add peci-aspeed controller driver
>   docs: hwmon: Document PECI drivers
>
>  Documentation/ABI/testing/sysfs-bus-peci      |  16 +
>  .../devicetree/bindings/peci/peci-aspeed.yaml | 109 +++
>  .../bindings/peci/peci-controller.yaml        |  33 +
>  Documentation/hwmon/index.rst                 |   2 +
>  Documentation/hwmon/peci-cputemp.rst          |  90 +++
>  Documentation/hwmon/peci-dimmtemp.rst         |  57 ++
>  Documentation/index.rst                       |   1 +
>  Documentation/peci/index.rst                  |  16 +
>  Documentation/peci/peci.rst                   |  51 ++
>  MAINTAINERS                                   |  29 +
>  arch/arm/boot/dts/aspeed-g4.dtsi              |  14 +
>  arch/arm/boot/dts/aspeed-g5.dtsi              |  14 +
>  arch/arm/boot/dts/aspeed-g6.dtsi              |  14 +
>  drivers/Kconfig                               |   3 +
>  drivers/Makefile                              |   1 +
>  drivers/hwmon/Kconfig                         |   2 +
>  drivers/hwmon/Makefile                        |   1 +
>  drivers/hwmon/peci/Kconfig                    |  31 +
>  drivers/hwmon/peci/Makefile                   |   7 +
>  drivers/hwmon/peci/common.h                   |  58 ++
>  drivers/hwmon/peci/cputemp.c                  | 592 ++++++++++++++++
>  drivers/hwmon/peci/dimmtemp.c                 | 630 ++++++++++++++++++
>  drivers/peci/Kconfig                          |  36 +
>  drivers/peci/Makefile                         |  10 +
>  drivers/peci/controller/Kconfig               |  17 +
>  drivers/peci/controller/Makefile              |   3 +
>  drivers/peci/controller/peci-aspeed.c         | 429 ++++++++++++
>  drivers/peci/core.c                           | 236 +++++++
>  drivers/peci/cpu.c                            | 343 ++++++++++
>  drivers/peci/device.c                         | 249 +++++++
>  drivers/peci/internal.h                       | 136 ++++
>  drivers/peci/request.c                        | 482 ++++++++++++++
>  drivers/peci/sysfs.c                          |  82 +++
>  include/linux/peci-cpu.h                      |  40 ++
>  include/linux/peci.h                          | 110 +++
>  35 files changed, 3944 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml
>  create mode 100644
> Documentation/devicetree/bindings/peci/peci-controller.yaml
>  create mode 100644 Documentation/hwmon/peci-cputemp.rst
>  create mode 100644 Documentation/hwmon/peci-dimmtemp.rst
>  create mode 100644 Documentation/peci/index.rst
>  create mode 100644 Documentation/peci/peci.rst
>  create mode 100644 drivers/hwmon/peci/Kconfig
>  create mode 100644 drivers/hwmon/peci/Makefile
>  create mode 100644 drivers/hwmon/peci/common.h
>  create mode 100644 drivers/hwmon/peci/cputemp.c
>  create mode 100644 drivers/hwmon/peci/dimmtemp.c
>  create mode 100644 drivers/peci/Kconfig
>  create mode 100644 drivers/peci/Makefile
>  create mode 100644 drivers/peci/controller/Kconfig
>  create mode 100644 drivers/peci/controller/Makefile
>  create mode 100644 drivers/peci/controller/peci-aspeed.c
>  create mode 100644 drivers/peci/core.c
>  create mode 100644 drivers/peci/cpu.c
>  create mode 100644 drivers/peci/device.c
>  create mode 100644 drivers/peci/internal.h
>  create mode 100644 drivers/peci/request.c
>  create mode 100644 drivers/peci/sysfs.c
>  create mode 100644 include/linux/peci-cpu.h
>  create mode 100644 include/linux/peci.h
>
> --
> 2.31.1
>
>

[-- Attachment #2: Type: text/html, Size: 11653 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 00/13] Introduce PECI subsystem
  2021-11-18 12:19 ` Tomer Maimon
@ 2021-11-18 21:51   ` Winiarska, Iwona
  0 siblings, 0 replies; 24+ messages in thread
From: Winiarska, Iwona @ 2021-11-18 21:51 UTC (permalink / raw)
  To: tmaimon77
  Cc: linux-aspeed, linux-doc, Hansen,  Dave, zweiss, jae.hyun.yoo,
	corbet, openbmc, pierre-louis.bossart, linux, devicetree,
	jdelvare, arnd, robh+dt, bp, Williams, Dan J, andriy.shevchenko,
	linux-arm-kernel, linux-hwmon, Luck, Tony, andrew, gregkh,
	rdunlap, linux-kernel, olof

On Thu, 2021-11-18 at 14:19 +0200, Tomer Maimon wrote:
> Hi Iwona,
> 
> My name is Tomer I working as a SW engineer in Nuvoton BMC project.
> 
> First, thanks for upstreaming the PECI driver!
> 
> Nuvoton (NPCM) PECI driver was in the PECI patchset that has been handheld by
> Jae.
> https://patchwork.kernel.org/project/linux-arm-kernel/patch/20191211194624.2872-10-jae.hyun.yoo@linux.intel.com/
> 
> Could you add Nuvoton (NPCM) PECI driver to your patch set next time you will
> send upstream patches to Linux vanilla?

Some (relatively small) changes are going to be needed to adapt that driver to
changes that happened in PECI core.
I want to keep this series as small as possible, but once it gets merged,
Nuvoton driver can be added in a separate series.

> If you agree, we will check your patchset in Nuvoton systems in a few days and
> send you NPCM OECI driver and documentation.
> 

I don't have any hardware to test it on so help will definitely be welcome :)

Thanks
-Iwona

> Thanks,
> 
> Tomer
> 
> On Mon, 15 Nov 2021 at 20:28, Iwona Winiarska <iwona.winiarska@intel.com> wrote:
> > Hi,
> > 
> > This is a third round of patches introducing PECI subsystem.
> > Sorry for the delay between v2 and v3.
> > 
> > The Platform Environment Control Interface (PECI) is a communication
> > interface between Intel processors and management controllers (e.g.
> > Baseboard Management Controller, BMC).
> > 
> > This series adds a PECI subsystem and introduces drivers which run in
> > the Linux instance on the management controller (not the main Intel
> > processor) and is intended to be used by the OpenBMC [1], a Linux
> > distribution for BMC devices.
> > The information exposed over PECI (like processor and DIMM
> > temperature) refers to the Intel processor and can be consumed by
> > daemons running on the BMC to, for example, display the processor
> > temperature in its web interface.
> > 
> > The PECI bus is collection of code that provides interface support
> > between PECI devices (that actually represent processors) and PECI
> > controllers (such as the "peci-aspeed" controller) that allow to
> > access physical PECI interface. PECI devices are bound to PECI
> > drivers that provides access to PECI services. This series introduces
> > a generic "peci-cpu" driver that exposes hardware monitoring "cputemp"
> > and "dimmtemp" using the auxiliary bus.
> > 
> > Exposing "raw" PECI to userspace, either to write userspace drivers or
> > for debug/testing purpose was left out of this series to encourage
> > writing kernel drivers instead, but may be pursued in the future.
> > 
> > Introducing PECI to upstream Linux was already attempted before [2].
> > Since it's been over a year since last revision, and the series
> > changed quite a bit in the meantime, I've decided to start from v1.
> > 
> > I would also like to give credit to everyone who helped me with
> > different aspects of preliminary review:
> > - Pierre-Louis Bossart,
> > - Tony Luck, 
> > - Andy Shevchenko,
> > - Dave Hansen.
> > 
> > [1] https://github.com/openbmc/openbmc
> > [2]
> > https://lore.kernel.org/openbmc/20191211194624.2872-1-jae.hyun.yoo@linux.intel.com/
> > 
> > Changes v2 -> v3:
> > 
> > * Dropped x86/cpu patches (Boris)
> > * Dropped pr_fmt() for PECI module (Dan)
> > * Fixed releasing peci controller device flow (Dan) 
> > * Improved peci-aspeed commit-msg and Kconfig help (Dan)
> > * Fixed aspeed_peci_xfer() to use the proper spin_lock function (Dan) 
> > * Wrapped print_hex_dump_bytes() in CONFIG_DYNAMIC_DEBUG (Dan)
> > * Removed debug status logs from aspeed_peci_irq_handler() (Dan)
> > * Renamed functions using devres to start with "devm" (Dan)
> > * Changed request to be allocated on stack in peci_detect (Dan)
> > * Removed redundant WARN_ON on invalid PECI addr (Dan)
> > * Changed peci_device_create() to use device_initialize() + device_add()
> > pattern (Dan)
> > * Fixed peci_device_destroy() to use kill_device() avoiding double-free (Dan)
> > * Renamed functions that perform xfer using "peci_xfer_*" prefix (Dan) 
> > * Renamed peci_request_data_dib(temp) -> peci_request_dib(temp)_read (Dan)
> > * Fixed thermal margin readings for older Intel processors (Zev) 
> > * Misc hwmon simplifications (Guenter)
> > * Used BIT_PER_TYPE to verify macro value constrains (Guenter)
> > * Improved WARN_ON message to print chan_rank_max and idx_dimm_max (Guenter)
> > * Improved dimmtemp to not reattempt probe if no dimms are populated
> > 
> > Changes v1 -> v2:
> > 
> > Biggest changes when it comes to diffstat are locking in HWMON
> > (I decided to clean things up a bit while adding it), switching to
> > devres usage in more places and exposing sysfs interface in separate patch.
> > 
> > * Moved extending X86 ARCHITECTURE MAINTAINERS earlier in series (Dan)
> > * Removed "default n" for GENERIC_LIB_X86 (Dan)
> > * Added vendor prefix for peci-aspeed specific properties (Rob)
> > * Refactored PECI to use devres consistently (Dan)
> > * Added missing sysfs documentation and excluded adding peci-sysfs to
> >   separate patch (Dan)
> > * Used module_init() instead of subsys_init() for peci module initialization
> > (Dan)
> > * Removed redundant struct peci_device member (Dan)
> > * Improved PECI Kconfig help (Randy/Dan)
> > * Fixed/removed log messages (Dan, Guenter)
> > * Refactored peci-cputemp and peci-dimmtemp and added missing locks (Guenter)
> > * Removed unused dev_set_drvdata() in peci-cputemp and peci-dimmtemp (Guenter)
> > * Fixed used types, names, fixed broken and added additional comments
> >   to peci-hwmon (Guenter, Zev)
> > * Refactored peci-dimmtemp to not return -ETIMEDOUT (Guenter)
> > * Added sanity check for min_peci_revision in peci-hwmon drivers (Zev)
> > * Added assert for DIMM_NUMS_MAX and additional warning in peci-dimmtemp (Zev)
> > * Fixed macro names in peci-aspeed (Zev)
> > * Refactored peci-aspeed sanitizing properties to a single helper function
> > (Zev)
> > * Fixed peci_cpu_device_ids definition for Broadwell Xeon D (David)
> > * Refactor peci_request to use a single allocation (Zev)
> > * Used min_t() to improve code readability (Zev)
> > * Added macro for PECI_RDENDPTCFG_MMIO_WR_LEN_BASE and fixed adev type
> >   array name to more descriptive (Zev)
> > * Fixed peci-hwmon commit-msg and documentation (Zev)
> > 
> > Thanks
> > -Iwona
> > 
> > Iwona Winiarska (11):
> >   dt-bindings: Add generic bindings for PECI
> >   dt-bindings: Add bindings for peci-aspeed
> >   ARM: dts: aspeed: Add PECI controller nodes
> >   peci: Add core infrastructure
> >   peci: Add device detection
> >   peci: Add sysfs interface for PECI bus
> >   peci: Add support for PECI device drivers
> >   peci: Add peci-cpu driver
> >   hwmon: peci: Add cputemp driver
> >   hwmon: peci: Add dimmtemp driver
> >   docs: Add PECI documentation
> > 
> > Jae Hyun Yoo (2):
> >   peci: Add peci-aspeed controller driver
> >   docs: hwmon: Document PECI drivers
> > 
> >  Documentation/ABI/testing/sysfs-bus-peci      |  16 +
> >  .../devicetree/bindings/peci/peci-aspeed.yaml | 109 +++
> >  .../bindings/peci/peci-controller.yaml        |  33 +
> >  Documentation/hwmon/index.rst                 |   2 +
> >  Documentation/hwmon/peci-cputemp.rst          |  90 +++
> >  Documentation/hwmon/peci-dimmtemp.rst         |  57 ++
> >  Documentation/index.rst                       |   1 +
> >  Documentation/peci/index.rst                  |  16 +
> >  Documentation/peci/peci.rst                   |  51 ++
> >  MAINTAINERS                                   |  29 +
> >  arch/arm/boot/dts/aspeed-g4.dtsi              |  14 +
> >  arch/arm/boot/dts/aspeed-g5.dtsi              |  14 +
> >  arch/arm/boot/dts/aspeed-g6.dtsi              |  14 +
> >  drivers/Kconfig                               |   3 +
> >  drivers/Makefile                              |   1 +
> >  drivers/hwmon/Kconfig                         |   2 +
> >  drivers/hwmon/Makefile                        |   1 +
> >  drivers/hwmon/peci/Kconfig                    |  31 +
> >  drivers/hwmon/peci/Makefile                   |   7 +
> >  drivers/hwmon/peci/common.h                   |  58 ++
> >  drivers/hwmon/peci/cputemp.c                  | 592 ++++++++++++++++
> >  drivers/hwmon/peci/dimmtemp.c                 | 630 ++++++++++++++++++
> >  drivers/peci/Kconfig                          |  36 +
> >  drivers/peci/Makefile                         |  10 +
> >  drivers/peci/controller/Kconfig               |  17 +
> >  drivers/peci/controller/Makefile              |   3 +
> >  drivers/peci/controller/peci-aspeed.c         | 429 ++++++++++++
> >  drivers/peci/core.c                           | 236 +++++++
> >  drivers/peci/cpu.c                            | 343 ++++++++++
> >  drivers/peci/device.c                         | 249 +++++++
> >  drivers/peci/internal.h                       | 136 ++++
> >  drivers/peci/request.c                        | 482 ++++++++++++++
> >  drivers/peci/sysfs.c                          |  82 +++
> >  include/linux/peci-cpu.h                      |  40 ++
> >  include/linux/peci.h                          | 110 +++
> >  35 files changed, 3944 insertions(+)
> >  create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
> >  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml
> >  create mode 100644 Documentation/devicetree/bindings/peci/peci-
> > controller.yaml
> >  create mode 100644 Documentation/hwmon/peci-cputemp.rst
> >  create mode 100644 Documentation/hwmon/peci-dimmtemp.rst
> >  create mode 100644 Documentation/peci/index.rst
> >  create mode 100644 Documentation/peci/peci.rst
> >  create mode 100644 drivers/hwmon/peci/Kconfig
> >  create mode 100644 drivers/hwmon/peci/Makefile
> >  create mode 100644 drivers/hwmon/peci/common.h
> >  create mode 100644 drivers/hwmon/peci/cputemp.c
> >  create mode 100644 drivers/hwmon/peci/dimmtemp.c
> >  create mode 100644 drivers/peci/Kconfig
> >  create mode 100644 drivers/peci/Makefile
> >  create mode 100644 drivers/peci/controller/Kconfig
> >  create mode 100644 drivers/peci/controller/Makefile
> >  create mode 100644 drivers/peci/controller/peci-aspeed.c
> >  create mode 100644 drivers/peci/core.c
> >  create mode 100644 drivers/peci/cpu.c
> >  create mode 100644 drivers/peci/device.c
> >  create mode 100644 drivers/peci/internal.h
> >  create mode 100644 drivers/peci/request.c
> >  create mode 100644 drivers/peci/sysfs.c
> >  create mode 100644 include/linux/peci-cpu.h
> >  create mode 100644 include/linux/peci.h
> > 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-11-18 21:53 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15 18:25 [PATCH v3 00/13] Introduce PECI subsystem Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 01/13] dt-bindings: Add generic bindings for PECI Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 02/13] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 03/13] ARM: dts: aspeed: Add PECI controller nodes Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 04/13] peci: Add core infrastructure Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 05/13] peci: Add peci-aspeed controller driver Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 06/13] peci: Add device detection Iwona Winiarska
2021-11-15 18:49   ` Greg Kroah-Hartman
2021-11-15 22:35     ` Winiarska, Iwona
2021-11-16  6:26       ` gregkh
2021-11-17 23:19         ` Winiarska, Iwona
2021-11-15 18:25 ` [PATCH v3 07/13] peci: Add sysfs interface for PECI bus Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 08/13] peci: Add support for PECI device drivers Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 09/13] peci: Add peci-cpu driver Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 10/13] hwmon: peci: Add cputemp driver Iwona Winiarska
2021-11-16  0:52   ` Guenter Roeck
2021-11-17 22:20     ` Winiarska, Iwona
2021-11-15 18:25 ` [PATCH v3 11/13] hwmon: peci: Add dimmtemp driver Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 12/13] docs: hwmon: Document PECI drivers Iwona Winiarska
2021-11-15 18:25 ` [PATCH v3 13/13] docs: Add PECI documentation Iwona Winiarska
2021-11-17  3:56 ` [PATCH v3 00/13] Introduce PECI subsystem Zev Weiss
2021-11-17 23:25   ` Winiarska, Iwona
2021-11-18 12:19 ` Tomer Maimon
2021-11-18 21:51   ` Winiarska, Iwona

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).