linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/15] Introduce PECI subsystem
@ 2021-08-03 11:31 Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers Iwona Winiarska
                   ` (15 more replies)
  0 siblings, 16 replies; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Hi Greg,

This is a second round of patches introducing PECI subsystem.
I don't think it is ready to be applied right away (we're still
missing r-b's), but I hope we have chance to complete discussion in
the 5.15 development cycle. I would appreciate if you could take
a look.

Note: All changes to arch/x86 are contained within patches 01-02, plus
small Kconfig change adding "depends on PECI" to GENERIC_LIB_X86
Kconfig in patch 10.

The Platform Environment Control Interface (PECI) is a communication
interface between Intel processors and management controllers (e.g.
Baseboard Management Controller, BMC).

This series adds a PECI subsystem and introduces drivers which run in
the Linux instance on the management controller (not the main Intel
processor) and is intended to be used by the OpenBMC [1], a Linux
distribution for BMC devices.
The information exposed over PECI (like processor and DIMM
temperature) refers to the Intel processor and can be consumed by
daemons running on the BMC to, for example, display the processor
temperature in its web interface.

The PECI bus is collection of code that provides interface support
between PECI devices (that actually represent processors) and PECI
controllers (such as the "peci-aspeed" controller) that allow to
access physical PECI interface. PECI devices are bound to PECI
drivers that provides access to PECI services. This series introduces
a generic "peci-cpu" driver that exposes hardware monitoring "cputemp"
and "dimmtemp" using the auxiliary bus.

Exposing "raw" PECI to userspace, either to write userspace drivers or
for debug/testing purpose was left out of this series to encourage
writing kernel drivers instead, but may be pursued in the future.

Introducing PECI to upstream Linux was already attempted before [2].
Since it's been over a year since last revision, and the series
changed quite a bit in the meantime, I've decided to start from v1.

I would also like to give credit to everyone who helped me with
different aspects of preliminary review:
- Pierre-Louis Bossart,
- Tony Luck, 
- Andy Shevchenko,
- Dave Hansen.

[1] https://github.com/openbmc/openbmc
[2] https://lore.kernel.org/openbmc/20191211194624.2872-1-jae.hyun.yoo@linux.intel.com/

Changes v1 -> v2:

Biggest changes when it comes to diffstat are locking in HWMON
(I decided to clean things up a bit while adding it), switching to
devres usage in more places and exposing sysfs interface in separate patch.

* Moved extending X86 ARCHITECTURE MAINTAINERS earlier in series (Dan)
* Removed "default n" for GENERIC_LIB_X86 (Dan)
* Added vendor prefix for peci-aspeed specific properties (Rob)
* Refactored PECI to use devres consistently (Dan)
* Added missing sysfs documentation and excluded adding peci-sysfs to
  separate patch (Dan)
* Used module_init() instead of subsys_init() for peci module initialization (Dan)
* Removed redundant struct peci_device member (Dan)
* Improved PECI Kconfig help (Randy/Dan)
* Fixed/removed log messages (Dan, Guenter)
* Refactored peci-cputemp and peci-dimmtemp and added missing locks (Guenter)
* Removed unused dev_set_drvdata() in peci-cputemp and peci-dimmtemp (Guenter)
* Fixed used types, names, fixed broken and added additional comments
  to peci-hwmon (Guenter, Zev)
* Refactored peci-dimmtemp to not return -ETIMEDOUT (Guenter)
* Added sanity check for min_peci_revision in peci-hwmon drivers (Zev)
* Added assert for DIMM_NUMS_MAX and additional warning in peci-dimmtemp (Zev)
* Fixed macro names in peci-aspeed (Zev)
* Refactored peci-aspeed sanitizing properties to a single helper function (Zev)
* Fixed peci_cpu_device_ids definition for Broadwell Xeon D (David)
* Refactor peci_request to use a single allocation (Zev)
* Used min_t() to improve code readability (Zev)
* Added macro for PECI_RDENDPTCFG_MMIO_WR_LEN_BASE and fixed adev type
  array name to more descriptive (Zev)
* Fixed peci-hwmon commit-msg and documentation (Zev)

Thanks
-Iwona

Iwona Winiarska (13):
  x86/cpu: Move intel-family to arch-independent headers
  x86/cpu: Extract cpuid helpers to arch-independent
  dt-bindings: Add generic bindings for PECI
  dt-bindings: Add bindings for peci-aspeed
  ARM: dts: aspeed: Add PECI controller nodes
  peci: Add core infrastructure
  peci: Add device detection
  peci: Add sysfs interface for PECI bus
  peci: Add support for PECI device drivers
  peci: Add peci-cpu driver
  hwmon: peci: Add cputemp driver
  hwmon: peci: Add dimmtemp driver
  docs: Add PECI documentation

Jae Hyun Yoo (2):
  peci: Add peci-aspeed controller driver
  docs: hwmon: Document PECI drivers

 Documentation/ABI/testing/sysfs-bus-peci      |  16 +
 .../devicetree/bindings/peci/peci-aspeed.yaml | 109 ++++
 .../bindings/peci/peci-controller.yaml        |  33 +
 Documentation/hwmon/index.rst                 |   2 +
 Documentation/hwmon/peci-cputemp.rst          |  90 +++
 Documentation/hwmon/peci-dimmtemp.rst         |  57 ++
 Documentation/index.rst                       |   1 +
 Documentation/peci/index.rst                  |  16 +
 Documentation/peci/peci.rst                   |  48 ++
 MAINTAINERS                                   |  32 +
 arch/arm/boot/dts/aspeed-g4.dtsi              |  14 +
 arch/arm/boot/dts/aspeed-g5.dtsi              |  14 +
 arch/arm/boot/dts/aspeed-g6.dtsi              |  14 +
 arch/x86/Kconfig                              |   1 +
 arch/x86/include/asm/cpu.h                    |   3 -
 arch/x86/include/asm/intel-family.h           | 141 +---
 arch/x86/include/asm/microcode.h              |   2 +-
 arch/x86/kvm/cpuid.h                          |   3 +-
 arch/x86/lib/Makefile                         |   2 +-
 drivers/Kconfig                               |   3 +
 drivers/Makefile                              |   1 +
 drivers/edac/mce_amd.c                        |   3 +-
 drivers/hwmon/Kconfig                         |   2 +
 drivers/hwmon/Makefile                        |   1 +
 drivers/hwmon/peci/Kconfig                    |  31 +
 drivers/hwmon/peci/Makefile                   |   7 +
 drivers/hwmon/peci/common.h                   |  58 ++
 drivers/hwmon/peci/cputemp.c                  | 591 +++++++++++++++++
 drivers/hwmon/peci/dimmtemp.c                 | 614 ++++++++++++++++++
 drivers/peci/Kconfig                          |  37 ++
 drivers/peci/Makefile                         |  10 +
 drivers/peci/controller/Kconfig               |  16 +
 drivers/peci/controller/Makefile              |   3 +
 drivers/peci/controller/peci-aspeed.c         | 445 +++++++++++++
 drivers/peci/core.c                           | 238 +++++++
 drivers/peci/cpu.c                            | 344 ++++++++++
 drivers/peci/device.c                         | 221 +++++++
 drivers/peci/internal.h                       | 137 ++++
 drivers/peci/request.c                        | 477 ++++++++++++++
 drivers/peci/sysfs.c                          |  82 +++
 include/linux/peci-cpu.h                      |  38 ++
 include/linux/peci.h                          | 110 ++++
 include/linux/x86/cpu.h                       |   9 +
 include/linux/x86/intel-family.h              | 146 +++++
 lib/Kconfig                                   |   4 +
 lib/Makefile                                  |   2 +
 lib/x86/Makefile                              |   3 +
 {arch/x86/lib => lib/x86}/cpu.c               |   2 +-
 48 files changed, 4084 insertions(+), 149 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml
 create mode 100644 Documentation/devicetree/bindings/peci/peci-controller.yaml
 create mode 100644 Documentation/hwmon/peci-cputemp.rst
 create mode 100644 Documentation/hwmon/peci-dimmtemp.rst
 create mode 100644 Documentation/peci/index.rst
 create mode 100644 Documentation/peci/peci.rst
 create mode 100644 drivers/hwmon/peci/Kconfig
 create mode 100644 drivers/hwmon/peci/Makefile
 create mode 100644 drivers/hwmon/peci/common.h
 create mode 100644 drivers/hwmon/peci/cputemp.c
 create mode 100644 drivers/hwmon/peci/dimmtemp.c
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/controller/Kconfig
 create mode 100644 drivers/peci/controller/Makefile
 create mode 100644 drivers/peci/controller/peci-aspeed.c
 create mode 100644 drivers/peci/core.c
 create mode 100644 drivers/peci/cpu.c
 create mode 100644 drivers/peci/device.c
 create mode 100644 drivers/peci/internal.h
 create mode 100644 drivers/peci/request.c
 create mode 100644 drivers/peci/sysfs.c
 create mode 100644 include/linux/peci-cpu.h
 create mode 100644 include/linux/peci.h
 create mode 100644 include/linux/x86/cpu.h
 create mode 100644 include/linux/x86/intel-family.h
 create mode 100644 lib/x86/Makefile
 rename {arch/x86/lib => lib/x86}/cpu.c (95%)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-10-04 19:03   ` Borislav Petkov
  2021-08-03 11:31 ` [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent Iwona Winiarska
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Baseboard management controllers (BMC) often run Linux but are usually
implemented with non-X86 processors. They can use PECI to access package
config space (PCS) registers on the host CPU and since some information,
e.g. figuring out the core count, can be obtained using different
registers on different CPU generations, they need to decode the family
and model.

Move the data from arch/x86/include/asm/intel-family.h into a new file
include/linux/x86/intel-family.h so that it can be used by other
architectures.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
To limit tree-wide changes and help people that were expecting
intel-family defines in arch/x86 to find it more easily without going
through git history, we're not removing the original header
completely, we're keeping it as a "stub" that includes the new one.
If there is a consensus that the tree-wide option is better,
we can choose this approach.

 MAINTAINERS                         |   2 +
 arch/x86/include/asm/intel-family.h | 141 +--------------------------
 include/linux/x86/intel-family.h    | 146 ++++++++++++++++++++++++++++
 3 files changed, 149 insertions(+), 140 deletions(-)
 create mode 100644 include/linux/x86/intel-family.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c9467d2839f5..104773d40952 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9241,6 +9241,7 @@ M:	x86@kernel.org
 L:	linux-kernel@vger.kernel.org
 S:	Supported
 F:	arch/x86/include/asm/intel-family.h
+F:	include/linux/x86/intel-family.h
 
 INTEL DRM DRIVERS (excluding Poulsbo, Moorestown and derivative chipsets)
 M:	Jani Nikula <jani.nikula@linux.intel.com>
@@ -20105,6 +20106,7 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/core
 F:	Documentation/devicetree/bindings/x86/
 F:	Documentation/x86/
 F:	arch/x86/
+F:	include/linux/x86/
 
 X86 ENTRY CODE
 M:	Andy Lutomirski <luto@kernel.org>
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index 27158436f322..0d4fe1b4e1f6 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -2,145 +2,6 @@
 #ifndef _ASM_X86_INTEL_FAMILY_H
 #define _ASM_X86_INTEL_FAMILY_H
 
-/*
- * "Big Core" Processors (Branded as Core, Xeon, etc...)
- *
- * While adding a new CPUID for a new microarchitecture, add a new
- * group to keep logically sorted out in chronological order. Within
- * that group keep the CPUID for the variants sorted by model number.
- *
- * The defined symbol names have the following form:
- *	INTEL_FAM6{OPTFAMILY}_{MICROARCH}{OPTDIFF}
- * where:
- * OPTFAMILY	Describes the family of CPUs that this belongs to. Default
- *		is assumed to be "_CORE" (and should be omitted). Other values
- *		currently in use are _ATOM and _XEON_PHI
- * MICROARCH	Is the code name for the micro-architecture for this core.
- *		N.B. Not the platform name.
- * OPTDIFF	If needed, a short string to differentiate by market segment.
- *
- *		Common OPTDIFFs:
- *
- *			- regular client parts
- *		_L	- regular mobile parts
- *		_G	- parts with extra graphics on
- *		_X	- regular server parts
- *		_D	- micro server parts
- *
- *		Historical OPTDIFFs:
- *
- *		_EP	- 2 socket server parts
- *		_EX	- 4+ socket server parts
- *
- * The #define line may optionally include a comment including platform or core
- * names. An exception is made for skylake/kabylake where steppings seem to have gotten
- * their own names :-(
- */
-
-/* Wildcard match for FAM6 so X86_MATCH_INTEL_FAM6_MODEL(ANY) works */
-#define INTEL_FAM6_ANY			X86_MODEL_ANY
-
-#define INTEL_FAM6_CORE_YONAH		0x0E
-
-#define INTEL_FAM6_CORE2_MEROM		0x0F
-#define INTEL_FAM6_CORE2_MEROM_L	0x16
-#define INTEL_FAM6_CORE2_PENRYN		0x17
-#define INTEL_FAM6_CORE2_DUNNINGTON	0x1D
-
-#define INTEL_FAM6_NEHALEM		0x1E
-#define INTEL_FAM6_NEHALEM_G		0x1F /* Auburndale / Havendale */
-#define INTEL_FAM6_NEHALEM_EP		0x1A
-#define INTEL_FAM6_NEHALEM_EX		0x2E
-
-#define INTEL_FAM6_WESTMERE		0x25
-#define INTEL_FAM6_WESTMERE_EP		0x2C
-#define INTEL_FAM6_WESTMERE_EX		0x2F
-
-#define INTEL_FAM6_SANDYBRIDGE		0x2A
-#define INTEL_FAM6_SANDYBRIDGE_X	0x2D
-#define INTEL_FAM6_IVYBRIDGE		0x3A
-#define INTEL_FAM6_IVYBRIDGE_X		0x3E
-
-#define INTEL_FAM6_HASWELL		0x3C
-#define INTEL_FAM6_HASWELL_X		0x3F
-#define INTEL_FAM6_HASWELL_L		0x45
-#define INTEL_FAM6_HASWELL_G		0x46
-
-#define INTEL_FAM6_BROADWELL		0x3D
-#define INTEL_FAM6_BROADWELL_G		0x47
-#define INTEL_FAM6_BROADWELL_X		0x4F
-#define INTEL_FAM6_BROADWELL_D		0x56
-
-#define INTEL_FAM6_SKYLAKE_L		0x4E	/* Sky Lake             */
-#define INTEL_FAM6_SKYLAKE		0x5E	/* Sky Lake             */
-#define INTEL_FAM6_SKYLAKE_X		0x55	/* Sky Lake             */
-/*                 CASCADELAKE_X	0x55	   Sky Lake -- s: 7     */
-/*                 COOPERLAKE_X		0x55	   Sky Lake -- s: 11    */
-
-#define INTEL_FAM6_KABYLAKE_L		0x8E	/* Sky Lake             */
-/*                 AMBERLAKE_L		0x8E	   Sky Lake -- s: 9     */
-/*                 COFFEELAKE_L		0x8E	   Sky Lake -- s: 10    */
-/*                 WHISKEYLAKE_L	0x8E       Sky Lake -- s: 11,12 */
-
-#define INTEL_FAM6_KABYLAKE		0x9E	/* Sky Lake             */
-/*                 COFFEELAKE		0x9E	   Sky Lake -- s: 10-13 */
-
-#define INTEL_FAM6_COMETLAKE		0xA5	/* Sky Lake             */
-#define INTEL_FAM6_COMETLAKE_L		0xA6	/* Sky Lake             */
-
-#define INTEL_FAM6_CANNONLAKE_L		0x66	/* Palm Cove */
-
-#define INTEL_FAM6_ICELAKE_X		0x6A	/* Sunny Cove */
-#define INTEL_FAM6_ICELAKE_D		0x6C	/* Sunny Cove */
-#define INTEL_FAM6_ICELAKE		0x7D	/* Sunny Cove */
-#define INTEL_FAM6_ICELAKE_L		0x7E	/* Sunny Cove */
-#define INTEL_FAM6_ICELAKE_NNPI		0x9D	/* Sunny Cove */
-
-#define INTEL_FAM6_LAKEFIELD		0x8A	/* Sunny Cove / Tremont */
-
-#define INTEL_FAM6_ROCKETLAKE		0xA7	/* Cypress Cove */
-
-#define INTEL_FAM6_TIGERLAKE_L		0x8C	/* Willow Cove */
-#define INTEL_FAM6_TIGERLAKE		0x8D	/* Willow Cove */
-
-#define INTEL_FAM6_SAPPHIRERAPIDS_X	0x8F	/* Golden Cove */
-
-#define INTEL_FAM6_ALDERLAKE		0x97	/* Golden Cove / Gracemont */
-#define INTEL_FAM6_ALDERLAKE_L		0x9A	/* Golden Cove / Gracemont */
-
-/* "Small Core" Processors (Atom) */
-
-#define INTEL_FAM6_ATOM_BONNELL		0x1C /* Diamondville, Pineview */
-#define INTEL_FAM6_ATOM_BONNELL_MID	0x26 /* Silverthorne, Lincroft */
-
-#define INTEL_FAM6_ATOM_SALTWELL	0x36 /* Cedarview */
-#define INTEL_FAM6_ATOM_SALTWELL_MID	0x27 /* Penwell */
-#define INTEL_FAM6_ATOM_SALTWELL_TABLET	0x35 /* Cloverview */
-
-#define INTEL_FAM6_ATOM_SILVERMONT	0x37 /* Bay Trail, Valleyview */
-#define INTEL_FAM6_ATOM_SILVERMONT_D	0x4D /* Avaton, Rangely */
-#define INTEL_FAM6_ATOM_SILVERMONT_MID	0x4A /* Merriefield */
-
-#define INTEL_FAM6_ATOM_AIRMONT		0x4C /* Cherry Trail, Braswell */
-#define INTEL_FAM6_ATOM_AIRMONT_MID	0x5A /* Moorefield */
-#define INTEL_FAM6_ATOM_AIRMONT_NP	0x75 /* Lightning Mountain */
-
-#define INTEL_FAM6_ATOM_GOLDMONT	0x5C /* Apollo Lake */
-#define INTEL_FAM6_ATOM_GOLDMONT_D	0x5F /* Denverton */
-
-/* Note: the micro-architecture is "Goldmont Plus" */
-#define INTEL_FAM6_ATOM_GOLDMONT_PLUS	0x7A /* Gemini Lake */
-
-#define INTEL_FAM6_ATOM_TREMONT_D	0x86 /* Jacobsville */
-#define INTEL_FAM6_ATOM_TREMONT		0x96 /* Elkhart Lake */
-#define INTEL_FAM6_ATOM_TREMONT_L	0x9C /* Jasper Lake */
-
-/* Xeon Phi */
-
-#define INTEL_FAM6_XEON_PHI_KNL		0x57 /* Knights Landing */
-#define INTEL_FAM6_XEON_PHI_KNM		0x85 /* Knights Mill */
-
-/* Family 5 */
-#define INTEL_FAM5_QUARK_X1000		0x09 /* Quark X1000 SoC */
+#include <linux/x86/intel-family.h>
 
 #endif /* _ASM_X86_INTEL_FAMILY_H */
diff --git a/include/linux/x86/intel-family.h b/include/linux/x86/intel-family.h
new file mode 100644
index 000000000000..ae4b075c1ab9
--- /dev/null
+++ b/include/linux/x86/intel-family.h
@@ -0,0 +1,146 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_X86_INTEL_FAMILY_H
+#define _LINUX_X86_INTEL_FAMILY_H
+
+/*
+ * "Big Core" Processors (Branded as Core, Xeon, etc...)
+ *
+ * While adding a new CPUID for a new microarchitecture, add a new
+ * group to keep logically sorted out in chronological order. Within
+ * that group keep the CPUID for the variants sorted by model number.
+ *
+ * The defined symbol names have the following form:
+ *	INTEL_FAM6{OPTFAMILY}_{MICROARCH}{OPTDIFF}
+ * where:
+ * OPTFAMILY	Describes the family of CPUs that this belongs to. Default
+ *		is assumed to be "_CORE" (and should be omitted). Other values
+ *		currently in use are _ATOM and _XEON_PHI
+ * MICROARCH	Is the code name for the micro-architecture for this core.
+ *		N.B. Not the platform name.
+ * OPTDIFF	If needed, a short string to differentiate by market segment.
+ *
+ *		Common OPTDIFFs:
+ *
+ *			- regular client parts
+ *		_L	- regular mobile parts
+ *		_G	- parts with extra graphics on
+ *		_X	- regular server parts
+ *		_D	- micro server parts
+ *
+ *		Historical OPTDIFFs:
+ *
+ *		_EP	- 2 socket server parts
+ *		_EX	- 4+ socket server parts
+ *
+ * The #define line may optionally include a comment including platform or core
+ * names. An exception is made for skylake/kabylake where steppings seem to have gotten
+ * their own names :-(
+ */
+
+/* Wildcard match for FAM6 so X86_MATCH_INTEL_FAM6_MODEL(ANY) works */
+#define INTEL_FAM6_ANY			X86_MODEL_ANY
+
+#define INTEL_FAM6_CORE_YONAH		0x0E
+
+#define INTEL_FAM6_CORE2_MEROM		0x0F
+#define INTEL_FAM6_CORE2_MEROM_L	0x16
+#define INTEL_FAM6_CORE2_PENRYN		0x17
+#define INTEL_FAM6_CORE2_DUNNINGTON	0x1D
+
+#define INTEL_FAM6_NEHALEM		0x1E
+#define INTEL_FAM6_NEHALEM_G		0x1F /* Auburndale / Havendale */
+#define INTEL_FAM6_NEHALEM_EP		0x1A
+#define INTEL_FAM6_NEHALEM_EX		0x2E
+
+#define INTEL_FAM6_WESTMERE		0x25
+#define INTEL_FAM6_WESTMERE_EP		0x2C
+#define INTEL_FAM6_WESTMERE_EX		0x2F
+
+#define INTEL_FAM6_SANDYBRIDGE		0x2A
+#define INTEL_FAM6_SANDYBRIDGE_X	0x2D
+#define INTEL_FAM6_IVYBRIDGE		0x3A
+#define INTEL_FAM6_IVYBRIDGE_X		0x3E
+
+#define INTEL_FAM6_HASWELL		0x3C
+#define INTEL_FAM6_HASWELL_X		0x3F
+#define INTEL_FAM6_HASWELL_L		0x45
+#define INTEL_FAM6_HASWELL_G		0x46
+
+#define INTEL_FAM6_BROADWELL		0x3D
+#define INTEL_FAM6_BROADWELL_G		0x47
+#define INTEL_FAM6_BROADWELL_X		0x4F
+#define INTEL_FAM6_BROADWELL_D		0x56
+
+#define INTEL_FAM6_SKYLAKE_L		0x4E	/* Sky Lake             */
+#define INTEL_FAM6_SKYLAKE		0x5E	/* Sky Lake             */
+#define INTEL_FAM6_SKYLAKE_X		0x55	/* Sky Lake             */
+/*                 CASCADELAKE_X	0x55	   Sky Lake -- s: 7     */
+/*                 COOPERLAKE_X		0x55	   Sky Lake -- s: 11    */
+
+#define INTEL_FAM6_KABYLAKE_L		0x8E	/* Sky Lake             */
+/*                 AMBERLAKE_L		0x8E	   Sky Lake -- s: 9     */
+/*                 COFFEELAKE_L		0x8E	   Sky Lake -- s: 10    */
+/*                 WHISKEYLAKE_L	0x8E       Sky Lake -- s: 11,12 */
+
+#define INTEL_FAM6_KABYLAKE		0x9E	/* Sky Lake             */
+/*                 COFFEELAKE		0x9E	   Sky Lake -- s: 10-13 */
+
+#define INTEL_FAM6_COMETLAKE		0xA5	/* Sky Lake             */
+#define INTEL_FAM6_COMETLAKE_L		0xA6	/* Sky Lake             */
+
+#define INTEL_FAM6_CANNONLAKE_L		0x66	/* Palm Cove */
+
+#define INTEL_FAM6_ICELAKE_X		0x6A	/* Sunny Cove */
+#define INTEL_FAM6_ICELAKE_D		0x6C	/* Sunny Cove */
+#define INTEL_FAM6_ICELAKE		0x7D	/* Sunny Cove */
+#define INTEL_FAM6_ICELAKE_L		0x7E	/* Sunny Cove */
+#define INTEL_FAM6_ICELAKE_NNPI		0x9D	/* Sunny Cove */
+
+#define INTEL_FAM6_LAKEFIELD		0x8A	/* Sunny Cove / Tremont */
+
+#define INTEL_FAM6_ROCKETLAKE		0xA7	/* Cypress Cove */
+
+#define INTEL_FAM6_TIGERLAKE_L		0x8C	/* Willow Cove */
+#define INTEL_FAM6_TIGERLAKE		0x8D	/* Willow Cove */
+
+#define INTEL_FAM6_SAPPHIRERAPIDS_X	0x8F	/* Golden Cove */
+
+#define INTEL_FAM6_ALDERLAKE		0x97	/* Golden Cove / Gracemont */
+#define INTEL_FAM6_ALDERLAKE_L		0x9A	/* Golden Cove / Gracemont */
+
+/* "Small Core" Processors (Atom) */
+
+#define INTEL_FAM6_ATOM_BONNELL		0x1C /* Diamondville, Pineview */
+#define INTEL_FAM6_ATOM_BONNELL_MID	0x26 /* Silverthorne, Lincroft */
+
+#define INTEL_FAM6_ATOM_SALTWELL	0x36 /* Cedarview */
+#define INTEL_FAM6_ATOM_SALTWELL_MID	0x27 /* Penwell */
+#define INTEL_FAM6_ATOM_SALTWELL_TABLET	0x35 /* Cloverview */
+
+#define INTEL_FAM6_ATOM_SILVERMONT	0x37 /* Bay Trail, Valleyview */
+#define INTEL_FAM6_ATOM_SILVERMONT_D	0x4D /* Avaton, Rangely */
+#define INTEL_FAM6_ATOM_SILVERMONT_MID	0x4A /* Merriefield */
+
+#define INTEL_FAM6_ATOM_AIRMONT		0x4C /* Cherry Trail, Braswell */
+#define INTEL_FAM6_ATOM_AIRMONT_MID	0x5A /* Moorefield */
+#define INTEL_FAM6_ATOM_AIRMONT_NP	0x75 /* Lightning Mountain */
+
+#define INTEL_FAM6_ATOM_GOLDMONT	0x5C /* Apollo Lake */
+#define INTEL_FAM6_ATOM_GOLDMONT_D	0x5F /* Denverton */
+
+/* Note: the micro-architecture is "Goldmont Plus" */
+#define INTEL_FAM6_ATOM_GOLDMONT_PLUS	0x7A /* Gemini Lake */
+
+#define INTEL_FAM6_ATOM_TREMONT_D	0x86 /* Jacobsville */
+#define INTEL_FAM6_ATOM_TREMONT		0x96 /* Elkhart Lake */
+#define INTEL_FAM6_ATOM_TREMONT_L	0x9C /* Jasper Lake */
+
+/* Xeon Phi */
+
+#define INTEL_FAM6_XEON_PHI_KNL		0x57 /* Knights Landing */
+#define INTEL_FAM6_XEON_PHI_KNM		0x85 /* Knights Mill */
+
+/* Family 5 */
+#define INTEL_FAM5_QUARK_X1000		0x09 /* Quark X1000 SoC */
+
+#endif /* _LINUX_X86_INTEL_FAMILY_H */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-10-04 19:08   ` Borislav Petkov
  2021-08-03 11:31 ` [PATCH v2 03/15] dt-bindings: Add generic bindings for PECI Iwona Winiarska
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Baseboard management controllers (BMC) often run Linux but are usually
implemented with non-X86 processors. They can use PECI to access package
config space (PCS) registers on the host CPU and since some information,
e.g. figuring out the core count, can be obtained using different
registers on different CPU generations, they need to decode the family
and model.

The format of Package Identifier PCS register that describes CPUID
information has the same layout as CPUID_1.EAX, so let's allow to reuse
cpuid helpers by making it available for other architectures as well.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 MAINTAINERS                      | 1 +
 arch/x86/Kconfig                 | 1 +
 arch/x86/include/asm/cpu.h       | 3 ---
 arch/x86/include/asm/microcode.h | 2 +-
 arch/x86/kvm/cpuid.h             | 3 ++-
 arch/x86/lib/Makefile            | 2 +-
 drivers/edac/mce_amd.c           | 3 +--
 include/linux/x86/cpu.h          | 9 +++++++++
 lib/Kconfig                      | 4 ++++
 lib/Makefile                     | 2 ++
 lib/x86/Makefile                 | 3 +++
 {arch/x86/lib => lib/x86}/cpu.c  | 2 +-
 12 files changed, 26 insertions(+), 9 deletions(-)
 create mode 100644 include/linux/x86/cpu.h
 create mode 100644 lib/x86/Makefile
 rename {arch/x86/lib => lib/x86}/cpu.c (95%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 104773d40952..7cdab7229651 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20107,6 +20107,7 @@ F:	Documentation/devicetree/bindings/x86/
 F:	Documentation/x86/
 F:	arch/x86/
 F:	include/linux/x86/
+F:	lib/x86/
 
 X86 ENTRY CODE
 M:	Andy Lutomirski <luto@kernel.org>
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 88fb922c23a0..9096593999ba 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -141,6 +141,7 @@ config X86
 	select GENERIC_IRQ_PROBE
 	select GENERIC_IRQ_RESERVATION_MODE
 	select GENERIC_IRQ_SHOW
+	select GENERIC_LIB_X86
 	select GENERIC_PENDING_IRQ		if SMP
 	select GENERIC_PTDUMP
 	select GENERIC_SMP_IDLE_THREAD
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 33d41e350c79..2a663a05a795 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -37,9 +37,6 @@ extern int _debug_hotplug_cpu(int cpu, int action);
 
 int mwait_usable(const struct cpuinfo_x86 *);
 
-unsigned int x86_family(unsigned int sig);
-unsigned int x86_model(unsigned int sig);
-unsigned int x86_stepping(unsigned int sig);
 #ifdef CONFIG_CPU_SUP_INTEL
 extern void __init sld_setup(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h
index ab45a220fac4..4b0eabf63b98 100644
--- a/arch/x86/include/asm/microcode.h
+++ b/arch/x86/include/asm/microcode.h
@@ -2,9 +2,9 @@
 #ifndef _ASM_X86_MICROCODE_H
 #define _ASM_X86_MICROCODE_H
 
-#include <asm/cpu.h>
 #include <linux/earlycpio.h>
 #include <linux/initrd.h>
+#include <linux/x86/cpu.h>
 
 struct ucode_patch {
 	struct list_head plist;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index c99edfff7f82..bf070d2a2175 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -4,10 +4,11 @@
 
 #include "x86.h"
 #include "reverse_cpuid.h"
-#include <asm/cpu.h>
 #include <asm/processor.h>
 #include <uapi/asm/kvm_para.h>
 
+#include <linux/x86/cpu.h>
+
 extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 void kvm_set_cpu_caps(void);
 
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index bad4dee4f0e4..fd73c1b72c3e 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -41,7 +41,7 @@ clean-files := inat-tables.c
 
 obj-$(CONFIG_SMP) += msr-smp.o cache-smp.o
 
-lib-y := delay.o misc.o cmdline.o cpu.o
+lib-y := delay.o misc.o cmdline.o
 lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_mc.o copy_mc_64.o
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 27d56920b469..f545f5fad02c 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1,8 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include <linux/module.h>
 #include <linux/slab.h>
-
-#include <asm/cpu.h>
+#include <linux/x86/cpu.h>
 
 #include "mce_amd.h"
 
diff --git a/include/linux/x86/cpu.h b/include/linux/x86/cpu.h
new file mode 100644
index 000000000000..5f383d47886d
--- /dev/null
+++ b/include/linux/x86/cpu.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _LINUX_X86_CPU_H
+#define _LINUX_X86_CPU_H
+
+unsigned int x86_family(unsigned int sig);
+unsigned int x86_model(unsigned int sig);
+unsigned int x86_stepping(unsigned int sig);
+
+#endif /* _LINUX_X86_CPU_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 5c9c0687f76d..e538d4d773bd 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -715,3 +715,7 @@ config PLDMFW
 
 config ASN1_ENCODER
        tristate
+
+config GENERIC_LIB_X86
+	bool
+	depends on X86
diff --git a/lib/Makefile b/lib/Makefile
index 5efd1b435a37..befbd9413432 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -360,3 +360,5 @@ obj-$(CONFIG_CMDLINE_KUNIT_TEST) += cmdline_kunit.o
 obj-$(CONFIG_SLUB_KUNIT_TEST) += slub_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
+
+obj-$(CONFIG_GENERIC_LIB_X86) += x86/
diff --git a/lib/x86/Makefile b/lib/x86/Makefile
new file mode 100644
index 000000000000..342024c272fc
--- /dev/null
+++ b/lib/x86/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-y := cpu.o
diff --git a/arch/x86/lib/cpu.c b/lib/x86/cpu.c
similarity index 95%
rename from arch/x86/lib/cpu.c
rename to lib/x86/cpu.c
index 7ad68917a51e..17af59a2fddf 100644
--- a/arch/x86/lib/cpu.c
+++ b/lib/x86/cpu.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include <linux/types.h>
 #include <linux/export.h>
-#include <asm/cpu.h>
+#include <linux/x86/cpu.h>
 
 unsigned int x86_family(unsigned int sig)
 {
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 03/15] dt-bindings: Add generic bindings for PECI
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-11 18:11   ` Rob Herring
  2021-08-03 11:31 ` [PATCH v2 04/15] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Add device tree bindings for the PECI controller.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
---
 .../bindings/peci/peci-controller.yaml        | 33 +++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-controller.yaml

diff --git a/Documentation/devicetree/bindings/peci/peci-controller.yaml b/Documentation/devicetree/bindings/peci/peci-controller.yaml
new file mode 100644
index 000000000000..bbc3d3f3a929
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-controller.yaml
@@ -0,0 +1,33 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/peci/peci-controller.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Generic Device Tree Bindings for PECI
+
+maintainers:
+  - Iwona Winiarska <iwona.winiarska@intel.com>
+
+description:
+  PECI (Platform Environment Control Interface) is an interface that provides a
+  communication channel from Intel processors and chipset components to external
+  monitoring or control devices.
+
+properties:
+  $nodename:
+    pattern: "^peci-controller(@.*)?$"
+
+  cmd-timeout-ms:
+    description:
+      Command timeout in units of ms.
+
+additionalProperties: true
+
+examples:
+  - |
+    peci-controller@1e78b000 {
+      reg = <0x1e78b000 0x100>;
+      cmd-timeout-ms = <500>;
+    };
+...
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 04/15] dt-bindings: Add bindings for peci-aspeed
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (2 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 03/15] dt-bindings: Add generic bindings for PECI Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-11 18:11   ` Rob Herring
  2021-08-03 11:31 ` [PATCH v2 05/15] ARM: dts: aspeed: Add PECI controller nodes Iwona Winiarska
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Add device tree bindings for the peci-aspeed controller driver.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
---
 .../devicetree/bindings/peci/peci-aspeed.yaml | 109 ++++++++++++++++++
 1 file changed, 109 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml

diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.yaml b/Documentation/devicetree/bindings/peci/peci-aspeed.yaml
new file mode 100644
index 000000000000..2929d1e000d8
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-aspeed.yaml
@@ -0,0 +1,109 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/peci/peci-aspeed.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Aspeed PECI Bus Device Tree Bindings
+
+maintainers:
+  - Iwona Winiarska <iwona.winiarska@intel.com>
+  - Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+allOf:
+  - $ref: peci-controller.yaml#
+
+properties:
+  compatible:
+    enum:
+      - aspeed,ast2400-peci
+      - aspeed,ast2500-peci
+      - aspeed,ast2600-peci
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    description:
+      Clock source for PECI controller. Should reference the external
+      oscillator clock.
+    maxItems: 1
+
+  resets:
+    maxItems: 1
+
+  cmd-timeout-ms:
+    minimum: 1
+    maximum: 1000
+    default: 1000
+
+  aspeed,clock-divider:
+    description:
+      This value determines PECI controller internal clock dividing
+      rate. The divider will be calculated as 2 raised to the power of
+      the given value.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 7
+    default: 0
+
+  aspeed,msg-timing:
+    description:
+      Message timing negotiation period. This value will determine the period
+      of message timing negotiation to be issued by PECI controller. The unit
+      of the programmed value is four times of PECI clock period.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 255
+    default: 1
+
+  aspeed,addr-timing:
+    description:
+      Address timing negotiation period. This value will determine the period
+      of address timing negotiation to be issued by PECI controller. The unit
+      of the programmed value is four times of PECI clock period.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 255
+    default: 1
+
+  aspeed,rd-sampling-point:
+    description:
+      Read sampling point selection. The whole period of a bit time will be
+      divided into 16 time frames. This value will determine the time frame
+      in which the controller will sample PECI signal for data read back.
+      Usually in the middle of a bit time is the best.
+    $ref: /schemas/types.yaml#/definitions/uint32
+    minimum: 0
+    maximum: 15
+    default: 8
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - resets
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+    #include <dt-bindings/clock/ast2600-clock.h>
+    peci-controller@1e78b000 {
+      compatible = "aspeed,ast2600-peci";
+      reg = <0x1e78b000 0x100>;
+      interrupts = <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>;
+      clocks = <&syscon ASPEED_CLK_GATE_REF0CLK>;
+      resets = <&syscon ASPEED_RESET_PECI>;
+      cmd-timeout-ms = <1000>;
+      aspeed,clock-divider = <0>;
+      aspeed,msg-timing = <1>;
+      aspeed,addr-timing = <1>;
+      aspeed,rd-sampling-point = <8>;
+    };
+...
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 05/15] ARM: dts: aspeed: Add PECI controller nodes
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (3 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 04/15] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 06/15] peci: Add core infrastructure Iwona Winiarska
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Add PECI controller nodes with all required information.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
---
 arch/arm/boot/dts/aspeed-g4.dtsi | 14 ++++++++++++++
 arch/arm/boot/dts/aspeed-g5.dtsi | 14 ++++++++++++++
 arch/arm/boot/dts/aspeed-g6.dtsi | 14 ++++++++++++++
 3 files changed, 42 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi b/arch/arm/boot/dts/aspeed-g4.dtsi
index c5aeb3cf3a09..87f07d7396d0 100644
--- a/arch/arm/boot/dts/aspeed-g4.dtsi
+++ b/arch/arm/boot/dts/aspeed-g4.dtsi
@@ -385,6 +385,20 @@ ibt: ibt@140 {
 				};
 			};
 
+			peci0: peci-controller@1e78b000 {
+				compatible = "aspeed,ast2400-peci";
+				reg = <0x1e78b000 0x60>;
+				interrupts = <15>;
+				clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+				resets = <&syscon ASPEED_RESET_PECI>;
+				cmd-timeout-ms = <1000>;
+				aspeed,clock-divider = <0>;
+				aspeed,msg-timing = <1>;
+				aspeed,addr-timing = <1>;
+				aspeed,rd-sampling-point = <8>;
+				status = "disabled";
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index 329eaeef66fb..f54d1a9eba22 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -506,6 +506,20 @@ ibt: ibt@140 {
 				};
 			};
 
+			peci0: peci-controller@1e78b000 {
+				compatible = "aspeed,ast2500-peci";
+				reg = <0x1e78b000 0x60>;
+				interrupts = <15>;
+				clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+				resets = <&syscon ASPEED_RESET_PECI>;
+				cmd-timeout-ms = <1000>;
+				aspeed,clock-divider = <0>;
+				aspeed,msg-timing = <1>;
+				aspeed,addr-timing = <1>;
+				aspeed,rd-sampling-point = <8>;
+				status = "disabled";
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
diff --git a/arch/arm/boot/dts/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed-g6.dtsi
index f96607b7b4e2..7fd9eaa02be4 100644
--- a/arch/arm/boot/dts/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed-g6.dtsi
@@ -459,6 +459,20 @@ wdt4: watchdog@1e7850c0 {
 				status = "disabled";
 			};
 
+			peci0: peci-controller@1e78b000 {
+				compatible = "aspeed,ast2600-peci";
+				reg = <0x1e78b000 0x100>;
+				interrupts = <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>;
+				clocks = <&syscon ASPEED_CLK_GATE_REF0CLK>;
+				resets = <&syscon ASPEED_RESET_PECI>;
+				cmd-timeout-ms = <1000>;
+				aspeed,clock-divider = <0>;
+				aspeed,msg-timing = <1>;
+				aspeed,addr-timing = <1>;
+				aspeed,rd-sampling-point = <8>;
+				status = "disabled";
+			};
+
 			lpc: lpc@1e789000 {
 				compatible = "aspeed,ast2600-lpc-v2", "simple-mfd", "syscon";
 				reg = <0x1e789000 0x1000>;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 06/15] peci: Add core infrastructure
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (4 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 05/15] ARM: dts: aspeed: Add PECI controller nodes Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-25 22:58   ` Dan Williams
  2021-08-03 11:31 ` [PATCH v2 07/15] peci: Add peci-aspeed controller driver Iwona Winiarska
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska, Jason M Bills

Intel processors provide access for various services designed to support
processor and DRAM thermal management, platform manageability and
processor interface tuning and diagnostics.
Those services are available via the Platform Environment Control
Interface (PECI) that provides a communication channel between the
processor and the Baseboard Management Controller (BMC) or other
platform management device.

This change introduces PECI subsystem by adding the initial core module
and API for controller drivers.

Co-developed-by: Jason M Bills <jason.m.bills@linux.intel.com>
Signed-off-by: Jason M Bills <jason.m.bills@linux.intel.com>
Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS             |   9 +++
 drivers/Kconfig         |   3 +
 drivers/Makefile        |   1 +
 drivers/peci/Kconfig    |  15 ++++
 drivers/peci/Makefile   |   5 ++
 drivers/peci/core.c     | 155 ++++++++++++++++++++++++++++++++++++++++
 drivers/peci/internal.h |  16 +++++
 include/linux/peci.h    |  99 +++++++++++++++++++++++++
 8 files changed, 303 insertions(+)
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/core.c
 create mode 100644 drivers/peci/internal.h
 create mode 100644 include/linux/peci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7cdab7229651..d411974aaa5e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14503,6 +14503,15 @@ L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/peaq-wmi.c
 
+PECI SUBSYSTEM
+M:	Iwona Winiarska <iwona.winiarska@intel.com>
+R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
+S:	Supported
+F:	Documentation/devicetree/bindings/peci/
+F:	drivers/peci/
+F:	include/linux/peci.h
+
 PENSANDO ETHERNET DRIVERS
 M:	Shannon Nelson <snelson@pensando.io>
 M:	drivers@pensando.io
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 8bad63417a50..f472b3d972b3 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -236,4 +236,7 @@ source "drivers/interconnect/Kconfig"
 source "drivers/counter/Kconfig"
 
 source "drivers/most/Kconfig"
+
+source "drivers/peci/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 27c018bdf4de..8d96f0c3dde5 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -189,3 +189,4 @@ obj-$(CONFIG_GNSS)		+= gnss/
 obj-$(CONFIG_INTERCONNECT)	+= interconnect/
 obj-$(CONFIG_COUNTER)		+= counter/
 obj-$(CONFIG_MOST)		+= most/
+obj-$(CONFIG_PECI)		+= peci/
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
new file mode 100644
index 000000000000..71a4ad81225a
--- /dev/null
+++ b/drivers/peci/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+menuconfig PECI
+	tristate "PECI support"
+	help
+	  The Platform Environment Control Interface (PECI) is an interface
+	  that provides a communication channel to Intel processors and
+	  chipset components from external monitoring or control devices.
+
+	  If you are building a Baseboard Management Controller (BMC) kernel
+	  for Intel platform say Y here and also to the specific driver for
+	  your adapter(s) below. If unsure say N.
+
+	  This support is also available as a module. If so, the module
+	  will be called peci.
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
new file mode 100644
index 000000000000..e789a354e842
--- /dev/null
+++ b/drivers/peci/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+# Core functionality
+peci-y := core.o
+obj-$(CONFIG_PECI) += peci.o
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
new file mode 100644
index 000000000000..7b3938af0396
--- /dev/null
+++ b/drivers/peci/core.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/bug.h>
+#include <linux/device.h>
+#include <linux/export.h>
+#include <linux/idr.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/pm_runtime.h>
+#include <linux/property.h>
+#include <linux/slab.h>
+
+#include "internal.h"
+
+static DEFINE_IDA(peci_controller_ida);
+
+static void peci_controller_dev_release(struct device *dev)
+{
+	struct peci_controller *controller = to_peci_controller(dev);
+
+	pm_runtime_disable(&controller->dev);
+
+	mutex_destroy(&controller->bus_lock);
+	ida_free(&peci_controller_ida, controller->id);
+	fwnode_handle_put(controller->dev.fwnode);
+	kfree(controller);
+}
+
+struct device_type peci_controller_type = {
+	.release	= peci_controller_dev_release,
+};
+
+static struct peci_controller *peci_controller_alloc(struct device *dev,
+						     struct peci_controller_ops *ops)
+{
+	struct fwnode_handle *node = fwnode_handle_get(dev_fwnode(dev));
+	struct peci_controller *controller;
+	int ret;
+
+	if (!ops->xfer)
+		return ERR_PTR(-EINVAL);
+
+	controller = kzalloc(sizeof(*controller), GFP_KERNEL);
+	if (!controller)
+		return ERR_PTR(-ENOMEM);
+
+	ret = ida_alloc_max(&peci_controller_ida, U8_MAX, GFP_KERNEL);
+	if (ret < 0)
+		goto err;
+	controller->id = ret;
+
+	controller->ops = ops;
+
+	controller->dev.parent = dev;
+	controller->dev.bus = &peci_bus_type;
+	controller->dev.type = &peci_controller_type;
+	controller->dev.fwnode = node;
+	controller->dev.of_node = to_of_node(node);
+
+	device_initialize(&controller->dev);
+
+	mutex_init(&controller->bus_lock);
+
+	pm_runtime_no_callbacks(&controller->dev);
+	pm_suspend_ignore_children(&controller->dev, true);
+	pm_runtime_enable(&controller->dev);
+
+	return controller;
+
+err:
+	kfree(controller);
+	return ERR_PTR(ret);
+}
+
+static void unregister_controller(void *_controller)
+{
+	struct peci_controller *controller = _controller;
+
+	device_unregister(&controller->dev);
+}
+
+/**
+ * devm_peci_controller_add() - add PECI controller
+ * @dev: device for devm operations
+ * @ops: pointer to controller specific methods
+ *
+ * In final stage of its probe(), peci_controller driver calls
+ * devm_peci_controller_add() to register itself with the PECI bus.
+ *
+ * Return: Pointer to the newly allocated controller or ERR_PTR() in case of failure.
+ */
+struct peci_controller *devm_peci_controller_add(struct device *dev,
+						 struct peci_controller_ops *ops)
+{
+	struct peci_controller *controller;
+	int ret;
+
+	controller = peci_controller_alloc(dev, ops);
+	if (IS_ERR(controller))
+		return controller;
+
+	ret = dev_set_name(&controller->dev, "peci-%d", controller->id);
+	if (ret)
+		goto err;
+
+	ret = device_add(&controller->dev);
+	if (ret)
+		goto err;
+
+	ret = devm_add_action_or_reset(dev, unregister_controller, controller);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return controller;
+
+err:
+	put_device(&controller->dev);
+
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
+
+struct bus_type peci_bus_type = {
+	.name		= "peci",
+};
+
+static int __init peci_init(void)
+{
+	int ret;
+
+	ret = bus_register(&peci_bus_type);
+	if (ret < 0) {
+		pr_err("failed to register PECI bus type!\n");
+		return ret;
+	}
+
+	return 0;
+}
+module_init(peci_init);
+
+static void __exit peci_exit(void)
+{
+	bus_unregister(&peci_bus_type);
+}
+module_exit(peci_exit);
+
+MODULE_AUTHOR("Jason M Bills <jason.m.bills@linux.intel.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI bus core module");
+MODULE_LICENSE("GPL");
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
new file mode 100644
index 000000000000..918dea745a86
--- /dev/null
+++ b/drivers/peci/internal.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2018-2021 Intel Corporation */
+
+#ifndef __PECI_INTERNAL_H
+#define __PECI_INTERNAL_H
+
+#include <linux/device.h>
+#include <linux/types.h>
+
+struct peci_controller;
+
+extern struct bus_type peci_bus_type;
+
+extern struct device_type peci_controller_type;
+
+#endif /* __PECI_INTERNAL_H */
diff --git a/include/linux/peci.h b/include/linux/peci.h
new file mode 100644
index 000000000000..26e0a4e73b50
--- /dev/null
+++ b/include/linux/peci.h
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2018-2021 Intel Corporation */
+
+#ifndef __LINUX_PECI_H
+#define __LINUX_PECI_H
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+/*
+ * Currently we don't support any PECI command over 32 bytes.
+ */
+#define PECI_REQUEST_MAX_BUF_SIZE 32
+
+struct peci_controller;
+struct peci_request;
+
+/**
+ * struct peci_controller_ops - PECI controller specific methods
+ * @xfer: PECI transfer function
+ *
+ * PECI controllers may have different hardware interfaces - the drivers
+ * implementing PECI controllers can use this structure to abstract away those
+ * differences by exposing a common interface for PECI core.
+ */
+struct peci_controller_ops {
+	int (*xfer)(struct peci_controller *controller, u8 addr, struct peci_request *req);
+};
+
+/**
+ * struct peci_controller - PECI controller
+ * @dev: device object to register PECI controller to the device model
+ * @ops: pointer to device specific controller operations
+ * @bus_lock: lock used to protect multiple callers
+ * @id: PECI controller ID
+ *
+ * PECI controllers usually connect to their drivers using non-PECI bus,
+ * such as the platform bus.
+ * Each PECI controller can communicate with one or more PECI devices.
+ */
+struct peci_controller {
+	struct device dev;
+	struct peci_controller_ops *ops;
+	struct mutex bus_lock; /* held for the duration of xfer */
+	u8 id;
+};
+
+struct peci_controller *devm_peci_controller_add(struct device *parent,
+						 struct peci_controller_ops *ops);
+
+static inline struct peci_controller *to_peci_controller(void *d)
+{
+	return container_of(d, struct peci_controller, dev);
+}
+
+/**
+ * struct peci_device - PECI device
+ * @dev: device object to register PECI device to the device model
+ * @controller: manages the bus segment hosting this PECI device
+ * @addr: address used on the PECI bus connected to the parent controller
+ *
+ * A peci_device identifies a single device (i.e. CPU) connected to a PECI bus.
+ * The behaviour exposed to the rest of the system is defined by the PECI driver
+ * managing the device.
+ */
+struct peci_device {
+	struct device dev;
+	u8 addr;
+};
+
+static inline struct peci_device *to_peci_device(struct device *d)
+{
+	return container_of(d, struct peci_device, dev);
+}
+
+/**
+ * struct peci_request - PECI request
+ * @device: PECI device to which the request is sent
+ * @tx: TX buffer specific data
+ * @tx.buf: TX buffer
+ * @tx.len: transfer data length in bytes
+ * @rx: RX buffer specific data
+ * @rx.buf: RX buffer
+ * @rx.len: received data length in bytes
+ *
+ * A peci_request represents a request issued by PECI originator (TX) and
+ * a response received from PECI responder (RX).
+ */
+struct peci_request {
+	struct peci_device *device;
+	struct {
+		u8 buf[PECI_REQUEST_MAX_BUF_SIZE];
+		u8 len;
+	} rx, tx;
+};
+
+#endif /* __LINUX_PECI_H */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 07/15] peci: Add peci-aspeed controller driver
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (5 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 06/15] peci: Add core infrastructure Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-26  1:35   ` Dan Williams
  2021-08-03 11:31 ` [PATCH v2 08/15] peci: Add device detection Iwona Winiarska
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>

ASPEED AST24xx/AST25xx/AST26xx SoCs supports the PECI electrical
interface (a.k.a PECI wire).

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Co-developed-by: Iwona Winiarska <iwona.winiarska@intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS                           |   9 +
 drivers/peci/Kconfig                  |   6 +
 drivers/peci/Makefile                 |   3 +
 drivers/peci/controller/Kconfig       |  16 +
 drivers/peci/controller/Makefile      |   3 +
 drivers/peci/controller/peci-aspeed.c | 445 ++++++++++++++++++++++++++
 6 files changed, 482 insertions(+)
 create mode 100644 drivers/peci/controller/Kconfig
 create mode 100644 drivers/peci/controller/Makefile
 create mode 100644 drivers/peci/controller/peci-aspeed.c

diff --git a/MAINTAINERS b/MAINTAINERS
index d411974aaa5e..6e9d53ff68ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2866,6 +2866,15 @@ S:	Maintained
 F:	Documentation/hwmon/asc7621.rst
 F:	drivers/hwmon/asc7621.c
 
+ASPEED PECI CONTROLLER
+M:	Iwona Winiarska <iwona.winiarska@intel.com>
+M:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
+L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
+S:	Supported
+F:	Documentation/devicetree/bindings/peci/peci-aspeed.yaml
+F:	drivers/peci/controller/peci-aspeed.c
+
 ASPEED PINCTRL DRIVERS
 M:	Andrew Jeffery <andrew@aj.id.au>
 L:	linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 71a4ad81225a..99279df97a78 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -13,3 +13,9 @@ menuconfig PECI
 
 	  This support is also available as a module. If so, the module
 	  will be called peci.
+
+if PECI
+
+source "drivers/peci/controller/Kconfig"
+
+endif # PECI
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index e789a354e842..926d8df15cbd 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -3,3 +3,6 @@
 # Core functionality
 peci-y := core.o
 obj-$(CONFIG_PECI) += peci.o
+
+# Hardware specific bus drivers
+obj-y += controller/
diff --git a/drivers/peci/controller/Kconfig b/drivers/peci/controller/Kconfig
new file mode 100644
index 000000000000..6d48df08db1c
--- /dev/null
+++ b/drivers/peci/controller/Kconfig
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config PECI_ASPEED
+	tristate "ASPEED PECI support"
+	depends on ARCH_ASPEED || COMPILE_TEST
+	depends on OF
+	depends on HAS_IOMEM
+	help
+	  This option enables PECI controller driver for ASPEED AST2400,
+	  AST2500 and AST2600 SoCs.
+
+	  Say Y here if your system runs on ASPEED SoC and you are using it
+	  as BMC for Intel platform.
+
+	  This driver can also be built as a module. If so, the module will
+	  be called peci-aspeed.
diff --git a/drivers/peci/controller/Makefile b/drivers/peci/controller/Makefile
new file mode 100644
index 000000000000..022c28ef1bf0
--- /dev/null
+++ b/drivers/peci/controller/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_PECI_ASPEED)	+= peci-aspeed.o
diff --git a/drivers/peci/controller/peci-aspeed.c b/drivers/peci/controller/peci-aspeed.c
new file mode 100644
index 000000000000..1d708c983749
--- /dev/null
+++ b/drivers/peci/controller/peci-aspeed.c
@@ -0,0 +1,445 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2012-2017 ASPEED Technology Inc.
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/platform_device.h>
+#include <linux/reset.h>
+
+#include <asm/unaligned.h>
+
+/* ASPEED PECI Registers */
+/* Control Register */
+#define ASPEED_PECI_CTRL			0x00
+#define   ASPEED_PECI_CTRL_SAMPLING_MASK	GENMASK(19, 16)
+#define   ASPEED_PECI_CTRL_RD_MODE_MASK		GENMASK(13, 12)
+#define     ASPEED_PECI_CTRL_RD_MODE_DBG	BIT(13)
+#define     ASPEED_PECI_CTRL_RD_MODE_COUNT	BIT(12)
+#define   ASPEED_PECI_CTRL_CLK_SOURCE		BIT(11)
+#define   ASPEED_PECI_CTRL_CLK_DIV_MASK		GENMASK(10, 8)
+#define   ASPEED_PECI_CTRL_INVERT_OUT		BIT(7)
+#define   ASPEED_PECI_CTRL_INVERT_IN		BIT(6)
+#define   ASPEED_PECI_CTRL_BUS_CONTENTION_EN	BIT(5)
+#define   ASPEED_PECI_CTRL_PECI_EN		BIT(4)
+#define   ASPEED_PECI_CTRL_PECI_CLK_EN		BIT(0)
+
+/* Timing Negotiation Register */
+#define ASPEED_PECI_TIMING_NEGOTIATION		0x04
+#define   ASPEED_PECI_T_NEGO_MSG_MASK		GENMASK(15, 8)
+#define   ASPEED_PECI_T_NEGO_ADDR_MASK		GENMASK(7, 0)
+
+/* Command Register */
+#define ASPEED_PECI_CMD				0x08
+#define   ASPEED_PECI_CMD_PIN_MONITORING	BIT(31)
+#define   ASPEED_PECI_CMD_STS_MASK		GENMASK(27, 24)
+#define     ASPEED_PECI_CMD_STS_ADDR_T_NEGO	0x3
+#define   ASPEED_PECI_CMD_IDLE_MASK		\
+	  (ASPEED_PECI_CMD_STS_MASK | ASPEED_PECI_CMD_PIN_MONITORING)
+#define   ASPEED_PECI_CMD_FIRE			BIT(0)
+
+/* Read/Write Length Register */
+#define ASPEED_PECI_RW_LENGTH			0x0c
+#define   ASPEED_PECI_AW_FCS_EN			BIT(31)
+#define   ASPEED_PECI_RD_LEN_MASK		GENMASK(23, 16)
+#define   ASPEED_PECI_WR_LEN_MASK		GENMASK(15, 8)
+#define   ASPEED_PECI_TARGET_ADDR_MASK		GENMASK(7, 0)
+
+/* Expected FCS Data Register */
+#define ASPEED_PECI_EXPECTED_FCS		0x10
+#define   ASPEED_PECI_EXPECTED_RD_FCS_MASK	GENMASK(23, 16)
+#define   ASPEED_PECI_EXPECTED_AW_FCS_AUTO_MASK	GENMASK(15, 8)
+#define   ASPEED_PECI_EXPECTED_WR_FCS_MASK	GENMASK(7, 0)
+
+/* Captured FCS Data Register */
+#define ASPEED_PECI_CAPTURED_FCS		0x14
+#define   ASPEED_PECI_CAPTURED_RD_FCS_MASK	GENMASK(23, 16)
+#define   ASPEED_PECI_CAPTURED_WR_FCS_MASK	GENMASK(7, 0)
+
+/* Interrupt Register */
+#define ASPEED_PECI_INT_CTRL			0x18
+#define   ASPEED_PECI_TIMING_NEGO_SEL_MASK	GENMASK(31, 30)
+#define     ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO	0
+#define     ASPEED_PECI_2ND_BIT_OF_ADDR_NEGO	1
+#define     ASPEED_PECI_MESSAGE_NEGO		2
+#define   ASPEED_PECI_INT_MASK			GENMASK(4, 0)
+#define     ASPEED_PECI_INT_BUS_TIMEOUT		BIT(4)
+#define     ASPEED_PECI_INT_BUS_CONTENTION	BIT(3)
+#define     ASPEED_PECI_INT_WR_FCS_BAD		BIT(2)
+#define     ASPEED_PECI_INT_WR_FCS_ABORT	BIT(1)
+#define     ASPEED_PECI_INT_CMD_DONE		BIT(0)
+
+/* Interrupt Status Register */
+#define ASPEED_PECI_INT_STS			0x1c
+#define   ASPEED_PECI_INT_TIMING_RESULT_MASK	GENMASK(29, 16)
+	  /* bits[4..0]: Same bit fields in the 'Interrupt Register' */
+
+/* Rx/Tx Data Buffer Registers */
+#define ASPEED_PECI_WR_DATA0			0x20
+#define ASPEED_PECI_WR_DATA1			0x24
+#define ASPEED_PECI_WR_DATA2			0x28
+#define ASPEED_PECI_WR_DATA3			0x2c
+#define ASPEED_PECI_RD_DATA0			0x30
+#define ASPEED_PECI_RD_DATA1			0x34
+#define ASPEED_PECI_RD_DATA2			0x38
+#define ASPEED_PECI_RD_DATA3			0x3c
+#define ASPEED_PECI_WR_DATA4			0x40
+#define ASPEED_PECI_WR_DATA5			0x44
+#define ASPEED_PECI_WR_DATA6			0x48
+#define ASPEED_PECI_WR_DATA7			0x4c
+#define ASPEED_PECI_RD_DATA4			0x50
+#define ASPEED_PECI_RD_DATA5			0x54
+#define ASPEED_PECI_RD_DATA6			0x58
+#define ASPEED_PECI_RD_DATA7			0x5c
+#define   ASPEED_PECI_DATA_BUF_SIZE_MAX		32
+
+/* Timing Negotiation */
+#define ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT	8
+#define ASPEED_PECI_RD_SAMPLING_POINT_MAX	(BIT(4) - 1)
+#define ASPEED_PECI_CLK_DIV_DEFAULT		0
+#define ASPEED_PECI_CLK_DIV_MAX			(BIT(3) - 1)
+#define ASPEED_PECI_MSG_TIMING_DEFAULT		1
+#define ASPEED_PECI_MSG_TIMING_MAX		(BIT(8) - 1)
+#define ASPEED_PECI_ADDR_TIMING_DEFAULT		1
+#define ASPEED_PECI_ADDR_TIMING_MAX		(BIT(8) - 1)
+
+/* Timeout */
+#define ASPEED_PECI_IDLE_CHECK_TIMEOUT_US	(50 * USEC_PER_MSEC)
+#define ASPEED_PECI_IDLE_CHECK_INTERVAL_US	(10 * USEC_PER_MSEC)
+#define ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT	(1000)
+#define ASPEED_PECI_CMD_TIMEOUT_MS_MAX		(1000)
+
+struct aspeed_peci {
+	struct peci_controller *controller;
+	struct device *dev;
+	void __iomem *base;
+	struct clk *clk;
+	struct reset_control *rst;
+	int irq;
+	spinlock_t lock; /* to sync completion status handling */
+	struct completion xfer_complete;
+	u32 status;
+	u32 cmd_timeout_ms;
+	u32 msg_timing;
+	u32 addr_timing;
+	u32 rd_sampling_point;
+	u32 clk_div;
+};
+
+static void aspeed_peci_init_regs(struct aspeed_peci *priv)
+{
+	u32 val;
+
+	val = FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, ASPEED_PECI_CLK_DIV_DEFAULT);
+	val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
+	writel(val, priv->base + ASPEED_PECI_CTRL);
+	/*
+	 * Timing negotiation period setting.
+	 * The unit of the programmed value is 4 times of PECI clock period.
+	 */
+	val = FIELD_PREP(ASPEED_PECI_T_NEGO_MSG_MASK, priv->msg_timing);
+	val |= FIELD_PREP(ASPEED_PECI_T_NEGO_ADDR_MASK, priv->addr_timing);
+	writel(val, priv->base + ASPEED_PECI_TIMING_NEGOTIATION);
+
+	/* Clear interrupts */
+	val = readl(priv->base + ASPEED_PECI_INT_STS) | ASPEED_PECI_INT_MASK;
+	writel(val, priv->base + ASPEED_PECI_INT_STS);
+
+	/* Set timing negotiation mode and enable interrupts */
+	val = FIELD_PREP(ASPEED_PECI_TIMING_NEGO_SEL_MASK, ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO);
+	val |= ASPEED_PECI_INT_MASK;
+	writel(val, priv->base + ASPEED_PECI_INT_CTRL);
+
+	val = FIELD_PREP(ASPEED_PECI_CTRL_SAMPLING_MASK, priv->rd_sampling_point);
+	val |= FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, priv->clk_div);
+	val |= ASPEED_PECI_CTRL_PECI_EN;
+	val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
+	writel(val, priv->base + ASPEED_PECI_CTRL);
+}
+
+static inline int aspeed_peci_check_idle(struct aspeed_peci *priv)
+{
+	u32 cmd_sts = readl(priv->base + ASPEED_PECI_CMD);
+
+	if (FIELD_GET(ASPEED_PECI_CMD_STS_MASK, cmd_sts) == ASPEED_PECI_CMD_STS_ADDR_T_NEGO)
+		aspeed_peci_init_regs(priv);
+
+	return readl_poll_timeout(priv->base + ASPEED_PECI_CMD,
+				  cmd_sts,
+				  !(cmd_sts & ASPEED_PECI_CMD_IDLE_MASK),
+				  ASPEED_PECI_IDLE_CHECK_INTERVAL_US,
+				  ASPEED_PECI_IDLE_CHECK_TIMEOUT_US);
+}
+
+static int aspeed_peci_xfer(struct peci_controller *controller,
+			    u8 addr, struct peci_request *req)
+{
+	struct aspeed_peci *priv = dev_get_drvdata(controller->dev.parent);
+	unsigned long flags, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
+	u32 peci_head;
+	int ret;
+
+	if (req->tx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX ||
+	    req->rx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX)
+		return -EINVAL;
+
+	/* Check command sts and bus idle state */
+	ret = aspeed_peci_check_idle(priv);
+	if (ret)
+		return ret; /* -ETIMEDOUT */
+
+	spin_lock_irqsave(&priv->lock, flags);
+	reinit_completion(&priv->xfer_complete);
+
+	peci_head = FIELD_PREP(ASPEED_PECI_TARGET_ADDR_MASK, addr) |
+		    FIELD_PREP(ASPEED_PECI_WR_LEN_MASK, req->tx.len) |
+		    FIELD_PREP(ASPEED_PECI_RD_LEN_MASK, req->rx.len);
+
+	writel(peci_head, priv->base + ASPEED_PECI_RW_LENGTH);
+
+	memcpy_toio(priv->base + ASPEED_PECI_WR_DATA0, req->tx.buf, min_t(u8, req->tx.len, 16));
+	if (req->tx.len > 16)
+		memcpy_toio(priv->base + ASPEED_PECI_WR_DATA4, req->tx.buf + 16,
+			    req->tx.len - 16);
+
+	dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
+	print_hex_dump_bytes("TX : ", DUMP_PREFIX_NONE, req->tx.buf, req->tx.len);
+
+	priv->status = 0;
+	writel(ASPEED_PECI_CMD_FIRE, priv->base + ASPEED_PECI_CMD);
+	spin_unlock_irqrestore(&priv->lock, flags);
+
+	ret = wait_for_completion_interruptible_timeout(&priv->xfer_complete, timeout);
+	if (ret < 0)
+		return ret;
+
+	if (ret == 0) {
+		dev_dbg(priv->dev, "Timeout waiting for a response!\n");
+		return -ETIMEDOUT;
+	}
+
+	spin_lock_irqsave(&priv->lock, flags);
+
+	writel(0, priv->base + ASPEED_PECI_CMD);
+
+	if (priv->status != ASPEED_PECI_INT_CMD_DONE) {
+		spin_unlock_irqrestore(&priv->lock, flags);
+		dev_dbg(priv->dev, "No valid response!\n");
+		return -EIO;
+	}
+
+	spin_unlock_irqrestore(&priv->lock, flags);
+
+	memcpy_fromio(req->rx.buf, priv->base + ASPEED_PECI_RD_DATA0, min_t(u8, req->rx.len, 16));
+	if (req->rx.len > 16)
+		memcpy_fromio(req->rx.buf + 16, priv->base + ASPEED_PECI_RD_DATA4,
+			      req->rx.len - 16);
+
+	print_hex_dump_bytes("RX : ", DUMP_PREFIX_NONE, req->rx.buf, req->rx.len);
+
+	return 0;
+}
+
+static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
+{
+	struct aspeed_peci *priv = arg;
+	u32 status;
+
+	spin_lock(&priv->lock);
+	status = readl(priv->base + ASPEED_PECI_INT_STS);
+	writel(status, priv->base + ASPEED_PECI_INT_STS);
+	priv->status |= (status & ASPEED_PECI_INT_MASK);
+
+	/*
+	 * In most cases, interrupt bits will be set one by one but also note
+	 * that multiple interrupt bits could be set at the same time.
+	 */
+	if (status & ASPEED_PECI_INT_BUS_TIMEOUT)
+		dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_BUS_TIMEOUT\n");
+
+	if (status & ASPEED_PECI_INT_BUS_CONTENTION)
+		dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_BUS_CONTENTION\n");
+
+	if (status & ASPEED_PECI_INT_WR_FCS_BAD)
+		dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_WR_FCS_BAD\n");
+
+	if (status & ASPEED_PECI_INT_WR_FCS_ABORT)
+		dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_WR_FCS_ABORT\n");
+
+	/*
+	 * All commands should be ended up with a ASPEED_PECI_INT_CMD_DONE bit
+	 * set even in an error case.
+	 */
+	if (status & ASPEED_PECI_INT_CMD_DONE)
+		complete(&priv->xfer_complete);
+
+	spin_unlock(&priv->lock);
+
+	return IRQ_HANDLED;
+}
+
+static void aspeed_peci_property_sanitize(struct device *dev, const char *propname,
+					  u32 min, u32 max, u32 default_val, u32 *propval)
+{
+	u32 val;
+	int ret;
+
+	ret = device_property_read_u32(dev, propname, &val);
+	if (ret) {
+		val = default_val;
+	} else if (val > max || val < min) {
+		dev_warn(dev, "Invalid %s: %u, falling back to: %u\n",
+			 propname, val, default_val);
+
+		val = default_val;
+	}
+
+	*propval = val;
+}
+
+static void aspeed_peci_property_setup(struct aspeed_peci *priv)
+{
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,clock-divider",
+				      0, ASPEED_PECI_CLK_DIV_MAX,
+				      ASPEED_PECI_CLK_DIV_DEFAULT, &priv->clk_div);
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,msg-timing",
+				      0, ASPEED_PECI_MSG_TIMING_MAX,
+				      ASPEED_PECI_MSG_TIMING_DEFAULT, &priv->msg_timing);
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,addr-timing",
+				      0, ASPEED_PECI_ADDR_TIMING_MAX,
+				      ASPEED_PECI_ADDR_TIMING_DEFAULT, &priv->addr_timing);
+	aspeed_peci_property_sanitize(priv->dev, "aspeed,rd-sampling-point",
+				      0, ASPEED_PECI_RD_SAMPLING_POINT_MAX,
+				      ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT,
+				      &priv->rd_sampling_point);
+	aspeed_peci_property_sanitize(priv->dev, "cmd-timeout-ms",
+				      1, ASPEED_PECI_CMD_TIMEOUT_MS_MAX,
+				      ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT, &priv->cmd_timeout_ms);
+}
+
+static struct peci_controller_ops aspeed_ops = {
+	.xfer = aspeed_peci_xfer,
+};
+
+static void aspeed_peci_reset_control_release(void *data)
+{
+	reset_control_assert(data);
+}
+
+int aspeed_peci_reset_control_deassert(struct device *dev, struct reset_control *rst)
+{
+	int ret;
+
+	ret = reset_control_deassert(rst);
+	if (ret)
+		return ret;
+
+	return devm_add_action_or_reset(dev, aspeed_peci_reset_control_release, rst);
+}
+
+static void aspeed_peci_clk_release(void *data)
+{
+	clk_disable_unprepare(data);
+}
+
+static int aspeed_peci_clk_enable(struct device *dev, struct clk *clk)
+{
+	int ret;
+
+	ret = clk_prepare_enable(clk);
+	if (ret)
+		return ret;
+
+	return devm_add_action_or_reset(dev, aspeed_peci_clk_release, clk);
+}
+
+static int aspeed_peci_probe(struct platform_device *pdev)
+{
+	struct peci_controller *controller;
+	struct aspeed_peci *priv;
+	int ret;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->dev = &pdev->dev;
+	dev_set_drvdata(priv->dev, priv);
+
+	priv->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(priv->base))
+		return PTR_ERR(priv->base);
+
+	priv->irq = platform_get_irq(pdev, 0);
+	if (!priv->irq)
+		return priv->irq;
+
+	ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
+			       0, "peci-aspeed", priv);
+	if (ret)
+		return ret;
+
+	init_completion(&priv->xfer_complete);
+	spin_lock_init(&priv->lock);
+
+	priv->rst = devm_reset_control_get(&pdev->dev, NULL);
+	if (IS_ERR(priv->rst))
+		return dev_err_probe(priv->dev, PTR_ERR(priv->rst),
+				     "failed to get reset control\n");
+
+	ret = aspeed_peci_reset_control_deassert(priv->dev, priv->rst);
+	if (ret)
+		return dev_err_probe(priv->dev, ret, "cannot deassert reset control\n");
+
+	priv->clk = devm_clk_get(priv->dev, NULL);
+	if (IS_ERR(priv->clk))
+		return dev_err_probe(priv->dev, PTR_ERR(priv->clk), "failed to get clk\n");
+
+	ret = aspeed_peci_clk_enable(priv->dev, priv->clk);
+	if (ret)
+		return dev_err_probe(priv->dev, ret, "failed to enable clock\n");
+
+	aspeed_peci_property_setup(priv);
+
+	aspeed_peci_init_regs(priv);
+
+	controller = devm_peci_controller_add(priv->dev, &aspeed_ops);
+	if (IS_ERR(controller))
+		return dev_err_probe(priv->dev, PTR_ERR(controller),
+				     "failed to add aspeed peci controller\n");
+
+	priv->controller = controller;
+
+	return 0;
+}
+
+static const struct of_device_id aspeed_peci_of_table[] = {
+	{ .compatible = "aspeed,ast2400-peci", },
+	{ .compatible = "aspeed,ast2500-peci", },
+	{ .compatible = "aspeed,ast2600-peci", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
+
+static struct platform_driver aspeed_peci_driver = {
+	.probe  = aspeed_peci_probe,
+	.driver = {
+		.name           = "peci-aspeed",
+		.of_match_table = aspeed_peci_of_table,
+	},
+};
+module_platform_driver(aspeed_peci_driver);
+
+MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("ASPEED PECI driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 08/15] peci: Add device detection
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (6 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 07/15] peci: Add peci-aspeed controller driver Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-27 19:01   ` Dan Williams
  2021-08-03 11:31 ` [PATCH v2 09/15] peci: Add sysfs interface for PECI bus Iwona Winiarska
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Since PECI devices are discoverable, we can dynamically detect devices
that are actually available in the system.

This change complements the earlier implementation by rescanning PECI
bus to detect available devices. For this purpose, it also introduces the
minimal API for PECI requests.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/peci/Makefile   |   2 +-
 drivers/peci/core.c     |  33 ++++++++++++
 drivers/peci/device.c   | 114 ++++++++++++++++++++++++++++++++++++++++
 drivers/peci/internal.h |  14 +++++
 drivers/peci/request.c  |  50 ++++++++++++++++++
 5 files changed, 212 insertions(+), 1 deletion(-)
 create mode 100644 drivers/peci/device.c
 create mode 100644 drivers/peci/request.c

diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 926d8df15cbd..c5f9d3fe21bb 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 # Core functionality
-peci-y := core.o
+peci-y := core.o request.o device.o
 obj-$(CONFIG_PECI) += peci.o
 
 # Hardware specific bus drivers
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
index 7b3938af0396..d143f1a7fe98 100644
--- a/drivers/peci/core.c
+++ b/drivers/peci/core.c
@@ -34,6 +34,20 @@ struct device_type peci_controller_type = {
 	.release	= peci_controller_dev_release,
 };
 
+static int peci_controller_scan_devices(struct peci_controller *controller)
+{
+	int ret;
+	u8 addr;
+
+	for (addr = PECI_BASE_ADDR; addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX; addr++) {
+		ret = peci_device_create(controller, addr);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static struct peci_controller *peci_controller_alloc(struct device *dev,
 						     struct peci_controller_ops *ops)
 {
@@ -76,10 +90,23 @@ static struct peci_controller *peci_controller_alloc(struct device *dev,
 	return ERR_PTR(ret);
 }
 
+static int unregister_child(struct device *dev, void *dummy)
+{
+	peci_device_destroy(to_peci_device(dev));
+
+	return 0;
+}
+
 static void unregister_controller(void *_controller)
 {
 	struct peci_controller *controller = _controller;
 
+	/*
+	 * Detach any active PECI devices. This can't fail, thus we do not
+	 * check the returned value.
+	 */
+	device_for_each_child_reverse(&controller->dev, NULL, unregister_child);
+
 	device_unregister(&controller->dev);
 }
 
@@ -115,6 +142,12 @@ struct peci_controller *devm_peci_controller_add(struct device *dev,
 	if (ret)
 		return ERR_PTR(ret);
 
+	/*
+	 * Ignoring retval since failures during scan are non-critical for
+	 * controller itself.
+	 */
+	peci_controller_scan_devices(controller);
+
 	return controller;
 
 err:
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
new file mode 100644
index 000000000000..32811248997b
--- /dev/null
+++ b/drivers/peci/device.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/peci.h>
+#include <linux/slab.h>
+
+#include "internal.h"
+
+static int peci_detect(struct peci_controller *controller, u8 addr)
+{
+	struct peci_request *req;
+	int ret;
+
+	/*
+	 * PECI Ping is a command encoded by tx_len = 0, rx_len = 0.
+	 * We expect correct Write FCS if the device at the target address
+	 * is able to respond.
+	 */
+	req = peci_request_alloc(NULL, 0, 0);
+	if (!req)
+		return -ENOMEM;
+
+	mutex_lock(&controller->bus_lock);
+	ret = controller->ops->xfer(controller, addr, req);
+	mutex_unlock(&controller->bus_lock);
+
+	peci_request_free(req);
+
+	return ret;
+}
+
+static bool peci_addr_valid(u8 addr)
+{
+	return addr >= PECI_BASE_ADDR && addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX;
+}
+
+static int peci_dev_exists(struct device *dev, void *data)
+{
+	struct peci_device *device = to_peci_device(dev);
+	u8 *addr = data;
+
+	if (device->addr == *addr)
+		return -EBUSY;
+
+	return 0;
+}
+
+int peci_device_create(struct peci_controller *controller, u8 addr)
+{
+	struct peci_device *device;
+	int ret;
+
+	if (WARN_ON(!peci_addr_valid(addr)))
+		return -EINVAL;
+
+	/* Check if we have already detected this device before. */
+	ret = device_for_each_child(&controller->dev, &addr, peci_dev_exists);
+	if (ret)
+		return 0;
+
+	ret = peci_detect(controller, addr);
+	if (ret) {
+		/*
+		 * Device not present or host state doesn't allow successful
+		 * detection at this time.
+		 */
+		if (ret == -EIO || ret == -ETIMEDOUT)
+			return 0;
+
+		return ret;
+	}
+
+	device = kzalloc(sizeof(*device), GFP_KERNEL);
+	if (!device)
+		return -ENOMEM;
+
+	device->addr = addr;
+	device->dev.parent = &controller->dev;
+	device->dev.bus = &peci_bus_type;
+	device->dev.type = &peci_device_type;
+
+	ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device->addr);
+	if (ret)
+		goto err_free;
+
+	ret = device_register(&device->dev);
+	if (ret)
+		goto err_put;
+
+	return 0;
+
+err_put:
+	put_device(&device->dev);
+err_free:
+	kfree(device);
+
+	return ret;
+}
+
+void peci_device_destroy(struct peci_device *device)
+{
+	device_unregister(&device->dev);
+}
+
+static void peci_device_release(struct device *dev)
+{
+	struct peci_device *device = to_peci_device(dev);
+
+	kfree(device);
+}
+
+struct device_type peci_device_type = {
+	.release	= peci_device_release,
+};
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 918dea745a86..57d11a902c5d 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -8,6 +8,20 @@
 #include <linux/types.h>
 
 struct peci_controller;
+struct peci_device;
+struct peci_request;
+
+/* PECI CPU address range 0x30-0x37 */
+#define PECI_BASE_ADDR		0x30
+#define PECI_DEVICE_NUM_MAX	8
+
+struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len);
+void peci_request_free(struct peci_request *req);
+
+extern struct device_type peci_device_type;
+
+int peci_device_create(struct peci_controller *controller, u8 addr);
+void peci_device_destroy(struct peci_device *device);
 
 extern struct bus_type peci_bus_type;
 
diff --git a/drivers/peci/request.c b/drivers/peci/request.c
new file mode 100644
index 000000000000..81b567bc7b87
--- /dev/null
+++ b/drivers/peci/request.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2021 Intel Corporation
+
+#include <linux/export.h>
+#include <linux/peci.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include "internal.h"
+
+/**
+ * peci_request_alloc() - allocate &struct peci_requests
+ * @device: PECI device to which request is going to be sent
+ * @tx_len: TX length
+ * @rx_len: RX length
+ *
+ * Return: A pointer to a newly allocated &struct peci_request on success or NULL otherwise.
+ */
+struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len)
+{
+	struct peci_request *req;
+
+	if (WARN_ON_ONCE(tx_len > PECI_REQUEST_MAX_BUF_SIZE || rx_len > PECI_REQUEST_MAX_BUF_SIZE))
+		return NULL;
+	/*
+	 * PECI controllers that we are using now don't support DMA, this
+	 * should be converted to DMA API once support for controllers that do
+	 * allow it is added to avoid an extra copy.
+	 */
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return NULL;
+
+	req->device = device;
+	req->tx.len = tx_len;
+	req->rx.len = rx_len;
+
+	return req;
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_alloc, PECI);
+
+/**
+ * peci_request_free() - free peci_request
+ * @req: the PECI request to be freed
+ */
+void peci_request_free(struct peci_request *req)
+{
+	kfree(req);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 09/15] peci: Add sysfs interface for PECI bus
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (7 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 08/15] peci: Add device detection Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-27 19:11   ` Dan Williams
  2021-08-03 11:31 ` [PATCH v2 10/15] peci: Add support for PECI device drivers Iwona Winiarska
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

PECI devices may not be discoverable at the time when PECI controller is
being added (e.g. BMC can boot up when the Host system is still in S5).
Since we currently don't have the capabilities to figure out the Host
system state inside the PECI subsystem itself, we have to rely on
userspace to do it for us.

In the future, PECI subsystem may be expanded with mechanisms that allow
us to avoid depending on userspace interaction (e.g. CPU presence could
be detected using GPIO, and the information on whether it's discoverable
could be obtained over IPMI).
Unfortunately, those methods may ultimately not be available (support
will vary from platform to platform), which means that we still need
platform independent method triggered by userspace.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-peci | 16 +++++
 drivers/peci/Makefile                    |  2 +-
 drivers/peci/core.c                      |  3 +-
 drivers/peci/device.c                    |  1 +
 drivers/peci/internal.h                  |  5 ++
 drivers/peci/sysfs.c                     | 82 ++++++++++++++++++++++++
 6 files changed, 107 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
 create mode 100644 drivers/peci/sysfs.c

diff --git a/Documentation/ABI/testing/sysfs-bus-peci b/Documentation/ABI/testing/sysfs-bus-peci
new file mode 100644
index 000000000000..56c2b2216bbd
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-peci
@@ -0,0 +1,16 @@
+What:		/sys/bus/peci/rescan
+Date:		July 2021
+KernelVersion:	5.15
+Contact:	Iwona Winiarska <iwona.winiarska@intel.com>
+Description:
+		Writing a non-zero value to this attribute will
+		initiate scan for PECI devices on all PECI controllers
+		in the system.
+
+What:		/sys/bus/peci/devices/<controller_id>-<device_addr>/remove
+Date:		July 2021
+KernelVersion:	5.15
+Contact:	Iwona Winiarska <iwona.winiarska@intel.com>
+Description:
+		Writing a non-zero value to this attribute will
+		remove the PECI device and any of its children.
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index c5f9d3fe21bb..917f689e147a 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 # Core functionality
-peci-y := core.o request.o device.o
+peci-y := core.o request.o device.o sysfs.o
 obj-$(CONFIG_PECI) += peci.o
 
 # Hardware specific bus drivers
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
index d143f1a7fe98..c473acb3c2a0 100644
--- a/drivers/peci/core.c
+++ b/drivers/peci/core.c
@@ -34,7 +34,7 @@ struct device_type peci_controller_type = {
 	.release	= peci_controller_dev_release,
 };
 
-static int peci_controller_scan_devices(struct peci_controller *controller)
+int peci_controller_scan_devices(struct peci_controller *controller)
 {
 	int ret;
 	u8 addr;
@@ -159,6 +159,7 @@ EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
 
 struct bus_type peci_bus_type = {
 	.name		= "peci",
+	.bus_groups	= peci_bus_groups,
 };
 
 static int __init peci_init(void)
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
index 32811248997b..d77d9dabd51e 100644
--- a/drivers/peci/device.c
+++ b/drivers/peci/device.c
@@ -110,5 +110,6 @@ static void peci_device_release(struct device *dev)
 }
 
 struct device_type peci_device_type = {
+	.groups		= peci_device_groups,
 	.release	= peci_device_release,
 };
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 57d11a902c5d..978e12c8e1d3 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -8,6 +8,7 @@
 #include <linux/types.h>
 
 struct peci_controller;
+struct attribute_group;
 struct peci_device;
 struct peci_request;
 
@@ -19,12 +20,16 @@ struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u
 void peci_request_free(struct peci_request *req);
 
 extern struct device_type peci_device_type;
+extern const struct attribute_group *peci_device_groups[];
 
 int peci_device_create(struct peci_controller *controller, u8 addr);
 void peci_device_destroy(struct peci_device *device);
 
 extern struct bus_type peci_bus_type;
+extern const struct attribute_group *peci_bus_groups[];
 
 extern struct device_type peci_controller_type;
 
+int peci_controller_scan_devices(struct peci_controller *controller);
+
 #endif /* __PECI_INTERNAL_H */
diff --git a/drivers/peci/sysfs.c b/drivers/peci/sysfs.c
new file mode 100644
index 000000000000..db9ef05776e3
--- /dev/null
+++ b/drivers/peci/sysfs.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2021 Intel Corporation
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/peci.h>
+
+#include "internal.h"
+
+static int rescan_controller(struct device *dev, void *data)
+{
+	if (dev->type != &peci_controller_type)
+		return 0;
+
+	return peci_controller_scan_devices(to_peci_controller(dev));
+}
+
+static ssize_t rescan_store(struct bus_type *bus, const char *buf, size_t count)
+{
+	bool res;
+	int ret;
+
+	ret = kstrtobool(buf, &res);
+	if (ret)
+		return ret;
+
+	if (!res)
+		return count;
+
+	ret = bus_for_each_dev(&peci_bus_type, NULL, NULL, rescan_controller);
+	if (ret)
+		return ret;
+
+	return count;
+}
+static BUS_ATTR_WO(rescan);
+
+static struct attribute *peci_bus_attrs[] = {
+	&bus_attr_rescan.attr,
+	NULL
+};
+
+static const struct attribute_group peci_bus_group = {
+	.attrs = peci_bus_attrs,
+};
+
+const struct attribute_group *peci_bus_groups[] = {
+	&peci_bus_group,
+	NULL
+};
+
+static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
+			    const char *buf, size_t count)
+{
+	struct peci_device *device = to_peci_device(dev);
+	bool res;
+	int ret;
+
+	ret = kstrtobool(buf, &res);
+	if (ret)
+		return ret;
+
+	if (res && device_remove_file_self(dev, attr))
+		peci_device_destroy(device);
+
+	return count;
+}
+static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0200, NULL, remove_store);
+
+static struct attribute *peci_device_attrs[] = {
+	&dev_attr_remove.attr,
+	NULL
+};
+
+static const struct attribute_group peci_device_group = {
+	.attrs = peci_device_attrs,
+};
+
+const struct attribute_group *peci_device_groups[] = {
+	&peci_device_group,
+	NULL
+};
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 10/15] peci: Add support for PECI device drivers
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (8 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 09/15] peci: Add sysfs interface for PECI bus Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-27 21:19   ` Dan Williams
  2021-08-03 11:31 ` [PATCH v2 11/15] peci: Add peci-cpu driver Iwona Winiarska
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Here we're adding support for PECI device drivers, which unlike PECI
controller drivers are actually able to provide functionalities to
userspace.

We're also extending peci_request API to allow querying more details
about PECI device (e.g. model/family), that's going to be used to find
a compatible peci_driver.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/peci/Kconfig    |   1 +
 drivers/peci/core.c     |  49 +++++++++
 drivers/peci/device.c   | 105 ++++++++++++++++++++
 drivers/peci/internal.h |  75 ++++++++++++++
 drivers/peci/request.c  | 214 ++++++++++++++++++++++++++++++++++++++++
 include/linux/peci.h    |  19 ++++
 lib/Kconfig             |   2 +-
 7 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 99279df97a78..1d0532e3a801 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -2,6 +2,7 @@
 
 menuconfig PECI
 	tristate "PECI support"
+	select GENERIC_LIB_X86
 	help
 	  The Platform Environment Control Interface (PECI) is an interface
 	  that provides a communication channel to Intel processors and
diff --git a/drivers/peci/core.c b/drivers/peci/core.c
index c473acb3c2a0..33c07920493d 100644
--- a/drivers/peci/core.c
+++ b/drivers/peci/core.c
@@ -157,8 +157,57 @@ struct peci_controller *devm_peci_controller_add(struct device *dev,
 }
 EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
 
+static const struct peci_device_id *
+peci_bus_match_device_id(const struct peci_device_id *id, struct peci_device *device)
+{
+	while (id->family != 0) {
+		if (id->family == device->info.family &&
+		    id->model == device->info.model)
+			return id;
+		id++;
+	}
+
+	return NULL;
+}
+
+static int peci_bus_device_match(struct device *dev, struct device_driver *drv)
+{
+	struct peci_device *device = to_peci_device(dev);
+	struct peci_driver *peci_drv = to_peci_driver(drv);
+
+	if (dev->type != &peci_device_type)
+		return 0;
+
+	if (peci_bus_match_device_id(peci_drv->id_table, device))
+		return 1;
+
+	return 0;
+}
+
+static int peci_bus_device_probe(struct device *dev)
+{
+	struct peci_device *device = to_peci_device(dev);
+	struct peci_driver *driver = to_peci_driver(dev->driver);
+
+	return driver->probe(device, peci_bus_match_device_id(driver->id_table, device));
+}
+
+static int peci_bus_device_remove(struct device *dev)
+{
+	struct peci_device *device = to_peci_device(dev);
+	struct peci_driver *driver = to_peci_driver(dev->driver);
+
+	if (driver->remove)
+		driver->remove(device);
+
+	return 0;
+}
+
 struct bus_type peci_bus_type = {
 	.name		= "peci",
+	.match		= peci_bus_device_match,
+	.probe		= peci_bus_device_probe,
+	.remove		= peci_bus_device_remove,
 	.bus_groups	= peci_bus_groups,
 };
 
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
index d77d9dabd51e..a78c02399574 100644
--- a/drivers/peci/device.c
+++ b/drivers/peci/device.c
@@ -1,11 +1,85 @@
 // SPDX-License-Identifier: GPL-2.0-only
 // Copyright (c) 2018-2021 Intel Corporation
 
+#include <linux/bitfield.h>
 #include <linux/peci.h>
 #include <linux/slab.h>
+#include <linux/x86/cpu.h>
 
 #include "internal.h"
 
+#define REVISION_NUM_MASK GENMASK(15, 8)
+static int peci_get_revision(struct peci_device *device, u8 *revision)
+{
+	struct peci_request *req;
+	u64 dib;
+
+	req = peci_get_dib(device);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	/*
+	 * PECI device may be in a state where it is unable to return a proper
+	 * DIB, in which case it returns 0 as DIB value.
+	 * Let's treat this as an error to avoid carrying on with the detection
+	 * using invalid revision.
+	 */
+	dib = peci_request_data_dib(req);
+	if (dib == 0) {
+		peci_request_free(req);
+		return -EIO;
+	}
+
+	*revision = FIELD_GET(REVISION_NUM_MASK, dib);
+
+	peci_request_free(req);
+
+	return 0;
+}
+
+static int peci_get_cpu_id(struct peci_device *device, u32 *cpu_id)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_pkg_cfg_readl(device, PECI_PCS_PKG_ID, PECI_PKG_ID_CPU_ID);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*cpu_id = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+
+static int peci_device_info_init(struct peci_device *device)
+{
+	u8 revision;
+	u32 cpu_id;
+	int ret;
+
+	ret = peci_get_cpu_id(device, &cpu_id);
+	if (ret)
+		return ret;
+
+	device->info.family = x86_family(cpu_id);
+	device->info.model = x86_model(cpu_id);
+
+	ret = peci_get_revision(device, &revision);
+	if (ret)
+		return ret;
+	device->info.peci_revision = revision;
+
+	device->info.socket_id = device->addr - PECI_BASE_ADDR;
+
+	return 0;
+}
+
 static int peci_detect(struct peci_controller *controller, u8 addr)
 {
 	struct peci_request *req;
@@ -79,6 +153,10 @@ int peci_device_create(struct peci_controller *controller, u8 addr)
 	device->dev.bus = &peci_bus_type;
 	device->dev.type = &peci_device_type;
 
+	ret = peci_device_info_init(device);
+	if (ret)
+		goto err_free;
+
 	ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device->addr);
 	if (ret)
 		goto err_free;
@@ -102,6 +180,33 @@ void peci_device_destroy(struct peci_device *device)
 	device_unregister(&device->dev);
 }
 
+int __peci_driver_register(struct peci_driver *driver, struct module *owner,
+			   const char *mod_name)
+{
+	driver->driver.bus = &peci_bus_type;
+	driver->driver.owner = owner;
+	driver->driver.mod_name = mod_name;
+
+	if (!driver->probe) {
+		pr_err("peci: trying to register driver without probe callback\n");
+		return -EINVAL;
+	}
+
+	if (!driver->id_table) {
+		pr_err("peci: trying to register driver without device id table\n");
+		return -EINVAL;
+	}
+
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_NS_GPL(__peci_driver_register, PECI);
+
+void peci_driver_unregister(struct peci_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_NS_GPL(peci_driver_unregister, PECI);
+
 static void peci_device_release(struct device *dev)
 {
 	struct peci_device *device = to_peci_device(dev);
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index 978e12c8e1d3..d661e1b65694 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -19,6 +19,34 @@ struct peci_request;
 struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len);
 void peci_request_free(struct peci_request *req);
 
+int peci_request_status(struct peci_request *req);
+u64 peci_request_data_dib(struct peci_request *req);
+
+u8 peci_request_data_readb(struct peci_request *req);
+u16 peci_request_data_readw(struct peci_request *req);
+u32 peci_request_data_readl(struct peci_request *req);
+u64 peci_request_data_readq(struct peci_request *req);
+
+struct peci_request *peci_get_dib(struct peci_device *device);
+struct peci_request *peci_get_temp(struct peci_device *device);
+
+struct peci_request *peci_pkg_cfg_readb(struct peci_device *device, u8 index, u16 param);
+struct peci_request *peci_pkg_cfg_readw(struct peci_device *device, u8 index, u16 param);
+struct peci_request *peci_pkg_cfg_readl(struct peci_device *device, u8 index, u16 param);
+struct peci_request *peci_pkg_cfg_readq(struct peci_device *device, u8 index, u16 param);
+
+/**
+ * struct peci_device_id - PECI device data to match
+ * @data: pointer to driver private data specific to device
+ * @family: device family
+ * @model: device model
+ */
+struct peci_device_id {
+	const void *data;
+	u16 family;
+	u8 model;
+};
+
 extern struct device_type peci_device_type;
 extern const struct attribute_group *peci_device_groups[];
 
@@ -28,6 +56,53 @@ void peci_device_destroy(struct peci_device *device);
 extern struct bus_type peci_bus_type;
 extern const struct attribute_group *peci_bus_groups[];
 
+/**
+ * struct peci_driver - PECI driver
+ * @driver: inherit device driver
+ * @probe: probe callback
+ * @remove: remove callback
+ * @id_table: PECI device match table to decide which device to bind
+ */
+struct peci_driver {
+	struct device_driver driver;
+	int (*probe)(struct peci_device *device, const struct peci_device_id *id);
+	void (*remove)(struct peci_device *device);
+	const struct peci_device_id *id_table;
+};
+
+static inline struct peci_driver *to_peci_driver(struct device_driver *d)
+{
+	return container_of(d, struct peci_driver, driver);
+}
+
+int __peci_driver_register(struct peci_driver *driver, struct module *owner,
+			   const char *mod_name);
+/**
+ * peci_driver_register() - register PECI driver
+ * @driver: the driver to be registered
+ * @owner: owner module of the driver being registered
+ * @mod_name: module name string
+ *
+ * PECI drivers that don't need to do anything special in module init should
+ * use the convenience "module_peci_driver" macro instead
+ *
+ * Return: zero on success, else a negative error code.
+ */
+#define peci_driver_register(driver) \
+	__peci_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
+void peci_driver_unregister(struct peci_driver *driver);
+
+/**
+ * module_peci_driver() - helper macro for registering a modular PECI driver
+ * @__peci_driver: peci_driver struct
+ *
+ * Helper macro for PECI drivers which do not do anything special in module
+ * init/exit. This eliminates a lot of boilerplate. Each module may only
+ * use this macro once, and calling it replaces module_init() and module_exit()
+ */
+#define module_peci_driver(__peci_driver) \
+	module_driver(__peci_driver, peci_driver_register, peci_driver_unregister)
+
 extern struct device_type peci_controller_type;
 
 int peci_controller_scan_devices(struct peci_controller *controller);
diff --git a/drivers/peci/request.c b/drivers/peci/request.c
index 81b567bc7b87..fe032d5a5e1b 100644
--- a/drivers/peci/request.c
+++ b/drivers/peci/request.c
@@ -1,13 +1,140 @@
 // SPDX-License-Identifier: GPL-2.0-only
 // Copyright (c) 2021 Intel Corporation
 
+#include <linux/bug.h>
 #include <linux/export.h>
 #include <linux/peci.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include <asm/unaligned.h>
+
 #include "internal.h"
 
+#define PECI_GET_DIB_CMD		0xf7
+#define  PECI_GET_DIB_WR_LEN		1
+#define  PECI_GET_DIB_RD_LEN		8
+
+#define PECI_RDPKGCFG_CMD		0xa1
+#define  PECI_RDPKGCFG_WR_LEN		5
+#define  PECI_RDPKGCFG_RD_LEN_BASE	1
+#define PECI_WRPKGCFG_CMD		0xa5
+#define  PECI_WRPKGCFG_WR_LEN_BASE	6
+#define  PECI_WRPKGCFG_RD_LEN		1
+
+/* Device Specific Completion Code (CC) Definition */
+#define PECI_CC_SUCCESS				0x40
+#define PECI_CC_NEED_RETRY			0x80
+#define PECI_CC_OUT_OF_RESOURCE			0x81
+#define PECI_CC_UNAVAIL_RESOURCE		0x82
+#define PECI_CC_INVALID_REQ			0x90
+#define PECI_CC_MCA_ERROR			0x91
+#define PECI_CC_CATASTROPHIC_MCA_ERROR		0x93
+#define PECI_CC_FATAL_MCA_ERROR			0x94
+#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB		0x98
+#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR	0x9B
+#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA	0x9C
+
+#define PECI_RETRY_BIT			BIT(0)
+
+#define PECI_RETRY_TIMEOUT		msecs_to_jiffies(700)
+#define PECI_RETRY_INTERVAL_MIN		msecs_to_jiffies(1)
+#define PECI_RETRY_INTERVAL_MAX		msecs_to_jiffies(128)
+
+static u8 peci_request_data_cc(struct peci_request *req)
+{
+	return req->rx.buf[0];
+}
+
+/**
+ * peci_request_status() - return -errno based on PECI completion code
+ * @req: the PECI request that contains response data with completion code
+ *
+ * It can't be used for Ping(), GetDIB() and GetTemp() - for those commands we
+ * don't expect completion code in the response.
+ *
+ * Return: -errno
+ */
+int peci_request_status(struct peci_request *req)
+{
+	u8 cc = peci_request_data_cc(req);
+
+	if (cc != PECI_CC_SUCCESS)
+		dev_dbg(&req->device->dev, "ret: %#02x\n", cc);
+
+	switch (cc) {
+	case PECI_CC_SUCCESS:
+		return 0;
+	case PECI_CC_NEED_RETRY:
+	case PECI_CC_OUT_OF_RESOURCE:
+	case PECI_CC_UNAVAIL_RESOURCE:
+		return -EAGAIN;
+	case PECI_CC_INVALID_REQ:
+		return -EINVAL;
+	case PECI_CC_MCA_ERROR:
+	case PECI_CC_CATASTROPHIC_MCA_ERROR:
+	case PECI_CC_FATAL_MCA_ERROR:
+	case PECI_CC_PARITY_ERR_GPSB_OR_PMSB:
+	case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR:
+	case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA:
+		return -EIO;
+	}
+
+	WARN_ONCE(1, "Unknown PECI completion code: %#02x\n", cc);
+
+	return -EIO;
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_status, PECI);
+
+static int peci_request_xfer(struct peci_request *req)
+{
+	struct peci_device *device = req->device;
+	struct peci_controller *controller = to_peci_controller(device->dev.parent);
+	int ret;
+
+	mutex_lock(&controller->bus_lock);
+	ret = controller->ops->xfer(controller, device->addr, req);
+	mutex_unlock(&controller->bus_lock);
+
+	return ret;
+}
+
+static int peci_request_xfer_retry(struct peci_request *req)
+{
+	long wait_interval = PECI_RETRY_INTERVAL_MIN;
+	struct peci_device *device = req->device;
+	struct peci_controller *controller = to_peci_controller(device->dev.parent);
+	unsigned long start = jiffies;
+	int ret;
+
+	/* Don't try to use it for ping */
+	if (WARN_ON(!req->rx.buf))
+		return 0;
+
+	do {
+		ret = peci_request_xfer(req);
+		if (ret) {
+			dev_dbg(&controller->dev, "xfer error: %d\n", ret);
+			return ret;
+		}
+
+		if (peci_request_status(req) != -EAGAIN)
+			return 0;
+
+		/* Set the retry bit to indicate a retry attempt */
+		req->tx.buf[1] |= PECI_RETRY_BIT;
+
+		if (schedule_timeout_interruptible(wait_interval))
+			return -ERESTARTSYS;
+
+		wait_interval = min_t(long, wait_interval * 2, PECI_RETRY_INTERVAL_MAX);
+	} while (time_before(jiffies, start + PECI_RETRY_TIMEOUT));
+
+	dev_dbg(&controller->dev, "request timed out\n");
+
+	return -ETIMEDOUT;
+}
+
 /**
  * peci_request_alloc() - allocate &struct peci_requests
  * @device: PECI device to which request is going to be sent
@@ -48,3 +175,90 @@ void peci_request_free(struct peci_request *req)
 	kfree(req);
 }
 EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);
+
+struct peci_request *peci_get_dib(struct peci_device *device)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_GET_DIB_WR_LEN, PECI_GET_DIB_RD_LEN);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_GET_DIB_CMD;
+
+	ret = peci_request_xfer(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+EXPORT_SYMBOL_NS_GPL(peci_get_dib, PECI);
+
+static struct peci_request *
+__pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_RDPKGCFG_WR_LEN, PECI_RDPKGCFG_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_RDPKGCFG_CMD;
+	req->tx.buf[1] = 0;
+	req->tx.buf[2] = index;
+	put_unaligned_le16(param, &req->tx.buf[3]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
+u8 peci_request_data_readb(struct peci_request *req)
+{
+	return req->rx.buf[1];
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readb, PECI);
+
+u16 peci_request_data_readw(struct peci_request *req)
+{
+	return get_unaligned_le16(&req->rx.buf[1]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readw, PECI);
+
+u32 peci_request_data_readl(struct peci_request *req)
+{
+	return get_unaligned_le32(&req->rx.buf[1]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readl, PECI);
+
+u64 peci_request_data_readq(struct peci_request *req)
+{
+	return get_unaligned_le64(&req->rx.buf[1]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_readq, PECI);
+
+u64 peci_request_data_dib(struct peci_request *req)
+{
+	return get_unaligned_le64(&req->rx.buf[0]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_dib, PECI);
+
+#define __read_pkg_config(x, type) \
+struct peci_request *peci_pkg_cfg_##x(struct peci_device *device, u8 index, u16 param) \
+{ \
+	return __pkg_cfg_read(device, index, param, sizeof(type)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_pkg_cfg_##x, PECI)
+
+__read_pkg_config(readb, u8);
+__read_pkg_config(readw, u16);
+__read_pkg_config(readl, u32);
+__read_pkg_config(readq, u64);
diff --git a/include/linux/peci.h b/include/linux/peci.h
index 26e0a4e73b50..dcf1c53f4e40 100644
--- a/include/linux/peci.h
+++ b/include/linux/peci.h
@@ -14,6 +14,14 @@
  */
 #define PECI_REQUEST_MAX_BUF_SIZE 32
 
+#define PECI_PCS_PKG_ID			0  /* Package Identifier Read */
+#define  PECI_PKG_ID_CPU_ID		0x0000  /* CPUID Info */
+#define  PECI_PKG_ID_PLATFORM_ID	0x0001  /* Platform ID */
+#define  PECI_PKG_ID_DEVICE_ID		0x0002  /* Uncore Device ID */
+#define  PECI_PKG_ID_MAX_THREAD_ID	0x0003  /* Max Thread ID */
+#define  PECI_PKG_ID_MICROCODE_REV	0x0004  /* CPU Microcode Update Revision */
+#define  PECI_PKG_ID_MCA_ERROR_LOG	0x0005  /* Machine Check Status */
+
 struct peci_controller;
 struct peci_request;
 
@@ -59,6 +67,11 @@ static inline struct peci_controller *to_peci_controller(void *d)
  * struct peci_device - PECI device
  * @dev: device object to register PECI device to the device model
  * @controller: manages the bus segment hosting this PECI device
+ * @info: PECI device characteristics
+ * @info.family: device family
+ * @info.model: device model
+ * @info.peci_revision: PECI revision supported by the PECI device
+ * @info.socket_id: the socket ID represented by the PECI device
  * @addr: address used on the PECI bus connected to the parent controller
  *
  * A peci_device identifies a single device (i.e. CPU) connected to a PECI bus.
@@ -67,6 +80,12 @@ static inline struct peci_controller *to_peci_controller(void *d)
  */
 struct peci_device {
 	struct device dev;
+	struct {
+		u16 family;
+		u8 model;
+		u8 peci_revision;
+		u8 socket_id;
+	} info;
 	u8 addr;
 };
 
diff --git a/lib/Kconfig b/lib/Kconfig
index e538d4d773bd..7f7972d357c2 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -718,4 +718,4 @@ config ASN1_ENCODER
 
 config GENERIC_LIB_X86
 	bool
-	depends on X86
+	depends on X86 || PECI
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 11/15] peci: Add peci-cpu driver
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (9 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 10/15] peci: Add support for PECI device drivers Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 12/15] hwmon: peci: Add cputemp driver Iwona Winiarska
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

PECI is an interface that may be used by different types of devices.
Here we're adding a peci-cpu driver compatible with Intel processors.
The driver is responsible for handling auxiliary devices that can
subsequently be used by other drivers (e.g. hwmons).

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS              |   1 +
 drivers/peci/Kconfig     |  15 ++
 drivers/peci/Makefile    |   2 +
 drivers/peci/cpu.c       | 344 +++++++++++++++++++++++++++++++++++++++
 drivers/peci/device.c    |   1 +
 drivers/peci/internal.h  |  27 +++
 drivers/peci/request.c   | 213 ++++++++++++++++++++++++
 include/linux/peci-cpu.h |  38 +++++
 include/linux/peci.h     |   8 -
 9 files changed, 641 insertions(+), 8 deletions(-)
 create mode 100644 drivers/peci/cpu.c
 create mode 100644 include/linux/peci-cpu.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 6e9d53ff68ab..3f5d48e1d143 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14519,6 +14519,7 @@ L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/peci/
 F:	drivers/peci/
+F:	include/linux/peci-cpu.h
 F:	include/linux/peci.h
 
 PENSANDO ETHERNET DRIVERS
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 1d0532e3a801..2ea01f43f547 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -17,6 +17,21 @@ menuconfig PECI
 
 if PECI
 
+config PECI_CPU
+	tristate "PECI CPU"
+	select AUXILIARY_BUS
+	help
+	  This option enables peci-cpu driver for Intel processors. It is
+	  responsible for creating auxiliary devices that can subsequently
+	  be used by other drivers in order to perform various
+	  functionalities such as e.g. temperature monitoring.
+
+	  Additional drivers must be enabled in order to use the functionality
+	  of the device.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called peci-cpu.
+
 source "drivers/peci/controller/Kconfig"
 
 endif # PECI
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 917f689e147a..7de18137e738 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -3,6 +3,8 @@
 # Core functionality
 peci-y := core.o request.o device.o sysfs.o
 obj-$(CONFIG_PECI) += peci.o
+peci-cpu-y := cpu.o
+obj-$(CONFIG_PECI_CPU) += peci-cpu.o
 
 # Hardware specific bus drivers
 obj-y += controller/
diff --git a/drivers/peci/cpu.c b/drivers/peci/cpu.c
new file mode 100644
index 000000000000..97b9043be1e2
--- /dev/null
+++ b/drivers/peci/cpu.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2021 Intel Corporation
+
+#include <linux/auxiliary_bus.h>
+#include <linux/module.h>
+#include <linux/peci.h>
+#include <linux/peci-cpu.h>
+#include <linux/slab.h>
+#include <linux/x86/intel-family.h>
+
+#include "internal.h"
+
+/**
+ * peci_temp_read() - read the maximum die temperature from PECI target device
+ * @device: PECI device to which request is going to be sent
+ * @temp_raw: where to store the read temperature
+ *
+ * It uses GetTemp PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_temp_read(struct peci_device *device, s16 *temp_raw)
+{
+	struct peci_request *req;
+
+	req = peci_get_temp(device);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	*temp_raw = peci_request_data_temp(req);
+
+	peci_request_free(req);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(peci_temp_read, PECI_CPU);
+
+/**
+ * peci_pcs_read() - read PCS register
+ * @device: PECI device to which request is going to be sent
+ * @index: PCS index
+ * @param: PCS parameter
+ * @data: where to store the read data
+ *
+ * It uses RdPkgConfig PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_pcs_read(struct peci_device *device, u8 index, u16 param, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_pkg_cfg_readl(device, index, param);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_pcs_read, PECI_CPU);
+
+/**
+ * peci_pci_local_read() - read 32-bit memory location using raw address
+ * @device: PECI device to which request is going to be sent
+ * @bus: bus
+ * @dev: device
+ * @func: function
+ * @reg: register
+ * @data: where to store the read data
+ *
+ * It uses RdPCIConfigLocal PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_pci_local_read(struct peci_device *device, u8 bus, u8 dev, u8 func,
+			u16 reg, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_pci_cfg_local_readl(device, bus, dev, func, reg);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_pci_local_read, PECI_CPU);
+
+/**
+ * peci_ep_pci_local_read() - read 32-bit memory location using raw address
+ * @device: PECI device to which request is going to be sent
+ * @seg: PCI segment
+ * @bus: bus
+ * @dev: device
+ * @func: function
+ * @reg: register
+ * @data: where to store the read data
+ *
+ * Like &peci_pci_local_read, but it uses RdEndpointConfig PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_ep_pci_local_read(struct peci_device *device, u8 seg,
+			   u8 bus, u8 dev, u8 func, u16 reg, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_ep_pci_cfg_local_readl(device, seg, bus, dev, func, reg);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_ep_pci_local_read, PECI_CPU);
+
+/**
+ * peci_mmio_read() - read 32-bit memory location using 64-bit bar offset address
+ * @device: PECI device to which request is going to be sent
+ * @bar: PCI bar
+ * @seg: PCI segment
+ * @bus: bus
+ * @dev: device
+ * @func: function
+ * @address: 64-bit MMIO address
+ * @data: where to store the read data
+ *
+ * It uses RdEndpointConfig PECI command.
+ *
+ * Return: 0 if succeeded, other values in case errors.
+ */
+int peci_mmio_read(struct peci_device *device, u8 bar, u8 seg,
+		   u8 bus, u8 dev, u8 func, u64 address, u32 *data)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_ep_mmio64_readl(device, bar, seg, bus, dev, func, address);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = peci_request_status(req);
+	if (ret)
+		goto out_req_free;
+
+	*data = peci_request_data_readl(req);
+out_req_free:
+	peci_request_free(req);
+
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(peci_mmio_read, PECI_CPU);
+
+static const char * const peci_adev_types[] = {
+	"cputemp",
+	"dimmtemp",
+};
+
+struct peci_cpu {
+	struct peci_device *device;
+	const struct peci_device_id *id;
+};
+
+static void adev_release(struct device *dev)
+{
+	struct auxiliary_device *adev = to_auxiliary_dev(dev);
+
+	auxiliary_device_uninit(adev);
+
+	kfree(adev->name);
+	kfree(adev);
+}
+
+static struct auxiliary_device *adev_alloc(struct peci_cpu *priv, int idx)
+{
+	struct peci_controller *controller = to_peci_controller(priv->device->dev.parent);
+	struct auxiliary_device *adev;
+	const char *name;
+	int ret;
+
+	adev = kzalloc(sizeof(*adev), GFP_KERNEL);
+	if (!adev)
+		return ERR_PTR(-ENOMEM);
+
+	name = kasprintf(GFP_KERNEL, "%s.%s", peci_adev_types[idx], (const char *)priv->id->data);
+	if (!name) {
+		ret = -ENOMEM;
+		goto free_adev;
+	}
+
+	adev->name = name;
+	adev->dev.parent = &priv->device->dev;
+	adev->dev.release = adev_release;
+	adev->id = (controller->id << 16) | (priv->device->addr);
+
+	ret = auxiliary_device_init(adev);
+	if (ret)
+		goto free_name;
+
+	return adev;
+
+free_name:
+	kfree(name);
+free_adev:
+	kfree(adev);
+	return ERR_PTR(ret);
+}
+
+static void unregister_adev(void *_adev)
+{
+	struct auxiliary_device *adev = _adev;
+
+	auxiliary_device_delete(adev);
+}
+
+static int devm_adev_add(struct device *dev, int idx)
+{
+	struct peci_cpu *priv = dev_get_drvdata(dev);
+	struct auxiliary_device *adev;
+	int ret;
+
+	adev = adev_alloc(priv, idx);
+	if (IS_ERR(adev))
+		return PTR_ERR(adev);
+
+	ret = auxiliary_device_add(adev);
+	if (ret) {
+		auxiliary_device_uninit(adev);
+		return ret;
+	}
+
+	ret = devm_add_action_or_reset(&priv->device->dev, unregister_adev, adev);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static void peci_cpu_add_adevices(struct peci_cpu *priv)
+{
+	struct device *dev = &priv->device->dev;
+	int ret, i;
+
+	for (i = 0; i < ARRAY_SIZE(peci_adev_types); i++) {
+		ret = devm_adev_add(dev, i);
+		if (ret) {
+			dev_warn(dev, "Failed to register PECI auxiliary: %s, ret = %d\n",
+				 peci_adev_types[i], ret);
+			continue;
+		}
+	}
+}
+
+static int
+peci_cpu_probe(struct peci_device *device, const struct peci_device_id *id)
+{
+	struct device *dev = &device->dev;
+	struct peci_cpu *priv;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->device = device;
+	priv->id = id;
+
+	peci_cpu_add_adevices(priv);
+
+	return 0;
+}
+
+static const struct peci_device_id peci_cpu_device_ids[] = {
+	{ /* Haswell Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_HASWELL_X,
+		.data	= "hsx",
+	},
+	{ /* Broadwell Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_BROADWELL_X,
+		.data	= "bdx",
+	},
+	{ /* Broadwell Xeon D */
+		.family	= 6,
+		.model	= INTEL_FAM6_BROADWELL_D,
+		.data	= "bdxd",
+	},
+	{ /* Skylake Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_SKYLAKE_X,
+		.data	= "skx",
+	},
+	{ /* Icelake Xeon */
+		.family	= 6,
+		.model	= INTEL_FAM6_ICELAKE_X,
+		.data	= "icx",
+	},
+	{ /* Icelake Xeon D */
+		.family	= 6,
+		.model	= INTEL_FAM6_ICELAKE_D,
+		.data	= "icxd",
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(peci, peci_cpu_device_ids);
+
+static struct peci_driver peci_cpu_driver = {
+	.probe		= peci_cpu_probe,
+	.id_table	= peci_cpu_device_ids,
+	.driver		= {
+		.name		= "peci-cpu",
+	},
+};
+module_peci_driver(peci_cpu_driver);
+
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI CPU driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI);
diff --git a/drivers/peci/device.c b/drivers/peci/device.c
index a78c02399574..9c377d32cec4 100644
--- a/drivers/peci/device.c
+++ b/drivers/peci/device.c
@@ -3,6 +3,7 @@
 
 #include <linux/bitfield.h>
 #include <linux/peci.h>
+#include <linux/peci-cpu.h>
 #include <linux/slab.h>
 #include <linux/x86/cpu.h>
 
diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
index d661e1b65694..c6bbf18b66cb 100644
--- a/drivers/peci/internal.h
+++ b/drivers/peci/internal.h
@@ -21,6 +21,7 @@ void peci_request_free(struct peci_request *req);
 
 int peci_request_status(struct peci_request *req);
 u64 peci_request_data_dib(struct peci_request *req);
+s16 peci_request_data_temp(struct peci_request *req);
 
 u8 peci_request_data_readb(struct peci_request *req);
 u16 peci_request_data_readw(struct peci_request *req);
@@ -35,6 +36,32 @@ struct peci_request *peci_pkg_cfg_readw(struct peci_device *device, u8 index, u1
 struct peci_request *peci_pkg_cfg_readl(struct peci_device *device, u8 index, u16 param);
 struct peci_request *peci_pkg_cfg_readq(struct peci_device *device, u8 index, u16 param);
 
+struct peci_request *peci_pci_cfg_local_readb(struct peci_device *device,
+					      u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_pci_cfg_local_readw(struct peci_device *device,
+					      u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_pci_cfg_local_readl(struct peci_device *device,
+					      u8 bus, u8 dev, u8 func, u16 reg);
+
+struct peci_request *peci_ep_pci_cfg_local_readb(struct peci_device *device, u8 seg,
+						 u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_ep_pci_cfg_local_readw(struct peci_device *device, u8 seg,
+						 u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_ep_pci_cfg_local_readl(struct peci_device *device, u8 seg,
+						 u8 bus, u8 dev, u8 func, u16 reg);
+
+struct peci_request *peci_ep_pci_cfg_readb(struct peci_device *device, u8 seg,
+					   u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_ep_pci_cfg_readw(struct peci_device *device, u8 seg,
+					   u8 bus, u8 dev, u8 func, u16 reg);
+struct peci_request *peci_ep_pci_cfg_readl(struct peci_device *device, u8 seg,
+					   u8 bus, u8 dev, u8 func, u16 reg);
+
+struct peci_request *peci_ep_mmio32_readl(struct peci_device *device, u8 bar, u8 seg,
+					  u8 bus, u8 dev, u8 func, u64 offset);
+
+struct peci_request *peci_ep_mmio64_readl(struct peci_device *device, u8 bar, u8 seg,
+					  u8 bus, u8 dev, u8 func, u64 offset);
 /**
  * struct peci_device_id - PECI device data to match
  * @data: pointer to driver private data specific to device
diff --git a/drivers/peci/request.c b/drivers/peci/request.c
index fe032d5a5e1b..eacd6d2513b1 100644
--- a/drivers/peci/request.c
+++ b/drivers/peci/request.c
@@ -3,6 +3,7 @@
 
 #include <linux/bug.h>
 #include <linux/export.h>
+#include <linux/pci.h>
 #include <linux/peci.h>
 #include <linux/slab.h>
 #include <linux/types.h>
@@ -15,6 +16,10 @@
 #define  PECI_GET_DIB_WR_LEN		1
 #define  PECI_GET_DIB_RD_LEN		8
 
+#define PECI_GET_TEMP_CMD		0x01
+#define  PECI_GET_TEMP_WR_LEN		1
+#define  PECI_GET_TEMP_RD_LEN		2
+
 #define PECI_RDPKGCFG_CMD		0xa1
 #define  PECI_RDPKGCFG_WR_LEN		5
 #define  PECI_RDPKGCFG_RD_LEN_BASE	1
@@ -22,6 +27,45 @@
 #define  PECI_WRPKGCFG_WR_LEN_BASE	6
 #define  PECI_WRPKGCFG_RD_LEN		1
 
+#define PECI_RDIAMSR_CMD		0xb1
+#define  PECI_RDIAMSR_WR_LEN		5
+#define  PECI_RDIAMSR_RD_LEN		9
+#define PECI_WRIAMSR_CMD		0xb5
+#define PECI_RDIAMSREX_CMD		0xd1
+#define  PECI_RDIAMSREX_WR_LEN		6
+#define  PECI_RDIAMSREX_RD_LEN		9
+
+#define PECI_RDPCICFG_CMD		0x61
+#define  PECI_RDPCICFG_WR_LEN		6
+#define  PECI_RDPCICFG_RD_LEN		5
+#define  PECI_RDPCICFG_RD_LEN_MAX	24
+#define PECI_WRPCICFG_CMD		0x65
+
+#define PECI_RDPCICFGLOCAL_CMD			0xe1
+#define  PECI_RDPCICFGLOCAL_WR_LEN		5
+#define  PECI_RDPCICFGLOCAL_RD_LEN_BASE		1
+#define PECI_WRPCICFGLOCAL_CMD			0xe5
+#define  PECI_WRPCICFGLOCAL_WR_LEN_BASE		6
+#define  PECI_WRPCICFGLOCAL_RD_LEN		1
+
+#define PECI_ENDPTCFG_TYPE_LOCAL_PCI		0x03
+#define PECI_ENDPTCFG_TYPE_PCI			0x04
+#define PECI_ENDPTCFG_TYPE_MMIO			0x05
+#define PECI_ENDPTCFG_ADDR_TYPE_PCI		0x04
+#define PECI_ENDPTCFG_ADDR_TYPE_MMIO_D		0x05
+#define PECI_ENDPTCFG_ADDR_TYPE_MMIO_Q		0x06
+#define PECI_RDENDPTCFG_CMD			0xc1
+#define  PECI_RDENDPTCFG_PCI_WR_LEN		12
+#define  PECI_RDENDPTCFG_MMIO_WR_LEN_BASE	10
+#define  PECI_RDENDPTCFG_MMIO_D_WR_LEN		14
+#define  PECI_RDENDPTCFG_MMIO_Q_WR_LEN		18
+#define  PECI_RDENDPTCFG_RD_LEN_BASE		1
+#define PECI_WRENDPTCFG_CMD			0xc5
+#define  PECI_WRENDPTCFG_PCI_WR_LEN_BASE	13
+#define  PECI_WRENDPTCFG_MMIO_D_WR_LEN_BASE	15
+#define  PECI_WRENDPTCFG_MMIO_Q_WR_LEN_BASE	19
+#define  PECI_WRENDPTCFG_RD_LEN			1
+
 /* Device Specific Completion Code (CC) Definition */
 #define PECI_CC_SUCCESS				0x40
 #define PECI_CC_NEED_RETRY			0x80
@@ -197,6 +241,27 @@ struct peci_request *peci_get_dib(struct peci_device *device)
 }
 EXPORT_SYMBOL_NS_GPL(peci_get_dib, PECI);
 
+struct peci_request *peci_get_temp(struct peci_device *device)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_GET_TEMP_WR_LEN, PECI_GET_TEMP_RD_LEN);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_GET_TEMP_CMD;
+
+	ret = peci_request_xfer(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+EXPORT_SYMBOL_NS_GPL(peci_get_temp, PECI);
+
 static struct peci_request *
 __pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
 {
@@ -221,6 +286,108 @@ __pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
 	return req;
 }
 
+static u32 __get_pci_addr(u8 bus, u8 dev, u8 func, u16 reg)
+{
+	return reg | PCI_DEVID(bus, PCI_DEVFN(dev, func)) << 12;
+}
+
+static struct peci_request *
+__pci_cfg_local_read(struct peci_device *device, u8 bus, u8 dev, u8 func, u16 reg, u8 len)
+{
+	struct peci_request *req;
+	u32 pci_addr;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_RDPCICFGLOCAL_WR_LEN,
+				 PECI_RDPCICFGLOCAL_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	pci_addr = __get_pci_addr(bus, dev, func, reg);
+
+	req->tx.buf[0] = PECI_RDPCICFGLOCAL_CMD;
+	req->tx.buf[1] = 0;
+	put_unaligned_le24(pci_addr, &req->tx.buf[2]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
+static struct peci_request *
+__ep_pci_cfg_read(struct peci_device *device, u8 msg_type, u8 seg,
+		  u8 bus, u8 dev, u8 func, u16 reg, u8 len)
+{
+	struct peci_request *req;
+	u32 pci_addr;
+	int ret;
+
+	req = peci_request_alloc(device, PECI_RDENDPTCFG_PCI_WR_LEN,
+				 PECI_RDENDPTCFG_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	pci_addr = __get_pci_addr(bus, dev, func, reg);
+
+	req->tx.buf[0] = PECI_RDENDPTCFG_CMD;
+	req->tx.buf[1] = 0;
+	req->tx.buf[2] = msg_type;
+	req->tx.buf[3] = 0;
+	req->tx.buf[4] = 0;
+	req->tx.buf[5] = 0;
+	req->tx.buf[6] = PECI_ENDPTCFG_ADDR_TYPE_PCI;
+	req->tx.buf[7] = seg; /* PCI Segment */
+	put_unaligned_le32(pci_addr, &req->tx.buf[8]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
+static struct peci_request *
+__ep_mmio_read(struct peci_device *device, u8 bar, u8 addr_type, u8 seg,
+	       u8 bus, u8 dev, u8 func, u64 offset, u8 tx_len, u8 len)
+{
+	struct peci_request *req;
+	int ret;
+
+	req = peci_request_alloc(device, tx_len, PECI_RDENDPTCFG_RD_LEN_BASE + len);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	req->tx.buf[0] = PECI_RDENDPTCFG_CMD;
+	req->tx.buf[1] = 0;
+	req->tx.buf[2] = PECI_ENDPTCFG_TYPE_MMIO;
+	req->tx.buf[3] = 0; /* Endpoint ID */
+	req->tx.buf[4] = 0; /* Reserved */
+	req->tx.buf[5] = bar;
+	req->tx.buf[6] = addr_type;
+	req->tx.buf[7] = seg; /* PCI Segment */
+	req->tx.buf[8] = PCI_DEVFN(dev, func);
+	req->tx.buf[9] = bus; /* PCI Bus */
+
+	if (addr_type == PECI_ENDPTCFG_ADDR_TYPE_MMIO_D)
+		put_unaligned_le32(offset, &req->tx.buf[10]);
+	else
+		put_unaligned_le64(offset, &req->tx.buf[10]);
+
+	ret = peci_request_xfer_retry(req);
+	if (ret) {
+		peci_request_free(req);
+		return ERR_PTR(ret);
+	}
+
+	return req;
+}
+
 u8 peci_request_data_readb(struct peci_request *req)
 {
 	return req->rx.buf[1];
@@ -251,6 +418,12 @@ u64 peci_request_data_dib(struct peci_request *req)
 }
 EXPORT_SYMBOL_NS_GPL(peci_request_data_dib, PECI);
 
+s16 peci_request_data_temp(struct peci_request *req)
+{
+	return get_unaligned_le16(&req->rx.buf[0]);
+}
+EXPORT_SYMBOL_NS_GPL(peci_request_data_temp, PECI);
+
 #define __read_pkg_config(x, type) \
 struct peci_request *peci_pkg_cfg_##x(struct peci_device *device, u8 index, u16 param) \
 { \
@@ -262,3 +435,43 @@ __read_pkg_config(readb, u8);
 __read_pkg_config(readw, u16);
 __read_pkg_config(readl, u32);
 __read_pkg_config(readq, u64);
+
+#define __read_pci_config_local(x, type) \
+struct peci_request * \
+peci_pci_cfg_local_##x(struct peci_device *device, u8 bus, u8 dev, u8 func, u16 reg) \
+{ \
+	return __pci_cfg_local_read(device, bus, dev, func, reg, sizeof(type)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_pci_cfg_local_##x, PECI)
+
+__read_pci_config_local(readb, u8);
+__read_pci_config_local(readw, u16);
+__read_pci_config_local(readl, u32);
+
+#define __read_ep_pci_config(x, msg_type, type) \
+struct peci_request * \
+peci_ep_pci_cfg_##x(struct peci_device *device, u8 seg, u8 bus, u8 dev, u8 func, u16 reg) \
+{ \
+	return __ep_pci_cfg_read(device, msg_type, seg, bus, dev, func, reg, sizeof(type)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_ep_pci_cfg_##x, PECI)
+
+__read_ep_pci_config(local_readb, PECI_ENDPTCFG_TYPE_LOCAL_PCI, u8);
+__read_ep_pci_config(local_readw, PECI_ENDPTCFG_TYPE_LOCAL_PCI, u16);
+__read_ep_pci_config(local_readl, PECI_ENDPTCFG_TYPE_LOCAL_PCI, u32);
+__read_ep_pci_config(readb, PECI_ENDPTCFG_TYPE_PCI, u8);
+__read_ep_pci_config(readw, PECI_ENDPTCFG_TYPE_PCI, u16);
+__read_ep_pci_config(readl, PECI_ENDPTCFG_TYPE_PCI, u32);
+
+#define __read_ep_mmio(x, y, addr_type, type1, type2) \
+struct peci_request *peci_ep_mmio##y##_##x(struct peci_device *device, u8 bar, u8 seg, \
+					   u8 bus, u8 dev, u8 func, u64 offset) \
+{ \
+	return __ep_mmio_read(device, bar, addr_type, seg, bus, dev, func, \
+			      offset, PECI_RDENDPTCFG_MMIO_WR_LEN_BASE + sizeof(type1), \
+			      sizeof(type2)); \
+} \
+EXPORT_SYMBOL_NS_GPL(peci_ep_mmio##y##_##x, PECI)
+
+__read_ep_mmio(readl, 32, PECI_ENDPTCFG_ADDR_TYPE_MMIO_D, u32, u32);
+__read_ep_mmio(readl, 64, PECI_ENDPTCFG_ADDR_TYPE_MMIO_Q, u64, u32);
diff --git a/include/linux/peci-cpu.h b/include/linux/peci-cpu.h
new file mode 100644
index 000000000000..d1b307ec2429
--- /dev/null
+++ b/include/linux/peci-cpu.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2021 Intel Corporation */
+
+#ifndef __LINUX_PECI_CPU_H
+#define __LINUX_PECI_CPU_H
+
+#include <linux/types.h>
+
+#define PECI_PCS_PKG_ID			0  /* Package Identifier Read */
+#define  PECI_PKG_ID_CPU_ID		0x0000  /* CPUID Info */
+#define  PECI_PKG_ID_PLATFORM_ID	0x0001  /* Platform ID */
+#define  PECI_PKG_ID_DEVICE_ID		0x0002  /* Uncore Device ID */
+#define  PECI_PKG_ID_MAX_THREAD_ID	0x0003  /* Max Thread ID */
+#define  PECI_PKG_ID_MICROCODE_REV	0x0004  /* CPU Microcode Update Revision */
+#define  PECI_PKG_ID_MCA_ERROR_LOG	0x0005  /* Machine Check Status */
+#define PECI_PCS_MODULE_TEMP		9  /* Per Core DTS Temperature Read */
+#define PECI_PCS_THERMAL_MARGIN		10 /* DTS thermal margin */
+#define PECI_PCS_DDR_DIMM_TEMP		14 /* DDR DIMM Temperature */
+#define PECI_PCS_TEMP_TARGET		16 /* Temperature Target Read */
+#define PECI_PCS_TDP_UNITS		30 /* Units for power/energy registers */
+
+struct peci_device;
+
+int peci_temp_read(struct peci_device *device, s16 *temp_raw);
+
+int peci_pcs_read(struct peci_device *device, u8 index,
+		  u16 param, u32 *data);
+
+int peci_pci_local_read(struct peci_device *device, u8 bus, u8 dev,
+			u8 func, u16 reg, u32 *data);
+
+int peci_ep_pci_local_read(struct peci_device *device, u8 seg,
+			   u8 bus, u8 dev, u8 func, u16 reg, u32 *data);
+
+int peci_mmio_read(struct peci_device *device, u8 bar, u8 seg,
+		   u8 bus, u8 dev, u8 func, u64 address, u32 *data);
+
+#endif /* __LINUX_PECI_CPU_H */
diff --git a/include/linux/peci.h b/include/linux/peci.h
index dcf1c53f4e40..ce43705eaac4 100644
--- a/include/linux/peci.h
+++ b/include/linux/peci.h
@@ -14,14 +14,6 @@
  */
 #define PECI_REQUEST_MAX_BUF_SIZE 32
 
-#define PECI_PCS_PKG_ID			0  /* Package Identifier Read */
-#define  PECI_PKG_ID_CPU_ID		0x0000  /* CPUID Info */
-#define  PECI_PKG_ID_PLATFORM_ID	0x0001  /* Platform ID */
-#define  PECI_PKG_ID_DEVICE_ID		0x0002  /* Uncore Device ID */
-#define  PECI_PKG_ID_MAX_THREAD_ID	0x0003  /* Max Thread ID */
-#define  PECI_PKG_ID_MICROCODE_REV	0x0004  /* CPU Microcode Update Revision */
-#define  PECI_PKG_ID_MCA_ERROR_LOG	0x0005  /* Machine Check Status */
-
 struct peci_controller;
 struct peci_request;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 12/15] hwmon: peci: Add cputemp driver
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (10 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 11/15] peci: Add peci-cpu driver Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-03 15:24   ` Guenter Roeck
  2021-08-03 11:31 ` [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver Iwona Winiarska
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Add peci-cputemp driver for Digital Thermal Sensor (DTS) thermal
readings of the processor package and processor cores that are
accessible via the PECI interface.

The main use case for the driver (and PECI interface) is out-of-band
management, where we're able to obtain the DTS readings from an external
entity connected with PECI, e.g. BMC on server platforms.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 MAINTAINERS                  |   7 +
 drivers/hwmon/Kconfig        |   2 +
 drivers/hwmon/Makefile       |   1 +
 drivers/hwmon/peci/Kconfig   |  18 ++
 drivers/hwmon/peci/Makefile  |   5 +
 drivers/hwmon/peci/common.h  |  58 ++++
 drivers/hwmon/peci/cputemp.c | 591 +++++++++++++++++++++++++++++++++++
 7 files changed, 682 insertions(+)
 create mode 100644 drivers/hwmon/peci/Kconfig
 create mode 100644 drivers/hwmon/peci/Makefile
 create mode 100644 drivers/hwmon/peci/common.h
 create mode 100644 drivers/hwmon/peci/cputemp.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 3f5d48e1d143..e36b5c0824e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14512,6 +14512,13 @@ L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/peaq-wmi.c
 
+PECI HARDWARE MONITORING DRIVERS
+M:	Iwona Winiarska <iwona.winiarska@intel.com>
+R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+L:	linux-hwmon@vger.kernel.org
+S:	Supported
+F:	drivers/hwmon/peci/
+
 PECI SUBSYSTEM
 M:	Iwona Winiarska <iwona.winiarska@intel.com>
 R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index e3675377bc5d..61c0e3404415 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1507,6 +1507,8 @@ config SENSORS_PCF8591
 	  These devices are hard to detect and rarely found on mainstream
 	  hardware. If unsure, say N.
 
+source "drivers/hwmon/peci/Kconfig"
+
 source "drivers/hwmon/pmbus/Kconfig"
 
 config SENSORS_PWM_FAN
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index d712c61c1f5e..f52331f212ed 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -202,6 +202,7 @@ obj-$(CONFIG_SENSORS_WM8350)	+= wm8350-hwmon.o
 obj-$(CONFIG_SENSORS_XGENE)	+= xgene-hwmon.o
 
 obj-$(CONFIG_SENSORS_OCC)	+= occ/
+obj-$(CONFIG_SENSORS_PECI)	+= peci/
 obj-$(CONFIG_PMBUS)		+= pmbus/
 
 ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
new file mode 100644
index 000000000000..e10eed68d70a
--- /dev/null
+++ b/drivers/hwmon/peci/Kconfig
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config SENSORS_PECI_CPUTEMP
+	tristate "PECI CPU temperature monitoring client"
+	depends on PECI
+	select SENSORS_PECI
+	select PECI_CPU
+	help
+	  If you say yes here you get support for the generic Intel PECI
+	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+	  readings of the CPU package and CPU cores that are accessible via
+	  the processor PECI interface.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called peci-cputemp.
+
+config SENSORS_PECI
+	tristate
diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
new file mode 100644
index 000000000000..e8a0ada5ab1f
--- /dev/null
+++ b/drivers/hwmon/peci/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+peci-cputemp-y := cputemp.o
+
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
diff --git a/drivers/hwmon/peci/common.h b/drivers/hwmon/peci/common.h
new file mode 100644
index 000000000000..734506b0eca2
--- /dev/null
+++ b/drivers/hwmon/peci/common.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2021 Intel Corporation */
+
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+#ifndef __PECI_HWMON_COMMON_H
+#define __PECI_HWMON_COMMON_H
+
+#define PECI_HWMON_UPDATE_INTERVAL	HZ
+
+/**
+ * struct peci_sensor_state - PECI state information
+ * @valid: flag to indicate the sensor value is valid
+ * @last_updated: time of the last update in jiffies
+ * @lock: mutex to protect sensor access
+ */
+struct peci_sensor_state {
+	bool valid;
+	unsigned long last_updated;
+	struct mutex lock; /* protect sensor access */
+};
+
+/**
+ * struct peci_sensor_data - PECI sensor information
+ * @value: sensor value in milli units
+ * @state: sensor update state
+ */
+
+struct peci_sensor_data {
+	s32 value;
+	struct peci_sensor_state state;
+};
+
+/**
+ * peci_sensor_need_update() - check whether sensor update is needed or not
+ * @sensor: pointer to sensor data struct
+ *
+ * Return: true if update is needed, false if not.
+ */
+
+static inline bool peci_sensor_need_update(struct peci_sensor_state *state)
+{
+	return !state->valid ||
+	       time_after(jiffies, state->last_updated + PECI_HWMON_UPDATE_INTERVAL);
+}
+
+/**
+ * peci_sensor_mark_updated() - mark the sensor is updated
+ * @sensor: pointer to sensor data struct
+ */
+static inline void peci_sensor_mark_updated(struct peci_sensor_state *state)
+{
+	state->valid = true;
+	state->last_updated = jiffies;
+}
+
+#endif /* __PECI_HWMON_COMMON_H */
diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c
new file mode 100644
index 000000000000..9c6858a9fb6d
--- /dev/null
+++ b/drivers/hwmon/peci/cputemp.c
@@ -0,0 +1,591 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/auxiliary_bus.h>
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/hwmon.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/peci.h>
+#include <linux/peci-cpu.h>
+#include <linux/units.h>
+#include <linux/x86/intel-family.h>
+
+#include "common.h"
+
+#define CORE_NUMS_MAX		64
+
+#define BASE_CHANNEL_NUMS	5
+#define CPUTEMP_CHANNEL_NUMS	(BASE_CHANNEL_NUMS + CORE_NUMS_MAX)
+
+#define TEMP_TARGET_FAN_TEMP_MASK	GENMASK(15, 8)
+#define TEMP_TARGET_REF_TEMP_MASK	GENMASK(23, 16)
+#define TEMP_TARGET_TJ_OFFSET_MASK	GENMASK(29, 24)
+
+#define DTS_MARGIN_MASK		GENMASK(15, 0)
+#define PCS_MODULE_TEMP_MASK	GENMASK(15, 0)
+
+#define DTS_FIXED_POINT_FRACTION	64
+
+struct resolved_cores_reg {
+	u8 bus;
+	u8 dev;
+	u8 func;
+	u8 offset;
+};
+
+struct cpu_info {
+	struct resolved_cores_reg *reg;
+	u8 min_peci_revision;
+};
+
+struct peci_temp_target {
+	s32 tcontrol;
+	s32 tthrottle;
+	s32 tjmax;
+	struct peci_sensor_state state;
+};
+
+enum peci_temp_target_type {
+	tcontrol_type,
+	tthrottle_type,
+	tjmax_type,
+	crit_hyst_type,
+};
+
+struct peci_cputemp {
+	struct peci_device *peci_dev;
+	struct device *dev;
+	const char *name;
+	const struct cpu_info *gen_info;
+	struct {
+		struct peci_temp_target target;
+		struct peci_sensor_data die;
+		struct peci_sensor_data dts;
+		struct peci_sensor_data core[CORE_NUMS_MAX];
+	} temp;
+	const char **coretemp_label;
+	DECLARE_BITMAP(core_mask, CORE_NUMS_MAX);
+};
+
+enum cputemp_channels {
+	channel_die,
+	channel_dts,
+	channel_tcontrol,
+	channel_tthrottle,
+	channel_tjmax,
+	channel_core,
+};
+
+static const char * const cputemp_label[BASE_CHANNEL_NUMS] = {
+	"Die",
+	"DTS",
+	"Tcontrol",
+	"Tthrottle",
+	"Tjmax",
+};
+
+static int update_temp_target(struct peci_cputemp *priv)
+{
+	s32 tthrottle_offset, tcontrol_margin;
+	u32 pcs;
+	int ret;
+
+	if (!peci_sensor_need_update(&priv->temp.target.state))
+		return 0;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_TEMP_TARGET, 0, &pcs);
+	if (ret)
+		return ret;
+
+	priv->temp.target.tjmax =
+		FIELD_GET(TEMP_TARGET_REF_TEMP_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
+
+	tcontrol_margin = FIELD_GET(TEMP_TARGET_FAN_TEMP_MASK, pcs);
+	tcontrol_margin = sign_extend32(tcontrol_margin, 7) * MILLIDEGREE_PER_DEGREE;
+	priv->temp.target.tcontrol = priv->temp.target.tjmax - tcontrol_margin;
+
+	tthrottle_offset = FIELD_GET(TEMP_TARGET_TJ_OFFSET_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
+	priv->temp.target.tthrottle = priv->temp.target.tjmax - tthrottle_offset;
+
+	peci_sensor_mark_updated(&priv->temp.target.state);
+
+	return 0;
+}
+
+static int get_temp_target(struct peci_cputemp *priv, enum peci_temp_target_type type, long *val)
+{
+	int ret;
+
+	mutex_lock(&priv->temp.target.state.lock);
+
+	ret = update_temp_target(priv);
+	if (ret)
+		goto unlock;
+
+	switch (type) {
+	case tcontrol_type:
+		*val = priv->temp.target.tcontrol;
+		break;
+	case tthrottle_type:
+		*val = priv->temp.target.tthrottle;
+		break;
+	case tjmax_type:
+		*val = priv->temp.target.tjmax;
+		break;
+	case crit_hyst_type:
+		*val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+unlock:
+	mutex_unlock(&priv->temp.target.state.lock);
+
+	return ret;
+}
+
+/*
+ * Processors return a value of DTS reading in S10.6 fixed point format
+ * (16 bits: 10-bit signed magnitude, 6-bit fraction).
+ * Error codes:
+ *   0x8000: General sensor error
+ *   0x8001: Reserved
+ *   0x8002: Underflow on reading value
+ *   0x8003-0x81ff: Reserved
+ */
+static bool dts_valid(s32 val)
+{
+	return val < 0x8000 || val > 0x81ff;
+}
+
+static s32 dts_to_millidegree(s32 val)
+{
+	return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / DTS_FIXED_POINT_FRACTION;
+}
+
+static int get_die_temp(struct peci_cputemp *priv, long *val)
+{
+	long tjmax;
+	s16 temp;
+	int ret;
+
+	mutex_lock(&priv->temp.die.state.lock);
+	if (!peci_sensor_need_update(&priv->temp.die.state))
+		goto skip_update;
+
+	ret = peci_temp_read(priv->peci_dev, &temp);
+	if (ret)
+		goto err_unlock;
+
+	if (!dts_valid(temp)) {
+		ret = -EIO;
+		goto err_unlock;
+	}
+
+	ret = get_temp_target(priv, tjmax_type, &tjmax);
+	if (ret)
+		goto err_unlock;
+
+	priv->temp.die.value = (s32)tjmax + dts_to_millidegree(temp);
+
+	peci_sensor_mark_updated(&priv->temp.die.state);
+
+skip_update:
+	*val = priv->temp.die.value;
+	mutex_unlock(&priv->temp.die.state.lock);
+
+	return 0;
+
+err_unlock:
+	mutex_unlock(&priv->temp.die.state.lock);
+	return ret;
+}
+
+static int get_dts(struct peci_cputemp *priv, long *val)
+{
+	s32 dts_margin;
+	long tcontrol;
+	u32 pcs;
+	int ret;
+
+	mutex_lock(&priv->temp.dts.state.lock);
+	if (!peci_sensor_need_update(&priv->temp.dts.state))
+		goto skip_update;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_THERMAL_MARGIN, 0, &pcs);
+	if (ret)
+		goto err_unlock;
+
+	dts_margin = FIELD_GET(DTS_MARGIN_MASK, pcs);
+	if (!dts_valid(dts_margin)) {
+		ret = -EIO;
+		goto err_unlock;
+	}
+
+	ret = get_temp_target(priv, tcontrol_type, &tcontrol);
+	if (ret)
+		goto err_unlock;
+
+	/* Note that the tcontrol should be available before calling it */
+	priv->temp.dts.value = (s32)tcontrol - dts_to_millidegree(dts_margin);
+
+	peci_sensor_mark_updated(&priv->temp.dts.state);
+
+skip_update:
+	*val = priv->temp.dts.value;
+	mutex_unlock(&priv->temp.dts.state.lock);
+
+	return 0;
+
+err_unlock:
+	mutex_unlock(&priv->temp.dts.state.lock);
+	return ret;
+}
+
+static int get_core_temp(struct peci_cputemp *priv, int core_index, long *val)
+{
+	s32 core_dts_margin;
+	long tjmax;
+	u32 pcs;
+	int ret;
+
+	mutex_lock(&priv->temp.core[core_index].state.lock);
+	if (!peci_sensor_need_update(&priv->temp.core[core_index].state))
+		goto skip_update;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_MODULE_TEMP, core_index, &pcs);
+	if (ret)
+		goto err_unlock;
+
+	core_dts_margin = FIELD_GET(PCS_MODULE_TEMP_MASK, pcs);
+	if (!dts_valid(core_dts_margin)) {
+		ret = -EIO;
+		goto err_unlock;
+	}
+
+	ret = get_temp_target(priv, tjmax_type, &tjmax);
+	if (ret)
+		goto err_unlock;
+
+	/* Note that the tjmax should be available before calling it */
+	priv->temp.core[core_index].value = (s32)tjmax + dts_to_millidegree(core_dts_margin);
+
+	peci_sensor_mark_updated(&priv->temp.core[core_index].state);
+
+skip_update:
+	*val = priv->temp.core[core_index].value;
+	mutex_unlock(&priv->temp.core[core_index].state.lock);
+
+	return 0;
+
+err_unlock:
+	mutex_unlock(&priv->temp.core[core_index].state.lock);
+	return ret;
+}
+
+static int cputemp_read_string(struct device *dev, enum hwmon_sensor_types type,
+			       u32 attr, int channel, const char **str)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+
+	if (attr != hwmon_temp_label)
+		return -EOPNOTSUPP;
+
+	*str = channel < channel_core ?
+		cputemp_label[channel] : priv->coretemp_label[channel - channel_core];
+
+	return 0;
+}
+
+static int cputemp_read(struct device *dev, enum hwmon_sensor_types type,
+			u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+
+	switch (attr) {
+	case hwmon_temp_input:
+		switch (channel) {
+		case channel_die:
+			return get_die_temp(priv, val);
+		case channel_dts:
+			return get_dts(priv, val);
+		case channel_tcontrol:
+			return get_temp_target(priv, tcontrol_type, val);
+		case channel_tthrottle:
+			return get_temp_target(priv, tthrottle_type, val);
+		case channel_tjmax:
+			return get_temp_target(priv, tjmax_type, val);
+		default:
+			return get_core_temp(priv, channel - channel_core, val);
+		}
+		break;
+	case hwmon_temp_max:
+		return get_temp_target(priv, tcontrol_type, val);
+	case hwmon_temp_crit:
+		return get_temp_target(priv, tjmax_type, val);
+	case hwmon_temp_crit_hyst:
+		return get_temp_target(priv, crit_hyst_type, val);
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types type,
+				  u32 attr, int channel)
+{
+	const struct peci_cputemp *priv = data;
+
+	if (channel > CPUTEMP_CHANNEL_NUMS)
+		return 0;
+
+	if (channel < channel_core)
+		return 0444;
+
+	if (test_bit(channel - channel_core, priv->core_mask))
+		return 0444;
+
+	return 0;
+}
+
+static int init_core_mask(struct peci_cputemp *priv)
+{
+	struct peci_device *peci_dev = priv->peci_dev;
+	struct resolved_cores_reg *reg = priv->gen_info->reg;
+	u64 core_mask;
+	u32 data;
+	int ret;
+
+	/* Get the RESOLVED_CORES register value */
+	switch (peci_dev->info.model) {
+	case INTEL_FAM6_ICELAKE_X:
+	case INTEL_FAM6_ICELAKE_D:
+		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
+					     reg->func, reg->offset + 4, &data);
+		if (ret)
+			return ret;
+
+		core_mask = (u64)data << 32;
+
+		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
+					     reg->func, reg->offset, &data);
+		if (ret)
+			return ret;
+
+		core_mask |= data;
+
+		break;
+	default:
+		ret = peci_pci_local_read(peci_dev, reg->bus, reg->dev,
+					  reg->func, reg->offset, &data);
+		if (ret)
+			return ret;
+
+		core_mask = data;
+
+		break;
+	}
+
+	if (!core_mask)
+		return -EIO;
+
+	bitmap_from_u64(priv->core_mask, core_mask);
+
+	return 0;
+}
+
+static int create_temp_label(struct peci_cputemp *priv)
+{
+	unsigned long core_max = find_last_bit(priv->core_mask, CORE_NUMS_MAX);
+	int i;
+
+	priv->coretemp_label = devm_kzalloc(priv->dev, core_max * sizeof(char *), GFP_KERNEL);
+	if (!priv->coretemp_label)
+		return -ENOMEM;
+
+	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) {
+		priv->coretemp_label[i] = devm_kasprintf(priv->dev, GFP_KERNEL, "Core %d", i);
+		if (!priv->coretemp_label[i])
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void check_resolved_cores(struct peci_cputemp *priv)
+{
+	/*
+	 * Failure to resolve cores is non-critical, we're still able to
+	 * provide other sensor data.
+	 */
+
+	if (init_core_mask(priv))
+		return;
+
+	if (create_temp_label(priv))
+		bitmap_zero(priv->core_mask, CORE_NUMS_MAX);
+}
+
+static void sensor_init(struct peci_cputemp *priv)
+{
+	int i;
+
+	mutex_init(&priv->temp.target.state.lock);
+	mutex_init(&priv->temp.die.state.lock);
+	mutex_init(&priv->temp.dts.state.lock);
+
+	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX)
+		mutex_init(&priv->temp.core[i].state.lock);
+}
+
+static const struct hwmon_ops peci_cputemp_ops = {
+	.is_visible = cputemp_is_visible,
+	.read_string = cputemp_read_string,
+	.read = cputemp_read,
+};
+
+static const u32 peci_cputemp_temp_channel_config[] = {
+	/* Die temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
+	/* DTS margin */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
+	/* Tcontrol temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
+	/* Tthrottle temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+	/* Tjmax temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+	/* Core temperature - for all core channels */
+	[channel_core ... CPUTEMP_CHANNEL_NUMS - 1] = HWMON_T_LABEL | HWMON_T_INPUT,
+	0
+};
+
+static const struct hwmon_channel_info peci_cputemp_temp_channel = {
+	.type = hwmon_temp,
+	.config = peci_cputemp_temp_channel_config,
+};
+
+static const struct hwmon_channel_info *peci_cputemp_info[] = {
+	&peci_cputemp_temp_channel,
+	NULL
+};
+
+static const struct hwmon_chip_info peci_cputemp_chip_info = {
+	.ops = &peci_cputemp_ops,
+	.info = peci_cputemp_info,
+};
+
+static int peci_cputemp_probe(struct auxiliary_device *adev,
+			      const struct auxiliary_device_id *id)
+{
+	struct device *dev = &adev->dev;
+	struct peci_device *peci_dev = to_peci_device(dev->parent);
+	struct peci_cputemp *priv;
+	struct device *hwmon_dev;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_cputemp.cpu%d",
+				    peci_dev->info.socket_id);
+	if (!priv->name)
+		return -ENOMEM;
+
+	priv->dev = dev;
+	priv->peci_dev = peci_dev;
+	priv->gen_info = (const struct cpu_info *)id->driver_data;
+
+	/*
+	 * This is just a sanity check. Since we're using commands that are
+	 * guaranteed to be supported on a given platform, we should never see
+	 * revision lower than expected.
+	 */
+	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
+		dev_warn(priv->dev,
+			 "Unexpected PECI revision %#x, some features may be unavailable\n",
+			 peci_dev->info.peci_revision);
+
+	check_resolved_cores(priv);
+
+	sensor_init(priv);
+
+	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev, priv->name,
+							 priv, &peci_cputemp_chip_info, NULL);
+
+	return PTR_ERR_OR_ZERO(hwmon_dev);
+}
+
+/*
+ * RESOLVED_CORES PCI configuration register may have different location on
+ * different platforms.
+ */
+static struct resolved_cores_reg resolved_cores_reg_hsx = {
+	.bus = 1,
+	.dev = 30,
+	.func = 3,
+	.offset = 0xb4,
+};
+
+static struct resolved_cores_reg resolved_cores_reg_icx = {
+	.bus = 14,
+	.dev = 30,
+	.func = 3,
+	.offset = 0xd0,
+};
+
+static const struct cpu_info cpu_hsx = {
+	.reg		= &resolved_cores_reg_hsx,
+	.min_peci_revision = 0x33,
+};
+
+static const struct cpu_info cpu_icx = {
+	.reg		= &resolved_cores_reg_icx,
+	.min_peci_revision = 0x40,
+};
+
+static const struct auxiliary_device_id peci_cputemp_ids[] = {
+	{
+		.name = "peci_cpu.cputemp.hsx",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.bdx",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.bdxd",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.skx",
+		.driver_data = (kernel_ulong_t)&cpu_hsx,
+	},
+	{
+		.name = "peci_cpu.cputemp.icx",
+		.driver_data = (kernel_ulong_t)&cpu_icx,
+	},
+	{
+		.name = "peci_cpu.cputemp.icxd",
+		.driver_data = (kernel_ulong_t)&cpu_icx,
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(auxiliary, peci_cputemp_ids);
+
+static struct auxiliary_driver peci_cputemp_driver = {
+	.probe		= peci_cputemp_probe,
+	.id_table	= peci_cputemp_ids,
+};
+
+module_auxiliary_driver(peci_cputemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI cputemp driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI_CPU);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (11 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 12/15] hwmon: peci: Add cputemp driver Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-03 15:39   ` Guenter Roeck
  2021-08-03 11:31 ` [PATCH v2 14/15] docs: hwmon: Document PECI drivers Iwona Winiarska
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
are accessible via the processor PECI interface.

The main use case for the driver (and PECI interface) is out-of-band
management, where we're able to obtain thermal readings from an external
entity connected with PECI, e.g. BMC on server platforms.

Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
Note that the timeout was completely removed - we're going to probe
for detected DIMMs every 5 seconds until we reach "stable" state of
either getting correct DIMM data or getting all -EINVAL (which
suggest that the CPU doesn't have any DIMMs).

 drivers/hwmon/peci/Kconfig    |  13 +
 drivers/hwmon/peci/Makefile   |   2 +
 drivers/hwmon/peci/dimmtemp.c | 614 ++++++++++++++++++++++++++++++++++
 3 files changed, 629 insertions(+)
 create mode 100644 drivers/hwmon/peci/dimmtemp.c

diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
index e10eed68d70a..9d32a57badfe 100644
--- a/drivers/hwmon/peci/Kconfig
+++ b/drivers/hwmon/peci/Kconfig
@@ -14,5 +14,18 @@ config SENSORS_PECI_CPUTEMP
 	  This driver can also be built as a module. If so, the module
 	  will be called peci-cputemp.
 
+config SENSORS_PECI_DIMMTEMP
+	tristate "PECI DIMM temperature monitoring client"
+	depends on PECI
+	select SENSORS_PECI
+	select PECI_CPU
+	help
+	  If you say yes here you get support for the generic Intel PECI hwmon
+	  driver which provides Temperature Sensor on DIMM readings that are
+	  accessible via the processor PECI interface.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called peci-dimmtemp.
+
 config SENSORS_PECI
 	tristate
diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
index e8a0ada5ab1f..191cfa0227f3 100644
--- a/drivers/hwmon/peci/Makefile
+++ b/drivers/hwmon/peci/Makefile
@@ -1,5 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 peci-cputemp-y := cputemp.o
+peci-dimmtemp-y := dimmtemp.o
 
 obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
new file mode 100644
index 000000000000..6264c29bb6c0
--- /dev/null
+++ b/drivers/hwmon/peci/dimmtemp.c
@@ -0,0 +1,614 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2018-2021 Intel Corporation
+
+#include <linux/auxiliary_bus.h>
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/hwmon.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/peci.h>
+#include <linux/peci-cpu.h>
+#include <linux/units.h>
+#include <linux/workqueue.h>
+#include <linux/x86/intel-family.h>
+
+#include "common.h"
+
+#define DIMM_MASK_CHECK_DELAY_JIFFIES	msecs_to_jiffies(5000)
+
+/* Max number of channel ranks and DIMM index per channel */
+#define CHAN_RANK_MAX_ON_HSX	8
+#define DIMM_IDX_MAX_ON_HSX	3
+#define CHAN_RANK_MAX_ON_BDX	4
+#define DIMM_IDX_MAX_ON_BDX	3
+#define CHAN_RANK_MAX_ON_BDXD	2
+#define DIMM_IDX_MAX_ON_BDXD	2
+#define CHAN_RANK_MAX_ON_SKX	6
+#define DIMM_IDX_MAX_ON_SKX	2
+#define CHAN_RANK_MAX_ON_ICX	8
+#define DIMM_IDX_MAX_ON_ICX	2
+#define CHAN_RANK_MAX_ON_ICXD	4
+#define DIMM_IDX_MAX_ON_ICXD	2
+
+#define CHAN_RANK_MAX		CHAN_RANK_MAX_ON_HSX
+#define DIMM_IDX_MAX		DIMM_IDX_MAX_ON_HSX
+#define DIMM_NUMS_MAX		(CHAN_RANK_MAX * DIMM_IDX_MAX)
+
+#define CPU_SEG_MASK		GENMASK(23, 16)
+#define GET_CPU_SEG(x)		(((x) & CPU_SEG_MASK) >> 16)
+#define CPU_BUS_MASK		GENMASK(7, 0)
+#define GET_CPU_BUS(x)		((x) & CPU_BUS_MASK)
+
+#define DIMM_TEMP_MAX		GENMASK(15, 8)
+#define DIMM_TEMP_CRIT		GENMASK(23, 16)
+#define GET_TEMP_MAX(x)		(((x) & DIMM_TEMP_MAX) >> 8)
+#define GET_TEMP_CRIT(x)	(((x) & DIMM_TEMP_CRIT) >> 16)
+
+struct peci_dimmtemp;
+
+struct dimm_info {
+	int chan_rank_max;
+	int dimm_idx_max;
+	u8 min_peci_revision;
+	int (*read_thresholds)(struct peci_dimmtemp *priv, int dimm_order,
+			       int chan_rank, u32 *data);
+};
+
+struct peci_dimm_thresholds {
+	long temp_max;
+	long temp_crit;
+	struct peci_sensor_state state;
+};
+
+enum peci_dimm_threshold_type {
+	temp_max_type,
+	temp_crit_type,
+};
+
+struct peci_dimmtemp {
+	struct peci_device *peci_dev;
+	struct device *dev;
+	const char *name;
+	const struct dimm_info *gen_info;
+	struct delayed_work detect_work;
+	struct {
+		struct peci_sensor_data temp;
+		struct peci_dimm_thresholds thresholds;
+	} dimm[DIMM_NUMS_MAX];
+	char **dimmtemp_label;
+	DECLARE_BITMAP(dimm_mask, DIMM_NUMS_MAX);
+};
+
+static u8 __dimm_temp(u32 reg, int dimm_order)
+{
+	return (reg >> (dimm_order * 8)) & 0xff;
+}
+
+static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no, long *val)
+{
+	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
+	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
+	u32 data;
+	int ret;
+
+	mutex_lock(&priv->dimm[dimm_no].temp.state.lock);
+	if (!peci_sensor_need_update(&priv->dimm[dimm_no].temp.state))
+		goto skip_update;
+
+	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &data);
+	if (ret) {
+		mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
+		return ret;
+	}
+
+	priv->dimm[dimm_no].temp.value = __dimm_temp(data, dimm_order) * MILLIDEGREE_PER_DEGREE;
+
+	peci_sensor_mark_updated(&priv->dimm[dimm_no].temp.state);
+
+skip_update:
+	*val = priv->dimm[dimm_no].temp.value;
+	mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
+	return 0;
+}
+
+static int update_thresholds(struct peci_dimmtemp *priv, int dimm_no)
+{
+	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
+	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
+	u32 data;
+	int ret;
+
+	if (!peci_sensor_need_update(&priv->dimm[dimm_no].thresholds.state))
+		return 0;
+
+	ret = priv->gen_info->read_thresholds(priv, dimm_order, chan_rank, &data);
+	if (ret == -ENODATA) /* Use default or previous value */
+		return 0;
+	if (ret)
+		return ret;
+
+	priv->dimm[dimm_no].thresholds.temp_max = GET_TEMP_MAX(data) * MILLIDEGREE_PER_DEGREE;
+	priv->dimm[dimm_no].thresholds.temp_crit = GET_TEMP_CRIT(data) * MILLIDEGREE_PER_DEGREE;
+
+	peci_sensor_mark_updated(&priv->dimm[dimm_no].thresholds.state);
+
+	return 0;
+}
+
+static int get_dimm_thresholds(struct peci_dimmtemp *priv, enum peci_dimm_threshold_type type,
+			       int dimm_no, long *val)
+{
+	int ret;
+
+	mutex_lock(&priv->dimm[dimm_no].thresholds.state.lock);
+	ret = update_thresholds(priv, dimm_no);
+	if (ret)
+		goto unlock;
+
+	switch (type) {
+	case temp_max_type:
+		*val = priv->dimm[dimm_no].thresholds.temp_max;
+		break;
+	case temp_crit_type:
+		*val = priv->dimm[dimm_no].thresholds.temp_crit;
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+unlock:
+	mutex_unlock(&priv->dimm[dimm_no].thresholds.state.lock);
+
+	return ret;
+}
+
+static int dimmtemp_read_string(struct device *dev,
+				enum hwmon_sensor_types type,
+				u32 attr, int channel, const char **str)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+
+	if (attr != hwmon_temp_label)
+		return -EOPNOTSUPP;
+
+	*str = (const char *)priv->dimmtemp_label[channel];
+
+	return 0;
+}
+
+static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
+			 u32 attr, int channel, long *val)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+
+	switch (attr) {
+	case hwmon_temp_input:
+		return get_dimm_temp(priv, channel, val);
+	case hwmon_temp_max:
+		return get_dimm_thresholds(priv, temp_max_type, channel, val);
+	case hwmon_temp_crit:
+		return get_dimm_thresholds(priv, temp_crit_type, channel, val);
+	default:
+		break;
+	}
+
+	return -EOPNOTSUPP;
+}
+
+static umode_t dimmtemp_is_visible(const void *data, enum hwmon_sensor_types type,
+				   u32 attr, int channel)
+{
+	const struct peci_dimmtemp *priv = data;
+
+	if (test_bit(channel, priv->dimm_mask))
+		return 0444;
+
+	return 0;
+}
+
+static const struct hwmon_ops peci_dimmtemp_ops = {
+	.is_visible = dimmtemp_is_visible,
+	.read_string = dimmtemp_read_string,
+	.read = dimmtemp_read,
+};
+
+static int check_populated_dimms(struct peci_dimmtemp *priv)
+{
+	int chan_rank_max = priv->gen_info->chan_rank_max;
+	int dimm_idx_max = priv->gen_info->dimm_idx_max;
+	u32 chan_rank_empty = 0;
+	u64 dimm_mask = 0;
+	int chan_rank, dimm_idx, ret;
+	u32 pcs;
+
+	BUILD_BUG_ON(CHAN_RANK_MAX > 32);
+	BUILD_BUG_ON(DIMM_NUMS_MAX > 64);
+	if (chan_rank_max * dimm_idx_max > DIMM_NUMS_MAX) {
+		WARN_ONCE(1, "Unsupported number of DIMMs");
+		return -EINVAL;
+	}
+
+	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
+		ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &pcs);
+		if (ret) {
+			/*
+			 * Overall, we expect either success or -EINVAL in
+			 * order to determine whether DIMM is populated or not.
+			 * For anything else - we fall back to defering the
+			 * detection to be performed at a later point in time.
+			 */
+			if (ret == -EINVAL) {
+				chan_rank_empty |= BIT(chan_rank);
+				continue;
+			}
+
+			return -EAGAIN;
+		}
+
+		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++)
+			if (__dimm_temp(pcs, dimm_idx))
+				dimm_mask |= BIT(chan_rank * dimm_idx_max + dimm_idx);
+	}
+
+	/* If we got all -EINVALs, it means that the CPU doesn't have any DIMMs. */
+	if (chan_rank_empty == GENMASK(chan_rank_max - 1, 0))
+		return -ENODEV;
+
+	/*
+	 * It's possible that memory training is not done yet. In this case we
+	 * defer the detection to be performed at a later point in time.
+	 */
+	if (!dimm_mask)
+		return -EAGAIN;
+
+	dev_dbg(priv->dev, "Scanned populated DIMMs: %#llx\n", dimm_mask);
+
+	bitmap_from_u64(priv->dimm_mask, dimm_mask);
+
+	return 0;
+}
+
+static int create_dimm_temp_label(struct peci_dimmtemp *priv, int chan)
+{
+	int rank = chan / priv->gen_info->dimm_idx_max;
+	int idx = chan % priv->gen_info->dimm_idx_max;
+
+	priv->dimmtemp_label[chan] = devm_kasprintf(priv->dev, GFP_KERNEL,
+						    "DIMM %c%d", 'A' + rank,
+						    idx + 1);
+	if (!priv->dimmtemp_label[chan])
+		return -ENOMEM;
+
+	return 0;
+}
+
+static const u32 peci_dimmtemp_temp_channel_config[] = {
+	[0 ... DIMM_NUMS_MAX - 1] = HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT,
+	0
+};
+
+static const struct hwmon_channel_info peci_dimmtemp_temp_channel = {
+	.type = hwmon_temp,
+	.config = peci_dimmtemp_temp_channel_config,
+};
+
+static const struct hwmon_channel_info *peci_dimmtemp_temp_info[] = {
+	&peci_dimmtemp_temp_channel,
+	NULL
+};
+
+static const struct hwmon_chip_info peci_dimmtemp_chip_info = {
+	.ops = &peci_dimmtemp_ops,
+	.info = peci_dimmtemp_temp_info,
+};
+
+static int create_dimm_temp_info(struct peci_dimmtemp *priv)
+{
+	int ret, i, channels;
+	struct device *dev;
+
+	/*
+	 * We expect to either find populated DIMMs and carry on with creating
+	 * sensors, or find out that there are no DIMMs populated.
+	 * All other states mean that the platform never reached the state that
+	 * allows to check DIMM state - causing us to retry later on.
+	 */
+	ret = check_populated_dimms(priv);
+	if (ret == -ENODEV) {
+		dev_dbg(priv->dev, "No DIMMs found\n");
+		return 0;
+	} else if (ret) {
+		schedule_delayed_work(&priv->detect_work, DIMM_MASK_CHECK_DELAY_JIFFIES);
+		dev_dbg(priv->dev, "Deferred populating DIMM temp info\n");
+		return ret;
+	}
+
+	channels = priv->gen_info->chan_rank_max * priv->gen_info->dimm_idx_max;
+
+	priv->dimmtemp_label = devm_kzalloc(priv->dev, channels * sizeof(char *), GFP_KERNEL);
+	if (!priv->dimmtemp_label)
+		return -ENOMEM;
+
+	for_each_set_bit(i, priv->dimm_mask, DIMM_NUMS_MAX) {
+		ret = create_dimm_temp_label(priv, i);
+		if (ret)
+			return ret;
+		mutex_init(&priv->dimm[i].thresholds.state.lock);
+		mutex_init(&priv->dimm[i].temp.state.lock);
+	}
+
+	dev = devm_hwmon_device_register_with_info(priv->dev, priv->name, priv,
+						   &peci_dimmtemp_chip_info, NULL);
+	if (IS_ERR(dev)) {
+		dev_err(priv->dev, "Failed to register hwmon device\n");
+		return PTR_ERR(dev);
+	}
+
+	dev_dbg(priv->dev, "%s: sensor '%s'\n", dev_name(dev), priv->name);
+
+	return 0;
+}
+
+static void create_dimm_temp_info_delayed(struct work_struct *work)
+{
+	struct peci_dimmtemp *priv = container_of(to_delayed_work(work),
+						  struct peci_dimmtemp,
+						  detect_work);
+	int ret;
+
+	ret = create_dimm_temp_info(priv);
+	if (ret && ret != -EAGAIN)
+		dev_err(priv->dev, "Failed to populate DIMM temp info\n");
+}
+
+static void remove_delayed_work(void *_priv)
+{
+	struct peci_dimmtemp *priv = _priv;
+
+	cancel_delayed_work_sync(&priv->detect_work);
+}
+
+static int peci_dimmtemp_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id)
+{
+	struct device *dev = &adev->dev;
+	struct peci_device *peci_dev = to_peci_device(dev->parent);
+	struct peci_dimmtemp *priv;
+	int ret;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_dimmtemp.cpu%d",
+				    peci_dev->info.socket_id);
+	if (!priv->name)
+		return -ENOMEM;
+
+	priv->dev = dev;
+	priv->peci_dev = peci_dev;
+	priv->gen_info = (const struct dimm_info *)id->driver_data;
+
+	/*
+	 * This is just a sanity check. Since we're using commands that are
+	 * guaranteed to be supported on a given platform, we should never see
+	 * revision lower than expected.
+	 */
+	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
+		dev_warn(priv->dev,
+			 "Unexpected PECI revision %#x, some features may be unavailable\n",
+			 peci_dev->info.peci_revision);
+
+	INIT_DELAYED_WORK(&priv->detect_work, create_dimm_temp_info_delayed);
+
+	ret = devm_add_action_or_reset(priv->dev, remove_delayed_work, priv);
+	if (ret)
+		return ret;
+
+	ret = create_dimm_temp_info(priv);
+	if (ret && ret != -EAGAIN) {
+		dev_err(dev, "Failed to populate DIMM temp info\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int
+read_thresholds_hsx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u8 dev, func;
+	u16 reg;
+	int ret;
+
+	/*
+	 * Device 20, Function 0: IMC 0 channel 0 -> rank 0
+	 * Device 20, Function 1: IMC 0 channel 1 -> rank 1
+	 * Device 21, Function 0: IMC 0 channel 2 -> rank 2
+	 * Device 21, Function 1: IMC 0 channel 3 -> rank 3
+	 * Device 23, Function 0: IMC 1 channel 0 -> rank 4
+	 * Device 23, Function 1: IMC 1 channel 1 -> rank 5
+	 * Device 24, Function 0: IMC 1 channel 2 -> rank 6
+	 * Device 24, Function 1: IMC 1 channel 3 -> rank 7
+	 */
+	dev = 20 + chan_rank / 2 + chan_rank / 4;
+	func = chan_rank % 2;
+	reg = 0x120 + dimm_order * 4;
+
+	ret = peci_pci_local_read(priv->peci_dev, 1, dev, func, reg, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int
+read_thresholds_bdxd(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u8 dev, func;
+	u16 reg;
+	int ret;
+
+	/*
+	 * Device 10, Function 2: IMC 0 channel 0 -> rank 0
+	 * Device 10, Function 6: IMC 0 channel 1 -> rank 1
+	 * Device 12, Function 2: IMC 1 channel 0 -> rank 2
+	 * Device 12, Function 6: IMC 1 channel 1 -> rank 3
+	 */
+	dev = 10 + chan_rank / 2 * 2;
+	func = (chan_rank % 2) ? 6 : 2;
+	reg = 0x120 + dimm_order * 4;
+
+	ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int
+read_thresholds_skx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u8 dev, func;
+	u16 reg;
+	int ret;
+
+	/*
+	 * Device 10, Function 2: IMC 0 channel 0 -> rank 0
+	 * Device 10, Function 6: IMC 0 channel 1 -> rank 1
+	 * Device 11, Function 2: IMC 0 channel 2 -> rank 2
+	 * Device 12, Function 2: IMC 1 channel 0 -> rank 3
+	 * Device 12, Function 6: IMC 1 channel 1 -> rank 4
+	 * Device 13, Function 2: IMC 1 channel 2 -> rank 5
+	 */
+	dev = 10 + chan_rank / 3 * 2 + (chan_rank % 3 == 2 ? 1 : 0);
+	func = chan_rank % 3 == 1 ? 6 : 2;
+	reg = 0x120 + dimm_order * 4;
+
+	ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int
+read_thresholds_icx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
+{
+	u32 reg_val;
+	u64 offset;
+	int ret;
+	u8 dev;
+
+	ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd4, &reg_val);
+	if (ret || !(reg_val & BIT(31)))
+		return -ENODATA; /* Use default or previous value */
+
+	ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd0, &reg_val);
+	if (ret)
+		return -ENODATA; /* Use default or previous value */
+
+	/*
+	 * Device 26, Offset 224e0: IMC 0 channel 0 -> rank 0
+	 * Device 26, Offset 264e0: IMC 0 channel 1 -> rank 1
+	 * Device 27, Offset 224e0: IMC 1 channel 0 -> rank 2
+	 * Device 27, Offset 264e0: IMC 1 channel 1 -> rank 3
+	 * Device 28, Offset 224e0: IMC 2 channel 0 -> rank 4
+	 * Device 28, Offset 264e0: IMC 2 channel 1 -> rank 5
+	 * Device 29, Offset 224e0: IMC 3 channel 0 -> rank 6
+	 * Device 29, Offset 264e0: IMC 3 channel 1 -> rank 7
+	 */
+	dev = 26 + chan_rank / 2;
+	offset = 0x224e0 + dimm_order * 4 + (chan_rank % 2) * 0x4000;
+
+	ret = peci_mmio_read(priv->peci_dev, 0, GET_CPU_SEG(reg_val), GET_CPU_BUS(reg_val),
+			     dev, 0, offset, data);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static const struct dimm_info dimm_hsx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_HSX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_HSX,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_hsx,
+};
+
+static const struct dimm_info dimm_bdx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_BDX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_BDX,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_hsx,
+};
+
+static const struct dimm_info dimm_bdxd = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_BDXD,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_BDXD,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_bdxd,
+};
+
+static const struct dimm_info dimm_skx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_SKX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_SKX,
+	.min_peci_revision = 0x33,
+	.read_thresholds = &read_thresholds_skx,
+};
+
+static const struct dimm_info dimm_icx = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_ICX,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_ICX,
+	.min_peci_revision = 0x40,
+	.read_thresholds = &read_thresholds_icx,
+};
+
+static const struct dimm_info dimm_icxd = {
+	.chan_rank_max	= CHAN_RANK_MAX_ON_ICXD,
+	.dimm_idx_max	= DIMM_IDX_MAX_ON_ICXD,
+	.min_peci_revision = 0x40,
+	.read_thresholds = &read_thresholds_icx,
+};
+
+static const struct auxiliary_device_id peci_dimmtemp_ids[] = {
+	{
+		.name = "peci_cpu.dimmtemp.hsx",
+		.driver_data = (kernel_ulong_t)&dimm_hsx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.bdx",
+		.driver_data = (kernel_ulong_t)&dimm_bdx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.bdxd",
+		.driver_data = (kernel_ulong_t)&dimm_bdxd,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.skx",
+		.driver_data = (kernel_ulong_t)&dimm_skx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.icx",
+		.driver_data = (kernel_ulong_t)&dimm_icx,
+	},
+	{
+		.name = "peci_cpu.dimmtemp.icxd",
+		.driver_data = (kernel_ulong_t)&dimm_icxd,
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(auxiliary, peci_dimmtemp_ids);
+
+static struct auxiliary_driver peci_dimmtemp_driver = {
+	.probe		= peci_dimmtemp_probe,
+	.id_table	= peci_dimmtemp_ids,
+};
+
+module_auxiliary_driver(peci_dimmtemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
+MODULE_DESCRIPTION("PECI dimmtemp driver");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PECI_CPU);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 14/15] docs: hwmon: Document PECI drivers
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (12 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-03 11:31 ` [PATCH v2 15/15] docs: Add PECI documentation Iwona Winiarska
  2021-08-05 12:17 ` [PATCH v2 00/15] Introduce PECI subsystem Greg Kroah-Hartman
  15 siblings, 0 replies; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>

Add documentation for peci-cputemp driver that provides DTS thermal
readings for CPU packages and CPU cores, and peci-dimmtemp driver that
provides Temperature Sensor on DIMM readings.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Co-developed-by: Iwona Winiarska <iwona.winiarska@intel.com>
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 Documentation/hwmon/index.rst         |  2 +
 Documentation/hwmon/peci-cputemp.rst  | 90 +++++++++++++++++++++++++++
 Documentation/hwmon/peci-dimmtemp.rst | 57 +++++++++++++++++
 MAINTAINERS                           |  2 +
 4 files changed, 151 insertions(+)
 create mode 100644 Documentation/hwmon/peci-cputemp.rst
 create mode 100644 Documentation/hwmon/peci-dimmtemp.rst

diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index bc01601ea81a..cc76b5b3f791 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -154,6 +154,8 @@ Hardware Monitoring Kernel Drivers
    pcf8591
    pim4328
    pm6764tr
+   peci-cputemp
+   peci-dimmtemp
    pmbus
    powr1220
    pxe1610
diff --git a/Documentation/hwmon/peci-cputemp.rst b/Documentation/hwmon/peci-cputemp.rst
new file mode 100644
index 000000000000..fe0422248dc5
--- /dev/null
+++ b/Documentation/hwmon/peci-cputemp.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+Kernel driver peci-cputemp
+==========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon D family
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides Digital
+Thermal Sensor (DTS) thermal readings of the CPU package and CPU cores that are
+accessible via the processor PECI interface.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+Sysfs interface
+-------------------
+
+======================= =======================================================
+temp1_label		"Die"
+temp1_input		Provides current die temperature of the CPU package.
+temp1_max		Provides thermal control temperature of the CPU package
+			which is also known as Tcontrol.
+temp1_crit		Provides shutdown temperature of the CPU package which
+			is also known as the maximum processor junction
+			temperature, Tjmax or Tprochot.
+temp1_crit_hyst		Provides the hysteresis value from Tcontrol to Tjmax of
+			the CPU package.
+
+temp2_label		"DTS"
+temp2_input		Provides current temperature of the CPU package scaled
+			to match DTS thermal profile.
+temp2_max		Provides thermal control temperature of the CPU package
+			which is also known as Tcontrol.
+temp2_crit		Provides shutdown temperature of the CPU package which
+			is also known as the maximum processor junction
+			temperature, Tjmax or Tprochot.
+temp2_crit_hyst		Provides the hysteresis value from Tcontrol to Tjmax of
+			the CPU package.
+
+temp3_label		"Tcontrol"
+temp3_input		Provides current Tcontrol temperature of the CPU
+			package which is also known as Fan Temperature target.
+			Indicates the relative value from thermal monitor trip
+			temperature at which fans should be engaged.
+temp3_crit		Provides Tcontrol critical value of the CPU package
+			which is same to Tjmax.
+
+temp4_label		"Tthrottle"
+temp4_input		Provides current Tthrottle temperature of the CPU
+			package. Used for throttling temperature. If this value
+			is allowed and lower than Tjmax - the throttle will
+			occur and reported at lower than Tjmax.
+
+temp5_label		"Tjmax"
+temp5_input		Provides the maximum junction temperature, Tjmax of the
+			CPU package.
+
+temp[6-N]_label		Provides string "Core X", where X is resolved core
+			number.
+temp[6-N]_input		Provides current temperature of each core.
+
+======================= =======================================================
diff --git a/Documentation/hwmon/peci-dimmtemp.rst b/Documentation/hwmon/peci-dimmtemp.rst
new file mode 100644
index 000000000000..e562aed620de
--- /dev/null
+++ b/Documentation/hwmon/peci-dimmtemp.rst
@@ -0,0 +1,57 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver peci-dimmtemp
+===========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon D family
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides
+Temperature sensor on DIMM readings that are accessible via the processor PECI interface.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+Sysfs interface
+-------------------
+
+======================= =======================================================
+
+temp[N]_label		Provides string "DIMM CI", where C is DIMM channel and
+			I is DIMM index of the populated DIMM.
+temp[N]_input		Provides current temperature of the populated DIMM.
+temp[N]_max		Provides thermal control temperature of the DIMM.
+temp[N]_crit		Provides shutdown temperature of the DIMM.
+
+======================= =======================================================
+
+Note:
+	DIMM temperature attributes will appear when the client CPU's BIOS
+	completes memory training and testing.
diff --git a/MAINTAINERS b/MAINTAINERS
index e36b5c0824e3..4861a214d9fe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14517,6 +14517,8 @@ M:	Iwona Winiarska <iwona.winiarska@intel.com>
 R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
 L:	linux-hwmon@vger.kernel.org
 S:	Supported
+F:	Documentation/hwmon/peci-cputemp.rst
+F:	Documentation/hwmon/peci-dimmtemp.rst
 F:	drivers/hwmon/peci/
 
 PECI SUBSYSTEM
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 15/15] docs: Add PECI documentation
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (13 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 14/15] docs: hwmon: Document PECI drivers Iwona Winiarska
@ 2021-08-03 11:31 ` Iwona Winiarska
  2021-08-05 12:17 ` [PATCH v2 00/15] Introduce PECI subsystem Greg Kroah-Hartman
  15 siblings, 0 replies; 49+ messages in thread
From: Iwona Winiarska @ 2021-08-03 11:31 UTC (permalink / raw)
  To: linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Guenter Roeck, Arnd Bergmann, Olof Johansson,
	Jonathan Corbet, Thomas Gleixner, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Yazen Ghannam, Mauro Carvalho Chehab,
	Pierre-Louis Bossart, Tony Luck, Andy Shevchenko, Jae Hyun Yoo,
	Dan Williams, Randy Dunlap, Zev Weiss, David Muller,
	Iwona Winiarska

Add a brief overview of PECI and PECI wire interface.
The documentation also contains kernel-doc for PECI subsystem internals
and PECI CPU Driver API.

Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 Documentation/index.rst      |  1 +
 Documentation/peci/index.rst | 16 ++++++++++++
 Documentation/peci/peci.rst  | 48 ++++++++++++++++++++++++++++++++++++
 MAINTAINERS                  |  1 +
 4 files changed, 66 insertions(+)
 create mode 100644 Documentation/peci/index.rst
 create mode 100644 Documentation/peci/peci.rst

diff --git a/Documentation/index.rst b/Documentation/index.rst
index 54ce34fd6fbd..7671f2cd474f 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -137,6 +137,7 @@ needed).
    misc-devices/index
    scheduler/index
    mhi/index
+   peci/index
 
 Architecture-agnostic documentation
 -----------------------------------
diff --git a/Documentation/peci/index.rst b/Documentation/peci/index.rst
new file mode 100644
index 000000000000..989de10416e7
--- /dev/null
+++ b/Documentation/peci/index.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+====================
+Linux PECI Subsystem
+====================
+
+.. toctree::
+
+   peci
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/peci/peci.rst b/Documentation/peci/peci.rst
new file mode 100644
index 000000000000..a12c8e10c4a9
--- /dev/null
+++ b/Documentation/peci/peci.rst
@@ -0,0 +1,48 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+========
+Overview
+========
+
+The Platform Environment Control Interface (PECI) is a communication
+interface between Intel processor and management controllers
+(e.g. Baseboard Management Controller, BMC).
+PECI provides services that allow the management controller to
+configure, monitor and debug platform by accessing various registers.
+It defines a dedicated command protocol, where the management
+controller is acting as a PECI originator and the processor - as
+a PECI responder.
+PECI can be used in both single processor and multiple-processor based
+systems.
+
+NOTE:
+Intel PECI specification is not released as a dedicated document,
+instead it is a part of External Design Specification (EDS) for given
+Intel CPU. External Design Specifications are usually not publicly
+available.
+
+PECI Wire
+---------
+
+PECI Wire interface uses a single wire for self-clocking and data
+transfer. It does not require any additional control lines - the
+physical layer is a self-clocked one-wire bus signal that begins each
+bit with a driven, rising edge from an idle near zero volts. The
+duration of the signal driven high allows to determine whether the bit
+value is logic '0' or logic '1'. PECI Wire also includes variable data
+rate established with every message.
+
+For PECI Wire, each processor package will utilize unique, fixed
+addresses within a defined range and that address should
+have a fixed relationship with the processor socket ID - if one of the
+processors is removed, it does not affect addresses of remaining
+processors.
+
+PECI subsystem internals
+------------------------
+
+.. kernel-doc:: include/linux/peci.h
+
+PECI CPU Driver API
+-------------------
+.. kernel-doc:: include/linux/peci-cpu.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 4861a214d9fe..c50d4d0005e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14527,6 +14527,7 @@ R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
 L:	openbmc@lists.ozlabs.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/peci/
+F:	Documentation/peci/
 F:	drivers/peci/
 F:	include/linux/peci-cpu.h
 F:	include/linux/peci.h
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 12/15] hwmon: peci: Add cputemp driver
  2021-08-03 11:31 ` [PATCH v2 12/15] hwmon: peci: Add cputemp driver Iwona Winiarska
@ 2021-08-03 15:24   ` Guenter Roeck
  2021-08-04 10:43     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Guenter Roeck @ 2021-08-03 15:24 UTC (permalink / raw)
  To: Iwona Winiarska, linux-kernel, openbmc, Greg Kroah-Hartman
  Cc: x86, devicetree, linux-aspeed, linux-arm-kernel, linux-hwmon,
	linux-doc, Rob Herring, Joel Stanley, Andrew Jeffery,
	Jean Delvare, Arnd Bergmann, Olof Johansson, Jonathan Corbet,
	Thomas Gleixner, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Yazen Ghannam, Mauro Carvalho Chehab, Pierre-Louis Bossart,
	Tony Luck, Andy Shevchenko, Jae Hyun Yoo, Dan Williams,
	Randy Dunlap, Zev Weiss, David Muller

On 8/3/21 4:31 AM, Iwona Winiarska wrote:
> Add peci-cputemp driver for Digital Thermal Sensor (DTS) thermal
> readings of the processor package and processor cores that are
> accessible via the PECI interface.
> 
> The main use case for the driver (and PECI interface) is out-of-band
> management, where we're able to obtain the DTS readings from an external
> entity connected with PECI, e.g. BMC on server platforms.
> 
> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>   MAINTAINERS                  |   7 +
>   drivers/hwmon/Kconfig        |   2 +
>   drivers/hwmon/Makefile       |   1 +
>   drivers/hwmon/peci/Kconfig   |  18 ++
>   drivers/hwmon/peci/Makefile  |   5 +
>   drivers/hwmon/peci/common.h  |  58 ++++
>   drivers/hwmon/peci/cputemp.c | 591 +++++++++++++++++++++++++++++++++++
>   7 files changed, 682 insertions(+)
>   create mode 100644 drivers/hwmon/peci/Kconfig
>   create mode 100644 drivers/hwmon/peci/Makefile
>   create mode 100644 drivers/hwmon/peci/common.h
>   create mode 100644 drivers/hwmon/peci/cputemp.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3f5d48e1d143..e36b5c0824e3 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14512,6 +14512,13 @@ L:	platform-driver-x86@vger.kernel.org
>   S:	Maintained
>   F:	drivers/platform/x86/peaq-wmi.c
>   
> +PECI HARDWARE MONITORING DRIVERS
> +M:	Iwona Winiarska <iwona.winiarska@intel.com>
> +R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> +L:	linux-hwmon@vger.kernel.org
> +S:	Supported
> +F:	drivers/hwmon/peci/
> +
>   PECI SUBSYSTEM
>   M:	Iwona Winiarska <iwona.winiarska@intel.com>
>   R:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> index e3675377bc5d..61c0e3404415 100644
> --- a/drivers/hwmon/Kconfig
> +++ b/drivers/hwmon/Kconfig
> @@ -1507,6 +1507,8 @@ config SENSORS_PCF8591
>   	  These devices are hard to detect and rarely found on mainstream
>   	  hardware. If unsure, say N.
>   
> +source "drivers/hwmon/peci/Kconfig"
> +
>   source "drivers/hwmon/pmbus/Kconfig"
>   
>   config SENSORS_PWM_FAN
> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> index d712c61c1f5e..f52331f212ed 100644
> --- a/drivers/hwmon/Makefile
> +++ b/drivers/hwmon/Makefile
> @@ -202,6 +202,7 @@ obj-$(CONFIG_SENSORS_WM8350)	+= wm8350-hwmon.o
>   obj-$(CONFIG_SENSORS_XGENE)	+= xgene-hwmon.o
>   
>   obj-$(CONFIG_SENSORS_OCC)	+= occ/
> +obj-$(CONFIG_SENSORS_PECI)	+= peci/
>   obj-$(CONFIG_PMBUS)		+= pmbus/
>   
>   ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
> diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> new file mode 100644
> index 000000000000..e10eed68d70a
> --- /dev/null
> +++ b/drivers/hwmon/peci/Kconfig
> @@ -0,0 +1,18 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config SENSORS_PECI_CPUTEMP
> +	tristate "PECI CPU temperature monitoring client"
> +	depends on PECI
> +	select SENSORS_PECI
> +	select PECI_CPU
> +	help
> +	  If you say yes here you get support for the generic Intel PECI
> +	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
> +	  readings of the CPU package and CPU cores that are accessible via
> +	  the processor PECI interface.
> +
> +	  This driver can also be built as a module. If so, the module
> +	  will be called peci-cputemp.
> +
> +config SENSORS_PECI
> +	tristate
> diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> new file mode 100644
> index 000000000000..e8a0ada5ab1f
> --- /dev/null
> +++ b/drivers/hwmon/peci/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +peci-cputemp-y := cputemp.o
> +
> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
> diff --git a/drivers/hwmon/peci/common.h b/drivers/hwmon/peci/common.h
> new file mode 100644
> index 000000000000..734506b0eca2
> --- /dev/null
> +++ b/drivers/hwmon/peci/common.h
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright (c) 2021 Intel Corporation */
> +
> +#include <linux/mutex.h>
> +#include <linux/types.h>
> +
> +#ifndef __PECI_HWMON_COMMON_H
> +#define __PECI_HWMON_COMMON_H
> +
> +#define PECI_HWMON_UPDATE_INTERVAL	HZ
> +
> +/**
> + * struct peci_sensor_state - PECI state information
> + * @valid: flag to indicate the sensor value is valid
> + * @last_updated: time of the last update in jiffies
> + * @lock: mutex to protect sensor access
> + */
> +struct peci_sensor_state {
> +	bool valid;
> +	unsigned long last_updated;
> +	struct mutex lock; /* protect sensor access */
> +};
> +
> +/**
> + * struct peci_sensor_data - PECI sensor information
> + * @value: sensor value in milli units
> + * @state: sensor update state
> + */
> +
> +struct peci_sensor_data {
> +	s32 value;
> +	struct peci_sensor_state state;
> +};
> +
> +/**
> + * peci_sensor_need_update() - check whether sensor update is needed or not
> + * @sensor: pointer to sensor data struct
> + *
> + * Return: true if update is needed, false if not.
> + */
> +
> +static inline bool peci_sensor_need_update(struct peci_sensor_state *state)
> +{
> +	return !state->valid ||
> +	       time_after(jiffies, state->last_updated + PECI_HWMON_UPDATE_INTERVAL);
> +}
> +
> +/**
> + * peci_sensor_mark_updated() - mark the sensor is updated
> + * @sensor: pointer to sensor data struct
> + */
> +static inline void peci_sensor_mark_updated(struct peci_sensor_state *state)
> +{
> +	state->valid = true;
> +	state->last_updated = jiffies;
> +}
> +
> +#endif /* __PECI_HWMON_COMMON_H */
> diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c
> new file mode 100644
> index 000000000000..9c6858a9fb6d
> --- /dev/null
> +++ b/drivers/hwmon/peci/cputemp.c
> @@ -0,0 +1,591 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2018-2021 Intel Corporation
> +
> +#include <linux/auxiliary_bus.h>
> +#include <linux/bitfield.h>
> +#include <linux/bitops.h>
> +#include <linux/hwmon.h>
> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/peci.h>
> +#include <linux/peci-cpu.h>
> +#include <linux/units.h>
> +#include <linux/x86/intel-family.h>
> +
> +#include "common.h"
> +
> +#define CORE_NUMS_MAX		64
> +
> +#define BASE_CHANNEL_NUMS	5
> +#define CPUTEMP_CHANNEL_NUMS	(BASE_CHANNEL_NUMS + CORE_NUMS_MAX)
> +
> +#define TEMP_TARGET_FAN_TEMP_MASK	GENMASK(15, 8)
> +#define TEMP_TARGET_REF_TEMP_MASK	GENMASK(23, 16)
> +#define TEMP_TARGET_TJ_OFFSET_MASK	GENMASK(29, 24)
> +
> +#define DTS_MARGIN_MASK		GENMASK(15, 0)
> +#define PCS_MODULE_TEMP_MASK	GENMASK(15, 0)
> +
> +#define DTS_FIXED_POINT_FRACTION	64
> +
> +struct resolved_cores_reg {
> +	u8 bus;
> +	u8 dev;
> +	u8 func;
> +	u8 offset;
> +};
> +
> +struct cpu_info {
> +	struct resolved_cores_reg *reg;
> +	u8 min_peci_revision;
> +};
> +
> +struct peci_temp_target {
> +	s32 tcontrol;
> +	s32 tthrottle;
> +	s32 tjmax;
> +	struct peci_sensor_state state;
> +};
> +
> +enum peci_temp_target_type {
> +	tcontrol_type,
> +	tthrottle_type,
> +	tjmax_type,
> +	crit_hyst_type,
> +};
> +
> +struct peci_cputemp {
> +	struct peci_device *peci_dev;
> +	struct device *dev;
> +	const char *name;
> +	const struct cpu_info *gen_info;
> +	struct {
> +		struct peci_temp_target target;
> +		struct peci_sensor_data die;
> +		struct peci_sensor_data dts;
> +		struct peci_sensor_data core[CORE_NUMS_MAX];
> +	} temp;
> +	const char **coretemp_label;
> +	DECLARE_BITMAP(core_mask, CORE_NUMS_MAX);
> +};
> +
> +enum cputemp_channels {
> +	channel_die,
> +	channel_dts,
> +	channel_tcontrol,
> +	channel_tthrottle,
> +	channel_tjmax,
> +	channel_core,
> +};
> +
> +static const char * const cputemp_label[BASE_CHANNEL_NUMS] = {
> +	"Die",
> +	"DTS",
> +	"Tcontrol",
> +	"Tthrottle",
> +	"Tjmax",
> +};
> +
> +static int update_temp_target(struct peci_cputemp *priv)
> +{
> +	s32 tthrottle_offset, tcontrol_margin;
> +	u32 pcs;
> +	int ret;
> +
> +	if (!peci_sensor_need_update(&priv->temp.target.state))
> +		return 0;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_TEMP_TARGET, 0, &pcs);
> +	if (ret)
> +		return ret;
> +
> +	priv->temp.target.tjmax =
> +		FIELD_GET(TEMP_TARGET_REF_TEMP_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
> +
> +	tcontrol_margin = FIELD_GET(TEMP_TARGET_FAN_TEMP_MASK, pcs);
> +	tcontrol_margin = sign_extend32(tcontrol_margin, 7) * MILLIDEGREE_PER_DEGREE;
> +	priv->temp.target.tcontrol = priv->temp.target.tjmax - tcontrol_margin;
> +
> +	tthrottle_offset = FIELD_GET(TEMP_TARGET_TJ_OFFSET_MASK, pcs) * MILLIDEGREE_PER_DEGREE;
> +	priv->temp.target.tthrottle = priv->temp.target.tjmax - tthrottle_offset;
> +
> +	peci_sensor_mark_updated(&priv->temp.target.state);
> +
> +	return 0;
> +}
> +
> +static int get_temp_target(struct peci_cputemp *priv, enum peci_temp_target_type type, long *val)
> +{
> +	int ret;
> +
> +	mutex_lock(&priv->temp.target.state.lock);
> +
> +	ret = update_temp_target(priv);
> +	if (ret)
> +		goto unlock;
> +
> +	switch (type) {
> +	case tcontrol_type:
> +		*val = priv->temp.target.tcontrol;
> +		break;
> +	case tthrottle_type:
> +		*val = priv->temp.target.tthrottle;
> +		break;
> +	case tjmax_type:
> +		*val = priv->temp.target.tjmax;
> +		break;
> +	case crit_hyst_type:
> +		*val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
> +		break;
> +	default:
> +		ret = -EOPNOTSUPP;
> +		break;
> +	}
> +unlock:
> +	mutex_unlock(&priv->temp.target.state.lock);
> +
> +	return ret;
> +}
> +
> +/*
> + * Processors return a value of DTS reading in S10.6 fixed point format
> + * (16 bits: 10-bit signed magnitude, 6-bit fraction).
> + * Error codes:
> + *   0x8000: General sensor error
> + *   0x8001: Reserved
> + *   0x8002: Underflow on reading value
> + *   0x8003-0x81ff: Reserved
> + */
> +static bool dts_valid(s32 val)
> +{
> +	return val < 0x8000 || val > 0x81ff;
> +}
> +
> +static s32 dts_to_millidegree(s32 val)
> +{
> +	return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE / DTS_FIXED_POINT_FRACTION;
> +}
> +
> +static int get_die_temp(struct peci_cputemp *priv, long *val)
> +{
> +	long tjmax;
> +	s16 temp;
> +	int ret;
> +
> +	mutex_lock(&priv->temp.die.state.lock);
> +	if (!peci_sensor_need_update(&priv->temp.die.state))
> +		goto skip_update;
> +
> +	ret = peci_temp_read(priv->peci_dev, &temp);
> +	if (ret)
> +		goto err_unlock;
> +
> +	if (!dts_valid(temp)) {
> +		ret = -EIO;
> +		goto err_unlock;
> +	}
> +
> +	ret = get_temp_target(priv, tjmax_type, &tjmax);
> +	if (ret)
> +		goto err_unlock;
> +
> +	priv->temp.die.value = (s32)tjmax + dts_to_millidegree(temp);
> +
> +	peci_sensor_mark_updated(&priv->temp.die.state);
> +
> +skip_update:
> +	*val = priv->temp.die.value;
> +	mutex_unlock(&priv->temp.die.state.lock);
> +
> +	return 0;
> +
> +err_unlock:
> +	mutex_unlock(&priv->temp.die.state.lock);
> +	return ret;
> +}
> +
> +static int get_dts(struct peci_cputemp *priv, long *val)
> +{
> +	s32 dts_margin;
> +	long tcontrol;
> +	u32 pcs;
> +	int ret;
> +
> +	mutex_lock(&priv->temp.dts.state.lock);
> +	if (!peci_sensor_need_update(&priv->temp.dts.state))
> +		goto skip_update;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_THERMAL_MARGIN, 0, &pcs);
> +	if (ret)
> +		goto err_unlock;
> +
> +	dts_margin = FIELD_GET(DTS_MARGIN_MASK, pcs);
> +	if (!dts_valid(dts_margin)) {
> +		ret = -EIO;
> +		goto err_unlock;
> +	}
> +
> +	ret = get_temp_target(priv, tcontrol_type, &tcontrol);
> +	if (ret)
> +		goto err_unlock;
> +
> +	/* Note that the tcontrol should be available before calling it */
> +	priv->temp.dts.value = (s32)tcontrol - dts_to_millidegree(dts_margin);
> +
> +	peci_sensor_mark_updated(&priv->temp.dts.state);
> +
> +skip_update:
> +	*val = priv->temp.dts.value;
> +	mutex_unlock(&priv->temp.dts.state.lock);
> +
> +	return 0;
> +
> +err_unlock:
> +	mutex_unlock(&priv->temp.dts.state.lock);
> +	return ret;

Simplify (see below)

> +}
> +
> +static int get_core_temp(struct peci_cputemp *priv, int core_index, long *val)
> +{
> +	s32 core_dts_margin;
> +	long tjmax;
> +	u32 pcs;
> +	int ret;

	int ret = 0;

to handle simplification below.

> +
> +	mutex_lock(&priv->temp.core[core_index].state.lock);
> +	if (!peci_sensor_need_update(&priv->temp.core[core_index].state))
> +		goto skip_update;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_MODULE_TEMP, core_index, &pcs);
> +	if (ret)
> +		goto err_unlock;
> +
> +	core_dts_margin = FIELD_GET(PCS_MODULE_TEMP_MASK, pcs);
> +	if (!dts_valid(core_dts_margin)) {
> +		ret = -EIO;
> +		goto err_unlock;
> +	}
> +
> +	ret = get_temp_target(priv, tjmax_type, &tjmax);
> +	if (ret)
> +		goto err_unlock;
> +
> +	/* Note that the tjmax should be available before calling it */
> +	priv->temp.core[core_index].value = (s32)tjmax + dts_to_millidegree(core_dts_margin);
> +
> +	peci_sensor_mark_updated(&priv->temp.core[core_index].state);
> +
> +skip_update:
> +	*val = priv->temp.core[core_index].value;
> +	mutex_unlock(&priv->temp.core[core_index].state.lock);
> +
> +	return 0;
> +
> +err_unlock:
> +	mutex_unlock(&priv->temp.core[core_index].state.lock);
> +	return ret;

Simplify:

skip_update:
	*val = priv->temp.core[core_index].value;
err_unlock:
	mutex_unlock(&priv->temp.core[core_index].state.lock);
	return ret;

> +}
> +
> +static int cputemp_read_string(struct device *dev, enum hwmon_sensor_types type,
> +			       u32 attr, int channel, const char **str)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +
> +	if (attr != hwmon_temp_label)
> +		return -EOPNOTSUPP;
> +
> +	*str = channel < channel_core ?
> +		cputemp_label[channel] : priv->coretemp_label[channel - channel_core];
> +
> +	return 0;
> +}
> +
> +static int cputemp_read(struct device *dev, enum hwmon_sensor_types type,
> +			u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		switch (channel) {
> +		case channel_die:
> +			return get_die_temp(priv, val);
> +		case channel_dts:
> +			return get_dts(priv, val);
> +		case channel_tcontrol:
> +			return get_temp_target(priv, tcontrol_type, val);
> +		case channel_tthrottle:
> +			return get_temp_target(priv, tthrottle_type, val);
> +		case channel_tjmax:
> +			return get_temp_target(priv, tjmax_type, val);
> +		default:
> +			return get_core_temp(priv, channel - channel_core, val);
> +		}
> +		break;
> +	case hwmon_temp_max:
> +		return get_temp_target(priv, tcontrol_type, val);
> +	case hwmon_temp_crit:
> +		return get_temp_target(priv, tjmax_type, val);
> +	case hwmon_temp_crit_hyst:
> +		return get_temp_target(priv, crit_hyst_type, val);
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
> +static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types type,
> +				  u32 attr, int channel)
> +{
> +	const struct peci_cputemp *priv = data;
> +
> +	if (channel > CPUTEMP_CHANNEL_NUMS)
> +		return 0;
> +
> +	if (channel < channel_core)
> +		return 0444;
> +
> +	if (test_bit(channel - channel_core, priv->core_mask))
> +		return 0444;
> +
> +	return 0;
> +}
> +
> +static int init_core_mask(struct peci_cputemp *priv)
> +{
> +	struct peci_device *peci_dev = priv->peci_dev;
> +	struct resolved_cores_reg *reg = priv->gen_info->reg;
> +	u64 core_mask;
> +	u32 data;
> +	int ret;
> +
> +	/* Get the RESOLVED_CORES register value */
> +	switch (peci_dev->info.model) {
> +	case INTEL_FAM6_ICELAKE_X:
> +	case INTEL_FAM6_ICELAKE_D:
> +		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
> +					     reg->func, reg->offset + 4, &data);
> +		if (ret)
> +			return ret;
> +
> +		core_mask = (u64)data << 32;
> +
> +		ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg->dev,
> +					     reg->func, reg->offset, &data);
> +		if (ret)
> +			return ret;
> +
> +		core_mask |= data;
> +
> +		break;
> +	default:
> +		ret = peci_pci_local_read(peci_dev, reg->bus, reg->dev,
> +					  reg->func, reg->offset, &data);
> +		if (ret)
> +			return ret;
> +
> +		core_mask = data;
> +
> +		break;
> +	}
> +
> +	if (!core_mask)
> +		return -EIO;
> +
> +	bitmap_from_u64(priv->core_mask, core_mask);
> +
> +	return 0;
> +}
> +
> +static int create_temp_label(struct peci_cputemp *priv)
> +{
> +	unsigned long core_max = find_last_bit(priv->core_mask, CORE_NUMS_MAX);
> +	int i;
> +
> +	priv->coretemp_label = devm_kzalloc(priv->dev, core_max * sizeof(char *), GFP_KERNEL);
> +	if (!priv->coretemp_label)
> +		return -ENOMEM;
> +
> +	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) {
> +		priv->coretemp_label[i] = devm_kasprintf(priv->dev, GFP_KERNEL, "Core %d", i);
> +		if (!priv->coretemp_label[i])
> +			return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static void check_resolved_cores(struct peci_cputemp *priv)
> +{
> +	/*
> +	 * Failure to resolve cores is non-critical, we're still able to
> +	 * provide other sensor data.
> +	 */
> +
> +	if (init_core_mask(priv))
> +		return;
> +
> +	if (create_temp_label(priv))
> +		bitmap_zero(priv->core_mask, CORE_NUMS_MAX);
> +}
> +
> +static void sensor_init(struct peci_cputemp *priv)
> +{
> +	int i;
> +
> +	mutex_init(&priv->temp.target.state.lock);
> +	mutex_init(&priv->temp.die.state.lock);
> +	mutex_init(&priv->temp.dts.state.lock);
> +
> +	for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX)
> +		mutex_init(&priv->temp.core[i].state.lock);
> +}
> +
> +static const struct hwmon_ops peci_cputemp_ops = {
> +	.is_visible = cputemp_is_visible,
> +	.read_string = cputemp_read_string,
> +	.read = cputemp_read,
> +};
> +
> +static const u32 peci_cputemp_temp_channel_config[] = {
> +	/* Die temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
> +	/* DTS margin */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT | HWMON_T_CRIT_HYST,
> +	/* Tcontrol temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
> +	/* Tthrottle temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +	/* Tjmax temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +	/* Core temperature - for all core channels */
> +	[channel_core ... CPUTEMP_CHANNEL_NUMS - 1] = HWMON_T_LABEL | HWMON_T_INPUT,
> +	0
> +};
> +
> +static const struct hwmon_channel_info peci_cputemp_temp_channel = {
> +	.type = hwmon_temp,
> +	.config = peci_cputemp_temp_channel_config,
> +};
> +
> +static const struct hwmon_channel_info *peci_cputemp_info[] = {
> +	&peci_cputemp_temp_channel,
> +	NULL
> +};
> +
> +static const struct hwmon_chip_info peci_cputemp_chip_info = {
> +	.ops = &peci_cputemp_ops,
> +	.info = peci_cputemp_info,
> +};
> +
> +static int peci_cputemp_probe(struct auxiliary_device *adev,
> +			      const struct auxiliary_device_id *id)
> +{
> +	struct device *dev = &adev->dev;
> +	struct peci_device *peci_dev = to_peci_device(dev->parent);
> +	struct peci_cputemp *priv;
> +	struct device *hwmon_dev;
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_cputemp.cpu%d",
> +				    peci_dev->info.socket_id);
> +	if (!priv->name)
> +		return -ENOMEM;
> +
> +	priv->dev = dev;
> +	priv->peci_dev = peci_dev;
> +	priv->gen_info = (const struct cpu_info *)id->driver_data;
> +
> +	/*
> +	 * This is just a sanity check. Since we're using commands that are
> +	 * guaranteed to be supported on a given platform, we should never see
> +	 * revision lower than expected.
> +	 */
> +	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
> +		dev_warn(priv->dev,
> +			 "Unexpected PECI revision %#x, some features may be unavailable\n",
> +			 peci_dev->info.peci_revision);
> +
> +	check_resolved_cores(priv);
> +
> +	sensor_init(priv);
> +
> +	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev, priv->name,
> +							 priv, &peci_cputemp_chip_info, NULL);
> +
> +	return PTR_ERR_OR_ZERO(hwmon_dev);
> +}
> +
> +/*
> + * RESOLVED_CORES PCI configuration register may have different location on
> + * different platforms.
> + */
> +static struct resolved_cores_reg resolved_cores_reg_hsx = {
> +	.bus = 1,
> +	.dev = 30,
> +	.func = 3,
> +	.offset = 0xb4,
> +};
> +
> +static struct resolved_cores_reg resolved_cores_reg_icx = {
> +	.bus = 14,
> +	.dev = 30,
> +	.func = 3,
> +	.offset = 0xd0,
> +};
> +
> +static const struct cpu_info cpu_hsx = {
> +	.reg		= &resolved_cores_reg_hsx,
> +	.min_peci_revision = 0x33,
> +};
> +
> +static const struct cpu_info cpu_icx = {
> +	.reg		= &resolved_cores_reg_icx,
> +	.min_peci_revision = 0x40,
> +};
> +
> +static const struct auxiliary_device_id peci_cputemp_ids[] = {
> +	{
> +		.name = "peci_cpu.cputemp.hsx",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.bdx",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.bdxd",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.skx",
> +		.driver_data = (kernel_ulong_t)&cpu_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.icx",
> +		.driver_data = (kernel_ulong_t)&cpu_icx,
> +	},
> +	{
> +		.name = "peci_cpu.cputemp.icxd",
> +		.driver_data = (kernel_ulong_t)&cpu_icx,
> +	},
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(auxiliary, peci_cputemp_ids);
> +
> +static struct auxiliary_driver peci_cputemp_driver = {
> +	.probe		= peci_cputemp_probe,
> +	.id_table	= peci_cputemp_ids,
> +};
> +
> +module_auxiliary_driver(peci_cputemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
> +MODULE_DESCRIPTION("PECI cputemp driver");
> +MODULE_LICENSE("GPL");
> +MODULE_IMPORT_NS(PECI_CPU);
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver
  2021-08-03 11:31 ` [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver Iwona Winiarska
@ 2021-08-03 15:39   ` Guenter Roeck
  2021-08-04 10:46     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Guenter Roeck @ 2021-08-03 15:39 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-kernel, openbmc, Greg Kroah-Hartman, x86, devicetree,
	linux-aspeed, linux-arm-kernel, linux-hwmon, linux-doc,
	Rob Herring, Joel Stanley, Andrew Jeffery, Jean Delvare,
	Arnd Bergmann, Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Dan Williams, Randy Dunlap,
	Zev Weiss, David Muller

On Tue, Aug 03, 2021 at 01:31:32PM +0200, Iwona Winiarska wrote:
> Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
> are accessible via the processor PECI interface.
> 
> The main use case for the driver (and PECI interface) is out-of-band
> management, where we're able to obtain thermal readings from an external
> entity connected with PECI, e.g. BMC on server platforms.
> 
> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
> Note that the timeout was completely removed - we're going to probe
> for detected DIMMs every 5 seconds until we reach "stable" state of
> either getting correct DIMM data or getting all -EINVAL (which
> suggest that the CPU doesn't have any DIMMs).
> 
>  drivers/hwmon/peci/Kconfig    |  13 +
>  drivers/hwmon/peci/Makefile   |   2 +
>  drivers/hwmon/peci/dimmtemp.c | 614 ++++++++++++++++++++++++++++++++++
>  3 files changed, 629 insertions(+)
>  create mode 100644 drivers/hwmon/peci/dimmtemp.c
> 
> diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> index e10eed68d70a..9d32a57badfe 100644
> --- a/drivers/hwmon/peci/Kconfig
> +++ b/drivers/hwmon/peci/Kconfig
> @@ -14,5 +14,18 @@ config SENSORS_PECI_CPUTEMP
>  	  This driver can also be built as a module. If so, the module
>  	  will be called peci-cputemp.
>  
> +config SENSORS_PECI_DIMMTEMP
> +	tristate "PECI DIMM temperature monitoring client"
> +	depends on PECI
> +	select SENSORS_PECI
> +	select PECI_CPU
> +	help
> +	  If you say yes here you get support for the generic Intel PECI hwmon
> +	  driver which provides Temperature Sensor on DIMM readings that are
> +	  accessible via the processor PECI interface.
> +
> +	  This driver can also be built as a module. If so, the module
> +	  will be called peci-dimmtemp.
> +
>  config SENSORS_PECI
>  	tristate
> diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> index e8a0ada5ab1f..191cfa0227f3 100644
> --- a/drivers/hwmon/peci/Makefile
> +++ b/drivers/hwmon/peci/Makefile
> @@ -1,5 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  
>  peci-cputemp-y := cputemp.o
> +peci-dimmtemp-y := dimmtemp.o
>  
>  obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
> diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
> new file mode 100644
> index 000000000000..6264c29bb6c0
> --- /dev/null
> +++ b/drivers/hwmon/peci/dimmtemp.c
> @@ -0,0 +1,614 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2018-2021 Intel Corporation
> +
> +#include <linux/auxiliary_bus.h>
> +#include <linux/bitfield.h>
> +#include <linux/bitops.h>
> +#include <linux/hwmon.h>
> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/peci.h>
> +#include <linux/peci-cpu.h>
> +#include <linux/units.h>
> +#include <linux/workqueue.h>
> +#include <linux/x86/intel-family.h>
> +
> +#include "common.h"
> +
> +#define DIMM_MASK_CHECK_DELAY_JIFFIES	msecs_to_jiffies(5000)
> +
> +/* Max number of channel ranks and DIMM index per channel */
> +#define CHAN_RANK_MAX_ON_HSX	8
> +#define DIMM_IDX_MAX_ON_HSX	3
> +#define CHAN_RANK_MAX_ON_BDX	4
> +#define DIMM_IDX_MAX_ON_BDX	3
> +#define CHAN_RANK_MAX_ON_BDXD	2
> +#define DIMM_IDX_MAX_ON_BDXD	2
> +#define CHAN_RANK_MAX_ON_SKX	6
> +#define DIMM_IDX_MAX_ON_SKX	2
> +#define CHAN_RANK_MAX_ON_ICX	8
> +#define DIMM_IDX_MAX_ON_ICX	2
> +#define CHAN_RANK_MAX_ON_ICXD	4
> +#define DIMM_IDX_MAX_ON_ICXD	2
> +
> +#define CHAN_RANK_MAX		CHAN_RANK_MAX_ON_HSX
> +#define DIMM_IDX_MAX		DIMM_IDX_MAX_ON_HSX
> +#define DIMM_NUMS_MAX		(CHAN_RANK_MAX * DIMM_IDX_MAX)
> +
> +#define CPU_SEG_MASK		GENMASK(23, 16)
> +#define GET_CPU_SEG(x)		(((x) & CPU_SEG_MASK) >> 16)
> +#define CPU_BUS_MASK		GENMASK(7, 0)
> +#define GET_CPU_BUS(x)		((x) & CPU_BUS_MASK)
> +
> +#define DIMM_TEMP_MAX		GENMASK(15, 8)
> +#define DIMM_TEMP_CRIT		GENMASK(23, 16)
> +#define GET_TEMP_MAX(x)		(((x) & DIMM_TEMP_MAX) >> 8)
> +#define GET_TEMP_CRIT(x)	(((x) & DIMM_TEMP_CRIT) >> 16)
> +
> +struct peci_dimmtemp;
> +
> +struct dimm_info {
> +	int chan_rank_max;
> +	int dimm_idx_max;
> +	u8 min_peci_revision;
> +	int (*read_thresholds)(struct peci_dimmtemp *priv, int dimm_order,
> +			       int chan_rank, u32 *data);
> +};
> +
> +struct peci_dimm_thresholds {
> +	long temp_max;
> +	long temp_crit;
> +	struct peci_sensor_state state;
> +};
> +
> +enum peci_dimm_threshold_type {
> +	temp_max_type,
> +	temp_crit_type,
> +};
> +
> +struct peci_dimmtemp {
> +	struct peci_device *peci_dev;
> +	struct device *dev;
> +	const char *name;
> +	const struct dimm_info *gen_info;
> +	struct delayed_work detect_work;
> +	struct {
> +		struct peci_sensor_data temp;
> +		struct peci_dimm_thresholds thresholds;
> +	} dimm[DIMM_NUMS_MAX];
> +	char **dimmtemp_label;
> +	DECLARE_BITMAP(dimm_mask, DIMM_NUMS_MAX);
> +};
> +
> +static u8 __dimm_temp(u32 reg, int dimm_order)
> +{
> +	return (reg >> (dimm_order * 8)) & 0xff;
> +}
> +
> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no, long *val)
> +{
> +	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> +	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> +	u32 data;
> +	int ret;

	int ret = 0;

> +
> +	mutex_lock(&priv->dimm[dimm_no].temp.state.lock);
> +	if (!peci_sensor_need_update(&priv->dimm[dimm_no].temp.state))
> +		goto skip_update;
> +
> +	ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &data);
> +	if (ret) {
> +		mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
> +		return ret;
> +	}

	if (ret)
		goto unlock;

> +
> +	priv->dimm[dimm_no].temp.value = __dimm_temp(data, dimm_order) * MILLIDEGREE_PER_DEGREE;
> +
> +	peci_sensor_mark_updated(&priv->dimm[dimm_no].temp.state);
> +
> +skip_update:
> +	*val = priv->dimm[dimm_no].temp.value;

unlock:
> +	mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
> +	return 0;

	return ret;

> +}
> +
> +static int update_thresholds(struct peci_dimmtemp *priv, int dimm_no)
> +{
> +	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> +	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> +	u32 data;
> +	int ret;
> +
> +	if (!peci_sensor_need_update(&priv->dimm[dimm_no].thresholds.state))
> +		return 0;
> +
> +	ret = priv->gen_info->read_thresholds(priv, dimm_order, chan_rank, &data);
> +	if (ret == -ENODATA) /* Use default or previous value */
> +		return 0;
> +	if (ret)
> +		return ret;
> +
> +	priv->dimm[dimm_no].thresholds.temp_max = GET_TEMP_MAX(data) * MILLIDEGREE_PER_DEGREE;
> +	priv->dimm[dimm_no].thresholds.temp_crit = GET_TEMP_CRIT(data) * MILLIDEGREE_PER_DEGREE;
> +
> +	peci_sensor_mark_updated(&priv->dimm[dimm_no].thresholds.state);
> +
> +	return 0;
> +}
> +
> +static int get_dimm_thresholds(struct peci_dimmtemp *priv, enum peci_dimm_threshold_type type,
> +			       int dimm_no, long *val)
> +{
> +	int ret;
> +
> +	mutex_lock(&priv->dimm[dimm_no].thresholds.state.lock);
> +	ret = update_thresholds(priv, dimm_no);
> +	if (ret)
> +		goto unlock;
> +
> +	switch (type) {
> +	case temp_max_type:
> +		*val = priv->dimm[dimm_no].thresholds.temp_max;
> +		break;
> +	case temp_crit_type:
> +		*val = priv->dimm[dimm_no].thresholds.temp_crit;
> +		break;
> +	default:
> +		ret = -EOPNOTSUPP;
> +		break;
> +	}
> +unlock:
> +	mutex_unlock(&priv->dimm[dimm_no].thresholds.state.lock);
> +
> +	return ret;
> +}
> +
> +static int dimmtemp_read_string(struct device *dev,
> +				enum hwmon_sensor_types type,
> +				u32 attr, int channel, const char **str)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> +
> +	if (attr != hwmon_temp_label)
> +		return -EOPNOTSUPP;
> +
> +	*str = (const char *)priv->dimmtemp_label[channel];
> +
> +	return 0;
> +}
> +
> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
> +			 u32 attr, int channel, long *val)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		return get_dimm_temp(priv, channel, val);
> +	case hwmon_temp_max:
> +		return get_dimm_thresholds(priv, temp_max_type, channel, val);
> +	case hwmon_temp_crit:
> +		return get_dimm_thresholds(priv, temp_crit_type, channel, val);
> +	default:
> +		break;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static umode_t dimmtemp_is_visible(const void *data, enum hwmon_sensor_types type,
> +				   u32 attr, int channel)
> +{
> +	const struct peci_dimmtemp *priv = data;
> +
> +	if (test_bit(channel, priv->dimm_mask))
> +		return 0444;
> +
> +	return 0;
> +}
> +
> +static const struct hwmon_ops peci_dimmtemp_ops = {
> +	.is_visible = dimmtemp_is_visible,
> +	.read_string = dimmtemp_read_string,
> +	.read = dimmtemp_read,
> +};
> +
> +static int check_populated_dimms(struct peci_dimmtemp *priv)
> +{
> +	int chan_rank_max = priv->gen_info->chan_rank_max;
> +	int dimm_idx_max = priv->gen_info->dimm_idx_max;
> +	u32 chan_rank_empty = 0;
> +	u64 dimm_mask = 0;
> +	int chan_rank, dimm_idx, ret;
> +	u32 pcs;
> +
> +	BUILD_BUG_ON(CHAN_RANK_MAX > 32);
> +	BUILD_BUG_ON(DIMM_NUMS_MAX > 64);

I don't immediately see the value of those build bugs. What happens if
CHAN_RANK_MAX > 32 or DIMM_NUMS_MAX > 64 ? Where do those limits come
from ?

> +	if (chan_rank_max * dimm_idx_max > DIMM_NUMS_MAX) {
> +		WARN_ONCE(1, "Unsupported number of DIMMs");

Maybe display the values (chan_rank_max and dimm_idx_max).

> +		return -EINVAL;
> +	}
> +
> +	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
> +		ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &pcs);
> +		if (ret) {
> +			/*
> +			 * Overall, we expect either success or -EINVAL in
> +			 * order to determine whether DIMM is populated or not.
> +			 * For anything else - we fall back to defering the

Why " - " ?

> +			 * detection to be performed at a later point in time.
> +			 */
> +			if (ret == -EINVAL) {
> +				chan_rank_empty |= BIT(chan_rank);
> +				continue;
> +			}
> +
> +			return -EAGAIN;
> +		}
> +
> +		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++)
> +			if (__dimm_temp(pcs, dimm_idx))
> +				dimm_mask |= BIT(chan_rank * dimm_idx_max + dimm_idx);
> +	}
> +
> +	/* If we got all -EINVALs, it means that the CPU doesn't have any DIMMs. */
> +	if (chan_rank_empty == GENMASK(chan_rank_max - 1, 0))
> +		return -ENODEV;
> +
> +	/*
> +	 * It's possible that memory training is not done yet. In this case we
> +	 * defer the detection to be performed at a later point in time.
> +	 */
> +	if (!dimm_mask)
> +		return -EAGAIN;
> +
> +	dev_dbg(priv->dev, "Scanned populated DIMMs: %#llx\n", dimm_mask);
> +
> +	bitmap_from_u64(priv->dimm_mask, dimm_mask);
> +
> +	return 0;
> +}
> +
> +static int create_dimm_temp_label(struct peci_dimmtemp *priv, int chan)
> +{
> +	int rank = chan / priv->gen_info->dimm_idx_max;
> +	int idx = chan % priv->gen_info->dimm_idx_max;
> +
> +	priv->dimmtemp_label[chan] = devm_kasprintf(priv->dev, GFP_KERNEL,
> +						    "DIMM %c%d", 'A' + rank,
> +						    idx + 1);
> +	if (!priv->dimmtemp_label[chan])
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static const u32 peci_dimmtemp_temp_channel_config[] = {
> +	[0 ... DIMM_NUMS_MAX - 1] = HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT,
> +	0
> +};
> +
> +static const struct hwmon_channel_info peci_dimmtemp_temp_channel = {
> +	.type = hwmon_temp,
> +	.config = peci_dimmtemp_temp_channel_config,
> +};
> +
> +static const struct hwmon_channel_info *peci_dimmtemp_temp_info[] = {
> +	&peci_dimmtemp_temp_channel,
> +	NULL
> +};
> +
> +static const struct hwmon_chip_info peci_dimmtemp_chip_info = {
> +	.ops = &peci_dimmtemp_ops,
> +	.info = peci_dimmtemp_temp_info,
> +};
> +
> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
> +{
> +	int ret, i, channels;
> +	struct device *dev;
> +
> +	/*
> +	 * We expect to either find populated DIMMs and carry on with creating
> +	 * sensors, or find out that there are no DIMMs populated.
> +	 * All other states mean that the platform never reached the state that
> +	 * allows to check DIMM state - causing us to retry later on.
> +	 */
> +	ret = check_populated_dimms(priv);
> +	if (ret == -ENODEV) {
> +		dev_dbg(priv->dev, "No DIMMs found\n");
> +		return 0;
> +	} else if (ret) {
> +		schedule_delayed_work(&priv->detect_work, DIMM_MASK_CHECK_DELAY_JIFFIES);
> +		dev_dbg(priv->dev, "Deferred populating DIMM temp info\n");
> +		return ret;
> +	}
> +
> +	channels = priv->gen_info->chan_rank_max * priv->gen_info->dimm_idx_max;
> +
> +	priv->dimmtemp_label = devm_kzalloc(priv->dev, channels * sizeof(char *), GFP_KERNEL);
> +	if (!priv->dimmtemp_label)
> +		return -ENOMEM;
> +
> +	for_each_set_bit(i, priv->dimm_mask, DIMM_NUMS_MAX) {
> +		ret = create_dimm_temp_label(priv, i);
> +		if (ret)
> +			return ret;
> +		mutex_init(&priv->dimm[i].thresholds.state.lock);
> +		mutex_init(&priv->dimm[i].temp.state.lock);
> +	}
> +
> +	dev = devm_hwmon_device_register_with_info(priv->dev, priv->name, priv,
> +						   &peci_dimmtemp_chip_info, NULL);
> +	if (IS_ERR(dev)) {
> +		dev_err(priv->dev, "Failed to register hwmon device\n");
> +		return PTR_ERR(dev);
> +	}
> +
> +	dev_dbg(priv->dev, "%s: sensor '%s'\n", dev_name(dev), priv->name);
> +
> +	return 0;
> +}
> +
> +static void create_dimm_temp_info_delayed(struct work_struct *work)
> +{
> +	struct peci_dimmtemp *priv = container_of(to_delayed_work(work),
> +						  struct peci_dimmtemp,
> +						  detect_work);
> +	int ret;
> +
> +	ret = create_dimm_temp_info(priv);
> +	if (ret && ret != -EAGAIN)
> +		dev_err(priv->dev, "Failed to populate DIMM temp info\n");
> +}
> +
> +static void remove_delayed_work(void *_priv)
> +{
> +	struct peci_dimmtemp *priv = _priv;
> +
> +	cancel_delayed_work_sync(&priv->detect_work);
> +}
> +
> +static int peci_dimmtemp_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id)
> +{
> +	struct device *dev = &adev->dev;
> +	struct peci_device *peci_dev = to_peci_device(dev->parent);
> +	struct peci_dimmtemp *priv;
> +	int ret;
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_dimmtemp.cpu%d",
> +				    peci_dev->info.socket_id);
> +	if (!priv->name)
> +		return -ENOMEM;
> +
> +	priv->dev = dev;
> +	priv->peci_dev = peci_dev;
> +	priv->gen_info = (const struct dimm_info *)id->driver_data;
> +
> +	/*
> +	 * This is just a sanity check. Since we're using commands that are
> +	 * guaranteed to be supported on a given platform, we should never see
> +	 * revision lower than expected.
> +	 */
> +	if (peci_dev->info.peci_revision < priv->gen_info->min_peci_revision)
> +		dev_warn(priv->dev,
> +			 "Unexpected PECI revision %#x, some features may be unavailable\n",
> +			 peci_dev->info.peci_revision);
> +
> +	INIT_DELAYED_WORK(&priv->detect_work, create_dimm_temp_info_delayed);
> +
> +	ret = devm_add_action_or_reset(priv->dev, remove_delayed_work, priv);
> +	if (ret)
> +		return ret;
> +
> +	ret = create_dimm_temp_info(priv);
> +	if (ret && ret != -EAGAIN) {
> +		dev_err(dev, "Failed to populate DIMM temp info\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +read_thresholds_hsx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
> +{
> +	u8 dev, func;
> +	u16 reg;
> +	int ret;
> +
> +	/*
> +	 * Device 20, Function 0: IMC 0 channel 0 -> rank 0
> +	 * Device 20, Function 1: IMC 0 channel 1 -> rank 1
> +	 * Device 21, Function 0: IMC 0 channel 2 -> rank 2
> +	 * Device 21, Function 1: IMC 0 channel 3 -> rank 3
> +	 * Device 23, Function 0: IMC 1 channel 0 -> rank 4
> +	 * Device 23, Function 1: IMC 1 channel 1 -> rank 5
> +	 * Device 24, Function 0: IMC 1 channel 2 -> rank 6
> +	 * Device 24, Function 1: IMC 1 channel 3 -> rank 7
> +	 */
> +	dev = 20 + chan_rank / 2 + chan_rank / 4;
> +	func = chan_rank % 2;
> +	reg = 0x120 + dimm_order * 4;
> +
> +	ret = peci_pci_local_read(priv->peci_dev, 1, dev, func, reg, data);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int
> +read_thresholds_bdxd(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
> +{
> +	u8 dev, func;
> +	u16 reg;
> +	int ret;
> +
> +	/*
> +	 * Device 10, Function 2: IMC 0 channel 0 -> rank 0
> +	 * Device 10, Function 6: IMC 0 channel 1 -> rank 1
> +	 * Device 12, Function 2: IMC 1 channel 0 -> rank 2
> +	 * Device 12, Function 6: IMC 1 channel 1 -> rank 3
> +	 */
> +	dev = 10 + chan_rank / 2 * 2;
> +	func = (chan_rank % 2) ? 6 : 2;
> +	reg = 0x120 + dimm_order * 4;
> +
> +	ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int
> +read_thresholds_skx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
> +{
> +	u8 dev, func;
> +	u16 reg;
> +	int ret;
> +
> +	/*
> +	 * Device 10, Function 2: IMC 0 channel 0 -> rank 0
> +	 * Device 10, Function 6: IMC 0 channel 1 -> rank 1
> +	 * Device 11, Function 2: IMC 0 channel 2 -> rank 2
> +	 * Device 12, Function 2: IMC 1 channel 0 -> rank 3
> +	 * Device 12, Function 6: IMC 1 channel 1 -> rank 4
> +	 * Device 13, Function 2: IMC 1 channel 2 -> rank 5
> +	 */
> +	dev = 10 + chan_rank / 3 * 2 + (chan_rank % 3 == 2 ? 1 : 0);
> +	func = chan_rank % 3 == 1 ? 6 : 2;
> +	reg = 0x120 + dimm_order * 4;
> +
> +	ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int
> +read_thresholds_icx(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
> +{
> +	u32 reg_val;
> +	u64 offset;
> +	int ret;
> +	u8 dev;
> +
> +	ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd4, &reg_val);
> +	if (ret || !(reg_val & BIT(31)))
> +		return -ENODATA; /* Use default or previous value */
> +
> +	ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd0, &reg_val);
> +	if (ret)
> +		return -ENODATA; /* Use default or previous value */
> +
> +	/*
> +	 * Device 26, Offset 224e0: IMC 0 channel 0 -> rank 0
> +	 * Device 26, Offset 264e0: IMC 0 channel 1 -> rank 1
> +	 * Device 27, Offset 224e0: IMC 1 channel 0 -> rank 2
> +	 * Device 27, Offset 264e0: IMC 1 channel 1 -> rank 3
> +	 * Device 28, Offset 224e0: IMC 2 channel 0 -> rank 4
> +	 * Device 28, Offset 264e0: IMC 2 channel 1 -> rank 5
> +	 * Device 29, Offset 224e0: IMC 3 channel 0 -> rank 6
> +	 * Device 29, Offset 264e0: IMC 3 channel 1 -> rank 7
> +	 */
> +	dev = 26 + chan_rank / 2;
> +	offset = 0x224e0 + dimm_order * 4 + (chan_rank % 2) * 0x4000;
> +
> +	ret = peci_mmio_read(priv->peci_dev, 0, GET_CPU_SEG(reg_val), GET_CPU_BUS(reg_val),
> +			     dev, 0, offset, data);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static const struct dimm_info dimm_hsx = {
> +	.chan_rank_max	= CHAN_RANK_MAX_ON_HSX,
> +	.dimm_idx_max	= DIMM_IDX_MAX_ON_HSX,
> +	.min_peci_revision = 0x33,
> +	.read_thresholds = &read_thresholds_hsx,
> +};
> +
> +static const struct dimm_info dimm_bdx = {
> +	.chan_rank_max	= CHAN_RANK_MAX_ON_BDX,
> +	.dimm_idx_max	= DIMM_IDX_MAX_ON_BDX,
> +	.min_peci_revision = 0x33,
> +	.read_thresholds = &read_thresholds_hsx,
> +};
> +
> +static const struct dimm_info dimm_bdxd = {
> +	.chan_rank_max	= CHAN_RANK_MAX_ON_BDXD,
> +	.dimm_idx_max	= DIMM_IDX_MAX_ON_BDXD,
> +	.min_peci_revision = 0x33,
> +	.read_thresholds = &read_thresholds_bdxd,
> +};
> +
> +static const struct dimm_info dimm_skx = {
> +	.chan_rank_max	= CHAN_RANK_MAX_ON_SKX,
> +	.dimm_idx_max	= DIMM_IDX_MAX_ON_SKX,
> +	.min_peci_revision = 0x33,
> +	.read_thresholds = &read_thresholds_skx,
> +};
> +
> +static const struct dimm_info dimm_icx = {
> +	.chan_rank_max	= CHAN_RANK_MAX_ON_ICX,
> +	.dimm_idx_max	= DIMM_IDX_MAX_ON_ICX,
> +	.min_peci_revision = 0x40,
> +	.read_thresholds = &read_thresholds_icx,
> +};
> +
> +static const struct dimm_info dimm_icxd = {
> +	.chan_rank_max	= CHAN_RANK_MAX_ON_ICXD,
> +	.dimm_idx_max	= DIMM_IDX_MAX_ON_ICXD,
> +	.min_peci_revision = 0x40,
> +	.read_thresholds = &read_thresholds_icx,
> +};
> +
> +static const struct auxiliary_device_id peci_dimmtemp_ids[] = {
> +	{
> +		.name = "peci_cpu.dimmtemp.hsx",
> +		.driver_data = (kernel_ulong_t)&dimm_hsx,
> +	},
> +	{
> +		.name = "peci_cpu.dimmtemp.bdx",
> +		.driver_data = (kernel_ulong_t)&dimm_bdx,
> +	},
> +	{
> +		.name = "peci_cpu.dimmtemp.bdxd",
> +		.driver_data = (kernel_ulong_t)&dimm_bdxd,
> +	},
> +	{
> +		.name = "peci_cpu.dimmtemp.skx",
> +		.driver_data = (kernel_ulong_t)&dimm_skx,
> +	},
> +	{
> +		.name = "peci_cpu.dimmtemp.icx",
> +		.driver_data = (kernel_ulong_t)&dimm_icx,
> +	},
> +	{
> +		.name = "peci_cpu.dimmtemp.icxd",
> +		.driver_data = (kernel_ulong_t)&dimm_icxd,
> +	},
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(auxiliary, peci_dimmtemp_ids);
> +
> +static struct auxiliary_driver peci_dimmtemp_driver = {
> +	.probe		= peci_dimmtemp_probe,
> +	.id_table	= peci_dimmtemp_ids,
> +};
> +
> +module_auxiliary_driver(peci_dimmtemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
> +MODULE_DESCRIPTION("PECI dimmtemp driver");
> +MODULE_LICENSE("GPL");
> +MODULE_IMPORT_NS(PECI_CPU);

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 12/15] hwmon: peci: Add cputemp driver
  2021-08-03 15:24   ` Guenter Roeck
@ 2021-08-04 10:43     ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-08-04 10:43 UTC (permalink / raw)
  To: linux, gregkh, linux-kernel, openbmc
  Cc: corbet, jae.hyun.yoo, linux-hwmon, Lutomirski, Andy, Luck, Tony,
	andrew, andriy.shevchenko, mchehab, jdelvare, mingo, olof,
	rdunlap, devicetree, tglx, linux-aspeed, linux-doc, arnd,
	yazen.ghannam, zweiss, robh+dt, linux-arm-kernel, joel,
	d.mueller, bp, pierre-louis.bossart, x86, Williams, Dan J

On Tue, 2021-08-03 at 08:24 -0700, Guenter Roeck wrote:
> On 8/3/21 4:31 AM, Iwona Winiarska wrote:
> > Add peci-cputemp driver for Digital Thermal Sensor (DTS) thermal
> > readings of the processor package and processor cores that are
> > accessible via the PECI interface.
> > 
> > The main use case for the driver (and PECI interface) is out-of-band
> > management, where we're able to obtain the DTS readings from an external
> > entity connected with PECI, e.g. BMC on server platforms.
> > 
> > Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> >   MAINTAINERS                  |   7 +
> >   drivers/hwmon/Kconfig        |   2 +
> >   drivers/hwmon/Makefile       |   1 +
> >   drivers/hwmon/peci/Kconfig   |  18 ++
> >   drivers/hwmon/peci/Makefile  |   5 +
> >   drivers/hwmon/peci/common.h  |  58 ++++
> >   drivers/hwmon/peci/cputemp.c | 591 +++++++++++++++++++++++++++++++++++
> >   7 files changed, 682 insertions(+)
> >   create mode 100644 drivers/hwmon/peci/Kconfig
> >   create mode 100644 drivers/hwmon/peci/Makefile
> >   create mode 100644 drivers/hwmon/peci/common.h
> >   create mode 100644 drivers/hwmon/peci/cputemp.c
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 3f5d48e1d143..e36b5c0824e3 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -14512,6 +14512,13 @@ L:     platform-driver-x86@vger.kernel.org
> >   S:    Maintained
> >   F:    drivers/platform/x86/peaq-wmi.c
> >   
> > +PECI HARDWARE MONITORING DRIVERS
> > +M:     Iwona Winiarska <iwona.winiarska@intel.com>
> > +R:     Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > +L:     linux-hwmon@vger.kernel.org
> > +S:     Supported
> > +F:     drivers/hwmon/peci/
> > +
> >   PECI SUBSYSTEM
> >   M:    Iwona Winiarska <iwona.winiarska@intel.com>
> >   R:    Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index e3675377bc5d..61c0e3404415 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1507,6 +1507,8 @@ config SENSORS_PCF8591
> >           These devices are hard to detect and rarely found on mainstream
> >           hardware. If unsure, say N.
> >   
> > +source "drivers/hwmon/peci/Kconfig"
> > +
> >   source "drivers/hwmon/pmbus/Kconfig"
> >   
> >   config SENSORS_PWM_FAN
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index d712c61c1f5e..f52331f212ed 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -202,6 +202,7 @@ obj-$(CONFIG_SENSORS_WM8350)        += wm8350-hwmon.o
> >   obj-$(CONFIG_SENSORS_XGENE)   += xgene-hwmon.o
> >   
> >   obj-$(CONFIG_SENSORS_OCC)     += occ/
> > +obj-$(CONFIG_SENSORS_PECI)     += peci/
> >   obj-$(CONFIG_PMBUS)           += pmbus/
> >   
> >   ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
> > diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> > new file mode 100644
> > index 000000000000..e10eed68d70a
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/Kconfig
> > @@ -0,0 +1,18 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +config SENSORS_PECI_CPUTEMP
> > +       tristate "PECI CPU temperature monitoring client"
> > +       depends on PECI
> > +       select SENSORS_PECI
> > +       select PECI_CPU
> > +       help
> > +         If you say yes here you get support for the generic Intel PECI
> > +         cputemp driver which provides Digital Thermal Sensor (DTS) thermal
> > +         readings of the CPU package and CPU cores that are accessible via
> > +         the processor PECI interface.
> > +
> > +         This driver can also be built as a module. If so, the module
> > +         will be called peci-cputemp.
> > +
> > +config SENSORS_PECI
> > +       tristate
> > diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> > new file mode 100644
> > index 000000000000..e8a0ada5ab1f
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/Makefile
> > @@ -0,0 +1,5 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +peci-cputemp-y := cputemp.o
> > +
> > +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)     += peci-cputemp.o
> > diff --git a/drivers/hwmon/peci/common.h b/drivers/hwmon/peci/common.h
> > new file mode 100644
> > index 000000000000..734506b0eca2
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/common.h
> > @@ -0,0 +1,58 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright (c) 2021 Intel Corporation */
> > +
> > +#include <linux/mutex.h>
> > +#include <linux/types.h>
> > +
> > +#ifndef __PECI_HWMON_COMMON_H
> > +#define __PECI_HWMON_COMMON_H
> > +
> > +#define PECI_HWMON_UPDATE_INTERVAL     HZ
> > +
> > +/**
> > + * struct peci_sensor_state - PECI state information
> > + * @valid: flag to indicate the sensor value is valid
> > + * @last_updated: time of the last update in jiffies
> > + * @lock: mutex to protect sensor access
> > + */
> > +struct peci_sensor_state {
> > +       bool valid;
> > +       unsigned long last_updated;
> > +       struct mutex lock; /* protect sensor access */
> > +};
> > +
> > +/**
> > + * struct peci_sensor_data - PECI sensor information
> > + * @value: sensor value in milli units
> > + * @state: sensor update state
> > + */
> > +
> > +struct peci_sensor_data {
> > +       s32 value;
> > +       struct peci_sensor_state state;
> > +};
> > +
> > +/**
> > + * peci_sensor_need_update() - check whether sensor update is needed or not
> > + * @sensor: pointer to sensor data struct
> > + *
> > + * Return: true if update is needed, false if not.
> > + */
> > +
> > +static inline bool peci_sensor_need_update(struct peci_sensor_state *state)
> > +{
> > +       return !state->valid ||
> > +              time_after(jiffies, state->last_updated +
> > PECI_HWMON_UPDATE_INTERVAL);
> > +}
> > +
> > +/**
> > + * peci_sensor_mark_updated() - mark the sensor is updated
> > + * @sensor: pointer to sensor data struct
> > + */
> > +static inline void peci_sensor_mark_updated(struct peci_sensor_state
> > *state)
> > +{
> > +       state->valid = true;
> > +       state->last_updated = jiffies;
> > +}
> > +
> > +#endif /* __PECI_HWMON_COMMON_H */
> > diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c
> > new file mode 100644
> > index 000000000000..9c6858a9fb6d
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/cputemp.c
> > @@ -0,0 +1,591 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2018-2021 Intel Corporation
> > +
> > +#include <linux/auxiliary_bus.h>
> > +#include <linux/bitfield.h>
> > +#include <linux/bitops.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/jiffies.h>
> > +#include <linux/module.h>
> > +#include <linux/peci.h>
> > +#include <linux/peci-cpu.h>
> > +#include <linux/units.h>
> > +#include <linux/x86/intel-family.h>
> > +
> > +#include "common.h"
> > +
> > +#define CORE_NUMS_MAX          64
> > +
> > +#define BASE_CHANNEL_NUMS      5
> > +#define CPUTEMP_CHANNEL_NUMS   (BASE_CHANNEL_NUMS + CORE_NUMS_MAX)
> > +
> > +#define TEMP_TARGET_FAN_TEMP_MASK      GENMASK(15, 8)
> > +#define TEMP_TARGET_REF_TEMP_MASK      GENMASK(23, 16)
> > +#define TEMP_TARGET_TJ_OFFSET_MASK     GENMASK(29, 24)
> > +
> > +#define DTS_MARGIN_MASK                GENMASK(15, 0)
> > +#define PCS_MODULE_TEMP_MASK   GENMASK(15, 0)
> > +
> > +#define DTS_FIXED_POINT_FRACTION       64
> > +
> > +struct resolved_cores_reg {
> > +       u8 bus;
> > +       u8 dev;
> > +       u8 func;
> > +       u8 offset;
> > +};
> > +
> > +struct cpu_info {
> > +       struct resolved_cores_reg *reg;
> > +       u8 min_peci_revision;
> > +};
> > +
> > +struct peci_temp_target {
> > +       s32 tcontrol;
> > +       s32 tthrottle;
> > +       s32 tjmax;
> > +       struct peci_sensor_state state;
> > +};
> > +
> > +enum peci_temp_target_type {
> > +       tcontrol_type,
> > +       tthrottle_type,
> > +       tjmax_type,
> > +       crit_hyst_type,
> > +};
> > +
> > +struct peci_cputemp {
> > +       struct peci_device *peci_dev;
> > +       struct device *dev;
> > +       const char *name;
> > +       const struct cpu_info *gen_info;
> > +       struct {
> > +               struct peci_temp_target target;
> > +               struct peci_sensor_data die;
> > +               struct peci_sensor_data dts;
> > +               struct peci_sensor_data core[CORE_NUMS_MAX];
> > +       } temp;
> > +       const char **coretemp_label;
> > +       DECLARE_BITMAP(core_mask, CORE_NUMS_MAX);
> > +};
> > +
> > +enum cputemp_channels {
> > +       channel_die,
> > +       channel_dts,
> > +       channel_tcontrol,
> > +       channel_tthrottle,
> > +       channel_tjmax,
> > +       channel_core,
> > +};
> > +
> > +static const char * const cputemp_label[BASE_CHANNEL_NUMS] = {
> > +       "Die",
> > +       "DTS",
> > +       "Tcontrol",
> > +       "Tthrottle",
> > +       "Tjmax",
> > +};
> > +
> > +static int update_temp_target(struct peci_cputemp *priv)
> > +{
> > +       s32 tthrottle_offset, tcontrol_margin;
> > +       u32 pcs;
> > +       int ret;
> > +
> > +       if (!peci_sensor_need_update(&priv->temp.target.state))
> > +               return 0;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_TEMP_TARGET, 0, &pcs);
> > +       if (ret)
> > +               return ret;
> > +
> > +       priv->temp.target.tjmax =
> > +               FIELD_GET(TEMP_TARGET_REF_TEMP_MASK, pcs) *
> > MILLIDEGREE_PER_DEGREE;
> > +
> > +       tcontrol_margin = FIELD_GET(TEMP_TARGET_FAN_TEMP_MASK, pcs);
> > +       tcontrol_margin = sign_extend32(tcontrol_margin, 7) *
> > MILLIDEGREE_PER_DEGREE;
> > +       priv->temp.target.tcontrol = priv->temp.target.tjmax -
> > tcontrol_margin;
> > +
> > +       tthrottle_offset = FIELD_GET(TEMP_TARGET_TJ_OFFSET_MASK, pcs) *
> > MILLIDEGREE_PER_DEGREE;
> > +       priv->temp.target.tthrottle = priv->temp.target.tjmax -
> > tthrottle_offset;
> > +
> > +       peci_sensor_mark_updated(&priv->temp.target.state);
> > +
> > +       return 0;
> > +}
> > +
> > +static int get_temp_target(struct peci_cputemp *priv, enum
> > peci_temp_target_type type, long *val)
> > +{
> > +       int ret;
> > +
> > +       mutex_lock(&priv->temp.target.state.lock);
> > +
> > +       ret = update_temp_target(priv);
> > +       if (ret)
> > +               goto unlock;
> > +
> > +       switch (type) {
> > +       case tcontrol_type:
> > +               *val = priv->temp.target.tcontrol;
> > +               break;
> > +       case tthrottle_type:
> > +               *val = priv->temp.target.tthrottle;
> > +               break;
> > +       case tjmax_type:
> > +               *val = priv->temp.target.tjmax;
> > +               break;
> > +       case crit_hyst_type:
> > +               *val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
> > +               break;
> > +       default:
> > +               ret = -EOPNOTSUPP;
> > +               break;
> > +       }
> > +unlock:
> > +       mutex_unlock(&priv->temp.target.state.lock);
> > +
> > +       return ret;
> > +}
> > +
> > +/*
> > + * Processors return a value of DTS reading in S10.6 fixed point format
> > + * (16 bits: 10-bit signed magnitude, 6-bit fraction).
> > + * Error codes:
> > + *   0x8000: General sensor error
> > + *   0x8001: Reserved
> > + *   0x8002: Underflow on reading value
> > + *   0x8003-0x81ff: Reserved
> > + */
> > +static bool dts_valid(s32 val)
> > +{
> > +       return val < 0x8000 || val > 0x81ff;
> > +}
> > +
> > +static s32 dts_to_millidegree(s32 val)
> > +{
> > +       return sign_extend32(val, 15) * MILLIDEGREE_PER_DEGREE /
> > DTS_FIXED_POINT_FRACTION;
> > +}
> > +
> > +static int get_die_temp(struct peci_cputemp *priv, long *val)
> > +{
> > +       long tjmax;
> > +       s16 temp;
> > +       int ret;
> > +
> > +       mutex_lock(&priv->temp.die.state.lock);
> > +       if (!peci_sensor_need_update(&priv->temp.die.state))
> > +               goto skip_update;
> > +
> > +       ret = peci_temp_read(priv->peci_dev, &temp);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       if (!dts_valid(temp)) {
> > +               ret = -EIO;
> > +               goto err_unlock;
> > +       }
> > +
> > +       ret = get_temp_target(priv, tjmax_type, &tjmax);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       priv->temp.die.value = (s32)tjmax + dts_to_millidegree(temp);
> > +
> > +       peci_sensor_mark_updated(&priv->temp.die.state);
> > +
> > +skip_update:
> > +       *val = priv->temp.die.value;
> > +       mutex_unlock(&priv->temp.die.state.lock);
> > +
> > +       return 0;
> > +
> > +err_unlock:
> > +       mutex_unlock(&priv->temp.die.state.lock);
> > +       return ret;
> > +}
> > +
> > +static int get_dts(struct peci_cputemp *priv, long *val)
> > +{
> > +       s32 dts_margin;
> > +       long tcontrol;
> > +       u32 pcs;
> > +       int ret;
> > +
> > +       mutex_lock(&priv->temp.dts.state.lock);
> > +       if (!peci_sensor_need_update(&priv->temp.dts.state))
> > +               goto skip_update;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_THERMAL_MARGIN, 0,
> > &pcs);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       dts_margin = FIELD_GET(DTS_MARGIN_MASK, pcs);
> > +       if (!dts_valid(dts_margin)) {
> > +               ret = -EIO;
> > +               goto err_unlock;
> > +       }
> > +
> > +       ret = get_temp_target(priv, tcontrol_type, &tcontrol);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       /* Note that the tcontrol should be available before calling it */
> > +       priv->temp.dts.value = (s32)tcontrol -
> > dts_to_millidegree(dts_margin);
> > +
> > +       peci_sensor_mark_updated(&priv->temp.dts.state);
> > +
> > +skip_update:
> > +       *val = priv->temp.dts.value;
> > +       mutex_unlock(&priv->temp.dts.state.lock);
> > +
> > +       return 0;
> > +
> > +err_unlock:
> > +       mutex_unlock(&priv->temp.dts.state.lock);
> > +       return ret;
> 
> Simplify (see below)
> 
> > +}
> > +
> > +static int get_core_temp(struct peci_cputemp *priv, int core_index, long
> > *val)
> > +{
> > +       s32 core_dts_margin;
> > +       long tjmax;
> > +       u32 pcs;
> > +       int ret;
> 
>         int ret = 0;
> 
> to handle simplification below.
> 
> > +
> > +       mutex_lock(&priv->temp.core[core_index].state.lock);
> > +       if (!peci_sensor_need_update(&priv->temp.core[core_index].state))
> > +               goto skip_update;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_MODULE_TEMP,
> > core_index, &pcs);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       core_dts_margin = FIELD_GET(PCS_MODULE_TEMP_MASK, pcs);
> > +       if (!dts_valid(core_dts_margin)) {
> > +               ret = -EIO;
> > +               goto err_unlock;
> > +       }
> > +
> > +       ret = get_temp_target(priv, tjmax_type, &tjmax);
> > +       if (ret)
> > +               goto err_unlock;
> > +
> > +       /* Note that the tjmax should be available before calling it */
> > +       priv->temp.core[core_index].value = (s32)tjmax +
> > dts_to_millidegree(core_dts_margin);
> > +
> > +       peci_sensor_mark_updated(&priv->temp.core[core_index].state);
> > +
> > +skip_update:
> > +       *val = priv->temp.core[core_index].value;
> > +       mutex_unlock(&priv->temp.core[core_index].state.lock);
> > +
> > +       return 0;
> > +
> > +err_unlock:
> > +       mutex_unlock(&priv->temp.core[core_index].state.lock);
> > +       return ret;
> 
> Simplify:
> 
> skip_update:
>         *val = priv->temp.core[core_index].value;
> err_unlock:
>         mutex_unlock(&priv->temp.core[core_index].state.lock);
>         return ret;

Sure, I'll use the same pattern in other places as well.

Thank you
-Iwona

> 
> > +}
> > +
> > +static int cputemp_read_string(struct device *dev, enum hwmon_sensor_types
> > type,
> > +                              u32 attr, int channel, const char **str)
> > +{
> > +       struct peci_cputemp *priv = dev_get_drvdata(dev);
> > +
> > +       if (attr != hwmon_temp_label)
> > +               return -EOPNOTSUPP;
> > +
> > +       *str = channel < channel_core ?
> > +               cputemp_label[channel] : priv->coretemp_label[channel -
> > channel_core];
> > +
> > +       return 0;
> > +}
> > +
> > +static int cputemp_read(struct device *dev, enum hwmon_sensor_types type,
> > +                       u32 attr, int channel, long *val)
> > +{
> > +       struct peci_cputemp *priv = dev_get_drvdata(dev);
> > +
> > +       switch (attr) {
> > +       case hwmon_temp_input:
> > +               switch (channel) {
> > +               case channel_die:
> > +                       return get_die_temp(priv, val);
> > +               case channel_dts:
> > +                       return get_dts(priv, val);
> > +               case channel_tcontrol:
> > +                       return get_temp_target(priv, tcontrol_type, val);
> > +               case channel_tthrottle:
> > +                       return get_temp_target(priv, tthrottle_type, val);
> > +               case channel_tjmax:
> > +                       return get_temp_target(priv, tjmax_type, val);
> > +               default:
> > +                       return get_core_temp(priv, channel - channel_core,
> > val);
> > +               }
> > +               break;
> > +       case hwmon_temp_max:
> > +               return get_temp_target(priv, tcontrol_type, val);
> > +       case hwmon_temp_crit:
> > +               return get_temp_target(priv, tjmax_type, val);
> > +       case hwmon_temp_crit_hyst:
> > +               return get_temp_target(priv, crit_hyst_type, val);
> > +       default:
> > +               return -EOPNOTSUPP;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types
> > type,
> > +                                 u32 attr, int channel)
> > +{
> > +       const struct peci_cputemp *priv = data;
> > +
> > +       if (channel > CPUTEMP_CHANNEL_NUMS)
> > +               return 0;
> > +
> > +       if (channel < channel_core)
> > +               return 0444;
> > +
> > +       if (test_bit(channel - channel_core, priv->core_mask))
> > +               return 0444;
> > +
> > +       return 0;
> > +}
> > +
> > +static int init_core_mask(struct peci_cputemp *priv)
> > +{
> > +       struct peci_device *peci_dev = priv->peci_dev;
> > +       struct resolved_cores_reg *reg = priv->gen_info->reg;
> > +       u64 core_mask;
> > +       u32 data;
> > +       int ret;
> > +
> > +       /* Get the RESOLVED_CORES register value */
> > +       switch (peci_dev->info.model) {
> > +       case INTEL_FAM6_ICELAKE_X:
> > +       case INTEL_FAM6_ICELAKE_D:
> > +               ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg-
> > >dev,
> > +                                            reg->func, reg->offset + 4,
> > &data);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               core_mask = (u64)data << 32;
> > +
> > +               ret = peci_ep_pci_local_read(peci_dev, 0, reg->bus, reg-
> > >dev,
> > +                                            reg->func, reg->offset, &data);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               core_mask |= data;
> > +
> > +               break;
> > +       default:
> > +               ret = peci_pci_local_read(peci_dev, reg->bus, reg->dev,
> > +                                         reg->func, reg->offset, &data);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               core_mask = data;
> > +
> > +               break;
> > +       }
> > +
> > +       if (!core_mask)
> > +               return -EIO;
> > +
> > +       bitmap_from_u64(priv->core_mask, core_mask);
> > +
> > +       return 0;
> > +}
> > +
> > +static int create_temp_label(struct peci_cputemp *priv)
> > +{
> > +       unsigned long core_max = find_last_bit(priv->core_mask,
> > CORE_NUMS_MAX);
> > +       int i;
> > +
> > +       priv->coretemp_label = devm_kzalloc(priv->dev, core_max *
> > sizeof(char *), GFP_KERNEL);
> > +       if (!priv->coretemp_label)
> > +               return -ENOMEM;
> > +
> > +       for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) {
> > +               priv->coretemp_label[i] = devm_kasprintf(priv->dev,
> > GFP_KERNEL, "Core %d", i);
> > +               if (!priv->coretemp_label[i])
> > +                       return -ENOMEM;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void check_resolved_cores(struct peci_cputemp *priv)
> > +{
> > +       /*
> > +        * Failure to resolve cores is non-critical, we're still able to
> > +        * provide other sensor data.
> > +        */
> > +
> > +       if (init_core_mask(priv))
> > +               return;
> > +
> > +       if (create_temp_label(priv))
> > +               bitmap_zero(priv->core_mask, CORE_NUMS_MAX);
> > +}
> > +
> > +static void sensor_init(struct peci_cputemp *priv)
> > +{
> > +       int i;
> > +
> > +       mutex_init(&priv->temp.target.state.lock);
> > +       mutex_init(&priv->temp.die.state.lock);
> > +       mutex_init(&priv->temp.dts.state.lock);
> > +
> > +       for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX)
> > +               mutex_init(&priv->temp.core[i].state.lock);
> > +}
> > +
> > +static const struct hwmon_ops peci_cputemp_ops = {
> > +       .is_visible = cputemp_is_visible,
> > +       .read_string = cputemp_read_string,
> > +       .read = cputemp_read,
> > +};
> > +
> > +static const u32 peci_cputemp_temp_channel_config[] = {
> > +       /* Die temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> > HWMON_T_CRIT_HYST,
> > +       /* DTS margin */
> > +       HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> > HWMON_T_CRIT_HYST,
> > +       /* Tcontrol temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
> > +       /* Tthrottle temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT,
> > +       /* Tjmax temperature */
> > +       HWMON_T_LABEL | HWMON_T_INPUT,
> > +       /* Core temperature - for all core channels */
> > +       [channel_core ... CPUTEMP_CHANNEL_NUMS - 1] = HWMON_T_LABEL |
> > HWMON_T_INPUT,
> > +       0
> > +};
> > +
> > +static const struct hwmon_channel_info peci_cputemp_temp_channel = {
> > +       .type = hwmon_temp,
> > +       .config = peci_cputemp_temp_channel_config,
> > +};
> > +
> > +static const struct hwmon_channel_info *peci_cputemp_info[] = {
> > +       &peci_cputemp_temp_channel,
> > +       NULL
> > +};
> > +
> > +static const struct hwmon_chip_info peci_cputemp_chip_info = {
> > +       .ops = &peci_cputemp_ops,
> > +       .info = peci_cputemp_info,
> > +};
> > +
> > +static int peci_cputemp_probe(struct auxiliary_device *adev,
> > +                             const struct auxiliary_device_id *id)
> > +{
> > +       struct device *dev = &adev->dev;
> > +       struct peci_device *peci_dev = to_peci_device(dev->parent);
> > +       struct peci_cputemp *priv;
> > +       struct device *hwmon_dev;
> > +
> > +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +
> > +       priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_cputemp.cpu%d",
> > +                                   peci_dev->info.socket_id);
> > +       if (!priv->name)
> > +               return -ENOMEM;
> > +
> > +       priv->dev = dev;
> > +       priv->peci_dev = peci_dev;
> > +       priv->gen_info = (const struct cpu_info *)id->driver_data;
> > +
> > +       /*
> > +        * This is just a sanity check. Since we're using commands that are
> > +        * guaranteed to be supported on a given platform, we should never
> > see
> > +        * revision lower than expected.
> > +        */
> > +       if (peci_dev->info.peci_revision < priv->gen_info-
> > >min_peci_revision)
> > +               dev_warn(priv->dev,
> > +                        "Unexpected PECI revision %#x, some features may be
> > unavailable\n",
> > +                        peci_dev->info.peci_revision);
> > +
> > +       check_resolved_cores(priv);
> > +
> > +       sensor_init(priv);
> > +
> > +       hwmon_dev = devm_hwmon_device_register_with_info(priv->dev, priv-
> > >name,
> > +                                                        priv,
> > &peci_cputemp_chip_info, NULL);
> > +
> > +       return PTR_ERR_OR_ZERO(hwmon_dev);
> > +}
> > +
> > +/*
> > + * RESOLVED_CORES PCI configuration register may have different location on
> > + * different platforms.
> > + */
> > +static struct resolved_cores_reg resolved_cores_reg_hsx = {
> > +       .bus = 1,
> > +       .dev = 30,
> > +       .func = 3,
> > +       .offset = 0xb4,
> > +};
> > +
> > +static struct resolved_cores_reg resolved_cores_reg_icx = {
> > +       .bus = 14,
> > +       .dev = 30,
> > +       .func = 3,
> > +       .offset = 0xd0,
> > +};
> > +
> > +static const struct cpu_info cpu_hsx = {
> > +       .reg            = &resolved_cores_reg_hsx,
> > +       .min_peci_revision = 0x33,
> > +};
> > +
> > +static const struct cpu_info cpu_icx = {
> > +       .reg            = &resolved_cores_reg_icx,
> > +       .min_peci_revision = 0x40,
> > +};
> > +
> > +static const struct auxiliary_device_id peci_cputemp_ids[] = {
> > +       {
> > +               .name = "peci_cpu.cputemp.hsx",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.bdx",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.bdxd",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.skx",
> > +               .driver_data = (kernel_ulong_t)&cpu_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.icx",
> > +               .driver_data = (kernel_ulong_t)&cpu_icx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.cputemp.icxd",
> > +               .driver_data = (kernel_ulong_t)&cpu_icx,
> > +       },
> > +       { }
> > +};
> > +MODULE_DEVICE_TABLE(auxiliary, peci_cputemp_ids);
> > +
> > +static struct auxiliary_driver peci_cputemp_driver = {
> > +       .probe          = peci_cputemp_probe,
> > +       .id_table       = peci_cputemp_ids,
> > +};
> > +
> > +module_auxiliary_driver(peci_cputemp_driver);
> > +
> > +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> > +MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
> > +MODULE_DESCRIPTION("PECI cputemp driver");
> > +MODULE_LICENSE("GPL");
> > +MODULE_IMPORT_NS(PECI_CPU);
> > 
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver
  2021-08-03 15:39   ` Guenter Roeck
@ 2021-08-04 10:46     ` Winiarska, Iwona
  2021-08-04 17:33       ` Guenter Roeck
  0 siblings, 1 reply; 49+ messages in thread
From: Winiarska, Iwona @ 2021-08-04 10:46 UTC (permalink / raw)
  To: linux
  Cc: corbet, jae.hyun.yoo, Williams, Dan J, linux-hwmon, andrew, Luck,
	Tony, Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, yazen.ghannam, zweiss, robh+dt,
	openbmc, gregkh, joel, d.mueller, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On Tue, 2021-08-03 at 08:39 -0700, Guenter Roeck wrote:
> On Tue, Aug 03, 2021 at 01:31:32PM +0200, Iwona Winiarska wrote:
> > Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
> > are accessible via the processor PECI interface.
> > 
> > The main use case for the driver (and PECI interface) is out-of-band
> > management, where we're able to obtain thermal readings from an external
> > entity connected with PECI, e.g. BMC on server platforms.
> > 
> > Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> > Note that the timeout was completely removed - we're going to probe
> > for detected DIMMs every 5 seconds until we reach "stable" state of
> > either getting correct DIMM data or getting all -EINVAL (which
> > suggest that the CPU doesn't have any DIMMs).
> > 
> >  drivers/hwmon/peci/Kconfig    |  13 +
> >  drivers/hwmon/peci/Makefile   |   2 +
> >  drivers/hwmon/peci/dimmtemp.c | 614 ++++++++++++++++++++++++++++++++++
> >  3 files changed, 629 insertions(+)
> >  create mode 100644 drivers/hwmon/peci/dimmtemp.c
> > 
> > diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> > index e10eed68d70a..9d32a57badfe 100644
> > --- a/drivers/hwmon/peci/Kconfig
> > +++ b/drivers/hwmon/peci/Kconfig
> > @@ -14,5 +14,18 @@ config SENSORS_PECI_CPUTEMP
> >           This driver can also be built as a module. If so, the module
> >           will be called peci-cputemp.
> >  
> > +config SENSORS_PECI_DIMMTEMP
> > +       tristate "PECI DIMM temperature monitoring client"
> > +       depends on PECI
> > +       select SENSORS_PECI
> > +       select PECI_CPU
> > +       help
> > +         If you say yes here you get support for the generic Intel PECI
> > hwmon
> > +         driver which provides Temperature Sensor on DIMM readings that are
> > +         accessible via the processor PECI interface.
> > +
> > +         This driver can also be built as a module. If so, the module
> > +         will be called peci-dimmtemp.
> > +
> >  config SENSORS_PECI
> >         tristate
> > diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> > index e8a0ada5ab1f..191cfa0227f3 100644
> > --- a/drivers/hwmon/peci/Makefile
> > +++ b/drivers/hwmon/peci/Makefile
> > @@ -1,5 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0-only
> >  
> >  peci-cputemp-y := cputemp.o
> > +peci-dimmtemp-y := dimmtemp.o
> >  
> >  obj-$(CONFIG_SENSORS_PECI_CPUTEMP)     += peci-cputemp.o
> > +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
> > diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
> > new file mode 100644
> > index 000000000000..6264c29bb6c0
> > --- /dev/null
> > +++ b/drivers/hwmon/peci/dimmtemp.c
> > @@ -0,0 +1,614 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2018-2021 Intel Corporation
> > +
> > +#include <linux/auxiliary_bus.h>
> > +#include <linux/bitfield.h>
> > +#include <linux/bitops.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/jiffies.h>
> > +#include <linux/module.h>
> > +#include <linux/peci.h>
> > +#include <linux/peci-cpu.h>
> > +#include <linux/units.h>
> > +#include <linux/workqueue.h>
> > +#include <linux/x86/intel-family.h>
> > +
> > +#include "common.h"
> > +
> > +#define DIMM_MASK_CHECK_DELAY_JIFFIES  msecs_to_jiffies(5000)
> > +
> > +/* Max number of channel ranks and DIMM index per channel */
> > +#define CHAN_RANK_MAX_ON_HSX   8
> > +#define DIMM_IDX_MAX_ON_HSX    3
> > +#define CHAN_RANK_MAX_ON_BDX   4
> > +#define DIMM_IDX_MAX_ON_BDX    3
> > +#define CHAN_RANK_MAX_ON_BDXD  2
> > +#define DIMM_IDX_MAX_ON_BDXD   2
> > +#define CHAN_RANK_MAX_ON_SKX   6
> > +#define DIMM_IDX_MAX_ON_SKX    2
> > +#define CHAN_RANK_MAX_ON_ICX   8
> > +#define DIMM_IDX_MAX_ON_ICX    2
> > +#define CHAN_RANK_MAX_ON_ICXD  4
> > +#define DIMM_IDX_MAX_ON_ICXD   2
> > +
> > +#define CHAN_RANK_MAX          CHAN_RANK_MAX_ON_HSX
> > +#define DIMM_IDX_MAX           DIMM_IDX_MAX_ON_HSX
> > +#define DIMM_NUMS_MAX          (CHAN_RANK_MAX * DIMM_IDX_MAX)
> > +
> > +#define CPU_SEG_MASK           GENMASK(23, 16)
> > +#define GET_CPU_SEG(x)         (((x) & CPU_SEG_MASK) >> 16)
> > +#define CPU_BUS_MASK           GENMASK(7, 0)
> > +#define GET_CPU_BUS(x)         ((x) & CPU_BUS_MASK)
> > +
> > +#define DIMM_TEMP_MAX          GENMASK(15, 8)
> > +#define DIMM_TEMP_CRIT         GENMASK(23, 16)
> > +#define GET_TEMP_MAX(x)                (((x) & DIMM_TEMP_MAX) >> 8)
> > +#define GET_TEMP_CRIT(x)       (((x) & DIMM_TEMP_CRIT) >> 16)
> > +
> > +struct peci_dimmtemp;
> > +
> > +struct dimm_info {
> > +       int chan_rank_max;
> > +       int dimm_idx_max;
> > +       u8 min_peci_revision;
> > +       int (*read_thresholds)(struct peci_dimmtemp *priv, int dimm_order,
> > +                              int chan_rank, u32 *data);
> > +};
> > +
> > +struct peci_dimm_thresholds {
> > +       long temp_max;
> > +       long temp_crit;
> > +       struct peci_sensor_state state;
> > +};
> > +
> > +enum peci_dimm_threshold_type {
> > +       temp_max_type,
> > +       temp_crit_type,
> > +};
> > +
> > +struct peci_dimmtemp {
> > +       struct peci_device *peci_dev;
> > +       struct device *dev;
> > +       const char *name;
> > +       const struct dimm_info *gen_info;
> > +       struct delayed_work detect_work;
> > +       struct {
> > +               struct peci_sensor_data temp;
> > +               struct peci_dimm_thresholds thresholds;
> > +       } dimm[DIMM_NUMS_MAX];
> > +       char **dimmtemp_label;
> > +       DECLARE_BITMAP(dimm_mask, DIMM_NUMS_MAX);
> > +};
> > +
> > +static u8 __dimm_temp(u32 reg, int dimm_order)
> > +{
> > +       return (reg >> (dimm_order * 8)) & 0xff;
> > +}
> > +
> > +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no, long
> > *val)
> > +{
> > +       int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> > +       int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> > +       u32 data;
> > +       int ret;
> 
>         int ret = 0;
> 
> > +
> > +       mutex_lock(&priv->dimm[dimm_no].temp.state.lock);
> > +       if (!peci_sensor_need_update(&priv->dimm[dimm_no].temp.state))
> > +               goto skip_update;
> > +
> > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP,
> > chan_rank, &data);
> > +       if (ret) {
> > +               mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
> > +               return ret;
> > +       }
> 
>         if (ret)
>                 goto unlock;
> 
> > +
> > +       priv->dimm[dimm_no].temp.value = __dimm_temp(data, dimm_order) *
> > MILLIDEGREE_PER_DEGREE;
> > +
> > +       peci_sensor_mark_updated(&priv->dimm[dimm_no].temp.state);
> > +
> > +skip_update:
> > +       *val = priv->dimm[dimm_no].temp.value;
> 
> unlock:
> > +       mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
> > +       return 0;
> 
>         return ret;

Ack.

> 
> > +}
> > +
> > +static int update_thresholds(struct peci_dimmtemp *priv, int dimm_no)
> > +{
> > +       int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> > +       int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> > +       u32 data;
> > +       int ret;
> > +
> > +       if (!peci_sensor_need_update(&priv->dimm[dimm_no].thresholds.state))
> > +               return 0;
> > +
> > +       ret = priv->gen_info->read_thresholds(priv, dimm_order, chan_rank,
> > &data);
> > +       if (ret == -ENODATA) /* Use default or previous value */
> > +               return 0;
> > +       if (ret)
> > +               return ret;
> > +
> > +       priv->dimm[dimm_no].thresholds.temp_max = GET_TEMP_MAX(data) *
> > MILLIDEGREE_PER_DEGREE;
> > +       priv->dimm[dimm_no].thresholds.temp_crit = GET_TEMP_CRIT(data) *
> > MILLIDEGREE_PER_DEGREE;
> > +
> > +       peci_sensor_mark_updated(&priv->dimm[dimm_no].thresholds.state);
> > +
> > +       return 0;
> > +}
> > +
> > +static int get_dimm_thresholds(struct peci_dimmtemp *priv, enum
> > peci_dimm_threshold_type type,
> > +                              int dimm_no, long *val)
> > +{
> > +       int ret;
> > +
> > +       mutex_lock(&priv->dimm[dimm_no].thresholds.state.lock);
> > +       ret = update_thresholds(priv, dimm_no);
> > +       if (ret)
> > +               goto unlock;
> > +
> > +       switch (type) {
> > +       case temp_max_type:
> > +               *val = priv->dimm[dimm_no].thresholds.temp_max;
> > +               break;
> > +       case temp_crit_type:
> > +               *val = priv->dimm[dimm_no].thresholds.temp_crit;
> > +               break;
> > +       default:
> > +               ret = -EOPNOTSUPP;
> > +               break;
> > +       }
> > +unlock:
> > +       mutex_unlock(&priv->dimm[dimm_no].thresholds.state.lock);
> > +
> > +       return ret;
> > +}
> > +
> > +static int dimmtemp_read_string(struct device *dev,
> > +                               enum hwmon_sensor_types type,
> > +                               u32 attr, int channel, const char **str)
> > +{
> > +       struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> > +
> > +       if (attr != hwmon_temp_label)
> > +               return -EOPNOTSUPP;
> > +
> > +       *str = (const char *)priv->dimmtemp_label[channel];
> > +
> > +       return 0;
> > +}
> > +
> > +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
> > +                        u32 attr, int channel, long *val)
> > +{
> > +       struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> > +
> > +       switch (attr) {
> > +       case hwmon_temp_input:
> > +               return get_dimm_temp(priv, channel, val);
> > +       case hwmon_temp_max:
> > +               return get_dimm_thresholds(priv, temp_max_type, channel,
> > val);
> > +       case hwmon_temp_crit:
> > +               return get_dimm_thresholds(priv, temp_crit_type, channel,
> > val);
> > +       default:
> > +               break;
> > +       }
> > +
> > +       return -EOPNOTSUPP;
> > +}
> > +
> > +static umode_t dimmtemp_is_visible(const void *data, enum
> > hwmon_sensor_types type,
> > +                                  u32 attr, int channel)
> > +{
> > +       const struct peci_dimmtemp *priv = data;
> > +
> > +       if (test_bit(channel, priv->dimm_mask))
> > +               return 0444;
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct hwmon_ops peci_dimmtemp_ops = {
> > +       .is_visible = dimmtemp_is_visible,
> > +       .read_string = dimmtemp_read_string,
> > +       .read = dimmtemp_read,
> > +};
> > +
> > +static int check_populated_dimms(struct peci_dimmtemp *priv)
> > +{
> > +       int chan_rank_max = priv->gen_info->chan_rank_max;
> > +       int dimm_idx_max = priv->gen_info->dimm_idx_max;
> > +       u32 chan_rank_empty = 0;
> > +       u64 dimm_mask = 0;
> > +       int chan_rank, dimm_idx, ret;
> > +       u32 pcs;
> > +
> > +       BUILD_BUG_ON(CHAN_RANK_MAX > 32);
> > +       BUILD_BUG_ON(DIMM_NUMS_MAX > 64);
> 
> I don't immediately see the value of those build bugs. What happens if
> CHAN_RANK_MAX > 32 or DIMM_NUMS_MAX > 64 ? Where do those limits come
> from ?

Supported HW doesn't come near the limit for now - it's just an "artificial"
limit imposed by variables we're using (u64 for dimm_mask and u32 for
chan_rank_empty).

> 
> > +       if (chan_rank_max * dimm_idx_max > DIMM_NUMS_MAX) {
> > +               WARN_ONCE(1, "Unsupported number of DIMMs");
> 
> Maybe display the values (chan_rank_max and dimm_idx_max).

Ok.

> 
> > +               return -EINVAL;
> > +       }
> > +
> > +       for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
> > +               ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP,
> > chan_rank, &pcs);
> > +               if (ret) {
> > +                       /*
> > +                        * Overall, we expect either success or -EINVAL in
> > +                        * order to determine whether DIMM is populated or
> > not.
> > +                        * For anything else - we fall back to defering the
> 
> Why " - " ?

Hum... No idea after reading it now.
I'll drop it.

Thank you
-Iwona

> 
> > +                        * detection to be performed at a later point in
> > time.
> > +                        */
> > +                       if (ret == -EINVAL) {
> > +                               chan_rank_empty |= BIT(chan_rank);
> > +                               continue;
> > +                       }
> > +
> > +                       return -EAGAIN;
> > +               }
> > +
> > +               for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++)
> > +                       if (__dimm_temp(pcs, dimm_idx))
> > +                               dimm_mask |= BIT(chan_rank * dimm_idx_max +
> > dimm_idx);
> > +       }
> > +
> > +       /* If we got all -EINVALs, it means that the CPU doesn't have any
> > DIMMs. */
> > +       if (chan_rank_empty == GENMASK(chan_rank_max - 1, 0))
> > +               return -ENODEV;
> > +
> > +       /*
> > +        * It's possible that memory training is not done yet. In this case
> > we
> > +        * defer the detection to be performed at a later point in time.
> > +        */
> > +       if (!dimm_mask)
> > +               return -EAGAIN;
> > +
> > +       dev_dbg(priv->dev, "Scanned populated DIMMs: %#llx\n", dimm_mask);
> > +
> > +       bitmap_from_u64(priv->dimm_mask, dimm_mask);
> > +
> > +       return 0;
> > +}
> > +
> > +static int create_dimm_temp_label(struct peci_dimmtemp *priv, int chan)
> > +{
> > +       int rank = chan / priv->gen_info->dimm_idx_max;
> > +       int idx = chan % priv->gen_info->dimm_idx_max;
> > +
> > +       priv->dimmtemp_label[chan] = devm_kasprintf(priv->dev, GFP_KERNEL,
> > +                                                   "DIMM %c%d", 'A' + rank,
> > +                                                   idx + 1);
> > +       if (!priv->dimmtemp_label[chan])
> > +               return -ENOMEM;
> > +
> > +       return 0;
> > +}
> > +
> > +static const u32 peci_dimmtemp_temp_channel_config[] = {
> > +       [0 ... DIMM_NUMS_MAX - 1] = HWMON_T_LABEL | HWMON_T_INPUT |
> > HWMON_T_MAX | HWMON_T_CRIT,
> > +       0
> > +};
> > +
> > +static const struct hwmon_channel_info peci_dimmtemp_temp_channel = {
> > +       .type = hwmon_temp,
> > +       .config = peci_dimmtemp_temp_channel_config,
> > +};
> > +
> > +static const struct hwmon_channel_info *peci_dimmtemp_temp_info[] = {
> > +       &peci_dimmtemp_temp_channel,
> > +       NULL
> > +};
> > +
> > +static const struct hwmon_chip_info peci_dimmtemp_chip_info = {
> > +       .ops = &peci_dimmtemp_ops,
> > +       .info = peci_dimmtemp_temp_info,
> > +};
> > +
> > +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
> > +{
> > +       int ret, i, channels;
> > +       struct device *dev;
> > +
> > +       /*
> > +        * We expect to either find populated DIMMs and carry on with
> > creating
> > +        * sensors, or find out that there are no DIMMs populated.
> > +        * All other states mean that the platform never reached the state
> > that
> > +        * allows to check DIMM state - causing us to retry later on.
> > +        */
> > +       ret = check_populated_dimms(priv);
> > +       if (ret == -ENODEV) {
> > +               dev_dbg(priv->dev, "No DIMMs found\n");
> > +               return 0;
> > +       } else if (ret) {
> > +               schedule_delayed_work(&priv->detect_work,
> > DIMM_MASK_CHECK_DELAY_JIFFIES);
> > +               dev_dbg(priv->dev, "Deferred populating DIMM temp info\n");
> > +               return ret;
> > +       }
> > +
> > +       channels = priv->gen_info->chan_rank_max * priv->gen_info-
> > >dimm_idx_max;
> > +
> > +       priv->dimmtemp_label = devm_kzalloc(priv->dev, channels *
> > sizeof(char *), GFP_KERNEL);
> > +       if (!priv->dimmtemp_label)
> > +               return -ENOMEM;
> > +
> > +       for_each_set_bit(i, priv->dimm_mask, DIMM_NUMS_MAX) {
> > +               ret = create_dimm_temp_label(priv, i);
> > +               if (ret)
> > +                       return ret;
> > +               mutex_init(&priv->dimm[i].thresholds.state.lock);
> > +               mutex_init(&priv->dimm[i].temp.state.lock);
> > +       }
> > +
> > +       dev = devm_hwmon_device_register_with_info(priv->dev, priv->name,
> > priv,
> > +                                                  &peci_dimmtemp_chip_info,
> > NULL);
> > +       if (IS_ERR(dev)) {
> > +               dev_err(priv->dev, "Failed to register hwmon device\n");
> > +               return PTR_ERR(dev);
> > +       }
> > +
> > +       dev_dbg(priv->dev, "%s: sensor '%s'\n", dev_name(dev), priv->name);
> > +
> > +       return 0;
> > +}
> > +
> > +static void create_dimm_temp_info_delayed(struct work_struct *work)
> > +{
> > +       struct peci_dimmtemp *priv = container_of(to_delayed_work(work),
> > +                                                 struct peci_dimmtemp,
> > +                                                 detect_work);
> > +       int ret;
> > +
> > +       ret = create_dimm_temp_info(priv);
> > +       if (ret && ret != -EAGAIN)
> > +               dev_err(priv->dev, "Failed to populate DIMM temp info\n");
> > +}
> > +
> > +static void remove_delayed_work(void *_priv)
> > +{
> > +       struct peci_dimmtemp *priv = _priv;
> > +
> > +       cancel_delayed_work_sync(&priv->detect_work);
> > +}
> > +
> > +static int peci_dimmtemp_probe(struct auxiliary_device *adev, const struct
> > auxiliary_device_id *id)
> > +{
> > +       struct device *dev = &adev->dev;
> > +       struct peci_device *peci_dev = to_peci_device(dev->parent);
> > +       struct peci_dimmtemp *priv;
> > +       int ret;
> > +
> > +       priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +
> > +       priv->name = devm_kasprintf(dev, GFP_KERNEL, "peci_dimmtemp.cpu%d",
> > +                                   peci_dev->info.socket_id);
> > +       if (!priv->name)
> > +               return -ENOMEM;
> > +
> > +       priv->dev = dev;
> > +       priv->peci_dev = peci_dev;
> > +       priv->gen_info = (const struct dimm_info *)id->driver_data;
> > +
> > +       /*
> > +        * This is just a sanity check. Since we're using commands that are
> > +        * guaranteed to be supported on a given platform, we should never
> > see
> > +        * revision lower than expected.
> > +        */
> > +       if (peci_dev->info.peci_revision < priv->gen_info-
> > >min_peci_revision)
> > +               dev_warn(priv->dev,
> > +                        "Unexpected PECI revision %#x, some features may be
> > unavailable\n",
> > +                        peci_dev->info.peci_revision);
> > +
> > +       INIT_DELAYED_WORK(&priv->detect_work,
> > create_dimm_temp_info_delayed);
> > +
> > +       ret = devm_add_action_or_reset(priv->dev, remove_delayed_work,
> > priv);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = create_dimm_temp_info(priv);
> > +       if (ret && ret != -EAGAIN) {
> > +               dev_err(dev, "Failed to populate DIMM temp info\n");
> > +               return ret;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int
> > +read_thresholds_hsx(struct peci_dimmtemp *priv, int dimm_order, int
> > chan_rank, u32 *data)
> > +{
> > +       u8 dev, func;
> > +       u16 reg;
> > +       int ret;
> > +
> > +       /*
> > +        * Device 20, Function 0: IMC 0 channel 0 -> rank 0
> > +        * Device 20, Function 1: IMC 0 channel 1 -> rank 1
> > +        * Device 21, Function 0: IMC 0 channel 2 -> rank 2
> > +        * Device 21, Function 1: IMC 0 channel 3 -> rank 3
> > +        * Device 23, Function 0: IMC 1 channel 0 -> rank 4
> > +        * Device 23, Function 1: IMC 1 channel 1 -> rank 5
> > +        * Device 24, Function 0: IMC 1 channel 2 -> rank 6
> > +        * Device 24, Function 1: IMC 1 channel 3 -> rank 7
> > +        */
> > +       dev = 20 + chan_rank / 2 + chan_rank / 4;
> > +       func = chan_rank % 2;
> > +       reg = 0x120 + dimm_order * 4;
> > +
> > +       ret = peci_pci_local_read(priv->peci_dev, 1, dev, func, reg, data);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return 0;
> > +}
> > +
> > +static int
> > +read_thresholds_bdxd(struct peci_dimmtemp *priv, int dimm_order, int
> > chan_rank, u32 *data)
> > +{
> > +       u8 dev, func;
> > +       u16 reg;
> > +       int ret;
> > +
> > +       /*
> > +        * Device 10, Function 2: IMC 0 channel 0 -> rank 0
> > +        * Device 10, Function 6: IMC 0 channel 1 -> rank 1
> > +        * Device 12, Function 2: IMC 1 channel 0 -> rank 2
> > +        * Device 12, Function 6: IMC 1 channel 1 -> rank 3
> > +        */
> > +       dev = 10 + chan_rank / 2 * 2;
> > +       func = (chan_rank % 2) ? 6 : 2;
> > +       reg = 0x120 + dimm_order * 4;
> > +
> > +       ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return 0;
> > +}
> > +
> > +static int
> > +read_thresholds_skx(struct peci_dimmtemp *priv, int dimm_order, int
> > chan_rank, u32 *data)
> > +{
> > +       u8 dev, func;
> > +       u16 reg;
> > +       int ret;
> > +
> > +       /*
> > +        * Device 10, Function 2: IMC 0 channel 0 -> rank 0
> > +        * Device 10, Function 6: IMC 0 channel 1 -> rank 1
> > +        * Device 11, Function 2: IMC 0 channel 2 -> rank 2
> > +        * Device 12, Function 2: IMC 1 channel 0 -> rank 3
> > +        * Device 12, Function 6: IMC 1 channel 1 -> rank 4
> > +        * Device 13, Function 2: IMC 1 channel 2 -> rank 5
> > +        */
> > +       dev = 10 + chan_rank / 3 * 2 + (chan_rank % 3 == 2 ? 1 : 0);
> > +       func = chan_rank % 3 == 1 ? 6 : 2;
> > +       reg = 0x120 + dimm_order * 4;
> > +
> > +       ret = peci_pci_local_read(priv->peci_dev, 2, dev, func, reg, data);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return 0;
> > +}
> > +
> > +static int
> > +read_thresholds_icx(struct peci_dimmtemp *priv, int dimm_order, int
> > chan_rank, u32 *data)
> > +{
> > +       u32 reg_val;
> > +       u64 offset;
> > +       int ret;
> > +       u8 dev;
> > +
> > +       ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd4,
> > &reg_val);
> > +       if (ret || !(reg_val & BIT(31)))
> > +               return -ENODATA; /* Use default or previous value */
> > +
> > +       ret = peci_ep_pci_local_read(priv->peci_dev, 0, 13, 0, 2, 0xd0,
> > &reg_val);
> > +       if (ret)
> > +               return -ENODATA; /* Use default or previous value */
> > +
> > +       /*
> > +        * Device 26, Offset 224e0: IMC 0 channel 0 -> rank 0
> > +        * Device 26, Offset 264e0: IMC 0 channel 1 -> rank 1
> > +        * Device 27, Offset 224e0: IMC 1 channel 0 -> rank 2
> > +        * Device 27, Offset 264e0: IMC 1 channel 1 -> rank 3
> > +        * Device 28, Offset 224e0: IMC 2 channel 0 -> rank 4
> > +        * Device 28, Offset 264e0: IMC 2 channel 1 -> rank 5
> > +        * Device 29, Offset 224e0: IMC 3 channel 0 -> rank 6
> > +        * Device 29, Offset 264e0: IMC 3 channel 1 -> rank 7
> > +        */
> > +       dev = 26 + chan_rank / 2;
> > +       offset = 0x224e0 + dimm_order * 4 + (chan_rank % 2) * 0x4000;
> > +
> > +       ret = peci_mmio_read(priv->peci_dev, 0, GET_CPU_SEG(reg_val),
> > GET_CPU_BUS(reg_val),
> > +                            dev, 0, offset, data);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct dimm_info dimm_hsx = {
> > +       .chan_rank_max  = CHAN_RANK_MAX_ON_HSX,
> > +       .dimm_idx_max   = DIMM_IDX_MAX_ON_HSX,
> > +       .min_peci_revision = 0x33,
> > +       .read_thresholds = &read_thresholds_hsx,
> > +};
> > +
> > +static const struct dimm_info dimm_bdx = {
> > +       .chan_rank_max  = CHAN_RANK_MAX_ON_BDX,
> > +       .dimm_idx_max   = DIMM_IDX_MAX_ON_BDX,
> > +       .min_peci_revision = 0x33,
> > +       .read_thresholds = &read_thresholds_hsx,
> > +};
> > +
> > +static const struct dimm_info dimm_bdxd = {
> > +       .chan_rank_max  = CHAN_RANK_MAX_ON_BDXD,
> > +       .dimm_idx_max   = DIMM_IDX_MAX_ON_BDXD,
> > +       .min_peci_revision = 0x33,
> > +       .read_thresholds = &read_thresholds_bdxd,
> > +};
> > +
> > +static const struct dimm_info dimm_skx = {
> > +       .chan_rank_max  = CHAN_RANK_MAX_ON_SKX,
> > +       .dimm_idx_max   = DIMM_IDX_MAX_ON_SKX,
> > +       .min_peci_revision = 0x33,
> > +       .read_thresholds = &read_thresholds_skx,
> > +};
> > +
> > +static const struct dimm_info dimm_icx = {
> > +       .chan_rank_max  = CHAN_RANK_MAX_ON_ICX,
> > +       .dimm_idx_max   = DIMM_IDX_MAX_ON_ICX,
> > +       .min_peci_revision = 0x40,
> > +       .read_thresholds = &read_thresholds_icx,
> > +};
> > +
> > +static const struct dimm_info dimm_icxd = {
> > +       .chan_rank_max  = CHAN_RANK_MAX_ON_ICXD,
> > +       .dimm_idx_max   = DIMM_IDX_MAX_ON_ICXD,
> > +       .min_peci_revision = 0x40,
> > +       .read_thresholds = &read_thresholds_icx,
> > +};
> > +
> > +static const struct auxiliary_device_id peci_dimmtemp_ids[] = {
> > +       {
> > +               .name = "peci_cpu.dimmtemp.hsx",
> > +               .driver_data = (kernel_ulong_t)&dimm_hsx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.dimmtemp.bdx",
> > +               .driver_data = (kernel_ulong_t)&dimm_bdx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.dimmtemp.bdxd",
> > +               .driver_data = (kernel_ulong_t)&dimm_bdxd,
> > +       },
> > +       {
> > +               .name = "peci_cpu.dimmtemp.skx",
> > +               .driver_data = (kernel_ulong_t)&dimm_skx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.dimmtemp.icx",
> > +               .driver_data = (kernel_ulong_t)&dimm_icx,
> > +       },
> > +       {
> > +               .name = "peci_cpu.dimmtemp.icxd",
> > +               .driver_data = (kernel_ulong_t)&dimm_icxd,
> > +       },
> > +       { }
> > +};
> > +MODULE_DEVICE_TABLE(auxiliary, peci_dimmtemp_ids);
> > +
> > +static struct auxiliary_driver peci_dimmtemp_driver = {
> > +       .probe          = peci_dimmtemp_probe,
> > +       .id_table       = peci_dimmtemp_ids,
> > +};
> > +
> > +module_auxiliary_driver(peci_dimmtemp_driver);
> > +
> > +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> > +MODULE_AUTHOR("Iwona Winiarska <iwona.winiarska@intel.com>");
> > +MODULE_DESCRIPTION("PECI dimmtemp driver");
> > +MODULE_LICENSE("GPL");
> > +MODULE_IMPORT_NS(PECI_CPU);


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver
  2021-08-04 10:46     ` Winiarska, Iwona
@ 2021-08-04 17:33       ` Guenter Roeck
  2021-08-05 21:48         ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Guenter Roeck @ 2021-08-04 17:33 UTC (permalink / raw)
  To: Winiarska, Iwona
  Cc: corbet, jae.hyun.yoo, Williams, Dan J, linux-hwmon, andrew, Luck,
	Tony, Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, yazen.ghannam, zweiss, robh+dt,
	openbmc, gregkh, joel, d.mueller, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On 8/4/21 3:46 AM, Winiarska, Iwona wrote:
> On Tue, 2021-08-03 at 08:39 -0700, Guenter Roeck wrote:
>> On Tue, Aug 03, 2021 at 01:31:32PM +0200, Iwona Winiarska wrote:
>>> Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
>>> are accessible via the processor PECI interface.
>>>
>>> The main use case for the driver (and PECI interface) is out-of-band
>>> management, where we're able to obtain thermal readings from an external
>>> entity connected with PECI, e.g. BMC on server platforms.
>>>
>>> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
>>> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>>> ---
>>> Note that the timeout was completely removed - we're going to probe
>>> for detected DIMMs every 5 seconds until we reach "stable" state of
>>> either getting correct DIMM data or getting all -EINVAL (which
>>> suggest that the CPU doesn't have any DIMMs).
>>>
>>>   drivers/hwmon/peci/Kconfig    |  13 +
>>>   drivers/hwmon/peci/Makefile   |   2 +
>>>   drivers/hwmon/peci/dimmtemp.c | 614 ++++++++++++++++++++++++++++++++++
>>>   3 files changed, 629 insertions(+)
>>>   create mode 100644 drivers/hwmon/peci/dimmtemp.c
>>>
>>> diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
>>> index e10eed68d70a..9d32a57badfe 100644
>>> --- a/drivers/hwmon/peci/Kconfig
>>> +++ b/drivers/hwmon/peci/Kconfig
>>> @@ -14,5 +14,18 @@ config SENSORS_PECI_CPUTEMP
>>>            This driver can also be built as a module. If so, the module
>>>            will be called peci-cputemp.
>>>   
>>> +config SENSORS_PECI_DIMMTEMP
>>> +       tristate "PECI DIMM temperature monitoring client"
>>> +       depends on PECI
>>> +       select SENSORS_PECI
>>> +       select PECI_CPU
>>> +       help
>>> +         If you say yes here you get support for the generic Intel PECI
>>> hwmon
>>> +         driver which provides Temperature Sensor on DIMM readings that are
>>> +         accessible via the processor PECI interface.
>>> +
>>> +         This driver can also be built as a module. If so, the module
>>> +         will be called peci-dimmtemp.
>>> +
>>>   config SENSORS_PECI
>>>          tristate
>>> diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
>>> index e8a0ada5ab1f..191cfa0227f3 100644
>>> --- a/drivers/hwmon/peci/Makefile
>>> +++ b/drivers/hwmon/peci/Makefile
>>> @@ -1,5 +1,7 @@
>>>   # SPDX-License-Identifier: GPL-2.0-only
>>>   
>>>   peci-cputemp-y := cputemp.o
>>> +peci-dimmtemp-y := dimmtemp.o
>>>   
>>>   obj-$(CONFIG_SENSORS_PECI_CPUTEMP)     += peci-cputemp.o
>>> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
>>> diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
>>> new file mode 100644
>>> index 000000000000..6264c29bb6c0
>>> --- /dev/null
>>> +++ b/drivers/hwmon/peci/dimmtemp.c
>>> @@ -0,0 +1,614 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +// Copyright (c) 2018-2021 Intel Corporation
>>> +
>>> +#include <linux/auxiliary_bus.h>
>>> +#include <linux/bitfield.h>
>>> +#include <linux/bitops.h>
>>> +#include <linux/hwmon.h>
>>> +#include <linux/jiffies.h>
>>> +#include <linux/module.h>
>>> +#include <linux/peci.h>
>>> +#include <linux/peci-cpu.h>
>>> +#include <linux/units.h>
>>> +#include <linux/workqueue.h>
>>> +#include <linux/x86/intel-family.h>
>>> +
>>> +#include "common.h"
>>> +
>>> +#define DIMM_MASK_CHECK_DELAY_JIFFIES  msecs_to_jiffies(5000)
>>> +
>>> +/* Max number of channel ranks and DIMM index per channel */
>>> +#define CHAN_RANK_MAX_ON_HSX   8
>>> +#define DIMM_IDX_MAX_ON_HSX    3
>>> +#define CHAN_RANK_MAX_ON_BDX   4
>>> +#define DIMM_IDX_MAX_ON_BDX    3
>>> +#define CHAN_RANK_MAX_ON_BDXD  2
>>> +#define DIMM_IDX_MAX_ON_BDXD   2
>>> +#define CHAN_RANK_MAX_ON_SKX   6
>>> +#define DIMM_IDX_MAX_ON_SKX    2
>>> +#define CHAN_RANK_MAX_ON_ICX   8
>>> +#define DIMM_IDX_MAX_ON_ICX    2
>>> +#define CHAN_RANK_MAX_ON_ICXD  4
>>> +#define DIMM_IDX_MAX_ON_ICXD   2
>>> +
>>> +#define CHAN_RANK_MAX          CHAN_RANK_MAX_ON_HSX
>>> +#define DIMM_IDX_MAX           DIMM_IDX_MAX_ON_HSX
>>> +#define DIMM_NUMS_MAX          (CHAN_RANK_MAX * DIMM_IDX_MAX)
>>> +
>>> +#define CPU_SEG_MASK           GENMASK(23, 16)
>>> +#define GET_CPU_SEG(x)         (((x) & CPU_SEG_MASK) >> 16)
>>> +#define CPU_BUS_MASK           GENMASK(7, 0)
>>> +#define GET_CPU_BUS(x)         ((x) & CPU_BUS_MASK)
>>> +
>>> +#define DIMM_TEMP_MAX          GENMASK(15, 8)
>>> +#define DIMM_TEMP_CRIT         GENMASK(23, 16)
>>> +#define GET_TEMP_MAX(x)                (((x) & DIMM_TEMP_MAX) >> 8)
>>> +#define GET_TEMP_CRIT(x)       (((x) & DIMM_TEMP_CRIT) >> 16)
>>> +
>>> +struct peci_dimmtemp;
>>> +
>>> +struct dimm_info {
>>> +       int chan_rank_max;
>>> +       int dimm_idx_max;
>>> +       u8 min_peci_revision;
>>> +       int (*read_thresholds)(struct peci_dimmtemp *priv, int dimm_order,
>>> +                              int chan_rank, u32 *data);
>>> +};
>>> +
>>> +struct peci_dimm_thresholds {
>>> +       long temp_max;
>>> +       long temp_crit;
>>> +       struct peci_sensor_state state;
>>> +};
>>> +
>>> +enum peci_dimm_threshold_type {
>>> +       temp_max_type,
>>> +       temp_crit_type,
>>> +};
>>> +
>>> +struct peci_dimmtemp {
>>> +       struct peci_device *peci_dev;
>>> +       struct device *dev;
>>> +       const char *name;
>>> +       const struct dimm_info *gen_info;
>>> +       struct delayed_work detect_work;
>>> +       struct {
>>> +               struct peci_sensor_data temp;
>>> +               struct peci_dimm_thresholds thresholds;
>>> +       } dimm[DIMM_NUMS_MAX];
>>> +       char **dimmtemp_label;
>>> +       DECLARE_BITMAP(dimm_mask, DIMM_NUMS_MAX);
>>> +};
>>> +
>>> +static u8 __dimm_temp(u32 reg, int dimm_order)
>>> +{
>>> +       return (reg >> (dimm_order * 8)) & 0xff;
>>> +}
>>> +
>>> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no, long
>>> *val)
>>> +{
>>> +       int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>>> +       int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>>> +       u32 data;
>>> +       int ret;
>>
>>          int ret = 0;
>>
>>> +
>>> +       mutex_lock(&priv->dimm[dimm_no].temp.state.lock);
>>> +       if (!peci_sensor_need_update(&priv->dimm[dimm_no].temp.state))
>>> +               goto skip_update;
>>> +
>>> +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP,
>>> chan_rank, &data);
>>> +       if (ret) {
>>> +               mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
>>> +               return ret;
>>> +       }
>>
>>          if (ret)
>>                  goto unlock;
>>
>>> +
>>> +       priv->dimm[dimm_no].temp.value = __dimm_temp(data, dimm_order) *
>>> MILLIDEGREE_PER_DEGREE;
>>> +
>>> +       peci_sensor_mark_updated(&priv->dimm[dimm_no].temp.state);
>>> +
>>> +skip_update:
>>> +       *val = priv->dimm[dimm_no].temp.value;
>>
>> unlock:
>>> +       mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
>>> +       return 0;
>>
>>          return ret;
> 
> Ack.
> 
>>
>>> +}
>>> +
>>> +static int update_thresholds(struct peci_dimmtemp *priv, int dimm_no)
>>> +{
>>> +       int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>>> +       int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>>> +       u32 data;
>>> +       int ret;
>>> +
>>> +       if (!peci_sensor_need_update(&priv->dimm[dimm_no].thresholds.state))
>>> +               return 0;
>>> +
>>> +       ret = priv->gen_info->read_thresholds(priv, dimm_order, chan_rank,
>>> &data);
>>> +       if (ret == -ENODATA) /* Use default or previous value */
>>> +               return 0;
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       priv->dimm[dimm_no].thresholds.temp_max = GET_TEMP_MAX(data) *
>>> MILLIDEGREE_PER_DEGREE;
>>> +       priv->dimm[dimm_no].thresholds.temp_crit = GET_TEMP_CRIT(data) *
>>> MILLIDEGREE_PER_DEGREE;
>>> +
>>> +       peci_sensor_mark_updated(&priv->dimm[dimm_no].thresholds.state);
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static int get_dimm_thresholds(struct peci_dimmtemp *priv, enum
>>> peci_dimm_threshold_type type,
>>> +                              int dimm_no, long *val)
>>> +{
>>> +       int ret;
>>> +
>>> +       mutex_lock(&priv->dimm[dimm_no].thresholds.state.lock);
>>> +       ret = update_thresholds(priv, dimm_no);
>>> +       if (ret)
>>> +               goto unlock;
>>> +
>>> +       switch (type) {
>>> +       case temp_max_type:
>>> +               *val = priv->dimm[dimm_no].thresholds.temp_max;
>>> +               break;
>>> +       case temp_crit_type:
>>> +               *val = priv->dimm[dimm_no].thresholds.temp_crit;
>>> +               break;
>>> +       default:
>>> +               ret = -EOPNOTSUPP;
>>> +               break;
>>> +       }
>>> +unlock:
>>> +       mutex_unlock(&priv->dimm[dimm_no].thresholds.state.lock);
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +static int dimmtemp_read_string(struct device *dev,
>>> +                               enum hwmon_sensor_types type,
>>> +                               u32 attr, int channel, const char **str)
>>> +{
>>> +       struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>> +
>>> +       if (attr != hwmon_temp_label)
>>> +               return -EOPNOTSUPP;
>>> +
>>> +       *str = (const char *)priv->dimmtemp_label[channel];
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
>>> +                        u32 attr, int channel, long *val)
>>> +{
>>> +       struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>> +
>>> +       switch (attr) {
>>> +       case hwmon_temp_input:
>>> +               return get_dimm_temp(priv, channel, val);
>>> +       case hwmon_temp_max:
>>> +               return get_dimm_thresholds(priv, temp_max_type, channel,
>>> val);
>>> +       case hwmon_temp_crit:
>>> +               return get_dimm_thresholds(priv, temp_crit_type, channel,
>>> val);
>>> +       default:
>>> +               break;
>>> +       }
>>> +
>>> +       return -EOPNOTSUPP;
>>> +}
>>> +
>>> +static umode_t dimmtemp_is_visible(const void *data, enum
>>> hwmon_sensor_types type,
>>> +                                  u32 attr, int channel)
>>> +{
>>> +       const struct peci_dimmtemp *priv = data;
>>> +
>>> +       if (test_bit(channel, priv->dimm_mask))
>>> +               return 0444;
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static const struct hwmon_ops peci_dimmtemp_ops = {
>>> +       .is_visible = dimmtemp_is_visible,
>>> +       .read_string = dimmtemp_read_string,
>>> +       .read = dimmtemp_read,
>>> +};
>>> +
>>> +static int check_populated_dimms(struct peci_dimmtemp *priv)
>>> +{
>>> +       int chan_rank_max = priv->gen_info->chan_rank_max;
>>> +       int dimm_idx_max = priv->gen_info->dimm_idx_max;
>>> +       u32 chan_rank_empty = 0;
>>> +       u64 dimm_mask = 0;
>>> +       int chan_rank, dimm_idx, ret;
>>> +       u32 pcs;
>>> +
>>> +       BUILD_BUG_ON(CHAN_RANK_MAX > 32);
>>> +       BUILD_BUG_ON(DIMM_NUMS_MAX > 64);
>>
>> I don't immediately see the value of those build bugs. What happens if
>> CHAN_RANK_MAX > 32 or DIMM_NUMS_MAX > 64 ? Where do those limits come
>> from ?
> 
> Supported HW doesn't come near the limit for now - it's just an "artificial"
> limit imposed by variables we're using (u64 for dimm_mask and u32 for
> chan_rank_empty).
> 

Please use a value derived from the size of those variables for the check
to clarify and explain the constraints.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 00/15] Introduce PECI subsystem
  2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
                   ` (14 preceding siblings ...)
  2021-08-03 11:31 ` [PATCH v2 15/15] docs: Add PECI documentation Iwona Winiarska
@ 2021-08-05 12:17 ` Greg Kroah-Hartman
  15 siblings, 0 replies; 49+ messages in thread
From: Greg Kroah-Hartman @ 2021-08-05 12:17 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-kernel, openbmc, x86, devicetree, linux-aspeed,
	linux-arm-kernel, linux-hwmon, linux-doc, Rob Herring,
	Joel Stanley, Andrew Jeffery, Jean Delvare, Guenter Roeck,
	Arnd Bergmann, Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Dan Williams, Randy Dunlap,
	Zev Weiss, David Muller

On Tue, Aug 03, 2021 at 01:31:19PM +0200, Iwona Winiarska wrote:
> Hi Greg,
> 
> This is a second round of patches introducing PECI subsystem.
> I don't think it is ready to be applied right away (we're still
> missing r-b's), but I hope we have chance to complete discussion in
> the 5.15 development cycle. I would appreciate if you could take
> a look.

I will wait to review this when you all feel it is ready so as to not
waste my time finding things that you already know need to be resolved.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver
  2021-08-04 17:33       ` Guenter Roeck
@ 2021-08-05 21:48         ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-08-05 21:48 UTC (permalink / raw)
  To: linux
  Cc: corbet, jae.hyun.yoo, pierre-louis.bossart, linux-hwmon,
	Lutomirski, Andy, Luck, Tony, andrew, x86, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, yazen.ghannam, zweiss, robh+dt,
	openbmc, gregkh, joel, bp, d.mueller, andriy.shevchenko,
	Williams, Dan J, linux-arm-kernel

On Wed, 2021-08-04 at 10:33 -0700, Guenter Roeck wrote:
> On 8/4/21 3:46 AM, Winiarska, Iwona wrote:
> > On Tue, 2021-08-03 at 08:39 -0700, Guenter Roeck wrote:
> > > On Tue, Aug 03, 2021 at 01:31:32PM +0200, Iwona Winiarska wrote:
> > > > Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
> > > > are accessible via the processor PECI interface.
> > > > 
> > > > The main use case for the driver (and PECI interface) is out-of-band
> > > > management, where we're able to obtain thermal readings from an external
> > > > entity connected with PECI, e.g. BMC on server platforms.
> > > > 
> > > > Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > > > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > > > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > > > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > > > ---
> > > > Note that the timeout was completely removed - we're going to probe
> > > > for detected DIMMs every 5 seconds until we reach "stable" state of
> > > > either getting correct DIMM data or getting all -EINVAL (which
> > > > suggest that the CPU doesn't have any DIMMs).
> > > > 
> > > >   drivers/hwmon/peci/Kconfig    |  13 +
> > > >   drivers/hwmon/peci/Makefile   |   2 +
> > > >   drivers/hwmon/peci/dimmtemp.c | 614 ++++++++++++++++++++++++++++++++++
> > > >   3 files changed, 629 insertions(+)
> > > >   create mode 100644 drivers/hwmon/peci/dimmtemp.c
> > > > 
> > > > diff --git a/drivers/hwmon/peci/Kconfig b/drivers/hwmon/peci/Kconfig
> > > > index e10eed68d70a..9d32a57badfe 100644
> > > > --- a/drivers/hwmon/peci/Kconfig
> > > > +++ b/drivers/hwmon/peci/Kconfig
> > > > @@ -14,5 +14,18 @@ config SENSORS_PECI_CPUTEMP
> > > >            This driver can also be built as a module. If so, the module
> > > >            will be called peci-cputemp.
> > > >   
> > > > +config SENSORS_PECI_DIMMTEMP
> > > > +       tristate "PECI DIMM temperature monitoring client"
> > > > +       depends on PECI
> > > > +       select SENSORS_PECI
> > > > +       select PECI_CPU
> > > > +       help
> > > > +         If you say yes here you get support for the generic Intel PECI
> > > > hwmon
> > > > +         driver which provides Temperature Sensor on DIMM readings that
> > > > are
> > > > +         accessible via the processor PECI interface.
> > > > +
> > > > +         This driver can also be built as a module. If so, the module
> > > > +         will be called peci-dimmtemp.
> > > > +
> > > >   config SENSORS_PECI
> > > >          tristate
> > > > diff --git a/drivers/hwmon/peci/Makefile b/drivers/hwmon/peci/Makefile
> > > > index e8a0ada5ab1f..191cfa0227f3 100644
> > > > --- a/drivers/hwmon/peci/Makefile
> > > > +++ b/drivers/hwmon/peci/Makefile
> > > > @@ -1,5 +1,7 @@
> > > >   # SPDX-License-Identifier: GPL-2.0-only
> > > >   
> > > >   peci-cputemp-y := cputemp.o
> > > > +peci-dimmtemp-y := dimmtemp.o
> > > >   
> > > >   obj-$(CONFIG_SENSORS_PECI_CPUTEMP)     += peci-cputemp.o
> > > > +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
> > > > diff --git a/drivers/hwmon/peci/dimmtemp.c
> > > > b/drivers/hwmon/peci/dimmtemp.c
> > > > new file mode 100644
> > > > index 000000000000..6264c29bb6c0
> > > > --- /dev/null
> > > > +++ b/drivers/hwmon/peci/dimmtemp.c
> > > > @@ -0,0 +1,614 @@
> > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > +// Copyright (c) 2018-2021 Intel Corporation
> > > > +
> > > > +#include <linux/auxiliary_bus.h>
> > > > +#include <linux/bitfield.h>
> > > > +#include <linux/bitops.h>
> > > > +#include <linux/hwmon.h>
> > > > +#include <linux/jiffies.h>
> > > > +#include <linux/module.h>
> > > > +#include <linux/peci.h>
> > > > +#include <linux/peci-cpu.h>
> > > > +#include <linux/units.h>
> > > > +#include <linux/workqueue.h>
> > > > +#include <linux/x86/intel-family.h>
> > > > +
> > > > +#include "common.h"
> > > > +
> > > > +#define DIMM_MASK_CHECK_DELAY_JIFFIES  msecs_to_jiffies(5000)
> > > > +
> > > > +/* Max number of channel ranks and DIMM index per channel */
> > > > +#define CHAN_RANK_MAX_ON_HSX   8
> > > > +#define DIMM_IDX_MAX_ON_HSX    3
> > > > +#define CHAN_RANK_MAX_ON_BDX   4
> > > > +#define DIMM_IDX_MAX_ON_BDX    3
> > > > +#define CHAN_RANK_MAX_ON_BDXD  2
> > > > +#define DIMM_IDX_MAX_ON_BDXD   2
> > > > +#define CHAN_RANK_MAX_ON_SKX   6
> > > > +#define DIMM_IDX_MAX_ON_SKX    2
> > > > +#define CHAN_RANK_MAX_ON_ICX   8
> > > > +#define DIMM_IDX_MAX_ON_ICX    2
> > > > +#define CHAN_RANK_MAX_ON_ICXD  4
> > > > +#define DIMM_IDX_MAX_ON_ICXD   2
> > > > +
> > > > +#define CHAN_RANK_MAX          CHAN_RANK_MAX_ON_HSX
> > > > +#define DIMM_IDX_MAX           DIMM_IDX_MAX_ON_HSX
> > > > +#define DIMM_NUMS_MAX          (CHAN_RANK_MAX * DIMM_IDX_MAX)
> > > > +
> > > > +#define CPU_SEG_MASK           GENMASK(23, 16)
> > > > +#define GET_CPU_SEG(x)         (((x) & CPU_SEG_MASK) >> 16)
> > > > +#define CPU_BUS_MASK           GENMASK(7, 0)
> > > > +#define GET_CPU_BUS(x)         ((x) & CPU_BUS_MASK)
> > > > +
> > > > +#define DIMM_TEMP_MAX          GENMASK(15, 8)
> > > > +#define DIMM_TEMP_CRIT         GENMASK(23, 16)
> > > > +#define GET_TEMP_MAX(x)                (((x) & DIMM_TEMP_MAX) >> 8)
> > > > +#define GET_TEMP_CRIT(x)       (((x) & DIMM_TEMP_CRIT) >> 16)
> > > > +
> > > > +struct peci_dimmtemp;
> > > > +
> > > > +struct dimm_info {
> > > > +       int chan_rank_max;
> > > > +       int dimm_idx_max;
> > > > +       u8 min_peci_revision;
> > > > +       int (*read_thresholds)(struct peci_dimmtemp *priv, int
> > > > dimm_order,
> > > > +                              int chan_rank, u32 *data);
> > > > +};
> > > > +
> > > > +struct peci_dimm_thresholds {
> > > > +       long temp_max;
> > > > +       long temp_crit;
> > > > +       struct peci_sensor_state state;
> > > > +};
> > > > +
> > > > +enum peci_dimm_threshold_type {
> > > > +       temp_max_type,
> > > > +       temp_crit_type,
> > > > +};
> > > > +
> > > > +struct peci_dimmtemp {
> > > > +       struct peci_device *peci_dev;
> > > > +       struct device *dev;
> > > > +       const char *name;
> > > > +       const struct dimm_info *gen_info;
> > > > +       struct delayed_work detect_work;
> > > > +       struct {
> > > > +               struct peci_sensor_data temp;
> > > > +               struct peci_dimm_thresholds thresholds;
> > > > +       } dimm[DIMM_NUMS_MAX];
> > > > +       char **dimmtemp_label;
> > > > +       DECLARE_BITMAP(dimm_mask, DIMM_NUMS_MAX);
> > > > +};
> > > > +
> > > > +static u8 __dimm_temp(u32 reg, int dimm_order)
> > > > +{
> > > > +       return (reg >> (dimm_order * 8)) & 0xff;
> > > > +}
> > > > +
> > > > +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no, long
> > > > *val)
> > > > +{
> > > > +       int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> > > > +       int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> > > > +       u32 data;
> > > > +       int ret;
> > > 
> > >          int ret = 0;
> > > 
> > > > +
> > > > +       mutex_lock(&priv->dimm[dimm_no].temp.state.lock);
> > > > +       if (!peci_sensor_need_update(&priv->dimm[dimm_no].temp.state))
> > > > +               goto skip_update;
> > > > +
> > > > +       ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP,
> > > > chan_rank, &data);
> > > > +       if (ret) {
> > > > +               mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
> > > > +               return ret;
> > > > +       }
> > > 
> > >          if (ret)
> > >                  goto unlock;
> > > 
> > > > +
> > > > +       priv->dimm[dimm_no].temp.value = __dimm_temp(data, dimm_order) *
> > > > MILLIDEGREE_PER_DEGREE;
> > > > +
> > > > +       peci_sensor_mark_updated(&priv->dimm[dimm_no].temp.state);
> > > > +
> > > > +skip_update:
> > > > +       *val = priv->dimm[dimm_no].temp.value;
> > > 
> > > unlock:
> > > > +       mutex_unlock(&priv->dimm[dimm_no].temp.state.lock);
> > > > +       return 0;
> > > 
> > >          return ret;
> > 
> > Ack.
> > 
> > > 
> > > > +}
> > > > +
> > > > +static int update_thresholds(struct peci_dimmtemp *priv, int dimm_no)
> > > > +{
> > > > +       int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> > > > +       int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> > > > +       u32 data;
> > > > +       int ret;
> > > > +
> > > > +       if (!peci_sensor_need_update(&priv-
> > > > >dimm[dimm_no].thresholds.state))
> > > > +               return 0;
> > > > +
> > > > +       ret = priv->gen_info->read_thresholds(priv, dimm_order,
> > > > chan_rank,
> > > > &data);
> > > > +       if (ret == -ENODATA) /* Use default or previous value */
> > > > +               return 0;
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       priv->dimm[dimm_no].thresholds.temp_max = GET_TEMP_MAX(data) *
> > > > MILLIDEGREE_PER_DEGREE;
> > > > +       priv->dimm[dimm_no].thresholds.temp_crit = GET_TEMP_CRIT(data) *
> > > > MILLIDEGREE_PER_DEGREE;
> > > > +
> > > > +       peci_sensor_mark_updated(&priv->dimm[dimm_no].thresholds.state);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static int get_dimm_thresholds(struct peci_dimmtemp *priv, enum
> > > > peci_dimm_threshold_type type,
> > > > +                              int dimm_no, long *val)
> > > > +{
> > > > +       int ret;
> > > > +
> > > > +       mutex_lock(&priv->dimm[dimm_no].thresholds.state.lock);
> > > > +       ret = update_thresholds(priv, dimm_no);
> > > > +       if (ret)
> > > > +               goto unlock;
> > > > +
> > > > +       switch (type) {
> > > > +       case temp_max_type:
> > > > +               *val = priv->dimm[dimm_no].thresholds.temp_max;
> > > > +               break;
> > > > +       case temp_crit_type:
> > > > +               *val = priv->dimm[dimm_no].thresholds.temp_crit;
> > > > +               break;
> > > > +       default:
> > > > +               ret = -EOPNOTSUPP;
> > > > +               break;
> > > > +       }
> > > > +unlock:
> > > > +       mutex_unlock(&priv->dimm[dimm_no].thresholds.state.lock);
> > > > +
> > > > +       return ret;
> > > > +}
> > > > +
> > > > +static int dimmtemp_read_string(struct device *dev,
> > > > +                               enum hwmon_sensor_types type,
> > > > +                               u32 attr, int channel, const char **str)
> > > > +{
> > > > +       struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> > > > +
> > > > +       if (attr != hwmon_temp_label)
> > > > +               return -EOPNOTSUPP;
> > > > +
> > > > +       *str = (const char *)priv->dimmtemp_label[channel];
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types
> > > > type,
> > > > +                        u32 attr, int channel, long *val)
> > > > +{
> > > > +       struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> > > > +
> > > > +       switch (attr) {
> > > > +       case hwmon_temp_input:
> > > > +               return get_dimm_temp(priv, channel, val);
> > > > +       case hwmon_temp_max:
> > > > +               return get_dimm_thresholds(priv, temp_max_type, channel,
> > > > val);
> > > > +       case hwmon_temp_crit:
> > > > +               return get_dimm_thresholds(priv, temp_crit_type,
> > > > channel,
> > > > val);
> > > > +       default:
> > > > +               break;
> > > > +       }
> > > > +
> > > > +       return -EOPNOTSUPP;
> > > > +}
> > > > +
> > > > +static umode_t dimmtemp_is_visible(const void *data, enum
> > > > hwmon_sensor_types type,
> > > > +                                  u32 attr, int channel)
> > > > +{
> > > > +       const struct peci_dimmtemp *priv = data;
> > > > +
> > > > +       if (test_bit(channel, priv->dimm_mask))
> > > > +               return 0444;
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static const struct hwmon_ops peci_dimmtemp_ops = {
> > > > +       .is_visible = dimmtemp_is_visible,
> > > > +       .read_string = dimmtemp_read_string,
> > > > +       .read = dimmtemp_read,
> > > > +};
> > > > +
> > > > +static int check_populated_dimms(struct peci_dimmtemp *priv)
> > > > +{
> > > > +       int chan_rank_max = priv->gen_info->chan_rank_max;
> > > > +       int dimm_idx_max = priv->gen_info->dimm_idx_max;
> > > > +       u32 chan_rank_empty = 0;
> > > > +       u64 dimm_mask = 0;
> > > > +       int chan_rank, dimm_idx, ret;
> > > > +       u32 pcs;
> > > > +
> > > > +       BUILD_BUG_ON(CHAN_RANK_MAX > 32);
> > > > +       BUILD_BUG_ON(DIMM_NUMS_MAX > 64);
> > > 
> > > I don't immediately see the value of those build bugs. What happens if
> > > CHAN_RANK_MAX > 32 or DIMM_NUMS_MAX > 64 ? Where do those limits come
> > > from ?
> > 
> > Supported HW doesn't come near the limit for now - it's just an "artificial"
> > limit imposed by variables we're using (u64 for dimm_mask and u32 for
> > chan_rank_empty).
> > 
> 
> Please use a value derived from the size of those variables for the check
> to clarify and explain the constraints.

Sure, I'll use BITS_PER_TYPE.

Thanks
-Iwona

> 
> Thanks,
> Guenter


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 03/15] dt-bindings: Add generic bindings for PECI
  2021-08-03 11:31 ` [PATCH v2 03/15] dt-bindings: Add generic bindings for PECI Iwona Winiarska
@ 2021-08-11 18:11   ` Rob Herring
  0 siblings, 0 replies; 49+ messages in thread
From: Rob Herring @ 2021-08-11 18:11 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: Tony Luck, Borislav Petkov, linux-aspeed, openbmc, Arnd Bergmann,
	Andy Shevchenko, Randy Dunlap, linux-kernel, devicetree,
	Jae Hyun Yoo, David Muller, x86, Yazen Ghannam, Jonathan Corbet,
	Rob Herring, Olof Johansson, linux-arm-kernel, Andrew Jeffery,
	Joel Stanley, linux-doc, Mauro Carvalho Chehab, Thomas Gleixner,
	Andy Lutomirski, Greg Kroah-Hartman, linux-hwmon, Jean Delvare,
	Zev Weiss, Ingo Molnar, Guenter Roeck, Pierre-Louis Bossart,
	Dan Williams

On Tue, 03 Aug 2021 13:31:22 +0200, Iwona Winiarska wrote:
> Add device tree bindings for the PECI controller.
> 
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> ---
>  .../bindings/peci/peci-controller.yaml        | 33 +++++++++++++++++++
>  1 file changed, 33 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-controller.yaml
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 04/15] dt-bindings: Add bindings for peci-aspeed
  2021-08-03 11:31 ` [PATCH v2 04/15] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
@ 2021-08-11 18:11   ` Rob Herring
  0 siblings, 0 replies; 49+ messages in thread
From: Rob Herring @ 2021-08-11 18:11 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-aspeed, Guenter Roeck, Andy Shevchenko, Tony Luck,
	Andy Lutomirski, Thomas Gleixner, x86, Olof Johansson,
	Jonathan Corbet, Andrew Jeffery, Randy Dunlap, linux-kernel,
	Greg Kroah-Hartman, Jae Hyun Yoo, linux-arm-kernel, linux-hwmon,
	linux-doc, Ingo Molnar, Dan Williams, Arnd Bergmann,
	Borislav Petkov, Pierre-Louis Bossart, Rob Herring, devicetree,
	Joel Stanley, Jean Delvare, Zev Weiss, openbmc, Yazen Ghannam,
	David Muller, Mauro Carvalho Chehab

On Tue, 03 Aug 2021 13:31:23 +0200, Iwona Winiarska wrote:
> Add device tree bindings for the peci-aspeed controller driver.
> 
> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> ---
>  .../devicetree/bindings/peci/peci-aspeed.yaml | 109 ++++++++++++++++++
>  1 file changed, 109 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 06/15] peci: Add core infrastructure
  2021-08-03 11:31 ` [PATCH v2 06/15] peci: Add core infrastructure Iwona Winiarska
@ 2021-08-25 22:58   ` Dan Williams
  2021-08-26 22:40     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Williams @ 2021-08-25 22:58 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: Linux Kernel Mailing List, openbmc, Greg Kroah-Hartman, X86 ML,
	Device Tree, linux-aspeed, Linux ARM, linux-hwmon,
	Linux Doc Mailing List, Rob Herring, Joel Stanley,
	Andrew Jeffery, Jean Delvare, Guenter Roeck, Arnd Bergmann,
	Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Randy Dunlap, Zev Weiss,
	David Muller, Jason M Bills

On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
<iwona.winiarska@intel.com> wrote:
>
> Intel processors provide access for various services designed to support
> processor and DRAM thermal management, platform manageability and
> processor interface tuning and diagnostics.
> Those services are available via the Platform Environment Control
> Interface (PECI) that provides a communication channel between the
> processor and the Baseboard Management Controller (BMC) or other
> platform management device.
>
> This change introduces PECI subsystem by adding the initial core module
> and API for controller drivers.
>
> Co-developed-by: Jason M Bills <jason.m.bills@linux.intel.com>
> Signed-off-by: Jason M Bills <jason.m.bills@linux.intel.com>
> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  MAINTAINERS             |   9 +++
>  drivers/Kconfig         |   3 +
>  drivers/Makefile        |   1 +
>  drivers/peci/Kconfig    |  15 ++++
>  drivers/peci/Makefile   |   5 ++
>  drivers/peci/core.c     | 155 ++++++++++++++++++++++++++++++++++++++++
>  drivers/peci/internal.h |  16 +++++
>  include/linux/peci.h    |  99 +++++++++++++++++++++++++
>  8 files changed, 303 insertions(+)
>  create mode 100644 drivers/peci/Kconfig
>  create mode 100644 drivers/peci/Makefile
>  create mode 100644 drivers/peci/core.c
>  create mode 100644 drivers/peci/internal.h
>  create mode 100644 include/linux/peci.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7cdab7229651..d411974aaa5e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14503,6 +14503,15 @@ L:     platform-driver-x86@vger.kernel.org
>  S:     Maintained
>  F:     drivers/platform/x86/peaq-wmi.c
>
> +PECI SUBSYSTEM
> +M:     Iwona Winiarska <iwona.winiarska@intel.com>
> +R:     Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> +L:     openbmc@lists.ozlabs.org (moderated for non-subscribers)
> +S:     Supported
> +F:     Documentation/devicetree/bindings/peci/
> +F:     drivers/peci/
> +F:     include/linux/peci.h
> +
>  PENSANDO ETHERNET DRIVERS
>  M:     Shannon Nelson <snelson@pensando.io>
>  M:     drivers@pensando.io
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 8bad63417a50..f472b3d972b3 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -236,4 +236,7 @@ source "drivers/interconnect/Kconfig"
>  source "drivers/counter/Kconfig"
>
>  source "drivers/most/Kconfig"
> +
> +source "drivers/peci/Kconfig"
> +
>  endmenu
> diff --git a/drivers/Makefile b/drivers/Makefile
> index 27c018bdf4de..8d96f0c3dde5 100644
> --- a/drivers/Makefile
> +++ b/drivers/Makefile
> @@ -189,3 +189,4 @@ obj-$(CONFIG_GNSS)          += gnss/
>  obj-$(CONFIG_INTERCONNECT)     += interconnect/
>  obj-$(CONFIG_COUNTER)          += counter/
>  obj-$(CONFIG_MOST)             += most/
> +obj-$(CONFIG_PECI)             += peci/
> diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> new file mode 100644
> index 000000000000..71a4ad81225a
> --- /dev/null
> +++ b/drivers/peci/Kconfig
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +menuconfig PECI
> +       tristate "PECI support"
> +       help
> +         The Platform Environment Control Interface (PECI) is an interface
> +         that provides a communication channel to Intel processors and
> +         chipset components from external monitoring or control devices.
> +
> +         If you are building a Baseboard Management Controller (BMC) kernel
> +         for Intel platform say Y here and also to the specific driver for
> +         your adapter(s) below. If unsure say N.
> +
> +         This support is also available as a module. If so, the module
> +         will be called peci.
> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> new file mode 100644
> index 000000000000..e789a354e842
> --- /dev/null
> +++ b/drivers/peci/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +# Core functionality
> +peci-y := core.o
> +obj-$(CONFIG_PECI) += peci.o
> diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> new file mode 100644
> index 000000000000..7b3938af0396
> --- /dev/null
> +++ b/drivers/peci/core.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2018-2021 Intel Corporation
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

This looks like overkill for only one print statement in this module,
especially when the dev_ print helpers offer more detail.

> +
> +#include <linux/bug.h>
> +#include <linux/device.h>
> +#include <linux/export.h>
> +#include <linux/idr.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/peci.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/property.h>
> +#include <linux/slab.h>
> +
> +#include "internal.h"
> +
> +static DEFINE_IDA(peci_controller_ida);
> +
> +static void peci_controller_dev_release(struct device *dev)
> +{
> +       struct peci_controller *controller = to_peci_controller(dev);
> +
> +       pm_runtime_disable(&controller->dev);

This seems late to be disabling power management, the device is about
to be freed. Keep in mind the lifetime of the this object can be
artificially prolonged. I expect this to be done when the device is
unregistered from the bus.

> +
> +       mutex_destroy(&controller->bus_lock);
> +       ida_free(&peci_controller_ida, controller->id);
> +       fwnode_handle_put(controller->dev.fwnode);

Shouldn't the get / put of this handle reference be bound to specific
accesses not held for the entire lifetime of the object? At a minimum
it seems to be a reference that can taken at registration and dropped
at unregistration.

> +       kfree(controller);
> +}
> +
> +struct device_type peci_controller_type = {
> +       .release        = peci_controller_dev_release,
> +};
> +
> +static struct peci_controller *peci_controller_alloc(struct device *dev,
> +                                                    struct peci_controller_ops *ops)
> +{
> +       struct fwnode_handle *node = fwnode_handle_get(dev_fwnode(dev));
> +       struct peci_controller *controller;
> +       int ret;
> +
> +       if (!ops->xfer)
> +               return ERR_PTR(-EINVAL);
> +
> +       controller = kzalloc(sizeof(*controller), GFP_KERNEL);
> +       if (!controller)
> +               return ERR_PTR(-ENOMEM);
> +
> +       ret = ida_alloc_max(&peci_controller_ida, U8_MAX, GFP_KERNEL);
> +       if (ret < 0)
> +               goto err;
> +       controller->id = ret;
> +
> +       controller->ops = ops;
> +
> +       controller->dev.parent = dev;
> +       controller->dev.bus = &peci_bus_type;
> +       controller->dev.type = &peci_controller_type;
> +       controller->dev.fwnode = node;
> +       controller->dev.of_node = to_of_node(node);
> +
> +       device_initialize(&controller->dev);
> +
> +       mutex_init(&controller->bus_lock);
> +
> +       pm_runtime_no_callbacks(&controller->dev);
> +       pm_suspend_ignore_children(&controller->dev, true);
> +       pm_runtime_enable(&controller->dev);

Per above, are you sure unregistered devices need pm_runtime enabled?

Rest looks ok to me.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 07/15] peci: Add peci-aspeed controller driver
  2021-08-03 11:31 ` [PATCH v2 07/15] peci: Add peci-aspeed controller driver Iwona Winiarska
@ 2021-08-26  1:35   ` Dan Williams
  2021-08-26 23:54     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Williams @ 2021-08-26  1:35 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: Linux Kernel Mailing List, openbmc, Greg Kroah-Hartman, X86 ML,
	Device Tree, linux-aspeed, Linux ARM, linux-hwmon,
	Linux Doc Mailing List, Rob Herring, Joel Stanley,
	Andrew Jeffery, Jean Delvare, Guenter Roeck, Arnd Bergmann,
	Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Randy Dunlap, Zev Weiss,
	David Muller

On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
<iwona.winiarska@intel.com> wrote:
>
> From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>
> ASPEED AST24xx/AST25xx/AST26xx SoCs supports the PECI electrical
> interface (a.k.a PECI wire).

Maybe a one sentence blurb here and in the Kconfig reminding people
why they should care if they have a PECI driver or not?

>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Co-developed-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  MAINTAINERS                           |   9 +
>  drivers/peci/Kconfig                  |   6 +
>  drivers/peci/Makefile                 |   3 +
>  drivers/peci/controller/Kconfig       |  16 +
>  drivers/peci/controller/Makefile      |   3 +
>  drivers/peci/controller/peci-aspeed.c | 445 ++++++++++++++++++++++++++
>  6 files changed, 482 insertions(+)
>  create mode 100644 drivers/peci/controller/Kconfig
>  create mode 100644 drivers/peci/controller/Makefile
>  create mode 100644 drivers/peci/controller/peci-aspeed.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d411974aaa5e..6e9d53ff68ab 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2866,6 +2866,15 @@ S:       Maintained
>  F:     Documentation/hwmon/asc7621.rst
>  F:     drivers/hwmon/asc7621.c
>
> +ASPEED PECI CONTROLLER
> +M:     Iwona Winiarska <iwona.winiarska@intel.com>
> +M:     Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> +L:     linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> +L:     openbmc@lists.ozlabs.org (moderated for non-subscribers)
> +S:     Supported
> +F:     Documentation/devicetree/bindings/peci/peci-aspeed.yaml
> +F:     drivers/peci/controller/peci-aspeed.c
> +
>  ASPEED PINCTRL DRIVERS
>  M:     Andrew Jeffery <andrew@aj.id.au>
>  L:     linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> index 71a4ad81225a..99279df97a78 100644
> --- a/drivers/peci/Kconfig
> +++ b/drivers/peci/Kconfig
> @@ -13,3 +13,9 @@ menuconfig PECI
>
>           This support is also available as a module. If so, the module
>           will be called peci.
> +
> +if PECI
> +
> +source "drivers/peci/controller/Kconfig"
> +
> +endif # PECI
> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> index e789a354e842..926d8df15cbd 100644
> --- a/drivers/peci/Makefile
> +++ b/drivers/peci/Makefile
> @@ -3,3 +3,6 @@
>  # Core functionality
>  peci-y := core.o
>  obj-$(CONFIG_PECI) += peci.o
> +
> +# Hardware specific bus drivers
> +obj-y += controller/
> diff --git a/drivers/peci/controller/Kconfig b/drivers/peci/controller/Kconfig
> new file mode 100644
> index 000000000000..6d48df08db1c
> --- /dev/null
> +++ b/drivers/peci/controller/Kconfig
> @@ -0,0 +1,16 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config PECI_ASPEED
> +       tristate "ASPEED PECI support"
> +       depends on ARCH_ASPEED || COMPILE_TEST
> +       depends on OF
> +       depends on HAS_IOMEM
> +       help
> +         This option enables PECI controller driver for ASPEED AST2400,
> +         AST2500 and AST2600 SoCs.
> +
> +         Say Y here if your system runs on ASPEED SoC and you are using it
> +         as BMC for Intel platform.
> +
> +         This driver can also be built as a module. If so, the module will
> +         be called peci-aspeed.
> diff --git a/drivers/peci/controller/Makefile b/drivers/peci/controller/Makefile
> new file mode 100644
> index 000000000000..022c28ef1bf0
> --- /dev/null
> +++ b/drivers/peci/controller/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +obj-$(CONFIG_PECI_ASPEED)      += peci-aspeed.o
> diff --git a/drivers/peci/controller/peci-aspeed.c b/drivers/peci/controller/peci-aspeed.c
> new file mode 100644
> index 000000000000..1d708c983749
> --- /dev/null
> +++ b/drivers/peci/controller/peci-aspeed.c
> @@ -0,0 +1,445 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (C) 2012-2017 ASPEED Technology Inc.
> +// Copyright (c) 2018-2021 Intel Corporation

Why different copyright capitalization?

> +
> +#include <linux/bitfield.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/peci.h>
> +#include <linux/platform_device.h>
> +#include <linux/reset.h>
> +
> +#include <asm/unaligned.h>

Why is this included?

> +
> +/* ASPEED PECI Registers */
> +/* Control Register */
> +#define ASPEED_PECI_CTRL                       0x00
> +#define   ASPEED_PECI_CTRL_SAMPLING_MASK       GENMASK(19, 16)
> +#define   ASPEED_PECI_CTRL_RD_MODE_MASK                GENMASK(13, 12)
> +#define     ASPEED_PECI_CTRL_RD_MODE_DBG       BIT(13)
> +#define     ASPEED_PECI_CTRL_RD_MODE_COUNT     BIT(12)
> +#define   ASPEED_PECI_CTRL_CLK_SOURCE          BIT(11)
> +#define   ASPEED_PECI_CTRL_CLK_DIV_MASK                GENMASK(10, 8)
> +#define   ASPEED_PECI_CTRL_INVERT_OUT          BIT(7)
> +#define   ASPEED_PECI_CTRL_INVERT_IN           BIT(6)
> +#define   ASPEED_PECI_CTRL_BUS_CONTENTION_EN   BIT(5)
> +#define   ASPEED_PECI_CTRL_PECI_EN             BIT(4)
> +#define   ASPEED_PECI_CTRL_PECI_CLK_EN         BIT(0)
> +
> +/* Timing Negotiation Register */
> +#define ASPEED_PECI_TIMING_NEGOTIATION         0x04
> +#define   ASPEED_PECI_T_NEGO_MSG_MASK          GENMASK(15, 8)
> +#define   ASPEED_PECI_T_NEGO_ADDR_MASK         GENMASK(7, 0)
> +
> +/* Command Register */
> +#define ASPEED_PECI_CMD                                0x08
> +#define   ASPEED_PECI_CMD_PIN_MONITORING       BIT(31)
> +#define   ASPEED_PECI_CMD_STS_MASK             GENMASK(27, 24)
> +#define     ASPEED_PECI_CMD_STS_ADDR_T_NEGO    0x3
> +#define   ASPEED_PECI_CMD_IDLE_MASK            \
> +         (ASPEED_PECI_CMD_STS_MASK | ASPEED_PECI_CMD_PIN_MONITORING)
> +#define   ASPEED_PECI_CMD_FIRE                 BIT(0)
> +
> +/* Read/Write Length Register */
> +#define ASPEED_PECI_RW_LENGTH                  0x0c
> +#define   ASPEED_PECI_AW_FCS_EN                        BIT(31)
> +#define   ASPEED_PECI_RD_LEN_MASK              GENMASK(23, 16)
> +#define   ASPEED_PECI_WR_LEN_MASK              GENMASK(15, 8)
> +#define   ASPEED_PECI_TARGET_ADDR_MASK         GENMASK(7, 0)
> +
> +/* Expected FCS Data Register */
> +#define ASPEED_PECI_EXPECTED_FCS               0x10
> +#define   ASPEED_PECI_EXPECTED_RD_FCS_MASK     GENMASK(23, 16)
> +#define   ASPEED_PECI_EXPECTED_AW_FCS_AUTO_MASK        GENMASK(15, 8)
> +#define   ASPEED_PECI_EXPECTED_WR_FCS_MASK     GENMASK(7, 0)
> +
> +/* Captured FCS Data Register */
> +#define ASPEED_PECI_CAPTURED_FCS               0x14
> +#define   ASPEED_PECI_CAPTURED_RD_FCS_MASK     GENMASK(23, 16)
> +#define   ASPEED_PECI_CAPTURED_WR_FCS_MASK     GENMASK(7, 0)
> +
> +/* Interrupt Register */
> +#define ASPEED_PECI_INT_CTRL                   0x18
> +#define   ASPEED_PECI_TIMING_NEGO_SEL_MASK     GENMASK(31, 30)
> +#define     ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO   0
> +#define     ASPEED_PECI_2ND_BIT_OF_ADDR_NEGO   1
> +#define     ASPEED_PECI_MESSAGE_NEGO           2
> +#define   ASPEED_PECI_INT_MASK                 GENMASK(4, 0)
> +#define     ASPEED_PECI_INT_BUS_TIMEOUT                BIT(4)
> +#define     ASPEED_PECI_INT_BUS_CONTENTION     BIT(3)
> +#define     ASPEED_PECI_INT_WR_FCS_BAD         BIT(2)
> +#define     ASPEED_PECI_INT_WR_FCS_ABORT       BIT(1)
> +#define     ASPEED_PECI_INT_CMD_DONE           BIT(0)
> +
> +/* Interrupt Status Register */
> +#define ASPEED_PECI_INT_STS                    0x1c
> +#define   ASPEED_PECI_INT_TIMING_RESULT_MASK   GENMASK(29, 16)
> +         /* bits[4..0]: Same bit fields in the 'Interrupt Register' */
> +
> +/* Rx/Tx Data Buffer Registers */
> +#define ASPEED_PECI_WR_DATA0                   0x20
> +#define ASPEED_PECI_WR_DATA1                   0x24
> +#define ASPEED_PECI_WR_DATA2                   0x28
> +#define ASPEED_PECI_WR_DATA3                   0x2c
> +#define ASPEED_PECI_RD_DATA0                   0x30
> +#define ASPEED_PECI_RD_DATA1                   0x34
> +#define ASPEED_PECI_RD_DATA2                   0x38
> +#define ASPEED_PECI_RD_DATA3                   0x3c
> +#define ASPEED_PECI_WR_DATA4                   0x40
> +#define ASPEED_PECI_WR_DATA5                   0x44
> +#define ASPEED_PECI_WR_DATA6                   0x48
> +#define ASPEED_PECI_WR_DATA7                   0x4c
> +#define ASPEED_PECI_RD_DATA4                   0x50
> +#define ASPEED_PECI_RD_DATA5                   0x54
> +#define ASPEED_PECI_RD_DATA6                   0x58
> +#define ASPEED_PECI_RD_DATA7                   0x5c
> +#define   ASPEED_PECI_DATA_BUF_SIZE_MAX                32
> +
> +/* Timing Negotiation */
> +#define ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT  8
> +#define ASPEED_PECI_RD_SAMPLING_POINT_MAX      (BIT(4) - 1)
> +#define ASPEED_PECI_CLK_DIV_DEFAULT            0
> +#define ASPEED_PECI_CLK_DIV_MAX                        (BIT(3) - 1)
> +#define ASPEED_PECI_MSG_TIMING_DEFAULT         1
> +#define ASPEED_PECI_MSG_TIMING_MAX             (BIT(8) - 1)
> +#define ASPEED_PECI_ADDR_TIMING_DEFAULT                1
> +#define ASPEED_PECI_ADDR_TIMING_MAX            (BIT(8) - 1)
> +
> +/* Timeout */
> +#define ASPEED_PECI_IDLE_CHECK_TIMEOUT_US      (50 * USEC_PER_MSEC)
> +#define ASPEED_PECI_IDLE_CHECK_INTERVAL_US     (10 * USEC_PER_MSEC)
> +#define ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT     (1000)
> +#define ASPEED_PECI_CMD_TIMEOUT_MS_MAX         (1000)
> +
> +struct aspeed_peci {
> +       struct peci_controller *controller;
> +       struct device *dev;
> +       void __iomem *base;
> +       struct clk *clk;
> +       struct reset_control *rst;
> +       int irq;
> +       spinlock_t lock; /* to sync completion status handling */
> +       struct completion xfer_complete;
> +       u32 status;
> +       u32 cmd_timeout_ms;
> +       u32 msg_timing;
> +       u32 addr_timing;
> +       u32 rd_sampling_point;
> +       u32 clk_div;
> +};
> +
> +static void aspeed_peci_init_regs(struct aspeed_peci *priv)
> +{
> +       u32 val;
> +
> +       val = FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, ASPEED_PECI_CLK_DIV_DEFAULT);
> +       val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
> +       writel(val, priv->base + ASPEED_PECI_CTRL);
> +       /*
> +        * Timing negotiation period setting.
> +        * The unit of the programmed value is 4 times of PECI clock period.
> +        */
> +       val = FIELD_PREP(ASPEED_PECI_T_NEGO_MSG_MASK, priv->msg_timing);
> +       val |= FIELD_PREP(ASPEED_PECI_T_NEGO_ADDR_MASK, priv->addr_timing);
> +       writel(val, priv->base + ASPEED_PECI_TIMING_NEGOTIATION);
> +
> +       /* Clear interrupts */
> +       val = readl(priv->base + ASPEED_PECI_INT_STS) | ASPEED_PECI_INT_MASK;
> +       writel(val, priv->base + ASPEED_PECI_INT_STS);
> +
> +       /* Set timing negotiation mode and enable interrupts */
> +       val = FIELD_PREP(ASPEED_PECI_TIMING_NEGO_SEL_MASK, ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO);
> +       val |= ASPEED_PECI_INT_MASK;
> +       writel(val, priv->base + ASPEED_PECI_INT_CTRL);
> +
> +       val = FIELD_PREP(ASPEED_PECI_CTRL_SAMPLING_MASK, priv->rd_sampling_point);
> +       val |= FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, priv->clk_div);
> +       val |= ASPEED_PECI_CTRL_PECI_EN;
> +       val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
> +       writel(val, priv->base + ASPEED_PECI_CTRL);
> +}
> +
> +static inline int aspeed_peci_check_idle(struct aspeed_peci *priv)
> +{
> +       u32 cmd_sts = readl(priv->base + ASPEED_PECI_CMD);
> +
> +       if (FIELD_GET(ASPEED_PECI_CMD_STS_MASK, cmd_sts) == ASPEED_PECI_CMD_STS_ADDR_T_NEGO)
> +               aspeed_peci_init_regs(priv);
> +
> +       return readl_poll_timeout(priv->base + ASPEED_PECI_CMD,
> +                                 cmd_sts,
> +                                 !(cmd_sts & ASPEED_PECI_CMD_IDLE_MASK),
> +                                 ASPEED_PECI_IDLE_CHECK_INTERVAL_US,
> +                                 ASPEED_PECI_IDLE_CHECK_TIMEOUT_US);
> +}
> +
> +static int aspeed_peci_xfer(struct peci_controller *controller,
> +                           u8 addr, struct peci_request *req)
> +{
> +       struct aspeed_peci *priv = dev_get_drvdata(controller->dev.parent);
> +       unsigned long flags, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
> +       u32 peci_head;
> +       int ret;
> +
> +       if (req->tx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX ||
> +           req->rx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX)
> +               return -EINVAL;
> +
> +       /* Check command sts and bus idle state */
> +       ret = aspeed_peci_check_idle(priv);
> +       if (ret)
> +               return ret; /* -ETIMEDOUT */
> +
> +       spin_lock_irqsave(&priv->lock, flags);
> +       reinit_completion(&priv->xfer_complete);
> +
> +       peci_head = FIELD_PREP(ASPEED_PECI_TARGET_ADDR_MASK, addr) |
> +                   FIELD_PREP(ASPEED_PECI_WR_LEN_MASK, req->tx.len) |
> +                   FIELD_PREP(ASPEED_PECI_RD_LEN_MASK, req->rx.len);
> +
> +       writel(peci_head, priv->base + ASPEED_PECI_RW_LENGTH);
> +
> +       memcpy_toio(priv->base + ASPEED_PECI_WR_DATA0, req->tx.buf, min_t(u8, req->tx.len, 16));
> +       if (req->tx.len > 16)
> +               memcpy_toio(priv->base + ASPEED_PECI_WR_DATA4, req->tx.buf + 16,
> +                           req->tx.len - 16);
> +
> +       dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
> +       print_hex_dump_bytes("TX : ", DUMP_PREFIX_NONE, req->tx.buf, req->tx.len);

On CONFIG_DYNAMIC_DEBUG=n builds the kernel will do all the work of
reading through this buffer, but skip emitting it. Are you sure you
want to pay that overhead for every transaction?

> +
> +       priv->status = 0;
> +       writel(ASPEED_PECI_CMD_FIRE, priv->base + ASPEED_PECI_CMD);
> +       spin_unlock_irqrestore(&priv->lock, flags);
> +
> +       ret = wait_for_completion_interruptible_timeout(&priv->xfer_complete, timeout);

spin_lock_irqsave() says "I don't know if interrupts are disabled
already, so I'll save the state, whatever it is, and restore later"

wait_for_completion_interruptible_timeout() says "I know I am in a
sleepable context where interrupts are enabled"

So, one of those is wrong, i.e. should it be spin_{lock,unlock}_irq()?


> +       if (ret < 0)
> +               return ret;
> +
> +       if (ret == 0) {
> +               dev_dbg(priv->dev, "Timeout waiting for a response!\n");
> +               return -ETIMEDOUT;
> +       }
> +
> +       spin_lock_irqsave(&priv->lock, flags);
> +
> +       writel(0, priv->base + ASPEED_PECI_CMD);
> +
> +       if (priv->status != ASPEED_PECI_INT_CMD_DONE) {
> +               spin_unlock_irqrestore(&priv->lock, flags);
> +               dev_dbg(priv->dev, "No valid response!\n");
> +               return -EIO;
> +       }
> +
> +       spin_unlock_irqrestore(&priv->lock, flags);
> +
> +       memcpy_fromio(req->rx.buf, priv->base + ASPEED_PECI_RD_DATA0, min_t(u8, req->rx.len, 16));
> +       if (req->rx.len > 16)
> +               memcpy_fromio(req->rx.buf + 16, priv->base + ASPEED_PECI_RD_DATA4,
> +                             req->rx.len - 16);
> +
> +       print_hex_dump_bytes("RX : ", DUMP_PREFIX_NONE, req->rx.buf, req->rx.len);
> +
> +       return 0;
> +}
> +
> +static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
> +{
> +       struct aspeed_peci *priv = arg;
> +       u32 status;
> +
> +       spin_lock(&priv->lock);
> +       status = readl(priv->base + ASPEED_PECI_INT_STS);
> +       writel(status, priv->base + ASPEED_PECI_INT_STS);
> +       priv->status |= (status & ASPEED_PECI_INT_MASK);
> +
> +       /*
> +        * In most cases, interrupt bits will be set one by one but also note
> +        * that multiple interrupt bits could be set at the same time.
> +        */
> +       if (status & ASPEED_PECI_INT_BUS_TIMEOUT)
> +               dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_BUS_TIMEOUT\n");
> +
> +       if (status & ASPEED_PECI_INT_BUS_CONTENTION)
> +               dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_BUS_CONTENTION\n");
> +
> +       if (status & ASPEED_PECI_INT_WR_FCS_BAD)
> +               dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_WR_FCS_BAD\n");
> +
> +       if (status & ASPEED_PECI_INT_WR_FCS_ABORT)
> +               dev_dbg_ratelimited(priv->dev, "ASPEED_PECI_INT_WR_FCS_ABORT\n");

Are you sure these would not be better as tracepoints? If you're
debugging an interrupt related failure, the ratelimiting might get in
your way when you really need to know when one of these error
interrupts fire relative to another event.

> +
> +       /*
> +        * All commands should be ended up with a ASPEED_PECI_INT_CMD_DONE bit
> +        * set even in an error case.
> +        */
> +       if (status & ASPEED_PECI_INT_CMD_DONE)
> +               complete(&priv->xfer_complete);

Hmm, no need to check if there was a sequencing error, like a command
was never submitted?

> +
> +       spin_unlock(&priv->lock);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static void aspeed_peci_property_sanitize(struct device *dev, const char *propname,
> +                                         u32 min, u32 max, u32 default_val, u32 *propval)
> +{
> +       u32 val;
> +       int ret;
> +
> +       ret = device_property_read_u32(dev, propname, &val);
> +       if (ret) {
> +               val = default_val;
> +       } else if (val > max || val < min) {
> +               dev_warn(dev, "Invalid %s: %u, falling back to: %u\n",
> +                        propname, val, default_val);
> +
> +               val = default_val;
> +       }
> +
> +       *propval = val;
> +}
> +
> +static void aspeed_peci_property_setup(struct aspeed_peci *priv)
> +{
> +       aspeed_peci_property_sanitize(priv->dev, "aspeed,clock-divider",
> +                                     0, ASPEED_PECI_CLK_DIV_MAX,
> +                                     ASPEED_PECI_CLK_DIV_DEFAULT, &priv->clk_div);
> +       aspeed_peci_property_sanitize(priv->dev, "aspeed,msg-timing",
> +                                     0, ASPEED_PECI_MSG_TIMING_MAX,
> +                                     ASPEED_PECI_MSG_TIMING_DEFAULT, &priv->msg_timing);
> +       aspeed_peci_property_sanitize(priv->dev, "aspeed,addr-timing",
> +                                     0, ASPEED_PECI_ADDR_TIMING_MAX,
> +                                     ASPEED_PECI_ADDR_TIMING_DEFAULT, &priv->addr_timing);
> +       aspeed_peci_property_sanitize(priv->dev, "aspeed,rd-sampling-point",
> +                                     0, ASPEED_PECI_RD_SAMPLING_POINT_MAX,
> +                                     ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT,
> +                                     &priv->rd_sampling_point);
> +       aspeed_peci_property_sanitize(priv->dev, "cmd-timeout-ms",
> +                                     1, ASPEED_PECI_CMD_TIMEOUT_MS_MAX,
> +                                     ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT, &priv->cmd_timeout_ms);
> +}
> +
> +static struct peci_controller_ops aspeed_ops = {
> +       .xfer = aspeed_peci_xfer,
> +};
> +
> +static void aspeed_peci_reset_control_release(void *data)
> +{
> +       reset_control_assert(data);
> +}
> +
> +int aspeed_peci_reset_control_deassert(struct device *dev, struct reset_control *rst)

I'd recommend naming this devm_aspeed_peci_reset_control_deassert(),
because I came looking here from reading probe for why there was no
reassertion of reset on driver ->remove().

> +{
> +       int ret;
> +
> +       ret = reset_control_deassert(rst);
> +       if (ret)
> +               return ret;
> +
> +       return devm_add_action_or_reset(dev, aspeed_peci_reset_control_release, rst);
> +}
> +
> +static void aspeed_peci_clk_release(void *data)
> +{
> +       clk_disable_unprepare(data);
> +}
> +
> +static int aspeed_peci_clk_enable(struct device *dev, struct clk *clk)

...ditto on the devm prefix, just to speed readability.

> +{
> +       int ret;
> +
> +       ret = clk_prepare_enable(clk);
> +       if (ret)
> +               return ret;
> +
> +       return devm_add_action_or_reset(dev, aspeed_peci_clk_release, clk);
> +}
> +
> +static int aspeed_peci_probe(struct platform_device *pdev)
> +{
> +       struct peci_controller *controller;
> +       struct aspeed_peci *priv;
> +       int ret;
> +
> +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +
> +       priv->dev = &pdev->dev;
> +       dev_set_drvdata(priv->dev, priv);
> +
> +       priv->base = devm_platform_ioremap_resource(pdev, 0);
> +       if (IS_ERR(priv->base))
> +               return PTR_ERR(priv->base);
> +
> +       priv->irq = platform_get_irq(pdev, 0);
> +       if (!priv->irq)
> +               return priv->irq;
> +
> +       ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
> +                              0, "peci-aspeed", priv);
> +       if (ret)
> +               return ret;
> +
> +       init_completion(&priv->xfer_complete);
> +       spin_lock_init(&priv->lock);
> +
> +       priv->rst = devm_reset_control_get(&pdev->dev, NULL);
> +       if (IS_ERR(priv->rst))
> +               return dev_err_probe(priv->dev, PTR_ERR(priv->rst),
> +                                    "failed to get reset control\n");
> +
> +       ret = aspeed_peci_reset_control_deassert(priv->dev, priv->rst);
> +       if (ret)
> +               return dev_err_probe(priv->dev, ret, "cannot deassert reset control\n");
> +
> +       priv->clk = devm_clk_get(priv->dev, NULL);
> +       if (IS_ERR(priv->clk))
> +               return dev_err_probe(priv->dev, PTR_ERR(priv->clk), "failed to get clk\n");
> +
> +       ret = aspeed_peci_clk_enable(priv->dev, priv->clk);
> +       if (ret)
> +               return dev_err_probe(priv->dev, ret, "failed to enable clock\n");
> +
> +       aspeed_peci_property_setup(priv);
> +
> +       aspeed_peci_init_regs(priv);
> +
> +       controller = devm_peci_controller_add(priv->dev, &aspeed_ops);
> +       if (IS_ERR(controller))
> +               return dev_err_probe(priv->dev, PTR_ERR(controller),
> +                                    "failed to add aspeed peci controller\n");
> +
> +       priv->controller = controller;
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id aspeed_peci_of_table[] = {
> +       { .compatible = "aspeed,ast2400-peci", },
> +       { .compatible = "aspeed,ast2500-peci", },
> +       { .compatible = "aspeed,ast2600-peci", },
> +       { }
> +};
> +MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
> +
> +static struct platform_driver aspeed_peci_driver = {
> +       .probe  = aspeed_peci_probe,
> +       .driver = {
> +               .name           = "peci-aspeed",
> +               .of_match_table = aspeed_peci_of_table,
> +       },
> +};
> +module_platform_driver(aspeed_peci_driver);
> +
> +MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_DESCRIPTION("ASPEED PECI driver");
> +MODULE_LICENSE("GPL");
> +MODULE_IMPORT_NS(PECI);
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 06/15] peci: Add core infrastructure
  2021-08-25 22:58   ` Dan Williams
@ 2021-08-26 22:40     ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-08-26 22:40 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: jason.m.bills, linux-hwmon, tglx, robh+dt, zweiss, Luck, Tony,
	Lutomirski, Andy, olof, jae.hyun.yoo, devicetree, openbmc,
	corbet, linux-kernel, yazen.ghannam, andrew,
	pierre-louis.bossart, jdelvare, x86, d.mueller, rdunlap, bp,
	arnd, mchehab, joel, andriy.shevchenko, gregkh, mingo,
	linux-aspeed, linux-arm-kernel, linux-doc, linux

On Wed, 2021-08-25 at 15:58 -0700, Dan Williams wrote:
> On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
> <iwona.winiarska@intel.com> wrote:
> > 
> > Intel processors provide access for various services designed to support
> > processor and DRAM thermal management, platform manageability and
> > processor interface tuning and diagnostics.
> > Those services are available via the Platform Environment Control
> > Interface (PECI) that provides a communication channel between the
> > processor and the Baseboard Management Controller (BMC) or other
> > platform management device.
> > 
> > This change introduces PECI subsystem by adding the initial core module
> > and API for controller drivers.
> > 
> > Co-developed-by: Jason M Bills <jason.m.bills@linux.intel.com>
> > Signed-off-by: Jason M Bills <jason.m.bills@linux.intel.com>
> > Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> >  MAINTAINERS             |   9 +++
> >  drivers/Kconfig         |   3 +
> >  drivers/Makefile        |   1 +
> >  drivers/peci/Kconfig    |  15 ++++
> >  drivers/peci/Makefile   |   5 ++
> >  drivers/peci/core.c     | 155 ++++++++++++++++++++++++++++++++++++++++
> >  drivers/peci/internal.h |  16 +++++
> >  include/linux/peci.h    |  99 +++++++++++++++++++++++++
> >  8 files changed, 303 insertions(+)
> >  create mode 100644 drivers/peci/Kconfig
> >  create mode 100644 drivers/peci/Makefile
> >  create mode 100644 drivers/peci/core.c
> >  create mode 100644 drivers/peci/internal.h
> >  create mode 100644 include/linux/peci.h
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 7cdab7229651..d411974aaa5e 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -14503,6 +14503,15 @@ L:     platform-driver-x86@vger.kernel.org
> >  S:     Maintained
> >  F:     drivers/platform/x86/peaq-wmi.c
> > 
> > +PECI SUBSYSTEM
> > +M:     Iwona Winiarska <iwona.winiarska@intel.com>
> > +R:     Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > +L:     openbmc@lists.ozlabs.org (moderated for non-subscribers)
> > +S:     Supported
> > +F:     Documentation/devicetree/bindings/peci/
> > +F:     drivers/peci/
> > +F:     include/linux/peci.h
> > +
> >  PENSANDO ETHERNET DRIVERS
> >  M:     Shannon Nelson <snelson@pensando.io>
> >  M:     drivers@pensando.io
> > diff --git a/drivers/Kconfig b/drivers/Kconfig
> > index 8bad63417a50..f472b3d972b3 100644
> > --- a/drivers/Kconfig
> > +++ b/drivers/Kconfig
> > @@ -236,4 +236,7 @@ source "drivers/interconnect/Kconfig"
> >  source "drivers/counter/Kconfig"
> > 
> >  source "drivers/most/Kconfig"
> > +
> > +source "drivers/peci/Kconfig"
> > +
> >  endmenu
> > diff --git a/drivers/Makefile b/drivers/Makefile
> > index 27c018bdf4de..8d96f0c3dde5 100644
> > --- a/drivers/Makefile
> > +++ b/drivers/Makefile
> > @@ -189,3 +189,4 @@ obj-$(CONFIG_GNSS)          += gnss/
> >  obj-$(CONFIG_INTERCONNECT)     += interconnect/
> >  obj-$(CONFIG_COUNTER)          += counter/
> >  obj-$(CONFIG_MOST)             += most/
> > +obj-$(CONFIG_PECI)             += peci/
> > diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> > new file mode 100644
> > index 000000000000..71a4ad81225a
> > --- /dev/null
> > +++ b/drivers/peci/Kconfig
> > @@ -0,0 +1,15 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +menuconfig PECI
> > +       tristate "PECI support"
> > +       help
> > +         The Platform Environment Control Interface (PECI) is an interface
> > +         that provides a communication channel to Intel processors and
> > +         chipset components from external monitoring or control devices.
> > +
> > +         If you are building a Baseboard Management Controller (BMC) kernel
> > +         for Intel platform say Y here and also to the specific driver for
> > +         your adapter(s) below. If unsure say N.
> > +
> > +         This support is also available as a module. If so, the module
> > +         will be called peci.
> > diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> > new file mode 100644
> > index 000000000000..e789a354e842
> > --- /dev/null
> > +++ b/drivers/peci/Makefile
> > @@ -0,0 +1,5 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +# Core functionality
> > +peci-y := core.o
> > +obj-$(CONFIG_PECI) += peci.o
> > diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> > new file mode 100644
> > index 000000000000..7b3938af0396
> > --- /dev/null
> > +++ b/drivers/peci/core.c
> > @@ -0,0 +1,155 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2018-2021 Intel Corporation
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> 
> This looks like overkill for only one print statement in this module,
> especially when the dev_ print helpers offer more detail.

Ok, I'll remove it.

> 
> > +
> > +#include <linux/bug.h>
> > +#include <linux/device.h>
> > +#include <linux/export.h>
> > +#include <linux/idr.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/peci.h>
> > +#include <linux/pm_runtime.h>
> > +#include <linux/property.h>
> > +#include <linux/slab.h>
> > +
> > +#include "internal.h"
> > +
> > +static DEFINE_IDA(peci_controller_ida);
> > +
> > +static void peci_controller_dev_release(struct device *dev)
> > +{
> > +       struct peci_controller *controller = to_peci_controller(dev);
> > +
> > +       pm_runtime_disable(&controller->dev);
> 
> This seems late to be disabling power management, the device is about
> to be freed. Keep in mind the lifetime of the this object can be
> artificially prolonged. I expect this to be done when the device is
> unregistered from the bus.

Makes sense.

> 
> > +
> > +       mutex_destroy(&controller->bus_lock);
> > +       ida_free(&peci_controller_ida, controller->id);
> > +       fwnode_handle_put(controller->dev.fwnode);
> 
> Shouldn't the get / put of this handle reference be bound to specific
> accesses not held for the entire lifetime of the object? At a minimum
> it seems to be a reference that can taken at registration and dropped
> at unregistration.

I'll move it to take ref at registration and to drop it at unregistration.

> 
> > +       kfree(controller);
> > +}
> > +
> > +struct device_type peci_controller_type = {
> > +       .release        = peci_controller_dev_release,
> > +};
> > +
> > +static struct peci_controller *peci_controller_alloc(struct device *dev,
> > +                                                    struct
> > peci_controller_ops *ops)
> > +{
> > +       struct fwnode_handle *node = fwnode_handle_get(dev_fwnode(dev));
> > +       struct peci_controller *controller;
> > +       int ret;
> > +
> > +       if (!ops->xfer)
> > +               return ERR_PTR(-EINVAL);
> > +
> > +       controller = kzalloc(sizeof(*controller), GFP_KERNEL);
> > +       if (!controller)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       ret = ida_alloc_max(&peci_controller_ida, U8_MAX, GFP_KERNEL);
> > +       if (ret < 0)
> > +               goto err;
> > +       controller->id = ret;
> > +
> > +       controller->ops = ops;
> > +
> > +       controller->dev.parent = dev;
> > +       controller->dev.bus = &peci_bus_type;
> > +       controller->dev.type = &peci_controller_type;
> > +       controller->dev.fwnode = node;
> > +       controller->dev.of_node = to_of_node(node);
> > +
> > +       device_initialize(&controller->dev);
> > +
> > +       mutex_init(&controller->bus_lock);
> > +
> > +       pm_runtime_no_callbacks(&controller->dev);
> > +       pm_suspend_ignore_children(&controller->dev, true);
> > +       pm_runtime_enable(&controller->dev);
> 
> Per above, are you sure unregistered devices need pm_runtime enabled?
> 
> Rest looks ok to me.

Thanks
-Iwona


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 07/15] peci: Add peci-aspeed controller driver
  2021-08-26  1:35   ` Dan Williams
@ 2021-08-26 23:54     ` Winiarska, Iwona
  2021-08-27 16:24       ` Dan Williams
  0 siblings, 1 reply; 49+ messages in thread
From: Winiarska, Iwona @ 2021-08-26 23:54 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On Wed, 2021-08-25 at 18:35 -0700, Dan Williams wrote:
> On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
> <iwona.winiarska@intel.com> wrote:
> > 
> > From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > 
> > ASPEED AST24xx/AST25xx/AST26xx SoCs supports the PECI electrical
> > interface (a.k.a PECI wire).
> 
> Maybe a one sentence blurb here and in the Kconfig reminding people
> why they should care if they have a PECI driver or not?

Ok, I'll expand it a bit.

> 
> > 
> > Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > Co-developed-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> >  MAINTAINERS                           |   9 +
> >  drivers/peci/Kconfig                  |   6 +
> >  drivers/peci/Makefile                 |   3 +
> >  drivers/peci/controller/Kconfig       |  16 +
> >  drivers/peci/controller/Makefile      |   3 +
> >  drivers/peci/controller/peci-aspeed.c | 445 ++++++++++++++++++++++++++
> >  6 files changed, 482 insertions(+)
> >  create mode 100644 drivers/peci/controller/Kconfig
> >  create mode 100644 drivers/peci/controller/Makefile
> >  create mode 100644 drivers/peci/controller/peci-aspeed.c
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index d411974aaa5e..6e9d53ff68ab 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -2866,6 +2866,15 @@ S:       Maintained
> >  F:     Documentation/hwmon/asc7621.rst
> >  F:     drivers/hwmon/asc7621.c
> > 
> > +ASPEED PECI CONTROLLER
> > +M:     Iwona Winiarska <iwona.winiarska@intel.com>
> > +M:     Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > +L:     linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> > +L:     openbmc@lists.ozlabs.org (moderated for non-subscribers)
> > +S:     Supported
> > +F:     Documentation/devicetree/bindings/peci/peci-aspeed.yaml
> > +F:     drivers/peci/controller/peci-aspeed.c
> > +
> >  ASPEED PINCTRL DRIVERS
> >  M:     Andrew Jeffery <andrew@aj.id.au>
> >  L:     linux-aspeed@lists.ozlabs.org (moderated for non-subscribers)
> > diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> > index 71a4ad81225a..99279df97a78 100644
> > --- a/drivers/peci/Kconfig
> > +++ b/drivers/peci/Kconfig
> > @@ -13,3 +13,9 @@ menuconfig PECI
> > 
> >           This support is also available as a module. If so, the module
> >           will be called peci.
> > +
> > +if PECI
> > +
> > +source "drivers/peci/controller/Kconfig"
> > +
> > +endif # PECI
> > diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> > index e789a354e842..926d8df15cbd 100644
> > --- a/drivers/peci/Makefile
> > +++ b/drivers/peci/Makefile
> > @@ -3,3 +3,6 @@
> >  # Core functionality
> >  peci-y := core.o
> >  obj-$(CONFIG_PECI) += peci.o
> > +
> > +# Hardware specific bus drivers
> > +obj-y += controller/
> > diff --git a/drivers/peci/controller/Kconfig
> > b/drivers/peci/controller/Kconfig
> > new file mode 100644
> > index 000000000000..6d48df08db1c
> > --- /dev/null
> > +++ b/drivers/peci/controller/Kconfig
> > @@ -0,0 +1,16 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +config PECI_ASPEED
> > +       tristate "ASPEED PECI support"
> > +       depends on ARCH_ASPEED || COMPILE_TEST
> > +       depends on OF
> > +       depends on HAS_IOMEM
> > +       help
> > +         This option enables PECI controller driver for ASPEED AST2400,
> > +         AST2500 and AST2600 SoCs.
> > +
> > +         Say Y here if your system runs on ASPEED SoC and you are using it
> > +         as BMC for Intel platform.
> > +
> > +         This driver can also be built as a module. If so, the module will
> > +         be called peci-aspeed.
> > diff --git a/drivers/peci/controller/Makefile
> > b/drivers/peci/controller/Makefile
> > new file mode 100644
> > index 000000000000..022c28ef1bf0
> > --- /dev/null
> > +++ b/drivers/peci/controller/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +obj-$(CONFIG_PECI_ASPEED)      += peci-aspeed.o
> > diff --git a/drivers/peci/controller/peci-aspeed.c
> > b/drivers/peci/controller/peci-aspeed.c
> > new file mode 100644
> > index 000000000000..1d708c983749
> > --- /dev/null
> > +++ b/drivers/peci/controller/peci-aspeed.c
> > @@ -0,0 +1,445 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (C) 2012-2017 ASPEED Technology Inc.
> > +// Copyright (c) 2018-2021 Intel Corporation
> 
> Why different copyright capitalization?

I'll make them consistent.

> 
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/clk.h>
> > +#include <linux/delay.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iopoll.h>
> > +#include <linux/jiffies.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/peci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/reset.h>
> > +
> > +#include <asm/unaligned.h>
> 
> Why is this included?

Leftover - I'll remove it.

> 
> > +
> > +/* ASPEED PECI Registers */
> > +/* Control Register */
> > +#define ASPEED_PECI_CTRL                       0x00
> > +#define   ASPEED_PECI_CTRL_SAMPLING_MASK       GENMASK(19, 16)
> > +#define   ASPEED_PECI_CTRL_RD_MODE_MASK                GENMASK(13, 12)
> > +#define     ASPEED_PECI_CTRL_RD_MODE_DBG       BIT(13)
> > +#define     ASPEED_PECI_CTRL_RD_MODE_COUNT     BIT(12)
> > +#define   ASPEED_PECI_CTRL_CLK_SOURCE          BIT(11)
> > +#define   ASPEED_PECI_CTRL_CLK_DIV_MASK                GENMASK(10, 8)
> > +#define   ASPEED_PECI_CTRL_INVERT_OUT          BIT(7)
> > +#define   ASPEED_PECI_CTRL_INVERT_IN           BIT(6)
> > +#define   ASPEED_PECI_CTRL_BUS_CONTENTION_EN   BIT(5)
> > +#define   ASPEED_PECI_CTRL_PECI_EN             BIT(4)
> > +#define   ASPEED_PECI_CTRL_PECI_CLK_EN         BIT(0)
> > +
> > +/* Timing Negotiation Register */
> > +#define ASPEED_PECI_TIMING_NEGOTIATION         0x04
> > +#define   ASPEED_PECI_T_NEGO_MSG_MASK          GENMASK(15, 8)
> > +#define   ASPEED_PECI_T_NEGO_ADDR_MASK         GENMASK(7, 0)
> > +
> > +/* Command Register */
> > +#define ASPEED_PECI_CMD                                0x08
> > +#define   ASPEED_PECI_CMD_PIN_MONITORING       BIT(31)
> > +#define   ASPEED_PECI_CMD_STS_MASK             GENMASK(27, 24)
> > +#define     ASPEED_PECI_CMD_STS_ADDR_T_NEGO    0x3
> > +#define   ASPEED_PECI_CMD_IDLE_MASK            \
> > +         (ASPEED_PECI_CMD_STS_MASK | ASPEED_PECI_CMD_PIN_MONITORING)
> > +#define   ASPEED_PECI_CMD_FIRE                 BIT(0)
> > +
> > +/* Read/Write Length Register */
> > +#define ASPEED_PECI_RW_LENGTH                  0x0c
> > +#define   ASPEED_PECI_AW_FCS_EN                        BIT(31)
> > +#define   ASPEED_PECI_RD_LEN_MASK              GENMASK(23, 16)
> > +#define   ASPEED_PECI_WR_LEN_MASK              GENMASK(15, 8)
> > +#define   ASPEED_PECI_TARGET_ADDR_MASK         GENMASK(7, 0)
> > +
> > +/* Expected FCS Data Register */
> > +#define ASPEED_PECI_EXPECTED_FCS               0x10
> > +#define   ASPEED_PECI_EXPECTED_RD_FCS_MASK     GENMASK(23, 16)
> > +#define   ASPEED_PECI_EXPECTED_AW_FCS_AUTO_MASK        GENMASK(15, 8)
> > +#define   ASPEED_PECI_EXPECTED_WR_FCS_MASK     GENMASK(7, 0)
> > +
> > +/* Captured FCS Data Register */
> > +#define ASPEED_PECI_CAPTURED_FCS               0x14
> > +#define   ASPEED_PECI_CAPTURED_RD_FCS_MASK     GENMASK(23, 16)
> > +#define   ASPEED_PECI_CAPTURED_WR_FCS_MASK     GENMASK(7, 0)
> > +
> > +/* Interrupt Register */
> > +#define ASPEED_PECI_INT_CTRL                   0x18
> > +#define   ASPEED_PECI_TIMING_NEGO_SEL_MASK     GENMASK(31, 30)
> > +#define     ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO   0
> > +#define     ASPEED_PECI_2ND_BIT_OF_ADDR_NEGO   1
> > +#define     ASPEED_PECI_MESSAGE_NEGO           2
> > +#define   ASPEED_PECI_INT_MASK                 GENMASK(4, 0)
> > +#define     ASPEED_PECI_INT_BUS_TIMEOUT                BIT(4)
> > +#define     ASPEED_PECI_INT_BUS_CONTENTION     BIT(3)
> > +#define     ASPEED_PECI_INT_WR_FCS_BAD         BIT(2)
> > +#define     ASPEED_PECI_INT_WR_FCS_ABORT       BIT(1)
> > +#define     ASPEED_PECI_INT_CMD_DONE           BIT(0)
> > +
> > +/* Interrupt Status Register */
> > +#define ASPEED_PECI_INT_STS                    0x1c
> > +#define   ASPEED_PECI_INT_TIMING_RESULT_MASK   GENMASK(29, 16)
> > +         /* bits[4..0]: Same bit fields in the 'Interrupt Register' */
> > +
> > +/* Rx/Tx Data Buffer Registers */
> > +#define ASPEED_PECI_WR_DATA0                   0x20
> > +#define ASPEED_PECI_WR_DATA1                   0x24
> > +#define ASPEED_PECI_WR_DATA2                   0x28
> > +#define ASPEED_PECI_WR_DATA3                   0x2c
> > +#define ASPEED_PECI_RD_DATA0                   0x30
> > +#define ASPEED_PECI_RD_DATA1                   0x34
> > +#define ASPEED_PECI_RD_DATA2                   0x38
> > +#define ASPEED_PECI_RD_DATA3                   0x3c
> > +#define ASPEED_PECI_WR_DATA4                   0x40
> > +#define ASPEED_PECI_WR_DATA5                   0x44
> > +#define ASPEED_PECI_WR_DATA6                   0x48
> > +#define ASPEED_PECI_WR_DATA7                   0x4c
> > +#define ASPEED_PECI_RD_DATA4                   0x50
> > +#define ASPEED_PECI_RD_DATA5                   0x54
> > +#define ASPEED_PECI_RD_DATA6                   0x58
> > +#define ASPEED_PECI_RD_DATA7                   0x5c
> > +#define   ASPEED_PECI_DATA_BUF_SIZE_MAX                32
> > +
> > +/* Timing Negotiation */
> > +#define ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT  8
> > +#define ASPEED_PECI_RD_SAMPLING_POINT_MAX      (BIT(4) - 1)
> > +#define ASPEED_PECI_CLK_DIV_DEFAULT            0
> > +#define ASPEED_PECI_CLK_DIV_MAX                        (BIT(3) - 1)
> > +#define ASPEED_PECI_MSG_TIMING_DEFAULT         1
> > +#define ASPEED_PECI_MSG_TIMING_MAX             (BIT(8) - 1)
> > +#define ASPEED_PECI_ADDR_TIMING_DEFAULT                1
> > +#define ASPEED_PECI_ADDR_TIMING_MAX            (BIT(8) - 1)
> > +
> > +/* Timeout */
> > +#define ASPEED_PECI_IDLE_CHECK_TIMEOUT_US      (50 * USEC_PER_MSEC)
> > +#define ASPEED_PECI_IDLE_CHECK_INTERVAL_US     (10 * USEC_PER_MSEC)
> > +#define ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT     (1000)
> > +#define ASPEED_PECI_CMD_TIMEOUT_MS_MAX         (1000)
> > +
> > +struct aspeed_peci {
> > +       struct peci_controller *controller;
> > +       struct device *dev;
> > +       void __iomem *base;
> > +       struct clk *clk;
> > +       struct reset_control *rst;
> > +       int irq;
> > +       spinlock_t lock; /* to sync completion status handling */
> > +       struct completion xfer_complete;
> > +       u32 status;
> > +       u32 cmd_timeout_ms;
> > +       u32 msg_timing;
> > +       u32 addr_timing;
> > +       u32 rd_sampling_point;
> > +       u32 clk_div;
> > +};
> > +
> > +static void aspeed_peci_init_regs(struct aspeed_peci *priv)
> > +{
> > +       u32 val;
> > +
> > +       val = FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK,
> > ASPEED_PECI_CLK_DIV_DEFAULT);
> > +       val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
> > +       writel(val, priv->base + ASPEED_PECI_CTRL);
> > +       /*
> > +        * Timing negotiation period setting.
> > +        * The unit of the programmed value is 4 times of PECI clock period.
> > +        */
> > +       val = FIELD_PREP(ASPEED_PECI_T_NEGO_MSG_MASK, priv->msg_timing);
> > +       val |= FIELD_PREP(ASPEED_PECI_T_NEGO_ADDR_MASK, priv->addr_timing);
> > +       writel(val, priv->base + ASPEED_PECI_TIMING_NEGOTIATION);
> > +
> > +       /* Clear interrupts */
> > +       val = readl(priv->base + ASPEED_PECI_INT_STS) |
> > ASPEED_PECI_INT_MASK;
> > +       writel(val, priv->base + ASPEED_PECI_INT_STS);
> > +
> > +       /* Set timing negotiation mode and enable interrupts */
> > +       val = FIELD_PREP(ASPEED_PECI_TIMING_NEGO_SEL_MASK,
> > ASPEED_PECI_1ST_BIT_OF_ADDR_NEGO);
> > +       val |= ASPEED_PECI_INT_MASK;
> > +       writel(val, priv->base + ASPEED_PECI_INT_CTRL);
> > +
> > +       val = FIELD_PREP(ASPEED_PECI_CTRL_SAMPLING_MASK, priv-
> > >rd_sampling_point);
> > +       val |= FIELD_PREP(ASPEED_PECI_CTRL_CLK_DIV_MASK, priv->clk_div);
> > +       val |= ASPEED_PECI_CTRL_PECI_EN;
> > +       val |= ASPEED_PECI_CTRL_PECI_CLK_EN;
> > +       writel(val, priv->base + ASPEED_PECI_CTRL);
> > +}
> > +
> > +static inline int aspeed_peci_check_idle(struct aspeed_peci *priv)
> > +{
> > +       u32 cmd_sts = readl(priv->base + ASPEED_PECI_CMD);
> > +
> > +       if (FIELD_GET(ASPEED_PECI_CMD_STS_MASK, cmd_sts) ==
> > ASPEED_PECI_CMD_STS_ADDR_T_NEGO)
> > +               aspeed_peci_init_regs(priv);
> > +
> > +       return readl_poll_timeout(priv->base + ASPEED_PECI_CMD,
> > +                                 cmd_sts,
> > +                                 !(cmd_sts & ASPEED_PECI_CMD_IDLE_MASK),
> > +                                 ASPEED_PECI_IDLE_CHECK_INTERVAL_US,
> > +                                 ASPEED_PECI_IDLE_CHECK_TIMEOUT_US);
> > +}
> > +
> > +static int aspeed_peci_xfer(struct peci_controller *controller,
> > +                           u8 addr, struct peci_request *req)
> > +{
> > +       struct aspeed_peci *priv = dev_get_drvdata(controller->dev.parent);
> > +       unsigned long flags, timeout = msecs_to_jiffies(priv-
> > >cmd_timeout_ms);
> > +       u32 peci_head;
> > +       int ret;
> > +
> > +       if (req->tx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX ||
> > +           req->rx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX)
> > +               return -EINVAL;
> > +
> > +       /* Check command sts and bus idle state */
> > +       ret = aspeed_peci_check_idle(priv);
> > +       if (ret)
> > +               return ret; /* -ETIMEDOUT */
> > +
> > +       spin_lock_irqsave(&priv->lock, flags);
> > +       reinit_completion(&priv->xfer_complete);
> > +
> > +       peci_head = FIELD_PREP(ASPEED_PECI_TARGET_ADDR_MASK, addr) |
> > +                   FIELD_PREP(ASPEED_PECI_WR_LEN_MASK, req->tx.len) |
> > +                   FIELD_PREP(ASPEED_PECI_RD_LEN_MASK, req->rx.len);
> > +
> > +       writel(peci_head, priv->base + ASPEED_PECI_RW_LENGTH);
> > +
> > +       memcpy_toio(priv->base + ASPEED_PECI_WR_DATA0, req->tx.buf,
> > min_t(u8, req->tx.len, 16));
> > +       if (req->tx.len > 16)
> > +               memcpy_toio(priv->base + ASPEED_PECI_WR_DATA4, req->tx.buf +
> > 16,
> > +                           req->tx.len - 16);
> > +
> > +       dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
> > +       print_hex_dump_bytes("TX : ", DUMP_PREFIX_NONE, req->tx.buf, req-
> > >tx.len);
> 
> On CONFIG_DYNAMIC_DEBUG=n builds the kernel will do all the work of
> reading through this buffer, but skip emitting it. Are you sure you
> want to pay that overhead for every transaction?

I can remove it or I can add something like:

#if IS_ENABLED(CONFIG_PECI_DEBUG)
#define peci_debug(fmt, ...) pr_debug(fmt, ##__VA_ARGS__)
#else
#define peci_debug(...) do { } while (0)
#endif

(and similar peci_trace with trace_printk for usage in IRQ handlers and such).

What do you think?

> 
> > +
> > +       priv->status = 0;
> > +       writel(ASPEED_PECI_CMD_FIRE, priv->base + ASPEED_PECI_CMD);
> > +       spin_unlock_irqrestore(&priv->lock, flags);
> > +
> > +       ret = wait_for_completion_interruptible_timeout(&priv-
> > >xfer_complete, timeout);
> 
> spin_lock_irqsave() says "I don't know if interrupts are disabled
> already, so I'll save the state, whatever it is, and restore later"
> 
> wait_for_completion_interruptible_timeout() says "I know I am in a
> sleepable context where interrupts are enabled"
> 
> So, one of those is wrong, i.e. should it be spin_{lock,unlock}_irq()?

You're right - I'll fix it.

> 
> 
> > +       if (ret < 0)
> > +               return ret;
> > +
> > +       if (ret == 0) {
> > +               dev_dbg(priv->dev, "Timeout waiting for a response!\n");
> > +               return -ETIMEDOUT;
> > +       }
> > +
> > +       spin_lock_irqsave(&priv->lock, flags);
> > +
> > +       writel(0, priv->base + ASPEED_PECI_CMD);
> > +
> > +       if (priv->status != ASPEED_PECI_INT_CMD_DONE) {
> > +               spin_unlock_irqrestore(&priv->lock, flags);
> > +               dev_dbg(priv->dev, "No valid response!\n");
> > +               return -EIO;
> > +       }
> > +
> > +       spin_unlock_irqrestore(&priv->lock, flags);
> > +
> > +       memcpy_fromio(req->rx.buf, priv->base + ASPEED_PECI_RD_DATA0,
> > min_t(u8, req->rx.len, 16));
> > +       if (req->rx.len > 16)
> > +               memcpy_fromio(req->rx.buf + 16, priv->base +
> > ASPEED_PECI_RD_DATA4,
> > +                             req->rx.len - 16);
> > +
> > +       print_hex_dump_bytes("RX : ", DUMP_PREFIX_NONE, req->rx.buf, req-
> > >rx.len);
> > +
> > +       return 0;
> > +}
> > +
> > +static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
> > +{
> > +       struct aspeed_peci *priv = arg;
> > +       u32 status;
> > +
> > +       spin_lock(&priv->lock);
> > +       status = readl(priv->base + ASPEED_PECI_INT_STS);
> > +       writel(status, priv->base + ASPEED_PECI_INT_STS);
> > +       priv->status |= (status & ASPEED_PECI_INT_MASK);
> > +
> > +       /*
> > +        * In most cases, interrupt bits will be set one by one but also
> > note
> > +        * that multiple interrupt bits could be set at the same time.
> > +        */
> > +       if (status & ASPEED_PECI_INT_BUS_TIMEOUT)
> > +               dev_dbg_ratelimited(priv->dev,
> > "ASPEED_PECI_INT_BUS_TIMEOUT\n");
> > +
> > +       if (status & ASPEED_PECI_INT_BUS_CONTENTION)
> > +               dev_dbg_ratelimited(priv->dev,
> > "ASPEED_PECI_INT_BUS_CONTENTION\n");
> > +
> > +       if (status & ASPEED_PECI_INT_WR_FCS_BAD)
> > +               dev_dbg_ratelimited(priv->dev,
> > "ASPEED_PECI_INT_WR_FCS_BAD\n");
> > +
> > +       if (status & ASPEED_PECI_INT_WR_FCS_ABORT)
> > +               dev_dbg_ratelimited(priv->dev,
> > "ASPEED_PECI_INT_WR_FCS_ABORT\n");
> 
> Are you sure these would not be better as tracepoints? If you're
> debugging an interrupt related failure, the ratelimiting might get in
> your way when you really need to know when one of these error
> interrupts fire relative to another event.

Tracepoints are ABI(ish), and using a full blown tracepoint just for IRQ status
would probably be too much.
I was thinking about something like trace_printk hidden under a
"CONFIG_PECI_DEBUG" (see above), but perhaps that's something for the future
improvement?

> 
> > +
> > +       /*
> > +        * All commands should be ended up with a ASPEED_PECI_INT_CMD_DONE
> > bit
> > +        * set even in an error case.
> > +        */
> > +       if (status & ASPEED_PECI_INT_CMD_DONE)
> > +               complete(&priv->xfer_complete);
> 
> Hmm, no need to check if there was a sequencing error, like a command
> was never submitted?

It's handled by checking if HW is idle in xfer before a command is sent, where
we just expect a single interrupt per command.

> 
> > +
> > +       spin_unlock(&priv->lock);
> > +
> > +       return IRQ_HANDLED;
> > +}
> > +
> > +static void aspeed_peci_property_sanitize(struct device *dev, const char
> > *propname,
> > +                                         u32 min, u32 max, u32 default_val,
> > u32 *propval)
> > +{
> > +       u32 val;
> > +       int ret;
> > +
> > +       ret = device_property_read_u32(dev, propname, &val);
> > +       if (ret) {
> > +               val = default_val;
> > +       } else if (val > max || val < min) {
> > +               dev_warn(dev, "Invalid %s: %u, falling back to: %u\n",
> > +                        propname, val, default_val);
> > +
> > +               val = default_val;
> > +       }
> > +
> > +       *propval = val;
> > +}
> > +
> > +static void aspeed_peci_property_setup(struct aspeed_peci *priv)
> > +{
> > +       aspeed_peci_property_sanitize(priv->dev, "aspeed,clock-divider",
> > +                                     0, ASPEED_PECI_CLK_DIV_MAX,
> > +                                     ASPEED_PECI_CLK_DIV_DEFAULT, &priv-
> > >clk_div);
> > +       aspeed_peci_property_sanitize(priv->dev, "aspeed,msg-timing",
> > +                                     0, ASPEED_PECI_MSG_TIMING_MAX,
> > +                                     ASPEED_PECI_MSG_TIMING_DEFAULT, &priv-
> > >msg_timing);
> > +       aspeed_peci_property_sanitize(priv->dev, "aspeed,addr-timing",
> > +                                     0, ASPEED_PECI_ADDR_TIMING_MAX,
> > +                                     ASPEED_PECI_ADDR_TIMING_DEFAULT,
> > &priv->addr_timing);
> > +       aspeed_peci_property_sanitize(priv->dev, "aspeed,rd-sampling-point",
> > +                                     0, ASPEED_PECI_RD_SAMPLING_POINT_MAX,
> > +                                     ASPEED_PECI_RD_SAMPLING_POINT_DEFAULT,
> > +                                     &priv->rd_sampling_point);
> > +       aspeed_peci_property_sanitize(priv->dev, "cmd-timeout-ms",
> > +                                     1, ASPEED_PECI_CMD_TIMEOUT_MS_MAX,
> > +                                     ASPEED_PECI_CMD_TIMEOUT_MS_DEFAULT,
> > &priv->cmd_timeout_ms);
> > +}
> > +
> > +static struct peci_controller_ops aspeed_ops = {
> > +       .xfer = aspeed_peci_xfer,
> > +};
> > +
> > +static void aspeed_peci_reset_control_release(void *data)
> > +{
> > +       reset_control_assert(data);
> > +}
> > +
> > +int aspeed_peci_reset_control_deassert(struct device *dev, struct
> > reset_control *rst)
> 
> I'd recommend naming this devm_aspeed_peci_reset_control_deassert(),
> because I came looking here from reading probe for why there was no
> reassertion of reset on driver ->remove().

Ok.

> 
> > +{
> > +       int ret;
> > +
> > +       ret = reset_control_deassert(rst);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return devm_add_action_or_reset(dev,
> > aspeed_peci_reset_control_release, rst);
> > +}
> > +
> > +static void aspeed_peci_clk_release(void *data)
> > +{
> > +       clk_disable_unprepare(data);
> > +}
> > +
> > +static int aspeed_peci_clk_enable(struct device *dev, struct clk *clk)
> 
> ...ditto on the devm prefix, just to speed readability.

Ok.

Thanks
-Iwona

> 
> > +{
> > +       int ret;
> > +
> > +       ret = clk_prepare_enable(clk);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return devm_add_action_or_reset(dev, aspeed_peci_clk_release, clk);
> > +}
> > +
> > +static int aspeed_peci_probe(struct platform_device *pdev)
> > +{
> > +       struct peci_controller *controller;
> > +       struct aspeed_peci *priv;
> > +       int ret;
> > +
> > +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > +       if (!priv)
> > +               return -ENOMEM;
> > +
> > +       priv->dev = &pdev->dev;
> > +       dev_set_drvdata(priv->dev, priv);
> > +
> > +       priv->base = devm_platform_ioremap_resource(pdev, 0);
> > +       if (IS_ERR(priv->base))
> > +               return PTR_ERR(priv->base);
> > +
> > +       priv->irq = platform_get_irq(pdev, 0);
> > +       if (!priv->irq)
> > +               return priv->irq;
> > +
> > +       ret = devm_request_irq(&pdev->dev, priv->irq,
> > aspeed_peci_irq_handler,
> > +                              0, "peci-aspeed", priv);
> > +       if (ret)
> > +               return ret;
> > +
> > +       init_completion(&priv->xfer_complete);
> > +       spin_lock_init(&priv->lock);
> > +
> > +       priv->rst = devm_reset_control_get(&pdev->dev, NULL);
> > +       if (IS_ERR(priv->rst))
> > +               return dev_err_probe(priv->dev, PTR_ERR(priv->rst),
> > +                                    "failed to get reset control\n");
> > +
> > +       ret = aspeed_peci_reset_control_deassert(priv->dev, priv->rst);
> > +       if (ret)
> > +               return dev_err_probe(priv->dev, ret, "cannot deassert reset
> > control\n");
> > +
> > +       priv->clk = devm_clk_get(priv->dev, NULL);
> > +       if (IS_ERR(priv->clk))
> > +               return dev_err_probe(priv->dev, PTR_ERR(priv->clk), "failed
> > to get clk\n");
> > +
> > +       ret = aspeed_peci_clk_enable(priv->dev, priv->clk);
> > +       if (ret)
> > +               return dev_err_probe(priv->dev, ret, "failed to enable
> > clock\n");
> > +
> > +       aspeed_peci_property_setup(priv);
> > +
> > +       aspeed_peci_init_regs(priv);
> > +
> > +       controller = devm_peci_controller_add(priv->dev, &aspeed_ops);
> > +       if (IS_ERR(controller))
> > +               return dev_err_probe(priv->dev, PTR_ERR(controller),
> > +                                    "failed to add aspeed peci
> > controller\n");
> > +
> > +       priv->controller = controller;
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct of_device_id aspeed_peci_of_table[] = {
> > +       { .compatible = "aspeed,ast2400-peci", },
> > +       { .compatible = "aspeed,ast2500-peci", },
> > +       { .compatible = "aspeed,ast2600-peci", },
> > +       { }
> > +};
> > +MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
> > +
> > +static struct platform_driver aspeed_peci_driver = {
> > +       .probe  = aspeed_peci_probe,
> > +       .driver = {
> > +               .name           = "peci-aspeed",
> > +               .of_match_table = aspeed_peci_of_table,
> > +       },
> > +};
> > +module_platform_driver(aspeed_peci_driver);
> > +
> > +MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
> > +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> > +MODULE_DESCRIPTION("ASPEED PECI driver");
> > +MODULE_LICENSE("GPL");
> > +MODULE_IMPORT_NS(PECI);
> > --
> > 2.31.1
> > 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 07/15] peci: Add peci-aspeed controller driver
  2021-08-26 23:54     ` Winiarska, Iwona
@ 2021-08-27 16:24       ` Dan Williams
  2021-08-29 19:42         ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Williams @ 2021-08-27 16:24 UTC (permalink / raw)
  To: Winiarska, Iwona
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On Thu, Aug 26, 2021 at 4:55 PM Winiarska, Iwona
<iwona.winiarska@intel.com> wrote:
>
> On Wed, 2021-08-25 at 18:35 -0700, Dan Williams wrote:
> > On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
> > <iwona.winiarska@intel.com> wrote:
> > >
> > > From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > >
> > > ASPEED AST24xx/AST25xx/AST26xx SoCs supports the PECI electrical
> > > interface (a.k.a PECI wire).
> >
> > Maybe a one sentence blurb here and in the Kconfig reminding people
> > why they should care if they have a PECI driver or not?
>
> Ok, I'll expand it a bit.
[..]
> > > +static int aspeed_peci_xfer(struct peci_controller *controller,
> > > +                           u8 addr, struct peci_request *req)
> > > +{
> > > +       struct aspeed_peci *priv = dev_get_drvdata(controller->dev.parent);
> > > +       unsigned long flags, timeout = msecs_to_jiffies(priv-
> > > >cmd_timeout_ms);
> > > +       u32 peci_head;
> > > +       int ret;
> > > +
> > > +       if (req->tx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX ||
> > > +           req->rx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX)
> > > +               return -EINVAL;
> > > +
> > > +       /* Check command sts and bus idle state */
> > > +       ret = aspeed_peci_check_idle(priv);
> > > +       if (ret)
> > > +               return ret; /* -ETIMEDOUT */
> > > +
> > > +       spin_lock_irqsave(&priv->lock, flags);
> > > +       reinit_completion(&priv->xfer_complete);
> > > +
> > > +       peci_head = FIELD_PREP(ASPEED_PECI_TARGET_ADDR_MASK, addr) |
> > > +                   FIELD_PREP(ASPEED_PECI_WR_LEN_MASK, req->tx.len) |
> > > +                   FIELD_PREP(ASPEED_PECI_RD_LEN_MASK, req->rx.len);
> > > +
> > > +       writel(peci_head, priv->base + ASPEED_PECI_RW_LENGTH);
> > > +
> > > +       memcpy_toio(priv->base + ASPEED_PECI_WR_DATA0, req->tx.buf,
> > > min_t(u8, req->tx.len, 16));
> > > +       if (req->tx.len > 16)
> > > +               memcpy_toio(priv->base + ASPEED_PECI_WR_DATA4, req->tx.buf +
> > > 16,
> > > +                           req->tx.len - 16);
> > > +
> > > +       dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
> > > +       print_hex_dump_bytes("TX : ", DUMP_PREFIX_NONE, req->tx.buf, req-
> > > >tx.len);
> >
> > On CONFIG_DYNAMIC_DEBUG=n builds the kernel will do all the work of
> > reading through this buffer, but skip emitting it. Are you sure you
> > want to pay that overhead for every transaction?
>
> I can remove it or I can add something like:
>
> #if IS_ENABLED(CONFIG_PECI_DEBUG)
> #define peci_debug(fmt, ...) pr_debug(fmt, ##__VA_ARGS__)
> #else
> #define peci_debug(...) do { } while (0)
> #endif

It's the hex dump I'm worried about, not the debug statements as much.

I think the choices are remove the print_hex_dump_bytes(), put it
behind an IS_ENABLED(CONFIG_DYNAMIC_DEBUG) to ensure the overhead is
skipped in the CONFIG_DYNAMIC_DEBUG=n case, or live with the overhead
if this is not a fast path / infrequently used.

>
> (and similar peci_trace with trace_printk for usage in IRQ handlers and such).
>
> What do you think?

In general, no, don't wrap the base level print routines with
driver-specific ones. Also, trace_printk() is only for debug builds.
Note that trace points are built to be even less overhead than
dev_dbg(), so there's no overhead concern with disabled tracepoints,
they literally translate to nops when disabled.

>
> >
> > > +
> > > +       priv->status = 0;
> > > +       writel(ASPEED_PECI_CMD_FIRE, priv->base + ASPEED_PECI_CMD);
> > > +       spin_unlock_irqrestore(&priv->lock, flags);
> > > +
> > > +       ret = wait_for_completion_interruptible_timeout(&priv-
> > > >xfer_complete, timeout);
> >
> > spin_lock_irqsave() says "I don't know if interrupts are disabled
> > already, so I'll save the state, whatever it is, and restore later"
> >
> > wait_for_completion_interruptible_timeout() says "I know I am in a
> > sleepable context where interrupts are enabled"
> >
> > So, one of those is wrong, i.e. should it be spin_{lock,unlock}_irq()?
>
> You're right - I'll fix it.
>
> >
> >
> > > +       if (ret < 0)
> > > +               return ret;
> > > +
> > > +       if (ret == 0) {
> > > +               dev_dbg(priv->dev, "Timeout waiting for a response!\n");
> > > +               return -ETIMEDOUT;
> > > +       }
> > > +
> > > +       spin_lock_irqsave(&priv->lock, flags);
> > > +
> > > +       writel(0, priv->base + ASPEED_PECI_CMD);
> > > +
> > > +       if (priv->status != ASPEED_PECI_INT_CMD_DONE) {
> > > +               spin_unlock_irqrestore(&priv->lock, flags);
> > > +               dev_dbg(priv->dev, "No valid response!\n");
> > > +               return -EIO;
> > > +       }
> > > +
> > > +       spin_unlock_irqrestore(&priv->lock, flags);
> > > +
> > > +       memcpy_fromio(req->rx.buf, priv->base + ASPEED_PECI_RD_DATA0,
> > > min_t(u8, req->rx.len, 16));
> > > +       if (req->rx.len > 16)
> > > +               memcpy_fromio(req->rx.buf + 16, priv->base +
> > > ASPEED_PECI_RD_DATA4,
> > > +                             req->rx.len - 16);
> > > +
> > > +       print_hex_dump_bytes("RX : ", DUMP_PREFIX_NONE, req->rx.buf, req-
> > > >rx.len);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
> > > +{
> > > +       struct aspeed_peci *priv = arg;
> > > +       u32 status;
> > > +
> > > +       spin_lock(&priv->lock);
> > > +       status = readl(priv->base + ASPEED_PECI_INT_STS);
> > > +       writel(status, priv->base + ASPEED_PECI_INT_STS);
> > > +       priv->status |= (status & ASPEED_PECI_INT_MASK);
> > > +
> > > +       /*
> > > +        * In most cases, interrupt bits will be set one by one but also
> > > note
> > > +        * that multiple interrupt bits could be set at the same time.
> > > +        */
> > > +       if (status & ASPEED_PECI_INT_BUS_TIMEOUT)
> > > +               dev_dbg_ratelimited(priv->dev,
> > > "ASPEED_PECI_INT_BUS_TIMEOUT\n");
> > > +
> > > +       if (status & ASPEED_PECI_INT_BUS_CONTENTION)
> > > +               dev_dbg_ratelimited(priv->dev,
> > > "ASPEED_PECI_INT_BUS_CONTENTION\n");
> > > +
> > > +       if (status & ASPEED_PECI_INT_WR_FCS_BAD)
> > > +               dev_dbg_ratelimited(priv->dev,
> > > "ASPEED_PECI_INT_WR_FCS_BAD\n");
> > > +
> > > +       if (status & ASPEED_PECI_INT_WR_FCS_ABORT)
> > > +               dev_dbg_ratelimited(priv->dev,
> > > "ASPEED_PECI_INT_WR_FCS_ABORT\n");
> >
> > Are you sure these would not be better as tracepoints? If you're
> > debugging an interrupt related failure, the ratelimiting might get in
> > your way when you really need to know when one of these error
> > interrupts fire relative to another event.
>
> Tracepoints are ABI(ish), and using a full blown tracepoint just for IRQ status
> would probably be too much.

Tracepoints become ABI once someone ships tooling that depends on them
being there. These don't look  attractive for a tool, and they don't
look difficult to maintain if the interrupt handler needs to be
reworked. I.e. it would be trivial to keep a dead tracepoint around if
worse came to worse to keep a tool from failing to load.

> I was thinking about something like trace_printk hidden under a
> "CONFIG_PECI_DEBUG" (see above), but perhaps that's something for the future
> improvement?

Again trace_printk() is only for private builds.

>
> >
> > > +
> > > +       /*
> > > +        * All commands should be ended up with a ASPEED_PECI_INT_CMD_DONE
> > > bit
> > > +        * set even in an error case.
> > > +        */
> > > +       if (status & ASPEED_PECI_INT_CMD_DONE)
> > > +               complete(&priv->xfer_complete);
> >
> > Hmm, no need to check if there was a sequencing error, like a command
> > was never submitted?
>
> It's handled by checking if HW is idle in xfer before a command is sent, where
> we just expect a single interrupt per command.

I'm asking how do you determine if this status was spurious, or there
was a sequencing error in the driver?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 08/15] peci: Add device detection
  2021-08-03 11:31 ` [PATCH v2 08/15] peci: Add device detection Iwona Winiarska
@ 2021-08-27 19:01   ` Dan Williams
  2021-11-15 22:18     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Williams @ 2021-08-27 19:01 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: Linux Kernel Mailing List, openbmc, Greg Kroah-Hartman, X86 ML,
	Device Tree, linux-aspeed, Linux ARM, linux-hwmon,
	Linux Doc Mailing List, Rob Herring, Joel Stanley,
	Andrew Jeffery, Jean Delvare, Guenter Roeck, Arnd Bergmann,
	Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Randy Dunlap, Zev Weiss,
	David Muller

On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
<iwona.winiarska@intel.com> wrote:
>
> Since PECI devices are discoverable, we can dynamically detect devices
> that are actually available in the system.
>
> This change complements the earlier implementation by rescanning PECI
> bus to detect available devices. For this purpose, it also introduces the
> minimal API for PECI requests.
>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  drivers/peci/Makefile   |   2 +-
>  drivers/peci/core.c     |  33 ++++++++++++
>  drivers/peci/device.c   | 114 ++++++++++++++++++++++++++++++++++++++++
>  drivers/peci/internal.h |  14 +++++
>  drivers/peci/request.c  |  50 ++++++++++++++++++
>  5 files changed, 212 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/peci/device.c
>  create mode 100644 drivers/peci/request.c
>
> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> index 926d8df15cbd..c5f9d3fe21bb 100644
> --- a/drivers/peci/Makefile
> +++ b/drivers/peci/Makefile
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>
>  # Core functionality
> -peci-y := core.o
> +peci-y := core.o request.o device.o
>  obj-$(CONFIG_PECI) += peci.o
>
>  # Hardware specific bus drivers
> diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> index 7b3938af0396..d143f1a7fe98 100644
> --- a/drivers/peci/core.c
> +++ b/drivers/peci/core.c
> @@ -34,6 +34,20 @@ struct device_type peci_controller_type = {
>         .release        = peci_controller_dev_release,
>  };
>
> +static int peci_controller_scan_devices(struct peci_controller *controller)
> +{
> +       int ret;
> +       u8 addr;
> +
> +       for (addr = PECI_BASE_ADDR; addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX; addr++) {
> +               ret = peci_device_create(controller, addr);
> +               if (ret)
> +                       return ret;
> +       }
> +
> +       return 0;
> +}
> +
>  static struct peci_controller *peci_controller_alloc(struct device *dev,
>                                                      struct peci_controller_ops *ops)
>  {
> @@ -76,10 +90,23 @@ static struct peci_controller *peci_controller_alloc(struct device *dev,
>         return ERR_PTR(ret);
>  }
>
> +static int unregister_child(struct device *dev, void *dummy)
> +{
> +       peci_device_destroy(to_peci_device(dev));
> +
> +       return 0;
> +}
> +
>  static void unregister_controller(void *_controller)
>  {
>         struct peci_controller *controller = _controller;
>
> +       /*
> +        * Detach any active PECI devices. This can't fail, thus we do not
> +        * check the returned value.
> +        */
> +       device_for_each_child_reverse(&controller->dev, NULL, unregister_child);
> +
>         device_unregister(&controller->dev);
>  }
>
> @@ -115,6 +142,12 @@ struct peci_controller *devm_peci_controller_add(struct device *dev,
>         if (ret)
>                 return ERR_PTR(ret);
>
> +       /*
> +        * Ignoring retval since failures during scan are non-critical for
> +        * controller itself.
> +        */
> +       peci_controller_scan_devices(controller);
> +
>         return controller;
>
>  err:
> diff --git a/drivers/peci/device.c b/drivers/peci/device.c
> new file mode 100644
> index 000000000000..32811248997b
> --- /dev/null
> +++ b/drivers/peci/device.c
> @@ -0,0 +1,114 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2018-2021 Intel Corporation
> +
> +#include <linux/peci.h>
> +#include <linux/slab.h>
> +
> +#include "internal.h"
> +
> +static int peci_detect(struct peci_controller *controller, u8 addr)
> +{
> +       struct peci_request *req;
> +       int ret;
> +
> +       /*
> +        * PECI Ping is a command encoded by tx_len = 0, rx_len = 0.
> +        * We expect correct Write FCS if the device at the target address
> +        * is able to respond.
> +        */
> +       req = peci_request_alloc(NULL, 0, 0);
> +       if (!req)
> +               return -ENOMEM;

Seems a waste to do a heap allocation for this routine. Why not:

       /*
        * PECI Ping is a command encoded by tx_len = 0, rx_len = 0.
        * We expect correct Write FCS if the device at the target address
        * is able to respond.
        */
       struct peci_request req = { 0 };

> +
> +       mutex_lock(&controller->bus_lock);
> +       ret = controller->ops->xfer(controller, addr, req);
> +       mutex_unlock(&controller->bus_lock);
> +
> +       peci_request_free(req);
> +
> +       return ret;
> +}
> +
> +static bool peci_addr_valid(u8 addr)
> +{
> +       return addr >= PECI_BASE_ADDR && addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX;
> +}
> +
> +static int peci_dev_exists(struct device *dev, void *data)
> +{
> +       struct peci_device *device = to_peci_device(dev);
> +       u8 *addr = data;
> +
> +       if (device->addr == *addr)
> +               return -EBUSY;
> +
> +       return 0;
> +}
> +
> +int peci_device_create(struct peci_controller *controller, u8 addr)
> +{
> +       struct peci_device *device;
> +       int ret;
> +
> +       if (WARN_ON(!peci_addr_valid(addr)))

The WARN_ON is overkill, especially as there is only one caller of
this and it loops through valid addresses.

> +               return -EINVAL;
> +
> +       /* Check if we have already detected this device before. */
> +       ret = device_for_each_child(&controller->dev, &addr, peci_dev_exists);
> +       if (ret)
> +               return 0;
> +
> +       ret = peci_detect(controller, addr);
> +       if (ret) {
> +               /*
> +                * Device not present or host state doesn't allow successful
> +                * detection at this time.
> +                */
> +               if (ret == -EIO || ret == -ETIMEDOUT)
> +                       return 0;
> +
> +               return ret;
> +       }
> +
> +       device = kzalloc(sizeof(*device), GFP_KERNEL);
> +       if (!device)
> +               return -ENOMEM;
> +
> +       device->addr = addr;
> +       device->dev.parent = &controller->dev;
> +       device->dev.bus = &peci_bus_type;
> +       device->dev.type = &peci_device_type;
> +
> +       ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device->addr);
> +       if (ret)
> +               goto err_free;

It's cleaner to just have one unified error exit using put_device().
Use the device_initialize() + device_add() pattern, not
device_register().


> +
> +       ret = device_register(&device->dev);
> +       if (ret)
> +               goto err_put;
> +
> +       return 0;
> +
> +err_put:
> +       put_device(&device->dev);
> +err_free:
> +       kfree(device);
> +
> +       return ret;
> +}
> +
> +void peci_device_destroy(struct peci_device *device)
> +{
> +       device_unregister(&device->dev);

No clear value for this wrapper, in fact in one caller it causes it to
do a to_peci_device() just this helper can undo that up-cast.

> +}
> +
> +static void peci_device_release(struct device *dev)
> +{
> +       struct peci_device *device = to_peci_device(dev);
> +
> +       kfree(device);
> +}
> +
> +struct device_type peci_device_type = {
> +       .release        = peci_device_release,
> +};
> diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
> index 918dea745a86..57d11a902c5d 100644
> --- a/drivers/peci/internal.h
> +++ b/drivers/peci/internal.h
> @@ -8,6 +8,20 @@
>  #include <linux/types.h>
>
>  struct peci_controller;
> +struct peci_device;
> +struct peci_request;
> +
> +/* PECI CPU address range 0x30-0x37 */
> +#define PECI_BASE_ADDR         0x30
> +#define PECI_DEVICE_NUM_MAX    8
> +
> +struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len);
> +void peci_request_free(struct peci_request *req);
> +
> +extern struct device_type peci_device_type;
> +
> +int peci_device_create(struct peci_controller *controller, u8 addr);
> +void peci_device_destroy(struct peci_device *device);
>
>  extern struct bus_type peci_bus_type;
>
> diff --git a/drivers/peci/request.c b/drivers/peci/request.c
> new file mode 100644
> index 000000000000..81b567bc7b87
> --- /dev/null
> +++ b/drivers/peci/request.c
> @@ -0,0 +1,50 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2021 Intel Corporation
> +
> +#include <linux/export.h>
> +#include <linux/peci.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +
> +#include "internal.h"
> +
> +/**
> + * peci_request_alloc() - allocate &struct peci_requests
> + * @device: PECI device to which request is going to be sent
> + * @tx_len: TX length
> + * @rx_len: RX length
> + *
> + * Return: A pointer to a newly allocated &struct peci_request on success or NULL otherwise.
> + */
> +struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len)
> +{
> +       struct peci_request *req;
> +
> +       if (WARN_ON_ONCE(tx_len > PECI_REQUEST_MAX_BUF_SIZE || rx_len > PECI_REQUEST_MAX_BUF_SIZE))

WARN_ON_ONCE() should only be here to help other kernel developers not
make this mistake However, another way to enforce this is to stop
exporting peci_request_alloc() and instead export helpers for specific
command types, and keep this detail internal to the core. If you keep
this, it needs a comment that it is only here to warn other
peci-client developers of their bug before it goes upstream.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 09/15] peci: Add sysfs interface for PECI bus
  2021-08-03 11:31 ` [PATCH v2 09/15] peci: Add sysfs interface for PECI bus Iwona Winiarska
@ 2021-08-27 19:11   ` Dan Williams
  2021-11-15 22:19     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Williams @ 2021-08-27 19:11 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: Linux Kernel Mailing List, openbmc, Greg Kroah-Hartman, X86 ML,
	Device Tree, linux-aspeed, Linux ARM, linux-hwmon,
	Linux Doc Mailing List, Rob Herring, Joel Stanley,
	Andrew Jeffery, Jean Delvare, Guenter Roeck, Arnd Bergmann,
	Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Randy Dunlap, Zev Weiss,
	David Muller

On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
<iwona.winiarska@intel.com> wrote:
>
> PECI devices may not be discoverable at the time when PECI controller is
> being added (e.g. BMC can boot up when the Host system is still in S5).
> Since we currently don't have the capabilities to figure out the Host
> system state inside the PECI subsystem itself, we have to rely on
> userspace to do it for us.
>
> In the future, PECI subsystem may be expanded with mechanisms that allow
> us to avoid depending on userspace interaction (e.g. CPU presence could
> be detected using GPIO, and the information on whether it's discoverable
> could be obtained over IPMI).

Thanks for this detail.

> Unfortunately, those methods may ultimately not be available (support
> will vary from platform to platform), which means that we still need
> platform independent method triggered by userspace.
>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-peci | 16 +++++
>  drivers/peci/Makefile                    |  2 +-
>  drivers/peci/core.c                      |  3 +-
>  drivers/peci/device.c                    |  1 +
>  drivers/peci/internal.h                  |  5 ++
>  drivers/peci/sysfs.c                     | 82 ++++++++++++++++++++++++
>  6 files changed, 107 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
>  create mode 100644 drivers/peci/sysfs.c
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-peci b/Documentation/ABI/testing/sysfs-bus-peci
> new file mode 100644
> index 000000000000..56c2b2216bbd
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-peci
> @@ -0,0 +1,16 @@
> +What:          /sys/bus/peci/rescan
> +Date:          July 2021
> +KernelVersion: 5.15
> +Contact:       Iwona Winiarska <iwona.winiarska@intel.com>
> +Description:
> +               Writing a non-zero value to this attribute will
> +               initiate scan for PECI devices on all PECI controllers
> +               in the system.
> +
> +What:          /sys/bus/peci/devices/<controller_id>-<device_addr>/remove
> +Date:          July 2021
> +KernelVersion: 5.15
> +Contact:       Iwona Winiarska <iwona.winiarska@intel.com>
> +Description:
> +               Writing a non-zero value to this attribute will
> +               remove the PECI device and any of its children.
> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> index c5f9d3fe21bb..917f689e147a 100644
> --- a/drivers/peci/Makefile
> +++ b/drivers/peci/Makefile
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>
>  # Core functionality
> -peci-y := core.o request.o device.o
> +peci-y := core.o request.o device.o sysfs.o
>  obj-$(CONFIG_PECI) += peci.o
>
>  # Hardware specific bus drivers
> diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> index d143f1a7fe98..c473acb3c2a0 100644
> --- a/drivers/peci/core.c
> +++ b/drivers/peci/core.c
> @@ -34,7 +34,7 @@ struct device_type peci_controller_type = {
>         .release        = peci_controller_dev_release,
>  };
>
> -static int peci_controller_scan_devices(struct peci_controller *controller)
> +int peci_controller_scan_devices(struct peci_controller *controller)
>  {
>         int ret;
>         u8 addr;
> @@ -159,6 +159,7 @@ EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
>
>  struct bus_type peci_bus_type = {
>         .name           = "peci",
> +       .bus_groups     = peci_bus_groups,
>  };
>
>  static int __init peci_init(void)
> diff --git a/drivers/peci/device.c b/drivers/peci/device.c
> index 32811248997b..d77d9dabd51e 100644
> --- a/drivers/peci/device.c
> +++ b/drivers/peci/device.c
> @@ -110,5 +110,6 @@ static void peci_device_release(struct device *dev)
>  }
>
>  struct device_type peci_device_type = {
> +       .groups         = peci_device_groups,
>         .release        = peci_device_release,
>  };
> diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
> index 57d11a902c5d..978e12c8e1d3 100644
> --- a/drivers/peci/internal.h
> +++ b/drivers/peci/internal.h
> @@ -8,6 +8,7 @@
>  #include <linux/types.h>
>
>  struct peci_controller;
> +struct attribute_group;
>  struct peci_device;
>  struct peci_request;
>
> @@ -19,12 +20,16 @@ struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u
>  void peci_request_free(struct peci_request *req);
>
>  extern struct device_type peci_device_type;
> +extern const struct attribute_group *peci_device_groups[];
>
>  int peci_device_create(struct peci_controller *controller, u8 addr);
>  void peci_device_destroy(struct peci_device *device);
>
>  extern struct bus_type peci_bus_type;
> +extern const struct attribute_group *peci_bus_groups[];

To me, sysfs.c is small enough to just fold into core.c, then no need
to declare public attribute arrays like this, but up to you if you
prefer the sysfs.c split.

>
>  extern struct device_type peci_controller_type;
>
> +int peci_controller_scan_devices(struct peci_controller *controller);
> +
>  #endif /* __PECI_INTERNAL_H */
> diff --git a/drivers/peci/sysfs.c b/drivers/peci/sysfs.c
> new file mode 100644
> index 000000000000..db9ef05776e3
> --- /dev/null
> +++ b/drivers/peci/sysfs.c
> @@ -0,0 +1,82 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2021 Intel Corporation
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/peci.h>
> +
> +#include "internal.h"
> +
> +static int rescan_controller(struct device *dev, void *data)
> +{
> +       if (dev->type != &peci_controller_type)
> +               return 0;
> +
> +       return peci_controller_scan_devices(to_peci_controller(dev));
> +}
> +
> +static ssize_t rescan_store(struct bus_type *bus, const char *buf, size_t count)
> +{
> +       bool res;
> +       int ret;
> +
> +       ret = kstrtobool(buf, &res);
> +       if (ret)
> +               return ret;
> +
> +       if (!res)
> +               return count;
> +
> +       ret = bus_for_each_dev(&peci_bus_type, NULL, NULL, rescan_controller);
> +       if (ret)
> +               return ret;
> +
> +       return count;
> +}
> +static BUS_ATTR_WO(rescan);
> +
> +static struct attribute *peci_bus_attrs[] = {
> +       &bus_attr_rescan.attr,
> +       NULL
> +};
> +
> +static const struct attribute_group peci_bus_group = {
> +       .attrs = peci_bus_attrs,
> +};
> +
> +const struct attribute_group *peci_bus_groups[] = {
> +       &peci_bus_group,
> +       NULL
> +};
> +
> +static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
> +                           const char *buf, size_t count)
> +{
> +       struct peci_device *device = to_peci_device(dev);
> +       bool res;
> +       int ret;
> +
> +       ret = kstrtobool(buf, &res);
> +       if (ret)
> +               return ret;
> +
> +       if (res && device_remove_file_self(dev, attr))
> +               peci_device_destroy(device);

How do you solve races between sysfs device remove and controller
device remove? Looks like double-free at first glance. Have a look at
the  kill_device() helper as one way to resolve this double-delete
race..

> +
> +       return count;
> +}
> +static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0200, NULL, remove_store);
> +
> +static struct attribute *peci_device_attrs[] = {
> +       &dev_attr_remove.attr,
> +       NULL
> +};
> +
> +static const struct attribute_group peci_device_group = {
> +       .attrs = peci_device_attrs,
> +};
> +
> +const struct attribute_group *peci_device_groups[] = {
> +       &peci_device_group,
> +       NULL
> +};
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/15] peci: Add support for PECI device drivers
  2021-08-03 11:31 ` [PATCH v2 10/15] peci: Add support for PECI device drivers Iwona Winiarska
@ 2021-08-27 21:19   ` Dan Williams
  2021-11-15 22:20     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Williams @ 2021-08-27 21:19 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: Linux Kernel Mailing List, openbmc, Greg Kroah-Hartman, X86 ML,
	Device Tree, linux-aspeed, Linux ARM, linux-hwmon,
	Linux Doc Mailing List, Rob Herring, Joel Stanley,
	Andrew Jeffery, Jean Delvare, Guenter Roeck, Arnd Bergmann,
	Olof Johansson, Jonathan Corbet, Thomas Gleixner,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Randy Dunlap, Zev Weiss,
	David Muller

On Tue, Aug 3, 2021 at 4:36 AM Iwona Winiarska
<iwona.winiarska@intel.com> wrote:
>
> Here we're adding support for PECI device drivers, which unlike PECI

s/Here we're adding/Add/

> controller drivers are actually able to provide functionalities to
> userspace.

>
> We're also extending peci_request API to allow querying more details

s/We're also extending/Also, extend/

...for the most part imperative tense is the preferred tense, by
upstream maintainers, for changelogs.

> about PECI device (e.g. model/family), that's going to be used to find
> a compatible peci_driver.
>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  drivers/peci/Kconfig    |   1 +
>  drivers/peci/core.c     |  49 +++++++++
>  drivers/peci/device.c   | 105 ++++++++++++++++++++
>  drivers/peci/internal.h |  75 ++++++++++++++
>  drivers/peci/request.c  | 214 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/peci.h    |  19 ++++
>  lib/Kconfig             |   2 +-
>  7 files changed, 464 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> index 99279df97a78..1d0532e3a801 100644
> --- a/drivers/peci/Kconfig
> +++ b/drivers/peci/Kconfig
> @@ -2,6 +2,7 @@
>
>  menuconfig PECI
>         tristate "PECI support"
> +       select GENERIC_LIB_X86

GENERIC_LIB_X86 has dependencies, so this 'select' will make kbuild
unhappy when that dependency is not met. Given that this symbol
already selected by X86, it seems this just wants a "depends on
GENERIC_LIB_X86".

>         help
>           The Platform Environment Control Interface (PECI) is an interface
>           that provides a communication channel to Intel processors and
> diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> index c473acb3c2a0..33c07920493d 100644
> --- a/drivers/peci/core.c
> +++ b/drivers/peci/core.c
> @@ -157,8 +157,57 @@ struct peci_controller *devm_peci_controller_add(struct device *dev,
>  }
>  EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
>
> +static const struct peci_device_id *
> +peci_bus_match_device_id(const struct peci_device_id *id, struct peci_device *device)
> +{
> +       while (id->family != 0) {
> +               if (id->family == device->info.family &&
> +                   id->model == device->info.model)
> +                       return id;
> +               id++;
> +       }
> +
> +       return NULL;
> +}
> +
> +static int peci_bus_device_match(struct device *dev, struct device_driver *drv)
> +{
> +       struct peci_device *device = to_peci_device(dev);
> +       struct peci_driver *peci_drv = to_peci_driver(drv);
> +
> +       if (dev->type != &peci_device_type)
> +               return 0;
> +
> +       if (peci_bus_match_device_id(peci_drv->id_table, device))
> +               return 1;

Save a couple lines and do:

    return peci_bus_match_device_id(...)

> +
> +       return 0;
> +}
> +
> +static int peci_bus_device_probe(struct device *dev)
> +{
> +       struct peci_device *device = to_peci_device(dev);
> +       struct peci_driver *driver = to_peci_driver(dev->driver);
> +
> +       return driver->probe(device, peci_bus_match_device_id(driver->id_table, device));
> +}
> +
> +static int peci_bus_device_remove(struct device *dev)

Note, in linux-next this prototype has changed to:

    void (*remove)(struct device *dev);

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/include/linux/device/bus.h


> +{
> +       struct peci_device *device = to_peci_device(dev);
> +       struct peci_driver *driver = to_peci_driver(dev->driver);
> +
> +       if (driver->remove)
> +               driver->remove(device);
> +
> +       return 0;
> +}
> +
>  struct bus_type peci_bus_type = {
>         .name           = "peci",
> +       .match          = peci_bus_device_match,
> +       .probe          = peci_bus_device_probe,
> +       .remove         = peci_bus_device_remove,
>         .bus_groups     = peci_bus_groups,
>  };
>
> diff --git a/drivers/peci/device.c b/drivers/peci/device.c
> index d77d9dabd51e..a78c02399574 100644
> --- a/drivers/peci/device.c
> +++ b/drivers/peci/device.c
> @@ -1,11 +1,85 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  // Copyright (c) 2018-2021 Intel Corporation
>
> +#include <linux/bitfield.h>
>  #include <linux/peci.h>
>  #include <linux/slab.h>
> +#include <linux/x86/cpu.h>
>
>  #include "internal.h"
>
> +#define REVISION_NUM_MASK GENMASK(15, 8)
> +static int peci_get_revision(struct peci_device *device, u8 *revision)
> +{
> +       struct peci_request *req;
> +       u64 dib;
> +
> +       req = peci_get_dib(device);

I would expect peci_get_dib() to return @dib.

> +       if (IS_ERR(req))
> +               return PTR_ERR(req);
> +
> +       /*
> +        * PECI device may be in a state where it is unable to return a proper
> +        * DIB, in which case it returns 0 as DIB value.
> +        * Let's treat this as an error to avoid carrying on with the detection
> +        * using invalid revision.
> +        */
> +       dib = peci_request_data_dib(req);

I would expect peci_request_data_dib() to make a request.

A stack allocated peci_request passed to peci_get_dib() that returns
an error code would seem to be cleaner than this current organization.

> +       if (dib == 0) {
> +               peci_request_free(req);
> +               return -EIO;
> +       }
> +
> +       *revision = FIELD_GET(REVISION_NUM_MASK, dib);
> +
> +       peci_request_free(req);
> +
> +       return 0;
> +}
> +
> +static int peci_get_cpu_id(struct peci_device *device, u32 *cpu_id)
> +{
> +       struct peci_request *req;
> +       int ret;
> +
> +       req = peci_pkg_cfg_readl(device, PECI_PCS_PKG_ID, PECI_PKG_ID_CPU_ID);
> +       if (IS_ERR(req))
> +               return PTR_ERR(req);
> +
> +       ret = peci_request_status(req);
> +       if (ret)
> +               goto out_req_free;
> +
> +       *cpu_id = peci_request_data_readl(req);
> +out_req_free:
> +       peci_request_free(req);
> +
> +       return ret;
> +}
> +
> +static int peci_device_info_init(struct peci_device *device)
> +{
> +       u8 revision;
> +       u32 cpu_id;
> +       int ret;
> +
> +       ret = peci_get_cpu_id(device, &cpu_id);
> +       if (ret)
> +               return ret;
> +
> +       device->info.family = x86_family(cpu_id);
> +       device->info.model = x86_model(cpu_id);
> +
> +       ret = peci_get_revision(device, &revision);
> +       if (ret)
> +               return ret;
> +       device->info.peci_revision = revision;
> +
> +       device->info.socket_id = device->addr - PECI_BASE_ADDR;
> +
> +       return 0;
> +}
> +
>  static int peci_detect(struct peci_controller *controller, u8 addr)
>  {
>         struct peci_request *req;
> @@ -79,6 +153,10 @@ int peci_device_create(struct peci_controller *controller, u8 addr)
>         device->dev.bus = &peci_bus_type;
>         device->dev.type = &peci_device_type;
>
> +       ret = peci_device_info_init(device);
> +       if (ret)
> +               goto err_free;
> +
>         ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device->addr);
>         if (ret)
>                 goto err_free;
> @@ -102,6 +180,33 @@ void peci_device_destroy(struct peci_device *device)
>         device_unregister(&device->dev);
>  }
>
> +int __peci_driver_register(struct peci_driver *driver, struct module *owner,
> +                          const char *mod_name)
> +{
> +       driver->driver.bus = &peci_bus_type;
> +       driver->driver.owner = owner;
> +       driver->driver.mod_name = mod_name;
> +
> +       if (!driver->probe) {
> +               pr_err("peci: trying to register driver without probe callback\n");
> +               return -EINVAL;
> +       }
> +
> +       if (!driver->id_table) {
> +               pr_err("peci: trying to register driver without device id table\n");
> +               return -EINVAL;
> +       }
> +
> +       return driver_register(&driver->driver);
> +}
> +EXPORT_SYMBOL_NS_GPL(__peci_driver_register, PECI);
> +
> +void peci_driver_unregister(struct peci_driver *driver)
> +{
> +       driver_unregister(&driver->driver);
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_driver_unregister, PECI);
> +
>  static void peci_device_release(struct device *dev)
>  {
>         struct peci_device *device = to_peci_device(dev);
> diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
> index 978e12c8e1d3..d661e1b65694 100644
> --- a/drivers/peci/internal.h
> +++ b/drivers/peci/internal.h
> @@ -19,6 +19,34 @@ struct peci_request;
>  struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len);
>  void peci_request_free(struct peci_request *req);
>
> +int peci_request_status(struct peci_request *req);
> +u64 peci_request_data_dib(struct peci_request *req);
> +
> +u8 peci_request_data_readb(struct peci_request *req);
> +u16 peci_request_data_readw(struct peci_request *req);
> +u32 peci_request_data_readl(struct peci_request *req);
> +u64 peci_request_data_readq(struct peci_request *req);
> +
> +struct peci_request *peci_get_dib(struct peci_device *device);
> +struct peci_request *peci_get_temp(struct peci_device *device);
> +
> +struct peci_request *peci_pkg_cfg_readb(struct peci_device *device, u8 index, u16 param);
> +struct peci_request *peci_pkg_cfg_readw(struct peci_device *device, u8 index, u16 param);
> +struct peci_request *peci_pkg_cfg_readl(struct peci_device *device, u8 index, u16 param);
> +struct peci_request *peci_pkg_cfg_readq(struct peci_device *device, u8 index, u16 param);
> +
> +/**
> + * struct peci_device_id - PECI device data to match
> + * @data: pointer to driver private data specific to device
> + * @family: device family
> + * @model: device model
> + */
> +struct peci_device_id {
> +       const void *data;
> +       u16 family;
> +       u8 model;
> +};
> +
>  extern struct device_type peci_device_type;
>  extern const struct attribute_group *peci_device_groups[];
>
> @@ -28,6 +56,53 @@ void peci_device_destroy(struct peci_device *device);
>  extern struct bus_type peci_bus_type;
>  extern const struct attribute_group *peci_bus_groups[];
>
> +/**
> + * struct peci_driver - PECI driver
> + * @driver: inherit device driver
> + * @probe: probe callback
> + * @remove: remove callback
> + * @id_table: PECI device match table to decide which device to bind
> + */
> +struct peci_driver {
> +       struct device_driver driver;
> +       int (*probe)(struct peci_device *device, const struct peci_device_id *id);
> +       void (*remove)(struct peci_device *device);
> +       const struct peci_device_id *id_table;
> +};
> +
> +static inline struct peci_driver *to_peci_driver(struct device_driver *d)
> +{
> +       return container_of(d, struct peci_driver, driver);
> +}
> +
> +int __peci_driver_register(struct peci_driver *driver, struct module *owner,
> +                          const char *mod_name);
> +/**
> + * peci_driver_register() - register PECI driver
> + * @driver: the driver to be registered
> + * @owner: owner module of the driver being registered
> + * @mod_name: module name string
> + *
> + * PECI drivers that don't need to do anything special in module init should
> + * use the convenience "module_peci_driver" macro instead
> + *
> + * Return: zero on success, else a negative error code.
> + */
> +#define peci_driver_register(driver) \
> +       __peci_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
> +void peci_driver_unregister(struct peci_driver *driver);
> +
> +/**
> + * module_peci_driver() - helper macro for registering a modular PECI driver
> + * @__peci_driver: peci_driver struct
> + *
> + * Helper macro for PECI drivers which do not do anything special in module
> + * init/exit. This eliminates a lot of boilerplate. Each module may only
> + * use this macro once, and calling it replaces module_init() and module_exit()
> + */
> +#define module_peci_driver(__peci_driver) \
> +       module_driver(__peci_driver, peci_driver_register, peci_driver_unregister)
> +
>  extern struct device_type peci_controller_type;
>
>  int peci_controller_scan_devices(struct peci_controller *controller);
> diff --git a/drivers/peci/request.c b/drivers/peci/request.c
> index 81b567bc7b87..fe032d5a5e1b 100644
> --- a/drivers/peci/request.c
> +++ b/drivers/peci/request.c
> @@ -1,13 +1,140 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  // Copyright (c) 2021 Intel Corporation
>
> +#include <linux/bug.h>
>  #include <linux/export.h>
>  #include <linux/peci.h>
>  #include <linux/slab.h>
>  #include <linux/types.h>
>
> +#include <asm/unaligned.h>
> +
>  #include "internal.h"
>
> +#define PECI_GET_DIB_CMD               0xf7
> +#define  PECI_GET_DIB_WR_LEN           1
> +#define  PECI_GET_DIB_RD_LEN           8
> +
> +#define PECI_RDPKGCFG_CMD              0xa1
> +#define  PECI_RDPKGCFG_WR_LEN          5
> +#define  PECI_RDPKGCFG_RD_LEN_BASE     1
> +#define PECI_WRPKGCFG_CMD              0xa5
> +#define  PECI_WRPKGCFG_WR_LEN_BASE     6
> +#define  PECI_WRPKGCFG_RD_LEN          1
> +
> +/* Device Specific Completion Code (CC) Definition */
> +#define PECI_CC_SUCCESS                                0x40
> +#define PECI_CC_NEED_RETRY                     0x80
> +#define PECI_CC_OUT_OF_RESOURCE                        0x81
> +#define PECI_CC_UNAVAIL_RESOURCE               0x82
> +#define PECI_CC_INVALID_REQ                    0x90
> +#define PECI_CC_MCA_ERROR                      0x91
> +#define PECI_CC_CATASTROPHIC_MCA_ERROR         0x93
> +#define PECI_CC_FATAL_MCA_ERROR                        0x94
> +#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB                0x98
> +#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR   0x9B
> +#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA    0x9C
> +
> +#define PECI_RETRY_BIT                 BIT(0)
> +
> +#define PECI_RETRY_TIMEOUT             msecs_to_jiffies(700)
> +#define PECI_RETRY_INTERVAL_MIN                msecs_to_jiffies(1)
> +#define PECI_RETRY_INTERVAL_MAX                msecs_to_jiffies(128)
> +
> +static u8 peci_request_data_cc(struct peci_request *req)
> +{
> +       return req->rx.buf[0];
> +}
> +
> +/**
> + * peci_request_status() - return -errno based on PECI completion code
> + * @req: the PECI request that contains response data with completion code
> + *
> + * It can't be used for Ping(), GetDIB() and GetTemp() - for those commands we
> + * don't expect completion code in the response.
> + *
> + * Return: -errno
> + */
> +int peci_request_status(struct peci_request *req)
> +{
> +       u8 cc = peci_request_data_cc(req);
> +
> +       if (cc != PECI_CC_SUCCESS)
> +               dev_dbg(&req->device->dev, "ret: %#02x\n", cc);
> +
> +       switch (cc) {
> +       case PECI_CC_SUCCESS:
> +               return 0;
> +       case PECI_CC_NEED_RETRY:
> +       case PECI_CC_OUT_OF_RESOURCE:
> +       case PECI_CC_UNAVAIL_RESOURCE:
> +               return -EAGAIN;
> +       case PECI_CC_INVALID_REQ:
> +               return -EINVAL;
> +       case PECI_CC_MCA_ERROR:
> +       case PECI_CC_CATASTROPHIC_MCA_ERROR:
> +       case PECI_CC_FATAL_MCA_ERROR:
> +       case PECI_CC_PARITY_ERR_GPSB_OR_PMSB:
> +       case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR:
> +       case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA:
> +               return -EIO;
> +       }
> +
> +       WARN_ONCE(1, "Unknown PECI completion code: %#02x\n", cc);
> +
> +       return -EIO;
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_request_status, PECI);
> +
> +static int peci_request_xfer(struct peci_request *req)
> +{
> +       struct peci_device *device = req->device;
> +       struct peci_controller *controller = to_peci_controller(device->dev.parent);
> +       int ret;
> +
> +       mutex_lock(&controller->bus_lock);
> +       ret = controller->ops->xfer(controller, device->addr, req);
> +       mutex_unlock(&controller->bus_lock);
> +
> +       return ret;
> +}
> +
> +static int peci_request_xfer_retry(struct peci_request *req)
> +{
> +       long wait_interval = PECI_RETRY_INTERVAL_MIN;
> +       struct peci_device *device = req->device;
> +       struct peci_controller *controller = to_peci_controller(device->dev.parent);
> +       unsigned long start = jiffies;
> +       int ret;
> +
> +       /* Don't try to use it for ping */
> +       if (WARN_ON(!req->rx.buf))
> +               return 0;
> +
> +       do {
> +               ret = peci_request_xfer(req);
> +               if (ret) {
> +                       dev_dbg(&controller->dev, "xfer error: %d\n", ret);
> +                       return ret;
> +               }
> +
> +               if (peci_request_status(req) != -EAGAIN)
> +                       return 0;
> +
> +               /* Set the retry bit to indicate a retry attempt */
> +               req->tx.buf[1] |= PECI_RETRY_BIT;
> +
> +               if (schedule_timeout_interruptible(wait_interval))
> +                       return -ERESTARTSYS;
> +
> +               wait_interval = min_t(long, wait_interval * 2, PECI_RETRY_INTERVAL_MAX);
> +       } while (time_before(jiffies, start + PECI_RETRY_TIMEOUT));
> +
> +       dev_dbg(&controller->dev, "request timed out\n");
> +
> +       return -ETIMEDOUT;
> +}
> +
>  /**
>   * peci_request_alloc() - allocate &struct peci_requests
>   * @device: PECI device to which request is going to be sent
> @@ -48,3 +175,90 @@ void peci_request_free(struct peci_request *req)
>         kfree(req);
>  }
>  EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);
> +
> +struct peci_request *peci_get_dib(struct peci_device *device)
> +{
> +       struct peci_request *req;
> +       int ret;
> +
> +       req = peci_request_alloc(device, PECI_GET_DIB_WR_LEN, PECI_GET_DIB_RD_LEN);
> +       if (!req)
> +               return ERR_PTR(-ENOMEM);
> +
> +       req->tx.buf[0] = PECI_GET_DIB_CMD;
> +
> +       ret = peci_request_xfer(req);
> +       if (ret) {
> +               peci_request_free(req);
> +               return ERR_PTR(ret);
> +       }
> +
> +       return req;
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_get_dib, PECI);
> +
> +static struct peci_request *
> +__pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
> +{
> +       struct peci_request *req;
> +       int ret;
> +
> +       req = peci_request_alloc(device, PECI_RDPKGCFG_WR_LEN, PECI_RDPKGCFG_RD_LEN_BASE + len);
> +       if (!req)
> +               return ERR_PTR(-ENOMEM);
> +
> +       req->tx.buf[0] = PECI_RDPKGCFG_CMD;
> +       req->tx.buf[1] = 0;
> +       req->tx.buf[2] = index;
> +       put_unaligned_le16(param, &req->tx.buf[3]);
> +
> +       ret = peci_request_xfer_retry(req);
> +       if (ret) {
> +               peci_request_free(req);
> +               return ERR_PTR(ret);
> +       }
> +
> +       return req;
> +}
> +
> +u8 peci_request_data_readb(struct peci_request *req)
> +{
> +       return req->rx.buf[1];
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_request_data_readb, PECI);
> +
> +u16 peci_request_data_readw(struct peci_request *req)
> +{
> +       return get_unaligned_le16(&req->rx.buf[1]);
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_request_data_readw, PECI);
> +
> +u32 peci_request_data_readl(struct peci_request *req)
> +{
> +       return get_unaligned_le32(&req->rx.buf[1]);
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_request_data_readl, PECI);
> +
> +u64 peci_request_data_readq(struct peci_request *req)
> +{
> +       return get_unaligned_le64(&req->rx.buf[1]);
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_request_data_readq, PECI);
> +
> +u64 peci_request_data_dib(struct peci_request *req)
> +{
> +       return get_unaligned_le64(&req->rx.buf[0]);
> +}
> +EXPORT_SYMBOL_NS_GPL(peci_request_data_dib, PECI);
> +
> +#define __read_pkg_config(x, type) \
> +struct peci_request *peci_pkg_cfg_##x(struct peci_device *device, u8 index, u16 param) \
> +{ \
> +       return __pkg_cfg_read(device, index, param, sizeof(type)); \
> +} \
> +EXPORT_SYMBOL_NS_GPL(peci_pkg_cfg_##x, PECI)
> +
> +__read_pkg_config(readb, u8);
> +__read_pkg_config(readw, u16);
> +__read_pkg_config(readl, u32);
> +__read_pkg_config(readq, u64);
> diff --git a/include/linux/peci.h b/include/linux/peci.h
> index 26e0a4e73b50..dcf1c53f4e40 100644
> --- a/include/linux/peci.h
> +++ b/include/linux/peci.h
> @@ -14,6 +14,14 @@
>   */
>  #define PECI_REQUEST_MAX_BUF_SIZE 32
>
> +#define PECI_PCS_PKG_ID                        0  /* Package Identifier Read */
> +#define  PECI_PKG_ID_CPU_ID            0x0000  /* CPUID Info */
> +#define  PECI_PKG_ID_PLATFORM_ID       0x0001  /* Platform ID */
> +#define  PECI_PKG_ID_DEVICE_ID         0x0002  /* Uncore Device ID */
> +#define  PECI_PKG_ID_MAX_THREAD_ID     0x0003  /* Max Thread ID */
> +#define  PECI_PKG_ID_MICROCODE_REV     0x0004  /* CPU Microcode Update Revision */
> +#define  PECI_PKG_ID_MCA_ERROR_LOG     0x0005  /* Machine Check Status */
> +
>  struct peci_controller;
>  struct peci_request;
>
> @@ -59,6 +67,11 @@ static inline struct peci_controller *to_peci_controller(void *d)
>   * struct peci_device - PECI device
>   * @dev: device object to register PECI device to the device model
>   * @controller: manages the bus segment hosting this PECI device
> + * @info: PECI device characteristics
> + * @info.family: device family
> + * @info.model: device model
> + * @info.peci_revision: PECI revision supported by the PECI device
> + * @info.socket_id: the socket ID represented by the PECI device
>   * @addr: address used on the PECI bus connected to the parent controller
>   *
>   * A peci_device identifies a single device (i.e. CPU) connected to a PECI bus.
> @@ -67,6 +80,12 @@ static inline struct peci_controller *to_peci_controller(void *d)
>   */
>  struct peci_device {
>         struct device dev;
> +       struct {
> +               u16 family;
> +               u8 model;
> +               u8 peci_revision;
> +               u8 socket_id;
> +       } info;
>         u8 addr;
>  };
>
> diff --git a/lib/Kconfig b/lib/Kconfig
> index e538d4d773bd..7f7972d357c2 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -718,4 +718,4 @@ config ASN1_ENCODER
>
>  config GENERIC_LIB_X86
>         bool
> -       depends on X86
> +       depends on X86 || PECI

This looks broken, what in the GENERIC_LIB_X86 implementation depends on peci?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 07/15] peci: Add peci-aspeed controller driver
  2021-08-27 16:24       ` Dan Williams
@ 2021-08-29 19:42         ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-08-29 19:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: corbet, jae.hyun.yoo, x86, Lutomirski, Andy, linux-hwmon, Luck,
	Tony, andrew, mchehab, jdelvare, linux-kernel, mingo, rdunlap,
	bp, devicetree, tglx, linux-aspeed, olof, arnd, linux, linux-doc,
	robh+dt, openbmc, zweiss, d.mueller, gregkh, joel,
	linux-arm-kernel, andriy.shevchenko, yazen.ghannam,
	pierre-louis.bossart

On Fri, 2021-08-27 at 09:24 -0700, Dan Williams wrote:
> On Thu, Aug 26, 2021 at 4:55 PM Winiarska, Iwona
> <iwona.winiarska@intel.com> wrote:
> > 
> > On Wed, 2021-08-25 at 18:35 -0700, Dan Williams wrote:
> > > On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
> > > <iwona.winiarska@intel.com> wrote:
> > > > 
> > > > From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> > > > 
> > > > ASPEED AST24xx/AST25xx/AST26xx SoCs supports the PECI electrical
> > > > interface (a.k.a PECI wire).
> > > 
> > > Maybe a one sentence blurb here and in the Kconfig reminding people
> > > why they should care if they have a PECI driver or not?
> > 
> > Ok, I'll expand it a bit.
> [..]
> > > > +static int aspeed_peci_xfer(struct peci_controller *controller,
> > > > +                           u8 addr, struct peci_request *req)
> > > > +{
> > > > +       struct aspeed_peci *priv = dev_get_drvdata(controller-
> > > > >dev.parent);
> > > > +       unsigned long flags, timeout = msecs_to_jiffies(priv-
> > > > > cmd_timeout_ms);
> > > > +       u32 peci_head;
> > > > +       int ret;
> > > > +
> > > > +       if (req->tx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX ||
> > > > +           req->rx.len > ASPEED_PECI_DATA_BUF_SIZE_MAX)
> > > > +               return -EINVAL;
> > > > +
> > > > +       /* Check command sts and bus idle state */
> > > > +       ret = aspeed_peci_check_idle(priv);
> > > > +       if (ret)
> > > > +               return ret; /* -ETIMEDOUT */
> > > > +
> > > > +       spin_lock_irqsave(&priv->lock, flags);
> > > > +       reinit_completion(&priv->xfer_complete);
> > > > +
> > > > +       peci_head = FIELD_PREP(ASPEED_PECI_TARGET_ADDR_MASK, addr) |
> > > > +                   FIELD_PREP(ASPEED_PECI_WR_LEN_MASK, req->tx.len) |
> > > > +                   FIELD_PREP(ASPEED_PECI_RD_LEN_MASK, req->rx.len);
> > > > +
> > > > +       writel(peci_head, priv->base + ASPEED_PECI_RW_LENGTH);
> > > > +
> > > > +       memcpy_toio(priv->base + ASPEED_PECI_WR_DATA0, req->tx.buf,
> > > > min_t(u8, req->tx.len, 16));
> > > > +       if (req->tx.len > 16)
> > > > +               memcpy_toio(priv->base + ASPEED_PECI_WR_DATA4, req-
> > > > >tx.buf +
> > > > 16,
> > > > +                           req->tx.len - 16);
> > > > +
> > > > +       dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
> > > > +       print_hex_dump_bytes("TX : ", DUMP_PREFIX_NONE, req->tx.buf,
> > > > req-
> > > > > tx.len);
> > > 
> > > On CONFIG_DYNAMIC_DEBUG=n builds the kernel will do all the work of
> > > reading through this buffer, but skip emitting it. Are you sure you
> > > want to pay that overhead for every transaction?
> > 
> > I can remove it or I can add something like:
> > 
> > #if IS_ENABLED(CONFIG_PECI_DEBUG)
> > #define peci_debug(fmt, ...) pr_debug(fmt, ##__VA_ARGS__)
> > #else
> > #define peci_debug(...) do { } while (0)
> > #endif
> 
> It's the hex dump I'm worried about, not the debug statements as much.
> 
> I think the choices are remove the print_hex_dump_bytes(), put it
> behind an IS_ENABLED(CONFIG_DYNAMIC_DEBUG) to ensure the overhead is
> skipped in the CONFIG_DYNAMIC_DEBUG=n case, or live with the overhead
> if this is not a fast path / infrequently used.

I will place it behind IS_ENABLED(CONFIG_DYNAMIC_DEBUG).

> 
> > 
> > (and similar peci_trace with trace_printk for usage in IRQ handlers and
> > such).
> > 
> > What do you think?
> 
> In general, no, don't wrap the base level print routines with
> driver-specific ones. Also, trace_printk() is only for debug builds.
> Note that trace points are built to be even less overhead than
> dev_dbg(), so there's no overhead concern with disabled tracepoints,
> they literally translate to nops when disabled.

Ack.

> 
> > 
> > > 
> > > > +
> > > > +       priv->status = 0;
> > > > +       writel(ASPEED_PECI_CMD_FIRE, priv->base + ASPEED_PECI_CMD);
> > > > +       spin_unlock_irqrestore(&priv->lock, flags);
> > > > +
> > > > +       ret = wait_for_completion_interruptible_timeout(&priv-
> > > > > xfer_complete, timeout);
> > > 
> > > spin_lock_irqsave() says "I don't know if interrupts are disabled
> > > already, so I'll save the state, whatever it is, and restore later"
> > > 
> > > wait_for_completion_interruptible_timeout() says "I know I am in a
> > > sleepable context where interrupts are enabled"
> > > 
> > > So, one of those is wrong, i.e. should it be spin_{lock,unlock}_irq()?
> > 
> > You're right - I'll fix it.
> > 
> > > 
> > > 
> > > > +       if (ret < 0)
> > > > +               return ret;
> > > > +
> > > > +       if (ret == 0) {
> > > > +               dev_dbg(priv->dev, "Timeout waiting for a response!\n");
> > > > +               return -ETIMEDOUT;
> > > > +       }
> > > > +
> > > > +       spin_lock_irqsave(&priv->lock, flags);
> > > > +
> > > > +       writel(0, priv->base + ASPEED_PECI_CMD);
> > > > +
> > > > +       if (priv->status != ASPEED_PECI_INT_CMD_DONE) {
> > > > +               spin_unlock_irqrestore(&priv->lock, flags);
> > > > +               dev_dbg(priv->dev, "No valid response!\n");
> > > > +               return -EIO;
> > > > +       }
> > > > +
> > > > +       spin_unlock_irqrestore(&priv->lock, flags);
> > > > +
> > > > +       memcpy_fromio(req->rx.buf, priv->base + ASPEED_PECI_RD_DATA0,
> > > > min_t(u8, req->rx.len, 16));
> > > > +       if (req->rx.len > 16)
> > > > +               memcpy_fromio(req->rx.buf + 16, priv->base +
> > > > ASPEED_PECI_RD_DATA4,
> > > > +                             req->rx.len - 16);
> > > > +
> > > > +       print_hex_dump_bytes("RX : ", DUMP_PREFIX_NONE, req->rx.buf,
> > > > req-
> > > > > rx.len);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
> > > > +{
> > > > +       struct aspeed_peci *priv = arg;
> > > > +       u32 status;
> > > > +
> > > > +       spin_lock(&priv->lock);
> > > > +       status = readl(priv->base + ASPEED_PECI_INT_STS);
> > > > +       writel(status, priv->base + ASPEED_PECI_INT_STS);
> > > > +       priv->status |= (status & ASPEED_PECI_INT_MASK);
> > > > +
> > > > +       /*
> > > > +        * In most cases, interrupt bits will be set one by one but also
> > > > note
> > > > +        * that multiple interrupt bits could be set at the same time.
> > > > +        */
> > > > +       if (status & ASPEED_PECI_INT_BUS_TIMEOUT)
> > > > +               dev_dbg_ratelimited(priv->dev,
> > > > "ASPEED_PECI_INT_BUS_TIMEOUT\n");
> > > > +
> > > > +       if (status & ASPEED_PECI_INT_BUS_CONTENTION)
> > > > +               dev_dbg_ratelimited(priv->dev,
> > > > "ASPEED_PECI_INT_BUS_CONTENTION\n");
> > > > +
> > > > +       if (status & ASPEED_PECI_INT_WR_FCS_BAD)
> > > > +               dev_dbg_ratelimited(priv->dev,
> > > > "ASPEED_PECI_INT_WR_FCS_BAD\n");
> > > > +
> > > > +       if (status & ASPEED_PECI_INT_WR_FCS_ABORT)
> > > > +               dev_dbg_ratelimited(priv->dev,
> > > > "ASPEED_PECI_INT_WR_FCS_ABORT\n");
> > > 
> > > Are you sure these would not be better as tracepoints? If you're
> > > debugging an interrupt related failure, the ratelimiting might get in
> > > your way when you really need to know when one of these error
> > > interrupts fire relative to another event.
> > 
> > Tracepoints are ABI(ish), and using a full blown tracepoint just for IRQ
> > status
> > would probably be too much.
> 
> Tracepoints become ABI once someone ships tooling that depends on them
> being there. These don't look  attractive for a tool, and they don't
> look difficult to maintain if the interrupt handler needs to be
> reworked. I.e. it would be trivial to keep a dead tracepoint around if
> worse came to worse to keep a tool from failing to load.

After more consideration, I would prefer to remove these logs for now - in case
of error I'll log full status in xfer().

> 
> > I was thinking about something like trace_printk hidden under a
> > "CONFIG_PECI_DEBUG" (see above), but perhaps that's something for the future
> > improvement?
> 
> Again trace_printk() is only for private builds.
> 
> > 
> > > 
> > > > +
> > > > +       /*
> > > > +        * All commands should be ended up with a
> > > > ASPEED_PECI_INT_CMD_DONE
> > > > bit
> > > > +        * set even in an error case.
> > > > +        */
> > > > +       if (status & ASPEED_PECI_INT_CMD_DONE)
> > > > +               complete(&priv->xfer_complete);
> > > 
> > > Hmm, no need to check if there was a sequencing error, like a command
> > > was never submitted?
> > 
> > It's handled by checking if HW is idle in xfer before a command is sent,
> > where
> > we just expect a single interrupt per command.
> 
> I'm asking how do you determine if this status was spurious, or there
> was a sequencing error in the driver?

I don't think we have any means to determine it.
PECI itself doesn't provide any mechanism to verify it (there is no sequence
number or tag to match request/response).
We're relying on the fact that BMC is a requester and initiates communication
with CPU - the interrupt won't be generated if BMC doesn't send any request.

Thanks
-Iwona

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-08-03 11:31 ` [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers Iwona Winiarska
@ 2021-10-04 19:03   ` Borislav Petkov
  2021-10-11 19:21     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Borislav Petkov @ 2021-10-04 19:03 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-kernel, openbmc, Greg Kroah-Hartman, x86, devicetree,
	linux-aspeed, linux-arm-kernel, linux-hwmon, linux-doc,
	Rob Herring, Joel Stanley, Andrew Jeffery, Jean Delvare,
	Guenter Roeck, Arnd Bergmann, Olof Johansson, Jonathan Corbet,
	Thomas Gleixner, Andy Lutomirski, Ingo Molnar, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Dan Williams, Randy Dunlap,
	Zev Weiss, David Muller

On Tue, Aug 03, 2021 at 01:31:20PM +0200, Iwona Winiarska wrote:
> Baseboard management controllers (BMC) often run Linux but are usually
> implemented with non-X86 processors. They can use PECI to access package
> config space (PCS) registers on the host CPU and since some information,
> e.g. figuring out the core count, can be obtained using different
> registers on different CPU generations, they need to decode the family
> and model.
> 
> Move the data from arch/x86/include/asm/intel-family.h into a new file
> include/linux/x86/intel-family.h so that it can be used by other
> architectures.
> 
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> ---
> To limit tree-wide changes and help people that were expecting
> intel-family defines in arch/x86 to find it more easily without going
> through git history, we're not removing the original header
> completely, we're keeping it as a "stub" that includes the new one.
> If there is a consensus that the tree-wide option is better,
> we can choose this approach.

Why can't the linux/ namespace header include the x86 one so that
nothing changes for arch/x86/?

And if it is really only a handful of families you need, you might just
as well copy them into the peci headers and slap a comment above it
saying where they come from and save yourself all that churn...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent
  2021-08-03 11:31 ` [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent Iwona Winiarska
@ 2021-10-04 19:08   ` Borislav Petkov
  2021-10-11 19:32     ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Borislav Petkov @ 2021-10-04 19:08 UTC (permalink / raw)
  To: Iwona Winiarska
  Cc: linux-kernel, openbmc, Greg Kroah-Hartman, x86, devicetree,
	linux-aspeed, linux-arm-kernel, linux-hwmon, linux-doc,
	Rob Herring, Joel Stanley, Andrew Jeffery, Jean Delvare,
	Guenter Roeck, Arnd Bergmann, Olof Johansson, Jonathan Corbet,
	Thomas Gleixner, Andy Lutomirski, Ingo Molnar, Yazen Ghannam,
	Mauro Carvalho Chehab, Pierre-Louis Bossart, Tony Luck,
	Andy Shevchenko, Jae Hyun Yoo, Dan Williams, Randy Dunlap,
	Zev Weiss, David Muller

On Tue, Aug 03, 2021 at 01:31:21PM +0200, Iwona Winiarska wrote:
> Baseboard management controllers (BMC) often run Linux but are usually
> implemented with non-X86 processors. They can use PECI to access package
> config space (PCS) registers on the host CPU and since some information,
> e.g. figuring out the core count, can be obtained using different
> registers on different CPU generations, they need to decode the family
> and model.
> 
> The format of Package Identifier PCS register that describes CPUID
> information has the same layout as CPUID_1.EAX, so let's allow to reuse
> cpuid helpers by making it available for other architectures as well.
> 
> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  MAINTAINERS                      | 1 +
>  arch/x86/Kconfig                 | 1 +
>  arch/x86/include/asm/cpu.h       | 3 ---
>  arch/x86/include/asm/microcode.h | 2 +-
>  arch/x86/kvm/cpuid.h             | 3 ++-
>  arch/x86/lib/Makefile            | 2 +-
>  drivers/edac/mce_amd.c           | 3 +--
>  include/linux/x86/cpu.h          | 9 +++++++++
>  lib/Kconfig                      | 4 ++++
>  lib/Makefile                     | 2 ++
>  lib/x86/Makefile                 | 3 +++
>  {arch/x86/lib => lib/x86}/cpu.c  | 2 +-
>  12 files changed, 26 insertions(+), 9 deletions(-)
>  create mode 100644 include/linux/x86/cpu.h
>  create mode 100644 lib/x86/Makefile
>  rename {arch/x86/lib => lib/x86}/cpu.c (95%)

AFAICT, all that churn is done for x86_family() and x86_model() which
are used *exactly* *once* and which are almost trivial anyway.

What's wrong with simply computing the family and model "by hand", so to
speak, in peci_device_info_init() and do away with that diffstat

 12 files changed, 26 insertions(+), 9 deletions(-)

?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-04 19:03   ` Borislav Petkov
@ 2021-10-11 19:21     ` Winiarska, Iwona
  2021-10-11 19:40       ` Dave Hansen
  2021-10-11 20:06       ` Borislav Petkov
  0 siblings, 2 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-10-11 19:21 UTC (permalink / raw)
  To: bp
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, Williams, Dan J

On Mon, 2021-10-04 at 21:03 +0200, Borislav Petkov wrote:
> On Tue, Aug 03, 2021 at 01:31:20PM +0200, Iwona Winiarska wrote:
> > Baseboard management controllers (BMC) often run Linux but are usually
> > implemented with non-X86 processors. They can use PECI to access package
> > config space (PCS) registers on the host CPU and since some information,
> > e.g. figuring out the core count, can be obtained using different
> > registers on different CPU generations, they need to decode the family
> > and model.
> > 
> > Move the data from arch/x86/include/asm/intel-family.h into a new file
> > include/linux/x86/intel-family.h so that it can be used by other
> > architectures.
> > 
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> > To limit tree-wide changes and help people that were expecting
> > intel-family defines in arch/x86 to find it more easily without going
> > through git history, we're not removing the original header
> > completely, we're keeping it as a "stub" that includes the new one.
> > If there is a consensus that the tree-wide option is better,
> > we can choose this approach.
> 
> Why can't the linux/ namespace header include the x86 one so that
> nothing changes for arch/x86/?

Same reason why PECI can't just include arch/x86 directly (we're building for
ARM, not x86).

> And if it is really only a handful of families you need, you might just
> as well copy them into the peci headers and slap a comment above it
> saying where they come from and save yourself all that churn...

It's a handful of families for now - but I do expect the list to grow once new
platforms are introduced (and with that - duplicates have to be added in both
places).

Since the churn is relatively low I wanted to start with trying to keep things
clean first.
If you're against that - sure, we can duplicate. 

Thanks
-Iwona

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent
  2021-10-04 19:08   ` Borislav Petkov
@ 2021-10-11 19:32     ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-10-11 19:32 UTC (permalink / raw)
  To: bp
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, Williams, Dan J

On Mon, 2021-10-04 at 21:08 +0200, Borislav Petkov wrote:
> On Tue, Aug 03, 2021 at 01:31:21PM +0200, Iwona Winiarska wrote:
> > Baseboard management controllers (BMC) often run Linux but are usually
> > implemented with non-X86 processors. They can use PECI to access package
> > config space (PCS) registers on the host CPU and since some information,
> > e.g. figuring out the core count, can be obtained using different
> > registers on different CPU generations, they need to decode the family
> > and model.
> > 
> > The format of Package Identifier PCS register that describes CPUID
> > information has the same layout as CPUID_1.EAX, so let's allow to reuse
> > cpuid helpers by making it available for other architectures as well.
> > 
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  MAINTAINERS                      | 1 +
> >  arch/x86/Kconfig                 | 1 +
> >  arch/x86/include/asm/cpu.h       | 3 ---
> >  arch/x86/include/asm/microcode.h | 2 +-
> >  arch/x86/kvm/cpuid.h             | 3 ++-
> >  arch/x86/lib/Makefile            | 2 +-
> >  drivers/edac/mce_amd.c           | 3 +--
> >  include/linux/x86/cpu.h          | 9 +++++++++
> >  lib/Kconfig                      | 4 ++++
> >  lib/Makefile                     | 2 ++
> >  lib/x86/Makefile                 | 3 +++
> >  {arch/x86/lib => lib/x86}/cpu.c  | 2 +-
> >  12 files changed, 26 insertions(+), 9 deletions(-)
> >  create mode 100644 include/linux/x86/cpu.h
> >  create mode 100644 lib/x86/Makefile
> >  rename {arch/x86/lib => lib/x86}/cpu.c (95%)
> 
> AFAICT, all that churn is done for x86_family() and x86_model() which
> are used *exactly* *once* and which are almost trivial anyway.

Correct.

> What's wrong with simply computing the family and model "by hand", so to
> speak, in peci_device_info_init() and do away with that diffstat
> 
>  12 files changed, 26 insertions(+), 9 deletions(-)
> 
> ?

Nothing wrong - just a trade-off between churn and keeping things tidy and not
duplicated, similar to patch 1.
And just like in patch 1, if you have a strong opinion against it - we can
duplicate. 

Thanks
-Iwona


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 19:21     ` Winiarska, Iwona
@ 2021-10-11 19:40       ` Dave Hansen
  2021-10-11 20:53         ` Winiarska, Iwona
  2021-10-11 20:06       ` Borislav Petkov
  1 sibling, 1 reply; 49+ messages in thread
From: Dave Hansen @ 2021-10-11 19:40 UTC (permalink / raw)
  To: Winiarska, Iwona, bp
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, Williams, Dan J

On 10/11/21 12:21 PM, Winiarska, Iwona wrote:
> On Mon, 2021-10-04 at 21:03 +0200, Borislav Petkov wrote:
>> On Tue, Aug 03, 2021 at 01:31:20PM +0200, Iwona Winiarska wrote:
>>> Baseboard management controllers (BMC) often run Linux but are usually
>>> implemented with non-X86 processors. They can use PECI to access package
>>> config space (PCS) registers on the host CPU and since some information,
>>> e.g. figuring out the core count, can be obtained using different
>>> registers on different CPU generations, they need to decode the family
>>> and model.
>>>
>>> Move the data from arch/x86/include/asm/intel-family.h into a new file
>>> include/linux/x86/intel-family.h so that it can be used by other
>>> architectures.
>>>
>>> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
>>> Reviewed-by: Tony Luck <tony.luck@intel.com>
>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>>> ---
>>> To limit tree-wide changes and help people that were expecting
>>> intel-family defines in arch/x86 to find it more easily without going
>>> through git history, we're not removing the original header
>>> completely, we're keeping it as a "stub" that includes the new one.
>>> If there is a consensus that the tree-wide option is better,
>>> we can choose this approach.
>> Why can't the linux/ namespace header include the x86 one so that
>> nothing changes for arch/x86/?
> Same reason why PECI can't just include arch/x86 directly (we're building for
> ARM, not x86).
If you're in include/linux/x86-hacks.h, what prevents you from doing

#include "../../arch/x86/include/asm/intel-family.h"

?

In the end, to the compiler, it's just a file in a weird location in the
tree.  I think I'd prefer one weird include to moving that file out of
arch/x86.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 19:21     ` Winiarska, Iwona
  2021-10-11 19:40       ` Dave Hansen
@ 2021-10-11 20:06       ` Borislav Petkov
  2021-10-11 20:38         ` Winiarska, Iwona
  1 sibling, 1 reply; 49+ messages in thread
From: Borislav Petkov @ 2021-10-11 20:06 UTC (permalink / raw)
  To: Winiarska, Iwona
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, Williams, Dan J

On Mon, Oct 11, 2021 at 07:21:26PM +0000, Winiarska, Iwona wrote:
> Same reason why PECI can't just include arch/x86 directly (we're building for
> ARM, not x86).

Aha.

So what do you need those INTEL_FAM6* defines for?

I see peci_cpu_device_ids[] which are used to match the CPU so at least
that thing must be loading on x86 hardware... reading your 0th message,
it sounds like that peci-cpu thing is loaded on an x86 CPU and it then
exposes those interfaces which a PECI controller accesses.

And then I see in init_core_mask() the single usage of INTEL_FAM6* and
that drivers/hwmon/peci/cputemp.c is a CPU temp monitoring client so
that thing probably runs on x86 too.

Or?

If it does, then you don't need the code move.

But it looks like I'm missing something...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 20:06       ` Borislav Petkov
@ 2021-10-11 20:38         ` Winiarska, Iwona
  2021-10-11 21:31           ` Borislav Petkov
  0 siblings, 1 reply; 49+ messages in thread
From: Winiarska, Iwona @ 2021-10-11 20:38 UTC (permalink / raw)
  To: bp
  Cc: corbet, jae.hyun.yoo, x86, Lutomirski, Andy, linux-hwmon, Luck,
	Tony, andrew, Williams, Dan J, mchehab, jdelvare, linux-kernel,
	mingo, rdunlap, devicetree, tglx, linux-aspeed, olof, arnd,
	linux, linux-doc, robh+dt, openbmc, zweiss, d.mueller, gregkh,
	joel, linux-arm-kernel, andriy.shevchenko, yazen.ghannam,
	pierre-louis.bossart

On Mon, 2021-10-11 at 22:06 +0200, Borislav Petkov wrote:
> On Mon, Oct 11, 2021 at 07:21:26PM +0000, Winiarska, Iwona wrote:
> > Same reason why PECI can't just include arch/x86 directly (we're building
> > for
> > ARM, not x86).
> 
> Aha.
> 
> So what do you need those INTEL_FAM6* defines for?

To identify the x86 CPU and use that as a match for binding PECI drivers.

> I see peci_cpu_device_ids[] which are used to match the CPU so at least
> that thing must be loading on x86 hardware... reading your 0th message,
> it sounds like that peci-cpu thing is loaded on an x86 CPU and it then
> exposes those interfaces which a PECI controller accesses.

Everything that's part of this series runs on the BMC (Baseboard Management
Controller). There's nothing ARM specific to it - it's just that the BMC
hardware we're currently supporting is ARM-based.
PECI is an interface that's exposed by some x86 CPUs - but that's a hardware
interface (available completely independent from whatever is actually running on
the x86 CPU).

> 
> And then I see in init_core_mask() the single usage of INTEL_FAM6* and
> that drivers/hwmon/peci/cputemp.c is a CPU temp monitoring client so
> that thing probably runs on x86 too.

That also runs on BMC, it uses functionality offered by peci-cpu to expose hwmon
interface to userspace.
Userspace that makes use of that hwmon interface also runs on the BMC and
exposes sensor data to user (via redfish API, or web-based interface).

> Or?
> 
> If it does, then you don't need the code move.
> 
> But it looks like I'm missing something...
> 

I'm sorry - it looks that my description in the cover letter wasn't clear
enough.

Thanks
-Iwona


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 19:40       ` Dave Hansen
@ 2021-10-11 20:53         ` Winiarska, Iwona
  2021-10-11 23:12           ` Dave Hansen
  0 siblings, 1 reply; 49+ messages in thread
From: Winiarska, Iwona @ 2021-10-11 20:53 UTC (permalink / raw)
  To: Hansen, Dave, bp
  Cc: corbet, jae.hyun.yoo, x86, Lutomirski, Andy, linux-hwmon, Luck,
	Tony, andrew, Williams, Dan J, mchehab, jdelvare, linux-kernel,
	mingo, rdunlap, devicetree, tglx, linux-aspeed, olof, arnd,
	linux, linux-doc, robh+dt, openbmc, zweiss, d.mueller, gregkh,
	joel, linux-arm-kernel, andriy.shevchenko, yazen.ghannam,
	pierre-louis.bossart

On Mon, 2021-10-11 at 12:40 -0700, Dave Hansen wrote:
> On 10/11/21 12:21 PM, Winiarska, Iwona wrote:
> > On Mon, 2021-10-04 at 21:03 +0200, Borislav Petkov wrote:
> > > On Tue, Aug 03, 2021 at 01:31:20PM +0200, Iwona Winiarska wrote:
> > > > Baseboard management controllers (BMC) often run Linux but are usually
> > > > implemented with non-X86 processors. They can use PECI to access package
> > > > config space (PCS) registers on the host CPU and since some information,
> > > > e.g. figuring out the core count, can be obtained using different
> > > > registers on different CPU generations, they need to decode the family
> > > > and model.
> > > > 
> > > > Move the data from arch/x86/include/asm/intel-family.h into a new file
> > > > include/linux/x86/intel-family.h so that it can be used by other
> > > > architectures.
> > > > 
> > > > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > > > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > > > ---
> > > > To limit tree-wide changes and help people that were expecting
> > > > intel-family defines in arch/x86 to find it more easily without going
> > > > through git history, we're not removing the original header
> > > > completely, we're keeping it as a "stub" that includes the new one.
> > > > If there is a consensus that the tree-wide option is better,
> > > > we can choose this approach.
> > > Why can't the linux/ namespace header include the x86 one so that
> > > nothing changes for arch/x86/?
> > Same reason why PECI can't just include arch/x86 directly (we're building
> > for
> > ARM, not x86).
> If you're in include/linux/x86-hacks.h, what prevents you from doing
> 
> #include "../../arch/x86/include/asm/intel-family.h"
> 
> ?
> 
> In the end, to the compiler, it's just a file in a weird location in the
> tree.  I think I'd prefer one weird include to moving that file out of
> arch/x86.

Using relative includes in include/linux is uncommon (I can see just one usage
in libfdt.h pulling stuff from scripts), so I thought I can't use it in this way
(seems slightly hacky to pull stuff from outside include path).

But if that would be ok, it looks like a good alternative to avoid duplication
in this case.

Thanks
-Iwona

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 20:38         ` Winiarska, Iwona
@ 2021-10-11 21:31           ` Borislav Petkov
  2021-10-12 23:15             ` Winiarska, Iwona
  0 siblings, 1 reply; 49+ messages in thread
From: Borislav Petkov @ 2021-10-11 21:31 UTC (permalink / raw)
  To: Winiarska, Iwona
  Cc: corbet, jae.hyun.yoo, x86, Lutomirski, Andy, linux-hwmon, Luck,
	Tony, andrew, Williams, Dan J, mchehab, jdelvare, linux-kernel,
	mingo, rdunlap, devicetree, tglx, linux-aspeed, olof, arnd,
	linux, linux-doc, robh+dt, openbmc, zweiss, d.mueller, gregkh,
	joel, linux-arm-kernel, andriy.shevchenko, yazen.ghannam,
	pierre-louis.bossart

On Mon, Oct 11, 2021 at 08:38:43PM +0000, Winiarska, Iwona wrote:
> Everything that's part of this series runs on the BMC (Baseboard
> Management Controller). There's nothing ARM specific to it - it's just
> that the BMC hardware we're currently supporting is ARM-based. PECI is
> an interface that's exposed by some x86 CPUs - but that's a hardware
> interface (available completely independent from whatever is actually
> running on the x86 CPU).

Aha, I think I got it: so this whole PECI pile is supposed to run on
the BMC - which can be ARM but doesn't have to be, i.e., code should be
generic enough - and the interfaces to the x86 CPU do get exposed to the
Linux running on the BMC.

Which brings me to the answer to your other mail:

On Mon, Oct 11, 2021 at 07:32:38PM +0000, Winiarska, Iwona wrote:
> Nothing wrong - just a trade-off between churn and keeping things tidy
> and not duplicated, similar to patch 1. And just like in patch 1, if
> you have a strong opinion against it - we can duplicate.

So it is not about strong opinion. Rather, it is about whether this
exporting would be disadvantageous for x86 freedom. And I think it will
be:

Because if you exported those and then we went and changed those
interfaces and defines (changed their naming, function arguments,
whatever) and something outside of x86 used them, we will break that
something.

And usually we go and fix those users too but I doubt anyone has access
to that PECI hw to actually test fixes, etc, etc.

So I'd prefer the small amount of duplication vs external stuff using
x86 facilities any day of the week. And so I'd suggest you simply copy
the handful of functions and defines you're gonna be needing and the
defines and be done with it.

Dave's idea makes sense to me too but lately it keeps happening that
we change something in x86-land and it turns out something "from the
outside" is using it and it breaks, so it is a lot easier if things are
independent.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 20:53         ` Winiarska, Iwona
@ 2021-10-11 23:12           ` Dave Hansen
  0 siblings, 0 replies; 49+ messages in thread
From: Dave Hansen @ 2021-10-11 23:12 UTC (permalink / raw)
  To: Winiarska, Iwona, bp
  Cc: corbet, jae.hyun.yoo, x86, Lutomirski, Andy, linux-hwmon, Luck,
	Tony, andrew, Williams, Dan J, mchehab, jdelvare, linux-kernel,
	mingo, rdunlap, devicetree, tglx, linux-aspeed, olof, arnd,
	linux, linux-doc, robh+dt, openbmc, zweiss, d.mueller, gregkh,
	joel, linux-arm-kernel, andriy.shevchenko, yazen.ghannam,
	pierre-louis.bossart

On 10/11/21 1:53 PM, Winiarska, Iwona wrote:
>> If you're in include/linux/x86-hacks.h, what prevents you from doing
>>
>> #include "../../arch/x86/include/asm/intel-family.h"
>>
>> ?
>>
>> In the end, to the compiler, it's just a file in a weird location in the
>> tree.  I think I'd prefer one weird include to moving that file out of
>> arch/x86.
> Using relative includes in include/linux is uncommon (I can see just one usage
> in libfdt.h pulling stuff from scripts), so I thought I can't use it in this way
> (seems slightly hacky to pull stuff from outside include path).
> 
> But if that would be ok, it looks like a good alternative to avoid duplication
> in this case.

If you don't want to do it from a header, you can also do it directly
from a .c file that's outside of arch/x86.

I think that's a much better alternative than moving stuff elsewhere.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-11 21:31           ` Borislav Petkov
@ 2021-10-12 23:15             ` Winiarska, Iwona
  2021-10-13  6:42               ` Borislav Petkov
  0 siblings, 1 reply; 49+ messages in thread
From: Winiarska, Iwona @ 2021-10-12 23:15 UTC (permalink / raw)
  To: bp
  Cc: corbet, jae.hyun.yoo, pierre-louis.bossart, linux-hwmon,
	Lutomirski, Andy, Luck, Tony, andrew, mchehab, yazen.ghannam,
	jdelvare, linux-kernel, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, olof, arnd, linux, linux-doc, robh+dt, openbmc,
	zweiss, d.mueller, gregkh, joel, linux-arm-kernel,
	andriy.shevchenko, x86, Williams, Dan J

On Mon, 2021-10-11 at 23:31 +0200, Borislav Petkov wrote:
> On Mon, Oct 11, 2021 at 08:38:43PM +0000, Winiarska, Iwona wrote:
> > Everything that's part of this series runs on the BMC (Baseboard
> > Management Controller). There's nothing ARM specific to it - it's just
> > that the BMC hardware we're currently supporting is ARM-based. PECI is
> > an interface that's exposed by some x86 CPUs - but that's a hardware
> > interface (available completely independent from whatever is actually
> > running on the x86 CPU).
> 
> Aha, I think I got it: so this whole PECI pile is supposed to run on
> the BMC - which can be ARM but doesn't have to be, i.e., code should be
> generic enough - and the interfaces to the x86 CPU do get exposed to the
> Linux running on the BMC.
> 
> Which brings me to the answer to your other mail:
> 
> On Mon, Oct 11, 2021 at 07:32:38PM +0000, Winiarska, Iwona wrote:
> > Nothing wrong - just a trade-off between churn and keeping things tidy
> > and not duplicated, similar to patch 1. And just like in patch 1, if
> > you have a strong opinion against it - we can duplicate.
> 
> So it is not about strong opinion. Rather, it is about whether this
> exporting would be disadvantageous for x86 freedom. And I think it will
> be:
> 
> Because if you exported those and then we went and changed those
> interfaces and defines (changed their naming, function arguments,
> whatever) and something outside of x86 used them, we will break that
> something.
> 
> And usually we go and fix those users too but I doubt anyone has access
> to that PECI hw to actually test fixes, etc, etc.

We (OpenBMC) do have PECI HW, so that shouldn't be a problem.

> So I'd prefer the small amount of duplication vs external stuff using
> x86 facilities any day of the week. And so I'd suggest you simply copy
> the handful of functions and defines you're gonna be needing and the
> defines and be done with it.
> 
> Dave's idea makes sense to me too but lately it keeps happening that
> we change something in x86-land and it turns out something "from the
> outside" is using it and it breaks, so it is a lot easier if things are
> independent.

Both CPUID.EAX=1 decoding and definitions in intel-family are pretty "well-
defined". I understand the scenario that you're describing, but in order to
break the outside user there would need to be some "logic" behind the pulled in
concepts (if, for example, I would use something like X86_MATCH_* defines in
PECI).

Thanks
-Iwona

> 
> Thx.
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers
  2021-10-12 23:15             ` Winiarska, Iwona
@ 2021-10-13  6:42               ` Borislav Petkov
  0 siblings, 0 replies; 49+ messages in thread
From: Borislav Petkov @ 2021-10-13  6:42 UTC (permalink / raw)
  To: Winiarska, Iwona
  Cc: corbet, jae.hyun.yoo, pierre-louis.bossart, linux-hwmon,
	Lutomirski, Andy, Luck, Tony, andrew, mchehab, yazen.ghannam,
	jdelvare, linux-kernel, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, olof, arnd, linux, linux-doc, robh+dt, openbmc,
	zweiss, d.mueller, gregkh, joel, linux-arm-kernel,
	andriy.shevchenko, x86, Williams, Dan J

On Tue, Oct 12, 2021 at 11:15:00PM +0000, Winiarska, Iwona wrote:
> We (OpenBMC) do have PECI HW, so that shouldn't be a problem.

Yeah, don't take it personally, but asking people to test stuff for you
doesn't really work, in practice.

> Both CPUID.EAX=1 decoding and definitions in intel-family are pretty "well-
> defined".

Sure, they are "well-defined" until we change them for whatever reason.
Then they will be "well-defined" again. But different.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 08/15] peci: Add device detection
  2021-08-27 19:01   ` Dan Williams
@ 2021-11-15 22:18     ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-11-15 22:18 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On Fri, 2021-08-27 at 12:01 -0700, Dan Williams wrote:
> On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
> <iwona.winiarska@intel.com> wrote:
> > 
> > Since PECI devices are discoverable, we can dynamically detect devices
> > that are actually available in the system.
> > 
> > This change complements the earlier implementation by rescanning PECI
> > bus to detect available devices. For this purpose, it also introduces the
> > minimal API for PECI requests.
> > 
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> >  drivers/peci/Makefile   |   2 +-
> >  drivers/peci/core.c     |  33 ++++++++++++
> >  drivers/peci/device.c   | 114 ++++++++++++++++++++++++++++++++++++++++
> >  drivers/peci/internal.h |  14 +++++
> >  drivers/peci/request.c  |  50 ++++++++++++++++++
> >  5 files changed, 212 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/peci/device.c
> >  create mode 100644 drivers/peci/request.c
> > 
> > diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> > index 926d8df15cbd..c5f9d3fe21bb 100644
> > --- a/drivers/peci/Makefile
> > +++ b/drivers/peci/Makefile
> > @@ -1,7 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0-only
> > 
> >  # Core functionality
> > -peci-y := core.o
> > +peci-y := core.o request.o device.o
> >  obj-$(CONFIG_PECI) += peci.o
> > 
> >  # Hardware specific bus drivers
> > diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> > index 7b3938af0396..d143f1a7fe98 100644
> > --- a/drivers/peci/core.c
> > +++ b/drivers/peci/core.c
> > @@ -34,6 +34,20 @@ struct device_type peci_controller_type = {
> >         .release        = peci_controller_dev_release,
> >  };
> > 
> > +static int peci_controller_scan_devices(struct peci_controller *controller)
> > +{
> > +       int ret;
> > +       u8 addr;
> > +
> > +       for (addr = PECI_BASE_ADDR; addr < PECI_BASE_ADDR +
> > PECI_DEVICE_NUM_MAX; addr++) {
> > +               ret = peci_device_create(controller, addr);
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> >  static struct peci_controller *peci_controller_alloc(struct device *dev,
> >                                                      struct
> > peci_controller_ops *ops)
> >  {
> > @@ -76,10 +90,23 @@ static struct peci_controller
> > *peci_controller_alloc(struct device *dev,
> >         return ERR_PTR(ret);
> >  }
> > 
> > +static int unregister_child(struct device *dev, void *dummy)
> > +{
> > +       peci_device_destroy(to_peci_device(dev));
> > +
> > +       return 0;
> > +}
> > +
> >  static void unregister_controller(void *_controller)
> >  {
> >         struct peci_controller *controller = _controller;
> > 
> > +       /*
> > +        * Detach any active PECI devices. This can't fail, thus we do not
> > +        * check the returned value.
> > +        */
> > +       device_for_each_child_reverse(&controller->dev, NULL,
> > unregister_child);
> > +
> >         device_unregister(&controller->dev);
> >  }
> > 
> > @@ -115,6 +142,12 @@ struct peci_controller *devm_peci_controller_add(struct
> > device *dev,
> >         if (ret)
> >                 return ERR_PTR(ret);
> > 
> > +       /*
> > +        * Ignoring retval since failures during scan are non-critical for
> > +        * controller itself.
> > +        */
> > +       peci_controller_scan_devices(controller);
> > +
> >         return controller;
> > 
> >  err:
> > diff --git a/drivers/peci/device.c b/drivers/peci/device.c
> > new file mode 100644
> > index 000000000000..32811248997b
> > --- /dev/null
> > +++ b/drivers/peci/device.c
> > @@ -0,0 +1,114 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2018-2021 Intel Corporation
> > +
> > +#include <linux/peci.h>
> > +#include <linux/slab.h>
> > +
> > +#include "internal.h"
> > +
> > +static int peci_detect(struct peci_controller *controller, u8 addr)
> > +{
> > +       struct peci_request *req;
> > +       int ret;
> > +
> > +       /*
> > +        * PECI Ping is a command encoded by tx_len = 0, rx_len = 0.
> > +        * We expect correct Write FCS if the device at the target address
> > +        * is able to respond.
> > +        */
> > +       req = peci_request_alloc(NULL, 0, 0);
> > +       if (!req)
> > +               return -ENOMEM;
> 
> Seems a waste to do a heap allocation for this routine. Why not:
> 
>        /*
>         * PECI Ping is a command encoded by tx_len = 0, rx_len = 0.
>         * We expect correct Write FCS if the device at the target address
>         * is able to respond.
>         */
>        struct peci_request req = { 0 };

Done.

> 
> > +
> > +       mutex_lock(&controller->bus_lock);
> > +       ret = controller->ops->xfer(controller, addr, req);
> > +       mutex_unlock(&controller->bus_lock);
> > +
> > +       peci_request_free(req);
> > +
> > +       return ret;
> > +}
> > +
> > +static bool peci_addr_valid(u8 addr)
> > +{
> > +       return addr >= PECI_BASE_ADDR && addr < PECI_BASE_ADDR +
> > PECI_DEVICE_NUM_MAX;
> > +}
> > +
> > +static int peci_dev_exists(struct device *dev, void *data)
> > +{
> > +       struct peci_device *device = to_peci_device(dev);
> > +       u8 *addr = data;
> > +
> > +       if (device->addr == *addr)
> > +               return -EBUSY;
> > +
> > +       return 0;
> > +}
> > +
> > +int peci_device_create(struct peci_controller *controller, u8 addr)
> > +{
> > +       struct peci_device *device;
> > +       int ret;
> > +
> > +       if (WARN_ON(!peci_addr_valid(addr)))
> 
> The WARN_ON is overkill, especially as there is only one caller of
> this and it loops through valid addresses.

Done.

> 
> > +               return -EINVAL;
> > +
> > +       /* Check if we have already detected this device before. */
> > +       ret = device_for_each_child(&controller->dev, &addr,
> > peci_dev_exists);
> > +       if (ret)
> > +               return 0;
> > +
> > +       ret = peci_detect(controller, addr);
> > +       if (ret) {
> > +               /*
> > +                * Device not present or host state doesn't allow successful
> > +                * detection at this time.
> > +                */
> > +               if (ret == -EIO || ret == -ETIMEDOUT)
> > +                       return 0;
> > +
> > +               return ret;
> > +       }
> > +
> > +       device = kzalloc(sizeof(*device), GFP_KERNEL);
> > +       if (!device)
> > +               return -ENOMEM;
> > +
> > +       device->addr = addr;
> > +       device->dev.parent = &controller->dev;
> > +       device->dev.bus = &peci_bus_type;
> > +       device->dev.type = &peci_device_type;
> > +
> > +       ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device-
> > >addr);
> > +       if (ret)
> > +               goto err_free;
> 
> It's cleaner to just have one unified error exit using put_device().
> Use the device_initialize() + device_add() pattern, not
> device_register().

Done.

> 
> 
> > +
> > +       ret = device_register(&device->dev);
> > +       if (ret)
> > +               goto err_put;
> > +
> > +       return 0;
> > +
> > +err_put:
> > +       put_device(&device->dev);
> > +err_free:
> > +       kfree(device);
> > +
> > +       return ret;
> > +}
> > +
> > +void peci_device_destroy(struct peci_device *device)
> > +{
> > +       device_unregister(&device->dev);
> 
> No clear value for this wrapper, in fact in one caller it causes it to
> do a to_peci_device() just this helper can undo that up-cast.

It gains value after extending it with kill_device().

> 
> > +}
> > +
> > +static void peci_device_release(struct device *dev)
> > +{
> > +       struct peci_device *device = to_peci_device(dev);
> > +
> > +       kfree(device);
> > +}
> > +
> > +struct device_type peci_device_type = {
> > +       .release        = peci_device_release,
> > +};
> > diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
> > index 918dea745a86..57d11a902c5d 100644
> > --- a/drivers/peci/internal.h
> > +++ b/drivers/peci/internal.h
> > @@ -8,6 +8,20 @@
> >  #include <linux/types.h>
> > 
> >  struct peci_controller;
> > +struct peci_device;
> > +struct peci_request;
> > +
> > +/* PECI CPU address range 0x30-0x37 */
> > +#define PECI_BASE_ADDR         0x30
> > +#define PECI_DEVICE_NUM_MAX    8
> > +
> > +struct peci_request *peci_request_alloc(struct peci_device *device, u8
> > tx_len, u8 rx_len);
> > +void peci_request_free(struct peci_request *req);
> > +
> > +extern struct device_type peci_device_type;
> > +
> > +int peci_device_create(struct peci_controller *controller, u8 addr);
> > +void peci_device_destroy(struct peci_device *device);
> > 
> >  extern struct bus_type peci_bus_type;
> > 
> > diff --git a/drivers/peci/request.c b/drivers/peci/request.c
> > new file mode 100644
> > index 000000000000..81b567bc7b87
> > --- /dev/null
> > +++ b/drivers/peci/request.c
> > @@ -0,0 +1,50 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2021 Intel Corporation
> > +
> > +#include <linux/export.h>
> > +#include <linux/peci.h>
> > +#include <linux/slab.h>
> > +#include <linux/types.h>
> > +
> > +#include "internal.h"
> > +
> > +/**
> > + * peci_request_alloc() - allocate &struct peci_requests
> > + * @device: PECI device to which request is going to be sent
> > + * @tx_len: TX length
> > + * @rx_len: RX length
> > + *
> > + * Return: A pointer to a newly allocated &struct peci_request on success
> > or NULL otherwise.
> > + */
> > +struct peci_request *peci_request_alloc(struct peci_device *device, u8
> > tx_len, u8 rx_len)
> > +{
> > +       struct peci_request *req;
> > +
> > +       if (WARN_ON_ONCE(tx_len > PECI_REQUEST_MAX_BUF_SIZE || rx_len >
> > PECI_REQUEST_MAX_BUF_SIZE))
> 
> WARN_ON_ONCE() should only be here to help other kernel developers not
> make this mistake However, another way to enforce this is to stop
> exporting peci_request_alloc() and instead export helpers for specific
> command types, and keep this detail internal to the core. If you keep
> this, it needs a comment that it is only here to warn other
> peci-client developers of their bug before it goes upstream.

Added comment.

Thanks
-Iwona


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 09/15] peci: Add sysfs interface for PECI bus
  2021-08-27 19:11   ` Dan Williams
@ 2021-11-15 22:19     ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-11-15 22:19 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On Fri, 2021-08-27 at 12:11 -0700, Dan Williams wrote:
> On Tue, Aug 3, 2021 at 4:35 AM Iwona Winiarska
> <iwona.winiarska@intel.com> wrote:
> > 
> > PECI devices may not be discoverable at the time when PECI controller is
> > being added (e.g. BMC can boot up when the Host system is still in S5).
> > Since we currently don't have the capabilities to figure out the Host
> > system state inside the PECI subsystem itself, we have to rely on
> > userspace to do it for us.
> > 
> > In the future, PECI subsystem may be expanded with mechanisms that allow
> > us to avoid depending on userspace interaction (e.g. CPU presence could
> > be detected using GPIO, and the information on whether it's discoverable
> > could be obtained over IPMI).
> 
> Thanks for this detail.
> 
> > Unfortunately, those methods may ultimately not be available (support
> > will vary from platform to platform), which means that we still need
> > platform independent method triggered by userspace.
> > 
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > ---
> >  Documentation/ABI/testing/sysfs-bus-peci | 16 +++++
> >  drivers/peci/Makefile                    |  2 +-
> >  drivers/peci/core.c                      |  3 +-
> >  drivers/peci/device.c                    |  1 +
> >  drivers/peci/internal.h                  |  5 ++
> >  drivers/peci/sysfs.c                     | 82 ++++++++++++++++++++++++
> >  6 files changed, 107 insertions(+), 2 deletions(-)
> >  create mode 100644 Documentation/ABI/testing/sysfs-bus-peci
> >  create mode 100644 drivers/peci/sysfs.c
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-peci
> > b/Documentation/ABI/testing/sysfs-bus-peci
> > new file mode 100644
> > index 000000000000..56c2b2216bbd
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-bus-peci
> > @@ -0,0 +1,16 @@
> > +What:          /sys/bus/peci/rescan
> > +Date:          July 2021
> > +KernelVersion: 5.15
> > +Contact:       Iwona Winiarska <iwona.winiarska@intel.com>
> > +Description:
> > +               Writing a non-zero value to this attribute will
> > +               initiate scan for PECI devices on all PECI controllers
> > +               in the system.
> > +
> > +What:          /sys/bus/peci/devices/<controller_id>-<device_addr>/remove
> > +Date:          July 2021
> > +KernelVersion: 5.15
> > +Contact:       Iwona Winiarska <iwona.winiarska@intel.com>
> > +Description:
> > +               Writing a non-zero value to this attribute will
> > +               remove the PECI device and any of its children.
> > diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> > index c5f9d3fe21bb..917f689e147a 100644
> > --- a/drivers/peci/Makefile
> > +++ b/drivers/peci/Makefile
> > @@ -1,7 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0-only
> > 
> >  # Core functionality
> > -peci-y := core.o request.o device.o
> > +peci-y := core.o request.o device.o sysfs.o
> >  obj-$(CONFIG_PECI) += peci.o
> > 
> >  # Hardware specific bus drivers
> > diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> > index d143f1a7fe98..c473acb3c2a0 100644
> > --- a/drivers/peci/core.c
> > +++ b/drivers/peci/core.c
> > @@ -34,7 +34,7 @@ struct device_type peci_controller_type = {
> >         .release        = peci_controller_dev_release,
> >  };
> > 
> > -static int peci_controller_scan_devices(struct peci_controller *controller)
> > +int peci_controller_scan_devices(struct peci_controller *controller)
> >  {
> >         int ret;
> >         u8 addr;
> > @@ -159,6 +159,7 @@ EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
> > 
> >  struct bus_type peci_bus_type = {
> >         .name           = "peci",
> > +       .bus_groups     = peci_bus_groups,
> >  };
> > 
> >  static int __init peci_init(void)
> > diff --git a/drivers/peci/device.c b/drivers/peci/device.c
> > index 32811248997b..d77d9dabd51e 100644
> > --- a/drivers/peci/device.c
> > +++ b/drivers/peci/device.c
> > @@ -110,5 +110,6 @@ static void peci_device_release(struct device *dev)
> >  }
> > 
> >  struct device_type peci_device_type = {
> > +       .groups         = peci_device_groups,
> >         .release        = peci_device_release,
> >  };
> > diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
> > index 57d11a902c5d..978e12c8e1d3 100644
> > --- a/drivers/peci/internal.h
> > +++ b/drivers/peci/internal.h
> > @@ -8,6 +8,7 @@
> >  #include <linux/types.h>
> > 
> >  struct peci_controller;
> > +struct attribute_group;
> >  struct peci_device;
> >  struct peci_request;
> > 
> > @@ -19,12 +20,16 @@ struct peci_request *peci_request_alloc(struct
> > peci_device *device, u8 tx_len, u
> >  void peci_request_free(struct peci_request *req);
> > 
> >  extern struct device_type peci_device_type;
> > +extern const struct attribute_group *peci_device_groups[];
> > 
> >  int peci_device_create(struct peci_controller *controller, u8 addr);
> >  void peci_device_destroy(struct peci_device *device);
> > 
> >  extern struct bus_type peci_bus_type;
> > +extern const struct attribute_group *peci_bus_groups[];
> 
> To me, sysfs.c is small enough to just fold into core.c, then no need
> to declare public attribute arrays like this, but up to you if you
> prefer the sysfs.c split.

Left the sysfs split for now.

> 
> > 
> >  extern struct device_type peci_controller_type;
> > 
> > +int peci_controller_scan_devices(struct peci_controller *controller);
> > +
> >  #endif /* __PECI_INTERNAL_H */
> > diff --git a/drivers/peci/sysfs.c b/drivers/peci/sysfs.c
> > new file mode 100644
> > index 000000000000..db9ef05776e3
> > --- /dev/null
> > +++ b/drivers/peci/sysfs.c
> > @@ -0,0 +1,82 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright (c) 2021 Intel Corporation
> > +
> > +#include <linux/device.h>
> > +#include <linux/kernel.h>
> > +#include <linux/peci.h>
> > +
> > +#include "internal.h"
> > +
> > +static int rescan_controller(struct device *dev, void *data)
> > +{
> > +       if (dev->type != &peci_controller_type)
> > +               return 0;
> > +
> > +       return peci_controller_scan_devices(to_peci_controller(dev));
> > +}
> > +
> > +static ssize_t rescan_store(struct bus_type *bus, const char *buf, size_t
> > count)
> > +{
> > +       bool res;
> > +       int ret;
> > +
> > +       ret = kstrtobool(buf, &res);
> > +       if (ret)
> > +               return ret;
> > +
> > +       if (!res)
> > +               return count;
> > +
> > +       ret = bus_for_each_dev(&peci_bus_type, NULL, NULL,
> > rescan_controller);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return count;
> > +}
> > +static BUS_ATTR_WO(rescan);
> > +
> > +static struct attribute *peci_bus_attrs[] = {
> > +       &bus_attr_rescan.attr,
> > +       NULL
> > +};
> > +
> > +static const struct attribute_group peci_bus_group = {
> > +       .attrs = peci_bus_attrs,
> > +};
> > +
> > +const struct attribute_group *peci_bus_groups[] = {
> > +       &peci_bus_group,
> > +       NULL
> > +};
> > +
> > +static ssize_t remove_store(struct device *dev, struct device_attribute
> > *attr,
> > +                           const char *buf, size_t count)
> > +{
> > +       struct peci_device *device = to_peci_device(dev);
> > +       bool res;
> > +       int ret;
> > +
> > +       ret = kstrtobool(buf, &res);
> > +       if (ret)
> > +               return ret;
> > +
> > +       if (res && device_remove_file_self(dev, attr))
> > +               peci_device_destroy(device);
> 
> How do you solve races between sysfs device remove and controller
> device remove? Looks like double-free at first glance. Have a look at
> the  kill_device() helper as one way to resolve this double-delete
> race..

Done.

Thanks
-Iwona

> 
> > +
> > +       return count;
> > +}
> > +static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0200, NULL, remove_store);
> > +
> > +static struct attribute *peci_device_attrs[] = {
> > +       &dev_attr_remove.attr,
> > +       NULL
> > +};
> > +
> > +static const struct attribute_group peci_device_group = {
> > +       .attrs = peci_device_attrs,
> > +};
> > +
> > +const struct attribute_group *peci_device_groups[] = {
> > +       &peci_device_group,
> > +       NULL
> > +};
> > --
> > 2.31.1
> > 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/15] peci: Add support for PECI device drivers
  2021-08-27 21:19   ` Dan Williams
@ 2021-11-15 22:20     ` Winiarska, Iwona
  0 siblings, 0 replies; 49+ messages in thread
From: Winiarska, Iwona @ 2021-11-15 22:20 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: corbet, jae.hyun.yoo, d.mueller, linux-hwmon, andrew, Luck, Tony,
	Lutomirski, Andy, andriy.shevchenko, mchehab, jdelvare,
	linux-kernel, olof, mingo, rdunlap, devicetree, tglx,
	linux-aspeed, arnd, linux-doc, linux, zweiss, robh+dt, openbmc,
	gregkh, joel, yazen.ghannam, linux-arm-kernel,
	pierre-louis.bossart, x86, bp

On Fri, 2021-08-27 at 14:19 -0700, Dan Williams wrote:
> On Tue, Aug 3, 2021 at 4:36 AM Iwona Winiarska
> <iwona.winiarska@intel.com> wrote:
> > 
> > Here we're adding support for PECI device drivers, which unlike PECI
> 
> s/Here we're adding/Add/
> 
> > controller drivers are actually able to provide functionalities to
> > userspace.

Done.

> 
> > 
> > We're also extending peci_request API to allow querying more details
> 
> s/We're also extending/Also, extend/

Done.

> 
> ...for the most part imperative tense is the preferred tense, by
> upstream maintainers, for changelogs.
> 
> > about PECI device (e.g. model/family), that's going to be used to find
> > a compatible peci_driver.
> > 
> > Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
> > Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > ---
> >  drivers/peci/Kconfig    |   1 +
> >  drivers/peci/core.c     |  49 +++++++++
> >  drivers/peci/device.c   | 105 ++++++++++++++++++++
> >  drivers/peci/internal.h |  75 ++++++++++++++
> >  drivers/peci/request.c  | 214 ++++++++++++++++++++++++++++++++++++++++
> >  include/linux/peci.h    |  19 ++++
> >  lib/Kconfig             |   2 +-
> >  7 files changed, 464 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> > index 99279df97a78..1d0532e3a801 100644
> > --- a/drivers/peci/Kconfig
> > +++ b/drivers/peci/Kconfig
> > @@ -2,6 +2,7 @@
> > 
> >  menuconfig PECI
> >         tristate "PECI support"
> > +       select GENERIC_LIB_X86
> 
> GENERIC_LIB_X86 has dependencies, so this 'select' will make kbuild
> unhappy when that dependency is not met. Given that this symbol
> already selected by X86, it seems this just wants a "depends on
> GENERIC_LIB_X86".

Not applicable anymore after patches 1 and 2 got dropped following feedback from
arch/x86.

> 
> >         help
> >           The Platform Environment Control Interface (PECI) is an interface
> >           that provides a communication channel to Intel processors and
> > diff --git a/drivers/peci/core.c b/drivers/peci/core.c
> > index c473acb3c2a0..33c07920493d 100644
> > --- a/drivers/peci/core.c
> > +++ b/drivers/peci/core.c
> > @@ -157,8 +157,57 @@ struct peci_controller *devm_peci_controller_add(struct
> > device *dev,
> >  }
> >  EXPORT_SYMBOL_NS_GPL(devm_peci_controller_add, PECI);
> > 
> > +static const struct peci_device_id *
> > +peci_bus_match_device_id(const struct peci_device_id *id, struct
> > peci_device *device)
> > +{
> > +       while (id->family != 0) {
> > +               if (id->family == device->info.family &&
> > +                   id->model == device->info.model)
> > +                       return id;
> > +               id++;
> > +       }
> > +
> > +       return NULL;
> > +}
> > +
> > +static int peci_bus_device_match(struct device *dev, struct device_driver
> > *drv)
> > +{
> > +       struct peci_device *device = to_peci_device(dev);
> > +       struct peci_driver *peci_drv = to_peci_driver(drv);
> > +
> > +       if (dev->type != &peci_device_type)
> > +               return 0;
> > +
> > +       if (peci_bus_match_device_id(peci_drv->id_table, device))
> > +               return 1;
> 
> Save a couple lines and do:
> 
>     return peci_bus_match_device_id(...)

Done.

> 
> > +
> > +       return 0;
> > +}
> > +
> > +static int peci_bus_device_probe(struct device *dev)
> > +{
> > +       struct peci_device *device = to_peci_device(dev);
> > +       struct peci_driver *driver = to_peci_driver(dev->driver);
> > +
> > +       return driver->probe(device, peci_bus_match_device_id(driver-
> > >id_table, device));
> > +}
> > +
> > +static int peci_bus_device_remove(struct device *dev)
> 
> Note, in linux-next this prototype has changed to:
> 
>     void (*remove)(struct device *dev);
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/include/linux/device/bus.h
> 
> 
> > +{
> > +       struct peci_device *device = to_peci_device(dev);
> > +       struct peci_driver *driver = to_peci_driver(dev->driver);
> > +
> > +       if (driver->remove)
> > +               driver->remove(device);
> > +
> > +       return 0;
> > +}
> > +
> >  struct bus_type peci_bus_type = {
> >         .name           = "peci",
> > +       .match          = peci_bus_device_match,
> > +       .probe          = peci_bus_device_probe,
> > +       .remove         = peci_bus_device_remove,
> >         .bus_groups     = peci_bus_groups,
> >  };
> > 
> > diff --git a/drivers/peci/device.c b/drivers/peci/device.c
> > index d77d9dabd51e..a78c02399574 100644
> > --- a/drivers/peci/device.c
> > +++ b/drivers/peci/device.c
> > @@ -1,11 +1,85 @@
> >  // SPDX-License-Identifier: GPL-2.0-only
> >  // Copyright (c) 2018-2021 Intel Corporation
> > 
> > +#include <linux/bitfield.h>
> >  #include <linux/peci.h>
> >  #include <linux/slab.h>
> > +#include <linux/x86/cpu.h>
> > 
> >  #include "internal.h"
> > 
> > +#define REVISION_NUM_MASK GENMASK(15, 8)
> > +static int peci_get_revision(struct peci_device *device, u8 *revision)
> > +{
> > +       struct peci_request *req;
> > +       u64 dib;
> > +
> > +       req = peci_get_dib(device);
> 
> I would expect peci_get_dib() to return @dib.

get_dib is a PECI command name - I changed the naming scheme slightly to make
things more clear, e.g. peci_xfer_get_dib().

> 
> > +       if (IS_ERR(req))
> > +               return PTR_ERR(req);
> > +
> > +       /*
> > +        * PECI device may be in a state where it is unable to return a
> > proper
> > +        * DIB, in which case it returns 0 as DIB value.
> > +        * Let's treat this as an error to avoid carrying on with the
> > detection
> > +        * using invalid revision.
> > +        */
> > +       dib = peci_request_data_dib(req);
> 
> I would expect peci_request_data_dib() to make a request.

Changed peci_request_data_dib to peci_request_dib_read and
peci_request_data_temp to peci_request_temp_read to align with
peci_request_data_read*.

> 
> A stack allocated peci_request passed to peci_get_dib() that returns
> an error code would seem to be cleaner than this current organization.
> 
> > +       if (dib == 0) {
> > +               peci_request_free(req);
> > +               return -EIO;
> > +       }
> > +
> > +       *revision = FIELD_GET(REVISION_NUM_MASK, dib);
> > +
> > +       peci_request_free(req);
> > +
> > +       return 0;
> > +}
> > +
> > +static int peci_get_cpu_id(struct peci_device *device, u32 *cpu_id)
> > +{
> > +       struct peci_request *req;
> > +       int ret;
> > +
> > +       req = peci_pkg_cfg_readl(device, PECI_PCS_PKG_ID,
> > PECI_PKG_ID_CPU_ID);
> > +       if (IS_ERR(req))
> > +               return PTR_ERR(req);
> > +
> > +       ret = peci_request_status(req);
> > +       if (ret)
> > +               goto out_req_free;
> > +
> > +       *cpu_id = peci_request_data_readl(req);
> > +out_req_free:
> > +       peci_request_free(req);
> > +
> > +       return ret;
> > +}
> > +
> > +static int peci_device_info_init(struct peci_device *device)
> > +{
> > +       u8 revision;
> > +       u32 cpu_id;
> > +       int ret;
> > +
> > +       ret = peci_get_cpu_id(device, &cpu_id);
> > +       if (ret)
> > +               return ret;
> > +
> > +       device->info.family = x86_family(cpu_id);
> > +       device->info.model = x86_model(cpu_id);
> > +
> > +       ret = peci_get_revision(device, &revision);
> > +       if (ret)
> > +               return ret;
> > +       device->info.peci_revision = revision;
> > +
> > +       device->info.socket_id = device->addr - PECI_BASE_ADDR;
> > +
> > +       return 0;
> > +}
> > +
> >  static int peci_detect(struct peci_controller *controller, u8 addr)
> >  {
> >         struct peci_request *req;
> > @@ -79,6 +153,10 @@ int peci_device_create(struct peci_controller
> > *controller, u8 addr)
> >         device->dev.bus = &peci_bus_type;
> >         device->dev.type = &peci_device_type;
> > 
> > +       ret = peci_device_info_init(device);
> > +       if (ret)
> > +               goto err_free;
> > +
> >         ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device-
> > >addr);
> >         if (ret)
> >                 goto err_free;
> > @@ -102,6 +180,33 @@ void peci_device_destroy(struct peci_device *device)
> >         device_unregister(&device->dev);
> >  }
> > 
> > +int __peci_driver_register(struct peci_driver *driver, struct module
> > *owner,
> > +                          const char *mod_name)
> > +{
> > +       driver->driver.bus = &peci_bus_type;
> > +       driver->driver.owner = owner;
> > +       driver->driver.mod_name = mod_name;
> > +
> > +       if (!driver->probe) {
> > +               pr_err("peci: trying to register driver without probe
> > callback\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       if (!driver->id_table) {
> > +               pr_err("peci: trying to register driver without device id
> > table\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       return driver_register(&driver->driver);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(__peci_driver_register, PECI);
> > +
> > +void peci_driver_unregister(struct peci_driver *driver)
> > +{
> > +       driver_unregister(&driver->driver);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_driver_unregister, PECI);
> > +
> >  static void peci_device_release(struct device *dev)
> >  {
> >         struct peci_device *device = to_peci_device(dev);
> > diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h
> > index 978e12c8e1d3..d661e1b65694 100644
> > --- a/drivers/peci/internal.h
> > +++ b/drivers/peci/internal.h
> > @@ -19,6 +19,34 @@ struct peci_request;
> >  struct peci_request *peci_request_alloc(struct peci_device *device, u8
> > tx_len, u8 rx_len);
> >  void peci_request_free(struct peci_request *req);
> > 
> > +int peci_request_status(struct peci_request *req);
> > +u64 peci_request_data_dib(struct peci_request *req);
> > +
> > +u8 peci_request_data_readb(struct peci_request *req);
> > +u16 peci_request_data_readw(struct peci_request *req);
> > +u32 peci_request_data_readl(struct peci_request *req);
> > +u64 peci_request_data_readq(struct peci_request *req);
> > +
> > +struct peci_request *peci_get_dib(struct peci_device *device);
> > +struct peci_request *peci_get_temp(struct peci_device *device);
> > +
> > +struct peci_request *peci_pkg_cfg_readb(struct peci_device *device, u8
> > index, u16 param);
> > +struct peci_request *peci_pkg_cfg_readw(struct peci_device *device, u8
> > index, u16 param);
> > +struct peci_request *peci_pkg_cfg_readl(struct peci_device *device, u8
> > index, u16 param);
> > +struct peci_request *peci_pkg_cfg_readq(struct peci_device *device, u8
> > index, u16 param);
> > +
> > +/**
> > + * struct peci_device_id - PECI device data to match
> > + * @data: pointer to driver private data specific to device
> > + * @family: device family
> > + * @model: device model
> > + */
> > +struct peci_device_id {
> > +       const void *data;
> > +       u16 family;
> > +       u8 model;
> > +};
> > +
> >  extern struct device_type peci_device_type;
> >  extern const struct attribute_group *peci_device_groups[];
> > 
> > @@ -28,6 +56,53 @@ void peci_device_destroy(struct peci_device *device);
> >  extern struct bus_type peci_bus_type;
> >  extern const struct attribute_group *peci_bus_groups[];
> > 
> > +/**
> > + * struct peci_driver - PECI driver
> > + * @driver: inherit device driver
> > + * @probe: probe callback
> > + * @remove: remove callback
> > + * @id_table: PECI device match table to decide which device to bind
> > + */
> > +struct peci_driver {
> > +       struct device_driver driver;
> > +       int (*probe)(struct peci_device *device, const struct peci_device_id
> > *id);
> > +       void (*remove)(struct peci_device *device);
> > +       const struct peci_device_id *id_table;
> > +};
> > +
> > +static inline struct peci_driver *to_peci_driver(struct device_driver *d)
> > +{
> > +       return container_of(d, struct peci_driver, driver);
> > +}
> > +
> > +int __peci_driver_register(struct peci_driver *driver, struct module
> > *owner,
> > +                          const char *mod_name);
> > +/**
> > + * peci_driver_register() - register PECI driver
> > + * @driver: the driver to be registered
> > + * @owner: owner module of the driver being registered
> > + * @mod_name: module name string
> > + *
> > + * PECI drivers that don't need to do anything special in module init
> > should
> > + * use the convenience "module_peci_driver" macro instead
> > + *
> > + * Return: zero on success, else a negative error code.
> > + */
> > +#define peci_driver_register(driver) \
> > +       __peci_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
> > +void peci_driver_unregister(struct peci_driver *driver);
> > +
> > +/**
> > + * module_peci_driver() - helper macro for registering a modular PECI
> > driver
> > + * @__peci_driver: peci_driver struct
> > + *
> > + * Helper macro for PECI drivers which do not do anything special in module
> > + * init/exit. This eliminates a lot of boilerplate. Each module may only
> > + * use this macro once, and calling it replaces module_init() and
> > module_exit()
> > + */
> > +#define module_peci_driver(__peci_driver) \
> > +       module_driver(__peci_driver, peci_driver_register,
> > peci_driver_unregister)
> > +
> >  extern struct device_type peci_controller_type;
> > 
> >  int peci_controller_scan_devices(struct peci_controller *controller);
> > diff --git a/drivers/peci/request.c b/drivers/peci/request.c
> > index 81b567bc7b87..fe032d5a5e1b 100644
> > --- a/drivers/peci/request.c
> > +++ b/drivers/peci/request.c
> > @@ -1,13 +1,140 @@
> >  // SPDX-License-Identifier: GPL-2.0-only
> >  // Copyright (c) 2021 Intel Corporation
> > 
> > +#include <linux/bug.h>
> >  #include <linux/export.h>
> >  #include <linux/peci.h>
> >  #include <linux/slab.h>
> >  #include <linux/types.h>
> > 
> > +#include <asm/unaligned.h>
> > +
> >  #include "internal.h"
> > 
> > +#define PECI_GET_DIB_CMD               0xf7
> > +#define  PECI_GET_DIB_WR_LEN           1
> > +#define  PECI_GET_DIB_RD_LEN           8
> > +
> > +#define PECI_RDPKGCFG_CMD              0xa1
> > +#define  PECI_RDPKGCFG_WR_LEN          5
> > +#define  PECI_RDPKGCFG_RD_LEN_BASE     1
> > +#define PECI_WRPKGCFG_CMD              0xa5
> > +#define  PECI_WRPKGCFG_WR_LEN_BASE     6
> > +#define  PECI_WRPKGCFG_RD_LEN          1
> > +
> > +/* Device Specific Completion Code (CC) Definition */
> > +#define PECI_CC_SUCCESS                                0x40
> > +#define PECI_CC_NEED_RETRY                     0x80
> > +#define PECI_CC_OUT_OF_RESOURCE                        0x81
> > +#define PECI_CC_UNAVAIL_RESOURCE               0x82
> > +#define PECI_CC_INVALID_REQ                    0x90
> > +#define PECI_CC_MCA_ERROR                      0x91
> > +#define PECI_CC_CATASTROPHIC_MCA_ERROR         0x93
> > +#define PECI_CC_FATAL_MCA_ERROR                        0x94
> > +#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB                0x98
> > +#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR   0x9B
> > +#define PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA    0x9C
> > +
> > +#define PECI_RETRY_BIT                 BIT(0)
> > +
> > +#define PECI_RETRY_TIMEOUT             msecs_to_jiffies(700)
> > +#define PECI_RETRY_INTERVAL_MIN                msecs_to_jiffies(1)
> > +#define PECI_RETRY_INTERVAL_MAX                msecs_to_jiffies(128)
> > +
> > +static u8 peci_request_data_cc(struct peci_request *req)
> > +{
> > +       return req->rx.buf[0];
> > +}
> > +
> > +/**
> > + * peci_request_status() - return -errno based on PECI completion code
> > + * @req: the PECI request that contains response data with completion code
> > + *
> > + * It can't be used for Ping(), GetDIB() and GetTemp() - for those commands
> > we
> > + * don't expect completion code in the response.
> > + *
> > + * Return: -errno
> > + */
> > +int peci_request_status(struct peci_request *req)
> > +{
> > +       u8 cc = peci_request_data_cc(req);
> > +
> > +       if (cc != PECI_CC_SUCCESS)
> > +               dev_dbg(&req->device->dev, "ret: %#02x\n", cc);
> > +
> > +       switch (cc) {
> > +       case PECI_CC_SUCCESS:
> > +               return 0;
> > +       case PECI_CC_NEED_RETRY:
> > +       case PECI_CC_OUT_OF_RESOURCE:
> > +       case PECI_CC_UNAVAIL_RESOURCE:
> > +               return -EAGAIN;
> > +       case PECI_CC_INVALID_REQ:
> > +               return -EINVAL;
> > +       case PECI_CC_MCA_ERROR:
> > +       case PECI_CC_CATASTROPHIC_MCA_ERROR:
> > +       case PECI_CC_FATAL_MCA_ERROR:
> > +       case PECI_CC_PARITY_ERR_GPSB_OR_PMSB:
> > +       case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_IERR:
> > +       case PECI_CC_PARITY_ERR_GPSB_OR_PMSB_MCA:
> > +               return -EIO;
> > +       }
> > +
> > +       WARN_ONCE(1, "Unknown PECI completion code: %#02x\n", cc);
> > +
> > +       return -EIO;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_request_status, PECI);
> > +
> > +static int peci_request_xfer(struct peci_request *req)
> > +{
> > +       struct peci_device *device = req->device;
> > +       struct peci_controller *controller = to_peci_controller(device-
> > >dev.parent);
> > +       int ret;
> > +
> > +       mutex_lock(&controller->bus_lock);
> > +       ret = controller->ops->xfer(controller, device->addr, req);
> > +       mutex_unlock(&controller->bus_lock);
> > +
> > +       return ret;
> > +}
> > +
> > +static int peci_request_xfer_retry(struct peci_request *req)
> > +{
> > +       long wait_interval = PECI_RETRY_INTERVAL_MIN;
> > +       struct peci_device *device = req->device;
> > +       struct peci_controller *controller = to_peci_controller(device-
> > >dev.parent);
> > +       unsigned long start = jiffies;
> > +       int ret;
> > +
> > +       /* Don't try to use it for ping */
> > +       if (WARN_ON(!req->rx.buf))
> > +               return 0;
> > +
> > +       do {
> > +               ret = peci_request_xfer(req);
> > +               if (ret) {
> > +                       dev_dbg(&controller->dev, "xfer error: %d\n", ret);
> > +                       return ret;
> > +               }
> > +
> > +               if (peci_request_status(req) != -EAGAIN)
> > +                       return 0;
> > +
> > +               /* Set the retry bit to indicate a retry attempt */
> > +               req->tx.buf[1] |= PECI_RETRY_BIT;
> > +
> > +               if (schedule_timeout_interruptible(wait_interval))
> > +                       return -ERESTARTSYS;
> > +
> > +               wait_interval = min_t(long, wait_interval * 2,
> > PECI_RETRY_INTERVAL_MAX);
> > +       } while (time_before(jiffies, start + PECI_RETRY_TIMEOUT));
> > +
> > +       dev_dbg(&controller->dev, "request timed out\n");
> > +
> > +       return -ETIMEDOUT;
> > +}
> > +
> >  /**
> >   * peci_request_alloc() - allocate &struct peci_requests
> >   * @device: PECI device to which request is going to be sent
> > @@ -48,3 +175,90 @@ void peci_request_free(struct peci_request *req)
> >         kfree(req);
> >  }
> >  EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);
> > +
> > +struct peci_request *peci_get_dib(struct peci_device *device)
> > +{
> > +       struct peci_request *req;
> > +       int ret;
> > +
> > +       req = peci_request_alloc(device, PECI_GET_DIB_WR_LEN,
> > PECI_GET_DIB_RD_LEN);
> > +       if (!req)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       req->tx.buf[0] = PECI_GET_DIB_CMD;
> > +
> > +       ret = peci_request_xfer(req);
> > +       if (ret) {
> > +               peci_request_free(req);
> > +               return ERR_PTR(ret);
> > +       }
> > +
> > +       return req;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_get_dib, PECI);
> > +
> > +static struct peci_request *
> > +__pkg_cfg_read(struct peci_device *device, u8 index, u16 param, u8 len)
> > +{
> > +       struct peci_request *req;
> > +       int ret;
> > +
> > +       req = peci_request_alloc(device, PECI_RDPKGCFG_WR_LEN,
> > PECI_RDPKGCFG_RD_LEN_BASE + len);
> > +       if (!req)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       req->tx.buf[0] = PECI_RDPKGCFG_CMD;
> > +       req->tx.buf[1] = 0;
> > +       req->tx.buf[2] = index;
> > +       put_unaligned_le16(param, &req->tx.buf[3]);
> > +
> > +       ret = peci_request_xfer_retry(req);
> > +       if (ret) {
> > +               peci_request_free(req);
> > +               return ERR_PTR(ret);
> > +       }
> > +
> > +       return req;
> > +}
> > +
> > +u8 peci_request_data_readb(struct peci_request *req)
> > +{
> > +       return req->rx.buf[1];
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_request_data_readb, PECI);
> > +
> > +u16 peci_request_data_readw(struct peci_request *req)
> > +{
> > +       return get_unaligned_le16(&req->rx.buf[1]);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_request_data_readw, PECI);
> > +
> > +u32 peci_request_data_readl(struct peci_request *req)
> > +{
> > +       return get_unaligned_le32(&req->rx.buf[1]);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_request_data_readl, PECI);
> > +
> > +u64 peci_request_data_readq(struct peci_request *req)
> > +{
> > +       return get_unaligned_le64(&req->rx.buf[1]);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_request_data_readq, PECI);
> > +
> > +u64 peci_request_data_dib(struct peci_request *req)
> > +{
> > +       return get_unaligned_le64(&req->rx.buf[0]);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(peci_request_data_dib, PECI);
> > +
> > +#define __read_pkg_config(x, type) \
> > +struct peci_request *peci_pkg_cfg_##x(struct peci_device *device, u8 index,
> > u16 param) \
> > +{ \
> > +       return __pkg_cfg_read(device, index, param, sizeof(type)); \
> > +} \
> > +EXPORT_SYMBOL_NS_GPL(peci_pkg_cfg_##x, PECI)
> > +
> > +__read_pkg_config(readb, u8);
> > +__read_pkg_config(readw, u16);
> > +__read_pkg_config(readl, u32);
> > +__read_pkg_config(readq, u64);
> > diff --git a/include/linux/peci.h b/include/linux/peci.h
> > index 26e0a4e73b50..dcf1c53f4e40 100644
> > --- a/include/linux/peci.h
> > +++ b/include/linux/peci.h
> > @@ -14,6 +14,14 @@
> >   */
> >  #define PECI_REQUEST_MAX_BUF_SIZE 32
> > 
> > +#define PECI_PCS_PKG_ID                        0  /* Package Identifier
> > Read */
> > +#define  PECI_PKG_ID_CPU_ID            0x0000  /* CPUID Info */
> > +#define  PECI_PKG_ID_PLATFORM_ID       0x0001  /* Platform ID */
> > +#define  PECI_PKG_ID_DEVICE_ID         0x0002  /* Uncore Device ID */
> > +#define  PECI_PKG_ID_MAX_THREAD_ID     0x0003  /* Max Thread ID */
> > +#define  PECI_PKG_ID_MICROCODE_REV     0x0004  /* CPU Microcode Update
> > Revision */
> > +#define  PECI_PKG_ID_MCA_ERROR_LOG     0x0005  /* Machine Check Status */
> > +
> >  struct peci_controller;
> >  struct peci_request;
> > 
> > @@ -59,6 +67,11 @@ static inline struct peci_controller
> > *to_peci_controller(void *d)
> >   * struct peci_device - PECI device
> >   * @dev: device object to register PECI device to the device model
> >   * @controller: manages the bus segment hosting this PECI device
> > + * @info: PECI device characteristics
> > + * @info.family: device family
> > + * @info.model: device model
> > + * @info.peci_revision: PECI revision supported by the PECI device
> > + * @info.socket_id: the socket ID represented by the PECI device
> >   * @addr: address used on the PECI bus connected to the parent controller
> >   *
> >   * A peci_device identifies a single device (i.e. CPU) connected to a PECI
> > bus.
> > @@ -67,6 +80,12 @@ static inline struct peci_controller
> > *to_peci_controller(void *d)
> >   */
> >  struct peci_device {
> >         struct device dev;
> > +       struct {
> > +               u16 family;
> > +               u8 model;
> > +               u8 peci_revision;
> > +               u8 socket_id;
> > +       } info;
> >         u8 addr;
> >  };
> > 
> > diff --git a/lib/Kconfig b/lib/Kconfig
> > index e538d4d773bd..7f7972d357c2 100644
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -718,4 +718,4 @@ config ASN1_ENCODER
> > 
> >  config GENERIC_LIB_X86
> >         bool
> > -       depends on X86
> > +       depends on X86 || PECI
> 
> This looks broken, what in the GENERIC_LIB_X86 implementation depends on peci?

Not applicable anymore.

Apologies for the huge delay in my replies.

Thanks
-Iwona


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2021-11-16  0:11 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03 11:31 [PATCH v2 00/15] Introduce PECI subsystem Iwona Winiarska
2021-08-03 11:31 ` [PATCH v2 01/15] x86/cpu: Move intel-family to arch-independent headers Iwona Winiarska
2021-10-04 19:03   ` Borislav Petkov
2021-10-11 19:21     ` Winiarska, Iwona
2021-10-11 19:40       ` Dave Hansen
2021-10-11 20:53         ` Winiarska, Iwona
2021-10-11 23:12           ` Dave Hansen
2021-10-11 20:06       ` Borislav Petkov
2021-10-11 20:38         ` Winiarska, Iwona
2021-10-11 21:31           ` Borislav Petkov
2021-10-12 23:15             ` Winiarska, Iwona
2021-10-13  6:42               ` Borislav Petkov
2021-08-03 11:31 ` [PATCH v2 02/15] x86/cpu: Extract cpuid helpers to arch-independent Iwona Winiarska
2021-10-04 19:08   ` Borislav Petkov
2021-10-11 19:32     ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 03/15] dt-bindings: Add generic bindings for PECI Iwona Winiarska
2021-08-11 18:11   ` Rob Herring
2021-08-03 11:31 ` [PATCH v2 04/15] dt-bindings: Add bindings for peci-aspeed Iwona Winiarska
2021-08-11 18:11   ` Rob Herring
2021-08-03 11:31 ` [PATCH v2 05/15] ARM: dts: aspeed: Add PECI controller nodes Iwona Winiarska
2021-08-03 11:31 ` [PATCH v2 06/15] peci: Add core infrastructure Iwona Winiarska
2021-08-25 22:58   ` Dan Williams
2021-08-26 22:40     ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 07/15] peci: Add peci-aspeed controller driver Iwona Winiarska
2021-08-26  1:35   ` Dan Williams
2021-08-26 23:54     ` Winiarska, Iwona
2021-08-27 16:24       ` Dan Williams
2021-08-29 19:42         ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 08/15] peci: Add device detection Iwona Winiarska
2021-08-27 19:01   ` Dan Williams
2021-11-15 22:18     ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 09/15] peci: Add sysfs interface for PECI bus Iwona Winiarska
2021-08-27 19:11   ` Dan Williams
2021-11-15 22:19     ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 10/15] peci: Add support for PECI device drivers Iwona Winiarska
2021-08-27 21:19   ` Dan Williams
2021-11-15 22:20     ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 11/15] peci: Add peci-cpu driver Iwona Winiarska
2021-08-03 11:31 ` [PATCH v2 12/15] hwmon: peci: Add cputemp driver Iwona Winiarska
2021-08-03 15:24   ` Guenter Roeck
2021-08-04 10:43     ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 13/15] hwmon: peci: Add dimmtemp driver Iwona Winiarska
2021-08-03 15:39   ` Guenter Roeck
2021-08-04 10:46     ` Winiarska, Iwona
2021-08-04 17:33       ` Guenter Roeck
2021-08-05 21:48         ` Winiarska, Iwona
2021-08-03 11:31 ` [PATCH v2 14/15] docs: hwmon: Document PECI drivers Iwona Winiarska
2021-08-03 11:31 ` [PATCH v2 15/15] docs: Add PECI documentation Iwona Winiarska
2021-08-05 12:17 ` [PATCH v2 00/15] Introduce PECI subsystem Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).