linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86
@ 2022-03-11 10:19 Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 1/8] drivers: base: Add hardware prefetch control core driver Kohei Tarumizu
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This patch series add sysfs interface to control CPU's hardware
prefetch behavior for performance tuning from userspace for arm64 and
x86 (on supported CPU).

Changes from v1:
  - split the attribute file so that one-value-per-file
    - example of old attribute file
      /sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control
    - example of new attribute file
      /sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/*_prefetcher_*
  - remove the description of "default m" in arm64's Kconfig
https://lore.kernel.org/lkml/20220125071414.811344-1-tarumizu.kohei@fujitsu.com/

[Background]
============
A64FX and some Intel processors have implementation-dependent register
for controlling CPU's hardware prefetch behavior. A64FX has
IMP_PF_STREAM_DETECT_CTRL_EL0[1], and Intel processors have MSR 0x1a4
(MSR_MISC_FEATURE_CONTROL)[2]. These registers cannot be accessed from
userspace.

[1]https://github.com/fujitsu/A64FX/tree/master/doc/
   A64FX_Specification_HPC_Extension_v1_EN.pdf

[2]https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
    Volume 4

The advantage of using this is improved performance. As an example of
performance improvements, the results of running the Stream benchmark
on the A64FX are described in section [Merit].

For MSR 0x1a4, it is also possible to change the value from userspace
via the MSR driver. However, using MSR driver is not recommended, so
it needs a proper kernel interface[3].

[3]https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about/

For these reasons, we provide a new proper kernel interface to control
both IMP_PF_STREAM_DETECT_CTRL_EL0 and MSR 0x1a4.

[Overall design]
================
The source code for this driver is divided into common parts
(driver/base/pfctl.c) and architecture parts (arch/XXX/XXX/pfctl.c).
Common parts is described architecture-independent processing, such as
creating sysfs.
Architecture parts is described architecture-dependent processing. It
must contain at least the what type of hardware prefetcher is supported
and how to read/write to the register. These information are set
through registration function in common parts.

This driver creates "prefetch_control" directory and some attribute
files in every CPU's cache/index[0,2] directory, if CPU supports
hardware prefetch control behavior. Each attribute file corresponds to
the cache level of the parent index directory.

Detailed description of this sysfs interface is in
Documentation/ABI/testing/sysfs-devices-system-cpu (patch8).

This driver needs cache sysfs directory and cache level/type
information. In ARM processor, these information can be obtained
from registers even without ACPI PPTT.
We add processing to create a cache/index directory using only the
information from the register if the machine does not support ACPI
PPTT and Kconfig for hardware prefetch control (CONFIG_HWPF_CONTROL)
is true in patch5.
This action caused a problem and is described in [Known problem].

[Examples]
==========
This section provides an example of using this sysfs interface at the
x86's model of INTEL_FAM6_BROADWELL_X.

This model has the following register specifications:

[0]    L2 Hardware Prefetcher Disable (R/W)
[1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
[2]    DCU Hardware Prefetcher Disable (R/W)
[3]    DCU IP Prefetcher Disable (R/W)
[63:4] Reserved

In this case, index0 (L1d cache) corresponds to bit[2,3] and index2
(L2 cache) corresponds to bit [0,1]. A list of attribute files of
index0 and index2 in CPU1 at BROADWELL_X is following:

```
# ls /sys/devices/system/cpu/cpu1/cache/index0/prefetch_control/

hardware_prefetcher_enable
ip_prefetcher_enable

# ls /sys/devices/system/cpu/cpu1/cache/index2/prefetch_control/

adjacent_cache_line_prefetcher_enable
hardware_prefetcher_enable
```

If user would like to disable the setting of "L2 Adjacent Cache Line
Prefetcher Disable (R/W)" in CPU1, do the following:

```
# echo 0 > /sys/devices/system/cpu/cpu1/cache/index2/prefetch_control/adjacent_cache_line_prefetcher_enable
```

In another example, a list of index0 at A64FX is following:

```
# ls /sys/devices/system/cpu/cpu1/cache/index0/prefetch_control/

stream_detect_prefetcher_dist
stream_detect_prefetcher_enable
stream_detect_prefetcher_strong
```

[Patch organizations]
=====================
This patch series add hardware prefetch control core driver for ARM64
and x86. Also, we add support for FUJITSU_CPU_PART_A64FX at ARM64 and
BROADWELL_X at x86.

- patch1: Add hardware prefetch core driver

  This driver provides a register/unregister function to create the
  "prefetch_control" directory and some attribute files in every CPU's
  cache/index[0,2] directory.
  If the architecture has control of the CPU's hardware prefetch
  behavior, use this function to create sysfs. When registering, it
  is necessary to provide what type of Hardware Prefetcher is
  supported and how to read/write to the register.

- patch2: Add Kconfig/Makefile to build hardware prefetch control core
  driver

- patch3: Add support for ARM64

  This adds module init/exit code, and creates sysfs attribute file
  "stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
  and "stream_detect_prefetcher_dist" for ARM64. This driver works only
  if part number is FUJITSU_CPU_PART_A64FX at this point.

- patch4: Add Kconfig/Makefile to build driver for arm64

- patch5: Create cache sysfs directory without ACPI PPTT for hardware
  prefetch control

  Hardware Prefetch control driver needs cache sysfs directory and cache
  level/type information. In ARM processor, these information can be
  obtained from registers even without PPTT. Therefore, we set the
  cpu_map_populated to true to create cache sysfs directory, if the
  machine doesn't have PPTT.

- patch6: Add support for x86

  This adds module init/exit code, and creates sysfs attribute file
  "hardware_prefetcher_enable", "ip_prefetcher_enable" and
  "adjacent_cache_line_prefetcher_enable" for x86. This driver works
  only if the model is INTEL_FAM6_BROADWELL_X at this point.

- patch7: Add Kconfig/Makefile to build driver for x86

- patch8: Add documentation for the new sysfs interface


[Known problem]
===============
- `lscpu` command terminates with -ENOENT because cache/index directory
  is exists but shared_cpu_map file does not exist. This is due to
  patch5, which creates a cache/index directory containing only level
  and type without ACPI PPTT.

[Merit]
=======
For reference, here is the result of STREAM Triad when tuning with
the "s file in L1 and L2 cache on A64FX.

| dist combination  | Pattern A   | Pattern B   |
|-------------------|-------------|-------------|
| L1:256,  L2:1024  | 234505.2144 | 114600.0801 |
| L1:1536, L2:1024  | 279172.8742 | 118979.4542 |
| L1:256,  L2:10240 | 247716.7757 | 127364.1533 |
| L1:1536, L2:10240 | 283675.6625 | 125950.6847 |

In pattern A, we set the size of the array to 174720, which is about
half the size of the L1d cache. In pattern B, we set the size of the
array to 10485120, which is about twice the size of the L2 cache.

In pattern A, a change of dist at L1 has a larger effect. On the other
hand, in pattern B, the change of dist at L2 has a larger effect.
As described above, the optimal dist combination depends on the
characteristics of the application. Therefore, such a sysfs interface
is useful for performance tuning.

Best regards,
Kohei Tarumizu

Kohei Tarumizu (8):
  drivers: base: Add hardware prefetch control core driver
  drivers: base: Add Kconfig/Makefile to build hardware prefetch control
    core driver
  arm64: Add hardware prefetch control support for ARM64
  arm64: Add Kconfig/Makefile to build hardware prefetch control driver
  arm64: Create cache sysfs directory without ACPI PPTT for hardware
    prefetch control
  x86: Add hardware prefetch control support for x86
  x86: Add Kconfig/Makefile to build hardware prefetch control driver
  docs: ABI: Add sysfs documentation interface of hardware prefetch
    control driver

 .../ABI/testing/sysfs-devices-system-cpu      |  89 ++++
 MAINTAINERS                                   |   8 +
 arch/arm64/Kconfig                            |   7 +
 arch/arm64/kernel/Makefile                    |   1 +
 arch/arm64/kernel/cacheinfo.c                 |  29 ++
 arch/arm64/kernel/pfctl.c                     | 368 ++++++++++++++++
 arch/x86/Kconfig                              |   7 +
 arch/x86/kernel/cpu/Makefile                  |   2 +
 arch/x86/kernel/cpu/pfctl.c                   | 314 +++++++++++++
 drivers/base/Kconfig                          |  13 +
 drivers/base/Makefile                         |   1 +
 drivers/base/pfctl.c                          | 412 ++++++++++++++++++
 include/linux/pfctl.h                         |  41 ++
 13 files changed, 1292 insertions(+)
 create mode 100644 arch/arm64/kernel/pfctl.c
 create mode 100644 arch/x86/kernel/cpu/pfctl.c
 create mode 100644 drivers/base/pfctl.c
 create mode 100644 include/linux/pfctl.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/8] drivers: base: Add hardware prefetch control core driver
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 2/8] drivers: base: Add Kconfig/Makefile to build " Kohei Tarumizu
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This driver adds the register/unregister function to create the
"prefetch_control" directory and some attribute files in every CPU's
cache/index[0,2] directory.

Each attribute file exists depending on kind of processor and cache
level. For example, on an INTEL_FAM6_BROADWELL_X:

    /sys/devices/system/cpu/cpu0/cache/index0/prefetch_control
        hardware_prefetcher_enable
        ip_prefetcher_enable

    /sys/devices/system/cpu/cpu0/cache/index2/prefetch_control
        adjacent_cache_line_prefetcher_enable
        hardware_prefetcher_enable

If the architecture has control of the CPU's hardware prefetcher
behavior, use this function to create sysfs. When registering, it is
necessary to provide what type of hardware prefetcher is supported
and how to read/write to the register.

Following patches add support for ARM64 and x86.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 drivers/base/pfctl.c  | 412 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pfctl.h |  41 +++++
 2 files changed, 453 insertions(+)
 create mode 100644 drivers/base/pfctl.c
 create mode 100644 include/linux/pfctl.h

diff --git a/drivers/base/pfctl.c b/drivers/base/pfctl.c
new file mode 100644
index 000000000000..9335d513f55f
--- /dev/null
+++ b/drivers/base/pfctl.c
@@ -0,0 +1,412 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * This driver provides tunable sysfs interface for Hardware Prefetch Control.
+ * See Documentation/ABI/testing/sysfs-devices-system-cpu for more information.
+ *
+ * This code provides architecture-independent functions such as create and
+ * remove attribute file.
+ * The implementation of reads and writes to the Hardware Prefetch Control
+ * register is architecture-dependent. Therefore, each architecture register
+ * a callback to read and write the register via pfctl_register_driver().
+ */
+
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/device.h>
+#include <linux/pfctl.h>
+#include <linux/parser.h>
+#include <linux/slab.h>
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+static DEFINE_PER_CPU(struct device *, cache_device_pcpu);
+#define per_cpu_cache_device(cpu) (per_cpu(cache_device_pcpu, cpu))
+
+struct pfctl_driver *pdriver;
+enum cpuhp_state hp_online;
+
+static const char dist_auto_string[] = "auto";
+
+static bool prefetcher_is_available(unsigned int level, enum cache_type type,
+				    int prefetcher)
+{
+	if ((level == 1) && (type == CACHE_TYPE_DATA)) {
+		if (pdriver->supported_l1d_prefetcher & prefetcher)
+			return true;
+	} else if ((level == 2) && (type == CACHE_TYPE_UNIFIED)) {
+		if (pdriver->supported_l2_prefetcher & prefetcher)
+			return true;
+	}
+
+	return false;
+}
+
+#define pfctl_enable_show(prefetcher, pattr)				\
+static ssize_t								\
+prefetcher##_enable_show(struct device *dev,				\
+			 struct device_attribute *attr, char *buf)	\
+{									\
+	int ret;							\
+	u64 val;							\
+	unsigned int cpu;						\
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);	\
+									\
+	cpu = dev->parent->parent->parent->id;				\
+									\
+	ret = pdriver->read_pfreg(pattr, cpu, this_leaf->level, &val);	\
+	if (ret < 0)							\
+		return ret;						\
+									\
+	if ((val == PFCTL_ENABLE_VAL) || (val == PFCTL_DISABLE_VAL))	\
+		return sysfs_emit(buf, "%llu\n", val);			\
+	else								\
+		return -EINVAL;						\
+}
+
+pfctl_enable_show(hardware_prefetcher, HWPF_ENABLE);
+pfctl_enable_show(ip_prefetcher, IPPF_ENABLE);
+pfctl_enable_show(adjacent_cache_line_prefetcher, ACLPF_ENABLE);
+pfctl_enable_show(stream_detect_prefetcher, SDPF_ENABLE);
+
+static ssize_t
+stream_detect_prefetcher_strong_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	int ret;
+	u64 val;
+	unsigned int cpu;
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);
+
+	cpu = dev->parent->parent->parent->id;
+
+	ret = pdriver->read_pfreg(SDPF_STRONG, cpu, this_leaf->level, &val);
+	if (ret < 0)
+		return ret;
+
+	if ((val == PFCTL_STRONG_VAL) || (val == PFCTL_WEAK_VAL))
+		return sysfs_emit(buf, "%llu\n", val);
+	else
+		return -EINVAL;
+
+}
+
+static ssize_t
+stream_detect_prefetcher_dist_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	int ret;
+	u64 val;
+	unsigned int cpu;
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);
+
+	cpu = dev->parent->parent->parent->id;
+
+	ret = pdriver->read_pfreg(SDPF_DIST, cpu, this_leaf->level, &val);
+	if (ret < 0)
+		return ret;
+
+	if (val == PFCTL_DIST_AUTO_VAL)
+		return sysfs_emit(buf, "%s\n", dist_auto_string);
+	else
+		return sysfs_emit(buf, "%llu\n", val);
+}
+
+#define pfctl_enable_store(prefetcher, pattr)				\
+static ssize_t								\
+prefetcher##_enable_store(struct device *dev,				\
+			  struct device_attribute *attr,		\
+			  const char *buf, size_t count)		\
+{									\
+	int ret;							\
+	u64 val;							\
+	unsigned int cpu;						\
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);	\
+									\
+	ret = kstrtoull(buf, 10, &val);					\
+	if (ret < 0)							\
+		return -EINVAL;						\
+									\
+	if ((val != PFCTL_ENABLE_VAL) && (val != PFCTL_DISABLE_VAL))	\
+		return -EINVAL;						\
+									\
+	cpu = dev->parent->parent->parent->id;				\
+									\
+	ret = pdriver->write_pfreg(pattr, cpu, this_leaf->level, val);	\
+	if (ret < 0)							\
+		return ret;						\
+									\
+	return count;							\
+}
+
+pfctl_enable_store(hardware_prefetcher, HWPF_ENABLE);
+pfctl_enable_store(ip_prefetcher, IPPF_ENABLE);
+pfctl_enable_store(adjacent_cache_line_prefetcher, ACLPF_ENABLE);
+pfctl_enable_store(stream_detect_prefetcher, SDPF_ENABLE);
+
+static ssize_t
+stream_detect_prefetcher_strong_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	int ret;
+	u64 val;
+	unsigned int cpu;
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);
+
+	ret = kstrtoull(buf, 10, &val);
+	if (ret < 0)
+		return -EINVAL;
+
+	if ((val != PFCTL_STRONG_VAL) && (val != PFCTL_WEAK_VAL))
+		return -EINVAL;
+
+	cpu = dev->parent->parent->parent->id;
+
+	ret = pdriver->write_pfreg(SDPF_STRONG, cpu, this_leaf->level, val);
+	if (ret < 0)
+		return ret;
+
+	return count;
+}
+
+static ssize_t
+stream_detect_prefetcher_dist_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	int ret;
+	u64 val;
+	unsigned int cpu;
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);
+
+	if (sysfs_streq(buf, dist_auto_string)) {
+		val = PFCTL_DIST_AUTO_VAL;
+	} else {
+		ret = kstrtoull(buf, 10, &val);
+		if (ret < 0)
+			return -EINVAL;
+	}
+
+	cpu = dev->parent->parent->parent->id;
+
+	ret = pdriver->write_pfreg(SDPF_DIST, cpu, this_leaf->level, val);
+	if (ret < 0)
+		return ret;
+
+	return count;
+}
+
+static DEVICE_ATTR_ADMIN_RW(hardware_prefetcher_enable);
+static DEVICE_ATTR_ADMIN_RW(ip_prefetcher_enable);
+static DEVICE_ATTR_ADMIN_RW(adjacent_cache_line_prefetcher_enable);
+static DEVICE_ATTR_ADMIN_RW(stream_detect_prefetcher_enable);
+static DEVICE_ATTR_ADMIN_RW(stream_detect_prefetcher_strong);
+static DEVICE_ATTR_ADMIN_RW(stream_detect_prefetcher_dist);
+
+static umode_t
+pfctl_attrs_is_visible(struct kobject *kobj, struct attribute *attr, int unused)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev->parent);
+	umode_t mode = attr->mode;
+
+	if ((attr == &dev_attr_hardware_prefetcher_enable.attr) &&
+	    (prefetcher_is_available(this_leaf->level, this_leaf->type, HWPF)))
+		return mode;
+
+	if ((attr == &dev_attr_ip_prefetcher_enable.attr) &&
+	    (prefetcher_is_available(this_leaf->level, this_leaf->type, IPPF)))
+		return mode;
+
+	if ((attr == &dev_attr_adjacent_cache_line_prefetcher_enable.attr) &&
+	    (prefetcher_is_available(this_leaf->level, this_leaf->type, ACLPF)))
+		return mode;
+
+	if (((attr == &dev_attr_stream_detect_prefetcher_enable.attr) ||
+	     (attr == &dev_attr_stream_detect_prefetcher_strong.attr) ||
+	     (attr == &dev_attr_stream_detect_prefetcher_dist.attr)) &&
+	    (prefetcher_is_available(this_leaf->level, this_leaf->type, SDPF)))
+		return mode;
+
+	return 0;
+}
+
+static struct attribute *pfctl_attrs[] = {
+	&dev_attr_hardware_prefetcher_enable.attr,
+	&dev_attr_ip_prefetcher_enable.attr,
+	&dev_attr_adjacent_cache_line_prefetcher_enable.attr,
+	&dev_attr_stream_detect_prefetcher_enable.attr,
+	&dev_attr_stream_detect_prefetcher_strong.attr,
+	&dev_attr_stream_detect_prefetcher_dist.attr,
+	NULL,
+};
+
+static const struct attribute_group pfctl_group = {
+	.attrs = pfctl_attrs,
+	.is_visible = pfctl_attrs_is_visible,
+};
+
+static const struct attribute_group *pfctl_groups[] = {
+	&pfctl_group,
+	NULL,
+};
+
+static int find_cache_device(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct device *cache_dev;
+
+	cache_dev = device_find_child_by_name(cpu_dev, "cache");
+	if (!cache_dev)
+		return -ENODEV;
+	per_cpu_cache_device(cpu) = cache_dev;
+
+	return 0;
+}
+
+static int _remove_pfctl_attr(struct device *dev, void *data)
+{
+	struct cacheinfo *leaf = dev_get_drvdata(dev);
+	struct device *pfctl_dev;
+
+	if (!prefetcher_is_available(leaf->level, leaf->type, ANYPF))
+		return 0;
+
+	pfctl_dev = device_find_child_by_name(dev, "prefetch_control");
+	if (!pfctl_dev)
+		return 0;
+
+	device_unregister(pfctl_dev);
+	return 0;
+}
+
+static void remove_pfctl_attr(unsigned int cpu)
+{
+	struct device *cache_dev = per_cpu_cache_device(cpu);
+
+	if (!cache_dev)
+		return;
+
+	device_for_each_child(cache_dev, NULL, _remove_pfctl_attr);
+}
+
+static int _create_pfctl_attr(struct device *dev, void *data)
+{
+	struct cacheinfo *leaf = dev_get_drvdata(dev);
+	struct device *pfctl_dev;
+
+	if (!prefetcher_is_available(leaf->level, leaf->type, ANYPF))
+		return 0;
+
+	pfctl_dev = cpu_device_create(dev, NULL, pfctl_groups,
+				      "prefetch_control");
+	if (IS_ERR(pfctl_dev))
+		return PTR_ERR(pfctl_dev);
+
+	return 0;
+}
+
+static int create_pfctl_attr(unsigned int cpu)
+{
+	int ret;
+	struct device *cache_dev = per_cpu_cache_device(cpu);
+
+	if (!cache_dev)
+		return -ENODEV;
+
+	ret = device_for_each_child(cache_dev, NULL, _create_pfctl_attr);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static int pfctl_online(unsigned int cpu)
+{
+	int ret;
+
+	ret = find_cache_device(cpu);
+	if (ret < 0)
+		return ret;
+
+	ret = create_pfctl_attr(cpu);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static int pfctl_prepare_down(unsigned int cpu)
+{
+	remove_pfctl_attr(cpu);
+
+	return 0;
+}
+
+/**
+ * pfctl_register_driver - register a Hardware Prefetch Control driver
+ * @driver_data: struct pfctl_driver must contain the supported prefetcher type
+ *               and function pointer for reading and writing hardware prefetch
+ *               register. If these are not defined this function return error.
+ *
+ * Note: This function must be called after the cache device is initialized
+ * because it requires access to the cache device.
+ * (e.g. Call at the late_initcall)
+ *
+ * Context: Any context.
+ * Return: 0 on success, negative error code on failure.
+ */
+int pfctl_register_driver(struct pfctl_driver *driver_data)
+{
+	int ret;
+
+	if (pdriver)
+		return -EEXIST;
+
+	if ((driver_data->supported_l1d_prefetcher == 0) &&
+	    (driver_data->supported_l2_prefetcher == 0))
+		return -EINVAL;
+
+	if (!driver_data->read_pfreg || !driver_data->write_pfreg)
+		return -EINVAL;
+
+	pdriver = driver_data;
+
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "base/pfctl:online",
+				pfctl_online, pfctl_prepare_down);
+	if (ret < 0) {
+		pr_err("failed to register hotplug callbacks\n");
+		pdriver = NULL;
+		return ret;
+	}
+
+	hp_online = ret;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pfctl_register_driver);
+
+/**
+ * pfctl_unregister_driver - unregister the Hardware Prefetch Control driver
+ * @driver_data: Used to verify that this function is called by the driver that
+ *               called pfctl_register_driver by determining if driver_data is
+ *               the same.
+ *
+ * Context: Any context.
+ * Return: nothing.
+ */
+void pfctl_unregister_driver(struct pfctl_driver *driver_data)
+{
+	if (!pdriver || (driver_data != pdriver))
+		return;
+
+	cpuhp_remove_state(hp_online);
+
+	pdriver = NULL;
+}
+EXPORT_SYMBOL_GPL(pfctl_unregister_driver);
diff --git a/include/linux/pfctl.h b/include/linux/pfctl.h
new file mode 100644
index 000000000000..607442606a95
--- /dev/null
+++ b/include/linux/pfctl.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_PFCTL_H
+#define _LINUX_PFCTL_H
+
+#define PFCTL_ENABLE_VAL		1
+#define PFCTL_DISABLE_VAL		0
+#define PFCTL_STRONG_VAL		1
+#define PFCTL_WEAK_VAL			0
+#define PFCTL_DIST_AUTO_VAL		0
+
+enum pfctl_attr {
+	HWPF_ENABLE,
+	IPPF_ENABLE,
+	ACLPF_ENABLE,
+	SDPF_ENABLE,
+	SDPF_STRONG,
+	SDPF_DIST,
+};
+
+enum prefetcher {
+	HWPF	= BIT(0), /* Hardware Prefetcher */
+	IPPF	= BIT(1), /* IP Prefetcher */
+	ACLPF	= BIT(2), /* Adjacent Cache Line Prefetcher */
+	SDPF	= BIT(3), /* Stream Detect Prefetcher */
+	ANYPF	= HWPF|IPPF|ACLPF|SDPF,
+};
+
+struct pfctl_driver {
+	unsigned int supported_l1d_prefetcher;
+	unsigned int supported_l2_prefetcher;
+
+	int (*read_pfreg)(enum pfctl_attr pattr, unsigned int cpu,
+			  unsigned int level, u64 *val);
+	int (*write_pfreg)(enum pfctl_attr pattr, unsigned int cpu,
+			   unsigned int level, u64 val);
+};
+
+int pfctl_register_driver(struct pfctl_driver *driver_data);
+void pfctl_unregister_driver(struct pfctl_driver *driver_data);
+
+#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 1/8] drivers: base: Add hardware prefetch control core driver Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64 Kohei Tarumizu
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This adds Kconfig/Makefile to build hardware prefetch control core
driver. This also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS           |  6 ++++++
 drivers/base/Kconfig  | 13 +++++++++++++
 drivers/base/Makefile |  1 +
 3 files changed, 20 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 05fd080b82f3..213537cea2e2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8452,6 +8452,12 @@ F:	include/linux/hwmon*.h
 F:	include/trace/events/hwmon*.h
 K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
 
+HARDWARE PREFETCH CONTROL DRIVERS
+M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
+S:	Maintained
+F:	drivers/base/pfctl.c
+F:	include/linux/pfctl.h
+
 HARDWARE RANDOM NUMBER GENERATOR CORE
 M:	Matt Mackall <mpm@selenic.com>
 M:	Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 6f04b831a5c0..d146604b5b3a 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -230,4 +230,17 @@ config GENERIC_ARCH_NUMA
 	  Enable support for generic NUMA implementation. Currently, RISC-V
 	  and ARM64 use it.
 
+config ARCH_HAS_HWPF_CONTROL
+	bool
+
+config HWPF_CONTROL
+	bool "Hardware Prefetch Control driver"
+	depends on ARCH_HAS_HWPF_CONTROL && SYSFS
+	help
+	  This driver allows user to control CPU's Hardware Prefetch behavior.
+	  If the machine supports this behavior, it provides a sysfs interface.
+
+	  See Documentation/ABI/testing/sysfs-devices-system-cpu for more
+	  information.
+
 endmenu
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 02f7f1358e86..13f3a0ddf3d1 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
 obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
 obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
 obj-$(CONFIG_GENERIC_ARCH_NUMA) += arch_numa.o
+obj-$(CONFIG_HWPF_CONTROL)	+= pfctl.o
 
 obj-y			+= test/
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 1/8] drivers: base: Add hardware prefetch control core driver Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 2/8] drivers: base: Add Kconfig/Makefile to build " Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-30 22:11   ` Rob Herring
  2022-03-11 10:19 ` [PATCH v2 4/8] arm64: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This adds module init/exit code, and creates sysfs attribute files for
"stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
and "stream_detect_prefetcher_dist". This driver works only if part
number is FUJITSU_CPU_PART_A64FX at this point. The details of the
registers to be read and written in this patch are described below.

"https://github.com/fujitsu/A64FX/tree/master/doc/"
    A64FX_Specification_HPC_Extension_v1_EN.pdf

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/arm64/kernel/pfctl.c | 368 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 368 insertions(+)
 create mode 100644 arch/arm64/kernel/pfctl.c

diff --git a/arch/arm64/kernel/pfctl.c b/arch/arm64/kernel/pfctl.c
new file mode 100644
index 000000000000..0487c763b206
--- /dev/null
+++ b/arch/arm64/kernel/pfctl.c
@@ -0,0 +1,368 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * ARM64 Hardware Prefetch Control support
+ */
+
+#include <asm/cputype.h>
+#include <linux/bitfield.h>
+#include <linux/cacheinfo.h>
+#include <linux/module.h>
+#include <linux/pfctl.h>
+#include <linux/parser.h>
+
+struct pfctl_driver arm64_pfctl_driver;
+
+/**************************************
+ * FUJITSU A64FX support
+ **************************************/
+
+/*
+ * Constants for these add the "A64FX_SDPF" prefix to the name described in
+ * section "1.3.4.2. IMP_PF_STREAM_DETECT_CTRL_EL0" of "A64FX specification".
+ * (https://github.com/fujitsu/A64FX/tree/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf")
+ * See this document for register specification details.
+ */
+#define A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0	sys_reg(3, 3, 11, 4, 0)
+#define A64FX_SDPF_V					BIT_ULL(63)
+#define A64FX_SDPF_L1PF_DIS				BIT_ULL(59)
+#define A64FX_SDPF_L2PF_DIS				BIT_ULL(58)
+#define A64FX_SDPF_L1W					BIT_ULL(55)
+#define A64FX_SDPF_L2W					BIT_ULL(54)
+#define A64FX_SDPF_L1_DIST				GENMASK_ULL(27, 24)
+#define A64FX_SDPF_L2_DIST				GENMASK_ULL(19, 16)
+
+#define A64FX_SDPF_MIN_DIST_L1				256
+#define A64FX_SDPF_MIN_DIST_L2				1024
+
+struct a64fx_read_info {
+	enum pfctl_attr pattr;
+	u64 val;
+	unsigned int level;
+	int ret;
+};
+
+struct a64fx_write_info {
+	enum pfctl_attr pattr;
+	u64 val;
+	unsigned int level;
+	int ret;
+};
+
+static int a64fx_get_sdpf_enable(u64 reg, unsigned int level)
+{
+	u64 val;
+
+	switch (level) {
+	case 1:
+		val = FIELD_GET(A64FX_SDPF_L1PF_DIS, reg);
+		break;
+	case 2:
+		val = FIELD_GET(A64FX_SDPF_L2PF_DIS, reg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (val == 0)
+		return PFCTL_ENABLE_VAL;
+	else if (val == 1)
+		return PFCTL_DISABLE_VAL;
+	else
+		return -EINVAL;
+}
+
+static int a64fx_modify_sdpf_enable(u64 *reg, unsigned int level, u64 val)
+{
+	if (val == PFCTL_ENABLE_VAL)
+		val = 0;
+	else
+		val = 1;
+
+	switch (level) {
+	case 1:
+		*reg &= ~A64FX_SDPF_L1PF_DIS;
+		*reg |= FIELD_PREP(A64FX_SDPF_L1PF_DIS, val);
+		break;
+	case 2:
+		*reg &= ~A64FX_SDPF_L2PF_DIS;
+		*reg |= FIELD_PREP(A64FX_SDPF_L2PF_DIS, val);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int a64fx_get_sdpf_strong(u64 reg, unsigned int level)
+{
+	u64 val;
+
+	switch (level) {
+	case 1:
+		val = FIELD_GET(A64FX_SDPF_L1W, reg);
+		break;
+	case 2:
+		val = FIELD_GET(A64FX_SDPF_L2W, reg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (val == 0)
+		return PFCTL_STRONG_VAL;
+	else if (val == 1)
+		return PFCTL_WEAK_VAL;
+	else
+		return -EINVAL;
+}
+
+static int a64fx_modify_sdpf_strong(u64 *reg, unsigned int level, u64 val)
+{
+	if (val == PFCTL_STRONG_VAL)
+		val = 0;
+	else
+		val = 1;
+
+	switch (level) {
+	case 1:
+		*reg &= ~A64FX_SDPF_L1W;
+		*reg |= FIELD_PREP(A64FX_SDPF_L1W, val);
+		break;
+	case 2:
+		*reg &= ~A64FX_SDPF_L2W;
+		*reg |= FIELD_PREP(A64FX_SDPF_L2W, val);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int a64fx_get_sdpf_dist(u64 reg, unsigned int level)
+{
+	switch (level) {
+	case 1:
+		return FIELD_GET(A64FX_SDPF_L1_DIST, reg) *
+			A64FX_SDPF_MIN_DIST_L1;
+	case 2:
+		return FIELD_GET(A64FX_SDPF_L2_DIST, reg) *
+			A64FX_SDPF_MIN_DIST_L2;
+	default:
+		return -EINVAL;
+	}
+}
+
+static int a64fx_modify_sdpf_dist(u64 *reg, unsigned int level, u64 val)
+{
+	switch (level) {
+	case 1:
+		val = roundup(val, A64FX_SDPF_MIN_DIST_L1) /
+			A64FX_SDPF_MIN_DIST_L1;
+		if (!FIELD_FIT(A64FX_SDPF_L1_DIST, val))
+			return -EINVAL;
+		*reg &= ~A64FX_SDPF_L1_DIST;
+		*reg |= FIELD_PREP(A64FX_SDPF_L1_DIST, val);
+		break;
+	case 2:
+		val = roundup(val, A64FX_SDPF_MIN_DIST_L2) /
+			A64FX_SDPF_MIN_DIST_L2;
+		if (!FIELD_FIT(A64FX_SDPF_L2_DIST, val))
+			return -EINVAL;
+		*reg &= ~A64FX_SDPF_L2_DIST;
+		*reg |= FIELD_PREP(A64FX_SDPF_L2_DIST, val);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void a64fx_enable_sdpf_verify(u64 *reg)
+{
+	*reg &= ~A64FX_SDPF_V;
+	*reg |= FIELD_PREP(A64FX_SDPF_V, 1);
+}
+
+static int a64fx_get_sdpf_params(enum pfctl_attr pattr, u64 reg,
+				 unsigned int level, u64 *val)
+{
+	int ret;
+
+	switch (pattr) {
+	case SDPF_ENABLE:
+		ret = a64fx_get_sdpf_enable(reg, level);
+		break;
+	case SDPF_STRONG:
+		ret = a64fx_get_sdpf_strong(reg, level);
+		break;
+	case SDPF_DIST:
+		ret = a64fx_get_sdpf_dist(reg, level);
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	if (ret < 0)
+		return ret;
+	*val = ret;
+
+	return 0;
+}
+
+static int a64fx_modify_pfreg_val(enum pfctl_attr pattr, u64 *reg,
+				  unsigned int level, u64 val)
+{
+	int ret;
+
+	switch (pattr) {
+	case SDPF_ENABLE:
+		ret = a64fx_modify_sdpf_enable(reg, level, val);
+		break;
+	case SDPF_STRONG:
+		ret = a64fx_modify_sdpf_strong(reg, level, val);
+		break;
+	case SDPF_DIST:
+		ret = a64fx_modify_sdpf_dist(reg, level, val);
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	if (ret < 0)
+		return ret;
+
+	a64fx_enable_sdpf_verify(reg);
+
+	return 0;
+}
+
+static void _a64fx_read_pfreg(void *info)
+{
+	u64 reg;
+	struct a64fx_read_info *rinfo = info;
+
+	reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+
+	rinfo->ret = a64fx_get_sdpf_params(rinfo->pattr, reg, rinfo->level,
+					   &rinfo->val);
+}
+
+static int a64fx_read_pfreg(enum pfctl_attr pattr, unsigned int cpu,
+			    unsigned int level, u64 *val)
+{
+	struct a64fx_read_info info = {
+		.level = level,
+		.pattr = pattr,
+	};
+
+	smp_call_function_single(cpu, _a64fx_read_pfreg, &info, true);
+
+	if (info.ret < 0)
+		return info.ret;
+
+	*val = info.val;
+	return 0;
+}
+
+static void _a64fx_write_pfreg(void *info)
+{
+	int ret;
+	u64 reg;
+	struct a64fx_write_info *winfo = info;
+
+	reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+
+	ret = a64fx_modify_pfreg_val(winfo->pattr, &reg, winfo->level,
+				     winfo->val);
+	if (ret < 0) {
+		winfo->ret = ret;
+		return;
+	}
+
+	write_sysreg_s(reg, A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+
+	winfo->ret = 0;
+}
+
+static int a64fx_write_pfreg(enum pfctl_attr pattr, unsigned int cpu,
+			     unsigned int level, u64 val)
+{
+	struct a64fx_write_info info = {
+		.level = level,
+		.pattr = pattr,
+		.val = val,
+	};
+
+	smp_call_function_single(cpu, _a64fx_write_pfreg, &info, true);
+	return info.ret;
+}
+
+/***** end of FUJITSU A64FX support *****/
+
+/*
+ * This driver returns a negative value if it does not support the Hardware
+ * Prefetch Control or if it is running on a VM guest.
+ */
+static int __init setup_pfctl_driver_params(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	unsigned long part_number = read_cpuid_part_number();
+
+	if (!is_kernel_in_hyp_mode())
+		return -EINVAL;
+
+	switch (implementor) {
+	case ARM_CPU_IMP_FUJITSU:
+		switch (part_number) {
+		case FUJITSU_CPU_PART_A64FX:
+			/* A64FX register requires EL2 access */
+			if (!has_vhe())
+				return -EINVAL;
+
+			arm64_pfctl_driver.supported_l1d_prefetcher = SDPF;
+			arm64_pfctl_driver.supported_l2_prefetcher = SDPF;
+			arm64_pfctl_driver.read_pfreg = a64fx_read_pfreg;
+			arm64_pfctl_driver.write_pfreg = a64fx_write_pfreg;
+			break;
+		default:
+			return -ENODEV;
+		}
+		break;
+	default:
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+static int __init arm64_pfctl_init(void)
+{
+	int ret;
+
+	ret = setup_pfctl_driver_params();
+	if (ret < 0)
+		return ret;
+
+	ret = pfctl_register_driver(&arm64_pfctl_driver);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void __exit arm64_pfctl_exit(void)
+{
+	pfctl_unregister_driver(&arm64_pfctl_driver);
+}
+
+late_initcall(arm64_pfctl_init);
+module_exit(arm64_pfctl_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("ARM64 Hardware Prefetch Control Driver");
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/8] arm64: Add Kconfig/Makefile to build hardware prefetch control driver
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
                   ` (2 preceding siblings ...)
  2022-03-11 10:19 ` [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64 Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control Kohei Tarumizu
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This adds Kconfig/Makefile to build hardware prefetch control driver
for arm64 support. This also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS                | 1 +
 arch/arm64/Kconfig         | 7 +++++++
 arch/arm64/kernel/Makefile | 1 +
 3 files changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 213537cea2e2..7eb530f5b301 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8455,6 +8455,7 @@ K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
 HARDWARE PREFETCH CONTROL DRIVERS
 M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
 S:	Maintained
+F:	arch/arm64/kernel/pfctl.c
 F:	drivers/base/pfctl.c
 F:	include/linux/pfctl.h
 
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 09b885cc4db5..da6bf7e75df6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -36,6 +36,7 @@ config ARM64
 	select ARCH_HAS_SET_DIRECT_MAP
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_STACKWALK
+	select ARCH_HAS_HWPF_CONTROL
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
@@ -2027,6 +2028,12 @@ config STACKPROTECTOR_PER_TASK
 	def_bool y
 	depends on STACKPROTECTOR && CC_HAVE_STACKPROTECTOR_SYSREG
 
+config ARM64_HWPF_CONTROL
+	tristate "ARM64 Hardware Prefetch Control support"
+	depends on HWPF_CONTROL
+	help
+	  This adds Hardware Prefetch driver control support for ARM64.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 88b3e2a21408..d5eb1dc6bfa6 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH)		+= pointer_auth.o
 obj-$(CONFIG_ARM64_MTE)			+= mte.o
 obj-y					+= vdso-wrap.o
 obj-$(CONFIG_COMPAT_VDSO)		+= vdso32-wrap.o
+obj-$(CONFIG_ARM64_HWPF_CONTROL)	+= pfctl.o
 
 obj-y					+= probes/
 head-y					:= head.o
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
                   ` (3 preceding siblings ...)
  2022-03-11 10:19 ` [PATCH v2 4/8] arm64: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-30 22:14   ` Rob Herring
  2022-03-11 10:19 ` [PATCH v2 6/8] x86: Add hardware prefetch control support for x86 Kohei Tarumizu
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This patch create a cache sysfs directory without ACPI PPTT if the
CONFIG_HWPF_CONTROL is true.

Hardware prefetch control driver need cache sysfs directory and cache
level/type information. In ARM processor, these information can be
obtained from the register even without PPTT. Therefore, we set the
cpu_map_populated to true to create cache sysfs directory if the
machine doesn't have PPTT.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/arm64/kernel/cacheinfo.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 587543c6c51c..039ec32d0b3d 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -43,6 +43,21 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 	this_leaf->type = type;
 }
 
+#if defined(CONFIG_HWPF_CONTROL)
+static bool acpi_has_pptt(void)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status))
+		return false;
+
+	acpi_put_table(table);
+	return true;
+}
+#endif
+
 int init_cache_level(unsigned int cpu)
 {
 	unsigned int ctype, level, leaves, fw_level;
@@ -95,5 +110,19 @@ int populate_cache_leaves(unsigned int cpu)
 			ci_leaf_init(this_leaf++, type, level);
 		}
 	}
+
+#if defined(CONFIG_HWPF_CONTROL)
+	/*
+	 * Hardware prefetch functions need cache sysfs directory and cache
+	 * level/type information. In ARM processor, these information can be
+	 * obtained from registers even without PPTT. Therefore, we set the
+	 * cpu_map_populated to true to create cache sysfs directory, if the
+	 * machine doesn't have PPTT.
+	 **/
+	if (!acpi_disabled)
+		if (!acpi_has_pptt())
+			this_cpu_ci->cpu_map_populated = true;
+#endif
+
 	return 0;
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/8] x86: Add hardware prefetch control support for x86
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
                   ` (4 preceding siblings ...)
  2022-03-11 10:19 ` [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-14 21:05   ` Dave Hansen
  2022-03-11 10:19 ` [PATCH v2 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This adds module init/exit code, and creates sysfs attribute file
"hardware_prefetcher_enable", "ip_prefetcher_enable" and
"adjacent_cache_line_prefetcher_enable" for x86. This driver works
only if the model is INTEL_FAM6_BROADWELL_X at this point.

If you would like to support a new model with the same register
specifications as INTEL_FAM6_BROADWELL_X, it is possible to add the
model settings to array of broadwell_cpu_ids[].

The details of the registers to be read and written in this patch are
described below:

"https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html"
    Volume 4

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/x86/kernel/cpu/pfctl.c | 314 ++++++++++++++++++++++++++++++++++++
 1 file changed, 314 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/pfctl.c

diff --git a/arch/x86/kernel/cpu/pfctl.c b/arch/x86/kernel/cpu/pfctl.c
new file mode 100644
index 000000000000..be2dce644808
--- /dev/null
+++ b/arch/x86/kernel/cpu/pfctl.c
@@ -0,0 +1,314 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * x86 Hardware Prefetch Control support
+ */
+
+#include <linux/bitfield.h>
+#include <linux/cacheinfo.h>
+#include <linux/pfctl.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <asm/cpu_device_id.h>
+#include <asm/intel-family.h>
+#include <asm/msr.h>
+
+struct pfctl_driver x86_pfctl_driver;
+
+/**************************************
+ * Intle BROADWELL support
+ **************************************/
+
+/*
+ * The register specification for each bits of Intel BROADWELL is as
+ * follow:
+ *
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
+ * [2]    DCU Hardware Prefetcher Disable (R/W)
+ * [3]    DCU IP Prefetcher Disable (R/W)
+ * [63:4] Reserved
+ *
+ * See "Intel 64 and IA-32 Architectures Software Developer's Manual"
+ * (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)
+ * for register specification details.
+ */
+#define BROADWELL_L2_HWPF_FIELD		BIT_ULL(0)
+#define BROADWELL_L2_ACLPF_FIELD	BIT_ULL(1)
+#define BROADWELL_DCU_HWPF_FIELD	BIT_ULL(2)
+#define BROADWELL_DCU_IPPF_FIELD	BIT_ULL(3)
+
+static int broadwell_get_hwpf_enable(u64 reg, unsigned int level)
+{
+	u64 val;
+
+	switch (level) {
+	case 1:
+		val = FIELD_GET(BROADWELL_DCU_HWPF_FIELD, reg);
+		break;
+	case 2:
+		val = FIELD_GET(BROADWELL_L2_HWPF_FIELD, reg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (val == 0)
+		return PFCTL_ENABLE_VAL;
+	else if (val == 1)
+		return PFCTL_DISABLE_VAL;
+	else
+		return -EINVAL;
+}
+
+static int broadwell_modify_hwpf_enable(u64 *reg, unsigned int level, u64 val)
+{
+	if (val == PFCTL_ENABLE_VAL)
+		val = 0;
+	else
+		val = 1;
+
+	switch (level) {
+	case 1:
+		*reg &= ~BROADWELL_DCU_HWPF_FIELD;
+		*reg |= FIELD_PREP(BROADWELL_DCU_HWPF_FIELD, val);
+		break;
+	case 2:
+		*reg &= ~BROADWELL_L2_HWPF_FIELD;
+		*reg |= FIELD_PREP(BROADWELL_L2_HWPF_FIELD, val);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int broadwell_get_ippf_enable(u64 reg, unsigned int level)
+{
+	u64 val;
+
+	switch (level) {
+	case 1:
+		val = FIELD_GET(BROADWELL_DCU_IPPF_FIELD, reg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (val == 0)
+		return PFCTL_ENABLE_VAL;
+	else if (val == 1)
+		return PFCTL_DISABLE_VAL;
+	else
+		return -EINVAL;
+}
+
+static int broadwell_modify_ippf_enable(u64 *reg, unsigned int level, u64 val)
+{
+	if (val == PFCTL_ENABLE_VAL)
+		val = 0;
+	else
+		val = 1;
+
+	switch (level) {
+	case 1:
+		*reg &= ~BROADWELL_DCU_IPPF_FIELD;
+		*reg |= FIELD_PREP(BROADWELL_DCU_IPPF_FIELD, val);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int broadwell_get_aclpf_enable(u64 reg, unsigned int level)
+{
+	u64 val;
+
+	switch (level) {
+	case 2:
+		val = FIELD_GET(BROADWELL_L2_ACLPF_FIELD, reg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (val == 0)
+		return PFCTL_ENABLE_VAL;
+	else if (val == 1)
+		return PFCTL_DISABLE_VAL;
+	else
+		return -EINVAL;
+}
+
+static int broadwell_modify_aclpf_enable(u64 *reg, unsigned int level, u64 val)
+{
+	if (val == PFCTL_ENABLE_VAL)
+		val = 0;
+	else
+		val = 1;
+
+	switch (level) {
+	case 2:
+		*reg &= ~BROADWELL_L2_ACLPF_FIELD;
+		*reg |= FIELD_PREP(BROADWELL_L2_ACLPF_FIELD, val);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int broadwell_get_pfctl_params(enum pfctl_attr pattr, u64 reg,
+				      unsigned int level, u64 *val)
+{
+	int ret;
+
+	switch (pattr) {
+	case HWPF_ENABLE:
+		ret = broadwell_get_hwpf_enable(reg, level);
+		break;
+	case IPPF_ENABLE:
+		ret = broadwell_get_ippf_enable(reg, level);
+		break;
+	case ACLPF_ENABLE:
+		ret = broadwell_get_aclpf_enable(reg, level);
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	if (ret < 0)
+		return ret;
+	*val = ret;
+
+	return 0;
+}
+
+static int broadwell_modify_pfreg(enum pfctl_attr pattr, u64 *reg,
+				  unsigned int level, u64 val)
+{
+	int ret;
+
+	switch (pattr) {
+	case HWPF_ENABLE:
+		ret = broadwell_modify_hwpf_enable(reg, level, val);
+		break;
+	case IPPF_ENABLE:
+		ret = broadwell_modify_ippf_enable(reg, level, val);
+		break;
+	case ACLPF_ENABLE:
+		ret = broadwell_modify_aclpf_enable(reg, level, val);
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static int broadwell_read_pfreg(enum pfctl_attr pattr, unsigned int cpu,
+				unsigned int level, u64 *val)
+{
+	int ret;
+	u64 reg;
+
+	ret = rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &reg);
+	if (ret)
+		return ret;
+
+	ret = broadwell_get_pfctl_params(pattr, reg, level, val);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int broadwell_write_pfreg(enum pfctl_attr pattr, unsigned int cpu,
+				 unsigned int level, u64 val)
+{
+	int ret;
+	u64 reg;
+
+	ret = rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &reg);
+	if (ret)
+		return ret;
+
+	ret = broadwell_modify_pfreg(pattr, &reg, level, val);
+	if (ret < 0)
+		return ret;
+
+	ret = wrmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, reg);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+/*
+ * In addition to BROADWELL_X, NEHALEM and others have same register
+ * specifications as those represented by BROADWELL_XXX_FIELD.
+ * If you want to add support for these processor, add the new target model
+ * here.
+ */
+static const struct x86_cpu_id broadwell_cpu_ids[] = {
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, NULL),
+	{}
+};
+
+/***** end of Intel BROADWELL support *****/
+
+/*
+ * This driver returns a negative value if it does not support the Hardware
+ * Prefetch Control or if it is running on a VM guest.
+ */
+static int __init setup_pfctl_driver_params(void)
+{
+	if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
+		return -EINVAL;
+
+	if (x86_match_cpu(broadwell_cpu_ids)) {
+		x86_pfctl_driver.supported_l1d_prefetcher = HWPF|IPPF;
+		x86_pfctl_driver.supported_l2_prefetcher = HWPF|ACLPF;
+		x86_pfctl_driver.read_pfreg = broadwell_read_pfreg;
+		x86_pfctl_driver.write_pfreg = broadwell_write_pfreg;
+	} else {
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+static int __init x86_pfctl_init(void)
+{
+	int ret;
+
+	ret = setup_pfctl_driver_params();
+	if (ret < 0)
+		return ret;
+
+	ret = pfctl_register_driver(&x86_pfctl_driver);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void __exit x86_pfctl_exit(void)
+{
+	pfctl_unregister_driver(&x86_pfctl_driver);
+}
+
+late_initcall(x86_pfctl_init);
+module_exit(x86_pfctl_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("x86 Hardware Prefetch Control Driver");
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
                   ` (5 preceding siblings ...)
  2022-03-11 10:19 ` [PATCH v2 6/8] x86: Add hardware prefetch control support for x86 Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-11 10:19 ` [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of " Kohei Tarumizu
  2022-03-14 19:19 ` [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Dave Hansen
  8 siblings, 0 replies; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This adds Kconfig/Makefile to build hardware prefetch control driver
for x86 support. This also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS                  | 1 +
 arch/x86/Kconfig             | 7 +++++++
 arch/x86/kernel/cpu/Makefile | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7eb530f5b301..1d2b4ba82500 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8456,6 +8456,7 @@ HARDWARE PREFETCH CONTROL DRIVERS
 M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
 S:	Maintained
 F:	arch/arm64/kernel/pfctl.c
+F:	arch/x86/kernel/pfctl.
 F:	drivers/base/pfctl.c
 F:	include/linux/pfctl.h
 
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9f5bd41bf660..65235d25b6f1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -26,6 +26,7 @@ config X86_64
 	depends on 64BIT
 	# Options that are inherently 64-bit kernel only:
 	select ARCH_HAS_GIGANTIC_PAGE
+	select ARCH_HAS_HWPF_CONTROL
 	select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select HAVE_ARCH_SOFT_DIRTY
@@ -1378,6 +1379,12 @@ config X86_CPUID
 	  with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
 	  /dev/cpu/31/cpuid.
 
+config X86_HWPF_CONTROL
+	tristate "x86 Hardware Prefetch Control support"
+	depends on HWPF_CONTROL
+	help
+	  This adds Hardware Prefetch driver control support for X86.
+
 choice
 	prompt "High Memory Support"
 	default HIGHMEM4G
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 9661e3e802be..aec62a6b37d2 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -56,6 +56,8 @@ obj-$(CONFIG_X86_LOCAL_APIC)		+= perfctr-watchdog.o
 obj-$(CONFIG_HYPERVISOR_GUEST)		+= vmware.o hypervisor.o mshyperv.o
 obj-$(CONFIG_ACRN_GUEST)		+= acrn.o
 
+obj-$(CONFIG_X86_HWPF_CONTROL)		+= pfctl.o
+
 ifdef CONFIG_X86_FEATURE_NAMES
 quiet_cmd_mkcapflags = MKCAP   $@
       cmd_mkcapflags = $(CONFIG_SHELL) $(srctree)/$(src)/mkcapflags.sh $@ $^
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of hardware prefetch control driver
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
                   ` (6 preceding siblings ...)
  2022-03-11 10:19 ` [PATCH v2 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
@ 2022-03-11 10:19 ` Kohei Tarumizu
  2022-03-14 16:39   ` Jonathan Cameron
  2022-03-14 19:19 ` [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Dave Hansen
  8 siblings, 1 reply; 19+ messages in thread
From: Kohei Tarumizu @ 2022-03-11 10:19 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel
  Cc: tarumizu.kohei

This describes the sysfs interface implemented on the hardware prefetch
control driver.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 .../ABI/testing/sysfs-devices-system-cpu      | 89 +++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 61f5676a7429..c1f6aa1322da 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -681,3 +681,92 @@ Description:
 		(RO) the list of CPUs that are isolated and don't
 		participate in load balancing. These CPUs are set by
 		boot parameter "isolcpus=".
+
+What:		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control
+		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/hardware_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/ip_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/adjacent_cache_line_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/stream_detect_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/stream_detect_prefetcher_strong
+		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/stream_detect_prefetcher_dist
+Date:		March 2022
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Parameters for CPU's hardware prefetch control
+
+		This sysfs interface provides Hardware Prefetch control
+		attribute file by using implementation defined registers.
+		These files exists in every CPU's cache/index[0,2] directory,
+		and these affect the cache level of the parent index directory.
+		Each attribute file exists depending on kind of processor and
+		cache level.
+
+		*_prefetcher_enable:
+		    (RW) control this prefetcher's enablement state.
+		    Read returns current status:
+			0: this prefetcher is disabled
+			1: this prefetcher is enabled
+
+		stream_detect_prefetcher_strong:
+		    (RW) control prefetcher operation's strongness state.
+		    Strong prefetch operation is surely executed, if there is
+		    no corresponding data in cache.
+		    Weak prefetch operation allows the hardware not to execute
+		    operation depending on hardware state.
+
+		    Read returns current status:
+			0: prefetch operation is weak
+			1: prefetch operation is strong
+
+		stream_detect_prefetcher_dist:
+		    (RW) control the prefetcher distance value.
+		    Read return current prefetcher distance value in bytes
+		    or the string "auto".
+
+		    Write either a value in byte or the string "auto" to this
+		    parameter. If you write a value less than multiples of a
+		    specific value, it is rounded up.
+
+		    The value 0 and the string "auto" are the same and have
+		    a special meaning. This means that instead of setting
+		    dist to a user-specified value, it operates using
+		    hardware-specific values.
+
+		- Supported processors
+
+		    This sysfs interface is available on several processors, x86
+		    and ARM64. Currently, the following processors are supported:
+
+			- x86 processor
+			    - INTEL_FAM6_BROADWELL_X
+
+			- ARM64 processor
+			    - FUJITSU_CPU_PART_A64FX
+
+		- Attribute mapping
+
+		    Some Intel processors have MSR 0x1a4. This register has several
+		    specifications depending on the model. This interface provides
+		    a one-to-one attribute file to control all the tunable
+		    parameters the CPU provides of the following.
+
+			- "* Hardware Prefetcher Disable (R/W)"
+			    corresponds to the "hardware_prefetcher_enable"
+
+			- "* Adjacent Cache Line Prefetcher Disable (R/W)"
+			    corresponds to the "adjacent_cache_line_prefetcher_enable"
+
+			- "* IP Prefetcher Disable (R/W)"
+			    corresponds to the "ip_prefetcher_enable"
+
+		    The processor A64FX has register IMP_PF_STREAM_DETECT_CTRL_EL0
+		    for Hardware Prefetch Control. This attribute maps each
+		    specification to the following.
+
+			- "L*PF_DIS": enablement of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_enable"
+
+			- "L*W": strongness of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_strong"
+
+			- "L*_DIST": distance of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_dist"
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of hardware prefetch control driver
  2022-03-11 10:19 ` [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of " Kohei Tarumizu
@ 2022-03-14 16:39   ` Jonathan Cameron
  2022-03-16 12:56     ` tarumizu.kohei
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2022-03-14 16:39 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel

On Fri, 11 Mar 2022 19:19:40 +0900
Kohei Tarumizu <tarumizu.kohei@fujitsu.com> wrote:

> This describes the sysfs interface implemented on the hardware prefetch
> control driver.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
H
i,

I'm going to review this with only a fairly basic knowledge of prefetchers
and with no particular design in mind (though I'll point at ARM docs
because they are generally good and easy to find ;) Key thing on an ABI like this
is to maintain flexibility for other implementations.

It makes me a bit nervous to see an interface for something like this
being defined with only a couple of implementations.  There are others
with public documentation such as the ARM N2.

https://developer.arm.com/documentation/102099/0000/The-Neoverse-N2--core

As is clear from below, not a lot of this is shared between CPUs so far
so it might make more sense to document them separately?

> ---
>  .../ABI/testing/sysfs-devices-system-cpu      | 89 +++++++++++++++++++
>  1 file changed, 89 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 61f5676a7429..c1f6aa1322da 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -681,3 +681,92 @@ Description:
>  		(RO) the list of CPUs that are isolated and don't
>  		participate in load balancing. These CPUs are set by
>  		boot parameter "isolcpus=".
> +
> +What:		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control
> +		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/hardware_prefetcher_enable
> +		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/ip_prefetcher_enable
> +		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/adjacent_cache_line_prefetcher_enable
> +		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/stream_detect_prefetcher_enable
> +		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/stream_detect_prefetcher_strong
> +		/sys/devices/system/cpu/cpu*/cache/index[0,2]/prefetch_control/stream_detect_prefetcher_dist

I'd be tempted to have this as multiple blocks.  A lot of the documentation is not generic
to all of them and that approach tends to give documentation files that are easier to add
to later.

> +Date:		March 2022
> +Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
> +Description:	Parameters for CPU's hardware prefetch control
> +
> +		This sysfs interface provides Hardware Prefetch control
> +		attribute file by using implementation defined registers.

No need to say how this is implemented.  If someone wants to do it with a mailbox
for a particular CPU that would be fine at the ABI level.

> +		These files exists in every CPU's cache/index[0,2] directory,

Can see where they are from above.  No need to repeat that bit.

> +		and these affect the cache level of the parent index directory.
> +		Each attribute file exists depending on kind of processor and
> +		cache level.

Perhaps
"Attributes are only present if the particular cache implements the relevant
 prefetcher controls".  Or maybe just "All controls are optional".

> +
> +		*_prefetcher_enable:
> +		    (RW) control this prefetcher's enablement state.
> +		    Read returns current status:
> +			0: this prefetcher is disabled
> +			1: this prefetcher is enabled
> +
> +		stream_detect_prefetcher_strong:
> +		    (RW) control prefetcher operation's strongness state.
> +		    Strong prefetch operation is surely executed, if there is
> +		    no corresponding data in cache.
> +		    Weak prefetch operation allows the hardware not to execute
> +		    operation depending on hardware state.
> +
> +		    Read returns current status:
> +			0: prefetch operation is weak
> +			1: prefetch operation is strong

How likely is it that other prefetcher implementations will allow more
than two levels for this?  Can we define this ABI more broadly to allow
that?  Assuming such a scale might exist, this needs renaming to
stream_detect_pretcher_strength (as no longer on or off)

Easiest way to do that would probably be to use a separate
stream_detect_prefetcher_strength_available that lists
possible values (or min, step, max if that makes more sense).
Here,

0 1

With my very limited knowledge of the details, a multilevel approach
would map better to controls like RPF_MODE in the N2
IMP_CPUECTLR_EL1 register which has 4 levels for example (though
I have no input on whether those levels could map to 'strength'.

> +
> +		stream_detect_prefetcher_dist:
> +		    (RW) control the prefetcher distance value.
> +		    Read return current prefetcher distance value in bytes
> +		    or the string "auto".
> +
> +		    Write either a value in byte or the string "auto" to this
> +		    parameter. If you write a value less than multiples of a
> +		    specific value, it is rounded up.
> +
> +		    The value 0 and the string "auto" are the same and have
> +		    a special meaning. This means that instead of setting
> +		    dist to a user-specified value, it operates using
> +		    hardware-specific values.

Having two possible ways of representing 'auto' seems likely to cause
testing complexity for no particular benefit. I'd just not allow
one of them.

> +
> +		- Supported processors
> +
> +		    This sysfs interface is available on several processors, x86
> +		    and ARM64. Currently, the following processors are supported:

I would not list supported processors in here.  It will get out of sync if this
becomes popular and it should be easy to see if it is supported by whether the
sysfs attribute exists or not.

> +
> +			- x86 processor
> +			    - INTEL_FAM6_BROADWELL_X
> +
> +			- ARM64 processor
> +			    - FUJITSU_CPU_PART_A64FX
> +
> +		- Attribute mapping
> +
> +		    Some Intel processors have MSR 0x1a4. This register has several
> +		    specifications depending on the model. This interface provides
> +		    a one-to-one attribute file to control all the tunable
> +		    parameters the CPU provides of the following.
> +
> +			- "* Hardware Prefetcher Disable (R/W)"
> +			    corresponds to the "hardware_prefetcher_enable"
> +
> +			- "* Adjacent Cache Line Prefetcher Disable (R/W)"
> +			    corresponds to the "adjacent_cache_line_prefetcher_enable"
> +
> +			- "* IP Prefetcher Disable (R/W)"
> +			    corresponds to the "ip_prefetcher_enable"

I'm not sure on whether this should be here or not.  It seems like a path
to some very long documentation once 10+ processor families are supported.
However, there may be no better place to put this information.

Jonathan


> +
> +		    The processor A64FX has register IMP_PF_STREAM_DETECT_CTRL_EL0
> +		    for Hardware Prefetch Control. This attribute maps each
> +		    specification to the following.
> +
> +			- "L*PF_DIS": enablement of hardware prefetcher
> +			    corresponds to the "stream_detect_prefetcher_enable"
> +
> +			- "L*W": strongness of hardware prefetcher
> +			    corresponds to the "stream_detect_prefetcher_strong"
> +
> +			- "L*_DIST": distance of hardware prefetcher
> +			    corresponds to the "stream_detect_prefetcher_dist"


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86
  2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
                   ` (7 preceding siblings ...)
  2022-03-11 10:19 ` [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of " Kohei Tarumizu
@ 2022-03-14 19:19 ` Dave Hansen
  2022-03-18  6:34   ` tarumizu.kohei
  8 siblings, 1 reply; 19+ messages in thread
From: Dave Hansen @ 2022-03-14 19:19 UTC (permalink / raw)
  To: Kohei Tarumizu, catalin.marinas, will, tglx, mingo, bp,
	dave.hansen, x86, hpa, linux-arm-kernel, linux-kernel

On 3/11/22 02:19, Kohei Tarumizu wrote:
> The advantage of using this is improved performance. As an example of
> performance improvements, the results of running the Stream benchmark
> on the A64FX are described in section [Merit].

I take it that there are users out there today that are sufficiently
motivated by the increased performance that they just do "wrmsr 0x1a4
0x1234".

You talked about this in the "[Merit]" section.  But, that's a _little_
unconvincing.  I don't doubt that there is *a* workload out there that
can benefit from hardware prefetcher tweaks.

Do we really expect end users to run their workloads and tweak these
values to find something optimal for them?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/8] x86: Add hardware prefetch control support for x86
  2022-03-11 10:19 ` [PATCH v2 6/8] x86: Add hardware prefetch control support for x86 Kohei Tarumizu
@ 2022-03-14 21:05   ` Dave Hansen
  2022-03-18  6:41     ` tarumizu.kohei
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Hansen @ 2022-03-14 21:05 UTC (permalink / raw)
  To: Kohei Tarumizu, catalin.marinas, will, tglx, mingo, bp,
	dave.hansen, x86, hpa, linux-arm-kernel, linux-kernel

On 3/11/22 02:19, Kohei Tarumizu wrote:
> +static int broadwell_write_pfreg(enum pfctl_attr pattr, unsigned int cpu,
> +				 unsigned int level, u64 val)
> +{
> +	int ret;
> +	u64 reg;
> +
> +	ret = rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &reg);
> +	if (ret)
> +		return ret;
> +
> +	ret = broadwell_modify_pfreg(pattr, &reg, level, val);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = wrmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, reg);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}

This needs to integrate _somehow_ with the pseudo_lock.c code.  Right
now, I suspect that code would just overwrite any MSR changes made by
this code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of hardware prefetch control driver
  2022-03-14 16:39   ` Jonathan Cameron
@ 2022-03-16 12:56     ` tarumizu.kohei
  0 siblings, 0 replies; 19+ messages in thread
From: tarumizu.kohei @ 2022-03-16 12:56 UTC (permalink / raw)
  To: 'Jonathan Cameron'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel

> H
> i,
> 
> I'm going to review this with only a fairly basic knowledge of prefetchers and with
> no particular design in mind (though I'll point at ARM docs because they are
> generally good and easy to find ;) Key thing on an ABI like this is to maintain
> flexibility for other implementations.
> 
> It makes me a bit nervous to see an interface for something like this being defined
> with only a couple of implementations.  There are others with public
> documentation such as the ARM N2.
> 
> https://developer.arm.com/documentation/102099/0000/The-Neoverse-N2--co
> re
> 
> As is clear from below, not a lot of this is shared between CPUs so far so it might
> make more sense to document them separately?

> 
> I'd be tempted to have this as multiple blocks.  A lot of the documentation is not
> generic to all of them and that approach tends to give documentation files that are
> easier to add to later.

As you commented, there are no shared attributes between different CPUs.
Therefore, split the document into separate blocks on x86 and arm64.

> No need to say how this is implemented.  If someone wants to do it with a
> mailbox for a particular CPU that would be fine at the ABI level.
> 
> > +		These files exists in every CPU's cache/index[0,2] directory,
> 
> Can see where they are from above.  No need to repeat that bit.

I remove these descriptions in the next version.

> Perhaps
> "Attributes are only present if the particular cache implements the relevant
> prefetcher controls".  Or maybe just "All controls are optional".

The former description is more accurate. I fix it in the next version.

> How likely is it that other prefetcher implementations will allow more than two
> levels for this?  Can we define this ABI more broadly to allow that?  Assuming
> such a scale might exist, this needs renaming to stream_detect_pretcher_strength
> (as no longer on or off)
> 
> Easiest way to do that would probably be to use a separate
> stream_detect_prefetcher_strength_available that lists possible values (or min,
> step, max if that makes more sense).
> Here,
> 
> 0 1
> 
> With my very limited knowledge of the details, a multilevel approach would map
> better to controls like RPF_MODE in the N2
> IMP_CPUECTLR_EL1 register which has 4 levels for example (though I have no
> input on whether those levels could map to 'strength'.
> 

As you commented, the attribute 'strong' may require more than two
levels of type (strong or weak) in the future. In order to allow that,
I will reconsider the interface specification.

> Having two possible ways of representing 'auto' seems likely to cause testing
> complexity for no particular benefit. I'd just not allow one of them.

I modify it so that only 'auto' is allowed and 0 is prohibited in the next version.

> I would not list supported processors in here.  It will get out of sync if this
> becomes popular and it should be easy to see if it is supported by whether the
> sysfs attribute exists or not.

I remove the description about supported processors, because it can be
determined by the sysfs attribute exists or not.

> > +
> > +			- "* Hardware Prefetcher Disable (R/W)"
> > +			    corresponds to the "hardware_prefetcher_enable"
> > +
> > +			- "* Adjacent Cache Line Prefetcher Disable (R/W)"
> > +			    corresponds to the
> "adjacent_cache_line_prefetcher_enable"
> > +
> > +			- "* IP Prefetcher Disable (R/W)"
> > +			    corresponds to the "ip_prefetcher_enable"
> 
> I'm not sure on whether this should be here or not.  It seems like a path to some
> very long documentation once 10+ processor families are supported.
> However, there may be no better place to put this information.

The current MSR specification is defined by a combination of the above
three type of prefetchers. Therefore, it is assumed that the amount of
sentences will not increase any more.

For reference, MSR 0x1A4 are classified into the following three types,
depending on the processor model.

* tyep A (INTEL_FAM6_BROADWELL, etc.)

[0]    L2 Hardware Prefetcher Disable (R/W)
[1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
[2]    DCU Hardware Prefetcher Disable (R/W)
[3]    DCU IP Prefetcher Disable (R/W)
[63:4] Reserved

* type B (INTEL_FAM6_XEON_PHI_KNL, etc.)

[0]    DCU Hardware Prefetcher Disable (R/W)
[1]    L2 Hardware Prefetcher Disable (R/W)
[63:2] Reserved

* type C (INTEL_FAM6_ATOM_SILVERMONT_D, etc.)

[0]    L2 Hardware Prefetcher Disable (R/W)
[1]    Reserved
[2]    DCU Hardware Prefetcher Disable (R/W)
[63:3] Reserved

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86
  2022-03-14 19:19 ` [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Dave Hansen
@ 2022-03-18  6:34   ` tarumizu.kohei
  0 siblings, 0 replies; 19+ messages in thread
From: tarumizu.kohei @ 2022-03-18  6:34 UTC (permalink / raw)
  To: 'Dave Hansen',
	catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel

> I take it that there are users out there today that are sufficiently motivated by the
> increased performance that they just do "wrmsr 0x1a4 0x1234".
> 
> You talked about this in the "[Merit]" section.  But, that's a _little_ unconvincing.
> I don't doubt that there is *a* workload out there that can benefit from hardware
> prefetcher tweaks.
> 
> Do we really expect end users to run their workloads and tweak these values to
> find something optimal for them?

In addition to the sample benchmarks in the [Merit] section, we assume
that some workloads will benefit from tweaking prefetches. We expect that
users can find the best parameters by using an tunable I/F from userspace.

I will find out the specific workload which improves the performance.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 6/8] x86: Add hardware prefetch control support for x86
  2022-03-14 21:05   ` Dave Hansen
@ 2022-03-18  6:41     ` tarumizu.kohei
  0 siblings, 0 replies; 19+ messages in thread
From: tarumizu.kohei @ 2022-03-18  6:41 UTC (permalink / raw)
  To: 'Dave Hansen',
	catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	linux-arm-kernel, linux-kernel

> > +static int broadwell_write_pfreg(enum pfctl_attr pattr, unsigned int cpu,
> > +				 unsigned int level, u64 val)
> > +{
> > +	int ret;
> > +	u64 reg;
> > +
> > +	ret = rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &reg);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = broadwell_modify_pfreg(pattr, &reg, level, val);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ret = wrmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, reg);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return 0;
> > +}
> 
> This needs to integrate _somehow_ with the pseudo_lock.c code.  Right now, I
> suspect that code would just overwrite any MSR changes made by this code.

I lacked consideration for pseudo_lock.c code. I try to integration
with that code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64
  2022-03-11 10:19 ` [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64 Kohei Tarumizu
@ 2022-03-30 22:11   ` Rob Herring
  2022-04-04 11:56     ` tarumizu.kohei
  0 siblings, 1 reply; 19+ messages in thread
From: Rob Herring @ 2022-03-30 22:11 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

On Fri, Mar 11, 2022 at 4:23 AM Kohei Tarumizu
<tarumizu.kohei@fujitsu.com> wrote:
>
> This adds module init/exit code, and creates sysfs attribute files for
> "stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
> and "stream_detect_prefetcher_dist". This driver works only if part
> number is FUJITSU_CPU_PART_A64FX at this point. The details of the
> registers to be read and written in this patch are described below.
>
> "https://github.com/fujitsu/A64FX/tree/master/doc/"
>     A64FX_Specification_HPC_Extension_v1_EN.pdf
>
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  arch/arm64/kernel/pfctl.c | 368 ++++++++++++++++++++++++++++++++++++++

This has nothing to do with arm64 arch other than you access registers
as sysregs. That's not enough of a reason to put in arch/arm64. Move
this to drivers/ assuming it continues. I agree that this seems
questionable to expose to userspace in the first place...

Rob

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
  2022-03-11 10:19 ` [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control Kohei Tarumizu
@ 2022-03-30 22:14   ` Rob Herring
  2022-04-04 11:48     ` tarumizu.kohei
  0 siblings, 1 reply; 19+ messages in thread
From: Rob Herring @ 2022-03-30 22:14 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

On Fri, Mar 11, 2022 at 4:23 AM Kohei Tarumizu
<tarumizu.kohei@fujitsu.com> wrote:
>
> This patch create a cache sysfs directory without ACPI PPTT if the
> CONFIG_HWPF_CONTROL is true.
>
> Hardware prefetch control driver need cache sysfs directory and cache
> level/type information. In ARM processor, these information can be
> obtained from the register even without PPTT.

What registers? CCSIDR register is no longer used. You must use DT or PPTT.

> Therefore, we set the
> cpu_map_populated to true to create cache sysfs directory if the
> machine doesn't have PPTT.
>
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  arch/arm64/kernel/cacheinfo.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
  2022-03-30 22:14   ` Rob Herring
@ 2022-04-04 11:48     ` tarumizu.kohei
  0 siblings, 0 replies; 19+ messages in thread
From: tarumizu.kohei @ 2022-04-04 11:48 UTC (permalink / raw)
  To: 'Rob Herring'
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

> What registers?

>> Hardware prefetch control driver need cache sysfs directory and cache
>> level/type information. In ARM processor, these information can be
>> obtained from the register even without PPTT.

This register mean CLIDR_EL1.

> CCSIDR register is no longer used. You must use DT or PPTT.

I know that commit "a8d4636f96ad" (arm64: cacheinfo: Remove CCSIDR-based
cache information probing) removed the code to read the CCSIDR from the
kernel.
Therefore, I only use level and type information that can be read from
CLIDR_EL1. Are there similar concerns when using only CLIDR_EL1
information?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64
  2022-03-30 22:11   ` Rob Herring
@ 2022-04-04 11:56     ` tarumizu.kohei
  0 siblings, 0 replies; 19+ messages in thread
From: tarumizu.kohei @ 2022-04-04 11:56 UTC (permalink / raw)
  To: 'Rob Herring'
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

> This has nothing to do with arm64 arch other than you access registers as sysregs.
> That's not enough of a reason to put in arch/arm64. Move this to drivers/
> assuming it continues. I agree that this seems questionable to expose to
> userspace in the first place...

I move it under drivers/ in the next version.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-04-04 11:56 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 10:19 [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Kohei Tarumizu
2022-03-11 10:19 ` [PATCH v2 1/8] drivers: base: Add hardware prefetch control core driver Kohei Tarumizu
2022-03-11 10:19 ` [PATCH v2 2/8] drivers: base: Add Kconfig/Makefile to build " Kohei Tarumizu
2022-03-11 10:19 ` [PATCH v2 3/8] arm64: Add hardware prefetch control support for ARM64 Kohei Tarumizu
2022-03-30 22:11   ` Rob Herring
2022-04-04 11:56     ` tarumizu.kohei
2022-03-11 10:19 ` [PATCH v2 4/8] arm64: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
2022-03-11 10:19 ` [PATCH v2 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control Kohei Tarumizu
2022-03-30 22:14   ` Rob Herring
2022-04-04 11:48     ` tarumizu.kohei
2022-03-11 10:19 ` [PATCH v2 6/8] x86: Add hardware prefetch control support for x86 Kohei Tarumizu
2022-03-14 21:05   ` Dave Hansen
2022-03-18  6:41     ` tarumizu.kohei
2022-03-11 10:19 ` [PATCH v2 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
2022-03-11 10:19 ` [PATCH v2 8/8] docs: ABI: Add sysfs documentation interface of " Kohei Tarumizu
2022-03-14 16:39   ` Jonathan Cameron
2022-03-16 12:56     ` tarumizu.kohei
2022-03-14 19:19 ` [PATCH v2 0/8] Add hardware prefetch control driver for arm64 and x86 Dave Hansen
2022-03-18  6:34   ` tarumizu.kohei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).